Mašīnmācīšanās pielietojums programmatūras produktu automatizētās regresijas testēšanas paātrināšanai

Cipiševa, Svetlana

dc.contributor.advisor	Arnicāns, Guntis
dc.contributor.author	Cipiševa, Svetlana
dc.contributor.other	Latvijas Universitāte. Datorikas fakultāte
dc.date.accessioned	2024-06-20T01:04:11Z
dc.date.available	2024-06-20T01:04:11Z
dc.date.issued	2024
dc.identifier.other	101361
dc.identifier.uri	https://dspace.lu.lv/dspace/handle/7/66043
dc.description.abstract	Regresijas testēšana prasa ievērojamus laika un resursu ieguldījumus. Pētījums parāda, kā mašīnmācīšanās var palīdzēt ievērojami paātrināt regresijas testēšanu, izmantojot testpiemēru prioritizāciju, lai atklātu kļūdas pēc iespējas agrāk. Darbā ir izstrādāti seši mašīnmācīšanās modeļi, pamatojoties uz trim algoritmiem: punktveida “Gradient Boosting”, pāru veida “LambdaMART” un saraksta veida “NeuralNDCG”. “Gradient Boosting” demonstrē augstāko precizitāti, kas ir tikai 1- 4% zemāka, salīdzinot ar ideāli sakārtotu datu kopu. Turklāt “Gradient Boosting” prasa visīsāko apmācības laiku, padarot to piemērotu ikdienas apmācībai nepārtrauktās integrēšanās vidē. Modeļi tika izstrādāti un pārbaudīti, balstoties uz projektiem, kas izmanto programmēšanas valodu “Java” un kas ietvēra 23 miljonus testkomplektu izpildes. Tas liecina par modeļu augsto uzticamību. Viena no darba unikālajām īpašībām ir tā, ka divas no izmantotajām iezīmēm tika iegūtas, pamatojoties uz pirmkoda izmaiņām. Tas kļuva iespējams, pateicoties “UniXCoder” lielā valodas modeļa izmantošanai, kas ir daļa no “CodeBERT” modeļa. Tā kā neviens no analizētajiem mūsdienu mašīnmācīšanās testēšanas rīkiem nepiedāvā šādu iespēju, tad darbs paver iespēju veidot jaunas inovatīvas metodes un rīkus testpiemēru rindošanai.
dc.description.abstract	Regression testing requires a significant investment of time and resources. The study shows how machine learning can help to significantly speed up regression testing by using test case prioritisation to detect errors as early as possible. The work included development of six machine learning models based on three algorithms: pointwise Gradient Boosting, pairwise LambdaMART and listwise NeuralNDCG. Gradient Boosting demonstrated the highest accuracy, only 1-4% lower on key metrics compared to a perfectly ordered dataset, and required the shortest training time, making it suitable for daily retraining in a continuous integration environment. The models were developed and tested on projects using the Java programming language, which involved 23 million test suite executions. This demonstrates the high reliability of the models. One of the unique features of the work is that two of the features used were derived based on source code changes. This was made possible by using the UniXCoder large language model, which is part of the CodeBERT model family. As none of the modern machine learning testing tools analysed offer this possibility, the work opens the possibility to develop new innovative methods and tools for queuing test examples.
dc.language.iso	lav
dc.publisher	Latvijas Universitāte
dc.rights	info:eu-repo/semantics/openAccess
dc.subject	Datorzinātne
dc.subject	Testpiemēra prioritāšu noteikšana
dc.subject	Rindošanas mašīnmācīšanās
dc.subject	Gradient Boosting
dc.subject	UniXCoder
dc.subject	LambdaMART
dc.title	Mašīnmācīšanās pielietojums programmatūras produktu automatizētās regresijas testēšanas paātrināšanai
dc.title.alternative	Application of machine learning to accelerate automated regression testing of software products
dc.type	info:eu-repo/semantics/masterThesis

Files in this item

Name:: 302-101361-Cipiseva_Svetlana_s ...
Size:: 2.659Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Bakalaura un maģistra darbi (EZTF) / Bachelor's and Master's theses [5488]

Show simple item record