Effort-Aware Just-in-Time Software Defect Prediction - دانشکده فنی و مهندسی
Effort-Aware Just-in-Time Software Defect Prediction
نوع: Type: Thesis
مقطع: Segment: Masters
عنوان: Title: Effort-Aware Just-in-Time Software Defect Prediction
ارائه دهنده: Provider: Sadra Goudarzdashti
اساتید راهنما: Supervisors: Dr. Morteza Yousef Sanati
اساتید مشاور: Advisory Professors: Dr. Muharram Mansoorizadeh
اساتید ممتحن یا داور: Examining professors or referees: Dr. Reza Mohammadi, Dr. Shakoor Vakilian
زمان و تاریخ ارائه: Time and date of presentation: 2025
مکان ارائه: Place of presentation: seminar
چکیده: Abstract: .Effort-aware Just-in-Time Software Defect Prediction (JIT-SDP) is one of the key challenges in software engineering, aiming to identify defective code changes at the moment of commit. This task plays a crucial role in reducing maintenance costs, improving software quality, and optimizing resource management. Given the limited resources available for code inspection, achieving high accuracy in detecting defective changes with minimal inspection effort is of particular importance. In this research, a novel approach based on language models is proposed to establish a semantic alignment between commit messages and code changes, enabling the model to gain a deeper understanding of the intent behind each change. To this end, a two-stage framework consisting of pre- training and fine-tuning phases was designed. In the pre-training phase, two complementary methods were employed: Masked Language Modeling (MLM) to extract semantic and structural representations from each component independently, and Contrastive Learning to bring together the embeddings of related commit–code pairs while separating unrelated samples. In the fine-tuning phase, the model was trained on labeled data containing code changes, commit messages, and handcrafted features to predict defective changes. Experimental results on the JIT-Defects4J dataset demonstrated that the proposed method outperforms existing baselines across all evaluation metrics. Specifically, it achieved an improvement of 7% in F1 score, 1% in AUC, and 4.9% in Recall@20%Effort compared to the strongest baseline. These results indicate that leveraging semantic pre-training based on language models to jointly represent commit messages and code changes can effectively enhance prediction accuracy, improve model generalization, and ultimately contribute to the advancement of software quality assurance processes