An Empirical Study of the Impact of Class Overlap on the Performance and Interpretability of Cross-Version Defect Prediction

Publication typeJournal Article
Publication date2024-09-21
scimago Q3
wos Q4
SJR0.206
CiteScore1.8
Impact factor0.6
ISSN02181940, 17936403
Abstract

The class overlap problem refers to instances from different categories heavily overlapping in the feature space. This issue is one of the challenges in improving the performance of software defect prediction (SDP). Currently, the studies on the impact of class overlap on SDP mainly focused on within-project defect prediction and cross-project defect prediction. Moreover, the existing class overlap instances cleaning methods are not suitable for cross-version defect prediction. In this paper, we propose a class overlap instances cleaning method based on the Ratio of K-nearest neighbors with the Same Label (RKSL). This method removes instances with the abnormal neighbor ratio in the training set. Based on the RKSL method, we investigate the impact of class overlap on the performance and interpretability of the cross-version defect prediction model. The experiment results show that class overlap can affect the performance of cross-version defect prediction models significantly. The RKSL method can handle the class overlap problem in defect datasets, but it may impact the interpretability of models. Through the analysis of feature changes, we consider that class overlap instances cleaning can assist models in identifying more important features.

Found 
Found 

Top-30

Journals

1
2
3
International Journal of Software Engineering and Knowledge Engineering
3 publications, 100%
1
2
3

Publishers

1
2
3
World Scientific
3 publications, 100%
1
2
3
  • We do not take into account publications without a DOI.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Metrics
3
Share
Cite this
GOST |
Cite this
GOST Copy
Han H. et al. An Empirical Study of the Impact of Class Overlap on the Performance and Interpretability of Cross-Version Defect Prediction // International Journal of Software Engineering and Knowledge Engineering. 2024. Vol. 34. No. 12. pp. 1895-1918.
GOST all authors (up to 50) Copy
Han H., Yu Q., Zhu Y., Cheng S., Zhang Yu. An Empirical Study of the Impact of Class Overlap on the Performance and Interpretability of Cross-Version Defect Prediction // International Journal of Software Engineering and Knowledge Engineering. 2024. Vol. 34. No. 12. pp. 1895-1918.
RIS |
Cite this
RIS Copy
TY - JOUR
DO - 10.1142/s0218194024500414
UR - https://www.worldscientific.com/doi/10.1142/S0218194024500414
TI - An Empirical Study of the Impact of Class Overlap on the Performance and Interpretability of Cross-Version Defect Prediction
T2 - International Journal of Software Engineering and Knowledge Engineering
AU - Han, Hui
AU - Yu, Qiao
AU - Zhu, Yi
AU - Cheng, Shengyi
AU - Zhang, Yu
PY - 2024
DA - 2024/09/21
PB - World Scientific
SP - 1895-1918
IS - 12
VL - 34
SN - 0218-1940
SN - 1793-6403
ER -
BibTex |
Cite this
BibTex (up to 50 authors) Copy
@article{2024_Han,
author = {Hui Han and Qiao Yu and Yi Zhu and Shengyi Cheng and Yu Zhang},
title = {An Empirical Study of the Impact of Class Overlap on the Performance and Interpretability of Cross-Version Defect Prediction},
journal = {International Journal of Software Engineering and Knowledge Engineering},
year = {2024},
volume = {34},
publisher = {World Scientific},
month = {sep},
url = {https://www.worldscientific.com/doi/10.1142/S0218194024500414},
number = {12},
pages = {1895--1918},
doi = {10.1142/s0218194024500414}
}
MLA
Cite this
MLA Copy
Han, Hui, et al. “An Empirical Study of the Impact of Class Overlap on the Performance and Interpretability of Cross-Version Defect Prediction.” International Journal of Software Engineering and Knowledge Engineering, vol. 34, no. 12, Sep. 2024, pp. 1895-1918. https://www.worldscientific.com/doi/10.1142/S0218194024500414.