Automatically Capturing Quality-Related Concerns in Bug Report Descriptions for Efficient Bug Triaging. Krasniqi, R. & Do, H. In 2021 ACM 25th International Conference on Evaluation and Assessment in Software Engineering (EASE), pages 10–19, Gothenburg Sweden, 2022. ACM.
doi  abstract   bibtex   42 downloads  
In the early phases of a project, software architects and developers design solutions to satisfy quality concerns. However, as a byproduct of the long-term maintenance effort, qualities tend to erode, causing quality-related bugs to surface across the codebase. In principle, quality-related concerns not only can be expensive and difficult to detect, but they can have a detrimental effect on the system operating as intended. Moreover, quality-related concerns can directly affect users' experiences at large. To address this problem, we build a quality-based bug classifier that leverages several feature selection techniques, TF-IDF, Chi-square, Mutual Information, and Extra Randomized Trees, including the incorporation of various machine learning algorithms. Our results indicate that Random Forest with the (TF-IDF+Chi-square) configuration achieved the best results for detecting six-quality related types, achieving a precision of 76%, recall of 70%, and F1 of 70$%. However, the same approach returned low precision of 48%, recall of 15%, and F1 of 23% for detecting functional-related bugs. We argue that such low performance has resulted in an aftermath of overlapping content caused by functional and quality-related information which opens another challenging topic that we aim to expand in future work.
@inproceedings{krasniqi_automatically_2022,
	address = {Gothenburg Sweden},
	title = {Automatically {Capturing} {Quality}-{Related} {Concerns} in {Bug} {Report} {Descriptions} for {Efficient} {Bug} {Triaging}},
	isbn = {978-1-4503-9613-4},
	doi = {10.1145/3530019.3530021},
	abstract = {In the early phases of a project, software architects and developers design solutions to satisfy quality concerns. However, as a byproduct of the long-term maintenance effort, qualities tend to erode, causing quality-related bugs to surface across the codebase. In principle, quality-related concerns not only can be expensive and difficult to detect, but they can have a detrimental effect on the system operating as intended. Moreover, quality-related concerns can directly affect users' experiences at large. To address this problem, we build a quality-based bug classifier that leverages several feature selection techniques, TF-IDF, Chi-square, Mutual Information, and Extra Randomized Trees, including the incorporation of various machine learning algorithms. Our results indicate that Random Forest with the (TF-IDF+Chi-square) configuration achieved the best results for detecting six-quality related types, achieving a precision of 76\%, recall of 70\%, and F1 of 70\$\%. However, the same approach returned low precision of 48\%, recall of 15\%, and F1 of 23\% for detecting functional-related bugs. We argue that such low performance has resulted in an aftermath of overlapping content caused by functional and quality-related information which opens another challenging topic that we aim to expand in future work.},
	language = {en},
	urldate = {2022-09-29},
	booktitle = {2021 {ACM} 25th {International} {Conference} on {Evaluation} and {Assessment} in {Software} {Engineering} ({EASE})},
	publisher = {ACM},
	author = {Krasniqi, Rrezarta and Do, Hyunsook},
	year = {2022},
	keywords = {Conference Full Papers},
	pages = {10--19},
}

Downloads: 42