Automatically Capturing Quality-Related Concerns in Bug Report Descriptions for Efficient Bug Triaging

Automatically Capturing Quality-Related Concerns in Bug Report Descriptions for Efficient Bug Triaging. Krasniqi, R. & Do, H. In 2021 ACM 25th International Conference on Evaluation and Assessment in Software Engineering (EASE), pages 10–19, Gothenburg Sweden, 2022. ACM.
doi abstract bibtex 42 downloads

In the early phases of a project, software architects and developers design solutions to satisfy quality concerns. However, as a byproduct of the long-term maintenance effort, qualities tend to erode, causing quality-related bugs to surface across the codebase. In principle, quality-related concerns not only can be expensive and difficult to detect, but they can have a detrimental effect on the system operating as intended. Moreover, quality-related concerns can directly affect users' experiences at large. To address this problem, we build a quality-based bug classifier that leverages several feature selection techniques, TF-IDF, Chi-square, Mutual Information, and Extra Randomized Trees, including the incorporation of various machine learning algorithms. Our results indicate that Random Forest with the (TF-IDF+Chi-square) configuration achieved the best results for detecting six-quality related types, achieving a precision of 76%, recall of 70%, and F1 of 70$%. However, the same approach returned low precision of 48%, recall of 15%, and F1 of 23% for detecting functional-related bugs. We argue that such low performance has resulted in an aftermath of overlapping content caused by functional and quality-related information which opens another challenging topic that we aim to expand in future work.

@inproceedings{krasniqi_automatically_2022,
	address = {Gothenburg Sweden},
	title = {Automatically {Capturing} {Quality}-{Related} {Concerns} in {Bug} {Report} {Descriptions} for {Efficient} {Bug} {Triaging}},
	isbn = {978-1-4503-9613-4},
	doi = {10.1145/3530019.3530021},
	abstract = {In the early phases of a project, software architects and developers design solutions to satisfy quality concerns. However, as a byproduct of the long-term maintenance effort, qualities tend to erode, causing quality-related bugs to surface across the codebase. In principle, quality-related concerns not only can be expensive and difficult to detect, but they can have a detrimental effect on the system operating as intended. Moreover, quality-related concerns can directly affect users' experiences at large. To address this problem, we build a quality-based bug classifier that leverages several feature selection techniques, TF-IDF, Chi-square, Mutual Information, and Extra Randomized Trees, including the incorporation of various machine learning algorithms. Our results indicate that Random Forest with the (TF-IDF+Chi-square) configuration achieved the best results for detecting six-quality related types, achieving a precision of 76\%, recall of 70\%, and F1 of 70\$\%. However, the same approach returned low precision of 48\%, recall of 15\%, and F1 of 23\% for detecting functional-related bugs. We argue that such low performance has resulted in an aftermath of overlapping content caused by functional and quality-related information which opens another challenging topic that we aim to expand in future work.},
	language = {en},
	urldate = {2022-09-29},
	booktitle = {2021 {ACM} 25th {International} {Conference} on {Evaluation} and {Assessment} in {Software} {Engineering} ({EASE})},
	publisher = {ACM},
	author = {Krasniqi, Rrezarta and Do, Hyunsook},
	year = {2022},
	keywords = {Conference Full Papers},
	pages = {10--19},
}

Downloads: 42

{"_id":"hpaj7mCj6JH4t4QXZ","bibbaseid":"krasniqi-do-automaticallycapturingqualityrelatedconcernsinbugreportdescriptionsforefficientbugtriaging-2022","author_short":["Krasniqi, R.","Do, H."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","address":"Gothenburg Sweden","title":"Automatically Capturing Quality-Related Concerns in Bug Report Descriptions for Efficient Bug Triaging","isbn":"978-1-4503-9613-4","doi":"10.1145/3530019.3530021","abstract":"In the early phases of a project, software architects and developers design solutions to satisfy quality concerns. However, as a byproduct of the long-term maintenance effort, qualities tend to erode, causing quality-related bugs to surface across the codebase. In principle, quality-related concerns not only can be expensive and difficult to detect, but they can have a detrimental effect on the system operating as intended. Moreover, quality-related concerns can directly affect users' experiences at large. To address this problem, we build a quality-based bug classifier that leverages several feature selection techniques, TF-IDF, Chi-square, Mutual Information, and Extra Randomized Trees, including the incorporation of various machine learning algorithms. Our results indicate that Random Forest with the (TF-IDF+Chi-square) configuration achieved the best results for detecting six-quality related types, achieving a precision of 76%, recall of 70%, and F1 of 70$%. However, the same approach returned low precision of 48%, recall of 15%, and F1 of 23% for detecting functional-related bugs. We argue that such low performance has resulted in an aftermath of overlapping content caused by functional and quality-related information which opens another challenging topic that we aim to expand in future work.","language":"en","urldate":"2022-09-29","booktitle":"2021 ACM 25th International Conference on Evaluation and Assessment in Software Engineering (EASE)","publisher":"ACM","author":[{"propositions":[],"lastnames":["Krasniqi"],"firstnames":["Rrezarta"],"suffixes":[]},{"propositions":[],"lastnames":["Do"],"firstnames":["Hyunsook"],"suffixes":[]}],"year":"2022","keywords":"Conference Full Papers","pages":"10–19","bibtex":"@inproceedings{krasniqi_automatically_2022,\n\taddress = {Gothenburg Sweden},\n\ttitle = {Automatically {Capturing} {Quality}-{Related} {Concerns} in {Bug} {Report} {Descriptions} for {Efficient} {Bug} {Triaging}},\n\tisbn = {978-1-4503-9613-4},\n\tdoi = {10.1145/3530019.3530021},\n\tabstract = {In the early phases of a project, software architects and developers design solutions to satisfy quality concerns. However, as a byproduct of the long-term maintenance effort, qualities tend to erode, causing quality-related bugs to surface across the codebase. In principle, quality-related concerns not only can be expensive and difficult to detect, but they can have a detrimental effect on the system operating as intended. Moreover, quality-related concerns can directly affect users' experiences at large. To address this problem, we build a quality-based bug classifier that leverages several feature selection techniques, TF-IDF, Chi-square, Mutual Information, and Extra Randomized Trees, including the incorporation of various machine learning algorithms. Our results indicate that Random Forest with the (TF-IDF+Chi-square) configuration achieved the best results for detecting six-quality related types, achieving a precision of 76\\%, recall of 70\\%, and F1 of 70\\$\\%. However, the same approach returned low precision of 48\\%, recall of 15\\%, and F1 of 23\\% for detecting functional-related bugs. We argue that such low performance has resulted in an aftermath of overlapping content caused by functional and quality-related information which opens another challenging topic that we aim to expand in future work.},\n\tlanguage = {en},\n\turldate = {2022-09-29},\n\tbooktitle = {2021 {ACM} 25th {International} {Conference} on {Evaluation} and {Assessment} in {Software} {Engineering} ({EASE})},\n\tpublisher = {ACM},\n\tauthor = {Krasniqi, Rrezarta and Do, Hyunsook},\n\tyear = {2022},\n\tkeywords = {Conference Full Papers},\n\tpages = {10--19},\n}\n\n","author_short":["Krasniqi, R.","Do, H."],"key":"krasniqi_automatically_2022","id":"krasniqi_automatically_2022","bibbaseid":"krasniqi-do-automaticallycapturingqualityrelatedconcernsinbugreportdescriptionsforefficientbugtriaging-2022","role":"author","urls":{},"keyword":["Conference Full Papers"],"metadata":{"authorlinks":{}},"downloads":42},"bibtype":"inproceedings","biburl":"https://api.zotero.org/users/10198036/collections/2RHJXKSI/items?key=X0RoN8iO9RtTbrWfSkRasb7b&format=bibtex&limit=100","dataSources":["qhNF8jwira6dquDfd","CqjF2PMvWfmGXcvqr","ADpzbvLQCLQu8ctuG","37aX9ioouEvzbunGp","ya2CyA73rpZseyrZ8","JHDShjsHrs6ZHE4bz","RKWLwgnLkqmgdCXEW","A6LHtF3YYwr9M952d"],"keywords":["conference full papers"],"search_terms":["automatically","capturing","quality","related","concerns","bug","report","descriptions","efficient","bug","triaging","krasniqi","do"],"title":"Automatically Capturing Quality-Related Concerns in Bug Report Descriptions for Efficient Bug Triaging","year":2022,"downloads":42}