Identifying Self-Admitted Technical Debt through Code Comment Analysis with a Contextualized Vocabulary. de Freitas Farias, M. A., de Mendonça, M. G., Kalinowski, M., & Spínola, R. O. Information and Software Technology, 121:106270:1-18, 2020.
Identifying Self-Admitted Technical Debt through Code Comment Analysis with a Contextualized Vocabulary [pdf]Author version  doi  abstract   bibtex   7 downloads  
Context: Previous work has shown that we can use code comments analysis to detect Self-Admitted Technical Debt (SATD). However, current SATD identification strategies still return a large number of candidate SATD items, making the identification process laborious. Besides, those strategies do not allow the automatic identification of the type of debt of the SATD items. Objective: This work intends to evaluate, improve, and apply a set of contextualized patterns we built to detect SATD using code comment analysis. We refer to this set of patterns as the TD identification vocabulary. Method: We carry out three empirical studies. Firstly, 23 participants analyzed the patterns of the vocabulary and registered their level of importance to identify SATD items. In the second study, we performed a qualitative analysis to investigate the relation between each pattern and the types of TD. These two studies resulted in an improved vocabulary. Finally, we performed a feasibility study using the improved vocabulary, created based on the results of empirical studies I and II, considering three open source projects: ArgoUML, jEdit, and Lucene. We used the improved vocabulary to automatically identify SATD items and the types of debt that exist in those projects. Results: The result was an improved vocabulary considering the level of importance of each pattern and the relationship between patterns and TD types to support the identification and classification of SATD items. More than half of the patterns were considered decisive or very decisive to detect SATD. Besides, using the improved vocabulary, we were able to find different TD types such as code, design, defect, documentation, and requirement debt. Conclusion: The studies allowed us to improve the vocabulary to identify SATD through code comments analysis. The results show that the use of pattern-based code comment analysis can contribute to improve existing methods, or create new ones, for identifying and classifying SATD items.

Downloads: 7