Combining Probabilistic Ranking and Latent Semantic Indexing for Feature Identification. Poshyvanyk, D., Guéhéneuc, Y., Marcus, A., Antoniol, G., & Rajlich, V. In Ebert, J. & Linos, P., editors, Proceedings of the 14<sup>th</sup> International Conference on Program Comprehension (ICPC), pages 137–148, June, 2006. IEEE CS Press. Best paper. 10 pages.
Combining Probabilistic Ranking and Latent Semantic Indexing for Feature Identification [pdf]Paper  abstract   bibtex   
The paper recasts the problem of feature location in source code as a decision-making problem in the presence of uncertainty. The main contribution consists in the combination of two existing techniques for feature location in source code. Both techniques provide a set of ranked facts from the software, as result to the feature identification problem. One of the techniques is based on a Scenario Based Probabilistic ranking of events observed while executing a program under given scenarios. The other technique is defined as an information retrieval task, based on the Latent Semantic Indexing of the source code. We show the viability and effectiveness of the combined technique with two case studies. A first case study is a replication of feature identification in Mozilla, which allows us to directly compare the results with previously published data. The other case study is a bug location problem in Mozilla. The results show that the combined technique improves feature identification significantly with respect to each technique used independently.

Downloads: 0