Physical and Conceptual Identifier Dispersion: Measures and Relation to Fault Proneness. Arnaoudova, V., Eshkevari, L. M., Oliveto, R., Gu�h�neuc, Y., & Antoniol, G. In Ferenc, R. & Poshyvanyk, D., editors, Proceedings of the 26<sup>th</sup> International Conference on Software Maintenance (ICSM), pages 1–5, September, 2010. IEEE CS Press. 5 pages. Early Research Achievements Track. \awardBest paper.
Paper abstract bibtex Poorly-chosen identifiers have been reported in the literature as misleading and increasing the program comprehension effort. Identifiers are composed of terms, which can be dictionary words, acronyms, contractions, or simple strings. We conjecture that the use of identical terms in different contexts may increase the risk of faults. We investigate our conjecture using a measure combining term entropy and term context coverage to study whether certain terms increase the odds ratios of methods to be fault-prone. Entropy measures the \emphphysical dispersion of terms in a program: the higher the entropy, the more scattered across the program the terms. Context coverage measures the \emphconceptual dispersion of terms: the higher their context coverage, the more unrelated the methods using them. We compute term entropy and context coverage of terms extracted from identifiers in Rhino 1.4R3 and ArgoUML 0.16. We show statistically that methods containing terms with high entropy and context coverage are more fault-prone than others.
@INPROCEEDINGS{Arnaoudova10-ICSM-ERA-IdentifierEntropy,
AUTHOR = {Venera Arnaoudova and Laleh Mousavi Eshkevari and
Rocco Oliveto and Yann-Ga�l Gu�h�neuc and Giuliano Antoniol},
BOOKTITLE = {Proceedings of the 26<sup>th</sup> International Conference on Software Maintenance (ICSM)},
TITLE = {Physical and Conceptual Identifier Dispersion: Measures
and Relation to Fault Proneness},
YEAR = {2010},
OPTADDRESS = {},
OPTCROSSREF = {},
EDITOR = {Rudolf Ferenc and Denys Poshyvanyk},
MONTH = {September},
NOTE = {5 pages. Early Research Achievements Track. \awardBest paper.},
OPTNUMBER = {},
OPTORGANIZATION = {},
PAGES = {1--5},
PUBLISHER = {IEEE CS Press},
OPTSERIES = {},
OPTVOLUME = {},
KEYWORDS = {Topic: <b>Identifier analysis</b>, Venue: <c>ICSM</c>,
Venue: <c>ICSME</c>},
URL = {http://www.ptidej.net/publications/documents/ICSM10a.doc.pdf},
PDF = {http://www.ptidej.net/publications/documents/ICSM10a.ppt.pdf},
ABSTRACT = {Poorly-chosen identifiers have been reported in the
literature as misleading and increasing the program comprehension
effort. Identifiers are composed of terms, which can be dictionary
words, acronyms, contractions, or simple strings. We conjecture that
the use of identical terms in different contexts may increase the
risk of faults. We investigate our conjecture using a measure
combining term entropy and term context coverage to study whether
certain terms increase the odds ratios of methods to be fault-prone.
Entropy measures the \emph{physical dispersion} of terms in a
program: the higher the entropy, the more scattered across the
program the terms. Context coverage measures the \emph{conceptual
dispersion} of terms: the higher their context coverage, the more
unrelated the methods using them. We compute term entropy and context
coverage of terms extracted from identifiers in Rhino 1.4R3 and
ArgoUML 0.16. We show statistically that methods containing terms
with high entropy and context coverage are more fault-prone than
others.}
}
Downloads: 0
{"_id":"nALiMxKEbBcGk2wCE","bibbaseid":"arnaoudova-eshkevari-oliveto-guhneuc-antoniol-physicalandconceptualidentifierdispersionmeasuresandrelationtofaultproneness-2010","downloads":0,"creationDate":"2018-01-17T20:29:42.459Z","title":"Physical and Conceptual Identifier Dispersion: Measures and Relation to Fault Proneness","author_short":["Arnaoudova, V.","Eshkevari, L. M.","Oliveto, R.","Gu�h�neuc, Y.","Antoniol, G."],"year":2010,"bibtype":"inproceedings","biburl":"http://www.yann-gael.gueheneuc.net/Work/Publications/Biblio/complete-bibliography.bib","bibdata":{"bibtype":"inproceedings","type":"inproceedings","author":[{"firstnames":["Venera"],"propositions":[],"lastnames":["Arnaoudova"],"suffixes":[]},{"firstnames":["Laleh","Mousavi"],"propositions":[],"lastnames":["Eshkevari"],"suffixes":[]},{"firstnames":["Rocco"],"propositions":[],"lastnames":["Oliveto"],"suffixes":[]},{"firstnames":["Yann-Ga�l"],"propositions":[],"lastnames":["Gu�h�neuc"],"suffixes":[]},{"firstnames":["Giuliano"],"propositions":[],"lastnames":["Antoniol"],"suffixes":[]}],"booktitle":"Proceedings of the 26<sup>th</sup> International Conference on Software Maintenance (ICSM)","title":"Physical and Conceptual Identifier Dispersion: Measures and Relation to Fault Proneness","year":"2010","optaddress":"","optcrossref":"","editor":[{"firstnames":["Rudolf"],"propositions":[],"lastnames":["Ferenc"],"suffixes":[]},{"firstnames":["Denys"],"propositions":[],"lastnames":["Poshyvanyk"],"suffixes":[]}],"month":"September","note":"5 pages. Early Research Achievements Track. \\awardBest paper.","optnumber":"","optorganization":"","pages":"1–5","publisher":"IEEE CS Press","optseries":"","optvolume":"","keywords":"Topic: <b>Identifier analysis</b>, Venue: <c>ICSM</c>, Venue: <c>ICSME</c>","url":"http://www.ptidej.net/publications/documents/ICSM10a.doc.pdf","pdf":"http://www.ptidej.net/publications/documents/ICSM10a.ppt.pdf","abstract":"Poorly-chosen identifiers have been reported in the literature as misleading and increasing the program comprehension effort. Identifiers are composed of terms, which can be dictionary words, acronyms, contractions, or simple strings. We conjecture that the use of identical terms in different contexts may increase the risk of faults. We investigate our conjecture using a measure combining term entropy and term context coverage to study whether certain terms increase the odds ratios of methods to be fault-prone. Entropy measures the \\emphphysical dispersion of terms in a program: the higher the entropy, the more scattered across the program the terms. Context coverage measures the \\emphconceptual dispersion of terms: the higher their context coverage, the more unrelated the methods using them. We compute term entropy and context coverage of terms extracted from identifiers in Rhino 1.4R3 and ArgoUML 0.16. We show statistically that methods containing terms with high entropy and context coverage are more fault-prone than others.","bibtex":"@INPROCEEDINGS{Arnaoudova10-ICSM-ERA-IdentifierEntropy,\r\n AUTHOR = {Venera Arnaoudova and Laleh Mousavi Eshkevari and \r\n Rocco Oliveto and Yann-Ga�l Gu�h�neuc and Giuliano Antoniol},\r\n BOOKTITLE = {Proceedings of the 26<sup>th</sup> International Conference on Software Maintenance (ICSM)},\r\n TITLE = {Physical and Conceptual Identifier Dispersion: Measures \r\n and Relation to Fault Proneness},\r\n YEAR = {2010},\r\n OPTADDRESS = {},\r\n OPTCROSSREF = {},\r\n EDITOR = {Rudolf Ferenc and Denys Poshyvanyk},\r\n MONTH = {September},\r\n NOTE = {5 pages. Early Research Achievements Track. \\awardBest paper.},\r\n OPTNUMBER = {},\r\n OPTORGANIZATION = {},\r\n PAGES = {1--5},\r\n PUBLISHER = {IEEE CS Press},\r\n OPTSERIES = {},\r\n OPTVOLUME = {},\r\n KEYWORDS = {Topic: <b>Identifier analysis</b>, Venue: <c>ICSM</c>, \r\n Venue: <c>ICSME</c>},\r\n URL = {http://www.ptidej.net/publications/documents/ICSM10a.doc.pdf},\r\n PDF = {http://www.ptidej.net/publications/documents/ICSM10a.ppt.pdf},\r\n ABSTRACT = {Poorly-chosen identifiers have been reported in the \r\n literature as misleading and increasing the program comprehension \r\n effort. Identifiers are composed of terms, which can be dictionary \r\n words, acronyms, contractions, or simple strings. We conjecture that \r\n the use of identical terms in different contexts may increase the \r\n risk of faults. We investigate our conjecture using a measure \r\n combining term entropy and term context coverage to study whether \r\n certain terms increase the odds ratios of methods to be fault-prone. \r\n Entropy measures the \\emph{physical dispersion} of terms in a \r\n program: the higher the entropy, the more scattered across the \r\n program the terms. Context coverage measures the \\emph{conceptual \r\n dispersion} of terms: the higher their context coverage, the more \r\n unrelated the methods using them. We compute term entropy and context \r\n coverage of terms extracted from identifiers in Rhino 1.4R3 and \r\n ArgoUML 0.16. We show statistically that methods containing terms \r\n with high entropy and context coverage are more fault-prone than \r\n others.}\r\n}\r\n\r\n","author_short":["Arnaoudova, V.","Eshkevari, L. M.","Oliveto, R.","Gu�h�neuc, Y.","Antoniol, G."],"editor_short":["Ferenc, R.","Poshyvanyk, D."],"key":"Arnaoudova10-ICSM-ERA-IdentifierEntropy","id":"Arnaoudova10-ICSM-ERA-IdentifierEntropy","bibbaseid":"arnaoudova-eshkevari-oliveto-guhneuc-antoniol-physicalandconceptualidentifierdispersionmeasuresandrelationtofaultproneness-2010","role":"author","urls":{"Paper":"http://www.ptidej.net/publications/documents/ICSM10a.doc.pdf"},"keyword":["Topic: <b>Identifier analysis</b>","Venue: <c>ICSM</c>","Venue: <c>ICSME</c>"],"metadata":{"authorlinks":{"gu�h�neuc, y":"https://bibbase.org/show?bib=http%3A%2F%2Fwww.yann-gael.gueheneuc.net%2FWork%2FPublications%2FBiblio%2Fcomplete-bibliography.bib&msg=embed","guéhéneuc, y":"https://bibbase.org/show?bib=http://www.yann-gael.gueheneuc.net/Work/BibBase/guehene%20(automatically%20cleaned).bib"}},"downloads":0},"search_terms":["physical","conceptual","identifier","dispersion","measures","relation","fault","proneness","arnaoudova","eshkevari","oliveto","gu�h�neuc","antoniol"],"keywords":["topic: <b>identifier analysis</b>","venue: <c>icsm</c>","venue: <c>icsme</c>"],"authorIDs":["2tFXMaTSHJKEB5ebi","2wY5eBcsYmbPNfmMS","36dm7jaw5EK5Wrr4D","3NxaNKic3nkXi568L","3S5Dkpx7DNefzJrnf","3afmfmoPr4SHa8B5F","3wmHB7JoQbQz2ujun","4YBWWbao6RKgiyGJE","4jZj9tB4SJ8zEEgHk","5CvA2hsaib2bPMaef","5TFJbxqRDGFj2P8Rg","5a5fb236a39f2c3645000032","5a8f17e006df23bc34000020","5cx79LBmaWcihgM4J","5de9a6425b51bcde01000042","5dee1197584fb4df010000fc","5df228a41e4fe9df0100012c","5df617f72b34d0de0100008b","5dfa14782e791dde010000ea","5dfe3d5e68d95dde01000080","5e02525b6ffa15df0100009f","5e0662c07da1d1de0100021a","5e093e8b934cacdf0100008b","5e0a61673eccf6e001000016","5e0b75b7e73cd6de010000f9","5e0d4ca6ae5827df0100007f","5e0ddf08552b25df01000137","5e0e5c41ac7d11df010000a3","5e1268e7a4cabfdf0100002c","5e12c45a70e2c4f201000043","5e157809f1f31adf01000006","5e162ca1df1bb4de01000123","5e185cff809b84f201000091","5e1a6c39b16ec5df0100000f","5e21b27e96aea7de01000084","5e22c57e49e2b4df0100000f","5e23c2aeb93b51de01000030","5e245835079bb2df0100007d","5e24fa3e2e79a1f201000027","5e26252f408641df01000161","5e26bfbd8535cedf0100005c","5e280fd1f860fcde0100006a","5e2a827f881468de01000080","5e2eb321b84405df01000128","5e2ef635e374eede0100001a","5e2fd6a74e91a9df01000010","5e3266bb5633c9de01000068","5e32ab0ee17accde0100012a","5e32bdec466076df010000d9","5e32d603150c84df01000068","5e34fb145978bef2010000a6","5e36bc8e7b975dde0100009a","5e389940030bcadf010001b4","5e39dd9a3687dddf010000a4","5e3ad173f2a00cdf01000206","5e3dcd50d51253de0100003d","5e3e8713666d79df010000a6","5e3ed80986a596de010000b9","5e3fefe1add5fbde01000087","5e409c79d668c6de010000c7","5e41795ed9f47bee01000194","5e41cd5be7c67ade010000eb","5e42ef1ca6f4a6f2010001eb","5e46dcb342fb31df01000113","5e46f12c461d04f201000078","5e478c9e27a0c8de010000ef","5e47fb06385298df010000b2","5e4add1941072bdf01000011","5e4c1c792dc400de0100011a","5e4c6262271596df010001b9","5e4f0360338acfde01000156","5e4f11b0e5389bde0100007e","5e530b976d68b8df010000a5","5e54ad6d929495df0100007c","5e57161b429006de0100005a","5e57839fcef9b7de0100003c","5e580f5a6a456fde0100004f","5e5afa78038583de010000f7","5e5b477174a3e7df010000b7","5e5d370173eb2edf01000038","5e5fca336b32b0f20100011b","5e60e7f0839e59df010000e8","5e6377cfae1c4dde0100011e","5e657007de41b9df0100017a","5e676f0910be53de0100001a","5gPbX6aQJFjpv2Na9","6eE2yRdMDQr2WGXuA","6iHE5tuM7yTfLd2pA","7BPWyvMr5e6bzbk7T","7RFwhpGkpZRsLwnmB","7amRA4ALcR2mksheF","7mkQL8eiftj5bGMzB","8jPjKehCMsj7ncvxN","8peLXfWtCSic5n7oz","95eRgTcabnJwF46f3","9Ba9JxkjQBCeGBZKg","9DjgvzQrx27uxbyJj","9HD56d3k5yrB9H9oq","9RtPuXNyeS3k8LM9J","9diLYpd8cMmjBh54T","9nx6Yv3XREwJDyRms","AfJhKcg96muyPdu7S","BGvchZsjW7Wejj9Cz","BYwdHpGr6xT5vmE5C","Bah6LM7GXdXTy8GGA","BmH2ytt7sXwPHcrse","CqJYxtqe6qBbtd5yz","D4kEZ2JcWCoMvRPy7","DFWW7D6Y7X57n4cbM","DSorPqHDfrFiNM5Ew","DWXisKXaQArvre3QL","DwBm6isMpKSHHkhAd","E88raoktD8ANF92Yu","EAjLox7ycbofcCXce","F8rzFhY9yWA7pBX4j","G3iynDKjz9BHJbrdg","GJw6mQETXADSCZuuk","GWK5669HLqPyYMQ5J","GibAXjj4xXdFT8qWh","HzFZpgGcfabjAp9x6","KJ4eYziy6hanF9kr9","Kcyu7uncEFiYzYP2D","N4zzhqcywSzDDYsdh","NCDg3xE2mPcNAu7LX","NvgbTAz3hZ9SevZvd","QbcDS3wK43sRASvgu","S3b7Bb9wwfpByQgbo","SXJaeFCgBDJ5HAHtj","T5nL8TGrggoLAF8Dj","W9vT8YcCNFEcp9mWQ","WZ5CpBEFNsb2ivfah","XxviSwRxhwgNwsraH","Z2Zs662GpXqKBEAMc","ZKYFgjHGm7PE4Y2kv","a5qpGirN3B5BLKdMh","ahGA65oGDChNYp7Mb","bA7pGCMS9AB2RBo2p","bTQb3TcrbBShtqFPS","cYnqisf4wzBsM7MF5","cjHpaYiWD5eX7btH4","ckrbesqi3pWqfF2nP","dH8EsWHZtCFuQk5bq","dS5kvBMnk3LMQe56w","eXsFRMzE7WfbHbBL4","fmmsBu4m6ayKtuopf","hdXr3PD8cHNWyAdCe","hgZxckC87u2A57teF","juvCjffHJaPQf44im","keQBT2Apb9yaev8AH","myHdF8zARwW5uGmFs","nJLfaznnYgFqWQQrv","onghitNWSvN2FpCaN","osgPwDW2y5KDXRa2i","pAWFMDHu5dNixqPAq","pLvmgrCjMeDYJiJxB","q4azvWakEjp2TQM7S","qBee6Md9YwRKwkeW3","qQky2Csek4mroLn2P","tJz4YBCqAzZAzek5d","tLtjttw8dEqF6YQ4s","uQ6jCrPijzAmZyfXz","vGEaFNt7mm92Z7GXc","vRkMmE65HSFpCk6FW","vsEsf8FR3Fxb6z7fJ","x5ejzvDeXCc89Dukv","xEQyC5shxpYySSJJm","xhwDdvQ7MYxa6keXm","xkviMnkrGBneANvMr","y64rFMcyp7tDsBrJQ","yBYJWSShoKkMG8aPE","yQPghCwQv22kf6dFq","yd5sCxaEiu5vWizTq"],"dataSources":["Sed98LbBeGaXxenrM","8vn5MSGYWB4fAx9Z4"]}