REPENT: Analyzing the Nature of Identifier Renamings

REPENT: Analyzing the Nature of Identifier Renamings. Arnaoudova, V., Eshkevari, L. M., Di Penta, M., Oliveto, R., Antoniol, G., & Gu�h�neuc, Y. Transactions on Software Engineering (TSE), 40(5):502–532, IEEE CS Press, May, 2014. 30 pages.

Paper abstract bibtex

Source code lexicon plays a paramount role in software quality: poor lexicon can lead to poor comprehensibility and even increase software fault-proneness. For this reason, renaming a program entity, ıe altering the entity identifier, is an important activity during software evolution. Developers rename when they feel that the name of an entity is not (anymore) consistent with its functionality, or when such a name may be misleading. A survey that we performed with 71 developers suggests that 39% perform renaming from a few times per week to almost every day and that 92% of the participants consider that renaming is not straightforward. However, despite the cost that is associated with renaming, renamings are seldom if ever documented—for example, less than 1% of the renamings in the five programs that we studied. This explains why participants largely agree on the usefulness of automatically documenting renamings. In this paper we propose \sc REPENT (\sc REanaming Program ENTities), an approach to automatically document—detect and classify—identifier renamings in source code. REPENT detects renamings based on a combination of source code differencing and data flow analyses. Using a set of natural language tools, REPENT classifies renamings into the different dimensions of a taxonomy that we defined. Using the documented renamings, developers will be able to, for example, look up methods that are part of the public API (as they impact client applications), or look for inconsistencies between the name and the implementation of an entity that underwent a high risk renaming (\eg towards the opposite meaning). We evaluate the accuracy and completeness of REPENT on the evolution history of five open-source Java programs. The study indicates a precision of 88% and a recall of 92%. In addition, we report an exploratory study investigating and discussing how identifiers are renamed in the five programs, according to our taxonomy.

@ARTICLE{Arnaoudova14-TSE-REPENT,
   AUTHOR       = {Venera Arnaoudova and Laleh Mousavi Eshkevari and 
      Di Penta, Massimiliano and Rocco Oliveto and Giuliano Antoniol and 
      Yann-Ga�l Gu�h�neuc},
   JOURNAL      = {Transactions on Software Engineering (TSE)},
   TITLE        = {REPENT: Analyzing the Nature of Identifier Renamings},
   YEAR         = {2014},
   MONTH        = {May},
   NOTE         = {30 pages.},
   NUMBER       = {5},
   PAGES        = {502--532},
   VOLUME       = {40},
   EDITOR       = {Harald Gall},
   KEYWORDS     = {Topic: <b>Identifier analysis</b>, Venue: <b>TSE</b>},
   PUBLISHER    = {IEEE CS Press},
   URL          = {http://www.ptidej.net/publications/documents/TSE14.doc.pdf},
   ABSTRACT     = {Source code lexicon plays a paramount role in software 
      quality: poor lexicon can lead to poor comprehensibility and even 
      increase software fault-proneness. For this reason, renaming a 
      program entity, \ie{} altering the entity identifier, is an important 
      activity during software evolution. Developers rename when they feel 
      that the name of an entity is not (anymore) consistent with its 
      functionality, or when such a name may be misleading. A survey that 
      we performed with 71 developers suggests that 39\% perform renaming 
      from a few times per week to almost every day and that 92\% of the 
      participants consider that renaming is not straightforward. However, 
      despite the cost that is associated with renaming, renamings are 
      seldom if ever documented---for example, less than 1\% of the 
      renamings in the five programs that we studied. This explains why 
      participants largely agree on the usefulness of automatically 
      documenting renamings. In this paper we propose {\sc REPENT} ({\sc 
      REanaming Program ENTities}), an approach to automatically 
      document---detect and classify---identifier renamings in source code. 
      REPENT detects renamings based on a combination of source code 
      differencing and data flow analyses. Using a set of natural language 
      tools, REPENT classifies renamings into the different dimensions of a 
      taxonomy that we defined. Using the documented renamings, developers 
      will be able to, for example, look up methods that are part of the 
      public API (as they impact client applications), or look for 
      inconsistencies between the name and the implementation of an entity 
      that underwent a high risk renaming (\eg{} towards the opposite 
      meaning). We evaluate the accuracy and completeness of REPENT on the 
      evolution history of five open-source Java programs. The study 
      indicates a precision of 88\% and a recall of 92\%. In addition, we 
      report an exploratory study investigating and discussing how 
      identifiers are renamed in the five programs, according to our 
      taxonomy.}
}

Downloads: 0

{"_id":"3onWtZsWwFFxMHZi3","bibbaseid":"arnaoudova-eshkevari-dipenta-oliveto-antoniol-guhneuc-repentanalyzingthenatureofidentifierrenamings-2014","downloads":0,"creationDate":"2018-01-17T20:29:42.264Z","title":"REPENT: Analyzing the Nature of Identifier Renamings","author_short":["Arnaoudova, V.","Eshkevari, L. M.","Di Penta, M.","Oliveto, R.","Antoniol, G.","Gu�h�neuc, Y."],"year":2014,"bibtype":"article","biburl":"http://www.yann-gael.gueheneuc.net/Work/Publications/Biblio/complete-bibliography.bib","bibdata":{"bibtype":"article","type":"article","author":[{"firstnames":["Venera"],"propositions":[],"lastnames":["Arnaoudova"],"suffixes":[]},{"firstnames":["Laleh","Mousavi"],"propositions":[],"lastnames":["Eshkevari"],"suffixes":[]},{"propositions":[],"lastnames":["Di","Penta"],"firstnames":["Massimiliano"],"suffixes":[]},{"firstnames":["Rocco"],"propositions":[],"lastnames":["Oliveto"],"suffixes":[]},{"firstnames":["Giuliano"],"propositions":[],"lastnames":["Antoniol"],"suffixes":[]},{"firstnames":["Yann-Ga�l"],"propositions":[],"lastnames":["Gu�h�neuc"],"suffixes":[]}],"journal":"Transactions on Software Engineering (TSE)","title":"REPENT: Analyzing the Nature of Identifier Renamings","year":"2014","month":"May","note":"30 pages.","number":"5","pages":"502–532","volume":"40","editor":[{"firstnames":["Harald"],"propositions":[],"lastnames":["Gall"],"suffixes":[]}],"keywords":"Topic: Identifier analysis, Venue: TSE","publisher":"IEEE CS Press","url":"http://www.ptidej.net/publications/documents/TSE14.doc.pdf","abstract":"Source code lexicon plays a paramount role in software quality: poor lexicon can lead to poor comprehensibility and even increase software fault-proneness. For this reason, renaming a program entity, ıe altering the entity identifier, is an important activity during software evolution. Developers rename when they feel that the name of an entity is not (anymore) consistent with its functionality, or when such a name may be misleading. A survey that we performed with 71 developers suggests that 39% perform renaming from a few times per week to almost every day and that 92% of the participants consider that renaming is not straightforward. However, despite the cost that is associated with renaming, renamings are seldom if ever documented—for example, less than 1% of the renamings in the five programs that we studied. This explains why participants largely agree on the usefulness of automatically documenting renamings. In this paper we propose \\sc REPENT (\\sc REanaming Program ENTities), an approach to automatically document—detect and classify—identifier renamings in source code. REPENT detects renamings based on a combination of source code differencing and data flow analyses. Using a set of natural language tools, REPENT classifies renamings into the different dimensions of a taxonomy that we defined. Using the documented renamings, developers will be able to, for example, look up methods that are part of the public API (as they impact client applications), or look for inconsistencies between the name and the implementation of an entity that underwent a high risk renaming (\\eg towards the opposite meaning). We evaluate the accuracy and completeness of REPENT on the evolution history of five open-source Java programs. The study indicates a precision of 88% and a recall of 92%. In addition, we report an exploratory study investigating and discussing how identifiers are renamed in the five programs, according to our taxonomy.","bibtex":"@ARTICLE{Arnaoudova14-TSE-REPENT,\r\n AUTHOR = {Venera Arnaoudova and Laleh Mousavi Eshkevari and \r\n Di Penta, Massimiliano and Rocco Oliveto and Giuliano Antoniol and \r\n Yann-Ga�l Gu�h�neuc},\r\n JOURNAL = {Transactions on Software Engineering (TSE)},\r\n TITLE = {REPENT: Analyzing the Nature of Identifier Renamings},\r\n YEAR = {2014},\r\n MONTH = {May},\r\n NOTE = {30 pages.},\r\n NUMBER = {5},\r\n PAGES = {502--532},\r\n VOLUME = {40},\r\n EDITOR = {Harald Gall},\r\n KEYWORDS = {Topic: Identifier analysis, Venue: TSE},\r\n PUBLISHER = {IEEE CS Press},\r\n URL = {http://www.ptidej.net/publications/documents/TSE14.doc.pdf},\r\n ABSTRACT = {Source code lexicon plays a paramount role in software \r\n quality: poor lexicon can lead to poor comprehensibility and even \r\n increase software fault-proneness. For this reason, renaming a \r\n program entity, \\ie{} altering the entity identifier, is an important \r\n activity during software evolution. Developers rename when they feel \r\n that the name of an entity is not (anymore) consistent with its \r\n functionality, or when such a name may be misleading. A survey that \r\n we performed with 71 developers suggests that 39\\% perform renaming \r\n from a few times per week to almost every day and that 92\\% of the \r\n participants consider that renaming is not straightforward. However, \r\n despite the cost that is associated with renaming, renamings are \r\n seldom if ever documented---for example, less than 1\\% of the \r\n renamings in the five programs that we studied. This explains why \r\n participants largely agree on the usefulness of automatically \r\n documenting renamings. In this paper we propose {\\sc REPENT} ({\\sc \r\n REanaming Program ENTities}), an approach to automatically \r\n document---detect and classify---identifier renamings in source code. \r\n REPENT detects renamings based on a combination of source code \r\n differencing and data flow analyses. Using a set of natural language \r\n tools, REPENT classifies renamings into the different dimensions of a \r\n taxonomy that we defined. Using the documented renamings, developers \r\n will be able to, for example, look up methods that are part of the \r\n public API (as they impact client applications), or look for \r\n inconsistencies between the name and the implementation of an entity \r\n that underwent a high risk renaming (\\eg{} towards the opposite \r\n meaning). We evaluate the accuracy and completeness of REPENT on the \r\n evolution history of five open-source Java programs. The study \r\n indicates a precision of 88\\% and a recall of 92\\%. In addition, we \r\n report an exploratory study investigating and discussing how \r\n identifiers are renamed in the five programs, according to our \r\n taxonomy.}\r\n}\r\n\r\n","author_short":["Arnaoudova, V.","Eshkevari, L. M.","Di Penta, M.","Oliveto, R.","Antoniol, G.","Gu�h�neuc, Y."],"editor_short":["Gall, H."],"key":"Arnaoudova14-TSE-REPENT","id":"Arnaoudova14-TSE-REPENT","bibbaseid":"arnaoudova-eshkevari-dipenta-oliveto-antoniol-guhneuc-repentanalyzingthenatureofidentifierrenamings-2014","role":"author","urls":{"Paper":"http://www.ptidej.net/publications/documents/TSE14.doc.pdf"},"keyword":["Topic: Identifier analysis","Venue: TSE"],"metadata":{"authorlinks":{"gu�h�neuc, y":"https://bibbase.org/show?bib=http%3A%2F%2Fwww.yann-gael.gueheneuc.net%2FWork%2FPublications%2FBiblio%2Fcomplete-bibliography.bib&msg=embed","guéhéneuc, y":"http://www.yann-gael.gueheneuc.net/"}}},"search_terms":["repent","analyzing","nature","identifier","renamings","arnaoudova","eshkevari","di penta","oliveto","antoniol","gu�h�neuc"],"keywords":["topic: identifier analysis","venue: tse"],"authorIDs":["AfJhKcg96muyPdu7S","xkviMnkrGBneANvMr"],"dataSources":["Sed98LbBeGaXxenrM","8vn5MSGYWB4fAx9Z4"]}