Noise in Mylyn Interaction Traces and Its Impact on Developers and Recommendation Systems. Soh, Z., Khomh, F., Gu�h�neuc, Y., & Antoniol, G. Empirical Software Engineering (EMSE), 23(2):645–692, Springer, April, 2018. 49 pages.
Paper abstract bibtex Interaction traces (ITs) are developers' logs collected while developers maintain or evolve software systems. Researchers use ITs to study developers' editing styles and recommend relevant program entities when developers perform changes on source code. However, when using ITs, they make assumptions that may not necessarily be true. This article assesses the extent to which researchers' assumptions are true and examines noise in ITs. It also investigates the impact of noise on previous studies. This article describes a quasi-experiment collecting both Mylyn ITs and video-screen captures while 15 participants performed four realistic software maintenance tasks. It assesses the noise in ITs by comparing Mylyn ITs and the ITs obtained from the video captures. It proposes an approach to correct noise and uses this approach to revisit previous studies. The collected data show that Mylyn ITs can miss, on average, about 6\NOof the time spent by participants performing tasks and can contain, on average, about 85\NOof false edit events, which are not real changes to the source code. The approach to correct noise reveals about 45\NOof misclassification of ITs. It can improve the precision and recall of recommendation systems from the literature by up to 56\NOand 62%, respectively. Mylyn ITs include noise that biases subsequent studies and, thus, can prevent researchers from assisting developers effectively. They must be cleaned before use in studies and recommendation systems. The results on Mylyn ITs open new perspectives for the investigation of noise in ITs generated by other monitoring tools such as DFlow, FeedBag, and Mimec, and for future studies based on ITs.
@ARTICLE{Soh17-EMSE-MylynNoise,
AUTHOR = {Z�phyrin Soh and Foutse Khomh and Yann-Ga�l Gu�h�neuc and
Giuliano Antoniol},
JOURNAL = {Empirical Software Engineering (EMSE)},
TITLE = {Noise in Mylyn Interaction Traces and Its Impact on
Developers and Recommendation Systems},
YEAR = {2018},
MONTH = {April},
NOTE = {49 pages.},
NUMBER = {2},
PAGES = {645--692},
VOLUME = {23},
EDITOR = {Robert Feldt and Thomas Zimmermann},
KEYWORDS = {Topic: <b>Program comprehension</b>, Venue: <b>EMSE</b>},
PUBLISHER = {Springer},
URL = {http://www.ptidej.net/publications/documents/EMSE17a.doc.pdf},
ABSTRACT = {Interaction traces (ITs) are developers' logs collected
while developers maintain or evolve software systems. Researchers use
ITs to study developers' editing styles and recommend relevant
program entities when developers perform changes on source code.
However, when using ITs, they make assumptions that may not
necessarily be true. This article assesses the extent to which
researchers' assumptions are true and examines noise in ITs. It also
investigates the impact of noise on previous studies. This article
describes a quasi-experiment collecting both Mylyn ITs and
video-screen captures while 15 participants performed four realistic
software maintenance tasks. It assesses the noise in ITs by comparing
Mylyn ITs and the ITs obtained from the video captures. It proposes
an approach to correct noise and uses this approach to revisit
previous studies. The collected data show that Mylyn ITs can miss, on
average, about 6\NOof the time spent by participants performing tasks
and can contain, on average, about 85\NOof false edit events, which
are not real changes to the source code. The approach to correct
noise reveals about 45\NOof misclassification of ITs. It can improve
the precision and recall of recommendation systems from the
literature by up to 56\NOand 62\%, respectively. Mylyn ITs include
noise that biases subsequent studies and, thus, can prevent
researchers from assisting developers effectively. They must be
cleaned before use in studies and recommendation systems. The results
on Mylyn ITs open new perspectives for the investigation of noise in
ITs generated by other monitoring tools such as DFlow, FeedBag, and
Mimec, and for future studies based on ITs.}
}
Downloads: 0
{"_id":"GdcbBPoKZxRzZLhCB","bibbaseid":"soh-khomh-guhneuc-antoniol-noiseinmylyninteractiontracesanditsimpactondevelopersandrecommendationsystems-2018","author_short":["Soh, Z.","Khomh, F.","Gu�h�neuc, Y.","Antoniol, G."],"bibdata":{"bibtype":"article","type":"article","author":[{"firstnames":["Z�phyrin"],"propositions":[],"lastnames":["Soh"],"suffixes":[]},{"firstnames":["Foutse"],"propositions":[],"lastnames":["Khomh"],"suffixes":[]},{"firstnames":["Yann-Ga�l"],"propositions":[],"lastnames":["Gu�h�neuc"],"suffixes":[]},{"firstnames":["Giuliano"],"propositions":[],"lastnames":["Antoniol"],"suffixes":[]}],"journal":"Empirical Software Engineering (EMSE)","title":"Noise in Mylyn Interaction Traces and Its Impact on Developers and Recommendation Systems","year":"2018","month":"April","note":"49 pages.","number":"2","pages":"645–692","volume":"23","editor":[{"firstnames":["Robert"],"propositions":[],"lastnames":["Feldt"],"suffixes":[]},{"firstnames":["Thomas"],"propositions":[],"lastnames":["Zimmermann"],"suffixes":[]}],"keywords":"Topic: <b>Program comprehension</b>, Venue: <b>EMSE</b>","publisher":"Springer","url":"http://www.ptidej.net/publications/documents/EMSE17a.doc.pdf","abstract":"Interaction traces (ITs) are developers' logs collected while developers maintain or evolve software systems. Researchers use ITs to study developers' editing styles and recommend relevant program entities when developers perform changes on source code. However, when using ITs, they make assumptions that may not necessarily be true. This article assesses the extent to which researchers' assumptions are true and examines noise in ITs. It also investigates the impact of noise on previous studies. This article describes a quasi-experiment collecting both Mylyn ITs and video-screen captures while 15 participants performed four realistic software maintenance tasks. It assesses the noise in ITs by comparing Mylyn ITs and the ITs obtained from the video captures. It proposes an approach to correct noise and uses this approach to revisit previous studies. The collected data show that Mylyn ITs can miss, on average, about 6\\NOof the time spent by participants performing tasks and can contain, on average, about 85\\NOof false edit events, which are not real changes to the source code. The approach to correct noise reveals about 45\\NOof misclassification of ITs. It can improve the precision and recall of recommendation systems from the literature by up to 56\\NOand 62%, respectively. Mylyn ITs include noise that biases subsequent studies and, thus, can prevent researchers from assisting developers effectively. They must be cleaned before use in studies and recommendation systems. The results on Mylyn ITs open new perspectives for the investigation of noise in ITs generated by other monitoring tools such as DFlow, FeedBag, and Mimec, and for future studies based on ITs.","bibtex":"@ARTICLE{Soh17-EMSE-MylynNoise,\r\n AUTHOR = {Z�phyrin Soh and Foutse Khomh and Yann-Ga�l Gu�h�neuc and \r\n Giuliano Antoniol},\r\n JOURNAL = {Empirical Software Engineering (EMSE)},\r\n TITLE = {Noise in Mylyn Interaction Traces and Its Impact on \r\n Developers and Recommendation Systems},\r\n YEAR = {2018},\r\n MONTH = {April},\r\n NOTE = {49 pages.},\r\n NUMBER = {2},\r\n PAGES = {645--692},\r\n VOLUME = {23},\r\n EDITOR = {Robert Feldt and Thomas Zimmermann},\r\n KEYWORDS = {Topic: <b>Program comprehension</b>, Venue: <b>EMSE</b>},\r\n PUBLISHER = {Springer},\r\n URL = {http://www.ptidej.net/publications/documents/EMSE17a.doc.pdf},\r\n ABSTRACT = {Interaction traces (ITs) are developers' logs collected \r\n while developers maintain or evolve software systems. Researchers use \r\n ITs to study developers' editing styles and recommend relevant \r\n program entities when developers perform changes on source code. \r\n However, when using ITs, they make assumptions that may not \r\n necessarily be true. This article assesses the extent to which \r\n researchers' assumptions are true and examines noise in ITs. It also \r\n investigates the impact of noise on previous studies. This article \r\n describes a quasi-experiment collecting both Mylyn ITs and \r\n video-screen captures while 15 participants performed four realistic \r\n software maintenance tasks. It assesses the noise in ITs by comparing \r\n Mylyn ITs and the ITs obtained from the video captures. It proposes \r\n an approach to correct noise and uses this approach to revisit \r\n previous studies. The collected data show that Mylyn ITs can miss, on \r\n average, about 6\\NOof the time spent by participants performing tasks \r\n and can contain, on average, about 85\\NOof false edit events, which \r\n are not real changes to the source code. The approach to correct \r\n noise reveals about 45\\NOof misclassification of ITs. It can improve \r\n the precision and recall of recommendation systems from the \r\n literature by up to 56\\NOand 62\\%, respectively. Mylyn ITs include \r\n noise that biases subsequent studies and, thus, can prevent \r\n researchers from assisting developers effectively. They must be \r\n cleaned before use in studies and recommendation systems. The results \r\n on Mylyn ITs open new perspectives for the investigation of noise in \r\n ITs generated by other monitoring tools such as DFlow, FeedBag, and \r\n Mimec, and for future studies based on ITs.}\r\n}\r\n\r\n","author_short":["Soh, Z.","Khomh, F.","Gu�h�neuc, Y.","Antoniol, G."],"editor_short":["Feldt, R.","Zimmermann, T."],"key":"Soh17-EMSE-MylynNoise","id":"Soh17-EMSE-MylynNoise","bibbaseid":"soh-khomh-guhneuc-antoniol-noiseinmylyninteractiontracesanditsimpactondevelopersandrecommendationsystems-2018","role":"author","urls":{"Paper":"http://www.ptidej.net/publications/documents/EMSE17a.doc.pdf"},"keyword":["Topic: <b>Program comprehension</b>","Venue: <b>EMSE</b>"],"metadata":{"authorlinks":{}}},"bibtype":"article","biburl":"http://www.yann-gael.gueheneuc.net/Work/Publications/Biblio/complete-bibliography.bib","dataSources":["8vn5MSGYWB4fAx9Z4"],"keywords":["topic: <b>program comprehension</b>","venue: <b>emse</b>"],"search_terms":["noise","mylyn","interaction","traces","impact","developers","recommendation","systems","soh","khomh","gu�h�neuc","antoniol"],"title":"Noise in Mylyn Interaction Traces and Its Impact on Developers and Recommendation Systems","year":2018}