Effective Use of Analysts' Effort in Automated Tracing. Hayes, J. H., Dekhtyar, A., Larsen, J., & Gu�h�neuc, Y. Requirements Engineering (REEN), 23(1):119–143, Springer, March, 2018. 26 pages.
Paper abstract bibtex Because of the large amount of effort it takes to manually trace requirements, automated traceability methods for mapping textual software engineering artifacts to each other and generating candidate links have received increased attention over the past 15 years. Automatically generated links, however, are viewed as candidates until human analysts confirm/reject them for the final requirements traceability matrix. Studies have shown that analysts are a fallible, but necessary, participant in the tracing process. There are two key measures guiding analyst work on the evaluation of candidate links: accuracy of analyst decision and efficiency of their work. Intuitively, it is expected that the more effort the analyst spends on candidate link validation, the more accurate the final traceability matrix is likely to be, although the exact nature of this relationship may be difficult to gauge outright. To assist analysts in making the best use of their time when reviewing candidate links, prior work simulated four possible behaviors and showed that more structured approaches save the analysts' time/effort required to achieve certain levels of accuracy. However, these behavioral simulations are complex to run and their results difficult to interpret and use in practice. In this paper, we present a mathematical model for evaluating analyst effort during requirements tracing tasks. We apply this model to a simulation study of 12 candidate link validation approaches. The simulation study is conducted on a number of different datasets. In each study, we assume perfect analyst behavior (i.e., analyst always being correct when making a determination about a link). Under this assumption, we evaluate the estimated effort for the analyst and plot it against the accuracy of the recovered traceability matrix. The effort estimation model is guided by a parameter specifying the relationship between the time it takes an analyst to evaluate a presented link and the time it takes an analyst to discover a link not presented to her. We construct a series of effort estimations based on different values of the model parameter. We found that the analysts' approach to candidate link validation—essentially the order in which the analyst examines presented candidate links—does impact the effort. We also found that the lowest ratio of the cost of finding a correct link from scratch over the cost of recognizing a correct link yields the lowest effort for all datasets, but that the lowest effort does not always yield the highest quality matrix. We finally observed that effort varies by dataset. We conclude that the link evaluation approach we call ``Top 1 Not Yet Examined Feedback Pruning'' was the overall winner in terms of effort and highest quality and, thus, should be followed by human analysts if possible.
@ARTICLE{Hayes16-REEN-AnalystEffort,
AUTHOR = {Jane Huffman Hayes and Alexander Dekhtyar and
Jody Larsen and Yann-Ga�l Gu�h�neuc},
JOURNAL = {Requirements Engineering (REEN)},
TITLE = {Effective Use of Analysts' Effort in Automated Tracing},
YEAR = {2018},
MONTH = {March},
NOTE = {26 pages.},
NUMBER = {1},
PAGES = {119--143},
VOLUME = {23},
EDITOR = {Pericles Loucopoulos},
KEYWORDS = {Topic: <b>Requirements and features</b>,
Venue: <b>REEN</b>},
PUBLISHER = {Springer},
URL = {http://www.ptidej.net/publications/documents/REEN16.doc.pdf},
ABSTRACT = {Because of the large amount of effort it takes to
manually trace requirements, automated traceability methods for
mapping textual software engineering artifacts to each other and
generating candidate links have received increased attention over the
past 15 years. Automatically generated links, however, are viewed as
candidates until human analysts confirm/reject them for the final
requirements traceability matrix. Studies have shown that analysts
are a fallible, but necessary, participant in the tracing process.
There are two key measures guiding analyst work on the evaluation of
candidate links: accuracy of analyst decision and efficiency of their
work. Intuitively, it is expected that the more effort the analyst
spends on candidate link validation, the more accurate the final
traceability matrix is likely to be, although the exact nature of
this relationship may be difficult to gauge outright. To assist
analysts in making the best use of their time when reviewing
candidate links, prior work simulated four possible behaviors and
showed that more structured approaches save the analysts' time/effort
required to achieve certain levels of accuracy. However, these
behavioral simulations are complex to run and their results difficult
to interpret and use in practice. In this paper, we present a
mathematical model for evaluating analyst effort during requirements
tracing tasks. We apply this model to a simulation study of 12
candidate link validation approaches. The simulation study is
conducted on a number of different datasets. In each study, we assume
perfect analyst behavior (i.e., analyst always being correct when
making a determination about a link). Under this assumption, we
evaluate the estimated effort for the analyst and plot it against the
accuracy of the recovered traceability matrix. The effort estimation
model is guided by a parameter specifying the relationship between
the time it takes an analyst to evaluate a presented link and the
time it takes an analyst to discover a link not presented to her. We
construct a series of effort estimations based on different values of
the model parameter. We found that the analysts' approach to
candidate link validation---essentially the order in which the
analyst examines presented candidate links---does impact the effort.
We also found that the lowest ratio of the cost of finding a correct
link from scratch over the cost of recognizing a correct link yields
the lowest effort for all datasets, but that the lowest effort does
not always yield the highest quality matrix. We finally observed that
effort varies by dataset. We conclude that the link evaluation
approach we call ``Top 1 Not Yet Examined Feedback Pruning'' was the
overall winner in terms of effort and highest quality and, thus,
should be followed by human analysts if possible.}
}
Downloads: 0
{"_id":"zhAgs9GnFJqsvnAtT","bibbaseid":"hayes-dekhtyar-larsen-guhneuc-effectiveuseofanalystseffortinautomatedtracing-2018","author_short":["Hayes, J. H.","Dekhtyar, A.","Larsen, J.","Gu�h�neuc, Y."],"bibdata":{"bibtype":"article","type":"article","author":[{"firstnames":["Jane","Huffman"],"propositions":[],"lastnames":["Hayes"],"suffixes":[]},{"firstnames":["Alexander"],"propositions":[],"lastnames":["Dekhtyar"],"suffixes":[]},{"firstnames":["Jody"],"propositions":[],"lastnames":["Larsen"],"suffixes":[]},{"firstnames":["Yann-Ga�l"],"propositions":[],"lastnames":["Gu�h�neuc"],"suffixes":[]}],"journal":"Requirements Engineering (REEN)","title":"Effective Use of Analysts' Effort in Automated Tracing","year":"2018","month":"March","note":"26 pages.","number":"1","pages":"119–143","volume":"23","editor":[{"firstnames":["Pericles"],"propositions":[],"lastnames":["Loucopoulos"],"suffixes":[]}],"keywords":"Topic: <b>Requirements and features</b>, Venue: <b>REEN</b>","publisher":"Springer","url":"http://www.ptidej.net/publications/documents/REEN16.doc.pdf","abstract":"Because of the large amount of effort it takes to manually trace requirements, automated traceability methods for mapping textual software engineering artifacts to each other and generating candidate links have received increased attention over the past 15 years. Automatically generated links, however, are viewed as candidates until human analysts confirm/reject them for the final requirements traceability matrix. Studies have shown that analysts are a fallible, but necessary, participant in the tracing process. There are two key measures guiding analyst work on the evaluation of candidate links: accuracy of analyst decision and efficiency of their work. Intuitively, it is expected that the more effort the analyst spends on candidate link validation, the more accurate the final traceability matrix is likely to be, although the exact nature of this relationship may be difficult to gauge outright. To assist analysts in making the best use of their time when reviewing candidate links, prior work simulated four possible behaviors and showed that more structured approaches save the analysts' time/effort required to achieve certain levels of accuracy. However, these behavioral simulations are complex to run and their results difficult to interpret and use in practice. In this paper, we present a mathematical model for evaluating analyst effort during requirements tracing tasks. We apply this model to a simulation study of 12 candidate link validation approaches. The simulation study is conducted on a number of different datasets. In each study, we assume perfect analyst behavior (i.e., analyst always being correct when making a determination about a link). Under this assumption, we evaluate the estimated effort for the analyst and plot it against the accuracy of the recovered traceability matrix. The effort estimation model is guided by a parameter specifying the relationship between the time it takes an analyst to evaluate a presented link and the time it takes an analyst to discover a link not presented to her. We construct a series of effort estimations based on different values of the model parameter. We found that the analysts' approach to candidate link validation—essentially the order in which the analyst examines presented candidate links—does impact the effort. We also found that the lowest ratio of the cost of finding a correct link from scratch over the cost of recognizing a correct link yields the lowest effort for all datasets, but that the lowest effort does not always yield the highest quality matrix. We finally observed that effort varies by dataset. We conclude that the link evaluation approach we call ``Top 1 Not Yet Examined Feedback Pruning'' was the overall winner in terms of effort and highest quality and, thus, should be followed by human analysts if possible.","bibtex":"@ARTICLE{Hayes16-REEN-AnalystEffort,\r\n AUTHOR = {Jane Huffman Hayes and Alexander Dekhtyar and \r\n Jody Larsen and Yann-Ga�l Gu�h�neuc},\r\n JOURNAL = {Requirements Engineering (REEN)},\r\n TITLE = {Effective Use of Analysts' Effort in Automated Tracing},\r\n YEAR = {2018},\r\n MONTH = {March},\r\n NOTE = {26 pages.},\r\n NUMBER = {1},\r\n PAGES = {119--143},\r\n VOLUME = {23},\r\n EDITOR = {Pericles Loucopoulos},\r\n KEYWORDS = {Topic: <b>Requirements and features</b>, \r\n Venue: <b>REEN</b>},\r\n PUBLISHER = {Springer},\r\n URL = {http://www.ptidej.net/publications/documents/REEN16.doc.pdf},\r\n ABSTRACT = {Because of the large amount of effort it takes to \r\n manually trace requirements, automated traceability methods for \r\n mapping textual software engineering artifacts to each other and \r\n generating candidate links have received increased attention over the \r\n past 15 years. Automatically generated links, however, are viewed as \r\n candidates until human analysts confirm/reject them for the final \r\n requirements traceability matrix. Studies have shown that analysts \r\n are a fallible, but necessary, participant in the tracing process. \r\n There are two key measures guiding analyst work on the evaluation of \r\n candidate links: accuracy of analyst decision and efficiency of their \r\n work. Intuitively, it is expected that the more effort the analyst \r\n spends on candidate link validation, the more accurate the final \r\n traceability matrix is likely to be, although the exact nature of \r\n this relationship may be difficult to gauge outright. To assist \r\n analysts in making the best use of their time when reviewing \r\n candidate links, prior work simulated four possible behaviors and \r\n showed that more structured approaches save the analysts' time/effort \r\n required to achieve certain levels of accuracy. However, these \r\n behavioral simulations are complex to run and their results difficult \r\n to interpret and use in practice. In this paper, we present a \r\n mathematical model for evaluating analyst effort during requirements \r\n tracing tasks. We apply this model to a simulation study of 12 \r\n candidate link validation approaches. The simulation study is \r\n conducted on a number of different datasets. In each study, we assume \r\n perfect analyst behavior (i.e., analyst always being correct when \r\n making a determination about a link). Under this assumption, we \r\n evaluate the estimated effort for the analyst and plot it against the \r\n accuracy of the recovered traceability matrix. The effort estimation \r\n model is guided by a parameter specifying the relationship between \r\n the time it takes an analyst to evaluate a presented link and the \r\n time it takes an analyst to discover a link not presented to her. We \r\n construct a series of effort estimations based on different values of \r\n the model parameter. We found that the analysts' approach to \r\n candidate link validation---essentially the order in which the \r\n analyst examines presented candidate links---does impact the effort. \r\n We also found that the lowest ratio of the cost of finding a correct \r\n link from scratch over the cost of recognizing a correct link yields \r\n the lowest effort for all datasets, but that the lowest effort does \r\n not always yield the highest quality matrix. We finally observed that \r\n effort varies by dataset. We conclude that the link evaluation \r\n approach we call ``Top 1 Not Yet Examined Feedback Pruning'' was the \r\n overall winner in terms of effort and highest quality and, thus, \r\n should be followed by human analysts if possible.}\r\n}\r\n\r\n","author_short":["Hayes, J. H.","Dekhtyar, A.","Larsen, J.","Gu�h�neuc, Y."],"editor_short":["Loucopoulos, P."],"key":"Hayes16-REEN-AnalystEffort","id":"Hayes16-REEN-AnalystEffort","bibbaseid":"hayes-dekhtyar-larsen-guhneuc-effectiveuseofanalystseffortinautomatedtracing-2018","role":"author","urls":{"Paper":"http://www.ptidej.net/publications/documents/REEN16.doc.pdf"},"keyword":["Topic: <b>Requirements and features</b>","Venue: <b>REEN</b>"],"metadata":{"authorlinks":{}}},"bibtype":"article","biburl":"http://www.yann-gael.gueheneuc.net/Work/Publications/Biblio/complete-bibliography.bib","dataSources":["8vn5MSGYWB4fAx9Z4"],"keywords":["topic: <b>requirements and features</b>","venue: <b>reen</b>"],"search_terms":["effective","use","analysts","effort","automated","tracing","hayes","dekhtyar","larsen","gu�h�neuc"],"title":"Effective Use of Analysts' Effort in Automated Tracing","year":2018}