A Study of Build Inflation in 30 Million CPAN Builds on 13 Perl Versions and 10 Operating Systems. Zolfagharinia, M., Adams, B., & Gu�h�neuc, Y. Empirical Software Engineering (EMSE), 24(6):3933–3971, Springer, June, 2019. 38 pages.
Paper abstract bibtex Continuous Integration (CI) is a cornerstone of modern quality assurance, providing on-demand builds (compilation and tests) of code changes or software releases. Yet the many existing CI systems do not help developers in interpreting build results, in particular when facing build inflation. Build inflation arises when each code change has to be built on dozens of combinations (configurations) of runtime environments (REs), operating systems (OSes), and hardware architectures (HAs). A code change C1 sent to the CI system may introduce programming faults that result in all these builds to fail, while a change C2 introducing a new library dependency might only lead one particular build configuration to fail. Consequently, the one build failure due to C2 will be ``hidden'' among the dozens of build failures due to C1 when the CI system reports the results of the builds. We have named this phenomenon build inflation, because it may bias the interpretation of build results by developers by ``hiding'' certain types of faults. In this paper, we study build inflation through a large-scale study of the relationship between REs and OSes and build failures on 30 million builds of the CPAN repository on the CPAN Testers package-level CI system. We show that the builds of Perl packages may fail differently on different REs and OSes and any combination thereof . Thus, we show that the results provided by CPAN Testers require filtering and selection to identify real trends of build failures among the many failures. Manual analysis of 791 build failures shows that dependency faults (missing modules) and programming faults (undefined values) are the main reasons for failures, with dependency faults being easier to fix. We conclude with recommendations for practitioners and researchers in interpreting build results as well as for tool builders who should improve he scheduling of builds and the reporting of build failures.
@ARTICLE{Zolfagharinia19-EMSE-BuildInflation,
AUTHOR = {Mahdis Zolfagharinia and Bram Adams and
Yann-Ga�l Gu�h�neuc},
JOURNAL = {Empirical Software Engineering (EMSE)},
TITLE = {A Study of Build Inflation in 30 Million CPAN Builds on
13 Perl Versions and 10 Operating Systems},
YEAR = {2019},
MONTH = {June},
NOTE = {38 pages.},
NUMBER = {6},
PAGES = {3933--3971},
VOLUME = {24},
EDITOR = {Robert Feldt and Thomas Zimmermann},
KEYWORDS = {Topic: <b>Evolution patterns</b>, Venue: <b>EMSE</b>},
PUBLISHER = {Springer},
URL = {http://www.ptidej.net/publications/documents/EMSE19.doc.pdf},
ABSTRACT = {Continuous Integration (CI) is a cornerstone of modern
quality assurance, providing on-demand builds (compilation and tests)
of code changes or software releases. Yet the many existing CI
systems do not help developers in interpreting build results, in
particular when facing build inflation. Build inflation arises when
each code change has to be built on dozens of combinations
(configurations) of runtime environments (REs), operating systems
(OSes), and hardware architectures (HAs). A code change C1 sent to
the CI system may introduce programming faults that result in all
these builds to fail, while a change C2 introducing a new library
dependency might only lead one particular build configuration to
fail. Consequently, the one build failure due to C2 will be
``hidden'' among the dozens of build failures due to C1 when the CI
system reports the results of the builds. We have named this
phenomenon build inflation, because it may bias the interpretation of
build results by developers by ``hiding'' certain types of faults. In
this paper, we study build inflation through a large-scale study of
the relationship between REs and OSes and build failures on 30
million builds of the CPAN repository on the CPAN Testers
package-level CI system. We show that the builds of Perl packages may
fail differently on different REs and OSes and any combination
thereof . Thus, we show that the results provided by CPAN Testers
require filtering and selection to identify real trends of build
failures among the many failures. Manual analysis of 791 build
failures shows that dependency faults (missing modules) and
programming faults (undefined values) are the main reasons for
failures, with dependency faults being easier to fix. We conclude
with recommendations for practitioners and researchers in
interpreting build results as well as for tool builders who should
improve he scheduling of builds and the reporting of build failures.}
}
Downloads: 0
{"_id":"unPJdfepXWHcoR2pu","bibbaseid":"zolfagharinia-adams-guhneuc-astudyofbuildinflationin30millioncpanbuildson13perlversionsand10operatingsystems-2019","author_short":["Zolfagharinia, M.","Adams, B.","Gu�h�neuc, Y."],"bibdata":{"bibtype":"article","type":"article","author":[{"firstnames":["Mahdis"],"propositions":[],"lastnames":["Zolfagharinia"],"suffixes":[]},{"firstnames":["Bram"],"propositions":[],"lastnames":["Adams"],"suffixes":[]},{"firstnames":["Yann-Ga�l"],"propositions":[],"lastnames":["Gu�h�neuc"],"suffixes":[]}],"journal":"Empirical Software Engineering (EMSE)","title":"A Study of Build Inflation in 30 Million CPAN Builds on 13 Perl Versions and 10 Operating Systems","year":"2019","month":"June","note":"38 pages.","number":"6","pages":"3933–3971","volume":"24","editor":[{"firstnames":["Robert"],"propositions":[],"lastnames":["Feldt"],"suffixes":[]},{"firstnames":["Thomas"],"propositions":[],"lastnames":["Zimmermann"],"suffixes":[]}],"keywords":"Topic: <b>Evolution patterns</b>, Venue: <b>EMSE</b>","publisher":"Springer","url":"http://www.ptidej.net/publications/documents/EMSE19.doc.pdf","abstract":"Continuous Integration (CI) is a cornerstone of modern quality assurance, providing on-demand builds (compilation and tests) of code changes or software releases. Yet the many existing CI systems do not help developers in interpreting build results, in particular when facing build inflation. Build inflation arises when each code change has to be built on dozens of combinations (configurations) of runtime environments (REs), operating systems (OSes), and hardware architectures (HAs). A code change C1 sent to the CI system may introduce programming faults that result in all these builds to fail, while a change C2 introducing a new library dependency might only lead one particular build configuration to fail. Consequently, the one build failure due to C2 will be ``hidden'' among the dozens of build failures due to C1 when the CI system reports the results of the builds. We have named this phenomenon build inflation, because it may bias the interpretation of build results by developers by ``hiding'' certain types of faults. In this paper, we study build inflation through a large-scale study of the relationship between REs and OSes and build failures on 30 million builds of the CPAN repository on the CPAN Testers package-level CI system. We show that the builds of Perl packages may fail differently on different REs and OSes and any combination thereof . Thus, we show that the results provided by CPAN Testers require filtering and selection to identify real trends of build failures among the many failures. Manual analysis of 791 build failures shows that dependency faults (missing modules) and programming faults (undefined values) are the main reasons for failures, with dependency faults being easier to fix. We conclude with recommendations for practitioners and researchers in interpreting build results as well as for tool builders who should improve he scheduling of builds and the reporting of build failures.","bibtex":"@ARTICLE{Zolfagharinia19-EMSE-BuildInflation,\r\n AUTHOR = {Mahdis Zolfagharinia and Bram Adams and \r\n Yann-Ga�l Gu�h�neuc},\r\n JOURNAL = {Empirical Software Engineering (EMSE)},\r\n TITLE = {A Study of Build Inflation in 30 Million CPAN Builds on \r\n 13 Perl Versions and 10 Operating Systems},\r\n YEAR = {2019},\r\n MONTH = {June},\r\n NOTE = {38 pages.},\r\n NUMBER = {6},\r\n PAGES = {3933--3971},\r\n VOLUME = {24},\r\n EDITOR = {Robert Feldt and Thomas Zimmermann},\r\n KEYWORDS = {Topic: <b>Evolution patterns</b>, Venue: <b>EMSE</b>},\r\n PUBLISHER = {Springer},\r\n URL = {http://www.ptidej.net/publications/documents/EMSE19.doc.pdf},\r\n ABSTRACT = {Continuous Integration (CI) is a cornerstone of modern \r\n quality assurance, providing on-demand builds (compilation and tests) \r\n of code changes or software releases. Yet the many existing CI \r\n systems do not help developers in interpreting build results, in \r\n particular when facing build inflation. Build inflation arises when \r\n each code change has to be built on dozens of combinations \r\n (configurations) of runtime environments (REs), operating systems \r\n (OSes), and hardware architectures (HAs). A code change C1 sent to \r\n the CI system may introduce programming faults that result in all \r\n these builds to fail, while a change C2 introducing a new library \r\n dependency might only lead one particular build configuration to \r\n fail. Consequently, the one build failure due to C2 will be \r\n ``hidden'' among the dozens of build failures due to C1 when the CI \r\n system reports the results of the builds. We have named this \r\n phenomenon build inflation, because it may bias the interpretation of \r\n build results by developers by ``hiding'' certain types of faults. In \r\n this paper, we study build inflation through a large-scale study of \r\n the relationship between REs and OSes and build failures on 30 \r\n million builds of the CPAN repository on the CPAN Testers \r\n package-level CI system. We show that the builds of Perl packages may \r\n fail differently on different REs and OSes and any combination \r\n thereof . Thus, we show that the results provided by CPAN Testers \r\n require filtering and selection to identify real trends of build \r\n failures among the many failures. Manual analysis of 791 build \r\n failures shows that dependency faults (missing modules) and \r\n programming faults (undefined values) are the main reasons for \r\n failures, with dependency faults being easier to fix. We conclude \r\n with recommendations for practitioners and researchers in \r\n interpreting build results as well as for tool builders who should \r\n improve he scheduling of builds and the reporting of build failures.}\r\n}\r\n\r\n","author_short":["Zolfagharinia, M.","Adams, B.","Gu�h�neuc, Y."],"editor_short":["Feldt, R.","Zimmermann, T."],"key":"Zolfagharinia19-EMSE-BuildInflation","id":"Zolfagharinia19-EMSE-BuildInflation","bibbaseid":"zolfagharinia-adams-guhneuc-astudyofbuildinflationin30millioncpanbuildson13perlversionsand10operatingsystems-2019","role":"author","urls":{"Paper":"http://www.ptidej.net/publications/documents/EMSE19.doc.pdf"},"keyword":["Topic: <b>Evolution patterns</b>","Venue: <b>EMSE</b>"],"metadata":{"authorlinks":{}}},"bibtype":"article","biburl":"http://www.yann-gael.gueheneuc.net/Work/Publications/Biblio/complete-bibliography.bib","dataSources":["8vn5MSGYWB4fAx9Z4"],"keywords":["topic: <b>evolution patterns</b>","venue: <b>emse</b>"],"search_terms":["study","build","inflation","million","cpan","builds","perl","versions","operating","systems","zolfagharinia","adams","gu�h�neuc"],"title":"A Study of Build Inflation in 30 Million CPAN Builds on 13 Perl Versions and 10 Operating Systems","year":2019}