A Study of Build Inflation in 30 Million CPAN Builds on 13 Perl Versions and 10 Operating Systems

A Study of Build Inflation in 30 Million CPAN Builds on 13 Perl Versions and 10 Operating Systems. Zolfagharinia, M., Adams, B., & Gu�h�neuc, Y. Empirical Software Engineering (EMSE), 24(6):3933–3971, Springer, June, 2019. 38 pages.

Paper abstract bibtex

Continuous Integration (CI) is a cornerstone of modern quality assurance, providing on-demand builds (compilation and tests) of code changes or software releases. Yet the many existing CI systems do not help developers in interpreting build results, in particular when facing build inflation. Build inflation arises when each code change has to be built on dozens of combinations (configurations) of runtime environments (REs), operating systems (OSes), and hardware architectures (HAs). A code change C1 sent to the CI system may introduce programming faults that result in all these builds to fail, while a change C2 introducing a new library dependency might only lead one particular build configuration to fail. Consequently, the one build failure due to C2 will be ``hidden'' among the dozens of build failures due to C1 when the CI system reports the results of the builds. We have named this phenomenon build inflation, because it may bias the interpretation of build results by developers by ``hiding'' certain types of faults. In this paper, we study build inflation through a large-scale study of the relationship between REs and OSes and build failures on 30 million builds of the CPAN repository on the CPAN Testers package-level CI system. We show that the builds of Perl packages may fail differently on different REs and OSes and any combination thereof . Thus, we show that the results provided by CPAN Testers require filtering and selection to identify real trends of build failures among the many failures. Manual analysis of 791 build failures shows that dependency faults (missing modules) and programming faults (undefined values) are the main reasons for failures, with dependency faults being easier to fix. We conclude with recommendations for practitioners and researchers in interpreting build results as well as for tool builders who should improve he scheduling of builds and the reporting of build failures.

@ARTICLE{Zolfagharinia19-EMSE-BuildInflation,
   AUTHOR       = {Mahdis Zolfagharinia and Bram Adams and 
      Yann-Ga�l Gu�h�neuc},
   JOURNAL      = {Empirical Software Engineering (EMSE)},
   TITLE        = {A Study of Build Inflation in 30 Million CPAN Builds on 
      13 Perl Versions and 10 Operating Systems},
   YEAR         = {2019},
   MONTH        = {June},
   NOTE         = {38 pages.},
   NUMBER       = {6},
   PAGES        = {3933--3971},
   VOLUME       = {24},
   EDITOR       = {Robert Feldt and Thomas Zimmermann},
   KEYWORDS     = {Topic: <b>Evolution patterns</b>, Venue: <b>EMSE</b>},
   PUBLISHER    = {Springer},
   URL          = {http://www.ptidej.net/publications/documents/EMSE19.doc.pdf},
   ABSTRACT     = {Continuous Integration (CI) is a cornerstone of modern 
      quality assurance, providing on-demand builds (compilation and tests) 
      of code changes or software releases. Yet the many existing CI 
      systems do not help developers in interpreting build results, in 
      particular when facing build inflation. Build inflation arises when 
      each code change has to be built on dozens of combinations 
      (configurations) of runtime environments (REs), operating systems 
      (OSes), and hardware architectures (HAs). A code change C1 sent to 
      the CI system may introduce programming faults that result in all 
      these builds to fail, while a change C2 introducing a new library 
      dependency might only lead one particular build configuration to 
      fail. Consequently, the one build failure due to C2 will be 
      ``hidden'' among the dozens of build failures due to C1 when the CI 
      system reports the results of the builds. We have named this 
      phenomenon build inflation, because it may bias the interpretation of 
      build results by developers by ``hiding'' certain types of faults. In 
      this paper, we study build inflation through a large-scale study of 
      the relationship between REs and OSes and build failures on 30 
      million builds of the CPAN repository on the CPAN Testers 
      package-level CI system. We show that the builds of Perl packages may 
      fail differently on different REs and OSes and any combination 
      thereof . Thus, we show that the results provided by CPAN Testers 
      require filtering and selection to identify real trends of build 
      failures among the many failures. Manual analysis of 791 build 
      failures shows that dependency faults (missing modules) and 
      programming faults (undefined values) are the main reasons for 
      failures, with dependency faults being easier to fix. We conclude 
      with recommendations for practitioners and researchers in 
      interpreting build results as well as for tool builders who should 
      improve he scheduling of builds and the reporting of build failures.}
}

Downloads: 0

{"_id":"9oSZy55J5LesXm8tt","bibbaseid":"zolfagharinia-adams-guhneuc-astudyofbuildinflationin30millioncpanbuildson13perlversionsand10operatingsystems-2019","authorIDs":["5a5fb236a39f2c3645000032","5e60e7f0839e59df010000e8","AfJhKcg96muyPdu7S","ahGA65oGDChNYp7Mb"],"author_short":["Zolfagharinia, M.","Adams, B.","Gu�h�neuc, Y."],"bibdata":{"bibtype":"article","type":"article","author":[{"firstnames":["Mahdis"],"propositions":[],"lastnames":["Zolfagharinia"],"suffixes":[]},{"firstnames":["Bram"],"propositions":[],"lastnames":["Adams"],"suffixes":[]},{"firstnames":["Yann-Ga�l"],"propositions":[],"lastnames":["Gu�h�neuc"],"suffixes":[]}],"journal":"Empirical Software Engineering (EMSE)","title":"A Study of Build Inflation in 30 Million CPAN Builds on 13 Perl Versions and 10 Operating Systems","year":"2019","month":"June","note":"38 pages.","number":"6","pages":"3933–3971","volume":"24","editor":[{"firstnames":["Robert"],"propositions":[],"lastnames":["Feldt"],"suffixes":[]},{"firstnames":["Thomas"],"propositions":[],"lastnames":["Zimmermann"],"suffixes":[]}],"keywords":"Topic: Evolution patterns, Venue: EMSE","publisher":"Springer","url":"http://www.ptidej.net/publications/documents/EMSE19.doc.pdf","abstract":"Continuous Integration (CI) is a cornerstone of modern quality assurance, providing on-demand builds (compilation and tests) of code changes or software releases. Yet the many existing CI systems do not help developers in interpreting build results, in particular when facing build inflation. Build inflation arises when each code change has to be built on dozens of combinations (configurations) of runtime environments (REs), operating systems (OSes), and hardware architectures (HAs). A code change C1 sent to the CI system may introduce programming faults that result in all these builds to fail, while a change C2 introducing a new library dependency might only lead one particular build configuration to fail. Consequently, the one build failure due to C2 will be ``hidden'' among the dozens of build failures due to C1 when the CI system reports the results of the builds. We have named this phenomenon build inflation, because it may bias the interpretation of build results by developers by ``hiding'' certain types of faults. In this paper, we study build inflation through a large-scale study of the relationship between REs and OSes and build failures on 30 million builds of the CPAN repository on the CPAN Testers package-level CI system. We show that the builds of Perl packages may fail differently on different REs and OSes and any combination thereof . Thus, we show that the results provided by CPAN Testers require filtering and selection to identify real trends of build failures among the many failures. Manual analysis of 791 build failures shows that dependency faults (missing modules) and programming faults (undefined values) are the main reasons for failures, with dependency faults being easier to fix. We conclude with recommendations for practitioners and researchers in interpreting build results as well as for tool builders who should improve he scheduling of builds and the reporting of build failures.","bibtex":"@ARTICLE{Zolfagharinia19-EMSE-BuildInflation,\r\n AUTHOR = {Mahdis Zolfagharinia and Bram Adams and \r\n Yann-Ga�l Gu�h�neuc},\r\n JOURNAL = {Empirical Software Engineering (EMSE)},\r\n TITLE = {A Study of Build Inflation in 30 Million CPAN Builds on \r\n 13 Perl Versions and 10 Operating Systems},\r\n YEAR = {2019},\r\n MONTH = {June},\r\n NOTE = {38 pages.},\r\n NUMBER = {6},\r\n PAGES = {3933--3971},\r\n VOLUME = {24},\r\n EDITOR = {Robert Feldt and Thomas Zimmermann},\r\n KEYWORDS = {Topic: Evolution patterns, Venue: EMSE},\r\n PUBLISHER = {Springer},\r\n URL = {http://www.ptidej.net/publications/documents/EMSE19.doc.pdf},\r\n ABSTRACT = {Continuous Integration (CI) is a cornerstone of modern \r\n quality assurance, providing on-demand builds (compilation and tests) \r\n of code changes or software releases. Yet the many existing CI \r\n systems do not help developers in interpreting build results, in \r\n particular when facing build inflation. Build inflation arises when \r\n each code change has to be built on dozens of combinations \r\n (configurations) of runtime environments (REs), operating systems \r\n (OSes), and hardware architectures (HAs). A code change C1 sent to \r\n the CI system may introduce programming faults that result in all \r\n these builds to fail, while a change C2 introducing a new library \r\n dependency might only lead one particular build configuration to \r\n fail. Consequently, the one build failure due to C2 will be \r\n ``hidden'' among the dozens of build failures due to C1 when the CI \r\n system reports the results of the builds. We have named this \r\n phenomenon build inflation, because it may bias the interpretation of \r\n build results by developers by ``hiding'' certain types of faults. In \r\n this paper, we study build inflation through a large-scale study of \r\n the relationship between REs and OSes and build failures on 30 \r\n million builds of the CPAN repository on the CPAN Testers \r\n package-level CI system. We show that the builds of Perl packages may \r\n fail differently on different REs and OSes and any combination \r\n thereof . Thus, we show that the results provided by CPAN Testers \r\n require filtering and selection to identify real trends of build \r\n failures among the many failures. Manual analysis of 791 build \r\n failures shows that dependency faults (missing modules) and \r\n programming faults (undefined values) are the main reasons for \r\n failures, with dependency faults being easier to fix. We conclude \r\n with recommendations for practitioners and researchers in \r\n interpreting build results as well as for tool builders who should \r\n improve he scheduling of builds and the reporting of build failures.}\r\n}\r\n\r\n","author_short":["Zolfagharinia, M.","Adams, B.","Gu�h�neuc, Y."],"editor_short":["Feldt, R.","Zimmermann, T."],"key":"Zolfagharinia19-EMSE-BuildInflation","id":"Zolfagharinia19-EMSE-BuildInflation","bibbaseid":"zolfagharinia-adams-guhneuc-astudyofbuildinflationin30millioncpanbuildson13perlversionsand10operatingsystems-2019","role":"author","urls":{"Paper":"http://www.ptidej.net/publications/documents/EMSE19.doc.pdf"},"keyword":["Topic: Evolution patterns","Venue: EMSE"],"metadata":{"authorlinks":{"gu�h�neuc, y":"https://bibbase.org/show?bib=http%3A%2F%2Fwww.yann-gael.gueheneuc.net%2FWork%2FPublications%2FBiblio%2Fcomplete-bibliography.bib&msg=embed"}}},"bibtype":"article","biburl":"http://www.yann-gael.gueheneuc.net/Work/Publications/Biblio/complete-bibliography.bib","creationDate":"2019-09-07T15:45:56.394Z","downloads":0,"keywords":["topic: evolution patterns","venue: emse"],"search_terms":["study","build","inflation","million","cpan","builds","perl","versions","operating","systems","zolfagharinia","adams","gu�h�neuc"],"title":"A Study of Build Inflation in 30 Million CPAN Builds on 13 Perl Versions and 10 Operating Systems","year":2019,"dataSources":["8vn5MSGYWB4fAx9Z4"]}