Ten Simple Rules for Making Research Software More Robust

Ten Simple Rules for Making Research Software More Robust. Taschuk, M. & Wilson, G. 13(4):e1005412+.

[Abstract] Software produced for research, published and otherwise, suffers from a number of common problems that make it difficult or impossible to run outside the original institution or even off the primary developer's computer. We present ten simple rules to make such software robust enough to be run by anyone, anywhere, and thereby delight your users and collaborators. [Author summary] Many researchers have found out the hard way that there's a world of difference between ” works for me on my machine” and ” works for other people on theirs.” Many common challenges can be avoided by following a few simple rules; doing so not only improves reproducibility but can accelerate research. [Excerpt] [] [...] Best practices in software engineering specifically aim to increase software robustness. However, most bioinformaticians learn what they know about software development on the job or otherwise informally [...]. Existing training programs and initiatives rarely have the time to cover software engineering in depth, especially since the field is so broad and developing so rapidly [...]. In addition, making software robust is not directly rewarded in science, and funding is difficult to come by [...]. Some proposed solutions to this problem include restructuring educational programs, hiring dedicated software engineers [...], partnering with private sector or grassroots organizations [...], or using specific technical tools like containerization or cloud computing [...]. Each of these requires time and, in some cases, institutional change. [] The good news is you don't need to be a professionally trained programmer to write robust software. In fact, some of the best, most reliable pieces of software in many scientific communities are written by researchers [...] who have adopted strong software engineering approaches, have high standards of reproducibility, use good testing practices, and foster strong user bases through constantly evolving, clearly documented, useful, and useable software. [...] [] So what is ” robust” software? We implied above that it is software that works for people other than the original author and on machines other than its creator's. More specifically, we mean that: [::] it can be installed on more than one computer with relative ease, [::] it works consistently as advertised, and [::] it can be integrated with other tools. [] Our rules are generic and can be applied to all languages, libraries, packages, documentation styles, and operating systems for both closed-source and open-source software. They are also necessary steps toward making computational research replicable and reproducible: after all, if your tools and libraries cannot be run by others, they cannot be used to verify your results or as a stepping stone for future work [...] [::] Rule 1: Use version control [...] [::] Rule 2: Document your code and usage [...] [::] Rule 3: Make common operations easy to control [...] [::] Rule 4: Version your releases [...] [::] Rule 5: Reuse software (within reason) [...] [::] Rule 6: Rely on build tools and package managers for installation [...] [::] Rule 7: Do not require root or other special privileges to install or run [...] [::] Rule 8: Eliminate hard-coded paths [...] [::] Rule 9: Include a small test set that can be run to ensure the software is actually working [...] [::] Rule 10: Produce identical results when given identical inputs [...] [] [...]

@article{taschukTenSimpleRules2017,
  title = {Ten Simple Rules for Making Research Software More Robust},
  author = {Taschuk, Morgan and Wilson, Greg},
  date = {2017-04},
  journaltitle = {PLOS Computational Biology},
  volume = {13},
  pages = {e1005412+},
  issn = {1553-7358},
  doi = {10.1371/journal.pcbi.1005412},
  url = {https://doi.org/10.1371/journal.pcbi.1005412},
  abstract = {[Abstract]

Software produced for research, published and otherwise, suffers from a number of common problems that make it difficult or impossible to run outside the original institution or even off the primary developer's computer. We present ten simple rules to make such software robust enough to be run by anyone, anywhere, and thereby delight your users and collaborators.

[Author summary]

Many researchers have found out the hard way that there's a world of difference between ” works for me on my machine” and ” works for other people on theirs.” Many common challenges can be avoided by following a few simple rules; doing so not only improves reproducibility but can accelerate research.

[Excerpt] [] [...] Best practices in software engineering specifically aim to increase software robustness. However, most bioinformaticians learn what they know about software development on the job or otherwise informally [...]. Existing training programs and initiatives rarely have the time to cover software engineering in depth, especially since the field is so broad and developing so rapidly [...]. In addition, making software robust is not directly rewarded in science, and funding is difficult to come by [...]. Some proposed solutions to this problem include restructuring educational programs, hiring dedicated software engineers [...], partnering with private sector or grassroots organizations [...], or using specific technical tools like containerization or cloud computing [...]. Each of these requires time and, in some cases, institutional change.

[] The good news is you don't need to be a professionally trained programmer to write robust software. In fact, some of the best, most reliable pieces of software in many scientific communities are written by researchers [...] who have adopted strong software engineering approaches, have high standards of reproducibility, use good testing practices, and foster strong user bases through constantly evolving, clearly documented, useful, and useable software. [...]

[] So what is ” robust” software? We implied above that it is software that works for people other than the original author and on machines other than its creator's. More specifically, we mean that:

[::] it can be installed on more than one computer with relative ease, [::] it works consistently as advertised, and [::] it can be integrated with other tools.

[] Our rules are generic and can be applied to all languages, libraries, packages, documentation styles, and operating systems for both closed-source and open-source software. They are also necessary steps toward making computational research replicable and reproducible: after all, if your tools and libraries cannot be run by others, they cannot be used to verify your results or as a stepping stone for future work [...]

[::] Rule 1: Use version control [...] [::] Rule 2: Document your code and usage [...] [::] Rule 3: Make common operations easy to control [...] [::] Rule 4: Version your releases [...] [::] Rule 5: Reuse software (within reason) [...] [::] Rule 6: Rely on build tools and package managers for installation [...] [::] Rule 7: Do not require root or other special privileges to install or run [...] [::] Rule 8: Eliminate hard-coded paths [...] [::] Rule 9: Include a small test set that can be run to ensure the software is actually working [...] [::] Rule 10: Produce identical results when given identical inputs [...] [] [...]},
  keywords = {*imported-from-citeulike-INRMM,~INRMM-MiD:c-14337051,bias-disembodied-science-vs-computational-scholarship,check-list,computational-science,free-scientific-knowledge,reproducible-research,software-engineering,software-uncertainty},
  number = {4}
}

Downloads: 0

{"_id":"iY6QomW9R8yNAvbWr","bibbaseid":"taschuk-wilson-tensimplerulesformakingresearchsoftwaremorerobust","authorIDs":[],"author_short":["Taschuk, M.","Wilson, G."],"bibdata":{"bibtype":"article","type":"article","title":"Ten Simple Rules for Making Research Software More Robust","author":[{"propositions":[],"lastnames":["Taschuk"],"firstnames":["Morgan"],"suffixes":[]},{"propositions":[],"lastnames":["Wilson"],"firstnames":["Greg"],"suffixes":[]}],"date":"2017-04","journaltitle":"PLOS Computational Biology","volume":"13","pages":"e1005412+","issn":"1553-7358","doi":"10.1371/journal.pcbi.1005412","url":"https://doi.org/10.1371/journal.pcbi.1005412","abstract":"[Abstract] Software produced for research, published and otherwise, suffers from a number of common problems that make it difficult or impossible to run outside the original institution or even off the primary developer's computer. We present ten simple rules to make such software robust enough to be run by anyone, anywhere, and thereby delight your users and collaborators. [Author summary] Many researchers have found out the hard way that there's a world of difference between ” works for me on my machine” and ” works for other people on theirs.” Many common challenges can be avoided by following a few simple rules; doing so not only improves reproducibility but can accelerate research. [Excerpt] [] [...] Best practices in software engineering specifically aim to increase software robustness. However, most bioinformaticians learn what they know about software development on the job or otherwise informally [...]. Existing training programs and initiatives rarely have the time to cover software engineering in depth, especially since the field is so broad and developing so rapidly [...]. In addition, making software robust is not directly rewarded in science, and funding is difficult to come by [...]. Some proposed solutions to this problem include restructuring educational programs, hiring dedicated software engineers [...], partnering with private sector or grassroots organizations [...], or using specific technical tools like containerization or cloud computing [...]. Each of these requires time and, in some cases, institutional change. [] The good news is you don't need to be a professionally trained programmer to write robust software. In fact, some of the best, most reliable pieces of software in many scientific communities are written by researchers [...] who have adopted strong software engineering approaches, have high standards of reproducibility, use good testing practices, and foster strong user bases through constantly evolving, clearly documented, useful, and useable software. [...] [] So what is ” robust” software? We implied above that it is software that works for people other than the original author and on machines other than its creator's. More specifically, we mean that: [::] it can be installed on more than one computer with relative ease, [::] it works consistently as advertised, and [::] it can be integrated with other tools. [] Our rules are generic and can be applied to all languages, libraries, packages, documentation styles, and operating systems for both closed-source and open-source software. They are also necessary steps toward making computational research replicable and reproducible: after all, if your tools and libraries cannot be run by others, they cannot be used to verify your results or as a stepping stone for future work [...] [::] Rule 1: Use version control [...] [::] Rule 2: Document your code and usage [...] [::] Rule 3: Make common operations easy to control [...] [::] Rule 4: Version your releases [...] [::] Rule 5: Reuse software (within reason) [...] [::] Rule 6: Rely on build tools and package managers for installation [...] [::] Rule 7: Do not require root or other special privileges to install or run [...] [::] Rule 8: Eliminate hard-coded paths [...] [::] Rule 9: Include a small test set that can be run to ensure the software is actually working [...] [::] Rule 10: Produce identical results when given identical inputs [...] [] [...]","keywords":"*imported-from-citeulike-INRMM,~INRMM-MiD:c-14337051,bias-disembodied-science-vs-computational-scholarship,check-list,computational-science,free-scientific-knowledge,reproducible-research,software-engineering,software-uncertainty","number":"4","bibtex":"@article{taschukTenSimpleRules2017,\n title = {Ten Simple Rules for Making Research Software More Robust},\n author = {Taschuk, Morgan and Wilson, Greg},\n date = {2017-04},\n journaltitle = {PLOS Computational Biology},\n volume = {13},\n pages = {e1005412+},\n issn = {1553-7358},\n doi = {10.1371/journal.pcbi.1005412},\n url = {https://doi.org/10.1371/journal.pcbi.1005412},\n abstract = {[Abstract]\n\nSoftware produced for research, published and otherwise, suffers from a number of common problems that make it difficult or impossible to run outside the original institution or even off the primary developer's computer. We present ten simple rules to make such software robust enough to be run by anyone, anywhere, and thereby delight your users and collaborators.\n\n[Author summary]\n\nMany researchers have found out the hard way that there's a world of difference between ” works for me on my machine” and ” works for other people on theirs.” Many common challenges can be avoided by following a few simple rules; doing so not only improves reproducibility but can accelerate research.\n\n[Excerpt] [] [...] Best practices in software engineering specifically aim to increase software robustness. However, most bioinformaticians learn what they know about software development on the job or otherwise informally [...]. Existing training programs and initiatives rarely have the time to cover software engineering in depth, especially since the field is so broad and developing so rapidly [...]. In addition, making software robust is not directly rewarded in science, and funding is difficult to come by [...]. Some proposed solutions to this problem include restructuring educational programs, hiring dedicated software engineers [...], partnering with private sector or grassroots organizations [...], or using specific technical tools like containerization or cloud computing [...]. Each of these requires time and, in some cases, institutional change.\n\n[] The good news is you don't need to be a professionally trained programmer to write robust software. In fact, some of the best, most reliable pieces of software in many scientific communities are written by researchers [...] who have adopted strong software engineering approaches, have high standards of reproducibility, use good testing practices, and foster strong user bases through constantly evolving, clearly documented, useful, and useable software. [...]\n\n[] So what is ” robust” software? We implied above that it is software that works for people other than the original author and on machines other than its creator's. More specifically, we mean that:\n\n[::] it can be installed on more than one computer with relative ease, [::] it works consistently as advertised, and [::] it can be integrated with other tools.\n\n[] Our rules are generic and can be applied to all languages, libraries, packages, documentation styles, and operating systems for both closed-source and open-source software. They are also necessary steps toward making computational research replicable and reproducible: after all, if your tools and libraries cannot be run by others, they cannot be used to verify your results or as a stepping stone for future work [...]\n\n[::] Rule 1: Use version control [...] [::] Rule 2: Document your code and usage [...] [::] Rule 3: Make common operations easy to control [...] [::] Rule 4: Version your releases [...] [::] Rule 5: Reuse software (within reason) [...] [::] Rule 6: Rely on build tools and package managers for installation [...] [::] Rule 7: Do not require root or other special privileges to install or run [...] [::] Rule 8: Eliminate hard-coded paths [...] [::] Rule 9: Include a small test set that can be run to ensure the software is actually working [...] [::] Rule 10: Produce identical results when given identical inputs [...] [] [...]},\n keywords = {*imported-from-citeulike-INRMM,~INRMM-MiD:c-14337051,bias-disembodied-science-vs-computational-scholarship,check-list,computational-science,free-scientific-knowledge,reproducible-research,software-engineering,software-uncertainty},\n number = {4}\n}\n\n","author_short":["Taschuk, M.","Wilson, G."],"key":"taschukTenSimpleRules2017","id":"taschukTenSimpleRules2017","bibbaseid":"taschuk-wilson-tensimplerulesformakingresearchsoftwaremorerobust","role":"author","urls":{"Paper":"https://doi.org/10.1371/journal.pcbi.1005412"},"keyword":["*imported-from-citeulike-INRMM","~INRMM-MiD:c-14337051","bias-disembodied-science-vs-computational-scholarship","check-list","computational-science","free-scientific-knowledge","reproducible-research","software-engineering","software-uncertainty"],"downloads":0},"bibtype":"article","biburl":"https://tmpfiles.org/dl/58794/INRMM.bib","creationDate":"2020-07-02T22:41:30.693Z","downloads":0,"keywords":["*imported-from-citeulike-inrmm","~inrmm-mid:c-14337051","bias-disembodied-science-vs-computational-scholarship","check-list","computational-science","free-scientific-knowledge","reproducible-research","software-engineering","software-uncertainty"],"search_terms":["ten","simple","rules","making","research","software","more","robust","taschuk","wilson"],"title":"Ten Simple Rules for Making Research Software More Robust","year":null,"dataSources":["DXuKbcZTirdigFKPF"]}