Towards Long-term and Archivable Reproducibility. Akhlaghi, M., Infante-Sainz, R., Roukema, B. F., Valls-Gabaud, D., & Baena-Gallé, R. June, 2020.
Towards Long-term and Archivable Reproducibility [link]Paper  abstract   bibtex   5 downloads  
Reproducible workflow solutions commonly use high-level technologies that were popular when they were created, providing an immediate solution which is unlikely to be sustainable in the long term. We therefore introduce a set of criteria to address this problem and demonstrate their practicality and implementation. The criteria have been tested in several research publications and can be summarized as: completeness (no dependency beyond a POSIX-compatible operating system, no administrator privileges, no network connection and storage primarily in plain text); modular design; minimal complexity; scalability; verifiable inputs and outputs; temporal provenance; linking analysis with narrative; and free-and-open-source software. As a proof of concept, we have implemented "Maneage", a solution which stores the project in machine-actionable and human-readable plain-text, enables version-control, cheap archiving, automatic parsing to extract data provenance, and peer-reviewable verification. We show that requiring longevity of a reproducible workflow solution is realistic, without sacrificing immediate or short-term reproducibility and discuss the benefits of the criteria for scientific progress. This paper has itself been written in Maneage, with snapshot 1637cce.
@article{akhlaghi_towards_2020,
	title = {Towards {Long}-term and {Archivable} {Reproducibility}},
	url = {http://arxiv.org/abs/2006.03018v1},
	abstract = {Reproducible workflow solutions commonly use high-level technologies that
were popular when they were created, providing an immediate solution which is
unlikely to be sustainable in the long term. We therefore introduce a set of
criteria to address this problem and demonstrate their practicality and
implementation. The criteria have been tested in several research publications
and can be summarized as: completeness (no dependency beyond a POSIX-compatible
operating system, no administrator privileges, no network connection and
storage primarily in plain text); modular design; minimal complexity;
scalability; verifiable inputs and outputs; temporal provenance; linking
analysis with narrative; and free-and-open-source software. As a proof of
concept, we have implemented "Maneage", a solution which stores the project in
machine-actionable and human-readable plain-text, enables version-control,
cheap archiving, automatic parsing to extract data provenance, and
peer-reviewable verification. We show that requiring longevity of a
reproducible workflow solution is realistic, without sacrificing immediate or
short-term reproducibility and discuss the benefits of the criteria for
scientific progress. This paper has itself been written in Maneage, with
snapshot 1637cce.},
	language = {en},
	urldate = {2020-06-09},
	author = {Akhlaghi, Mohammad and Infante-Sainz, Raúl and Roukema, Boudewijn F. and Valls-Gabaud, David and Baena-Gallé, Roberto},
	month = jun,
	year = {2020},
}

Downloads: 5