Reproducibility of scientific results, or lack thereof, has received increasing attention over recent years. Computational studies, by their nature, should be amongst the most reproducible. However it often proves to be a challenge to reproduce computational results, even when code is made available. The need to adopt standards for reproducibility of claims made based on computational results is now clear to researchers, however there is still a great deal of debate about where responsibility for checking reproducibility lies, and about appropriate tools and approaches to ensure reproducibility of a computational result.
Many technologies exist to support and promote reproduction of computational results: containerisation tools like Docker, literate programming approaches such as Sweave, knitr, iPython or cloud environments like Amazon Web Services. But these technologies are tied to specific programming languages (e.g. Sweave/knitr to R; iPython to Python) or to platforms (e.g. Docker for 64-bit Linux environments only). To date, no single approach is able to span the broad range of technologies and platforms represented in computational biology and biotechnology.
In our recent preprint “Reference environments: A universal tool for reproducibility in computational biology”, now available on arXiv, we demonstrate an approach and provide a set of tools that is suitable for all computational work and is not tied to a particular programming language or platform. We illustrate this approach, which we call ‘Reference Environments’, using examples from a number of published papers in different areas of computational biology, spanning the major languages and technologies in the field (Python/R/MATLAB/Fortran/C/Java).
The Reference Environments approach provides a transparent and flexible process for replication and recomputation of results. Ultimately, the most valuable aspect of this approach is the decoupling of methods in computational biology from their implementation. Separating the ‘how’ (method) of a publication from the ‘where’ (implementation) promotes genuinely open science and benefits the scientific community as a whole.
Read it here:
Daniel G. Hurley, Joseph Cursons, Matthew Faria, David M. Budden, Vijay Rajagopal, Edmund J. Crampin
Reference environments: A universal tool for reproducibility in computational biology