Reference environments: A universal tool for reproducibility in computational biology

Reproducibility of scientific results, or lack thereof, has received increasing attention over recent years. Computational studies, by their nature, should be amongst the most reproducible. However it often proves to be a challenge to reproduce computational results, even when code is made available. The need to adopt standards for reproducibility of claims made based on computational results is now clear to researchers, however there is still a great deal of debate about where responsibility for checking reproducibility lies, and about appropriate tools and approaches to ensure reproducibility of a computational result.

Many technologies exist to support and promote reproduction of computational results: containerisation tools like Docker, literate programming approaches such as Sweave, knitr, iPython or cloud environments like Amazon Web Services. But these technologies are tied to specific programming languages (e.g. Sweave/knitr to R; iPython to Python) or to platforms (e.g. Docker for 64-bit Linux environments only). To date, no single approach is able to span the broad range of technologies and platforms represented in computational biology and biotechnology.

In our new preprint “Reference environments: A universal tool for reproducibility in computational biology”, now available on arXiv, we demonstrate an approach and provide a set of tools that is suitable for all computational work and is not tied to a particular programming language or platform. We illustrate this approach, which we call ‘Reference Environments’, using examples from a number of published papers in different areas of computational biology, spanning the major languages and technologies in the field (Python/R/MATLAB/Fortran/C/Java).

architecture_diagram

The Reference Environments approach provides a transparent and flexible process for replication and recomputation of results. Ultimately, the most valuable aspect of this approach is the decoupling of methods in computational biology from their implementation. Separating the ‘how’ (method) of a publication from the ‘where’ (implementation) promotes genuinely open science and benefits the scientific community as a whole.

Read it here:

Daniel G. Hurley, Joseph Cursons, Matthew Faria, David M. Budden, Vijay Rajagopal, Edmund J. Crampin
Reference environments: A universal tool for reproducibility in computational biology
arxiv:1810.03766

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s