Session Materials

Class examples and Reproducibility resources

Contact information

Presenter: April Clyburne-Sherin
Email: april@codeocean.com \ Slidedeck: MSK Reproducible Data Management Slidedeck

Good enough practices in scientific computing. Wilson G, Bryan J, Cranston K, Kitzes J, Nederbragt L, et al. (2017) Good enough practices in scientific computing. PLOS Computational Biology 13(6): e1005510. https://doi.org/10.1371/journal.pcbi.1005510

Ten Simple Rules for Reproducible Computational Research. Sandve GK, Nekrutenko A, Taylor J, Hovig E (2013) Ten Simple Rules for Reproducible Computational Research. PLOS Computational Biology 9(10): e1003285. https://doi.org/10.1371/journal.pcbi.1003285

The Practice of Reproducible Research: Case Studies and Lessons from the Data-Intensive Sciences. Kitzes, J., Turek, D., & Deniz, F. (Eds.). (2018). Oakland, CA: University of California Press. https://www.gitbook.com/book/bids/the-practice-of-reproducible-research/details

Jupyter Notebooks and reproducible data science. Woodbridge M, Sanz D, Mietchen D, & Mounce R (2017).


Installations & Prerequisites

  • GitHub account is recommended.
  • No installations are necessary.

Session description

Organization

Data preparation & collection:

Centralize your project in a repository:

  • Create one repository that holds all related research files: data, code, notebooks, documentation, etc.

Create directory structure:

Documentation

Specify run environment:

  • Configure the run environment for your code.
    • Example: Base Environment: Python 3 with Anaconda

Specify dependencies:

Create a project level README:

Create metadata:

Automation

Create a master script:

  • Create a master script that executes your notebooks in order.
    • Create a file in the code directory.
    • Name the file “run.sh”.
  • Use nbconvert to render your notebook.
    • In your run.sh script, use nbconvert to execute your notebook into the results directory.
  • Resource on automation using a driver script: https://www.practicereproducibleresearch.org/core-chapters/3-basic.html

Create relative paths:

Dissemination

Specify a licence:

Publish your code and data:

  • Share your repository.
    • Obtain a DOI for your repository and use this link throughout your article.
    • Example: Github -> Binder -> Zenodo -> DOI linked in article

Recording

The recording of this session is here