Reproducibility is a major principle underpinning the scientific method, and scientific software is not an exception.
Anaconda is a distribution of the Python and R programming languages for scientific computing with more than 25 million users. But, how reproducible is science made with Anaconda? And most important:
Do you think you will be capable of reproducing the results your research in the next 10 years?.
Currently, the reproducibility of Anaconda environments is not guaranteed. conda list --explicit
provides just some kind of short term reproducibility.
For example, if you use packages from non-standard channels, the owner could delete them at any moment. Also, the resolved URLs could vary due to changes in package labels or storage.
There is an ongoing debate about how to unify the different available tools to solve this problem. In this workflow, I propose a simple but effective way to keep your environments reproducible using GitHub Actions and conda-pack
:
conda-pack
is a command line tool for creating archives of conda environments that can be installed on other systems and locations. This is useful for deploying code in a consistent environment —potentially where Python and/or conda isn’t already installed.
Every time you publish a new release of your code (e.g. a paper) on GitHub, the environment is solved, packed and uploaded as an asset.
name: pack
on:
release:
types: [published]
env:
BASENAME: ${{ github.event.repository.name }}-${{ github.event.release.tag_name }}
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Setup Mambaforge
uses: conda-incubator/setup-miniconda@v2
with:
miniforge-variant: Mambaforge
miniforge-version: latest
environment-file: environment.yml
activate-environment: my-env
use-mamba: true
- name: Freeze packages
shell: bash -l {0}
run: conda env export -n my-env > $BASENAME.yml
- name: Install conda-pack
shell: bash -l {0}
run: mamba install -c conda-forge conda-pack
- name: Pack environment
shell: bash -l {0}
run: conda pack -n my-env -o $BASENAME.tar.gz
- name: Upload assets
uses: AButler/upload-release-assets@v2.0
with:
files: '${{ env.BASENAME }}.{yml,tar.gz}'
repo-token: ${{ secrets.GITHUB_TOKEN }}
release-tag: ${{ github.event.release.tag_name }}
Finally, follow the instructions to deploy an identical environment at any point in the future.
Get the code
epassaro / repro-conda-envs
An example repository on how to keep Anaconda environments reproducible in the long term with GitHub Actions
repro-conda-envs
An example repository on how to keep Anaconda environments reproducible in the long term with GitHub Actions
Top comments (0)