jooh / neuroconda

Conda environment for neuroimaging analysis in python, R, etc

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

install build

A Conda environment for neuroscience. This is a very inclusive environment that covers pretty much all neuroimaging-related packages you might want to use. Please let us know if you want to see any additional packages.

The idea with this repo is to provide an open specification of the computing environment that was used to run a particular analysis. If you report that you used a particular release of this environment in your manuscript, you are providing a fairly complete description of your analysis software. And if you share analysis code, there is a much better chance that someone else will be able to run it and reproduce your results.

Usage

If you've never used conda before, you may have to do conda init. Then it's on to

conda activate neuroconda_2_0

As convenient as it may be, it is -not- recommended to activate the environment in your shell login script since this can cause conflicts with e.g. vncserver and other packages outside the environment, because you are shadowing system libraries.

CBU users

Users at MRC CBU may want to use shell script wrappers for activating neuroconda. These also take care of adding various non-conda dependencies to the path (e.g., Matlab, SPM12, ANTs, FSL, Freesurfer). If you are not at CBU, you may find it useful to write your own versions of these wrappers.

source neuroconda.csh

Or if you are using sh-derived shells like bash:

source neuroconda.sh

We currently don't supply de-activation wrapper scripts (PRs welcome!) so it's probably safest to start a fresh shell session every time you want to switch neuroconda versions (the standard conda deactivate route will take care of conda packages but will leave the non-conda dependencies on your path).

Install

The recommended install route is through make.

To install in a custom location, set the PREFIX environment variable, e.g. in bash, PREFIX=~/temp/ make. Note that the neuroconda environment will be created inside this directory (unlike the conda prefix argument, which is a full path to the desired install location).

The make install route takes care of some basic setup, including a fix for Pycortex (see below) and enabling the jupyterlab code formatter extension. Alternatively, you can just create the environment as usual with conda

conda env create -f neuroconda.yml

Pycortex initial configuration

If you don't follow the make install route you will have problems with pycortex, which looks for file paths in the (invalid) build directory instead of the final install directory. Work around this by first importing pycortex to generate the default config, and then editing it to look for the subject database and colormaps in the correct location (note that if you are using this in a centralised install at e.g. CBU, you may want the subject database to be somewhere you have write access instead):

python -c "import cortex"
sed -i 's@build/bdist.linux-x86_64/wheel/pycortex-.*data/data@'"$CONDA_PREFIX"'@g' ~/.config/pycortex/options.cfg

Suggested non-conda dependencies

To make full use of the packages in the environment (especially nipype), you may want the following on your system path:

  • SPM / Matlab
  • ANTs
  • Freesurfer
  • FSL

In past releases we used conda's env_vars.{sh,csh} functionality to add non-conda packages to the path, but this part of conda is profoundly broken at the moment (csh works not at all, restoring the path during deactivate doesn't work). The workaround for now is to create a shell script wrapper that takes care of adding non-conda packages to the path before activating the environment. For examples that we use at CBU, see neuroconda.sh and neuroconda.csh.

Dealing with firewall issues with HTTPS / SSL connections in git, conda, urllib3

If, like us, you are unlucky enough to sit behind a firewall with HTTPS inspection, you will need need to set a few environment variables to get HTTPS connectivity for git and packages that depend on urllib3 / requests. I recommend setting REQUESTS_CA_BUNDLE and GIT_SSL_CAINFO to point to your site-specific certificate. You may also want to add your certificate to the ssl_verify option in your .condarc file.

FAQ

  • Can I use neuroconda on Mac or Windows? No. We use multiple packages that are only available under Linux on Conda. You could probably put the environment into a Neurodocker container though.
  • I can't find package X Pull requests are welcome! We aim for inclusivity, so barring conflicting dependencies anything neuro-related goes.
  • This is not how you're meant to use environments That's not a question, but you're right. If you're a developer you probably want to use a separate environment for each project you work on rather than a single monolith. But if you're a data analyst, you may value productivity and easy reproducibility over control over the exact package versions you use. Neuroconda is aimed at the latter group, much like Anaconda.
  • Are neuroconda environments fully reproducible? We try to get as close to full reproducibility as we can given that the environment is built from external sources (mainly conda-forge and pypi). We pin versions of all installed packages, but not builds since these have a tendency to disappear from conda-forge over time, leading to broken environments. Reproducibility is limited by the fact that there is nothing to stop the external source from changing what code that version corresponds to next time you build the environment. If you want to have stronger guarantees of exact reproducibility you probably need to bundle the environment into a container image. This would also take care of any non-conda dependencies. The tradeoff being that you now have to work inside a container.

Problems

Please contact Johan Carlin or open an issue.

For developers

Contributions are welcome! The basic design of neuroconda is to list desired packages in neuroconda_basepackages.yml with minimal version pinning. The Makefile then takes care of constructing a new neuroconda.yml by building an environment and exporting with pinning (but no builds because these tend to go missing on conda-forge). The benefits of this two-yml design are 1) that updating is a lot faster than simply doing conda update --all in the full environment (and less prone to conflicts); 2) By distinguishing required base packages from dependencies we can also prune packages that are no longer a dependency of a base package on update.

Adding a package to neuroconda

If you just want to see a new package, you would take the following steps:

  1. Add the package to neuroconda_basepackages.yml, ideally without any version pinning.
  2. run make update to re-generate a new neuroconda.yml file (including all dependencies) from neuroconda_basepackages.yml.
  3. Use e.g. git diff to check that the new neuroconda.yml does not contain any new pip packages that could have been installed with conda instead (this happens when a pip package has a dependency that wasn't already satisfied by the conda packages). If so, add them to the list in neuroconda_basebackages.yml and repeat the update process. We try to use conda packages whenever possible.
  4. Conversely, check that conda doesn't uninstall a conda package in order to install a newer pip package. This happens when a pip package requires a newer version than is available on conda-forge. In this case, move the package to the pip section in conda_basepackages.yml and make a note of this (we may try to move it back later).
  5. Commit, push and submit a pull request.
  6. Maintainer to merge and cut new releases, after incrementing the version in neuroconda_basepackages.yml and README.md.

Maintaining large conda environments is hard because the conda solver continues to exhibit performance issues. This bioconda issue has some useful suggestions for workarounds, as does this continuum blog post. I use the pycroptosat sat_solver in my .condarc, which seems to help a bit.

Other worthwhile contributions

  • deactivation shell wrapper scripts
  • tests (probably just try importing a few packages that are known to be tricky or have implicit dependencies, e.g. tensorflow, pycortex)
  • neurodocker container
  • CI

About

Conda environment for neuroimaging analysis in python, R, etc

License:MIT License


Languages

Language:Dockerfile 64.9%Language:Makefile 20.5%Language:Shell 14.6%