MeteoSwiss-APN / pyflexplot

Python FLEXPART Plotting

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Use conda for installation

ruestefa opened this issue · comments

  • PyFlexPlot version: v0.15.5

Description

So far, pyflexplot has been using the standard venv module to create virtual environments during installation. This requires some workarounds for packages depending on system libraries, namely cartopy (and its dependency shapely) depending on geos and proj. To build against the installed geos and proj versions, cartopy and shapely are installed from github (and cartopy has additionally been forked to add a pyproject.toml to enable seamless installation with pip).

Solution

The dependencies problems can be solved by switching from "pip+venv" to conda (specifically, miniconda), which handles not only python dependencies, but also their non-python dependencies.

Potential drawback: The resulting virtual environments are substantially larger and may reach multiple GBs each. However, conda should be smart enough to share identical installations between environments with hardlinks, so this should not matter.

Tasks

  • Make conda available on tsa (ask Dani)
  • Figure out how to use conda with pyflexplot (for both development and deployment)
  • Adapt makefile for conda
  • Merge adapted makefile into blueprint (in parallel or as an alternative to "venv+pip"-makefile)

Conda on tsa

Test installation

  • Dani has installed miniconda on his /scratch for testing
    • Problem: I cannot create envs because I don't have write permissions there!
  • Research conda for multiple users and configuring a multiuser environment
    • Looks like a multiuser installation requires root and a designated group; not sure that's the way to go for OSM
  • For testing, I'll just use my conda installation on /project/s1063 (like on daint)

Outlook

  • For deployment, a standard single-user OSM installation is sufficient
  • Other users best just install their own miniconda if they want to use conda
  • Maybe a central standard environment could eventually be provided by OSM to cover standard use cases (like at IAC)

Create a conda environment

Try conda env create

  • Try conda env create -n pyflexplot python=3.8
    • Fails (also with variations of name (in -n .<name>) and w/o python=...
  • Github issue reveals that what I want is conda create ..., not conda env create
    • The latter is to create environments from .yml files (as far as I can tell)

Try conda create

  • Try conda create -n pyflexplot.test python=3.8 git+ssh://git@github.com/meteoswiss-apn/pyflexplot.git
    • Fails b/c the github package is not found in any channel
  • Stackoverflow: Cannot install from github, at least not this way
    • Could use pip inside conda, but then the dependencies would be installed with pip, defeating the purpose of conda (see also here)
    • Alternatively, the environment can be specified in a .yml file, which can contain git links, and installed with conda env create
      • However, that file also needs to contain the dependencies (those in setup.py) for them to be installed by conda
  • Cleanest (i.e., least disruptive) approach looks to be working with a requirements file
    • Those can be used with conda create with --file (see here)
    • Move unpinned deps from setup.py to requirements/run-unpinned.txt and read the latter in the former to avoid duplication

Prospective workflow

  • Create environment:
    • Either with conda create, specifying the dependencies with a (unpinned or pinned) requirements file with --file;
    • Or with conda env create, using a previously exported environment.yml file
  • Install pyflexplot with pip

Revisit setup.py vs. requirements.txt

  • Unpinned top-level runtime dependencies should be defined in setup.py, not in a requirements file that is read into setup.py
  • Briefly research this topic again for a refresher
  • Why not to read requirements.txt into setup.py: The latter contains all recursive pinned dependencies for reproducible deployment, the latter the unpinned top-level dependencies for regular installation
    • So the argument is not so much about the files setup.py vs. requirements.txt per se, but about recursive pinned vs. top-level unpinned dependencies
  • How to read a requirements file: Use pkg_resources.parse_requirements (even if it should not be done)
  • Manage requirements files with pip-tools: Manually write requirements.in file (unpinned top-level deps), then generate requirements.txt file (recursive pinned deps) with pip-compile
    • Given there is no accepted convention and pip-tools is quite ubiquitous, it makes sense to adopt the .in vs. .txt convention for unpinned vs. pinned deps (even if the latter are produced manually with pip freeze instead of pip-compile)

Conclusions

  • Reading requirements.txt in setup.py is primarily discouraged to avoid using pinned dependencies during regular installation
  • If the requirements are read in setup.py from a file, they should be top-level unpinned deps and read with pkg_resources.parse_requirements
  • Following the pip-tools convention, unpinned/pinned requirements files should be named requirements.in/requirements.txt

Try again

  • Adapt requirements file names and add requirements.in
  • Problem: /project (where my conda installation resides) is currently not available (CSCS maintenance)!
  • Take the opportunity to install miniconda from scratch to /scratch (pun intended), so it's documented here
    • See next comment

...and again

  • Try conda create -n pyflexplot.dev python=3.8 --file=requirements/requirements.in

    • Works, taking about 1.5 min
  • Try out env:

    $ echo $CONDA_DEFAULT_ENV $CONDA_PREFIX
    
    $ conda activate pyflexplot.dev                                                                                                                    
    (conda:pyflexplot.dev) $ echo $CONDA_DEFAULT_ENV                                                                                                                          
    pyflexplot.dev
    (conda:pyflexplot.dev) $ echo $CONDA_PREFIX
    /scratch/ruestefa/miniconda3/envs/pyflexplot.dev
    
    • Note: Added "conda:" in prompt modifier before the env name by changing "env_prompt" in ~/.condarc:

      $ cat ~/.condarc
      report_errors: true
      channels:
        - conda-forge
        - defaults
      channel_priority: strict
      env_prompt: '(conda:{default_env}) '
      auto_activate_base: false
      
  • Use env without activating it:

    (conda:pyflexplot.dev) $ python -V
    Python 3.8.10
    (conda:pyflexplot.dev) $ conda deactivate
    $ python -V
    Python 3.7.4
    $ /scratch/ruestefa/miniconda3/envs/pyflexplot.dev/bin/python -V                                                                             
    Python 3.8.10
    $ /scratch/ruestefa/miniconda3/envs/pyflexplot.dev/bin/python -c 'import cartopy; print(cartopy.__file__)'
    /scratch/ruestefa/miniconda3/envs/pyflexplot.dev/lib/python3.8/site-packages/cartopy/__init__.py
    
    • Works! Except pyflexplot not yet because it has not yet been installed, only its dependencies...
  • Install pyflexplot with pip (in editable mode for development):

    $ /scratch/ruestefa/miniconda3/envs/pyflexplot.dev/bin/python -m pip install -e .
    ...
    $ /scratch/ruestefa/miniconda3/envs/pyflexplot.dev/bin/pyflexplot -V
    0.15.5
    
    • Works like a charm!
    • Change __version__ temporarily in src/pyflexplot/init.py and run .../pyflexplot -V again: Editable mode works!

Install miniconda

  • On daint (/project), I've installed it with easybuild, which is not available on tsa (AFAIK)
  • Follow installation instructions
wget https://repo.anaconda.com/miniconda/Miniconda3-py39_4.9.2-Linux-x86_64.sh
sha256sum Miniconda3-py39_4.9.2-Linux-x86_64.sh | grep -q '^536817d1b14cb1ada88900f5be51ce0a5e042bae178b5550e62f61e223deae7c' && echo OK
bash Miniconda3-py39_4.9.2-Linux-x86_64.sh
Welcome to Miniconda3 py39_4.9.2

In order to continue the installation process, please review the license
agreement.
Please, press ENTER to continue
>>>
...
Do you accept the license terms? [yes|no]
[no] >>> yes  

Miniconda3 will now be installed into this location:
/users/ruestefa/miniconda3

  - Press ENTER to confirm the location
  - Press CTRL-C to abort the installation
  - Or specify a different location below

[/users/ruestefa/miniconda3] >>> /scratch/ruestefa/miniconda3
PREFIX=/scratch/ruestefa/miniconda3
Unpacking payload ...
Collecting package metadata (current_repodata.json): done                                                                                          
Solving environment: done
...
Preparing transaction: done
Executing transaction: done
installation finished.
Do you wish the installer to initialize Miniconda3
by running conda init? [yes|no]
[no] >>> no

You have chosen to not have conda modify your shell scripts at all.
To activate conda's base environment in your current shell session:

eval "$(/scratch/ruestefa/miniconda3/bin/conda shell.YOUR_SHELL_NAME hook)" 

To install conda's shell functions for easier access, first activate, then:

conda init

If you'd prefer that conda's base environment not be activated on startup, 
   set the auto_activate_base parameter to false: 

conda config --set auto_activate_base false

Thank you for installing Miniconda3!
  • The code to initialize conda was not appended to ~/.bashrc because I answered "no"

    • Users who always want to activate it and don't care too much about their bashrc should just answer "yes"
  • Function that initializes conda with command conda-init (put in ~/.bashrc):

      # Show hint how to initialize conda
      undef conda
      function conda { echo "Initialize conda: conda-init" >&2; }
    
      # Command to initialize conda
      function conda-init()
      {
          local conda_path="/scratch/ruestefa/miniconda3"
          local conda_name="conda"
    
          # Unset temporary function defined above
          undef conda
          echo "initializing ${conda_name} from ${conda_path}" >&2
    
          # Based on the block of code created by 'conda init':
          #
          # # >>> conda initialize >>>
          # # !! Contents within this block are managed by 'conda init' !!
          # __conda_setup="$('/scratch/ruestefa/miniconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
          # if [ $? -eq 0 ]; then
          #     eval "$__conda_setup"
          # else
          #     if [ -f "/scratch/ruestefa/miniconda3/etc/profile.d/conda.sh" ]; then
          #         . "/scratch/ruestefa/miniconda3/etc/profile.d/conda.sh"
          #     else
          #         export PATH="/scratch/ruestefa/miniconda3/bin:$PATH"
          #     fi
          # fi
          # unset __conda_setup
          # # <<< conda initialize <<<
    
          local __conda_setup="$("${conda_path}/bin/conda" 'shell.bash' 'hook' 2> /dev/null)"
          if [ ${?} -eq 0 ]; then
              eval "${__conda_setup}"
          else
              if [ -f "${conda_path}/etc/profile.d/conda.sh" ]; then
                  . "${conda_path}/etc/profile.d/conda.sh"
              else
                  export PATH="${conda_path}bin:$PATH"
              fi
          fi
      }
    • Masks conda command to prompt user to initialize it with conda-init
    • When conda-init is run, the conda command is unmasked and conda is initialized
    • The code inserted by conda in ~/.bashrc if one answers "yes" is shown as a comment
  • In new shell:

    $ conda-init
    initializing conda from /scratch/ruestefa/miniconda3
    $ which conda
    /scratch/ruestefa/miniconda3/condabin/conda
    $ conda -V
    conda 4.9.2
    $ conda
    usage: conda [-h] [-V] command ...
    
    conda is a tool for managing and deploying applications, environments and packages.
    
    Options:
    
    positional arguments:
      command
        clean        Remove unused packages and caches.
        compare      Compare packages between conda environments.
        config       Modify configuration values in .condarc. This is modeled after the git config command. Writes to the user .condarc file
                     (/users/ruestefa/.condarc) by default.
        create       Create a new conda environment from a list of specified packages.
        help         Displays a list of available conda commands and their help strings.
        info         Display information about current conda install.
        init         Initialize conda for shell interaction. [Experimental]
        install      Installs a list of packages into a specified conda environment.
        list         List linked packages in a conda environment.
        package      Low-level conda package utility. (EXPERIMENTAL)
        remove       Remove a list of packages from a specified conda environment.
        uninstall    Alias for conda remove.
        run          Run an executable in a conda environment. [Experimental]
        search       Search for packages and display associated information. The input is a MatchSpec, a query language for conda packages. See
                     examples below.
        update       Updates conda packages to the latest compatible version.
        upgrade      Alias for conda update.
    
    optional arguments:
      -h, --help     Show this help message and exit.
      -V, --version  Show the conda version number and exit.
    
    conda commands available from other packages:
      env
    
    • Works! :-)