rothnic / docker-tinyconda

A simple pattern for dockerizing apps, which uses conda to manage python versions and compiled dependencies, then pip to fill in the gaps

Home Page:https://hub.docker.com/r/rothnic/tinyconda/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

docker-tinyconda

A minimal base image for miniconda, starting with the smallest glibc compatible docker image (debian), adding in miniconda runtime requirements, adding miniconda, then cleaning out as much as possible that isn't required for a dockerized python application.

Overview

What does this do for me?

  • Gives you a configurable python runtime in a smallish docker image.
  • Leverages conda, so widely used, compiled packages are simple and fast to install
  • Formalizes an application structure utilizing conda and pip, dependent on how you setup the environment. (see example environment.yml)
  • Provides a deployment environment that is very similar to a local development environment.

Quickstart with Example App

Assuming you have docker installed. The example uses pandas and click to demonstrate that we can use pre-combiled libraries and pip-installed libraries. The example modifies the example click app to build up a DataFrame of greetings.

  1. Clone this repository
  2. cd docker-miniconda/test_app
  3. docker build -t test_app .
  4. docker run --rm -it test_app

Output


# uses default CMD in Dockerfile
→ docker run --rm -it test_app /app/app.py
Your name: Nick
  greeting  name
0    Hello  Nick

# overrides CMD in Dockerfile
→ docker run --rm -it test_app /app/app.py --count=3
Your name: Nick
  greeting  name
0    Hello  Nick
1    Hello  Nick
2    Hello  Nick

→ docker run --rm -it test_app /app/app.py --count=3 --greeting=Yo
Your name: Nick
  greeting  name
0       Yo  Nick
1       Yo  Nick
2       Yo  Nick

Ok, but what did that do?

  1. Pulled the onbuild tinyconda image.
  2. Docker built an image with all of the app dependencies by having conda install the version of python and packages specified in environment.yml.
  3. Conda installed python 3.5 in an environment and all conda packages specified, including pandas, which requires numpy, normally requiring blas, fortran, and other headache inducing platform compilation dependencies be installed, which aren't required for running the app.
  4. Conda installed the pip packages specified.
  5. When you call run, docker spins up an instance container of the pre-build image containing all dependencies, with --rm meaning to throw away the container afterwards (doesn't affect the image), then -it controls the output from our app, then test_app is our image name, then any additional inputs are passed to /opt/conda/envs/app/bin/python, as if python was on your local machine.

Tutorial

  1. Most users should start with the Docker Toolbox

  2. Install Miniconda on local machine.

  3. You will either create a new project (git repository), or use an existing one. Open a console and change to this directory.

  4. enter conda create -n <your_app_name> python=3 (or just python for python=2)

  5. enter source activate <your_app_name>

  6. Now install everything your app requires with conda install <dependency>. If conda doesn't have the requirement for your or your target platform (this debian image), you have these options:

  7. Build the conda package and upload to Anaconda.org. Then proceed to the next dependency.

  8. If the dependency is available through pip and doesn't require compilation, just install it via pip in your environment.

  9. (last resort) Extend this docker image to install required dependencies to compile the package, if required. The downside to this is rebuilds of the image will take much longer, compared to just downloading and installing a pre-compiled dependency via conda.

  10. enter conda env export > environment.yml (see example app's file for reference)

  • Note: The tinyconda-onbuild image will automatically look for this file by name and install the requirements for you.
  1. Start up docker (see setup instructions if needed)
  2. Copy example app's Dockerfile to the root of your application's project
  3. Modify the CMD line as needed to setup the default python script called. This is passed to the environment's python.
  • Changing CMD results in the default of /path/to/env/python <CMD's arguments here> being run.
  • Note, you can override CMD when running your image.
  1. Build image with docker build -t <your_app_name> .
  2. Run the app with docker run --rm -it <your_app_name> or docker run --rm -it <your_app_name> /app/<script_name>.py arg1 arg2

Troubleshooting

Note: If you have issues when running your app in the docker environment, there is likely an issue between package requirements for your local machine OS vs linux. In that case do the following:

  1. Run the image with bash as entrypoint: docker run --rm -it --entrypoint=/bin/bash test_app
  2. enter source activate app
  3. conda install <package> where package is suspected to have issues
  4. conda env export
  5. Copy the output from export into the environment.yml and rebuild app.

Docker Images

The dockerfile is available for customization, but the most common use should be by using the pre-built version from docker hub. This currently comes in two flavors, tinyconda and tinyconda-onbuild.

  • rothnic/tinyconda:latest: base image (~290MB) with Miniconda installed
  • rothnic/tinyconda:onbuild-latest: auto installs dependencies with environment.yml file from conda

Python/Miniconda Version

The miniconda version is python3, but your environment.yml will specify the python version, and all package versions that you installed in your local environment before executing the creation of the environment.yml file.

Why not utilize a micro linux distribution?

Currently, there is a smaller miniconda distribution available that utilizes busybox, but this can no longer be built at the moment due to the move from glibc to musl c compiler. There is some work to utilize glibc with alpine, but this is immature at the moment, and likely to cause issues with conda at some point. This could move to alpine if it is found to avoid issues with all the pre-built binaries that conda is useful for in the first place.

About

A simple pattern for dockerizing apps, which uses conda to manage python versions and compiled dependencies, then pip to fill in the gaps

https://hub.docker.com/r/rothnic/tinyconda/


Languages

Language:Dockerfile 76.7%Language:Python 23.3%