aws / sagemaker-chainer-container

Docker container for running Chainer scripts to train and host Chainer models on SageMaker

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Building the final image results in error

vangj opened this issue · comments

In your example here for building the final container, you execute this command docker build -t preprod-chainer:4.1.0-gpu-py3 -f docker/4.1.0/final/py3/Dockerfile.gpu .. However the build fails with the following error.

Step 9/12 : RUN pip3 install --no-cache /sagemaker_chainer_container-1.0-py2.py3-none-any.whl
 ---> Running in 03d3d1d3f888
Processing /sagemaker_chainer_container-1.0-py2.py3-none-any.whl
Requirement already satisfied: numpy>=1.14 in /usr/local/lib/python3.5/dist-packages (from sagemaker-chainer-container==1.0) (1.15.4)
Collecting retrying==1.3.3 (from sagemaker-chainer-container==1.0)
  Downloading https://files.pythonhosted.org/packages/44/ef/beae4b4ef80902f22e3af073397f079c96969c69b2c7d52a57ea9ae61c9d/retrying-1.3.3.tar.gz
Collecting sagemaker-containers>=2.2.5 (from sagemaker-chainer-container==1.0)
  Downloading https://files.pythonhosted.org/packages/be/7f/d5011a5d24d86872813140a31d9b8f6d38395a5937c917b9cff1c16e2a6d/sagemaker_containers-2.4.0.tar.gz (43kB)
Collecting chainer==5.0.0 (from sagemaker-chainer-container==1.0)
  Downloading https://files.pythonhosted.org/packages/bd/34/be31d10ff7f6a9452025866a6d515e1fbc877ff2ee68d9c7197c75f15797/chainer-5.0.0.tar.gz (510kB)
    Complete output from command python setup.py egg_info:
    
    We detected that ChainerMN is installed in your environment.
    ChainerMN has been integrated to Chainer and no separate installation
    is necessary. Please uninstall the old ChainerMN in advance.

A quick and dirty fix is to add this line to the Dockerfile before line 18

RUN pip3 uninstall chainermn -y
RUN pip3 install --no-cache /sagemaker_chainer_container-1.0-py2.py3-none-any.whl

Note the error is coming from here.

Note that I am building against 4.1.0 but in the logs above, chainer is 5.0.0.

Note that the wheel is dist/sagemaker_chainer_container-1.0-py2.py3-none-any.whl, which was created by your instructions python setup.py bdist_wheel.

I haven't tested yet, as I am still mid-way through your README.rst.

Note this may impact all your final Dockerfiles.

Hello @vangj!
Thank you for reporting this! Our documentation and old dockerfiles need to be updated/fixed to match the current state of things.

Is there any reason why you want to use chainer 4.1 image instead of the latest 5.0?

We usually release any new updates only to images with the latest framework version.