google-deepmind / lab

A customisable 3D platform for agent-based AI research

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Failed to find function dmlab_connect in library

kaustabpal opened this issue · comments

After building the Docker image using the Dockerfile for scalable_agent and running sudo docker run --name scalable_agent kaustab/scalable_agent, I am getting the following error:

2020-07-07 00:27:46.528660: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-07-07 00:27:46.533306: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:215] Initialize GrpcChannelCache for job local -> {0 -> localhost:41863}
2020-07-07 00:27:46.534586: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:334] Started server with target: grpc://localhost:41863
INFO:tensorflow:Using dynamic batching.
INFO:tensorflow:Creating actor 0 with level explore_goal_locations_small
INFO:tensorflow:Creating actor 1 with level explore_goal_locations_small
INFO:tensorflow:Creating actor 2 with level explore_goal_locations_small
INFO:tensorflow:Creating actor 3 with level explore_goal_locations_small
INFO:tensorflow:Creating MonitoredSession, is_chief True
INFO:tensorflow:Create CheckpointSaverHook.
WARNING:tensorflow:Issue encountered when serializing py_process_processes.
Type is unsupported, or the types of the items don't match field type in CollectionDef. Note this is a warning and probably safe to ignore.
'PyProcess' object has no attribute 'name'
INFO:tensorflow:Starting all processes.
Failed to find function dmlab_connect in library!
Failed to find function dmlab_connect in library!
Traceback (most recent call last):
  File "experiment.py", line 700, in <module>
    tf.app.run()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
Failed to find function dmlab_connect in library!
  File "experiment.py", line 694, in main
    train(action_set, level_names)
  File "experiment.py", line 587, in train
    hooks=[py_process.PyProcessHook()]) as session:
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 415, in MonitoredTrainingSession
    stop_grace_period_secs=stop_grace_period_secs)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 826, in __init__
    stop_grace_period_secs=stop_grace_period_secs)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/monitored_session.py", line 542, in __init__
    h.begin()
  File "/scalable_agent/py_process.py", line 192, in begin
    tp.map(lambda p: p.start(), tf.get_collection(PyProcess.COLLECTION))
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 253, in map
    return self.map_async(func, iterable, chunksize).get()
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 572, in get
    raise self._value
RuntimeError: Failed to connect RL API
Failed to find function dmlab_connect in library!

Below is the contents of the Dockerfile I used to build the image:

FROM ubuntu:18.04

# Install dependencies.
# g++ (v. 5.4) does not work: https://github.com/tensorflow/tensorflow/issues/13308
RUN apt-get update && apt-get install -y \
    curl \
    wget \
    zip \
    unzip \
    software-properties-common \
    pkg-config \
    g++-4.8 \
    zlib1g-dev \
    python \
    lua5.1 \
    liblua5.1-0-dev \
    libffi-dev \
    gettext \
    freeglut3 \
    libsdl2-dev \
    libosmesa6-dev \
    libglu1-mesa \
    libglu1-mesa-dev \
    python-dev \
    build-essential \
    git \
    gnupg \
    python-setuptools \
    python-pip \
    libjpeg-dev

# Install bazel
RUN echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | \
    tee /etc/apt/sources.list.d/bazel.list && \
    curl https://bazel.build/bazel-release.pub.gpg | \
    apt-key add - && \
    apt-get update && apt-get install -y bazel

# Install TensorFlow and other dependencies
RUN pip install tensorflow==1.9.0 dm-sonnet==1.23

# Build and install DeepMind Lab pip package.
# We explicitly set the Numpy path as shown here:
# https://github.com/deepmind/lab/blob/master/docs/users/build.md
RUN NP_INC="$(python -c 'import numpy as np; print(np.get_include())[5:]')" && \
    git clone https://github.com/deepmind/lab.git --branch release-2019-02-04 && \
    cd lab && \
    sed -i 's@hdrs = glob(\[@hdrs = glob(["'"$NP_INC"'/\*\*/*.h", @g' python.BUILD && \
    sed -i 's@includes = \[@includes = ["'"$NP_INC"'", @g' python.BUILD && \
    bazel build -c opt python/pip_package:build_pip_package && \
    pip install wheel && \
    ./bazel-bin/python/pip_package/build_pip_package /tmp/dmlab_pkg && \
    pip install /tmp/dmlab_pkg/DeepMind_Lab-1.0-py2-none-any.whl --force-reinstall

# Install dataset (from https://github.com/deepmind/lab/tree/master/data/brady_konkle_oliva2008)
RUN mkdir dataset && \
    cd dataset && \
    pip install Pillow && \
    curl -sS https://raw.githubusercontent.com/deepmind/lab/master/data/brady_konkle_oliva2008/README.md | \
    tr '\n' '\r' | \
    sed -e 's/.*```sh\(.*\)```.*/\1/' | \
    tr '\r' '\n' | \
    bash

# Clone.
RUN git clone https://github.com/deepmind/scalable_agent.git
WORKDIR scalable_agent

# Build dynamic batching module.
RUN TF_INC="$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')" && \
    TF_LIB="$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')" && \
    g++-4.8 -std=c++11 -shared batcher.cc -o batcher.so -fPIC -I $TF_INC -O2 -D_GLIBCXX_USE_CXX11_ABI=0 -L$TF_LIB -ltensorflow_framework

# Run tests.
RUN python py_process_test.py
RUN python dynamic_batching_test.py
RUN python vtrace_test.py

# Run.
CMD ["sh", "-c", "python experiment.py --total_environment_frames=10000 --dataset_path=../dataset && python experiment.py --mode=test --test_num_episodes=5"]

# Docker commands:
#   docker rm scalable_agent -v
#   docker build -t scalable_agent .
#   docker run --name scalable_agent scalable_agent

How do I resolve this?

Hello -- which Dockerfile are you referring to? As far as I'm aware, we're not shipping (and hence not supporting) any Dockerfiles?

Please reopen this issue if this is still a problem and you have more information.