How should Python packages depending on TensorFlow structure their requirements?

Question

How should Python packages depending on TensorFlow structure their requirements?

dustinvtran opened this issue 7 years ago · comments

Many packages build on TensorFlow. For example, our work in Edward uses tensorflow>=1.0.0a0 as an install requirement.

However, this conflicts with tensorflow-gpu, which can no longer be installed because of the requirement specifically on tensorflow. What do you suggest is the best way to handle this?

One option suggested by @gokceneraslan (blei-lab/edward#428 (comment)) is to hack in the dependency according to whether the user has a GPU. Another option, which PrettyTensor and Keras employ, is to not even require TensorFlow. (Both options sound not good.)

Also see blei-lab/edward#428. also looping in GPflow devs (@jameshensman, @alexggmatthews) in case they have the same problem. (Note I'm raising this as an issue instead of asking on a mailing list, in case this is something that should be changed on TensorFlow's end and not our end.)

Stephan Hoyer · Answer 1 · Wed Feb 01 2017 05:23:01 GMT+0800 (China Standard Time)

I'm pretty sure the fundamental issue here is that pypi doesn't support uploading wheels with and without GPU support (see PEP 425 for a list of supported tags). Hence the separate "tensorflow-gpu" distribution on pypi.

Both of your suggested work-arounds (hacks in setup.py or simply removing problematic dependencies altogether) are commonly done by Python packages for scientific computing.

Dustin Tran · Answer 2 · Wed Feb 01 2017 09:45:56 GMT+0800 (China Standard Time)

Do you have some examples? This would be great references in deciding from the work-arounds. (I'm leaning towards removing the dependence.)

Stephan Hoyer · Answer 3 · Wed Feb 01 2017 10:11:00 GMT+0800 (China Standard Time)

@dustinvtran Here's a discussion about this for patsy: pydata/patsy#5

Yaroslav Bulatov · Answer 4 · Wed Feb 01 2017 11:28:15 GMT+0800 (China Standard Time)

@shoyer interesting example, I wonder if that explains why pip install --upgrade $TF_BINARY_URL replaces MKL numpy on our machines with OpenBLAS numpy (there's REQUIRED_PACKAGES = [ 'numpy >= 1.11.0', inside of TF's setup.py

Stephan Hoyer · Answer 5 · Thu Feb 02 2017 09:47:04 GMT+0800 (China Standard Time)

@yaroslavvb yes, that's likely the case. Pip install only recently got the option --upgrade-strategy=only-if-needed which is probably what you want to use here. Eventually this will become the default behavior (pypa/pip#3871).

Simon Perkins · Answer 6 · Tue Feb 07 2017 17:12:31 GMT+0800 (China Standard Time)

@dustinvtran See here for an example which detects a CUDA installation and then selects either tensorflow/tensorflow-gpu depending on CUDA availability.

Tomáš Karásek · Answer 7 · Thu Feb 09 2017 18:43:59 GMT+0800 (China Standard Time)

@sjperkins Detecting if nviidia gpu is available will not work when installing in Dockerfile to a Docker image, and in general if you install it somewhere where you don't intend to run it.

I was checking if it could be done with setup.py extras:
https://setuptools.readthedocs.io/en/latest/setuptools.html#declaring-extras-optional-features-with-their-own-dependencies

but I don't think it's possible.

IMO most straightforward would be to drop tensorflow from the install_requires list, and just try to import tf it in setup.py, catch the ImportError, and raise some pip exception saying that you need either tensorflow or tensorflow-gpu installed.

Simon Perkins · Answer 8 · Thu Feb 09 2017 19:04:44 GMT+0800 (China Standard Time)

Detecting if nviidia gpu is available will not work when installing in Dockerfile to a Docker image, and in general if you install it somewhere where you don't intend to run it.

@t0mk One still needs to install CUDA in the docker container. The example I provided will still compile CUDA code if GPUs aren't available, but it won't be able to target specific architectures and will default to sm_30.

Andrew Stepanov · Answer 9 · Sun Feb 19 2017 07:02:14 GMT+0800 (China Standard Time)

Currently I am using extras_require in setup.py

setup(
    name="my_package",
    ...,  # other stuff
    install_requires=<list of dependencies EXCLUDING tensorflow>,
    extras_require={
        "tf": ["tensorflow>=1.0.0"],
        "tf_gpu": ["tensorflow-gpu>=1.0.0"],
    }
)

The problem here is that if user does not specify which version of package he/she wants, i.e. like this my_package[tf_gpu], tensorflow won't be required. But I think at least it is better then not specifying tensorflow at all.

Dustin Tran · Answer 10 · Tue Feb 28 2017 00:53:19 GMT+0800 (China Standard Time)

For Edward I decided to remove the explicit dependence on TensorFlow and make it part of extras_require. Not ideal, but I think it's the best present solution. Feel free to close this issue—it would be nice though to have this type of recommended advice in the docs.

johnthagen · Answer 11 · Fri May 13 2022 03:26:34 GMT+0800 (China Standard Time)

In case this helps others, we use this Poetry dependency specification to target different TF runtimes based on operating system and architecture. Note that these requirements are not used for training, only for making inferences using a pre-trained model.

[tool.poetry.dependencies]
# Use Tensorflow Lite on Linux to produce slim production Docker images.
tflite-runtime = { version = "*", markers = "sys_platform == 'linux'" }

# Tensorflow Lite only ships wheels for Linux, so use full tensorflow on other platforms.
tensorflow = { version = "*", markers = "sys_platform != 'linux' and platform_machine != 'arm64'" }

# ARM Mac wheels are hosted under a different PyPI package name.
tensorflow-macos = { version = "*", markers = "sys_platform == 'darwin' and platform_machine == 'arm64'" }