TF 1.12.0, CPU/GPU, CUDA 9.0, CuDNN 7.4, Python 3.5, Ubuntu 16.04, Skylake, -AVX, +SSE4
bzamecnik opened this issue · comments
Recent build with and without GPU, without AVX for Ubuntu 16.04.
CPU/GPU | AVX/AVX2/FMA | SSE4.1/SSE4.2 | link | md5 |
---|---|---|---|---|
GPU | no | yes | download | 5d9fb5aee87456d5c0f1915b16844769 |
CPU | no | yes | download | df616627cfcbe47d77df7f8628611b9e |
GPU | no | no | download | 61e1081971626bccc047ea773a0f2eed |
Compiled on Intel Pentium G4400 (Skylake) without AVX/FMA instructions.
- TensorFlow 1.12.0 GPU
- Ubuntu 16.04 (libc6-2.23)
- Python 3.5
- CUDA 9.0
- CuDNN 7.4
- NCCL 1.3
- Bazel 0.19.2
- compute capabilities 5.2 (Maxwell), 6.1 (Pascal)
Successfully tested with Keras 2.2.4 on GTX 980 Ti.
Instructions
Install the Bazel build tool:
* For TF 1.12 use Bazel 0.19.1 (recommended 0.15.0)
open https://docs.bazel.build/versions/master/install-ubuntu.html
# use binary installer (recommended)
sudo apt-get install pkg-config zip g++ zlib1g-dev unzip python
# download from https://github.com/bazelbuild/bazel/releases (~ 160 MB)
BAZEL_VERSION=0.19.2
BAZEL_INSTALLER=bazel-$BAZEL_VERSION-installer-linux-x86_64.sh
wget https://github.com/bazelbuild/bazel/releases/download/${BAZEL_VERSION}/${BAZEL_INSTALLER}
wget https://github.com/bazelbuild/bazel/releases/download/${BAZEL_VERSION}/${BAZEL_INSTALLER}.sha256
shasum -a 256 -c -b ${BAZEL_INSTALLER}.sha256
# bazel-0.19.2-installer-linux-x86_64.sh: OK
chmod +x ${BAZEL_INSTALLER}
./${BAZEL_INSTALLER} --user
# set in ~/.bashrc
export PATH="$PATH:$HOME/bin"
source /home/bza/.bazel/bin/bazel-complete.bash
# check it
bazel version
Build:
git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow/
git checkout v1.12.0
# install deps
# Python 3.5 on Ubuntu 16.04
sudo apt install python3-dev python3-pip
# NVIDIA - given the ML deb repo
sudo apt remove libcudnn6 libcudnn6-dev
sudo apt install libcudnn7 libcudnn7-dev
/usr/lib/x86_64-linux-gnu/
# python virtual env + dependencies
mkvirtualenv tf-build
pip install -U pip six numpy wheel mock
pip install -U keras_applications==1.0.6 --no-deps
pip install -U keras_preprocessing==1.0.5 --no-deps
# not in the guide, but missing in the tests
pip install -U scipy scikit-learn portpicker
./configure
Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/lib/x86_64-linux-gnu
# Do you wish to build TensorFlow with CUDA support? [y/N]: y
# Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 9.0]:
# Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]:
# Do you wish to build TensorFlow with TensorRT support? [y/N]: n
# Please specify the NCCL version you want to use. If NCCL 2.2 is not installed, then you can use version 1.3 that can be fetched automatically but it may have worse performance with multiple GPUs. [Default is 2.2]: 1.3
# Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
# Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 5.2]: 5.2,6.1
# without this export it was not working
export TF_NEED_CUDA="1"
bazel build -c opt \
--config=cuda \
//tensorflow/tools/pip_package:build_pip_package
# INFO: Elapsed time: 16706.700s, Critical Path: 190.08s, Remote (0.00% of the time): # [queue: 0.00%, setup: 0.00%, process: 0.00%]
# INFO: 14075 processes: 14075 local.
# INFO: Build completed successfully, 16607 total actions
# real 278m28.498s (4:38 h)
# Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2
./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
mv /tmp/tensorflow_pkg/tensorflow-1.12.0-cp35-cp35m-linux_x86_64.whl ~
pip install --no-cache-dir ~/tensorflow-1.12.0-cp35-cp35m-linux_x86_64.whl
Test it:
cd
git clone https://github.com/keras-team/keras.git
cd keras/examples
python mnist_cnn.py
Troubleshooting
- tensorflow/tensorflow#23613: "--config=opt" -> "-c opt"
- tensorflow/tensorflow#4841: export TF_NEED_CUDA="1"
- Building with --config=cuda but TensorFlow is not configured to build with GPU support.
- no such target '@local_config_cuda//crosstool:toolchain': target 'toolchain' not declared in package 'crosstool'
- tensorflow/tensorflow#22654: seems that bazel 1.19.1 is broken -> upgrade to 1.19.2
- Auto-Configuration Warning: 'TMP' environment variable is not set, using 'C:\Windows\Temp' as default
- tensorflow/tensorflow#23719: seems that bazel 1.19.1 is broken -> upgrade to 1.19.2
- ERROR: /home/bza/.cache/bazel/_bazel_bza/a1c658cb770a9cc79bfab3126f33f28b/external/local_config_cc/BUILD:57:1: in cc_toolchain rule @local_config_cc//:cc-compiler-k8: Error while selecting cc_toolchain: Toolchain identifier 'local' was not found, valid identifiers are [local_linux, local_darwin, local_windows]
- it was picking old libcudnn6-dev -> remove and install libcudnn7-dev
- Cannot find cuda library libcudnn.so.6
- NCCL: 2.2 not working with libnccl2 -> specify 1.3
- ERROR: Config value cuda is not defined in any .rc file
- tensorflow/tensorflow#23401
- change "import /home/bza/tensorflow/.tf_configure.bazelrc" to "import /home/bza/tensorflow/tools/bazel.rc"
sed -i -E 's#\.tf_configure\.bazelrc#tools/bazel.rc#' .bazelrc
- tensorflow/tensorflow#24385
- Downgrade to bazel 0.19.1 (was latest 0.22.0) -> 0.19.2 or 0.15.0
Default GCC flag is -march=native, so on my CPU (without AVX) AVX will be disabled. If cross-compiling on another CPU with AVX, we can disable it explicitly:
bazel build --config=opt \
--config=cuda \
--copt=-no-mavx \
--copt=-no-mavx2 \
--copt=-no-mfma \
//tensorflow/tools/pip_package:build_pip_package
Another build (with explicitly enabled SSE4 and with -D_GLIBCXX_USE_CXX11_ABI=0
):
time bazel build -c opt \
--config=cuda \
--copt=-mno-avx \
--copt=-mno-avx2 \
--copt=-mno-fma \
--copt=-msse4.1 \
--copt=-msse4.2 \
--cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" \
//tensorflow/tools/pip_package:build_pip_package
Another build was without CUDA:
unset TF_TF_NEED_CUDA
bazel clean
./configure
# ... build as above
- 5.2 - Maxwell 9xx series (including GeForce GTX 980 Ti)
- 6.1 - Pascal 10xx series
TODO:
- Docker build
- enable SSE4 instructions (likely not enabled with -march=native)
- TensorRT
- with GCC 5 possibly use:
- --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"
- but it is useful only for custom TF operations
- maybe can be useful to overcome different versions of glibc tensorflow/tensorflow#15376
Dude many thanks. I have searched all the web for this wheel for my old computer. Thanks for taking your time and building one.
THANK YOU
Been trying for a year to get a newer version than tensorflow v1.5 working. This is the first thing that has worked. 🎉
Should I build also 1.13?
@bzamecnik As of right now, I'm too afraid to update CUDA to work with 1.13 (like I said, it took me way to long to get this working), so I likely will not be able to test for you. I'm sure others in the community would love that contribution, though.
I really really appreciate this build. Now my old Xeon desktop can finally use tensorflow with version > 1.5. What a leap to 1.12! Thanks!!
@bzamecnik Hey tensorflow 2 has released can you please make a wheel for this version cpu