google / jax

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Home Page:http://jax.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

'+ptx84' is not a recognized feature for this target (ignoring feature)

adam-hartshorne opened this issue · comments

Description

After updating to Jax / Jaxlib 0.4.27, upon running code that imports normal Jax functions, I am presented with constant warnings of the following type,

'+ptx84' is not a recognized feature for this target (ignoring feature)

I don't know if it is relevant or not, but in the same virtual environment I have Pytorch 2.2.2 installed. With previous version of jax/jaxlib I have no problem.

System info (python version, jaxlib version, accelerator, etc.)

Linux Ubuntu 22.04
jax: 0.4.26
jaxlib: 0.4.26
numpy: 1.24.4
python: 3.10.12 (main, Jul 5 2023, 18:54:27) [GCC 11.2.0]
NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4

commented

I also ran into this issue. Downgrading jaxlib to 0.4.26 fixed it for me.

System info (python version, jaxlib version, accelerator, etc.)

jax:    0.4.27
jaxlib: 0.4.27
numpy:  1.26.4
python: 3.11.5 (main, Oct 25 2023, 16:19:59) [GCC 8.5.0 20210514 (Red Hat 8.5.0-20)]
jax.devices (1 total, 1 local): [cuda(id=0)]
process_count: 1
platform: uname_result(system='Linux', release='4.18.0-513.24.1.el8_9.x86_64', version='#1 SMP Thu Apr 4 18:13:02 UTC 2024', machine='x86_64')


$ nvidia-smi
Wed May  8 00:34:18 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA H100                    On  |   00000000:9D:00.0 Off |                    0 |
| N/A   36C    P0             82W /  700W |     534MiB /  95830MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A    157152      C   python                                        524MiB |
+-----------------------------------------------------------------------------------------+

For those experiencing this problem, can you please share the output of nvidia-smi (if you didn't already) and pip list | grep nvidia ?

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3090 Ti     Off |   00000000:01:00.0 Off |                  Off |
|  0%   59C    P2            128W /  450W |    2745MiB /  24564MiB |     19%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA GeForce RTX 3090 Ti     Off |   00000000:21:00.0 Off |                  Off |
| 32%   57C    P2            170W /  450W |    3453MiB /  24564MiB |     82%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

pip list | grep nvidia => Empty output (system CUDA)

> /usr/local/cuda-12.4/bin/nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Mar_28_02:18:24_PDT_2024
Cuda compilation tools, release 12.4, V12.4.131
Build cuda_12.4.r12.4/compiler.34097967_0

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4090 Off | 00000000:01:00.0 On | Off |
| 0% 47C P8 27W / 450W | 5934MiB / 24564MiB | 24% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

pip list | grep nvidia => Empty output (system CUDA)

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Mar_28_02:18:24_PDT_2024
Cuda compilation tools, release 12.4, V12.4.131
Build cuda_12.4.r12.4/compiler.34097967_0

The problem occurs when you have ptxas from CUDA 12.4 in your path. An older version would work (e.g., pip install nvidia-nvcc-cu12 but pin the CUDA 12.3 version).

I'm experiencing the same issue, here's my output from nvidia-smi and pip list | grep nvidia

(base) ➜  ~ nvidia-smi
Wed May  8 09:05:06 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.171.04             Driver Version: 535.171.04   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4090        Off | 00000000:01:00.0  On |                  Off |
|  0%   47C    P0              63W / 450W |    191MiB / 24564MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      1525      G   /usr/lib/xorg/Xorg                          167MiB |
|    0   N/A  N/A      1612      G   /usr/bin/gnome-shell                         14MiB |
+---------------------------------------------------------------------------------------+
(base) ➜  ~ pip list | grep nvidia
nvidia-cublas-cu12        12.1.3.1
nvidia-cuda-cupti-cu12    12.1.105
nvidia-cuda-nvcc-cu12     12.4.131
nvidia-cuda-nvrtc-cu12    12.1.105
nvidia-cuda-runtime-cu12  12.1.105
nvidia-cudnn-cu12         8.9.2.26
nvidia-cufft-cu12         11.0.2.54
nvidia-curand-cu12        10.3.2.106
nvidia-cusolver-cu12      11.4.5.107
nvidia-cusparse-cu12      12.1.0.106
nvidia-nccl-cu12          2.19.3
nvidia-nvjitlink-cu12     12.4.127
nvidia-nvtx-cu12          12.1.105

Also experiencing the same issue. Any specific fixes yet?

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4090 On | 00000000:01:00.0 Off | Off |
| 0% 31C P8 28W / 450W | 3MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce RTX 4090 On | 00000000:24:00.0 Off | Off |
| 0% 31C P8 24W / 450W | 3MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce RTX 4090 On | 00000000:41:00.0 Off | Off |
| 0% 31C P8 26W / 450W | 3MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 3 NVIDIA GeForce RTX 4090 On | 00000000:61:00.0 Off | Off |
| 0% 31C P8 23W / 450W | 3MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 4 NVIDIA GeForce RTX 4090 On | 00000000:81:00.0 Off | Off |
| 0% 31C P8 23W / 450W | 3MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 5 NVIDIA GeForce RTX 4090 On | 00000000:A1:00.0 Off | Off |
| 0% 29C P8 24W / 450W | 3MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 6 NVIDIA GeForce RTX 4090 On | 00000000:C1:00.0 Off | Off |
| 0% 30C P8 29W / 450W | 3MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 7 NVIDIA GeForce RTX 4090 On | 00000000:E1:00.0 Off | Off |
| 0% 31C P8 15W / 450W | 3MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+

nvidia-cublas-cu12 12.4.5.8
nvidia-cuda-cupti-cu12 12.4.127
nvidia-cuda-nvcc-cu12 12.4.131
nvidia-cuda-nvrtc-cu12 12.4.127
nvidia-cuda-runtime-cu12 12.4.127
nvidia-cudnn-cu12 8.9.7.29
nvidia-cufft-cu12 11.2.1.3
nvidia-cusolver-cu12 11.6.1.9
nvidia-cusparse-cu12 12.3.1.170
nvidia-nccl-cu12 2.21.5
nvidia-nvjitlink-cu12 12.4.127

This is now fixed in XLA upstream, but it needs a new release.

You can downgrade nvidia-cuda-nvcc-cu12 to the version from 12.3 to work around. Or downgrade jaxlib to 0.4.26.

What specific commands do I use to downgrade? Thank you!

Try:

pip install nvidia-cuda-nvcc-cu12==12.3.107

I have the same issue

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.78                 Driver Version: 550.78         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA RTX 4000 SFF Ada ...    Off |   00000000:01:00.0 Off |                  Off |
| 30%   37C    P8             11W /   70W |   15291MiB /  20475MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A   1143328      C   ...yenv/versions/3.10.9/bin/python3.10      15284MiB |
+-----------------------------------------------------------------------------------------+

Same issue here. I am sorry but my machine is in a local private network. I cannot paste nvidia-smi.

Ubuntu 22.04
NVIDIA driver 550.54.15
CUDA 12.4 (nvcc 12.4.131) installed on system
Two RTX A6000

If you are OK with the previous version,
pip uninstall jax jaxlib & pip install "jax[cuda12]"==0.4.26 would help.

We just released jax and jaxlib v0.4.28, which resolves this issue.

commented

Well, I just built jax and jaxlib from source. Now this shows up: '+ptx85' is not a recognized feature for this target (ignoring feature)

This is on latest arch linux. Jaxlib was built using:

export TF_CUDA_PATHS=/opt/cuda
python ./build/build.py \
	--bazel_options=--local_ram_resources=HOST_RAM*.2 \
	--target_cpu_features=native\
	--enable_cuda \
	--use_clang \
	--cuda_path=/opt/cuda \
	--cudnn_path=/usr \
	--cuda_compute_capabilities='7.5' \
	--cuda_version='12.5' \
	--cudnn_version='9.1.1'
$ pip list | grep nvidia
$ nvidia-smi 
Tue Jun  4 17:50:33 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.78                 Driver Version: 550.78         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 2080 Ti     Off |   00000000:65:00.0  On |                  N/A |
|  0%   44C    P8             14W /  260W |    9106MiB /  11264MiB |      2%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Wed_Apr_17_19:19:55_PDT_2024
Cuda compilation tools, release 12.5, V12.5.40
Build cuda_12.5.r12.5/compiler.34177558_0