tensorflow / profiler

A profiling and performance analysis tool for TensorFlow

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

libcupti.so.10.1 not found

SourabhKul opened this issue · comments

I am trying to do profiling on AWS instance, it has CUDA 11.0 and driver version 450.+

but when I looked into the folder /usr/local/cuda/extras/CUPTI/lib64 I did not find the file libcupti.so.10.1

instead the folder had the file libcupti.so.10.0

now when i use tf.profiler API in my code, I see this warning:

Could not load dynamic library 'libcupti.so.10.1'; dlerror: libcupti.so.10.1: cannot open shared object file:

Is it required to have exactly CUDA 10.1 for profiler to work?

Use base AMI in AWS, and run the command:

sudo rm /usr/local/cuda sudo ln -s /usr/local/cuda-10.1 /usr/local/cuda

to fix the issue

I am getting the same Problem!! Same AWS instance ec2. 450+ CUDA 11

But Getting this error while using yur code:
rm: invalid option -- 's'
Try 'rm --help' for more information.

What to do now?

Try giving two commands like so: and make sure the ln command has - s and not -- s

sudo rm /usr/local/cuda

sudo ln -s /usr/local/cuda-10.1 /usr/local/cuda

Thanks. But now this happens:

2020-12-07 18:36:26.878924: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcupti.so.10.1
2020-12-07 18:36:27.080827: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1441] function cupti_interface_->Subscribe( &subscriber_, (CUpti_CallbackFunc)ApiCallback, this)failed with error CUPTI_ERROR_INSUFFICIENT_PRIVILEGE

Adding options nvidia "NVreg_RestrictProfilingToAdminUsers=0" to /etc/modprobe.d/nvidia-kernel-common.conf and reboot should resolve the permision issue.

use sudo?

Thank you So much. It is solved at last. Hurray!!