libcupti.so.10.1 not found
SourabhKul opened this issue · comments
I am trying to do profiling on AWS instance, it has CUDA 11.0 and driver version 450.+
but when I looked into the folder /usr/local/cuda/extras/CUPTI/lib64
I did not find the file libcupti.so.10.1
instead the folder had the file libcupti.so.10.0
now when i use tf.profiler API in my code, I see this warning:
Could not load dynamic library 'libcupti.so.10.1'; dlerror: libcupti.so.10.1: cannot open shared object file:
Is it required to have exactly CUDA 10.1 for profiler to work?
Use base AMI in AWS, and run the command:
sudo rm /usr/local/cuda sudo ln -s /usr/local/cuda-10.1 /usr/local/cuda
to fix the issue
I am getting the same Problem!! Same AWS instance ec2. 450+ CUDA 11
But Getting this error while using yur code:
rm: invalid option -- 's'
Try 'rm --help' for more information.
What to do now?
Try giving two commands like so: and make sure the ln command has - s
and not -- s
sudo rm /usr/local/cuda
sudo ln -s /usr/local/cuda-10.1 /usr/local/cuda
Thanks. But now this happens:
2020-12-07 18:36:26.878924: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcupti.so.10.1
2020-12-07 18:36:27.080827: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1441] function cupti_interface_->Subscribe( &subscriber_, (CUpti_CallbackFunc)ApiCallback, this)failed with error CUPTI_ERROR_INSUFFICIENT_PRIVILEGE
Adding options nvidia "NVreg_RestrictProfilingToAdminUsers=0"
to /etc/modprobe.d/nvidia-kernel-common.conf
and reboot should resolve the permision issue.
use sudo?
Thank you So much. It is solved at last. Hurray!!