Segmentation Fault on running model
LanHikari22 opened this issue · comments
Hi,
So I have two issues. I will walk through my process. Initially, I tried to install the project requirements using
~/miniconda3/bin/python3 -m pip install git+https://github.com/suno-ai/bark.git
And installing the required cudnn:
conda install cudnn=8.9.2
but got the following error attempting to import BarkModel from transformers (source code included at the end of the post):
(base) ~ ➜ LD_LIBRARY_PATH=~/miniconda3/lib ~/miniconda3/bin/python3 ~/src/exp/bark_gpu.py
[+] Importing Transformers
2023-12-03 14:02:30.308417: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-12-03 14:02:30.414817: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-12-03 14:02:31.859108: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Traceback (most recent call last):
File "/home/lan/miniconda3/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1353, in _get_module
return importlib.import_module("." + module_name, self.__name__)
File "/home/lan/miniconda3/lib/python3.10/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 883, in exec_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "/home/lan/miniconda3/lib/python3.10/site-packages/transformers/models/bark/modeling_bark.py", line 55, in <module>
from flash_attn import flash_attn_func, flash_attn_varlen_func
File "/home/lan/.local/lib/python3.10/site-packages/flash_attn/__init__.py", line 3, in <module>
from flash_attn.flash_attn_interface import (
File "/home/lan/.local/lib/python3.10/site-packages/flash_attn/flash_attn_interface.py", line 8, in <module>
import flash_attn_2_cuda as flash_attn_cuda
ImportError: /home/lan/.local/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda20CUDACachingAllocator9allocatorE
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/lan/src/exp/bark_gpu.py", line 55, in <module>
from transformers import AutoProcessor, BarkModel
File "<frozen importlib._bootstrap>", line 1075, in _handle_fromlist
File "/home/lan/miniconda3/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1344, in __getattr__
value = getattr(module, name)
File "/home/lan/miniconda3/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1343, in __getattr__
module = self._get_module(self._class_to_module[name])
File "/home/lan/miniconda3/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1355, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.models.bark.modeling_bark because of the following error (look up to see its traceback):
/home/lan/.local/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda20CUDACachingAllocator9allocatorE
This project installs pytorch 1.13.1. I was able to bypass this issue by installing 2.1.1
~/miniconda3/bin/python3 -m pip install torch==2.1.1
Now I am able to run the model and generate audio on CPU. However, trying to use my GPU with CUDA results in a segmentation fault when running the model:
(base) ~ ➜ LD_LIBRARY_PATH=~/miniconda3/lib ~/miniconda3/bin/python3 ~/src/exp/bark_gpu.py
[+] Importing Transformers
2023-12-03 14:16:58.823061: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-12-03 14:16:58.876651: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-12-03 14:16:59.624913: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Took 00:00:02.698
[+] Loading Processor
Took 00:00:00.413
[+] Loading Model
/home/lan/miniconda3/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
Took 00:00:12.618
[+] Processing Input
Took 00:00:00.179
[+] Running Model
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:10000 for open-end generation.
[1] 1156277 segmentation fault (core dumped) LD_LIBRARY_PATH=~/miniconda3/lib ~/miniconda3/bin/python3
You can find the source code at
bark_gpu.py.txt
Please let me know if I need to provide any more information.
Thanks,
Lan