yangjianxin1 / Firefly

Firefly: 大模型训练工具,支持训练Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

baichuan-7b-sft 训练遇到的问题

Kris-rod opened this issue · comments

我训练的是单卡qlora的版本,但跑代码时碰到一个报错一直解决不了:
===================================BUG REPORT===================================
/home/ubuntu/anaconda3/envs/firefly/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:167: UserWarning: Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

warn(msg)

/home/ubuntu/anaconda3/envs/firefly/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:167: UserWarning: /home/ubuntu/anaconda3/envs/firefly did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
warn(msg)
/home/ubuntu/anaconda3/envs/firefly/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:167: UserWarning: /usr/local/cuda/lib64 did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
warn(msg)
The following directories listed in your path were found to be non-existent: {PosixPath('-DCMAKE_LINKER=/home/ubuntu/anaconda3/envs/firefly/bin/x86_64-conda-linux-gnu-ld -DCMAKE_STRIP=/home/ubuntu/anaconda3/envs/firefly/bin/x86_64-conda-linux-gnu-strip')}
The following directories listed in your path were found to be non-existent: {PosixPath('-Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,-rpath,/home/ubuntu/anaconda3/envs/firefly/lib -Wl,-rpath-link,/home/ubuntu/anaconda3/envs/firefly/lib -L/home/ubuntu/anaconda3/envs/firefly/lib')}
The following directories listed in your path were found to be non-existent: {PosixPath('-march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-all -fno-plt -Og -g -Wall -Wextra -fvar-tracking-assignments -ffunction-sections -pipe -isystem /home/ubuntu/anaconda3/envs/firefly/include')}
The following directories listed in your path were found to be non-existent: {PosixPath('-DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /home/ubuntu/anaconda3/envs/firefly/include')}
The following directories listed in your path were found to be non-existent: {PosixPath('-D_DEBUG -D_FORTIFY_SOURCE=2 -Og -isystem /home/ubuntu/anaconda3/envs/firefly/include')}
The following directories listed in your path were found to be non-existent: {PosixPath('-march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/ubuntu/anaconda3/envs/firefly/include')}
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths...
DEBUG: Possible options found for libcudart.so: {PosixPath('/usr/local/cuda/lib64/libcudart.so')}
CUDA SETUP: PyTorch settings found: CUDA_VERSION=117, Highest Compute Capability: 8.9.
CUDA SETUP: To manually override the PyTorch CUDA version please see:https://github.com/TimDettmers/bitsandbytes/blob/main/how_to_use_nonpytorch_cuda.md
CUDA SETUP: Loading binary /home/ubuntu/anaconda3/envs/firefly/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda117.so...
libcusparse.so.11: cannot open shared object file: No such file or directory
CUDA SETUP: Something unexpected happened. Please compile from source:
git clone https://github.com/TimDettmers/bitsandbytes.git
cd bitsandbytes
CUDA_VERSION=117 make cuda11x
python setup.py install
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/Firefly-master/train.py", line 6, in
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training
File "/home/ubuntu/anaconda3/envs/firefly/lib/python3.10/site-packages/peft/init.py", line 22, in
from .auto import (
File "/home/ubuntu/anaconda3/envs/firefly/lib/python3.10/site-packages/peft/auto.py", line 30, in
from .mapping import MODEL_TYPE_TO_PEFT_MODEL_MAPPING
File "/home/ubuntu/anaconda3/envs/firefly/lib/python3.10/site-packages/peft/mapping.py", line 20, in
from .peft_model import (
File "/home/ubuntu/anaconda3/envs/firefly/lib/python3.10/site-packages/peft/peft_model.py", line 26, in
from accelerate import dispatch_model, infer_auto_device_map
File "/home/ubuntu/anaconda3/envs/firefly/lib/python3.10/site-packages/accelerate/init.py", line 3, in
from .accelerator import Accelerator
File "/home/ubuntu/anaconda3/envs/firefly/lib/python3.10/site-packages/accelerate/accelerator.py", line 35, in
from .checkpointing import load_accelerator_state, load_custom_state, save_accelerator_state, save_custom_state
File "/home/ubuntu/anaconda3/envs/firefly/lib/python3.10/site-packages/accelerate/checkpointing.py", line 24, in
from .utils import (
File "/home/ubuntu/anaconda3/envs/firefly/lib/python3.10/site-packages/accelerate/utils/init.py", line 131, in
from .bnb import has_4bit_bnb_layers, load_and_quantize_model
File "/home/ubuntu/anaconda3/envs/firefly/lib/python3.10/site-packages/accelerate/utils/bnb.py", line 42, in
import bitsandbytes as bnb
File "/home/ubuntu/anaconda3/envs/firefly/lib/python3.10/site-packages/bitsandbytes/init.py", line 6, in
from . import cuda_setup, utils, research
File "/home/ubuntu/anaconda3/envs/firefly/lib/python3.10/site-packages/bitsandbytes/research/init.py", line 1, in
from . import nn
File "/home/ubuntu/anaconda3/envs/firefly/lib/python3.10/site-packages/bitsandbytes/research/nn/init.py", line 1, in
from .modules import LinearFP8Mixed, LinearFP8Global
File "/home/ubuntu/anaconda3/envs/firefly/lib/python3.10/site-packages/bitsandbytes/research/nn/modules.py", line 8, in
from bitsandbytes.optim import GlobalOptimManager
File "/home/ubuntu/anaconda3/envs/firefly/lib/python3.10/site-packages/bitsandbytes/optim/init.py", line 6, in
from bitsandbytes.cextension import COMPILED_WITH_CUDA
File "/home/ubuntu/anaconda3/envs/firefly/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 20, in
raise RuntimeError('''
RuntimeError:
CUDA Setup failed despite GPU being available. Please run the following command to get more information:

    python -m bitsandbytes

    Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
    to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
    and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues