PhoebusSi / Alpaca-CoT

We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts to initiate any meaningful PR on this repo and integrate as many LLM related technologies as possible. 我们打造了方便研究人员上手和使用大模型等微调平台,我们欢迎开源爱好者发起任何有意义的pr!

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ChatGLM的Finetune推荐命令,使用3090 24G会OOM,代码默认使用8Bit量化同样会导致OOM

StarrickLiu opened this issue · comments

Issue 1:

python3 uniform_finetune.py   --model_type chatglm --model_name_or_path THUDM/chatglm-6b \
    --data alpaca-belle-cot --lora_target_modules query_key_value \
    --lora_r 32 --lora_alpha 32 --lora_dropout 0.1 --per_gpu_train_batch_size 2 \
    --learning_rate 2e-5 --epochs 1

运行上述命令后会在训练阶段OOM:

RuntimeError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 23.69 GiB total capacity; 22.48 GiB already allocated; 6.06 MiB free; 22.55 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

使用下述命令训练GLM顺利进入训练阶段,尚未发生OOM:

python3 uniform_finetune.py   --model_type chatglm --model_name_or_path /workspace/para/chatglm-6b \
     --data instinwild_ch --lora_target_modules query_key_value \
     --per_gpu_train_batch_size 1  --epochs 1 \
     --report_to wandb

训练时占用:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:0A:00.0 Off |                  N/A |
| 31%   65C    P2   305W / 350W |  21916MiB / 24576MiB |     78%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

Issue 2:

根据Readme所述,训练GLM时不能使用int8量化,但是finetune代码中没有判断后跳过此类的处理,会导致OOM:

image

89e97978136633b0d03e41544d61060
可以注释掉这行,注释后不会在这OOM

chatglm才6b,我32G都是OutOfMemoryError,好神奇,没找到原因