ChatGLM的Finetune推荐命令，使用3090 24G会OOM，代码默认使用8Bit量化同样会导致OOM

Question

ChatGLM的Finetune推荐命令，使用3090 24G会OOM，代码默认使用8Bit量化同样会导致OOM

StarrickLiu opened this issue a year ago · comments

Issue 1：

python3 uniform_finetune.py   --model_type chatglm --model_name_or_path THUDM/chatglm-6b \
    --data alpaca-belle-cot --lora_target_modules query_key_value \
    --lora_r 32 --lora_alpha 32 --lora_dropout 0.1 --per_gpu_train_batch_size 2 \
    --learning_rate 2e-5 --epochs 1

运行上述命令后会在训练阶段OOM：

RuntimeError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 23.69 GiB total capacity; 22.48 GiB already allocated; 6.06 MiB free; 22.55 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

使用下述命令训练GLM顺利进入训练阶段，尚未发生OOM：

python3 uniform_finetune.py   --model_type chatglm --model_name_or_path /workspace/para/chatglm-6b \
     --data instinwild_ch --lora_target_modules query_key_value \
     --per_gpu_train_batch_size 1  --epochs 1 \
     --report_to wandb

训练时占用：

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:0A:00.0 Off |                  N/A |
| 31%   65C    P2   305W / 350W |  21916MiB / 24576MiB |     78%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

Issue 2:

根据Readme所述，训练GLM时不能使用int8量化，但是finetune代码中没有判断后跳过此类的处理，会导致OOM：

可以注释掉这行，注释后不会在这OOM

ForgetThatNight · Answer 1 · Thu May 11 2023 17:39:41 GMT+0800 (China Standard Time)

chatglm才6b，我32G都是OutOfMemoryError，好神奇，没找到原因