[Bug] large CUDA memory usage in the evaluation phase
ChenMnZ opened this issue · comments
Mengzhao Chen commented
I train llama-7b with the following batch size settings:
--per_device_train_batch_size 4 \
--per_device_eval_batch_size 4 \
--gradient_accumulation_steps 4 \
When training, it consumes about 9G GPU memory. However, when evaluation (mmlu evaluation), the memory consumption increase to 27GB. It is there any bug for the evaluation process?
TIANSHU ZHU commented
set --eval_accumulation_steps