artidoro / qlora

QLoRA: Efficient Finetuning of Quantized LLMs

Home Page:https://arxiv.org/abs/2305.14314

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Bug] large CUDA memory usage in the evaluation phase

ChenMnZ opened this issue · comments

I train llama-7b with the following batch size settings:

    --per_device_train_batch_size 4 \
    --per_device_eval_batch_size 4 \
    --gradient_accumulation_steps 4 \

When training, it consumes about 9G GPU memory. However, when evaluation (mmlu evaluation), the memory consumption increase to 27GB. It is there any bug for the evaluation process?

set --eval_accumulation_steps