rmihaylov / falcontune

Tune any FALCON in 4-bit

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Runtime Error : CUDA OUT OF MEMORY

amnasher opened this issue · comments

I am trying to fintune the 7b-instruct-gptq model but it gives me cuda out of memory when I specify a cutoff length of 2048.

Parameters:
-------config-------
dataset='./Falcontune_data.json'
data_type='alpaca'
lora_out_dir='./falcon-7b-instruct-4bit-customodel/'
lora_apply_dir=None
weights='gptq_model-4bit-64g.safetensors'
target_modules=['query_key_value']

------training------
mbatch_size=1
batch_size=2
gradient_accumulation_steps=2
epochs=3
lr=0.0003
cutoff_len=2048
lora_r=8
lora_alpha=16
lora_dropout=0.05
val_set_size=0.2
gradient_checkpointing=False
gradient_checkpointing_ratio=1
warmup_steps=5
save_steps=50
save_total_limit=3
logging_steps=5
checkpoint=False
skip=False
world_size=1
ddp=False
device_map='auto'

OutOfMemoryError: CUDA out of memory. Tried to allocate 508.00 MiB (GPU 0; 14.75
GiB total capacity; 12.29 GiB already allocated; 476.81 MiB free; 13.22 GiB
reserved in total by PyTorch) If reserved memory is >> allocated memory try
setting max_split_size_mb to avoid fragmentation. See documentation for Memory
Management and PYTORCH_CUDA_ALLOC_CONF