artidoro / qlora

QLoRA: Efficient Finetuning of Quantized LLMs

Home Page:https://arxiv.org/abs/2305.14314

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

extra memory usage for loading the model

XintianHan opened this issue · comments

In Figure 6 of the paper, why are the memory usages for 7B, 13B, 33B, 65B equal to 5046M, 8476M, 19302M, 37074M, instead of 3.5 G, 6.5 G, 16.5G, 32.5G?

I understand that there are some memory for the quantization constants, but I think the gap should not be this large, like for 7B, there is 1.5 G memory gap.