yandex / YaLM-100B

Pretrained language model with 100B parameters

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CUDA out of memory

Aspector1 opened this issue · comments

Hello, I'm trying to use YaLM to generate text. I am using pretrained models. But when I try to run the generation, I get an error:

RuntimeError: CUDA out of memory. Tried to allocate 76.00 MiB (GPU 0; 5.80 GiB total capacity; 62.50 MiB already allocated; 20.81 MiB free; 64.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

GPU is 1660, 6gb vram. Is there anything I can do about it or have I wasted a few weeks?

commented

The neural network requires 200 GB of video memory to run. Have you even looked into the details?

The neural network requires 200 GB of video memory to run. Have you even looked into the details?

I'm not trying to retrain the model, I'm trying to use it.

commented

There is no difference.

GPU is 1660, 6gb vram. Is there anything I can do about it or have I wasted a few weeks?

You may try to use huggingface-accelerate https://github.com/huggingface/accelerate https://github.com/huggingface/accelerate/blob/main/src/accelerate/big_modeling.py

commented

GPU is 1660, 6gb vram. Is there anything I can do about it or have I wasted a few weeks?

You may try to use huggingface-accelerate https://github.com/huggingface/accelerate https://github.com/huggingface/accelerate/blob/main/src/accelerate/big_modeling.py

Can you tell me more about how to load such a large model on the 1060?

@Aspector1 by the way. Did you use docker to run it?