Out of Memory？

Question

Out of Memory？

driftboat opened this issue a year ago · comments

VinsonLaro · Answer 1 · Sat Apr 22 2023 00:18:31 GMT+0800 (China Standard Time)

You don't have enough VRAM to load 3B flan-alpaca-xl model. Try replacing 770M flan-alpaca-large model
你没有足够的显存，默认加载的模型是3B参数的flan-alpaca-xl，你试试加载770M参数的flan-alpaca-large，这个加载只需要3G多显存

robjlyons · Answer 2 · Sat Apr 22 2023 01:34:13 GMT+0800 (China Standard Time)

The xl model (3B) seems to need a 16GB GPU from my testing. The large (770M) model will work on an 8GB GPU but I don’t know about a 6GB card.

Replacing the xl model with a smaller one will allow it to run, however, I have found that anything below the xl model returns some unintelligible results.

You may find that if you don’t have a powerful enough GPU then you will either need to upgrade or consider a cloud service like Colab, Kaggle or Paperspace Gradient. Though none of the free versions are powerful enough.

VinsonLaro · Answer 3 · Sat Apr 22 2023 02:59:39 GMT+0800 (China Standard Time)

The xl model (3B) seems to need a 16GB GPU from my testing. The large (770M) model will work on an 8GB GPU but I don’t know about a 6GB card.

Replacing the xl model with a smaller one will allow it to run, however, I have found that anything below the xl model returns some unintelligible results.

You may find that if you don’t have a powerful enough GPU then you will either need to upgrade or consider a cloud service like Colab, Kaggle or Paperspace Gradient. Though none of the free versions are powerful enough.

yep，i ran xl(3B) successfully at 3080ti 12G and large(770m) successfully at 3060 6G, but it is true that as you said, the results generated at large(770m) are not very good.
For this I tried to run it on other models, such as chatGLM-6B, which worked successfully on 3060 6G.
However, in most LLM models, they emphasize that they are a language model and all kinds of moral and ethical restrictions when running, so I am trying to run this project with a more free model

robjlyons · Answer 4 · Sat Apr 22 2023 03:36:16 GMT+0800 (China Standard Time)

I found that with the large (770m) model I got quite a few symbols in with the text. If you’re happy to ignore or process the output to remove them then the actual text content is quite usable. It would be interesting to see the results with the new flan-shareGPT models they’ve just released. I haven’t gotten round to testing those.