yandex / YaLM-100B

Pretrained language model with 100B parameters

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Provide pruned version for weaker hardware

CommanderTvis opened this issue · comments

It would be really useful to have a pruned version of the model (like Balaboba) to launch on weaker video card setups.

Also, quantization even to 4 bits may be possible, like it is successfully done for LLaMa. https://github.com/ggerganov/llama.cpp

+1 also this distribution technique might be very much applicable here: https://petals.ml