question: Running (small) LLMs on RaspberryPI 5
ChristianWeyer opened this issue · comments
Summary
Hey!
Should we be able to run e.g. Phi-2 with Q4_K_M on a Raspberry Pi 5 (8GB RAM) using WasmEdge?
Did anybody already try it?
Thanks.
Appendix
No response
I think it can work. Using WasmEdge, you can run llama2-7b on a 8GB Ram machine. But I don't have a Raspberry Pi in my hand. You're welcome to give it a try.
The Phi-2 model support is on the way. You can try Tinyllama first. https://www.secondstate.io/articles/tinyllama-1.1b-chat/
Thank you @alabulei1 !
One general hint: it would be great if the articles on secondstate.io would contain a date when they were published. This is kind of important, especially in the field of Gen AI where everything moves so fast...
BTW: TinyLlama 1.0 dropped recently 😊
https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0
BTW: TinyLlama 1.0 dropped recently 😊 https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0
That's cool. We should support the new version. Let me give it a try.
Thank you @alabulei1 !
One general hint: it would be great if the articles on secondstate.io would contain a date when they were published. This is kind of important, especially in the field of Gen AI where everything moves so fast...
Great ideas. Let me see how to add date for these articles.
Thanks for the information. I can run TinyLlama-1.1B-Chat-v1.0 successfully. I will create an issue to update the model list. You're also welcome to give it try.
curl -LO https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/tinyllama-1.1b-chat-v1.0.Q3_K_L.gguf
wasmedge --dir .:. --nn-preload default:GGML:AUTO:tinyllama-1.1b-chat-v0.3.Q5_K_M.gguf llama-chat.wasm -p chatml
It works :-). Thanks!
Glad to hear that. May I ask your Raspberry Pi Ram and which model you're using?
RPi 5 with 8GB RAM.