LlamaEdge / LlamaEdge

The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge

Home Page:https://llamaedge.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

question: Running (small) LLMs on RaspberryPI 5

ChristianWeyer opened this issue · comments

Summary

Hey!

Should we be able to run e.g. Phi-2 with Q4_K_M on a Raspberry Pi 5 (8GB RAM) using WasmEdge?
Did anybody already try it?

Thanks.

Appendix

No response

I think it can work. Using WasmEdge, you can run llama2-7b on a 8GB Ram machine. But I don't have a Raspberry Pi in my hand. You're welcome to give it a try.

The Phi-2 model support is on the way. You can try Tinyllama first. https://www.secondstate.io/articles/tinyllama-1.1b-chat/

Thank you @alabulei1 !

One general hint: it would be great if the articles on secondstate.io would contain a date when they were published. This is kind of important, especially in the field of Gen AI where everything moves so fast...

BTW: TinyLlama 1.0 dropped recently 😊 https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0

That's cool. We should support the new version. Let me give it a try.

Thank you @alabulei1 !

One general hint: it would be great if the articles on secondstate.io would contain a date when they were published. This is kind of important, especially in the field of Gen AI where everything moves so fast...

Great ideas. Let me see how to add date for these articles.

Hi @ChristianWeyer

Thanks for the information. I can run TinyLlama-1.1B-Chat-v1.0 successfully. I will create an issue to update the model list. You're also welcome to give it try.

curl -LO https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/tinyllama-1.1b-chat-v1.0.Q3_K_L.gguf

wasmedge --dir .:. --nn-preload default:GGML:AUTO:tinyllama-1.1b-chat-v0.3.Q5_K_M.gguf llama-chat.wasm -p chatml

It works :-). Thanks!

Glad to hear that. May I ask your Raspberry Pi Ram and which model you're using?

RPi 5 with 8GB RAM.