question: Running (small) LLMs on RaspberryPI 5

Question

question: Running (small) LLMs on RaspberryPI 5

ChristianWeyer opened this issue 5 months ago · comments

Summary

Hey!

Should we be able to run e.g. Phi-2 with Q4_K_M on a Raspberry Pi 5 (8GB RAM) using WasmEdge?
Did anybody already try it?

Thanks.

Appendix

No response

alabulei1 · Answer 1 · Tue Jan 02 2024 10:20:27 GMT+0800 (China Standard Time)

I think it can work. Using WasmEdge, you can run llama2-7b on a 8GB Ram machine. But I don't have a Raspberry Pi in my hand. You're welcome to give it a try.

The Phi-2 model support is on the way. You can try Tinyllama first. https://www.secondstate.io/articles/tinyllama-1.1b-chat/

Christian Weyer · Answer 2 · Tue Jan 02 2024 18:56:33 GMT+0800 (China Standard Time)

Thank you @alabulei1 !

One general hint: it would be great if the articles on secondstate.io would contain a date when they were published. This is kind of important, especially in the field of Gen AI where everything moves so fast...

Christian Weyer · Answer 3 · Tue Jan 02 2024 19:04:57 GMT+0800 (China Standard Time)

BTW: TinyLlama 1.0 dropped recently 😊
https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0

alabulei1 · Answer 4 · Tue Jan 02 2024 19:28:54 GMT+0800 (China Standard Time)

BTW: TinyLlama 1.0 dropped recently 😊 https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0

That's cool. We should support the new version. Let me give it a try.

alabulei1 · Answer 5 · Tue Jan 02 2024 19:29:37 GMT+0800 (China Standard Time)

Thank you @alabulei1 !

One general hint: it would be great if the articles on secondstate.io would contain a date when they were published. This is kind of important, especially in the field of Gen AI where everything moves so fast...

Great ideas. Let me see how to add date for these articles.

alabulei1 · Answer 6 · Tue Jan 02 2024 19:46:13 GMT+0800 (China Standard Time)

Hi @ChristianWeyer

Thanks for the information. I can run TinyLlama-1.1B-Chat-v1.0 successfully. I will create an issue to update the model list. You're also welcome to give it try.

curl -LO https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/tinyllama-1.1b-chat-v1.0.Q3_K_L.gguf

wasmedge --dir .:. --nn-preload default:GGML:AUTO:tinyllama-1.1b-chat-v0.3.Q5_K_M.gguf llama-chat.wasm -p chatml

Christian Weyer · Answer 7 · Fri Jan 12 2024 16:42:42 GMT+0800 (China Standard Time)

It works :-). Thanks!

alabulei1 · Answer 8 · Fri Jan 12 2024 18:06:51 GMT+0800 (China Standard Time)

Glad to hear that. May I ask your Raspberry Pi Ram and which model you're using?

Christian Weyer · Answer 9 · Tue Jan 16 2024 04:20:55 GMT+0800 (China Standard Time)

RPi 5 with 8GB RAM.

alabulei1 · Answer 10 · Tue Jan 16 2024 13:21:50 GMT+0800 (China Standard Time)

Thanks for the information!

…

On Tue, Jan 16, 2024 at 04:21 Christian Weyer ***@***.***> wrote: RPi 5 with 8GB RAM. — Reply to this email directly, view it on GitHub <#59 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AK5KEIPLSOE52NDHEEAK2MTYOWFTFAVCNFSM6AAAAABBJJRGBWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJSG4ZDMMZUHA> . You are receiving this because you were mentioned.Message ID: ***@***.***>