LAION-AI / natural_voice_assistant

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Adding quantized Model

SaddamBInSyed opened this issue · comments

Hi
Thanks for this good work.,

Is there a method to add a quantized LLM model so that it can be run on a GPU with under 10GB of VRAM, making it accessible to more users?

Note:
I am running llama3 from the ollama tool on my laptop, so once we have this option in this repo, I can test the same on my laptop itself.

Thank you