mlc-ai / web-llm

High-performance In-browser LLM Inference Engine

Home Page:https://webllm.mlc.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

I am seeing 100% RAM usage in my laptop when running this. Can you please let me know if its an issue or the minimum RAM requirement is higher than 8 GB?

devashish234073 opened this issue · comments

Hi @devashish234073, if you look at https://github.com/mlc-ai/web-llm/blob/main/examples/simple-chat/src/gh-config.js, there is a field called vram_required_MB for each model. I would say it is an optimistic estimation and the actual usage should be higher than this field. Looking at llama 7b q4f32 specifically, it is indeed around 8GB. I would suggest perhaps using smaller models (e.g. the 3B ones), and use f16 if your browser/device supports that.

You can also use https://github.com/mlc-ai/web-llm/tree/main/utils/vram_requirements to see how this usage is broken down.