mit-han-lab / TinyChatEngine

TinyChatEngine: On-Device LLM Inference Library

Home Page:https://mit-han-lab.github.io/TinyChatEngine/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

problem with - Loading model... Killed

ecliipt opened this issue · comments

so i've been trying to test tinychatengine on my win10 laptop with 7gb of ram and an i3 (11gen) with windows's latest version of wsl debian, and i'm not sure if this is an error or just lack of specs, but here's what i've encountered:

  • every time i try to chat with LLaMA2_7B_chat_awq_int4 --QM QM_x86 (i followed the tutorial in the readme):
(venv) user@LAPTOP:~/TinyChatEngine/llm$ ./chat
TinyChatEngine by MIT HAN Lab: https://github.com/mit-han-lab/TinyChatEngine
Using model: LLaMA2_7B_chat
Using AWQ for 4bit quantization: https://github.com/mit-han-lab/llm-awq
Loading model... Killed
  • when i try to use opt models like the 125m or the 1.3B (fp32):
(venv) user@LAPTOP:~/TinyChatEngine/llm$ ./chat OPT_125m
TinyChatEngine by MIT HAN Lab: https://github.com/mit-han-lab/TinyChatEngine
Using model: OPT_125m
Loading model... No such file or directory: INT4/models/OPT_125m/decoder/embed_tokens/weight.bin
terminate called after throwing an instance of 'char const*'
Aborted

i'm not really going to use the opt models, but i tought it would be good to note.
is there anything i can do to "fix" this or is this having to do with the laptop's specs? thanks & happy holidays