LlamaEdge / LlamaEdge

The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge

Home Page:https://llamaedge.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Model requests

alabulei1 opened this issue · comments

Summary

AI technology is evolving rapidly, and many models have released new versions. We should support the latest version of the following models.

  • TinyLlama
  • Openchat-3.5-1210
  • OpenChat-3.5-0106
  • Nous-Hermes-2-Mixtral-8x7B-DPO
  • Nous-Hermes-2-Mixtral-8x7B-SFT

Please help us extend model list via comments.

Appendix

No response

I have verified tinyllama-1.1b-chat-v1.0, which works well.

The command line I used.

wasmedge --dir .:. --nn-preload default:GGML:AUTO:tinyllama-1.1b-chat-v1.0.Q3_K_L.gguf llama-chat.wasm -p chatml

wasmedge --dir .:. --nn-preload default:GGML:AUTO:tinyllama-1.1b-chat-v1.0.Q3_K_L.gguf llama-api-server.wasm -p chatml

The wasm file I used.

wasmedge llama-chat.wasm -V
llama-chat 0.2.3

wasmedge llama-api-server.wasm -V
llama-api-server 0.2.3

I have verified OpenChat-3.5-1210, which works well.

The command lines I used:

wasmedge --dir .:. --nn-preload default:GGML:AUTO:openchat-3.5-1210.Q2_K.gguf llama-chat.wasm -p openchat -r '<|end_of_turn|>'

wasmedge --dir .:. --nn-preload default:GGML:AUTO:openchat-3.5-1210.Q2_K.gguf llama-chat.wasm -p openchat -r '<|end_of_turn|>'

@apepkuss I noticed that you already added TinyLlama-Chat-v1.0. I think we can also add openchat-3.5-1210.

The latest release (b1016) has already supported these two models. The download urls:

Just tried Nous-Hermes-2-Mixtral-8x7B-DPO using the chatml prompt template. It works well. https://huggingface.co/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO-GGUF