llama_playground

Playground using llama-cpp

It contains experimentations on how to run local models using llama-cpp library to run local models. Here are the supported models. We use it thru node-llama-cpp wrapper.

npm scripts

To install the project

npm install

To download a model from huggingface

npm run download:mistral-7b-instruct

it will be downloaded in ./models/ folder.

Benchmark

It will benchmark the downloaded models.

npm run benchmark

Output:

Benching model... codellama-7b-instruct.Q4_K_M.gguf
Time elapsed: 21.71-seconds
Speed: 108.75 chars/second - 25.01 tokens/second
Benching model... llama-2-7b-chat.Q6_K.gguf
Time elapsed: 22.05-seconds
Speed: 83.73 chars/second - 18.82 tokens/second
Benching model... mistral-7b-instruct-v0.1.Q6_K.gguf
Time elapsed: 20.07-seconds
Speed: 71.05 chars/second - 17.19 tokens/second
Benching model... zephyr-7b-alpha.Q6_K.gguf
Time elapsed: 14.41-seconds
Speed: 75.00 chars/second - 16.65 tokens/second

Some Predefined Models For Convenience

Check the ./packages.json scripts and look at downdload:* or validate:*.

To download a specific model

npm run download:mistral-7b-instruct

To check the download was successful

npm run validate:mistral-7b-instruct

Useful Links

node-llama-cpp doc https://withcatai.github.io/node-llama-cpp/guide/
model to download https://huggingface.co/TheBloke/llama-2-7B-Arguments-GGUF
langchain official doc https://js.langchain.com/docs/modules/model_io/models/llms/integrations/llama_cpp

Models

The models are in GGUF format. They are downloaded from huggingface.

Files

examples/ contains examples
examples/langchain-chat-bot.js is a chat bot using langchain.js and the local model
examples/raw-chat-bot.js is a chat bot using node-llama-cpp and the local model

About

Playground using llama-cpp

Languages

Language:JavaScript 100.0%