saul-leon / llama-2-7b-q4-gguf-docker-cpu

Let's run Llama-2-7B (GGUF 4-bits quantized)

Llama-2-7B (Docker CPU)

Models

Build

docker build -t llm-image .

Deploy

docker run -d --name=llm-image-gguf-001 -p 8080:8080 -p 88:88 --restart unless-stopped llm-image:latest

Usage

WebServer Doc (llama-cpp-python): http://localhost:8080/docs
FrontEnd (It's on the house!): http://localhost:88/

About

Let's run Llama-2-7B (GGUF 4-bits quantized)

MIT License

Languages

Language:HTML 78.7%Language:Dockerfile 13.8%Language:Shell 4.0%Language:Python 3.5%