docker build -t llm-image .
docker run -d --name=llm-image-gguf-001 -p 8080:8080 -p 88:88 --restart unless-stopped llm-image:latest
- WebServer Doc (llama-cpp-python): http://localhost:8080/docs
- FrontEnd (It's on the house!): http://localhost:88/
Let's run Llama-2-7B (GGUF 4-bits quantized)
docker build -t llm-image .
docker run -d --name=llm-image-gguf-001 -p 8080:8080 -p 88:88 --restart unless-stopped llm-image:latest
Let's run Llama-2-7B (GGUF 4-bits quantized)
MIT License