EZForever / llama.cpp-static

Static builds of llama.cpp (Currently only amd64 server builds are available)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

llama.cpp-static

Your daily, minimal build of llama.cpp. Also available on Docker Hub.

Source code: https://github.com/ggerganov/llama.cpp
Built from: https://github.com/EZForever/llama.cpp-static

Usage

Please refer to llama.cpp docker guide and server README.

Tag format

tl;dr: Use server-ssl-avx2 if you don't know what you're doing.

Server images are tagged in the format of server-<ssl>-<avx>.

<ssl> is one of the following:

  • nossl: Minimal build with no SSL/TLS capability.
  • ssl: Built with OpenSSL (LLAMA_SERVER_SSL=ON), thus supports --ssl-key-file and --ssl-cert-file.

<avx> is one of the following:

  • noavx: All AVX-related optimizations are disabled. Do not use this build unless you are working around some known bug, or running LLMs on a 10-year-old potato.
  • avx: (Only) AVX instruction set is enabled. Might be useful if you are using some old CPUs that don't support AVX2.
  • avx2: AVX2 instruction set is enabled. This build should support most modern/recent CPUs with reasonable performance.
  • avx512: AVX512 and AVX512-VNNI instruction sets are enabled. Currently only some high-end or server-grade CPUs support these instruction sets, so check your hardware specs before using this build.
  • oneapi: Experimental build with the Intel oneAPI compiler, inspired by ggerganov/llama.cpp#5067. Offers a ~30% speed increase (~20tok/s vs ~15tok/s) in prompt processing on my machine compared to avx2 builds. Not updated daily.

About

Static builds of llama.cpp (Currently only amd64 server builds are available)

License:MIT License


Languages

Language:Dockerfile 100.0%