buckedunicorn / llama.go

llama.go is like llama.cpp in pure Golang!

Geek Repo

Github PK Tool

llama.go

Meta's LLaMA large language model inference in pure Golang using only CPU. No GPU needed.

It will stress all CPU cores using FP32 math - so you'll need at least 32Gb RAM for 7B model.

AVX2/AVX-512 and ARM NEON optimizations will come later. More details come a bit later too...

About

llama.go is like llama.cpp in pure Golang!

Other

Languages

Language:C 58.6%Language:C++ 21.5%Language:Go 13.9%Language:Python 5.1%Language:Shell 0.6%Language:CMake 0.2%Language:Batchfile 0.2%Language:Makefile 0.0%