just the bare basics to run inference on local hardware.
currently working:
- gguf.py Now it reads the entire gguf file and returns the file locations for the tensor data.
todo:
- load tensors into model
- inference
My own implementation to run inference on local LLM models
just the bare basics to run inference on local hardware.
currently working:
todo:
My own implementation to run inference on local LLM models
GNU Affero General Public License v3.0