GGUF support
Mihaiii opened this issue · comments
Mihai Chirculescu commented
Feature request
Right now transformers.js works with ONNX models. It would be useful to also support GGUF files (see llama.cpp)
Motivation
Wider support + ONNX doesn't quantize below 8bit, but GGUF does.
Your contribution
I could help manual testing. Regarding the dev work, I'm unsure.