💡 [REQUEST] Add support for llamacpp

Question

💡 [REQUEST] Add support for llamacpp

kumare3 opened this issue a year ago · comments

Start Date

No response

Implementation PR

It would be great to serve llama models on cpu using unionml. this is possible by using python bindings for llama cpp with 4-bit quantization that allows it to run on cpu pretty well
https://github.com/nomic-ai/pygpt4all

Reference Issues

No response

Summary

Ideal would be that users can fine tune a model and then serve it using the llama cpp module all within the unionml app

Basic Example

NA

Drawbacks

NA

Unresolved questions

No response