LlamaEdge / LlamaEdge

The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge

Home Page:https://llamaedge.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

feat: Support for OpenAI Function Calling (for full drop-in replacement)

ChristianWeyer opened this issue · comments

Summary

Hello all!

Problem
AFAICS, the current implementation does not have OpenAI Function Calling support. This would be a fantastic, powerful, and much needed feature. Almost any serious LLM integration application needs OAI Function Calling support - so we do as well for Open Source LLMs.

Success Criteria
Any OAI client can be used with LlamaEdge via the API server, even (and especially) those that use OAI Function Calling.

Reference:
https://platform.openai.com/docs/guides/function-calling
https://platform.openai.com/docs/api-reference/chat/create#chat-create-tools

Thank you!

Appendix

No response

Yes, this is much needed. But it also heavily depends on the model. The model must be fine-tuned in a way that it generates reliable JSON responses. We will be looking for these.

Maybe you could also try to leverage the grammar support in llama.cpp? This would also enable models that are not explicitly fine-tuned.

https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md
https://github.com/Maximilian-Winter/llama-cpp-agent

Yes, this is much needed. But it also heavily depends on the model. The model must be fine-tuned in a way that it generates reliable JSON responses. We will be looking for these.

Just wanted to check where we are with this?
With the upcoming wave of agent-based systems, Function Calling / Tools will be a very essential feature for successfully implementing agents on the edge.

Do you guys have any prio for this?

Supporting Open AI Tools calling (aka Function calling) would be killer! This would enable 'micro agents' running on edge devices being accessed programmatically in a standardized way.

.cc @juntao

Hi @ChristianWeyer

Yes, we have several things going on here. First, check out our support for the Command-R model, which is ranked pretty high in function calling benchmarks.

https://github.com/second-state/WasmEdge-WASINN-examples/tree/master/wasmedge-ggml/command-r

Then, this pending PR showcases how to force-generate JSON text that is suitable for function calling.

https://github.com/second-state/WasmEdge-WASINN-examples/pull/127/files

After it is merged, we will be adding some actual function calling endpoints to the LlamaEdge API server, and then provide developers a way to add their own endpoints.

Please help us test / plan these features. :)

Nice, looking forward to those!
Thanks for the hard work.

Full Open AI API compat will be a real milestone.

Hey @juntao - just checking in and bumping... 😅.

Do you already know when Function / Tool Calling will make it into awesome-llamaedge?