microsoft / onnxruntime-genai

Generative AI extensions for onnxruntime

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can I use onnxruntime-genai with a grammar for output like llama.cpp?

han-minhee opened this issue · comments

llama.cpp can force the model to generate outputs that satisfy the given grammar format.
(which, in my understanding, is actually picking the token with the highest logit among the tokens that satisfy the grammar)
Is something like this possible with the current C or C# API?
I couldn't find the corresponding API in the documentation.

Hi @han-minhee. This is something we plan to support in the future. But we do not have support for this right now.

We are currently in the process of making a plan for new features to support in the 0.4.0 release and I will add this to the pool of new features so it can be considered for 0.4.0.

I'll move this to discussions now.