microsoft / onnxruntime-genai

Generative AI extensions for onnxruntime

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[.NET] Genny chat-bot sample doesn't support DirectML and Phi-3

asmirnov82 opened this issue · comments

I would like to update ..examples\csharp\Genny sample to support DirectML and Phi-3 model.

I managed to do it for Stateless mode (#568), however I faces with an issue for Stateful mode:

Inference fails with OnnxRuntimeGenAIException: 'Non-zero status code returned while running DmlFusedNode_0_0 node. Name:'DmlFusedNode_0_0' Status Message: invalid unordered_map<K, T> key'

This happens due to current implementation of private void AddPastTokens(Sequences sequences) method in Genny sample:

// Only keep (context_length - max_length) worth of history
while (_pastTokens.Count > ModelOptions.ContextLength - SearchOptions.MaxLength)
{
    _pastTokens.RemoveAt(0);
}

for Phi3 both ModelOptions.ContextLength and SearchOptions.MaxLength are equal to 4096, so this method removes all tokens from session history and passes empty collection of tokens to generatorParams.SetInputIDs.

Should it be just while (_pastTokens.Count > ModelOptions.ContextLength) instead so number of passed tokens doesn't exceed model ContextLength?

Documention on GenAI C# API (https://onnxruntime.ai/docs/genai/api/csharp.html) doesn't provide any description on what each SearchOptions fields mean and how SetInputIDs works, so I don't have a clear view what this check is aimed to achive.

Could you please provide more info? If my understanding is correct, fix can be applied. If you approve it, I'll finish #568 PR

@baijumeswani could you please assist?

Thanks for raising this issue @asmirnov82. We will look into it