microsoft / onnxruntime-genai

Generative AI extensions for onnxruntime

microsoft/onnxruntime-genai Issues

phi-3-tutorial direct ML are not working properly
Closed 2 months ago3
Feature Request: Specify seed in python SDK
Closed 2 months ago2
Model for llama-3-8B, EP: cpu, precision: int4 generated using onnxruntime-genai/src/python/py/models/builder.py has issues
Closed 2 months ago6
Gemma-2b-it EP:CPU int4 ONNX model generated using ONNXruntime-genai throws an error
Closed 2 months ago4
Extend GeneratorParams.SetSearchOption to allow setting strings
Closed 2 months ago1
GeneratorParams should not throw an exception
Updated 2 months ago1
Wasm/WebGPU backends?
Closed 2 months ago1
Regex decoding support
Closed 2 months ago3
How to release GPU memory after each inference?
Updated 2 months ago1
Is there a plan to support running Phi3 on the NPU of the Core Ultra
Closed 2 months ago1
Extensions for LLM
Updated 2 months ago2
How to ignore EOS token when using onnxruntime-genai
Closed 2 months ago6
Cannot use the DirectML packages to run on CPU in Windows App
Closed 2 months ago6
OnnxRuntimeGenAIException - OrtValue shape verification failed, when running Phi-3 model with DML
Closed 3 months ago4
Is there a way to retrieve key-value cache values using onnxruntime-genai?
Closed 3 months ago1
Try to install the package in Python 3.10 (AWS sagemaker), but failed
Closed 3 months ago12
Documentation for running inference or a pre-built inference server
Closed 3 months ago1
Loading an `og.Model` sometimes throws a segfault
Closed 3 months ago2
Improve performance for quantized models on Power10 CPU
Closed 3 months ago1
Phi-3 does not load on iGPU vega 11 (Ryzen 2400g)
Updated a month ago3
Finetuned Phi3 built with the model builder give gibberish ouput due to lack of support for LoRA
Closed 3 months ago4
Unknown provider type: dml
Closed 3 months ago5
How to suppress info/warnings?
Closed 3 months ago1
Created onnx version of ChatQA-1.5-8B model with Unknown value: pad_token_id error on inferencing
Closed 2 months ago9
Phi-3 128k `cos_cache dimension 0 should be of max_sequence_length.` when setting larger context window
Closed 3 months ago4
Multi-target with .NET Standard 2.0 and .NET runtimes
Closed 2 months ago2
Phi-3 128K onnx model continues token generation after it should stop
Closed 3 months ago2
Failing to load the GPU enabled model on Linux: `Protobuf parsing failed`
Closed 3 months ago3
Huggingface Chat error for Phi3 ONNX
Closed 3 months ago1
Poor user experience CUDA Graph warning for Phi3
Closed 3 months ago
Difference in response for same prompt Phi-3 4k model
Closed 3 months ago3
Querying the ids of all the DML devices and selecting a device for inference
Updated 2 months ago7
Phi-3 genai_config in .NET requiring to modification
Closed 3 months ago3
Error loading the assembly file
Closed 3 months ago4
RTL support and other languages ?
Closed 2 months ago6
the output is kinda weird for a local llm ?
Closed 3 months ago
ImportError: libcudnn.so.8: cannot open shared object file: No such file or directory
Closed 2 months ago9
support for Ubuntu on raspberry pi (aarch64)
Closed 3 months ago
Support for Apple M1 / M2 / M3 CPUs - Metal/MPS
Closed 3 months ago1
Ability to acquire loss for next token generation
Closed 3 months ago2
Error while running Phi-3 with DML
Closed 2 months ago10
Can't use this lib and the non-genai version of the onnxruntime in the same MSIX project.
Closed 2 months ago6
Cannot run Genny sample with CUDA
Closed 2 months ago4
[Java] Support RoadMap ?
Closed 3 months ago1
benchmark_e2e run time error
Closed 3 months ago6
cannot build phi2 model on system with 32 GB ram
Closed 2 months ago2
.NET Microsoft.ML.OnnxRuntimeGenAI 0.1.0-rc4 fails loading `Phi-3-mini-4k-instruct-onnx`
Closed 3 months ago3
Phi-3 on mobile
Closed 2 months ago3
Llava style model support in Onnx?
Closed 3 months ago1
Phi-3 can't deal with Japanese. How can I solve this issue?
Closed 2 months ago20