ONNXRuntime genai fine tuning

Question

ONNXRuntime genai fine tuning

Positronx opened this issue a month ago · comments

Hello,
Will fine tuning Large Language Models on onnxruntime-genai be supported any time in the future?

Nat Kershaw (MSFT) · Answer 1 · Thu May 23 2024 01:40:45 GMT+0800 (China Standard Time)

Hi @Positronx., we have support for running models that have been fine-tuned with LoRA.
And you can fine-tune the model ahead of time with LoRA using the Olive tool (https://github.com/microsoft/olive).
Would that satisfy your requirement?

Positronx · Answer 2 · Thu May 23 2024 14:38:57 GMT+0800 (China Standard Time)

Hi @natke, I took a look at the Olive tool. it looks like it does what I want it to except that it's written in python. I'm looking for a C++ tool unfortunately.

arnfaldur · Answer 3 · Fri May 24 2024 06:34:14 GMT+0800 (China Standard Time)

This looks more like a support request than an issue with the code in this repo, or its usage.

It should probably be closed.

Positronx · Answer 4 · Fri May 24 2024 16:00:54 GMT+0800 (China Standard Time)

Hi @arnfaldur, should I move it to a discussion and close this issue?

arnfaldur · Answer 5 · Fri May 24 2024 19:13:40 GMT+0800 (China Standard Time)

It sounds like the authors of this repo don't intend on adding fine tuning because they have olive to do that with. If that is the case, I would close this issue as not planned.

Regarding your specific needs. I don't know the specifics of why you need it in c++ but you can in many cases just call into python from c++.

If you can't, you'll have to make it yourself or wait for it in llama.cpp.

Baiju Meswani · Answer 6 · Mon Jun 10 2024 10:32:50 GMT+0800 (China Standard Time)

onnxruntime-genai is mainly designed to serve inference requests and finetuning is not a on our roadmap for now. The workflow pointed out by @natke is the recommended solution for this scenario.

I'll close this issue, but please feel free to share your ideas and suggestions.