Model Merging

Question

Model Merging

okpatil4u opened this issue 8 months ago · comments

Hello Eric,

This issue may not be relevant to this repo, but it seems like model merging is gathering some speed. Have you seen any examples ? Any tips on how to implement this in candle ?

Thanks !

Eric Buehler · Answer 1 · Mon Dec 04 2023 07:19:01 GMT+0800 (China Standard Time)

@okpatil4u, thanks for your interest! If you mean LoRA model weight merging, candle-lora has already implemented it. Otherwise, could you please let me know what you meant by model merging?

Eric Buehler · Answer 2 · Mon Dec 11 2023 01:55:23 GMT+0800 (China Standard Time)

Closing so that it does not become stale, please feel free to reopen!

Omkar Patil · Answer 3 · Mon Dec 11 2023 09:56:28 GMT+0800 (China Standard Time)

Apologies Eric. I was thinking about the following repos.

https://github.com/yule-BUAA/MergeLM
https://github.com/cg123/mergekit

The idea behind is taking two different finetuned models with the same origin and merge them so their expertise could be compounded. This could be pretty useful tool if its theoretical efficiency is proven.

Just giving you a heads up.

Eric Buehler · Answer 4 · Sat Dec 16 2023 23:16:26 GMT+0800 (China Standard Time)

Do you mean something like Mixtral?

Eric Buehler · Answer 5 · Sun Dec 17 2023 04:09:28 GMT+0800 (China Standard Time)

@okpatil4u, I just took a look at the following link: https://github.com/cg123/mergekit. I think it would definitely be possible to implement using candle-lora's trait-based swapping mechanism!

Eric Buehler · Answer 6 · Fri Feb 02 2024 03:34:11 GMT+0800 (China Standard Time)

Closed so it doesn't become stale. Please feel free to reopen if you have any ideas!