akashmjn / tinydiarize

Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Please Release Fine-Tuning Scripts

ohmguru opened this issue · comments

Please release fine-tuning scripts so we can adapt this for medium and large whisper models.

Please! I would love to fine tune and contribute the larger models!

Yeah I'm looking to port this to other human languages. Would be nice to know how it's done. it seems like the project is on ice right now?

Hi folks - sorry about the delay, i've been on a break for a bit with some personal life updates. Ack - and will keep you posted. Appreciate the patience/interest!

Would love to contribute! Waiting for the fine tuning scripts.

Hi everyone,

Firstly, thanks a lot for your interest in this project!

As you may have noticed, releases I'd initially planned have been on a pause over the last couple months, particularly the request for release of finetuning code to reproduce results shared in the repository.

Due to discussions on professional constraints with my employer, I've had to be conservative and refrain from making any major releases. My apologies as I didn't anticipate an issue here given that I personally found finetuning relatively simple to implement, and cheap to run on a consumer GPU. If you're interested in trying on your end, feel free to dig around on the repo a bit to get an idea how to start. However unfortunately until the issue on my end is resolved I recommend assuming an indefinite pause on further major releases from me.

I can imagine this is not news you'd have liked to hear, but I wanted to be transparent here and trust that you'd be able to understand. Much appreciated! 🙏

Best,
Akash

Thanks for the reply, totally get it!

Thanks for the thoughtful response and transparency Akash, that's totally understandable!

Hey @akashmjn - I'm VB, I lead the advocacy effort for open source audio at Hugging Face. It's sad to see that you've had to cut down on major releases because of professional reasons. If you're game then we'd love to help scale your experiments to large-v3 checkpoint. I think it'd be a huge win for the community.

Feel free to DM me at reach_vb and we can work something out! Hoping we can work something out that works for your constraints and the community too! 🤗

Open source for the win!

This project is insanely cool. Just thought I'd leave a comment!

Hey @akashmjn - I'm VB, I lead the advocacy effort for open source audio at Hugging Face. It's sad to see that you've had to cut down on major releases because of professional reasons. If you're game then we'd love to help scale your experiments to large-v3 checkpoint. I think it'd be a huge win for the community.

Feel free to DM me at reach_vb and we can work something out! Hoping we can work something out that works for your constraints and the community too! 🤗

Open source for the win!

Thanks for the note @Vaibhavs10, and excellent to see the interest in scaling this up from HuggingFace! Dropped you a DM and let's see what we can work out.