[Feature] Speaker Diarization

Question

[Feature] Speaker Diarization

Arche151 opened this issue 7 months ago · comments

I'd like to propose a feature that could elevate Whishper's functionality further: the implementation of speaker diarization, utilizing pyannote.

This addition would be a game-changer imo and finally render Trint obsolete for me. Being able to modify speaker tags, offering a workaround for the occasional inaccuracies in speaker identification by pyannote would be amazing too.

I noticed your plans to introduce insanely-fast-whisper as an alternative backend - it now includes built-in diarization!

Looking forward to potentially seeing this functionality in future updates! :)