akashmjn / tinydiarize

Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Edit decoding to force sample a timestamp after every speaker turn

akashmjn opened this issue · comments

This would likely be a small patch to the logit filtering applied during decoding.

Doing so makes for readable transcripts and sets things up for downstream global diarization (clustering).