Edit decoding to force sample a timestamp after every speaker turn

Question

akashmjn opened this issue a year ago · comments

This would likely be a small patch to the logit filtering applied during decoding.

Doing so makes for readable transcripts and sets things up for downstream global diarization (clustering).