akashmjn / tinydiarize

Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Not getting the SPEAKER SEGMENTS

prafulkl opened this issue · comments

I have downloaded the model and tested it on my local machine. However, I am facing a critical issue. The [SPEAKER_TURN] information that I was expecting is missing in all the WAV files that I processed using this component. It was working fine when I tested it last time, but now it's not. This absence of speaker segmentation is causing significant disruptions in my workflow. As I need to deploy this system into a production environment, it is crucial that this issue gets resolved as soon as possible.

The project hasn't been meaningfully updated since June. Probably what changed was in your own setup?

@prafulkl this is not a production ready tool so you should not make requests like this.

Hi @prafulkl, as the others have pointed out (ty!):

  • If things were working previously, it's most likely something on your end. Both the checkpoints and inference code haven't seen any changes. In general, you should be able to share a reproducible example if someone has to help debug an issue. I would suggest making sure your python env isn't somehow clashing with the original OpenAI whisper package #16
  • This is not a production-ready project, rather a prototype/proof-of-concept that happens to work fairly well. For reasons in #14 it will be the case for some time. I have added a note to the README to clarify this. Hope that helps.

For these reasons, I'm going to have to close this issue.