I stop receiving predictions while streaming after some time
jayavanth opened this issue · comments
I'm running
python whisper_online_server.py --model base.en --host localhost --port 43001 --vad
arecord -f S16_LE -c1 -r 16000 -t raw -D default | nc localhost 43001
and the client stops receiving predictions after 50-70s. If I restart the client it starts working again. I noticed that this happens more often with audio that have frequent silences. Some audios that have background noises as "silence" did okay.
Also wondering how I can get word level timestamping. Is there an option for that? Because I'm currently getting something like this
20530 21390 Just saying, you know what?
24230 24770 before anything, because
24770 24830 for
24830 25770 anything because we've handled business
35130 36750 Well, I mean, the most
36770 37650 recent two rounds at
37650 38970 NIP's been able to put on the board have
38970 40210 been the result of some
40210 41190 form of aggression for
41190 42090 CI and the push into
42090 43250 halls to actually try and fight
43250 43550 out the
43550 45470 balconies at this point. And then now
45850 47010 here trying to be aggressive with
47010 48130 a boost off at the half wall.
Thank you for this library 🙌
Hi, are you using --vad ?
Also wondering how I can get word level timestamping. Is there an option for that? Because I'm currently getting something like this
you can print the word-level timestamps. Override or rewrite this function: https://github.com/ufal/whisper_streaming/blob/c236a9984f7e71465eb04a63b5545198fce1c8eb/whisper_online.py#L412C4-L425C23
Hi, are you using --vad ?
Yes, I noticed your command.
You're using base.en model, this one is probably badly performing and has outages like you report. Use bigger one for better quality.
Good luck!