Purfview / whisper-standalone-win

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What settings improve word timestamps accuracy of timing?

scaruslooner opened this issue · comments

I have this problem where the end transcript time cuts off the word instead of capturing the whole word.
What settings do you recommend I play with to get more accurate/precise timing of the ending of a word?
(Im using a 4060ti 16 gb vram)

This is what i type in:
"C:\Program Files\Whisper\whisper-faster.exe" "C:\Users\camer\OneDrive\Desktop\Whisper-Faster_r186.1_windows\Whisper-Faster\2024-03-10 20-56-19.mp4" --model large-v3 --output_dir C:\Users\camer\OneDrive\Desktop\Whisper-Faster_r186.1_windows\Whisper-Faster --output_format srt --language en --sentence

Preview of the output:
(Also i highlighted the words that are getting cut off)
1
00:00:04,800 --> 00:00:05,240
Yo.

2
00:00:05,240 --> 00:00:05,680
Yo.

3
00:00:07,680 --> 00:00:09,300
The Pistons suck this year.

4
00:00:09,440 --> 00:00:10,620
I don't know why they're so bad.

5
00:00:11,620 --> 00:00:13,240
I feel like they need to get better.

6
00:00:14,720 --> 00:00:16,060
In everything they do.

7
00:00:16,380 --> 00:00:17,120
You feel me?

8
00:00:22,070 --> 00:00:23,770
I'm just kind of eating right now.

9
00:00:25,770 --> 00:00:26,750
Kyrie has the ball.
"

Share the json file.
Can you share the audio?

I dont where the json file is but i have srt file.
2024-03-10 20-56-19 words get cuttoff.txt

2024-03-10.20-56-19.mp4

Oh wait i can generate json. Here it is:
2024-03-10 20-56-19.json

Maybe you need to zip json and srt files as I can't open them.

SrtAndJSON.zip
Does this work

Nope, zip doesn't work too. Upload there ->https://wetransfer.com/

https://we.tl/t-r5dMVjh6Ue
(I switched the link )

That is normal precision for whisper.
Maybe in the future it will be improved with wav2vec alignment.

--output_dir C:\Users\camer\OneDrive\Desktop\Whisper-Faster_r186.1_windows\Whisper-Faster

You can use -o source

--output_format srt

not needed as srt is default