Large V2 versus Large V3

Question

Large V2 versus Large V3

EricVee68 opened this issue 5 months ago · comments

Love your executables, excellent work.
An observation that I hope you can elaborate on while running Standalone Faster-Whisper r172.1

When using --model=large-v2, I receive this message:
Warning: Word-level timestamps on translations may not be reliable.

When using --model=large-v3, I receive this message:
Warning: 'large-v3' model may produce inferior results, better use 'large-v2'!

Any updates to either of these, or recommended tweaks for either model (preferably v3)?

Thanks much!

Purfview · Answer 1 · Mon Jan 29 2024 04:58:23 GMT+0800 (China Standard Time)

Any updates to either of these, or recommended tweaks for either model (preferably v3)?

What updates? Those are just warnings, use whatever you want.
I feel like "large-v3" needs different defaults on whisper's pseudo-vad settings to hallucinate less.

EricVee68 · Answer 2 · Mon Jan 29 2024 05:09:22 GMT+0800 (China Standard Time)

Got it...
I guess what I'm trying to convey is that YOUR implementation at least offers some "heads up" on what could result in problems. Especially since most other posts regarding Faster Whisper with model v3 seem to imply it's the best thing since sliced bread! ;-)

Purfview · Answer 3 · Mon Jan 29 2024 05:21:15 GMT+0800 (China Standard Time)

Especially since most other posts regarding Faster Whisper with model v3 seem to imply it's the best thing since sliced bread! ;-)

Where did you see such? I've seen only complains that it's worse.

Find a single post where it says that it's better: openai/whisper#1762

And look there: https://deepgram.com/learn/whisper-v3-results

Purfview · Answer 4 · Mon Jan 29 2024 05:49:16 GMT+0800 (China Standard Time)

Btw, it's dupe of #101