Large model

Question

Large model

mootje2 opened this issue a year ago · comments

Thanks for the nice afford with this app, I was wondering if I could- use it with the large model because I can see that with the multilanguage the transscription the large model have much better results than the one you are using. I have the large model on my Ubuntu server and test it with Gradio it gives a much better transcription. The question is how to adjust the script the use the large model from my local server?. also I saw in your demo on hugging face there is a microphone I do miss it.
Thanks

Joshua Lochner · Answer 1 · Fri Sep 01 2023 01:36:35 GMT+0800 (China Standard Time)

The purpose of this project is to run whisper directly in your browser, instead of a local server, so, I won't be modifying it to support an external API. However, feel free to clone the repo yourself, then separating the frontend from the backend if you wish to reuse the user interface.

Yavuz Kömeçoğlu · Answer 2 · Thu Feb 01 2024 15:01:59 GMT+0800 (China Standard Time)

Hi @xenova,
We added it to the models list as 'Xenova/whisper-large': [1550]. I download the model, but I get the error "RangeError: offset is out of bounds" during the transcription phase. I get the same error on devices with these different RAMs. How can I operate the Large model?

midpoint · Answer 3 · Mon Jun 17 2024 19:05:58 GMT+0800 (China Standard Time)

whisper-web\src\components\AudioManager.tsx

    const models = {
        // Original checkpoints
        'Xenova/whisper-tiny': [41, 152],
        'Xenova/whisper-base': [77, 291],
        'Xenova/whisper-small': [249],
        'Xenova/whisper-medium': [776],
        'Xenova/whisper-large-v2': [23776],
        'Xenova/whisper-large-v3': [17776],

        // Distil Whisper (English-only)
        'distil-whisper/distil-medium.en': [402],
        'distil-whisper/distil-large-v2': [767],
    };