huggingface / speechbox

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Anyone have demo source code to process file with whisper large model and get outputs as vtt srt?

FurkanGozukara opened this issue · comments

Dear @patrickvonplaten thank you very much for this repo
I am using almost every day to generate subtitles for my videos by using Whisper

However, I need its produced vtt file (basically subtitle output format of transcription)

Currently to fix and improve punctuation, I am using fullstop-punctuation-multilang-large (https://huggingface.co/oliverguhr/fullstop-punctuation-multilang-large) but I can't say it is the best

I would like to test your repo however I need demo for full vtt export

Could you release a demo source code that can output vtt file ? it can have both fixed and raw output of whipser for comparison

Moreover, I have added you from linkedin if you accept i appreciate : https://www.linkedin.com/in/furkangozukara/

One final thing. I am also very interested in stable diffusion and preparing tutorial videos. I hope that you consider adding my tutorial videos to readme here : https://huggingface.co/runwayml/stable-diffusion-v1-5

And perhaps open back this topic so people can learn? thank you : https://huggingface.co/runwayml/stable-diffusion-v1-5/discussions/66

For a demo, please have a look at: https://github.com/huggingface/speechbox#web-demo

It would be nice to not conflate diffusion models with this library. This library is not about diffusion models, but about speech models.

For a demo, please have a look at: https://github.com/huggingface/speechbox#web-demo

It would be nice to not conflate diffusion models with this library. This library is not about diffusion models, but about speech models.

Thank you I already saw the demo. But how to get subtitle formatted outputs?