oTranscribe / oTranscribe

A free & open tool for transcribing audio interviews

Home Page:oTranscribe.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Export in format suitable for captions, such as vtt

orotau opened this issue · comments

Really enjoy using this transcription program on line, thanks!
I want to import into http://amara.org/en/ the transcription I export using oTranscribe
To do this I need something in a format such as vtt.
I can (and have) just edited the transcription file manually to get it into the .vtt format.
Happy to continue to do this or I could put something together in Python. Any thoughts?

Hi @orotau, hi @ejb .

The problem with the conversion to the VTT format is the lack of placing starting and ending timestamps on the transcription text.

So, I've tried adding the starting and ending timestamps on each subtitle manually using the insertTimestamp shortcut Ctrl/Cmd+J. The result looks something like this: https://prnt.sc/g7xe18

I've come up with a quick solution which you can find here: kostasx@46ab34c#diff-1162b350bf232465b180ba5f9c1c723f

I've tested the exported .vtt file which looks like this:

WEBVTT

NOTE Paragraph

00:38 --> 00:42
Well, what we are going to do today is climb a pretty big mountain,

00:43 --> 00:44
Because, we're going to go from a neural net with 2 parameters

00:46 --> 00:49
To discussing the kind of neural nets in which

00:50 --> 00:56
people end up with 60 million parameters

21:33 --> 21:36
Some time later

1:01:51 --> 1:01:56
Let's go over an hour

1:26:16 --> 1:26:21
and see what happens...

Of course the code needs a lot of rewriting, but I think it's a good start.
Let me know if you have any ideas or suggestions.

P.S. I would suggest we stick with JavaScript on features like this. ;)

Very cool @kostasx! I've had many people ask for this feature. Would be a valuable addition.

You both might be interested to know that @pietrop is working on a standalone app, inspired by oTranscribe, for creating VTT files. But I don't see why oTranscribe can't have basic support for VTT.

I agree that we should stick to JavaScript, since oTranscribe is meant to run without a server.