youngwoo-yoon / youtube-gesture-dataset

This repository contains scripts to build Youtube Gesture Dataset.

Home Page:https://sites.google.com/view/youngwoo-yoon/projects/co-speech-gesture-generation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Transcript

ireneb612 opened this issue · comments

commented

Hi, How should the transcript be? should it be a .vtt file? is this ok?

[{'duration': 1.696, 'start': 12.937, 'text': 'This is a photograph'},
{'duration': 2.816, 'start': 14.633, 'text': 'of a man whom for many years'},
{'duration': 3.721, 'start': 17.449, 'text': 'I plotted to kill.'},
{'duration': 3.02, 'start': 21.17, 'text': 'This is my father,'},
{'duration': 3.737, 'start': 24.19, 'text': 'Clinton George "Bageye" Grant.'},
{'duration': 2.271,
'start': 27.927,
'text': "He's called Bageye because he has"}]

I used youtube-dl library to get subtitles of videos in VTT. Here is one sample VTT file:

WEBVTT
Kind: captions
Language: en

00:00:12.972 --> 00:00:17.858
Do you ever stop and think,
during a romantic dinner,

00:00:17.882 --> 00:00:20.774
"I've just left my fingerprints
all over my wine glass."

00:00:20.798 --> 00:00:21.799
(Laughter)
...

commented

Thank you!