Hangz-nju-cuhk / Talking-Face-Generation-DAVS

Code for Talking Face Generation by Adversarially Disentangled Audio-Visual Representation (AAAI 2019)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is audio video offset considered in LRW?

leeeeeeo opened this issue · comments

@Hangz-nju-cuhk LRW is used to train the model according to your paper, but there are audio video offset in LRW videos. And [11] used SyncNet when pre-processing dataset to correct the offset.
Did you consider this problem when preparing dataset?
Thank you!

[11] Joon Son Chung, Amir Jamaludin, and Andrew Zisserman. You said that? arXiv preprint arXiv:1705.02966, 2017.

@leeeeeeo I looked closely but had not seen audio video offset cases in the LRW dataset. And in the LRW paper, they said the data had been cleaned already, so I believe such cases are very rare in this dataset. Of course, it can be further cleaned with SyncNet, but I think the pre-processing to this dataset is not that crucial to our task.