Hangz-nju-cuhk / Talking-Face-Generation-DAVS

Code for Talking Face Generation by Adversarially Disentangled Audio-Visual Representation (AAAI 2019)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Table 3: Audio-Visual Speech Recognition and 1:25000 audio-video retrieval results with different supervisions.

zzzzhuque opened this issue · comments

Hi, after reading the paper, I am confused about the table 3.
What is the meaning of visual acc, audio acc and combine acc?
How did you calculate the result of 67.5%, 91.8%, 95.2%?
default

HI @ZHUTAO142857 , sorry that I didn't notice this issue before.

I performed the audio-visual recognition task (word classification for LRW) as written in the paper and these are the accuracies of the classification using only video or audio or combination.