mpc001 / auto_avsr

Auto-AVSR: Lip-Reading Sentences Project

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to train an auto-avsr model from scratch through curriculum learning

sara-kkk opened this issue · comments

commented

Thank you for sharing the code.

I am interested in training a visual-only model from scratch on the LRS2 dataset, using curriculum learning.
I want to know the optimal learning rate and the number of epochs for training the model using a subset of LRS2 that includes only short utterances lasting no more than 4 seconds (100 frames).
Could you provide details on how you trained the visual-only model available in the model zoo using only the LRS3 dataset (438 hours)?

Hi @sara-kkk, for LRS3 (438 hours), I start by training with short utterances (100 frames) using a learning rate of 0.0002 for 75 epochs. Then, I load the weights for fine-tuning on the whole LRS3 using a learning rate of 0.001 for 75 epochs.