kakaobrain / rq-vae-transformer

The official implementation of Autoregressive Image Generation using Residual Quantization (CVPR '22)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What is the reason for using cosine annealing scheduler for stage1 if min_lr = optimizer lr

fostiropoulos opened this issue · comments

It is not clear if this is a bug with the implementation or intentional that the cosine annealing schedule will never kick in because the lr of the optimizer is equal to the lr of the minimum lr of the scheduler. Please advice.

@fostiropoulos ,
we implement the cosine annealing to explore optimal training setting of RQ-VAE, including cosine learning rate schedule.
While cosine learning rate scheduling requires the predefinition of total training epoch, we found that the performance of RQ-VAE can be improved as the training proceeds. Thus, instead of fixing the implementation of cosine lr schedule, we fix the minimum lr to instantiate the constant lr schedule with warm-up.