Teacher Forcing
lalalune opened this issue · comments
Review our teacher forcing strategy.
One idea that might be interesting is to set it to 1 - loss. So we force until out loss below 0, then start to back off. By the end the model shouldn't care about sequence order.