Why does we need to train for Stage II and Stage III? And why not just train for one stage on the annotated dataset?

Question

Why does we need to train for Stage II and Stage III? And why not just train for one stage on the annotated dataset?

chengyang00 opened this issue 3 years ago · comments

I want to know why doing this can improve the performance. Thanks!

Abhinav Dayal · Answer 1 · Fri May 28 2021 12:28:46 GMT+0800 (China Standard Time)

My understanding. Stage 1 is synthetic data which is also huge in size, so training is done on that. Stage 2 and 3 use manually annotated and accurate data with the kind of errors humanly made. The data size is tiny compared to the synthetic data. Thus they call it fine tuning and not training.

Alex Skurzhanskyi · Answer 2 · Fri May 28 2021 17:46:27 GMT+0800 (China Standard Time)

Thanks for answering this. You're right – different stages have data of different quality.