decide the n_layers
leehelenah opened this issue · comments
Helena Lee commented
Hello,
Thanks for the nice implementation.
I notice you set n_layers= 1 in conf/train.json
I thought most of the time, people set n_layers to 6 or even higher in their experiments.
Would that be a reason that the Transformer model doesn't outperform RCNN in your results? Thank you.
lipengyu commented
Transformer have more parameters than RCNN, which need more data to fit it. It's also the reason that transformer-based pretrain LM models needs huge corpus. So if you have a large dataset, maybe the result will be different slightly.