ZeroRin / BertGCN

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Difficulty on Reproducing the Performance.

wangywUST opened this issue · comments

I ran the command
python3 train_bert_gcn.py --dataset R8 --pretrained_bert_ckpt checkpoint/roberta-base_R8/checkpoint.pth -m 0.5
without changing the code.

But the test accuracy is less than 0.8. Is there anything that I missed?

Checked my logs and noticed that for bert+gcn experiments i used batch size 128 and switched to 64 for gat variants, but i don't think that would make big difference in performance.
Does the pretrained roberta-base model matches the reported performance?

Thanks for your reply!

I just sequentially run the commands:

python3 build_graph.py R8
python3 finetune_bert.py --dataset R8
python3 train_bert_gcn.py --dataset R8 --pretrained_bert_ckpt checkpoint/roberta-base_R8/checkpoint.pth -m 0.5

with a V100 GPU, without changing any line of the code. But the test accuracy is less than 0.8 in every epoch.

Is your mentioned 'pretrained roberta-base model' the one that I produce with the second command or the one offered by you. If the latter is the case, may I know how to use it? Thank you very much for your time and hope that you can try running the commands listed above.

I mean the one produced by finetune_bert.py. I want to know whether the problem occurs during in the training of bert module or the joint training for bert+gcn. I'll also test it myself.

I noticed that I accidentally removed scheduler.step() in finetune_bert.py when reformatting my code, so that the lr scheduler is not working and the bert module fails to converge under a high learning rate. I have fixed this bug and it should work now.

Thanks for checking!

Based on your updated code, I sequentially run the commands:

python3 build_graph.py R8
python3 finetune_bert.py --dataset R8
python3 train_bert_gcn.py --dataset R8 --pretrained_bert_ckpt checkpoint/roberta-base_R8/checkpoint.pth -m 0.5

with a V100 GPU, without changing any line of the code. But the test accuracy is still less than 0.8 in every epoch.

Did you try these commands and see the results? Thanks a lot!

After running train_bert_gcn.py for 1 epoch I got ~0.97, so I thought it should work as expected.
I guess it is still important to know whether the problem occurs in bert training or joint training, can you check the test accuracy of ‘checkpoint/roberta-base_R8/checkpoint.pth’? And maybe you can try running:
python train_bert_gcn.py --dataset R8 -m 0.5
without using pretrained initialization

Checked our finetuned roberta model again and I noticed that it was trained under initial lr 1e-4 instead of 1e-3. According to my experiments today, training roberta with 1e-3 results in a bad model for initialization. I updated the default params and ran everything from the scratch, the result matches our reported performance. Hopefully it would solve the problem

Thank you for your time!

python3 train_bert_gcn.py --dataset R8 -m 0.5

gives good performance.