The accuracy of bert_spc.py is low

Question

The accuracy of bert_spc.py is low

YHTtTtao opened this issue 5 years ago · comments

I ran the code of bert_spc.py model, the accuracy rate was 65.8%, F1 was 36.5%. Why the accuracy rate was not so high? I used the restaurant data. Was it because of the data set?

songyouwei · Answer 1 · Sun Mar 31 2019 19:28:40 GMT+0800 (China Standard Time)

Note that BERT is very sensitive to hyperparameters on small data sets.

My own experiment shows that the learning rate of 5e-5, 3e-5, 2e-5 performs excellent.
BERT-SPC on Restaurant dataset: Acc==0.8446, F1==0.7698

songyouwei · Answer 2 · Sun Mar 31 2019 19:31:19 GMT+0800 (China Standard Time)

plz check the latest committed version

YHTtTtao · Answer 3 · Sun Mar 31 2019 19:43:57 GMT+0800 (China Standard Time)

Ok, thank you, I will try that again.

floAlpha · Answer 4 · Mon Apr 01 2019 20:27:18 GMT+0800 (China Standard Time)

666, 最终还是被大佬解决了

Felix Hamborg · Answer 5 · Tue Nov 05 2019 20:50:25 GMT+0800 (China Standard Time)

@songyouwei What are the parameters, with which you achieve accuracy values of over 0.80? I'm using ABSA and aen_bert, with default parameters (which are defined in the argparser in train.py) but the accuracy goes only up to 0.59, where it converges. These are the parameters I use:

> training arguments:
>>> model_name: aen_bert
>>> dataset: restaurant
>>> optimizer: <class 'torch.optim.adam.Adam'>
>>> initializer: <function xavier_uniform_ at 0x7f4f29cca7a0>
>>> learning_rate: 2e-05
>>> dropout: 0.1
>>> l2reg: 0.01
>>> num_epoch: 10
>>> batch_size: 16
>>> log_step: 5
>>> embed_dim: 300
>>> hidden_dim: 300
>>> bert_dim: 768
>>> pretrained_bert_name: bert-base-uncased
>>> max_seq_len: 80
>>> polarities_dim: 3
>>> hops: 3
>>> device: cpu
>>> seed: None
>>> valset_ratio: 0
>>> local_context_focus: cdm
>>> SRD: 3
>>> model_class: <class 'models.aen.AEN_BERT'>
>>> dataset_file: {'train': './datasets/semeval14/Restaurants_Train.xml.seg', 'test': './datasets/semeval14/Restaurants_Test_Gold.xml.seg'}
>>> inputs_cols: ['text_raw_bert_indices', 'aspect_bert_indices']

songyouwei · Answer 6 · Tue Nov 05 2019 23:39:07 GMT+0800 (China Standard Time)

@fhamborg Maybe it's because the batch size is small? What is the performance of this set of parameters on bert_spc ? I'll check it later.

Felix Hamborg · Answer 7 · Wed Nov 06 2019 05:04:14 GMT+0800 (China Standard Time)

Hi, thanks for getting back! With the same parameters, but on bert_spc the performance I get on the restaurants set is a bit higher than for aen_bert but still not in the 80%ish range:

> val_acc: 0.6455, val_f1: 0.4145
>> test_acc: 0.6670, test_f1: 0.3756

FYI, I attached the full console output: https://gist.github.com/fhamborg/dade525af54a158982967383444fade4

Heng Yang · Answer 8 · Wed Nov 06 2019 10:53:23 GMT+0800 (China Standard Time)

Hello, I just cloned the latest repository and checked the code after latest PR . I set the parameters consistent with you and aen_bert's accuracy on restaurant achieves 81+. Here is my training log.
training log.txt

Heng Yang · Answer 9 · Wed Nov 06 2019 11:52:27 GMT+0800 (China Standard Time)

Hi guys, I've also tested all of the Bert-based models modified by my latest PR, and here are the logs, and it's working really well. I hope it helps.
bert_spc training log.txt
lcf_bert training log.txt

Felix Hamborg · Answer 10 · Thu Nov 07 2019 00:04:46 GMT+0800 (China Standard Time)

Hey @yangheng95 , thanks for the logs! I still haven't figured what exactly the difference; but the only more or less meaningful assumption is due to the random initialization of a few components in pytorch and transformers. Would you be so kind to post your bert_spc log with the same parameters as before, but also setting the --seed 1337? This would allow me a better comparison. Thank you =)

Felix Hamborg · Answer 11 · Thu Nov 07 2019 07:09:23 GMT+0800 (China Standard Time)

Also, could you post the log when running aen_glove? Thank you very much in advance!

Heng Yang · Answer 12 · Thu Nov 07 2019 10:01:36 GMT+0800 (China Standard Time)

Hello @fhamborg , I trained the bert_spc model with 1337 as seed and the result is still very good. >> test_acc: 0.8402, test_f1: 0.7692. I think that cloning and referring to the latest code after PR may solve your problem. Due to the busy schedule, I may not have time to adapt and train AEN-GloVe, but you can run it by adding the aen-glove model to train.py just as adding the other models.
bert_spc.training.log.seed1337.txt

Felix Hamborg · Answer 13 · Thu Nov 07 2019 18:33:32 GMT+0800 (China Standard Time)

Hi @yangheng95 , thanks for your reply and verification with seed 1337. I'm using the latest repo, i.e., including the PR to migrate to transformers. Though, I tried it on another machine and the results went up to roughly 70% (plus/minus) for all the approaches.

Also, I managed to train aen_glove but in contrast to the results reported in the paper, I was only able to get roughly 50% on validation or test set. Do you have any idea where the difference for glove could come come from?

songyouwei · Answer 14 · Thu Nov 07 2019 20:25:53 GMT+0800 (China Standard Time)

@fhamborg Thank you for reporting this issue.
I just looked into this.
There might be something wrong with the recent release of the pretrained bert from https://github.com/huggingface/transformers, named transformers.

I installed it with pip install transformers, replaced pytorch_transformers imports with transformers, and reproduced this issue.

Try reinstall and use the previous release pytorch_transformers with pip install pytorch-transformers.

Felix Hamborg · Answer 15 · Thu Nov 07 2019 20:35:19 GMT+0800 (China Standard Time)

Thanks, you're right, I was using transformers instead of pytorch_transformers. I will check it out now :-)

Felix Hamborg · Answer 16 · Fri Nov 08 2019 00:57:57 GMT+0800 (China Standard Time)

Awesome, on pytorch_transformers I get much higher performances than on transformers, e.g.:

> val_acc: 0.8536, val_f1: 0.7924

Thanks for the hint, @songyouwei ! Do you have any idea what might be causing this significant difference between whether pytorch_transformers or transformers is being used?

shuzhinian · Answer 17 · Thu Mar 07 2024 10:20:18 GMT+0800 (China Standard Time)

hello @songyouwei ,how to train aen_glove,do I need to modify the code for the train?