songyouwei / ABSA-PyTorch

Aspect Based Sentiment Analysis, PyTorch Implementations. 基于方面的情感分析,使用PyTorch实现。

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The accuracy of bert_spc.py is low

YHTtTtao opened this issue · comments

I ran the code of bert_spc.py model, the accuracy rate was 65.8%, F1 was 36.5%. Why the accuracy rate was not so high? I used the restaurant data. Was it because of the data set?

Note that BERT is very sensitive to hyperparameters on small data sets.

My own experiment shows that the learning rate of 5e-5, 3e-5, 2e-5 performs excellent.
BERT-SPC on Restaurant dataset: Acc==0.8446, F1==0.7698

image

plz check the latest committed version

Ok, thank you, I will try that again.

666, 最终还是被大佬解决了

@songyouwei What are the parameters, with which you achieve accuracy values of over 0.80? I'm using ABSA and aen_bert, with default parameters (which are defined in the argparser in train.py) but the accuracy goes only up to 0.59, where it converges. These are the parameters I use:

> training arguments:
>>> model_name: aen_bert
>>> dataset: restaurant
>>> optimizer: <class 'torch.optim.adam.Adam'>
>>> initializer: <function xavier_uniform_ at 0x7f4f29cca7a0>
>>> learning_rate: 2e-05
>>> dropout: 0.1
>>> l2reg: 0.01
>>> num_epoch: 10
>>> batch_size: 16
>>> log_step: 5
>>> embed_dim: 300
>>> hidden_dim: 300
>>> bert_dim: 768
>>> pretrained_bert_name: bert-base-uncased
>>> max_seq_len: 80
>>> polarities_dim: 3
>>> hops: 3
>>> device: cpu
>>> seed: None
>>> valset_ratio: 0
>>> local_context_focus: cdm
>>> SRD: 3
>>> model_class: <class 'models.aen.AEN_BERT'>
>>> dataset_file: {'train': './datasets/semeval14/Restaurants_Train.xml.seg', 'test': './datasets/semeval14/Restaurants_Test_Gold.xml.seg'}
>>> inputs_cols: ['text_raw_bert_indices', 'aspect_bert_indices']

@fhamborg Maybe it's because the batch size is small? What is the performance of this set of parameters on bert_spc ? I'll check it later.

Hi, thanks for getting back! With the same parameters, but on bert_spc the performance I get on the restaurants set is a bit higher than for aen_bert but still not in the 80%ish range:

> val_acc: 0.6455, val_f1: 0.4145
>> test_acc: 0.6670, test_f1: 0.3756

FYI, I attached the full console output: https://gist.github.com/fhamborg/dade525af54a158982967383444fade4

Hello, I just cloned the latest repository and checked the code after latest PR . I set the parameters consistent with you and aen_bert's accuracy on restaurant achieves 81+. Here is my training log.
training log.txt

Hi guys, I've also tested all of the Bert-based models modified by my latest PR, and here are the logs, and it's working really well. I hope it helps.
bert_spc training log.txt
lcf_bert training log.txt

Hey @yangheng95 , thanks for the logs! I still haven't figured what exactly the difference; but the only more or less meaningful assumption is due to the random initialization of a few components in pytorch and transformers. Would you be so kind to post your bert_spc log with the same parameters as before, but also setting the --seed 1337? This would allow me a better comparison. Thank you =)

Also, could you post the log when running aen_glove? Thank you very much in advance!

Hello @fhamborg , I trained the bert_spc model with 1337 as seed and the result is still very good. >> test_acc: 0.8402, test_f1: 0.7692. I think that cloning and referring to the latest code after PR may solve your problem. Due to the busy schedule, I may not have time to adapt and train AEN-GloVe, but you can run it by adding the aen-glove model to train.py just as adding the other models.
bert_spc.training.log.seed1337.txt

Hi @yangheng95 , thanks for your reply and verification with seed 1337. I'm using the latest repo, i.e., including the PR to migrate to transformers. Though, I tried it on another machine and the results went up to roughly 70% (plus/minus) for all the approaches.

Also, I managed to train aen_glove but in contrast to the results reported in the paper, I was only able to get roughly 50% on validation or test set. Do you have any idea where the difference for glove could come come from?

@fhamborg Thank you for reporting this issue.
I just looked into this.
There might be something wrong with the recent release of the pretrained bert from https://github.com/huggingface/transformers, named transformers.

I installed it with pip install transformers, replaced pytorch_transformers imports with transformers, and reproduced this issue.

Try reinstall and use the previous release pytorch_transformers with pip install pytorch-transformers.

Thanks, you're right, I was using transformers instead of pytorch_transformers. I will check it out now :-)

Awesome, on pytorch_transformers I get much higher performances than on transformers, e.g.:

> val_acc: 0.8536, val_f1: 0.7924

Thanks for the hint, @songyouwei ! Do you have any idea what might be causing this significant difference between whether pytorch_transformers or transformers is being used?

hello @songyouwei ,how to train aen_glove,do I need to modify the code for the train?