datnnt1997 / ViSA

Vietnamese sentiment analysis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ViSA

Vietnamese sentiment analysis

πŸŽ“ Training

The commands below train/fine-tune model for Sentiment analysis.

python main.py train --task UIT-ViSD4SA \
                     --model_arch hier_roberta_sl \
                     --run_test \
                     --data_dir datasets/UIT-ViSD4SA \
                     --model_name_or_path vinai/phobert-base \
                     --output_dir outputs \
                     --max_seq_length 256 \
                     --train_batch_size 32 \
                     --eval_batch_size 32 \
                     --learning_rate 1e-4 \
                     --classifier_learning_rate 3e-3 \
                     --epochs 100 \
                     --early_stop 50 \
                     --overwrite_data

πŸ₯‡ Performances

All experiments were performed on an RTX 3090 with 24GB VRAM, and a CPU AMD EPYC 7282 16-Core Processor with 64GB RAM, both of which are available for rent on vast.ai. The pretrained-model used for comparison are available on HuggingFace.

UIT-ViSD4SA (update 18/07/2022)
Table 1: The overall experimental results
Model Accuracy micro-Average macro-Average Reference
Precision Recall F1-score Precision Recall F1-score
Aspect
BiLSTM_CRF_Base ..... 0.6563 0.6515 0.6539 0.6288 0.6162 0.6217 Paper
BiLSTM_CRF_Large ..... 0.6496 0.6685 0.6589 0.6200 0.6356 0.6276 Paper
HierRoBERTa_SL 0.8061 0.6481 0.6726 0.6601 0.6169 0.6509 0.6331 Log
HierRoBERTa_ML 0.8045 0.6528 0.6750 0.6637 0.6324 0.6474 0.6391 Log
Polarity
BiLSTM_CRF_Base ..... 0.5488 0.5591 0.5539 0.4687 0.4639 0.4657 Paper
BiLSTM_CRF_Large ..... 0.5689 0.5978 0.5830 0.4900 0.5060 0.4977 Paper
HierRoBERTa_SL 0.8110 0.6464 0.6659 0.6560 0.5601 0.5747 0.5673 Log
HierRoBERTa_ML 0.8085 0.6526 0.6655 0.6590 0.5794 0.5734 0.5757 Log
Aspect-polarity
BiLSTM_CRF_Base ..... 0.6071 0.6162 0.6116 0.4618 0.4342 0.4437 Paper
BiLSTM_CRF_Large ..... 0.6178 0.6299 0.6238 0.4684 0.4546 0.4570 Paper
HierRoBERTa_SL 0.7709 0.6128 0.6401 0.6262 0.5089 0.5389 0.5166 Log
HierRoBERTa_ML 0.7706 0.6213 0.6416 0.6313 0.5391 0.5195 0.5206 Log
Table 2: Result per class for aspect label of HierRoBERTa_ML
Aspect General Scores Polarity F1-scores
Precision Recall F1-score Negative Neutral Positive
BATTERY 0.7511 0.7612 0.7561 0.5944 0.5231 0.8121
CAMERA 0.7588 0.7650 0.7619 0.5836 0.5823 0.8062
DESIGN 0.7059 0.7024 0.7042 0.4188 0.2857 0.7600
FEATURES 0.5600 0.5784 0.5690 0.4894 0.4545 0.6667
GENERAL 0.6537 0.6743 0.6638 0.5478 0.4685 0.6705
PERFORMANCE 0.6381 0.6535 0.6457 0.5061 0.2714 0.7165
PRICE 0.4640 0.4981 0.4804 0.3937 0.2963 0.4907
SCREEN 0.6735 0.7174 0.6947 0.5067 0.3529 0.7748
SER&ACC 0.5672 0.6527 0.6069 0.2939 0.2857 0.6727
STORAGE 0.5517 0.4706 0.5079 0.3478 0.4444 0.6000
Table 3: Result per class for only sentiment polarity label of HierRoBERTa_ML
Sentiment Precision Recall F1-score
NEGATIVE 0.5400 0.5579 0.5488
NEUTRAL 0.4704 0.4157 0.4414
POSITIVE 0.7278 0.7466 0.7371
ABSA (update 18/07/2022)
YASO (update 18/07/2022)

πŸ“‹ Todo

Models

  • Implement Hierarchical RoBERTa model (single layer).
  • Implement Hierarchical RoBERTa model (multiple layers).
  • Implement Hierarchical BERT model.
  • Implement Controlable Task-dependency loss.

Dataset processors

  • Read the UIT-ViSD4SA dataset and convert it to ABSA features.
  • Read the ABSA-{laptop, rest, twitter} dataset and convert it to ABSA features.
  • Read the YASO dataset and convert it to ABSA features.

Pipelines

  • Complete Train pipeline.
  • Complete Test pipeline.
  • Complete Predict pipeline.
  • Code metrics for evaluate ABSA task.

Documents

  • Introduce ViSA and its features and implemented model (Introduction].
  • Configure the environment and install any necessary libraries (Environments).
  • How to execute ViSA to train/fine-tune and validate the model (Training).
  • Describe the experimental setup and performance of the models (Performances).

About

Vietnamese sentiment analysis


Languages

Language:Python 98.9%Language:Shell 1.1%