ViSA

Vietnamese sentiment analysis

🎓 Training

The commands below train/fine-tune model for Sentiment analysis.

python main.py train --task UIT-ViSD4SA \
                     --model_arch hier_roberta_sl \
                     --run_test \
                     --data_dir datasets/UIT-ViSD4SA \
                     --model_name_or_path vinai/phobert-base \
                     --output_dir outputs \
                     --max_seq_length 256 \
                     --train_batch_size 32 \
                     --eval_batch_size 32 \
                     --learning_rate 1e-4 \
                     --classifier_learning_rate 3e-3 \
                     --epochs 100 \
                     --early_stop 50 \
                     --overwrite_data

🥇 Performances

All experiments were performed on an RTX 3090 with 24GB VRAM, and a CPU AMD EPYC 7282 16-Core Processor with 64GB RAM, both of which are available for rent on vast.ai. The pretrained-model used for comparison are available on HuggingFace.

UIT-ViSD4SA (update 18/07/2022)

Table 1: The overall experimental results

Model	Accuracy	micro-Average			macro-Average			Reference
Model	Accuracy	Precision	Recall	F1-score	Precision	Recall	F1-score	Reference
Aspect
BiLSTM_CRF_Base	.....	0.6563	0.6515	0.6539	0.6288	0.6162	0.6217	Paper
BiLSTM_CRF_Large	.....	0.6496	0.6685	0.6589	0.6200	0.6356	0.6276	Paper
HierRoBERTa_SL	0.8061	0.6481	0.6726	0.6601	0.6169	0.6509	0.6331	Log
HierRoBERTa_ML	0.8045	0.6528	0.6750	0.6637	0.6324	0.6474	0.6391	Log
Polarity
BiLSTM_CRF_Base	.....	0.5488	0.5591	0.5539	0.4687	0.4639	0.4657	Paper
BiLSTM_CRF_Large	.....	0.5689	0.5978	0.5830	0.4900	0.5060	0.4977	Paper
HierRoBERTa_SL	0.8110	0.6464	0.6659	0.6560	0.5601	0.5747	0.5673	Log
HierRoBERTa_ML	0.8085	0.6526	0.6655	0.6590	0.5794	0.5734	0.5757	Log
Aspect-polarity
BiLSTM_CRF_Base	.....	0.6071	0.6162	0.6116	0.4618	0.4342	0.4437	Paper
BiLSTM_CRF_Large	.....	0.6178	0.6299	0.6238	0.4684	0.4546	0.4570	Paper
HierRoBERTa_SL	0.7709	0.6128	0.6401	0.6262	0.5089	0.5389	0.5166	Log
HierRoBERTa_ML	0.7706	0.6213	0.6416	0.6313	0.5391	0.5195	0.5206	Log

Table 2: Result per class for aspect label of HierRoBERTa_ML

Aspect	General Scores			Polarity F1-scores
Aspect	Precision	Recall	F1-score	Negative	Neutral	Positive
BATTERY	0.7511	0.7612	0.7561	0.5944	0.5231	0.8121
CAMERA	0.7588	0.7650	0.7619	0.5836	0.5823	0.8062
DESIGN	0.7059	0.7024	0.7042	0.4188	0.2857	0.7600
FEATURES	0.5600	0.5784	0.5690	0.4894	0.4545	0.6667
GENERAL	0.6537	0.6743	0.6638	0.5478	0.4685	0.6705
PERFORMANCE	0.6381	0.6535	0.6457	0.5061	0.2714	0.7165
PRICE	0.4640	0.4981	0.4804	0.3937	0.2963	0.4907
SCREEN	0.6735	0.7174	0.6947	0.5067	0.3529	0.7748
SER&ACC	0.5672	0.6527	0.6069	0.2939	0.2857	0.6727
STORAGE	0.5517	0.4706	0.5079	0.3478	0.4444	0.6000

Table 3: Result per class for only sentiment polarity label of HierRoBERTa_ML

Sentiment	Precision	Recall	F1-score
NEGATIVE	0.5400	0.5579	0.5488
NEUTRAL	0.4704	0.4157	0.4414
POSITIVE	0.7278	0.7466	0.7371

ABSA (update 18/07/2022)

YASO (update 18/07/2022)

📋 Todo

Models

~~Implement Hierarchical RoBERTa model (single layer).~~
~~Implement Hierarchical RoBERTa model (multiple layers).~~
Implement Hierarchical BERT model.
~~Implement Controlable Task-dependency loss.~~

Dataset processors

~~Read the UIT-ViSD4SA dataset and convert it to ABSA features.~~
Read the ABSA-{laptop, rest, twitter} dataset and convert it to ABSA features.
Read the YASO dataset and convert it to ABSA features.

Pipelines

~~Complete Train pipeline.~~
~~Complete Test pipeline.~~
Complete Predict pipeline.
~~Code metrics for evaluate ABSA task.~~

Documents

Introduce ViSA and its features and implemented model (Introduction].
Configure the environment and install any necessary libraries (Environments).
How to execute ViSA to train/fine-tune and validate the model (Training).
~~Describe the experimental setup and performance of the models (Performances).~~

About

Vietnamese sentiment analysis

Languages

Language:Python 98.9%Language:Shell 1.1%