spartan-minhbui / sentiment-analysis

Pipeline for sentiment classification, use trainer of HuggingFace and use ONNX to export model

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Finetune pretrained model for Sentiment analysis

Model Architecture

  • Try some pretrained model:
    • PhoBERT: vinai/phobert-base-v2
    • Bloom: bigscience/bloom-560m
  • Architecture: XXXXForSequenceClassification
  • Use Trainer of Huggingface to training model

Dataset

Optimization

  • Use some optimization techniques to optimize ONNX - Optim
  • See: pipeline/onnx_converter.py

How to run

  1. Note!!!
  • Pass model_class to init class SentimentProcessor.
  • Pass n_folds != None if you want to training with K-fold validation
  • If you use Bloom, you should pass use_lora=True
  1. Export environment variables: while read LINE; do export "$LINE"; done < .env
  2. Run training: PRETRAINED_PATH=bigscience/bloom-560m CUDA_VISIBLE_DEVICES=1,2 python -m torch.distributed.launch --nproc_per_node 2 --master-port=30000 pipeline/trainer.py

About

Pipeline for sentiment classification, use trainer of HuggingFace and use ONNX to export model


Languages

Language:Jupyter Notebook 63.6%Language:Python 36.4%