Transformers
Portuguese BERT base, BERT multilingual base and RoBERTa large evaluation on ASSIN 1 rte and TweetSentBR using Transformers in addition to ASSIN 1 sts and ASSIN2 evaluation.
TweetSentBR formatted data is not available due to Twitter Policy.
Instructions
- Install requirements
pip install -r ./examples/requirements.txt
-
Update Transformers package to support these tasks
pip install --upgrade .
-
Run task
Replace {TASK_TYPE} for assin and tweesent
Replace {TASK} for assin-ptbr-rte, assin-ptbr-rte. Leave in blank for tweetsent.
a) For BERT multilingual
bash run_{TASK_TYPE}.sh {TASK} PT bert-base-multilingual-cased
b) For Portuguese BERT
bash run_{TASK_TYPE}.sh {TASK} PT neuralmind/bert-base-portuguese-cased
c) For RoBERTa
bash run_{TASK_TYPE}.sh {TASK} EN
Results
run_{TASK_TYPE}.sh outputs predictions.json in output/{MODEL}/{TASK}.
-
Task evaluations and ASSIN XML in output/{MODEL}.
-
Task evaluation scripts in {TASK_TYPE}_eval.yaml
-
ASSIN xml formatting script in assin_xml.yaml
XML ASSIN Similarity labels were not modified.