Sentiment Analysis Shared Task at BLP Workshop @EMNLP 2023
The aim of this task is to identify the polarity of social media content. Please see the Task Description below.
Table of contents:
- Important Dates
- Proceedings
- List of Versions
- Contents of the Directory
- Task Description
- Dataset
- Scorer and Official Evaluation Metrics
- Baselines
- Format checker
- Submission Guidelines
- Leaderboard
- Organizers
16 July 2023: Registration on codalab and beginning of the development cycle15 August 2023: Beginning of the evaluation cycle (test sets release and run submission)18 August 2023: End of the evaluation cycle20 August 2023: Publish rank list and share paper submission details- **
12 September 2023: Deadline for the submission of working notes 10 September 2023: Deadline for the submission of working notes10 October 2023: Notification of acceptance- **
18 October 2023: Camera-ready due 16 October 2023: Camera-ready due- 8 December 2023: Workshop co-located with EMNLP-2023 (Singapore)
The title of paper should be in the following format: < Team Name > at BLP-2023 Task 2: < Descriptive title of your paper >
For example, team AlphaX would have their title as follows: AlphaX at BLP-2023 Task 2: Transformer Models for Sentiment Analysis
- The shared task papers may consist of up to four (4) pages of content.
Templates: The Shared tasks papers must follow the EMNLP 2023 two-column format, using the supplied official templates. The templates can be downloaded in style files and formatting. Please do not modify these style files, nor should you use templates designed for other conferences. Submissions that do not conform to the required styles, including paper size, margin width, and font size restrictions, will be rejected without review. Verification to guarantee conformance to publication standards, we will be using the ACL pubcheck tool. The PDFs of camera-ready papers must be run through this tool prior to their final submission, and we recommend its use also at submission time.
Submissions are open to only for the teams who submitted their systems during the evaluation phase and listed in the leaderboard. The working notes are to be submitted anonymously. For the anonymity, double-blind submission and reproducibility criteria please follow the EMNLP 2023 instructions.
- [21/08/2023] Leaderboard announced
- [18/08/2023] Competition Ends
- [15/08/2023] Evaluation phase starts
- [15/08/2023] Test data released for evaluation phase
- [13/07/2023] Development phase starts
- [13/07/2023] Training and dev data released
-
Main folder: data
This directory contains data files for the task.- Sub folder: test
This directory contains test files for the task evaluation.
- Sub folder: test
-
Main folder: bibtex
This directory contains bibliography of related works. -
Main folder: baselines
Contains scripts provided for baseline models of the task. -
Main folder: example_scripts
Contains an example script provided to run DistilBERT model for the task. -
Main folder: format_checker
Contains scripts provided to check the format of the submission file. -
Main folder: scorer
Contains scripts provided to score the output of the model when provided with the label (i.e., dev). -
README.md
This file!
Task: The objective is to detect the sentiment associated within a given text. This is a multi-class classification task that involves determining whether the sentiment expressed in the text is Positive, Negative, Neutral.
For a brief overview of the dataset, kindly refer to the README.md file located in the data directory.
Each file uses the tsv format. A row within the tsv adheres to the following structure:
id text label
Where:
- id: an index or id of the text
- text: text
- label: Positive, Negative, or Neutral
14737 এখান থেকে সবাই শিক্ষা নিতে পারি । Positive
The scorer for the task is located in the scorer module of the project. The scorer will report official evaluation metrics and other metrics of a prediction file. The scorer invokes the format checker for the task to verify the output is properly shaped. It also handles checking if the provided predictions file contains all tweets from the gold one.
You can install all prerequisites through,
pip install -r requirements.txt
Launch the scorer for the task as follows:
python scorer/task.py --gold-file-path=<path_gold_file> --pred-file-path=<predictions_file>
python scorer/task.py --pred_files_path task_dev_output.txt --gold_file_path data/dev.tsv
The official evaluation metric for the task is micro-F1. However, the scorer also reports accuracy, precision and recall.
The baselines module currently contains a majority, random and a simple n-gram baseline.
Baseline Results for the task on Test set
Model | micro-F1 |
---|---|
Random Baseline | 0.3356 |
Majority Baseline | 0.4977 |
n-gram Baseline | 0.5514 |
Baseline Results for the task on Dev-Test set
Model | micro-F1 |
---|---|
Random Baseline | 0.3389 |
Majority Baseline | 0.4962 |
n-gram Baseline | 0.5736 |
The format checkers for the task are located in the format_checker module of the project. The format checker verifies that your generated results file complies with the expected format.
Before running the format checker please install all prerequisites,
pip install -r requirements.txt
To launch it, please run the following command:
python format_checker/task.py -p results_files
python format_checker/task.py -p ./task.txt
results_files: can be one path or space-separated list of paths
Evaluation consists of two phases:
- Development phase: This phase involves working on the dev-test set.
- Evaluation phase: This phase involves working on the test set, which will be released during the evaluation cycle.
For each phase, please adhere to the following guidelines:
- We request each team to establish and manage a single account for all submissions. Hence, all runs should be submitted through the same account. Any submissions made from multiple accounts by the same team may lead to your system being not ranked from the final ranking in the overview paper.
- The most recently uploaded file on the leaderboard will serve as your final submission.
- Adhere strictly to the naming convention for the output file, which must be labeled as 'task.tsv'. Deviation from this standard could trigger an error on the leaderboard.
- Submission protocol requires you to compress the '.tsv' file into a '.zip' file (for instance, zip task.zip task.tsv) and submit it through the Codalab page.
- With each submission, ensure to include your team name along with a brief explanation of your methodology.
- Each team is allowed a maximum of 50 submissions per day for the given task. Please adhere to this limit.
Submission file format is tsv (tab seperated values). A row within the tsv adheres to the following structure:
id label
Where:
- id: a id of the text
- label: Positive, Negative, or Neutral
https://codalab.lisn.upsaclay.fr/competitions/14587
Ranking | Username | F1-Micro |
---|---|---|
1 | MoFa_Aambela | 0.7310 |
2 | yangst | 0.7267 |
3 | amlan107 | 0.7179 |
4 | Hari_vm | 0.7172 |
5 | PreronaTarannum | 0.7164 |
— | ShadmanRohan | 0.7155 |
6 | MEAkhter | 0.7112 |
7 | empty_box | 0.7109 |
8 | todiketan | 0.7094 |
9 | towhidul_tonmoy | 0.7088 |
10 | ptnv-s | 0.7078 |
11 | DeepBlueAI | 0.7076 |
12 | Raihan008 | 0.7058 |
13 | NLP_TEAM | 0.7052 |
14 | M1437 | 0.7036 |
15 | Semantic_Savants | 0.7002 |
16 | abdalimran | 0.6996 |
17 | Ka05aR | 0.6930 |
18 | VishwasGPai | 0.6824 |
19 | UFAL-ULD | 0.6768 |
20 | KrishnoDey | 0.6742 |
21 | Ssaha | 0.6702 |
22 | pramitb | 0.6584 |
23 | Trina_Chakraborty | 0.6194 |
— | Rachana8._K | 0.5962 |
24 | lixn | 0.5889 |
25 | Baseline (Majority) | 0.4977 |
26 | deepsarker | 0.4534 |
27 | rajeshdiu | 0.4129 |
28 | SSCP | 0.3390 |
29 | Baseline (Random) | 0.3356 |
30 | nnur594 | 0.2626 |
Submissions without position were submitted after the deadline due to the formatting issues.
There are various papers associated with the task. Details for the papers specific to the task as well as an overall overview will be posted here as they come out. Bib entries for each paper are included here. For your convenience, the bib file is available as well.
@inproceedings{blp2023-overview-task2,
title = "BLP 2023 Task 2: Sentiment Analysis",
author = "Hasan, Md. Arid and Alam, Firoj and Anjum, Anika and Das, Shudipta and Anjum, Afiyat",
booktitle = "Proceedings of the 1st International Workshop on Bangla Language Processing (BLP-2023)",
month = dec,
year = "2023",
address = "Singapore",
publisher = "Association for Computational Linguistics",
}
@article{hasan2023zero,
title={Zero- and Few-Shot Prompting with LLMs: A Comparative Study with Fine-tuned Models for Bangla Sentiment Analysis},
author={Md. Arid Hasan and Shudipta Das and Afiyat Anjum and Firoj Alam and Anika Anjum and Avijit Sarker and Sheak Rashed Haider Noori},
year={2023},
eprint={2308.10783},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@inproceedings{islam-etal-2021-sentnob-dataset,
title = "{S}ent{N}o{B}: A Dataset for Analysing Sentiment on Noisy {B}angla Texts",
author = "Islam, Khondoker Ittehadul and
Kar, Sudipta and
Islam, Md Saiful and
Amin, Mohammad Ruhul",
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
month = nov,
year = "2021",
address = "Punta Cana, Dominican Republic",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2021.findings-emnlp.278",
doi = "10.18653/v1/2021.findings-emnlp.278",
pages = "3265--3271",
}
Please join us in Slack channel for discussion and doubts:
- Md. Arid Hasan, GRA, University of New Brunswick
- Firoj Alam, Scientist, Qatar Computing Research Institute
- Shudipta Das, Daffodil International University
- Afiyat Anjum, Daffodil International University
- Anika Anjum, Daffodil International University