NLP_Text_Classification_with_Transformers_RoBERTa_and_XLNet_Model

Introduction:

This project covers the end to end implementation of how to load, fine tune and evaluate various transformer models for NLP based text classification tasks.

In this project, 2 types of transformer models are explored to categorize human emotions using Hugging Face library dataset.

RoBERTa: A Robustly Optimized BERT Pretraining Approach
XLNet: Generalized Autoregressive Pretraining for Language Understanding

The architectures of these two models are analysed, studied about the training and optimization techniques and finally used them to classify Human Emotions into separate categories.

Dataset:

Hugging Face Emotion Dataset

Emotion is a dataset of English Twitter messages with six basic emotions: anger, fear, joy,love, sadness, and surprise.

we will be using the Human Emotions datasets from the hugging face library.

The dataset comprises of three data categories namely,

Train - 16000 rows and 2 columns
Validation - 2000 rows and 2 columns
Test - 2000 rows and 2 columns

Project Implementation Steps:

The project aims at building two models namely RoBERTa and XLNet to perform classification on the human emotion dataset, by implementing the below steps for both the models.

Data Exploration and Analysis
Data Pre-processing
Creation of the RoBERTa/XLNet Model
Compiling the RoBERTa/XLNet Model
Model Training with Fine-Tuning
Model Evaluation and Validation
Model Performance Metrics Measures
Saving the Final Optimized Model
Verifying the Final Optimized on the Unseen Test Data

Tools & Technologies:

Python, numpy, pandas, ktrain, transformers, tensorflow, sklearn, amtplotlib

Tech-with-Vidhya / NLP_Text_Classification_with_Transformers_RoBERTa_and_XLNet_Models