jsong336 / emotion-bert

Fine-tuning BERT-small on GoEmotions dataset.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Finding Human Emotions from Text Comments

Fine-grained emotion detectiong using BERT and GoEmotions dataset. Details in report.pdf .

Setup with Google Colab

The project was developed in Google Colab utilizing Google Drive as storage. To setup the running environment please follows below steps.

  1. Create a colab notebook and clone the project.
from google.colab import drive
drive.mount('/content/drive/')

repos_dir = '/content/drive/MyDrive/{where you want to put in google drive}'
repos = 'fine-grained-emotions-bert' # our repository name
url = "https://github.com/jsong336/fine-grained-emotions-bert.git"

%cd $repos_dir
! git clone $url
%cd $repos_dir/$repos
! git pull 
  1. Go to Kaggle and create developer API credential.
import os

os.environ['KAGGLE_USERNAME'] = ""
os.environ['KAGGLE_KEY'] = ""

def download_from_kaggle(url, target_dir):
  dataset_name = url.split('/', 1)[-1]
  dirname = os.path.join(target_dir, dataset_name)
  ! mkdir $dirname
  ! kaggle datasets download -d $url
  ! unzip $dataset_name -d $dirname
  return 
  
input_dir = repos_dir + '/inputs'
download_from_kaggle('shivamb/go-emotions-google-emotions-dataset', input_dir)
download_from_kaggle('ishivinal/contractions', input_dir)
download_from_kaggle('bittlingmayer/spelling', input_dir)

Go to your Google Drive and make sure you have the repository cloned and datasets downloaded

  1. Go to notebooks/ and run main_prepare_dataset.ipynb and you should have train & test datasets splitted in the inputs/

  2. Run main_bert_gru.ipynb and main_bert_dense.ipynb notebooks to train the models. (Careful, notebooks create checkpoints in your Google Drive and could easily take up a lot of space in Google Drive)

  3. Run view_model_analysis.ipynb to compare the models.

Because all of our codes were developed in google colab and archives were stored in google drive, any view_* or main_* codes contains following block

import os 

# os.environ['GO_EMOTIONS_COLAB_WORKDIR'] = '/content/drive/MyDrive/Notebooks/Repository/go-emotions/notebooks'
colab_workdir = os.environ.get('GO_EMOTIONS_COLAB_WORKDIR')

if colab_workdir:
    print('Running with colab')
    from google.colab import drive
    drive.mount('/content/drive')
    %cd $colab_workdir
    !pip install -q -r ../requirements.txt
else:
    print('Running with jupyter notebook')

You might need to update os.environ['GO_EMOTIONS_COLAB_WORKDIR'] = {cloned work directory in google drive}.

About

Fine-tuning BERT-small on GoEmotions dataset.


Languages

Language:Jupyter Notebook 83.1%Language:TeX 16.8%Language:Python 0.2%Language:Shell 0.0%