liliaellouz / Twitter-Sentiment-Analysis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sentiment Analysis Twitter

This project was done in the scope of the course Machine Learning at EPFL The competition was hosted at www.crowdai.org. We were given dataset of 2500000 tweets, half of it conatins positive labels and the other half contains negative labels. For the comptetions we used a test dataset that contains 10000 entries.

Necessary libraries

  • numpy
  • keras
  • tensorflow
  • sklearn
  • matplotlib
  • pickle
  • enchant
  • wordninja
  • re
  • glove
  • os
  • math

Folders

Code and notebooks

  • helper.py: This script contains methodes to read and process the data
  • process.py: This script calls specific methodes of helper.py to clean the data and save the cleaned version
  • paths.py: It contains only the paths used in our code so that we won't define them seperately.
  • tfidfi_methods.py: This script will use the glove weighted matrix and average the vectors of the words in each tweet.
  • kaggle_submission.py: This script create submission file from saved models.
  • create_word_vectors.py: This script create a word embedding using either stanford pretrained files or by constructing our own glove.
  • run_models: This script contains the models that we used to generate kaggle predicitons.
  • run.py: This script run the project and call the above scripts and cerates the kaggle sumbimission.
  • tf_idf_models.ipynb: This notebook contain models training on TF-IDF.

Run our project

To be able to run our project you need first to install the above librairies and then:

Useful links:

Collaborators

About


Languages

Language:Jupyter Notebook 80.6%Language:Python 19.4%