e-vdb / movie-review-sentiment-classification

Sentiment analysis of movie reviews from Internet Movies Database.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

movie-review-sentiment-classification

1200px-IMDB_Logo_2016Rescaled

Summary

Sentiment analysis of movie reviews from Internet Movies Database (IMDb)

Dataset

Dataset from https://www.kaggle.com/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews

50000 reviews : 25000 positive - 25000 negative

Streamlit interface

streamlit

Repository content

  • model/ folder containing different Python scripts
  • moviereview-sentimentanalysis.ipynb: Jupyter Notebook

Notebook

  • Natural Language Processing (NLP)
  • Machine learning model : Logistic Regression
  • Evaluation of the accuracy score

Visualization of the most discriminating features

Count Vectorizer (unigrams only)

featuresCV

Count Vectorizer (unigrams and bigrams)

featuresCV2

Count Vectorizer with TFIDF rescaling (unigrams only)

featuresTFIDF

Count Vectorizer with TFIDF rescaling (unigrams and bigrams)

featuresTFIDF2

TASK LIST

  • Implement machine learning algorithm using Scikit-learn
  • Implement deep learning algorithm using Keras
  • Deploy model: Streamlit interface

About

Sentiment analysis of movie reviews from Internet Movies Database.

License:MIT License


Languages

Language:Jupyter Notebook 99.1%Language:Python 0.9%