This is a sentiment analysis tool that takes a range of review data sets from sources such as Amazon product reviews, IGN game reviews, and IMDB movie reviews, and then builds one of 3 user selected analyzer models. Once a model is constructed, the system preforms sentiment analysis on a given test data and outputs the results into a .TSV file with the model's name.
- Python 3.6.x
- Pandas 0.19.2
- sklearn 0.0
- numpy 1.12.0
- BeautifulSoup4 4.4.1
- scipy 0.19.0
- gensim 1.0.1
- NLTK 3.2.2
This can be done by opening the console/command window on your machine and entering:
import nltk
nltk.download()
A UI will appear, you will then pressing the download button to download everything that the Natural Language Toolkit has to offer.
All models are held under a Random Forest Classifier as features
- Bag of Words
- Word to Vector - Average
- Word to Vector - Centroid