Feature-extraction-for-sentiment-analysis-with-Logistic-Regression-and-Naive-Bayes

• Generated feature vectors for processing 50,000 IMDB reviews and 900,000 tweets while achieving an accuracy of ~85% and ~64% respectively

• The first one was Doc2Vec technique where document vectors are learned via artificial neural networks which captures the context of the words while reducing the size of the data.

• The second technique used was traditional NLP technique where the features were words selected that occurred in at least 1% of either of positive or negative texts and occurred twice as many times in positive/negative reviews as compared negative/positive reviews. and the feature vectors are simple binary vectors Technologies used: Python

About

Uses Word2Vec/Doc2Vec and traditional NLP techniques

Languages

Language:Python 100.0%