• Generated feature vectors for processing 50,000 IMDB reviews and 900,000 tweets while achieving an accuracy of ~85% and ~64% respectively
• The first one was Doc2Vec technique where document vectors are learned via artificial neural networks which captures the context of the words while reducing the size of the data.
• The second technique used was traditional NLP technique where the features were words selected that occurred in at least 1% of either of positive or negative texts and occurred twice as many times in positive/negative reviews as compared negative/positive reviews. and the feature vectors are simple binary vectors Technologies used: Python