amazon-reviews classification data-science data-visualization exploratory-data-analysis jupyter-notebook machine-learning model-building-and-evaluation natural-language-processing nlp-machine-learning nltk-python python sentiment-analysis sentiment-classification

Sentiment Analysis of Product Reviews

Overview

This repository contains a sentiment analysis project that uses machine learning to classify product reviews as either positive or negative. The analysis is performed on a dataset containing reviews from Amazon customers. Various machine learning models including Logistic Regression, Naive Bayes, Decision Trees, and Random Forests are evaluated to determine which performs best at predicting sentiment based on text content.

Repository Structure

Reviews.csv: The dataset file containing review texts and scores. To download the dataset go the Kaggle.
sentiment_analysis.ipynb: Jupyter notebook containing all the analyses and visualizations.

Installation

To run this project, you need Python 3.x and the following libraries:

pandas
numpy
seaborn
matplotlib
scikit-learn
nltk

Data

The dataset includes the following key columns:

Text: The text of the review.
Score: The rating given by the user, used to derive the sentiment.

Features

Text preprocessing includes cleaning, tokenization, and lemmatization.
Sentiment scores derived using NLTK's VADER.
Tfidf vectorization for converting text to numeric form suitable for ML model training.

Usage

To run the sentiment analysis, open the Jupyter notebook sentiment_analysis.ipynb and execute the cells sequentially.

Models Implemented

Logistic Regression
Bernoulli Naive Bayes
Multinomial Naive Bayes
Decision Tree
Random Forest

Results

The effectiveness of each model is evaluated based on accuracy, precision, recall, and F1-score. Results are visually presented through a bar chart comparing the performance of the models.

Contributing

Contributions to this project are welcome! You can contribute in the following ways:

Enhancing the preprocessing pipeline.
Experimenting with different models or tuning the hyperparameters.
Improving the visualization of results.

License

Distributed under the MIT License. See LICENSE for more information.

References

Contact

Jaya Surya Thota - feel free to contact me!

About

Sentiment Analysis on Amazon Fine Food Reviews

amazon-reviews classification data-science data-visualization exploratory-data-analysis jupyter-notebook machine-learning model-building-and-evaluation natural-language-processing nlp-machine-learning nltk-python python sentiment-analysis sentiment-classification

MIT License

Languages

Language:Jupyter Notebook 100.0%