julio-pimentel / Griffith_7130ICT_assignment

Repository for the final assignment of 7130ICT Data Analytics.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Amazon Tablet Market Analysis

7130ICT Data Analytics Griffith University, Queensland, Australia Master of Information Technology

The purpose of this report is to discuss and present the most relevant insights on tablet market extracted from the Amazon product review dataset. The Amazon Market contains different product categories such as Books, Video Games and Digital Music. This report only contains information from the Electronics category, especially the Tablet products. Moreover, the analysis was done using several Python libraries: Pandas, gzip, NumPy, Matplotlib, Seaborn, Scikit-Learn, PyLab, statsmodels, itertools, Calplot, pyecharts, NLTK, and Altair. The report is divided in 3 parts: Basic Analysis, Advance Analysis, and Evaluation.

The Basic Analysis describes the exploratory analysis made to the Tablet dataset. It also provides a detailed explanation of the data preparation and pre-processing of the Electronics and Tablets dataset. Finally, this section proposes several hypotheses which are the basis of the Advanced Analysis.

The Advanced Analysis is divided in 3 parts: Brand and Product Analysis, Sentiment Analysis, Time Series Analysis. First, the Brand and Product Analysis describes the most relevant brands in the market and the most popular products on sale. Second, in the Sentiment Analysis the review scores are transformed in positive and negative sentiment and a Naïve Bayes algorithm is developed to predict the sentiment of a review. Expanding on the sentiment analysis, lexical scores are generated in an attempt to predict the rating that a certain user would give to a product.

Next, in the Time Series Analysis, trend and seasonality are captured, forecasting methods are used and evaluated. Moreover, reviews by month and days of the week are further explored to identify patterns. Finally, for the whole period of the data, variation in sales rank for each brand is visualised using interactive plot.

Acknowledgements:
Gabriela Monteiro
Julio Pimentel
Dr. Henry Nguyen, Griffith University
Mr. Thanh Cong Phan
Julian McAuley, UCSD
Uma Maheswari Raju, Towards Data Science

About

Repository for the final assignment of 7130ICT Data Analytics.


Languages

Language:Jupyter Notebook 100.0%