RileyCC56 / Amazon_Vine_Analysis

Performing extract, transform, and load on Amazon Vine program members for specific product reviews then compare and calculate table results to settle if there is any bias towards favorable reviews compared to non-Vine program members product reviews.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Overview:

The following analysis focuses on the Amazon Vine program that measures the difference between member and non-member participants to determine if there is any bias toward favorable reviews from Vine members.

The Amazon Vine program is a service that allows manufacturers and publishers to receive reviews for their products. Companies pay a small fee to Amazon and provide products to Amazon Vine members, who are then required to publish a review.

Having access to approximately 50 different Amazon product datasets, each one contains reviews of a specific product, from clothing apparel to wireless products.

Using PySpark and performing ETL within a Amazon reviews based on Tools database we have connected the AWS RDS and pgAdmin to determine the differences.

Technology Used:

PySpark, Amazon Web Services, Amazon RDS, Amazon S3, Python, Pandas

Results:

After running our metrics and creating data frames we have found the following results below.

o Total Vine reviews = 285

o Total non-Vine reviews = 31,545

o Vine 5 Star reviews = 163

o Non-Vine 5 Star reviews = 14,614

o 5-star Vine review percentage = 57.2%

o 5-star non-Vine review percentage = 46.3%

Screenshot (100)

Screenshot (99)

Summary:

When reviewing our metrics, we have found that there was a 57.2% of 5-star reviews that were based within the Vine program and a 46.3% of 5-star reviews for non-Vine members. With this slight difference the Vine program does show positivity for the Vine members over the non-Vine members.

One additional result that could support our analysis is gathering the mean of all star ratings, zero through five, to truly measure the percentage of the reviews for Vine and non-Vine members.

About

Performing extract, transform, and load on Amazon Vine program members for specific product reviews then compare and calculate table results to settle if there is any bias towards favorable reviews compared to non-Vine program members product reviews.


Languages

Language:Jupyter Notebook 100.0%