Victoriapm / WeRateDogs_Twitter_Data_Wrangling

Data wrangling, analyzing and visualizing of the tweet archive of WeRateDogs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Wrangle and Analyze a Dataset: WeRateDogs Twitter Archive

Introduction

This project is the final task for part 4 of the Udacity Data Analyst Nano Degree.

The project consists in creating a jupyter notebook to perform data wrangling tasks on a given dataset.

Learning objectives

  • Data wrangling, which consists of:
    • Gathering data (downloadable file in the Resources tab in the left most panel of your classroom and linked in step 1 below).
    • Assessing data
  • Cleaning data
  • Storing, analyzing, and visualizing your wrangled data
  • Reporting on 1) your data wrangling efforts and 2) your data analyses and visualizations

Analysis Description

The dataset that will be wrangling (and analyzing and visualizing) in this project is the tweet archive of Twitter user @dog_rates, also known as WeRateDogs.

WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog. These ratings almost always have a denominator of 10. The numerators, though? Almost always greater than 10. 11/10, 12/10, 13/10, etc. Why? Because... "they're good dogs Brent."

WeRateDogs has over 4 million followers and has received international media coverage.

Contents

  • wrangle_act.ipynb file where the analysis has been developed, contains code with Markdown cells from Jupyter Notebook.
  • .html file output of the .ipynb converted to web version for easy viewing
  • twitter_archive_master.csv file with the clean data used
  • wrangle_report.pdf file with a written report about the wrangling efforts. This is to be framed as an internal document.
  • act_report.pdf file with a written report that communicates the insights and displays the visualization(s) produced from the wrangled data. This is to be framed as an external document, like a blog post or magazine article.

Pre-requisites

No installation is needed to view the analysis. To reproduce the project an installation of Python 3.5 and the following libraries is needed:

  • pandas
  • NumPy
  • Matplotlib
  • requests
  • tweepy
  • json

About

Data wrangling, analyzing and visualizing of the tweet archive of WeRateDogs


Languages

Language:Jupyter Notebook 100.0%