joohk / Udacity-DataAnalystND

My progress through the Udacity Data Analyst Nanodegree

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Udacity-DanD-P0

My progress through the Udacity Data Analyst Nanodegree

In this repository I'll be trying to maintain a relatively presentable copy of my Udacity DanD projects. If you're feeling extra scrutinizing today then go ahead and ignore the earlier projects as I took massive aesthetic liberty to churn out hideous but functional projects quickly so I could get to the juicy learnin'.

P0 - Chopsticks

Introductory project, very straightforward basic stats on chopstick size preferences among students.
Jupyter Notebook, Python

P1 - Stroop

Looking at the Stroop effect in a small sample and doing some more basic stats.
Jupyter Notebook, Python (numpy, pandas)

P2 - Titanic Survival Data

Taking the Titanic data set from Kaggle and using numpy and pandas. In the Kaggle comp you are supposed to look at stats as they relate to passenger survival, but for this exercise I didn't go terribly in depth with survival, preferring to look at other relationships like Class and Sex, fare age etc. as well as Survival.
Jupyter Notebook, Python (numpy, pandas)

P3 - Open Street Map Data Wrangle with MongoDB

Exported an XML document from Open Street Maps detailing the Hampton Roads area in Virginia. XML file size was over 1GB. Cleaned data, saved as JSON, uploaded to MongoDB and looked at a couple different statistics on the area. This one was a fav!
Jupyter Notebook, Python (numpy, pandas, ElementTree), XML, JSON, MongoDB

P4 - Data Mining the Tanzanian Water Table With R

Exploratory Data Analysis for the drivendata.org machine learning competition to predict failure of water wells in Tanzania to support the NPO Taarifa.
R, R Markdown, Statistical Analysis, ggplot2

About

My progress through the Udacity Data Analyst Nanodegree


Languages

Language:HTML 94.5%Language:Jupyter Notebook 5.4%Language:Python 0.1%