xiong-qiao / DS100

This is the repo of all work done during the DS100 course at UC Berkeley, Fall 2018.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DS100

This is the repo of all work done during the DS100 course at UC Berkeley, Fall 2018, including 12 labs, 7 homeworks, and 3 projects.


Homework

HW0: Introductions(Setup, Prerequisites, and Classification)

HW1: Food Safety (Cleaning and Exploring Data with Pandas)

HW2: Bike Sharing(EDA and Visualization)

HW3: Loss Minimization(Modeling, Estimation and Gradient Descent)

HW4: Spam/Ham Classification(Feature Engineering, Logistic Regression, Cross Validation)

HW5: Hypothesis Testing: Does The Hot Hand Effect Exist?

HW6: Scalable Data Processing Using Ray

Lab

Lab01: Get familiar with JupyterHub and introduction to matplotlib, a python visualization library

Lab02: Pandas Overview

Lab03: Data Cleaning and Visualization

Lab04: Practice plotting, applying data transformations, and working with kernel density estimators. (Working with data from the World Bank containing various statistics for countries and territories around the world.)

Lab05: Modeling and Estimation

Lab06: Multiple Linear Regression and Feature Engineering

Lab07: Feature Engineering & Cross-Validation

Lab09: Logistic regression

Lab10: Use Bootstrap to Estimate Mean and Variance

Lab11: SQL, FEC Data, and Small Donors

Lab12: Introduction to dataCommons

Project

Project1: Trump, Twitter, and Text(work with the Twitter API in order to analyze Donald Trump's tweets.)

Project2: NYC Taxi Rides(The Data Science Lifecycle)

  • Project2A

    • Part1: Data Wrangling
    • Part2: EDA, Visualization, Feature Engineering
  • Project2B:

    • Part3: NYC Accidents Data
    • Part4: Feature Engineering and Model Fitting

About

This is the repo of all work done during the DS100 course at UC Berkeley, Fall 2018.


Languages

Language:Jupyter Notebook 74.5%Language:HTML 25.1%Language:TeX 0.3%Language:Python 0.1%