Python-and-Spark-for-Big-Data
Course Notebooks for Python and Spark for Big Data
Course Outline:
-
Course Introduction
- Promo/Intro Video
- Course Curriculum Overview
- Introduction to Spark, RDDs, and Spark 2.0
-
Course Set-up
- Set-up Overview
- EC2 Installation Guide
- Local Installation Guide with VirtualBox
- Databricks Notebooks
- Unix Command Line Basics and Jupyter Notebook Overview
-
Spark DataFrames
- Spark DataFrames Section Introduction
- Spark DataFrame Basics
- Spark DataFrame Operations
- Groupby and Aggregate Functions
- Missing Data
- Dates and Timestamps
-
Spark DataFrame Project
- DataFrame Project Exercise
- DataFrame Project Exercise Solutions
-
Machine Learning
- Introduction to Machine Learning and ISLR
- Machine Learning with Spark and Python and MLlib
- Consulting Project Approach Overview
-
Linear Regression
- Introduction to Linear Regression
- Discussion on Data Transformations
- Linear Regression with PySpark Example (Car Data)
- Linear Regression Consulting Project (Housing Data)
- Linear Regression Consulting Project Solution
-
Logistic Regression
- Introduction to Logisitic Regression
- Logistic Regression Example
- Logistic Regression Consulting Project (Customer Churn)
- Logistic Regression Consluting Project Solution
-
Tree Methods
- Introduction to Tree Methods
- Decision Tree and Random Forest Example
- Random Forest Classification Consulting Project - Dog Food Data
- RF Classification Consulting Project Solutions
- RF Regression Project - (Facebook Data)
-
Clustering
- Introduction to K-means Clustering
- Clustering Example - Iris Dataset
- Clustering Consulting Project - Customer Segmentation (Fake Data)
- Clustering Consulting Project Solutions
-
Recommender System
- Introduction to Recommender Systems and Collaborative Filtering
- Code Along Project - MovieLens Dataset
- Possible Consulting Project ? Company Service Reviews
-
Natural Language Processing
- Introduction to Project/NLP/Naive Bayes Model
- What are pipelines?
- Code Along
-
Spark Streaming
- Introduction to Spark Streaming
- Spark Streaming Code-along!