swilliamc / SparkSQL

UC Davis Distributed Computing with Spark SQL (with Databricks) and Databricks Apache Spark SQL for Data Analysts

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SparkSQL

Distributed Computing with Spark SQL (UC Davis and Databricks)

Week1: 101 Introduction to Spark and Queries in Spark SQL

Week2: 102 Spark Core Concepts and Spark Internals

Week3: 103 Engineering Data Pipelines

Week4: 104 Machine Learning Applications of Spark and Linear Regression/Logistic Regression Classifier

Logistic Regression Classifier Machine Learning Assignment (with Python Sklearn)


Databricks Apache Spark SQL for Data Analysts

W1 Introduction

W2 Big Data and Apache Spark

W3 Spark SQL on Databricks, Data Visualization, and Exploratory Data Analysis

W4 Spark SQL Powered Queries and Spark User Interface

W5 Manage Nested Data Structure, Manipulating data, and Data Munging

W6 Higher Order Functions, Aggregating and Summarizing, Partitioning Tables, and Sharing Insights

W7 Modern Data Storage and Using Delta Lake

W8 Building and Maintaining Delta Tables, Managing records in delta table, Delta Engine Optimization

W9 SQL Coding Challenges

About

UC Davis Distributed Computing with Spark SQL (with Databricks) and Databricks Apache Spark SQL for Data Analysts


Languages

Language:HTML 100.0%