marti1125 / DataAnalysisWithPythonAndPySpark

Code repository for the "PySpark in Action" book

Home Page:https://www.manning.com/books/data-analysis-with-python-and-pyspark

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Data Analysis with Python and PySpark

This is the companion repository for the Data Analysis with Python and PySpark book (Manning, 2022). It contains the source code and data download scripts, when pertinent.

Get the data

The complete data set for the book hovers at around ~1GB. Because of this, I moved the data sources to another repository to avoid cloning a gigantic repository just to get the code. The book assumes the data is under ./data.

Mistakes or omissions

If you encounter mistakes in the book manuscript (including the printed source code), please use the Manning platform to provide feedback.

About

Code repository for the "PySpark in Action" book

https://www.manning.com/books/data-analysis-with-python-and-pyspark


Languages

Language:Python 100.0%