This notebook walks you through the basics of PySpark and data manipulation with PySpark.
To run the notebook, install the requirements:
pip install -r requirements.txt
Python >= 3.5 is required, and I would recommend a clean environment to install it.
Have fun!