- Python
- Pyspark
- SQL
Using PySpark to profile, clean, and briefly analyze Spofity artists data.
- In the terminal, clone github repository using the following command;
$ git clone https://github.com/jessgschueler/Spark-CR
- In a venv, Pip install requirements.txt file
- Create a /data directory and run get_data.sh inside it
- Run main.py file
- None at this time
MIT
Copyright (c) 6/24/22 Jess Schueler