This docker container is meant to be used for learning purpose for programming PySpark. It has the following components.
- Hadoop v3.2.1
- Spark v2.4.4
- Conda 3 with Python v3.7
After running the container, you may visit the following pages.
- HDFS
- YARN
- Spark
- Spark History
- Jupyter Lab
To run the docker container, type in the following.
bash ./start-docker-container.sh
Click on below link to access portal