Run Spark Cluster within Docker

This is the implementation of spark cluster on top of hadoop (1 masternode, 2 slaves node) using Docker

Follow this steps on Windows 10

# Step 1
https://github.com/nghoanglong/spark-cluster-with-docker.git

# Step 2
cd spark-cluster-with-docker

docker pull ghcr.io/nghoanglong/spark-cluster-with-docker/spark-cluster:1.0

docker-compose up

The implementation of Apache Spark (combine with PySpark, Jupyter Notebook) on top of Hadoop cluster using Docker

Language:Shell 78.2%Language:Dockerfile 21.8%