docker pull bde2020/hadoop-namenode:1.1.0-hadoop2.8-java8
docker pull bde2020/hadoop-datanode:1.1.0-hadoop2.8-java8
docker pull bde2020/spark-base:2.1.0-hadoop2.8-hive-java8
docker pull bde2020/spark-master:2.1.0-hadoop2.8-hive-java8
docker pull bde2020/spark-worker:2.1.0-hadoop2.8-hive-java8
docker pull bde2020/spark-notebook:2.1.0-hadoop2.8-hive
docker pull bde2020/hdfs-filebrowser:3.11
To start an HDFS/Spark Workbench, run:
docker-compose up -d
- Namenode: http://localhost:50070
- Datanode: http://localhost:50075
- Spark-master: http://localhost:8080
- Spark-notebook: http://localhost:9001
- Hue (HDFS Filebrowser): http://localhost:8088/home
Jar file is packaged under the jarfile directory.
- Compiled with scala 2.11.11
- For Spark 2.1.0
To Run the make the make command I've come up with this order which seem very mandatory to properly have the job done.
make prepare-raw-dataset
make ingest-hdfs
make jar
make prediction
make prediction-result
make clean-output
make clean-input