lisy09 / spark-dev-box

A practical dev suite for scala spark

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

spark-dev-box

Origin: https://github.com/lisy09/spark-dev-box

This is a project to provide a practical dev suite for scala spark.

Directory

  • spark-app/: module directory to write the spark application
  • vendor/hadoop-docker: vendor hadoop to provider

How to run

Prequisuite

build all docker images in:

  • kafka-docker
  • vendor/hadoop-docker
  • vendor/apache-livy-docker and the spark application jar in :
  • spark-app/build_results/spark-app.jar

You can do this in one command:

make all

Step1. Deploy dev cluster in docker-compose

make deploy

Step2. Submit spark streaming application through apache-livy REST API

Need curl installed

make submit

to stop the spark streaming app

make unsubmit

Step3. Start the kafka client to produce streaming data

make send-batch

Step4. Check latest state stored at redis through a HTTP server

make curl-wordcount

or check the API GET /v1/wordcount at http://localhost:8001/docs

StepX. Undeploy

make undeploy

About

A practical dev suite for scala spark

License:MIT License


Languages

Language:Python 41.1%Language:Shell 18.5%Language:HTML 12.5%Language:Dockerfile 12.2%Language:Makefile 9.5%Language:Scala 6.3%