damian-barsotti / spark-docker-thrift

Docker compose cluster with running thrift server

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Spark cluster with thrift server implemented with docker compose

General

A simple spark standalone virtual docker compose cluster with a running thrift server for testing purposses. The cluster shares the spark warehouse space with the thrift server. Tested with Tableau.

Set file sharing permissions

chmod g+rwx shared-folder
sudo chown :root shared-folder

Build the images

docker compose build

Run the docker-compose

The final step to create your test cluster will be to run the compose file:

docker compose up -d

Show Spark UI

http://localhost:8080/

You should see the thrift server as an application.

Test the cluster

docker compose run -it spark-cmd /opt/spark/bin/spark-submit --master spark://spark-master:7077 /shared-folder/load_data_write_to_server.py
docker compose run -it spark-cmd /shared-folder/connect-thrift-server.sh

Run Jupyter notebook in the cluster

docker compose up jupyter

and open the link begining with "http://127.0.0.1:8888/lab?token=".

There are some notebooks examples in the work folder inside Jupyter.

Show ip of thrift server

./show-thrift-server-ip.sh

About

Docker compose cluster with running thrift server

License:GNU General Public License v3.0


Languages

Language:Jupyter Notebook 72.2%Language:Dockerfile 14.2%Language:Python 8.5%Language:Shell 5.0%