big-data-europe / docker-spark

Apache Spark docker image

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

spark-history-server not actually working with the docker-compose.yaml in readme

tlzxsun opened this issue · comments

There are two issues

  1. spark-history-server and master/worker are not on the same node, so the logs in /tmp/spark-events will not be read
  2. it seems there is no spark-defaults.conf in the master/worker node, the spark.eventLog.enabled is by default false, so logs are not write

I managed to make it working by mount /spark/conf/spark-defatuls.conf and /tmp/spark-events to the host, it seems working now.
But I think this is not so good, so I was expecting a better solution.
Thanks a lot~

version: '3'
services:
  spark-master:
    image: bde2020/spark-master:3.3.0-hadoop3.3
    container_name: spark-master
    ports:
      - "8080:8080"
      - "7077:7077"
    environment:
      - INIT_DAEMON_STEP=setup_spark
    volumes:
      - /tmp/spark-events-local:/tmp/spark-events
      - /Users/matteo/spark/conf/spark-defaults.conf:/spark/conf/spark-defaults.conf
  spark-worker-1:
    image: bde2020/spark-worker:3.3.0-hadoop3.3
    container_name: spark-worker-1
    depends_on:
      - spark-master
    ports:
      - "8081:8081"
    environment:
      - "SPARK_MASTER=spark://spark-master:7077"
    volumes:
      - /tmp/spark-events-local:/tmp/spark-events
      - /Users/matteo/spark/conf/spark-defaults.conf:/spark/conf/spark-defaults.conf
  spark-worker-2:
    image: bde2020/spark-worker:3.3.0-hadoop3.3
    container_name: spark-worker-2
    depends_on:
      - spark-master
    ports:
      - "8082:8081"
    environment:
      - "SPARK_MASTER=spark://spark-master:7077"
    volumes:
      - /tmp/spark-events-local:/tmp/spark-events
      - /Users/matteo/spark/conf/spark-defaults.conf:/spark/conf/spark-defaults.conf
  spark-history-server:
      image: bde2020/spark-history-server:3.3.0-hadoop3.3
      container_name: spark-history-server
      depends_on:
        - spark-master
      ports:
        - "18081:18081"
      volumes:
        - /tmp/spark-events-local:/tmp/spark-events
        - /Users/matteo/spark/conf/spark-defaults.conf:/spark/conf/spark-defaults.conf

could you please share the spark-defaults.conf file that you are using?

2. spark.eventLog.enabled

I am sorry all the files are gone now. But I think there is only one line in the conf
spark.eventLog.enabled true