Kubernetes example clarification on spark-submit
Fixmetal opened this issue · comments
Hello
I'm trying to migrate an application over kubernetes using this solution. I am humbly confused because I'm not from Spark world and I'd need some directions. Hope any out there would help me out.
Following https://github.com/big-data-europe/docker-spark#kubernetes-deployment I can successfully create a pod which is based on the base
image with my application. But this pod will keep up forever since spark-submit
is just and endless script.
I think it's just me but I'm missing how this is correct since in this way we would have:
- A master pod (which is cluster manager, right?)
- One or more worker(s) pod(s) which should compute what submitted applications instruct to
- A pod per application which will stay up forever (until the application will eventually end up)
What I was expecting from spark-submit
was to submit the application to the workers and end up its life, but maybe I'm just having a bad light on this.
Can some Spark expert clarify the exact use case on k8s?
This is how I handled the spark-submit
operation:
---
apiVersion: batch/v1
kind: Job
metadata:
name: mySparkSubmitJob
spec:
template:
metadata:
labels:
app: spark-client
spec:
containers:
- name: mySparkSubmitJobContainer
image: myCustomImage
command: [ "bin/spark-submit" ]
args:
- "--master"
- "spark://spark-master:7077"
- "--deploy-mode"
- "client"
- "--conf"
- "spark.yarn.submit.waitAppCompletion=false"
- "--conf"
- "spark.driver.host=spark-client"
- "--conf"
- "spark.executor.memory=2g"
- "--conf"
- "spark.executor.cores=1"
- "--conf"
- "spark.locality.wait=0"
- "--conf"
- "spark.network.timeout=432000"
- "--conf"
- "spark.ui.showConsoleProgress=false"
- "--conf"
- "spark.driver.extraClassPath=<path-to-dependency-file.jar>"
- "--conf"
- "spark.driver.extraJavaOptions=-Dlog4j.configurationFile=<log4jdriver-properties-file> -Djava.security.egd=file:///dev/urandom"
- "--class"
- "<className>"
- "--jars"
- "<path-to-dependency-file.jar>"
- "<path-to-mainClass-file.jar>"
- "-c"
- "<path-to-application-config-file>"
restartPolicy: OnFailure
backoffLimit: 3
I think I found an answer: in client mode the pod itself becomes the driver and instructs worker to schedule the application job over them. A proper answer is here I guess. From what I understood the master is the Cluster Manager, Workers become Executors and my application creates the Driver pod.
Hence I converted the whole thing into a Deployment instead of using a Job.
Feel free to comment but I feel this is the point I was missing so I'm closing the issue.