js-ts / kfp-spark

Orchestrate spark jobs on kubeflow pipelines

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

KFP version: 1.7.0+ Kubernetes version: 1.17+

Orchestrate Spark Jobs using Kubeflow pipelines

Install kubeflow pipelines standalone or full kubeflow

for standalone kubeflow pipelines installation

https://www.kubeflow.org/docs/components/pipelines/installation/

for full kubeflow installation

https://www.kubeflow.org/docs/started/installing-kubeflow/

Install Spark Operator

https://github.com/GoogleCloudPlatform/spark-on-k8s-operator#installation

Create Spark Service Account and add permissions

kubectl apply -f ./scripts/spark-rbac.yaml

Run the notebok kubeflow-pipeline.ipynb

Access Kubflow/KFP UI

image

OR

image

Upload pipeline

Upload the spark_job_pipeline.yaml file

image

Create Run

image

Start Pipeline add service account spark-sa

image

Wait till the execution is finished. check the print-message logs to view the result

image

About

Orchestrate spark jobs on kubeflow pipelines

License:Apache License 2.0


Languages

Language:Jupyter Notebook 100.0%