pachyderm / openshift-operator

Operator for deploying Pachyderm on OpenShift

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Pachyderm Operator

The pachyderm operator is an application responsible for installing, watching and managing Pachyderm resources in your Openshift cluster.

Pachyderm is the data foundation for machine learning. Pachyderm provides industry leading data versioning, pipelines and lineage that allow data science teams to automate the machine learning lifecycle and optimize their machine learning operations (MLOps).

Usage

1. Using AWS S3 for Pachd storage

  • Create a secret which contains the AWS S3 storage information
$ oc create secret generic pachyderm-aws --from-literal access-id=ABCDEFGHIJKLMNOPQR --from-literal access-secret=dkhfjdshfj/fjkdshfiuUjmfhdsjkhfjdhs/KLhdfuiseh --from-literal bucket=pachyderm-bucket --from-literal region=us-east-1`

secret/pachyderm-aws created
$
  • Create a Pachyderm custom resource in the same namespace
$ cat <<EOF> pachyderm-cr.yaml
apiVersion: aiml.pachyderm.com/v1beta1
kind: Pachyderm
metadata:
  name: pachyderm-sample
  namespace: pachyderm-test
spec:
  console:
    disable: true
  pachd:
    metrics:
      disable: false
    storage:
      amazon:
        credentialSecretName: pachyderm-aws
      backend: AMAZON
EOF
$ oc create -f pachyderm-cr.yaml
pachyderm.aiml.pachyderm.com/pachyderm-sample created
$ 
  • Ensure pachyderm is up and running
$ oc get pachyderm pachyderm-sample -o yaml | yq e '.status' -
phase: Running
$   

2. User-provided postgresql database

  • Set postgresql to disabled in pachyderm.spec.postgresql

  • Provide postgresql instance information in pachyderm.spec.pachd.postgresql

  • Create a k8s secret to hold the postgresql password. It should have a key postgres-password

About

Operator for deploying Pachyderm on OpenShift

License:Apache License 2.0


Languages

Language:Go 91.1%Language:Makefile 5.4%Language:Shell 1.9%Language:Dockerfile 1.5%