ArroyoSystems / arroyo

Distributed stream processing engine in Rust

Home Page:https://arroyo.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Controller panick when stopping pipeline

juchiast opened this issue · comments

commented

Deployed on AWS EKS

2023-08-29T15:36:10.402273Z ERROR arroyo_server_common: panicked at 'called `Result::unwrap()` on an `Err` value: Request ID: None Body: <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>9GVR3VRECTWSPBJQ</RequestId><HostId>R0W2NZnbLfvwxlQ8htuN7ijJmQjeuKDDsKOkdhqz7WL55F5iCvpcdZgaXrKsfstuQKuzS9z9m40=</HostId></Error>', arroyo-state/src/parquet.rs:131:14 panic.file="arroyo-state/src/parquet.rs" panic.line=131 panic.column=14
kubectl describe deployment/arroyo-controller

Name:               arroyo-controller
Namespace:          default
CreationTimestamp:  Tue, 29 Aug 2023 11:03:20 +0700
Labels:             app=arroyo-controller
                    app.kubernetes.io/instance=arroyo
                    app.kubernetes.io/managed-by=Helm
                    app.kubernetes.io/name=arroyo
                    app.kubernetes.io/version=0.5.1
                    helm.sh/chart=arroyo-0.5.1
Annotations:        deployment.kubernetes.io/revision: 2
                    meta.helm.sh/release-name: arroyo
                    meta.helm.sh/release-namespace: default
Selector:           app=arroyo-controller
Replicas:           1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:       Recreate
MinReadySeconds:    0
Pod Template:
  Labels:           app=arroyo-controller
                    app.kubernetes.io/instance=arroyo
                    app.kubernetes.io/managed-by=Helm
                    app.kubernetes.io/name=arroyo
                    app.kubernetes.io/version=0.5.1
                    helm.sh/chart=arroyo-0.5.1
  Annotations:      prometheus.io/path: /metrics
                    prometheus.io/port: 9191
                    prometheus.io/scrape: true
  Service Account:  arroyo
  Containers:
   arroyo-controller:
    Image:       ghcr.io/arroyosystems/arroyo-services:0.5.1
    Ports:       9190/TCP, 9191/TCP
    Host Ports:  0/TCP, 0/TCP
    Args:
      controller
    Requests:
      cpu:      1
      memory:   2Gi
    Liveness:   http-get http://:admin/status delay=0s timeout=1s period=10s #success=1 #failure=3
    Readiness:  http-get http://:admin/status delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      K8S_WORKER_SERVICE_ACCOUNT_NAME:  arroyo
      S3_BUCKET:                        serversstack-arroyoarroyobucket05f593c6-yadx1dmgn3rh
      S3_REGION:                        us-west-1
      DATABASE_HOST:                    arroyo-postgresql.default.svc.cluster.local
      DATABASE_PORT:                    5432
      DATABASE_NAME:                    arroyo
      DATABASE_USER:                    arroyo
      DATABASE_PASSWORD:                <set to the key 'password' in secret 'arroyo-postgresql'>  Optional: false
      CONTROLLER_ADDR:                  http://arroyo-controller:9190
      REMOTE_COMPILER_ENDPOINT:         http://arroyo-compiler:9000
      SCHEDULER:                        kubernetes
      K8S_NAMESPACE:                     (v1:metadata.namespace)
      K8S_WORKER_NAME:                  arroyo
      K8S_WORKER_LABELS:                helm.sh/chart: arroyo-0.5.1
                                        app.kubernetes.io/name: arroyo
                                        app.kubernetes.io/instance: arroyo
                                        app.kubernetes.io/version: "0.5.1"
                                        app.kubernetes.io/managed-by: Helm
      K8S_WORKER_ANNOTATIONS:           prometheus.io/path: /metrics
                                        prometheus.io/port: "6901"
                                        prometheus.io/scrape: "true"
      K8S_WORKER_IMAGE:                 ghcr.io/arroyosystems/arroyo-worker:0.5.1
      K8S_WORKER_IMAGE_PULL_POLICY:     IfNotPresent
      K8S_WORKER_RESOURCES:             limits: {}
                                        requests:
                                          cpu: 400m
                                          memory: 200Mi
      K8S_WORKER_SLOTS:
    Mounts:                             <none>
  Volumes:                              <none>
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  arroyo-controller-794658d686 (0/0 replicas created)
NewReplicaSet:   arroyo-controller-5ff5c645d4 (1/1 replicas created)
Events:          <none>

Looking at S3 request logs, it didn't make the request with arroyo service account.

#	requester
1	arn:aws:sts::586927300535:assumed-role/ServersStack-EKSClusterclusterNodegroupClusterNode-QFIH2CJ3RDA2/i-0dced49d58c01a152

Successful request looked like this

#	requester
1	arn:aws:sts::586927300535:assumed-role/ServersStack-ArroyoArroyoSARole09B95647-I86TNKR6N78N/WebIdentitySession
commented

Rusoto does not use K8S service role automatically
https://github.com/rusoto/rusoto/blob/master/AWS-CREDENTIALS.md