Reconciler error
cameronbraid opened this issue · comments
operator version getting error same with both ghcr.io/risingwavelabs/risingwave-operator:latest
and ghcr.io/risingwavelabs/risingwave-operator:v0.2.5
2023-02-18T09:13:55Z ERROR Reconciler error {"controller": "risingwave", "controllerGroup": "risingwave.risingwavelabs.com", "controllerKind": "RisingWave", "RisingWave": {"name":"risingwave","namespace":"drivenow-staging-z"}, "namespace": "drivenow-staging-z", "name": "risingwave", "reconcileID": "2c7919aa-d406-4966-a6ca-9c11824845ab", "error": "unable to sync meta service: Service \"risingwave-meta\" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset, spec.ports[0].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[2].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}, spec.ports[2]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}]; unable to sync compute service: Service \"risingwave-compute\" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset, spec.ports[0].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}]; unable to sync compactor service: Service \"risingwave-compactor\" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset, spec.ports[0].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}]; unable to sync object: Deployment.apps \"risingwave-compactor\" is invalid: [spec.template.spec.containers[0].ports[0].containerPort: Required value, spec.template.spec.containers[0].ports[1].containerPort: Required value]; unable to sync frontend service: Service \"risingwave-frontend\" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset, spec.ports[0].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}]; unable to sync object: Deployment.apps \"risingwave-frontend\" is invalid: [spec.template.spec.containers[0].ports[0].containerPort: Required value, spec.template.spec.containers[0].ports[1].containerPort: Required value]; unable to sync connector service: Service \"risingwave-connector\" is invalid: [spec.ports[0].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}]; unable to sync meta service: Service \"risingwave-meta\" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset, spec.ports[0].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[2].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}, spec.ports[2]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}]; unable to sync compute service: Service \"risingwave-compute\" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset, spec.ports[0].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}]; unable to sync compactor service: Service \"risingwave-compactor\" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset, spec.ports[0].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}]; unable to sync object: Deployment.apps \"risingwave-compactor\" is invalid: [spec.template.spec.containers[0].ports[0].containerPort: Required value, spec.template.spec.containers[0].ports[1].containerPort: Required value]; unable to sync frontend service: Service \"risingwave-frontend\" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset, spec.ports[0].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}]; unable to sync object: Deployment.apps \"risingwave-frontend\" is invalid: [spec.template.spec.containers[0].ports[0].containerPort: Required value, spec.template.spec.containers[0].ports[1].containerPort: Required value]; unable to sync connector service: Service \"risingwave-connector\" is invalid: [spec.ports[0].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}]", "errorCauses": [{"error": "unable to sync meta service: Service \"risingwave-meta\" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset, spec.ports[0].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[2].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}, spec.ports[2]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}]"}, {"error": "unable to sync compute service: Service \"risingwave-compute\" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset, spec.ports[0].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}]"}, {"error": "unable to sync compactor service: Service \"risingwave-compactor\" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset, spec.ports[0].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}]"}, {"error": "unable to sync object: Deployment.apps \"risingwave-compactor\" is invalid: [spec.template.spec.containers[0].ports[0].containerPort: Required value, spec.template.spec.containers[0].ports[1].containerPort: Required value]"}, {"error": "unable to sync frontend service: Service \"risingwave-frontend\" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset, spec.ports[0].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}]"}, {"error": "unable to sync object: Deployment.apps \"risingwave-frontend\" is invalid: [spec.template.spec.containers[0].ports[0].containerPort: Required value, spec.template.spec.containers[0].ports[1].containerPort: Required value]"}, {"error": "unable to sync connector service: Service \"risingwave-connector\" is invalid: [spec.ports[0].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}]"}, {"error": "unable to sync meta service: Service \"risingwave-meta\" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset, spec.ports[0].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[2].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}, spec.ports[2]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}]"}, {"error": "unable to sync compute service: Service \"risingwave-compute\" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset, spec.ports[0].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}]"}, {"error": "unable to sync compactor service: Service \"risingwave-compactor\" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset, spec.ports[0].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}]"}, {"error": "unable to sync object: Deployment.apps \"risingwave-compactor\" is invalid: [spec.template.spec.containers[0].ports[0].containerPort: Required value, spec.template.spec.containers[0].ports[1].containerPort: Required value]"}, {"error": "unable to sync frontend service: Service \"risingwave-frontend\" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset, spec.ports[0].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}]"}, {"error": "unable to sync object: Deployment.apps \"risingwave-frontend\" is invalid: [spec.template.spec.containers[0].ports[0].containerPort: Required value, spec.template.spec.containers[0].ports[1].containerPort: Required value]"}, {"error": "unable to sync connector service: Service \"risingwave-connector\" is invalid: [spec.ports[0].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1].port: Invalid value: 0: must be between 1 and 65535, inclusive, spec.ports[1]: Duplicate value: core.ServicePort{Name:\"\", Protocol:\"TCP\", AppProtocol:(*string)(nil), Port:0, TargetPort:intstr.IntOrString{Type:0, IntVal:0, StrVal:\"\"}, NodePort:0}]"}]}
resource:
apiVersion: risingwave.risingwavelabs.com/v1alpha1
kind: RisingWave
metadata:
name: risingwave
spec:
storages:
meta:
etcd:
endpoint: risingwave-etcd:2388
object:
minio:
secret: risingwave-minio-credentials
endpoint: http://minio.minio.svc.cluster.local:9000
bucket: risingwave/staging
global:
image: ghcr.io/risingwavelabs/risingwave:latest
imagePullPolicy: IfNotPresent
nodeSelector:
node: node02
resources:
limits:
cpu: 2
memory: 4Gi
requests:
cpu: 100m
memory: 100Mi
replicas:
meta: 1
frontend: 1
compute: 1
compactor: 1
Hi @cameronbraid, thanks for reporting the issue. Which version of Kubernetes were you using? The error also happened to me, and it is a specific behavior of some lower version of Kubernetes. Could you try a version of 1.22+?
yep, same here.. was on 1.21. I suggest you prevent the operator from creating any resources in a kube version that is too old.
Would save some time in troubleshooting.
Thanks
Cameron
fyi - its working great in 1.25 :)
I suggest you prevent the operator from creating any resources in a kube version that is too old.
Thanks! I'll do it after then.
Its not working great :(
I am getting the same error now on kube 1.25.6
I tried deleteing the meta statefulset and it got recreated with invalid ports:
ports:
- name: service
containerPort: 0
protocol: TCP
- name: metrics
containerPort: 0
protocol: TCP
- name: dashboard
containerPort: 0
protocol: TCP
I'll attach the logs of the operator. What I did was scale the operator to 0, delete the meta statefulset, scale the operator back to 1, and downloaded the logs
logs-from-manager-in-risingwave-operator-controller-manager-dd7b8b6cd-d8wsb.log
@cameronbraid In case you still have this cluster, could you please attach the following information:
# operator logs
k logs -n risingwave-operator-system risingwave-operator-controller-manager-86cb5f4fb8-l2mft > operator.log
# a description of your service
k get service risingwave-meta -o yaml > service.yaml
# a description of your statefulset
k get statefulset <meta-statefulset> -o yaml > meta.yaml
(github wouldn't allow attachment of .yaml file)
fyi to get it working I manually edited the statefulset to set the ports correct
the operator image is ghcr.io/risingwavelabs/risingwave-operator:v0.2.5
@arkbriar Looked into it only really briefly, but my suspicion is that NewMetaService()
may be incorrect. We get Port: 0
, which is just the default value. Maybe this is because
is an optional and it is not set? We could introduce custom defaults for that case.
Thank you for your help @cameronbraid. I will try to look into this issue in the next couple days. Please let me know if you have any more helpful infos. I never ran into this one so far.
could it be related to #360 ?
I found something that may be related
I created a kind cluster and deployed the controller and a RW resource. After creating the RW resource and the RW cluster was provisioned I looked at the in cluster RW resource yaml and found this
...
spec:
components:
compactor:
ports:
metrics: 1260
service: 6660
compute:
ports:
metrics: 1222
service: 5688
frontend:
ports:
metrics: 8080
service: 4567
meta:
ports:
dashboard: 5691
metrics: 1250
service: 5690
...
This doesn't exist within the yaml I used to create it (https://raw.githubusercontent.com/risingwavelabs/risingwave-operator/main/docs/manifests/risingwave/risingwave-in-memory.yaml)
It also doesn't exist in my other cluster where I am getting this issue, here is the RW resource yaml from there.
Two things to note - the spec.components
tree is missing and the status Running=false
apiVersion: risingwave.risingwavelabs.com/v1alpha1
kind: RisingWave
metadata:
annotations:
kapp.k14s.io/identity: v1;drivenow-staging-z/risingwave.risingwavelabs.com/RisingWave/risingwave;risingwave.risingwavelabs.com/v1alpha1
kapp.k14s.io/original: '{"apiVersion":"risingwave.risingwavelabs.com/v1alpha1","kind":"RisingWave","metadata":{"annotations":{"meta.helm.sh/release-name":"risingwave"},"labels":{"gitops.jenkins-x.io/pipeline":"namespaces","kapp.k14s.io/app":"1611641776429462175","kapp.k14s.io/association":"v1.b31f6902351713787f49f4d08bb8a038"},"name":"risingwave","namespace":"drivenow-staging-z"},"spec":{"global":{"image":"ghcr.io/risingwavelabs/risingwave:latest","imagePullPolicy":"IfNotPresent","nodeSelector":{"node":"node02"},"podTemplate":"drivenow","replicas":{"compactor":1,"compute":1,"frontend":1,"meta":1},"resources":{"limits":{"cpu":2,"memory":"4Gi"},"requests":{"cpu":"100m","memory":"100Mi"}}},"storages":{"meta":{"etcd":{"endpoint":"risingwave-etcd:2379","secret":"risingwave-etcd-auth"}},"object":{"minio":{"bucket":"risingwave-staging","endpoint":"minio.minio.svc.cluster.local:9000","secret":"risingwave-minio-credentials"}}}}}'
kapp.k14s.io/original-diff-md5: 80f4b4bc344be2ab44201e763040b235
meta.helm.sh/release-name: risingwave
creationTimestamp: "2023-02-22T02:41:49Z"
generation: 2
labels:
gitops.jenkins-x.io/pipeline: namespaces
kapp.k14s.io/app: "1611641776429462175"
kapp.k14s.io/association: v1.b31f6902351713787f49f4d08bb8a038
name: risingwave
namespace: drivenow-staging-z
resourceVersion: "1460700225"
uid: c4ecf740-34c5-4d96-94fa-8bd73f333296
spec:
enableOpenKruise: false
global:
image: ghcr.io/risingwavelabs/risingwave:latest
imagePullPolicy: IfNotPresent
nodeSelector:
node: node02
podTemplate: drivenow
replicas:
compactor: 1
compute: 1
frontend: 1
meta: 1
resources:
limits:
cpu: 2
memory: 4Gi
requests:
cpu: 100m
memory: 100Mi
serviceType: ClusterIP
storages:
meta:
etcd:
endpoint: risingwave-etcd:2379
secret: risingwave-etcd-auth
object:
minio:
bucket: risingwave-staging
endpoint: minio.minio.svc.cluster.local:9000
secret: risingwave-minio-credentials
status:
componentReplicas:
compactor:
groups:
- exists: true
name: ""
running: 1
target: 1
running: 1
target: 1
compute:
groups:
- exists: true
name: ""
running: 1
target: 1
running: 1
target: 1
frontend:
groups:
- exists: true
name: ""
running: 1
target: 1
running: 1
target: 1
meta:
groups:
- exists: true
name: ""
running: 1
target: 1
running: 1
target: 1
conditions:
- lastTransitionTime: "2023-02-22T02:41:51Z"
status: "True"
type: Initializing
- lastTransitionTime: null
status: "False"
type: Running
observedGeneration: 2
storages:
meta:
type: Etcd
object:
type: MinIO
Also, in my broken cluster if I add in spec.components
data into my RW resource, this fixes the controller issue
so it looks like the root cause is to do with the Initializing phase
could it be related to #360 ?
I guess so. The ports are initialized by the webhooks only when the object is created. So upgrading the CRDs might leave the new fields with default values. We will find a way to fix it.
It should be fixed. I'm here to close the issue. Feel free to re-open it if it happens again on the latest version.