APIservice configuring incorrectly
dmcstravick7 opened this issue · comments
Issue
- metrics-server APIservice is failing to come up (ServiceNotFound)
v1beta1.external.metrics.k8s.io
- All Keda resources are deployed as
keda-operator*
in this case that includeskeda-operator-metrics-apiserver
.- The spec for
/keda/config/metrics-server/api_service.yaml
contains spec.Service.Name forkeda-metrics-apiserver
. ^ Issue
- The spec for
Fix
- Editing the APIservice and updating spec.Service.Name to
keda-operator-metrics-apiserver
. - E.g
kubectl edit apiservice v1beta1.external.metrics.k8s.io
- Change spec.Service.Name from
keda-metrics-apiserver
-->keda-operator-metrics-apiserver
.
Steps to reproduce
- Install Keda via helm chart 2.11.1 (note this has happened with previous versions)
- Running any kubectl command such as
kubectl get pods
returns the below error (also returns pods)
57503 memcache.go:287] couldn't get resource list for external.metrics.k8s.io/v1beta1: the server is currently unable to handle the request
Notes
- Could this be due to setting me not setting some value in the chart that I am unaware of?
Originally posted by @dmcstravick7 in kedacore/keda#4769
Is this a Helm issue or KEDA deployment itself? Always happy to review PRs.
I think it's a Keda issue as the Kubernetes manifest specifies the name wrong, which is then picked up by Helm in my case.
I'll create a PR, I just want to be very sure that this is an issue for someone other then me.
Yes, but the main question is do we need to update the Helm chart and/or KEDA core which generates the manifests. @zroubalik can you take a look? I keep forgetting where we annotate this.
I am having this same issue for myself too
Hello,
I have just tried with a fresh cluster using latest helm chart (v2.11.1) and default values and I can't reproduce the issue. In my case, the apiservice points to the correct service.
Could you share your values to test them?
@tomkerkhove , this only can happen with helm because otherwise, e2e test wouldn't pass as the metrics server hadn't been reachable, so I move this issue to charts repo
Hi @JorTurFer,
I use Terraform to deploy a helm_release, but it uses all default values, besides the ones below.
Something unique about my environment is Keda 2.7.1 is deployed (not via Helm) and I'm deleting all the resources (only leaving CRDs) and deploying the helm_release of Keda 2.11.1. But the deployment works fine and the only issue is the one mentioned in post.
operator:
replicaCount: 2
prometheus:
metricServer:
podMonitor:
namespace: kube-system
operator:
podMonitor:
namespace: kube-system
prometheusRules:
namespace: kube-system
serviceAccount:
annotations:
XYZ_ROLE
webhooks:
enabled: false
recapping:
You have KEDA v2.7.1 installed and you want to migrate to v2.11 but it doesn't work, right? You are using default values + the values above. Is it correct?
After the confirmation, I'll try your scenario again
That's correct, with the context of also switching from vanilla manifests, to now using Helm via Terraform.
I can also provide the kubectl commands I am using to delete the Keda resources (everything except ScaledObjects)
I can also provide the kubectl commands I am using to delete the Keda resources (everything except ScaledObjects)
It'd be nice ❤️ I'll try to reproduce your scenario exactly
The below is what I'm running to delete all the required resources, then tag and annotate CRDs so that the helm deploy can take them over.
Sidenote, the only reason i'm doing it this way is I will have several ScaledObjects running in production and I don't want to effect/have to re-deploy those (which I think happens if I just run k delete -f keda-manifest) Is this accurate?
k delete ClusterRole keda-operator
k delete ClusterRole keda-external-metrics-reader
k delete RoleBinding keda-auth-reader -n kube-system
k delete ClusterRoleBinding keda-hpa-controller-external-metrics
k delete ClusterRoleBinding keda-operator
k delete ClusterRoleBinding keda-system-auth-delegator
k delete Service keda-metrics-apiserver -n kube-system
k delete Deployment/keda-metrics-apiserver -n kube-system
k delete Deployment/keda-operator -n kube-system
k label --overwrite crd clustertriggerauthentications.keda.sh app.kubernetes.io/managed-by="Helm"
k annotate --overwrite crd clustertriggerauthentications.keda.sh meta.helm.sh/release-name="keda"
k annotate --overwrite crd clustertriggerauthentications.keda.sh meta.helm.sh/release-namespace="kube-system"
k label --overwrite crd scaledjobs.keda.sh app.kubernetes.io/managed-by="Helm"
k annotate --overwrite crd scaledjobs.keda.sh meta.helm.sh/release-name="keda"
k annotate --overwrite crd scaledjobs.keda.sh meta.helm.sh/release-namespace="kube-system"
k label --overwrite crd scaledobjects.keda.sh app.kubernetes.io/managed-by="Helm"
k annotate --overwrite crd scaledobjects.keda.sh meta.helm.sh/release-name="keda"
k annotate --overwrite crd scaledobjects.keda.sh meta.helm.sh/release-namespace="kube-system"
k label --overwrite crd triggerauthentications.keda.sh app.kubernetes.io/managed-by="Helm"
k annotate --overwrite crd triggerauthentications.keda.sh meta.helm.sh/release-name="keda"
k annotate --overwrite crd triggerauthentications.keda.sh meta.helm.sh/release-namespace="kube-system"
k label --overwrite APIService v1beta1.external.metrics.k8s.io app.kubernetes.io/managed-by="Helm"
k annotate --overwrite APIService v1beta1.external.metrics.k8s.io meta.helm.sh/release-name="keda"
k annotate --overwrite APIService v1beta1.external.metrics.k8s.io meta.helm.sh/release-namespace="kube-system"
I'm going to test it soon (today or tomorrow max), but I have a question in the meantime, why didn't you just upgrade the chart using helm upgrade
? It doesn't require any deletion and works in one step. I mean, if you have installed KEDA (or any other component) using helm, you can upgrade it just with helm upgrade
That would be ideal. But currently we're also moving management of helm to Terraform, using the helm_release provider.
That would be ideal. But currently we're also moving management of helm to Terraform, using the helm_release provider.
AFAIK, that provider supports upgrading out-of-the-box, so you just need to change the version
field and it will execute helm upgrade
Hmm, that might work. I tried that previously but it wouldn't work as it was missing all the labels and annotations, maybe I can just adjust my process, I won't delete any resources but i'll do all labelling/annotating and attempt to using Terraform to deploy.
I'll add a comment once I try that. Thanks for your help so far @JorTurFer!
experienced similar issue with version 2.11.1
, prometheus was not able to scrape metrics from the metrics server.
when trying to check the application locally using
kubectl port-forward $(kubectl get pods -l app=keda-operator-metrics-apiserver -n keda -o name) 8080:8080 -n keda
the connection crash.
when testing the same chart version, but setting this value:
image:
metricsApiServer:
tag: "2.10.1"
all seems to be working properly
hi @ArieLevs ,
It's a known bug, we removed the prometheus server by error and we have already merged the fix, a hotfix release will be cut soon. I'd suggest paying attention to this comment about the metrics
In any case, I think that the problems here are different because the APIService uses the port 6443