kubecost / cost-analyzer-helm-chart

Kubecost helm chart

Home Page:http://kubecost.com/install

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Bug] Secondaries Aggregator can't be turned off.

passionInfinite opened this issue · comments

Kubecost Helm Chart Version

v2.0.2 and v2.1.0 both

Kubernetes Version

v1.27.7

Kubernetes Platform

AKS

Description

If I have kubecostAggregator.enabled: false and kubecostAggregator.deployMethod: disabled . The nginx.conf file fails to process successfully and thus keeps the frontend pod restarting.

The reason for nginx.conf not get parsed successfully is because upstream for aggregation is not defined due to disabling of aggregator but the endpoints in the conf file still tries to use it which nginx throws error saying [emer] aggregator host not defined in upstream.

I believe all those endpoints server should be configured based on the aggregator settings. Currently they are hardcoded expecting it to have aggregator running.

Example: https://github.com/kubecost/cost-analyzer-helm-chart/blob/v2.1.0/cost-analyzer/templates/cost-analyzer-frontend-config-map-template.yaml#L309

Steps to reproduce

  1. Disable aggregator and also make deployMethod as disabled.
  2. Deploy the kubecost. The frontend pod will keep on restarting failing to parse the nginx.conf file.

Expected behavior

I expect even if the aggregator is disabled everything else should work as is.

Impact

We can't really use v2.x version.

Screenshots

No response

Logs

/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration                                                                                            │
│ /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/                                                                                                                   │
│ /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh                                                                                                       │
│ 10-listen-on-ipv6-by-default.sh: info: can not modify /etc/nginx/conf.d/default.conf (read-only file system?)                                                                               │
│ /docker-entrypoint.sh: Sourcing /docker-entrypoint.d/15-local-resolvers.envsh                                                                                                               │
│ /docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh                                                                                                           │
│ /docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh                                                                                                           │
│ /docker-entrypoint.sh: Configuration complete; ready for start up                                                                                                                           │
│ 2024/02/23 00:21:55 [emerg] 1#1: host not found in upstream "aggregator" in /etc/nginx/conf.d/default.conf:94                                                                               │
│ nginx: [emerg] host not found in upstream "aggregator" in /etc/nginx/conf.d/default.conf:94

Slack discussion

No response

Troubleshooting

  • I have read and followed the issue guidelines and this is a bug impacting only the Helm chart.
  • I have searched other issues in this repository and mine is not recorded.

cc @michaelmdresser thoughts here?

@AjayTripathy this is a well-defined bug (thanks @passionInfinite!). I'm guessing that this install path slipped through the cracks when releasing 2.0; this was intended to be a supported configuration.

@michaelmdresser any workarounds until this gets fixed? Happy to contribute to move forward. We really want to move to v2.x with our internal releases.

To workaround it, you can edit the nginx.conf file in the frontend container (either with kubectl exec or before installing the configmap in the cluster) to remove the offending references to the aggregator upstream.

A contribution would involve identifying the parts of https://github.com/kubecost/cost-analyzer-helm-chart/blob/develop/cost-analyzer/templates/cost-analyzer-frontend-config-map-template.yaml that must be modified when aggregator.depoyMethod=disabled to remove the references to the aggregator upstream -- a quick glance reveals these as a likely candidate:

location /oidc/ {
proxy_connect_timeout 180;
proxy_send_timeout 180;
proxy_read_timeout 180;
proxy_pass http://aggregator/oidc/;
proxy_redirect off;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
location /saml/ {
proxy_connect_timeout 180;
proxy_send_timeout 180;
proxy_read_timeout 180;
proxy_pass http://aggregator/saml/;
proxy_redirect off;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
location /login {
proxy_connect_timeout 180;
proxy_send_timeout 180;
proxy_read_timeout 180;
proxy_pass http://aggregator/login;
proxy_redirect off;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Original-URI $request_uri;
}
location /logout {
proxy_connect_timeout 180;
proxy_send_timeout 180;
proxy_read_timeout 180;
proxy_pass http://aggregator/logout;
proxy_redirect off;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}

It's debatable whether or not the frontend container should even exist if Aggregator is disabled -- it is almost certain that the FE will be in a degraded state because it does not have its expected API server available. A more sensible thing might be to disable the frontend entirely when Aggregator is disabled.

@williamkubecost was possibly going to take a shot?

@AjayTripathy Just created a PR #3184

Thank you!! Maybe @williamkubecost can help test/review #3184 ?

commented

@passionInfinite was a little faster than me! more than happy to review 😄

Thank you @williamkubecost !! Let me know if you need anything from my end 😄

I'm unsure how it impacts this, but there is a helm value for disabling components on secondary deployments.
federatedETL.agentOnly=true

It disables Grafana, forecasting service, and the frontend.

@srpomeroy Yes it disables but then cluster controller doesn't work as it looks for backend apis and only the cost-model pod is running and nginx is not available. Check kubecost/features-bugs#90

@williamkubecost

Hi, noting here that in v2.2.1 the issue of referencing an 'aggregator' upstream is still present.

This section of the nginx.conf still populates by default if oidc or saml are enabled:

        location /auth {
            proxy_pass http://aggregator/isAuthenticated;
        }

It's coming from this part of the frontend configmap template: https://github.com/kubecost/cost-analyzer-helm-chart/blob/v2.2.1/cost-analyzer/templates/cost-analyzer-frontend-config-map-template.yaml#L355

Is it intended to require the aggregator if saml or oidc are enabled?

I believe it is; cc @jessegoodier to confirm.

@williamkubecost

Hi, noting here that in v2.2.1 the issue of referencing an 'aggregator' upstream is still present.

This section of the nginx.conf still populates by default if oidc or saml are enabled:

        location /auth {
            proxy_pass http://aggregator/isAuthenticated;
        }

It's coming from this part of the frontend configmap template: https://github.com/kubecost/cost-analyzer-helm-chart/blob/v2.2.1/cost-analyzer/templates/cost-analyzer-frontend-config-map-template.yaml#L355

Is it intended to require the aggregator if saml or oidc are enabled?

Thanks for the question @ravnalexquinn
Can you clarify what the goal is?

If I'm following this issue correctly, there was a question on how to disable all optional services on an agent (agentlOnly=true)?

Are you saying that you have a SAML/OIDC config that you are also using for agents?

@jessegoodier If we turn off aggregator on secondaries can we use the kubescaler on secondaries?

@jessegoodier If we turn off aggregator on secondaries can we use the kubescaler on secondaries?

clusterController is a local cluster only utility.
You can run kubecost on the local cluster and also ship the federated metrics. We are happy to help with this.