aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).

Home Page:https://aws.amazon.com/about-aws/whats-new/containers/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[EKS] Enable HPA with CloudWatch metrics and alarms

joshuabaird opened this issue · comments

Tell us about your request
Many ECS customers make use of ECS service autoscaling based on CloudWatch metrics and alarms. This functionality is desired in EKS.

Community projects that add this support include https://github.com/chankh/k8s-cloudwatch-adapter

Which service(s) is this request for?
This could be EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
We need to be able to use HPA to autoscale pods based on Cloudwatch metrics and alarms.

Are you currently working around this issue?
We use ECS.

You can do this today in EKS by using custom metrics with the Kubernetes metrics server (part of the HPA implementation). These are typically metrics from within the Kubernetes cluster however and preform a similar function to how CW metrics and alarms work for ECS. I think what you are describing would be to consume CW metrics as external metrics into the metrics server and use these to trigger scaling.

Can you describe the use case that having this feature would enable?

k8s/EKS has native support for autoscaling both application container replicas and cluster worker nodes based on cluster's built-in metrics collection. You don't need to integrate external systems like CloudWatch.

However, if you did still want to scale off CloudWatch, e.g. maybe you are triggering scaling inside EKS in response to SQS queue length, then you could trigger a Lambda that tells the EKS k8s API to scale up down your application (like kubectl scale from the CLI).

That project mentioned @joshuabaird has disappeared. If you want to go the other direction, export internal cluster metrics to CloudWatch. then Istio has an adaptor for export traffic metrics and this issue #38 might be relevant.

Sorry - my point was that users migrating from ECS are most likely using CloudWatch metrics to autoscale their ECS services, so this same functionality is desired in EKS. It's an important feature to consider seeing that CloudWatch is a fundamental service in AWS, in my opinion.

Use-case:

We currently push custom metrics to CloudWatch for various things. These metrics are used to autoscale our ECS services. So, ideally, we need similar functionality in EKS otherwise we would need to re-write our entire metrics pipeline for this use case.

Cool @joshuabaird, if you already have CloudWatch metrics then right now you can trigger:

  • Cluster scaling just by scaling the worker node ASG as normal, new nodes will automatically join the cluster, workload on deleted nodes gets automatically rescheduled
  • Service scaling using the CloudWatch -> Lambda -> EKS k8s API approach

The k8s built-in horizontal autoscaler also supports custom metrics, so someone or AWS could implement a CloudWatch metrics adaptor. Then the native k8s service scaling would be driven off CloudWatch metrics. There are existing projects for scaling based on Azure, Google, Datadog and Prometheus metrics. Adding a CloudWatch metrics adaptor seems like a good way to add what you want?

Yep, adding a Cloudwatch metrics adapter sounds like a solution!

While I understand this is possible by using the described workaround, I believe HPA is a standard feature for Kubernetes and as for the metrics-server can be a default toolset for a managed Kubernetes cluster.

I'd really like this feature to come true without setting much!

How can I push some of my Application Custom Metrics to Cloudwatch and use them in my HPA

while the k8s cloudwatch adapter & eks container insights is great, it's still a major cluster level component if it stopped working for any reason then all autoscaling of all deployments stop, which is a significant risk.

While in ECS entire autoscaling is completely managed without any in-house management. this is one of the many reliability issues of managed aws managed k8s platform where it's not truly aws managed, there is a significant chunk of cluster-level components that need to be managed in-house. AWS managing such basic components will greatly increase the adoption of EKS from our team due to increased reliability.

Is there any updates on this? I have just spoke to AWS Support and they indicate that the k8s-cloudwartch-adapter is unsupported now. When can we expect a way to natively scale services based on cloudwatch metrics without have to do a run around for it?

Hi k8s-cloudwartch-adapter seems to be archived but I can't find in favor of whom.
Could you point me please to an alternative ?

cloudwatch metrics adapter doesn't seem like a good solution to use. major issue is that it uses GetMetricData api, which has rate limits, so if someone ran a script to fetch metrics for a usecase, suddenly you might see your autoscaling stopped working or worse started scaling down

the adapter either needs to maintain some state by itself or aws should expand the HPA to use an event driven approach like in ECS where alarm triggers an autoscaling action like KEDA