Reduce the set of metrics exposed by the kubelet

Question

Reduce the set of metrics exposed by the kubelet

dashpole opened this issue 6 years ago · comments

Background

In 1.12, the kubelet exposes a number of sources for metrics directly from cAdvisor. This includes:

The kubelet also exposes the summary API, which is not exposed directly by cAdvisor, but queries cAdvisor as one of its sources for metrics.

The Monitoring Architecture documentation describes the path for "core" metrics, and for "monitoring" metrics. The Core Metrics proposal describes the set of metrics that we consider core, and their uses. The motivation for the split architecture is:

To minimize the performance impact of stats collection for core metrics, allowing these to be collected more frequently
To make the monitoring pipeline replaceable, and extensible.

Current kubelet metrics that are not included in core metrics

Pod and Node-level Network Metrics
Persistent Volume Metrics
Container-level (Nvidia) GPU Metrics
Node-Level RLimit Metrics
Misc Memory Metrics (e.g. PageFaults)
Container, Pod, and Node-level Inode metrics (for ephemeral storage)
Container, Pod, and Node-level DiskIO metrics (from cAdvisor)

Deprecating and removing the Summary API will require out-of-tree sources for each of these metrics. "Direct" cAdvisor endpoints are not often used, and have even been broken for multiple releases (#62544) without anyone raising an issue.

Working Items

Open Questions

Should the kubelet be a source for any monitoring metrics?
- For example, metrics about the kubelet itself, or DiskIO metrics for empty-dir volumes (which are "owned" by the kubelet).
What will provide the metrics listed above, now that the kubelet no longer does?
- cAdvisor can provide Network, RLimit, Misc Memory metrics, Inode metrics, and DiskIO metrics.
  - cAdvisor only works for some runtimes, but is a drop-in replacement for "direct" cAdvisor API endpoints
- Container Runtimes can be a source for container-level Memory, Inode, Network and DiskIO metrics.
- NVidia GPU metrics provided by a daemonset published by NVidia
- No source for Persistent Volume metrics?

/sig node
/sig instrumentation
/kind feature
/priority important-longterm
cc @kubernetes/sig-node-proposals @kubernetes/sig-instrumentation-misc

WanLinghao commented 6 years ago

/cc

Ganga Mahesh Siddem commented 3 years ago

/track

Yosi Zelensky commented a year ago

/track

Hamzah Qudsi commented 6 months ago

/track

Piotr Skamruk · Answer 1 · Wed Sep 12 2018 16:04:58 GMT+0800 (China Standard Time)

Shouldn't be container/pod metrics gathered from CRI instead of from cAdvisor? Same with info about inodes/bytes usage of image store and for container root filesystems.

That's strongly runtime dependent and we should not expect that cAdvisor will support all types of runtimes supported by CRI (e.g. VM based runtimes like https://github.com/Mirantis/virtlet/ ).

Solly Ross · Answer 2 · Wed Sep 12 2018 22:48:10 GMT+0800 (China Standard Time)

cc @brancz

Introduce Core Metrics Kubelet API

Is there a proposal out for this, besides the design/scratch doc I put out a while ago on using Prometheus metrics for this purpose?

Frederic Branczyk · Answer 3 · Thu Sep 13 2018 00:33:37 GMT+0800 (China Standard Time)

I don’t think there is anything else.

David Ashpole · Answer 4 · Thu Sep 13 2018 01:20:57 GMT+0800 (China Standard Time)

Shouldn't be container/pod metrics gathered from CRI instead of from cAdvisor? Same with info about inodes/bytes usage of image store and for container root filesystems.

@jellonek you are correct. That is why I list cAdvisor as "one of its sources for metrics". The other notable source is metrics from the runtime through CRI.
I added the container runtime as a source of metrics that the kubelet won't provide in the future.

Solly Ross · Answer 5 · Thu Sep 13 2018 01:58:20 GMT+0800 (China Standard Time)

I don't think we should introduce the core metrics API as described in the proposal you linked above. It's basically just summary 2.0 in terms of format, which makes it highly kube-specific. I'd much rather just have a stable Prometheus endpoint, as discussed previously. That way, tools that collect metrics are much less likely to have to learn a new format.

EDIT: too much speed reading

David Ashpole · Answer 6 · Thu Sep 13 2018 02:12:04 GMT+0800 (China Standard Time)

I don't think we should introduce the core metrics API as described in the proposal you linked above. It's basically just summary 2.0 in terms of format, which makes it highly kube-specific.

The format of the core metrics API is one of the non-goals listed in the proposal linked above. It just lists the contents of the API. We can debate format during the proposal process.
I intend for this to be an umbrella issue. I am trying to document current state, rough timeline, and gaps I see in terms of which metrics don't have a transition path to the new architecture.

Solly Ross · Answer 7 · Thu Sep 13 2018 02:20:48 GMT+0800 (China Standard Time)

Ah, apologies -- too much speed reading plus trying to recall from the last time I looked at things :-)

Drinky Pool · Answer 8 · Fri Sep 21 2018 17:35:41 GMT+0800 (China Standard Time)

hi~I would like to know how to Deprecate the "direct" cAdvisor API endpoints?
Is that deprecating the port for direct cAdvisor in kubelet first, and then removing the cAdvisor vendor codes from kubelet after all?

Solly Ross · Answer 9 · Mon Sep 24 2018 22:15:45 GMT+0800 (China Standard Time)

removing the vendored cadvisor would require making sure we have out-of-tree sources for all the metrics. Deprecating "direct" cAdvisor API endpoints is mostly just about deprecating that piece of API, I believe. Removing the rest will come later.

Elana Hashman · Answer 10 · Sat Oct 13 2018 00:02:34 GMT+0800 (China Standard Time)

Should the kubelet be a source for any monitoring metrics?

For example, metrics about the kubelet itself, or DiskIO metrics for empty-dir volumes (which are "owned" by the kubelet).

Given that I've been unable to find an accurate source of Prometheus-formatted metrics for empty-dirs, I am an enthusiastic +1 to including these. (Does anyone know if these are exported elsewhere? The exported cadvisor disk usage metrics in 1.8.7 appear to exclude any overlays and are useless for monitoring real container disk usage.)

David Ashpole · Answer 11 · Sat Oct 13 2018 00:52:40 GMT+0800 (China Standard Time)

Given that I've been unable to find an accurate source of Prometheus-formatted metrics for empty-dirs

The kubelet's /metrics endpoint has disk usage metrics for volumes. cAdvisor's disk usage metrics just measure the writable layer of the container.

Elana Hashman · Answer 12 · Tue Oct 16 2018 02:42:10 GMT+0800 (China Standard Time)

@dashpole the linked file appears to only collect PVC stats, not emptyDir/local storage metrics.

Bhavin Gandhi · Answer 13 · Mon Oct 29 2018 21:04:55 GMT+0800 (China Standard Time)

@dashpole I think there is a small typo in the first point of description of this issue. Shouldn't that be /metrics/cadvisor https://github.com/kubernetes/kubernetes/blob/v1.12.0/pkg/kubelet/server/server.go#L71

Edit: the issue description has been updated.

WanLinghao · Answer 14 · Mon Jan 21 2019 16:43:00 GMT+0800 (China Standard Time)

@dashpole Hello, as a fresh man here, I want to clarify if my understanding is correct about this issue.

Currently summary api endpoint provides redundancy metrics and we should replace it with some kind of new endpoint which only provides metrics needed by components like metrics-server, scheduler,HPA...
After removing the summary endpoint as well as direct cadvisor endpoint, by default, the only way kubelet provides to get metrics information is the new endpoint.
If users want more metrics support , they should run Daemonset as the source of metrics.

If the understanding above is correct, I have a few of questions about them.

What kind of metrics the new endpoint would contain, it seems the description in https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/core-metrics-pipeline.md#metric-requirements gives the answer:

Node-level usage metrics for Filesystems, CPU, and Memory
Pod-level usage metrics for Filesystems, CPU, and Memory
Container-level usage metrics for Filesystems, CPU, and Memory

should we remove cadvisor totally to reduce the load of kubelet/node? Or just disable the endpoint(/stats/summary,/stats/container,/stats/podName/containerName) and still use cadvisor as metrics source consumed by new endpoint.
If we should remove cadvisor totally, where could the new endpoint get metrics from? Should we implement some kind of "light-version cadvisor" which collect metrics consumed by that new endpoint (those in point 1)?

David Ashpole · Answer 15 · Wed Jan 23 2019 06:44:22 GMT+0800 (China Standard Time)

@WanLinghao those are good questions. I have some ideas, but nothing is set in stone yet.

Understanding

Information in the summary API isn't redundant, but the kubelet should be scoped as narrowly as possible.
The kubelet may, or may not provide some supplemental metrics in prometheus format, depending on which metrics require knowledge only the kubelet has ("ephemeral storage", for example, is defined by the kubelet). The short answer is that this is still up in the air, but the goal is for each metric in the summary API to exist afterwards, though possibly not from the kubelet.
That is one possible solution, but isn't determined yet

Questions

I am working on a proposal for this right now. You are correct that it will contain the metrics described in the core metrics proposal.
The eventual goal is to remove cAdvisor, but it is likely it will be around for some time.
At a high level, cAdvisor is both a cgroup discovery tool, and a cgroup monitoring tool. It discovers cgroups by recursively watching all cgroup directories using inotify, and responding to inotify events by monitoring the discovered cgroup. Since the kubelet already manages pod, allocatable (/kubepods), QoS, kubelet, runtime and misc cgroups, it doesn't need any discovery for those. The kubelet also already gets container metrics from the CRI, so it doesn't need discovery or monitoring for those. The only thing left is adding minimal monitoring to the container manager in the kubelet, which should be ~100 lines of code, rather than an entire cAdvisor. As stated before, this is not set in stone, but as a worst-case design is a marked improvement over the current state.

Sachin Kumar · Answer 16 · Wed Jan 23 2019 10:02:59 GMT+0800 (China Standard Time)

@dashpole , and if a person wants to use Prometheus for monitoring then who will provide the Prometheus formatted metrics? currently these are exposed throgh /metrics/cadvisor end point on kubelet.

would we have to run the cAdvisor as deamonset for that purpose?

WanLinghao · Answer 17 · Wed Jan 23 2019 14:13:09 GMT+0800 (China Standard Time)

@dashpole So why would we remove the summary API?

Information in the summary API isn't redundant, but the kubelet should be scoped as narrowly as possible.

It seems we will replace summary API with Core Metrics kubelet API. What's the difference between them?

[1.14] Introduce Core Metrics Kubelet API
[1.14] Deprecate the Summary API

David Ashpole · Answer 18 · Thu Jan 24 2019 03:07:39 GMT+0800 (China Standard Time)

@sachinmsft that is not yet entirely determined. If you want the exact metrics cAdvisor produces, you will likely have to run cAdvisor as a daemonset. However, my hope is that we can find a better way which will involve a combination of metrics from the kubelet and from other sources.

@WanLinghao The Core metrics api has more limited metrics, as required by the metrics server.

khsahaji · Answer 19 · Sat Jan 26 2019 02:06:18 GMT+0800 (China Standard Time)

@WanLinghao The Core metrics api has more limited metrics, as required by the metrics server.

Hi dashpole
I am a beginner with Kubernetes API server usage for metrics collection. I have been trying to collect various metrics data using the "k8s.io/metrics/" metrics Golang package. So far I am able to get only few metrics like CPU, Memory and Ephemeral Storage of the containers. I would like to know how should I go about collecting metrics about persistent volume usage of a Pod(if I get it per container that would be al-right, I can do the addition :)) as well as "latency"(time taken to serve an api request) for a Pod(I would like to either find the average latency or latency per api call). Can you point me towards the correct way to get those metrics ? By the way my metrics collector application is deployed as a POD itself which will forward the metrics data to a renderer application for creating visualization. Thanks in advance.

David Ashpole · Answer 20 · Sat Jan 26 2019 02:10:32 GMT+0800 (China Standard Time)

@khsahaji please ping me on slack. That isn't relevant to this issue.

Aaron Crickenberger · Answer 21 · Sun Feb 24 2019 10:08:12 GMT+0800 (China Standard Time)

/milestone v1.14
We're entering burndown and this looks relevant to the 1.14 release. Please /milestone clear if I am incorrect

Ibrahim AshShohail · Answer 22 · Wed Mar 06 2019 04:55:04 GMT+0800 (China Standard Time)

Hey!

Gentle reminder that code freeze is this Friday.

Since this is part of the 1.14 milestone, is it going to be resolved soon?

David Ashpole · Answer 23 · Wed Mar 06 2019 05:17:19 GMT+0800 (China Standard Time)

@ibrasho The only PR for 1.14 is #73946. Then this can be moved to 1.15

Niko Pen · Answer 24 · Wed Mar 06 2019 07:27:08 GMT+0800 (China Standard Time)

/milestone v1.15

Bryan Boreham · Answer 25 · Thu Apr 25 2019 18:45:44 GMT+0800 (China Standard Time)

If you want the exact metrics cAdvisor produces, you will likely have to run cAdvisor as a daemonset.

Running cAdvisor on its own may produce the same numbers, but it does not label them in the way Kubelet does with pod, container, etc., so is not a replacement.

Unless I'm missing something, we would need a new program which duplicates this part of Kubelet+cAdvisor.

Frederic Branczyk · Answer 26 · Thu Apr 25 2019 19:51:24 GMT+0800 (China Standard Time)

I may be wrong, but last I checked, all Kubernetes does is re-write the labels available well-known docker labels to pod_name, pod, container_name, and container, that would be possible to be replicated with metric relabelling in Prometheus. I agree that the exact same response from cAdvisor directly is not currently possible, although it could be by implementing label replacing directly into cAdvisor.

Bryan Boreham · Answer 27 · Thu Apr 25 2019 19:56:23 GMT+0800 (China Standard Time)

We tried it, and resorted to creating a new program. If you describe how to do it without, that would make a great blog post.

David Ashpole · Answer 28 · Fri Apr 26 2019 02:21:05 GMT+0800 (China Standard Time)

@bboreham I opened google/cadvisor#2227 to track that.

Stephen Augustus · Answer 29 · Tue Jun 11 2019 00:40:54 GMT+0800 (China Standard Time)

This looks like it should be a KEP and not a tracking issue.

I see:

[1.15] Deprecate the "direct" cAdvisor API endpoints by adding and deprecating a --enable-cadvisor-json-endpoints flag

listed as the work item for 1.15; are there any updates regarding this?

/milestone clear

David Ashpole · Answer 30 · Tue Jun 11 2019 00:48:45 GMT+0800 (China Standard Time)

@justaugustus

This is one of the working items under the metrics overhaul KEP: https://github.com/kubernetes/enhancements/blob/master/keps/sig-instrumentation/20181106-kubernetes-metrics-overhaul.md#export-less-metrics

The 1.15 item was completed in #78504, so this doesn't have anything left for 1.15.

fejta-bot · Answer 31 · Mon Sep 09 2019 01:36:36 GMT+0800 (China Standard Time)

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Frederic Branczyk · Answer 32 · Mon Sep 09 2019 15:23:22 GMT+0800 (China Standard Time)

/remove-lifecycle stale

stevebail · Answer 33 · Tue Jan 21 2020 23:28:34 GMT+0800 (China Standard Time)

Is it possible to get an update for 1.17?
Which kubelet endpoint should I use for collecting cAdvisor metrics by Prometheus Server?
Which other kubelet endpoints are supported for direct access to additional metrics?
@juliusv
@dashpole

stevebail · Answer 34 · Wed Jan 22 2020 00:50:55 GMT+0800 (China Standard Time)

I think I got it now.
This is what I have understood from #59827:
If one needs to collect all cAdvisor metrics with Prometheus, don't scrape kubelet but rather scrape a separate/standalone cAdvisor instance running as daemonset.
I hope this is right :)

David Ashpole · Answer 35 · Tue Feb 04 2020 03:11:32 GMT+0800 (China Standard Time)

/assign @serathius

Marek Siarkowicz · Answer 36 · Tue Feb 04 2020 20:41:47 GMT+0800 (China Standard Time)

I discussed it with David and proposed that I will work on a KEP that will propose replacements for kubelet monitoring endpoints

Kiel Chan · Answer 37 · Mon Mar 02 2020 15:38:18 GMT+0800 (China Standard Time)

@dashpole If I want to enable subcontainers options in kubelet api request to collect more metrics about sidecar container, is that conflict with this proposal ?

ref:

	// Whether to also include information from subcontainers.
	// Default: false.
	// +optional
	Subcontainers bool `json:"subcontainers,omitempty"`

David Ashpole · Answer 38 · Tue Mar 17 2020 02:51:28 GMT+0800 (China Standard Time)

Since progress on this has slipped multiple releases, I removed references to specific versions in which some steps are happening. Once we have a plan to move forward, we can update those with the up-to-date plan.

Christos Markou · Answer 39 · Wed Jun 10 2020 16:03:56 GMT+0800 (China Standard Time)

Hey @dashpole is there any update on this? From what I get it seems that it is not clear yet when changes will happen, however is there any resource (design doc, open discussion etc) where we can keep track of updates even if decisions are not final (I'm interested mostly in stats/summary future to be honest)?

@serathius I see that you mentioned about working on proposing replacements for kubelet's monitoring endpoints. If there are any resources to keep track of this one too (even in preliminary state) would be really nice to know. Thanks!

David Ashpole · Answer 40 · Thu Jun 11 2020 00:45:57 GMT+0800 (China Standard Time)

/unassign

I am not able to drive this work right now. This is still the place to check for updates.

Jeannette Pepin · Answer 41 · Sat Jun 13 2020 00:22:59 GMT+0800 (China Standard Time)

@ChrsMark I and my team have been following this as well, and working on replacement options for stats/summary while we wait for guidance from the kubernetes sigs. Happy to discuss what we've looked at and decided on if it would be helpful, and would be interested in hearing others' solutions as well. My impression is that no in-tree replacement for stats/summary is being planned by sig-node or sig-instrumentation.

Christos Markou · Answer 42 · Mon Jun 15 2020 20:31:33 GMT+0800 (China Standard Time)

@jpepin happy to join in such a meeting/design-proposal/discussion if there is any in the future!

Vishwanath · Answer 43 · Wed Jun 17 2020 23:43:48 GMT+0800 (China Standard Time)

@dashpole - Now that --enable-cadvisor-json-endpoints defaults to false with 1.18, whats the recommended equivalent for /spec/ to get machineinfo ? It would be great if we can provide alternatives for each of the endpoints we are deprecating (and removing).

David Ashpole · Answer 44 · Sat Jun 20 2020 08:12:16 GMT+0800 (China Standard Time)

@vishiy You can run cAdvisor as a daemonset to get the exact same information. Or, i'd recommend checking out the prometheus node exporter, which has most of the same information available in prometheus format, which IMO is easier to integrate into popular monitoring pipelines.

Vincent Boulineau · Answer 45 · Mon Jun 22 2020 20:25:40 GMT+0800 (China Standard Time)

@dashpole I understand that external replacements are the way to go. It works well for Linux as we always have the possibility for a container to have a level of host access similar to the Kubelet.

However, I believe that's not an option on Windows. AFAIK, anything running as a container (until privileged containers are supported on Windows) cannot provide the same metrics than the Kubelet (currently network, some disk - excluding cpu, memory as it'd still be available from Kubelet).

Tejaswini Vadlamudi · Answer 46 · Mon Jun 22 2020 22:54:03 GMT+0800 (China Standard Time)

I'm trying to understand the changes, does k8s plan to remove CAdviosor metrics from kubelet?
I can deploy a CAdvisor myself as a daemonSet but do I get disk or PVC metrics which I get from kubelet?
What we have today and what we are going to achieve later seems skeptical for me. Could anyone help me to understand?

Vishwanath · Answer 47 · Fri Jun 26 2020 08:32:26 GMT+0800 (China Standard Time)

@dashpole - Thanks. But running yet another pod/workload just for a few metrics and running many for each of these metrics is not scalable/sustainable in my opinion. So unless the metric server has these metrics available as a single consolidator/aggregator, we should probably rethink deprecating the remaining endpoints. It just breaks tooling & solutions people have built and these metrics are super useful. What do you think ?

Clayton Coleman · Answer 48 · Fri Jul 31 2020 03:33:31 GMT+0800 (China Standard Time)

I am not in favor of removing these metrics in general. These are as much a part of the value of Kubernetes as many other parts of the system. If we need to improve their sustainability while keeping in mind the needs of users, or adapt the underlying impl, we should do so. But I do not think it is as clear cut as "move out of project".

…

On Thu, Jun 25, 2020 at 8:32 PM Vishwanath ***@***.***> wrote: @dashpole <https://github.com/dashpole> - Thanks. But running yet another pod/workload just for a few metrics and running many for each of these metrics is not scalable/sustainable in my opinion. So unless the metric server has these metrics available as a single consolidator/aggregator, we should probably rethink deprecating the remaining endpoints. It just breaks tooling & solutions people have built and these metrics are super useful. What do you think ? — You are receiving this because you are on a team that was mentioned. Reply to this email directly, view it on GitHub <#68522 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAI37J5QVSBZI5QBNXYMNRDRYPUCZANCNFSM4FUQXD2Q> .

Sergey Kanzhelev · Answer 49 · Fri Oct 23 2020 14:24:17 GMT+0800 (China Standard Time)

@dashpole what's the plan here? Is this issue moving forward? Do we need re-discuss it based on recent comments?

I want to make sure this PR #95763 is not going to waste.

David Ashpole · Answer 50 · Sat Oct 24 2020 01:02:01 GMT+0800 (China Standard Time)

@SergeyKanzhelev Thanks for bumping this. Given the process we recently took for Accelerator metrics, it seems like deprecating a set of metrics should have its own KEP. This was originally part of the metrics overhaul, but was removed from that KEP recently since it hadn't been finished. Given the feedback above, it seems reasonable to have a KEP in which people can discuss pros/cons of moving forward with the deprecation, and let the sig leads make the call.

Once we have a KEP for removing the cadvisor json endpoints, we should probably close this issue. I think the question of whether to deprecate and replace endpoints is probably better answered by its own KEP.

Sergey Kanzhelev · Answer 51 · Sat Oct 24 2020 01:09:32 GMT+0800 (China Standard Time)

got it. Set PR on hold for now. Is this something that sig instrumentation can discuss and push forward?

David Ashpole · Answer 52 · Sat Oct 24 2020 01:10:53 GMT+0800 (China Standard Time)

Sig-node has historically handled questions of which endpoints the kubelet exposes. Sig-Instrumentation should probably be an involved sig

Jeannette Pepin · Answer 53 · Fri Oct 30 2020 11:14:02 GMT+0800 (China Standard Time)

@dashpole What's the best way to get involved with this KEP discussing the deprecation?

Bryan Boreham · Answer 54 · Fri Oct 30 2020 18:39:18 GMT+0800 (China Standard Time)

Seems relevant to this discussion: apparently a set of metrics like machine_memory_bytes and machine_cpu_cores stopped being generated by kubelet in 1.19, due to some reorganisation inside cAdvisor #95204.

David Ashpole · Answer 55 · Thu Nov 05 2020 04:49:59 GMT+0800 (China Standard Time)

I opened kubernetes/enhancements#2130, which we can use as a place to discuss concerns around removing the cAdvisor json endpoints.

Johannes 'fish' Ziemke · Answer 56 · Wed Nov 11 2020 01:20:29 GMT+0800 (China Standard Time)

node-exporter maintainer here, just to clarify: the node-exporter doesn't provide any container level metrics

David Ashpole · Answer 57 · Wed Nov 11 2020 03:50:59 GMT+0800 (China Standard Time)

Follow-up from discussion at Sig-Node today:

Just to clarify, kubernetes/enhancements#2130 is trying to consolidate existing endpoints, and isn't intended to remove content. Potential migration paths are covered as part of the KEP, but to summarize, most metrics being removed should be available in the /metrics/cadvisor kubelet endpoint. If you rely on metrics that are not included in /metrics/cadvisor, please describe those on the KEP.

Since the introduction of the CRI, Sig-Node has been interested in removing cAdvisor from the kubelet, since it requires container runtimes to integrate with cAdvisor to get complete metrics. (Since 2016) the desired monitoring architecture has been one in which monitoring metrics are collected out-of-tree. This issue has been tracking the steps we would need to take to reach those goals, but doesn't cover specifics (such as migration steps for users), or guarantee we will ever actually reach the desired monitoring architecture. It may very well be the case (as @smarterclayton pointed out above) that the primary monitoring endpoints are too widely used to break at this point.

As a status update:
If kubernetes/enhancements#2130 is accepted, and the monitoring server is moved to /metrics/resource, we will be in the following state in 1.21:

The only remaining container/pod/node monitoring endpoints on the kubelet are /metrics/cadvisor and /stats/summary.
Third-party device metrics are no longer collected in-tree.
The "monitoring" endpoints listed above are used only for monitoring, and not for kubectl top or autoscaling.
cAdvisor is only used to provide data for monitoring endpoints, and not for other endpoints. This may change with the introduction of pod-level metrics.
cAdvisor is still used to provide metrics for eviction.

That means anyone who works on this in the future can consider cAdvisor + monitoring endpoints largely in isolation.

fejta-bot · Answer 58 · Tue Feb 09 2021 04:16:20 GMT+0800 (China Standard Time)

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

Elana Hashman · Answer 59 · Thu Feb 11 2021 07:50:51 GMT+0800 (China Standard Time)

/remove-lifecycle stale
/lifecycle frozen

Madhav Jivrajani · Answer 60 · Tue Jun 29 2021 23:59:44 GMT+0800 (China Standard Time)

/remove-kind design
/kind feature

kind/design will soon be removed from k/k in favor of kind/feature. Relevant discussion can be found here: kubernetes/community#5641