CPUStats.UsageCoreNanoSeconds is inaccurate on windows - 1/10 of the real value.
qiutongs opened this issue · comments
What happened?
The CPU utilization metrics from Kubelet summary API CPUStats.UsageCoreNanoSeconds
is not accurate on Windows. The value is only 1/10 of the real value which I have other VM level metrics as a proof.
summary API CPU utilization metric
VM CPU utilization metric
What did you expect to happen?
The CPUStats.UsageCoreNanoSeconds
should be accurate on Windows too.
How can we reproduce it (as minimally and precisely as possible)?
$ kubectl get --raw "/api/v1/nodes/<NODE>/proxy/stats/summary"
T1
"cpu": {
"time": "2023-12-06T21:23:48Z",
"usageNanoCores": 8000000,
"usageCoreNanoSeconds": 20985800000000
},
T2
"cpu": {
"time": "2023-12-06T21:24:38Z",
"usageNanoCores": 8000000,
"usageCoreNanoSeconds": 20986320000000
},
(20986320000000 - 20985800000000)ns / 50s = 0.01 s/s
This matches the magnitude of the metric chart.
Anything else we need to know?
This problem doesn't exist on Linux.
Call stack
- metric is set here by kubelet -
kubernetes/pkg/kubelet/winstats/winstats.go
Lines 118 to 129 in 7fe31be
- the data collected by using windows perf counters -
kubernetes/pkg/kubelet/winstats/perfcounter_nodestats.go
Lines 206 to 212 in 7fe31be
- query here -
Kubernetes version
$ kubectl version
1.27
Cloud provider
OS version
# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here
# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)
This issue is currently awaiting triage.
If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
Could usageNanoCores
be wrong?
This code caught my attention:
kubernetes/pkg/kubelet/winstats/perfcounter_nodestats.go
Lines 252 to 254 in 85684ec
usageNanoCores is computed in cri_stats_provider.go, like this:
kubernetes/pkg/kubelet/stats/cri_stats_provider.go
Lines 792 to 797 in 26923b9
So wouldn't something like this be equivalent (just going by looking at the code)?
cpuUsageNanoCores := (p.cpuUsageCoreNanoSecondsCache.latestValue - p.cpuUsageCoreNanoSecondsCache.previousValue) / float64(perfCounterUpdatePeriod) * float64(time.Second/time.Nanosecond)
/sig windows
you can open discusion with https://github.com/kubernetes/community/tree/master/sig-windows