fenrus75 / powertop

The Linux PowerTOP tool -- please post patches to the mailing list instead of using github pull requests

Home Page:http://www.01.org/powertop

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PowerTOP reports over 100% C0 state residency

yarda opened this issue · comments

It seems it's reproducible on multiple systems.

It was originally reported by Erik Hamera ehamera@redhat.com, forwarding part of his report:

powertop reports over 400% in C0 in Idle stats on Alderlake-p.
I have thought about it and I remember when top has reported >100% on multi-CPU systems. But these statistics are per-CPU. OTOH there is an asymmetrical multiprocessing obviously:

# for a in `seq 1 19`; do echo -n "$a "; cat /sys/devices/system/cpu/cpu$a/topology/thread_siblings_list; done
1 0-1
2 2-3
3 2-3
4 4-5
5 4-5
6 6-7
7 6-7
8 8-9
9 8-9
10 10-11
11 10-11
12 12
13 13
14 14
15 15
16 16
17 17
18 18
19 19

So I guess, it's possible, that the faster cores are about 4.5 times faster, than slower ones and Powertop is calibrating itself on these slower ones. But I consider value over 100% as bug still, because it's very unintuitive. It's better to cap the faster one on 100% and adjust maximum for the slower one (around 22% in this case).

PowerTOP 2.14     Overview   Idle stats   Frequency stats   Device stats   Tunables   WakeUp                            


           Pkg(HW)  |            Core(HW) |            CPU(OS) 0   CPU(OS) 1
                    |                     | C0 active 445.1%        1.9%
                    |                     | POLL        0.0%    0.0 ms  0.0%    0.0 ms
                    |                     | C1_ACPI     0.0%    0.0 ms  0.0%    0.0 ms
C2 (pc2)    0.0%    |                     | C2_ACPI     0.0%    0.0 ms  0.2%    1.0 ms
C3 (pc3)    0.0%    | C3 (cc3)    0.0%    | C3_ACPI     0.0%    0.0 ms 99.5%    1.1 ms
C6 (pc6)    0.0%    | C6 (cc6)    0.0%    |
C7 (pc7)    0.0%    | C7 (cc7)    0.0%    |
C8 (pc8)    0.0%    |                     |
C9 (pc9)    0.0%    |                     |
C10 (pc10)  0.0%    |                     |

                    |            Core(HW) |            CPU(OS) 2   CPU(OS) 3
                    |                     | C0 active 445.2%        0.0%
                    |                     | POLL        0.0%    0.0 ms  0.0%    0.0 ms
                    |                     | C1_ACPI     0.0%    0.0 ms  0.0%    0.9 ms
                    |                     | C2_ACPI     0.0%    0.0 ms  0.0%    0.0 ms
                    | C3 (cc3)    0.0%    | C3_ACPI     0.0%    0.0 ms100.0%  835.4 ms
                    | C6 (cc6)    0.0%    |
                    | C7 (cc7)    0.0%    |
                    |                     |
                    |                     |
                    |                     |

                    |            Core(HW) |            CPU(OS) 4   CPU(OS) 5
                    |                     | C0 active   0.6%        0.0%
                    |                     | POLL        0.0%    0.0 ms  0.0%    0.0 ms
                    |                     | C1_ACPI     0.0%    0.2 ms  0.0%    0.0 ms
                    |                     | C2_ACPI     4.3%    0.9 ms  0.0%    0.0 ms
                    | C3 (cc3)    0.0%    | C3_ACPI    95.4%  265.7 ms100.0%  835.6 ms
                    | C6 (cc6)    4.3%    |
                    | C7 (cc7)   95.4%    |
                    |                     |
                    |                     |
                    |                     |

                    |            Core(HW) |            CPU(OS) 6   CPU(OS) 7
                    |                     | C0 active 445.1%        0.0%
                    |                     | POLL        0.0%    0.0 ms  0.0%    0.0 ms
                    |                     | C1_ACPI     0.0%    0.0 ms  0.0%    0.5 ms
                    |                     | C2_ACPI     0.0%    0.0 ms  0.2%    1.0 ms
                    | C3 (cc3)    0.0%    | C3_ACPI     0.0%    0.4 ms 99.8%  645.8 ms
                    | C6 (cc6)    0.0%    |
                    | C7 (cc7)    0.0%    |
                    |                     |
                    |                     |

I did not realize ADL was fully public already.

A cpu where the TSC runs at different rates on different cores is buggy and should be fixed, so clearly there is a Si bug lurking here

Thanks for feedback, closing.

Reopening. I am able to reproduce this problem on different machines. I validated that the TSC counts the same base freq on all cores. I think the problem is that due to the 'turbo' the APERF can count higher than the TSC. If turbo is disabled on the machines, the C0 residency doesn't go over 100%.

I think the problem is here:

state->duration_delta = ratio * (state->duration_after - state->duration_before) / state->after_count;

Ratio is:
ratio = 1.0 * time_delta / (tsc_after - tsc_before);

I.e.: after simplification we get something like time_delta * aperf_delta/tsc_delta

And here:

sprintf(buffer,"%5.1f%%", percentage(cstates[i]->duration_delta / time_factor));

because time_delta == time_factor we finally get aperf_delta/tsc_delta which can be higher than the 1.0 (100%) with the turbo utilized.

And that's exactly what I am observing if the machines reach turbo freqs under load.

Or am I missing something?

I have constant_tsc which is running at the max core frequency without turbo.

I am not sure how to hack this into the current code, maybe just clamp the C0 residency to the 100%?

Clamping to the 100% will probably not work, because the CPU could e.g. spend 50% in the C0 running with the 200% turbo which will result in the report of the 100% C0 residency which is not correct.

What about mperf_delta/tsc_delta? Would it help?

Being over 100% could be considered "correct" if it is just turbo executing more clock cycles than the nominal 100%. Is your turbo really four times the TSC? Or is your TSC frequency scaling? So an explanation in the documentation might be enough. Or, C0 active percent is obviously 100 minus the sum of the idle percents and forget all the MSR stuff.

Yes, in this case it seems the turbo is four times the TSC and the TSC is constant. I was a bit confused because I have seen this very rarely (on very few machines) in the past. I am OK with just documentation of it.