ColinIanKing / powerstat

Powerstat measures the power consumption of a machine using the battery stats or the Intel RAPL interface. The output is like vmstat but also shows power consumption statistics. At the end of a run, powerstat will calculate the average, standard deviation and min/max of the gathered data.

Home Page:https://github.com/ColinIanKing/powerstat

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

powerstat fails with "Device does not have any RAPL domains", likely due to wrong strncmp()

anarazel opened this issue · comments

Hi,

Starting recently (I think since 5ee1a9b) powerstat reports
Device does not have any RAPL domains, cannot power measure power usage

I think that's due to

powerstat/powerstat.c

Lines 1742 to 1744 in 6a6353e

/* Ignore duplicated RAPL info from mmio */
if (strncmp(entry->d_name, "intel-rapl-mmio", 15))
continue;

note that we skip if strncmp() returns non-zero, i.e. the string does not match "intel-rapl-mmio". I suspect this is a copy-and-pasto from

powerstat/powerstat.c

Lines 1745 to 1747 in 6a6353e

/* Ignore non Intel RAPL interfaces */
if (strncmp(entry->d_name, "intel-rapl", 10))
continue;

which does want to skip on any mismatch.

I guess this means that powerstat will just use the mmio domain, and nothing else. My system doesn't have that, and thus fails.

Regards,

Andres

Thanks for the bug report and fix. There was an issue in an earlier fix that tried to remove duplicated RAPL domains but this was clearly broken. I've added a fix that checks for duplicated domain names and filters out extraneous ones. The Watts field was also incorrect as it was just the sum of the packages and not all the RAPL power domains, so I've fixed that too.

Fix committed: 8e80286

Thanks, it now works!

Did you intentionally leave this in:

printf("powercap: %s\n", entry->d_name);

I now see this:

powercap: .
powercap: ..
powercap: intel-rapl:1
powercap: intel-rapl:0:0
powercap: intel-rapl
powercap: intel-rapl:0
powercap: intel-rapl:1:0
Running for 60.0 seconds (60 samples at 1.0 second intervals).
...

when running powerstat. I don't think these, particularly . and .., are all that informative?