sosy-lab / cpu-energy-meter

A tool for measuring energy consumption of Intel CPUs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Measure power rather than energy

TrevorVillwock opened this issue · comments

Hello,

This is a question rather than an issue, so apologies if I'm posting in the wrong place. Is there a way to get the tool to report Package and DRAM power data rather than energy? I'm repeatedly triggering the measurement every 51 ms using the -e flag and outputting the measurements to a csv file. Is the best method just to calculate the power myself using the time and energy measurements, or is there a more precise way to measure the power directly? Slightly confused since of course RAPL stands for Running Average Power Limit but all the data reported from the msrs seems to be energy.

Many thanks,
Trevor

Are you talking about CPU Energy Meter? It has an -e option, but it affects only the internal measurement frequency (which is irrelevant for most users and should usually be computed automatically instead) and does not cause repeated output. How are you achieving this?

But yes, the CPU only reports accumulated energy consumption. Power can be computed from this using time and and energy difference, but note that it will be always an average power for the respective time interval. (And for short measurement intervals, a low resolution of the energy measurements could introduce significant errors in the computation, not sure whether this would be the case for 51ms.)

The name Running Average Power Limit refers to something else, actually, namely the possibility to limit how much power the CPU consumes on average, which is apparently what Intel considers to be the main feature of this functionality. Energy measurements are more like a side effect of RAPL.

Thanks for the reply! That's all very helpful. Yes, I'm talking about CPU Energy Meter. To get repeated output I simply copied the line print_results(num_node, cum_energy_J, measurement_start_time, measurement_end_time, fpt) that gets executed on SIGINT or SIGUSR1 to the body of the while loop like so, with FILE *fpt being the csv file:

static int measure_and_print_results(FILE *fpt) {

  const int num_node = get_num_rapl_nodes();
  struct timeval measurement_start_time, measurement_end_time;
  double prev_sample[num_node][RAPL_NR_DOMAIN];

  // Read initial values
  if (get_total_energy_consumed_for_nodes(num_node, prev_sample, NULL) != 0) {
    return 1;
  }
  gettimeofday(&measurement_start_time, NULL);

  double cum_energy_J[num_node][RAPL_NR_DOMAIN];
  memset(cum_energy_J, 0, sizeof(cum_energy_J));
  const struct timespec signal_timelimit = compute_msr_probe_interval_time();
  const sigset_t signal_set = get_sigset();

  // Actual measurement loop
  while (true) {
    // Wait for signal or timeout
    const int rcvd_signal = sigtimedwait(&signal_set, NULL, &signal_timelimit);

    // print to csv
    print_results(num_node, cum_energy_J, measurement_start_time, measurement_end_time, fpt);
    // handle errors
    if (rcvd_signal == -1) {
      if (errno == EAGAIN) {
        DEBUG("Time limit elapsed, reading values to ensure overflows are detected.%s", "");
      } else if (errno == EINTR) {
        // interrupted, just try again
      } else {
        warn("Waiting for signal failed.");
        return 1;
    }
    // make sure to read in each iteration, otherwise we might miss overflows
    if (get_total_energy_consumed_for_nodes(num_node, prev_sample, cum_energy_J) != 0) {
      return 1;
    }

    // handle signals
    if (rcvd_signal != -1) {
      gettimeofday(&measurement_end_time, NULL);
      DEBUG("Received signal %d.", rcvd_signal);
      if (rcvd_signal == SIGINT) {
        // printf("\nprinting results\n");
        print_results(num_node, cum_energy_J, measurement_start_time, measurement_end_time, fpt);
        break;

      } else if (rcvd_signal == SIGUSR1) {
        print_results(num_node, cum_energy_J, measurement_start_time, measurement_end_time, fpt);

      } else {
        warnx("Received unexpected signal %d", rcvd_signal);
        return 1;
      }
    }
  }

  return 0;
}

In retrospect though that's pretty hackish and probably not exactly what I want to do. Should I be sending a SIGUSR1 signal to the process periodically instead to accomplish what I want?

Yes, I'm talking about CPU Energy Meter. To get repeated output I simply copied the line

Ah I see. Please always indicate if you are using a modified version when reporting issues or asking questions to the developers, otherwise this will inevitably lead to confusion.

Should I be sending a SIGUSR1 signal to the process periodically instead to accomplish what I want?

Depends on what you want to do. As explained, there is no way to get power measurements except by sampling energy measurements (with the inherent measurement overhead and imprecision) and derive the power values.

If you still want to do this and use CPU Energy Meter for it, adding a command-line flag with the output interval would be a solution, probably better than using another process to regularly send intervals (which would add further jitter).

A second flag for output in CSV format (similar to -r) would also be possible.

But I think that we probably would not want to compute the power values in CPU Energy Meter itself, but let the user do this on their own, such that they understand the problems related to it.

Ah got it, sorry for the confusion--I'm a relatively new developer so that's good to know. I should be fine calculating average power outside of CPU Energy Meter. All I'd like to do is get periodic measurements of the energy usage for time periods of ~51-200ms. Just to confirm, when you say add a command line flag with the output interval, you mean a new one, not the -e option that already exists, correct? Thanks for all the help.

Yes, exactly. The -e parameter is mostly a relict from the history and useful for debugging, but there is no reason for users to use it, so we better use a new parameter.

Got it, thanks!