HPCToolkit / hpctoolkit

HPCToolkit performance tools: measurement and analysis components

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cannot test floating-point metrics

HuG-Cloud opened this issue · comments

I want to test some floating-point related metrics.

  1. When I looked at the available events using hpcrun -l, the floating-point related metrics provided by PAPI were not available;
    PAPI_SP_OPS No Floating point operations; optimized to count scaled single precision vector operations PAPI_DP_OPS No Floating point operations; optimized to count scaled double precision vector operations
  2. And It shows the FP_ARITH is available, but the following problem is prompted when actually running hpcrun:

HPCToolkit fatal error: event FP_ARITH is unknown or unsupported.
If running a dynamically-linked program with hpcrun, use 'hpcrun -L ' for a list of available events.
If running a statically-linked program built with hpclink, set HPCRUN_EVENT_LIST=LIST in your environment and
run your program to see a list of available events.
Note: Either of the aforementioned methods will exit after listing available events. Arguments to your program
will be ignored. Thus, an execution to list events can be run on a single core and it will execute for only a few
seconds.

I apologize that your bug report went unnoticed. A week ago, we moved to gitlab; github is now just a mirror.

Your error report isn't actionable.

What platform are you running on? What version of HPCToolkit are you using? How did you build HPCToolkit? With spack or configure? What configuration options did you use? (Was PAPI compiled into hpcrun at build time?) Exactly what was the hpcrun command that you issued where hpcrun reported an error?

Add a copy of hpcrun -l to your bug report.

Please add a copy of output from papi_avail and papi_native_avail to your problem report.

I'm sorry for not being clear.

I'm using Intel Xeon; And HPCToolkit version is 2022.05.15; I built it with spack: spack install hpctoolkit +mpi +cuda, I find the support for PAPI seems to be default ?

The command I used is: mpirun -n 4 hpcrun -e CPUTIME -e FP_ARITH ./my .
And I got the error:

HPCToolkit fatal error: event FP_ARITH is unknown or unsupported.
If running a dynamically-linked program with hpcrun, use 'hpcrun -L <program>' for a list of available events.

If running a statically-linked program built with hpclink, set HPCRUN_EVENT_LIST=LIST in your environment and
run your program to see a list of available events.

Note: Either of the aforementioned methods will exit after listing available events. Arguments to your program
will be ignored. Thus, an execution to list events can be run on a single core and it will execute for only a few
seconds.

What should I do to make it support floating-point metrics?

The docs are the output of "hpcrun -l" and "papi_avail -d" respectively.
hpcrun.txt
papi.txt
@jmellorcrummey

In the hpcrun output, there is the following header above the hardware performance counter events. See the note associated with an asterisk.

===========================================================================
Available Linux perf events
===========================================================================
(*) Denotes the counter may not be profilable.

Later, we report the events:

skx::FP_ARITH   Floating-point instructions retired (*)
FP_ARITH        Floating-point instructions retired

skx::FP_ARITH, which is a skylake-specific counter, reports that the FP_ARITH counter may not be profilable. When the event is listed without the skylake prefix "skx", it lacks the notation that it may not be profilable.

We'll have to look into why the FP_ARITH event isn't annotated with an asterisk. The reason the FP_ARITH is not found seems to be due to a bug in the PAPI library or the libpfm library that it depends upon. papi_native_avail lists FP_ARITH. However, "papi_native_avail -e FP_ARITH" says that the event is not found.

Unfortunately, Intel has made floating point operations very hard to profile on recent processors such as skylake. (See https://groups.google.com/a/icl.utk.edu/g/ptools-perfapi/c/AlPdLO6fU7o) There are many events that need to be counted and summed together. We can only profile native events in the hardware, not events derived by summing together counts of other events.

I see, thank you.