TristanSullivan / CSBS_BMPaper_Analysis

Scripts to make the figures for the HEP CPU benchmarking paper submitted to CSBS in August 2021

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Scripts to make the figures for the HEP CPU benchmarking paper submitted to CSBS in August 2021

The contents of this repository pertain to Figures 1, 3, 7, 8, 9, and 10. There is a separate script to produce each figure. For Figures 3 and 9, there is a python script and a C++ ROOT macro, because I couldn't make the stat boxes work in python. For each figure except Figure 7, there is a script to read the data in from a .pkl file and print it. I write this output to a text file, which is read by the scripts that make the plots. For Figure 3, there are two scripts to read the data: one for the B2 workload, and one for the other six HEP workloads. The output must be combined into a single file.

The format of the data is 
<benchmark name> one of hs06, hepscore, atlas-gen, cms-gen-sim, cms-digi, cms-reco, belle2-gen-sim-reco, lhcb-gen-sim
<CPU model 1>
score1
...
scoreN
<CPU model 2>
score1
...
score N
etc.

There is a block for each benchmark, containing sub-blocks for each CPU, containing the score obtained for that CPU and that benchmark

For Figure 1, the script writes out only the mean and standard deviation of the score for each CPU. The other scripts write out the score for each run, so the output contains the full distributions.

The scripts to generate the plots use PYROOT. Reading in the data files works the same way for each one, except MakeprmonPlot.py, for which the data are in .csv files. For each of the others, lists are created of the CPU models and benchmarks in the output files from the first step. These are used to parse the data files, and fill a dictionary called bmresults containing the benchmark score for each benchmark and cpu. An example is bmresults["Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz"]["hepscore"]["32.0"], where the last index is the number of cores used on the machine. bmresults["Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz"]["hepscore"]["32.0"] is a list of the HEPscore values for some number of runs on the Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz CPU. The length of the list is the number of runs.

bmresults is then used to fill histograms or graphs, which are plotted with various options. For Figures 3 and 9, the python scripts write out histograms to ROOT files, and the plots are made with C++ macros.

This repository contains all the .pkl files, text output files, and the final figures for the version of the paper originally submitted. For example, to produce Figure 1, I did

./GetSPECvsHS06data.py > SPECvsHS06data_reproduceoldversion_sigma

on a VM with python3. Then, I copied SPECvsHS06data_reproduceoldversion_sigma to my laptop with python2 set up with ROOT, and did

./MakeSPECvsHS06_reproduceoldversion.py

For Figures 1, 7, and 8, I modified them by hand to make the final version. There are comments in the scripts describing what was done.

About

Scripts to make the figures for the HEP CPU benchmarking paper submitted to CSBS in August 2021


Languages

Language:Python 83.7%Language:C 16.3%