RWTH-HPC / must-rma-correctness22-supplemental

On-the-Fly Data Race Detection for MPI RMA Programs with MUST - Supplemental Material

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

On-the-Fly Data Race Detection for MPI RMA Programs with MUST - Supplemental Material

Authors: Simon Schwitanski, Joachim Jenke, Felix Tomski, Christian Terboven, Matthias S. Müller

This is supplemental material for the paper "On-the-Fly Data Race Detection for MPI RMA Programs with MUST".

Repository Structure

  • must_rma: Sources of MUST-RMA with helper script for installation
  • docker: Dockerfile to build the software environment for the classification quality benchmarks
  • classification_quality: Script to generate the classification quality table out of the test cases
  • overhead_measurement: JUBE scripts to reproduce the measurements
  • overhead_results: Results of the overhead measurements on CLAIX18 (RWTH cluster)

Source Code

The sources of MUST-RMA are available in must_rma/src. Note that the folder contains a bunch of files unrelated to the paper. The contributions / tests can be found in the following folders and files:

Software Requirements

The following software packages are needed to reproduce the results:

  • Clang compiler (preferably in version 12.0.1)
  • MPI library with support for at least MPI 3.0 (preferably Intel MPI or MPICH)
  • CMake in version 3.20 or newer
  • libxml2 parser (libxml2-dev)
  • Python 3

The classification quality benchmarks in addition need:

  • LLVM lit in version 14.0.0 (available via PyPI)
  • FileCheck binary (distributed with LLVM)

The overhead evaluation in addition needs:

Classification Quality Benchmarks

To simplify the reproduction of the classification quality benchmarks, we provide a Dockerfile that provides the required software environment to build and run MUST-RMA with the benchmarks. If instead a cluster environment is used, the following Docker build and run steps can be skipped.

Build the docker image with tag must-rma, adjust permissions for the must_rma subfolder to match with the container user, and run the produced docker image with the MUST source code mounted as volume:

# cd $ROOT
# docker build docker -t must-rma
# chown -R 1000:1000 ./must_rma
# docker run --rm -it \
    -v $(pwd)/must_rma:/must_rma must-rma /bin/bash

Change to the must_rma directory. Install MUST-RMA by using the provided install script build_must.sh:

$ cd $ROOT/must_rma
$ ./build_must.sh

Build and installation path can be set within the script. In the following, we assume that MUST-RMA was built in the folder $BUILD and installed in $INSTALL.

Change into the $BUILD directory and run the tests:

$ cd $BUILD
$ lit -j 1 tests/OneSidedChecks/ | tee test_output.log

This will run all 81 test cases and output the results (number of passed and failed tests). Passed tests are marked as PASS, failed tests with FAIL or XFAIL. The number of workers (parameter -j) can be increased, however spawning too many workers might lead to failed test cases if there are not enough cores available to run the tests.

To produce the result table, we provide a Python script that parses the test_output.log file. Change back to the classification_quality folder and pass the test output log file to the script:

$ cd $ROOT/classification_quality
$ python3 generate_classification_quality_table.py \ 
    $BUILD/test_output.log

To run tests on own applications / binaries, MUST-RMA can be run with:

$ $INSTALL/bin/mustrun --must:distributed \ 
    --must:tsan --must:rma \ 
    -np <number of processes> <binary> 

Overhead Evaluation

The overhead evaluation is specific to the CLAIX cluster, so running the benchmarks in another environment will need manual adaptations. We provide a JUBE configuration to make reproducibility easier. Important parameter sets within the JUBE configuration (prk_rma.xml) to consider:

  • prk_kernel_args_pset: number of iterations and grid size to be used in the kernels
  • prk_system_pset: system configuration, e.g., number of nodes to be used

After configuring all required parameters, the benchmarks can be run with

$ cd $ROOT/overhead_measurement
$ jube run prk_rma.xml -t kernel_name

where kernel_name can be stencil or transpose.

The JUBE configuration (1) builds MUST-RMA, (2) builds the chosen kernel with and without TSan instrumentation, (3) submits per requested number of nodes a Slurm job that runs the three different configurations (plain, tsan, must-rma). After the Slurm jobs finished, the results can be retrieved with

$ cd $ROOT/overhead_measurement
$ jube result -a bench_run --id <id of JUBE run>

This will print out the results (average iteration time per second per configuration) as a table.

About

On-the-Fly Data Race Detection for MPI RMA Programs with MUST - Supplemental Material


Languages

Language:C++ 71.1%Language:C 17.0%Language:CMake 5.4%Language:Shell 2.2%Language:Python 2.1%Language:TeX 0.9%Language:PHP 0.6%Language:Makefile 0.5%Language:Fortran 0.3%Language:CWeb 0.0%Language:Dockerfile 0.0%Language:HTML 0.0%