GeertBosch / benchmarks

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About

I wrote these benchmarks for a presentation on "Performance Tips, Tricks, and Gotchas". They contain benchmarks to compare several ways of doing the same thing in C++ that are subtly different on the surface but may differ significantly in terms of performance. Writing these was an interesting learning opportunity for me, because I learned how to write these benchmarks in the process of doing it, and though I already knew that in principal there were performance differences between these things, I'd never actually taken the time to measure them.

Benchmarks include measurements for:

  • Function call overhead: Virtual member function vs. non-virtual member function vs. lambda function vs. std::function
  • Effects of data locality/cache misses
  • False sharing between threads
  • Using mutexes vs. atomics

This is a work in progress and there may be mistakes. There are also a few TODOs left in benchmarks.cpp that are worth paying attention to. I'll clean this up more in the following weeks.

How to Install and Run

# Install conan. Used to fetch google benchmark.
sudo apt-get install python3-venv
python3 -m venv pyenv
source pyenv/bin/activate
pip install conan

# Configure conan.
conan profile new default --detect
conan profile update settings.compiler.libcxx=libstdc++11 default
mkdir build && cd build
# This will download google benchmark
conan install ..

# Configure cmake
cmake .. -G "Unix Makefiles" -DCMAKE_BUILD_TYPE=Release
# Build benchmark
cmake --build .
# Run benchmark
./bin/benchmarks

Output on my machine:

Running ./build/bin/benchmarks
Run on (16 X 3396.7 MHz CPU s)
CPU Caches:
  L1 Data 32K (x8)
  L1 Instruction 32K (x8)
  L2 Unified 1024K (x8)
  L3 Unified 25344K (x1)
Load Average: 0.00, 0.11, 0.27
----------------------------------------------------------------------------------------
Benchmark                                              Time             CPU   Iterations
----------------------------------------------------------------------------------------
BM_virtualFunctionCallsThroughPointerToParent       2.51 ns         2.51 ns    294239104
BM_virtualFunctionCallsThroughPointerToChild        1.61 ns         1.61 ns    433945778
BM_virtualFunctionCallsThroughInstanceOfChild      0.295 ns        0.295 ns   1000000000
BM_nonVirtualNonInlineFunctionCall                  3.24 ns         3.24 ns    215936970
BM_inlineFunctionCall                              0.295 ns        0.295 ns   1000000000
BM_noFunctionCall                                  0.295 ns        0.295 ns   1000000000
BM_stdFunctionCall                                  1.77 ns         1.77 ns    395824197
BM_lambdaFunctionCall                              0.295 ns        0.295 ns   1000000000
BM_stdFunctionPassedAsParameterFunctionCall         2.06 ns         2.06 ns    339063066
BM_lambdaPassedAsParameterFunctionCall             0.295 ns        0.295 ns   1000000000
BM_sequentialListAccess                             1367 ns         1367 ns       440074
BM_sequentialArrayAccess                             148 ns          148 ns      4704685
BM_sequentialArrayAccessSmallerThanL1               47.8 ns         47.8 ns     14637763
BM_randomArrayAccessSmallerThanL1                   99.9 ns         99.9 ns      7034840
BM_sequentialArrayAccessBiggerThanL1              158070 ns       158063 ns         4429
BM_randomArrayAccessBiggerThanL1                  727313 ns       727301 ns          839
BM_falseSharing/manual_time                      2841579 ns        41720 ns          246
BM_noFalseSharing/manual_time                    2144634 ns        39660 ns          326
BM_useMutex/manual_time                        116323172 ns        50073 ns            6
BM_useMutexNoContention/manual_time             15982798 ns        36639 ns           44
BM_useAtomic/manual_time                        28326920 ns        39443 ns           25

About


Languages

Language:C++ 98.2%Language:CMake 1.8%