BencheeAsync

Benchee plugin for benchmarking multi-process performance for async work.

This plugin allows optimization of systems that are spread out over multiple processes. Benchee only allows benchmarking of a singular function within a singular executing process, and cannot keep track of cross-process work performed. This plugin allows us to measure async units of work done, thereby allowing us to optimize our async pipelines.

The goal of this library is to approximately track units of async work done and the rate of completion.

Installation

def deps do
  [
    # benchee is used internally
    {:benchee, "~> 1.0", only: [:dev, :test]},
    {:benchee_async, "~> 0.1.0", only: [:dev, :test]}
  ]
end

Usage

The following must be configured:

Start the BencheeAsync.Reporter GenServer.
Benchmark functions must call BencheeAsync.Reporter.record/0 to record a unit of work completed.
Set the extended_statistics: true option for Benchee.Formatters.Console

Example

This shows an example of running Benchee from within a ExUnit test suite.

defmodule MyAppTest do
  use ExUnit.Case, async: false

  test "measure async work!" do
    # start the reporter process
    start_supervised!(BencheeAsync.Reporter)

    # use BencheeAsync instead of Benchee
    BencheeAsync.run(
      %{
        "case_100_ms" => fn ->
          Task.start(fn ->
            :timer.sleep(100)
            BencheeAsync.Reporter.record()
          end)
          :timer.sleep(2500)
        end,
        "case_1000_ms" => fn ->
          Task.start(fn ->
            :timer.sleep(1000)
            BencheeAsync.Reporter.record()
          end)
          :timer.sleep(1500)
        end
      },
      time: 1,
      warmup: 3,
      # use extended_statistics to view units of work done
      formatters: [{Benchee.Formatters.Console, extended_statistics: true}]
    )
  end
end

The resulting console output will be as follows:

Operating System: macOS
CPU Information: Apple M1 Pro
Number of Available Cores: 10
Available memory: 32 GB
Elixir 1.15.5
Erlang 26.1.2

Benchmark suite executing with the following configuration:
warmup: 3 s
time: 1 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: none specified
Estimated total run time: 8 s

Benchmarking case_1000_ms ...
Benchmarking case_100_ms ...

Name                       ips        average  deviation         median         99th %
case_100_ms           9.92        0.101 s     ±0.20%        0.101 s        0.101 s
case_1000_ms          1.00         1.00 s     ±0.04%         1.00 s         1.00 s

Comparison:
case_100_ms           9.92
case_1000_ms          1.00 - 9.93x slower +0.90 s

Extended statistics:

Name                     minimum        maximum    sample size                     mode
case_100_ms          0.101 s        0.101 s              3                     None
case_1000_ms          1.00 s         1.00 s              1                     None

Interpretation differences from Benchee are as follows:

ips: The maximum iterations per second of the async process(es) if the async logic was repeatedly executed in isolation.
average, deviation, median, 99th %: The statistics of execution time between each reported unit work done.
sample size: The amount of reported units of work done, which will correspond to the number of BencheeAsync.Reporter.report/1 calls.

Usage with Inputs

Inputs work as well with no additional configuration needed.

Operating System: macOS
CPU Information: Apple M1 Pro
Number of Available Cores: 10
Available memory: 32 GB
Elixir 1.15.5
Erlang 26.1.2

Benchmark suite executing with the following configuration:
warmup: 0 ns
time: 3 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: Bigger, Medium, Small
Estimated total run time: 18 s

Benchmarking case_faster with input Bigger ...
Benchmarking case_faster with input Medium ...
Benchmarking case_faster with input Small ...
Benchmarking case_slower with input Bigger ...
Benchmarking case_slower with input Medium ...
Benchmarking case_slower with input Small ...

##### With input Bigger #####
Name                  ips        average  deviation         median         99th %
case_faster        1.08 M     0.00092 ms    ±36.87%     0.00092 ms     0.00154 ms
case_slower     0.00001 M       75.90 ms     ±0.22%       75.94 ms       76.03 ms

Comparison:
case_faster        1.08 M
case_slower     0.00001 M - 82215.44x slower +75.90 ms

Extended statistics:

Name                minimum        maximum    sample size                     mode
case_faster      0.00013 ms     0.00154 ms             39               0.00088 ms
case_slower        75.27 ms       76.03 ms             20                     None

##### With input Medium #####
Name                  ips        average  deviation         median         99th %
case_faster      982.25 K     0.00102 ms   ±151.32%     0.00083 ms      0.0123 ms
case_slower      0.0196 K       51.04 ms     ±0.76%       51.00 ms       52.96 ms

Comparison:
case_faster      982.25 K
case_slower      0.0196 K - 50138.38x slower +51.04 ms

Extended statistics:

Name                minimum        maximum    sample size                     mode
case_faster      0.00013 ms      0.0123 ms             58               0.00075 ms
case_slower        50.49 ms       52.96 ms             30                     None

##### With input Small #####
Name                  ips        average  deviation         median         99th %
case_faster        1.68 M     0.00059 ms    ±38.29%     0.00058 ms     0.00108 ms
case_slower     0.00009 M       11.00 ms     ±1.08%       11.01 ms       11.61 ms

Comparison:
case_faster        1.68 M
case_slower     0.00009 M - 18489.07x slower +11.00 ms

Extended statistics:

Name                minimum        maximum    sample size                     mode
case_faster      0.00013 ms     0.00275 ms            272               0.00063 ms
case_slower        10.44 ms       11.69 ms            14311.02 ms, 11.04 ms, 11.01

Usage in a Real World Application

It is advised to mock your async functions using :meck or Mimic. The mocked function would be where you trigger BencheeAsync.Reporter.report/0.

Internals and Behavior

This library injects hooks into the Benchee.run/1 in order to achieve async work benchmarking.

BencheeAsync utilizes the Benchee public APIs only to achieve the hook injections. All user provided hooks will be executed after the injected hooks.

Global hooks need to be injected in order to initiate tracking of post warmup timing and post-scenario timings.

To allow BencheeAsync.Reporter.record/0 to work without specifying scenario name or input name, the input is used in the local :before_scenario hook in order to identify the scenario-input combination being benchmarked. The input is then hashed using :erlang.phash2/2 for internal referencing.

Limitations

The memory_time and reduction_time Benchee options will extend the execution time, hence the sample size will include counts beyond set run time value.

Ziinc / benchee_async