GitHub Action for Continuous Benchmarking of Pull Requests
This repository provides a GitHub Action for continuous benchmarking of pull requests: it allows comparing the performance of the PR code against the base branch, aiming to detect possible performance regression.
Concretely, this action collects data from the benchmark outputs generated by some popular tools, and when benchmark results get worse than the baseline exceeding the specified threshold, it can raise an alert via commit comment and/or workflow failure.
The following benchmarking tools are supported:
cargo bench
for Rust projectsgo test -bench
for Go projects- benchmark.js for JavaScript/TypeScript projects
- pytest-benchmark for Python projects with [pytest][]
- Google Benchmark Framework for C++ projects
- Catch2 for C++ projects
How to use: workflow setup
This action takes as input the files that contain benchmark outputs from the PR and base branch. To ensure the measurements are carried out with similar conditions, the results should be collected on the same workflow and machine.
This is an example workflow that is triggered on pull requests targeting the master branch, and processes the benchmark results from JavaScript code.
name: Performance Regression Test
on:
# This action only works for pull requests
pull_request:
branches: [master]
jobs:
benchmark:
name: Time benchmark
runs-on: ubuntu-latest
steps:
# Check out pull request branch
- uses: actions/checkout@v2
with:
path: pr
# Check out base branch (to compare performance)
- uses: actions/checkout@v2
with:
ref: master
path: master
- uses: actions/setup-node@v1
with:
node-version: '15'
# Run benchmark on master and stores the output to a file
- name: Run benchmark on master (baseline)
run: cd master/examples/benchmarkjs && npm install && node bench.js | tee benchmarks.txt
# Run benchmark on the PR branch and stores the output to a separate file (must use the same tool as above)
- name: Run pull request benchmark
run: cd pr/examples/benchmarkjs && npm install && npm install && node bench.js | tee benchmarks.txt
- name: Compare benchmark result
uses: larabr/github-action-benchmark@master
with:
# What benchmark tool the benchmarks.txt files came from
tool: 'benchmarkjs'
name: 'time benchmark'
# Where the two output files from the benchmark tool are stored
pr-benchmark-file-path: pr/examples/benchmarks.txt
base-benchmark-file-path: master/examples/benchmarks.txt
# A comment will be left on the latest PR commit if `alert-threshold` is exceeded
comment-on-alert: true
alert-threshold: '130%'
# Workflow will fail if `fail-threshold` is exceeded
fail-on-alert: true
fail-threshold: '150%'
# A token is needed to leave commit comments
github-token: ${{ secrets.GITHUB_TOKEN }}
By default, this action marks the result as performance regression when it is worse than the previous
exceeding a 50% threshold. For example, if the previous benchmark result was 100 iter/ns and this time
it is 150 iter/ns, it means 50% worse than the previous and an alert will happen. The threshold can
be changed by alert-threshold
input.
For documentation about each action input, see the definitions in action.yml.
Tool-specific setup
Please read README.md
files at each example directory. Usually, take stdout from a benchmark tool
and store it to file. Then specify the file path to pr-benchmark-file-path
and base-benchmark-file-path
input.
cargo bench
for Rust projectsgo test
for Go projects- Benchmark.js for JavaScript/TypeScript projects
- pytest-benchmark for Python projects with pytest
- Google Benchmark Framework for C++ projects
These examples are run in workflows of this repository as described in the 'Examples' section above.
How to add new benchmark tool support
- Add your tool name in
src/config.ts
- Implement the logic to extract benchmark results from the output in
src/extract.ts
Stability of Virtual Environment
Based on the benchmark results of the examples in this repository, the amplitude of the benchmarks is about +- 10~20%. If your benchmarks use some resources such as networks or file I/O, the amplitude might be bigger.
If the amplitude is not acceptable, please prepare a stable environment to run benchmarks. GitHub action supports self-hosted runners.
Related actions
This is a hard fork of the Continuous Benchmarking Github Action: it reuses some core components, but provides different features. In particular, there is no support for pushing benchmark results to Github Pages, and this action is not meant to carry out benchmark comparisons outside of pull requests.