Nighthawk

A L7 HTTP protocol family benchmarking tool based on Envoy

Current state

The nighthawk client supports HTTP/1.1 and HTTP/2 over HTTP and HTTPS.

HTTPS certificates are not yet validated.

Prerequisites

Ubuntu

First, follow steps 1 and 2 over at Quick start Bazel build for developers.

Building and testing Nighthawk

# test it
bazel test -c fastbuild //test:nighthawk_test

Build it

# for best accuracy it is important to specify -c opt.
bazel build -c opt //:nighthawk_client

Using the Nighthawk client

➜ bazel-bin/nighthawk_client --help

USAGE: 

   bazel-bin/nighthawk_client  [--request-body-size <uint32_t>]
                               [--request-header <string>] ... 
                               [--request-method <GET|HEAD|POST|PUT|DELETE
                               |CONNECT|OPTIONS|TRACE>] [--address-family
                               <auto|v4|v6>] [--burst-size <uint64_t>]
                               [--prefetch-connections] [--output-format
                               <human|yaml|json>] [-v <trace|debug|info
                               |warn|error|critical>] [--concurrency
                               <string>] [--h2] [--timeout <uint64_t>]
                               [--duration <uint64_t>] [--connections
                               <uint64_t>] [--rps <uint64_t>] [--]
                               [--version] [-h] <uri format>


Where: 

   --request-body-size <uint32_t>
     Size of the request body to send. NH will send a number of consecutive
     'a' characters equal to the number specified here. (default: 0, no
     data).

   --request-header <string>  (accepted multiple times)
     Raw request headers in the format of 'name: value' pairs. This
     argument may specified multiple times.

   --request-method <GET|HEAD|POST|PUT|DELETE|CONNECT|OPTIONS|TRACE>
     Request method used when sending requests. The default is 'GET'.

   --address-family <auto|v4|v6>
     Network addres family preference. Possible values: [auto, v4, v6]. The
     default output format is 'v4'.

   --burst-size <uint64_t>
     Release requests in bursts of the specified size (default: 0, no
     bursting).

   --prefetch-connections
     Prefetch connections before benchmarking (HTTP/1 only).

   --output-format <human|yaml|json>
     Verbosity of the output. Possible values: [human, yaml, json]. The
     default output format is 'human'.

   -v <trace|debug|info|warn|error|critical>,  --verbosity <trace|debug
      |info|warn|error|critical>
     Verbosity of the output. Possible values: [trace, debug, info, warn,
     error, critical]. The default level is 'info'.

   --concurrency <string>
     The number of concurrent event loops that should be used. Specify
     'auto' to let Nighthawk leverage all vCPUs that have affinity to the
     Nighthawk process.Note that increasing this results in an effective
     load multiplier combined with the configured-- rps and --connections
     values.Default : 1. 

   --h2
     Use HTTP/2

   --timeout <uint64_t>
     Timeout period in seconds used for both connection timeout and grace
     period waiting for lagging responses to come in after the test run is
     done. Default: 5.

   --duration <uint64_t>
     The number of seconds that the test should run. Default: 5.

   --connections <uint64_t>
     The number of connections per event loop that the test should
     maximally use. HTTP/1 only. Default: 1.

   --rps <uint64_t>
     The target requests-per-second rate. Default: 5.

   --,  --ignore_rest
     Ignores the rest of the labeled arguments following this flag.

   --version
     Displays version information and exits.

   -h,  --help
     Displays usage information and exits.

   <uri format>
     (required)  uri to benchmark. http:// and https:// are supported, but
     in case of https no certificates are validated.


   Nighthawk, a L7 HTTP protocol family benchmarking tool based on Envoy.

Sample benchmark run

# start the benchmark target (Envoy in this case) on core 3.
$ taskset -c 3 /path/to/envoy --config-path nighthawk/tools/envoy.yaml

# run a quick benchmark using cpu cores 4 and 5.
$ taskset -c 4-5 bazel-bin/nighthawk_client --rps 1000 --concurrency auto http://127.0.0.1:10000/

Nighthawk - A layer 7 protocol benchmarking tool.

benchmark_http_client.queue_to_connect: 9993 samples, mean: 0.000010053s, pstdev: 0.000011278s
Percentile  Count       Latency        
0           1           0.000006713s   
0.5         4997        0.000007821s   
0.75        7495        0.000008677s   
0.8         7995        0.000009084s   
0.9         8994        0.000011583s   
0.95        9494        0.000015702s   
0.990625    9900        0.000077299s   
0.999023    9984        0.000145863s   
1           9993        0.000232383s   

benchmark_http_client.request_to_response: 9993 samples, mean: 0.000115456s, pstdev: 0.000052326s
Percentile  Count       Latency        
0           1           0.000080279s   
0.5         4998        0.000104799s   
0.75        7496        0.000113787s   
0.8         7996        0.000121359s   
0.9         8994        0.000153487s   
0.95        9494        0.000180647s   
0.990625    9900        0.000382591s   
0.999023    9984        0.000608159s   
1           9993        0.000985951s   

sequencer.blocking: 14 samples, mean: 0.000531127s, pstdev: 0.000070919s
Percentile  Count       Latency        
0           1           0.000484127s   
0.5         7           0.000495615s   
0.75        11          0.000521007s   
0.8         12          0.000545887s   
0.9         13          0.000655839s   
1           14          0.000736223s   

sequencer.callback: 9993 samples, mean: 0.000131079s, pstdev: 0.000060199s
Percentile  Count       Latency        
0           1           0.000091547s   
0.5         4998        0.000116935s   
0.75        7495        0.000127351s   
0.8         7995        0.000137807s   
0.9         8994        0.000174335s   
0.95        9495        0.000210063s   
0.990625    9900        0.000444063s   
0.999023    9984        0.000664383s   
1           9993        0.001103615s   

Counter                                 Value       Per second
client.benchmark.http_2xx               9995        1999.00
client.upstream_cx_close_notify         98          19.60
client.upstream_cx_http1_total          100         20.00
client.upstream_cx_rx_bytes_total       8585215     1717043.00
client.upstream_cx_total                100         20.00
client.upstream_cx_tx_bytes_total       569715      113943.00
client.upstream_rq_pending_total        100         20.00
client.upstream_rq_total                9995        1999.00

Accuracy and repeatability considerations when using the Nighthawk client

Processes not related to the benchmarking task at hand may add significant noise. Consider stopping any processes that are not needed.
Be aware that power state management and CPU Frequency changes are able to introduce significant noise. When idle, Nighthawk uses a busy loop to achieve precise timings when starting requests, which helps minimize this. Still, consider disabling C-state changes in the system BIOS.
Be aware that CPU thermal throttling may skew results.
Consider using taskset to isolate client and server. On machines with multiple physical CPUs there is a choice here. You can partition client and server on the same physical processor, or run each of them on a different physical CPU. Be aware of the latency effects of interconnects such as QPI.
Consider disabling hyper-threading.
Consider tuning the benchmarking system for low (network) latency. You can do that manually, or install tuned and run:

As this may change boot flags, take precautions, and familiarize yourself with the tool on systems that you don't mind breaking. For example, running this has been observed to mess up dual-boot systems!

sudo tuned-adm profile network-latency

When using Nighthawk with concurrency > 1 or multiple connections, workers may produce significantly different results. That can happen because of various reasons:
- Server fairness. For example, connections may end up being serviced by the same server thread, or not.
- One of the clients may be unlucky and structurally spend time waiting on requests from the other(s) being serviced due to interference of request release timings and server processing time.
Consider using separate machines for the clients and server(s).

htuch / nighthawk

Nighthawk

Current state

Prerequisites

Ubuntu

Building and testing Nighthawk

Build it

Using the Nighthawk client

Sample benchmark run

Accuracy and repeatability considerations when using the Nighthawk client

About

Languages