camerondavison/vector-test-harness

Full end-to-end test harness for the Vector log & metrics router. This is the test framework used to generate the performance and correctness results displayed in the Vector docs. You can learn more about how this test harness works in the How It Works section, and you can begin using this test harness via the Usage section.

Performance Tests

Correctness Tests

Results

High-level results can be found in the Vector performance and correctness documentation sections.
Detailed results can be found within each test case's README.
Raw performance result data can be found in our public S3 bucket.
You can run your own queries against the raw data. See the Usage section.

Directories

/ansible - global ansible resources and tasks
/bin - contains all scripts
/cases - contains all test cases
/terraform - global terraform state, resources, and modules

Setup

Ensure you have Ansible and Terraform installed.
This step is optional, but highly recommended. Setup a vector specific AWS profile in your ~/.aws/credentials file. We highly recommend running the Vector test harness in a separate AWS sandbox account if possible.
Create an Amazon compatible key pair. This will be used for SSH access to test instances.
Run cp .envrc.example .envrc. Read through the file, update as necessary.
Run source .envrc to prepare the environment. Alternatively install direnv to do this automatically.
Run:
```
AWS_PROFILE=timber ./bin/test -t [tcp_to_tcp_performance]
```
This script will take care of running the necessary Terraform and Ansible scripts.

Usage

bin/test - run a test
bin/cohort - perform a cohort analysis of test results across a subject's versions
bin/compare - compare of test results across all subjects

Development

Adding a test

We recommend cloning a similar to test since it removes a lot of the boilerplate. If you prefer to start from scratch:

Create a new folder in the /cases directory. Your name should end with _performance or _correctness to clarify the type of test this is.
Add a README.md providing an overview of the test. See the tcp_to_tcp_performance test for an example.
Add a terraform/main.tf file for provisioning test resources.
Add a ansible/bootstrap.yml to bootstrap the environment.
Add a ansible/run.yml to run the test againt each subject.
Add any additional files as you see fit for each test.
Run bin/test -t <name_of_test>.

Changing a test

You should not be changing tests with historical test data. You can change test subject versions since test data is partitioned by version, but you cannot change a test's execution strategy as this would corrupt historical test data. If you need to change the test in such a way that would violate historical data we recommend creating an entirely new test.

Deleting a test

Simply delete the folder and any data in the s3 bucket.

How It Works

Design

The Vector test harness is a mix of bash, Terraform, and Ansible scripts. Each test case lives in the /cases directory and has full reign of it's bootstrap and test process via it's own Terraform and Ansible scripts. The location of these scripts is dicated by the test script and is outlined in more detail in the Adding a test section. Each test falls into one of 2 categories: performance tests and correctness tests:

Performance tests

Performance tests measure performance and MUST capture detailed performance data as outlined in the Performance Data and Rules sections.

In addition to the test script, there are compare and cohort scripts. Each of these scripts analyze the performance data captured when executing a test. More information on this data and how it's captured and analyzed can be found in the Performance Data section. Finally, each script includes a usage overview that you can access with the --help flag.

Performance data

Performance test data is captured via dstat, which is a light weight utility that captures a variety of system statistics in 1 second snapshot intervals. The final result is a CSV where each row represents a snapshot. You can see the dstat command used in the ansible/roles/profiling/start.yml file.

Performance data schema

The performance data schema is reflected in the Athena table definition as well as the CSV itself. The following is an ordered list of columns:

Name	Type
`epoch`	`double`
`cpu_usr`	`double`
`cpu_sys`	`double`
`cpu_idl`	`double`
`cpu_wai`	`double`
`cpu_hiq`	`double`
`cpu_siq`	`double`
`disk_read`	`double`
`disk_writ`	`double`
`io_read`	`double`
`io_writ`	`double`
`load_avg_1m`	`double`
`load_avg_5m`	`double`
`load_avg_15m`	`double`
`mem_used`	`double`
`mem_buff`	`double`
`mem_cach`	`double`
`mem_free`	`double`
`net_recv`	`double`
`net_send`	`double`
`procs_run`	`double`
`procs_bulk`	`double`
`procs_new`	`double`
`procs_total`	`double`
`sys_init`	`double`
`sys_csw`	`double`
`sock_total`	`double`
`sock_tcp`	`double`
`sock_udp`	`double`
`sock_raw`	`double`
`sock_frg`	`double`
`tcp_lis`	`double`
`tcp_act`	`double`
`tcp_syn`	`double`
`tcp_tim`	`double`
`tcp_clo`	`double`

Performance data location

All performance data is made public via the vector-tests S3 bucket in the us-east-1 region. The partitioning structure follows the Hive partitioning structure with variable names in the path. For example:

name=tcp_to_tcp_performance/configuration=default/subject=vector/version=v0.2.0-dev.1-20-gae8eba2/timestamp=1559073720

name = the test name.
configuration = refers to the test's specific configuration (tests can have multiple configurations if necessary).
subject = the test subject, such as vector.
version = the version fo the test subject.
timestamp = when the test was executed.

Performance data analysis

Analysis of this data is performed through the AWS Athena service. This allows us to execute complex queries on the performance data stored in S3. You can see the queries ran in the compare and cohort scripts.

Correctness tests

Correctness tests simply verify behavior. These tests are not required to capture or persist any data. The results can be manually verified and placed in the test's README.

Correctness data

Since correctness tests are pass/fail there is no data to capture other than the successful running of the test.

Correctness output

Generally, correctness tests verify output. Because of the various test subjects, we use a variety of output methods to capture output (tcp, http, and file). This is highly dependent on the test subject and the methods available. For example, the Splunk Forwarders only support TCP and Splunk specific outputs.

To make capturing this data easy, we created a test_server Ansible role that spins up various test servers and provides a simple way to capture summary output.

Environments

Tests must operate in isolated reproducible environments, they must never run locally. The obvious benefit is that it removes variables across tests, but it also improves collaboration since remote environments are easily accessible and reproducible by oother engineers.

Rules

ALWAYS filter to resources specific to your test_name, test_configuration, and user_id (ex: ansible host targeting)
ALWAYS make sure initial instance state is identical across test subjects. We recommend explicitly stopping all test subjects in the case of previous failure and a subject was not cleanly shutdown.
ALWAYS use the profile ansible role to capture data. This ensures a consistent data structure across tests.
ALWAYS run performance tests for at least 1 minute to calculate a 1m CPU load average.
Use ansible roles whenever possible.
If you are not testing local data collection we recommend using TCP as a data source since it is a light weight source that is more likely to be consistent, performance wise, across subjects.

camerondavison / vector-test-harness

TOC