Full end-to-end test harness for the Vector log & metrics router. This is the test framework used to generate the performance and correctness results displayed in the Vector docs. You can learn more about how this test harness works in the How It Works section, and you can begin using this test harness via the Usage section.
disk_buffer_performance
testfile_to_tcp_performance
testtcp_to_blackhole_performance
testtcp_to_tcp_performance
testtcp_to_http_performance
testregex_parsing_performance
test
disk_buffer_persistence_correctness
testfile_rotate_create_correctness
testfile_rotate_truncate_correctness
testfile_truncate_correctness
testsighup_correctness
testwrapped_json_correctness
test
- High-level results can be found in the Vector performance and correctness documentation sections.
- Detailed results can be found within each test case's README.
- Raw performance result data can be found in our public S3 bucket.
- You can run your own queries against the raw data. See the Usage section.
/ansible
- global ansible resources and tasks/bin
- contains all scripts/cases
- contains all test cases/terraform
- global terraform state, resources, and modules
-
This step is optional, but highly recommended. Setup a
vector
specific AWS profile in your~/.aws/credentials
file. We highly recommend running the Vector test harness in a separate AWS sandbox account if possible. -
Create an Amazon compatible key pair. This will be used for SSH access to test instances.
-
Run
cp .envrc.example .envrc
. Read through the file, update as necessary. -
Run
source .envrc
to prepare the environment. Alternatively install direnv to do this automatically. -
Run:
AWS_PROFILE=timber ./bin/test -t [tcp_to_tcp_performance]
This script will take care of running the necessary Terraform and Ansible scripts.
bin/test
- run a testbin/cohort
- perform a cohort analysis of test results across a subject's versionsbin/compare
- compare of test results across all subjects
We recommend cloning a similar to test since it removes a lot of the boilerplate. If you prefer to start from scratch:
- Create a new folder in the
/cases
directory. Your name should end with_performance
or_correctness
to clarify the type of test this is. - Add a
README.md
providing an overview of the test. See thetcp_to_tcp_performance
test for an example. - Add a
terraform/main.tf
file for provisioning test resources. - Add a
ansible/bootstrap.yml
to bootstrap the environment. - Add a
ansible/run.yml
to run the test againt each subject. - Add any additional files as you see fit for each test.
- Run
bin/test -t <name_of_test>
.
You should not be changing tests with historical test data. You can change test subject versions since test data is partitioned by version, but you cannot change a test's execution strategy as this would corrupt historical test data. If you need to change the test in such a way that would violate historical data we recommend creating an entirely new test.
Simply delete the folder and any data in the s3 bucket.
The Vector test harness is a mix of bash, Terraform, and Ansible
scripts. Each test case lives in the /cases
directory and has full reign of it's
bootstrap and test process via it's own Terraform and Ansible scripts.
The location of these scripts is dicated by the test
script and is outlined in more
detail in the Adding a test section. Each test falls into one of 2 categories:
performance tests and correctness tests:
Performance tests measure performance and MUST capture detailed performance data as outlined in the Performance Data and Rules sections.
In addition to the test
script, there are compare
and cohort
scripts.
Each of these scripts analyze the performance data captured when executing a test. More information
on this data and how it's captured and analyzed can be found in the
Performance Data section. Finally, each script includes a usage
overview that you can access with the --help
flag.
Performance test data is captured via dstat
, which is a light weight utility that
captures a variety of system statistics in 1 second snapshot intervals. The final result is a CSV
where each row represents a snapshot. You can see the dstat
command used in the
ansible/roles/profiling/start.yml
file.
The performance data schema is reflected in the Athena table definition as well as the CSV itself. The following is an ordered list of columns:
Name | Type |
---|---|
epoch |
double |
cpu_usr |
double |
cpu_sys |
double |
cpu_idl |
double |
cpu_wai |
double |
cpu_hiq |
double |
cpu_siq |
double |
disk_read |
double |
disk_writ |
double |
io_read |
double |
io_writ |
double |
load_avg_1m |
double |
load_avg_5m |
double |
load_avg_15m |
double |
mem_used |
double |
mem_buff |
double |
mem_cach |
double |
mem_free |
double |
net_recv |
double |
net_send |
double |
procs_run |
double |
procs_bulk |
double |
procs_new |
double |
procs_total |
double |
sys_init |
double |
sys_csw |
double |
sock_total |
double |
sock_tcp |
double |
sock_udp |
double |
sock_raw |
double |
sock_frg |
double |
tcp_lis |
double |
tcp_act |
double |
tcp_syn |
double |
tcp_tim |
double |
tcp_clo |
double |
All performance data is made public via the vector-tests
S3 bucket in the
us-east-1
region. The partitioning structure follows the Hive partitioning structure with
variable names in the path. For example:
name=tcp_to_tcp_performance/configuration=default/subject=vector/version=v0.2.0-dev.1-20-gae8eba2/timestamp=1559073720
name
= the test name.configuration
= refers to the test's specific configuration (tests can have multiple configurations if necessary).subject
= the test subject, such asvector
.version
= the version fo the test subject.timestamp
= when the test was executed.
Analysis of this data is performed through the AWS Athena service. This allows us
to execute complex queries on the performance data stored in S3. You can see
the queries ran in the compare
and cohort
scripts.
Correctness tests simply verify behavior. These tests are not required to capture or persist any data. The results can be manually verified and placed in the test's README.
Since correctness tests are pass/fail there is no data to capture other than the successful running of the test.
Generally, correctness tests verify output. Because of the various test subjects, we use a variety of output methods to capture output (tcp, http, and file). This is highly dependent on the test subject and the methods available. For example, the Splunk Forwarders only support TCP and Splunk specific outputs.
To make capturing this data easy, we created a test_server
Ansible role
that spins up various test servers and provides a simple way to capture summary output.
Tests must operate in isolated reproducible environments, they must never run locally. The obvious benefit is that it removes variables across tests, but it also improves collaboration since remote environments are easily accessible and reproducible by oother engineers.
- ALWAYS filter to resources specific to your
test_name
,test_configuration
, anduser_id
(ex: ansible host targeting) - ALWAYS make sure initial instance state is identical across test subjects. We recommend explicitly stopping all test subjects in the case of previous failure and a subject was not cleanly shutdown.
- ALWAYS use the
profile
ansible role to capture data. This ensures a consistent data structure across tests. - ALWAYS run performance tests for at least 1 minute to calculate a 1m CPU load average.
- Use ansible roles whenever possible.
- If you are not testing local data collection we recommend using TCP as a data source since it is a light weight source that is more likely to be consistent, performance wise, across subjects.