ADOC-tracker

What's this?

This is a profiler framework used in paper "ADOC: Automatically Harmonizing Dataflow Between Components in Log-Structured Key-Value Stores for Improved Performance"

What's for?

You can use this framework to track following real-time metrics:

Metrics	Measure Tool	content	Result File
IO information	iostat	CPU utilization (IO process only) and bandwidth for each second	text, named with IOSTAT.txt
process information	pidstat	cpu utilization (sys,usr,total), disk info (bytes read/written in),	stat_result.csv
real-time throughput	db_bench	time elapsed and throughput	report.csv
output of db_bench	N/A	workload information, performance summary, level states, etc.	stdout.txt
*perf output	perf	execution trace	perf.out

* The usage of perf is listed in the example directory, but you need to config it by your self, it won't be embedded automatically, for both performance and size consideration.

There are also some other thing you can use, for example:

We embedded the cgroup tool in the system, you can
1. limit the CPU clock number to control the vCPU used by db_bench (set in default.ini)
2. limit the bandwidth by bytes_wrote_in, check that in the example "bandwidth_influence"
3. do further modification to make full use of cgroup tool
You can use the db_bench_dynamic_runner to simulate the scenarios:
1. Your db_bench is running with another software with higher throughput (The throughput is generated by Alibaba's workload trace, but only the first one hour of machine No. 48)
If you are really interested the impact of each parameter, try the parameter_influence example, we will upload the ANOVA test script later, so that you can use ONE-WAY ANOVA to analyze the impact of different parameters, and pick the most important ones. This function is inspired from the paper Rafiki

Warning!!!

The result files can be very large, use the command sudo gzip **/LOG*,sudo gzip **/iostat* to compress the oversized files.

Preparation

Download RocksDB, and compile the db_bench
1. Modify the default.ini, and set the db_bench path
2. You can always reload the path with in the running script
This framework was designed for evaluating the impact of thread number and batch size (common size of Memtable and SSTable), but you can always change the configure in the config.json
You will need several python packages, and following system tools:
1. iostat
2. pidstat
3. top
4. perf
5. cgroup
Please download the flame graph tool in this link if you want to plot the flamegraph
If you are interested, you can visit the plot script in this link

After you have installed all the packages, create a directory, create a DB_launcher class to run your experiments. Refer the following examples to see further details.

What's in the dirs?

dir name	usage
bandwidth_influence	use cgroup to limit the available bandwidth
parameter_influence	traverse through all options, and use ANOVA method to evaluate the impact of each parameter
rate-limited-fillrandom	run the fillrandom workload with a rate-limiter in db_bench
fillrandom	the basic usage, run fillrandom and monitor the resource usage
white_noise_fillrandom	run fillrandom with varying bandwidth, the bandwidth follows a sine function
on_cpu_analysis	run fillrandom and save the perf results

SparklyYS / ADOC-Rocks-tracker

ADOC-tracker

What's this?

What's for?

Preparation

What's in the dirs?

About

Languages