Demo of Non-Hierarchical Caching (NHC)

This is a demo of our FAST'21 paper on non-hierarchical caching (NHC). This version of implementation is based on Guanzhou's fork of Intel's Open-CAS Framework (OCF) and an event-driven flash SSD simulator called Flashsim.

NHC was originally named multi-factor caching (MFC). The old name is used in this demo code. Following the OCF naming convention, the capacity device is named core and the performant device is named cache.

Tested to work on Ubuntu 20.04.

Overview

Folder structure:

# This is the Flashsim submodule
flashsim/

# This is the user-level context
contexts/
 |- ul-exp/     # Context code wrapping OCF, for user-level benchmarking    
 |   |- src/
 |   |   |- cache/      # Cache volume FlashSim driver, queue, and log
 |   |   |- core/       # Core  volume FlashSim driver, queue, and log
 |   |   |- simfs/      # Dummy application context
 |   |   |- fuzzy/      # Fuzzy testing workload (for correctness)
 |   |   |- bench/      # All benchmarking logics should go here
 |   |   |- main.c
 |   |- Makefile
 |   |- cache-ssd.conf  # Cache SSD configuration used in experiments
 |   |- core-ssd.conf   # Core  SSD configuration used in experiments
 |   |- run-flashsim.sh

# These are the OCF library - We added mf cache modes into the engine
doc/
env/
 |- posix/      # POSIX environment specific support
inc/            # OCF headers exposed to context code
src/            # OCF library source code
 |- engine/             
 |   |- engine_mfwa.c   # Multi-factor cache mode read with write-around
 |   |- engine_mfwa.h
 |   |- engine_mfwb.c   # Multi-factor cache mode read with write-back
 |   |- engine_mfwb.h
 |   |- mf_monitor.c    # Multi-factor monitor logic
 |   |- mf_monitor.h
tests/
Makefile        # Don't do `make` in root path directly

Everything we have added into the OCF library is marked by [Orthus FLAG BEGIN] and [Orthus FLAG END] for easier future reference.

Preparation

Prerequisites:

$ sudo apt update
$ sudo apt upgrade
$ sudo apt install libboost-all-dev
$ sudo apt install python3-pip
$ pip3 install matplotlib

Clone the repo recursively (there is a submodule - the Flash SSD simulator flashsim):

$ git clone --recursive git@github.com:josehu07/nhc-demo.git
$ git submodule update --init --recursive
$ cd nhc-demo

Go into the example context ul-exp and compile:

$ cd contexts/ul-exp
$ make

This will link the OCF library to this location and compile it together with the context's main file into a single executable ./bench. Please do all the following work under this ul-exp/ path.

Throughput Benchmarking

Cache & Core Flashsim Instances

First, set up the cache SSD and core SSD configurations in cache-ssd.conf and core-ssd.conf (the default should be good enough). Ensure that the PAGE_ENABLE_DATA option in both configs are set to 0. Then, start cache and core FlashSim devices by:

# In shell 1:
$ ./run-flashsim.sh cache
  # Compiles flashsim on the first invocation

# In shell 2:
$ ./run-flashsim.sh core

Make sure the current file system type is valid for creating UNIX-domain sockets required by Flashsim, otherwise bind() fails.

Doing the Throughput Benchmark

Then, in yet another shell:

# In shell 3:
$ ./tp-benchmark.sh MODE INTENSITY READ_PERCENTAGE HIT_RATIO 
  # Where:
  #    mode := pt|wa|wb|wt|mfwa|mfwb|mfwt
  #    intensity (reqs/sec) must be a multiple of 10
  #    read_percentage := 100|95|50|0
  #    hit_ratio := 99|95|80
  # E.g., ./tp-benchmark mfwa 12000 100 99

This will generate a result txt file under result/ with proper filename.

Visualizing the Results Over Time

After several rounds of experiments with different parameters, to visualize all the result txts over time, do:

$ python3 plot-results.py

Each experiment result txt will produce a corresponding png image under result/.

Seeing the Effects of NHC

NHC adapts to redirect an appropriate amount of excessive traffic to the core device when the request intensity goes over the cache device's max bandwidth. The core device, in this case, helps to answer some of the requests (extra throughput) and possibly also relieves contention at the cache device (maintaining cache at its optimal throughput). For example, the following two settings reveal of benefits of NHC:

Original write-around (wa) with intensity 12000 reqs/sec, 100% read, 99% hit rate: ~30 MiB/s throughput at cache;
Non-hierarchical write-around (mfwa) with the same parameters: ~31 MiB/s throughput at cache + ~10 MiB/s at core.

More on Demo Usage

Fuzzy Testing

Ensure that the PAGE_ENABLE_DATA option in both cache and core FlashSim config files are set to 1. Then, start cache and core FlashSim devices in two shells in the same way as in normal benchmarking.

Then, in yet another shell:

# In shell 3:
$ ./bench MODE fuzzy
  # E.g., ./bench mfwa fuzzy

Adding a New Benchmark

To add a new benchmarking experiment called new_bench for example, do:

Navigate to the folder contexts/ul-exp/src/bench/. There you will find benchmarking experiments implementation. Add new_bench.c & new_bench.h there. (Follow throughput.c & throughput.h as a guidance.)
In contexts/ul-exp/src/main.c, find the arrays bench_names and bench_funcs at the top. Add your new benchmark in. (Follow the throughput benchmark as a guidance.)

The exact way of invoking ./bench on experiments depends on the arguments expected by the benchmark code.

litany06 / nhc-demo