Ensō Evaluation

Ensō is a high-performance streaming interface for NIC-application communication. Here you will find the scripts and instructions to reproduce the main claims in the OSDI '23 paper. We also include the applications modified to use Ensō that we use as part of the evaluation. Refer to the Ensō repository for Ensō's source code and to the Ensō documentation for more information on how to use Ensō for your own projects.

The instructions are split into two sections. The first section explains how to setup the environment and manually run an experiment using an echo server. The second section explains how to automatically run experiments to verify the main claims in the paper.

Getting Started Instructions

In this section, we describe how to set up the environment and run a simple Ensō-based echo server, feeding it packets generated by the EnsōGen packet generator.

Testbed

You need two or three machines to run the experiments:

DUT: This is the machine with the "design under test".
Packet Generator: This is the machine that will be used to generate and send traffic to the DUT, and profile the response traffic echoed back by the DUT.
Client: This is the machine that will run the experiment script and coordinate both the DUT and the Packet Generator machines. It may be the same as the Packet Generator machine or, for instance, your laptop. The Client machine should have SSH access to both the DUT and the Packet Generator machines.

The DUT machine and the Packet Generator machine should be equipped with an Intel Stratix 10 MX FPGA and be connected back-to-back by a 100GbE cable.

Requirements

Both the DUT and Packet Generator machines should be capable of running Ensō. Refer to Ensō's Setup for the system requirements and dependencies. If the DUT and the Packet Generator machines are running Ubuntu, the experiment.py script will attempt to automatically install the dependencies for you.

In addition, the client machine should have python3.9 or later installed as well as rsync and ssh. These are not automatically installed by the experiment script.

If Python3.9 or later is not available in your distribution, you may install the latest python through homebrew or pyenv.

Plots

Plotting the results of the experiments requires some extra dependencies (LaTeX). On Ubuntu, you may install the required dependencies by running:

sudo apt install texlive-latex-base texlive-latex-extra \
  texlive-fonts-recommended texlive-fonts-extra cm-super dvipng

SSH Connectivity

You should be able to access both the DUT and the Packet Generator machines from the client machine using key-based SSH with a passwordless key. The experiment script does not support SSH Agent so you must make sure that you specify the IdentityFile in the ~/.ssh/config file in the client machine. For example:

Host host_a
    HostName host1.example.com
    User user
    IdentityFile ~/.ssh/id_rsa

Host host_b
    HostName host2.example.com
    User user
    IdentityFile ~/.ssh/id_rsa

You will use the host name defined in the ~/.ssh/config file to specify the DUT and Packet Generator machines in the experiment configuration (In the example above that would be host_a and host_b).

Setup

Once you have all the required client dependencies and SSH connectivity setup to all machines, you are ready to setup the experiment environment. Start by cloning this repository in the client machine, if you haven't already:

git clone https://github.com/crossroadsfpga/enso_eval
cd enso_eval

With the repository cloned, you can automatically setup the experiment environment by running the setup.sh script (in the root enso_eval directory) on the client machine. The script can take as argument either a path to the bitstream to be used in the experiments¹ or --download to automatically download the appropriate bitstream from Ensō's git repository. For example:

./setup.sh --download

The script will also automatically clone the Ensō repository, install some python dependencies needed in the client machine and pre-populate the experiment_config.toml configuration file.

After running the script you should edit the generated experiment_config.toml file to specify the host names of the DUT and Packet Generator machines, as well as other details about the environment such as PCIe addresses and FPGA IDs. Refer to the comments in the configuration file itself for more details about each option that you need to configure.

Once you are done setting all the options in the configuration file, you can use the experiment.py script to run the experiments. We will describe how to run the complete set of experiments in the next section. For now, you should run the experiment.py script with the --setup-only option to skip the experiments and make sure that everything is setup correctly. For example:

python3 experiment.py . --setup-only

Helper environment variables

To avoid having to specify command line arguments for parameters that you already defined in the experiment_config.toml file, you can use the set_constants.py script to define helper environment variables based on the configuration file. We will use these variables in the next steps.

To define the helper variables in your shell, run the command bellow on the client machine:

source <(python3 set_constants.py)

Environment variables are only defined in the current shell session. If you open a new terminal window, remember to run the above command again!

Running a simple echo experiment

To get things started, we will describe how to manually run an echo server on the DUT machine and feed it with packets generated using EnsōGen on the Packet Generator machine. This serves two purposes. First, it should help familiarize you with how to run an application using Ensō. Second, it verifies the main claim that Ensō can sustain full 100 Gbps line rate (148.8 Mpps) using a single core.

Loading the bitstream

Start by using the newly-installed enso command to load the bitstream and configure the Ensō NICs on both the DUT and Packet Generator machines. On the client machine, run:

# Bringup NIC on Packet Generator machine
enso $PKTGEN_ENSO_PATH --host $PKTGEN_HOSTNAME \
  --fpga $PKTGEN_ENSO_FPGA_ID --fallback-queues 4 --enable-rr

# Bringup NIC on DUT machine
enso $DUT_ENSO_PATH --host $DUT_HOSTNAME --fpga $ENSO_DUT_FPGA_ID

You may run both commands at the same time in different terminal windows. Make sure to define the helper environment variables in each terminal window that you open by running source <(python3 set_constants.py).

Below is an explanation of the options used.

To setup the Packet Generator machine's Ensō NIC:

$PKTGEN_ENSO_PATH: This is the absolute path to the root of the Ensō repository (e.g., /home/pktgen/enso) on the Packet Generator machine.
--host $PKTGEN_HOSTNAME: This specifies the host name of the Packet Generator machine as defined in the ~/.ssh/config file (e.g., host_a or host_b in the example above).
--fpga $PKTGEN_ENSO_FPGA_ID: This specifies the FPGA ID to use on the Packet Generator machine.
--fallback-queues 4: This specifies the number of fallback queues to use to receive packets back in EnsōGen. Fallback queues are used to send packets that do not match any rule in the Ensō NIC's flow table. Using fallback queues ensures that EnsōGen will receive any packet that arrives at the host's Ensō NIC.
--enable-rr: Fallback queues work using RSS by default. This option enables round-robin scheduling of fallback queues, which ensures that EnsōGen will receive back an even distribution of packets among its pipes, regardless of the workload.

To setup the DUT machine's Ensō NIC:

$DUT_ENSO_PATH: This is the absolute path to the root of the Ensō repository (e.g., /home/dut/enso) on the DUT machine.
--host $DUT_HOSTNAME: This specifies the host name of the DUT machine as defined in the ~/.ssh/config file (e.g., host_a or host_b in the example above).
--fpga $ENSO_DUT_FPGA_ID: This specifies the FPGA ID to use on the DUT machine.

After the commands finish running, you will be presented with a JTAG console for both FPGAs. You can leave the consoles open as we will use them to check NIC statistics later. You should open a new terminal window to run the next steps.

Before moving on, check if the FPGAs are listed as PCIe devices. You can do so by running enso/scripts/list_enso_nics.sh on both machines. If an FPGA is listed as a USB device but not as a PCIe device, you should reboot the machine and run the above command again for the corresponding machine. The script will also give you a warning if you need to reboot.²

Running the echo server and EnsōGen

Now that both FPGAs are loaded and configured, you should run the echo server. To do so, run the following commands (starting at the client machine so that we can export the environment variables):

# Ensure that you are at the enso_eval repo and that
# you have defined the helper environment variables.
cd <enso_eval repo>
source <(python3 set_constants.py)

# SSH to the DUT machine and bring the helper environment variables along.
ssh -t $DUT_HOSTNAME "$(python3 set_constants.py); exec \$SHELL -l"

# Run the echo server.
cd $DUT_ENSO_PATH
sudo ./build/software/examples/echo 1 2 0

This will run the echo server with one CPU core, two pipes and will not busy loop for each packet. You can later try changing the number of CPU cores, pipes per core, and cycles per packet to see how it affects performance (remember to also change EnsōGen's input pcap accordingly).

After the echo server is running, you can run EnsōGen to send packets. You may specify any pcap file as input to EnsōGen. For convenience, we provide some sample pcaps that you can use in enso/scripts/sample_pcaps. These pcaps are composed of packets with the destination IP and port that match the echo server's default bind.

Start by sending 1000 packets (again starting at the client machine so that we can export the environment variables):

# Ensure that you are at the enso_eval repo and that
# you have defined the helper environment variables.
cd <enso_eval repo>
source <(python3 set_constants.py)

# SSH to the pktgen machine and bring the helper environment variables along.
ssh -t $PKTGEN_HOSTNAME "$(python3 set_constants.py); exec \$SHELL -l"

# Run EnsōGen.
cd $PKTGEN_ENSO_PATH
sudo ./scripts/ensogen.sh ./scripts/sample_pcaps/2_64_1_2.pcap 100 \
  --pcie-addr $PKTGEN_PCIE_ADDR_ENSO --count 1000

Note that 100 is the sending rate in Gbps and --pcie-addr $PKTGEN_PCIE_ADDR_ENSO specifies the PCIe address of the Ensō NIC that we will use with EnsōGen. The --count option specifies the number of packets to send.

The pcap file name that we provide follows the format <number of destinations>_<packet size>_<number of sources>_<number of packets>.pcap. 64B is the minimum packet size and here we use a pcap with two destinations so that we can send packets to both pipes. EnsōGen will send the same pcap file repeatedly until the target number of packets is reached.

After running the command above, you should see the echo server forwarding 1000 packets and EnsōGen receiving them back. You can check the statistics on the JTAG console for both machines to see how the NIC counters have changed. Simply type the following command on each JTAG console to see the statistics:

get_top_stats

You should pay attention to the first two counters: IN_PKT and OUT_PKT. These show the number of packets that entered the NIC (RX) and left the NIC (TX). You may refer to the Hardware Counter docs for a description of each counter.

You can also have EnsōGen send an unlimited number of packets by suppressing the --count option. This will send packets until you press Ctrl+C on the EnsōGen terminal. Let's use this to verify the throughput of the echo server:

# Ensure that you are at the enso_eval repo and that
# you have defined the helper environment variables.
cd <enso_eval repo>
source <(python3 set_constants.py)

ssh -t $PKTGEN_HOSTNAME "$(python3 set_constants.py); exec \$SHELL -l"

cd $PKTGEN_ENSO_PATH
sudo ./scripts/ensogen.sh ./scripts/sample_pcaps/2_64_1_2.pcap 100 \
  --pcie-addr $PKTGEN_PCIE_ADDR_ENSO  # Note that we do not use --count here.

If your CPU is powerful enough, you should see that the echo server is sending and receiving packets at full 100 Gbps line rate (148.8 Mpps) using a single core! (You may run htop on the DUT machine to check the CPU utilization.)

You may now stop EnsōGen by pressing Ctrl+C. After EnsōGen stops, you can also stop the echo server by pressing Ctrl+C. The same is true for the two JTAG consoles.

Detailed Instructions

We now describe how to reproduce the main claims in the paper. Assuming that you have followed the previous instructions to configure SSH connectivity and to setup the environment, you should now be able to run the experiments associated with each claim automatically (using experiment.py) and plot the corresponding figures (using paper_plots.py).

Claims

Tabulated below is a brief summary of each claim, their associated figure name in paper_plots.py, and the corresponding experiment names in experiment.py.

Claim	Figure name in `paper_plots.py`	Experiment names in `experiment.py`	Estimated time to run
#1 Ensō can reach full 100 Gbps line rate (148.8 Mpps) with a single core.	`rate_vs_cores`	- `"Ensō throughput vs. cores"`	3 min
#2 Reactive notifications allow Ensō to reach 100 Gbps but increase latency.	`rtt_vs_load_reactive_notif`	- `"Ensō RTT vs. load"` - `"Ensō (notification per packet) RTT vs. load"`	7 min
#3 Ensō's notification prefetching keeps the high-throughput benefits of reactive notifications while reducing the latency impact.	`rtt_vs_load_pref_notif`	- `"Ensō RTT vs. load"` - `"Ensō (prefetching) RTT vs. load"`	7 min
#4 Ensō scales to more than 1000 pipes with 4 cores but throughput drops when using more than 32 pipes with a single core and the most pessimistic workload.	`rate_vs_nb_pipes`	- `"Ensō throughput vs. ensō pipes (1 core)"` - `"Ensō throughput vs. ensō pipes (2 cores)"` - `"Ensō throughput vs. ensō pipes (4 cores)"` - `"Ensō throughput vs. ensō pipes (8 cores)"`	20 min
#5 Ensō can achieve 100 Gbps throughput regardless of number of cores and packet sizes.	`rate_vs_cores_vs_pkt_sizes`	- `"Ensō throughput vs. packet size"`	9 min
#6 Ensō's implementation of Google's Maglev load balancer reaches 148.8 Mpps with four cores for both the cached and the SYN-Flood workload.	`maglev`	- `"Ensō Maglev throughput (SYN flood)"` - `"Ensō Maglev throughput (Cached)"`	7 min
All claims (avoiding redundant experiments)			42 min

Running the Experiments

You may choose to run all the experiments or only a specific subset to satisfy a desired claim. In the table above, we list the set of experiments associated with each claim. For instance, to reproduce claim #2, you may run the following command (from the client machine):

./experiment.py ../data -f "Ensō RTT vs. load" \
                        -f "Ensō (notification per packet) RTT vs. load"

If you do not specify any filter with -f, experiment.py will run all the available experiments.

When you run experiment.py, it will automatically copy the enso and enso_eval repositories to the DUT and Packet Generator machines, run the setup on these machines, and then run the experiments. The script will automatically checkpoint progress and, in case of interruption, resume from where it left off. This is also useful for claims that share some of the experiments, as the script will avoid repeating the experiments that have already been run.

The result of the experiments will be saved in the data directory that you specify (../data in the example above). You should always use the same data directory to avoid running duplicate experiments.

Producing plots

Once experiment.py finishes running, you can produce the plot associated with a given claim using the paper_plots.py script. You can specify which plot to produce using the --pick option followed by the figure name. Refer to the claims table for the figure name associated with each claim.

For instance, to produce the plot for claim #2, you may run the following command (from the client machine):

./paper_plots.py ../data ../plots --pick rtt_vs_load_reactive_notif

This will save the plot named rtt_vs_load_reactive_notif.pdf in the ../plots directory.

If you run paper_plots.py without --pick, it will try to produce all the plots with data available in the data directory that you specified.

Running more samples

All throughput experiments rely on the RFC 2544 methodology. They use a binary search to find the maximum throughput that can be sustained without packet loss. This adds substantial time to the experiments, as we automatically try up to $\log_2(1000) \approx 10$ different rates to find the maximum rate that can be sustained without loss (for 100 Gbps with 0.1 Gbps precision). In the paper, we also repeat each binary search 10 times (i.e., requiring up to 100 sample per configuration in total).

To speed things up, experiment.py will run only a single binary search for each configuration by default. This saves time when evaluating the artifact but it might make the results more noisy compared to the 10 searches per configuration that we use in the paper. To run the experiments with more iterations, you may pass --iters <number of iterations> to the experiment.py script. But be aware that running all experiments using 10 iterations per configuration will take around 7 hours to run.

Logs

You may watch the logs while the experiments are running to see what is being executed in either the DUT or Packet Generator machines. Logs are also useful to diagnose any issues that may arise during the experiments.

To watch the DUT machine logs, run (on the client machine):

tail -f dut.log

To watch the Packet Generator machine logs, run (on the client machine):

tail -f pktgen.log

Other experiments

This repository also contains the necessary code to reproduce the other experiments in the paper's evaluation. This includes the baseline experiments to evaluate the E810 NIC with DPDK as well as the the remaining applications that we ported to run on Ensō.

Evaluating the E810 with DPDK

You can also use the experiment.py script to run experiments with the E810 NIC. However, to do so, you may need to change the testbed configuration so that the Stratix 10 MX FPGA on the Packet Generator machine is connected to the E810 NIC on the DUT machine. The DUT machine must also have DPDK 20.11 installed. Refer to DPDK's documentation for instructions on how to install it.

Once you have the testbed configured and DPDK installed, you can run the experiments with the --dpdk e810 option. As with the Ensō experiments, you should specify a path in the client machine to save the results from the experiments (it may be the same as the one used for the Ensō experiments).

./experiment.py ../data --dpdk e810

Remaining applications

This repository also includes the following applications:

All applications work with both Ensō and DPDK. You can refer to the README files in each application's directory for instructions on how to run them.

Refer to the Ensō documentation for more information on how to synthesize the bitstream. ↩
Rebooting should only be required the first time you load the FPGA after the server has been power cycled. Note that this is not the same as rebooting the machine. ↩

ntyunyayev / enso_eval

Ensō Evaluation

Getting Started Instructions

Testbed

Requirements

Plots

SSH Connectivity

Setup

Helper environment variables

Running a simple echo experiment

Loading the bitstream

Running the echo server and EnsōGen

Detailed Instructions

Claims

Running the Experiments

Producing plots

Running more samples

Logs

Other experiments

Evaluating the E810 with DPDK

Remaining applications

About

Languages

Ensō Evaluation

Getting Started Instructions

Testbed

Requirements

Plots

SSH Connectivity

Setup

Helper environment variables

Running a simple echo experiment

Loading the bitstream

Running the echo server and EnsōGen

Detailed Instructions

Claims

Running the Experiments

Producing plots

Running more samples

Logs

Other experiments

Evaluating the E810 with DPDK

Remaining applications

Footnotes

About

Languages