HongyuHe / noserver

凵 Full-system, queuing simulator for serverless workflows.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NoServer 凵

NoServer is a full-system, queuing simulator for serverless workflows.

Key Features

  • Supports serverless workflows, scalable to thousands function instances and hundreds of nodes.
  • Simulates all layers of abstraction of a typical serverless system.
  • Disaggregates the three scheduling dimensions: load balancing (LB), autoscaling (AS), and function placement (FP).
    • Separate serverless police logic from the underlying system implementation.
  • Models hyterogeneous worker types (e.g., Harvest VMs).

Overview

Architecture of NoServer

The base models of each abstraction layer are the following:

Setup

$ pip3 install -r requirements.txt

Usage

$ python3 -m noserver [flags]

Flags:
  --mode: Simulation mode to run. Available options: [test, rps, dag, benchmark, trace].
  --trace: Path to the DAG trace to simulate. Default: 'data/trace_dags.pkl'.
  --hvm: Specify a fixed Harvest VM from the trace to simulate.
  --logfile: Log file path.
  --display: Display the task DAG. (Opposite option: --nodisplay)
  --vm: Number of normal VMs. Default: 2.
  --cores: Number of cores per VM. Default: 40.
  --stages: Number of stages in the task DAG. Default: 8.
  --invocations: Total number of invocations in the task DAG. Default: 4096.
  --width: Width of the DAG. Default: 1.
  --depth: Depth of the DAG. Default: 1.
  --rps: Request per second arrival rate. Default: 1.0.
  --config: Path to a configuration file. Default: './configs/default.py'.

Note:
  • The '--mode' flag is required. You must provide a valid simulation mode.
  • Use '--display' to show the task DAG graph. Use '--nodisplay' to suppress the display.
  • To use other configurations: 
    python3 -m noserver --config=./configs/another_config.py:params
  • To override parameters:
    python3 -m noserver --mode dag --noconfig.harvestvm.ENABLE_HARVEST

Validation

I conducted validation against the serverless platform vHive (a benchmark wrapper around Knative).

For the following experiments, the cluster specifications are the following:

  • Machine type: c220g5 on Cloudlab Winsconsin
    • Number of cores per node: 40 (2 sockets w/ 10 cores each, hyperthreads: 2)
    • Maximum theoretical throughput is around 40 requests per second.
  • Cluster size: 11 nodes
    • 1 master + 10 workers
  • Function execution time: 1 s
  • Function memory footprint: 170 MiB

Resource Utilization

Validation of Resource Utilization



Real-Time Autoscaling

Real-Time Autoscaling



Latency & Cold Start

In the following experiments, the cluster was not warmed up in order to preserve cold start.

  • 50 percentile (p50):

Validation of p50 Queuing Latency



  • 99 percentile (p99):

Validation of p99 Queuing Latency



About

凵 Full-system, queuing simulator for serverless workflows.

License:Apache License 2.0


Languages

Language:Python 92.4%Language:Shell 6.1%Language:Makefile 1.5%