ServerlessFlowBench

Computer Engineering Master's Degree final project @ University of Rome 'Tor Vergata'

Author: Francesco Marino

Academic Year: 2019/2020

ServerlessFlowBench is a framework that allows users to:

deploy serverless functions to Amazon Web Services, Google Cloud Platform and OpenWhisk (already defined functions are available),
deploy serverless function compositions to Amazon Web Services, Google Cloud Platform and OpenWhisk (as before, already defined compositions are available),
perform HTTP benchmarks on deployed functions and compositions.

Execution Requirements

Java Developer Kit (JDK) version 8 (recommended) or newer
Docker Desktop with the following Docker images installed:
- amazon/aws-cli with tag 2.0.60
- google/cloud-sdk with tag 316.0.0
- francescom412/ow-utils-complete with tag 63a5498
- influxdb with tag 1.8.2
- grafana/grafana with tag 6.5.0
- mysql with tag 8.0.17
- bschitter/alpine-with-wrk2 with tag 0.1
Amazon Web Services valid account that can access to the following services:
- AWS Lambda,
- Amazon API Gateway,
- AWS Step Functions,
- Amazon S3,
- Amazon Rekognition,
- Amazon Translate.
Google Cloud Platform valid account with, at least, the following enabled:
OpenWhisk running deployment
[OPTIONAL] Azure valid and active account with, at least, the following enabled:
- Vision API,
- Face API.

Please note: the framework remains usable even with just 1 or 2, out of the 3, serverless platform(s) available.

Project structure description

docker_env

Folder containing files needed for a container based execution of the project architecture.

Content:

docker-compose.yml used to describe and deploy the project support architecture,
grafana_storage folder, to be added if not present, used to store Grafana container content and implement persistence,
influx_storage folder, to be added if not present, used to store InfluxDB container content and implement persistence,
mysql_storage folder, to be added if not present, used to store MySQL container content and implement persistence,
grafana_dashboards folder used to store Grafana dashboards needed to show benchmarks results.

Notes:

Composition containers' description:

MySQL: a relational database used to keep track of every entity deployed to the cloud in order to be able to reach and, eventually, delete each of them.
InfluxDB: a time series database used to keep track of benchmarks' results, each of them with the right test performance date and time.
Grafana: a visualization tool used to show benchmarks' result stored in InfluxDB in clear and explicative dashboards.

Credentials:

In docker-compose.yml file are listed credentials needed to access service containers:

MySQL:
- username: root,
- password: password.
InfluxDB:
- username: root,
- password: password.
Grafana:
- username: root,
- password: password.

First start:

In order to import in Grafana, after having the Docker compose environment up, the dashboards saved in grafana_dashboards:

connect to http://localhost:3000,
login using Grafana username and password,
select the "setting" panel,
choose "datasources" and add a new datasource,
choose influxDB as datasource, set http://influx-db:8086 (or replace "influx-db" with your InfluxDB Docker container name) as url, select your database (name can be set using the config.properties file located in the project root) and insert InfluxDB credentials (please note: an error message appears if the database does not exist yet, make sure to insert the correct name and ignore the error as the information is going to be consistent at the moment of measurement insertion),
select the "+" tab,
choose "import" option,
select every dashboard inside the grafana_dashboards directory.

serverless_functions

Folder containing examples of serverless functions and compositions created and benchmarked by the author.

Functions:

Here is the list of the functionalities realized:

basic_composition: composition realized just calling two different functions.
- latency_test: JSON response generator.
- cpu_test: big number factorization.
memory_test: dynamic array allocation and filling.
face_recognition: detection of face and anger in an image.
- image_recognition: detection of faces.
- anger_detection: detection of anger if face found.
cycle_translator: translation of sentences from any language to english (OpenWhisk version not realized).
- loop_controller: utility to manage more sentence translation at a time.
- language_detection: detection of the sentence language.
- sentence_translation: translation to English language.
- translation_logger: translation logging in a cloud bucket.

Each of them has been realized for Python, Java and Node.js (Javascript) in different versions, one for each tested provider.

Content:

aws folder containing functionalities meant to be deployed to Amazon Web Services:
- java containing Java AWS version of the functionalities,
- node containing Node.js AWS version of the functionalities,
- python containing Python AWS version of the functionalities,
- orchestration_handler folder containing a Python handler to execute and return result of compositions.
gcloud folder containing functionalities meant to be deployed to Google Cloud Platform:
- java containing Java Google Cloud version of the functionalities,
- node containing Node.js Google Cloud version of the functionalities,
- python containing Python Google Cloud version of the functionalities,
- orchestration_handler folder containing a Python handler to execute and return result of compositions.
openwhisk folder containing functionalities meant to be deployed to OpenWhisk:
- java containing Java OpenWhisk version of the functionalities,
- node containing Node.js OpenWhisk version of the functionalities,
- python containing Python OpenWhisk version of the functionalities.

src

This directory contains, in its subdirectories, Java code for Serverless Composition Performance Project execution, further details are provided in the next section.

Java Project structure description

The entire project part was developed using JetBrains' IntelliJ IDEA so it is recommended to open it using this IDE for better code navigation.

In the main folder is located the class ServerlessFlowBenchMain.java, this is the application entry point that allows the user to:

deploy serverless functions,
deploy serverless compositions,
optionally deploy of elements needed by the previous entities to work (e.g. cloud buckets),
perform benchmarks on functions and compositions,
deploy serverless functions that collect information about their execution environment,
remove every entity previously deployed.

cmd package

This package contains classes for shell commands execution grouped by functionality type.

In the main folder there are:

CommandExecutor.java, an abstract class providing common functions needed for shell command execution,
CommandUtility.java, an abstract class providing common functions and elements needed for shell command building,
StreamGobbler.java used for executing shell command output collection.

cmd.benchmark_commands package

BenchmarkCommandExecutor.java needed to execute load benchmarks, cold start benchmarks and collect results,
BenchmarkCommandUtility.java needed to build shell commands for load benchmarks execution using wrk2,
output_parsing package containing utilities to parse benchmarks results:
- BenchmarkCollector.java needed to parse wrk2 benchmarks results,
- BenchmarkStats.java needed to collect wrk2 benchmarks results.

cmd.docker_daemon_utility package

DockerException.java raised when a Docker daemon execution related error occurs,
DockerExecutor.java needed to check Docker containers correct configuration, Docker images presence and Docker composition running.

cmd.functionality_commands package

AmazonCommandUtility.java used to create Amazon Web Services CLI shell commands,
GoogleCommandUtility.java used to create Google CLoud Platform CLI shell commands,
OpenWhiskCommandUtility.java used to create OpenWhisk CLI shell commands,
BucketsCommandExecutor.java used to execute cloud buckets related commands,
CompositionCommandExecutor.java used to execute serverless compositions related commands,
FunctionCommandExecutor.java used to execute serverless functions related commands,
TablesCommandExecutor.java used to execute cloud NoSQL storage related commands,
IllegalNameException.java raised when a malformed name is attempted to be assigned to a resource,
output_parsing package containing utilities to parse command outputs:
- ReplyCollector.java used to collect console command execution output,
- URLFinder.java used to collect deployment url from console command execution output,
security package containing security utilities:
- GoogleAuthClient.java used to authenticate Google Cloud Workflows [BETA] executions urls.

databases package

This package contains classes needed for external databases interaction.

cmd.influx package

InfluxClient.java used to export benchmark results to the time series database InfluxDB.

cmd.mysql package

CloudEntityData.java used to collect functions, compositions, bucket and NoSQL table information,
DAO.java, an abstract class providing common information and methods needed by database access objects,
FunctionalityURL.java used to collect resource deployment url,
MySQLConnect.java used to connect and disconnect MySQL database,
daos package containing database access objects implementations:
- BucketsRepositoryDAO.java needed for cloud buckets' persistence management,
- CompositionsRepositoryDAO.java needed for serverless compositions' persistence management,
- FunctionsRepositoryDAO.java needed for serverless functions' persistence management,
- TablesRepositoryDAO.java needed for cloud NoSQL tables' persistence management.

utility package

This package contains classes needed for configuration purposes.

ComposeManager.java used to obtain automatically Docker images used inside the docker-compose.yml,
PropertiesManager.java used to get configuration parameters from config.properties file stored in the project root (further details provided in following sections).

User specific required files

Authentication files related to user's active services required to run the application (the ones used in the development process were excluded using .gitignore file for privacy related reasons).

Amazon Web Services

A file named credentials is required serverless_functions/aws/.aws, it should contain AWS account access key and secret. This file has the following structure:

[default]
aws_access_key_id=xxxxxxxxxx
aws_secret_access_key=xxxxxxxxxx

It can be downloaded from AWS Console → My Security Credentials (in the account menu) → Access Keys → New Access Key.

Google Cloud Platform

A file named credentials.json is required in serverless_functions/gcloud/.credentials, it should contain a Google Cloud Platform service account related info. This file has the following structure:

{
  "type": "service_account",
  "project_id": "id of the Google Cloud Platform project",
  "private_key_id": "xxxxxxxxxxxxxxx",
  "private_key": "-----BEGIN PRIVATE KEY-----\nxxxxxxxxxxxxxxxxxx\n-----END PRIVATE KEY-----\n",
  "client_email": "xxxxxxxxxx@xxxxx.xxx",
  "client_id": "xxxxxxxxxxxxxxx",
  "auth_uri": "https://xxxxxxxxxxxxx",
  "token_uri": "https://xxxxxxxxxx",
  "auth_provider_x509_cert_url": "https://xxxxxxxxxx",
  "client_x509_cert_url": "https://xxxxxxxxxxx"
}

It can be downloaded from Google Cloud Platform Console → API and services (in the side menu) → Credentials → Service accounts (selecting the one with desired authorizations) → New key.

Azure API in OpenWhisk [optional]

These files are needed only if the user needs to execute benchmarks on OpenWhisk for the originally defined anger detection workflows. Being every file specific for each function, several versions of this information are needed. Strings needed to fill these files can be found from Azure Console → Resources (in the side menu) → Choose the specific Cognitive Service resource → Keys and endpoints.

Java:

In serverless_functions/openwhisk/java/face_recognition/anger_detection/src/main/java/anger_detection and serverless_functions/openwhisk/java/face_recognition/image_recognition/src/main/java/image_recognition a file named AzureConfig.java with the following structure:

public class AzureConfig {
	protected static String endpoint = "xxxxxxxxxx";
	protected static String key = "xxxxxxxxxx";
}

Node.js:

In serverless_functions/openwhisk/node/face_recognition/anger_detection and serverless_functions/openwhisk/node/face_recognition/image_recognition a file named azureconfig.js with the following structure:

module.exports = {
    endpoint: "xxxxxxxxxx",
    key: "xxxxxxxxxx"
};

Python:

In serverless_functions/openwhisk/python/face_recognition/anger_detection and serverless_functions/openwhisk/python/face_recognition/image_recognition a file named azureconfig.py with the following structure:

endpoint = "xxxxxxxxxx"
key = "xxxxxxxxxx"

config.properties

A file named config.properties in the project root with the following structure (filled with valid current information):

docker_compose_dir=absolute_path_to:docker_env

mysql_ip=localhost ['localhost' to use Docker compose MySQL instance]
mysql_port=3306
mysql_user=xxxxxxx
mysql_password=xxxxxxx
mysql_dbname=xxxxxxx

influx_ip=localhost ['localhost' to use Docker compose InfluxDB instance]
influx_port=8086
influx_user=xxxxxxx
influx_password=xxxxxxx
influx_dbname=xxxxxxx

google_cloud_auth_json_path=absolute_path_to:credentials.json
google_cloud_cli_container_name=gcloud-cli
google_cloud_stage_bucket=name_of_stage_bucket_in_Google_Cloud_Platform

aws_auth_folder_path=absolute_path_to:credentials
aws_lambda_execution_role=arn:xxxxxxx
aws_step_functions_execution_role=arn:xxxxxxx

openwhisk_host=xxx.xxx.xxx.xxx
openwhisk_auth=xxxxxxx
openwhisk_ignore_ssl=True [or False if OpenWhisk is deployed on a SSL certified endpoint]

google_handler_function_path=absolute_path_to:serverless_functions/gcloud/orchestration_handler
aws_handler_function_path=absolute_path_to:serverless_functions/aws/orchestration_handler

Please note: in order to execute successfully the provided functions on AWS, the lambda role needs access to Comprehend, Translate, Rekognition, S3 and Step Functions, the step functions role needs access to Lambda only.

Serverless functions packages creation

This section's purpose is to explain how to create packages ready for deployment to the different service providers.

Amazon Web Services

Java:

The .jar file to deploy can be easily created using the project management tool Maven.

Here an example of the pom.xml file.

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>GROUP_ID</groupId>
    <artifactId>PROJECT_NAME</artifactId>
    <version>VERSION</version>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
    </properties>

    <dependencies>
        <dependency>
            <groupId>com.amazonaws</groupId>
            <artifactId>aws-lambda-java-core</artifactId>
            <version>x.x.x</version>
        </dependency>
        <dependency>
            <groupId>com.amazonaws</groupId>
            <artifactId>aws-lambda-java-events</artifactId>
            <version>x.x.x</version>
        </dependency>
        <dependency>
            <groupId>com.amazonaws</groupId>
            <artifactId>aws-lambda-java-log4j2</artifactId>
            <version>x.x.x</version>
        </dependency>
        <dependency>
            <groupId>javax.json</groupId>
            <artifactId>javax.json-api</artifactId>
            <version>x.x.x</version>
        </dependency>
        <dependency>
            <groupId>javax.json.bind</groupId>
            <artifactId>javax.json.bind-api</artifactId>
            <version>x.x.x</version>
        </dependency>
        <dependency>
            <groupId>org.glassfish</groupId>
            <artifactId>javax.json</artifactId>
            <version>x.x.x</version>
        </dependency>
        <dependency>
            <groupId>com.google.code.gson</groupId>
            <artifactId>gson</artifactId>
            <version>x.x.x</version>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>x.x.x</version>
                <configuration>
                    <createDependencyReducedPom>false</createDependencyReducedPom>
                </configuration>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>x.x.x</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>
        </plugins>
    </build>
</project>

Node.js:

In order to create a Node.js zipped package:

define the package.json file with every needed dependency (an example can be found at the end of this subsection),
install every needed dependency using npm inside a folder named node_modules placed in the Node.js project root,
put package.json file, node_modules folder and .js code files inside a .zip archive ready to be deployed.

Here an example of the package.json file.

{
  "name": "PROJECT_NAME",
  "version": "VERSION",
  "description": "PROJECT_DESCRIPTION",
  "main": "index.js",
  "author": "PROJECT_AUTHOR",
  "license": "ISC",
  "dependencies": {
    "dependency_name": "^x.x.x"
  }
}

Please note: package creation for AWS Node.js example functions can be automatically performed running the generate_archives.sh script.

Python:

In order to create a Python zipped package:

install every needed dependency using pip inside the Python project root,
put every dependency installed and the .py files inside .zip archive ready to be deployed.

Please note:

In the common cases the function needs only to communicate with AWS services, a .zip archive with just .py files inside is needed.
Package creation for AWS Python example functions can be automatically performed running the generate_archives.sh script.

Google Cloud Platform

For Google Cloud Platform no archive creation is needed.

Java:

The project to deploy can be easily created using Maven, in order to perform deployment is enough passing the project root path to the deployment utility.

Here an example of the pom.xml file needed for Google Cloud Functions deployment.

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>GROUP_ID</groupId>
    <artifactId>PROJECT_NAME</artifactId>
    <version>VERSION</version>

    <properties>
        <maven.compiler.target>1.8</maven.compiler.target>
        <maven.compiler.source>1.8</maven.compiler.source>
    </properties>

    <dependencies>
        <dependency>
            <groupId>com.google.cloud.functions</groupId>
            <artifactId>functions-framework-api</artifactId>
            <version>x.x.x</version>
        </dependency>
        <dependency>
            <groupId>javax.json</groupId>
            <artifactId>javax.json-api</artifactId>
            <version>x.x.x</version>
        </dependency>
        <dependency>
            <groupId>javax.json.bind</groupId>
            <artifactId>javax.json.bind-api</artifactId>
            <version>x.x.x</version>
        </dependency>
        <dependency>
            <groupId>org.glassfish</groupId>
            <artifactId>javax.json</artifactId>
            <version>x.x.x</version>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>x.x.x</version>
                <configuration>
                    <excludes>
                        <exclude>.google/</exclude>
                    </excludes>
                </configuration>
            </plugin>
        </plugins>
    </build>
</project>

Node.js:

In order to create a Node.js package to deploy:

define the package.json file with every needed dependency (an example can be found in the Amazon Web Services Node.js section),
put package.json file and .js code files inside the project root to deploy and pass its absolute path to the deployment utility.

Python:

In order to create a Python package to deploy:

put every needed .py file in the package root,
create a requirements.txt file in the package root with every needed dependency.

The deployment process is similar to the ones for Node.js and Java in Google Cloud Platform.

Here an example of the requirements.txt file needed for Google Cloud Functions deployment.

dependency-name==x.x.x
dependency-name==x.x.x
dependency-name==x.x.x
...

OpenWhisk

Java:

The .jar file to deploy can be created, again, using Maven.

Here an example of the pom.xml file.

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>GROUP_ID</groupId>
    <artifactId>PROJECT_NAME</artifactId>
    <version>VERSION</version>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
    </properties>

    <dependencies>
        <dependency>
            <groupId>com.google.code.gson</groupId>
            <artifactId>gson</artifactId>
            <version>x.x.x</version>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>x.x.x</version>
                <configuration>
                    <createDependencyReducedPom>false</createDependencyReducedPom>
                </configuration>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>x.x.x</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>
        </plugins>
    </build>
</project>

Node.js:

In order to create a Node.js zipped package:

define the package.json file with every needed dependency (an example can be found in the Amazon Web Services Node.js subsection),
install every needed dependency using npm inside a folder named node_modules placed in the Node.js project root,
put package.json file, node_modules folder and .js code files inside a .zip archive ready to be deployed.

Please note: package creation for OpenWhisk Node.js example functions can be automatically performed running the generate_archives.sh script.

Python:

[SOURCE] In order to create a Python zipped package:

create the entry point file in the Python project root and name it as __main__.py,
create a virtual environment,
install every needed dependency using pip inside the Python project root,
put the virtualenv directory and the .py files inside .zip archive ready to be deployed.

In order to create a virtual environment execute the following command starting from the Python project root:

$ virtualenv virtualenv

In order to install dependencies execute the following commands starting from the Python project root:

$ source virtualenv/bin/activate
(virtualenv) $ pip install dependency-name
(virtualenv) $ pip install dependency-name
...

Please note: package creation for OpenWhisk Python example functions can be automatically performed running the generate_archives.sh script.

ServerlessFlowBench

Execution Requirements

Project structure description

Content:

Notes:

Composition containers' description:

Credentials:

First start:

Functions:

Content:

Java Project structure description

User specific required files

Amazon Web Services

Google Cloud Platform

Azure API in OpenWhisk [optional]

Java:

Node.js:

Python:

config.properties

Serverless functions packages creation

Amazon Web Services

Java:

Node.js:

Python:

Google Cloud Platform

Java:

Node.js:

Python:

OpenWhisk

Java:

Node.js:

Python:

About

Languages