This project implements peer to peer distribution of content (such as files or OCI container images) in a Kubernetes cluster. The source of the content could be another node in the same cluster, an OCI container registry (like Azure Container Registry) or a remote blob store (such as Azure Blob Storage).
This project is work in progress and can be used for experimental and development purposes. It is not yet production ready, but under active development.
This section shows how to get started with peerd
. To see all available commands, run make help
.
$ make help
_____ _
| __ \ | |
| |__) |__ ___ _ __ __| |
| ___/ _ \/ _ \ '__/ _` |
| | | __/ __/ | | (_| |
|_| \___|\___|_| \__,_|
all Runs the peerd build targets in the correct order.
build-image Build the peerd docker image.
build Build the peerd packages.
coverage Generates test results for code coverage.
help Generates help for all targets with a description.
...
peerd
is a self-contained binary that can be run directly on each node of a cluster, as a systemd service (peerd.service).
Alternatively, it can also be deployed as DaemonSet pods using the helm chart.
- An existing Kubernetes cluster with
- containerd as the container runtime.
With containerd, peerd
leverages the hosts configuration to act as a mirror for container images.
The helm chart deploys a DameonSet to the peerd-ns
namespace, and mounts the containerd socket to the peerd
containers.
The peerd
container image is available at ghcr.io/azure/acr/peerd
. To deploy, run the following.
CLUSTER_CONTEXT=<your-cluster-context> && \
TAG=v0.0.3-alpha && \
HELM_RELEASE_NAME=peerd && \
HELM_CHART_DIR=./build/package/peerd-helm && \
helm --kube-context=$CLUSTER_CONTEXT install --wait $HELM_RELEASE_NAME $HELM_CHART_DIR \
--set peerd.image.ref=ghcr.io/azure/acr/dev/peerd:$TAG
By default, mcr.microsoft.com
and ghcr.io
are mirrored, but this is configurable. For example, to mirror docker.io
as well, run the following.
CLUSTER_CONTEXT=<your-cluster-context> && \
TAG=v0.0.3-alpha && \
HELM_RELEASE_NAME=peerd && \
HELM_CHART_DIR=./build/package/peerd-helm && \
helm --kube-context=$CLUSTER_CONTEXT install --wait $HELM_RELEASE_NAME $HELM_CHART_DIR \
--set peerd.image.ref=ghcr.io/azure/acr/dev/peerd:$TAG
--set peerd.hosts="mcr.microsoft.com ghcr.io docker.io"
On deployment, each peerd
instance will try to connect to its peers in the cluster.
-
When connected successfully, each pod will generate an event
P2PConnected
. This event is used to signal that thepeerd
instance is ready to serve requests to its peers. -
When a request is served by downloading data from a peer,
peerd
will emit an event calledP2PActive
, signalling that it's actively communicating with a peer and serving data from it.
To see logs from the peerd
pods, run the following.
kubectl --context=$CLUSTER_CONTEXT -n peerd-ns logs -l app=peerd -f
For local development or experimentation, you can build the peerd
docker image, create a kind cluster, and deploy the
peerd
application to each node in it. To build and deploy to a 3 node kind cluster, run the following.
$ make build-image && \
make kind-create kind-deploy
...
...
daemonset.apps/peerd created
service/peerd created
waiting for pods to connect
pods: peerd-5trwv peerd-q2c45 peerd-tkj5k
checking pod 'peerd-5trwv' for event 'P2PConnected'
checking pod 'peerd-q2c45' for event 'P2PConnected'
checking pod 'peerd-tkj5k' for event 'P2PConnected'
Success: All pods have event 'P2PConnected'.
Clean up your deployment.
$ make kind-delete
There are two kinds of test workloads available in this repository:
-
Simple peer to peer file sharing by specifying the range of bytes to read.
- This enables block level file drivers, such as Overlaybd, to use
peerd
as the p2p proxy. - This test is run by deploying the
random
test workload to the kind cluster. - The workload is deployed to each node, and outputs performance metrics that are observed by it, such as the speed of downloads and error rates.
$ make ci-kind-random ... {"level":"info","node":"random-zb9vm","version":"bb7ee6a","mode":"upstream","size":22980743,"readsPerBlob":5,"time":"2024-03-07T21:50:29Z","message":"downloading blob"} {"level":"info","node":"random-9gcvw","version":"bb7ee6a","upstream.p50":21.25170790666404,"upstream.p75":5.834663359546446,"upstream.p90":0.7871542327673121,"upstream.p95":0.2965091294200036,"upstream.p100":0.2645602612715345,"time":"2024-03-07T21:50:34Z","message":"speeds (MB/s)"} {"level":"info","node":"random-9gcvw","version":"bb7ee6a","p2p.p50":5.802082290454193,"p2p.p75":1.986398855488793,"p2p.p90":0.6210418172329215,"p2p.p95":0.0523776186045032,"p2p.p100":0.023341096448268952,"time":"2024-03-07T21:50:34Z","message":"speeds (MB/s)"} {"level":"info","node":"random-9gcvw","version":"bb7ee6a","p2p.error_rate":0,"upstream.error_rate":0,"time":"2024-03-07T21:50:34Z","message":"error rates"} ... # Clean up $ make kind-delete
- This enables block level file drivers, such as Overlaybd, to use
-
Peer to peer sharing of container images that are available in the containerd content store of a node.
- This enables pulling container images from peers in the cluster, instead of from the registry.
- This test is run by deploying the
ctr
test workload to the kind cluster. - The workload is deployed to each node, and outputs performance metrics that are observed by it, such as the speed of downloads and error rates.
$ make ci-kind-ctr ... ... ... # Clean up $ make kind-delete
To build the peerd
binary, run the following.
$ make
...
The build produces a binary and a systemd service unit file. Additionally, it bin-places the API swagger file.
|-- peerd # The binary
|-- peerd.service # The service unit file for systemd
|-- swagger.yml # The swagger file for the REST API
peerd
allows a Kubenetes node to share its container images (when using containerd) with other nodes in the cluster.
It also allows a node to act as a mirror for files obtained from any HTTP upstream source (such as an Azure Blob using
a SAS URL), and can discover and serve a specified byte range of the file to/from other nodes in the cluster.
When a range of an HTTP file is requested, peerd
first attempts to discover if any of its peers already have that exact
range. The file is identified by its SHA256 digest, and only upstream URLs that specify this digest are supported.
For example, to download the first 100 bytes of a layer of the mcr.microsoft.com/hello-world:latest
container image,
whose SAS URL can be obtained by querying GET https://mcr.microsoft.com/v2/hello-world/blobs/sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
, the following request can be made to peerd
,
which is assumed to run at http://localhost:30000
.
GET http://localhost:30000/blobs/https://westus2.data.mcr.microsoft.com/01031d61e1024861afee5d512651eb9f-h36fskt2ei//docker/registry/v2/blobs/sha256/a3/a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4/data?se=2024-03-13T21%3A35%3A45Z&sig=mSdsz%2FXkQjze%2Bzhy7pEAlr0WPrUnlhbcgnPfAoxVzuE%3D&sp=r&spr=https&sr=b&sv=2018-03-28&regid=01031d61e1024861afee5d512651eb9f
Range: bytes=0-100
peerd
will first attempt to discover this range from its peers, and if not found, will download the range from one of
them. If not found, it will download the range from the upstream HTTP source and serve it to the client. Additionally,
the instance will cache the bytes (and optionally, prefetch the entire file) and advertise the cached bytes to its peers,
so that they can serve it in the future without having to download it from the upstream source.
This approach requires an exact knowledge of parsing the digest from the HTTP URL, and is currently supported for the following:
mcr.microsoft.com
- Azure Container Registry
With this facility, peerd
can be used as the p2p proxy for Overlaybd.
"p2pConfig": {
"enable": true,
"address": "localhost:30000/blobs"
}
Pulling a container image to a node in Kubernetes is often a time consuming process, especially in scenarios where the
registry becomes a bottleneck, such as deploying a large cluster or scaling out in response to bursty traffic. To increase
throughput, nodes in the cluster which already have the image can be used as an alternate image source. peerd
subscribes
to events in the containerd content store, and advertises local images to peers. When a node needs an image, it can query
its peers for the image, and download it from them instead of the registry. Containerd has a mirror
facility that can be used to configure peerd
as the mirror for container images.
The APIs are described in the swagger.yaml.
See design.
Please read our CONTRIBUTING.md which outlines all of our policies, procedures, and requirements for contributing to this project.
The Spegel project has greatly inspired this work, and a big THANK YOU to Philip Laine and Simon Gottschlag at Xenit for generously sharing their insights with us. A hat tip also to the DADI P2P Proxy project for demonstrating the integration with Overlaybd.
Term | Definition |
---|---|
ACR | Azure Container Registry |
AKS | Azure Kubernetes Service |
ACI | Azure Container Instances |
DHT | Distributed Hash Table |
OCI | Open Container Initiative |
P2P | Peer to Peer |
POC | Proof of Concept |
TCMU | Target Core Module Userspace |