google / crfs

CRFS: Container Registry Filesystem

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

comment on CVMFS

rptaylor opened this issue · comments

FYI maybe you have already heard about it before, but this seems similar to using CVMFS for container distribution.

General information on CVMFS:
https://cvmfs.readthedocs.io/en/stable/
https://cernvm.cern.ch/portal/filesystem
https://github.com/cvmfs/cvmfs

Information about loading docker images on demand from CVMFS:
https://cvmfs.readthedocs.io/en/stable/cpt-graphdriver.html

Information about automatically converting container images and publishing them to CVMFS (with DUCC)
https://cvmfs.readthedocs.io/en/stable/cpt-ducc.html

Thanks, I hadn't seen that before.

I see it's alike in that they both make outbound HTTP requests to fault in data as needed. I'm guessing it's different in that they have a specialized server, where CRFS's goal is to use existing Container Registry servers, and still be compatible with normal "docker pull" etc workflows.

Yes, CVMFS has a hierarchical stratum model of servers (a single authoritative stratum 0, and multiple stratum 1 servers replicate from it). So this handles the load balancing and distribution part, like a CDN. The same infrastructure can be used to serve repositories of container images as well as repositories of any other content/software. The stratum servers are based on Apache httpd.

When using CVMFS, DUCC (the last URL in the post) is sort of like a shim layer that takes the images from the existing container registry servers and automates publishing them onto CVMFS.

It would be very nice indeed to just do the usual 'docker pull library/fedora' etc. and have the lazy fetching and caching work transparently.
On the other hand, one other benefit of CVMFS is that all content is block-level deduplicated and cached locally with a CAS scheme, so if a new user wants to run a public container image with similar content to another one that has already started, that content will already be present in the local cache, even if the tar.gz files in the container images are completely different.

Now the containerd remote snapshotter exists and can be used by CVMFS and other remote filesystems.