google / crfs

CRFS: Container Registry Filesystem

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support docker private registry API

ktock opened this issue · comments

Currently crfs supports GCR and uses GCR-specific API, so we can't use it with docker private registry(especially with local and unsecure(http) one).
Isn't it great to support private registry to make it easy to try crfs?
And also, isn't it good start point to support other OCI(docker) compliant registries, as we can focus on API-related issues and separate the auth-related issues aside?
like:

$ ls /crfs/layers/127.0.0.1:5000/
my
$ ls /crfs/layers/127.0.0.1:5000/my
ubuntu
$ ls /crfs/layers/127.0.0.1:5000/my/ubuntu
18.04  sha256-2bca06c5f3ca2402e6fd5ab82fad0c3d8d6ee18e2def29bcadaae5360d0d43d9
$ ls /crfs/layers/127.0.0.1:5000/my/ubuntu/18.04/
0       sha256-0c0ed20421e1c2fbadc7fb185d4e37348de9b39a390c09957f2b9a6b68bd4785
1       sha256-24e2698eca10208eab4c4dad0dfad485a30c8307902404ffec2da284ae848fb8
2       sha256-2b01b35b83e6609c41f1aac861cd65914934fa503f645ca17c9ebff45907b9c5
3       sha256-646be464f13960b2cd0bf3a741a42f1bf658bee676ffbc49183222bdfb79e249
bottom  top
config
$ ls /crfs/layers/127.0.0.1:5000/my/ubuntu/18.04/bottom
bin  boot  dev  etc  home  lib  lib64  media  mnt  opt  proc  root  run  sbin  srv  sys  tmp  usr  var

And I have an idea of patch implementation on my branch, so if possible I'm willing to contribute.

Notice on image path format:
Currently crfs supports only <owner>/<image>-styled image path format, and this patch also intend to follow the restriction. Supporting any format of image path may be future work.

Overview of the ideas is following:

Use OCI-compliant API

Currently, crfs uses GCR specific API for following purposes:

Use OCI(docker)-compliant API instead of GCR specific one for private registry, say:

  • Listing owners on a host: GET on /v2/_catalog (Filter the response to allow only <owner>/<image>-styled image path format, and then parse it.)
  • Listing images stored in a owner: GET on /v2/_catalog (Filter the response to allow only <owner>/<image>-styled image path format, and then parse it.)
  • Listing maps from tag digests to names labeled on an image: GET on /v2/<owner>/<image>/tags/list for getting the list of tag names, and HEAD on /v2/<owner>/<image>/manifests/<tag name> with Accept: application/vnd.docker.distribution.manifest.v2+json header for getting digests of a V2 manifest(written in Docker-Content-Digest response header).

Use appropriate schema based on the host name

As used in the current code base of crfs, it is better to select appropriate API schema based on github.com/google/go-containerregistry/pkg/name.
Say, when we request API of localhost's docker private registry, it is good to use http (not https).

This is hard to do in a broadly compatible way. OCI is dropping catalog for various reasons. We've discussed replacing it here and here. Might be worth continuing the conversation over there.

Using the catalog API seems reasonable for certain registries, but it won't work everywhere 😞

I would like help implementing support for other registries. The current support for gcr.io is just because that's what I needed myself and I wanted to do something quickly for a proof of concept.

But the filesystem hierarchy was definitely designed to support others in the future. I was just short on time fleshing them all out.

Brad,

I would like to help with this project. I have extensive file system development experience starting with AT&T Unix V7. I am a Google GDE Cloud Platform and I know the Google services as a developer very well. I would like to help with supporting IBM's Container Registry (ICR), which I do not know as well as GCR. Seth Vargo (Google) @sethvargo can provide a reference about me.

@jhanley-com, great, thanks!

I will start studying your code and the IBM GCR specs this weekend. What is the best way to work with you? My Twitter is https://twitter.com/NeoPrimeAws I prefer to use email instead of public forums.

Using this bug for discussion and GitHub PRs for code review is the norm. It may not be the best, but it keeps things consistent and everybody else informed about the status and reduces duplication.