GoogleCloudPlatform / gcr-cleaner

Delete untagged image refs in Google Container Registry or Artifact Registry

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

gcr-cleaner fails to delete images in DockerHub

JKCHacking opened this issue · comments

TL;DR

Thank you developing this very useful tool. Based on the README file it also deletes images in DockerHub.

We are planning to use gcr-cleaner tool to maintain the number of images in our repository in DockerHub and GCR (single tool for both registries). However, after testing the tool in DockerHub, it fails to find the images to be deleted and gives us ✗ no refs were deleted in the output even though there are existing images that should have matched the condition.

Steps to reproduce:

  1. Create a DockerHub repository.
  2. Add images in the repository with tags: dev-1, dev-2, prod
  3. docker login (input username and password)
  4. ./gcr-cleaner-cli -repo="joshnee16/test-clean-up-images" -dry-run -keep=1 -tag-filter-all=^dev-

Version:
gcr-cleaner-cli == 0.10.0

Expected behavior

The expected behavior:

  1. images with tags dev-1 and dev-2 will be deleted.
  2. image with tag prod will not be deleted.

Observed behavior

Actual behavior:
Images with tags dev-1 and dev-2 are not deleted.

Debug log output

$ ./gcr-cleaner-cli -repo="joshnee16/test-clean-up-images" -dry-run -keep=1 -tag-filter-all=^dev-

{"message":"cli is starting","severity":"DEBUG","time":"2022-11-07T08:46:54Z","version":"gcr-cleaner-cli 0.10.0 (103e593b1bf59309841ff4a483cf33f3474acf28, linux/amd64)"}
WARNING: Running in dry-run mode - nothing will actually be cleaned!

Deleting refs older than 2022-11-07T08:46:54Z on 1 repo(s)...

joshnee16/test-clean-up-images
{"message":"computed repo","repo":"index.docker.io/joshnee16/test-clean-up-images","severity":"DEBUG","time":"2022-11-07T08:46:54Z"}
{"keep":1,"manifests":[],"message":"computed all manifests","severity":"DEBUG","time":"2022-11-07T08:47:04Z"}
  ✗ no refs were deleted
{"message":"cli finished","severity":"DEBUG","time":"2022-11-07T08:47:04Z"}

Additional information

No response

What is the repo layout and timestamps?

Repo doesn't have any child repositories, its just flat for the sake of testing. It contains 3 images with tags: dev-1, dev-2, prod. and they were pushed 4 days ago.

image

Also does the origin of these images affect the result? because these 3 images came from 1 image only but with different tags.
image

I would expect it to detect the manifests and then print out why it's not deleting things. Can you run without keep or tag-filter? If you have a quick reproduction for me to make those images on my personal dockerhub, I can try to reproduce it on my end too.

Hm - actually, I can query your repo. I'm a little confused because the result is:

Children:[]string(nil), Manifests:map[string]google.ManifestInfo(nil), Name:"", Tags:[]string{"dev-1", "dev-2", "prod"}

There are no manifests, but you have tags. So it appears GCR/GAR don't behave the same as Docker Hub.

You can pull my images:
docker pull joshnee16/test-clean-up-images:dev-2
docker pull joshnee16/test-clean-up-images:dev-1
docker pull joshnee16/test-clean-up-images:prod

Yes, actually I also did a curl request GET /v2/<name>/tags/list and it seems that there were no manifest as well.

does this mean that we cant use this tool for Docker Hub? I believe that you are using the manifests as a list to delete the images in the registry. Can't you use the tags also to loop and delete the images?

It seems that DockerHub has its own sets of APIs. For example getting list of tags in DockerHub: curl -X GET https://hub.docker.com/v2/namespaces/joshnee16/repositories/test-clean-up-images/tags

These includes the following manifests in the output:

           {
            "creator": 20322939,
            "id": 323036787,
            "images": [
                {
                    "architecture": "amd64",
                    "features": "",
                    "variant": null,
                    "digest": "sha256:89e1e6e82a9e12e77e82f60263cb3dca7f20abd94d75525d021a6365043dcca3",
                    "os": "linux",
                    "os_features": "",
                    "os_version": null,
                    "size": 13975909,
                    "status": "active",
                    "last_pulled": "2022-11-07T03:55:36.457302Z",
                    "last_pushed": "2022-11-04T06:49:03.157121Z"
                }
            ],
            "last_updated": "2022-11-04T05:16:54.477237Z",
            "last_updater": 20322939,
            "last_updater_username": "joshnee16",
            "name": "dev-2",
            "repository": 18449011,
            "full_size": 13975909,
            "v2": true,
            "tag_status": "active",
            "tag_last_pulled": "2022-11-07T03:55:36.457302Z",
            "tag_last_pushed": "2022-11-04T05:16:54.477237Z",
            "media_type": "application/vnd.docker.container.image.v1+json",
            "digest": "sha256:89e1e6e82a9e12e77e82f60263cb3dca7f20abd94d75525d021a6365043dcca3"
        },
       ...

From what I can see in the cleaner.go it only uses gcrgoogle.List() which only references the Docker Registry API.
Are there plans for supporting DockerHub specific requests in the future?

Hi @JKCHacking - users should prefer the native Google Artifact Registry functionality instead of gcr-cleaner. We are only fixing bugs and security issues in gcr-cleaner now that there's a native (and free) feature in the Google Cloud product.

Unfortunately it looks like the DockerHub API changed at some point, so gcr-cleaner no longer works there. Sorry.