Docker limit rate in the hypervisors hits the CI

Question

Docker limit rate in the hypervisors hits the CI

ccamacho opened this issue 2 years ago · comments

Carlos Camacho Gonzalez commented 2 years ago

Describe the bug
On the task:

Play: Run cluster deployment on prepared hypervisors
Task: kubeinit.kubeinit.kubeinit_services : Create a new working container image
Action: ansible.builtin.command
Path: /root/.ansible/collections/ansible_collections/kubeinit/kubeinit/roles/kubeinit_services/tasks/create_provision_container.yml:75
Host: localhost ( task delegated to hypervisor-01 )
Status: failed
Started: 04 Mar 2022 18:11:58 +0000
Ended: 04 Mar 2022 18:14:21 +0000
Duration: 00:02:22.43

We have this error:

initializing source docker://ubuntu:focal: reading manifest focal in docker.io/library/ubuntu: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit

To Reproduce
Everywhere in the CI.

Expected behavior
Do not hit the limit rate.

Screenshots
If applicable, add screenshots to help explain your problem.

Workkaround
After running podman login docker.io in the hypervisors the problems is partially solved until reboot which clears the login auth data.

Carlos Camacho Gonzalez · Answer 1 · Sat Mar 05 2022 18:22:55 GMT+0800 (China Standard Time)

This https://github.com/Kubeinit/kubeinit/blob/main/kubeinit/roles/kubeinit_prepare/tasks/prepare_podman.yml#L39 needs to run in the hypervisors.

Called from roles/kubeinit_services/tasks/00_create_service_pod.yml

Maybe we need to do the login first.

Glenn Marcy · Answer 2 · Sun Mar 06 2022 05:57:54 GMT+0800 (China Standard Time)

The prepare_podman is run in the hypervisor where the services will be located here. Is the provision container being placed somewhere else?

Check the logs, the problem isn't that the prepare_podman wasn't called, it's that the login docker.io on the hypervisor was skipped - https://ci.kubeinit.org/file/kubeinit-ci/jobs/cdk-libvirt-1-0-1-h-periodic-pid-weekly-u/results/415.html

Carlos Camacho Gonzalez · Answer 3 · Sun Mar 06 2022 15:44:36 GMT+0800 (China Standard Time)

Right, the deployment is as it was since the beginning, services pod in the first hypervisor, (the single-node deployment) and running the playbook also from the hypervisor (no container yet). I wanted to have it working before making the change to trigger the deployment from a container.