[Issue] Seems need to add rangeEnd to chunkStart for push image to Aliyun DevOps registry
cloorc opened this issue · comments
Current Behavior
- Download
v0.5.5
regctl for windows; - Download image from
docker.io/thorstenhans/helm3:3.8.2
toocidir://helm3-3.8.2.tar
- Login to Aliyun DevOps docker registry:
regctl registry login xxx.aliyun.com -u ... -p ...
and output success message - Push image to Aliyun DevOps as :
regctl image copy ocidir://helm3-3.8.2.tar xxx.aliyun.com/group/helm3:3.8.2
- Partial pushed and unable to continue
- After checkout source code from
main
branch and change line568
ofschema/reg/blob.go
fromchunkStart = rangeEnd + 1
tochunkStart += rangeEnd + 1
and rebuildregctl
, it could finish pushing(not able to verify yet)
time="2023-12-14T19:28:32+08:00" level=warning msg="Sleeping for backoff" Host="xxx.aliyun.com:443" Seconds=7.9994458999999996
time="2023-12-14T19:28:34+08:00" level=warning msg="Sleeping for backoff" Host="xxx.aliyun.com:443" Seconds=6.105377
time="2023-12-14T19:28:36+08:00" level=warning msg="Sleeping for backoff" Host="xxx.aliyun.com:443" Seconds=4.0457817
time="2023-12-14T19:28:41+08:00" level=warning msg="Sleeping for backoff" Host="xxx.aliyun.com:443" Seconds=1.994271
time="2023-12-14T19:28:41+08:00" level=warning msg="Sleeping for backoff" Host="xxx.aliyun.com:443" Seconds=3.9940072
time="2023-12-14T19:28:41+08:00" level=warning msg="Sleeping for backoff" Host="xxx.aliyun.com:443" Seconds=7.9938414
time="2023-12-14T19:28:43+08:00" level=warning msg="Sleeping for backoff" Host="xxx.aliyun.com:443" Seconds=6.3028506
time="2023-12-14T19:28:45+08:00" level=warning msg="Sleeping for backoff" Host="xxx.aliyun.com:443" Seconds=3.9651752
sha256:f423bbb [========================================] 100.00% 4.474kB/4.474kB
sha256:67bfedf [========================================] 100.00% 770.000B/770.000B
sha256:2043435 [=========> ] 23.61% 2.097MB/8.882MB
sha256:6e09ff9 [===> ] 7.86% 2.097MB/26.673MB
sha256:d6f7a29 [======> ] 15.56% 2.097MB/13.481MB
sha256:9c630af [> ] 0.00% 0.000B/126.000B
sha256:219578c [> ] 0.00% 0.000B/94.000B
Manifests: 0/1 | Blobs: 6.297MB copied, 0.000B skipped, 56.485MB queued | Elapsed: 109s
Expected Behavior
Pushed successfully.
Steps To Reproduce
Please check formal steps.
Version
$ regctl version
VCSTag: v0.5.5
VCSRef: 278ecbfdfce3e8b89ac15348ffe750b823ee1ea9
VCSCommit: 278ecbfdfce3e8b89ac15348ffe750b823ee1ea9
VCSState: clean
VCSDate: 2023-11-24T21:10:14Z
Platform: windows/amd64
GoVer: go1.21.4
GoCompiler: gc
Environment
Windows 10 (21H2, 19044.2075) Enterprise Edition LTSC
- Running as binary or container:
- Host platform:
- Registry description:
Anything else
None.
There's a good chance this is a bug in Aliyun's implementation. Support for chunked uploads is pretty bad in the ecosystem, and they may be expecting the streaming upload used by Docker but not defined in the OCI spec (yet). The streaming upload is a single patch, so they may be giving the range for the single patch and not know it's broken when multiple chunks are used.
A good way to check would be to run the OCI conformance test against the registry. You can find details of running that at https://github.com/opencontainers/distribution-spec/blob/main/conformance/README.md. Since it sounds like you're familiar with running Go and have access to their registry, would you be able to give this a try and report back the findings?
@sudo-bmitch Bad news, nearly no tests passed the test. I will report the issue to Aliyun DevOps vendor. Thanks!
There's a good chance this is a bug in Aliyun's implementation. Support for chunked uploads is pretty bad in the ecosystem, and they may be expecting the streaming upload used by Docker but not defined in the OCI spec (yet). The streaming upload is a single patch, so they may be giving the range for the single patch and not know it's broken when multiple chunks are used.
A good way to check would be to run the OCI conformance test against the registry. You can find details of running that at https://github.com/opencontainers/distribution-spec/blob/main/conformance/README.md. Since it sounds like you're familiar with running Go and have access to their registry, would you be able to give this a try and report back the findings?
@sudo-bmitch Hi Brandon, I'm very curious about why we use =rangeEnd+1
instead of += rangeEnd
. I thought all of them should be +=...
. Cause Range
header will be used to indicate bytes sent from client, which means +=
contains exactly bytes we already send, am I missunderstanding somthing?
I'm very curious about why we use =rangeEnd+1 instead of += rangeEnd. I thought all of them should be +=.... Cause Range header will be used to indicate bytes sent from client, which means += contains exactly bytes we already send, am I missunderstanding somthing?
@cloorc The range header we should see from the registry is the overall blob range, and not the single patch result. The client uses the header to indicate what part of the blob is being pushed, and the server response indicates what range of the blob it has received across multiple patch requests. There's similar logic to resume a failed chunk push, where the client can request what range the server has received so far, to know where to start resending.
This is a code path many registries don't see from clients, which also means support is weak, so regclient avoids it by default unless the normal push fails. I've been working on getting it better tested in the OCI conformance tests to hopefully reduce these issues in the ecosystem, but it will take time to identify and fix them.
Closing this as an issue with the specific registry. Feel free to continue the conversation here if there's anything that was missed.