[>=4.3.1] Chunked uploads are broken with urllib3 v2
sathieu opened this issue · comments
With vcrpy 5.0.0, when updating urllib3 from 1.26.16
to 2.0.3
, chunked uploads are failing.
This is because the body appears to vcrpy unchunked:
(Pdb) self.cassette.requests[18].url == self._vcr_request.url
True
(Pdb) self.cassette.requests[18].body
b'45\r\nBlob bf9500cc6aa4fac1a75c958496c9e9caeceacb256113e28836fc5b3bd448f706\r\n0\r\n\r\n'
(Pdb) [c for c in self._vcr_request.body]
[b'Blob bf9500cc6aa4fac1a75c958496c9e9caeceacb256113e28836fc5b3bd448f706']
(Pdb) self._vcr_request.headers
{'User-Agent': 'gitlabracadabra/1.13.0', 'Docker-Distribution-Api-Version': 'registry/2.0', 'Authorization': 'Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IlJJWTM6SUhGMjpZUlhVOlJGTVI6NEVRRDpJWVZDOjZMSzM6MzVJTDpGRFJGOklKUE46Q0MySjpMTEIzIiwidHlwIjoiSldUIn0.eyJhY2Nlc3MiOlt7InR5cGUiOiJyZXBvc2l0b3J5IiwibmFtZSI6InJvb3QvdGVzdF9yZWdpc3RyeS9vcGVyYXRvci1mcmFtZXdvcmsvb2xtIiwiYWN0aW9ucyI6WyJwdXNoIiwicHVsbCJdfV0sImp0aSI6ImM5ZGQ1YWQxLWQ0M2ItNDI0Yi1iZDUwLWU4NzJlYjYxZjYwNSIsImF1ZCI6ImNvbnRhaW5lcl9yZWdpc3RyeSIsInN1YiI6InJvb3QiLCJpc3MiOiJnaXRsYWItaXNzdWVyIiwiaWF0IjoxNjE3MTkzMjE5LCJuYmYiOjE2MTcxOTMyMTQsImV4cCI6MTYxNzE5MzUxOX0.b6PKFISAEPnDnfTKR8XVIHUOc6JYomUOB-u_w3QQIqF8gkb92n199zvFSwcpOZolcXjk1DvCfMtUkOvklzAgXyIwNcBx7t_o_SO4mCqWEnbfWxzhJKKSPm6YLAC0IoQbxCsPfA4CK7XEOCJU7eTLaCTqLvq6GvmpjtyLF-KgZV8eUV4zPDNYkhvUqcJXSQR_UHqQ4k32h8itBGYzfhykHSP2B2ZvbBDHLYmBlsDFtc7LJVUEEvKKhCR1VQppwgIoHwJwu7OR3R6I5_XO23NwBbPYvAQZC9twPBKNvoPfcToMzSF_keuAVodUgAyYMUHmnFPw5jX0-PcPt0L9j6a9rA', 'Cookie': 'perf_bar_enabled=true; experimentation_subject_id=eyJfcmFpbHMiOnsibWVzc2FnZSI6IkltSmpaRFZrTXpRMkxXWXlPR010TkRZek1DMWhNMll5TFRNMlpEYzNZVFJrWWpsa09DST0iLCJleHAiOm51bGwsInB1ciI6ImNvb2tpZS5leHBlcmltZW50YXRpb25fc3ViamVjdF9pZCJ9fQ%3D%3D--403d642cd6f4aaa4365664b26bec713ff089b313', 'Transfer-Encoding': 'chunked'}
(Pdb) self.cassette.requests[18].headers
{'Authorization': 'Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IlJJWTM6SUhGMjpZUlhVOlJGTVI6NEVRRDpJWVZDOjZMSzM6MzVJTDpGRFJGOklKUE46Q0MySjpMTEIzIiwidHlwIjoiSldUIn0.eyJhY2Nlc3MiOlt7InR5cGUiOiJyZXBvc2l0b3J5IiwibmFtZSI6InJvb3QvdGVzdF9yZWdpc3RyeS9vcGVyYXRvci1mcmFtZXdvcmsvb2xtIiwiYWN0aW9ucyI6WyJwdXNoIiwicHVsbCJdfV0sImp0aSI6ImM5ZGQ1YWQxLWQ0M2ItNDI0Yi1iZDUwLWU4NzJlYjYxZjYwNSIsImF1ZCI6ImNvbnRhaW5lcl9yZWdpc3RyeSIsInN1YiI6InJvb3QiLCJpc3MiOiJnaXRsYWItaXNzdWVyIiwiaWF0IjoxNjE3MTkzMjE5LCJuYmYiOjE2MTcxOTMyMTQsImV4cCI6MTYxNzE5MzUxOX0.b6PKFISAEPnDnfTKR8XVIHUOc6JYomUOB-u_w3QQIqF8gkb92n199zvFSwcpOZolcXjk1DvCfMtUkOvklzAgXyIwNcBx7t_o_SO4mCqWEnbfWxzhJKKSPm6YLAC0IoQbxCsPfA4CK7XEOCJU7eTLaCTqLvq6GvmpjtyLF-KgZV8eUV4zPDNYkhvUqcJXSQR_UHqQ4k32h8itBGYzfhykHSP2B2ZvbBDHLYmBlsDFtc7LJVUEEvKKhCR1VQppwgIoHwJwu7OR3R6I5_XO23NwBbPYvAQZC9twPBKNvoPfcToMzSF_keuAVodUgAyYMUHmnFPw5jX0-PcPt0L9j6a9rA', 'Cookie': 'experimentation_subject_id=eyJfcmFpbHMiOnsibWVzc2FnZSI6IkltSmpaRFZrTXpRMkxXWXlPR010TkRZek1DMWhNMll5TFRNMlpEYzNZVFJrWWpsa09DST0iLCJleHAiOm51bGwsInB1ciI6ImNvb2tpZS5leHBlcmltZW50YXRpb25fc3ViamVjdF9pZCJ9fQ%3D%3D--403d642cd6f4aaa4365664b26bec713ff089b313; perf_bar_enabled=true', 'Docker-Distribution-Api-Version': 'registry/2.0', 'Transfer-Encoding': 'chunked'}
Notes:
- The fix is probably similar to #723.
- The affected cassette is here
- I use a custom body matcher, but I don't think this is relevant here (except that this could be a way to workaround)
Hi @sathieu, I have not been successful in reproducing this issue yet. Could you provide a minimal reproducer in Python with VCR.py 5.0.0 and chunked transfer that works with urllib3 v2 but not v2?
In case the log at https://gitlab.com/gitlabracadabra/gitlabracadabra/-/jobs/4563346377 disappears, I am attaching a full raw copy here: full_job.log
@hartwork Here is a minimal reproducer:
works with requirements-good.txt
, fails with requirements-bad.txt
.
Hi @sathieu thanks for these files, well done! 🙏
I confirm that this works with urllib3 v1 but not v2. My understanding is that the request matcher that compres request bodies does not consider these arguably equal bodies as equal.
Comparision of bodies is done as follows:
Lines 51 to 57 in 4f70152
So both sides are transformed before comparison. Which transformer that is is looked up here…
Lines 99 to 104 in 4f70152
…and it is fed by this block here:
Lines 85 to 92 in 4f70152
Now that we have that: Maybe one way to fix this is to add a new transformer there? I have a demo of something like that on a new branch issue-734-fix-body-matcher-for-chunked-requests
now. Could you have a look? What do you think?
One more thing: method no_read
that you added — where can I learn more about that and its purpose, is that some common API? Thanks!
@hartwork Thanks !
Adding a transformer would work, but the question is why the body is unchuncked with the Transfer-Encoding: chunked
header still present? Maybe urllib3/urllib3#2565 (see also urllib3/urllib3#2291 (comment), especially If Content-Length can be determined, set Content-Length and proceed not chunked.)?
NB: this no_read
method was a test (for the read
method). It was not needed, and you can ignore it.
@sathieu thanks for your reply!
I have fixed multiple bugs in the demo in branch issue-734-fix-body-matcher-for-chunked-requests
in the meantime.
Regarding the difference, with urllib3 v1 request.body
is of type bytearray
while with v2 it's of type Stream
, i.e. your custom class.