[5.1.0] tests/unit/test_serialize.py::test_serialize_binary_request fails if simplejson is installed in the environment
mgorny opened this issue · comments
When simplejson
is installed in the system, it is used over the built-in json
module. This causes the following test to fail:
$ python -m pytest
========================================================= test session starts =========================================================
platform linux -- Python 3.10.12, pytest-7.4.0, pluggy-1.2.0
rootdir: /tmp/vcrpy
configfile: pyproject.toml
plugins: httpbin-2.0.0, cov-4.1.0
collected 255 items / 10 skipped
tests/integration/test_basic.py ..... [ 1%]
tests/integration/test_config.py ........... [ 6%]
tests/integration/test_disksaver.py .... [ 7%]
tests/integration/test_filter.py .......... [ 11%]
tests/integration/test_ignore.py .... [ 13%]
tests/integration/test_matchers.py .............. [ 18%]
tests/integration/test_multiple.py . [ 19%]
tests/integration/test_record_mode.py ........ [ 22%]
tests/integration/test_register_matcher.py .... [ 23%]
tests/integration/test_register_persister.py ... [ 25%]
tests/integration/test_register_serializer.py . [ 25%]
tests/integration/test_request.py .. [ 26%]
tests/integration/test_stubs.py .... [ 27%]
tests/integration/test_urllib2.py .................. [ 34%]
tests/unit/test_cassettes.py ............................... [ 47%]
tests/unit/test_errors.py .... [ 48%]
tests/unit/test_filters.py ........................ [ 58%]
tests/unit/test_json_serializer.py . [ 58%]
tests/unit/test_matchers.py ............................ [ 69%]
tests/unit/test_migration.py ... [ 70%]
tests/unit/test_persist.py .... [ 72%]
tests/unit/test_request.py ................. [ 78%]
tests/unit/test_response.py .... [ 80%]
tests/unit/test_serialize.py .............F. [ 86%]
tests/unit/test_stubs.py .. [ 87%]
tests/unit/test_unittest.py ......... [ 90%]
tests/unit/test_vcr.py ....................... [ 99%]
tests/unit/test_vcr_import.py . [100%]
============================================================== FAILURES ===============================================================
____________________________________________________ test_serialize_binary_request ____________________________________________________
def test_serialize_binary_request():
msg = "Does this HTTP interaction contain binary data?"
request = Request(method="POST", uri="http://localhost/", body=b"\x8c", headers={})
try:
> serialize({"requests": [request], "responses": [{}]}, jsonserializer)
tests/unit/test_serialize.py:111:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
vcr/serialize.py:59: in serialize
return serializer.serialize(data)
vcr/serializers/jsonserializer.py:19: in serialize
return json.dumps(cassette_dict, indent=4) + "\n"
.tox/py310/lib/python3.10/site-packages/simplejson/__init__.py:395: in dumps
**kw).encode(obj)
.tox/py310/lib/python3.10/site-packages/simplejson/encoder.py:300: in encode
chunks = list(chunks)
.tox/py310/lib/python3.10/site-packages/simplejson/encoder.py:714: in _iterencode
for chunk in _iterencode_dict(o, _current_indent_level):
.tox/py310/lib/python3.10/site-packages/simplejson/encoder.py:668: in _iterencode_dict
for chunk in chunks:
.tox/py310/lib/python3.10/site-packages/simplejson/encoder.py:544: in _iterencode_list
for chunk in chunks:
.tox/py310/lib/python3.10/site-packages/simplejson/encoder.py:668: in _iterencode_dict
for chunk in chunks:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
dct = {'body': b'\x8c', 'headers': {}, 'method': 'POST', 'uri': 'http://localhost/'}, _current_indent_level = 4
def _iterencode_dict(dct, _current_indent_level):
if not dct:
yield '{}'
return
if markers is not None:
markerid = id(dct)
if markerid in markers:
raise ValueError("Circular reference detected")
markers[markerid] = dct
yield '{'
if _indent is not None:
_current_indent_level += 1
newline_indent = '\n' + (_indent * _current_indent_level)
item_separator = _item_separator + newline_indent
yield newline_indent
else:
newline_indent = None
item_separator = _item_separator
first = True
if _PY3:
iteritems = dct.items()
else:
iteritems = dct.iteritems()
if _item_sort_key:
items = []
for k, v in dct.items():
if not isinstance(k, string_types):
k = _stringify_key(k)
if k is None:
continue
items.append((k, v))
items.sort(key=_item_sort_key)
else:
items = iteritems
for key, value in items:
if not (_item_sort_key or isinstance(key, string_types)):
key = _stringify_key(key)
if key is None:
# _skipkeys must be True
continue
if first:
first = False
else:
yield item_separator
yield _encoder(key)
yield _key_separator
if isinstance(value, string_types):
yield _encoder(value)
elif _PY3 and isinstance(value, bytes) and _encoding is not None:
> yield _encoder(value)
E UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8c in position 0: invalid start byte
.tox/py310/lib/python3.10/site-packages/simplejson/encoder.py:633: UnicodeDecodeError
During handling of the above exception, another exception occurred:
def test_serialize_binary_request():
msg = "Does this HTTP interaction contain binary data?"
request = Request(method="POST", uri="http://localhost/", body=b"\x8c", headers={})
try:
serialize({"requests": [request], "responses": [{}]}, jsonserializer)
except (UnicodeDecodeError, TypeError) as exc:
> assert msg in str(exc)
E assert 'Does this HTTP interaction contain binary data?' in "'utf-8' codec can't decode byte 0x8c in position 0: invalid start byte"
E + where "'utf-8' codec can't decode byte 0x8c in position 0: invalid start byte" = str(UnicodeDecodeError('utf-8', b'\x8c', 0, 1, 'invalid start byte'))
tests/unit/test_serialize.py:113: AssertionError
======================================================= short test summary info =======================================================
FAILED tests/unit/test_serialize.py::test_serialize_binary_request - assert 'Does this HTTP interaction contain binary data?' in "'utf-8' codec can't decode byte 0x8c in position 0: invalid start byte"
============================================= 1 failed, 254 passed, 10 skipped in 11.74s ==============================================
@mgorny before having a closer look:
- Which versions of VCR.py are known affected or not affected, what did you use and try?
- Is there a related ticket In Gentoo that I just failed to find?
@mgorny before having a closer look:
* Which versions of VCR.py are known affected or not affected, what did you use and try?
5.1.0 failed, 5.0.0 passed.
* Is there a related ticket In Gentoo that I just failed to find?
No, I noticed while bumping, so I deselected it.
Bisect says it's 4f70152 (CC @jairhenrique):
commit 4f70152e7ce510cde41cf071585cbdb481e4e8f2 (HEAD)
Author: Jair Henrique <jair.henrique@gmail.com>
AuthorDate: 2023-06-27 14:12:40 +0200
Commit: Jair Henrique <jair.henrique@gmail.com>
CommitDate: 2023-06-27 22:36:26 +0200
Enable rule B (flake8-bugbear) on ruff
Prior to this change, the exception is:
ValueError: 'utf-8' codec can't decode byte 0x45 in position 0: invalid start byteDoes this HTTP interaction contain binary data? If so, use a different serializer (like the yaml serializer) for this request?
After it, it is:
ValueError: 'utf-8' codec can't decode byte 0x8c in position 0: invalid start byte
Without simplejson
installed, it is:
ValueError: Does this HTTP interaction contain binary data? If so, use a different serializer (like the yaml serializer) for this request?
Perhaps the simplest solution would be to remove simplejson
support entirely — I suspect it's only there for py2 support.
Oh, and my educated guess is that this is the problematic part of the change:
diff --git a/vcr/serializers/jsonserializer.py b/vcr/serializers/jsonserializer.py
index 5ffef3e..55cf780 100644
--- a/vcr/serializers/jsonserializer.py
+++ b/vcr/serializers/jsonserializer.py
@@ -17,13 +17,5 @@ def serialize(cassette_dict):
try:
return json.dumps(cassette_dict, indent=4) + "\n"
- except UnicodeDecodeError as original: # py2
- raise UnicodeDecodeError(
- original.encoding,
- b"Error serializing cassette to JSON",
- original.start,
- original.end,
- original.args[-1] + error_message,
- )
- except TypeError: # py3
- raise TypeError(error_message)
+ except TypeError:
+ raise TypeError(error_message) from None
Note that it removes exception rewriting for "py2" case, and I guess simplejson
falls into that case.
@mgorny thanks for the additional details! 🙏
Thanks!