frostming / unearth

A utility to fetch and download python packages

Home Page:https://unearth.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

json.decoder.JSONDecodeError during find_all_packages

paugier opened this issue · comments

On a Gitlab CI, I get a traceback using PDM (pdm-project/pdm#2532). I think that the problem is related to unearth. The exception can be reproduced only with unearth:

$ python3.9 -c "from unearth import PackageFinder as F; f = F(index_urls=['https://pypi.org/simple/']); print(list(f.find_all_packages('flit-core')))"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/appuser/.local/lib/python3.9/site-packages/unearth/finder.py", line 295, in find_all_packages
    self._find_packages(package_name, allow_yanked), hashes=hashes or {}
  File "/home/appuser/.local/lib/python3.9/site-packages/unearth/finder.py", line 275, in _find_packages
    return sorted(all_packages, key=self._sort_key, reverse=True)
  File "/home/appuser/.local/lib/python3.9/site-packages/unearth/collector.py", line 135, in collect_links_from_location
    yield from _collect_links_from_index(session, location)
  File "/home/appuser/.local/lib/python3.9/site-packages/unearth/collector.py", line 85, in parse_json_response
    data = json.loads(page.content)
  File "/usr/local/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/local/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/lib/python3.9/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 10236 (char 10235)

Interestingly, (i) this code runs fine locally (and even locally in the same Docker image used for the CI) and (ii) I can install packages with pip in the Gitlab CI.

System (please complete the following information):

  • unearth version: 0.12.1
  • Python version: 3.9
  • OS: Linux

Additional context

Cause pdm-project/pdm#2532

Since it is not reproducible, can you inspect what is the response content, around exactly this line:

  File "/home/appuser/.local/lib/python3.9/site-packages/unearth/collector.py", line 85, in parse_json_response
    data = json.loads(page.content)

Print page.content and you can probably figure out what the problem is.

  File "/builds/fluiddyn/unearth/src/unearth/collector.py", line 88, in parse_json_response
    raise RuntimeError(page.content)
RuntimeError: b'{"files":[{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0rc1-py2.py3-none-any.whl","hashes":{"sha256":"1d717e7336997feed076c4f5dbdbe9ce45062e680f2b1de319b4c759f809a561"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":84110,"upload-time":"2019-11-17T20:54:08.119802Z","url":"https://files.pythonhosted.org/packages/12/4f/8a0a7b2033b8a80451d214a289aecf486afdfb8e155b25986b0cbd3eb6e8/flit_core-2.0rc1-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0rc1.tar.gz","hashes":{"sha256":"d78f4b5b8fb2b484a98974b6da8d0edc8e7af55f60da7f40e0a9ddd2c36a5932"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22702,"upload-time":"2019-11-17T20:54:10.798088Z","url":"https://files.pythonhosted.org/packages/7f/8c/583b4412da71153ec70ed78341983c242a234d47abcfc8485284c6bb7b48/flit_core-2.0rc1.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0rc2-py2.py3-none-any.whl","hashes":{"sha256":"35a83504f509fcfd19bc53859d938cf2ad3385a2a19bfeb1745d1c957d39115c"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":84124,"upload-time":"2019-11-17T21:01:19.976000Z","url":"https://files.pythonhosted.org/packages/d0/72/0fe258ce61fa1b59adb6c76a701b19a96e0033fbe054b297a2012e33ad44/flit_core-2.0rc2-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0rc2.tar.gz","hashes":{"sha256":"b34eef2a6da426c659b5bbfc7a18cbfba2a72bbf7dc20d75a15fd5fc90c1d937"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22697,"upload-time":"2019-11-17T21:01:22.111432Z","url":"https://files.pythonhosted.org/packages/2f/1b/41ac0da91712d9c3e7d06a6e1eb7dfe616c96a14e4036e9b9c37ea9ee6f8/flit_core-2.0rc2.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0rc3-py2.py3-none-any.whl","hashes":{"sha256":"9c5e882e51ddb4206626f576f0a8217ebdf011ab34aeb9d4bb91f101cad03981"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":84135,"upload-time":"2019-11-19T09:38:40.419494Z","url":"https://files.pythonhosted.org/packages/6a/ff/be83d749ff1ad481b09e1e6069178c2d5d6c56a10b493353f0cc405e8475/flit_core-2.0rc3-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0rc3.tar.gz","hashes":{"sha256":"207a70987a60e67c475955996813ed95d485f97eee288d03fc04bff01b2c56b8"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22527,"upload-time":"2019-11-19T09:38:42.261378Z","url":"https://files.pythonhosted.org/packages/2d/36/bcd4bfb529261a27f113eb2e6fb9f5e5aed4d0b79be59a76ce65689a1892/flit_core-2.0rc3.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0-py2.py3-none-any.whl","hashes":{"sha256":"6315800ae208f0f1de1ee89997e16f69dacc5e18d3fd2a65e4e518e3d78dbdda"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":84102,"upload-time":"2019-11-23T09:24:17.224079Z","url":"https://files.pythonhosted.org/packages/dc/81/1f336b50c81e5345aafe7469e4f4c1104faa82b76e6e9885456b47d898fe/flit_core-2.0-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0.tar.gz","hashes":{"sha256":"8e91d877c663b16e70d88a2f652bc9e0ae71501cbb81c5ab8d48c838e731ba80"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22527,"upload-time":"2019-11-23T09:24:19.013574Z","url":"https://files.pythonhosted.org/packages/ec/cc/60e05480a5bf4b44ee1dbd179ca715ca4d192597d054e8c97bc0403060e8/flit_core-2.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0.1-py2.py3-none-any.whl","hashes":{"sha256":"1eb2bf3fd805560ed3ad6abca365a03681d1bf1f7d80707dc3bc3ce6833d52f4"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":84353,"upload-time":"2019-11-23T13:23:44.884523Z","url":"https://files.pythonhosted.org/packages/36/6a/b0e5ba2ad9d801887c8df7095535635292ce9b97f63cbb86f2b4d96dfebf/flit_core-2.0.1-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0.1.tar.gz","hashes":{"sha256":"96e7708bc88c03b58e0d35f1171197737e701e29a901a8b49c13d3fd21866560"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22616,"upload-time":"2019-11-23T13:23:46.654078Z","url":"https://files.pythonhosted.org/packages/0e/3d/e9b28cd1d220ca635234e37567099bf4d50ea0a98a77b360b8d8042352e6/flit_core-2.0.1.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0.2-py2.py3-none-any.whl","hashes":{"sha256":"c49546abb6afe371a13b78a2595d5afe1c0cd0aaa9dd753d800cd21259e51222"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":84764,"upload-time":"2019-11-23T13:40:37.588869Z","url":"https://files.pythonhosted.org/packages/66/d2/c520657053052af580573e32aeafe50a9f68fc77c5d87ff551ca856d2aa3/flit_core-2.0.2-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.0.2.tar.gz","hashes":{"sha256":"9efcdca4ae84fd4d831e18d3cdb85a0b4f211a52d4b832408ff9a65bcc309928"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22764,"upload-time":"2019-11-23T13:40:39.483209Z","url":"https://files.pythonhosted.org/packages/89/cf/a76f37dfded167e97936b8d53308abe5a8d00b97d417a6a405e69167e685/flit_core-2.0.2.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.1.0-py2.py3-none-any.whl","hashes":{"sha256":"c6dff661e9e290d51084cefc38b0971d692290e8a352d0b6cec6006be764b4d1"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":39162,"upload-time":"2019-11-26T09:48:48.530001Z","url":"https://files.pythonhosted.org/packages/b6/b0/50719ef7d12cd39ccfa4e48abb593764c8e4a6d0d9bdf7815be1949142ff/flit_core-2.1.0-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.1.0.tar.gz","hashes":{"sha256":"d2ebad9351c34083c16388d1df64a6e19579affcec02bfc05746714eef9f82fb"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22978,"upload-time":"2019-11-26T09:48:49.922785Z","url":"https://files.pythonhosted.org/packages/6c/6a/f945cf72957752ba0655260a8cb9c1139ea134c5f4b104bc48027349a6f4/flit_core-2.1.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.2.0-py2.py3-none-any.whl","hashes":{"sha256":"4df2b9b43f00764a81e7ea742829749183a7f5a9e360fa5c3a9e8643dadd716a"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":40023,"upload-time":"2020-01-14T10:57:57.314090Z","url":"https://files.pythonhosted.org/packages/25/4c/0b1ed660937d96ed192c376d3983dd7b052b887c8041ae020c950c0d06f0/flit_core-2.2.0-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.2.0.tar.gz","hashes":{"sha256":"4efb8bffc1a04d8e550e877f0c9acf53109a021cc27c2a89b1b467715dc1d657"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":23131,"upload-time":"2020-01-14T10:57:59.011481Z","url":"https://files.pythonhosted.org/packages/77/72/5dda5dc417a4e702e0d7e4a77e9802792a0e4a2daec2aeed915ead7db477/flit_core-2.2.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.3.0-py2.py3-none-any.whl","hashes":{"sha256":"a8f8904b534966712390e0a2e434cd33f76037730a0aaed299a286f9e18cac2b"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":40020,"upload-time":"2020-04-08T08:04:01.308900Z","url":"https://files.pythonhosted.org/packages/4b/3c/82798771fc1fd978c9225c5ae25eef45cb23b0df4728f208024a5b57901f/flit_core-2.3.0-py2.py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-2.3.0.tar.gz","hashes":{"sha256":"a50bcd8bf5785e3a7d95434244f30ba693e794c5204ac1ee908fc07c4acdbf80"},"requires-python":">=2.7, !=3.0, !=3.1, !=3.2, != 3.3","size":22995,"upload-time":"2020-04-08T08:04:02.852440Z","url":"https://files.pythonhosted.org/packages/bb/92/e51c58d463ebbabb7b226662655cef6d17d3b4b83f570b08f6be0fe2b1b8/flit_core-2.3.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.0.0-py3-none-any.whl","hashes":{"sha256":"a787754978cfe3c192a5fc6baf2179ae85b05395804de7d7fe2864d9431e8d03"},"requires-python":">=3.4","size":36921,"upload-time":"2020-09-06T10:57:29.444835Z","url":"https://files.pythonhosted.org/packages/a8/66/67758f788959c2557c4d0f80e4895c3c0802873be95b82a5213ea39542d7/flit_core-3.0.0-py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.0.0.tar.gz","hashes":{"sha256":"a465052057e2d6d957e6850e9915245adedfc4fd0dd5737d0791bf3132417c2d"},"requires-python":">=3.4","size":22037,"upload-time":"2020-09-06T10:57:30.734781Z","url":"https://files.pythonhosted.org/packages/0e/b9/040baf94b40c80081bbecbd90365a5d7765a1c07e31b6c949838cc4c93d1/flit_core-3.0.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.1.0-py3-none-any.whl","hashes":{"sha256":"1d06e64a6af7e1fd1496563b160df29dd32714e00b473f3b763f6e6810476517"},"requires-python":">=3.4","size":38715,"upload-time":"2021-03-01T15:36:57.289033Z","url":"https://files.pythonhosted.org/packages/ed/0c/50352b127c0936cd59dd762db41d0e17986401c42ba613fa502e926d33ec/flit_core-3.1.0-py3-none-any.whl","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.1.0.tar.gz","hashes":{"sha256":"22ff73be39a2b3c9e0692dfbbea3ad4a9d127e5733736a87dbb8ddcbf7309b1e"},"requires-python":">=3.4","size":22706,"upload-time":"2021-03-01T15:36:58.522778Z","url":"https://files.pythonhosted.org/packages/4c/8f/bed80c03f71cb3a2935882f391b53d2510c359191e5e0361650fa02d1365/flit_core-3.1.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.2.0-py3-none-any.whl","hashes":{"sha256":"6f25843e908dfc3e907b6b9ee71e3d185bcb5aebab8c3431e4e34c261e5ff1b5"},"requires-python":">=3.4","size":45693,"upload-time":"2021-03-21T21:20:19.175500Z","url":"http'

page.content ends with

[...],{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.2.0-py3-none-any.whl",
"hashes":{"sha256":"6f25843e908dfc3e907b6b9ee71e3d185bcb5aebab8c3431e4e34c261e5ff1b5"},
"requires-python":">=3.4","size":45693,"upload-time":"2021-03-21T21:20:19.175500Z","url":"http'

so indeed, an unterminated json text...

Questions:

  • why do I get this unterminated response?
  • how can unearth avoid that or complete the loading of this data when it happens?

Note that I can reproduce the exception with this simple code:

from datetime import datetime
from requests import Session

session = Session()

print("before get:", datetime.now())
resp = session.get(
    "https://pypi.org/simple/flit-core/",
    headers={
        "Accept": "application/vnd.pypi.simple.v1+json",
        "Cache-Control": "no-cache",
    },
    timeout=120,
)
print("after get:", datetime.now())

print(resp)
print(resp.content[-400:])
print(resp.json()["versions"])

which gives

before get: 2024-01-03 10:22:01.244619
after get: 2024-01-03 10:22:01.277003
<Response [200]>
b'g/packages/4c/8f/bed80c03f71cb3a2935882f391b53d2510c359191e5e0361650fa02d1365/flit_core-3.1.0.tar.gz","yanked":false},{"core-metadata":false,"data-dist-info-metadata":false,"filename":"flit_core-3.2.0-py3-none-any.whl","hashes":{"sha256":"6f25843e908dfc3e907b6b9ee71e3d185bcb5aebab8c3431e4e34c261e5ff1b5"},"requires-python":">=3.4","size":45693,"upload-time":"2021-03-21T21:20:19.175500Z","url":"http'
Traceback (most recent call last):
  File "/home/appuser/.local/lib/python3.9/site-packages/requests/models.py", line 960, in json
    return complexjson.loads(self.content.decode(encoding), **kwargs)
  File "/usr/local/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/local/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/lib/python3.9/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 10236 (char 10235)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/builds/fluiddyn/fluidsim/tmp_bug_unearth.py", line 20, in <module>
    print(resp.json()["versions"])
  File "/home/appuser/.local/lib/python3.9/site-packages/requests/models.py", line 968, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Unterminated string starting at: line 1 column 10236 (char 10235)

Note that everything is fine with pip in the same environment (Gitlab CI). In particular pip index versions flit-core prints the correct data.

print(resp.content[-400:])

Did this line play an important role in reproducing the issue?

Did this line play an important role in reproducing the issue?

No. This line was only to visualize what happens, i.e. the response is truncated.

No. This line was only to visualize what happens, i.e. the response is truncated.

If requests itself can reproduce this, why not asking it there? I don't think there is any behavior of requests that can be tweaked via arguments to bypass this.