sckott / habanero

client for Crossref search API

Home Page:https://habanero.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

UnboundLocalError in request_class.py

vgreg opened this issue · comments

Python 3.11
Habanero 1.2.3

I'm getting the following error on line 164 of request_class.py: UnboundLocalError: cannot access local variable 'r' where it is not associated with a value.

It seem that you can reach that line (check_json(r)) with r undefined if requests.get() raises a RequestException before returning. Because the exception is caught and printed, the code continues with r still undefined.

def _req(self, payload, should_warn):
try:
r = requests.get(
self._url(),
params=payload,
headers=make_ua(self.mailto, self.ua_string),
)
r.raise_for_status()
except requests.exceptions.HTTPError:
try:
f = r.json()
raise RequestError(r.status_code, f["message"][0]["message"])
except:
if should_warn:
mssg = "%s: %s" % (r.status_code, r.reason)
warnings.warn(mssg)
return None
else:
r.raise_for_status()
except requests.exceptions.RequestException as e:
print(e)
check_json(r)
return r.json()

Thank you for the report @vgreg

Can you please share a reproducible example that caused this error? Seems like the except's aren't catching whatever error is thrown and then we don't have r defined. It'd be nice to have an example to figure out what that error is

I am still looking for an example that will consistently reproduce the error. I was retrieving all articles for a set of about 150 journals and had the error occur for two journals, but I have been to re-run the request for both with no issue the second time.

Here is a simplified version of a request that failed once but has been working every other time:

from habanero import Crossref
cr = Crossref()
query = {"issn": "0028-3932"}


responses = cr.works(
    filter=query, cursor="*", cursor_max=12000
)

cursor_max is set to slightly more than the number of DOIs for the journal.

Thanks - I'll see if I can get that to fail

This may be difficult to track down - the facat that it doesn't happen consistently suggests it's an intermittent problem with the Crossref API

I was able to reproduce a similar error and see what gets printed on line 163. Here is the exception:

HTTPSConnectionPool(host='api.crossref.org', port=443): Max retries exceeded with url:
/works?filter=issn%3A0028-3932&cursor=DnF1ZXJ5VGhlbkZldGNoBgAAAAAFuuH-Fmx3VDZUUHY5VHlhdThmaGVtbFhBOVEAAAAABcXycBZPY3FES3VMU1R5R3JIWHlwQUZBcktnAAAAAAXd460WTUpsaGN0RGFRbS1yN0ZYWTJ3MG5pUQAAAAAGAprKFlVTQUNpdVFEVHZLdWVZQWxVZEJDUUEAAAAABbl
xaxY0bldDU3pmSlJZeWhaSGk2VHVVdHh3AAAAAAW0yJAWaUpOMms5em5SUmVMR2JjT2VGdEFtdw%3D%3D (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0xffff33f68e90>, 'Connection to api.crossref.org timed out. (connect
timeout=None)'))

It seems that ConnectTimeout is derived from RequestException, so it is caught on line 162:
https://requests.readthedocs.io/en/latest/api/#requests.ConnectionError

However, the code continues to line 164 with r still undefined.

Okay, thanks for this. I'll try to get to this soon

Having the same issue...

HTTPSConnectionPool(host='api.crossref.org', port=443): Max retries exceeded with url: /works?query=author%3AMONAGHAN+A%2BAND%2Btitle%3A%E2%80%98CALMLY+CRITICAL%E2%80%99%3A+EVOLVING+RUSSIAN+VIEWS+OF+US+HEGEMONY%2BAND%2Byear%3A2006%2BAND%2Bjournal%3AJOURNAL+OF+STRATEGIC+STUDIES&rows=1 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f3fec6f7f70>: Failed to establish a new connection: [Errno 101] Network is unreachable'))
An error occurred: local variable 'r' referenced before assignment

Let me know if I can help with debugging (but the run continues despite the error...)

thanks for your report @sdspieg ! Sorry about the issue. I started working on this, but I just haven't had time to finish it off. I'll let you know if I could use any help.

@vgreg @sdspieg Can both of you reinstall from Github and try again?

closing for now, if it pops up again ping here