blekhmanlab / rxivist

API providing access to papers and authors scraped from biorxiv.org

Home Page:https://rxivist.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Be more sensitive to crawling errors

rabdill opened this issue · comments

If a call keeps failing (for checking publication status, for example), don't just keep hammering them:

Refreshing article 31476
Determining publication status for DOI 10.1101/402800.
Error fetching publication data: ('Connection aborted.', OSError(0, 'Error'))
Retrying:
Determining publication status for DOI 10.1101/402800.
Error fetching publication data: ('Connection aborted.', OSError(0, 'Error'))
Giving up on this one for now.

Refreshing article 25634
Determining publication status for DOI 10.1101/105825.
Paper already has publication recorded. Skipping.
Recorded 2 stats for ID 25634

Refreshing article 33182
Determining publication status for DOI 10.1101/425991.
Error fetching publication data: ('Connection aborted.', OSError(0, 'Error'))
Retrying:
Determining publication status for DOI 10.1101/425991.
Error fetching publication data: ('Connection aborted.', OSError(0, 'Error'))
Giving up on this one for now.