oduwsdl / archivenow

A Tool To Push Web Resources Into Web Archives

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Feature request] Add retry logic

Igetin opened this issue · comments

commented

Description

The program should have a retry logic in case the request to the archive service fails. In my experience, this happens a lot with The Internet Archive. For example:

$ archivenow --ia --is https://twitter.com/Itaoka1/status/494145244540063745
Error (The Internet Archive): HTTPSConnectionPool(host='web.archive.org', port=443): Max retries exceeded with url: /save/https://twitter.com/Itaoka1/status/494145244540063745 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f6d61a34d10>, 'Connection to web.archive.org timed out. (connect timeout=120)'))
https://archive.li/wip/v4SzI

I would prefer that the command does not complete before it actually succeeds with the requests to all of the given archive services, or at least before a certain number of maximum retries (per service) is reached. The retry count should be configurable, via a command line option (e.g. --max-retries 20), and it should have a reasonable default (5?) in case the option isn’t given by the user.

Currently, the user has to manually issue new archivals for the services for which the request was unsuccessful.