oduwsdl / archivenow

A Tool To Push Web Resources Into Web Archives

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Handle case where no optional parameters are specified

machawk1 opened this issue · comments

I attempted to specify no optional parameters but simply the URI positional parameter via:

archivenow http://some-urir

and was supplied the command-line help functionality. It would be better to handle this usage in a smarter manner, i.e., triggering the "--all" or "--ia" flags when no archive is explicitly specified.

Done.

Now if no argument provided, the page will be pushed into "ia". If "ia" is not there, the first archive in the list will be considered as the default one.

Example:
$ archivenow www.foxnews.com
['https://web.archive.org/web/20170215165408/http://www.foxnews.com']

@maturban I am getting an error when I try what you did.

$ archivenow www.foxnews.com
['The Internet Archive: Unexpected error']

I might be good to report what the "Unexpected error" is.

I am using version: 2017.2.15.12.17.30 from pip, as reported by pip show archivenow.

@machawk1 Here is how errors/exceptions are handled for now:

            ...
        except Exception as e:
            pass;
        return self.name+ ": Unexpected error"

where self.name is the name of an archive

would it be fine to do:

            ...
        except Exception as e:
            pass;
        return "Error ("+self.name+ ")"+str(e)

or should I catch each exception separately ?

@maturban It would be good to know WHY the push to the archive failed. If an archive is down, for example, you might expect requests.exceptions.ConnectTimeout and perform some action in the except body then report more information about the error instead of simply saying, "An error occurred".

@machawk1
Now, a more meaningful message is shown when an error is occurred

@maturban You can probably close this ticket, as the request has been fulfilled.

I pulled a new version, killed my internet connection and tried

archivenow www.cs.odu.edu/~mln/

The error I got seemed weird:

["Error (The Internet Archive): HTTPSConnectionPool(host='web.archive.org', port=443): Max retries exceeded with url: /save/www.cs.odu.edu/~mln/ (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x105933950>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known',))"]