Add functionality to synthesize warcs from archive.today
lesleyodu opened this issue · comments
Feeding the mementos from a timemap generated by memgator to the "synthesize warcs" action in hypercane results in exceptions for mementos from archive.today. There appears to be a captcha.
This turned out to be far more complicated than we expected. @lesleyodu -- could you summarize our email conversation as a comment here so we have it available when I can work on Hypercane again. Thanks.
- Captcha does not appear when using Hypercane from an archive.today approved research whitelisted network
- archive.today zip files with original resources are no longer available
- HTML: need to recreate head (title ok) and undo rewriting in body to mimic raw memento capability of other archives
- Replace otmt raw memento calculations with MementoEmbed