For any url or a JSON with urls in one of the fields, this saves the html and a pdf of the page. To use with multiple documents- a = URLArchiver.new("multiple") (or "multifull" to also get full text) a.multiarchive(json, "fieldname") a.genOutput (to get the input JSON with the paths to the html and pdfs) To use with a single document- a = URLArchiver.new("single") a.archiveone("url") a.genOutput (if you want a JSON with the page text and paths)