CrsiX / WebsiteCrawler

Retrieve and store whole websites

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

WebsiteCrawler

Use this tool to retrieve whole websites and store them locally.

Limitations

  • No subdomains
  • No media download (images will be supported soon)
  • Bad support for dynamically generated websites
  • No file checking (e.g. CSP) or integrity hash verification
  • No support for CORS
  • No support for dynamic loading (e.g. XHR)
  • No CSS, JavaScript or other stuff is ever interpreted or executed
  • No authentication
  • No fonts, images and other sources in CSS or JavaScript files

About

Retrieve and store whole websites

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Python 100.0%