rbSparky / Scraper

Cybersec Project

Home Page:https://web-scraper-utility.rbsparky.repl.co/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Web Scraper Utility

Hosted on replit, with code. Made with Flask, Jinja2 & Bootstrap.

Features

Extracting lists, tabular and data from all html tags

  • All data is collected and stored in a JSON file.

Search result summarizer with NLP

Generating sitemap

  • Ability to scrape the site till a certain depth to get the entire site architecture of any website.

Improvements to be made

TBD:

  • Image scraping
  • Batch Image Downloading
  • JSON Download Feature

About

Cybersec Project

https://web-scraper-utility.rbsparky.repl.co/


Languages

Language:Python 45.5%Language:HTML 41.7%Language:CSS 10.5%Language:Nix 2.3%