website-scraper

There are 10 repositories under website-scraper topic.

website-scraper / node-website-scraper
Download website to local directory (including all css, images, js, etc.)
javascript website-scraper scraper nodejs hacktoberfest
Language:JavaScript 1565
imthaghost / goclone
Website Cloner - Utilizes powerful Go routines to clone websites to your computer within seconds.
cloning crawler go golang website-cloner website-scraper
Language:Go 1372
josephlimtech / linkedin-profile-scraper-api
🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON.
crawler crawling expressjs json linkedin linkedin-bot linkedin-crawler linkedin-profile linkedin-profile-scraper linkedin-scraper linkedin-scraping nodejs profile-data puppeteer scraper scrapers scraping scraping-websites spider website-scraper
Language:TypeScript 564
Uscrapper
z0m31en7 / Uscrapper
Uscrapper Vanta: Dive deeper into the web with this powerful open-source tool. Extract valuable insights with ease and efficiency, from both surface and deep web sources. Empower your data mining and analysis with Vanta's advanced capabilities. Fast, reliable, and user-friendly, Uscrapper Vanta is the ultimate choice for researchers and analysts.
darkweb darkweb-crawler information-extraction information-gathering osint osint-python osint-tool python reconnaissance selenium selenium-webscraper tor web-scraping webcra webcrawler webscraping website-scraper websites
Language:Python 513
website-scraper / website-scraper-puppeteer
Plugin for website-scraper which returns html for dynamic websites using puppeteer
chrome chromium hacktoberfest javascript nodejs puppeteer scraper website-scraper
Language:JavaScript 324
Kooboo
Kooboo / Kooboo
CMS, WebSite, Application and Ecommerce Development Tool Using JavaScript
cms development javascript kooboo magento shopify templates web-application-platform website-builder website-development website-scraper wordpress
Language:C# 318
html2rss-web
html2rss / html2rss-web
🕸 generates RSS feeds of any website a d servers to the web! Docker. Automatic scraping, use the built-in configs or create your own. Rolling release for speedy updates.
html2rss ruby docker scraper rss feed builder website-scraper rss-feed-scraper html2rss-configs rss-feed rolling-release rss-aggregator feed-configs webfeeds serves roda webfeed
Language:Ruby 91
erlange / wbm-dl
Wayback Machine Downloader. 🔥 Download your entire archived websites from the Internet Archive Wayback Machine.
command-line-app command-line-parser command-line-tool console console-app console-application csharp internet internet-archive internet-wayback-machine wayback-machine wayback-machine-downloader website-scraper
Language:C# 89
OSINT-TECHNOLOGIES / dpulse
DPULSE - Tool for complex approach to domain OSINT
data-gathering domain-analysis information-gathering information-security infosectools intelligence intelligence-gathering osint osint-tool web-scraping google-dorking webscraping website-scraper cybersecurity cybersecurity-education cybersecurity-tool osint-tools pentest pentest-tool pentesting
Language:Python 82
xarantolus / Collect
A server to collect & archive websites that also supports video downloads
self-hosted webinterface archive website-archive video-downloader website-scraper web-archiving
Language:TypeScript 79
LexiestLeszek / scrapeGPT
ScrapeGPT is a RAG-based Telegram bot designed to scrape and analyze websites, then answer questions based on the scraped content. The bot utilizes Retrieval Augmented Generation and webscraping to return natural language answers to the user's queries.
crawler huggingface large-language-models llm ollama proxy rag retrieval-augmented-generation robots-txt scraper telegram-bot website-scraper
Language:Python 71
MLArtist / WebScraper
Python-based web crawling script with randomized intervals, user-agent rotation, and proxy server IP rotation to outsmart website bots and prevent blocking.
beautiful-soup beautifulsoup beautifulsoup4 crawler crawling-python iprotation robots-txt scraper scraping scrapper scrapping-python user-agent website-crawler website-scraper
Language:Python 70
goClone
shurco / goClone
🌱 goClone - clone websites in seconds
cloner cloning crawler crawling go goclone golang hacktoberfest scraping scraping-websites scrapper website-cloner website-scraper wp2static
Language:Go 64
website-scraper / node-website-scraper-phantom
Plugin for website-scraper which returns html for dynamic websites using PhantomJS.
hacktoberfest javascript nodejs phantomjs scraper website-scraper
Language:JavaScript 59
CRAKZOR / linkedin-post-automator
Automatically curates and posts content to LinkedIn. It can optionally use web scraping to gather data, which is then fed to ChatGPT to craft engaging LinkedIn posts.
chatgpt-api linkedin linkedin-bot linkedin-post linkedin-posts-automation linkedin-scraper openai-api post-scheduler post-scheduling website-scraper chatgpt-bot
Language:Python 57
yuis-ice / jseval
Evaluate JavaScript on a URL through headless Chrome browser.
command-line headless-browser web-browser browser-automation pupeteer headless-browsers cmdline commandline-interface cli-utilities eval evaluator scrapers datascraping scrapping data-scraping webscrapping web-crawling scrapper website-scraper web-scrapping
Language:JavaScript 25
vlmaier / marvel-snap-scrapr
Scraper for https://marvelsnapzone.com to retrieve metadata of Marvel SNAP cards.
marvel marvel-characters marvel-snap crawler crawler-python game scraper website-crawler website-scraper
Language:Python 21
faheel / file-extensions
JSON collection of scraped file extensions, along with their description and type, from FileInfo.com
file-extensions scraper python3 scraped-data json fileinfo website-scraper
Language:Python 18
jeanrauwers / followers-scraper-serverless
Now you can keep track of your followers from YouTube, Instagram and Twitter accounts - Followers scraper API on AWS serverless
aws aws-lambda aws-serverless followers-scraper instagram instagram-scraper instagramscraper lambda nodejs-lambda scraper twitter twitter-scraper twittersc typescript webscraper webscraper-api webscraping website-scraper youtube
Language:TypeScript 18
Ashwin-op / Email-Extractor
A spider to crawl webpages
spider crawler scrapy website-scraper python
Language:Python 16
cometolearnofficial / WebHawk
Website Penetration Testing Tool With Dos Attack Feature
website hacking penetration-testing website-scraper webhawk come-to-learn ddos-attack termux
Language:Python 16
dtflare / GPTparser
Use GPTparser with your OpenAI API to scrape & parse files into structured JSON files.
dataset-creation json-mode json-parser openai-api-chatbot website-scraper
Language:Python 12
orangmuda / SECTOOL
sᴇᴀʀᴄʜ ᴇɴɢɪɴᴇ sᴄʀᴀᴘᴇʀ ᴛᴏᴏʟ (ʙᴀsʜ)
crawler crawling website-scraper scraper
Language:Shell 12
website-scraper / website-scraper-existing-directory
Plugin for website-scraper which allows to save resources to existing directory
hacktoberfest javascript nodejs website-scraper
Language:JavaScript 11
dann1 / ndown
Bandwidth efficient scheduled downloads
aria2 bandwidth scheduler website-scraper wget youtube-downloader
Language:Shell 10
epegzz / node-scraper
Scraping websites made easy! A minimalistic yet powerful tool for collecting data from websites.
axios cheerio javascript node scraper scraping website-scraper
Language:JavaScript 9
nigeld3v / Tumblr_Image_scrape
Download ALL the images (JPEG/GIF/PNG) from any Tumblr website! This project employs Python3 and BeautifulSoup4 to scrape a Tumblr site (with the url provided by the user) to download, page by page, all the images from the Tumblr site's posts. Ideal for archiving other peoples' Tumblrs <3
tumblr tumblr-image-scrape beautifulsoup beautifulsoup4 archive image images comics webcomics fashion art gif gifs scraper website-scraper graphics graphics-library design blogging blog
Language:Python 9
codassassin / website-url-scraper
This is a website url scraper built using python.
website-scraper website-scanner url-finder url-parser
Language:Python 8
methylDragon / news-anaCrawler
Article Dataset Generator for Internet News Sites. Crawls news sites, analyses them with NLP (sentiment analysis), and pushes to a database.
scraping website-scraper dataset-generation python3 jupyter-notebook script
Language:Jupyter Notebook 8
jasniec / WebsiteParser
Simple library which parses web pages into objects usin attributes
parser csharp netstandard-libraries dotnet website-scraper web-crawler
Language:C# 7
SamuraiPolix / openbible-verse-scraper
This script scrapes the verses and references from an openbible.info page into a JSON file - if needed, we use bible-api.com to translate to another bible version.
bible bible-verse bible-verse-references bible-verses bibledata python json website-scraper bible-scraper verse-scraper bible-api bible-search bible-search-engine bible-study bible-translations biblequote
Language:Python 7
thenurhabib / linkext
A python Script for automatically collect links from a web page.
scraper website-scraper link-extractor python3 automation hacking-tool
Language:Python 7
anaustinbeing / website-scraper
Scrapes any website to retrieve all hyperlinks from it in a matter of seconds. Scraping made easy!
website webscraping webscraper website-scraper python
Language:Python 6
Hatim315 / Manhua-Manga-Manhwa_Downloader
This script downloads manhua, manga or manhwa and save them in a same name directory.
image-scraper manga manga-scraper manhua manhua-scraper manhwa manhwa-scraper website-scraper scraper advanced-scraper python3 python python-3 python-scraper asynchronous asyncio async asynchronous-programming
Language:Python 6
Sachinart / alexa-rank-checker
Alexa Bulk Website Rank Checker PHP Script 2020 Latest! you can grab 200+ URL's website ranking at once!
php rank script seo website-scraper amazon-alexa website ranks
Language:CSS 6
ajaygithub2 / yellow-pages-scraper
There is a script for scraping yellowpages.com website for name, contact, address and link
scraping scraper website-scraper yellow-pages yellow-pages-scraper
Language:Python 5

website-scraper

website-scraper / node-website-scraper

imthaghost / goclone

josephlimtech / linkedin-profile-scraper-api

z0m31en7 / Uscrapper

website-scraper / website-scraper-puppeteer

Kooboo / Kooboo

html2rss / html2rss-web

erlange / wbm-dl

OSINT-TECHNOLOGIES / dpulse

xarantolus / Collect

LexiestLeszek / scrapeGPT

MLArtist / WebScraper

shurco / goClone

website-scraper / node-website-scraper-phantom

CRAKZOR / linkedin-post-automator

yuis-ice / jseval

vlmaier / marvel-snap-scrapr

faheel / file-extensions

jeanrauwers / followers-scraper-serverless

Ashwin-op / Email-Extractor

cometolearnofficial / WebHawk

dtflare / GPTparser

orangmuda / SECTOOL

website-scraper / website-scraper-existing-directory

dann1 / ndown

epegzz / node-scraper

nigeld3v / Tumblr_Image_scrape

codassassin / website-url-scraper

methylDragon / news-anaCrawler

jasniec / WebsiteParser

SamuraiPolix / openbible-verse-scraper

thenurhabib / linkext

anaustinbeing / website-scraper

Hatim315 / Manhua-Manga-Manhwa_Downloader

Sachinart / alexa-rank-checker

ajaygithub2 / yellow-pages-scraper