There are 62 repositories under web-scraper topic.
A collection of awesome web crawler,spider in different languages
PHP Curl Class makes it easy to send HTTP requests and integrate with web APIs
A list of practical knowledge-building projects.
Web Scraper in Go, similar to BeautifulSoup
Generate and download e-books from online sources.
Faster requests on Python 3
:rocket: Stealth - Secure, Peer-to-Peer, Private and Automateable Web Browser/Scraper/Proxy
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Detailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, MacroTrends, SHFE and alternative data crawlers on Tomtom, BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The Economist
scrape data data from Google Maps. Extracts data such as the name, address, phone number, website URL, rating, reviews number, latitude and longitude, reviews,email and more for each place
A universal web-util for PHP.
A framework for creating semi-automatic web content extractors
CLI tool for saving a faithful copy of a complete web page in a single HTML file (based on SingleFile)
`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
NBA Stats API via Basketball Reference
Fetch user's data across social media
A Reddit bot that summarizes news articles written in Spanish or English. It uses a custom built algorithm to rank words and sentences.
A collection of awesome web scaper, crawler.
Scrapes facebook's pages front end with no limitations & provides a feature to turn data into structured JSON or CSV
Free trial Web Unblocker - an AI-powered proxy solution that can bypass even the most sophisticated anti-bot systems
Lightweight scraper for Google News
MetaData html scraper and parser for Node.js (supports Promises and callback style)
This repository contains a script to scrape Facebook Marketplace data using Playwright, BeautifulSoup and Streamlit.
A simple python library that allows for easy access of the SEC website so that someone can parse filings, collect data, and query documents.
Instagram Bot which when given a post url will spam mentions to increase the chances of winning. Win Instagram Giveaways!
A multithreaded 🕸️ web crawler that recursively crawls a website and creates a 🔽 markdown file for each page, designed for LLM RAG
A command line program to download Hentai videos and images from multiple websites