web-scraper

There are 96 repositories under web-scraper topic.

firecrawl
firecrawl / firecrawl
🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
ai ai-agents ai-crawler ai-scraping ai-search crawler data-extraction html-to-markdown llm markdown scraper scraping web-crawler web-data web-data-extraction web-scraper web-scraping web-search webscraping
Language:TypeScript 66911
ScrapeGraphAI / Scrapegraph-ai
Python scraper based on AI
scraping scraping-python automated-scraper llm web-crawler web-scraping ai-scraping crawler markdown rag web-crawlers ai-crawler ai-search large-language-model web-data-extraction web-search web-scraper data-extraction web-data webscraping
Language:Python 21730
getmaxun / maxun
⚡ Easiest no code web data extraction platform • Instantly turn any website into API or spreadsheet ⚡
automation no-code scraper web-automation web-scraper web-scraping api browser browser-automation playwright self-hosted robotic-process-automation rpa no-code-web-scraper agents data-extraction webscraping hacktoberfest hacktoberfest-accepted nocode
Language:TypeScript 13826
Scrapling
D4Vinci / Scrapling
🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!
crawler crawling crawling-python playwright python scraping selectors stealth web-scraper web-scraping web-scraping-python webscraping xpath automation ai ai-scraping data data-extraction mcp mcp-server
Language:Python 8110
BruceDone / awesome-crawler
A collection of awesome web crawler,spider in different languages
web-crawler crawler web-scraper spider node-crawler scraper awesome
6992
jaypyles / Scraperr
Self-hosted webscraper.
opensource self-hosted webscraper docker helm kubernetes playwright python scraping web-scraper web-scrapers web-scraping webscraping
Language:TypeScript 4681
arpit-omprakash / 100ProjectsOfCode
A list of practical knowledge-building projects.
programming projects python web-scraper music-player search-engine java javascript c cpp11 csharp processing
3504
php-curl-class / php-curl-class
PHP Curl Class makes it easy to send HTTP requests and integrate with web APIs
php curl class api api-client client framework http http-client http-proxy json php-curl php-curl-library proxy requests restful web-scraper web-scraping web-service xml
Language:PHP 3300
google-maps-scraper
gosom / google-maps-scraper
scrape data data from Google Maps. Extracts data such as the name, address, phone number, website URL, rating, reviews number, latitude and longitude, reviews,email and more for each place
golang google-maps-scraping web-scraper web-scraping distributed-scraper distributed-scraping google-maps
Language:Go 2409
anaskhan96 / soup
Web Scraper in Go, similar to BeautifulSoup
golang go webscraper webscraping beautifulsoup web-scraper html-node
Language:Go 2218
lightnovel-crawler
dipu-bd / lightnovel-crawler
Generate and download e-books from online sources.
lightnovel termux web-scraper console-app python lightnovel-crawler discord telegram kindle-books
Language:Python 1913
itsOwen / CyberScraper-2077
A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama
ai-scraping llm openai scraper webscraping gemini-api llm-scraper web-scraper
Language:Python 1886
oxylabs / google-ai-mode-scraper
Scrape Google AI Mode responses without blocks on a large scale.
ai-mode google-ai google-ai-mode proxy-scrape web-scraper web-scraper-api scraper-api
Language:Java 1451
how-to-scrape-amazon-product-data
oxylabs / how-to-scrape-amazon-product-data
The process of extracting product data from Amazon using Python, including titles, ratings, prices, images, and descriptions.
amazon amazon-scraper python web-scraper web-scraping web-scraping-python
1233
juancarlospaco / faster-than-requests
Faster requests on Python 3
python python3 python-library http-requests python-requests web-scraping download-file speed curl cython urllib faster-than-requests open-data urllib3 requests3 scrapy ndjson high-performance requests-toolbelt web-scraper
Language:Nim 1126
tholian-network / stealth
:rocket: Stealth - Secure, Peer-to-Peer, Private and Automateable Web Browser/Scraper/Proxy
web-browser web-scraper web-proxy web-filter privacy-protection anonymity browser-automation
Language:JavaScript 1113
gildas-lormeau / single-file-cli
CLI tool for saving a faithful copy of a complete web page in a single HTML file (based on SingleFile)
cli nodejs single-file web-archiving web-scraper web-scraping archiving scraping-websites crawler web-crawler deno dockerfile
Language:JavaScript 1034
monkey-dl
Oshan96 / monkey-dl
Bulk download your favourite anime episodes from your favourite anime websites
anime-downloader anime anime-search anime-fans anime-scraper web-scraper 9anime animeultima animepahe 4anime animepahe-downloader ffmpeg hls-downloader monkey-dl
Language:Python 861
web-scraping
je-suis-tm / web-scraping
Detailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, MacroTrends, SHFE and alternative data crawlers on Tomtom, BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The Economist
sraping scrapper futures futures-historical-data reuters wall-street-journal bloomberg python-web-scraper news-websites financial-times web-scraping news-scraper web-scraper newsletter web-scrapers options-data financial-data data-scraping data-scraper wallstreetbets
Language:Python 833
postmodern / spidr
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
spider ruby spider-links crawler web scraper web-scraping web-spider web-crawler web-scraper
Language:Ruby 827
k0rnh0li0 / onlyfans-dl
OnlyFans content downloader
onlyfans media-downloader python web-scraper
Language:Python 797
cassidoo / scrapers
A list of scrapers from around the web.
scraper web-scraper list scrape-websites
695
how-to-scrape-google-scholar
oxylabs / how-to-scrape-google-scholar
A guide for extracting titles, authors, and citations from Google Scholar using Python and Oxylabs SERP Scraper API.
google-scholar google-scholar-scraper python python-scraper scraper-api web-scraper web-scraping google-scholar-scrapper google-search-scraper
Language:Python 587
PHPScraper
spekulatius / PHPScraper
A universal web-util for PHP.
php php-spider php-crawler scraping-websites php-scraper scraper scraping web-scraper web-scraping beautifulsoup scrapy php-spiders puppeteer pyppeteer chromium headless-chrome
Language:PHP 572
how-to-scrape-amazon-prices
oxylabs / how-to-scrape-amazon-prices
A code for extracting best-selling items, search results, and currently available deals from Amazon using Python and Oxylabs E-Commerce Scraper API.
amazon amazon-scraper api python python-scraper scraper-api web-scraper web-scraping
Language:Python 532
jaebradley / basketball_reference_web_scraper
NBA Stats API via Basketball Reference
basketball-reference python nba web-scraping web-scraper
Language:HTML 517
quick-start-guide
oxylabs / quick-start-guide
Python quick start guides to get the most out of Oxylabs' Web Scraper API free trial.
scraper web-scraper oxylabs scraper-api scraper-python scrapers scraping scraping-websites web-scraping
516
0x676e67 / wreq
An ergonomic Rust HTTP Client with TLS fingerprint
http-client http2 ja3 ja4 tls websocket https akamai web-scraper fingerprint rust http tls-fingerprint crawler scraper client websocket-client
Language:Rust 508
austinoboyle / scrape-linkedin-selenium
`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
selenium selenium-webdriver linkedin scraping web-scraper web-scraping python scrape scraper
Language:HTML 508
AlexMathew / scrapple
A framework for creating semi-automatic web content extractors
python css-selector xpath-expression web-scraper web-scraping scrapers scraping scrapy selector extractor crawler selector-expression tutorial lxml beautifulsoup
Language:Python 503
social-media-profile-scrapers
shaikhsajid1111 / social-media-profile-scrapers
Fetch user's data across social media
web-scraping web-scraper request python selenium-python facebook-scraper twitter-scraper pinterest reddit-scraper medium-scraper tiktok-scraper quora-scraper instagram-scraper pinterest-scrapper scrapping-python social-media
Language:Python 496
paulpierre / markdown-crawler
A multithreaded 🕸️ web crawler that recursively crawls a website and creates a 🔽 markdown file for each page, designed for LLM RAG
html-to-markdown html-to-markdown-converter html2md llm llmops markdown markdown-parser rag web-scraper markdown-crawler markdown-scraper md-crawler
Language:Python 414
crawler
crwlrsoft / crawler
Library for Rapid (Web) Crawler and Scraper Development
crawling php scraper scraping scraping-websites web-crawler web-crawling web-scraping hacktoberfest crawler web-scraper
Language:PHP 366
passivebot / facebook-marketplace-scraper
This repository contains a script to scrape Facebook Marketplace data using Playwright, BeautifulSoup and Streamlit.
database facebook facebook-marketing-automation facebook-marketplace python sqlite3 web-automation web-scraper web-scraping playwright playwright-python
Language:Python 353
google-news-scraper
lewisdonovan / google-news-scraper
Lightweight scraper for Google News
google-news google-news-scraper news news-scraper news-articles web-scraper crawler web-crawler news-crawler google-crawler
Language:TypeScript 348
web-unblocker
oxylabs / web-unblocker
Free trial Web Unblocker - an AI-powered proxy solution that can bypass even the most sophisticated anti-bot systems.
unblocker web-scraper web-unblocker bypass bypasscaptcha captcha captcha-solving web-scraping-api captcha-breaking captcha-bypass amazon-captcha unblocker-website webiste-unblocker-github website-unlocker website-unblocker unblocked-websites rotate-captcha web-proxy-server unblocker-websites school-unblocker
Language:Python 325

web-scraper

firecrawl / firecrawl

ScrapeGraphAI / Scrapegraph-ai

getmaxun / maxun

D4Vinci / Scrapling

BruceDone / awesome-crawler

jaypyles / Scraperr

arpit-omprakash / 100ProjectsOfCode

php-curl-class / php-curl-class

gosom / google-maps-scraper

anaskhan96 / soup

dipu-bd / lightnovel-crawler

itsOwen / CyberScraper-2077

oxylabs / google-ai-mode-scraper

oxylabs / how-to-scrape-amazon-product-data

juancarlospaco / faster-than-requests

tholian-network / stealth

gildas-lormeau / single-file-cli

Oshan96 / monkey-dl

je-suis-tm / web-scraping

postmodern / spidr

k0rnh0li0 / onlyfans-dl

cassidoo / scrapers

oxylabs / how-to-scrape-google-scholar

spekulatius / PHPScraper

oxylabs / how-to-scrape-amazon-prices

jaebradley / basketball_reference_web_scraper

oxylabs / quick-start-guide

0x676e67 / wreq

austinoboyle / scrape-linkedin-selenium

AlexMathew / scrapple

shaikhsajid1111 / social-media-profile-scrapers

paulpierre / markdown-crawler

crwlrsoft / crawler

passivebot / facebook-marketplace-scraper

lewisdonovan / google-news-scraper

oxylabs / web-unblocker