There are 227 repositories under scraper topic.
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
The fast, flexible, and elegant library for parsing and manipulating HTML and XML.
AIHawk aims to easy job hunt process by automating the job application process. Utilizing artificial intelligence, it enables users to apply for multiple jobs in a tailored way.
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
:orange_book: 中华新华字典数据库。包括歇后语,成语,词语,汉字。
🔥Open Source No Code Web Data Extraction Platform. Turn Websites To APIs & Spreadsheets With No-Code Robots In Minutes🔥
AV 电影管理系统, avmoo , javbus , javlibrary 爬虫,线上 AV 影片图书馆,AV 磁力链接数据库,Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database
Use LLMs to dig out what you care about from massive amounts of information and a variety of sources daily.
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
A collection of awesome web crawler,spider in different languages
Declarative web scraping
Distributed crawler powered by Headless Chrome
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
A social networking service scraper in Python
Turn any webpage into structured data using LLMs
YouTube video downloader in javascript.
A community-driven way to read and chat with AI bots - powered by chatGPT.
Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️♂️ when scraping the web?
Scrape all the media from an OnlyFans account - Updated regularly
Emby/Jellyfin 的一个日本电影刮削器插件,可以从某些网站抓取影片信息。
Do you want to LEARN NEW STUFF for FREE? Don't worry, with the power of web-scraping and automation, this script will find the necessary Udemy coupons & enroll you for PAID UDEMY COURSES, ABSOLUTELY FREE!
There was no free Dictionary API on the web when I wanted one for my friend, so I created one.
Geziyor, blazing fast web crawling & scraping framework for Go. Supports JS rendering.
Node.js scraper to get data from Google Play
Twitter API Scraper | Without an API key | Twitter Internal API | Free | Twitter scraper | Twitter Bot
Downloads and archives content from reddit