Ondra Urban's starred repositories
playwright
Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
docusaurus
Easy to maintain open source documentation websites.
Prompt-Engineering-Guide
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
changedetection.io
The best and simplest free open source web page change detection, website watcher, restock monitor and notification service. Restock Monitor, change detection. Designed for simplicity - Simply monitor which websites had a text change for free. Free Open source web page change detection, Website defacement monitoring, Price change notification
crawlee
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
readability
A standalone version of the readability lib
browser-fingerprinting
Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️♂️ when scraping the web?
SourceCodeSyntaxHighlight
Quick Look extension for highlight source code files on macOS 10.15 and later.
QLMarkdown
macOS Quick Look extension for Markdown files.
user-agents
A JavaScript library for generating random user agents with data that's updated daily.
fingerprint-suite
Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.
proxy-chain
Node.js implementation of a proxy server (think Squid) with support for SSL, authentication and upstream proxy chaining.
secret-agent
The web scraper that's nearly impossible to block - now called @ulixee/hero
got-scraping
HTTP client made for scraping based on got.
http2-wrapper
Use HTTP/2 the same way like HTTP/1
You-Dont-Know-Axios
Enrich documents for axios
actor-scraper
House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.
apify-sdk-python
The Apify SDK for Python is the official library for creating Apify Actors in Python. It provides useful features like actor lifecycle management, local storage emulation, and actor event handling.
apify-sdk-js
Apify SDK monorepo
browser-pool
A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
apify-client-python
Apify API client for Python