Burak Emre Kabakcı's starred repositories
crawlee
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
rehype-pretty-code
Beautiful code blocks for Markdown or MDX.
duckdb-shellfs-extension
DuckDB extension allowing shell commands to be used for input and output.
unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
terraform-google-identity-engine-data-plane
Identity/Identity-Engine - Data plane Terraform module
selectorgadget
Go go CSS / DOM inspection.
json-to-ffmpeg
Experimental JSON to ffmpeg filter complex converter
ossinsight
Analysis, Comparison, Trends, Rankings of Open Source Software, you can also get insight from more than 7 billion with natural language (powered by OpenAI). Follow us on Twitter: https://twitter.com/ossinsight
datacontract-specification
The Data Contract Specification Repository
meltano-codespace-ready
Have your first meltano project running within 5 minutes - no setup - no install - no boundaries. All inside GitHub Codespaces. (GitHub account required)
dbt-llm-tools
RAG based LLM chatbot for dbt projects
datafusion-comet
Apache DataFusion Comet Spark Accelerator