Beast code in Giters

360fish's starred repositories

GoogleScraper

A Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, ...). Including asynchronous networking support.

Language:HTMLApache-2.0260300

Monster-Crawler

A Tutorial Showing Scrapy Web Scraping and Data Visulization

Language:Python1600

lemon-agent

Plan-Validate-Solve (PVS) Agent for accurate, reliable and reproducable workflow automation

Language:TypeScriptMIT30700

Plan-and-Solve-Prompting

Code for our ACL 2023 Paper "Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models".

Language:Python56000

factory-pattern-vectorstore-interface

A pattern to let you try several vector databases and change a little code as possible

Language:Python3400

hr-gpt

An AI HR Agent who lives in Slack (GPT-powered)

Language:PythonNOASSERTION5800

Uscrapper Vanta: Dive deeper into the web with this powerful open-source tool. Extract valuable insights with ease and efficiency, from both surface and deep web sources. Empower your data mining and analysis with Vanta's advanced capabilities. Fast, reliable, and user-friendly, Uscrapper Vanta is the ultimate choice for researchers and analysts.

Language:PythonMIT45300

dataflowkit

Extract structured data from web sites. Web sites scraping.

Language:GoBSD-3-Clause65400

amazon-scraper

A simple web scraper to extract Product Data and Pricing from Amazon

Language:Python31600

crawlee

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

Language:TypeScriptApache-2.01381600