zye1996's starred repositories
Alpaca-CoT
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts to initiate any meaningful PR on this repo and integrate as many LLM related technologies as possible. 我们打造了方便研究人员上手和使用大模型等微调平台,我们欢迎开源爱好者发起任何有意义的pr!
PDF-Extract-Kit
A Comprehensive Toolkit for High-Quality PDF Content Extraction
crawlee-python
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
mistral.rs
Blazingly fast LLM inference.
Scrapegraph-ai
Python scraper based on AI
awesome-software-architecture
🚀 A curated list of awesome articles, videos, and other resources to learn and practice software architecture, patterns, and principles.
reddit-dataset
Dataset of threads and comments from reddit
LexLIP-ICCV23
Official Code for the ICCV23 Paper: "LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Sparse Retrieval"