caoxuwen's starred repositories

AutoGPT

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

Language:PythonLicense:NOASSERTIONStargazers:168353Issues:1544Issues:2809

langchain

🦜🔗 Build context-aware reasoning applications

Language:Jupyter NotebookLicense:MITStargazers:94880Issues:691Issues:7867

open-interpreter

A natural language interface for computers

Language:PythonLicense:AGPL-3.0Stargazers:55373Issues:416Issues:971

gpt-engineer

Platform to experiment with the AI Software Engineer. Terminal based. NOTE: Very different from https://gptengineer.app

Language:PythonLicense:MITStargazers:52391Issues:513Issues:482

dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.

Language:TypeScriptLicense:NOASSERTIONStargazers:51729Issues:368Issues:4920

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Language:PythonLicense:MITStargazers:37339Issues:375Issues:318

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Language:PythonLicense:Apache-2.0Stargazers:36954Issues:352Issues:1830

guidance

A guidance language for controlling large language models.

Language:Jupyter NotebookLicense:MITStargazers:19093Issues:119Issues:544

dspy

DSPy: The framework for programming—not prompting—foundation models

Language:PythonLicense:MITStargazers:18776Issues:140Issues:811

MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。

Language:PythonLicense:AGPL-3.0Stargazers:15492Issues:80Issues:546

ChuanhuChatGPT

GUI for ChatGPT API and many LLMs. Supports agents, file-based QA, GPT finetuning and query with web search. All with a neat UI.

Language:PythonLicense:GPL-3.0Stargazers:15254Issues:84Issues:794

triton

Development repository for the Triton language and compiler

doris

Apache Doris is an easy-to-use, high performance and unified analytics database.

Language:JavaLicense:Apache-2.0Stargazers:12703Issues:283Issues:7489

datahub

The Metadata Platform for your Data and AI Stack

Language:JavaLicense:Apache-2.0Stargazers:9910Issues:251Issues:2221
Language:Jupyter NotebookLicense:MITStargazers:9365Issues:85Issues:30

imaginAIry

Pythonic AI generation of images and videos

Language:PythonLicense:MITStargazers:7957Issues:51Issues:262

PDF-Extract-Kit

A Comprehensive Toolkit for High-Quality PDF Content Extraction

Language:PythonLicense:AGPL-3.0Stargazers:5630Issues:37Issues:128

researchgpt

A LLM based research assistant that allows you to have a conversation with a research paper

Language:PythonLicense:MITStargazers:3556Issues:41Issues:61

Baichuan-13B

A 13B large language model developed by Baichuan Intelligent Technology

Language:PythonLicense:Apache-2.0Stargazers:2982Issues:32Issues:193

kangas

🦘 Explore multimedia datasets at scale

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1041Issues:13Issues:15

cc_net

Tools to download and cleanup Common Crawl data

Language:PythonLicense:MITStargazers:971Issues:23Issues:44

tonbo

A portable embedded database using Arrow.

Language:RustLicense:Apache-2.0Stargazers:774Issues:14Issues:65

awesome-data-catalogs

📙 Awesome Data Catalogs and Observability Platforms.

langchain-visualizer

Visualization and debugging tool for LangChain workflows

Language:PythonLicense:MITStargazers:721Issues:9Issues:24

yanagishima

Web UI for Trino, Hive and SparkSQL

Language:JavaLicense:Apache-2.0Stargazers:634Issues:29Issues:127

devchat

Automate your dev tasks with AI-powered scripts, from your IDE's chat panel.

Language:PythonLicense:Apache-2.0Stargazers:353Issues:9Issues:298

pdfdeal

A python wrapper for the Doc2X API and comes with native texts processing (to improve PDF recall in RAG). | Doc2X API的python封装,同时附带本地的文本处理(提升PDF在RAG中的召回率)。

Language:PythonLicense:MITStargazers:197Issues:2Issues:9

ProX

Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"

Language:PythonLicense:Apache-2.0Stargazers:191Issues:0Issues:0

uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering

LLM-based text extraction from unstructured data like PDFs, Words and HTMLs. Transform and cluster the text into your desired format. Less information loss, more interpretation, and faster R&D!

Language:PythonLicense:Apache-2.0Stargazers:187Issues:5Issues:13
Language:PythonLicense:NOASSERTIONStargazers:17Issues:9Issues:0