Ren Xuancheng's starred repositories

candle

Minimalist ML framework for Rust

Language:RustLicense:Apache-2.0Stargazers:14076Issues:0Issues:0

open-webui

User-friendly WebUI for LLMs (Formerly Ollama WebUI)

Language:SvelteLicense:MITStargazers:26579Issues:0Issues:0

hugo-PaperMod

A fast, clean, responsive Hugo theme.

Language:HTMLLicense:MITStargazers:8967Issues:0Issues:0

nanotron

Minimalistic large language model 3D-parallelism training

Language:PythonLicense:Apache-2.0Stargazers:901Issues:0Issues:0

template

This is the repository for the distill web framework

Language:JavaScriptLicense:Apache-2.0Stargazers:769Issues:0Issues:0

locust

Write scalable load tests in plain Python 🚗💨

Language:PythonLicense:MITStargazers:23964Issues:0Issues:0

mediawiki-services-parsoid

This is a mirror from https://gerrit.wikimedia.org/g/mediawiki/services/parsoid/. See https://www.mediawiki.org/wiki/Developer_access for contributing.

Language:PHPLicense:GPL-2.0Stargazers:148Issues:0Issues:0

crawlee

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

Language:TypeScriptLicense:Apache-2.0Stargazers:12721Issues:0Issues:0

paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.

Language:JavaLicense:Apache-2.0Stargazers:2027Issues:0Issues:0

marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Language:PythonLicense:Apache-2.0Stargazers:402Issues:0Issues:0

python-magic

A python wrapper for libmagic

Language:PythonLicense:NOASSERTIONStargazers:2562Issues:0Issues:0

pandoc

Universal markup converter

Language:HaskellLicense:NOASSERTIONStargazers:32929Issues:0Issues:0

marker

Convert PDF to markdown quickly with high accuracy

Language:PythonLicense:GPL-3.0Stargazers:12168Issues:0Issues:0

dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.

Language:TypeScriptLicense:NOASSERTIONStargazers:33304Issues:0Issues:0

HunyuanDiT

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Language:PythonLicense:NOASSERTIONStargazers:2225Issues:0Issues:0

data-juicer

A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!

Language:PythonLicense:Apache-2.0Stargazers:1627Issues:0Issues:0
Language:PythonStargazers:680Issues:0Issues:0

internetarchive

A Python and Command-Line Interface to Archive.org

Language:PythonLicense:AGPL-3.0Stargazers:1546Issues:0Issues:0

ia-download

Internet archive downloader

Language:Jupyter NotebookStargazers:2Issues:0Issues:0

llama-cpp-python

Python bindings for llama.cpp

Language:PythonLicense:MITStargazers:6937Issues:0Issues:0

dash-cookbook

Receipts for creating AI Applications with APIs from DashScope (and friends)!

License:Apache-2.0Stargazers:19Issues:0Issues:0

python-markdownify

Convert HTML to Markdown

Language:PythonLicense:MITStargazers:831Issues:0Issues:0

the-stack-v2

Code for the curation of The Stack v2 and StarCoder2 training data

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:70Issues:0Issues:0

octopack

🐙 OctoPack: Instruction Tuning Code Large Language Models

Language:Jupyter NotebookLicense:MITStargazers:394Issues:0Issues:0

unsloth

Finetune Llama 3, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

Language:PythonLicense:Apache-2.0Stargazers:11356Issues:0Issues:0

datatrove

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Language:PythonLicense:Apache-2.0Stargazers:1645Issues:0Issues:0

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:21926Issues:0Issues:0

CodeQwen1.5

CodeQwen1.5 is the code version of Qwen, the large language model series developed by Qwen team, Alibaba Cloud.

Language:PythonStargazers:336Issues:0Issues:0

web-content-extraction-benchmark

Web Content Extraction Benchmark

Language:PythonLicense:Apache-2.0Stargazers:13Issues:0Issues:0

yt-dlp

A feature-rich command-line audio/video downloader

Language:PythonLicense:UnlicenseStargazers:74222Issues:0Issues:0