Linpeng Tang's starred repositories
data-juicer
A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
NeMo-Guardrails
NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
guardrails
Adding guardrails to large language models.
streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
distilabel
⚗️ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
openllmetry
Open-source observability for your LLM application, based on OpenTelemetry
pdf2htmlEX
Convert PDF to HTML without losing text or format.
instill-core
🔮 Instill Core is a full-stack AI infrastructure tool for data, model and pipeline orchestration, designed to streamline every aspect of building versatile AI-first applications