heluocs's starred repositories
MediaCrawler
小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫
data-juicer
A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据!
build-your-own-x
Master programming by recreating your favorite technologies from scratch.
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
javascript
JavaScript Style Guide
parquet-format
Apache Parquet Format
stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
Awesome-LLMOps
An awesome & curated list of best LLMOps tools for developers
katalyst-core
Katalyst aims to provide a universal solution to help improve resource utilization and optimize the overall costs in the cloud. This is the core components in Katalyst system, including multiple agents and centralized components