YP's starred repositories

ChatTTS

A generative speech model for daily dialogue.

Language:PythonLicense:AGPL-3.0Stargazers:29705Issues:172Issues:480

LLMs-from-scratch

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:25134Issues:278Issues:77

llm.c

LLM training in simple, raw C/CUDA

Language:CudaLicense:MITStargazers:22949Issues:225Issues:130

tiny-gpu

A minimal GPU design in Verilog to learn how GPUs work from the ground up

Language:SystemVerilogStargazers:6862Issues:68Issues:22

hatchet

A distributed, fault-tolerant task queue

Shiro

📜 A minimalist personal website embodying the purity of paper and freshness of snow.

Language:TypeScriptLicense:NOASSERTIONStargazers:3238Issues:13Issues:103

LapisCV

📃 开箱即用的 Obsidian / Typora 简历

Language:CSSLicense:MITStargazers:2554Issues:34Issues:13

torchtitan

A native PyTorch Library for large model training

Language:PythonLicense:BSD-3-ClauseStargazers:1491Issues:35Issues:124

ThunderKittens

Tile primitives for speedy kernels

Language:CudaLicense:MITStargazers:1456Issues:26Issues:22

how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

llm-reasoners

A library for advanced large language model reasoning

Language:PythonLicense:Apache-2.0Stargazers:1071Issues:15Issues:33

LlamaGym

Fine-tune LLM agents with online reinforcement learning

Language:PythonLicense:MITStargazers:964Issues:7Issues:9

tiny-universe

《大模型白盒子构建指南》:一个全手搓的Tiny-Universe

poe-api-wrapper

👾 A Python API wrapper for Poe.com. With this, you will have free access to GPT-4, Claude, Llama, Gemini, Mistral and more! 🚀

Language:PythonLicense:GPL-3.0Stargazers:780Issues:21Issues:157

tutel

Tutel MoE: An Optimized Mixture-of-Experts Implementation

Language:PythonLicense:MITStargazers:703Issues:15Issues:58

LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing

LLM-PowerHouse: Unleash LLMs' potential through curated tutorials, best practices, and ready-to-use code for custom training and inferencing.

Language:Jupyter NotebookLicense:MITStargazers:583Issues:15Issues:0

EasyContext

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.

Language:PythonLicense:Apache-2.0Stargazers:582Issues:9Issues:41

makeMoE

From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)

Language:Jupyter NotebookLicense:MITStargazers:567Issues:7Issues:3

awesomeMLSys

An ML Systems Onboarding list

text-clustering

Easily embed, cluster and semantically label text datasets

Language:PythonLicense:Apache-2.0Stargazers:419Issues:34Issues:5

ipc

[Start here!] Flow-IPC - Modern C++ toolkit for high-speed inter-process communication (IPC)

Language:C++License:Apache-2.0Stargazers:273Issues:6Issues:12

Chinese-Resume-in-Typst

使用 Typst 编写的中文简历, 语法简洁, 样式美观, 开箱即用, 可选是否显示照片

ns3-ai

Enable the interaction between ns-3 and popular frameworks using Python, which mean you can train and test your AI algorithms in ns-3 without changing any frameworks you are using now!

Language:C++License:GPL-2.0Stargazers:219Issues:12Issues:82

cuda-repo

From zero to hero CUDA for accelerating maths and machine learning on GPU.

Language:CudaLicense:MITStargazers:163Issues:4Issues:0

scattermoe

Triton-based implementation of Sparse Mixture of Experts.

Language:PythonLicense:Apache-2.0Stargazers:159Issues:5Issues:12

fms-fsdp

🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash attention v2.

Language:PythonLicense:Apache-2.0Stargazers:141Issues:11Issues:31

PyNorch

Recreating PyTorch from scratch (C/C++, CUDA and Python, with multi-GPU support and automatic differentiation!)

ccml

simple autodiff library

Language:Objective-CStargazers:62Issues:3Issues:0

InterProcessPyObjects

High-performance and seamless sharing and modification of Python objects between processes, without the periodic overhead of serialization and deserialization. Provides fast inter-process communication (IPC) via shared memory. Supports NumPy, Torch arrays, custom classes (including dataclass), classes with methods, and asyncio

Language:PythonLicense:Apache-2.0Stargazers:54Issues:2Issues:3