williamstar

followers

following

stars

Beijing China

Yulong Wang's starred repositories

the-algorithm

Source code for Twitter's Recommendation Algorithm

Language:ScalaAGPL-3.061813 374 966

llama

Inference code for Llama models

Language:PythonNOASSERTION54741 516 945

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Language:PythonMIT35224 357 306

milvus

A cloud-native vector database, storage for next generation AI applications

Language:GoApache-2.028566 275 11319

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.023860 218 3665

Chinese-LLaMA-Alpaca

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Language:PythonApache-2.017990 185 730

Awesome-LLM

Awesome-LLM: a curated list of Large Language Model

CC0-1.016423 344 24

peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Language:PythonApache-2.015264 105 984

triton

Development repository for the Triton language and compiler

Language:C++MIT12118 186 1330

InstantID

InstantID : Zero-shot Identity-Preserving Generation in Seconds 🔥

Language:PythonApache-2.010616 122 207

AISystem

AISystem 主要是指AI系统，包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Language:Jupyter NotebookApache-2.09832 137 31

trl

Train transformer language models with reinforcement learning.

Language:PythonApache-2.08894 75 1019

wandb

🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.

Language:PythonMIT8682 58 3224

xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Language:PythonNOASSERTION8115 81 502

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.07718 87 1660

tianshou

An elegant PyTorch deep reinforcement learning library.

Language:PythonMIT7657 93 725

search_with_lepton

Building a quick conversation-based search demo with Lepton AI.

Language:TypeScriptApache-2.07587 52 63

accelerate

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Language:PythonApache-2.07445 97 1502

flax

Flax is a neural network library for JAX that is designed for flexibility.

Language:PythonApache-2.05844 88 867

bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.

Language:PythonMIT5756 48 968

FasterTransformer

Transformer related optimization, including BERT, GPT

Language:C++Apache-2.05691 64 624

gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Language:PythonBSD-3-Clause5394 63 96

AITemplate

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Language:PythonApache-2.04499 83 241

Fengshenbang-LM

Fengshenbang-LM(封神榜大模型)是IDEA研究院认知计算与自然语言研究中心主导的大模型开源体系，成为中文AIGC和认知智能的基础设施。

Language:PythonApache-2.03967 56 291

spec

Development Containers: Use a container as a full-featured development environment.

CC-BY-4.03197 168 196

pudb

Full-screen console debugger for Python

Language:PythonNOASSERTION2922 50 325

DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Language:PythonApache-2.01790 41 290

smoothquant

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Language:PythonMIT1125 21 84

EnergonAI

Large-scale model inference.

Language:PythonApache-2.0628 23 50

HolisticTraceAnalysis

A library to analyze PyTorch traces.

Language:PythonMIT254 17 53