wm901115nwpu's starred repositories

ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Language:PythonLicense:Apache-2.0Stargazers:31201Issues:473Issues:17415

jax

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Language:PythonLicense:Apache-2.0Stargazers:28045Issues:323Issues:5132

generative-models

Generative Models by Stability AI

Language:PythonLicense:MITStargazers:22375Issues:236Issues:259

mlx

MLX: An array framework for Apple silicon

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonLicense:MITStargazers:10205Issues:152Issues:149

yolov9

Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

Language:PythonLicense:GPL-3.0Stargazers:7959Issues:53Issues:360

text-generation-inference

Large Language Model Text Generation Inference

Language:PythonLicense:Apache-2.0Stargazers:7937Issues:99Issues:1050

gemma.cpp

lightweight, standalone C++ inference engine for Google's Gemma models.

Language:C++License:Apache-2.0Stargazers:5529Issues:38Issues:65

trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Language:PythonLicense:MITStargazers:4331Issues:49Issues:282

DiffSinger

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Language:PythonLicense:MITStargazers:4114Issues:41Issues:98

alignment-handbook

Robust recipes to align language models with human and AI preferences

Language:PythonLicense:Apache-2.0Stargazers:3842Issues:110Issues:111

LLaMA2-Accessory

An Open-source Toolkit for LLM Development

Language:PythonLicense:NOASSERTIONStargazers:2516Issues:36Issues:125

Qwen-Agent

Agent framework and applications built upon Qwen1.5, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.

Language:PythonLicense:NOASSERTIONStargazers:1689Issues:27Issues:135

GaLore

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Language:PythonLicense:Apache-2.0Stargazers:1126Issues:19Issues:34

OpenDiT

OpenDiT: An Easy, Fast and Memory-Efficient System for DiT Training and Inference

Language:PythonLicense:Apache-2.0Stargazers:1006Issues:20Issues:51

LlamaEdge

The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge

Language:RustLicense:Apache-2.0Stargazers:645Issues:16Issues:79

tensorrtllm_backend

The Triton TensorRT-LLM Backend

Language:PythonLicense:Apache-2.0Stargazers:485Issues:24Issues:353

flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Language:PythonLicense:MITStargazers:457Issues:14Issues:1

SoraReview

The official GitHub page for the review paper "Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models".

ring-flash-attention

Ring attention implementation with flash attention

LLM_MultiAgents_Survey_Papers

Large Language Model based Multi-Agents: A Survey of Progress and Challenges

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:160Issues:3Issues:4

self-speculative-decoding

Code associated with the paper **Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding**

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:96Issues:4Issues:14

fp6_llm

An efficient GPU support for LLM inference with 6-bit quantization (FP6).

Language:CudaLicense:Apache-2.0Stargazers:80Issues:0Issues:0

flash-linear-rnn

Implementations of various linear RNN layers using pytorch and triton

QLLM

[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"

Language:PythonLicense:Apache-2.0Stargazers:21Issues:7Issues:0
Language:PythonLicense:BSD-3-ClauseStargazers:18Issues:1Issues:0