Amanda-Barbara's starred repositories
Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
flops-counter.pytorch
Flops counter for convolutional networks in pytorch framework
DeepSeek-V2
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
flashinfer
FlashInfer: Kernel Library for LLM Serving
CUDA-Learn-Notes
🎉CUDA 笔记 / 大模型手撕CUDA / C++笔记,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.
tensorrtllm_backend
The Triton TensorRT-LLM Backend
Awesome-LLM-Long-Context-Modeling
📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥
MiniGPT4-video
Official code for MiniGPT4-video
ring-flash-attention
Ring attention implementation with flash attention
Consistency_LLM
[ICML 2024] CLLMs: Consistency Large Language Models
long-context-attention
Sequence Parallel Attention for Long Context LLM Model Training and Inference
Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), but we only have limited resource. We deeply wish the all open source community can contribute to this project.