TKONIY

Yangshen⚡Deng's starred repositories

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonApache-2.0128624 1100 15138

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonApache-2.020240 176 353

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonApache-2.017946 157 1381

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

Language:PythonApache-2.011966 137 197

DALLE2-pytorch

Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch

Language:PythonMIT10947 122 207

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonApache-2.010880 160 192

Yi

A series of large language models trained from scratch by developers @01-ai

Language:PythonApache-2.07436 112 287

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++Apache-2.07399 84 1546

streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Language:PythonMIT6354 60 78

github-markdown-toc

Easy TOC creation for GitHub README.md

Language:ShellMIT3198 40 81

DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

MIT2877 22 64

Ask-Anything

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Language:PythonMIT2861 37 187

lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Language:PythonApache-2.02033 21 167

LLMAgentPapers

Must-read Papers on LLM Agents.

1423 43 7

Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.

Language:PythonCC-BY-4.01052 14 107

punica

Serving multiple LoRA finetuned LLM as one

Language:PythonApache-2.0887 14 37

flashinfer

FlashInfer: Kernel Library for LLM Serving

Language:CudaApache-2.0760 13 70

ringattention

Transformers with Arbitrarily Large Context

Language:PythonApache-2.0566 5 15

Awesome-LLM-Long-Context-Modeling

📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥

MIT556 23 3

megalodon

Reference implementation of Megalodon 7B model

Language:CudaMIT492 14 7

dash

Scalable Hashing on Persistent Memory

Language:C++MIT184 6 12

TriForce

TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Language:Python140 1 5

DistServe

Disaggregated serving system for Large Language Models (LLMs).

Language:Jupyter NotebookApache-2.0139 4 12

NExT-QA

NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)

Language:PythonMIT109 2 27

sarathi-serve

A low-latency & high-throughput serving engine for LLMs

Language:PythonApache-2.06600

MSVBASE

MSVBASE is a system that efficiently supports complex queries of both approximate similarity search and relational operators. It integrates high-dimensional vector indices into PostgreSQL, a relational database to facilitate complex approximate similarity queries.

Language:C++MIT58 7 8