Yangshenâš¡Deng (TKONIY)

TKONIY

Geek Repo

Company:@DBGroup-SUSTech

Location:Shenzhen, China

Home Page:https://dengyangshen.netlify.app/

Github PK Tool:Github PK Tool


Organizations
DBGroup-SUSTech

Yangshenâš¡Deng's starred repositories

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonLicense:Apache-2.0Stargazers:128624Issues:1100Issues:15138

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonLicense:Apache-2.0Stargazers:20240Issues:176Issues:353

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonLicense:Apache-2.0Stargazers:17946Issues:157Issues:1381

RWKV-LM

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.

Language:PythonLicense:Apache-2.0Stargazers:11966Issues:137Issues:197

DALLE2-pytorch

Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch

Language:PythonLicense:MITStargazers:10947Issues:122Issues:207

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonLicense:Apache-2.0Stargazers:10880Issues:160Issues:192

Yi

A series of large language models trained from scratch by developers @01-ai

Language:PythonLicense:Apache-2.0Stargazers:7436Issues:112Issues:287

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:7399Issues:84Issues:1546

streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Language:PythonLicense:MITStargazers:6354Issues:60Issues:78

github-markdown-toc

Easy TOC creation for GitHub README.md

Language:ShellLicense:MITStargazers:3198Issues:40Issues:81

DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Ask-Anything

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Language:PythonLicense:MITStargazers:2861Issues:37Issues:187

lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Language:PythonLicense:Apache-2.0Stargazers:2033Issues:21Issues:167

LLMAgentPapers

Must-read Papers on LLM Agents.

Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.

Language:PythonLicense:CC-BY-4.0Stargazers:1052Issues:14Issues:107

punica

Serving multiple LoRA finetuned LLM as one

Language:PythonLicense:Apache-2.0Stargazers:887Issues:14Issues:37

flashinfer

FlashInfer: Kernel Library for LLM Serving

Language:CudaLicense:Apache-2.0Stargazers:760Issues:13Issues:70

ringattention

Transformers with Arbitrarily Large Context

Language:PythonLicense:Apache-2.0Stargazers:566Issues:5Issues:15

Awesome-LLM-Long-Context-Modeling

📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥

megalodon

Reference implementation of Megalodon 7B model

Language:CudaLicense:MITStargazers:492Issues:14Issues:7

dash

Scalable Hashing on Persistent Memory

Language:C++License:MITStargazers:184Issues:6Issues:12

TriForce

TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

DistServe

Disaggregated serving system for Large Language Models (LLMs).

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:139Issues:4Issues:12

NExT-QA

NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions (CVPR'21)

Language:PythonLicense:MITStargazers:109Issues:2Issues:27

sarathi-serve

A low-latency & high-throughput serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:66Issues:0Issues:0

MSVBASE

MSVBASE is a system that efficiently supports complex queries of both approximate similarity search and relational operators. It integrates high-dimensional vector indices into PostgreSQL, a relational database to facilitate complex approximate similarity queries.

Language:C++License:MITStargazers:58Issues:7Issues:8

SwiftTransformer

High performance Transformer implementation in C++.

Language:C++Stargazers:41Issues:1Issues:0

FastLanesGPU

Accelerating GPU Data Processing using FastLanes Compression

Language:CudaLicense:MITStargazers:9Issues:3Issues:0

GPUDB-Prefetch

Source code of our DaMoN@SIGMOD 2024 paper "How Does Software Prefetching Work on GPU Query Processing?"

Language:CudaStargazers:5Issues:0Issues:0