Dingkang Liang (dk-liang)

dk-liang

Geek Repo

Company:Huazhong University of Science & Technology

Location: Luoyu Road 1037, Wuhan, China

Home Page:https://dk-liang.github.io/

Github PK Tool:Github PK Tool

Dingkang Liang's starred repositories

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:22372Issues:182Issues:176

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonLicense:Apache-2.0Stargazers:19262Issues:170Issues:329

Omost

Your image is almost there!

Language:PythonLicense:Apache-2.0Stargazers:6544Issues:39Issues:62

VAR

[GPT beats diffusionšŸ”„] [scaling laws in visual generationšŸ“ˆ] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Language:PythonLicense:MITStargazers:3727Issues:110Issues:68

MobileAgent

Mobile-Agent: The Powerful Mobile Device Operation Assistant Family

Language:PythonLicense:MITStargazers:2231Issues:35Issues:26

InternLM-XComposer

InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension.

MambaOut

MambaOut: Do We Really Need Mamba for Vision?

Language:PythonLicense:Apache-2.0Stargazers:1841Issues:6Issues:239

2d-gaussian-splatting

[SIGGRAPH'24] 2D Gaussian Splatting for Geometrically Accurate Radiance Fields

Language:PythonLicense:NOASSERTIONStargazers:1530Issues:40Issues:80

flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Language:PythonLicense:MITStargazers:659Issues:19Issues:21

OneLLM

[CVPR 2024] OneLLM: One Framework to Align All Modalities with Language

Language:PythonLicense:NOASSERTIONStargazers:500Issues:11Issues:19

RADIO

Official repository for "AM-RADIO: Reduce All Domains Into One"

Language:PythonLicense:NOASSERTIONStargazers:487Issues:21Issues:18

Vista

A Generalizable World Model for Autonomous Driving

Language:PythonLicense:Apache-2.0Stargazers:342Issues:18Issues:13

Vision-RWKV

Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures

Language:PythonLicense:Apache-2.0Stargazers:278Issues:5Issues:26

MA-LMM

(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

Language:PythonLicense:MITStargazers:166Issues:4Issues:22

nxtp

Object Recognition as Next Token Prediction (CVPR 2024)

Language:PythonLicense:NOASSERTIONStargazers:137Issues:2Issues:2

numpy-hilbert-curve

Numpy implementation of Hilbert curves in arbitrary dimensions

Language:PythonLicense:MITStargazers:137Issues:5Issues:0

GenAD

GenAD: Generative End-to-End Autonomous Driving

Language:PythonLicense:Apache-2.0Stargazers:131Issues:7Issues:8

WidthFormer

WidthFormer: Toward Efficient Transformer-based BEV View Transformation

Language:PythonLicense:Apache-2.0Stargazers:118Issues:12Issues:15

vHeat

vHeat: Building Vision Models upon Heat Conduction

DiG

DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention

Language:PythonLicense:MITStargazers:85Issues:0Issues:0
Language:PythonLicense:MITStargazers:69Issues:0Issues:0

1d-tokenizer

This repo contains the code for our paper An Image is Worth 32 Tokens for Reconstruction and Generation

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:62Issues:0Issues:0

Vision-Mamba-A-Comprehensive-Survey-and-Taxonomy

Vision Mamba: A Comprehensive Survey and Taxonomy

SOLE

Official code of "Segment any 3D Object with Language"

Language:PythonLicense:MITStargazers:31Issues:5Issues:4
Stargazers:6Issues:0Issues:0

ViTWSS3D

[ICCV 23] A Simple Vision Transformer for Weakly Semi-supervised 3D Object Detection

Language:PythonLicense:Apache-2.0Stargazers:6Issues:0Issues:0