dk-liang

followers

following

stars

Huazhong University of Science & Technology

Luoyu Road 1037, Wuhan, China

https://dk-liang.github.io/

Dingkang Liang's starred repositories

llama3

The official Meta Llama 3 GitHub site

Language:PythonNOASSERTION22372 182 176

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonApache-2.019262 170 329

Omost

Your image is almost there!

Language:PythonApache-2.06544 39 62

VAR

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!

Language:PythonMIT3727 110 68

website

Language:HTML2315 16 14

MobileAgent

Mobile-Agent: The Powerful Mobile Device Operation Assistant Family

Language:PythonMIT2231 35 26

InternLM-XComposer

InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension.

Language:Python1889 36 283

MambaOut

MambaOut: Do We Really Need Mamba for Vision?

Language:PythonApache-2.01841 6 239

2d-gaussian-splatting

[SIGGRAPH'24] 2D Gaussian Splatting for Geometrically Accurate Radiance Fields

Language:PythonNOASSERTION1530 40 80

LLaVA-NeXT

Language:Python825 19 59

flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Language:PythonMIT659 19 21

OneLLM

[CVPR 2024] OneLLM: One Framework to Align All Modalities with Language

Language:PythonNOASSERTION500 11 19

RADIO

Official repository for "AM-RADIO: Reduce All Domains Into One"

Language:PythonNOASSERTION487 21 18

Vista

A Generalizable World Model for Autonomous Driving

Language:PythonApache-2.0342 18 13

Vision-RWKV

Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures

Language:PythonApache-2.0278 5 26

MA-LMM

(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

Language:PythonMIT166 4 22

nxtp

Object Recognition as Next Token Prediction (CVPR 2024)

Language:PythonNOASSERTION137 2 2

numpy-hilbert-curve

Numpy implementation of Hilbert curves in arbitrary dimensions

Language:PythonMIT137 50

GenAD

GenAD: Generative End-to-End Autonomous Driving

Language:PythonApache-2.0131 7 8

OmniDrive

WidthFormer

WidthFormer: Toward Efficient Transformer-based BEV View Transformation

Language:PythonApache-2.0118 12 15

vHeat

vHeat: Building Vision Models upon Heat Conduction

Language:Python86 4 2

DiG

DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention

Language:PythonMIT8500

ViG

Language:PythonMIT6900

1d-tokenizer

This repo contains the code for our paper An Image is Worth 32 Tokens for Reconstruction and Generation

Language:Jupyter NotebookApache-2.06200

Vision-Mamba-A-Comprehensive-Survey-and-Taxonomy

Vision Mamba: A Comprehensive Survey and Taxonomy

SOLE

Official code of "Segment any 3D Object with Language"

Language:PythonMIT31 5 4

Sparse-Tuning

MoE-Jetpack

600

ViTWSS3D

[ICCV 23] A Simple Vision Transformer for Weakly Semi-supervised 3D Object Detection

Language:PythonApache-2.0600