yukang's repositories
NAS-quantization
The code for Joint Neural Architecture Search and Quantization
Pose-Mobile
A real-time posing app
AutoGPT
An experimental open-source attempt to make GPT-4 fully autonomous.
Awesome-BEV-Perception-Multi-Cameras
Awesome papers about Multi-Camera 3D Object Detection and Segmentation in Bird-Eye-View, such as DETR3D, BEVDet, BEVFormer
Composition-Stable-Diffusion
Image Composition via Stable Diffusion
DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Grounded-Segment-Anything
Marrying Grounding DINO with Segment Anything & Stable Diffusion & BLIP & Whisper & ChatBot - Automatically Detect , Segment and Generate Anything with Image, Text, and Speech Inputs
GroundingDINO
The official implementation of "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
LinK
[CVPR 2023] LinK: Linear Kernel for LiDAR-based 3D Perception
LongBench
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
Mask3D
Mask3D predicts accurate 3D semantic instances achieving state-of-the-art on ScanNet, ScanNet200, S3DIS and STPLS3D.
spconv
Spatial Sparse Convolution Library
spvnas
[ECCV 2020] Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution
SST
Codes for “Fully Sparse 3D Object Detection” & “Embracing Single Stride 3D Object Detector with Sparse Transformer”
stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
VILA
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)