XuDong Frank Wang's starred repositories
DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
CRATE-alpha
This repository includes the official implementation our paper "Scaling White-Box Transformers for Vision"
Awesome-Unsupervised-Object-Localization
Curated list of awesome works on unsupervised object localization in 2D images.
scaling_on_scales
When do we not need larger vision models?
BrainDecodesDeepNets
PyTorch implementation of "Brain Decodes Deep Nets"
annotated_deep_learning_paper_implementations
🧑🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
faster-rcnn.pytorch
A faster pytorch implementation of faster r-cnn
DetectAndTrack
The implementation of an algorithm presented in the CVPR18 paper: "Detect-and-Track: Efficient Pose Estimation in Videos"
PixArt-alpha
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Structured-Diffusion-Guidance
Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
MaskTrackRCNN
MaskTrackRCNN for video instance segmentation based on mmdetection