frank-xwang

followers

following

stars

UC Berkeley

San Francisco Bay Area

http://people.eecs.berkeley.edu/~xdwang/

XuDong Frank Wang's starred repositories

annotated_deep_learning_paper_implementations

🧑‍🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

Language:PythonMIT52209 436 130

grok-1

Grok open release

Language:PythonApache-2.049204 561 202

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonMIT11007 164 217

xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Language:PythonNOASSERTION8095 80 501

faster-rcnn.pytorch

A faster pytorch implementation of faster r-cnn

Language:PythonMIT7631 91 838

EMO

Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

dust3r

DUSt3R: Geometric 3D Vision Made Easy

Language:PythonNOASSERTION4764 54 129

PixArt-alpha

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Language:PythonAGPL-3.02581 460

DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Language:PythonMIT1900 18 45

FreeU

FreeU: Free Lunch in Diffusion U-Net (CVPR2024 Oral)

cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Language:PythonApache-2.01596 20 44

ml-4m

4M: Massively Multimodal Masked Modeling

Language:PythonApache-2.01435 31 16

LLaVA-NeXT

Language:Python1360 22 100

FeatUp

Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024

Language:Jupyter NotebookMIT1305 19 57

DetectAndTrack

The implementation of an algorithm presented in the CVPR18 paper: "Detect-and-Track: Efficient Pose Estimation in Videos"

Language:PythonApache-2.0999 59 66

RADIO

Official repository for "AM-RADIO: Reduce All Domains Into One"

Language:PythonNOASSERTION521 21 23

MaskTrackRCNN

MaskTrackRCNN for video instance segmentation based on mmdetection

Language:PythonApache-2.0430 6 60

garfield

[CVPR'24] Group Anything with Radiance Fields

Language:PythonMIT355 7 27

zest_code

This is the official implementation of ZeST

Language:Jupyter NotebookMIT330 10 9

FreeSOLO

FreeSOLO for unsupervised instance segmentation, CVPR 2022

Language:PythonNOASSERTION312 4 17

Structured-Diffusion-Guidance

Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis

Language:Jupyter NotebookNOASSERTION300 7 14

scaling_on_scales

When do we not need larger vision models?

Language:PythonMIT273 6 14

LayoutGPT

Official repo for LayoutGPT

Language:PythonMIT271 13 17

ComfyUI-InstanceDiffusion

Language:PythonApache-2.0147 4 10

bsq-vit

[BSQ-ViT] Image and Video Tokenization with Binary Spherical Quantization

Language:PythonMIT65 5 2

Awesome-Unsupervised-Object-Localization

Curated list of awesome works on unsupervised object localization in 2D images.

Apache-2.061 5 1

BrainDecodesDeepNets

PyTorch implementation of "Brain Decodes Deep Nets"

Language:Jupyter Notebook48 2 1

CRATE-alpha

This repository includes the official implementation our paper "Scaling White-Box Transformers for Vision"

Language:Python35 2 1

UnScene3D

Unsupervised 3D Instance Segmentation

Language:PythonBSD-3-Clause3100

CuVLER

Enhanced Unsupervised Object Discoveries through Exhaustive Self-Supervised Transformers

Language:PythonMIT5 20