XuDong Frank Wang (frank-xwang)

frank-xwang

Geek Repo

Company:UC Berkeley

Location:San Francisco Bay Area

Home Page:http://people.eecs.berkeley.edu/~xdwang/

Twitter:@XDWang101

Github PK Tool:Github PK Tool

XuDong Frank Wang's starred repositories

bsq-vit

[BSQ-ViT] Image and Video Tokenization with Binary Spherical Quantization

Language:PythonLicense:MITStargazers:62Issues:0Issues:0

cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Language:PythonLicense:Apache-2.0Stargazers:1422Issues:0Issues:0

DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Language:PythonLicense:MITStargazers:1854Issues:0Issues:0

ml-4m

4M: Massively Multimodal Masked Modeling

Language:PythonLicense:Apache-2.0Stargazers:1242Issues:0Issues:0

CRATE-alpha

This repository includes the official implementation our paper "Scaling White-Box Transformers for Vision"

Language:PythonStargazers:35Issues:0Issues:0

Awesome-Unsupervised-Object-Localization

Curated list of awesome works on unsupervised object localization in 2D images.

License:Apache-2.0Stargazers:60Issues:0Issues:0
Language:PythonStargazers:1073Issues:0Issues:0

scaling_on_scales

When do we not need larger vision models?

Language:PythonLicense:MITStargazers:253Issues:0Issues:0

UnScene3D

Unsupervised 3D Instance Segmentation

Language:PythonLicense:BSD-3-ClauseStargazers:27Issues:0Issues:0

xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Language:PythonLicense:NOASSERTIONStargazers:7982Issues:0Issues:0

BrainDecodesDeepNets

PyTorch implementation of "Brain Decodes Deep Nets"

Language:Jupyter NotebookStargazers:39Issues:0Issues:0

annotated_deep_learning_paper_implementations

🧑‍🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

Language:PythonLicense:MITStargazers:51303Issues:0Issues:0

zest_code

This is the official implementation of ZeST

Language:Jupyter NotebookLicense:MITStargazers:327Issues:0Issues:0

FreeU

FreeU: Free Lunch in Diffusion U-Net (CVPR2024 Oral)

License:MITStargazers:1592Issues:0Issues:0

CuVLER

Enhanced Unsupervised Object Discoveries through Exhaustive Self-Supervised Transformers

Language:PythonLicense:MITStargazers:5Issues:0Issues:0

garfield

[CVPR'24] Group Anything with Radiance Fields

Language:PythonLicense:MITStargazers:349Issues:0Issues:0

FeatUp

Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024

Language:Jupyter NotebookLicense:MITStargazers:1287Issues:0Issues:0

grok-1

Grok open release

Language:PythonLicense:Apache-2.0Stargazers:49149Issues:0Issues:0

faster-rcnn.pytorch

A faster pytorch implementation of faster r-cnn

Language:PythonLicense:MITStargazers:7624Issues:0Issues:0

FreeSOLO

FreeSOLO for unsupervised instance segmentation, CVPR 2022

Language:PythonLicense:NOASSERTIONStargazers:312Issues:0Issues:0

DetectAndTrack

The implementation of an algorithm presented in the CVPR18 paper: "Detect-and-Track: Efficient Pose Estimation in Videos"

Language:PythonLicense:Apache-2.0Stargazers:997Issues:0Issues:0

PixArt-alpha

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Language:PythonLicense:AGPL-3.0Stargazers:2525Issues:0Issues:0

dust3r

DUSt3R: Geometric 3D Vision Made Easy

Language:PythonLicense:NOASSERTIONStargazers:4660Issues:0Issues:0

Open-Sora-Plan

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Language:PythonLicense:Apache-2.0Stargazers:10879Issues:0Issues:0

LayoutGPT

Official repo for LayoutGPT

Language:PythonLicense:MITStargazers:265Issues:0Issues:0

Structured-Diffusion-Guidance

Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:298Issues:0Issues:0

MaskTrackRCNN

MaskTrackRCNN for video instance segmentation based on mmdetection

Language:PythonLicense:Apache-2.0Stargazers:430Issues:0Issues:0

RADIO

Official repository for "AM-RADIO: Reduce All Domains Into One"

Language:PythonLicense:NOASSERTIONStargazers:501Issues:0Issues:0

EMO

Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

Stargazers:7174Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:145Issues:0Issues:0