Xin (Eric) Wang's starred repositories
segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
stablediffusion
High-Resolution Image Synthesis with Latent Diffusion Models
generative-models
Generative Models by Stability AI
Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
gaussian-splatting
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
MetaTransformer
Meta-Transformer for Unified Multimodal Learning
Multimodal-GPT
Multimodal-GPT
Transformer-in-Vision
Recent Transformer-based CV and related works.
Neural-Network-Diffusion
We introduce a novel approach for parameter generation, named neural network parameter diffusion (p-diff), which employs a standard latent diffusion model to synthesize a new set of parameters
Multi-Modality-Arena
Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!
Structured-Diffusion-Guidance
Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
habitat-matterport3d-dataset
This repository contains code to reproduce experimental results from our HM3D paper in NeurIPS 2021.
Aerial-Vision-and-Dialog-Navigation
Codebase of ACL 2023 Findings "Aerial Vision-and-Dialog Navigation"
MultipanelVQA
Code for the MultipanelVQA benchmark