Xin (Eric) Wang (eric-xw)

eric-xw

Geek Repo

Company:University of California, Santa Cruz

Github PK Tool:Github PK Tool


Organizations
eric-ai-lab

Xin (Eric) Wang's starred repositories

whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Language:PythonLicense:MITStargazers:63540Issues:530Issues:0

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:45181Issues:299Issues:650

stablediffusion

High-Resolution Image Synthesis with Latent Diffusion Models

Language:PythonLicense:MITStargazers:37284Issues:442Issues:292

generative-models

Generative Models by Stability AI

Language:PythonLicense:MITStargazers:23099Issues:251Issues:274

Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:13982Issues:113Issues:369

gaussian-splatting

Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"

Language:PythonLicense:NOASSERTIONStargazers:12323Issues:110Issues:804

LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:9115Issues:95Issues:625
Language:PythonLicense:NOASSERTIONStargazers:6062Issues:70Issues:115

Voyager

An Open-Ended Embodied Agent with Large Language Models

Language:JavaScriptLicense:MITStargazers:5317Issues:62Issues:139

OpenAGI

OpenAGI: When LLM Meets Domain Experts

Language:PythonLicense:MITStargazers:1800Issues:26Issues:16
Language:PythonLicense:Apache-2.0Stargazers:1671Issues:133Issues:18

MetaTransformer

Meta-Transformer for Unified Multimodal Learning

Language:PythonLicense:Apache-2.0Stargazers:1462Issues:22Issues:65

Multimodal-GPT

Multimodal-GPT

Language:PythonLicense:Apache-2.0Stargazers:1435Issues:12Issues:16

Transformer-in-Vision

Recent Transformer-based CV and related works.

Neural-Network-Diffusion

We introduce a novel approach for parameter generation, named neural network parameter diffusion (p-diff), which employs a standard latent diffusion model to synthesize a new set of parameters

AWSIM

Open source simulator for self-driving vehicles

Language:C#License:NOASSERTIONStargazers:473Issues:52Issues:91

Multi-Modality-Arena

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

Structured-Diffusion-Guidance

Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:298Issues:7Issues:14
Language:PythonLicense:Apache-2.0Stargazers:173Issues:10Issues:24

habitat-matterport3d-dataset

This repository contains code to reproduce experimental results from our HM3D paper in NeurIPS 2021.

Language:PythonLicense:MITStargazers:128Issues:9Issues:5

LLMScore

LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation

PEViT

Official implementation of AAAI 2023 paper "Parameter-efficient Model Adaptation for Vision Transformers"

Language:PythonLicense:MITStargazers:93Issues:6Issues:8

TIP

Multimodal-Procedural-Planning

Aerial-Vision-and-Dialog-Navigation

Codebase of ACL 2023 Findings "Aerial Vision-and-Dialog Navigation"

CPL

Official implementation of our EMNLP 2022 paper "CPL: Counterfactual Prompt Learning for Vision and Language Models"

Language:PythonLicense:MITStargazers:31Issues:3Issues:9

FedVLN

[ECCV 2022] Official pytorch implementation of the paper "FedVLN: Privacy-preserving Federated Vision-and-Language Navigation"

Language:C++License:MITStargazers:12Issues:3Issues:0

T2IAT

T2IAT: Measuring Valence and Stereotypical Biases in Text-to-Image Generation

Language:PythonLicense:MITStargazers:7Issues:4Issues:0

PECTVLM

Code implementation for Findings of EMNLP 2023 paper "Parameter-Efficient Cross-lingual Transfer of Vision and Language Models via Translation-based Alignment"

Language:SmalltalkLicense:MITStargazers:6Issues:3Issues:0

R2H

Official implementation of the EMNLP 2023 paper "R2H: Building Multimodal Navigation Helpers that Respond to Help Requests"

Language:PythonStargazers:3Issues:3Issues:0

MultipanelVQA

Code for the MultipanelVQA benchmark

Stargazers:3Issues:0Issues:0