Xin (Eric) Wang (eric-xw)

eric-xw

Geek Repo

Company:University of California, Santa Cruz

Github PK Tool:Github PK Tool


Organizations
eric-ai-lab

Xin (Eric) Wang's starred repositories

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:45983Issues:303Issues:658

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:25121Issues:206Issues:215

generative-models

Generative Models by Stability AI

Language:PythonLicense:MITStargazers:23635Issues:251Issues:289

gaussian-splatting

Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"

Language:PythonLicense:NOASSERTIONStargazers:13031Issues:112Issues:856

SWE-agent

SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It solves 12.47% of bugs in the SWE-bench evaluation set and takes just 1 minute to run.

Language:PythonLicense:MITStargazers:12181Issues:87Issues:342

MemGPT

Create LLM agents with long-term memory and custom tools 📚🦙

Language:PythonLicense:Apache-2.0Stargazers:11006Issues:113Issues:684
Language:PythonLicense:Apache-2.0Stargazers:7041Issues:67Issues:69

Voyager

An Open-Ended Embodied Agent with Large Language Models

Language:JavaScriptLicense:MITStargazers:5411Issues:62Issues:143

OpenAGI

OpenAGI: When LLM Meets Domain Experts

Language:PythonLicense:MITStargazers:1870Issues:27Issues:16
Language:PythonLicense:Apache-2.0Stargazers:1715Issues:123Issues:20

MetaTransformer

Meta-Transformer for Unified Multimodal Learning

Language:PythonLicense:Apache-2.0Stargazers:1476Issues:22Issues:65

Multimodal-GPT

Multimodal-GPT

Language:PythonLicense:Apache-2.0Stargazers:1450Issues:12Issues:17

Neural-Network-Parameter-Diffusion

We introduce a novel approach for parameter generation, named neural network parameter diffusion (p-diff), which employs a standard latent diffusion model to synthesize a new set of parameters

AWSIM

Open source simulator for self-driving vehicles

Language:C#License:NOASSERTIONStargazers:483Issues:55Issues:96

Multi-Modality-Arena

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

swap-anything

"SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing"

Language:PythonLicense:Apache-2.0Stargazers:187Issues:10Issues:37

LLMScore

LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation

TIP

Multimodal-Procedural-Planning

Aerial-Vision-and-Dialog-Navigation

Codebase of ACL 2023 Findings "Aerial Vision-and-Dialog Navigation"

ComCLIP

Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"

Language:PythonLicense:MITStargazers:27Issues:3Issues:0

Discffusion

Official repo for the paper "Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners"

Language:PythonLicense:MITStargazers:25Issues:2Issues:0

llm_coordination

Code repository for the paper "LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models"

Language:PythonLicense:MITStargazers:18Issues:4Issues:0

MMWorld

Official repo of the paper "MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos"

Language:PythonLicense:MITStargazers:16Issues:1Issues:0

ProbMed

"Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA"

Language:PythonStargazers:11Issues:1Issues:0

T2IAT

T2IAT: Measuring Valence and Stereotypical Biases in Text-to-Image Generation

Language:PythonLicense:MITStargazers:7Issues:4Issues:0

MultipanelVQA

Code for the MultipanelVQA benchmark "Muffin or Chihuahua? Challenging Large Vision-Language Models with Multipanel VQA"

Language:Jupyter NotebookLicense:MITStargazers:6Issues:3Issues:0

PECTVLM

Code implementation for Findings of EMNLP 2023 paper "Parameter-Efficient Cross-lingual Transfer of Vision and Language Models via Translation-based Alignment"

Language:SmalltalkLicense:MITStargazers:6Issues:3Issues:0

Naivgation-as-wish

Official implementation of the NAACL 2024 paper "Navigation as Attackers Wish? Towards Building Robust Embodied Agents under Federated Learning"

Language:PythonLicense:MITStargazers:4Issues:2Issues:0

R2H

Official implementation of the EMNLP 2023 paper "R2H: Building Multimodal Navigation Helpers that Respond to Help Requests"

Language:PythonStargazers:4Issues:3Issues:0