Xu Cao (IrohXu)

IrohXu

Geek Repo

Company:UIUC

Location:Palo Alto

Home Page:https://www.irohxucao.com/

Twitter:@IrohXu

Github PK Tool:Github PK Tool


Organizations
SZCHAI

Xu Cao's starred repositories

RareBench

[KDD2024 ADS Track] RareBench: Can LLMs Serve as Rare Diseases Specialists?

Language:PythonLicense:Apache-2.0Stargazers:12Issues:0Issues:0

Vim

[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Language:PythonLicense:Apache-2.0Stargazers:2783Issues:0Issues:0

graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system

Language:PythonLicense:MITStargazers:16773Issues:0Issues:0

Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Language:PythonLicense:Apache-2.0Stargazers:1479Issues:0Issues:0

Baichuan-13B

A 13B large language model developed by Baichuan Intelligent Technology

Language:PythonLicense:Apache-2.0Stargazers:2979Issues:0Issues:0

CODA-LM

Official PyTorch implementation of CODA-LM(https://arxiv.org/abs/2404.10595)

Language:PythonStargazers:56Issues:0Issues:0

VisionLLM

VisionLLM Series

Language:PythonLicense:Apache-2.0Stargazers:832Issues:0Issues:0

segment-anything-2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:10419Issues:0Issues:0

GPT4Tools

GPT4Tools is an intelligent system that can automatically decide, control, and utilize different visual foundation models, allowing the user to interact with images during a conversation.

Language:PythonLicense:NOASSERTIONStargazers:751Issues:0Issues:0

Multi-Modality-Arena

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

Language:PythonStargazers:435Issues:0Issues:0

stable-diffusion-webui

Stable Diffusion web UI

Language:PythonLicense:AGPL-3.0Stargazers:139129Issues:0Issues:0

PerlDiff

PerlDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Models

Stargazers:32Issues:0Issues:0

LLaMA-Factory

Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)

Language:PythonLicense:Apache-2.0Stargazers:30253Issues:0Issues:0

EB1A

EB1A Full Application - I-140 and I-485

Language:TeXStargazers:212Issues:0Issues:0

DDPM_inversion

Official pytorch implementation of the paper: "An Edit Friendly DDPM Noise Space: Inversion and Manipulations". CVPR 2024.

Language:PythonLicense:MITStargazers:243Issues:0Issues:0

HiDiffusion

[ECCV 2024] HiDiffusion: Increases the resolution and speed of your diffusion model by only adding a single line of code!

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:731Issues:0Issues:0

DiLightNet

Official Code Release for [SIGGRAPH 2024] DilightNet: Fine-grained Lighting Control for Diffusion-based Image Generation

Language:PythonLicense:MITStargazers:88Issues:0Issues:0

RPG-DiffusionMaster

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)

Language:Jupyter NotebookLicense:MITStargazers:1641Issues:0Issues:0

Paints-UNDO

Understand Human Behavior to Align True Needs

Language:PythonLicense:Apache-2.0Stargazers:3252Issues:0Issues:0

DriveDreamer

[ECCV 2024] DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving

Stargazers:279Issues:0Issues:0

Omost

Your image is almost there!

Language:PythonLicense:Apache-2.0Stargazers:7178Issues:0Issues:0

Mora

Mora: More like Sora for Generalist Video Generation

Language:PythonStargazers:1471Issues:0Issues:0

SEINE

[ICLR 2024] SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction

Language:PythonLicense:Apache-2.0Stargazers:882Issues:0Issues:0

SimGen

Simulator-conditioned Driving Scene Generation

Stargazers:45Issues:0Issues:0

LayoutGPT

Official repo for LayoutGPT

Language:PythonLicense:MITStargazers:282Issues:0Issues:0

euler-scheduler

My implementation Diffusers-like Scheduler for performing Euler Method on Conditional Flow Matching models

Language:PythonLicense:MITStargazers:7Issues:0Issues:0

Visual-Reasoning-Papers

📄 A curated list of visual reasoning papers.

Language:TeXStargazers:20Issues:0Issues:0

MMSI

Code for "Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations" (CVPR 2024 Oral)

Language:PythonLicense:MITStargazers:8Issues:0Issues:0

DiverGen

DiverGen (CVPR 2024) & BSGAL (ICML 2024)

Language:PythonLicense:BSD-2-ClauseStargazers:33Issues:0Issues:0

diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

Language:PythonLicense:Apache-2.0Stargazers:25023Issues:0Issues:0