Xu Cao (IrohXu)

IrohXu

Geek Repo

Company:UIUC

Location:Palo Alto

Home Page:https://www.irohxucao.com/

Twitter:@IrohXu

Github PK Tool:Github PK Tool


Organizations
SZCHAI

Xu Cao's starred repositories

stable-diffusion-webui

Stable Diffusion web UI

Language:PythonLicense:AGPL-3.0Stargazers:136978Issues:0Issues:0

PerlDiff

PerlDiff: Controllable Street View Synthesis Using Perspective-Layout Diffusion Models

Stargazers:28Issues:0Issues:0

LLaMA-Factory

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonLicense:Apache-2.0Stargazers:28069Issues:0Issues:0

EB1A

EB1A Full Application - I-140 and I-485

Language:TeXStargazers:200Issues:0Issues:0

DDPM_inversion

Official pytorch implementation of the paper: "An Edit Friendly DDPM Noise Space: Inversion and Manipulations". CVPR 2024.

Language:PythonLicense:MITStargazers:227Issues:0Issues:0

HiDiffusion

[ECCV 2024] HiDiffusion: Increases the resolution and speed of your diffusion model by only adding a single line of code!

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:712Issues:0Issues:0

DiLightNet

Official Code Release for [SIGGRAPH 2024] DilightNet: Fine-grained Lighting Control for Diffusion-based Image Generation

Language:PythonLicense:MITStargazers:61Issues:0Issues:0

RPG-DiffusionMaster

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)

Language:Jupyter NotebookStargazers:1615Issues:0Issues:0

Paints-UNDO

Understand Human Behavior to Align True Needs

Language:PythonLicense:Apache-2.0Stargazers:3042Issues:0Issues:0

DriveDreamer

[ECCV 2024] DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving

Stargazers:269Issues:0Issues:0

Omost

Your image is almost there!

Language:PythonLicense:Apache-2.0Stargazers:6979Issues:0Issues:0

Mora

Mora: More like Sora for Generalist Video Generation

Language:PythonStargazers:1454Issues:0Issues:0

SEINE

[ICLR 2024] SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction

Language:PythonLicense:Apache-2.0Stargazers:869Issues:0Issues:0

SimGen

Simulator-conditioned Driving Scene Generation

Stargazers:41Issues:0Issues:0

LayoutGPT

Official repo for LayoutGPT

Language:PythonLicense:MITStargazers:273Issues:0Issues:0

euler-scheduler

My implementation Diffusers-like Scheduler for performing Euler Method on Conditional Flow Matching models

Language:PythonLicense:MITStargazers:7Issues:0Issues:0

Visual-Reasoning-Papers

📄 A curated list of visual reasoning papers.

Language:TeXStargazers:19Issues:0Issues:0

MMSI

Code for "Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations" (CVPR 2024 Oral)

Language:PythonLicense:MITStargazers:7Issues:0Issues:0

DiverGen

DiverGen (CVPR 2024) & BSGAL (ICML 2024)

Language:PythonLicense:MITStargazers:33Issues:0Issues:0

diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

Language:PythonLicense:Apache-2.0Stargazers:24321Issues:0Issues:0

VCog-Bench

What is the Visual Cognition Gap between Humans and Multimodal LLMs?

Language:PythonLicense:MITStargazers:3Issues:0Issues:0

yolov10

YOLOv10: Real-Time End-to-End Object Detection

Language:PythonLicense:AGPL-3.0Stargazers:8599Issues:0Issues:0

MapUncertaintyPrediction

[CVPR 2024 Award Candidate] Producing and Leveraging Online Map Uncertainty in Trajectory Prediction

Language:PythonLicense:Apache-2.0Stargazers:121Issues:0Issues:0

REDFormer

[ITSC 23] Official codebase for the paper 'Radar Enlighten the Dark: Enhancing Low-Visibility Perception for Automated Vehicles with Camera-Radar Fusion

Language:PythonLicense:MITStargazers:52Issues:0Issues:0

Awesome-LLM-Reasoning

Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought, Instruction-Tuning and Multimodality.

License:MITStargazers:1321Issues:0Issues:0

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

Stargazers:10939Issues:0Issues:0

BLINK_Benchmark

This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.org/abs/2404.12390 [ECCV 2024]

Language:PythonLicense:Apache-2.0Stargazers:89Issues:0Issues:0

Vista

A Generalizable World Model for Autonomous Driving

Language:PythonLicense:Apache-2.0Stargazers:419Issues:0Issues:0

chroma

the AI-native open-source embedding database

Language:RustLicense:Apache-2.0Stargazers:13850Issues:0Issues:0
Language:PythonStargazers:161Issues:0Issues:0