Jiazhi Yang (YTEP-ZHI)

YTEP-ZHI

Geek Repo

Company:@OpenDriveLab

Location:Shanghai, China

Home Page:https://scholar.google.com/citations?user=Ju7nGX8AAAAJ&hl=zh-CN

Github PK Tool:Github PK Tool

Jiazhi Yang's starred repositories

InstantID

InstantID : Zero-shot Identity-Preserving Generation in Seconds 🔥

Language:PythonLicense:Apache-2.0Stargazers:9703Issues:122Issues:168

StableCascade

Official Code for Stable Cascade

Language:Jupyter NotebookLicense:MITStargazers:6279Issues:57Issues:111
Language:PythonLicense:Apache-2.0Stargazers:3634Issues:49Issues:94

Vim

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

RPG-DiffusionMaster

Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (PRG)

Language:Jupyter NotebookLicense:AGPL-3.0Stargazers:1420Issues:23Issues:35

onediff

OneDiff: An out-of-the-box acceleration library for diffusion models.

hallucination-leaderboard

Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents

OpenLRM

An open-source impl. of Large Reconstruction Models

Language:PythonLicense:Apache-2.0Stargazers:722Issues:26Issues:34

ml-aim

This repository provides the code and model checkpoints of the research paper: Scalable Pre-training of Large Autoregressive Image Models

Language:PythonLicense:NOASSERTIONStargazers:609Issues:20Issues:4

Bunny

A family of lightweight multimodal models.

Language:PythonLicense:Apache-2.0Stargazers:459Issues:11Issues:27

gaussian_splatting_notes

A detailed formulae explanation on gaussian splatting

LaVIT

LaVIT: Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:330Issues:15Issues:17

local-attention

An implementation of local windowed attention for language modeling

Language:PythonLicense:MITStargazers:325Issues:5Issues:16

FreeNoise

[ICLR 2024] Code for FreeNoise based on VideoCrafter

Language:PythonLicense:Apache-2.0Stargazers:314Issues:6Issues:13

DCNv4

[CVPR 2024] Deformable Convolution v4

Language:PythonLicense:MITStargazers:311Issues:3Issues:41

InfoBatch

Lossless Training Speed Up by Unbiased Dynamic Data Pruning

VIRL

Code for V-IRL: Grounding Virtual Intelligence in Real Life

particle-sfm

ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild. ECCV 2022.

Language:C++License:GPL-3.0Stargazers:226Issues:15Issues:15

Forge_VFM4AD

A comprehensive survey of forging vision foundation models for autonomous driving, including challenges, methodologies, and opportunities.

iclr2024-openreview-submissions

ICLR 2024 OpenReivew Submission Data

ZeroShape

Code repository for "ZeroShape: Regression-based Zero-shot Shape Reconstruction".

clip_prs

official implementation of "Interpreting CLIP's Image Representation via Text-Based Decomposition"

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:71Issues:3Issues:1

EvalCrafter

[CVPR 2024] EvalCrafter: Benchmarking and Evaluating Large Video Generation Models

Language:Jupyter NotebookStargazers:65Issues:1Issues:9

360DVD

[CVPR2024] 360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model

POP3D

Source code for NeurIPS paper "POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images"

Optix

Memory Efficient Training Framework for Large Video Generation Model

Language:PythonLicense:Apache-2.0Stargazers:17Issues:4Issues:1