Wenhao Chai (rese1f)

rese1f

Geek Repo

Company:University of Washington

Location:Seattle, US

Home Page:https://rese1f.github.io/

Twitter:@re5e1f

Github PK Tool:Github PK Tool


Organizations
CVNext

Wenhao Chai's repositories

StableVideo

[ICCV 2023] StableVideo: Text-driven Consistency-aware Diffusion Video Editing

Language:PythonLicense:Apache-2.0Stargazers:1335Issues:20Issues:23

MovieChat

[CVPR 2024] πŸŽ¬πŸ’­ chat with over 10K frames of video!

Language:PythonLicense:BSD-3-ClauseStargazers:408Issues:10Issues:57

Awesome-VQVAE

πŸ“š A collection of resources and papers on Vector Quantized Variational Autoencoder (VQ-VAE) and its application

CityGen

πŸ™οΈπŸŒ†πŸŒƒ Try Infinite and Controllable 3D City Layout Generation!

STEVE

β›πŸ’Ž STEVE in Minecraft is for See and Think: Embodied Agent in Virtual Environment

License:MITStargazers:27Issues:0Issues:3

Awesome-DriveLM

πŸ“š A collection of resources and papers on Large Language Models in autonomous driving

PoseDA

[ICCV 2023] Global Adaptation meets Local Generalization: Unsupervised Domain Adaptation for 3D Human Pose Estimation

UniAP

[AAAI 2024] UniAP: Towards Universal Animal Perception in Vision via Few-shot Learning

Language:PythonLicense:MITStargazers:9Issues:4Issues:0

old_web

personal website built on beautiful jekyll, feel free to clone and modify

Language:HTMLLicense:MITStargazers:3Issues:0Issues:0

UniVHP

Unified Human-centric Perception Model and Benchmark in Sports

arxiv-daily

πŸŽ“ Automatically Update Some Fields Papers Daily using Github Actions (Update Every 12th hours)

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

Awesome-LLM-3D

Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources

License:MITStargazers:1Issues:0Issues:0

3D-VisTA

Official implementation of ICCV 2023 paper "3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment"

License:MITStargazers:0Issues:0Issues:0

all-seeing

This is the official implementation of the paper "The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World"

Stargazers:0Issues:1Issues:0

awesome-3D-gaussian-splatting

Curated list of papers and resources focused on 3D Gaussian Splatting, intended to keep pace with the anticipated surge of research in the coming months.

License:MITStargazers:0Issues:0Issues:0

Awesome-Foundation-Models

A curated list of foundation models for vision and language tasks

Stargazers:0Issues:0Issues:0

Awesome-Long-Context

A curated list of resources about long-context in large-language models and video understanding.

Stargazers:0Issues:0Issues:0

Awesome-MLLM-Hallucination

πŸ“– A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).

Stargazers:0Issues:0Issues:0

Awesome-Multimodal-Large-Language-Models

Latest Papers and Datasets on Multimodal Large Language Models

Stargazers:0Issues:0Issues:0

awesome-NeRF

A curated list of awesome neural radiance fields papers

Language:TeXLicense:MITStargazers:0Issues:0Issues:0

Awesome-Skeleton-based-Action-Recognition

A curated paper list of awesome skeleton-based action recognition.

License:MITStargazers:0Issues:0Issues:0

DriveLM

DriveLM: Drive on Language

License:Apache-2.0Stargazers:0Issues:0Issues:0

ED-Pose

[ICLR 2023] Official implementation of the paper "Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation "

License:Apache-2.0Stargazers:0Issues:0Issues:0

ipl-uw.github.io

Website for IPL

Language:HTMLStargazers:0Issues:0Issues:0

LLaMA-Efficient-Tuning

Easy-to-use fine-tuning framework using PEFT (PT+SFT+RLHF with QLoRA) (LLaMA-2, BLOOM, Falcon, Baichuan)

License:Apache-2.0Stargazers:0Issues:0Issues:0

LLM-Agent-Paper-List

The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.

Stargazers:0Issues:0Issues:0

minisora

The Mini Sora project aims to explore the implementation path and future development direction of Sora.

Stargazers:0Issues:0Issues:0

Multi-Modality-Arena

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

Stargazers:0Issues:0Issues:0

OpenScene

3D Occupancy Prediction Benchmark in Autonomous Driving

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

rese1f

Config files for my GitHub profile.

Stargazers:0Issues:0Issues:0