Yuhao Dong (dongyh20)

dongyh20

User data from Github https://github.com/dongyh20

Company:Tsinghua University

GitHub:@dongyh20

Yuhao Dong's repositories

Octopus

[ECCV2024] 🐙Octopus, an embodied vision-language model trained with RLEF, emerging superior in embodied visual planning and programming.

Insight-V

[CVPR2025 Highlight] Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

Chain-of-Spot

Chain-of-Spot: Interactive Reasoning Improves Large Vision-language Models

Language:PythonLicense:Apache-2.0Stargazers:96Issues:5Issues:8

C2P

[CVPR2023]Complete-to-Partial 4D Distillation for Self-Supervised Point Cloud Sequence Representation Learning

Awesome-RL-based-Reasoning-MLLMs

This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-based Reasoning MLLMs!

License:MITStargazers:0Issues:0Issues:0

Awesome_Think_With_Images

Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.

Stargazers:0Issues:0Issues:0

c2p.github.io

Project Page for C2P[CVPR2023]

Language:JavaScriptStargazers:0Issues:1Issues:0
Stargazers:0Issues:1Issues:0

lmms-eval

Accelerating the development of large multimodal models (LMMs) with lmms-eval

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0