LMMs-Lab's repositories
open-r1-multimodal
A fork to add multimodal model training to open-r1
LLaVA-OneVision-1.5
Fully Open Framework for Democratized Multimodal Training
lmms-engine
A simple, unified multimodal models training engine. Lean, flexible, and built for hacking at scale.
RelateAnything
Relate Anything Model is capable of taking an image as input and utilizing SAM to identify the corresponding mask within the image.
multimodal-search-r1
MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search tools.
multimodal-sae
[ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.
lean-runner
Deploying High-Performance Lean 4 Server in One Click
VLMEvalKit
An open-source evaluation toolkit to evaluate MLLMs on Spatial Intelligence using the EASI protocol
openevolve
Open-source implementation of AlphaEvolve
DeepseekLeanPlayground
The math library of Lean 4
DiffSynth-Studio
Enjoy the magic of Diffusion models!