LMMs-Lab (EvolvingLMMs-Lab)

LMMs-Lab

EvolvingLMMs-Lab

Organization data from Github https://github.com/EvolvingLMMs-Lab

Feeling and building multimodal intelligence.

Location:Singapore

GitHub:@EvolvingLMMs-Lab

Twitter:@lmmslab

LMMs-Lab's repositories

Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

Language:PythonLicense:MITStargazers:3276Issues:79Issues:165

lmms-eval

One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

Language:PythonLicense:NOASSERTIONStargazers:3263Issues:6Issues:378

open-r1-multimodal

A fork to add multimodal model training to open-r1

Language:PythonLicense:Apache-2.0Stargazers:1416Issues:13Issues:28

LLaVA-OneVision-1.5

Fully Open Framework for Democratized Multimodal Training

Language:PythonLicense:Apache-2.0Stargazers:605Issues:0Issues:0

lmms-engine

A simple, unified multimodal models training engine. Lean, flexible, and built for hacking at scale.

Language:PythonStargazers:474Issues:0Issues:0

RelateAnything

Relate Anything Model is capable of taking an image as input and utilizing SAM to identify the corresponding mask within the image.

Language:PythonLicense:Apache-2.0Stargazers:455Issues:9Issues:12

LongVA

Long Context Transfer from Language to Vision

Language:PythonLicense:NOASSERTIONStargazers:396Issues:7Issues:37

multimodal-search-r1

MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search tools.

Language:PythonLicense:Apache-2.0Stargazers:347Issues:0Issues:0

EgoLife

[CVPR 2025] EgoLife: Towards Egocentric Life Assistant

Language:PythonLicense:NOASSERTIONStargazers:343Issues:7Issues:12

NEO

NEO Series: Native Vision-Language Models from First Principles

Language:PythonLicense:Apache-2.0Stargazers:221Issues:0Issues:0

multimodal-sae

[ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.

Language:PythonLicense:NOASSERTIONStargazers:159Issues:1Issues:4
Language:PythonLicense:Apache-2.0Stargazers:78Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:61Issues:3Issues:4

MGPO

High-Resolution Visual Reasoning via Multi-Turn Grounding-Based Reinforcement Learning

Stargazers:51Issues:0Issues:0

sae

A framework that allows you to apply Sparse AutoEncoder on any models

Language:PythonStargazers:41Issues:0Issues:0

lean-runner

Deploying High-Performance Lean 4 Server in One Click

Language:PythonLicense:MITStargazers:9Issues:0Issues:0

EASI

Holistic Evaluation of Multimodal LLMs on Spatial Intelligence

License:Apache-2.0Stargazers:5Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:3Issues:0Issues:0

sglang

SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.

Language:PythonLicense:Apache-2.0Stargazers:3Issues:0Issues:0

VLMEvalKit

An open-source evaluation toolkit to evaluate MLLMs on Spatial Intelligence using the EASI protocol

Language:PythonLicense:Apache-2.0Stargazers:3Issues:0Issues:0

openevolve

Open-source implementation of AlphaEvolve

Language:PythonLicense:Apache-2.0Stargazers:2Issues:0Issues:0

agent-rl

A fork version of verl to support multi-turn tool use and many more agentic tasks.

License:MITStargazers:1Issues:0Issues:0
Language:PythonStargazers:0Issues:3Issues:0

DeepseekLeanPlayground

The math library of Lean 4

License:Apache-2.0Stargazers:0Issues:0Issues:0

DiffSynth-Studio

Enjoy the magic of Diffusion models!

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0