nahidalam's repositories

maya

Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya

Language:PythonLicense:Apache-2.0Stargazers:107Issues:4Issues:0

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonLicense:Apache-2.0Stargazers:6Issues:1Issues:0

anything-llm

The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.

License:MITStargazers:0Issues:0Issues:0

apple_pie

robot foundation models

Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

Cosmos

Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. Cosmos is purpose built for physical AI. The Cosmos repository will enable end users to run the Cosmos models, run inference scripts and generate videos.

License:Apache-2.0Stargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

HunyuanVideo

HunyuanVideo: A Systematic Framework For Large Video Generation Model

License:NOASSERTIONStargazers:0Issues:0Issues:0

imm

Official implementation of Inductive Moment Matching

License:NOASSERTIONStargazers:0Issues:0Issues:0

Isaac-GR00T

NVIDIA Isaac GR00T N1 is the world's first open foundation model for generalized humanoid robot reasoning and skills.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

kokoro

https://hf.co/hexgrad/Kokoro-82M

License:Apache-2.0Stargazers:0Issues:0Issues:0

lerobot

🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning

License:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:1Issues:0

lmms-eval

Accelerating the development of large multimodal models (LMMs) with lmms-eval

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0

mllms_know

[ICLR'25] Official code for the paper 'MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs'

Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

ModernBERT

Bringing BERT into modernity via both architecture changes and scaling

License:Apache-2.0Stargazers:0Issues:0Issues:0

open-r1

Fully open reproduction of DeepSeek-R1

License:Apache-2.0Stargazers:0Issues:0Issues:0

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

open_clip

An open source implementation of CLIP.

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

smolagents

🤗 smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.

License:Apache-2.0Stargazers:0Issues:0Issues:0

smollm

Everything about the SmolLM2 and SmolVLM family of models

License:Apache-2.0Stargazers:0Issues:0Issues:0
License:MITStargazers:0Issues:0Issues:0

tidybot2

TidyBot++: An Open-Source Holonomic Mobile Manipulator for Robot Learning

License:MITStargazers:0Issues:0Issues:0

video-generation-survey

A reading list of video generation

Stargazers:0Issues:0Issues:0

VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0