Cherryjingyao's starred repositories

visualnav-transformer

Official code and checkpoint release for mobile robot foundation models: GNM, ViNT, and NoMaD.

Language:PythonLicense:MITStargazers:514Issues:0Issues:0

SuperPrompt

SuperPrompt is an attempt to engineer prompts that might help us understand AI agents.

Stargazers:3812Issues:0Issues:0

Qwen-Agent

Agent framework and applications built upon Qwen2, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.

Language:PythonLicense:NOASSERTIONStargazers:3065Issues:0Issues:0

Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Language:PythonLicense:Apache-2.0Stargazers:1741Issues:0Issues:0

JARVIS-1

JARVIS-1: Open-world Multi-task Agents with Memory-Augmented Multimodal Language Models

Language:JavaStargazers:329Issues:0Issues:0

Agent-Smith

[ICML2024] Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast

Language:PythonLicense:MITStargazers:76Issues:0Issues:0

visualwebarena

VisualWebArena is a benchmark for multimodal agents.

Language:PythonLicense:MITStargazers:211Issues:0Issues:0

Multi-Agent-GPT

Multi-Agent-GPT: 一款基于RAG和agent构建的多模态专家助手GPT。它集成了文本、图像和音频等模态工具。支持本地部署和私有数据库建设。

Language:PythonLicense:MITStargazers:216Issues:0Issues:0

Chatglm_lora_multi-gpu

chatglm多gpu用deepspeed和

Language:PythonStargazers:395Issues:0Issues:0

alfworld

ALFWorld: Aligning Text and Embodied Environments for Interactive Learning

Language:PythonLicense:MITStargazers:328Issues:0Issues:0

JARVIS

JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf

Language:PythonLicense:MITStargazers:23531Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:34512Issues:0Issues:0

FireAct

FireAct: Toward Language Agent Fine-tuning

Language:PythonLicense:MITStargazers:241Issues:0Issues:0

MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Language:PythonLicense:Apache-2.0Stargazers:11775Issues:0Issues:0

modelscope-agent

ModelScope-Agent: An agent framework connecting models in ModelScope with the world

Language:PythonLicense:Apache-2.0Stargazers:2598Issues:0Issues:0

autogen

A programming framework for agentic AI 🤖

Language:Jupyter NotebookLicense:CC-BY-4.0Stargazers:30691Issues:0Issues:0

agentUniverse

agentUniverse is a LLM multi-agent framework that allows developers to easily build multi-agent applications.

Language:PythonLicense:Apache-2.0Stargazers:741Issues:0Issues:0
Language:PythonStargazers:89Issues:0Issues:0

mem0

The Memory layer for your AI apps

Language:PythonLicense:Apache-2.0Stargazers:21548Issues:0Issues:0

Ask-Anything

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

Language:PythonLicense:MITStargazers:2973Issues:0Issues:0

MiniGPT4-video

Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding

Language:PythonLicense:BSD-3-ClauseStargazers:534Issues:0Issues:0

humanplus

[CoRL 2024] HumanPlus: Humanoid Shadowing and Imitation from Humans

Language:PythonStargazers:508Issues:0Issues:0

BLIP

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:4627Issues:0Issues:0

lamorel

Lamorel is a Python library designed for RL practitioners eager to use Large Language Models (LLMs).

Language:PythonLicense:MITStargazers:186Issues:0Issues:0

ViLaIn

An official implementation of Vision-Language Interpreter (ViLaIn)

Language:SASLicense:MITStargazers:23Issues:0Issues:0

open_flamingo

An open-source framework for training large multimodal models.

Language:PythonLicense:MITStargazers:3651Issues:0Issues:0

VQASynth

Compose multimodal datasets 🎹

Language:PythonStargazers:170Issues:0Issues:0

flamingo-pytorch

Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

Language:PythonLicense:MITStargazers:1191Issues:0Issues:0

reflect

[CoRL 2023] REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction

Language:Jupyter NotebookLicense:MITStargazers:70Issues:0Issues:0