uenian33's starred repositories

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:25391Issues:219Issues:4105

CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

Language:PythonLicense:Apache-2.0Stargazers:5800Issues:65Issues:415

moondream

tiny vision language model

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:4795Issues:51Issues:111

koboldcpp

Run GGUF models easily with a KoboldAI UI. One File. Zero Install.

Language:C++License:AGPL-3.0Stargazers:4686Issues:67Issues:713

Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Language:PythonLicense:NOASSERTIONStargazers:4608Issues:50Issues:421

MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Language:PythonLicense:Apache-2.0Stargazers:3147Issues:26Issues:129

FoundationPose

[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects

Language:PythonLicense:NOASSERTIONStargazers:1275Issues:31Issues:201

InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Language:PythonLicense:Apache-2.0Stargazers:1260Issues:29Issues:148

TinyGPT-V

TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones

Language:PythonLicense:BSD-3-ClauseStargazers:1229Issues:19Issues:33

3D-LLM

Code for 3D-LLM: Injecting the 3D World into Large Language Models

Language:PythonLicense:MITStargazers:880Issues:16Issues:62

octo

Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.

Language:PythonLicense:MITStargazers:735Issues:19Issues:92

dift

[NeurIPS'23] Emergent Correspondence from Image Diffusion

Language:PythonLicense:MITStargazers:573Issues:7Issues:22

dobb-e

Dobb·E: An open-source, general framework for learning household robotic manipulation

Language:G-codeLicense:MITStargazers:558Issues:15Issues:7

omniglue

Code release for CVPR'24 submission 'OmniGlue'

Language:PythonLicense:Apache-2.0Stargazers:500Issues:10Issues:23

tabbyAPI

An OAI compatible exllamav2 API that's both lightweight and fast

Language:PythonLicense:AGPL-3.0Stargazers:421Issues:9Issues:89

alfworld

ALFWorld: Aligning Text and Embodied Environments for Interactive Learning

Language:PythonLicense:MITStargazers:318Issues:8Issues:72

embodied-generalist

[ICML 2024] Official code repository for 3D embodied generalist agent LEO

Language:PythonLicense:MITStargazers:308Issues:15Issues:42

Object-Goal-Navigation

Pytorch code for NeurIPS-20 Paper "Object Goal Navigation using Goal-Oriented Semantic Exploration"

Language:PythonLicense:MITStargazers:295Issues:6Issues:32

awesome-temporal-action-segmentation

A curated list of awesome temporal action segmentation resources.

simple-diffusion

A minimal implementation of a denoising diffusion model in PyTorch.

Language:PythonLicense:MITStargazers:80Issues:2Issues:1

GeoAware-SC

Official Implementation of paper "Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence"

LDM_correspondences

Unsupervised Semantic Correspondence Using Stable Diffusion

Language:PythonLicense:Apache-2.0Stargazers:48Issues:3Issues:3

deformable_gym

A collection of RL gymnasium environments for learning to grasp 3D deformable objects.

Language:PythonLicense:NOASSERTIONStargazers:19Issues:9Issues:22
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:13Issues:8Issues:1

softgym_tfn

Code for CoRL 2022 paper: https://arxiv.org/abs/2211.09006 (simulation environments)

Language:C++License:MITStargazers:10Issues:6Issues:1
Language:PythonLicense:MITStargazers:4Issues:1Issues:0

SSSCWEB

This repository contains the official implementation of Self-supervised Learning of Semantic Correspondence Using Web Videos that has been accepted to 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2024).

Language:PythonStargazers:1Issues:2Issues:0