yangyuke001

followers

following

stars

Zhejiang University

hangzhou

杨宇克's starred repositories

3D-VLA

[ICML 2024] 3D-VLA: A 3D Vision-Language-Action Generative World Model

Language:Python25000

SegmentAnything3D

[ICCV'23 Workshop] SAM3D: Segment Anything in 3D Scenes

Language:PythonMIT90900

Awesome-Embodied-Agent-with-LLMs

This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates! 🔥

RT-2

Democratization of RT-2 "RT-2: New model translates vision and language into action"

Language:PythonMIT31800

embodied-agents

Seamlessly integrate state-of-the-art transformer models into robotics stacks

Language:PythonApache-2.013400

robotic-transformer-pytorch

Implementation of RT1 (Robotic Transformer) in Pytorch

Language:PythonMIT36100

habitat-sim

A flexible, high-performance 3D simulator for Embodied AI research.

Language:C++MIT248100

Awesome-Embodied-AI

Apache-2.020400

ChatGLM2-6B

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

Language:PythonNOASSERTION1564900

MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Language:PythonApache-2.0805900

Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

Language:Shell649300

MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型，实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。

Language:PythonApache-2.0301700

dino-tracker

Official Pytorch Implementation for “DINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Video”

Language:PythonMIT31600

narrator

David Attenborough narrates your life

Language:Python433000

FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Language:PythonMIT618500

YOLO-World

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Language:PythonGPL-3.0404700

Depth-Anything

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Language:PythonApache-2.0652600

modelscope-agent

ModelScope-Agent: An agent framework connecting models in ModelScope with the world

Language:PythonApache-2.0229600

h2o-llmstudio

H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://h2oai.github.io/h2o-llmstudio/

Language:PythonApache-2.0379300

LLaMA-Factory

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Language:PythonApache-2.02769000

Grounding-DINO-1.5-API

API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series

Language:PythonApache-2.064200

SEED-X

Multimodal Models in Real World

Language:Jupyter NotebookNOASSERTION34300

llama3

The official Meta Llama 3 GitHub site

Language:PythonNOASSERTION2410300

SD-inference

Stable Diffusion inference

Language:PythonMIT18400

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Language:PythonApache-2.0351100

Monkey

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Language:PythonMIT159300

Awesome-ChatGPT

ChatGPT资料汇总学习，持续更新......

ProPainter

[ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting

Language:PythonNOASSERTION516600

IOPaint

Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.

Language:PythonApache-2.01830000

AnyText

Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>

Language:PythonApache-2.0406700