Yxxxb

followers

following

stars

Tsinghua University; Intern @TencentARC

Shenzhen

yxxxb.github.io

Xubing Ye's starred repositories

MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

Language:PythonBSD-3-Clause25343 218 460

generative-models

Generative Models by Stability AI

Language:PythonMIT24315 257 305

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

trl

Train transformer language models with reinforcement learning.

Language:PythonApache-2.09694 74 1155

mistral-src

Reference implementation of Mistral AI 7B v0.1 model.

Language:Jupyter NotebookApache-2.08772 116 115

Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Language:PythonNOASSERTION4905 49 443

mmaction2

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

Language:PythonApache-2.04217 43 1359

awesome-RLHF

A curated list of reinforcement learning with human feedback resources (continually updated)

Apache-2.03321 61 3

Awesome-Text-to-Image

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.

direct-preference-optimization

Reference implementation for DPO (Direct Preference Optimization)

Language:PythonApache-2.02075 19 81

LVM

Language:PythonApache-2.01750 121 22

RPG-DiffusionMaster

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)

Language:Jupyter NotebookMIT1667 26 51

podcast

此 GitHub 作为《不明白播客》官网的备份站，用于分享文字版播客。版权所有 ©️ 不明白播客 bumingbai.net

UNINEXT

[CVPR'23] Universal Instance Perception as Object Discovery and Retrieval

Language:PythonMIT1493 99 56

Video-Swin-Transformer

This is an official implementation for "Video Swin Transformers".

Language:PythonApache-2.01419 9 93

Awesome-Multimodal-Research

A curated list of Multimodal Related Research.

Language:PythonMIT1304 40 1

ImageReward

[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation

Language:PythonApache-2.01126 14 86

llm-hallucination-survey

Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models"

Awesome-Robotics-Foundation-Models

ReferFormer

[CVPR2022] Official Implementation of ReferFormer

Language:PythonApache-2.0322 7 55

deep-learning-dynamics-paper-list

This is a list of peer-reviewed representative papers on deep learning dynamics (optimization dynamics of neural networks). The success of deep learning attributes to both network architecture and stochastic optimization. Thus, deep learning dynamics play an essentially important role in theoretical foundation of deep learning.

MIT242 150

RLHF-V

[CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback

Language:Python225 2 27

tongji-undergrad-thesis

:page_facing_up: 同济大学本科生毕业设计论文模板 | Tongji University Undergraduate Thesis Template | Overleaf / Mac / Linux / Windows / Workshop / Docker

Language:TeXLPPL-1.3c193 4 24

segment-caption-anything

[CVPR 24] The repository provides code for running inference and training for "Segment and Caption Anything" (SCA) , links for downloading the trained model checkpoints, and example notebooks / gradio demo that show how to use the model.

Language:PythonApache-2.0190 7 14

Awesome-Video-Object-Segmentation

:bookmark: Curated list of video object segmentation (VOS) papers, datasets, and projects.

LAVT-RIS

Language:PythonGPL-3.0179 4 36

polygon-transformer

Language:PythonApache-2.0128 11 27

Muffin

Language:Python53 8 6

UniLSeg

Official implementation of "Universal Segmentation at Arbitrary Granularity with Language Instruction"

SCAN

[CVPR 2024] The repository contains the official implementation of "Open-Vocabulary Segmentation with Semantic-Assisted Calibration"

Language:Jupyter NotebookNOASSERTION11 2 1