Chaoyou Fu (BradyFU)

BradyFU

Geek Repo

Company:Nanjing University

Location:Shanghai

Home Page:https://bradyfu.github.io/

Github PK Tool:Github PK Tool


Organizations
VITA-MLLM

Chaoyou Fu's starred repositories

EAGLE

EAGLE: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

Language:PythonLicense:Apache-2.0Stargazers:507Issues:0Issues:0

MME-RealWorld

✨✨ MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?

Language:PythonStargazers:68Issues:0Issues:0

VITA

✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM

Language:PythonLicense:NOASSERTIONStargazers:804Issues:0Issues:0

RWKU

RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024

Language:PythonStargazers:52Issues:0Issues:0

SliME

✨✨Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models

Language:PythonLicense:Apache-2.0Stargazers:132Issues:0Issues:0

Awesome-Open-Vocabulary-Detection-and-Segmentation

Awesome OVD-OVS - A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future

Stargazers:97Issues:0Issues:0

Libra

Simple PyTorch implementation of "Libra: Building Decoupled Vision System on Large Language Models" (accepted by ICML 2024)

Language:PythonLicense:Apache-2.0Stargazers:41Issues:0Issues:0

Video-MME

✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

Stargazers:374Issues:0Issues:0
Language:PythonStargazers:31Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:100Issues:0Issues:0
Language:HTMLStargazers:64Issues:0Issues:0

MGM

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Language:PythonLicense:Apache-2.0Stargazers:3189Issues:0Issues:0

VMamba

VMamba: Visual State Space Models,code is based on mamba

Language:PythonLicense:MITStargazers:2067Issues:0Issues:0

APE

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Language:PythonLicense:Apache-2.0Stargazers:478Issues:0Issues:0

LLaMA-VID

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)

Language:PythonLicense:Apache-2.0Stargazers:693Issues:0Issues:0

4DGaussians

[CVPR 2024] 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:2113Issues:0Issues:0

GaussianDreamer

GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models (CVPR 2024)

Language:PythonLicense:Apache-2.0Stargazers:650Issues:0Issues:0

Lion

Lion: Kindling Vision Intelligence within Large Language Models

Stargazers:52Issues:0Issues:0
Stargazers:33Issues:0Issues:0

CNeRF

Pytorch implementation of AAAI2023 Oral paper "Semantic 3D-aware Portrait Synthesis and Manipulation Based on Compositional Neural Radiance Field"

Language:PythonLicense:NOASSERTIONStargazers:39Issues:0Issues:0

MUST-GAN

Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

Language:PythonStargazers:75Issues:0Issues:0

MUTR

[AAAI 2024] Referred by Multi-Modality: A Unified Temporal Transformers for Video Object Segmentation

Language:PythonLicense:MITStargazers:63Issues:0Issues:0

PanoVOS

[ECCV 2024] PanoVOS: Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation

License:BSD-3-ClauseStargazers:17Issues:0Issues:0

Woodpecker

✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models. The first work to correct hallucinations in MLLMs.

Language:PythonStargazers:601Issues:0Issues:0

LongLoRA

Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)

Language:PythonLicense:Apache-2.0Stargazers:2608Issues:0Issues:0

MQ-Det

Official PyTorch implementation of "Multi-modal Queried Object Detection in the Wild" (accepted by NeurIPS 2023)

Language:PythonLicense:Apache-2.0Stargazers:258Issues:0Issues:0

vision-process-webui

💡💡💡awesome compute vision app in gradio

Language:PythonLicense:Apache-2.0Stargazers:41Issues:0Issues:0

Awesome-Multimodal-Large-Language-Models

:sparkles::sparkles:Latest Advances on Multimodal Large Language Models

Stargazers:11987Issues:0Issues:0

SeqTR

SeqTR: A Simple yet Universal Network for Visual Grounding

Language:PythonStargazers:128Issues:0Issues:0

TiNeuVox

TiNeuVox: Fast Dynamic Radiance Fields with Time-Aware Neural Voxels (SIGGRAPH Asia 2022)

Language:PythonLicense:Apache-2.0Stargazers:323Issues:0Issues:0