Zhixing Sun's repositories

2024-AAAI-HPT

Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models (AAAI 2024)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Agent-Attention

Official repository of Agent Attention

Language:PythonStargazers:0Issues:0Issues:0

AttriCLIP

CVPR2023: AttriCLIP: A Non-Incremental Learner for Incremental Knowledge Learning

Language:PythonStargazers:0Issues:0Issues:0

BiDistFSCIL

Official implementation of CVPR 2023 paper Few-Shot Class-Incremental Learning via Class-Aware Bilateral Distillation.

Language:PythonStargazers:0Issues:0Issues:0

code-samples

Holds code for our CVPR'23 tutorial: All Things ViTs: Understanding and Interpreting Attention in Vision.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

FGVP

Official Codes for Fine-Grained Visual Prompting, NeurIPS 2023

Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

FLatten-Transformer

Official repository of FLatten Transformer (ICCV2023)

Stargazers:0Issues:0Issues:0

GraphRAG-Local-UI

GraphRAG using Local LLMs - Features robust API and multiple apps for Indexing/Prompt Tuning/Query/Chat/Visualizing/Etc. This is meant to be the ultimate GraphRAG/KG local LLM app.

License:MITStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

IELT

Source code of the paper Fine-Grained Visual Classification via Internal Ensemble Learning Transformer

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

LLaVA-Plus-Codebase

LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

Monkey

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models

License:MITStargazers:0Issues:0Issues:0

multimodal-prompt-learning

[CVPR 2023] Official repository of paper titled "MaPLe: Multi-modal Prompt Learning".

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

Oryx

MLLM for On-Demand Spatial-Temporal Understanding at Arbitrary Resolution

Stargazers:0Issues:0Issues:0

ovsam

[arXiv preprint] The official code of paper "Open-Vocabulary SAM".

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

License:Apache-2.0Stargazers:0Issues:0Issues:0

recognize-anything

Code for the Recognize Anything Model (RAM) and Tag2Text Model

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

RevisitingCIL

The code repository for "Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need" in PyTorch.

Language:PythonStargazers:0Issues:0Issues:0

SHIP

Official code for ICCV 2023 paper, "Improving Zero-Shot Generalization for CLIP with Synthesized Prompts"

Language:PythonStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

some_useful_python_program

some useful python program

Language:PythonStargazers:0Issues:0Issues:0

sunhongbo.github.io

Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes

Language:JavaScriptLicense:MITStargazers:0Issues:0Issues:0

TokenPacker

The code for "TokenPacker: Efficient Visual Projector for Multimodal LLM".

Stargazers:0Issues:0Issues:0

Vitron

A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing

Stargazers:0Issues:0Issues:0