Zhao Jiahe (ZJHTerry18)

ZJHTerry18

Geek Repo

Github PK Tool:Github PK Tool

Zhao Jiahe's starred repositories

MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

Language:PythonLicense:BSD-3-ClauseStargazers:25157Issues:220Issues:452

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonLicense:Apache-2.0Stargazers:18196Issues:158Issues:1400

LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:9244Issues:96Issues:627

LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Language:PythonLicense:GPL-3.0Stargazers:5618Issues:78Issues:141

mmtracking

OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework.

Language:PythonLicense:Apache-2.0Stargazers:3448Issues:47Issues:459

LLaMA2-Accessory

An Open-source Toolkit for LLM Development

Language:PythonLicense:NOASSERTIONStargazers:2622Issues:36Issues:133

computer-vision-in-action

A computer vision closed-loop learning platform where code can be run interactively online. 学习闭环《计算机视觉实战演练:算法与应用》中文电子书、源码、读者交流社区(持续更新中 ...) 📘 在线电子书 https://charmve.github.io/computer-vision-in-action/ 👇项目主页

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:2467Issues:36Issues:75

InternLM-XComposer

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

GLIP

Grounded Language-Image Pre-training

Language:PythonLicense:MITStargazers:2081Issues:45Issues:168

LISA

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

Language:PythonLicense:Apache-2.0Stargazers:1650Issues:11Issues:127

awesome-openai-vision-api-experiments

Must-have resource for anyone who wants to experiment with and build on the OpenAI vision API 🔥

open-images-dataset

Open Images is a dataset of ~9 million images that have been annotated with image-level labels and bounding boxes spanning thousands of classes.

VisionLLM

VisionLLM Series

Language:PythonLicense:Apache-2.0Stargazers:745Issues:38Issues:11

GPT4RoI

GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

Language:PythonLicense:NOASSERTIONStargazers:479Issues:8Issues:44

fromage

🧀 Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:467Issues:12Issues:37
Language:PythonLicense:BSD-3-ClauseStargazers:337Issues:18Issues:16

MIC

MMICL, a state-of-the-art VLM with the in context learning ability from ICL, PKU

PCT

This is an official implementation of our CVPR 2023 paper "Human Pose as Compositional Tokens" (https://arxiv.org/pdf/2303.11638.pdf)

Language:PythonLicense:MITStargazers:293Issues:6Issues:40

LAMM

[NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:213Issues:13Issues:12

ContextDET

Contextual Object Detection with Multimodal Large Language Models

APTM

The official code of "Towards Unified Text-based Person Retrieval: A Large-scale Multi-Attribute and Language Search Benchmark"

Language:PythonLicense:MITStargazers:124Issues:4Issues:24

ISR_ICCV2023_Oral

The code for ICCV2023 Oral paper: Identity-Seeking Self-Supervised Representation Learning for Generalizable Person Re-identification

UAL

The code for ECCV2022 paper: Reliability-Aware Prediction via Uncertainty Learning for Person Image Retrieval

PVIT

Repository of paper: Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models

Language:PythonLicense:Apache-2.0Stargazers:26Issues:4Issues:8

pointingqa

Code for paper "Point and Ask: Incorporating Pointing into Visual Question Answering"

Pix2SeqV2-Pytorch

Simple Implementation of Pix2seqV2(multi-task)

Language:PythonStargazers:16Issues:0Issues:0