zhangtao's repositories

e2ec

E2EC: An End-to-End Contour-based Method for High-Quality High-Speed Instance Segmentation

Language:PythonLicense:NOASSERTIONStargazers:203Issues:9Issues:45

DVIS

DVIS: Decoupled Video Instance Segmentation Framework

Language:PythonLicense:MITStargazers:118Issues:4Issues:30
Language:PythonLicense:MITStargazers:69Issues:2Issues:14

PCM

Point Could Mamba: Point Cloud Learning via State Space Model

Language:PythonLicense:MITStargazers:2Issues:1Issues:1
Language:PythonLicense:NOASSERTIONStargazers:1Issues:0Issues:0

bubogpt

BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:1Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

dinov2

PyTorch code and models for the DINOv2 self-supervised learning method.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

DragGAN

Code for DragGAN (SIGGRAPH 2023)

Stargazers:0Issues:0Issues:0
Language:PythonLicense:NOASSERTIONStargazers:0Issues:1Issues:0

fc-clip

This repo contains the code for our paper Convolutions Die Hard: Open-Vocabulary Panoptic Segmentation with Single Frozen Convolutional CLIP

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

HIPIE

Code release for "Hierarchical Open-vocabulary Universal Image Segmentation"

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

InternLM

InternLM has open-sourced 7 and 20 billion parameter base models and chat models.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

LLaVA

Visual Instruction Tuning: Large Language-and-Vision Assistant built towards multimodal GPT-4 level capabilities.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Mask2Former

Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

OmniScient-Model

This repo contains the code for our paper Towards Open-Ended Visual Recognition with Large Language Model

License:Apache-2.0Stargazers:0Issues:0Issues:0

OpenSeeD

A Simple Framework for Open-Vocabulary Segmentation and Detection

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Osprey

The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:1Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Semantic-SAM

Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"

Stargazers:0Issues:0Issues:0

subobjects

Official repository of paper "Subobject-level Image Tokenization"

Stargazers:0Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

VAR

[GPT beats diffusionšŸ”„] [scaling laws in visual generationšŸ“ˆ] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction"

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

zhang-tao-whu.github.io

AcadHomepage: A Modern and Responsive Academic Personal Homepage

Language:JavaScriptLicense:MITStargazers:0Issues:0Issues:0