Wenguan Wang's repositories
VOS_Correspondence
Official code for CVPR2023 Boosting Video Object Segmentation via Space-time Correspondence Learning
Stereoscopic-thumbnail-creation-via-efficient-stereo-saliency-detection
Stereoscopic Thumbnail Creation via Efficient Stereo Saliency Detection (TVCG16)
SupertrajectorySeg
Semi-Supervised Video Object Segmentation with Super-Trajectories (ICCV2017, PAMI2018)
ContrastiveSeg
Exploring Cross-Image Pixel Contrast for Semantic Segmentation
visdial-gnn
(CVPR19Oral) Reasoning Visual Dialogs with Structural and Partial Observations
Active_VLN
The repository of ECCV 2020 paper `Active Visual Information Gathering for Vision-Language Navigation`
Segment-and-Track-Anything
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.
AGNN
Zero-shot Video Object Segmentation via Attentive Graph Neural Networks (ICCV2019 Oral)
C-HOI
Cascaded Human-Object Interaction Recognition (CVPR2020)
CompositionalHumanParsing
(ICCV2019) Learning Compositional Neural Infomation Fusion for Human Parsing
DNC
Official Pytorch implementation of 'Visual Recognition with Deep Nearest Centroids'. (ICLR2023 Spotlight)
DoraemonGPT
Official repository of DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models
ETPNav
Official Implementation of "ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous Environments"
GMMSeg
[NeurIPS 2022 Spotlight] GMMSeg: Gaussian Mixture based Generative Semantic Segmentation Models
GraphMemVOS
Code for ECCV 2020 paper: Video Object Segmentation with Episodic Graph Memory Networks
LANA-VLN
Repository of our CVPR2023 paper "Lana: A Language-Capable Navigator for Instruction Following and Generation"
ProtoSeg
CVPR2022 (Oral) - Rethinking Semantic Segmentation: A Prototype View
SSM-VLN
Code and Data for our CVPR 2021 paper "Structured Scene Memory for Vision-Language Navigation"
TD-STP
Code for MM 22 "Target-Driven Structured Transformer Planner for Vision-Language Navigation"
VAR
[CVPR 2022] Visual Abductive Reasoning