Yui010206

followers

following

stars

UNC, Chapel Hill

Chapel Hill

https://yui010206.github.io/

Shoubin's repositories

SeViLA

[NeurIPS 2023] Self-Chained Image-Language Model for Video Localization and Question Answering

Language:PythonBSD-3-Clause165 3 24

CREMA

☕️ CREMA: Multimodal Compositional Video Reasoning via Efficient Modular Adaptation and Fusion

Language:PythonBSD-3-Clause19 20

MoPRL

[TCSVT] Regularity Learning via Explicit Distribution Modeling for Skeletal Video Anomaly Detection

Language:Python10 10

AIART_Website

an image style translatiton website

020

AlphaPose

Real-Time and Accurate Full-Body Multi-Person Pose Estimation&Tracking System

Language:PythonNOASSERTION010

arunmallya.github.io

my public website

Language:JavaScript010

awesome-anomaly-detection

A curated list of awesome anomaly detection resources

010

awesome-vln

A curated list of research papers in Vision-Language Navigation (VLN)

MIT010

detectron2

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Language:PythonApache-2.0010

grid-feats-vqa

Grid features pre-training code for visual question answering

Language:PythonApache-2.0000

HOI-Learning-List

A list of Human-Object Interaction Learning.

010

just-ask

[TPAMI Special Issue on ICCV 2021 Best Papers, Oral] Just Ask: Learning to Answer Questions from Millions of Narrated Videos

Language:Jupyter NotebookApache-2.0010

LAVIS

LAVIS - A One-stop Library for Language-Vision Intelligence

Language:PythonBSD-3-Clause000

MAC

020

magenta

Magenta: Music and Art Generation with Machine Intelligence

Language:PythonApache-2.0010

merlot_reserve

Code release for "MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound"

Language:PythonMIT010

mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

Language:PythonNOASSERTION010

n2nmn

Code release for Hu et al. Learning to Reason: End-to-End Module Networks for Visual Question Answering. in ICCV, 2017

Language:SourcePawnBSD-2-Clause010

Person-Search-with-Natural-Language-Description

Person Search with Natural Language Description

Language:Lua010

Research

novel deep learning research works with PaddlePaddle

Language:PythonApache-2.0010

Scene-Graph-Benchmark.pytorch

A new codebase for popular Scene Graph Generation methods (2020). Visualization & Scene Graph Extraction on custom images/datasets are provided. It's also a PyTorch implementation of paper “Unbiased Scene Graph Generation from Biased Training CVPR 2020”

Language:Jupyter NotebookNOASSERTION010

seg2vid

Video Generation from Single Semantic Label Map

Language:Python010

SJTUThesis

Shanghai Jiao Tong University XeLaTeX Thesis Template

Language:TeXApache-2.0010

SlowFast

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

Language:PythonApache-2.0010

transformer-time-series-prediction

proof of concept for a transformer-based time series prediction model

Language:PythonMIT010

VGT

Video Graph Transformer for Video Question Answering (ECCV'22)

Language:PythonApache-2.0000

video-swin-transformer-pytorch

Video Swin Transformer - PyTorch

Language:PythonMIT010

video_feature_extractor

Easy to use video deep features extractor

Language:PythonApache-2.0010

ViLT

Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"

Language:PythonApache-2.0010

Yui010206.github.io

Language:SCSSMIT000