Lily's repositories
actionformer_release
Code release for ActionFormer (ECCV 2022)
ATM
Action Temporality Modeling for Video Question Answering
ChineseNMT
ChineseNMT: Translate English to Chinese with PyTorch Implementation of Transformer
CoVGT
Contrastive Video Question Answering via Video Graph Transformer (IEEE T-PAMI'23)
CRIPP-VQA
CRIPP-VQA Benchmark -- EMNLP, 2022
dest
The official implementation of Learning Fine-Grained Visual Understanding for Video Question Answering via Decoupling Spatial-Temporal Modeling (BMVC 2022 Spotlight).
DJ-RN
As a part of HAKE project (HAKE-3D). Code for our CVPR2020 paper "Detailed 2D-3D Joint Representation for Human-Object Interaction".
DL-Demos
Demos for deep learning
evals
Evals is a framework for evaluating OpenAI models and an open-source registry of benchmarks.
fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
G-VUE
General-purpose Vision Understanding Evaluation
Glance-Focus
This repo contains source code for Glance and Focus: Memory Prompting for Multi-Event Video Question Answering (Accepted in NeurIPS 2023)
hetsgg-torch
Hetsgg
MARS
MARS: Motion-Augmented RGB Stream for Action Recognition
ML-MWN
Official code for the Multi-Label Meta Weighting for Long-Tailed Dynamic Scene Graph Generation (ICMR 2023).
mmf
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
ncNet
ncNet, a Transformer-based model for supporting NL2VIS.
OpenPSG
Benchmarking Panoptic Scene Graph Generation (PSG), ECCV'22
python-causality-handbook
Causal Inference for the Brave and True. A light-hearted yet rigorous approach to learning about impact estimation and causality.
pytracking
Visual tracking library based on PyTorch.
question-decomposition-to-sql
Weakly Supervised Text-to-SQL Parsing through Question Decomposition
RelateAnything
Relate Anything Model is capable of taking an image as input and utilizing SAM to identify the corresponding mask within the image.
RelTR
RelTR: Relation Transformer for Scene Graph Generation: https://arxiv.org/abs/2201.11460v2
SGDiff
Official implementation for "Diffusion-Based Scene Graph to Image Generation with Masked Contrastive Pre-Training" https://arxiv.org/abs/2211.11138
SIMPAC-2023-146--
中文环境领域文本分析包,纯神经网络架构,支持EnvBert,LSTM,RNN,word2vec等模型,支持自定义模型,下游任务包括分类,回归,多选,情感分析,命名实体识别等,专题包括气候变化文本分析,环境知识图谱等。针对领域研究进行了接口的优化,一键使用模型。
svitt
Code for CVPR 2023 paper "SViTT: Temporal Learning of Sparse Video-Text Transformers"
visual-chatgpt
Official repo for the paper: Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models