Lily's repositories

actionformer_release

Code release for ActionFormer (ECCV 2022)

Language:PythonLicense:MITStargazers:1Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

CoVGT

Contrastive Video Question Answering via Video Graph Transformer (IEEE T-PAMI'23)

License:Apache-2.0Stargazers:0Issues:0Issues:0

CRIPP-VQA

CRIPP-VQA Benchmark -- EMNLP, 2022

License:MITStargazers:0Issues:0Issues:0

dest

The official implementation of Learning Fine-Grained Visual Understanding for Video Question Answering via Decoupling Spatial-Temporal Modeling (BMVC 2022 Spotlight).

License:MITStargazers:0Issues:0Issues:0

DJ-RN

As a part of HAKE project (HAKE-3D). Code for our CVPR2020 paper "Detailed 2D-3D Joint Representation for Human-Object Interaction".

License:Apache-2.0Stargazers:0Issues:0Issues:0

evals

Evals is a framework for evaluating OpenAI models and an open-source registry of benchmarks.

License:MITStargazers:0Issues:0Issues:0

G-VUE

General-purpose Vision Understanding Evaluation

Stargazers:0Issues:0Issues:0

gluon-cv

Gluon CV Toolkit

License:Apache-2.0Stargazers:0Issues:0Issues:0

Graphormer

Graphormer is a deep learning package that allows researchers and developers to train custom models for molecule modeling tasks. It aims to accelerate the research and application in AI for molecule science, such as material design, drug discovery, etc.

License:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

ML-MWN

Official code for the Multi-Label Meta Weighting for Long-Tailed Dynamic Scene Graph Generation (ICMR 2023).

License:MITStargazers:0Issues:0Issues:0

mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

License:NOASSERTIONStargazers:0Issues:0Issues:0

OpenPSG

Benchmarking Panoptic Scene Graph Generation (PSG), ECCV'22

License:MITStargazers:0Issues:0Issues:0

paper-reading

深度学习经典、新论文逐段精读

License:Apache-2.0Stargazers:0Issues:0Issues:0

pytracking

Visual tracking library based on PyTorch.

License:GPL-3.0Stargazers:0Issues:0Issues:0

question-decomposition-to-sql

Weakly Supervised Text-to-SQL Parsing through Question Decomposition

License:MITStargazers:0Issues:0Issues:0

RelateAnything

Relate Anything Model is capable of taking an image as input and utilizing SAM to identify the corresponding mask within the image.

License:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

RelTR

RelTR: Relation Transformer for Scene Graph Generation: https://arxiv.org/abs/2201.11460v2

Stargazers:0Issues:0Issues:0

SIMPAC-2023-146--

中文环境领域文本分析包,纯神经网络架构,支持EnvBert,LSTM,RNN,word2vec等模型,支持自定义模型,下游任务包括分类,回归,多选,情感分析,命名实体识别等,专题包括气候变化文本分析,环境知识图谱等。针对领域研究进行了接口的优化,一键使用模型。

License:Apache-2.0Stargazers:0Issues:0Issues:0

STTran

Spatial-Temporal Transformer for Dynamic Scene Graph Generation, ICCV2021

License:MITStargazers:0Issues:0Issues:0

study_resources

study resources of model and engineering

Stargazers:0Issues:0Issues:0

svitt

Code for CVPR 2023 paper "SViTT: Temporal Learning of Sparse Video-Text Transformers"

License:MITStargazers:0Issues:0Issues:0

VGT

Video Graph Transformer for Video Question Answering (ECCV'22)

License:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

visual-chatgpt

Official repo for the paper: Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models

License:MITStargazers:0Issues:0Issues:0

vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

License:MITStargazers:0Issues:0Issues:0