mariaxen

Maria Xenochristou's starred repositories

google-research

Google Research

Language:Jupyter NotebookApache-2.033546 749 1223

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Language:PythonMIT29930 426 4173

tensorboardX

tensorboard for pytorch (and chainer, mxnet, numpy, ...)

Language:PythonMIT7839 84 450

adrenaline

Instant answers to any programming question

GPL-3.03769 48 23

deepfakes_faceswap

from deekfakes' faceswap: https://www.reddit.com/user/deepfakes/

Language:Python3078 200 22

youtube-8m

Starter code for working with the YouTube-8M dataset.

Language:PythonApache-2.02298 109 25

AliceMind

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

Language:PythonApache-2.01964 50 80

VideoMAE

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

Language:PythonNOASSERTION1284 16 118

info8010-deep-learning

Lectures for INFO8010 Deep Learning, ULiège

Language:Jupyter NotebookBSD-3-Clause1213 65 2

neural_renderer

"Neural 3D Mesh Renderer" (CVPR 2018) by H. Kato, Y. Ushiku, and T. Harada.

Language:PythonMIT1133 51 40

CVinW_Readings

A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''

1101 37 6

Awesome-CLIP

Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).

1081 20 12

ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Language:Jupyter NotebookBSD-3-Clause1079 18 131

VideoX

VideoX: a collection of video cross-modal models

Language:PythonNOASSERTION954 22 110

SPIN

Repository for the paper "Learning to Reconstruct 3D Human Pose and Shape via Model-fitting in the Loop"

Language:PythonNOASSERTION804 23 131

UniFormer

[ICLR2022] official implementation of UniFormer

Language:PythonApache-2.0803 10 130

human_body_prior

VPoser: Variational Human Pose Prior

Language:Jupyter NotebookNOASSERTION771 25 67

awesome-audio-visual

A curated list of different papers and datasets in various areas of audio-visual processing

640 18 2

Multimodal-Toolkit

Multimodal model for text and tabular data with HuggingFace transformers as building block for text data

Language:PythonApache-2.0570 25 54

ActionCLIP

This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"

Language:PythonMIT486 4 50

video_features

Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.

Language:PythonMIT482 6 72

MultiBench

[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning

Language:HTMLMIT461 16 32

efficient-video-recognition

Language:Python162 5 22

everything_at_once

This is the official implementation of "Everything at Once - Multi-modal Fusion Transformer for Video Retrieval". CVPR 2022

Language:Python91 2 14

Bridge-Prompt

[CVPR2022] Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos

Language:Python89 4 23

AVLnet

Code for the AVLnet (Interspeech 2021) and Cascaded Multilingual (Interspeech 2021) papers.

Language:PythonNOASSERTION49 1 1

RepNet-Pytorch

Temporal repetition counting

Language:Python39 3 12

avbert

Language:Python31 2 3

TCAF-GZSL

This repository contains the code for our ECCV 2022 paper "Temporal and cross-modal attention for audio-visual zero-shot learning"

Language:PythonMIT24 5 10

fitclip

Code for the FitCLIP method

Language:PythonMIT7 3 1