Maria Xenochristou (mariaxen)

mariaxen

Geek Repo

Company:Stanford

Location:Palo Alto, California

Twitter:@mariaxenoch

Github PK Tool:Github PK Tool

Maria Xenochristou's starred repositories

google-research

Google Research

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:33546Issues:749Issues:1223

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Language:PythonLicense:MITStargazers:29930Issues:426Issues:4173

tensorboardX

tensorboard for pytorch (and chainer, mxnet, numpy, ...)

Language:PythonLicense:MITStargazers:7839Issues:84Issues:450

adrenaline

Instant answers to any programming question

deepfakes_faceswap

from deekfakes' faceswap: https://www.reddit.com/user/deepfakes/

youtube-8m

Starter code for working with the YouTube-8M dataset.

Language:PythonLicense:Apache-2.0Stargazers:2298Issues:109Issues:25

AliceMind

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

Language:PythonLicense:Apache-2.0Stargazers:1964Issues:50Issues:80

VideoMAE

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

Language:PythonLicense:NOASSERTIONStargazers:1284Issues:16Issues:118

info8010-deep-learning

Lectures for INFO8010 Deep Learning, ULiège

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:1213Issues:65Issues:2

neural_renderer

"Neural 3D Mesh Renderer" (CVPR 2018) by H. Kato, Y. Ushiku, and T. Harada.

Language:PythonLicense:MITStargazers:1133Issues:51Issues:40

CVinW_Readings

A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''

Awesome-CLIP

Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).

ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:1079Issues:18Issues:131

VideoX

VideoX: a collection of video cross-modal models

Language:PythonLicense:NOASSERTIONStargazers:954Issues:22Issues:110

SPIN

Repository for the paper "Learning to Reconstruct 3D Human Pose and Shape via Model-fitting in the Loop"

Language:PythonLicense:NOASSERTIONStargazers:804Issues:23Issues:131

UniFormer

[ICLR2022] official implementation of UniFormer

Language:PythonLicense:Apache-2.0Stargazers:803Issues:10Issues:130

human_body_prior

VPoser: Variational Human Pose Prior

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:771Issues:25Issues:67

awesome-audio-visual

A curated list of different papers and datasets in various areas of audio-visual processing

Multimodal-Toolkit

Multimodal model for text and tabular data with HuggingFace transformers as building block for text data

Language:PythonLicense:Apache-2.0Stargazers:570Issues:25Issues:54

ActionCLIP

This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"

Language:PythonLicense:MITStargazers:486Issues:4Issues:50

video_features

Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.

Language:PythonLicense:MITStargazers:482Issues:6Issues:72

MultiBench

[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning

Language:HTMLLicense:MITStargazers:461Issues:16Issues:32

everything_at_once

This is the official implementation of "Everything at Once - Multi-modal Fusion Transformer for Video Retrieval". CVPR 2022

Bridge-Prompt

[CVPR2022] Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos

AVLnet

Code for the AVLnet (Interspeech 2021) and Cascaded Multilingual (Interspeech 2021) papers.

Language:PythonLicense:NOASSERTIONStargazers:49Issues:1Issues:1

RepNet-Pytorch

Temporal repetition counting

TCAF-GZSL

This repository contains the code for our ECCV 2022 paper "Temporal and cross-modal attention for audio-visual zero-shot learning"

Language:PythonLicense:MITStargazers:24Issues:5Issues:10

fitclip

Code for the FitCLIP method

Language:PythonLicense:MITStargazers:7Issues:3Issues:1