Andrew Rouditchenko (roudimit)

roudimit

Geek Repo

Company:Massachusetts Institute of Technology

Location:Cambridge, Massachusetts

Home Page:http://people.csail.mit.edu/roudi/

Twitter:@arouditchenko

Github PK Tool:Github PK Tool

Andrew Rouditchenko's repositories

MUSIC_dataset

MUSIC Dataset from The Sound of Pixels (ECCV '18)

AVLnet

Code for the AVLnet (Interspeech 2021) and Cascaded Multilingual (Interspeech 2021) papers.

Language:PythonLicense:NOASSERTIONStargazers:49Issues:1Issues:1

whisper-flamingo

[Interspeech 2024] Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:48Issues:0Issues:0

c2kd

Code for the C2KD paper (ICASSP 2023)

Language:PythonLicense:BSD-3-ClauseStargazers:15Issues:1Issues:0

Sound-of-Pixels

Codebase for ECCV18 "The Sound of Pixels"

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

video_feature_extractor

Easy to use video deep features extractor

Language:PythonLicense:Apache-2.0Stargazers:1Issues:0Issues:0

awesome-video-text-retrieval

A curated list of deep learning resources for video-text retrieval.

Stargazers:0Issues:0Issues:0

everything_at_once

Implementation of "Everything at Once - Multi-modal Fusion Transformer for Video Retrieval" (CVPR 2022)

Language:PythonStargazers:0Issues:0Issues:0

MIL-NCE_HowTo100M

PyTorch GPU distributed training code for MIL-NCE HowTo100M

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

MIT-6.058-Notebook

Notebooks collection I made for the MIT IAP Signals and Systems class

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0
Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:0Issues:1Issues:0

Spoken-ObjectNet

Official code for Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset (Interspeech 2021)

Language:ShellLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0