satwikkottur

We rank the 1st in DSTC8 Audio-Visual Scene-Aware Dialog competition. This is the source code for our IEEE/ACM TASLP (AAAI2020-DSTC8-AVSD) paper "Bridging Text and Video: A Universal Multimodal Transformer for Video-Audio Scene-Aware Dialog".

Language:PythonMIT010

EgoVLP

[Arxiv2022] Egocentric Video-Language Pretraining

Language:Python010

gitignore

A collection of useful .gitignore templates

CC0-1.0010

ImageTextDetector

Fall 2014 Course project for Computer Vision course

Language:Matlab030

lang-emerge-parlai

Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Language:Python030

MTN

Code for the paper Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems (ACL19)

Language:PythonMIT010

mturk-code-samples

Code samples to help you get started with the Amazon Mechanical Turk Requester API

Language:JavaApache-2.0020

neural-networks-and-deep-learning

Code samples for my book "Neural Networks and Deep Learning"

Language:Python020

nn

Language:LuaNOASSERTION020

ParlAI

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

Language:PythonMIT010

rnn

Recurrent Neural Network library for Torch7's nn

Language:LuaBSD-3-Clause030

satwikkottur.github.io

Personal Webpage

Language:JavaScriptMIT020

simmc

With the aim of building next generation virtual assistants that can handle multimodal inputs and perform multimodal actions, we introduce two new datasets (both in the virtual shopping domain), the annotation schema, the core technical tasks, and the baseline models. The code for the baselines and the datasets will be opensourced.

Language:PythonNOASSERTION020

satwikkottur

Satwik Kottur's repositories

clevr-dialog

VisualWord2Vec

StochasticMCMC

MovieRecommend

FluidSimulator

abstract_scenes_v002

DeepLearningMovies

DSTC8-AVSD

EgoVLP

gitignore

ImageTextDetector

lang-emerge-parlai

MTN

mturk-code-samples

neural-networks-and-deep-learning

nn

ParlAI

rnn

satwikkottur.github.io

simmc

simmc2

sparse-app

tensorflow

tutorials

VD-BERT

visdial-1

visdial-bert

visdial-challenge-starter-pytorch

visual-semantic-embedding