msvd

There are 0 repository under msvd topic.

ArrowLuo / CLIP4Clip
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
activitynet clip didemo lsmdc msrvtt msvd multimodal multimodal-learning multimodality ranking retrieval retrieval-model search video-clip-retrieval video-text-retrieval
Language:Python 926
xuguohai / X-CLIP
An official implementation for "X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval"
multimodal video-text-retrieval msrvtt activitynet didemo lsmdc msvd
Language:Python 152
jssprz / video_captioning_datasets
Summary about Video-to-Text datasets. This repository is part of the review paper *Bridging Vision and Language from the Video-to-Text Perspective: A Comprehensive Review*
video-captioning video-description vision-and-language video-dataset video-to-text msvd msr-vtt activitynet-captions trecvid charades vatex tgif-dataset review state-of-the-art
Language:Jupyter Notebook 121
MikeWangWZHL / VidIL
Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
blip clip gpt-3 msrvtt msvd vatex video-language vision-language youcook2 vlep
Language:Python 114
nasib-ullah / video-captioning-models-in-Pytorch
A PyTorch implementation of state of the art video captioning models from 2015-2019 on MSVD and MSRVTT datasets.
deep-learning marn msrvtt msvd pytorch pytorch-implementation recnet s2vt sequence-to-sequence video video-captioning video-captioning-models
Language:Python 70
tuyunbin / Video-Description-with-Spatial-Temporal-Attention
[ACM MM 2017 & IEEE TMM 2020] This is the Theano code for the paper "Video Description with Spatial Temporal Attention"
attention-mechanism msvd video-caption
Language:Python 57
WingsBrokenAngel / Semantics-AssistedVideoCaptioning
Source code for Semantics-Assisted Video Captioning Model Trained with Scheduled Sampling Strategy
msrvtt msvd python3 state-of-the-art tensorflow video-captioning-model youtube2text
Language:Python 55
WingsBrokenAngel / delving-deeper-into-the-decoder-for-video-captioning
Source code for Delving Deeper into the Decoder for Video Captioning
decoder msr-vtt msvd professional-learning semantics state-of-the-art tensorflow video-captioning
Language:Jupyter Notebook 39
jssprz / visual_syntactic_embedding_video_captioning
Source code of the paper titled *Improving Video Captioning with Temporal Composition of a Visual-Syntactic Embedding*
deep-learning encoder-decoder msr-vtt msvd pos-tagging representation-learning syntactic-representations video-captioning video-description video-to-text wacv2021
Language:Python 30
jssprz / video_features_extractor
Python implementation of extraction of several visual features representations from videos
visual-representation video-representation cnn c3d video-captioning msvd msr-vtt trecvid activitynet-captions vatex tgif-dataset
Language:Python 22
jssprz / attentive_specialized_network_video_captioning
Source code of the paper titled *Attentive Visual Semantic Specialized Network for Video Captioning*
video-captioning msvd msr-vtt icpr2020 deep-learning video-to-text video-description
Language:Python 15
suyash-chintawar / Spatio-Temporal-Attention-based-Video-Captioning
To build attention based encoder-decoder model for video captioning on the MSVD dataset
attention-mechanism gru lstm msvd video-captioning
Language:Jupyter Notebook 4
tuyunbin / Enhancing-the-Alignment-between-Target-Words-and-Corresponding-Frames-for-Video-Captioning
[Pattern Rcognition 2021] This is the Theano code for our paper "Enhancing the Alignment between Target Words and Corresponding Frames for Video Captioning".
msvd video-captoning
Language:Python 4
Pandla-Vijay / Video-Captioning-using-Spatio-temporal-features-and-Gaussian-Attention
This project utilizes advanced deep learning techniques to automatically generate contextually relevant captions for videos by extracting spatial and temporal features, while incorporating Gaussian attention to focus on important regions. This enhances video indexing, retrieval, and accessibility for visually impaired individuals.
gru lstm msvd spatio-temporal-data video-captioning
Language:Jupyter Notebook 3
HaydenFaulkner / Attributes_SVO_Video_Captioning
LSTM RNN and Transformer networks video captioning on MSVD and MSR-VTT using attributes and SVOS
video-captioning msvd msr-vtt lstm transformer pytorch
Language:Jupyter Notebook 2
willyfh / msvd-indonesian
MSVD-Indonesian: A Benchmark for Multimodal Video-Text Tasks in Indonesian (Bahasa Indonesia).
deep-learning multimodal-dataset neural-network video-captioning video-description video-retrieval video-text indonesian-dataset msvd msvd-indonesian bahasa-indonesia
2

msvd

ArrowLuo / CLIP4Clip

xuguohai / X-CLIP

jssprz / video_captioning_datasets

MikeWangWZHL / VidIL

nasib-ullah / video-captioning-models-in-Pytorch

tuyunbin / Video-Description-with-Spatial-Temporal-Attention

WingsBrokenAngel / Semantics-AssistedVideoCaptioning

WingsBrokenAngel / delving-deeper-into-the-decoder-for-video-captioning

jssprz / visual_syntactic_embedding_video_captioning

jssprz / video_features_extractor

jssprz / attentive_specialized_network_video_captioning

suyash-chintawar / Spatio-Temporal-Attention-based-Video-Captioning

tuyunbin / Enhancing-the-Alignment-between-Target-Words-and-Corresponding-Frames-for-Video-Captioning

Pandla-Vijay / Video-Captioning-using-Spatio-temporal-features-and-Gaussian-Attention

HaydenFaulkner / Attributes_SVO_Video_Captioning

willyfh / msvd-indonesian