microsoft / XPretrain

Multi-modality pre-training

microsoft/XPretrain Issues

Is there no classification in the HD-VILA dataset?
Updated 24 days ago1
about clipvip-vit-16 pretrained weights file
Updated a month ago
Pretrained Checkpoints of CLIP-VIP
Updated 2 months ago
Code for transcript text processing
Closed 2 years ago26
About activitynet captions dataset in CLIP-ViP
Updated 2 months ago
Pretrained Checkpoints of LF-VILA
Closed 2 months ago1
Hi, how to understand the LF-hdvila-8m?
Updated 3 months ago1
Code for transcript text processing
Updated 4 months ago1
Dockerfile and requirements for Clip-ViP
Updated 5 months ago
About LF-VILA code in PatchEmbed3D of video encoder
Updated 5 months ago
Error on starting horovod
Updated 6 months ago
Asking for a simple script to get text and video features
Updated 7 months ago8
Model checkpoints
Updated 10 months ago
Error in finetuning
Updated 10 months ago1
video caption of HD-VILA-100M Dataset
Closed 10 months ago1
How long does CLIP-VIP pretraining takes?
Updated a year ago1
About the zero-shot performance
Closed a year ago1
About the zero-shot performance
Closed a year ago2
where are the train9k.jsonl and test1ka.jsonl files in MSRVTT retrieval?
Closed a year ago3
Where is the MSRVTT json file in CLIP-ViP?
Closed a year ago2
CLIP-VIP OFA caption generate
Closed a year ago1
MSR-VTT fine tune epochs number
Closed a year ago2
Ways to open the .mdb caption files
Closed a year ago2
Captions for HD-ViLA-100M
Closed a year ago1
How to prepare pretrain data for LF-VILA?
Closed a year ago2
How to use HD-VILA as multimodal TextEncoder?
Closed a year ago3
Video compression/decoding methods of each dataset in CLIP-ViP
Closed a year ago1
About OFA-Caption generated captions on HD-VILA-100M
Closed a year ago1
Question regarding video proxy mechanism in CLIP-ViP
Closed a year ago4
Reproducing the result of CLIP-ViP performance on MSRVTT
Closed a year ago4
In CLIP-ViP, what is the results of OFA captions + HD-VILA-10M?
Closed a year ago1
Questions about HD-VILA
Closed 2 years ago4
[CLS] token in CLIP-ViP
Closed 2 years ago2
releasing code and pretrain
Closed 2 years ago3
Long Video Processing in LF-VILA
Closed 2 years ago3
Where can i get the asr text
Closed 2 years ago1
where to download the ASR transcriptions?
Closed 2 years ago1
HD-VILA-100M dataset, where is the text corresponding to each video?
Closed 2 years ago2