TXH-mercury / VAST

Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

https://arxiv.org/abs/2305.18500

TXH-mercury/VAST Issues

Question about table6
Updated a month ago
Is there any plan to release the finetune model of downstream tasks?
Updated a month ago4
"/data/IndexAnno.py", "VQA-msrvtt.json", and "descs_qa_trainval.json"
Closed a month ago3
Can you share the checkpoint of the finetune models?
Updated a month ago
/github/workspace/src/video/ffmpeg/threaded_decoder.cc:292: [14:29:09] /github/workspace/src/video/ffmpeg/threaded_decode r.cc:218: Check failed: avcodec_send_packet(dec_ctx_.get(), pkt.get()) >= 0 (-11 vs. 0) Thread worker: Error sending packet.
Updated a month ago1
How did you get the audio for "datasets/srcdata/msrvtt/audios"?
Closed a month ago3
Error about finetune_qa_msvd task (Miss key 'desc' or 'caption' in descs_qa_trainval.json)
Closed 3 months ago4
Activitynet-QA annotations are missing
Closed a month ago
Problem running finetuning on TGIF
Updated a month ago
What's the function of the param 'captioner_mode' ?
Updated 2 months ago
The overall pipeline implementations of caption generation for VAST-27M
Updated 2 months ago
what is the difference between argument "--local-rank" and "--local_rank"?
Updated 3 months ago
Missing config files for pretrain
Updated 3 months ago
How can I fine-tune a model for a downstream task?
Updated 3 months ago1
License?
Closed 4 months ago1
labelling my own data use vast's captioner error?
Closed 4 months ago
What are the minimum requirements for gpu memory
Updated 4 months ago
Error while captioning using single processor
Updated 4 months ago
Memory usuage during validation
Updated 4 months ago
Dataset download
Updated 4 months ago
Inference code
Updated 4 months ago
Code Release Please
Updated 4 months ago11
Code release
Updated 5 months ago7
Nice work!
Updated 5 months ago7