Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
Home Page:https://arxiv.org/abs/2305.18500
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool