qiantianwen's repositories
NuScenes-QA
[AAAI 2024] NuScenes-QA: A Multi-modal Visual Question Answering Benchmark for Autonomous Driving Scenario.
ViGA
"Video Moment Retrieval from Text Queries via Single Frame Annotation" in SIGIR 2022.
Language:PythonMIT000
X-Trans2Cap
[CVPR 2022] X-Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning
Language:PythonApache-2.0000