Multiple modals

Question

HaibiaoXuan opened this issue 9 months ago · comments

How to use multiple modals at the same time for a task, such as text+image, text+audio, or text+pointcloud?

Yiyuan Zhang · Answer 1 · Mon Sep 04 2023 20:45:02 GMT+0800 (China Standard Time)

You can simply concatenate these multimodal embeddings and then feed them to the shared encoder.

Xu Hong Bo · Answer 2 · Thu Sep 21 2023 15:23:36 GMT+0800 (China Standard Time)

You can simply concatenate these multimodal embeddings and then feed them to the shared encoder.

请问下是指将分类前的向量统一拼接，然后送给分类器吗？

Yiyuan Zhang · Answer 3 · Thu Sep 21 2023 15:25:15 GMT+0800 (China Standard Time)

简单拼接就行