Question about Hypothesis 2

Question

Question about Hypothesis 2

x2ss opened this issue 10 months ago · comments

Hi,

Thank you for sharing this insightful work.

I have a question about Hypothesis 2 in page 6 (arxiv version):

Hypothesis 2: It is still worthwhile to further explore the potential of SSM for visual
detection and segmentation since these tasks align with Characteristic 2, despite not fulfilling
Characteristic 1

• Characteristic 1: The task involves processing long sequences.
• Characteristic 2: The task requires causal token mixing mode.

However at the end of page 5 (arxiv version), the paper says that "both detection on COCO and segmentation on ADE20K can be considered long-sequence tasks"

So may the Hypothesis 2 should be "It is still worthwhile to further explore the potential of SSM for visual
detection and segmentation since these tasks align with Characteristic 1, despite not fulfilling
Characteristic 2" ?

If I have misunderstood anything, I appreciate you pointing that out. I look forward to your response.

Thanks and best！

Weihao Yu · Answer 1 · Fri May 17 2024 16:57:30 GMT+0800 (China Standard Time)

Hi @x2ss , thank you so much for your attention to our work. Yes, this is a typo. Thanks for your reminder, I will correct it in the next version.

x2ss · Answer 2 · Fri May 17 2024 17:44:58 GMT+0800 (China Standard Time)

Hi @x2ss , thank you so much for your attention to our work. Yes, this is a typo. Thanks for your reminder, I will correct it in the next version.

thanks for your reply.

Hope the next version of the paper (maybe another paper) will have more discussion on detection and segmentation tasks, as many detection and segmentation tasks have performance gain by employing Mamba.

Thanks and best！