The arch figure doesn't comform to the code

Question

The arch figure doesn't comform to the code

kytimmylai opened this issue 9 months ago · comments

I found that in the arch figure of gated CNN block, the concat part is missing, and the gate branch and conv should share the same linear layer. This adjustment would bring the figure closer to the intended inference.

Weihao Yu · Answer 1 · Wed Jun 26 2024 05:14:19 GMT+0800 (China Standard Time)

Hi @kytimmylai ,

the conv can be standard depthwise convolution or depthwise conv on partial channels (like InceptionNeXt), controlled by the conv ratio.
the gate branch and conv branch do not share the same linear (linear + split).