Wrong about the computational complexity
Med-Process opened this issue · comments
Given the input feature H,W,C, the whole Multi-axis gMLP block (figure 3) has: Input proj (2HWC^2) + Output proj (2HWC^2) + Block-gMLP dense (3HWC^2) + Grid-gMLP dense (3HWC^2) = 10HWC^2
Note: the first Dense layer in gMLP block expands channels from C to 2C, thus 2HWC^2. Output Dense in gMLP is HWC^2. So the whole gMLP block has 3HWC^2.
I see. Thank you. But, the figure 3 is confused with C and C / 2.
Oh yeah you're right. We follow common complexity convention from Swin, etc, but didn't expect to cause a confusion.