How to avoid large number of sequence length?
IceClear opened this issue · comments
Hi, @JunMa11 . Thanks for your great work.
I have a small question related to the network setting.
Since the sequence length L is set to be the multiplication of C, H, W of the image patch according to the paper, then given an image patch such as 320x320, the C can be 32 if my understanding is correct according to the code, then L is 160x160x32=819.2K (after the first pooling) at the first scale of Unet which can be quite large.
Do I misunderstand some details? Or there are some strategies to avoid such a case?
Thanks again and look forward to your help :)
Hi, @IceClear
We followed the common practice in vision transformer and there is a transpose operation. Thus, C is the lengh.
U-Mamba/umamba/nnunetv2/nets/UMambaBot.py
Line 205 in 548f3b2