SHI-Labs / Neighborhood-Attention-Transformer

Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Questions about the algorithm speed.

jikerWRN opened this issue · comments

Hi, thanks for your good work.
I notice that the paper does not compare the algorithm speed. I would like to know the speed comparison of NAT vs swin-Transformer, and CNN model.
Thanks!

Hello and thank you for your interest.
We did not discuss that in this version of the paper, because the implementation (the CUDA kernel) is still a work in progress. Because both training and inference speeds are largely dependent on the implementation, it would not make sense to compare them at this stage.
However, with the current version of the kernel that is publicly released, we found that NAT-Mini is faster in ImageNet classification inference than Swin-Tiny, and NAT-Tiny is as fast as Swin-Tiny. However, we found that NAT-Small and Base are slower than their Swin counterparts, due to the inefficiency still present in our kernel, which we are working on.
Note that these speeds are all subject to input sizes, therefore with bigger models (more channels), or larger inputs (larger resolution), they will be different.

I hope this answers your question, but please let me know if you need further clarification.

Ok, thanks for your reply!