HazyResearch / safari

Convolutions for Sequence Modeling

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question about discrepancy between implementations available in the repo and related papers

sylee0124 opened this issue · comments

Hi, I'm bit confused about current implementations of the repo and implementations used/discussed in related papers. I'll just state what I think is true. Please correct me if I'm wrong.

  • Flashconv from h3
    Fused kernel is implemented at fftconv_cuda.cu but it is not using block FFT.

  • FlashButterfly in "Simple Hardware-Efficient Long Convolutions for Sequence Modeling"
    long_conv.py uses BlockFFT (which is same as Butterfly Decomposition) with support for learnable parameters for dft_matrix. But not using fused kernel and Three-pass algorithm is also not implemented.

commented

Thanks for verifying :)
When can I expect this performance update? Will it happen anytime soon?

commented