Question about the model size
fransilvionGenomica opened this issue · comments
Hi,
I am trying to build a Hyena model using hyperparameters from Table A4 (the 4th row). I am using the implementation of a standalone model:
layer2 = HyenaOperator( d_model=1024, l_max=19072, order=36, filter_order=64, num_inner_mlps=4, emb_dim=17, w=14 )
However, when I check the number of parameters, I get ~42M instead of 355M as stated in the paper. Is it because I am using the standalone implementation? But even then how come the difference is so big? Or maybe I am missing something?
`def count_parameters(model):
return sum(p.numel() for p in model.parameters() if p.requires_grad)
count_parameters(layer)`
Thank you for clarification!