lucidrains / g-mlp-pytorch

Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Parameter count doesnt line up with paper

titu1994 opened this issue · comments

Just a note (and correct me if I misunderstood the paper) -

The parameter count for the Tiny gMLP doesnt line up with the param count from the paper for 30 layers and 128 dim and 6 ff_mult.
Thats probably due to the doubling of parameters here - https://github.com/lucidrains/g-mlp-pytorch/blob/main/g_mlp_pytorch/g_mlp_pytorch.py#L111

Halving this back to dim_ff + all 3 lines here need to halve their respective dims - https://github.com/lucidrains/g-mlp-pytorch/blob/main/g_mlp_pytorch/g_mlp_pytorch.py#L64-L66

Then param count is roughly 5.5 M params.

@titu1994 Hi Somshubra! I made the changes in 0.0.9 - could you let me know if it matches up now?

Param counts are now very close to the paper, thanks !