mit-han-lab / lite-transformer

[ICLR 2020] Lite Transformer with Long-Short Range Attention

Home Page:https://arxiv.org/abs/2004.11886

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

training config for wikitext103

pichuang1984 opened this issue · comments

Thanks so much for open-source your code!

Will it be possible to provide training/config for wikitext-103 dataset?

Thanks

Thank you for asking! We just added the codes and configs in the new branch language-model with the pre-trained checkpoints.

Thanks for providing the codes and configs. However it seems that the architecture transformer_lm_multibranch_v2_wiki103_small is not available in transformer_multibranch_v2.py?

Hi, the model is in the file lite-transformer/fairseq/models/transformer_lm_multibranch_v2.py in the language-model branch.

Thanks, you are right, I was still looking at transformer_multibranch_v2.py