`inv_freq` of `RotaryPositionEmbedding` is hard-coded to 10k
shijie-wu opened this issue · comments
Shijie Wu commented
theta
in inv_freq
of RotaryPositionEmbedding
is hard-coded to 10k
TransformerEngine/transformer_engine/pytorch/attention.py
Lines 1371 to 1377 in 50e7a3d
Przemyslaw Tredak commented
@sudhakarsingh27 Could you take a look at it?