dmlc / gluon-nlp

NLP made easy

Home Page:https://nlp.gluon.ai/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

difference between gluonnlp 0.10.0 and gluonnlp 1.0.0 RoBERTaModel?

makua-bernal opened this issue · comments

I'm working on converting a RoBERTa model to gluonnlp 0.10.0 with mxnet 1.7.0.

I managed to get it working in gluonnlp 1.0.0 and mxnet 2.0.0 and the activations in the hidden layers are the same as the source model, but in gluonnlp 0.10.0 and mxnet 1.7.0 they differ very slightly.

The discrepancy starts in the first layer so I'm assuming it has something to do with the embeddings.

I could have made a mistake somewhere, but I'm wondering if there's a simpler explanation.