hooshvare / parsbert-ner

🤗 ParsBERT Persian NER Tasks

Home Page:https://huggingface.co/HooshvareLab/bert-base-parsbert-ner-uncased

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Multi-lingual BERT

danyaljj opened this issue · comments

Salaam,
Nice work!

Did you have a chance to try multilingual BERT?
The Transformers package has a minimal example for NER: https://github.com/huggingface/transformers/tree/master/examples/token-classification

Hi,
Thank you so much!

Yes, but we didn't add to our paper. For your record and the others, I updated the information on the README about mBERT results regarding NER datasets.

Dataset ParsBERT mBERT
PEYMA 93.10* 86.64
ARMAN 98.79* 95.89

Good! I am glad you have tried mBERT.
And the numbers look reasonable to me.

Several other queries for you:

  1. A while ago I collected lots of raw Persian text. I suspect that this is bigger than your current pretraining corpus. Feel free to use it: https://github.com/persiannlp/persian-raw-text
  2. What model sizes are you releasing?
  3. When are you guys planning to release the models?
  4. Would be great to make your models available on Huggingface hub: https://huggingface.co/models this would give a lot of visibility to your work.

Thank you again!

  • We plan to upgrade the model shortly (actually, it is under development, and it would be published soon). Also, we have used a lot more corpus during this upgrade!
  • The model sizes and architecture is based on BERT-base configuration!
  • In fact, the model has already been released under the Huggigface model you can access from the Huggigface model from here! (also, the pipeline model on the readme implemented based on Huggigface, check it out!)

Wonderful! 🥳