Need more description about the CRF based NER model train details

Question

Need more description about the CRF based NER model train details

ArupDas15 opened this issue 3 years ago · comments

Please give further details as to how to train the NER model on custom data. The dataset (https://github.com/MISabic/NER-Bangla-Dataset) on which the model has been trained has both IOB and BIOES tags. I am unable to understand what tagging style is used to train the model as per the example given (screenshot attached for reference).

Also from the screenshot, it is appearing to me as though you have passed the same example you used for training (twice with the same sentence ) for testing (again twice with the same sentence ). This again is not clear to me.

Please give more details about the architecture of the CRF model used to train the data. I tried to understand these details from your paper (https://arxiv.org/pdf/2102.00405.pdf) but unfortunately, I could not understand. Hence please shed some details into these aspects as I am unable to understand the internals and it appears to me like a black box. Hoping for a quick and positive response.

Sagor Sarker · Answer 1 · Wed Dec 15 2021 20:59:06 GMT+0800 (China Standard Time)

Hello @ArupDas15 ,
Thanks for raising this issue.
Below I am breaking down your questions and providing answers

Which tagging format I can apply for training NER?
Answer: As the module has no dependencies with tagging method, you can use IOB/BILOU/BIOES. Only you should keep in mind that your testing tag should be similar to the training tag. I have used BIOES format from https://github.com/MISabic/NER-Bangla-Dataset
Why did you pass the same example for training and testing?
Answer: It's just an example. First, you need to preprocess your data like the example format. Then you can divide it into train test chunks and pass it in the training method.
What is the training model? Where can I find details about that model?
Answer: I have used scikit-learn crf-suit for the NER task. This API provides you with details about this model argument. Also if you want to understand about CRF please read this. paper1, paper2

I am hoping this answer fulfills your requirements.
regards

Arup Das · Answer 2 · Thu Dec 16 2021 22:54:33 GMT+0800 (China Standard Time)

Hi @sagorbrur,
Thank you very much for your prompt reply. Is it possible to use your pre-trained model and train further? i.e. can I feed in new instances and update the trained model (bn_ber.pkl)? Or do I need to train from scratch...?

Sagor Sarker · Answer 3 · Thu Dec 16 2021 23:06:12 GMT+0800 (China Standard Time)

Hi @ArupDas15,
I don't think so. You need to train from scratch.
You can merge your new datasets with NER-Bangla-Dataset datasets in a similar tagging format and then train a new model. Remember, your new datasets should be in a similar tagging format to that datasets.

Arup Das · Answer 4 · Sat Dec 18 2021 00:03:47 GMT+0800 (China Standard Time)

Hi @sagorbrur,
I tried to reproduce your results just to be sure that I am doing things correctly. I trained on the training data and tested on the test data from https://github.com/MISabic/NER-Bangla-Dataset. As per your results (https://arxiv.org/pdf/2102.00405.pdf) I was expecting 66.88 as F1 score but I am getting F1 score of 90.35 and this is the same value I have obtained for accuracy as well.

I am attaching a screenshot for your kind reference:

I am doing something wrong here for sure, can you please help me out?

Sagor Sarker · Answer 5 · Sat Dec 18 2021 01:40:44 GMT+0800 (China Standard Time)

Hello @ArupDas15 ,
There is nothing wrong with your training.
I think it's sklearn metrics problem.
You can predict using train model check F1 score using seqeval
Here's my F1 score using seqeval: