Documentation to implement NER

Question

Documentation to implement NER

koushikram3420 opened this issue 4 years ago · comments

Hey,
I tried using IndicBert NER for news article clustering using transformers. While tokenization, some of the tokens are getting split up. I wanted to know if there is any way to avoid it.
Also, when I implemented the same example as you have mentioned in your documentation, I get different results.

kindly help me on why the tokens are not getting recognized properly. When I tried giving custom inputs in the same format of the tokenizer, tokens are not getting recognized and giving encoding as 1 even with add_special_token.

It would be helpful if you could share some implementations of the NER.

Kritika Dhawale · Answer 1 · Mon Jul 12 2021 17:01:32 GMT+0800 (China Standard Time)

Can you please share your notebook?
Thanks in advance.

yashsinglatimes · Answer 2 · Thu Feb 03 2022 20:28:51 GMT+0800 (China Standard Time)

Anybody able to create an example of NER for indian language using indic bert. That would be very helpful . @koushikram3420 which model you have usen because I think if you have use indic bert then according to your process its label size should be 768 whereas in yours case label size is 9