glample / tagger

Named Entity Recognition Tool

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Inconsistent conversion for IOBES to IOB

sbmaruf opened this issue · comments

Example:
From eng.testb,

CRICKET NNP I-NP O
- : O O
LEICESTERSHIRE NNP I-NP I-ORG
TAKE NNP I-NP O
OVER IN I-PP O
AT NNP I-NP O
TOP NNP I-NP O
AFTER NNP I-NP O
INNINGS NNP I-NP O
VICTORY NN I-NP O
. . O O

the code update the tag scheme by update_tag_scheme() function to convert iob to iobes.
now while evaluate, it convert back to iobes to iob here.
The output in the files are like following,

CRICKET NNP I-NP O O
- : O O O
LEICESTERSHIRE NNP I-NP B-ORG O
TAKE NNP I-NP O O
OVER IN I-PP O O
AT NNP I-NP O O
TOP NNP I-NP O O
AFTER NNP I-NP O O
INNINGS NNP I-NP O O
VICTORY NN I-NP O O
. . O O O

where the last column is the predicted output and it's previous column is the TRUE tag.
now LEICESTERSHIRE in given dataset is I-ORG but when we write the output to the file we write B-ORG. Isn't it a wrong conversion? And the result may vary for this.
@glample

Ok. I got it. It converts IOBES to IOB2 where the main dataset has the tag in IOB1 format.