Bug detecting names with hyphens.
home15c6 opened this issue · comments
I know hyphenated names like "Jean-Luc Godard" are not typical in Vietnamese, but they may appear in texts, such as news articles.
For ner('Jean-Luc Godard', deep=True)
Expected: B-PER, I-PER, I-PER -> 1 entity
Actual: B-PER, B-PER, I-PER -> 2 entities
Note: The model works as expected for Công ty TNHH Bảo hiểm Nhân thọ Dai-ichi Việt Nam
-> 1 entity