NICTA / nicta-ner

NICTA Named Entity Recogniser is a rule based Named Entity Recogniser which extracts named entities from text such as Organisation, Location and Person names. It is written in Java.

Home Page:http://t3as.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Update word lists

meh9 opened this issue · comments

The word lists are old, and due to a problem with encoding conversion in the original files there are 'missing' characters marked as '?', e.g. line 5 of WIKI_LOC_EXTRACTION:

Hü?ün

We should update all the word lists.

Done.