carfly / bi_ilp

Bilingual Named Entity Recognition with Integer Linear Programming

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Directories
- data: training and test data
- ner: Stanford NER
  + models: model files and property files
- ilp: Bilingual ILP
- scripts: Some useful scrips

Steps (See run.sh)
- CRF-based NER
 * training: build Chinese and English NER models
 * test: predict NER results and probability of each word to a tag
- Calculate PMI
 * calculate PMI scores through the NER results
- Bilingual ILP
 * predict NER results with bilingual ILP
- Evaluation
 * evaluate the performances of CRF-based and Bi_ILP-based NER

About

Bilingual Named Entity Recognition with Integer Linear Programming


Languages

Language:Perl 46.8%Language:Python 43.8%Language:Shell 9.4%