RelationExtraction

-- data_prep.py: reads and loads necessary data for the classifier.

-- features.py: contains "Mention" and "MentionPair" classes to do feature engineering.

-- run.py: contains the main function to run the whole system.

-- data folder: contains original train/dev/test file, and also extra knowledge and tree data.

-- model/train.txt/test.txt/test.tagged: outputs of the system with best performance.

Please follow the instructions below to get results. The final score should be 42.89%.

Get on a department machine.
Clone the GitHub repo: https://github.com/jiaeyan/RelationExtraction , and get into that repo, and run following commands (if you would like to directly check the final performance, please do STEP 7):
python run.py --task train # to generate train feature file
python run.py --task test # to generate test feature file
sh mallet-maxent-classifier.sh -train -model=model -gold=train.txt
sh mallet-maxent-classifier.sh -classify -model=model -input=test.txt > test.tagged
python relation-evaluator.py data/rel-testset.gold test.tagged

Note: Step 6 takes long time, and please ignore the "ClassNotFound" error.

About

Implementation of relation extraction between entities in texts, feature engineering with Maximum Entropy template, provided by Mallet.

Language:Python 98.5%Language:Shell 1.5%