StraysWonderland / EntityResolver

A simple python entity-resolver sample for the CORA dataset

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

EntityResolver

A simple python entity-resolver sample for the CORA dataset Resolves entities of possible duplicates of publications by comparing authors and the title tag.

Results

  • 72125 duplicates in gold-standard
  • 20748 retrieved duplicates

  • True positives: 16672
  • True negative: 271103‬
  • False positives: 4076
  • False negative: 55453

  • Precision: 0.80
  • Recall: 0.23
  • F1 Score: 0.359

About

A simple python entity-resolver sample for the CORA dataset


Languages

Language:Jupyter Notebook 72.9%Language:Python 27.1%