eamsen / pythia

Semantic web oracle.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Entity pre-filtering (backend)

eamsen opened this issue · comments

Add some basic entity pre-filtering at the backend to reduce network and frontend load.

Added entity filtering based on minumum number of evidence documents (currently set to 2).
Filters up to 40% out, halfing the load for the later passes, especially the scoring.
Reduces recall from 0.63 to 0.55, but this is mostly due to the missing entity identification/clustering.