amir-zeldes / xrenner

eXternally configurable REference and Non Named Entity Recognizer

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Reattach modifiers of first coordinate that appear after subsequent coordinate

amir-zeldes opened this issue · comments

If we have a coordinated markable for which the dependency graph attaches a modifier to the first coordinate, but this modifier appear after a subsequent coordinate, we should surmise that the modification applies to the 'big' markable. For example for these gold markables:

[[Sean Anderson] and [Sandy Anderson] of [Idaho]]

The dependency graph says the following:

1   Sean    _   NNP NNP _   2   nn  _   _
2   Anderson    _   NNP NNP _   0   root    _   _
3   and _   CC  CC  _   2   cc  _   _
4   Sandy   _   NNP NNP _   5   nn  _   _
5   Anderson    _   NNP NNP _   2   conj    _   _
6   of  _   IN  IN  _   2   prep    _   _
7   Idaho   _   NNP NNP _   6   pobj    _   _

Semantically, of Idaho modifies the coordinate NP: Both persons are from Idaho. If we leave the graph as is, xrenner will produce these wrong markables, with the span of 'Sean' going all the way to the end to include its 'of' modifier:

[[Sean Anderson and [Sandy Anderson] of [Idaho]]]

Ideally we would need to make the modifier (the of-PP) only attach to the 'big markable' (coordinate markable), while removing it from the descendant subgraph of rhte 'little markable' that covers only Sean. This should happen during 'big markable' costruction, currently in xrenner_xrenner.process_sentence()