BeWe11 / rasa_composite_entities

A Rasa NLU component for composite entities.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Problems with Un-ordered Entities (with Duckling)

ttlekich opened this issue · comments

From a brief glance, it seems as though the composite entity extractor relies on the entity list to sorted by appearance in the text. Most of the time, the entities are in the order they appear in the text; however, Duckling seems to mess with this order.

Examples (what currently happens with rasa/duckling):

  • Say I have a composite entity pattern: wordA number wordB where wordA and wordB are entities and number is a duckling-parsible number. If the string wordA 30 wordB is parsed by rasa with duckling in the training pipeline, the composite entity does not get caught. Only the primitive entities wordA, wordB, and 30 are caught.

  • If I had the composite entity pattern: wordA wordB number (same constraints as above), and the string wordA wordB 30 was parsed by rasa/duckling, then the composite entity would get caught.

Duckling seems to put its parsed entities at the end of the entity array. Locally, I just sorted the entity array by start to get them back in order (in _find_composite_entities) - this fixes the issue above for me. I am not sure if this is the best way of fixing this issue or maybe I missed the underlying issue, but I'd be happy to make a PR if any changes are needed.

Thank you!

Hey @ttlekich ,

thank you for mentioning this behavior, I'd consider it a bug. I've fixed it in e5c3109 by sorting the entities by their start value before processing them.

Version 1.0.2 this the package contains the fix, just run pip install -U rasa-composite-entities

Awesome, thank you!