asyml / ForteHealth

The project is in the incubation stage and still under development. ForteHealth is a flexible and powerful ML workflow builder for biomedical and clinical scenarios. This is part of the CASL project: http://casl-project.ai/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Optimize index finding in scispacy processor

nikhilranjan7 opened this issue · comments

Is your feature request related to a problem? Please describe.
Optimize find_index function in scispacy processor as it is currently scanning the whole sentence for each item.

Describe the solution you'd like
Accumulating all the items from hearst_patterns. And then iterating over the items as the input pack text is traversed. This way we don’t have to traverse the whole input pack text for each items.

Not needed anymore as we are not doing search or iteration