asyml / ForteHealth

The project is in the incubation stage and still under development. ForteHealth is a flexible and powerful ML workflow builder for biomedical and clinical scenarios. This is part of the CASL project: http://casl-project.ai/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Fix abbreviation detector in Scispacy processor

nikhilranjan7 opened this issue · comments

Describe the bug
Incorrect index for begin and end in abbreviation detection processor

To Reproduce
Steps to reproduce the behavior:
Run Scispacy processor for abbreviation detection

Expected behavior
Example:

"Spinal and bulbar muscular atrophy (SBMA) is an
inherited motor neuron disease caused by the expansion
of a polyglutamine tract within the androgen receptor (AR)."

long_form = Spinal and bulbar muscular atrophy
Stored long_form in tmp_abrv.text is just one letter character due to incorrect indexing

@nikhilranjan7 It would be great if you could add some description to these issues, this and the index search optimization one. Its helpful if someone is tracking an old issue or if they are trying to find an issue to work on, based on their bandwidth, etc.