asyml / ForteHealth

The project is in the incubation stage and still under development. ForteHealth is a flexible and powerful ML workflow builder for biomedical and clinical scenarios. This is part of the CASL project: http://casl-project.ai/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Implement a scispaCy processor as wrapper

Piyush13y opened this issue · comments

Is your feature request related to a problem? Please describe.
Two new tasks to be incorporated into medical pipeline are Abbreviation Detection and Hyponym Detection. We use scispaCy to implement these 2 tasks. Hence, we need a processor to wrap all the scispaCy functionalities that we might require in our pipeline.

Describe the solution you'd like
The task is to develop a processor called ScispacyProcessor which wraps all the scispaCy methods that will be used for these 2 tasks and any future task that wraps scispaCy functionalities. Other than just writing the wrapper for this, we will also need to generate new ontologies which the wrappers will be utilizing. Please ping me on slack, or just add a comment here, once you start working on this so I can update the ontologies that we might require for these tasks here.
(Ontology generation will be handled in #25 )

  • Annotation
    • Abbreviation
      • long_form

-Link

  • Hyponym
    • hyponym_link

While implementing this processor, you can refer NegationContextAnalyzer processor for the practices and design principles that we should follow. All the processors in Forte have a similar design since they are all modular components of Forte as a pipeline. For any queries, feel free to add comments in this issue.

You can also refer to the following link for help on how scispacy's abbreviation detection and hyponym detection are used out of the box.
https://pythonlang.dev/repo/allenai-scispacy/

commented

Hi Piyush I'm starting to implementing this feature now. Could you help to generate the ontology? Thanks.