asyml / ForteHealth

The project is in the incubation stage and still under development. ForteHealth is a flexible and powerful ML workflow builder for biomedical and clinical scenarios. This is part of the CASL project: http://casl-project.ai/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Create an example for building bio NER pipeline

Leolty opened this issue · comments

Describe the solution you'd like

In ForteHealth, we incorporate ScispaCy for bio ner annotation, I think, as the very first example, we can simply create a pipeline for bio NER annotation. The demo from scispacy is here

In scispacy, with model en_ner_bc5cdr_md, we can annotate Disease and Chemical, with model en_ner_bionlp13cg_md, we can annotate Cancer, Organ, etc. We can also show this by using different configuration to build the pipeline.

Possible included componets:

  1. Sentence Segementor
  2. Tokenizer
  3. Bio NER Tagger