allenai / scispacy

A full spaCy pipeline and models for scientific/biomedical documents.

Home Page:https://allenai.github.io/scispacy/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NER-classes: Where to find definitions?

raven44099 opened this issue · comments

Hi, I'm an enthusiastic user of your library!

On your website you state that there is a GGP class in this dataset.

Model F1 Entity Types
en_ner_craft_md 77.56 GGP, SO, TAXON, CHEBI, GO, CL

However, The original CRAFT-paper doesn't have this class:

Terminology Total Annotations
ChEBI 8,137
CL 5,760
Entrez Gene 12,277
GO BPa 16,184
GO CC 8,354/4,707b
GO MF 4,062
NCBITaxonc 7,449
PRO 15,594
SOd 22,090
All 99,907

I tried to find the mapping, but was not successful. Where can I find information about the definitions of the classes used for your NER models?

  • en_ner_craft_md
  • en_ner_jnlpba_md
  • en_ner_bc5cdr_md
  • en_ner_bionlp13cg_md

I believe this should contain the information you are looking for: https://github.com/cambridgeltl/MTL-Bioinformatics-2016/blob/master/Additional%20file%201.pdf. GGP specifically is gene/gene-product