shreyashub / BioFLAIR

BioFLAIR: Pretrained Pooled Contextualized Embeddings for Biomedical Sequence Labeling Tasks

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PWC PWC PWC PWC PWC

BioFLAIR

This repository provides the code for fine-tuning BioFLAIR, a pretrained pooled contextualized embedding model for Biomedical Sequence Labeling tasks like NER. Please refer to our paper BioFLAIR: Pretrained Pooled Contextualized Embeddings for Biomedical Sequence Labeling Tasks.

Installation

BioFLAIR is built using FLAIR. Check out their repo for more information.

$ pip install flair
$ git clone https://github.com/shreyashub/BioFLAIR.git

Datasets

We provide a pre-processed version of benchmark datasets as follows:

  • NCBI
  • BC5CDR (complete\chemicals\diseases)
  • JNLPBA
  • Species-800
  • LINNAEUS

Fine-Tuning

Run fine_tune.py for fine-tuning proccess.

Just change the data_folder = 'data/ner/DATASET_NAME' in fine_tune.py.

Citation

@article{sharma2019bioflair,
  title={BioFLAIR: Pretrained Pooled Contextualized Embeddings for Biomedical Sequence Labeling Tasks},
  author={Sharma, Shreyas and Daniel Jr, Ron},
  journal={arXiv preprint arXiv:1908.05760},
  year={2019}
}

Contact

Please email your questions or comments to Shreyas Sharma(shreyas.rox101@gmail.com)

About

BioFLAIR: Pretrained Pooled Contextualized Embeddings for Biomedical Sequence Labeling Tasks


Languages

Language:Python 100.0%