yutanakamura-tky / ebmnlp

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

0. What is this?

This is a solution to the EBM-NLP task proposed in this ACL 2018 publication by Benjamin Nye et al.

The method is Named Entity Recognition (NER) with BioELMo + CRF under PyTorch implementation.

1. Preparation

  1. Clone this repository:
$ git clone https://github.com/iBotamon/ebmnlp.git
  1. Activate Virtual Environment:
$ cd ebmnlp
$ python -m venv .
$ source bin/activate
  1. Install necessary packages:
$ pip install --upgrade pip
$ pip install -r requirements.txt
  1. Download the following files:

Instead, you can also download them by running this:

$ bash get_pretrained_models.sh

2. How to use BioELMo + CRF model

2-1. Use via command line

  1. Prepare text file that contains an RCT abstract (e.g., sample.txt).

  2. Run like this:

$ python ebmnlp.py TEXT_FILE_NAME
  1. NER tagging result will be returned in a standard output:
I-I	Remdesivir
O	in
I-P	adults
I-P	with
I-P	severe
I-P	COVID-19
I-P	:
O	a
O	randomised
O	,
O	double-blind
O	,
O	placebo-controlled
O	,
O	multicentre
  1. If you wish to get the result as a file, run like this:
$ python ebmnlp.py TEXT_FILE_NAME OUTPUT_FILE_NAME

2-2. Use via Web browser

  1. Run this:
$ bash run_flask.sh
  1. Access to localhost:5000 via your Web browser.

  2. You can use the PIO identification system interactively.

Flask demo image

3. How to train BioELMo + CRF model yourself

  1. Prepare EBM-NLP dataset ebm_nlp_1_00.tar.gz from the repository by the authors.

  2. Extract ebm_nlp_1_00.tar.gz in the official directory like this:

- models
- templates
- official
  └ ebm_nlp_1_00
    └ annotations
      └ ..
    └ documents
      └ ..
  1. Run this:
$ python ebmnlp_bioelmo_crf.py

You can specify CUDA device number like this:

$ python ebmnlp_bioelmo_crf.py --cuda 3

About


Languages

Language:Python 80.3%Language:Shell 13.1%Language:HTML 6.6%