yoseflaw / nerindo

Named Entity Recognition with BiLSTM, CRF, and Attention-based models implemented in PyTorch for Indonesian News.

Home Page:http://nerindo-simple.herokuapp.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Nerindo

Named Entity Recognition for Bahasa Indonesia NER with PyTorch.

Corpus for NER:

The step-by-step implementation in Google Colab is indexed here.

The Fine-tuned Indonesian word embeddings id_ft.bin is available here, based on word embeddings trained in indonesian-word-embedding.

Included configurations

  1. BiLSTM
  2. BiLSTM + Word Embeddings
  3. BiLSTM + Word Embeddings + Char Embeddings (CNN)
  4. BiLSTM + Word Embeddings + Char Embeddings (CNN) + Attention Layer
  5. Transformer (simplified BERT) + Word Embeddings + Char Embeddings (CNN)

Learning rate finder

Automatic learning rate finder based on pytorch-lr-finder.

Note: since the learning rates are determined automatically from the same range for all models, it may not be the best learning rate. To see the best learning rate, check the google colab version.

Example output:

LR Finder Example Output

Final result

LR Finder Example Output

Main reference

Gunawan, W., Suhartono, D., Purnomo, F., & Ongko, A. (2018). Named-entity recognition for indonesian language using bidirectional lstm-cnns. Procedia Computer Science, 135, 425-432.

About

Named Entity Recognition with BiLSTM, CRF, and Attention-based models implemented in PyTorch for Indonesian News.

http://nerindo-simple.herokuapp.com


Languages

Language:Python 100.0%