grit-id / nergrit-corpus

Open source corpus for Indonesian Named Entity Recognition, Sentiment Analysis and Statement Extraction. https://ner.grit.id/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NERGRIT CORPUS

MUST READ

All NERGRIT contributors and LICENSE must be included if You use this corpus. This corpus can be used both for learning or commercially.

NERGRIT

NERGRIT is machine learning based NLP Tools used for Indonesian Named Entity Recognition, Statement Extraction, and Sentiment Analysis. This is the corpus we made and have F1 score with using glove as follow:

  • Named Entity Recognition: ~ 80.00%
  • Statement Extraction: ~ 70%
  • Sentiment Analysis: ~ 75%

Download

Open this link => https://ner.grit.id/index.php/front/about and click "GET NERGRIT CORPUS"

How to Use

Better use Python 3 and use your GPU instead of CPU for training, and run the following on your OS:

pip install virtualenv

Virtual Enviroment for NER and Statement using Anago version 1.0.8 :

cd nergrit-corpus
mkdir venv
virtualenv venv
source venv/bin/activate
pip install anago

Traning the model for Name Entity Recognition (NER) with:

cd ner/
python make_model.py

Try the model with:

python tag.py

Traning the model for Statement Extraction with:

cd statement/
python make_model.py

Try the model with:

python tag_statetement.py

Virtual Enviroment for Sentiment Analysis using Anago version 0.0.5 :

cd nergrit-corpus
mkdir venvold
virtualenv venvold
source venvold/bin/activate
pip install anago==0.0.5

Traning the model for Sentiment Analysis with:

cd sentiment/
python make_model.py

Try the model with:

python test_data.py

If You demand the model to be optimized more, you can create glove from wikipedia Indonesia or any Indonesian articles.

Copyright (C) 2019 NERGRIT DEVELOPERS

Coach:

  • Riyanti Kusumawati

Mentor:

Lead Developer:

Developer:

Annotator:

Github Issue Reporter:

Try NERGRIT at: https://ner.grit.id and visit us at https://grit.id/

About

Open source corpus for Indonesian Named Entity Recognition, Sentiment Analysis and Statement Extraction. https://ner.grit.id/