ntuanhung / VNER

Vietnamese Named Entity Recognition

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Vietnamese Named Entity Recognition using Seq2Seq / EncoderDecoder / Attention

Could it be success?

============================

Vietnamese Named Entity Recognition

[English] [Vietnamese]

This repository contains starter code for training and evaluating machine learning models in Vietnamese Named Entity Recognition problem. It is a part of underthesea project. The code gives an end-to-end working example for reading datasets, training machine learning models, and evaluating performance of the models. It can easily be extended to train your own custom-defined models.

Table of contents

1. Installation

1.1 Requirements

This code is writen in python. The dependencies are:

  • Operating Systems: Linux (Ubuntu, CentOS), Mac
  • Python 3.6
  • Anaconda

Python Packages

  • underthesea==1.1.7
  • languageflow==1.1.7

1.2 Download and Setup Environment

Clone project using git

$ git clone https://github.com/undertheseanlp/ner.git

Create environment and install requirements

$ cd ner
$ conda create -n uts.ner python=3.5
$ pip install -r requirements.txt

2. Usage

2.1 Using a pretrained model

cd ner
$ source activate ner
$ python ner.py -fin tmp/input.txt -fout tmp/output.txt

2.2 Train a new dataset

Prepare a new dataset

Train and test

$ cd ner
$ source activate ner
$ python train.py
  --train data/vlsp2018/corpus/train.txt

2.3 Sharing a model

To be updated

3. References

To be updated

Last update: 07/2018

About

Vietnamese Named Entity Recognition

License:GNU Affero General Public License v3.0


Languages

Language:TeX 75.2%Language:Python 24.1%Language:Shell 0.7%