utkarshminhas / LegalNER

Legal Text Annotation and Entity Recognition (Project for Applied Machine Learning in Computational Linguistics course at Indian University, Bloomington)

Geek Repo

Github PK Tool

To build corpus

Keep TSV data files in the same folder as the MLCL - Project - Corpus.ipynb
The notebook would create a file train.txt in the path data_folder = './project/example/ner/test/'

This would be the corpus to be used in the MLCL_Project.ipynb file.

To train model

Keep train.txt in the same folder as the file MLCL_Project.ipynb
The resultant model files (training.png, best-model.pt) will be in the path ./project/example/ner

About

Legal Text Annotation and Entity Recognition (Project for Applied Machine Learning in Computational Linguistics course at Indian University, Bloomington)

Languages

Language:Jupyter Notebook 100.0%