nested-people-entities-quran

This repository consist of indonesian translated quran with nested people entities up to two level datasets, and a supervised learning implementations (BiLSTM-CRF, IndoBERT, CRF) of nested people entity extraction on indonesian translated quran.

Dataset

Dataset file: TA_dataset_raw_labeled_nested_4th
Desc: The dataset is taken from the Tanzil Quran corpus which includes Juz 1 through Juz 6. The entity tag used in this research is PER (person), which represents people entities, and O for entities outside people entities. The format used to label people entities is the IOB format. Entity tag are manually labeled

About

This repository consist of indonesian translated quran with nested people entities up to two level.

Languages

Language:Jupyter Notebook 100.0%