elerdg / NER-BERT-Italian

Fine-tune Bert models for Italian NER

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Named Entity Recognition with BERT on Italian

This project aims to fine-tune pre-trained BERT models for named-entity recognition (NER) on Italian Data from the Wikineural Dataset

Overview:

The dataset:

Wikineural IT comprises 111k sentences from Wikipedia, tokenized and ner tagged. The Dataset is organized in 3 splits: train, test, and validation. The sentences are cased and contain punctuation. The entity categories are encoded as illustrated below:

{'O': 0, 'B-PER': 1, 'I-PER': 2, 'B-ORG': 3, 'I-ORG': 4, 'B-LOC': 5, 'I-LOC': 6, 'B-MISC': 7, 'I-MISC': 8}

The pre-trained models in this project:

About

Fine-tune Bert models for Italian NER


Languages

Language:Jupyter Notebook 100.0%