arop / ner-re-pt

Named entity extraction from Portuguese web text

Home Page:http://hdl.handle.net/10216/106094

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Named entity extraction from Portuguese web text

My master dissertation on Named entity extraction from Portuguese web text, at FEUP (Faculty of Engineering of University of Porto).

Entity extraction using well-established tools (OpenNLP, Stanford CoreNLP, spaCy and NLTK) for the Portuguese language, and more specifically for the news section in University of Porto Information System - SIGARRA and all its subdomains.

Author: André Ricardo Oliveira Pires

Supervisor: Sérgio Nunes

Co-supervisor: José Devezas

In colaboration with: FEUP InfoLab and INESC TEC

For more information, regarding the developing process, guidelines for each tool, results obtained, resources created (trained NER models and annotated dataset) and more, check wiki.

About

Named entity extraction from Portuguese web text

http://hdl.handle.net/10216/106094


Languages

Language:Python 61.8%Language:Shell 26.9%Language:Perl 11.3%