Computational Linguistics and & Text Mining Lab's repositories
MFS_classifier
This repo contains the scripts to attempt to remove the mfs bias from a WSD system.
LongTailIdentity
Generating profiles of long tail identities from text
WordNetMapper
This repo provides the possibility to map between lexical keys | offsets | ilidefs from one wordnet version to the other ["16","17","171","20","21","30"]. It makes use of the index.sense files from WordNet (http://wordnet.princeton.edu/) and the automatically generated mappings between WordNet offsets (http://nlp.lsi.upc.edu/tools/download-map.php)
morphosyntactic_parser_nl
Morphosyntactic parser for Dutch based on the Alpino parser
EL-long-tail-phenomena
Systematic study of long tail phenomena in the task of entity linking
BabelfyReimplementation
Reimplementation of Babelfy (http://babelfy.org)
positive-interpretations
Code (Python 3.6) for automatically scoring and classifying positive interpretations generated from negations in OntoNotes
opinion_miner_deluxe
Opinion miner based of Machine Learning that can be trained on a corpus of KAF/NAF files
EmotionTagger
Uses an emotion tagger to tag text with emotions
EventStoryLine
Materials for the StoryLine extraction task - annotated data, baselines and evaluation scripts, evaluation data.
OpeNER_corpus
OpeNER corpus : news articles and hotel reviews annotated with opinion expressions, holders, targets and their relations
LOTUS
Code of LOTUS, the largest LOD text index, allowing free text access to the LOD Laundromat data collection
HumanLikeEL
Human-Like Entity Linking using Contextual knowledge
cltl-magicplace
Annotate NAF-documents with the Newsreader pipeline on Lisa computer (SurfSara)
nwr-triple-api
Queries the KnowledgeStore populated with NewsReader output and represents the result as SEM-RDF or SEM-JSON
OldBailey
Processing the OldBailey data to create LOD
Spoken-versus-Written
Code and data for our VarDial 2018 paper on spoken versus written image descriptions