Computational Linguistics and & Text Mining Lab's repositories
EventStoryLine
Materials for the StoryLine extraction task - annotated data, baselines and evaluation scripts, evaluation data.
opinion_miner_deluxe
Opinion miner based of Machine Learning that can be trained on a corpus of KAF/NAF files
BabelfyReimplementation
Reimplementation of Babelfy (http://babelfy.org)
morphosyntactic_parser_nl
Morphosyntactic parser for Dutch based on the Alpino parser
EL-long-tail-phenomena
Systematic study of long tail phenomena in the task of entity linking
HumanLikeEL
Human-Like Entity Linking using Contextual knowledge
WordNetMapper
This repo provides the possibility to map between lexical keys | offsets | ilidefs from one wordnet version to the other ["16","17","171","20","21","30"]. It makes use of the index.sense files from WordNet (http://wordnet.princeton.edu/) and the automatically generated mappings between WordNet offsets (http://nlp.lsi.upc.edu/tools/download-map.php)
MFS_classifier
This repo contains the scripts to attempt to remove the mfs bias from a WSD system.
EmotionTagger
Uses an emotion tagger to tag text with emotions
LongTailIdentity
Generating profiles of long tail identities from text
cltl-magicplace
Annotate NAF-documents with the Newsreader pipeline on Lisa computer (SurfSara)
LOTUS
Code of LOTUS, the largest LOD text index, allowing free text access to the LOD Laundromat data collection
nwr-triple-api
Queries the KnowledgeStore populated with NewsReader output and represents the result as SEM-RDF or SEM-JSON
OldBailey
Processing the OldBailey data to create LOD
OpeNER_corpus
OpeNER corpus : news articles and hotel reviews annotated with opinion expressions, holders, targets and their relations
positive-interpretations
Code (Python 3.6) for automatically scoring and classifying positive interpretations generated from negations in OntoNotes
Spoken-versus-Written
Code and data for our VarDial 2018 paper on spoken versus written image descriptions