Computational Linguistics and & Text Mining Lab's repositories
python-for-text-analysis
If you want to use Python for text analysis, this course is for you!
ba-text-mining
Hands-on material for the course text-mining BA, taught at VU Amsterdam
SpaCy-to-NAF
spaCy-to-naf converter
ma-hlt-labs
Human Language Technology Notebooks for Lab sessions, Master Students
entity-identification-from-scratch
Entity recognition and linking for historical documents in Dutch, developed within the Clariah+ project at VU Amsterdam
vu-rm-pip3
Dutch NewsReader pipeline
multilingual-wiki-event-pipeline
This project aims to extract information about incidents of a particular type. This information consists of structured data on the incidents from Wikidata, as well as unstructured description and supporting sources from Wikipedia. We obtain information from Wikipedia in multiple languages.
cltl-ma-thesis
(LaTeX) MA thesis template
ma-communicative-robots
Communication robots
reference-framing-perspective
Workshop website
cltl.github.io
CLTL organization site
rfp_corpus_collection
Collect a referentially grounded corpus for the 1st workshop on Reference, Framing, and Perspective (LREC-COLING 2024)
bibliography
CLTL bibtex bibliography
grounding-toxicity
Code base for the paper Grounding Toxicity in Real-World Events across Languages
InappropriateLanguageDetection
This repository contains annotated data on inappropriate language in online discussions, generated through a combination of expert annotation, crowd-sourcing, and ChatGPT-based methods.
Lingoturk
Creating crowdsourcing based experiments made easy
panli-crowdtruth
A CrowdTruth analysis of the PANLI dataset
panli-models
Model evaluation on the PANLI dataset
Reddit_topic
Toxicity analysis of Reddit conversation across topics and languages
unkown_script
Code base for the paper Unknown Script: Impact of Script on Cross-Lingual Transfer