Language Machines's repositories
ucto
Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing steps such as changing case that you can all use to make your text suited for further processing such as indexing, part-of-speech tagging, or machine translation. Ucto comes with tokenisation rules for several languages and can be easily extended to suit other languages. It has been incorporated for tokenizing Dutch text in Frog, our Dutch morpho-syntactic processor. http://ilk.uvt.nl/ucto --
ticcltools
Tools for TICCL
foliautils
Command-line utilities for working with the Format for Linguistic Annotation (FoLiA), powered by libfolia (C++), written by Ko van der Sloot (CLST, Radboud University)
timblserver
TiMBL implements several memory-based learning algorithms. This is the server part.
dialect2keywords
Webinterface designed to convert words in Dutch dialects ("dialectopgaven") into standard Dutch keywords ("vernederlandste trefwoorden").
actiontests
small program to test travis issues. Like OSX and Clang OpenMP support
clariah-plus-tasks
An overview of CLARIAH-PLUS tasks at CLST, Radboud University, Nijmegen
JASMIN-BLISS-Negation
Documentation of a corpus sample of Dutch human-computer dialogues annotated with negation cues.
ticcactions
collection of githib actions for use in ticc software
timbltests
Unit tests for Timbl