Filip Ginter's repositories
simstring-cuda
A quick implementation of cosine-based string fuzzy lookup (a little bit like the famous simstring library) using sklearn, torch, and GPU acceleration. Can hold its own with an index of few million strings, batched queries, and GPU. Otherwise loses in speed to simstring, but is easy to install OTOH. I leave this here in case anyone wants it
ainl_2020_tutorial
AINL2020 tutorial on Finnish NLP
hunspell-fi
Best-effort good coverage Finnish hunspell dictionary auto-gathered
multiling_parser
Stuff and scripts to prepare data for a "one model for all languages" run of the Turku Neural Parser Pipeline
biltema-ebike-controller-upgrade
This repository documents a controller upgrade to Biltema's ebike. I simply record here what was needed in the hope that it is useful also for somebody else.
blast-baseline
throwaway repo
data-tooling
How should we store and serve the dataset?
feature_predictor
Little test needed for our parser
fginter.github.io
A Jekyll version of the "Editorial" theme by HTML5 UP.
flask_shelve
Get/store json
friendly-dl
friendly url fetcher for veronika
jekyll-hook
trigger jekyll deployment by github webhook event
Megatron-LM
Ongoing research training transformer language models at scale, including: BERT & GPT-2
subword-nmt
Unsupervised Word Segmentation for Neural Machine Translation and Text Generation
tiedetestaajat-demo
Demo for tiedetestaajat paja
transformer-mlabel
could we get multilabel from transformer in some reasonable manner
yle_rss_downloader
Downloads YLE RSS news