Language Technology at the University of Helsinki (Helsinki-NLP)

Language Technology at the University of Helsinki

Helsinki-NLP

Geek Repo

Projects and resources developed in the Language Technology Research Group at the University of Helsinki.

Location:Helsinki, Finland

Home Page:https://blogs.helsinki.fi/language-technology/

Twitter:@HelsinkiNLP

Github PK Tool:Github PK Tool

Language Technology at the University of Helsinki's repositories

Language:MakefileLicense:NOASSERTIONStargazers:801Issues:22Issues:36

Opus-MT

Open neural machine translation models and web services

Language:PythonLicense:MITStargazers:613Issues:17Issues:84

OPUS-MT-train

Training open neural machine translation models

Language:MakefileLicense:MITStargazers:331Issues:18Issues:99

OpusFilter

OpusFilter - Parallel corpus processing toolkit

Language:PythonLicense:MITStargazers:101Issues:11Issues:36

OPUS-CAT

OPUS-CAT is a collection of software which make it possible to OPUS-MT neural machine translation models in professional translation. OPUS-CAT includes a local offline MT engine and a collection of CAT tool plugins.

Language:C#License:MITStargazers:70Issues:12Issues:96

OPUS

The Open Parallel Corpus

mammoth

MAMMOTH: MAssively Multilingual Modular Open Translation @ Helsinki

Language:PythonLicense:MITStargazers:21Issues:4Issues:31

OPUS-MT-testsets

benchmarks for evaluating MT models

Language:SmalltalkLicense:NOASSERTIONStargazers:10Issues:3Issues:14

neural-search-tutorials

Additional Notebooks for the Building NLP Applications course

Language:Jupyter NotebookStargazers:5Issues:3Issues:0
Language:SCSSStargazers:4Issues:3Issues:0

opus-fast-mosestokenizer

c++ mosestokenizer (OPUS fork)

Language:C++License:LGPL-2.1Stargazers:3Issues:2Issues:2

uncertainty-aware-nli

Uncertainty-aware fine-tuning of transformers with NLI data.

Language:PythonLicense:MITStargazers:3Issues:2Issues:0

murre24

Manually annotated dataset of Finnish varieties in the Suomi24, the largest Finnish internet forum, the id's of automatically annotated dialectal messages and the scripts used for classification and evaluation.

Language:PythonLicense:CC-BY-4.0Stargazers:2Issues:2Issues:0

dialect-topic-model

Scripts and metadata for the paper "Corpus-based dialectometry with topic models"

Language:PythonLicense:CC-BY-4.0Stargazers:1Issues:1Issues:0

External-MT-leaderboard

Leaderboards for external MT models

License:CC-BY-SA-4.0Stargazers:1Issues:2Issues:1

lm-vs-mt

A Comparison of Language Modeling and Translation as Multilingual Pretraining Objectives

Language:PythonStargazers:1Issues:2Issues:0

OPUS-API

API for searching corpora from OPUS

OPUS-MT-leaderboard-recipes

Makefile recipes shared between all leaderboard repos

Language:MakefileLicense:CC-BY-SA-4.0Stargazers:1Issues:2Issues:0

OpusDistillery

Training pipelines for Firefox Translations neural machine translation models (adapted for OPUS-MT and integrating GreenNLP metrics)

Language:PythonLicense:MPL-2.0Stargazers:1Issues:1Issues:5

Contributed-MT-leaderboard

Leaderboard of contributed MT results

License:CC-BY-SA-4.0Stargazers:0Issues:2Issues:0

eflomal

Efficient Low-Memory Aligner

Language:CLicense:GPL-3.0Stargazers:0Issues:0Issues:0
Language:MakefileLicense:CC-BY-SA-4.0Stargazers:0Issues:2Issues:0

lowres-spain-st

This is the repository that contains all scripts and data from the Helsinki-NLP participation to the WMT24 Shared task: Translation into Low-Resource Languages of Spain

Language:ShellStargazers:0Issues:2Issues:0
Stargazers:0Issues:3Issues:0

swa_gaussian

Code repo for "A Simple Baseline for Bayesian Uncertainty in Deep Learning" (Helsinki-NLP fork)

Language:Jupyter NotebookLicense:BSD-2-ClauseStargazers:0Issues:1Issues:0
Language:PythonLicense:MITStargazers:0Issues:5Issues:0