john-hewitt / morph16

Morphologically-motivated phrase tables from AMTA 2016 paper

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Morphologically Motivated Phrase Tables

Available here are phrase tables for inflected forms in each of 10 languages, with the number of phrase pairs:

  • Finnish: 11,340,214
  • Russian: 1,281,644
  • Turkish: 697,548
  • Georgian: 592,544
  • Czech: 467,688
  • Korean: 358,879
  • Swahili: 65,614
  • Urdu: 54,493

Each table was constructed according to the method described in Automatic Construction of Morphologically Motivated Translation Models for Highly Inflected, Low-Resource Languages, AMTA 2016. If you use these tables, please cite the paper.

For questions, concerns, or new language requests, please contact John Hewitt.

https://www.seas.upenn.edu/~johnhew

About

Morphologically-motivated phrase tables from AMTA 2016 paper