bicici / SMTData

Datasets for machine translation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MT Data

Datasets for machine translation

  • English-Turkish Parallel Corpus: 1984_en-tr_SentenceAligned_ParallelCorpus.zip [Usage: For training SMT systems or running sentence alignment experiments.]
  • German-English Parallel Corpus for Word Alignment: German-English_WordAlignment.zip [Usage: This parallel corpus is manually word aligned and can be used for training and testing word alignment systems or statistical machine translation systems.]
  • Turkish-English Parallel Corpus for WMT2018: TurkishTDK_WMT2018.zip
  • German-English Parallel Corpus for WMT2019: PrussianCulturalHeritageFoundation_WMT2019.zip

About

Datasets for machine translation