There are 2 repositories under parallel-corpus topic.
非常全的文言文(古文)-现代文平行语料
data resource untuk NLP bahasa indonesia
This repository contains the code and data of the paper titled "Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation" published in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), November 16 - November 20, 2020.
OpusFilter - Parallel corpus processing toolkit
The Business Scene Dialogue corpus
A multilingual, multi-style and multi-granularity dataset for cross-language textual similarity detection
Leeds University and King Saud University (LK) Hadith Corpus
Neural Machine Translation on the Nepali-English language pair
Machine translation (MT) benchmark dataset for languages in the Horn of Africa.
Curated list of publicly available parallel corpus for Indian Languages
Abkhazian language focused multilingual and monolingual corpuses for Natural Language Processing(NLP)
An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For instance useful for comparing a translation with the original text, to find differences and similarities between two different translations, or to see how a machine translation differs from a reference translation.
The IIT Bombay English-Hindi Parallel Corpus
Machine Translation from Sanskrit to Hindi using Unsupervised and Supervised Learning
A corpus that can be used to train English-to-Italian End-to-End Speech-to-Text Machine Translation models
Python application, generating parallel corpus for any language pairs, can be used for training nmt (Neural Machine Translation) systems
Machine Translation from English to Odia language.
Editor for normalising learner texts (error annotation and tagging.)
AMI Meeting Parallel Corpus
Pali Buddhist scriptures of 15 countries and its parallel corpus
4,500 sentences in Irish, tokenized, manually lemmatized, translated into English.
Code to extract multilingual parallel corpus from Press Information Bureau (PIB) website.
A parallel corpus of Sorani, Kurmanji and English
Parallel corpus annotation and visualization
Parallel corpus and multilingual machine translation system of the Pali Buddhist scriptures in 15 countries(15国巴利文大藏经平行语料与多语言机器翻译系统)
Tajik-to-Persian transliteration model
Extracting present perfects (and related forms) from parallel corpora
Thai Lao Parallel corpus
Neovim plugin for aligning bilingual parallel texts
Online parallel text alignment tool.
English to Odia/Oriya parallel corpus of phrases