Giters
Helsinki-NLP
/
OPUS-ingest
Geek Repo:
Geek Repo
Github PK Tool:
Github PK Tool
Stargazers:
4
Watchers:
5
Issues:
31
Forks:
1
Helsinki-NLP/OPUS-ingest Issues
new version of JParaCrawl
Updated
21 days ago
Multi-Parallel Corpus of North Levantine Arabic
Updated
6 months ago
macocu datasets
Closed
7 months ago
Comments count
1
Add datasets from https://github.com/Softcatala/nmt-softcatala
Updated
8 months ago
Add CoVoST dataset
Updated
8 months ago
Add en-th dataset
Closed
8 months ago
Comments count
1
MDN Web Docs
Closed
8 months ago
Comments count
1
NLLB dataset
Closed
9 months ago
Comments count
1
datasets collected in NLLB from various sources
Updated
9 months ago
ELRA-W0232 is empty
Updated
10 months ago
Comments count
1
add TALPCo dataset
Updated
10 months ago
bug: unable to clone all submodules
Closed
a year ago
Comments count
1
There are two template files for each type
Closed
a year ago
Comments count
1
Add Multilingual corpus of Caucasian languages
Updated
a year ago
Comments count
2
Add CLUVI corpus for Galician>Spanish, English
Updated
a year ago
wmt21 multilingual data set
Updated
a year ago
gourmet swahili english does not show in opus api
Closed
a year ago
Comments count
2
LoResMT data sets
Updated
a year ago
JW300 alignment problems
Closed
a year ago
Comments count
1
add UN new UN corpus
Closed
a year ago
Comments count
2
update various outdated corpora
Closed
a year ago
mulitparallel and updated ParaCrawl corpus
Closed
a year ago
update parsed data
Closed
a year ago
Comments count
1
better use of disk space and temp directories
Closed
a year ago
UD compatible pre-processing
Closed
a year ago
Comments count
1
release filtered/unfiltered commoncrawl and rapid corpus
Closed
a year ago
Comments count
1
alignments missing?
Closed
a year ago
Comments count
1
update tatoeba
Closed
a year ago
Invalid xml
Updated
5 years ago
add mediawiki translation corpus
Closed
5 years ago
improve makefiles
Updated
5 years ago