boli-ai / vishal

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

vishal

Dataset will be hosted on 🤗 Datasets here

Dataset Processing Type Language Owner Citation
AIBharat IndicCorp In Process Original Scraped en-in, hi, as, bn, gu, kn, ml, mr, or, pa, ta, te HC citation
CC-100 Corpus In Process Original, Romantized as, bn, bn_rom, gu, hi, hi_rom, kn, ml, mr, ne, or, pa, sa, si, sd, ta, ta_rom, te, te_rom, ur, ur_rom HC citation
WMT NEWS Crawl Available to pickup Original Scraped bn, gu, hi, kn, ml, mr, or, pa, ta, te citation
Charles University Hindi Monolingual Corpus Available to pickup Parallel Corpora hi, en
IIT Bombay Hindi Monolingual Corpus Available to pickup Parallel Corpora, Monolingual hi, en citation

About

License:Apache License 2.0