dpriskorn / LexUtils

Collection of utilities to work semi-automatically on lexemes in Wikidata

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

support the commonvoice dataset

dpriskorn opened this issue · comments

how to handle big downloads?
do the sentences have a unique persistent ID?
any API for lookup using the ID?

emailed mozilla 2 days ago to ask

No answer from Mozilla yet. Maybe the best way forward is to count lines in their file with sentences?