Hungarian and Russian scrapes hang
erip opened this issue · comments
Using wikipron
with keys hun
and rus
seem to hang indefinitely. Both have been running for an hour without any movement; see below:
$ wikipron hun > hun.tsv
INFO: Language: 'Hungarian'
INFO: No cut-off date specified
Unfortunately I have very little diagnostic info to offer -- many other languages (including more complicated scrapes like zho
) have completed successfully.
Sometimes the server does that. We don't know why. The big scrape (see data/scrape) has logic to resume from hangups, which may be of help to you.
Those are, I think, the two largest languages in the entire collection, though. If it hangs it'll be on one of the two of them.
I'll try replicating your Hungarian example and report back.
Thanks, @kylebgorman! As a workaround I can use the pre-scraped transcriptions from the data/
dir. I mostly file this as documentation for future issue-havers, though it's likely that it isn't really a bug in the client but in the server as you state.
I just remembered something. Both of those languages have narrow (square brackets) pronunciations, almost exclusively. You'll want to add --phonetic
to your flags. (Note that this is renamed --narrow
in the next release; see #402).
Closing this since I believe my comment a month ago is the explanation for the issue. This is not exactly a bug.