CUNY-CL / wikipron

Massively multilingual pronunciation mining

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[lat] Broken selector

kylebgorman opened this issue · comments

As of at least #509 the custom selector for Latin has been broken.

Latin has a custom selector because the headwords lack macrons. Now the Romans of course didn't use macrons (and they did not consistently indicate vowel length) but just about every modern-era pedagogical resource does, so this was frankly a bizarre decision by the editors, and requires us to find the macronized forms somewhere else on the page (namely in the etymology subsection), then merge these together with the pronunciations. To debug, the obvious thing to do is to mock up one or the other stream: the etymological macronized headwords, or the pronunciations, and see which one isn't matching.

It should be possible to push through with a new version of #509 and just ignore Latin, which is probably a good idea given that it's been a while since a big scrape has been merged.

Testing of Latin is (hackily) paused in #520.