jwngr / sdow

Six Degrees of Wikipedia

Home Page:https://www.sixdegreesofwikipedia.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

sdow is not able to follow links with accented characters

Nhoya opened this issue · comments

Apparently links with accented characters will break the search result, for tags it's working fine, the problem is that unicode chars are not handled correctly

Demo: https://www.sixdegreesofwikipedia.com/?source=White%20hat%20%28computer%20security%29&target=%C3%89cole

Nope, there actually is no path between those pages because no pages link to École:

sqlite> SELECT id FROM pages WHERE title = "École";
19771211
sqlite> SELECT incoming_links FROM links WHERE id = 19771211;

École is a disambiguation page and there is a general Wikipedia policy to never link to a disambiguation page. I've kept them around for completeness, although I've definitely considered removing them.

To prove that unicode characters work properly, try doing Doppelgänger to El Niño.