kiselev-dv / gazetteer

OSM ElasticSearch geocoder and addresses exporter

Home Page:http://osm.me

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Использовать названия из Wikidata для wikipedia-затегированных объектов

d1g opened this issue · comments

commented

Раз в OSM не обозначается old_name, alt_name, то хоть при геокодинге должно учитываться:
http://www.openstreetmap.org/relation/337422

  • Питер
  • Петербург
  • СПб
  • etc

https://www.wikidata.org/wiki/Q656

It would be great to have this feature!

To have that data in gazetteer out, I have to have local dump of wikidata, If I'll query wikidata duiring data processing it will take forever to generate data. So I'll probabbly implement that, if it's possible to get wikidata subset.

commented

to have local dump of wikidata

@kiselev-dv

a SPARQL query to fetch labels

44791 Results in 452 ms thanks to WDQS

@d1g, cool, few more things:

  1. How to get language code for label?
  2. How to get type-codes? wikidata as I think could be binded not only to localities but also to streets and boundaries.

As an option, I can get a full list of wikidata identifiers during first pass and make a batch query to download all the wikidata in few calls to API, and join that data later, but it's easier and faster to have that data downloaded before gazetteer run.

commented

@kiselev-dv

  1. simply add a (LANG(?l) as ?lang) after ?l
  2. I'm not exactly sure what codes do you mean. Could you please give an example for SPB or other item?

Here is the code for cities:

 ?item wdt:P31 wd:Q515; # cities

How could i get something like:

?item wdt:P31 wd:Q515; # cities
or ?item wdt:P31 wd:Q123; # states
or ?item wdt:P31 wd:Q1234; # streets 

Heh, that's what I've been afraid of. Are there any analogs of rdbs joins and subqueries or recursive queries in wikidata?

commented

I prefer to load data using SPARQL, but perform really complex precessing using regular tools e.g. Python

commented

@kiselev-dv, it is possible to fetch any division using Q10864048 item: 3623 Results in 6046 ms