adbar / toponyms

Old prototype for toponym extraction in historical texts written in German

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Outdated code base, development now takes place on geokelone package

Project basics

Prototype for toponym extraction in historical texts written in German, as seen in:

Files currently in the repository

  • Scripts released under GNU GPLv3 license

    1. Extraction of place names (work in progress! stable version only)
    2. Preparation of data from Geonames (used as a fallback)
  • Curated registers containing place names released under CC BY-SA 4.0 license, 2016 versions, update pending

    1. Manually curated historical lists: continental, state, and region levels
    2. Semi-automatically completed list: cities and towns
    3. Data gathered from Wikipedia (CC-BY license), to be cleared and uploaded

Sources

Data

Wikipedia

Other

Stop lists

Method

Additional infos

Thanks to Logan Pecinovsky (BBAW, Berlin) and Judith Brottrager (ÖAW, Vienna) for helping with the curation of the lists.

Outdated code base, development now takes place on geokelone package

About

Old prototype for toponym extraction in historical texts written in German

License:GNU General Public License v3.0