swalter2 / lemon.dbpedia

lemon lexicon for DBpedia

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

lemon lexica for DBpedia

The folders en, de, es and jp contain the development versions of an English, German, Spanish and Japanese lexicon for the DBpedia ontology. They comprise several LDP files with entries using the lemon design patterns and pooled by domain (persons, organizations, arts and entertainment, animals and plants, etc.), together with a file containing all entries that could not be created using those patterns but only by writing lemon RDF triples (extra.ttl). Additionally, the file references.ttl defines classes and properties that are not part of the DBpedia ontology but are used in the lexicalizations.

In order to create a single RDF lexicon file, run the export script with the language folder as argument, for example:

$ ./export.sh en

This requires:

The file allURIs contains a list of all URIs in the DBpedia 3.8 ontology (schema but no instance data). Exporting the English lexicon creates the files en_lexicalizedURIs (all URIs that occur in the lexicalizations) and en_todoURIs (all URIs that do not yet occur in any lexicalization).

Further, the statistics.py script outputs the number of verbalizations (per classes, properties and total) as well as the average number of entries and their distribution.

Latest release

English lexicon version 1 (July 2013)

The first release of the English lexicon for DBpedia 3.8 covers 353 classes as well as 300 properties (all those that have more than 10,000 occurrences in the DBpedia dataset, with only a few exceptions).

  • Total lexicalizations: 1,216 (1.8 entries per concept)
  • Class lexicalizations: 443 (1.3 entries per class)
  • Property lexicalizations: 773 (2.4 entries per property)

Published on: lemon-model.net/lexica/dbpedia_en (under Creative Commons BY 3.0 license)

Want to contribute?

If you want to help to improve and extend the lexicon, if you want to port it to others languages, or if you are using the lexicon, we'd love to hear from you!

About

lemon lexicon for DBpedia


Languages

Language:Python 69.0%Language:Scala 24.2%Language:Shell 6.8%