qtheperfect / wordnet-to-json

WordNet in JSON format.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

WordNet in JSON Format

The WordNet dataset is provided in a format that requires dedicated parsing routines. JSON is a universal data format, that is supported across various programming languages.

Using JSON, anyone can import and use WordNet dataset with ease.

Citation

This dataset is based on: Princeton University "About WordNet." WordNet. Princeton University. 2010. http://wordnet.princeton.edu

Please cite them if you use this dataset.

Download

See releases.

File Structure

WordNet (root object)

An entire WordNet database.

Fields:

  • synset (map to Synset) from synset ID to synset object.
  • lemma (map to string array) from pos.lemma to synset IDs that contain it.
  • lemmaRanked (map to string array) like Lemma but synsets are ordered from the most frequently used to the least. Only a subset of the synsets are ranked, so LemmaRanked has less synsets.
  • exception (map to string array) from exceptional word to its forms.
  • example (map to string) from example ID to sentence template.

Synset

A set of synonymous words.

Fields:

  • offset (int) synset offset in the raw data file, also used as an identifier.
  • pos (string) part of speech:
    • a: adjective
    • n: noun
    • r: adverb
    • s: satellite
    • v: verb
  • word (string array) words in this synset.
  • pointer (Pointer array) pointers to other synsets.
  • frame (Frame array) sentence frames for verbs.
  • gloss (string) lexical definition.
  • example (Example array) usage examples for words in this synset. Verbs only.

Pointer

Denotes a semantic relation between one synset/word to another.

Fields:

  • symbol (string) relation between the 2 words. Target is <symbol> to source. See their meanings here.
  • synset (string) target synset ID.
  • source (int) index of word in source synset, -1 for entire synset.
  • target (int) index of word in target synset, -1 for entire synset.

Frame

Links a synset word to a generic phrase that illustrates how to use it. Applies to verbs only.

Fields:

  • wordNumber (int) index of word in the containing synset, -1 for entire synset.
  • frameNumber (int) frame number on the WordNet site.

Example

Links a synset word to an example sentence. Applies to verbs only.

Fields:

  • wordNumber (int) index of word in the containing synset, -1 for entire synset.
  • templateNumber (int) tumber of template in the WordNet.Example field.

Go API

If you are working with Go, I encourage you to skip this JSON file and work directly with the Go API. This JSON dump is simply a marshaled WordNet struct.

Having Trouble?

If you have any issues, questions, or comments - feel free to share them on the issues section.

About

WordNet in JSON format.

License:MIT License


Languages

Language:Go 100.0%