spirineta / GL_ro

Haskell implementation of Pustejovsky's Generative Lexicon for the Romanian language

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Generative Lexicon (GL) is a theory of linguistic semantics which focuses on the distributed nature of compositionality in natural language. The first major work outlining the framework is James Pustejovsky's "Generative Lexicon" (1991). Subsequent important developments are presented in Pustejovsky and Boguraev (1993), Bouillon (1997), and Busa (1996). The first unified treatment of GL was given in Pustejovsky (1995). Unlike purely verb-based approaches to compositionality, Generative Lexicon attempts to spread the semantic load across all constituents of the utterance. Central to the philosophical perspective of GL are two major lines of inquiry: (1) How is it that we are able to deploy a finite number of words in our language in an unbounded number of contexts? (2) Is lexical information and the representations used in composing meanings separable from our commonsense knowledge?

GL was initially developed as a theoretical framework for encoding selectional knowledge in natural language. This in turn required making some changes in the formal rules of representation and composition. Perhaps the most controversial aspect of GL has been the manner in which lexically encoded knowledge is exploited in the construction of interpretations for linguistic utterances. The computational resources available to a lexical item within this theory consist of the following four levels:

   1. Lexical Typing Structure: giving an explicit type for a word positioned within a type system for the language;
   2. Argument Structure: specifying the number and nature of the arguments to a predicate;
   3. Event Structure: defining the event type of the expression and any subeventual structure it may have; with subevents;
   4. Qualia Structure: a structural differentiation of the predicative force for a lexical item.

The qualia structure, inspired by Moravcsik's (1975) interpretation of the aitia of Aristotle, are defined by Pustejovsky as the modes of explanation associated with a word or phrase in the language, and are defined as follows:

   1. formal: the basic category of which distinguishes the meaning of a word within a larger domain;
   2. constitutive: the relation between an object and its constituent parts;
   3. telic: the purpose or function of the object, if there is one;
   4. agentive: the factors involved in the object's origins or ``coming into being.

References:
    * Bouillon, P. 1997. "Polymorphie et semantique lexical: le case des adjectifs", Ph.D., Paris VII. Paris.
    * Busa, F. 1996. Compositionality and the Semantics of Nominals, Ph.D. Dissertation, Brandeis University.
    * Moravcsik, J. M. 1975. Aitia as Generative Factor in Aristotle's Philosophy, Dialogue, 14:622-36.
    * Pustejovsky, J. (1991) The Generative Lexicon, in Computational Linguistics, 17.4.
    * Pustejovsky, J. (1995) The Generative Lexicon, MIT Press, Cambridge, MA.
    * Pustejovsky, J. and B. Boguraev. (1993) Lexical Knowledge Representation and Natural Language Processing, in Artificial Intelligence, 63:193-223.

----

About the SIMPLE-CLIPS Ontology


PAROLE-SIMPLE-CLIPS offers therefore the advantage of being compatible with the other eleven PAROLE-SIMPLE lexicons that were built for European languages and that share a common theoretical model, representation language and building methodology. A PAROLE-SIMPLE-CLIPS entry gathers together all the phonological, morphological and inherent syntactic and semantic properties of a headword. Its subcategorization pattern is (or are) described in terms of optionality, syntactic function, syntagmatic realization as well as morpho-syntactic, syntactic and lexical properties of each slot filler. At the semantic level, the theoretical approach adopted by the SIMPLE model is essentially grounded on a revisited version of some fundamental aspects of the Generative Lexicon. A SIMPLE-CLIPS semantic unit is richly endowed with a wide range of fine-grained, structured information, most relevant for NLP applications. First among them, the ontological typing: the lexicon is in fact structured in terms of a multidimensional type system based on both hierarchical and non-hierarchical conceptual relations, taking into account the principle of orthogonal inheritance. Other relevant information types in a word entry are its domain of use; type of denoted event; synonymy and morphological derivation relations; membership in a class of regular polysemy as well as any relevant distinctive semantic features. Particularly outstanding is the information encoded in the Extended Qualia Structure (a set of 60 semantic relations that allow modelling both the different meaning dimensions of a word sense and its relationships to other lexical units) and the Predicative Representation which describes the semantic scenario the word sense considered is involved in and characterizes its participants in terms of thematic roles and semantic constraints. In a word?s description, lexical information is interrelated across the four description levels. Syntactic and semantic information, in particular, is related to each other through the projection of the predicate-argument structure onto its syntactic realization(s).

About

Haskell implementation of Pustejovsky's Generative Lexicon for the Romanian language