bryanegan / navajo-tdl

Most updated Navajo tdl

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LinGO Grammar Matrix v 0.6, October 15, 2003 (dpf)

This is a minor tuning of version 0.5, including a refinement of the KEYS
attributes, more normalization of predicate names (especially for messages),
some bug fixes in the syntactic rule schemata, and a few additional lexical
types.  Details on these changes will be available in the soon-to-be-released 
Matrix Users' Guide.

For those who have already developed a grammar baaed on the Matrix, the
following changes will have to be made manually in your language-specific
files in order to make them consistent with this version:

1. Changes in feature geometry

   a. SYNSEM.LOCAL.LKEYS ==>
      SYNSEM.LKEYS

      The feature LKEYS has been moved up from LOCAL to SYNSEM, to shorten
      this frequently mentioned path.  (NB: As a related change, the type
      'local-basic' which formerly introduced LKEYS has been deleted.)

   b. LKEYS.--KEYREL ==>
      LKEYS.KEYREL

      LKEYS.--ALTKEYREL ==>
      LKEYS.ALTKEYREL

      These two attributes are the only pointers to relations in the RELS
      list for lexical types, and since they are not shortcuts, the leading
      hyphens have been dropped.  (NB: Since the attributes --COMPKEY and
      --OCOMPKEY are just shortcuts, the leading hyphens for these two names
      remain as a reminder).

   c. CAT.HEAD.KEYS.MESSAGE ==>
      CONT.MSG

      Since the message value of a headed phrase is not always identified 
      with that of its head daughter, it was an error to make the attribute
      MESSAGE a head feature.  This attribute is now moved to CONT, and its
      name shortened for convenience to MSG.


2. Strings and symbols: RULE-NAME
   
   The type 'symbol', which along with 'string' was a subtype of 'atom', has
   been dropped, since the distinction between symbols and strings is not
   useful, and was a source of potential confusion.  So any attributes whose
   values were of type 'symbol' should be changed to be of type 'string',
   and values assigned to these attributes should be converted accordingly.
   In particular, in subtypes of 'rule', the value of the attribute RULE-NAME
   should be changed to be enclosed in double quotes; e.g.
          [ RULE-NAME 'subj-head ] ==>
          [ RULE-NAME "subj-head" ]

3. KEYS attributes

   The attributes KEY and ALTKEY can have as values subsorts of the type
   'predsort' (the same kinds of values allowed for the attribute PREDSORT
   in semantic relations).  These attributes enable a word or phrase to be
   semantically selected by a predicate, and as head features they propagate
   up from the lexical head of the phrase.  For example, a verb can select
   for a prepositional phrase headed by a particular preposition, as long as
   the preposition has lexically assigned a specific value (a subtype of
   predsort) to its SYNSEM.LOCAL.CAT.HEAD.KEYS.KEY attribute, and the verb 
   similarly constrains the KEY value of its PP complement (accessed via the 
   SYNSEM.LKEYS.--COMPKEY or --OCOMPKEY of the verb).  Note that the values
   of KEY and ALTKEY are of the same type as the values of the PRED attribute
   within semantic relations, but it is not always the case that the KEY
   value of a sign is identified with the PRED value of one of the relations
   in its RELS list.  Typically, closed-class lexical entries may identify
   KEY and PRED values, but open-class lexical entries won't, since their
   KEY value will be some underspecified subtype of 'predsort' (e.g.
   'noun_rel' or 'verb_rel')

4. Relations and messages

   The revisions introduced in version 0.5 for improved MRSs have led to a
   potential confusion in naming of relations and their PRED values, so we
   introduce a simple naming convention where all subtypes of the type
   'relation' bear the suffix "-relation" as part of their name, and all
   values of the attribute PRED (subsorts of 'predsort') within relations 
   bear the suffix "_rel" as part of their name.

   In keeping with the reduction to a small number of subtypes of 'relation',
   the value of the attribute MSG is now always the relation subtype 'message'
   with appropriate values in the PRED attribute of the 'message' relation,
   drawing from subtypes of the type 'predsort'.  For example, in the type
   'imperative-clause', the following change has been made: 

   imperative-clause := clause &
     [ SYNSEM.LOCAL.CAT.HEAD.KEYS.MESSAGE command ].    ==>

   imperative-clause := clause &
     [ SYNSEM.LOCAL.CONT.MSG.PRED command_m_rel ].

5. Rules

   This version incorporates several corrections and improvements to the
   definitions of lexical and syntactic rules proposed by colleagues working 
   on the Japanese and Norwegian grammars, as follows:

   a. In the definition of 'lex-rule', the order of appending of the RELS 
      lists has been reversed, for convenience.
   b. The type 'basic-head-subj-phrase' no longer inherits from the type
      'head-compositional' - this was an error preventing coherent MRSs.
   c. The type 'basic-extracted-comp-phrase' no longer identifies the LEX value
      of mother and daughter - this too was an error making the rule unusable.
   d. The type 'basic-head-mod-phrase-simple' no longer identifies the value
      of HOOK on mother and nonhead daughter, since this is no longer uniform
      for scopal and intersective modifiers.  Instead this identification is
      done in the type 'scopal-mod-phrase'; in contrast, the type
      'isect-mod-phrase' now inherits from 'head-compositional', identifying
      the HOOK values of mother and head daughter.
   e. In a related change, the type 'extracted-adj-phrase' is now restricted
      to extracting intersective modifiers, so that the value of HOOK can be
      correctly constrained.

6. Lexeme types

   In order to capture the usual configuration of semantic constraints for
   open-class lexical entries, the types 'lex-item', 'norm-lex-item', and
   'lexeme' have been added.  Some closed-class lexical entries, like those
   for determiners in English, do not conform to the constraints in 
   'norm-lex-item', but most lexical entries will.  We further add the
   constraint that the outputs of lexeme-to-lexeme rules will conform to the 
   constraints in 'norm-lex-item'.  We look forward to feedback, as always.
  
------------------------------------------------------------------------------
Grammar matrix v 0.5, August 15, 2003 (dpf)

This is an upgrade of version 0.4 of the grammar matrix, with some
further normalization of relation names and MRS feature geometry to be
consistent with the Copestake et al.  paper, "Introduction to MRS",
being readied for publication.

If you have already developed a grammar based on the matrix, you will
need to make at least one set of manual adjustments to your
language-specific grammar files, since the location of the KEYS
attribute has changed, and the constraints on its attributes have also
changed.  The KEYS attributes had been used in matrix-derived grammars
for two distinct purposes, first to simplify the notation when defining
lexical types, and second to express constraints on semantic selection
within phrases.  The first usage was a convenient shorthand notation
which is irrelevant to phrasal signs, while the second is crucial in
constraining phrases.  These two notions are now distinct in the
matrix, with the attribute LKEYS now containing these 'shorthand'
attributes convenient for defining lexical types, and the attribute
KEYS now made a HEAD feature.  The attributes in KEYS are also more
strictly constrained, with KEY and ALTKEY no longer taking whole
relations as values, but only semantic sorts (see the User Guide for
elaboration).  Likewise, the MESSAGE attribute now simply takes a
'message' type (or the distinguished type 'no-msg') as its value,
rather than a difference list.

Obligatory changes to make to language-specific grammar files:

(1) Where your grammar used the KEY and ALTKEY attributes to constrain
    the properties of a selected constituent (complement, specifier, 
    subject, or modifier), change these values of KEYS.KEY and 
    KEYS.ALTKEY to be subtypes of the type 'semsort'.  See the User 
    Guide for elaboration.
(2) Change these paths for SYNSEM.LOCAL.KEYS.KEY and ...ALTKEY to be
    SYNSEM.LOCAL.CAT.HEAD.KEYS.KEY and ...ALTKEY
(3) Where your grammar used the KEY and ALTKEY attributes to constrain
    the value of a lexical type's own semantic relations, change these
    paths for SYNSEM.LOCAL.KEYS.KEY and ...ALTKEY to be
    SYNSEM.LOCAL.LKEYS.--KEYREL and ...--ALTKEYREL
(4) Change the value of KEYS.MESSAGE by removing the diff-list brackets.
(5) Change the paths SYNSEM.LOCAL.KEYS.MESSAGE to
    SYNSEM.LOCAL.CAT.HEAD.KEYS.MESSAGE
(6) Change the values for --COMPKEY and --OCOMPKEY to be the semantic
    sort of the relevant complement, rather than the type of a relation
    (again, see the User Guide for elaboration of semantic sorts).
(7) Change the paths SYNSEM.LOCAL.KEYS.--COMPKEY and ...--OCOMPKEY to
    SYNSEM.LOCAL.LKEYS.--COMPKEY and ...--OCOMPKEY

In addition, you may need to make further adjustments, depending on
whether you have made explicit reference to the affected features or
types, which have been changed as follows:

(a) Deleted feature

    The feature E-INDEX was introduced into the matrix for v 0.4,based 
    on its use in the ERG at the time for treating the semantics of 
    predicative PPs and gerunds.  However, improved analysis of English 
    has removed the current motivation for this attribute in HOOK, so 
    it has been deleted from the matrix in order to be consistent with 
    the emerging MRS documentation.

(b) Renaming of type 'mrs-thing', and changes to its subtypes

    The name of the supertype of 'individual' and 'handle' has been 
    renamed from 'mrs-thing' to 'semarg' (for 'semantic argument').  
    Also, one of its subtypes 'non-expl' has been deleted, since it was 
    confusingly redundant with the type 'event-or-ref-index'.  
    Corresponding adjustments have been made to the type hierarchy under 
    'semarg', though the leaf types remain the same.

(c) Renaming of other relations

    To support a more consistent naming convention for relations, any 
    relation or predicate whose name formerly ended in "-rel" now has a 
    name which is like the previous one except that the hyphen ("-") is 
    always replaced with an underscore ("_").  An explanation of the 
    naming conventions can be found in the Matrix User Guide.

------------------------------------------------------------------------------
Grammar matrix v 0.4, March 10, 2003

This is a minor upgrade of the first version of the grammar matrix
(v 0.3), designed to standardize the feature geometry and naming
conventions for MRS feature structures, and to enable stronger
principles of semantic composition, as presented in Copestake,
Lascarides, and Flickinger (2001).

If you have already developed a grammar based on the matrix, you will
need to make the following manual adjustments to your language-specific
grammar files:

(1) Renamed features 
    Summary: Naming conventions now made consistent with soon-to-be-published
    standard reference on MRS.
    Recommended procedure: Do a global replace for each of the following in
    all of your *.tdl files:
    LISZT    -->  RELS
    H-CONS   -->  HCONS
    TOP      -->  LTOP
    HNDL     -->  LBL
    SC-ARG   -->  HARG
    OUTSCPD  -->  LARG
    SOA      -->  MARG
    RESTR    -->  RSTR
    BV       -->  ARG0
    EVENT    -->  ARG0
    INST     -->  ARG0
    LABEL    -->  WLINK

(2) Introduction of HOOK attribute
    Summary: The externally visible attributes of an MRS are now grouped
    within a single attribute called HOOK, which is consistently used in
    constructions to identify the properties of the semantic head daughter
    with those of the phrase.  The features in HOOK include the familiar
    LTOP (formerly TOP), INDEX, and E-INDEX, as well as a new feature XARG
    which is unified with the semantic index of the controlled argument of
    a phrase (to simplify the definition of e.g. equi and raising types)
    Recommended procedure: In each of your *.tdl files, search for each 
    occurrence of the three features LTOP, INDEX, and E-INDEX, and insert 
    HOOK into the path preceding each feature.  In some cases, you will see 
    that you can simplify the re-entrancies in your feature structures by 
    referring to HOOK instead of individually referring to each of the three
    attributes separately.  In addition, consider revising your lexical types 
    for equi and raising predicates to make use of the new XARG feature, 
    which should enable you to avoid reference to arguments of arguments.
    
(3) Naming of argument roles (ARG1, ARG2, ARG3, ARG4)
    Summary: Each relation now assigns its first (least oblique) argument
    to ARG1, its next argument to ARG2, and so on.   The major change from
    the first version of the matrix is to assign objects of transitive verbs
    to ARG2 rather than ARG3, and similarly for objects of prepositions.
    Recommended procedure: In each of your *.tdl files, search for ARG3, and 
    consider replacing it with ARG2.  Check all other role name assignments 
    to ensure that role names are assigned consistently.

(4) Basic relation types
    Summary: The inventory of basic relation types has been simplified.
    Recommended procedure: Review the subtypes that your grammar defined for
    the original basic relation types, and revise them to employ the new
    relation types, consistent with the changes made in step (3) above.  Note
    that a basic relation type has been added for quantifiers: quant-rel.

(5) Deleted features (--TOPKEY)
    Summary: Some semantics-related features proved to be unnecessary
    Recommended procedure: None required, unless your grammar makes use of
    the feature --TOPKEY, in which case you may choose to introduce this
    feature as part of your language-specific inventory of features.


------------------------------------------------------------------------------
[Original notes for v 0.3]

This is an extremely preliminary first cut at the grammar
matrix.  It has not been tested except by being loaded into
the LKB.  It contains the following:

-- basic types which define the feature geometry
-- types for MRS semantics
-- underspecified supertypes of lexical rules
-- underspecified supertypes of phrase structure rules

Of these, the last were the most hastily thrown together.
They are basically taken from the syntax.tdl file of the LinGO
English grammar, and then simplified by removing constraints
that are either likely to be specific to English or are
related to the LinGO analysis of coordination.

These phrase structure rule types are of necessity underconstrained
and merely instantiating them will surely lead to a grammar
with gross overgeneration.  Thus, it is expected that they
will be either augmented directly, or via subtypes that fill
in some of the missing constraints.  One clear example of this
is the lack of constraints on the HEAD values in phrase structure
rules.  Since it's not clear what the appropriate 'universal'
head type hierarchy will/could be, I've refrained from even 
defining types like 'verbal'...

Similarly, certain parts of the type hierarchy might need to be
modified.  In an ideal world, the matrix type hierarchy would only
need to be extended at the bottom for individual grammars.  However,
it is not clear that this is possible or desirable even in principle,
and it is certainly not the case for this preliminary first version!
(Defaults may help here...)

The single biggest gap in the matrix is the utter lack of
lexical types.  I hope that it can be useful even with this
huge lacuna.  Since the matrix holds so closely to the LinGO
grammar, the lexical types of the LinGO grammar should be 
used as models for creating lexical types.  Note in particular
that the rules assume lexical threading of NON-LOCAL features.
Beware that some feature names (notably HNDL) and many type names
differ between the matrix and the LinGO grammar, even when they
are logically and mnemonically related.

Future versions of the matrix should include further documentation
as well as more types (especially lexical types).  Revisions to
the types included in this version should also be anticipated, 
since it seems extremely unlikely that this first guess as to
what's universally useful will turn out to be entirely correct.

Stephan Oepen has kindly cleaned up the collateral .lsp files
included in this distribution of the matrix.  Take a look at
lkb/script for information on how various files (including .tdl
files) are included, and which aspects of the grammar should
be encoded in which .tdl files.

April 5, 2002 (erb)

Added improvements to supertypes for lexical rules.  "Derivational"
lexical rules are now lexeme-to-lexeme, rather than word-to-word
(a hold-over from PAGE).  Lexeme-to-lexeme rules can be spelling
changing, and apply _inside_ lexeme-to-word, or inflectional rules,
as expected.

June 18, 2002 (oe)

Fix generator support (by adding a suitable `mrsglobals.lsp'); more
cleaning up of `script' and related collateral files.

About

Most updated Navajo tdl


Languages

Language:Perl 53.4%Language:Common Lisp 46.6%