ekoontz / menard

A Clojure library for generation and parsing expressions from grammars and lexicons.

Home Page:https://hiro-tan.org/nlquiz

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Trying to go beyond the demo.

martinodb opened this issue · comments

Hi! This looks great, but I can't get beyond the demo. When I try to translate one of the Dutch sentences generated and translated in the demo, I get nothing. What am I doing wrong? Thanks!

$ lein repl
[..]
user=> (load "menard/translate")
[..]
user=> (in-ns 'menard.translate)
#object[clojure.lang.Namespace 0x93f18a47 "menard.translate"]
menard.translate=> (demo)
# intensifier adjective; 5 examples:
---
Volkomen eenzaam.|
                 |Completely lonely.
[..]
Uw wortels.|
           |Your carrots.
menard.translate=> (translate "Uw wortels.")
Uw wortels..|
            |_.
nil
menard.translate=> (translate "uw wortels.")
Uw wortels..|
            |_.
nil
menard.translate=> (translate "uw wortels")
Uw wortels.|
           |_.
nil
menard.translate=> (translate "Uw wortels.")
Uw wortels..|
            |_.
nil
menard.translate=> (nl-to-en-spec "Uw wortels.")
{:cat nil, :phrasal true, :reflexive false, :subcat [], :agr {:person :top, :number :top, :gender :top}, :sem {:obj {:obj nil}}, :max-depth 15, :comp {:interrogative? :top}}
menard.translate=> 

Hi Martin, try this:

menard.translate=> (-> "Uw wortels" nl/parse first translate)
Uw wortels.|
           |Your carrots.
nil
menard.translate=> 

That worked, thanks!

Sorry I took a while to respond. It only took a moment to test that snippet, but I've been playing a little and trying to figure out what's going on. So the demo function doesn't need to call the parse function because it translates from the output of the generate function, which already generates a map with the expected keys, right?

Something closer to what demo is doing would be, for instance:


menard.translate=> (nth nl/expressions 2)
{:note "noun verb", :example "ik slaap", :subcat [], :cat :verb, :phrasal true, :head {:phrasal false}, :sem {:obj :none}}
menard.translate=> (-> (nth nl/expressions 2) nl/generate  translate )
Voorbeelden gaan.|
                 |Examples go.
nil
menard.translate=> (-> (nth nl/expressions 2) nl/generate  translate )
Guus zag.|
         |Guus saw.
nil
menard.translate=> (-> (nth nl/expressions 2) nl/generate  translate )
Hagedissen moetten.|
                   |Lizards must.
nil
menard.translate=> (-> (nth nl/expressions 2) nl/generate  translate )
Je zegd.|
        |You 🤠 say.
nil
menard.translate=> (-> (nth nl/expressions 2) nl/generate  translate )
Wortels voorkomen.|
                  |Carrots prevent.

The random output of generate is quite large, but we can see they map keys:

menard.translate=> (-> (nth nl/expressions 2) nl/generate keys )
(:aux :cat :comp-derivation :syntax-tree :variant :phrasal :infl :rule :reflexive :interogative? :subcat :agr :1 :note :head :sem :root :head-derivation :menard.generate/done? :2 :example :menard.generate/started? :abbreviation :comp)

Just like we get from the output of parse (edit: I noticed the map is similar but not identical):

menard.translate=> (-> "Uw wortels" nl/parse first keys)
(:cat :comp-derivation :slash :phrasal :np? :infl :mod :rule :reflexive :interogative? :subcat :agr :1 :definite? :head :sem :root :head-derivation :mods-nested? :2 :menard.generate/started? :comp :menard.nesting/only-one-allowed-of)

BTW, it may be convenient to add a warning that sentences must be written without a period. I was a bit puzzled at first by this error:

menard.translate=> (-> "Uw wortels." nl/parse first translate)
INFO  10 may 2021 17:20:15,707 menard.nederlands: no lexemes found for: [wortels.]; will use null lexemes instead.
Uw _.|
     |Your _.
nil

Well, I'll have to keep digging but I guess you can count this issue as fixed. Greetings and thanks!

Ok, I'll close this and fix the issue with the period. I agree it's confusing if it outputs Uw wortels but does not accept Uw wortels. (with the period as part of the input). Yes, the output of (generate) and (parse) are quite large and hard to read - you can run u/pprint on such a structure and see it more readably, e.g.

menard.nederlands> (->> "Uw wortels." parse first syntax-tree)
"[np:2 .uw +wortels]"
menard.nederlands> (->> "Uw wortels." parse first u/pprint)
{:cat [[1] :noun],
 :comp-derivation
 [[2]
  {:0
   {:curriculum-is-none-default {:match? true},
    :null-is-false-by-default {:match? true},
    :dont-inflect-determiners {:match? true}},
   :1 {:possessive-is-definite {:match? true}}}],
 :slash false,
 :phrasal true,
 :np? true,
 :infl [[3] :top],
 :mod nil,
 :rule "np:2",
 :reflexive [[4] false],
 :interogative? [[5] :top],
 :subcat [],
 :agr [[6] {:person :3rd, :number [[7] :plur], :gender :common}],
 :1
 [[8]
  {:cat :det,
   :derivation [2],
   :null? false,
   :phrasal false,
   :inflected? true,
   :possessive? true,
   :curriculum :menard.nederlands/none,
   :subcat [],
   :agr [6],
   :definite? [[9] true],
   :sem
   {:arg2 [[10] :top],
    :countable? [[11] :top],
    :pred [[12] :you],
    :arg1 [[13] :top],
    :context :polite},
   :head-derivation [2],
   :mods-nested? true,
   :canonical "uw"}],
 :definite? [9],
 :head
 [[14]
  {:cat [1],
   :slash false,
   :regular true,
   :derivation
   [[15]
    {:0
     {:curriculum-is-none-default {:match? true},
      :propernouns-are-nonreflexive {:match? true},
      :noun-semantics {:match? true},
      :nouns-are-not-propernouns {:match? true},
      :null-is-false-by-default {:match? true},
      :noun-gender-common-default {:match? true},
      :nouns-have-empty-modifiers {:match? true},
      :nouns-are-not-pronouns {:match? true},
      :propernouns-have-empty-modifiers {:match? true},
      :pronouns-are-nonreflexive {:match? true},
      :politeness-is-unspecified {:match? true}},
     :2 {:common-noun {:match? true}}}],
   :null? false,
   :phrasal false,
   :infl [3],
   :mod [[16] []],
   :reflexive [4],
   :interogative? [5],
   :curriculum :menard.nederlands/none,
   :subcat {:1 [8], :2 []},
   :agr [6],
   :definite? [9],
   :sem
   {:ref [[17] {:number [7]}],
    :arg2 [10],
    :context [[18] :none],
    :countable? [11],
    :quant [12],
    :pred [[19] :carrot],
    :mod [],
    :arg1 [13]},
   :root [[20] "wortel"],
   :head-derivation [15],
   :mods-nested? false,
   :canonical [20],
   :inflection :s,
   :propernoun false,
   :pronoun false,
   :surface "wortels"}],
 :sem {:pred [19], :ref [17], :mod [16], :quant [12], :context [18]},
 :root [20],
 :head-derivation [15],
 :mods-nested? true,
 :2 [14],
 :menard.generate/started? true,
 :comp [8],
 :menard.nesting/only-one-allowed-of :nest-only}
{:cat [[1] :noun],
 :comp-derivation
 [[2]
  {:0
   {:curriculum-is-none-default {:match? true},
    :null-is-false-by-default {:match? true},
    :dont-inflect-determiners {:match? true}},
   :1 {:possessive-is-definite {:match? true}}}],
 :slash false,
 :phrasal true,
 :np? true,
 :infl [[3] :top],
 :mod nil,
 :rule "np:2",
 :reflexive [[4] false],
 :interogative? [[5] :top],
 :subcat [],
 :agr [[6] {:person :3rd, :number [[7] :plur], :gender :common}],
 :1
 [[8]
  {:cat :det,
   :derivation [2],
   :null? false,
   :phrasal false,
   :inflected? true,
   :possessive? true,
   :curriculum :menard.nederlands/none,
   :subcat [],
   :agr [6],
   :definite? [[9] true],
   :sem
   {:arg2 [[10] :top],
    :countable? [[11] :top],
    :pred [[12] :you],
    :arg1 [[13] :top],
    :context :polite},
   :head-derivation [2],
   :mods-nested? true,
   :canonical "uw"}],
 :definite? [9],
 :head
 [[14]
  {:cat [1],
   :slash false,
   :regular true,
   :derivation
   [[15]
    {:0
     {:curriculum-is-none-default {:match? true},
      :propernouns-are-nonreflexive {:match? true},
      :noun-semantics {:match? true},
      :nouns-are-not-propernouns {:match? true},
      :null-is-false-by-default {:match? true},
      :noun-gender-common-default {:match? true},
      :nouns-have-empty-modifiers {:match? true},
      :nouns-are-not-pronouns {:match? true},
      :propernouns-have-empty-modifiers {:match? true},
      :pronouns-are-nonreflexive {:match? true},
      :politeness-is-unspecified {:match? true}},
     :2 {:common-noun {:match? true}}}],
   :null? false,
   :phrasal false,
   :infl [3],
   :mod [[16] []],
   :reflexive [4],
   :interogative? [5],
   :curriculum :menard.nederlands/none,
   :subcat {:1 [8], :2 []},
   :agr [6],
   :definite? [9],
   :sem
   {:ref [[17] {:number [7]}],
    :arg2 [10],
    :context [[18] :none],
    :countable? [11],
    :quant [12],
    :pred [[19] :carrot],
    :mod [],
    :arg1 [13]},
   :root [[20] "wortel"],
   :head-derivation [15],
   :mods-nested? false,
   :canonical [20],
   :inflection :s,
   :propernoun false,
   :pronoun false,
   :surface "wortels"}],
 :sem {:pred [19], :ref [17], :mod [16], :quant [12], :context [18]},
 :root [20],
 :head-derivation [15],
 :mods-nested? true,
 :2 [14],
 :menard.generate/started? true,
 :comp [8],
 :menard.nesting/only-one-allowed-of :nest-only}
menard.nederlands> 

Fixed the period issue here:

3985177