mjackson / citrus

Parsing Expressions for Ruby

Home Page:http://mjackson.github.io/citrus

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Rules don't respond to their own name?

ehahn9 opened this issue · comments

grammar Foo

# The value of digit cannot be retrived because digit is undefined
# which seems odd, especially contrasted with the next rule, below
rule bad
    digit { 100 * digit.value }
end

# The value of digit can be retrieved in this case
rule good
    (digit !'xxx') { 100 * digit.value }
end

# The value of thing cannot be retrived because thing is undefined
# which seems odd, especially contrasted with the next rule, below
rule bad2
    thing:digit { 100 * thing.value }
end

# The value of thing can be retrieved in this case
rule good2
    (thing:digit !'xxx') { 100 * thing.value }
end

rule digit
[0-9]
end

end

Hopefully this will clear things up.

grammar Foo
  rule bad
    digit { 100 * super() }
  end

  rule good
    (digit !'xxx') { 100 * digit.value }
  end

  rule bad2
    thing:digit { 100 * super() }
  end

  rule good2
    (thing:digit !'xxx') { 100 * thing.value }
  end

  rule digit
    [0-9] { to_i }
  end
end

When you attach a semantic block to an Alias (digit in this case), the block operates on the match that is produced by digit, not some higher level rule. The match object is extended with the semantic block, so you can use super just as you would on any other Ruby object.

Thanks for the excellent clarifiction. I think what I was expecting was the digit would be an alias for super() in these cases. I have two reasons for this:

  1. It makes editing the rule much less error-prone: if you edit the match, you have to be absolutely certain to edit the block replacing super() with a term name, or a term name with super() (e.g. when deleting a term). This is very error-prone.
  2. Having the alias might be justifyable in the ruby "least surprises" thinking. At least, I was surprised by this behavior.

Anyway, my grammar is all 100% working and I'm very happy with your wonderful gem, so feel free to close this ticket if you like!

ps: FYI, I've used citrus to create a parser for dates, like chronic but more complete (and of course, more extensible!).

I appreciate the feedback. You bring up some good points. It would be trivial to hack in the behavior you desire, but I'm not sure yet what implications this might have on other types of rules. Let's wait and see if this is a problem for anyone else before tweaking the behavior here.

I'd love to see what you're doing with Citrus. Have you open sourced your work?

Hey Michael, i just wanted to let you know that i got tripped up by this as well.

rule statement
    top:(select_statement | set_operation_statement) <Veritas::SQL::Statement>
end

I expected to be able to be able to refer to top as an element and do things to it. However if i should instead be contending with the select_statement itself directly, i will do that!

I have added the ability to retrieve self by calling either captures[0] or some label (and/or alias) that the match may have in this commit. This does introduce a small issue with backwards compatibility, because previously you could get the first sub match by calling captures[0], but it is now available as captures[1], the second as captures[2], and so on. The array you get back from calling matches however is still the same - just sub matches.

Thanks, Michael. I think we're still miscommunicating slightly (sorry). Here's an example from my real code which I wish worked, but doesn't until I replace number with something else (super() or captures[0] neither of which are visually stunning):

rule year
        number { case number
                    when 0..20 then number + 2000
                    when 30..99 then number + 1900
                    else number
                 end }
end

To my thinking, this should work :-). I suspect this is also what knowtheory was suggesting, above, too (although there with an explicit label). Oddly, I prefer super() to captures[0] so I've not taken advantage of the new behavior. I know it caused some grief to break compatibility with the commit, so if you want to back that out and add this construct instead, that would be lovely :-).

I suspect you are rightly growing weary of this topic, so I'm happy to close the ticket and soldier on with whatever syntax you find best.

I don't think we're miscommunicating, it's just that the example you give doesn't work. Here's why: When you retrieve the number match, it is a Citrus::Match object. Thus, in your case statement, Range#=== fails to match because it is a range of integers, none of which are equal to the string value of the match object. Hopefully the following example will clear things up.

require 'citrus'

Citrus.eval(<<CITRUS)
grammar Date
  rule year
    number {
      int = number.to_i
      case int
      when 0..20 then int + 2000
      when 30..99 then int + 1900
      else int
      end
    }
  end

  rule number
    [0-9] { to_i }
  end
end
CITRUS

require 'test/unit'

class DateTest < Test::Unit::TestCase
  def test_number
    m = Date.parse('1', :root => :number)
    assert_equal('1', m)
    assert_equal(1, m.value)
  end

  def test_year
    m = Date.parse('2', :root => :year)
    assert_equal('2', m)
    assert_equal(2002, m.value)
  end
end

Notice that in the semantic block associated with number inside the year rule, we can't use number.value. Instead, we need to call number.to_i explicitly. This is because we are already in the block that gets called when we call value on a match object that is generated by the year rule. Thus, calling number.value inside the block would create an infinitely recurring loop.

DUH! My bad. I had assumed this wasn't working because of the topic of this thread about the summer for referencing the value by name. I never stopped to think that when changing from super() to number, the semantics of referencing the value changed as well.

In my world, this would mean number.value rather than number.to_i - but the mistake (mine!) is the same.

Thanks, Michael. Sorry for taking up your time on this.