pyparsing / pyparsing

Python library for creating PEG parsers

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

IndentedBlock causes parse actions of its first element to be called twice

Honza0297 opened this issue · comments

Sorry for opening another issue in a short time, but:

When the first element of IndentedBlock has a parse action (entry1 and its e1_action in the example), it is performed twice:

def e1_action(x):
    print("entry1")
entry1 = Keyword("entry1").set_parse_action(e1_action)

def e2_action(x):
    print("entry2")
entry2 = Keyword("entry2").set_parse_action(e2_action)

entries = IndentedBlock(entry1 | entry2, recursive=True, grouped=True)
header = Keyword("header") + Word(alphanums + "_") + entries


s = """
header foo
    entry1
    entry2"""
header.parse_string(s)

Expected result:

entry1
entry2

Actual result:

entry1
entry1
entry2

I tried to dig into it and found out that it is caused by this line: self.expr.try_parse(instring, anchor_loc, do_actions=doActions), specifically by do_actions=doActions.
I hotfixed it locally to do_actions=False (I would expect not to perform parse action when I am only trying to parse the block and will parse it again) and now, the output is as expected (by me :)) .
Is it truly a bug or (rather strange) feature?

Thanks for your answer in advance!

You can also prevent double evaluation of the parse action by enabling packrat parsing.

ParserElement.enable_packrat()

This will cache the result of the try_parse lookahead, so that when it succeeds, the actual parse doesn't repeat the parsing and parse action, but just retuns the value from the packrat cache.

Oh, deeply sorry for not responding, somehow forgot to. :(
Anyway, ParserElement.enable_packrat() is the solution! Thanks!