Misleading debug text when encountering `\r`
rowlesmr opened this issue · comments
Matthew Rowles commented
Having a \r
in a string which is being parsed resets the output string in the debug output, overwriting what was already there.
The parsing is correct, just the explanatory text is wrong.
\r
is imprtant as a standalone character, as I need to be able to accept it as a line terminator.
from pyparsing import (
Opt,
ParserElement,
Regex
)
if __name__ == "__main__":
ParserElement.set_default_whitespace_chars(" \t")
debug = True
line_term = (("\r" + Opt("\n")) | "\n").set_debug(flag=debug).set_name("line_term")
comment = (Regex("#.*(?=(\r\n?)|\n)") + line_term).set_debug(flag=debug).set_name("comment")
string = (Regex("[a-z0-9]+") + Opt(line_term)).set_debug(flag=debug).set_name("string")
value = (string | comment).set_debug(flag=debug).set_name("value")
file = (value[...] + line_term[...]).set_debug(flag=debug)
s="""#multi word comment \nval val2 \r val3\nval4 \t\n\n\r"""
print(f"{file.parse_string(s, parse_all=True)=}")
results in (in part):
#... more stuff before
Match line_term at loc 20(1,21)
#multi word comment
^
Matched line_term -> ['\n']
Matched comment -> ['#multi word comment ', '\n']
Matched value -> ['#multi word comment ', '\n']
Match value at loc 21(2,1)
val3
^
Match string at loc 21(2,1)
val3
^
Match line_term at loc 24(2,4)
val3
^
Match line_term failed, ParseException raised: Expected '\r', found 'val2' (at char 25), (line:2, col:5)
Matched string -> ['val']
Matched value -> ['val']
Match value at loc 24(2,4)
val3
^
Match string at loc 25(2,5)
val3
^
Match line_term at loc 29(2,9)
val3
^
Matched line_term -> ['\r']
Matched string -> ['val2', '\r']
Matched value -> ['val2', '\r']
#... more stuff after
Paul McGuire commented
Interesting issue. Could you also please supply a small sample string I can use for s
? Probably past the repr of the string so that the control characters show up properly.
Matthew Rowles commented
One string:
s="""#multi word comment \nval val2 \r val3\nval4 \t\n\n\r"""