goodmami / pe

Fastest general-purpose parsing library for Python with a familiar API

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Auto-ignore (e.g., whitespace)

goodmami opened this issue · comments

Being able to ignore whitespace when parsing without explicitly putting it in the grammar can make the grammars much shorter and easier to write and comprehend (assuming one understands the implicit whitespace exists). There are some challenges:

  • There needs to be a way to enable/disable the implicit whitespace, e.g., with some rule operator (pegged uses < (link))
  • There needs to be a way to customize what is ignored
  • There should ideally be a way for the parser to not look for whitespace too often. E.g., if it occurs between every item in a sequence and we have:
    A <- "a" B
    B <- "b" "c"
    
    ...then a naive system would look for and discard whitespace after "a" and before B, then again before "b" in the B rule (that is, twice in a row)