Left recursion
goodmami opened this issue · comments
So far I have not seen a need for left recursion, particularly because pe emits values in a flat list. But some people seem to really care about it, so this issue is for tracking its implementation.
Using the memo to escape the left-recursive loop looks promising, and this method was chosen for Python's new pegen parser. Guido wrote about it here: https://medium.com/@gvanrossum_83706/left-recursive-peg-grammars-65dab3c580e1
Not sure how applicable this is to PEG but ANTLR can deal with direct left-recursion very gracefully and from its docs I quote
The most natural expression of some common language constructs is left recursive. For example C declarators and arithmetic expressions.
which is why it's always a very handy feature for a parser generator to support at least direct left recursion.
See also the corresponding publication. From there I quote
ANTLR 4 generates ALL(*) parsers and supports direct left-recursion through grammar rewriting
so maybe a similar automatic grammar-rewriting strategy could be used by pe
as well?
Thanks for the links. I haven't seen that paper yet, and I'm curious how the ALL(*) parsing works. I think that some limited grammar rewriting is possible to avoid trivial left-recursion cases in the same way that pe's "common" optimizations do grammar rewriting, such as transforming patterns like "a" "b"
into "ab"
or "a" / "b"
into [ab]
.
If you continue the section you quoted, they say this:
Direct left-recursion covers the most common cases, such as arithmetic expression productions, like E → E . id, and C declarators. We made an engineering decision not to support indirect or hidden left-recursion.
So they do not handle the more difficult left-recursion cases, which sounds like a practical decision (especially since they are apparently doing grammar analysis at parse time instead of compile time).