sirthias / parboiled2

A macro-based PEG parser generator for Scala 2.10+

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Reintroduce error recovery from pb1

sirthias opened this issue · comments

Can be implemented largely by simply wrapping a parser and using a special ParserInput, which allows for insertion and removal of virtual characters in the same way as the MutableInputBuffer from pb1.

The techniques used for recovery in pb1 should still work much the same as before.

I am using parboiled2 to build a parser for a small language, and I really like its speed and flexibility. Having error recovery would really help in building a user-friendly text editor for my language. If this feature will not be built, is there any good blog post that explains in detail the techniques used by parboiled1?

It will really help us to start building tooling around a language be are creating.

I'm currently developing an IDE-like environment with syntax error hints for my users. For now I use parboiled1 but it is very slow for some inputs so I'm considering switching to parboiled2. It would be no-brainer if it had error recovery as the parboiled1 has.

+1 for this feature

Would be great to have this feature in parboiled2!

+1 for this feature

I know I am a bit late, but could someone expand a bit one what would be needed to implement this?

I see

https://github.com/sirthias/parboiled/blob/master/parboiled-core/src/main/java/org/parboiled/buffers/MutableInputBuffer.java

How much implementation effort would approximately be required and would an outsider be able to do this? :)

The ParserInput, which roughly corresponds to the input buffers of PB1, are not where the meat of the recovery logic lives.
PB2 would need an equivalent of the RecoveringParseRunner from PB1:
https://github.com/sirthias/parboiled/blob/master/parboiled-core/src/main/java/org/parboiled/parserunners/RecoveringParseRunner.java

Essentially, this ParseRunner implements another layer on the outside, around the actual parser logic, that "watches" how a parser eats through an input. If the parsing run succeeds, all is well. If it doesn't the RecoveringParseRunner "changes" the input somewhat an re-runs the parser, potentially many times. This way it can overcome, one by one, all errors and make the parser succeed eventually.
Even though this process can be much slower that a "normal" parsing run this delay usually doesn't matter at all because error recovery usually only needs to be fast enough in human timescales, which is usually not that hard.

So, error recovery for PB2 should be implementable by adding sth like the MutableInputBuffer as another ParserInput and sth on the outside, wrapping the actual parser (an equivalent to the RecoveringParseRunner from PB1).

Note however: PB2 is essentially EOL, because its macro core cannot be easily ported to Scala 3 and would have to be completely rewritten. AFAICS I won't have the capacity or the drive to do this myself.
Also, the parsing landscape in the Scala ecosystem has changed a lot since "the parboileds" were written (2009 - 2014).
We now have things like fast-parse and cats-parse.

So, if you want to implement error recovery for a Scala parsing framework, I'd look into doing it for one of these newer libraries rather than parboiled2.