textX / Arpeggio

Parser interpreter based on PEG grammars written in Python http://textx.github.io/Arpeggio/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Indentation based languages

mathrb opened this issue · comments

Hello

I know pure PEG cannot parse an indentation based language.
But like pegjs, Arpeggio might have a way to do it.

Do you thinks it's feasible? If yes, could you give me some hints?

Kind regards

Hi
Yes, I think it is feasible but never got a change to put some time into it. There is some discussion on the textX issue.

There are two task/issues to be decided/solved:

  1. How to track indentation level during recursive descend.
  2. How to specify indentation language rules in the grammar.

For the second point what comes to mind is either to introduce a new Match subclasses Indent and Dedent and insert it at the place in the grammar where increasing/decreasing of indentation level is expected. Another way to deal with it is to introduce a parsing expression Indented which wraps other expression that are expected to be at the higher indentation level. Everything should be backward compatible regarding whitespace skipping. Also, what is considered an indentation increase (tabs/spaces, how many?) should be configurable.

Thanks for your answer @igordejanovic
The grammar I have highly depends on tracking indentation, so it might be complicated to implement.
Thanks

FYI the PyParsing project recently introduced an IndentedBlock class to handle this (previously they used a helper method). Code and docs (same MIT license) at pyparsing/pyparsing@2dd2e2b

Is there a plan to implement such a feature?

I don't have resources to work on this but I would be glad to help out in discussing the design and reviewing the implementation if anyone has time and will to work on it.