jzimmerman / langcc

langcc: A Next-Generation Compiler Compiler

Home Page:https://langcc.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Generating directly executable parsers

nsajko opened this issue · comments

The old LR parsing and finite automata literature used to distinguish between two basic ways of implementing DFA: table-based and "directly executable".

Directly executable (or hardcoded) is when the tables are actually encoded in the generated C++ code, usually there would be a label and a block of generated code for each state, and goto would be used for jumping from one state to the other.

If I understood correctly, your approach was to start from table-based and then augment it with some "recursive-descent" stuff (I only skimmed your paper as of yet).

Is the directly executable/hardcoded apprach compatible with your work? Have you considered letting langcc generate such parsers?

While one could theoretically expand the generated LR parsing tables into executable code, such code would not be significantly more intuitive; it would still look like hard-coded tables, as the table indices (and resulting jump indices) do not provide significant intuition. For this reason, it does not appear worthwhile to enable this in langcc.

Usually the motivation is performance. Hardcoding the tables into executable code should enable the C++ compiler to do a lot more optimization than it otherwise could, and I guess that less data cache pressure is also a good thing.

There might be a slight performance advantage (or maybe not -- considering also the instruction cache), but in either case I don't think it would be nearly significant enough to justify that amount of complicated codegen.