linkrope / gamma

Extended Affix Grammar Compiler Generator

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

EOF not recognized on corner case specification

kuniss opened this issue · comments

Minimal erroneous EAG specification:

NoEOF<+ 'Done': Done>: 'x'.
Done = 'Done'.

The only recognized sentence should be a lonely 'x':

~/git/gamma/build$ echo x | ./NoEOF 
info: NoEOF compiler (generated with epsilon)
Done

But it also compile successfully on input 'xx':

~/git/gamma/build$ echo xx | ./NoEOF 
info: NoEOF compiler (generated with epsilon)
Done 

In fact, it compiles on any arbitrary input after the first 'x':

~/git/gamma/build$ echo xsakdjf7 | ./NoEOF 
info: NoEOF compiler (generated with epsilon)
Done 

This was once a feature of Oberon:
If the source code is Module Compiler; ... END Compiler. then the parser stops at the final dot.
The rest of the file was used for test commands:
https://github.com/linkrope/gamma/blob/master/test/oberon0/Sample.Mod#L46-L52
Typically the last line of the source code was something like

END Compiler.Compile *

where you selected Compiler.Compile * in the Oberon system to run the function of the module.
That was pretty cool back then.

While this works for top-down parsers, bottom-up parsers usually introduce the extra rule

S' -> S <EOF>

The end symbol is the required lookahead to finally reduce everything on the stack.
So it could be difficult to reproduce the Oberon behavior with the bottom-up parser.

On the other hand, removing this "feature" would break the test cases for Sample.Mod.
You would no longer be able to generate an Oberon compiler...

Didn't know that, even back then.

What do you think about making it a special generator option? As it is quite unusual for other languages.

I guess it only works if the last symbol in the grammar is a terminal, isn't it?