linkrope / gamma

Extended Affix Grammar Compiler Generator

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error positions of generated compilers are incorrect

kuniss opened this issue · comments

Grammar: https://github.com/linkrope/gamma/blob/master/example/decl-appl.eag
Test source test.declappl (preserve empty lines!):

DECL aa
DECL aa

APPL ab
APPL ba


Will result in the following wrong error positions:

>~/git/gamma/build$ .//DeclAppl ../test/test.declappl 
info: DeclAppl compiler (generated with epsilon)
error: predicate 'NotAlreadyDeclared' failed
../test/test.declappl:4:1 APPL ab
                          ^
error: predicate 'Declared' failed
../test/test.declappl:5:1 APPL ba
                          ^
error: predicate 'Declared' failed
../test/test.declappl:7:1 
                          ^
info: errors detected: 3

While these positions are not exactly helpful, they are not really wrong.
In the rule, the predicate follows a terminal.
The generated code for the terminal moves on to the next terminal.
The following predicate uses the position of this next terminal for the error message.

Interestingly, SOAG handles one of the two cases better than the other evaluators:

error: predicate Declared failed
stdin:2:6 APPL b
               ^

So, it seems, you're in the best position to fix this issue.

Interesting that the SOAG generated compiler handles it better while using the same predicate generator... Currently I cannot imagine (w/o looking into the code, however) why it should handle it better.

You may assign it to me. I will try to investigate and maybe improve it.

May be this is caused by this line:

    for (size_t i = NextHeap + Arity; i >= NextHeap; --i)

... At least, it alters arity+1 positions copying them from the PosStack. One more as I would expect.
Worth to give a change here a try....

May be this is caused by this line:

    for (size_t i = NextHeap + Arity; i >= NextHeap; --i)

... At least, it alters arity+1 positions copying them from the PosStack. One more as I would expect. Worth to give a change here a try....

Didn't take off. The intended "correction" did not change any position...

Nevertheless, the mentioned code line should be a off-by-one error, isn't it, @linkrope .

I come more and more to the conclusion, this is a conceptual problem in Gamma of determining a helpful error position for failed predicates.

First, I took a closer to the SOAG error positions for predicates as they seemed to be more accurate than the other evaluators. They are generically computed at method eSOGA.d::GenPredPos.
In fact, they are more accurate only by chance. The SOAG generator tries to find the first visit before the predicate call and takes the position associated with the symbol occurrence the visit is intended for. By chance this is the symbol the predicate applies for. But it may also be the visit for the last symbol in the rule. Then just the position of the next terminal (which may not covered by the rule under work itself) is used. That happens for the NotAlreadyDeclared predicate in the given example. Whereas the Declared predicate by chance targets the vsiti of the id symbol and so the right position is picked accidentally.

The Single Sweep evaluator generator computes the predicate error positions at sweep.d::GenerateNont. It applies a similar strategy by traversing the symbol occurrences analyzed before according the computed sweep order and takes the position associated with that symbol. Almost this is not the symbol the predicate is associated with. So, the error positions the Single Sweep evaluator based compiler in our examples are all pointing to the wrong position, the next identifier in the token stream, even not covered by the rule the failing predicate is in.

The LEAG evaluator embedded the LL(1) parser acts even more simple. It just uses the token position the scanner has stopped currently while executing the predicate. If there is no terminal behind the predicate in the rule (what is the case in our example), the error position will always pointing to the wrong token as happening in our example.

To overcome this situation we need to associate the error position of the predicate provided by a terminal hyper symbol which position gets applied at a defining affix position of the predicate. E.g. for the rule alternative

        { <- Table, + Table1>
            'DECL' id <id>
            NotAlreadyDeclared <id, Table>
            <id ';' Table, Table1>
        |

the position terminal recognized by the hyper terminal id could be used as position provider as its synthesized affix value represented by the affix variable id appears at the defining affix position of the NotAlreadyDeclared predicate. The best way would be to request from the specification creator to put the affix value created by hyper terminal (maybe indirectly) at the first affix position of a predicate, like a convention.