mewmew / uc

A compiler for the µC language.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

parser: Add source position tracking of nodes

mewmew opened this issue · comments

commented

From ast.go:

// TODO: Add source position tracking of nodes.

This requires some careful considerations. How would we like to store the node source position information? Should we always store both the start and the end of a node, or just the start and calculate the end from the length of the node (would this always be possible?).

Related to issue #22 which tracks the pretty-printing of line:col pairs.

commented

Gocc already provides the positional information when generating error messages for input streams which are not reducible by the grammar. We may therefore postpone the implementation of this feature until the semantic analysis stage, at which point positional information should be present for each node of the AST.

commented

Positional information was added to each AST node with commit 313ddfd.

Tracking of positional information was added in commit 5080451. Note, this commit breaks all parser test cases since they don't contain any positional information yet.

Commit 2ba2f0f added positional information to the quiet/lexer/l05.c test case. All other parser test cases are still broken.

commented

Updated by commit 17205c0.

Updated by commit b64e65e

commented

As far as I can tell, this issue was resolved with commit b64e65e. Anything else we need before closing it?

The conversion of positional information to line:col pairs is tracked by issue #25.

Well it depends on if we want to clean up unnecessary to track positions, and just track the essential ones.

commented

Well it depends on if we want to clean up unnecessary to track positions, and just track the essential ones.

That's true. Lets give it a couple of days to sink in before we take a decision. Basically, once we finish the semantic analysis phase, we will have a better understanding of which parts of the AST API we are actually using. At that point, perhaps we'll get rid of the rest to optimize memory usage.

commented

Lets discuss this issue next time we meet. Which positional tracking information should we include in the nodes? The minimum required for good semantic messages, the current information, or all information of token positions?

commented

To facilitate generation of informative error messages, a decision has been taken to store positional information for each relevant token.

An example from clang is presented below, where the right-parenthesis ) is pointed out in the error message.

Contents of a.c:

void f(int a, int b, int c) {
}

int main(void) {
    f(1, 2, 3, 4, 5);
}
u@x1 ~/Desktop> clang -o a a.c
a.c:5:13: error: too many arguments to function call, expected 3, have 5
        f(1, 2, 3, 4, 5);
        ~          ^~~~
a.c:1:1: note: 'f' declared here
void f(int a, int b, int c) {
^
1 error generated.

Closing this issue for now.