Keep `Span` data for AST nodes

Question

Keep `Span` data for AST nodes

MovingtoMars opened this issue 8 years ago · comments

Keeping Span data for AST nodes allows nice errors in the semantic analysis phase. It also is a step towards something like gofmt.

Perhaps something like

pub struct Spanned<T> {
    value: T,
    span: Span,
}

and then update parser like so:

fn parse_import_decl(&mut self) -> PResult<Spanned<ast::ImportDecl>> {

Or even:

pub type PResult<T> = ::std::result::Result<Spanned<T>, Error>;

Yohaï-Eliel Berreby · Answer 1 · Sat May 21 2016 15:56:15 GMT+0800 (China Standard Time)

Sounds like a good idea, we'll need position info, indeed. I fear the size overhead is going to be massive if we use Spanned<T> everywhere, though. The Go AST implementation uses an interface, and sometimes computes the end position on the fly based on the position of inner nodes, which could save space.

What approach did you use for ark?

Liam · Answer 2 · Sat May 21 2016 16:05:21 GMT+0800 (China Standard Time)

In Ark we use

type nodePos struct {
    pos lexer.Position
}

which gets embedded in every AST node, using Go's stuct embedding.

In rgo, Span is only 64 bits, which isn't too bad. I've got an idea for a try_parse!(val) macro that:

records the start of the current token
runs try!(val)
records the end of the previous token
returns val wrapped in Spanned<T>, using the two recorded offsets

This seems like the option that'll take the least code to support.

Another option would be, like you said, to store spans for the primitive nodes, and then to compute them for more complex nodes based on their children. This would save some memory, but require much more implementation.

Yohaï-Eliel Berreby · Answer 3 · Sat May 21 2016 16:10:53 GMT+0800 (China Standard Time)

64 bits are not too much, indeed. Your try_parse! macro idea looks good. 👍