CSTParser

A parser for Julia using Tokenize that aims to extend the built-in parser by providing additional meta information along with the resultant AST.

Additional Output

EXPR's are iterable producing children in the order that they appear in the source code, including punctuation.

Example:
```
f(x) = x*2 becomes [f(x), =, x*2]
f(x) becomes [f, (, x, )]
```
The byte span of each EXPR is stored allowing a mapping between byte position in the source code and the releveant parsed expression. The span of a single token includes any trailing whitespace, newlines or comments. This also allows for fast partial parsing of modified source code.
Formatting hints are generated as the source code is parsed (e.g. mismatched indents for blocks, missing white space around operators).
The declaration of modules, functions, datatypes and variables are tracked and stored in the relevant hierarchical scopes attatched to the expressions that declare the scope. This allows for a mapping between any identifying symbol and the relevant code that it refers to.

Structure

Expressions are represented solely by the following types:

Parser.SyntaxNode
  Parser.EXPR
  Parser.INSTANCE
    Parser.HEAD{K}
    Parser.IDENTIFIER
    Parser.KEYWORD{K}
    Parser.LITERAL{K}
    Parser.OPERATOR{P,K,dot}
    Parser.PUNCTUATION{K}
  Parser.QUOTENODE

The K parameterisation refers to the kind of the associated token as specified by Tokenize.Tokens.Kind. The P and dot parameters for operators refers to the precedence of the operator and whether it is dotted (e.g. .+).

INSTANCEs represent singular objects that may have a concrete or implicit relation to a portion of the source text. In the the former case they have a span storing the width in bytes that they occupy in the source text, in the latter case their span is 0. Additionally, IDENTIFIERs store their value as a Symbol and LITERALs as a String.

EXPR are equivalent to Base.Expr but have extra fields to store their span and any punctuation tokens.

KristofferC / CSTParser.jl

CSTParser

Additional Output

Structure

About

Languages