jfinkels / gast

Python AST that abstracts the underlying Python version

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GAST, daou naer!

A generic AST to represent Python2 and Python3's Abstract Syntax Tree(AST).

GAST provides a compatibility layer between the AST of various Python versions, as produced by ast.parse from the standard ast module.

Basic Usage

>>> import ast, gast
>>> code = open('file.py').read()
>>> tree = ast.parse(code)
>>> gtree = gast.ast_to_gast(tree)
>>> ... # process gtree
>>> tree = gast.gast_to_ast(gtree)
>>> ... # do stuff specific to tree

API

gast provides the same API as the ast module. All functions and classes from the ast module are also available in the gast module, and work accordingly on the gast tree.

Three notable exceptions:

  1. gast.parse directly produces a GAST node. It's equivalent to running
    gast.ast_to_gast on the output of ast.parse.
  2. gast.dump dumps the gast common representation, not the original
    one.
  3. gast.gast_to_ast and gast.ast_to_gast can be used to convert
    from one ast to the other, back and forth.

Version Compatibility

GAST is tested using tox and Travis on the following Python versions:

  • 2.7
  • 3.0
  • 3.1
  • 3.2
  • 3.3
  • 3.4
  • 3.5

AST Changes

Python3

The AST used by GAST is the same as the one used in Python3.5, to the notable exception of ast.arg being replaced by ast.Name with an ast.Param context. Addiitonnaly, ast.Name have an extra annotation field.

Python2

To cope with Python3 features, several nodes from the Python2 AST are extended with some new attributes/children.

  • FunctionDef nodes have a returns attribute.
  • ClassDef nodes have a keywords attribute.
  • With's context_expr and optional_vars fields are hold in a withitem object.
  • Raise's type, inst and tback fields are hold in a single exc field, using the transformation raise E, V, T => raise E(V).with_traceback(T).
  • TryExcept and TryFinally nodes are merged in the Try node.
  • Name nodes have an annotation attribute.
  • arguments nodes have a kwonlyargs and kw_defaults attributes.
  • Call nodes loose their starargs attribute, replaced by an argument wrapped in a Starred node. They also loose their kwargs attribute, wrapped in a keyword node with the identifier set to None, as done in Python3.

Pit Falls

  • In Python3, None, True and False are parsed as NamedConstant while they are parsed as regular Name in Python2. gast uses the same convention.

ASDL

-- ASDL's six builtin types are identifier, int, string, bytes, object, singleton

module Python
{
    mod = Module(stmt* body)
        | Interactive(stmt* body)
        | Expression(expr body)

        -- not really an actual node but useful in Jython's typesystem.
        | Suite(stmt* body)

    stmt = FunctionDef(identifier name, arguments args,
                       stmt* body, expr* decorator_list, expr? returns)
          | AsyncFunctionDef(identifier name, arguments args,
                             stmt* body, expr* decorator_list, expr? returns)

          | ClassDef(identifier name,
             expr* bases,
             keyword* keywords,
             stmt* body,
             expr* decorator_list)
          | Return(expr? value)

          | Delete(expr* targets)
          | Assign(expr* targets, expr value)
          | AugAssign(expr target, operator op, expr value)

          -- not sure if bool is allowed, can always use int
          | Print(expr? dest, expr* values, bool nl)

          -- use 'orelse' because else is a keyword in target languages
          | For(expr target, expr iter, stmt* body, stmt* orelse)
          | AsyncFor(expr target, expr iter, stmt* body, stmt* orelse)
          | While(expr test, stmt* body, stmt* orelse)
          | If(expr test, stmt* body, stmt* orelse)
          | With(withitem* items, stmt* body)
          | AsyncWith(withitem* items, stmt* body)

          | Raise(expr? exc, expr? cause)
          | Try(stmt* body, excepthandler* handlers, stmt* orelse, stmt* finalbody)
          | Assert(expr test, expr? msg)

          | Import(alias* names)
          | ImportFrom(identifier? module, alias* names, int? level)

          | Global(identifier* names)
          | Nonlocal(identifier* names)
          | Expr(expr value)
          | Pass | Break | Continue

          -- XXX Jython will be different
          -- col_offset is the byte offset in the utf8 string the parser uses
          attributes (int lineno, int col_offset)

          -- BoolOp() can use left & right?
    expr = BoolOp(boolop op, expr* values)
         | BinOp(expr left, operator op, expr right)
         | UnaryOp(unaryop op, expr operand)
         | Lambda(arguments args, expr body)
         | IfExp(expr test, expr body, expr orelse)
         | Dict(expr* keys, expr* values)
         | Set(expr* elts)
         | ListComp(expr elt, comprehension* generators)
         | SetComp(expr elt, comprehension* generators)
         | DictComp(expr key, expr value, comprehension* generators)
         | GeneratorExp(expr elt, comprehension* generators)
         -- the grammar constrains where yield expressions can occur
         | Await(expr value)
         | Yield(expr? value)
         | YieldFrom(expr value)
         -- need sequences for compare to distinguish between
         -- x < 4 < 3 and (x < 4) < 3
         | Compare(expr left, cmpop* ops, expr* comparators)
         | Call(expr func, expr* args, keyword* keywords)
         | Num(object n) -- a number as a PyObject.
         | Str(string s) -- need to specify raw, unicode, etc?
         | Bytes(bytes s)
         | NameConstant(singleton value)
         | Ellipsis

         -- the following expression can appear in assignment context
         | Attribute(expr value, identifier attr, expr_context ctx)
         | Subscript(expr value, slice slice, expr_context ctx)
         | Starred(expr value, expr_context ctx)
         | Name(identifier id, expr_context ctx, expr? annotation)
         | List(expr* elts, expr_context ctx)
         | Tuple(expr* elts, expr_context ctx)

          -- col_offset is the byte offset in the utf8 string the parser uses
          attributes (int lineno, int col_offset)

    expr_context = Load | Store | Del | AugLoad | AugStore | Param

    slice = Slice(expr? lower, expr? upper, expr? step)
          | ExtSlice(slice* dims)
          | Index(expr value)

    boolop = And | Or

    operator = Add | Sub | Mult | MatMult | Div | Mod | Pow | LShift
                 | RShift | BitOr | BitXor | BitAnd | FloorDiv

    unaryop = Invert | Not | UAdd | USub

    cmpop = Eq | NotEq | Lt | LtE | Gt | GtE | Is | IsNot | In | NotIn

    comprehension = (expr target, expr iter, expr* ifs)

    excepthandler = ExceptHandler(expr? type, expr? name, stmt* body)
                    attributes (int lineno, int col_offset)

    arguments = (expr* args, expr? vararg, expr* kwonlyargs, expr* kw_defaults,
                 expr? kwarg, expr* defaults)

    -- keyword arguments supplied to call (NULL identifier for **kwargs)
    keyword = (identifier? arg, expr value)

    -- import name with optional 'as' alias.
    alias = (identifier name, identifier? asname)

    withitem = (expr context_expr, expr? optional_vars)
}

About

Python AST that abstracts the underlying Python version

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Python 100.0%