A pure Python ECMAScript 5.1 parser and engine.
Unpack the archive, cd
into the source directory, and run the following
command:
python setup.py install
Assuming you have pip and git installed, run the following command to install from the GitHub repository:
pip install git+git://github.com/jefkistler/BigRig.git#egg=BigRig
The setup.py
installer will install a script named bigrig
that provides
a basic shell for executing scripts. With no arguments the script will launch
an interactive read-evaluate-print loop:
$ bigrig > (function() { return "Hello World!"; })() Hello World! >
Press Ctrl+D
to exit the shell, or Ctrl+C
to reset the prompt.
Positional filename arguments may be given, corresponding to script files that will be executed in the given order in the same execution context:
$ bigrig ./script.js ...
The console
object provides a single log
method that will print the
toString
representations of the given arguments:
$ bigrig > console.log("Hello World!"); Hello World! undefined >
The --eval
or -e
flag can be specified to execute a string:
$ bigrig -e "console.log('test');" test $
The main interface for using the parsing library is found in the
bigrig.parser
module. To parse an ECMAScript file into an abstract syntax
tree, the utility function parse_file
is provided:
from bigrig import parser ast = parser.parse_file('/path/to/an/ecmascript/file.js', encoding='utf-8')
Upon encountering unparseable source the parser will throw a
bigrig.parser.ParseException
exception with what is hopefully a useful
error message. Note that parse_file
accepts the keyword arguments line
,
which is the line number of the start of the source file, column
, which is
the character offset on the current line at which the source file begins, and
encoding
, which is the character encoding of the source file so that it may
be converted to unicode internally.
The utility function bigrig.parser.parse_string
works in a similar fashion
to parse_file
except that it accepts source as a string instead of the
path to a file. If you'd like to ascribe some kind of file name for location
tracking information it accepts one in the keyword argument filename
.
If you would like more control over parsing productions, you can use the
parser building utility functions found in bigrig.parser
in the form of
make_file_parser
and make_string_parser
. These utilities simply
build a parser for the given inputs without attempting to parse anything.
This might be useful to you if you want to see what the result of parsing
a production other than Program
is by calling one of the parse
prefixed parsing methods. Here's a quick example of parsing a function
declaration using a Parser
object:
from bigrig import parser source = 'function example() { console.log("example"); }' parser_obj = parser.make_string_parser(source) function_node = parser_obj.parse_function_declaration()
The abstract syntax tree is comprised of bigrig.parser.node.Node
objects,
with some terminals being expressed as list
, None
and unicode
objects. To navigate the tree, nodes provide a simple fields
and
attributes
interface. Fields represent child nodes in the parse tree and
attributes are metadata about the node. To examine a node's fields, an
iterable of available field attributes is stored in the node_object.fields
attribute and may be examined using the iter_fields
generator method,
which returns (name, value)
pairs. If you simply want to iterate over the
child values, nodes provide an iter_children
generator method.
To see the available node types that are built by the default Parser
class,
have a look over the bigrig.parser.ast
module. If these nodes types are
insufficient for your needs, have a look at the bigrig.parser.factory
module, which contains the base node building mixin-class that the default
Parser
class uses to build the abstract syntax tree. Making your own node
factory parser mixin class will allow you to customize the abstract syntax
tree that the parser will build.
The ECMAScript tokenizing class is found in the bigrig.parser.scanner
module. This module provides the utility functions make_file_scanner
and
make_string_scanner
to quickly build tokenizers for ECMAScript source files
and strings respectively. The Token
types are defined within the
bigrig.token
module, so look there to see what the various lexical tokens
are. The public interface of the scanner class consists simply of a next
method, which produces the next lexical token from the input. To facilitate
parsing source with lookahead, the bigrig.parser.scanner.TokenStream
class
provides a light buffering wrapper around Scanner
objects, adding the
peek
method which returns the next Token
in the source without
advancing the stream state. Here's a quick example of tokenizing an ECMAScript
string:
from bigrig.parser import make_string_scanner source = 'if (token) { console.log(token); } else { console.log("error!"); }' scanner_obj = scanner.make_string_scanner(source) while True: token = scanner_obj.next() if token.type == 'EOF': break print token