jacegu / treeton

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Treeton

Treeton is a JSON parser built on the top of Treetop.

This is a pet project I've developed to learn about grammars and parser generators. In the meantime I have also gotten a deeper understanding of the JSON format.

Disclaimer

Despite it has a really cool name this project is not intended to be used in production environments.

If you don't want to believe me and think that Treeton is so cool that it's worth a try, let me throw a couple benchmarks at you.

This is the profiling of Treeton and Yajl when parsing the same JSON:

//TODO

Usage

//TODO

The Grammar

You can read the whole JSON RFC but you can get the grammar in a nutshell just by looking at the railroad diagrams.

I picked JSON grammar for this experiment because its fairly simple. It is basically composed by six types: booleans, strings, numbers, arrays, objects and null.

Lets look at the grammar of each one of them, and how it has been translated to Treetop rules:

Values

Railroad diagram for value grammar

Which is translated nicely into:

rule value
  string / number / array / object / true / false / null
end

The boolean values true, false and the null value are terminals expressed in their own rules:

rule true
  'true'
end

rule false
  'false'
end

rule null
  'null'
end

Numbers

Railroad diagram for numbers grammar

This has been translated into the following rule:

rule number
  integer_part decimal_part? exponent?
end

You can chechout the detailed rules in the numbers grammar.

Strings

Railroad diagram for strings grammar

The rule for strings is pretty straightforward:

rule string
  quotation_mark (escaped_character / character)* quotation_mark
end

The definition of each of this sub-rules can be found in the strings grammar.

Arrays

Railroad diagram for arrays grammar

In this case the rule gets a little bit uglier in order to handle the three possible cases: empty array, array with a single element, array with more than one element.

Notice the reference to the previous rule, value, which makes the array able to hold any of the types recognized by JSON grammar.

rule array
  open_square_bracket value? (comma value)* close_square_bracket
end

The sub-rules can be found in the arrays grammar.

Objects

Railroad diagram for objects grammar

Object grammar rule is pretty similar to the array's one

rule object
  open_curly_brace (string colon value)? (comma string colon value)* close_curly_brace
end

The sub-rules can be found in the objects grammar.

The comma rule is defined in the arrays grammar.

About

License:MIT License


Languages

Language:Ruby 100.0%