binast / binjs-ref

Reference implementation for the JavaScript Binary AST format

Home Page:https://binast.github.io/binjs-ref/binjs/index.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

In Context-0.2, try and simplify grammar walking order

Yoric opened this issue · comments

The grammar walking order in Context 0.1 is a bit complicated: it combines two depth-first strategies and implementing it properly seems to require two stacks. There is probably a simpler order that we could adopt (either breadth first or depth first or by need).

Are you referring to needing to traverse the models section differently than the AST?

The root AST decoder we wrote for the V8 implementation uses a fairly simple DFS traversal in IDL/field order, and shouldn't require two stacks.

Well, according to Dominic's document, it's definitely not a simple DFS.

I'll use the word "enqueue" because "push" is overloaded in both Dominic's document and the Python implementation.

  • When we visit an Interface, we enqueue all the fields that are not arrays (stack 1).
  • Whenever we encounter a field that is an array, we need to handle the length, then recurse immediately to the contents of the array (stack 2), before proceeding with stack 1.
  • Also, depending on the path/stack, the moment at which we check whether a table has already been visited is not the same.

Did you find a way to merge stack 1 and stack 2? If my memory serves, in the Python implementation, stack 2 is hidden by the Python stack, but that's not a good practice in production code, at least for a browser, as it it makes it hard to fail softly in case memory is exceeded. Here, grammars are simple, so both stacks are always relatively small, but that's generally a risk that we want to avoid (and it might even get rejected in future versions of Firefox by the static checker).

Ideally, I'd prefer a way to load the tables without needing to walk the grammar itself, as this is some fairly heavy machinery.

In previous versions of the format, we had a convention that tables were written in the order in which they were needed for reading. This makes reading the prelude much simpler (and theoretically easier to parallelize with reading the contents), but doesn't scale well towards laziness.