yeslogic / fathom

๐Ÿšง (Alpha stage software) A declarative data definition language for formally specifying binary data formats. ๐Ÿšง

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Let formats

brendanzab opened this issue ยท comments

In the OpenType we sometimes have to employ records as a way to store intermediate values during parsing. For example:

htmx <- required_table "hmtx" {
    hhea <- deref _ hhea.link,
    maxp <- deref _ maxp.link,
    table <- htmx_table
        hhea.number_of_long_horizontal_metrics
        maxp.num_glyphs,
},

This is not ideal as this results in a copy of hhea and maxp appearing in the resulting data structures after parsing. For example:

4752 = [
    {
        hhea = {
            major_version = 1,
            โ‹ฎ
            number_of_long_horizontal_metrics = 1,
        },
        maxp = { version = 20480, num_glyphs = 100 },
        table = {
            h_metrics = [ { advance_width = 1500, left_side_bearing = 300 } ],
        },
    },
]

It would be nice however to be able to write something like:

htmx <- required_table "hmtx" (
    let hhea <- deref _ hhea.link;
    let maxp <- deref _ maxp.link;
    htmx_table
        hhea.number_of_long_horizontal_metrics
        maxp.num_glyphs,
),

This would employ a new format of the form: let x <- fโ‚; fโ‚‚, which allows a format to be parsed, with the result added to the environment, and then a subsequent format will be parsed in that environment. This would result in a data structure that looks something like:

4752 = [
    {
        h_metrics = [ { advance_width = 1500, left_side_bearing = 300 } ],
    },
]

This also has the tantalizing possibility of being further improved to look something like:

htmx <- required_table "hmtx" (
    htmx_table
        (deref _ hhea.link).number_of_long_horizontal_metrics
        (deref _ maxp.link).num_glyphs,
),

Rough Specification

The typing and parsing rules for this format could look something like:

  fโ‚ : Format     x : Repr fโ‚ โŠข fโ‚‚ : Format
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
           let x <- fโ‚; fโ‚‚ : Format


  s .. s' : fโ‚ โŸน eโ‚   x : Repr f = eโ‚ โŠข s' .. s'' : fโ‚ โŸน eโ‚‚
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
          s .. s'' : (let x <- fโ‚; fโ‚‚) โŸน eโ‚‚

Alas, difficulty arises when attempting to define a host representation for this format.

Repr (let x <- fโ‚; fโ‚‚ x) = Repr (fโ‚‚ x)
                                    ^ where is `x : Repr fโ‚` bound?

I'm not sure yet how to resolve this.

I also have suspicions that this could also make a implementing a dual binary semantics more challenging, but weโ€™ll likely also struggle with this for link and deref types.

Fixed by #371?

No, this is a bit differentโ€ฆ here we bind the result of some parsed data but then donโ€™t add a corresponding field in the representation type. Iโ€™m not sure if that really makes sense or is a good idea or not! The computed fields in #371 allow you to add a constant in the middle of a record format and do show up in the representation type. I realise this is confusing though (and the confusion might show thereโ€™s an issue with the design).