yeslogic / fathom

🚧 (Alpha stage software) A declarative data definition language for formally specifying binary data formats. 🚧

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Conditional formats

brendanzab opened this issue Β· comments

It would be useful to have format descriptions that are refine other formats with predicates:

{ x <- f | e }

For example, in OpenType we might see something like:

{ sfnt_version <- u32be | sfnt_version == 0x00010000 || sfnt_version == "OTTO" }

Rough specification

The typing rules for the core language could look something like:

  f : Format     x : Repr Format ⊒ e : Bool
──────────────────────────────────────────────
           { x <- f | e } : Format


Repr { x <- f | e } = Repr f


  s .. s' : f ⟹ e₁   x : Repr f = e₁ ⊒ eβ‚‚ = true : Bool
────────────────────────────────────────────────────────────
          s .. s' : { x <- f | eβ‚‚ } ⟹ e₁
  • { x <- f | e } is a Format when:
    • f is a Format
    • assuming x : Repr Format, e is a Bool
  • the representation of { x <- f | e } is Repr f
  • the bit sequence s .. s' is recognized with { x <- f | eβ‚‚ } as an expression e₁ when:
    • the bit sequence s .. s' is recognized with f as an expression e₁
    • assuming x : Repr f = e₁, eβ‚‚ is the same Bool as true
Elaborated typing rules

Some might find it easier to read these rules with explicit typing contexts, Ξ“:

  Ξ“ ⊒ f : Format     Ξ“, x : Repr Format ⊒ e : Bool
────────────────────────────────────────────────────
           Ξ“ ⊒ { x <- f | e } : Format
  • under the context Ξ“, { x <- f | e } is a Format when:
    • under the context Ξ“, f is a Format
    • under the context Ξ“, x : Repr Format, e is a Bool

Note: The binary interpretation uses a similar notation as Mark Brown in the prolog prototype.

Naming ideas

Some alternative names for these format descriptions could be:

  • conditional formats
  • refinement formats
  • guard formats

Future extensions

Predicate preservation

Eventually we could preserve the guard condition in the representation types, using refinement types:

-   f : Format     x : Repr Format ⊒ e : Bool
+   f : Format     x : Repr Format ⊒ e : Prop
  ──────────────────────────────────────────────
             { x <- f | e } : Format


- Repr { x <- f | e } = Repr f
+ Repr { x <- f | e } = { x : Repr f | e }

Syntactic Sugar

We could eventually add some sugar for record formats:

{
    sfnt_version <- u32be == 0x00010000 || "OTTO",
    ...
}

Or perhaps:

{
    sfnt_version <- u32be if sfnt_version == 0x00010000 || sfnt_version == "OTTO",
    ...
}