untitaker / html5gum

A WHATWG-compliant HTML5 tokenizer and tag soup parser

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Audit code for potential panics

untitaker opened this issue · comments

@lebensterben pointed out in #32 that it's currently hard to understand why certain potential panics (supposedly) don't occur in practice.

We should

  • start documenting the relevant invariants in code comments
  • write more explicit assertion messages when those fail (either by adding more debug_asserts on top or doing something else)
  • statically enforce the above (this is probably impossible)

labelling as documentation as I'm not aware of actual panics being hit in usage.

Additionally we can investigate how usages of ArrayVec or other panicking utils can be avoided entirely.

For example, we could statically ensure, with enough type trickery, that each state transition produces at most const CAP errors (i.e. how many times set_character_error is called per invocation of consume), which provides us with a statically verifiable upper bound of how large the array must be. (this upper bound does exist in practice but it's unknown to the Rust compiler)

My gut feeling is that it would be too much effort though, and that the fuzzer should find most problems already.