rust-fuzz / arbitrary

Generating structured data from arbitrary, unstructured input.

Home Page:https://docs.rs/arbitrary/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`size_hint` computation time blows up for "wide" recursive enums and structs

sgued opened this issue · comments

commented

Calling size_hint on a "wide" recursive struct or enum is extremely slow. Examples:

#[derive(Debug, Arbitrary)]
enum Enum {
    None,
    A(Box<Enum>),
    B(Box<Enum>),
    C(Box<Enum>),
    D(Box<Enum>),
    E(Box<Enum>),
    F(Box<Enum>),
    G(Box<Enum>),
    H(Box<Enum>),
    I(Box<Enum>),
}


#[derive(Debug, Arbitrary)]
struct Struct {
    a: Box<Struct>,
    b: Box<Struct>,
    c: Box<Struct>,
    d: Box<Struct>,
    e: Box<Struct>,
    f: Box<Struct>,
    g: Box<Struct>,
    h: Box<Struct>,
    i: Box<Struct>,
}

This is because even though the depth parameter prevents this from going into an infinite loop, the derive implementation always calls size_hint once per field member even if they are of the same type. That makes the complexity close to $witdth^{recursion_depth}$, which here corresponds to $9^{20}$

Ideally the solution would be to somehow memoize the results of the calls to size_hint for each type but I don't see how to do that without having to add Any bounds everywhere. I tried implementing an unintrusive way to bail out when recursion is detected in https://github.com/sgued/arbitrary/tree/recursive-size-hint but it still doesn't solve every problem.

I think the most simple solution would be to make size_hint fallible and return an error in case of recursion. This would make it possible to bail out early for all implementations. It would still be easily possible to get the always correct (0, None) value by using unwrap_or_default. This would however be a breaking change.

Another solution I can think of would be to provide a way to override the size_hint implementation by something that always return (0, None)