silentbicycle / theft

property-based testing for C: generate input to find obscure bugs, then reduce to minimal failing input

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Configuring theft to run shrinking to completion and then stop

DRMacIver opened this issue · comments

I'm currently trying to evaluate different approaches to shrinking for a paper, and in order to do this I'd like to be able to configure theft so that it runs until it finds a failure, shrinks that failure to completion, and then prints that failure and then stops.

I can't seem to figure out how to do this, and I'm not sure whether this because it's currently not possible, possible but undocumented, or I'm just failing at reading the docs (and the source).

As far as I can tell from reading the source, the theft logic currently is that shrinking is counted as part of a trial (the glossary suggests that this is not the case, but the source suggests that it is), and that theft will run N trials and run shrinking on each of the subset of these that fail. That suggests that the post trial hook would be the place to configure this if one existed, but as far as I can tell none of the return codes from such a call back mean "don't run any more trials".

You're right, it's not clear. The most direct way to do it, currently, is to use the trial_pre hook to return HALT if info->failures is greater than 0. It's framed as "should we start another trial, based on results thus far?", rather than "stop after this trial". Really, it should be valid in either place.

More generally, though, the hook interface has been unclear / insufficiently documented because it has changed a bunch release to release. In the upcoming release, it's changing again, mostly because of architectural considerations for supporting multicore searching and shrinking -- for example, hooks will note whether they're running on the main/supervisor process or a child/worker; in either case, they have distinct address spaces, with copy-on-write, so flags set from a worker won't be visible to the supervisor's hooks.

There are several things that can currently be done via hooks (such as adjusting a logging level or ignoring shrinking known issues) that will become new configuration options. The hook interface has design tension because I want it to be open-ended enough to experiment with new features, but many common cases really belong in the main configuration or as part of the API. (The hooks also don't compose well currently (see issue #17), for somewhat C-specific reasons that should be addressed soon.)

Once the restructuring for multicore is complete, cleaning up the hooks interface (including documenting it better) will be the next highest priority.

It's framed as "should we start another trial, based on results thus far?", rather than "stop after this trial". Really, it should be valid in either place.

Ah, that makes sense, thanks! But yeah, I didn't think to look there and hadn't realised that the available options would be different.

More generally, though, the hook interface has been unclear / insufficiently documented

Yeah I've generally found theft relatively easy to figure out (although this required a fair bit of reading the source) but the hook interface has been a bit of a struggle. Once it's stabilised a bit more perhaps adding some recipes to the documentation would be helpful?

Also, based on your answer, I do think the glossary entry for "trial" is wrong: It definitely reads to me as if it corresponds to a single call to the test function, when in reality it is many calls to the test function when you include shrinking. This is a large part of why I struggled with the hooks I think.