google / fuzzbench

FuzzBench - Fuzzer benchmarking as a service.

Home Page:https://google.github.io/fuzzbench/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

corpus archive changes

vanhauser-thc opened this issue · comments

I just noticed that the corpus archive was changed and only contains the corpus and crashes, not other information files.

While I understand that non-corpus/non-crash data is not useful for gathering coverage, it removes our ability to get an in-depth analysis what happened in the fuzzer run, e.g. via default/fuzzer_stats or introspection features we can activate.

This hinders us to actually learn about what works and what not when developing on the fuzzer, because of CPU and randomness fluctuations something like a 0.3% coverage difference (or higher if only two very similar variants are run) is common.

Can this change please be reverted? or an alternative found? otherwise fuzzbench is much less helping us :(

@jonathanmetzman

We definitely need the change I made at some point, storing the same file 90X was grossly inefficient. But I want to get this working for you.
Intuitively I don't understand why the stats file is no longer included, because the new approach is to only archive modified files, https://github.com/google/fuzzbench/blob/master/experiment/runner.py#L390 even if the name is the same. Do you know why the stat file might not be considered "modified"? Maybe we are doing something wrong or AFL++ is doing something funky

Actually I think there's a different feature at fault here: https://github.com/google/fuzzbench/blame/master/experiment/runner.py#L51 but I'm not sure it's new? Maybe I was using it differently in the past (e.g. to decide if the corpus was "unchanged") but still included it in the zip and now I don't do this?

I'm not sure I want to push a potentially breaking change before the contest though. I guess because fixing this will only remove code it's probably low risk

@vanhauser-thc Would you consider this critical for the competition?
As you know, we plan to launch the pre-run this weekend : )

@vanhauser-thc Would you consider this critical for the competition? As you know, we plan to launch the pre-run this weekend : )

no that has nothing to do with the competition. I do not need this for the competition run :)

often for testing if a change is good or bad a simple run is enough.
but sometimes we need to collect data why something is working or not and then this need the meta data we collect.