Can this library be used under AFL?
0xfocu5 opened this issue · comments
this is my harness code
use afl::fuzz;
use std::thread;
use std::time::Duration;
use arbitrary::Arbitrary;
#[derive(Arbitrary, Debug)]
struct MyData {
a: u32,
b: bool,
c: Vec<u8>,
}
fn main() {
afl::fuzz!(|data: MyData| {
thread::sleep(Duration::from_secs(20));
println!("1111");
});
}
and the seed is
-> % xxd seed
00000000: 0102 0304 0105 0607 08
I want the data is divided:
The first four bytes (01 02 03 04) are for a.
The next byte (01) is for b. In this case, 01 represents true.
The remaining bytes (05 06 07 08) are for c.
but I got this
pwndbg> p/x data
$1 = test::MyData {
a: <synthetic pointer>,
b: <synthetic pointer>,
c: alloc::vec::Vec<u8, alloc::alloc::Global> {
buf: alloc::raw_vec::RawVec<u8, alloc::alloc::Global> {
ptr: core::ptr::unique::Unique<u8> {
pointer: core::ptr::non_null::NonNull<u8> {
pointer: 0x5555557bbbc0
},
_marker: core::marker::PhantomData<u8>
},
cap: 0x8,
alloc: alloc::alloc::Global
},
len: <synthetic pointer>
}
}
pwndbg> x/2gx 0x5555557bbbc0
0x5555557bbbc0: 0x0000000000000806 0x0000000000000000
According to GDB, the result is incorrect. Is there a problem with my usage?
I have never used arbitrary
with AFL but there is no fundamental reason why they shouldn't be compatible. arbitrary
isn't doing anything magical.
I'd suggest filing a bug with the AFL rust bindings and asking there.
I think @0xfocu5's question is about how Arbitrary builds a struct from a sequence of bytes.
In particular, is what he wrote here correct?
The first four bytes (01 02 03 04) are for a.
The next byte (01) is for b. In this case, 01 represents true.
The remaining bytes (05 06 07 08) are for c.
I.e., should the bytes 0102 0304 0105 0607 08
construct a MyData
with the following?
a
:0x04030201
(or0x01020304
)b
:true
c
:[ 0x05, 0x06, 0x07, 0x08 ]
I don't think this is a problem with afl.rs, but if your answer to the above is "yes," then it may be.
I.e., should the bytes
0102 0304 0105 0607 08
construct aMyData
with the following?* `a`: `0x04030201` (or `0x01020304`) * `b`: `true` * `c`: `[ 0x05, 0x06, 0x07, 0x08 ]`
Ignoring the actual generated values, yes, the seed should be split between the fields in the way described.
In particular c
won't simply be the seed bytes because the implementation Arbitrary for Vec<A>
will compute a length up front based on how much data is available and then add up to that many elements to the vector. There is no specialization for Vec<u8>
in particular. Also it will probably end up with fewer than four elements because some of the seed data was used to calculate the length.
Finally, the amount of bytes each Arbitrary
implementation consumes and how exactly they get mapped to the constructed instance are all internal implementation details that are subject to change across releases. So I do not recommend relying on these exact details.
Apologies, we changed the implementation of Unstructured::arbitrary_iter
some time ago, and I forgot. The change was in this commit, and its message has the motivation. This should reinforce my point about relying on those exact details: they are likely to change.
Anyways, it reads a byte on each iteration to determine whether to keep generating more elements (and ultimately adding them to the vector) or not.
Line 676 in 37ccb7d
Line 607 in 37ccb7d
Lines 713 to 729 in 37ccb7d