rust-fuzz / arbitrary

this is my harness code

use afl::fuzz;
use std::thread;
use std::time::Duration;
use arbitrary::Arbitrary;

#[derive(Arbitrary, Debug)]
struct MyData {
    a: u32,
    b: bool,
    c: Vec<u8>,
}

fn main() {
    afl::fuzz!(|data: MyData| {
        thread::sleep(Duration::from_secs(20));
        println!("1111");
    });
}

and the seed is

-> % xxd seed  
00000000: 0102 0304 0105 0607 08

I want the data is divided:

The first four bytes (01 02 03 04) are for a.
The next byte (01) is for b. In this case, 01 represents true.
The remaining bytes (05 06 07 08) are for c.

but I got this

pwndbg> p/x data
$1 = test::MyData {
  a: <synthetic pointer>,
  b: <synthetic pointer>,
  c: alloc::vec::Vec<u8, alloc::alloc::Global> {
    buf: alloc::raw_vec::RawVec<u8, alloc::alloc::Global> {
      ptr: core::ptr::unique::Unique<u8> {
        pointer: core::ptr::non_null::NonNull<u8> {
          pointer: 0x5555557bbbc0
        },
        _marker: core::marker::PhantomData<u8>
      },
      cap: 0x8,
      alloc: alloc::alloc::Global
    },
    len: <synthetic pointer>
  }
}
pwndbg> x/2gx 0x5555557bbbc0
0x5555557bbbc0: 0x0000000000000806      0x0000000000000000

According to GDB, the result is incorrect. Is there a problem with my usage?

I have never used arbitrary with AFL but there is no fundamental reason why they shouldn't be compatible. arbitrary isn't doing anything magical.

I'd suggest filing a bug with the AFL rust bindings and asking there.

I think @0xfocu5's question is about how Arbitrary builds a struct from a sequence of bytes.

In particular, is what he wrote here correct?

The first four bytes (01 02 03 04) are for a.
The next byte (01) is for b. In this case, 01 represents true.
The remaining bytes (05 06 07 08) are for c.

I.e., should the bytes 0102 0304 0105 0607 08 construct a MyData with the following?

a: 0x04030201 (or 0x01020304)
b: true
c: [ 0x05, 0x06, 0x07, 0x08 ]

I don't think this is a problem with afl.rs, but if your answer to the above is "yes," then it may be.

I.e., should the bytes 0102 0304 0105 0607 08 construct a MyData with the following?
* `a`: `0x04030201` (or `0x01020304`)

* `b`: `true`

* `c`: `[ 0x05, 0x06, 0x07, 0x08 ]`

Ignoring the actual generated values, yes, the seed should be split between the fields in the way described.

In particular c won't simply be the seed bytes because the implementation Arbitrary for Vec<A> will compute a length up front based on how much data is available and then add up to that many elements to the vector. There is no specialization for Vec<u8> in particular. Also it will probably end up with fewer than four elements because some of the seed data was used to calculate the length.

Finally, the amount of bytes each Arbitrary implementation consumes and how exactly they get mapped to the constructed instance are all internal implementation details that are subject to change across releases. So I do not recommend relying on these exact details.

Apologies, we changed the implementation of Unstructured::arbitrary_iter some time ago, and I forgot. The change was in this commit, and its message has the motivation. This should reinforce my point about relying on those exact details: they are likely to change.

Anyways, it reads a byte on each iteration to determine whether to keep generating more elements (and ultimately adding them to the vector) or not.

arbitrary/src/lib.rs

Line 676 in 37ccb7d

u.arbitrary_iter()?.collect()

arbitrary/src/unstructured.rs

Line 607 in 37ccb7d

pub fn arbitrary_iter<'b, ElementType: Arbitrary<'a>>(

arbitrary/src/unstructured.rs

Lines 713 to 729 in 37ccb7d

    
           /// Utility iterator produced by [`Unstructured::arbitrary_iter`] 
        
           pub struct ArbitraryIter<'a, 'b, ElementType> { 
        
               u: &'b mut Unstructured<'a>, 
        
               _marker: PhantomData<ElementType>, 
        
           } 
        
           impl<'a, 'b, ElementType: Arbitrary<'a>> Iterator for ArbitraryIter<'a, 'b, ElementType> { 
        
               type Item = Result<ElementType>; 
        
               fn next(&mut self) -> Option<Result<ElementType>> { 
        
                   let keep_going = self.u.arbitrary().unwrap_or(false); 
        
                   if keep_going { 
        
                       Some(Arbitrary::arbitrary(self.u)) 
        
                   } else { 
        
                       None 
        
                   } 
        
               } 
        
           }

Thanks very much, @fitzgen.

@0xfocu5 Does this resolve your question?

Thanks very much, @fitzgen.非常感谢，@fitzgen。

@0xfocu5 Does this resolve your question?@0xfocu5 这能解决你的问题吗？

thanks very much. I got it.

	/// Utility iterator produced by [`Unstructured::arbitrary_iter`]
	pub struct ArbitraryIter<'a, 'b, ElementType> {
	u: &'b mut Unstructured<'a>,
	_marker: PhantomData<ElementType>,
	}

	impl<'a, 'b, ElementType: Arbitrary<'a>> Iterator for ArbitraryIter<'a, 'b, ElementType> {
	type Item = Result<ElementType>;
	fn next(&mut self) -> Option<Result<ElementType>> {
	let keep_going = self.u.arbitrary().unwrap_or(false);
	if keep_going {
	Some(Arbitrary::arbitrary(self.u))
	} else {
	None
	}
	}
	}

Can this library be used under AFL?