jqwik-team / jqwik

Property-Based Testing on the JUnit Platform

Home Page:http://jqwik.net

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

generated arbitrary list from fixed inputs

drobert opened this issue · comments

Testing Problem

In complicated data sets (e.g. for spark or similar) we often need to establish the 'bounds' or 'universe of allowed IDs' or similar such that we can produce a semi-arbitrary data set that still has high likelihood (or perfect likelihood) of joining together safely.

Consider testing a system that combines ad clicks with ad campaigns, with data types looking something like this (note this is an intentionally simplistic example):

// imagine getters/setters exist, etc.
class AdClick {
  private long adId;
  private Instant clickTime;
}
class Ad {
  private long id;
  private long campaignId;
}
class Campaign {
  private long id;
  private BigDecimal costPerClick;
  private BigDecimal bugdet;
}

Processing might join all ad clicks against all ads against campaigns to produce the total cost (clicks * costs per click) vs the budget for each campaign at some point in time. One important test case is the 'happy path' where all ad clicks correspond to an ad and all ads correspond to a campaign.

Mechanically, I think the approach would generally be to:

  1. produce Arbitrary<List<Campaign>> (some arbitrary list of campaigns, comprising the 'universe' of known campaigns)
  2. produce one or more Ad for each Campaign
  3. produce one or more AdClick for each Ad
    (alternatively, produce all known campaign ids and all known ad ids up-front and then generate the full objects from there).

In either case, there would at some point be within a flatMap operation List<Campaign> and we need to create at least one AdGroup for each Campaign. I don't see an approach that looks much different than this:

Arbitrary<List<Campaign>> arbCampaigns = ...;
arbCampaigns.flatMap(campaigns -> 
  // note: this is List<Arbitrary<T>> rather than Arbitrary<List<T>>.
  // semantically, it feels reasonable, the list isn't really arbitrary but 
  // each element within the list is arbitrary except for the corresponding campaign id
  List<Arbitrary<AdGroup>> arbAdGroups = 
    campaigns.stream()
    .map(Campaign::getId)
    .map(campaignId -> Arbitraries.longs().map(adId -> new AdGroup(id, campaignId))
  );
  // and a similar flatMap for the list of arbitrary ad clicks
)

Suggested Solution

I think it would be useful to have a built-in mechanism to go from List<Arbitrary<T>> to Arbitrary<List<T>>. (And/or Stream<Arbitrary<T>>). Something like:

// I picked 'sequence' as the common FP name for such an operation
public static <T> Arbitrary<List<T>> sequence(List<Arbitrary<T>> in) {
    return in.stream()
        .collect(
            () -> Arbitraries.just(new ArrayList<>()),
            (arbList, arb) -> arb.flatMap(e -> arbList.map(l -> l.add(e))),
            (l1, l2) -> {
                l1.flatMap(l1p -> l2.map(l2p -> {
                    l1p.addAll(l2p);
                    return l1p;
               }));
             }
        );
}

The above example would then look something like:

Arbitrary<List<Campaign>> arbCampaigns = ...;
arbCampaigns.flatMap(campaigns -> 
  List<Arbitrary<AdGroup>> tmpArbAdGroups = 
    campaigns.stream()
    .map(Campaign::getId)
    .map(campaignId -> Arbitraries.longs().map(adId -> new AdGroup(id, campaignId))
  );

  Arbitrary<List<AdGroup>> arbAdGroups = sequence(tmpArbAdGroups);
)

I haven't gone through your problem in detail (yet). Have you looked at
ListArbitrary.mapEach(..) and
ListArbitrary.flatMapEach(..)?

A somewhat simpler example using flatMapEach for what I think you want to do:

@Property(tries = 10)
void addAgesToFixedListOfUsers(@ForAll("users") List<User> users) {
	System.out.println(users);
	// Assertions.assertThat(users).hasSize(1);
}

@Provide
Arbitrary<List<User>> users() {
	Arbitrary<String> names = Arbitraries.strings().alpha().ofLength(5);
	ListArbitrary<User> users = names.map(User::new).list().ofMinSize(1).ofMaxSize(5);
	return users.flatMapEach((allUsers, user) -> {
		IntegerArbitrary ages = Arbitraries.integers().between(0, 100);
		return ages.map(age -> {
			user.age = age;
			return user;
		});
	});
}

static class User {
	String name;
	int age = -1;

	public User(String name) {
		this.name = name;
	}

	@Override
	public String toString() {
		return "User{name='" + name + '\'' + ", age=" + age + '}';
	}
}

One could argue that there should be a simpler form for

return users.flatMapEach((allUsers, user) -> {
	IntegerArbitrary ages = Arbitraries.integers().between(0, 100);
	return ages.map(age -> {
		user.age = age;
		return user;
	});
});

Especially since allUsers is not needed in this case.
For example:

IntegerArbitrary ages = Arbitraries.integers().between(0, 100);
return users.combineEach(ages, (user, age) -> {
	user.age = age;
        return user;
});