Tuesday, February 18, 2020

In praise of the thoughful design: how property-based testing helps me to be a better developer

The developer's testing toolbox is one of these things which rarely stays unchanged. For sure, some testing practices have proven to be more valuable than others but still, we are constantly looking for better, faster and more expressive ways to test our code. Property-based testing, largely unknown to Java community, is yet another gem crafted by Haskell folks and described in QuickCheck paper.

The power of this testing technique has been quickly realized by Scala community (where the ScalaCheck library was born) and many others but the Java ecosystem has lacked the interest into adopting property-based testing for quite some time. Luckily, since the jqwik appearance, the things are slowly changing for better.

For many, it is quite difficult to grasp what property-based testing is and how it could be exploited. The excellent presentation Property-based Testing for Better Code by Jessica Kerr and comprehensive An introduction to property-based testing, Property-based Testing Patterns series of articles are excellent sources to get you hooked, but in today's post we are going to try discovering the practical side of the property-based testing for typical Java developer using jqwik.

To start with, what the name property-based testing actually implies? The first thought of every Java developer would be it aims to test all getters and setters (hello 100% coverage)? Not really, although for some data structures it could be useful. Instead, we should identify the high-level characteristics, if you will, of the component, data structure, or even individual function and efficiently test them by formulating the hypothesis.

Our first example falls into category "There and back again": serialization and deserialization into JSON representation. The class under the test is User POJO, although trivial, please notice that it has one temporal property of type OffsetDateTime.

public class User {
    private String username;
    @JsonFormat(pattern = "yyyy-MM-dd'T'HH:mm:ss[.SSS[SSS]]XXX", shape = Shape.STRING)
    private OffsetDateTime created;
    
    // ...
}

It is surprising to see how often manipulation with date/time properties are causing issues these days since everyone tries to use own representation. As you could spot, our contract is using ISO-8601 interchange format with optional milliseconds part. What we would like to make sure is that any valid instance of User could be serialized into JSON and desearialized back into Java object without loosing any date/time precision. As an exercise, let us try to express that in pseudo code first:

For any user
  Serialize user instance to JSON
  Deserialize user instance back from JSON
  Two user instances must be identical

Looks simple enough but here comes the surprising part: let us take a look at how this pseudo code projects into real test case using jqwik library. It gets as close to our pseudo code as it possibly could.

@Property
void serdes(@ForAll("users") User user) throws JsonProcessingException {
    final String json = serdes.serialize(user);

    assertThat(serdes.deserialize(json))
        .satisfies(other -> {
            assertThat(user.getUsername()).isEqualTo(other.getUsername());
            assertThat(user.getCreated().isEqual(other.getCreated())).isTrue();
        });
        
    Statistics.collect(user.getCreated().getOffset());
}

The test case reads very easy, mostly natural, but obviously, there is some background hidden behind jqwik's @Property and @ForAll annotations. Let us start from @ForAll and clear out where all these User instances are coming from. As you may guess, these instances must be generated, preferably in a randomized fashion.

For most of the built-in data types jqwik has a rich set of data providers (Arbitraries), but since we are dealing with application-specific class, we have to supply our own generation strategy. It should be able to emit User class instances with the wide range of usernames and the date/time instants for different set of timezones and offsets. Let us do a sneak peek at the provider implementation first and discuss it in details right after.

@Provide
Arbitrary<User> users() {
    final Arbitrary<String> usernames = Arbitraries.strings().alpha().ofMaxLength(64);
 
    final Arbitrary<OffsetDateTime> dates = Arbitraries
        .of(List.copyOf(ZoneId.getAvailableZoneIds()))
        .flatMap(zone -> Arbitraries
            .longs()
            .between(1266258398000L, 1897410427000L) // ~ +/- 10 years
            .unique()
            .map(epochMilli -> Instant.ofEpochMilli(epochMilli))
            .map(instant -> OffsetDateTime.from(instant.atZone(ZoneId.of(zone)))));

    return Combinators
        .combine(usernames, dates)
        .as((username, created) -> new User(username).created(created));

}

The source of usernames is easy: just random strings. The source of dates basically could be any date/time between 2010 and 2030 whereas the timezone part (thus the offset) is randomly picked from all available region-based zone identifiers. For example, below are some samples jqwik came up with.

{"username":"zrAazzaDZ","created":"2020-05-06T01:36:07.496496+03:00"}
{"username":"AZztZaZZWAaNaqagPLzZiz","created":"2023-03-20T00:48:22.737737+08:00"}
{"username":"aazGZZzaoAAEAGZUIzaaDEm","created":"2019-03-12T08:22:12.658658+04:00"}
{"username":"Ezw","created":"2011-10-28T08:07:33.542542Z"}
{"username":"AFaAzaOLAZOjsZqlaZZixZaZzyZzxrda","created":"2022-07-09T14:04:20.849849+02:00"}
{"username":"aaYeZzkhAzAazJ","created":"2016-07-22T22:20:25.162162+06:00"}
{"username":"BzkoNGzBcaWcrDaaazzCZAaaPd","created":"2020-08-12T22:23:56.902902+08:45"}
{"username":"MazNzaTZZAEhXoz","created":"2027-09-26T17:12:34.872872+11:00"}
{"username":"zqZzZYamO","created":"2023-01-10T03:16:41.879879-03:00"}
{"username":"GaaUazzldqGJZsqksRZuaNAqzANLAAlj","created":"2015-03-19T04:16:24.098098Z"}
...

By default, jqwik will run the test against 1000 different sets of parameter values (randomized User instances). The quite helpful Statistics container allows to collect whatever distribution insights you are curious about. Just in case, why not to collect the distribution by zone offsets?

    ...
    -04:00 (94) :  9.40 %
    -03:00 (76) :  7.60 %
    +02:00 (75) :  7.50 %
    -05:00 (74) :  7.40 %
    +01:00 (72) :  7.20 %
    +03:00 (69) :  6.90 %
    Z      (62) :  6.20 %
    -06:00 (54) :  5.40 %
    +11:00 (42) :  4.20 %
    -07:00 (39) :  3.90 %
    +08:00 (37) :  3.70 %
    +07:00 (34) :  3.40 %
    +10:00 (34) :  3.40 %
    +06:00 (26) :  2.60 %
    +12:00 (23) :  2.30 %
    +05:00 (23) :  2.30 %
    -08:00 (20) :  2.00 %
    ...    

Let us consider another example. Imagine at some point we decided to reimplement the equality for User class (which in Java means, overriding equals and hashCode) based on username property. With that, for any pair of User class instances the following invariants must hold true:

  • if two User instances have the same username, they are equal and must have same hash code
  • if two User instances have different usernames, they are not equal (but hash code may not necessarily be different)
It is the perfect fit for property-based testing and jqwik in particular makes such kind of tests trivial to write and maintain.

@Provide
Arbitrary<String> usernames() {
    return Arbitraries.strings().alpha().ofMaxLength(64);
}

@Property
void equals(@ForAll("usernames") String username, @ForAll("usernames") String other) {
    Assume.that(!username.equals(other));
        
    assertThat(new User(username))
        .isEqualTo(new User(username))
        .isNotEqualTo(new User(other))
        .extracting(User::hashCode)
        .isEqualTo(new User(username).hashCode());
}

The assumptions expressed through Assume allow to put additional constraints on the generated parameters since we introduce two sources of the usernames, it could happen that both of them emit the identical username at the same run so the test would fail.

The question you may be holding up to now is: what is the point? It is surely possible to test serialization / deserialization or equals/hashCode without embarking on property-based testing and using jqwik, so why even bother? Fair enough, but the answer to this question basically lies deeply in how we approach the design of our software systems.

By and large, property-based testing is heavily influenced by functional programming, not a first thing which comes into mind with respect to Java (at least, not yet), to say it mildly. The randomized generation of test data is not novel idea per se, however what property-based testing is encouraging you to do, at least in my opinion, is to think in more abstract terms, focus not on individual operations (equals, compare, add, sort, serialize, ...) but what kind of properties, characteristics, laws and/or invariants they come with to obey. It certainly feels like an alien technique, paradigm shift if you will, encourages to spend more time on designing the right thing. It does not mean that from now on all your tests must be property-based but I believe it certainly deserves the place in the front row of our testing toolboxes.

Please find the complete project sources available on Github.