Tuesday, July 28, 2020

It is never enough of them: enriching Apache Avro generated classes with custom Java annotations

Apache Avro, along with Apache Thrift and Protocol Buffers, is often being used as a platform-neutral extensible mechanism for serializing structured data. In the context of event-driven systems, the Apache Avro's schemas play the role of the language-agnostic contracts, shared between loosely-coupled components of the system, not necessarily written using the same programming language.

Probably, the most widely adopted reference architecture for such systems circles around Apache Kafka backed by Schema Registry and Apache Avro, although many other excellent options are available. Nevertheless, why Apache Avro?

The official documentation page summarizes pretty well the key advantages Apache Avro has over Apache Thrift and Protocol Buffers. But we are going to add another one to the list: biased (in a good sense) support of the Java and JVM platform in general.

Let us imagine that one of the components (or, it has to be said, microservice) takes care of the payment processing. Not every payment may succeed and to propagate such failures, the component broadcasts PaymentRejectedEvent whenever such unfortunate event happens. Here is its Apache Avro schema, persisted in the PaymentRejectedEvent.avsc file.

{
    "type": "record",
    "name": "PaymentRejectedEvent",
    "namespace": "com.example.event",
    "fields": [
        {
            "name": "id",
            "type": {
                "type": "string",
                "logicalType": "uuid"
            }
        },
        {
            "name": "reason",
            "type": {
                "type": "enum",
                "name": "PaymentStatus",
                "namespace": "com.example.event",
                "symbols": [
                    "EXPIRED_CARD",
                    "INSUFFICIENT_FUNDS",
                    "DECLINED"
                ]
            }
        },
        {
            "name": "date",
            "type": {
                "type": "long",
                "logicalType": "local-timestamp-millis"
            }
        }
    ]
}

The event is notoriously kept simple, you can safely assume that in more or less realistic system it has to have considerably more details available. To turn this event into Java class at build time, we could use Apache Avro Maven plugin, it is as easy as it could get.

<plugin>
    <groupId>org.apache.avro</groupId>
    <artifactId>avro-maven-plugin</artifactId>
    <version>1.10.0</version>
    <configuration>
        <stringType>String</stringType>
    </configuration>
    <executions>
        <execution>
            <phase>generate-sources</phase>
            <goals>
                <goal>schema</goal>
            </goals>
            <configuration>
                <sourceDirectory>${project.basedir}/src/main/avro/</sourceDirectory>
                <outputDirectory>${project.build.directory}/generated-sources/avro/</outputDirectory>
            </configuration>
        </execution>
    </executions>
</plugin>

Once the build finishes, you will get PaymentRejectedEvent Java class generated. But a few annoyances are going to emerge right away:

@org.apache.avro.specific.AvroGenerated
public class PaymentRejectedEvent extends ... {
   private java.lang.String id;
   private com.example.event.PaymentStatus reason;
   private long date;
}

The Java's types for id and date fields are not really what we would expect. Luckily, this is easy to fix by specifying customConversions plugin property, for example.

<plugin>
    <groupId>org.apache.avro</groupId>
    <artifactId>avro-maven-plugin</artifactId>
    <version>1.10.0</version>
    <configuration>
        <stringType>String</stringType>
        <customConversions>
            org.apache.avro.Conversions$UUIDConversion,org.apache.avro.data.TimeConversions$LocalTimestampMillisConversion
        </customConversions>
    </configuration>
    <executions>
        <execution>
            <phase>generate-sources</phase>
            <goals>
                <goal>schema</goal>
            </goals>
            <configuration>
                <sourceDirectory>${project.basedir}/src/main/avro/</sourceDirectory>
                <outputDirectory>${project.build.directory}/generated-sources/avro/</outputDirectory>
            </configuration>
        </execution>
    </executions>
</plugin>

If we build the project this time, the plugin would generate the right types.

@org.apache.avro.specific.AvroGenerated
public class PaymentRejectedEvent extends ... {
   private java.util.UUID id;
   private com.example.event.PaymentStatus reason;
   private java.time.LocalDateTime date;
}

It looks much better! But what about next challenge. In Java, annotations are commonly used to associate some additional metadata pieces with a particular language element. What if we have to add a custom, application-specific annotation to all generated event classes? It does not really matter which one, let it be @javax.annotation.Generated, for example. It turns out, with Apache Avro it is not an issue, it has dedicated javaAnnotation property we could benefit from.

{
    "type": "record",
    "name": "PaymentRejectedEvent",
    "namespace": "com.example.event",
    "javaAnnotation": "javax.annotation.Generated(\"avro\")",
    "fields": [
        {
            "name": "id",
            "type": {
                "type": "string",
                "logicalType": "uuid"
            }
        },
        {
            "name": "reason",
            "type": {
                "type": "enum",
                "name": "PaymentStatus",
                "namespace": "com.example.event",
                "symbols": [
                    "EXPIRED_CARD",
                    "INSUFFICIENT_FUNDS",
                    "DECLINED"
                ]
            }
        },
        {
            "name": "date",
            "type": {
                "type": "long",
                "logicalType": "local-timestamp-millis"
            }
        }
    ]
}

When we rebuild the project one more time (hopefully the last one), the generated PaymentRejectedEvent Java class is going to be decorated with the additional custom annotation.

@javax.annotation.Generated("avro")
@org.apache.avro.specific.AvroGenerated
public class PaymentRejectedEvent extends ... {
   private java.util.UUID id;
   private com.example.event.PaymentStatus reason;
   private java.time.LocalDateTime date;
}

Obviously, this property has no effect if the schema is used to produce respective constructs in other programming languages but it still feels good to see that Java has privileged support in Apache Avro, thanks for that! As a side note, it is good to see that after some quite long inactivity time the project is expiriencing the second breath, with regular releases and new features delivered constantly.

The complete source code is available on Github.

Thursday, April 30, 2020

The crypto quirks using JDK's Cipher streams (and what to do about that)

In our day-to-day job we often run into the recurrent theme of transferring data (for example, files) from one location to another. It sounds like a really simple task but let us make it a bit more difficult by stating the fact that these files may contain confidential information and could be transferred over non-secure communication channels.

One of the solutions which comes to mind first is to use encryption algorithms. Since the files could be really large, hundreds of megabytes or tens of gigabytes, using the symmetric encryption scheme like AES would probably make a lot of sense. Besides just encryption it would be great to make sure that the data is not tampered in transit. Fortunately, there is a thing called authenticated encryption which simultaneously provides to us confidentiality, integrity, and authenticity guarantees. Galois/Counter Mode (GCM) is one of the most popular modes that supports authenticated encryption and could be used along with AES. These thoughts lead us to use AES256-GCM128, a sufficiently strong encryption scheme.

In case you are on JVM platform, you should feel lucky since AES and GCM are supported by Java Cryptography Architecture (JCA) out of the box. With that being said, let us see how far we could go.

The first thing we have to do is to generate a new AES256 key. As always, OWASP has a number of recommendations on using JCA/JCE APIs properly.

final SecureRandom secureRandom = new SecureRandom();
        
final byte[] key = new byte[32];
secureRandom.nextBytes(key);

final SecretKey secretKey = new SecretKeySpec(key, "AES");

Also, to initialize AES/GCM cipher we need to generate random initialization vector (or shortly, IV). As per NIST recommendations, its length should be 12 bytes (96 bits).

For IVs, it is recommended that implementations restrict support to the length of 96 bits, to promote interoperability, efficiency, and simplicity of design. - Recommendation for Block Cipher Modes of Operation: Galois/Counter Mode (GCM) and GMAC

So here we are:

final byte[] iv = new byte[12];
secureRandom.nextBytes(iv);

Having the AES key and IV ready, we could create a cipher instance and actually perform the encryption part. Dealing with large files assumes the reliance on streaming, therefore we use BufferedInputStream / BufferedOutputStream combined with CipherOutputStream for encryption.

public static void encrypt(SecretKey secretKey, byte[] iv, final File input, 
        final File output) throws Throwable {

    final Cipher cipher = Cipher.getInstance("AES/GCM/NoPadding");
    final GCMParameterSpec parameterSpec = new GCMParameterSpec(128, iv);
    cipher.init(Cipher.ENCRYPT_MODE, secretKey, parameterSpec);

    try (final BufferedInputStream in = new BufferedInputStream(new FileInputStream(input))) {
        try (final BufferedOutputStream out = new BufferedOutputStream(new CipherOutputStream(new FileOutputStream(output), cipher))) {
            int length = 0;
            byte[] bytes = new byte[16 * 1024];

            while ((length = in.read(bytes)) != -1) {
                out.write(bytes, 0, length);
            }
        }
    }
}

Please note how we specify GCM cipher parameters with the tag size of 128 bits and initialize it in encryption mode (be aware of some GCM limitations when dealing with files over 64Gb). The decryption part is no different besides the fact the cipher is initialized in decryption mode.

public static void decrypt(SecretKey secretKey, byte[] iv, final File input, 
        final File output) throws Throwable {

    final Cipher cipher = Cipher.getInstance("AES/GCM/NoPadding");
    final GCMParameterSpec parameterSpec = new GCMParameterSpec(128, iv);
    cipher.init(Cipher.DECRYPT_MODE, secretKey, parameterSpec);
        
    try (BufferedInputStream in = new BufferedInputStream(new CipherInputStream(new FileInputStream(input), cipher))) {
        try (BufferedOutputStream out = new BufferedOutputStream(new FileOutputStream(output))) {
            int length = 0;
            byte[] bytes = new byte[16 * 1024];
                
            while ((length = in.read(bytes)) != -1) {
                out.write(bytes, 0, length);
            }
        }
    }
}

It seems like we are done, right? Unfortunately, not really, encrypting and decrypting the small files takes just a few moments but dealing with more or less realistic data samples gives shocking results.

Mostly 8 minutes to process a ~42Mb file (and as you may guess, larger is the file, longer it takes), the quick analysis reveals that most of that time is spent while decrypting the data (please note by no means this is a benchmark, merely a test). The search for possible culprits points out to the long-standing list of issues with AES/GCM and CipherInputStream / CipherOutputStream in JCA implementation here, here, here and here.

So what are the alternatives? It seems like it is possible to sacrifice the CipherInputStream / CipherOutputStream, refactor the implementation to use ciphers directly and make the encryption / decryption work using JCA primitives. But arguably there is a better way by bringing in battle-tested BouncyCastle library.

From the implementation perspective, the solutions are looking mostly identical. Indeed, although the naming conventions are unchanged, the CipherOutputStream / CipherInputStream in the snippet below are coming from BouncyCastle.

public static void encrypt(SecretKey secretKey, byte[] iv, final File input, 
        final File output) throws Throwable {

    final GCMBlockCipher cipher = new GCMBlockCipher(new AESEngine());
    cipher.init(true, new AEADParameters(new KeyParameter(secretKey.getEncoded()), 128, iv));

    try (BufferedInputStream in = new BufferedInputStream(new FileInputStream(input))) {
        try (BufferedOutputStream out = new BufferedOutputStream(new CipherOutputStream(new FileOutputStream(output), cipher))) {
            int length = 0;
            byte[] bytes = new byte[16 * 1024];

            while ((length = in.read(bytes)) != -1) {
                out.write(bytes, 0, length);
            }
        }
    }
}

public static void decrypt(SecretKey secretKey, byte[] iv, final File input, 
        final File output) throws Throwable {

    final GCMBlockCipher cipher = new GCMBlockCipher(new AESEngine());
    cipher.init(false, new AEADParameters(new KeyParameter(secretKey.getEncoded()), 128, iv));

    try (BufferedInputStream in = new BufferedInputStream(new CipherInputStream(new FileInputStream(input), cipher))) {
        try (BufferedOutputStream out = new BufferedOutputStream(new FileOutputStream(output))) {
            int length = 0;
            byte[] bytes = new byte[16 * 1024];
                
            while ((length = in.read(bytes)) != -1) {
                out.write(bytes, 0, length);
            }
        }
    }
}

Re-runing the previous encryption/decryption tests using BouncyCastle crypto primitives yields the completely different picture.

To be fair, the file encryption / decryption on the JVM platform looked like a solved problem at first but turned out to be full of surprising discoveries. Nonetheless, thanks to BouncyCastle, some shortcomings of JCA implementation are addressed in efficient and clean way.

Please find the complete sources available on Github.

Tuesday, February 18, 2020

In praise of the thoughful design: how property-based testing helps me to be a better developer

The developer's testing toolbox is one of these things which rarely stays unchanged. For sure, some testing practices have proven to be more valuable than others but still, we are constantly looking for better, faster and more expressive ways to test our code. Property-based testing, largely unknown to Java community, is yet another gem crafted by Haskell folks and described in QuickCheck paper.

The power of this testing technique has been quickly realized by Scala community (where the ScalaCheck library was born) and many others but the Java ecosystem has lacked the interest into adopting property-based testing for quite some time. Luckily, since the jqwik appearance, the things are slowly changing for better.

For many, it is quite difficult to grasp what property-based testing is and how it could be exploited. The excellent presentation Property-based Testing for Better Code by Jessica Kerr and comprehensive An introduction to property-based testing, Property-based Testing Patterns series of articles are excellent sources to get you hooked, but in today's post we are going to try discovering the practical side of the property-based testing for typical Java developer using jqwik.

To start with, what the name property-based testing actually implies? The first thought of every Java developer would be it aims to test all getters and setters (hello 100% coverage)? Not really, although for some data structures it could be useful. Instead, we should identify the high-level characteristics, if you will, of the component, data structure, or even individual function and efficiently test them by formulating the hypothesis.

Our first example falls into category "There and back again": serialization and deserialization into JSON representation. The class under the test is User POJO, although trivial, please notice that it has one temporal property of type OffsetDateTime.

public class User {
    private String username;
    @JsonFormat(pattern = "yyyy-MM-dd'T'HH:mm:ss[.SSS[SSS]]XXX", shape = Shape.STRING)
    private OffsetDateTime created;
    
    // ...
}

It is surprising to see how often manipulation with date/time properties are causing issues these days since everyone tries to use own representation. As you could spot, our contract is using ISO-8601 interchange format with optional milliseconds part. What we would like to make sure is that any valid instance of User could be serialized into JSON and desearialized back into Java object without loosing any date/time precision. As an exercise, let us try to express that in pseudo code first:

For any user
  Serialize user instance to JSON
  Deserialize user instance back from JSON
  Two user instances must be identical

Looks simple enough but here comes the surprising part: let us take a look at how this pseudo code projects into real test case using jqwik library. It gets as close to our pseudo code as it possibly could.

@Property
void serdes(@ForAll("users") User user) throws JsonProcessingException {
    final String json = serdes.serialize(user);

    assertThat(serdes.deserialize(json))
        .satisfies(other -> {
            assertThat(user.getUsername()).isEqualTo(other.getUsername());
            assertThat(user.getCreated().isEqual(other.getCreated())).isTrue();
        });
        
    Statistics.collect(user.getCreated().getOffset());
}

The test case reads very easy, mostly natural, but obviously, there is some background hidden behind jqwik's @Property and @ForAll annotations. Let us start from @ForAll and clear out where all these User instances are coming from. As you may guess, these instances must be generated, preferably in a randomized fashion.

For most of the built-in data types jqwik has a rich set of data providers (Arbitraries), but since we are dealing with application-specific class, we have to supply our own generation strategy. It should be able to emit User class instances with the wide range of usernames and the date/time instants for different set of timezones and offsets. Let us do a sneak peek at the provider implementation first and discuss it in details right after.

@Provide
Arbitrary<User> users() {
    final Arbitrary<String> usernames = Arbitraries.strings().alpha().ofMaxLength(64);
 
    final Arbitrary<OffsetDateTime> dates = Arbitraries
        .of(List.copyOf(ZoneId.getAvailableZoneIds()))
        .flatMap(zone -> Arbitraries
            .longs()
            .between(1266258398000L, 1897410427000L) // ~ +/- 10 years
            .unique()
            .map(epochMilli -> Instant.ofEpochMilli(epochMilli))
            .map(instant -> OffsetDateTime.from(instant.atZone(ZoneId.of(zone)))));

    return Combinators
        .combine(usernames, dates)
        .as((username, created) -> new User(username).created(created));

}

The source of usernames is easy: just random strings. The source of dates basically could be any date/time between 2010 and 2030 whereas the timezone part (thus the offset) is randomly picked from all available region-based zone identifiers. For example, below are some samples jqwik came up with.

{"username":"zrAazzaDZ","created":"2020-05-06T01:36:07.496496+03:00"}
{"username":"AZztZaZZWAaNaqagPLzZiz","created":"2023-03-20T00:48:22.737737+08:00"}
{"username":"aazGZZzaoAAEAGZUIzaaDEm","created":"2019-03-12T08:22:12.658658+04:00"}
{"username":"Ezw","created":"2011-10-28T08:07:33.542542Z"}
{"username":"AFaAzaOLAZOjsZqlaZZixZaZzyZzxrda","created":"2022-07-09T14:04:20.849849+02:00"}
{"username":"aaYeZzkhAzAazJ","created":"2016-07-22T22:20:25.162162+06:00"}
{"username":"BzkoNGzBcaWcrDaaazzCZAaaPd","created":"2020-08-12T22:23:56.902902+08:45"}
{"username":"MazNzaTZZAEhXoz","created":"2027-09-26T17:12:34.872872+11:00"}
{"username":"zqZzZYamO","created":"2023-01-10T03:16:41.879879-03:00"}
{"username":"GaaUazzldqGJZsqksRZuaNAqzANLAAlj","created":"2015-03-19T04:16:24.098098Z"}
...

By default, jqwik will run the test against 1000 different sets of parameter values (randomized User instances). The quite helpful Statistics container allows to collect whatever distribution insights you are curious about. Just in case, why not to collect the distribution by zone offsets?

    ...
    -04:00 (94) :  9.40 %
    -03:00 (76) :  7.60 %
    +02:00 (75) :  7.50 %
    -05:00 (74) :  7.40 %
    +01:00 (72) :  7.20 %
    +03:00 (69) :  6.90 %
    Z      (62) :  6.20 %
    -06:00 (54) :  5.40 %
    +11:00 (42) :  4.20 %
    -07:00 (39) :  3.90 %
    +08:00 (37) :  3.70 %
    +07:00 (34) :  3.40 %
    +10:00 (34) :  3.40 %
    +06:00 (26) :  2.60 %
    +12:00 (23) :  2.30 %
    +05:00 (23) :  2.30 %
    -08:00 (20) :  2.00 %
    ...    

Let us consider another example. Imagine at some point we decided to reimplement the equality for User class (which in Java means, overriding equals and hashCode) based on username property. With that, for any pair of User class instances the following invariants must hold true:

  • if two User instances have the same username, they are equal and must have same hash code
  • if two User instances have different usernames, they are not equal (but hash code may not necessarily be different)
It is the perfect fit for property-based testing and jqwik in particular makes such kind of tests trivial to write and maintain.

@Provide
Arbitrary<String> usernames() {
    return Arbitraries.strings().alpha().ofMaxLength(64);
}

@Property
void equals(@ForAll("usernames") String username, @ForAll("usernames") String other) {
    Assume.that(!username.equals(other));
        
    assertThat(new User(username))
        .isEqualTo(new User(username))
        .isNotEqualTo(new User(other))
        .extracting(User::hashCode)
        .isEqualTo(new User(username).hashCode());
}

The assumptions expressed through Assume allow to put additional constraints on the generated parameters since we introduce two sources of the usernames, it could happen that both of them emit the identical username at the same run so the test would fail.

The question you may be holding up to now is: what is the point? It is surely possible to test serialization / deserialization or equals/hashCode without embarking on property-based testing and using jqwik, so why even bother? Fair enough, but the answer to this question basically lies deeply in how we approach the design of our software systems.

By and large, property-based testing is heavily influenced by functional programming, not a first thing which comes into mind with respect to Java (at least, not yet), to say it mildly. The randomized generation of test data is not novel idea per se, however what property-based testing is encouraging you to do, at least in my opinion, is to think in more abstract terms, focus not on individual operations (equals, compare, add, sort, serialize, ...) but what kind of properties, characteristics, laws and/or invariants they come with to obey. It certainly feels like an alien technique, paradigm shift if you will, encourages to spend more time on designing the right thing. It does not mean that from now on all your tests must be property-based but I believe it certainly deserves the place in the front row of our testing toolboxes.

Please find the complete project sources available on Github.

Saturday, November 30, 2019

Spring has you covered, again: consumer-driven contract testing for messaging continued

In the previous post we have started to talk about consumer-driven contract testing in the context of the message-based communications. In today's post, we are going to include yet another tool in our testing toolbox but before that, let me do a quick refresher on a system under the microscope. It has two services, Order Service and Shipment Service. The Order Service publishes the messages / events to the message queue and Shipment Service consumes them from there.

The search for the suitable test scaffolding led us to discovery of the Pact framework (to be precise, Pact JVM). The Pact offers simple and straightforward ways to write consumer and producer tests, leaving no excuses to not doing consumer-driven contract testing. But there is another player on the field, Spring Cloud Contract, and this is what we are going to discuss today.

To start with, Spring Cloud Contract fits the best JVM-based projects, built on top of terrific Spring portfolio (although you could make it work in polyglot scenarios as well). In addition, the collaboration flow that Spring Cloud Contract adopts is slightly different from the one Pact taught us, which is not necessarily a bad thing. Let us get straight to the point.

Since we are scoping out to messaging only, the first thing Spring Cloud Contract asks us to do is to define messaging contract specification, written using convenient Groovy Contract DSL.

package contracts

org.springframework.cloud.contract.spec.Contract.make {
    name "OrderConfirmed Event"
    label 'order'
    
    input {
        triggeredBy('createOrder()')
    }
    
    outputMessage {
        sentTo 'orders'
        
        body([
            orderId: $(anyUuid()),
            paymentId: $(anyUuid()),
            amount: $(anyDouble()),
            street: $(anyNonBlankString()),
            city: $(anyNonBlankString()),
            state: $(regex('[A-Z]{2}')),
            zip: $(regex('[0-9]{5}')),
            country: $(anyOf('USA','Mexico'))
        ])
        
        headers {
            header('Content-Type', 'application/json')
        }
    }
}

It resembles a lot Pact specifications we are already familiar with (if you are not a big fan of Groovy, no real need to learn it in order to use Spring Cloud Contract). The interesting parts here are triggeredBy and sentTo blocks: basically, those outline how the message is being produced (or triggered) and where it should land (channel or queue name) respectively. In this case, the createOrder() is just a method name which we have to provide the implementation for.

package com.example.order;

import java.math.BigDecimal;
import java.util.UUID;

import org.junit.runner.RunWith;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.cloud.contract.verifier.messaging.boot.AutoConfigureMessageVerifier;
import org.springframework.integration.support.MessageBuilder;
import org.springframework.messaging.MessageChannel;
import org.springframework.test.context.junit4.SpringRunner;

import com.example.order.event.OrderConfirmed;

@RunWith(SpringRunner.class)
@SpringBootTest
@AutoConfigureMessageVerifier
public class OrderBase {
    @Autowired private MessageChannel orders;
    
    public void createOrder() {
        final OrderConfirmed order = new OrderConfirmed();
        order.setOrderId(UUID.randomUUID());
        order.setPaymentId(UUID.randomUUID());
        order.setAmount(new BigDecimal("102.32"));
        order.setStreet("1203 Westmisnter Blvrd");
        order.setCity("Westminster");
        order.setCountry("USA");
        order.setState("MI");
        order.setZip("92239");

        orders.send(
            MessageBuilder
                .withPayload(order)
                .setHeader("Content-Type", "application/json")
                .build());
    }
}

There is one small detail left out though: these contracts are managed by providers (or better to say, producers), not consumers. Not only that, the producers are responsible for publishing all the stubs for consumers so they would be able to write the tests against. Certainly a different path than Pact takes, but on the bright side, the test suite for producers are 100% generated by Apache Maven / Gradle plugins.

<plugin>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-contract-maven-plugin</artifactId>
    <version>2.1.4.RELEASE</version>
    <extensions>true</extensions>
    <configuration>
        <packageWithBaseClasses>com.example.order</packageWithBaseClasses>
    </configuration>
</plugin>

As you may have noticed, the plugin would assume that the base test classes (the ones which have to provide createOrder() method implementation) are located in the com.example.order package, exactly where we have placed OrderBase class. To complete the setup, we need to add a few dependencies to our pom.xml file.


<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-dependencies</artifactId>
            <version>Greenwich.SR4</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-dependencies</artifactId>
            <version>2.1.10.RELEASE</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
    </dependencies>
</dependencyManagement>

<dependencies>
    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-starter-contract-verifier</artifactId>
        <scope>test</scope>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-test</artifactId>
        <scope>test</scope>
    </dependency>
</dependencies>

And we are done with producer side! If we run mvn clean install right now, two things are going to happen. First, you will notice that some tests were run and passed, although we wrote none, these were generated on our behalf.

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running com.example.order.OrderTest

....

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

And secondly, the stubs for consumers are going to be generate (and published) as well (in this case, bundled into order-service-messaging-contract-tests-0.0.1-SNAPSHOT-stubs.jar).

...
[INFO]
[INFO] --- spring-cloud-contract-maven-plugin:2.1.4.RELEASE:generateStubs (default-generateStubs) @ order-service-messaging-contract-tests ---
[INFO] Files matching this pattern will be excluded from stubs generation []
[INFO] Building jar: order-service-messaging-contract-tests-0.0.1-SNAPSHOT-stubs.jar
[INFO]
....

Awesome, so we have messaging contract specification and stubs published, the ball is on consumer's field now, the Shipment Service. Probably, the most tricky part for the consumer would be to configure the messaging integration library of choice. In our case, it is going to be Spring Cloud Stream however other integrations are also available.

The fastest way to understand how the Spring Cloud Contract works on cosumer side is to start from the end and to look at the complete sample test suite first.

@RunWith(SpringRunner.class)
@SpringBootTest
@AutoConfigureMessageVerifier
@AutoConfigureStubRunner(
    ids = "com.example:order-service-messaging-contract-tests:+:stubs", 
    stubsMode = StubRunnerProperties.StubsMode.LOCAL
)
public class OrderMessagingContractTest {
    @Autowired private MessageVerifier<Message<?>> verifier;
    @Autowired private StubFinder stubFinder;

    @Test
    public void testOrderConfirmed() throws Exception {
        stubFinder.trigger("order");
        
        final Message<?> message = verifier.receive("orders");
        assertThat(message, notNullValue());
        assertThat(message.getPayload(), isJson(
            allOf(List.of(
                withJsonPath("$.orderId"),
                withJsonPath("$.paymentId"),
                withJsonPath("$.amount"),
                withJsonPath("$.street"),
                withJsonPath("$.city"),
                withJsonPath("$.state"),
                withJsonPath("$.zip"),
                withJsonPath("$.country")
            ))));
    }
}

At the top, the @AutoConfigureStubRunner references the stubs published by producer, effectively the ones from order-service-messaging-contract-tests-0.0.1-SNAPSHOT-stubs.jar archive. The StubFinder helps us to pick the right stub for the test case and to trigger a particular messaging contract verification flow by means of calling stubFinder.trigger("order"). The value "order" is not arbitrary, it should match the label assigned to the contract specification, in our case we have it defined as:

package contracts

org.springframework.cloud.contract.spec.Contract.make {
    ...
    label 'order'
    ...
}

With that, the test should be looking simple and straightfoward: trigger the flow, verify that the message has been placed into the messaging channel and satisfies the consumer expectations. From the configuration standpoint, we only need to provide this messaging channel to run the tests against.

@SpringBootConfiguration
public class OrderMessagingConfiguration {
    @Bean
    PollableChannel orders() {
        return MessageChannels.queue().get();
    }
}

And again, the name of the bean, orders, is not a random pick, it has to much the destination from the contract specification:

package contracts

org.springframework.cloud.contract.spec.Contract.make {
    ...
    outputMessage {
        sentTo 'orders'
        ...
    }
    ...
}

Last but not least, let us enumerate the dependencies which are required on consumer side (luckily, there is no need to use any additional Apache Maven or Gradle plugins).

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.springframework.cloud</groupId>
            <artifactId>spring-cloud-dependencies</artifactId>
            <version>Greenwich.SR4</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
    </dependencies>
</dependencyManagement>

<dependencies>
    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-starter-contract-verifier</artifactId>
        <scope>test</scope>
    </dependency>
    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-starter-contract-stub-runner</artifactId>
        <scope>test</scope>
    </dependency>
    <dependency>
        <groupId>org.springframework.cloud</groupId>
        <artifactId>spring-cloud-stream</artifactId>
        <version>2.2.1.RELEASE</version>
        <type>test-jar</type>
        <scope>test</scope>
        <classifier>test-binder</classifier>
    </dependency>
</dependencies>

A quick note here. The last dependency is quite an important piece of the puzzle, it brings the integration of the Spring Cloud Stream with Spring Cloud Contract. With that, the consumers are all set.

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running com.example.order.OrderMessagingContractTest

...

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

To close the loop, we should look back to the one of the core promises of the consumer-driven contract testing: allow the producers to evolve the contracts without breaking the consumers. What that means practically is that consumers may contribute their tests back to the producers, alhough the improtance of doing that is less of the concern with Spring Cloud Contract. The reason is simple: the producers are the ones who write the message contract specifications first and the tests generated out of these specifications are expected to fail against any breaking change. Nonetheless, there are number of benefits for producers to know how the consumers use their messages, so please give it some thoughts.

Hopefuly, it was an interesting subject to discuss. Spring Cloud Contract brings somewhat different perspective of applying consumer-driven contract testing for messaging. It is an appealing alternative to Pact JVM, especially if your applications and services already rely on Spring projects.

As always, the complete project sources are available on Github.

Thursday, October 31, 2019

Tell us what you want and we will make it so: consumer-driven contract testing for messaging

Quite some time ago we have talked about consumer-driven contract testing from the perspective of the REST(ful) web APIs in general and their projection into Java (JAX-RS 2.0 specification) in particular. It would be fair to say that REST still dominates the web API landscape, at least with respect to public APIs, however the shift towards microservices or/and service-based architecture is changing the alignment of forces very fast. One of such disrupting trends is messaging.

Modern REST(ful) APIs are implemented mostly over HTTP 1.1 protocol and are constrained by its request/response communication style. The HTTP/2 is here to help out but still, not every use case fits into this communication model. Often the job could be performed asynchronously and the fact of its completion could be broadcasted to interested parties later on. This is how most of the things work in real life and using messaging is a perfect answer to that.

The messaging space is really crowded with astonishing amount of message brokers and brokerless options available. We are not going to talk about that instead focusing on another tricky subject: the message contracts. Once the producer emits message or event, it lands into the queue/topic/channel, ready to be consumed. It is here to stay for some time. Obviously, the producer knows what it publishes, but what about consumers? How would they know what to expect?

At this moment, many of us would scream: use schema-based serialization! And indeed, Apache Avro, Apache Thrift, Protocol Buffers, Message Pack, ... are here to address that. At the end of the day, such messages and events become the part of the provider contract, along with the REST(ful) web APIs if any, and have to be communicated and evolved over time without breaking the consumers. But ... you would be surprised to know how many organizations found their nirvana in JSON and use it to pass messages and events around, throwing such clobs at consumers, no schema whatsoever! In this post we are going to look at how consumer-driven contract testing technique could help us in such situation.

Let us consider a simple system with two services, Order Service and Shipment Service. The Order Service publishes the messages / events to the message queue and Shipment Service consumes them from there.

Since Order Service is implemented in Java, the events are just POJO classes, serialized into JSON before arriving to the message broker using one of the numerous libraries out there. OrderConfirmed is one of such events.

public class OrderConfirmed {
    private UUID orderId;
    private UUID paymentId;
    private BigDecimal amount;
    private String street;
    private String city;
    private String state;
    private String zip;
    private String country;
}

As it often happens, the Shipment Service team was handed over the sample JSON snippet or pointed out some documentation piece, or reference Java class, and that is basically it. How Shipment Service team could kickoff the integration while being sure their interpretation is correct and the message's data they need will not suddenly disappear? Consumer-driven contract testing to the rescue!

The Shipment Service team could (and should) start off by writing the test cases against the OrderConfirmed message, embedding the knowledge they have, and our old friend Pact framework (to be precise, Pact JVM) is the right tool for that. So how the test case may look like?

public class OrderConfirmedConsumerTest {
    private static final String PROVIDER_ID = "Order Service";
    private static final String CONSUMER_ID = "Shipment Service";
    
    @Rule
    public MessagePactProviderRule provider = new MessagePactProviderRule(this);
    private byte[] message;

    @Pact(provider = PROVIDER_ID, consumer = CONSUMER_ID)
    public MessagePact pact(MessagePactBuilder builder) {
        return builder
            .given("default")
            .expectsToReceive("an Order confirmation message")
            .withMetadata(Map.of("Content-Type", "application/json"))
            .withContent(new PactDslJsonBody()
                .uuid("orderId")
                .uuid("paymentId")
                .decimalType("amount")
                .stringType("street")
                .stringType("city")
                .stringType("state")
                .stringType("zip")
                .stringType("country"))
            .toPact();
    }

    @Test
    @PactVerification(PROVIDER_ID)
    public void test() throws Exception {
        Assert.assertNotNull(message);
    }

    public void setMessage(byte[] messageContents) {
        message = messageContents;
    }
}

It is exceptionally simple and straightforward, no boilerplate added. The test case is designed right from the JSON representation of the OrderConfirmed message. But we are only half-way through, the Shipment Service team should somehow contribute their expectations back to the Order Service so the producer would keep track of who and how consumes the OrderConfirmed message. The Pact test harness takes care of that by generating the pact files (set of agreements, or pacts) out of the each JUnit test cases into the 'target/pacts' folder. Below is an example of the generated Shipment Service-Order Service.json pact file after running OrderConfirmedConsumerTest test suite.

{
  "consumer": {
    "name": "Shipment Service"
  },
  "provider": {
    "name": "Order Service"
  },
  "messages": [
    {
      "description": "an Order confirmation message",
      "metaData": {
        "contentType": "application/json"
      },
      "contents": {
        "zip": "string",
        "country": "string",
        "amount": 100,
        "orderId": "e2490de5-5bd3-43d5-b7c4-526e33f71304",
        "city": "string",
        "paymentId": "e2490de5-5bd3-43d5-b7c4-526e33f71304",
        "street": "string",
        "state": "string"
      },
      "providerStates": [
        {
          "name": "default"
        }
      ],
      "matchingRules": {
        "body": {
          "$.orderId": {
            "matchers": [
              {
                "match": "regex",
                "regex": "[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}"
              }
            ],
            "combine": "AND"
          },
          "$.paymentId": {
            "matchers": [
              {
                "match": "regex",
                "regex": "[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}"
              }
            ],
            "combine": "AND"
          },
          "$.amount": {
            "matchers": [
              {
                "match": "decimal"
              }
            ],
            "combine": "AND"
          },
          "$.street": {
            "matchers": [
              {
                "match": "type"
              }
            ],
            "combine": "AND"
          },
          "$.city": {
            "matchers": [
              {
                "match": "type"
              }
            ],
            "combine": "AND"
          },
          "$.state": {
            "matchers": [
              {
                "match": "type"
              }
            ],
            "combine": "AND"
          },
          "$.zip": {
            "matchers": [
              {
                "match": "type"
              }
            ],
            "combine": "AND"
          },
          "$.country": {
            "matchers": [
              {
                "match": "type"
              }
            ],
            "combine": "AND"
          }
        }
      }
    }
  ],
  "metadata": {
    "pactSpecification": {
      "version": "3.0.0"
    },
    "pact-jvm": {
      "version": "4.0.2"
    }
  }
}

The next step for Shipment Service team is to share this pact file with Order Service team so these guys could run the provider-side Pact verifications as part of their test suites.

@RunWith(PactRunner.class)
@Provider(OrderServicePactsTest.PROVIDER_ID)
@PactFolder("pacts") 
public class OrderServicePactsTest {
    public static final String PROVIDER_ID = "Order Service";

    @TestTarget
    public final Target target = new AmqpTarget();
    private ObjectMapper objectMapper;
    
    @Before
    public void setUp() {
        objectMapper = new ObjectMapper();
    }

    @State("default")
    public void toDefaultState() {
    }
    
    @PactVerifyProvider("an Order confirmation message")
    public String verifyOrderConfirmed() throws JsonProcessingException {
        final OrderConfirmed order = new OrderConfirmed();
        
        order.setOrderId(UUID.randomUUID());
        order.setPaymentId(UUID.randomUUID());
        order.setAmount(new BigDecimal("102.33"));
        order.setStreet("1203 Westmisnter Blvrd");
        order.setCity("Westminster");
        order.setCountry("USA");
        order.setState("MI");
        order.setZip("92239");

        return objectMapper.writeValueAsString(order);
    }
}

The test harness picks all the pact files from the @PactFolder and run the tests against the @TestTarget, in this case we are wiring AmqpTarget, provided out of the box, but your could plug your own specific target easily.

And this is basically it! The consumer (Shipment Service) have their expectations expressed in the test cases and shared with the producer (Order Service) in a shape of the pact files. The producers have own set of tests to make sure its model matches the consumers' view. Both sides could continue to evolve independently, and trust each other, as far as pacts are not denounced (hopefully, never).

To be fair, Pact is not the only choice for doing consumer-driven contract testing, in the upcoming post (already in work) we are going to talk about yet another excellent option, Spring Cloud Contract.

As for today, the complete project sources are available on Github.

Thursday, July 18, 2019

Testing Spring Boot conditionals the sane way

If you are more or less experienced Spring Boot user, it is very luckily that at some point you may need to run into the situation when the particular beans or configurations have to be injected conditionally. The mechanics of it is well understood but sometimes the testing such conditions (and their combinations) could get messy. In this post we are going to talk about some possible (arguably, sane) ways to approach that.

Since Spring Boot 1.5.x is still widely used (nonetheless it is racing towards the EOL this August), we would include it along with Spring Boot 2.1.x, both with JUnit 4.x and JUnit 5.x. The techniques we are about to cover are equally applicable to the regular configuration classes as well as auto-configurations classes.

The example we will be playing with would be related to our home-made logging. Let us assume our Spring Boot application requires some bean for dedicated logger with name "sample". In certain circumstances however this logger has to be disabled (or become effectively a noop), so the property logging.enabled serves like a kill switch here. We use Slf4j and Logback in this example, but it is not really important. The LoggingConfiguration snippet below reflects this idea.

@Configuration
public class LoggingConfiguration {
    @Configuration
    @ConditionalOnProperty(name = "logging.enabled", matchIfMissing = true)
    public static class Slf4jConfiguration {
        @Bean
        Logger logger() {
            return LoggerFactory.getLogger("sample");
        }
    }
    
    @Bean
    @ConditionalOnMissingBean
    Logger logger() {
        return new NOPLoggerFactory().getLogger("sample"); 
    }
}

So how would we test that? Spring Boot (and Spring Framework in general) has always offered the outstanding test scaffolding support. The @SpringBootTest and @TestPropertySource annotations allow to quickly bootstrap the application context with the customized properties. There is one issue though: they are applied per test class level, not a per test method. It certainly makes sense but basically requires you to create a test class per combination of conditionals.

If you are still with JUnit 4.x, there is one trick you may found useful which exploits Enclosed runner, the hidden gem of the framework.

@RunWith(Enclosed.class)
public class LoggingConfigurationTest {
    @RunWith(SpringRunner.class)
    @SpringBootTest
    public static class LoggerEnabledTest {
        @Autowired private Logger logger;
        
        @Test
        public void loggerShouldBeSlf4j() {
            assertThat(logger).isInstanceOf(ch.qos.logback.classic.Logger.class);
        }
    }
    
    @RunWith(SpringRunner.class)
    @SpringBootTest
    @TestPropertySource(properties = "logging.enabled=false")
    public static class LoggerDisabledTest {
        @Autowired private Logger logger;
        
        @Test
        public void loggerShouldBeNoop() {
            assertThat(logger).isSameAs(NOPLogger.NOP_LOGGER);
        }
    }
}

You still have the class per condition but at least they are all in the same nest. With JUnit 5.x, some things got easier but not to the level as one might expect. Unfortunately, Spring Boot 1.5.x does not support JUnit 5.x natively, so we have to rely on extension provided by spring-test-junit5 community module. Here are the relevant changes in pom.xml, please notice that junit is explicitly excluded from the spring-boot-starter-test dependencies graph.

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-test</artifactId>
    <scope>test</scope>
    <exclusions>
        <exclusion>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
        </exclusion>
    </exclusions>
</dependency>

<dependency>
    <groupId>com.github.sbrannen</groupId>
    <artifactId>spring-test-junit5</artifactId>
    <version>1.5.0</version>
    <scope>test</scope>
</dependency>

<dependency>
    <groupId>org.junit.jupiter</groupId>
    <artifactId>junit-jupiter-api</artifactId>
    <version>5.5.0</version>
    <scope>test</scope>
</dependency>

<dependency>
    <groupId>org.junit.jupiter</groupId>
    <artifactId>junit-jupiter-engine</artifactId>
    <version>5.5.0</version>
    <scope>test</scope>
</dependency>

The test case itself is not very different besides usage of the @Nested annotation, which comes from JUnit 5.x to support tests as inner classes.

public class LoggingConfigurationTest {
    @Nested
    @ExtendWith(SpringExtension.class)
    @SpringBootTest
    @DisplayName("Logging is enabled, expecting Slf4j logger")
    public static class LoggerEnabledTest {
        @Autowired private Logger logger;
        
        @Test
        public void loggerShouldBeSlf4j() {
            assertThat(logger).isInstanceOf(ch.qos.logback.classic.Logger.class);
        }
    }
    
    @Nested
    @ExtendWith(SpringExtension.class)
    @SpringBootTest
    @TestPropertySource(properties = "logging.enabled=false")
    @DisplayName("Logging is disabled, expecting NOOP logger")
    public static class LoggerDisabledTest {
        @Autowired private Logger logger;
        
        @Test
        public void loggerShouldBeNoop() {
            assertThat(logger).isSameAs(NOPLogger.NOP_LOGGER);
        }
    }
}

If you try to run the tests from the command line using Apache Maven and Maven Surefire plugin, you might be surprised to see that none of them were executed during the build. The issue is that ... all nested classes are excluded ... so we need to put in place another workaround.

<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-surefire-plugin</artifactId>
    <version>2.22.2</version>
    <configuration>
        <excludes>
            <exclude />
        </excludes>
    </configuration>
</plugin>

With that, things should be rolling smoothly. But enough about legacy, the Spring Boot 2.1.x comes as the complete game changer. The family of the context runners, ApplicationContextRunner, ReactiveWebApplicationContextRunner and WebApplicationContextRunner, provide an easy and straightforward way to tailor the context on per test method level, keeping the test executions incredibly fast.

public class LoggingConfigurationTest {
    private final ApplicationContextRunner runner = new ApplicationContextRunner()
        .withConfiguration(UserConfigurations.of(LoggingConfiguration.class));
    
    @Test
    public void loggerShouldBeSlf4j() {
        runner
            .run(ctx -> 
                assertThat(ctx.getBean(Logger.class)).isInstanceOf(Logger.class)
            );
    }
    
    @Test
    public void loggerShouldBeNoop() {
        runner
            .withPropertyValues("logging.enabled=false")
            .run(ctx -> 
                assertThat(ctx.getBean(Logger.class)).isSameAs(NOPLogger.NOP_LOGGER)
            );
    }
}

It looks really great. The JUnit 5.x support in Spring Boot 2.1.x is much better and with the the upcoming 2.2 release, JUnit 5.x will be the default engine (not to worry, the old JUnit 4.x will still be supported). As of now, the switch to JUnit 5.x needs a bit of work on dependencies side.

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-test</artifactId>
    <scope>test</scope>
    <exclusions>
        <exclusion>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
        </exclusion>
    </exclusions>
</dependency>

<dependency>
    <groupId>org.junit.jupiter</groupId>
    <artifactId>junit-jupiter-api</artifactId>
    <scope>test</scope>
</dependency>

<dependency>
    <groupId>org.junit.jupiter</groupId>
    <artifactId>junit-jupiter-engine</artifactId>
    <scope>test</scope>
</dependency>

As an additional step, you may need to use recent Maven Surefire plugin, 2.22.0 or above, with out-of-the box JUnit 5.x support. The the snippet below illustrates that.

<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-surefire-plugin</artifactId>
    <version>2.22.2</version>
</plugin>

The sample configuration we have worked with is pretty naive, many of the real-world applications would end up with quite complex contexts built out of many conditionals. The flexibility and enormous opportunities that come out of the context runners, the invaluable addition to the Spring Boot 2.x test scaffolding, are just the live savers, please keep them in mind.

The complete project sources are available on Github.

Saturday, May 11, 2019

When HTTP status code is not enough: tackling web APIs error reporting

One area of the RESTful web APIs design, quite frequently overlooked, is how to report errors and problems, either related to business or application. The proper usage of the HTTP status codes comes to mind first, and although quite handy, often it is not informative enough. Let us take 400 Bad Request for example. Yes, it clearly states that the request is problematic, but what exactly is wrong?

The RESTful architectural style does not dictate what should be done in this case and so everyone is inventing its own styles, conventions and specifications. It could be as simple as including error message into the response or as shortsighted as copy/pasting long stack traces (in case of Java or .NET, to name a few cultprits). There is no shortage of ideas but luckily, we have at least some guidance available in the form of RFC 7807: Problem Details for HTTP APIs. Despite the fact that it is not an official specification but a draft (still), it outlines the good common principles on the problem at hand and this is what we are going to talk about in this post.

In the nutshell, RFC 7807: Problem Details for HTTP APIs just proposes the error or problem representation (in JSON or XML formats) which may include at least the following details:

  • type - A URI reference that identifies the problem type
  • title - A short, human-readable summary of the problem type
  • status - The HTTP status code
  • detail - A human-readable explanation specific to this occurrence of the problem
  • instance - A URI reference that identifies the specific occurrence of the problem
More importantly, the problem type definitions may extend the problem details object with additional members, contributing to the ones above. As you see, it looks dead simple from the implementation perspective. Even better, thanks to Zalando, we already have the RFC 7807: Problem Details for HTTP APIs implementation for Java (and Spring Web in particular). So ... let us give it a try!

Our imaginary People Management web API is going to be built using the state of the art technology stack, Spring Boot and Apache CXF, the popular web services framework and JAX-RS 2.1 implementation. To keep it somewhat simple, there are only two endpoints which are exposed: registration and lookup by person identifier.

Sweeping aside the tons of issues and business constraints you may run into while developing the real-world services, even with this simple API a few things may go wrong. The first problem we age going to tackle is what if the person you are looking for is not registered yet? Looks like a fit for 404 Not Found, right? Indeed, let us start with our first problem, PersonNotFoundProblem!

public class PersonNotFoundProblem extends AbstractThrowableProblem {
    private static final long serialVersionUID = 7662154827584418806L;
    private static final URI TYPE = URI.create("http://localhost:21020/problems/person-not-found");
    
    public PersonNotFoundProblem(final String id, final URI instance) {
        super(TYPE, "Person is not found", Status.NOT_FOUND, 
            "Person with identifier '" + id + "' is not found", instance, 
                null, Map.of("id", id));
    }
}

It resembles a lot the typical Java exception, and it really is one, since AbstractThrowableProblem is the subclass of the RuntimeException. As such, we could throw it from our JAX-RS API.

@Produces({ MediaType.APPLICATION_JSON, "application/problem+json" })
@GET
@Path("{id}")
public Person findById(@PathParam("id") String id) {
    return service
        .findById(id)
        .orElseThrow(() -> new PersonNotFoundProblem(id, uriInfo.getRequestUri()));
}

If we run the server and just try to fetch the person providing any identifier, the problem detail response is going to be returned back (since the dataset is not pre-populated), for example:

$ curl "http://localhost:21020/api/people/1" -H  "Accept: */*" 

HTTP/1.1 404
Content-Type: application/problem+json

{
    "type" : "http://localhost:21020/problems/person-not-found",
    "title" : "Person is not found",
    "status" : 404,
    "detail" : "Person with identifier '1' is not found",
    "instance" : "http://localhost:21020/api/people/1",
    "id" : "1"
}

Please notice the usage of the application/problem+json media type along with additional property id being included into the response. Although there are many things which could be improved, it is arguably better than just naked 404 (or 500 caused by EntityNotFoundException). Plus, the documentation section behind this type of the problem (in our case, http://localhost:21020/problems/person-not-found) could be consulted in case further clarifications may be needed.

So designing the problems after exceptions is just one option. You may often (and for very valid reasons) restrain from coupling you business logic with unrelated details. In this case, it is perfectly valid to return the problem details as the response payload from the JAX-RS resource. As an example, the registration process may raise NonUniqueEmailException so our web API layer could transform it into appropriate problem detail.

@Consumes(MediaType.APPLICATION_JSON)
@Produces({ MediaType.APPLICATION_JSON, "application/problem+json" })
@POST
public Response register(@Valid final CreatePerson payload) {
    try {
        final Person person = service.register(payload.getEmail(), 
            payload.getFirstName(), payload.getLastName());
            
        return Response
            .created(uriInfo.getRequestUriBuilder().path(person.getId()).build())
            .entity(person)
            .build();

    } catch (final NonUniqueEmailException ex) {
        return Response
            .status(Response.Status.BAD_REQUEST)
            .type("application/problem+json")
            .entity(Problem
                .builder()
                .withType(URI.create("http://localhost:21020/problems/non-unique-email"))
                .withInstance(uriInfo.getRequestUri())
                .withStatus(Status.BAD_REQUEST)
                .withTitle("The email address is not unique")
                .withDetail(ex.getMessage())
                .with("email", payload.getEmail())
                .build())
            .build();
        }
    }

To trigger this issue, it is enough to run the server instance and try to register the same person twice, like we have done below.

$ curl -X POST "http://localhost:21020/api/people" \ 
     -H  "Accept: */*" -H "Content-Type: application/json" \
     -d '{"email":"john@smith.com", "firstName":"John", "lastName": "Smith"}'

HTTP/1.1 400                                                                              
Content-Type: application/problem+json                                                           
                                                                                                                                                                                   
{                                                                                         
    "type" : "http://localhost:21020/problems/non-unique-email",                            
    "title" : "The email address is not unique",                                            
    "status" : 400,                                                                         
    "detail" : "The email 'john@smith.com' is not unique and is already registered",        
    "instance" : "http://localhost:21020/api/people",                                       
    "email" : "john@smith.com"                                                              
}                                                                                         

Great, so our last example is a bit more complicated but, probably, at the same time, the most realistic one. Our web API heavily relies on Bean Validation in order to make sure the input provided by the consumers of the API is valid. How would we represent the validation errors as the problem details? The most straightforward way is to supply the dedicated ExceptionMapper provider, which is the part of the JAX-RS specification. Let us introduce one.

@Provider
public class ValidationExceptionMapper implements ExceptionMapper<ValidationException> {
    @Context private UriInfo uriInfo;
    
    @Override
    public Response toResponse(final ValidationException ex) {
        if (ex instanceof ConstraintViolationException) {
            final ConstraintViolationException constraint = (ConstraintViolationException) ex;
            
            final ThrowableProblem problem = Problem
                    .builder()
                    .withType(URI.create("http://localhost:21020/problems/invalid-parameters"))
                    .withTitle("One or more request parameters are not valid")
                    .withStatus(Status.BAD_REQUEST)
                    .withInstance(uriInfo.getRequestUri())
                    .with("invalid-parameters", constraint
                        .getConstraintViolations()
                        .stream()
                        .map(this::buildViolation)
                        .collect(Collectors.toList()))
                    .build();

            return Response
                .status(Response.Status.BAD_REQUEST)
                .type("application/problem+json")
                .entity(problem)
                .build();
        }
        
        return Response
            .status(Response.Status.INTERNAL_SERVER_ERROR)
            .type("application/problem+json")
            .entity(Problem
                .builder()
                .withTitle("The server is not able to process the request")
                .withType(URI.create("http://localhost:21020/problems/server-error"))
                .withInstance(uriInfo.getRequestUri())
                .withStatus(Status.INTERNAL_SERVER_ERROR)
                .withDetail(ex.getMessage())
                .build())
            .build();
    }

    protected Map<?, ?> buildViolation(ConstraintViolation<?> violation) {
        return Map.of(
                "bean", violation.getRootBeanClass().getName(),
                "property", violation.getPropertyPath().toString(),
                "reason", violation.getMessage(),
                "value", Objects.requireNonNullElse(violation.getInvalidValue(), "null")
            );
    }
}

The snippet above distingushes two kind of issues: the ConstraintViolationExceptions indicate the invalid input and are mapped to 400 Bad Request, whereas generic ValidationExceptions indicate the problem on the server side and are mapped to 500 Internal Server Error. We only extract the basic details about violations, however even that improves the error reporting a lot.

$ curl -X POST "http://localhost:21020/api/people" \
    -H  "Accept: */*" -H "Content-Type: application/json" \
    -d '{"email":"john.smith", "firstName":"John"}' -i    

HTTP/1.1 400                                                                    
Content-Type: application/problem+json                                              
                                                                                
{                                                                               
    "type" : "http://localhost:21020/problems/invalid-parameters",                
    "title" : "One or more request parameters are not valid",                     
    "status" : 400,                                                               
    "instance" : "http://localhost:21020/api/people",                             
    "invalid-parameters" : [ 
        {
            "reason" : "must not be blank",                                             
            "value" : "null",                                                           
            "bean" : "com.example.problem.resource.PeopleResource",                     
            "property" : "register.payload.lastName"                                    
        }, 
        {                                                                          
            "reason" : "must be a well-formed email address",                           
            "value" : "john.smith",                                                     
            "bean" : "com.example.problem.resource.PeopleResource",                     
            "property" : "register.payload.email"                                       
        } 
    ]                                                                           
}                                                                               

This time the additional information bundled into the invalid-parameters member is quite verbose: we know the class (PeopleResource), method (register), the method's argument (payload) and the properties (lastName and email) respectively (all that extracted from the property path).

Meaningful error reporting is one of corner stones of the modern RESTful web APIs. Often it is not easy but definitely worth the efforts. The consumers (which often are just other developers) should have a clear understanding of what went wrong and what to do about it. The RFC 7807: Problem Details for HTTP APIs is a step into right direction and libraries like problem and problem-spring-web are here to back you up, please make use of them.

The complete source code is available on Github.