Tuesday, July 28, 2020

It is never enough of them: enriching Apache Avro generated classes with custom Java annotations

Apache Avro, along with Apache Thrift and Protocol Buffers, is often being used as a platform-neutral extensible mechanism for serializing structured data. In the context of event-driven systems, the Apache Avro's schemas play the role of the language-agnostic contracts, shared between loosely-coupled components of the system, not necessarily written using the same programming language.

Probably, the most widely adopted reference architecture for such systems circles around Apache Kafka backed by Schema Registry and Apache Avro, although many other excellent options are available. Nevertheless, why Apache Avro?

The official documentation page summarizes pretty well the key advantages Apache Avro has over Apache Thrift and Protocol Buffers. But we are going to add another one to the list: biased (in a good sense) support of the Java and JVM platform in general.

Let us imagine that one of the components (or, it has to be said, microservice) takes care of the payment processing. Not every payment may succeed and to propagate such failures, the component broadcasts PaymentRejectedEvent whenever such unfortunate event happens. Here is its Apache Avro schema, persisted in the PaymentRejectedEvent.avsc file.

{
    "type": "record",
    "name": "PaymentRejectedEvent",
    "namespace": "com.example.event",
    "fields": [
        {
            "name": "id",
            "type": {
                "type": "string",
                "logicalType": "uuid"
            }
        },
        {
            "name": "reason",
            "type": {
                "type": "enum",
                "name": "PaymentStatus",
                "namespace": "com.example.event",
                "symbols": [
                    "EXPIRED_CARD",
                    "INSUFFICIENT_FUNDS",
                    "DECLINED"
                ]
            }
        },
        {
            "name": "date",
            "type": {
                "type": "long",
                "logicalType": "local-timestamp-millis"
            }
        }
    ]
}

The event is notoriously kept simple, you can safely assume that in more or less realistic system it has to have considerably more details available. To turn this event into Java class at build time, we could use Apache Avro Maven plugin, it is as easy as it could get.

<plugin>
    <groupId>org.apache.avro</groupId>
    <artifactId>avro-maven-plugin</artifactId>
    <version>1.10.0</version>
    <configuration>
        <stringType>String</stringType>
    </configuration>
    <executions>
        <execution>
            <phase>generate-sources</phase>
            <goals>
                <goal>schema</goal>
            </goals>
            <configuration>
                <sourceDirectory>${project.basedir}/src/main/avro/</sourceDirectory>
                <outputDirectory>${project.build.directory}/generated-sources/avro/</outputDirectory>
            </configuration>
        </execution>
    </executions>
</plugin>

Once the build finishes, you will get PaymentRejectedEvent Java class generated. But a few annoyances are going to emerge right away:

@org.apache.avro.specific.AvroGenerated
public class PaymentRejectedEvent extends ... {
   private java.lang.String id;
   private com.example.event.PaymentStatus reason;
   private long date;
}

The Java's types for id and date fields are not really what we would expect. Luckily, this is easy to fix by specifying customConversions plugin property, for example.

<plugin>
    <groupId>org.apache.avro</groupId>
    <artifactId>avro-maven-plugin</artifactId>
    <version>1.10.0</version>
    <configuration>
        <stringType>String</stringType>
        <customConversions>
            org.apache.avro.Conversions$UUIDConversion,org.apache.avro.data.TimeConversions$LocalTimestampMillisConversion
        </customConversions>
    </configuration>
    <executions>
        <execution>
            <phase>generate-sources</phase>
            <goals>
                <goal>schema</goal>
            </goals>
            <configuration>
                <sourceDirectory>${project.basedir}/src/main/avro/</sourceDirectory>
                <outputDirectory>${project.build.directory}/generated-sources/avro/</outputDirectory>
            </configuration>
        </execution>
    </executions>
</plugin>

If we build the project this time, the plugin would generate the right types.

@org.apache.avro.specific.AvroGenerated
public class PaymentRejectedEvent extends ... {
   private java.util.UUID id;
   private com.example.event.PaymentStatus reason;
   private java.time.LocalDateTime date;
}

It looks much better! But what about next challenge. In Java, annotations are commonly used to associate some additional metadata pieces with a particular language element. What if we have to add a custom, application-specific annotation to all generated event classes? It does not really matter which one, let it be @javax.annotation.Generated, for example. It turns out, with Apache Avro it is not an issue, it has dedicated javaAnnotation property we could benefit from.

{
    "type": "record",
    "name": "PaymentRejectedEvent",
    "namespace": "com.example.event",
    "javaAnnotation": "javax.annotation.Generated(\"avro\")",
    "fields": [
        {
            "name": "id",
            "type": {
                "type": "string",
                "logicalType": "uuid"
            }
        },
        {
            "name": "reason",
            "type": {
                "type": "enum",
                "name": "PaymentStatus",
                "namespace": "com.example.event",
                "symbols": [
                    "EXPIRED_CARD",
                    "INSUFFICIENT_FUNDS",
                    "DECLINED"
                ]
            }
        },
        {
            "name": "date",
            "type": {
                "type": "long",
                "logicalType": "local-timestamp-millis"
            }
        }
    ]
}

When we rebuild the project one more time (hopefully the last one), the generated PaymentRejectedEvent Java class is going to be decorated with the additional custom annotation.

@javax.annotation.Generated("avro")
@org.apache.avro.specific.AvroGenerated
public class PaymentRejectedEvent extends ... {
   private java.util.UUID id;
   private com.example.event.PaymentStatus reason;
   private java.time.LocalDateTime date;
}

Obviously, this property has no effect if the schema is used to produce respective constructs in other programming languages but it still feels good to see that Java has privileged support in Apache Avro, thanks for that! As a side note, it is good to see that after some quite long inactivity time the project is expiriencing the second breath, with regular releases and new features delivered constantly.

The complete source code is available on Github.