Monday, October 30, 2017

Better late than never: SSE, or Server-Sent Events, are now in JAX-RS

Server-Sent Events (or just SSE) is quite useful protocol which allows the servers to push data to the clients over HTTP. This is something our web browsers support for ages but, surprisingly, neglected by JAX-RS specification for quite a long time. Although Jersey had an extension available for SSE media type, the API has never been formalized and as such, was not portable to other JAX-RS implementations.

Luckily, JAX-RS 2.1, also known as JSR-370, has changed that by making SSE support, both client-side and server-side, a part of the official specification. In today's post we are going to look at how to integrate SSE support into the existing Java REST(ful) web services, using recently released version 3.2.0 of the terrific Apache CXF framework. In fact, beside the bootstrapping, there is nothing CXF-specific really, all the examples should work in any other framework which implements JAX-RS 2.1 specification.

Without further ado, let us get started. As the significant amount of Java projects these days are built on top of awesome Spring Framework, our sample application would use Spring Boot and Apache CXF Spring Boot Integration to get us off the ground quickly. The old good buddy Apache Maven would help us as well by managing our project dependencies.


    org.springframework.boot
    spring-boot-starter
    1.5.8.RELEASE



    org.apache.cxf
    cxf-rt-frontend-jaxrs
    3.2.0



    org.apache.cxf
    cxf-spring-boot-starter-jaxrs
    3.2.0



    org.apache.cxf
    cxf-rt-rs-client
    3.2.0



     org.apache.cxf
     cxf-rt-rs-sse
     3.2.0

Under the hood Apache CXF is using Atmosphere framework to implement SSE transport so this is another dependency we have to include.


    org.atmosphere
    atmosphere-runtime
    2.4.14

The specifics around relying on Atmosphere framework introduces a need to provide additional configuration settings, namely transportId, so to ensure that SSE-capable transport will be picked up at runtime. The relevant details could be added into application.yml file:

cxf:
  servlet:
    init:
      transportId: http://cxf.apache.org/transports/http/sse

Great, so the foundation is there, moving on. The REST(ful) web service we are going to build would expose imaginary CPU load averages (for simplicity randomly generated) as the SSE streams. The Stats class would constitute our data model.

public class Stats {
    private long timestamp;
    private int load;

    public Stats() {
    }

    public Stats(long timestamp, int load) {
        this.timestamp = timestamp;
        this.load = load;
    }

    // Getters and setters are omitted
    ...
}

Speaking of streams, the Reactive Streams specification made its way into Java 9 and hopefully we are going to see the accelerated adoption of the reactive programming models by Java community. Moreover, developing SSE-enabled REST(ful) web services would be so much easier and straightforward when backed by Reactive Streams. To make the case, let us onboard RxJava 2 into our sample application.


    io.reactivex.rxjava2
    rxjava
    2.1.6

This is a good moment to start with our StatsRestService class, the typical JAX-RS resource implementation. The key SSE capabilities in JAX-RS 2.1 are centered around Sse contextual object which could be inject like this.

@Service
@Path("/api/stats")
public class StatsRestService {
    @Context 
    public void setSse(Sse sse) {
        // Access Sse context here
    }

Out of the Sse context we could get access to two very useful abstractions: SseBroadcaster and OutboundSseEvent.Builder, for example:

private SseBroadcaster broadcaster;
private Builder builder;
    
@Context 
public void setSse(Sse sse) {
    this.broadcaster = sse.newBroadcaster();
    this.builder = sse.newEventBuilder();
}

As you may already guess, the OutboundSseEvent.Builder constructs the instances of the OutboundSseEvent classes which could be sent over the wire, while SseBroadcaster broadcasts the same SSE stream to all the connected clients. With that being said, we could generate the stream of OutboundSseEvents and distribute it to everyone who is interested:

private static void subscribe(final SseBroadcaster broadcaster, final Builder builder) {
    Flowable
        .interval(1, TimeUnit.SECONDS)
        .zipWith(eventsStream(builder), (id, bldr) -> createSseEvent(bldr, id))
        .subscribeOn(Schedulers.single())
        .subscribe(broadcaster::broadcast);
}

private static Flowable<OutboundSseEvent.Builder> eventsStream(final Builder builder) {
    return Flowable.generate(emitter -> emitter.onNext(builder.name("stats")));
}

If you are not familiar with RxJava 2, no worries, this is what is happening here. The eventsStream method returns an effectively infinite stream of OutboundSseEvent.Builder instances for the SSE events of type stats. The subscribe method is a little bit more complicated. We start off by creating a stream which emits sequential number every second, f.e. 0,1,2,3,4,5,6,... and so on. Later, we combine this stream with the one returned by eventsStream method, essentially merging both streams to a single one which emits a tuple of (number, OutboundSseEvent.Builder) every second. Fairly speaking, this tuple is not very useful to us so we transform it to the instance of OutboundSseEvent class, treating the number as SSE event identifier:

private static final Random RANDOM = new Random();

private static OutboundSseEvent createSseEvent(OutboundSseEvent.Builder builder, long id) {
    return builder
        .id(Long.toString(id))
        .data(Stats.class, new Stats(new Date().getTime(), RANDOM.nextInt(100)))
        .mediaType(MediaType.APPLICATION_JSON_TYPE)
        .build();
}

The OutboundSseEvent may carry any payload in the data property which will be serialized with respect to the mediaType specified, using the usual MessageBodyWriter resolution strategy. Once we get our OutboundSseEvent instance, we send it off using SseBroadcaster::broadcast method. Please notice that we handed off the control flow to another thread using subscribeOn operator, this is usually what you would do all the time.

Good, hopefully the stream part is cleared out now but how could we actually subscribe to the SSE events emitted by SseBroadcaster? That is easier than you might think:

@GET
@Path("broadcast")
@Produces(MediaType.SERVER_SENT_EVENTS)
public void broadcast(@Context SseEventSink sink) {
    broadcaster.register(sink);
}

And we are all set. The most important piece here is content type being produced, which should be set to MediaType.SERVER_SENT_EVENTS. In this case, the contextual instance of the SseEventSink becomes available and could be registered with SseBroadcaster instance.

To see our JAX-RS resource in action, we need to bootstrap the server instance using, for example, JAXRSServerFactoryBean, configuring all the necessary providers along the way. Please take a note that we are also explicitly specifying transport to be used, in this case SseHttpTransportFactory.TRANSPORT_ID.

@Configuration
@EnableWebMvc
public class AppConfig extends WebMvcConfigurerAdapter {
    @Bean
    public Server rsServer(Bus bus, StatsRestService service) {
        JAXRSServerFactoryBean endpoint = new JAXRSServerFactoryBean();
        endpoint.setBus(bus);
        endpoint.setAddress("/");
        endpoint.setServiceBean(service);
        endpoint.setTransportId(SseHttpTransportFactory.TRANSPORT_ID);
        endpoint.setProvider(new JacksonJsonProvider());
        return endpoint.create();
    }
    
    @Override
    public void addResourceHandlers(ResourceHandlerRegistry registry) {
        registry
          .addResourceHandler("/static/**")
          .addResourceLocations("classpath:/web-ui/"); 
    }
}

To close the loop, we just need to supply the runner for our Spring Boot application:

@SpringBootApplication
public class SseServerStarter {    
    public static void main(String[] args) {
        SpringApplication.run(SseServerStarter.class, args);
    }
}

Now, if we run the application and navigate to http://localhost:8080/static/broadcast.html using multiple web browsers or different tabs within the same browser, we would observe the identical stream of events charted inside all of them:

Nice, broadcasting is certainly a valid use case, but what about returning an independent SSE stream on each endpoint invocation? Easy, just use SseEventSink methods, like send and close, to manipulate the SSE stream directly.

@GET
@Path("sse")
@Produces(MediaType.SERVER_SENT_EVENTS)
public void stats(@Context SseEventSink sink) {
    Flowable
        .interval(1, TimeUnit.SECONDS)
        .zipWith(eventsStream(builder), (id, bldr) -> createSseEvent(bldr, id))
        .subscribeOn(Schedulers.single())
        .subscribe(sink::send, ex -> {}, sink::close);
}

This time, if we run the application and navigate to http://localhost:8080/static/index.html using multiple web browsers or different tabs within the same browser, we would observe absolutely different charts:

Excellent, the server-side APIs are indeed very concise and easy to use. But what about client side, could we consume SSE streams from the Java applications? The answer is yes, absolutely. The JAX-RS 2.1 outlines the client-side API as well, with SseEventSource in the heart of it.


final WebTarget target = ClientBuilder
    .newClient()
    .register(JacksonJsonProvider.class)
    .target("http://localhost:8080/services/api/stats/sse");
        
try (final SseEventSource eventSource =
            SseEventSource
                .target(target)
                .reconnectingEvery(5, TimeUnit.SECONDS)
                .build()) {

    eventSource.register(event -> {
        final Stats stats = event.readData(Stats.class, MediaType.APPLICATION_JSON_TYPE);
        System.out.println("name: " + event.getName());
        System.out.println("id: " + event.getId());
        System.out.println("comment: " + event.getComment());
        System.out.println("data: " + stats.getLoad() + ", " + stats.getTimestamp());
        System.out.println("---------------");
    });
    eventSource.open();

    // Just consume SSE events for 10 seconds
    Thread.sleep(10000); 
}     

If we run this code snippet (assuming the server is up and running as well) we would see something like that in the console (as you may recall, the data is generated randomly).

name: stats
id: 0
comment: null
data: 82, 1509376080027
---------------
name: stats
id: 1
comment: null
data: 68, 1509376081033
---------------
name: stats
id: 2
comment: null
data: 12, 1509376082028
---------------
name: stats
id: 3
comment: null
data: 5, 1509376083028
---------------

...

As we can see, the OutboundSseEvent from server-side becomes InboundSseEvent for the client side. The client may consume any payload from the data property which could be deserialized by specifying expected media type, using the usual MessageBodyReader resolution strategy.

There are a lot of material squeezed in the single post. And still, there are few more things regarding SSE and JAX-RS 2.1 which we have not covered here, like for example using HttpHeaders.LAST_EVENT_ID_HEADER or configuring reconnect delays. Those could be a great topic for the upcoming post if there would be an interest to learn about.

To conclude, the SSE support in JAX-RS is what many of us have been awaiting for so long. Finally, it is the there, please give it a try!

The complete project sources are available on Github.

Wednesday, July 12, 2017

The Patterns of the Antipatterns: Architecture

We have been always looking for the best ways to architect our applications and platforms so they are maintainable, extensible, observable, adaptable and easy to evolve (this is by no means a complete list of the properties we would like to have but the foundational ones). Layered architecture, component-based architecture, hexagonal architecture, microservice architecture (and many others) advocate to use great design principles and guidelines to achieve these goals.

For example, layered architecture suggests to have clear separation between different layers (persistence, services, caching, presentation, ...) with a strict rules imposed on how layers could talk to each other. Component-based architecture, quite widespread in the world of UI development but not limited to it, introduces the components (which internally may be built on top of layered architecture or other architectural style) as key building blocks of the applications or system.

In turn, hexagonal architecture goes even further by introducing the concepts of ports and adapters around the application core. Last but not least, microservice architecture, the hottest architectural style these days, advocates to structure the application or system as a set of loosely coupled, ideally small, self-sufficient services which collaborate with each other.

Each one of the aforementioned architectural styles (as well as others we haven't mentioned) has own strong/weak sides and the context where it shines the most. But there are two kinds of architectures which beat them all, and those we are going to discuss right away.

Proximity Architecture

There is only one principle this architectural style offers: use anything you want anywhere you need. Do you need access to the service from data transfer object? No problem, just do it! May be your RESTful resource needs access to persistence services? Not an issue, go for it, because by and large, why to bother? It is just a pile of classes (usually within the same application), use static fields or methods, dependency injection or any other technique your language / frameworks provide. At the end, if compiler is happy, what is the problem? Right?

No, it is not right ... For sure, you have heard funny terms such as "big ball of mud" or "spaghetti code". Those are the typical outcomes when proximity architecture drives the show.

There are many reasons why proximity architecture emerges. Those reasons range from the constant urge to push features out, lack of knowledge and/or experience in software craftsmanship or the worst case, an ethical and careless approach to software engineering discipline. If you happen to have a startup experience , or worked for one of those large companies that still claim to be a startup to justify their chaos, I’m sure you have seen this form of architecture or one of its variations. Are they all doomed?

Well, I would argue, no, not really but fleeing the claws of the proximity architecture solely depends on the trade-offs you are going to make. For the developers, it is hard or even impossible to push back on the timelines set for the delivering of new features (unless you are exceptionally lucky and your manager understands what is technical debt and how to keep it under control). You have to and you will take shortcuts. However, it is enormously important to strive for the right balance between pouring in hacky code (on top of bunch of other hacks) and going with quick / not the best option while keeping in mind how much harm it could do to overall application architecture (and how hard it would be to refactore / improve / reverse that later on). Undoubtedly, seasoned software developers are able to cope with that.

I know, it sounds pretty vague to be an actionable advice but every company is unique, there is no magic universal rule or recipe to stop proximity architecture from poisoning your applications. The one thing is clear though, if you let it in and do nothing, you are going to end up with the unmanageable codebase and the only route to take would be to rewrite everything from scratch.

Micromonolith Architecture

Microservices, microservices, microservices ... everyone talks about them ... You hear from left and right that if you don't do microservices, then you are stuck in stone age. And many actually truly believe that all their problems and issues could be solved with microservices.

In many regards, microservices architecture could put life into stagnating projects assuming you are ready to embrace it across the whole organization, culturally and technically. This is a serious long-term commitment as such the changes won't happen overnight. But so many organizations fail to see the new beast rising here: micromonolith architecture in microservice clothing.

Here is what you could be sold as microservices architecture. Many organizations ended up with monolith and have no choice as to deal with that. Quality is terrible, productivity is going down, releasing new features takes months (or even years) ... and throwing more people into it does not help at all. So everyone tries to escape the monolith, whatever it takes, building standalone applications or services on the side, luckily terrific Spring Boot makes it as easy as falling off a log.

At first glance, it doesn’t look bad, after all they are calling it a microservice, and a bunch of things are calling each other, right? Except that this is not right at all. Applications and services that thrive in this kind of unhealthy ecosystem often do not own any data. They do not have sufficient knowledge to function independently and they do not belong to a concrete domain. These kind of services need to talk to the monolith to perform any useful function and in the worst case scenario, they share a database with the monolith and becomes a mere chatty proxy.

Why is that? Well, because in most cases it is very hard to migrate the large monolithic application to microservices architecture (the true one). And along with the migration, the organization should transform itself to become a fertile ground for cultivation of microservices, naturally. Not everyone is so brave to pull it off but the good news though, it is feasible. And here is why.

While building the monolith, the organization (hopefully) becomes an expert in its domain. There are tons of knowledge developed around the product, there are well-defined, well-understood use cases and flows, the domain-driven design might be of great help to formalize and capture all that. Once there is a solid understanding of what should be built, it is time to start cutting the pieces off the monolith. It is going to be a bumpy ride for sure and probably you won't be able to get from point A to point B directly (refactorings? splitting monolith into modules?), not to forget about evolving your testing practices (consumer contract tests? component tests? e2e tests?) along the way. But the rewards are worth the efforts.

Microservice architecture is a great one, it opens unlimited amount of opportunities and, when done right, could bring back the joy of development, could serve as a solid foundation for organization's software stacks, current and future ones. But neglecting its core principles would pave the road to frustration and chaos.

Monday, June 19, 2017

The Patterns of the Antipatterns: Design / Testing

It's been a while since I have started my professional career as the software developer. Over the years I worked for many companies, on a whole bunch of different projects, trying my best to deliver them and be proud of the job. However, drifting from company to company, from project to project, you see the same bad design decisions made over and over again, no matter how old the codebase is or how many people have touched it. They are so widely present that truly become the patterns themselves, thereby destine the title of this blog: the patterns of the antipatterns.

I am definitely not the first one who identified them and certainly not the last one either (so let all the credits go to the original sources if any exist). They all come from the real projects, primarily Java ones, though other details should stay undisclosed. With that ...

Code first, no time to think

This is hardly sounds like a name of the antipattern but I believe it is. We all are developers, and unsurprisingly love to code. Whenever we have gotten some feature or idea to implement, we often just start coding it, without thinking much about the design or even not looking around if we could borrow / reuse some functionality which is already there. As such, reinventing the wheel all the time, hopefully enjoying at least the coding part ...

I am absolutely convinced that spending some time just thinking about high-level design, incarnating it in terms of interfaces, traits or protocols (whatever your favoring language is offering) is a must-do exercise. It won't take more time, really, but the results are rewarding: beautiful, well-thought solution at the end.

TDD is a set of the terrific practices to help and guide your through. While you are developing test cases, you are iterating over and over, making changes and with each step getting close to the right implementation. Please, adopt it in each and every project you are part of.

Occasional Doer

Good, enough philosophy, now we are moving on to a real antipatterns. Undoubtedly, many of you witnessed such pieces of code:

public void setSomething(Object something) {
    if (!heavensAreFalling) {
        this.something = something;
    }
}

Essentially, you call the method to set some value, and surely you expect the value to be set once the invocation is completed. However, due to presence of some logic inside the method implementation it may actually do nothing, silently ...

This is an example of really bad design. Instead, such interactions could be modeled using for example State pattern or at least just throwing an exception saying that the operation cannot be completed due to internal constraints. Yes, it may require a bit more code to be written but at least the expectations will be met.

The Universe

Often there is a class within the application which is referencing (or is referenced by) mostly every other class of the application. For example, all classes in the project must be inherited from some base class.

public class MyService extends MyBase {
  ...
}
public class MyDao extends MyBase {
  ...
}

In most cases, this is not a good idea. Not only it creates a hard dependency on this single class but also on every dependency this guy is using internally. You may argue that in Java every single class implicitly is inherited from Object. Indeed, but this is part of the language specification and has nothing to do with application logic, no need to mimic that.

Another extreme is to have a class which serves as a central brain of the application. It knows about every service / dao / ... and provides accessors (most of the time, static) so anyone can just turn to it and ask for anything, for example:

public class Services {
    public static MyService1 getMyService1() {...}
    public static MyService2 getMyService2() {...}
    public static MyService3 getMyService3() {...}
}

Those are universes, eventually they leak into every class of the application (it is easy to call static method, right?) and couple everything together. In more or less large codebase, getting rid of such universes is exceptionally hard.

To prevent such things to poison your projects, use dependency injection pattern, implemented by Spring Framework, Dagger, Guice, CDI, HK2, or pass the necessary dependencies through constructors or method arguments. It will actually help you to see the smells early on and address them even before they become a problem.

The Onionated Inheritance

This is a really fun and scary one, very often present in projects which expose and implement REST(ful) web services. Let us suppose you have a Customer class, with quite a few different properties, for example:

public class Customer {
    private String id;
    private String firstName;
    private String lastName;
    private Address address;
    private Company company;
    ...
}

Now, you have been asked to create an API (or expose through other means) which finds the customer but returns only his id, firstName and lastName. Sounds straightforward, isn't it? Let us do it:

public class CustomerWithIdAndNames {
    private String id;
    private String firstName;
    private String lastName;
}

Looks pretty neat, right? Everyone is happy but next day you get to work on another feature where you need to design an API which returns id, firstName, lastName and also company. Sounds pretty easy, we already have CustomerWithIdAndNames, only need to enrich it with a company. Perfect job for inheritance, right?

public class CustomerWithIdAndNamesAndCompany extends CustomerWithIdAndNames {
    private Company company;
}

It works! But after a week, another request comes in, where a new API also needs to expose the address property, anyway, you got the idea. So at the end you end up with dozen of classes which extends each other, adding properties here and there (like onions, hence the antipattern name) to fulfill the API contracts.

Yet another road to hell ... There are quite a few options here, the simplest one is JSON Views (here are some examples using Jackson) where the underlying class stays the same but different views of it could be returned. In case you really care about not fetching the data you don't need, another option is GraphQL which we have covered last time. Essentially, the message here is: don't create such onionated hierarchies, use single representation but use different techniques to assemble it and fetch the necessary pieces efficiently.

IDD: if-driven development

Most of real-world projects internally are built using pretty complex set of application and business rules, cemented by applying different layers of extensibility and customizations (if needed). In most cases, the implementation choice is pretty unpretentious and the logic is driven by a conditional statements, for example:

int price = ...;
if (calculateTaxes) {
    price += taxes;
}

With time, the business rules evolve, so do the conditional expression they are impersonated, becoming a real monsters at the end:

int price = ...;
if (calculateTaxes && (country == "US" || country == "GB") && 
    discount == null || (discount != null && discount.getType() == TAXES)) {
    price += taxes;
}

You see where I am going with that, the project is built on top IDD practices: if-driven development. Not only the code becomes fragile, prone to condition errors, very hard to follow, it is very scary to change as well! To be fair, you also need to test all possible combinations of such if branches to ensure they make sense and the right branch is picked up (I bet no one is actually doing that because this is a gigantic amount of tests and efforts)!

In many cases the feature toggles could be an answer. But if you get used to write such a code, please take some time and read the books about design patterns, test-driven development and coding practices. There are so many of them, below is the list I would definitely recommend:

There are so many great ones to add here, but those are a very good starting point and highly rewarding investment. Much better alternatives to if statements are going to fill you mind.

Test Lottery

When you hear something like "All projects in our organization are agile and using TDD practices", the idealistic pictures come in mind, you hope everything is done right by the book. In reality, in most projects things are very far from that. Hopefully, at least there are some test suites you could trust, or could you?

Please welcome the test lottery: the kingdom of flaky tests. Have you ever seen builds with random test failures? Like on every consecutive build (without any changes introduced) some tests suddenly pass but others are starting to fail? And may be one of ten builds may turn green (jackpot!) as stars finally got aligned properly? If no one cares, it becomes a new normal and every team member who joins the team is being told to ignore those failures, "they are failing for ages, screw them".

Tests are as important as the mainstream code you push into production. They need to be maintained, refactored and kept clean. Keep all your builds green all the time, if you see some flakiness or random failures, address them right away. It is not easy but in many cases you may actually run into real bugs! You have to trust your test suites, otherwise why do you need them at all?

Test Framework Factory

This guy is an outstanding one. It happens so often, it feels like every organization is just obligated to create own test framework, usually on top of existing ones. At first it sounds like a good idea, and arguably, it really is. Unfortunately, in 99% the outcome is yet another monstrous framework no one wants to use but is forced to because "it took 15 men / years to develop and we cannot throw it away after that". Everyone struggles, productivity is going down at the speed of light, and quality does not improve at all.

Think twice before taking this route. The goal to simplify testing of complex applications the organization is working on is certainly a good one. But don't fall into the trap of throwing people at it, who are going to spend few months or even years working on the "perfect & mighty test framework" in isolation, may be doing great job overall, but not solving any real problems. Instead, just creating new ones. If you decided to embark on this train anyway, try hard to stay focused, address the pain points at their core, prototype, always ask for feedback, listen and iterate ... And keep it stable, easy to use, helpful, trustful and as lightweight as possible.

Conclusions

You cannot build everything right from the beginning. Sometimes our decisions are impacted by pressure and deadlines, or the factors outside of our control. But it does not mean that we should let things stay like that forever, getting worst and worst every single day, please, don't ...

This post is a mesh of ideas from many people, terrific former teammates and awesome friends for good, credits go to all of them. If you happens to like it, you may be interested in checking out the upcoming post, where we are going to talk about architectural antipatterns.

Wednesday, April 26, 2017

When following REST(ful) principles might look impractical, GraphQL could come on the resque

I am certainly late with jumping on the trendy train, but today we are going to talk about GraphQL, a very interesting approach to build REST(ful) web services and APIs. In my opinion, it would be fair to restate that REST(ful) architecture is built on top of quite reasonable principles and constraints (although the debates over that are never ending in the industry).

So ... what is GraphQL? By and large, it is yet another kind of the query language. But what makes it interesting, it is designed to give the clients (f.e., the frontends) the ability to express their needs (f.e., to the backends) in terms of data they are expecting. Frankly speaking, GraphQL goes much further than that but this is the one of its most compelling features.

GraphQL is just a specification, without any particular requirements to the programming language or technology stack but, not surprisingly, it got the widespread acceptance in modern web development, both on client-side (Apollo, Relay) and server-side (the class of APIs often called BFFs these days). In today's post we are going to give GraphQL a ride, discuss where it could be a good fit and why you may consider adopting it. Although there quite a few options on the table, Sangria, terrific Scala-based implementation of GraphQL specification, would be the foundation we are going to build atop.

Essentially, our goal is to develop a simple user management web API. The data model is far from being complete but good enough to serve its purpose. Here is the User class.

case class User(id: Long, email: String, created: Option[LocalDateTime], 
  firstName: Option[String], lastName: Option[String], roles: Option[Seq[Role.Value]], 
    active: Option[Boolean], address: Option[Address])

The Role is a hardcoded enumeration:

object Role extends Enumeration {
  val USER, ADMIN = Value
}

While the Address is another small class in the data model:

case class Address(street: String, city: String, state: String, 
  zip: String, country: String)

In its core, GraphQL is strongly typed. That means, the application specific models should be somehow represented in GraphQL. To speak naturally, we need to define schema. In Sangria, schema definitions are pretty straightforward and consist of three main categories, borrowed from GraphQL specification: object types, query types and mutation types. All of these we are going to touch upon, but the object type definitions sounds like a logical point to start with.

val UserType = ObjectType(
  "User",
  "User Type",
  interfaces[Unit, User](),
  fields[Unit, User](
    Field("id", LongType, Some("User id"), tags = ProjectionName("id") :: Nil, 
      resolve = _.value.id),
    Field("email", StringType, Some("User email address"), 
      resolve = _.value.email),
    Field("created", OptionType(LocalDateTimeType), Some("User creation date"), 
      resolve = _.value.created),
    Field("address", OptionType(AddressType), Some("User address"), 
      resolve = _.value.address),
    Field("active", OptionType(BooleanType), Some("User is active or not"), 
      resolve = _.value.active),
    Field("firstName", OptionType(StringType), Some("User first name"), 
      resolve = _.value.firstName),
    Field("lastName", OptionType(StringType), Some("User last name"), 
      resolve = _.value.lastName),
    Field("roles", OptionType(ListType(RoleEnumType)), Some("User roles"), 
      resolve = _.value.roles)
  ))

In many respects, it is just direct mapping from the User to UserType. Sangria can easy you off from doing that by providing macros support so you may get the schema generated at compile time. The AddressType definition is very similar, let us skip over it and look on how to deal with enumeration, like Roles.

val RoleEnumType = EnumType(
  "Role",
  Some("List of roles"),
  List(
    EnumValue("USER", value = Role.USER, description = Some("User")),
    EnumValue("ADMIN", value = Role.ADMIN, description = Some("Administrator"))
  ))

Easy, simple and compact ... In traditional REST(ful) web services the metadata about the resources is not generally available out of the box. However, several complimentary specifications, like JSON Schema, could fill this gap with a bit of work.

Good, so types are there, but what are these queries and mutations? Query is a special type within GraphQL specification which basically describes how you would like to fetch the data and the shape of it. For example, there is often a need to get user details by its identifier, which could be expressed by following GraphQL query:

query {
  user(id: 1) {
    email
    firstName
    lastName
    address {
      country
    }
  }
}

You can literally read it as-is: lookup the user with identifier 1 and return only his email, first and last names, and address with the country only. Awesome, not only GraphQL queries are exceptionally powerful, but they are giving the control of what to return back to the interested parties. Priceless feature if you have to support a diversity of different clients without exploding the amount of API endpoints. Defining the query types in Sangria is also a no-brainer, for example:

val Query = ObjectType(
  "Query", fields[Repository, Unit](
    Field("user", OptionType(UserType), arguments = ID :: Nil, 
      resolve = (ctx) => ctx.ctx.findById(ctx.arg(ID))),
    Field("users", ListType(UserType), 
      resolve = Projector((ctx, f) => ctx.ctx.findAll(f.map(_.name))))
  ))

There are two queries in fact which the code snippet above describes. The one we have seen before, fetching user by identifier, and another one, fetching all users. Here is a quick example of latter:

query {
  users {
    id 
    email 
  }
}

Hopefully you would agree that no explanations needed, the intent is clear. Queries arguably are the strongest argument in favor of adopting GraphQL, the value proposition is really tremendous. With Sangria you do have access to the fields which clients want back, so the data store could be told to return only these subsets, using projections, selects, or similar concepts. To be closer to reality, our sample application stores data in MongoDB so we could ask it to return only fields the client is interested in.

def findAll(fields: Seq[String]): Future[Seq[User]] = collection.flatMap(
   _.find(document())
    .projection(
      fields
       .foldLeft(document())((doc, field) => doc.merge(field -> BSONInteger(1)))
    )
    .cursor[User]()
    .collect(Int.MaxValue, Cursor.FailOnError[Seq[User]]())
  )

If we get back to the typical REST(ful) web APIs, the approach most widely used these days to outline the shape of the desired response is to pass a query string parameter, for example /api/users?fields=email,firstName,lastName, .... However, from the implementation perspective, not many frameworks support such features natively, so everyone has to come up with their own way. Regarding the querying capabilities, in case you happen to be the user of terrific Apache CXF framework, you may benefit from its quite powerful search extension, which we have talked about some time ago.

If queries usually just fetch data, mutations are serving the purpose of the data modification. Syntactically they are very similar to queries but their interpretation is different. For example, here is one of the ways we could add new user to the application.

mutation {
  add(email: "a@b.com", firstName: "John", lastName: "Smith", roles: [ADMIN]) {
    id
    active
    email
    firstName
    lastName
    address {
      street
      country
    }
  }
}

In this mutation a new user John Smith with email a@b.com and ADMIN role assigned is going to be added to the system. As with queries, client is always in control which data shape it needs from server when mutation completes. Mutations could be think of as the calls for action and resemble a lot method invocations, for example the activation of the user may be done like that:

mutation {
  activate(id: 1) {
    active
  }
}

In Sangria, mutations are described exactly like queries, for example the ones we have looked at before have the following type definition:

val Mutation = ObjectType(
  "Mutation", fields[Repository, Unit](
    Field("activate", OptionType(UserType), 
      arguments = ID :: Nil,
      resolve = (ctx) => ctx.ctx.activateById(ctx.arg(ID))),
    Field("add", OptionType(UserType), 
      arguments = EmailArg :: FirstNameArg :: LastNameArg :: RolesArg :: Nil,
      resolve = (ctx) => ctx.ctx.addUser(ctx.arg(EmailArg), ctx.arg(FirstNameArg), 
        ctx.arg(LastNameArg), ctx.arg(RolesArg)))
  ))

With that, our GraphQL schema is complete:

val UserSchema = Schema(Query, Some(Mutation))

That's great, however ... what we can do with it? Just in time question, please welcome GraphQL server. As we remember, there is no attachment to particular technology or stack, but in the universe of web APIs you can think of GraphQL server as a single endpoint which is bound to POST HTTP verb. And, once we started to talk about HTTP and Scala, who could do better job than amazing Akka HTTP, luckily Sangria has a seamless integration with it.

val route: Route = path("users") {
  post {
    entity(as[String]) { document =>
      QueryParser.parse(document) match {
        case Success(queryAst) =>
          complete(Executor.execute(SchemaDefinition.UserSchema, queryAst, repository)
            .map(OK -> _)
            .recover {
              case error: QueryAnalysisError => BadRequest -> error.resolveError
              case error: ErrorWithResolver => InternalServerError -> error.resolveError
            })

        case Failure(error) => complete(BadRequest -> Error(error.getMessage))
      }
    }
  } ~ get {
    complete(SchemaRenderer.renderSchema(SchemaDefinition.UserSchema))
  }
}

You may notice that we also expose our schema under GET endpoint as well, what it is here for? Well, if you are familiar with Swagger which we have talked about a lot here, it is a very similar concept. The schema contains all the necessary pieces, enough for external tools to automatically discover the respective GraphQL queries and mutations, along with the types they are referencing. GraphiQL, an in-browser IDE for exploring GraphQL, is one of those (think about Swagger UI in the REST(ful) services world).

We are mostly there, our GraphQL server is ready, let us run it and send off a couple of queries and mutations, to get the feeling of it:

sbt run

[info] Running com.example.graphql.Boot
INFO  akka.event.slf4j.Slf4jLogger  - Slf4jLogger started
INFO  reactivemongo.api.MongoDriver  - No mongo-async-driver configuration found
INFO  reactivemongo.api.MongoDriver  - [Supervisor-1] Creating connection: Connection-2
INFO  r.core.actors.MongoDBSystem  - [Supervisor-1/Connection-2] Starting the MongoDBSystem akka://reactivemongo/user/Connection-2

Very likely our data store (we are using MongoDB run as Docker container) has no users at the moment so it sounds like a good idea to add one right away:

$ curl -vi -X POST http://localhost:48080/users -H "Content-Type: application/json" -d " \
   mutation { \
     add(email: \"a@b.com\", firstName: \"John\", lastName: \"Smith\", roles: [ADMIN]) { \
       id \
       active \
       email \
       firstName \
       lastName \
       address { \
       street \
         country \
       } \
     } \
   }"

HTTP/1.1 200 OK
Server: akka-http/10.0.5
Date: Tue, 25 Apr 2017 01:01:25 GMT
Content-Type: application/json
Content-Length: 123

{
  "data":{
    "add":{
      "email":"a@b.com",
      "lastName":"Smith",
      "firstName":"John",
      "id":1493082085000,
      "address":null,
      "active":false
    }
  }
}

It seems to work perfectly fine. The response details will be always wrapped into data envelop, no matter what kind of query or mutation you are running, for example:

$ curl -vi -X POST http://localhost:48080/users -H "Content-Type: application/json" -d " \                                                
   query { \                                                                                                                           
     users { \                                                                                                                         
       id \                                                                                                                            
       email \                                                                                                                         
     } \                                                                                                                               
   }"                                                                                                                                  

HTTP/1.1 200 OK                                                                                                                                   
Server: akka-http/10.0.5                                                                                                                          
Date: Tue, 25 Apr 2017 01:09:21 GMT                                                                                                               
Content-Type: application/json                                                                                                                    
Content-Length: 98                                                                                                                                
                                                                                                                                                  
{
  "data":{
    "users":[
      {
        "id":1493082085000,
        "email":"a@b.com"
      }
    ]
  }
}

Exactly as we ordered... Honestly, working with GraphQL feels natural, specifically when data querying is involved. And we didn't even talk about fragments, variables, directives, and a lot of other things.

Now it comes to the question: should we abandon all our practices, JAX-RS, Spring MVC, ... and switch to GraphQL? I honestly believe that this is not the case, GraphQL is a good fit to address certain kind of problems, but by and large, traditional REST(ful) web services, combined with Swagger or any other established API specification framework, are here to stay.

And please be warned, along with the benefits, GraphQL comes at a price. For example, HTTP caching and cache control won't apply anymore, HATEOAS does not make much sense either, unified responses no matter what you are calling, reliability as everything is behind single facade, ... With that in mind, GraphQL is indeed a great tool, please use it wisely!

The complete project source is available on Github.

Monday, January 9, 2017

Overcast and Docker: test your apps as if you are releasing them to production

I am sure everyone would agree that rare application architecture these days could survive without relying on some kind of data stores (either relational or NoSQL ones), or/and messaging middleware, or/and external caches, ... just to name a few. Testing such applications becomes a real challenge.

Luckily, if you are a JVM developer, things are not as bad. Most of the time you have an option to fallback to an embedded version of the data store or message broker in your integration or component test scenarios. But what if the solution you are using is not JVM-based? Great examples of those are RabbitMQ, Redis, Memcached, MySQL, Postgresql which are extremely popular choices these days, and for a very good reasons. Even better, what if your integration testing strategy is set to exercise the component (read, microservice) in the environment as close to production as possible? Should we give up here? Or should we write a bunch of flaky shell scripts to orchestrate the test runs and scary most of the developers to death? Let us see what we can do here ...

Many of you are already screaming at this point: just use Docker, or CoreOS! And this is exactly what we are going to talk about in this post, more precisely, how to use Docker to back integration / component testing. I think Docker does not need an introduction anymore. Even those of us who spent a last couple of years in a cave on a deserted island in the middle of the ocean have heard about it.

Our sample application is going to be built on top of Spring projects portfolio, heavily relying on Spring Boot magic to wire all the pieces together (who doesn't, right? it works pretty well indeed). It will be implementing a very simple workflow: publish a message to RabbitMQ exchange app.exchange (using app.queue routing key), consume the message from the RabbitMQ queue app.queue and store it in Redis list data structure under the key messages. The three self-explanatory code snippets below demonstrate how each functional piece is being done:

@Component
public class AppQueueMessageSender {
    @Autowired private RabbitTemplate rabbitTemplate;
 
    public void send(final String message) {
        rabbitTemplate.convertAndSend("app.exchange", "app.queue", message);
    }
}
@Component
public class AppQueueMessageListener {
    @Autowired private AppMessageRepository repository;
 
    @RabbitListener(
        queues = "app.queue", 
        containerFactory = "rabbitListenerContainerFactory", 
        admin = "amqpAdmin"
    )
    public void onMessage(final String message) {
        repository.persist(message);
    }
}
@Repository
public class AppMessageRepository {
    @Autowired private StringRedisTemplate redisTemplate;
 
    public void persist(final String message) {
        redisTemplate.opsForList().rightPush("messages", message);
    }
 
    public Collection<String> readAll() {
        return redisTemplate.opsForList().range("messages", 0, -1);
    }
 
    public long size() {
        return redisTemplate.opsForList().size("messages");
    }
}

As you can see, the implementation is deliberately doing the bare minimum, we are more interested in the fact that quite a few interactions with RabbitMQ and Redis are happening here. The configuration class includes only the necessary beans, everything else has been figured out by Spring Boot automatic discovery from the classpath dependencies.

@Configuration
@EnableAutoConfiguration
@EnableRabbit
@ComponentScan(basePackageClasses = AppConfiguration.class)
public class AppConfiguration {
    @Bean
    Queue queue() {
        return new Queue("app.queue", false);
    }

    @Bean
    TopicExchange exchange() {
        return new TopicExchange("app.exchange");
    }

    @Bean
    Binding binding(Queue queue, TopicExchange exchange) {
        return BindingBuilder.bind(queue).to(exchange).with(queue.getName());
    }
 
    @Bean
    StringRedisTemplate template(RedisConnectionFactory connectionFactory) {
        return new StringRedisTemplate(connectionFactory);
    }
}

At the very end comes application.yml. Essentially it contains the default connection parameters for RabbitMQ and Redis, plus a bit of logging level tuning.

logging:
  level:
    root: INFO
    
spring:
  rabbitmq:
    host: localhost
    username: guest
    password: guest
    port: 5672
    
  redis:
    host: localhost
    port: 6379

With that, our application is ready be run. For convenience, the project repository contains docker-compose.yml with official RabbitMQ and Redis images from the Docker hub.

Being TDD believers and practitioners, we make sure no application leaves the gate without thorough set of test suites and test cases. Keeping unit tests and integration tests out of scope of the subject of our discussion, let us jump right into component testing with this simple scenario.

@RunWith(SpringJUnit4ClassRunner.class)
@SpringBootTest(
    classes = { AppConfiguration.class }, 
    webEnvironment = WebEnvironment.NONE
)
public class AppComponentTest {
    @Autowired private AppQueueMessageSender sender;
    @Autowired private AppMessageRepository repository;
 
    @Test
    public void testThatMessageHasBeenPersisted() {
        sender.send("Test Message!");
        await().atMost(1, SECONDS).until(() -> repository.size() > 0);
        assertThat(repository.readAll()).containsExactly("Test Message!");
    }
}

It is really basic test case, exercising the main flow, but it makes an important point: no mocks / stubs / ... allowed, we expect the real things. The line is somewhat blurry but this is what makes component tests different from let say integration or e2e tests: we test a single full-fledged application (component) with the real dependencies (when it makes sense).

This is a right time for an excellent Overcast project to appear on the stage and help us out. Overcast brings the power of Docker to enrich test harness of the JVM applications. Among many other things, it allows to define and manage lifecycle of Docker containers from within Java code (or more precisely, any programming language based on JVM).

Unfortunately, the last released version 2.5.1 of Overcast is pretty old and does not include a lot of features and enhancements. However, building it from source is no-brainer (hopefully the new release is going to be available soon).

git clone https://github.com/xebialabs/overcast
cd overcast
./gradlew install

Essentially, the only prerequisite is to provide the configuration file overcast.conf with the list of named containers to run. In our case, we need RabbitMQ and Redis.

rabbitmq {
    dockerImage="rabbitmq:3.6.6"
    exposeAllPorts=true
    remove=true
    removeVolume=true
}

redis {
    dockerImage="redis:3.2.6"
    exposeAllPorts=true
    remove=true
    removeVolume=true
}

Great! The syntax is not as powerful as Docker Compose supports, but simple, straightforward and quite sufficient to be fair. Once configuration file is placed into src/test/resources folder, we could move on and use Overcast Java API to manage these containers programmatically. It is natural to introduce the dedicated configuration class in this case as we are using Spring framework.

@Configuration
public class OvercastConfiguration {
    @Autowired private ConfigurableEnvironment env;

    @Bean(initMethod = "setup", destroyMethod = "teardown")
    @Qualifier("rabbitmq")
    public CloudHost rabbitmq() {
        return CloudHostFactory.getCloudHost("rabbitmq");
    }

    @Bean(initMethod = "setup", destroyMethod = "teardown")
    @Qualifier("redis")
    public CloudHost redis() {
        return CloudHostFactory.getCloudHost("redis");
    }

    @PostConstruct
    public void init() throws TimeoutException {
        final CloudHost redis = redis();
        final CloudHost rabbitmq = rabbitmq();

        final Map<String, Object> properties = new HashMap<>();
        properties.put("spring.rabbitmq.host", rabbitmq.getHostName());
        properties.put("spring.rabbitmq.port", rabbitmq.getPort(5672));

        properties.put("spring.redis.host", redis.getHostName());
        properties.put("spring.redis.port", redis.getPort(6379));

        final PropertySource<?> source = new MapPropertySource("overcast", properties);
        env.getPropertySources().addFirst(source);
    }
}

And that is literally all we need! Just a couple of important notes here. Docker is going to expose random ports for each container, so we could run many test cases in parallel on the same box without any port conflicts. On most operating systems it is safe to use localhost to access the running containers but for the ones without native Docker support the workarounds with Docker Machine or boot2docker exist. That is why we override connection settings for both host and port for RabbitMQ and Redis respectively, asking for the bindings at runtime:

properties.put("spring.rabbitmq.host", rabbitmq.getHostName());
properties.put("spring.rabbitmq.port", rabbitmq.getPort(5672));

properties.put("spring.redis.host", redis.getHostName());
properties.put("spring.redis.port", redis.getPort(6379));

Lastly, more advanced Docker users may wonder how Overcast is able to figure out where Docker daemon is running? Which port it is bound to? Does it use TLS or not? Under the hood Overcast uses terrific Spotify Docker Client which is able to retrieve all the relevant details from the environment variables, which works in majority of use cases (though you can always provide your own settings).

To finish up, let us include this configuration into the test case:

@SpringBootTest(
    classes = { OvercastConfiguration.class, AppConfiguration.class }, 
    webEnvironment = WebEnvironment.NONE
)

Easy, isn't it? If we go ahead and run mvn test for our project, all test cases should pass (please notice that first run may take some time as Docker would have to pull the container images from the remote repository).

> mvn test
...
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
...

No doubts, Docker raises testing techniques to a new level. With the help of such awesome libraries as Overcast, seasoned JVM developers have even more options to come up with realistic test scenarios and run them against the components in "mostly" production environment (on the wave of hype, it fits perfectly into microservices testing strategies). There are many areas where Overcast could and will improve but it brings a lot of value even now, definitely worth checking out.

Probably, the most annoying issue you may encounter when working with Docker containers is awaiting for the moment when the container is fully started and ready to accept the requests (which heavily depends on what kind of underlying service this container is running). Although the work on that has been started, Overcast does not help with this particular problem yet though simple, old-style sleeps may be good enough (versus a bit more complex port polling for example).

But but but ... always remember about testing pyramid and strive for a right balance. Create as many test cases as you need to cover most critical and important flows, but no more. Unit and integration tests should be your main weapon.

The complete project is available on Github.