Monday, June 19, 2017

The Patterns of the Antipatterns: Design / Testing

It's been a while since I have started my professional career as the software developer. Over the years I worked for many companies, on a whole bunch of different projects, trying my best to deliver them and be proud of the job. However, drifting from company to company, from project to project, you see the same bad design decisions made over and over again, no matter how old the codebase is or how many people have touched it. They are so widely present that truly become the patterns themselves, thereby destine the title of this blog: the patterns of the antipatterns.

I am definitely not the first one who identified them and certainly not the last one either (so let all the credits go to the original sources if any exist). They all come from the real projects, primarily Java ones, though other details should stay undisclosed. With that ...

Code first, no time to think

This is hardly sounds like a name of the antipattern but I believe it is. We all are developers, and unsurprisingly love to code. Whenever we have gotten some feature or idea to implement, we often just start coding it, without thinking much about the design or even not looking around if we could borrow / reuse some functionality which is already there. As such, reinventing the wheel all the time, hopefully enjoying at least the coding part ...

I am absolutely convinced that spending some time just thinking about high-level design, incarnating it in terms of interfaces, traits or protocols (whatever your favoring language is offering) is a must-do exercise. It won't take more time, really, but the results are rewarding: beautiful, well-thought solution at the end.

TDD is a set of the terrific practices to help and guide your through. While you are developing test cases, you are iterating over and over, making changes and with each step getting close to the right implementation. Please, adopt it in each and every project you are part of.

Occasional Doer

Good, enough philosophy, now we are moving on to a real antipatterns. Undoubtedly, many of you witnessed such pieces of code:

public void setSomething(Object something) {
    if (!heavensAreFalling) {
        this.something = something;
    }
}

Essentially, you call the method to set some value, and surely you expect the value to be set once the invocation is completed. However, due to presence of some logic inside the method implementation it may actually do nothing, silently ...

This is an example of really bad design. Instead, such interactions could be modeled using for example State pattern or at least just throwing an exception saying that the operation cannot be completed due to internal constraints. Yes, it may require a bit more code to be written but at least the expectations will be met.

The Universe

Often there is a class within the application which is referencing (or is referenced by) mostly every other class of the application. For example, all classes in the project must be inherited from some base class.

public class MyService extends MyBase {
  ...
}
public class MyDao extends MyBase {
  ...
}

In most cases, this is not a good idea. Not only it creates a hard dependency on this single class but also on every dependency this guy is using internally. You may argue that in Java every single class implicitly is inherited from Object. Indeed, but this is part of the language specification and has nothing to do with application logic, no need to mimic that.

Another extreme is to have a class which serves as a central brain of the application. It knows about every service / dao / ... and provides accessors (most of the time, static) so anyone can just turn to it and ask for anything, for example:

public class Services {
    public static MyService1 getMyService1() {...}
    public static MyService2 getMyService2() {...}
    public static MyService3 getMyService3() {...}
}

Those are universes, eventually they leak into every class of the application (it is easy to call static method, right?) and couple everything together. In more or less large codebase, getting rid of such universes is exceptionally hard.

To prevent such things to poison your projects, use dependency injection pattern, implemented by Spring Framework, Dagger, Guice, CDI, HK2, or pass the necessary dependencies through constructors or method arguments. It will actually help you to see the smells early on and address them even before they become a problem.

The Onionated Inheritance

This is a really fun and scary one, very often present in projects which expose and implement REST(ful) web services. Let us suppose you have a Customer class, with quite a few different properties, for example:

public class Customer {
    private String id;
    private String firstName;
    private String lastName;
    private Address address;
    private Company company;
    ...
}

Now, you have been asked to create an API (or expose through other means) which finds the customer but returns only his id, firstName and lastName. Sounds straightforward, isn't it? Let us do it:

public class CustomerWithIdAndNames {
    private String id;
    private String firstName;
    private String lastName;
}

Looks pretty neat, right? Everyone is happy but next day you get to work on another feature where you need to design an API which returns id, firstName, lastName and also company. Sounds pretty easy, we already have CustomerWithIdAndNames, only need to enrich it with a company. Perfect job for inheritance, right?

public class CustomerWithIdAndNamesAndCompany extends CustomerWithIdAndNames {
    private Company company;
}

It works! But after a week, another request comes in, where a new API also needs to expose the address property, anyway, you got the idea. So at the end you end up with dozen of classes which extends each other, adding properties here and there (like onions, hence the antipattern name) to fulfill the API contracts.

Yet another road to hell ... There are quite a few options here, the simplest one is JSON Views (here are some examples using Jackson) where the underlying class stays the same but different views of it could be returned. In case you really care about not fetching the data you don't need, another option is GraphQL which we have covered last time. Essentially, the message here is: don't create such onionated hierarchies, use single representation but use different techniques to assemble it and fetch the necessary pieces efficiently.

IDD: if-driven development

Most of real-world projects internally are built using pretty complex set of application and business rules, cemented by applying different layers of extensibility and customizations (if needed). In most cases, the implementation choice is pretty unpretentious and the logic is driven by a conditional statements, for example:

int price = ...;
if (calculateTaxes) {
    price += taxes;
}

With time, the business rules evolve, so do the conditional expression they are impersonated, becoming a real monsters at the end:

int price = ...;
if (calculateTaxes && (country == "US" || country == "GB") && 
    discount == null || (discount != null && discount.getType() == TAXES)) {
    price += taxes;
}

You see where I am going with that, the project is built on top IDD practices: if-driven development. Not only the code becomes fragile, prone to condition errors, very hard to follow, it is very scary to change as well! To be fair, you also need to test all possible combinations of such if branches to ensure they make sense and the right branch is picked up (I bet no one is actually doing that because this is a gigantic amount of tests and efforts)!

In many cases the feature toggles could be an answer. But if you get used to write such a code, please take some time and read the books about design patterns, test-driven development and coding practices. There are so many of them, below is the list I would definitely recommend:

There are so many great ones to add here, but those are a very good starting point and highly rewarding investment. Much better alternatives to if statements are going to fill you mind.

Test Lottery

When you hear something like "All projects in our organization are agile and using TDD practices", the idealistic pictures come in mind, you hope everything is done right by the book. In reality, in most projects things are very far from that. Hopefully, at least there are some test suites you could trust, or could you?

Please welcome the test lottery: the kingdom of flaky tests. Have you ever seen builds with random test failures? Like on every consecutive build (without any changes introduced) some tests suddenly pass but others are starting to fail? And may be one of ten builds may turn green (jackpot!) as stars finally got aligned properly? If no one cares, it becomes a new normal and every team member who joins the team is being told to ignore those failures, "they are failing for ages, screw them".

Tests are as important as the mainstream code you push into production. They need to be maintained, refactored and kept clean. Keep all your builds green all the time, if you see some flakiness or random failures, address them right away. It is not easy but in many cases you may actually run into real bugs! You have to trust your test suites, otherwise why do you need them at all?

Test Framework Factory

This guy is an outstanding one. It happens so often, it feels like every organization is just obligated to create own test framework, usually on top of existing ones. At first it sounds like a good idea, and arguably, it really is. Unfortunately, in 99% the outcome is yet another monstrous framework no one wants to use but is forced to because "it took 15 men / years to develop and we cannot throw it away after that". Everyone struggles, productivity is going down at the speed of light, and quality does not improve at all.

Think twice before taking this route. The goal to simplify testing of complex applications the organization is working on is certainly a good one. But don't fall into the trap of throwing people at it, who are going to spend few months or even years working on the "perfect & mighty test framework" in isolation, may be doing great job overall, but not solving any real problems. Instead, just creating new ones. If you decided to embark on this train anyway, try hard to stay focused, address the pain points at their core, prototype, always ask for feedback, listen and iterate ... And keep it stable, easy to use, helpful, trustful and as lightweight as possible.

Conclusions

You cannot build everything right from the beginning. Sometimes our decisions are impacted by pressure and deadlines, or the factors outside of our control. But it does not mean that we should let things stay like that forever, getting worst and worst every single day, please, don't ...

This post is a mesh of ideas from many people, terrific former teammates and awesome friends for good, credits go to all of them. If you happens to like it, you may be interested in checking out the upcoming post, where we are going to talk about architectural antipatterns.

Wednesday, April 26, 2017

When following REST(ful) principles might look impractical, GraphQL could come on the resque

I am certainly late with jumping on the trendy train, but today we are going to talk about GraphQL, a very interesting approach to build REST(ful) web services and APIs. In my opinion, it would be fair to restate that REST(ful) architecture is built on top of quite reasonable principles and constraints (although the debates over that are never ending in the industry).

So ... what is GraphQL? By and large, it is yet another kind of the query language. But what makes it interesting, it is designed to give the clients (f.e., the frontends) the ability to express their needs (f.e., to the backends) in terms of data they are expecting. Frankly speaking, GraphQL goes much further than that but this is the one of its most compelling features.

GraphQL is just a specification, without any particular requirements to the programming language or technology stack but, not surprisingly, it got the widespread acceptance in modern web development, both on client-side (Apollo, Relay) and server-side (the class of APIs often called BFFs these days). In today's post we are going to give GraphQL a ride, discuss where it could be a good fit and why you may consider adopting it. Although there quite a few options on the table, Sangria, terrific Scala-based implementation of GraphQL specification, would be the foundation we are going to build atop.

Essentially, our goal is to develop a simple user management web API. The data model is far from being complete but good enough to serve its purpose. Here is the User class.

case class User(id: Long, email: String, created: Option[LocalDateTime], 
  firstName: Option[String], lastName: Option[String], roles: Option[Seq[Role.Value]], 
    active: Option[Boolean], address: Option[Address])

The Role is a hardcoded enumeration:

object Role extends Enumeration {
  val USER, ADMIN = Value
}

While the Address is another small class in the data model:

case class Address(street: String, city: String, state: String, 
  zip: String, country: String)

In its core, GraphQL is strongly typed. That means, the application specific models should be somehow represented in GraphQL. To speak naturally, we need to define schema. In Sangria, schema definitions are pretty straightforward and consist of three main categories, borrowed from GraphQL specification: object types, query types and mutation types. All of these we are going to touch upon, but the object type definitions sounds like a logical point to start with.

val UserType = ObjectType(
  "User",
  "User Type",
  interfaces[Unit, User](),
  fields[Unit, User](
    Field("id", LongType, Some("User id"), tags = ProjectionName("id") :: Nil, 
      resolve = _.value.id),
    Field("email", StringType, Some("User email address"), 
      resolve = _.value.email),
    Field("created", OptionType(LocalDateTimeType), Some("User creation date"), 
      resolve = _.value.created),
    Field("address", OptionType(AddressType), Some("User address"), 
      resolve = _.value.address),
    Field("active", OptionType(BooleanType), Some("User is active or not"), 
      resolve = _.value.active),
    Field("firstName", OptionType(StringType), Some("User first name"), 
      resolve = _.value.firstName),
    Field("lastName", OptionType(StringType), Some("User last name"), 
      resolve = _.value.lastName),
    Field("roles", OptionType(ListType(RoleEnumType)), Some("User roles"), 
      resolve = _.value.roles)
  ))

In many respects, it is just direct mapping from the User to UserType. Sangria can easy you off from doing that by providing macros support so you may get the schema generated at compile time. The AddressType definition is very similar, let us skip over it and look on how to deal with enumeration, like Roles.

val RoleEnumType = EnumType(
  "Role",
  Some("List of roles"),
  List(
    EnumValue("USER", value = Role.USER, description = Some("User")),
    EnumValue("ADMIN", value = Role.ADMIN, description = Some("Administrator"))
  ))

Easy, simple and compact ... In traditional REST(ful) web services the metadata about the resources is not generally available out of the box. However, several complimentary specifications, like JSON Schema, could fill this gap with a bit of work.

Good, so types are there, but what are these queries and mutations? Query is a special type within GraphQL specification which basically describes how you would like to fetch the data and the shape of it. For example, there is often a need to get user details by its identifier, which could be expressed by following GraphQL query:

query {
  user(id: 1) {
    email
    firstName
    lastName
    address {
      country
    }
  }
}

You can literally read it as-is: lookup the user with identifier 1 and return only his email, first and last names, and address with the country only. Awesome, not only GraphQL queries are exceptionally powerful, but they are giving the control of what to return back to the interested parties. Priceless feature if you have to support a diversity of different clients without exploding the amount of API endpoints. Defining the query types in Sangria is also a no-brainer, for example:

val Query = ObjectType(
  "Query", fields[Repository, Unit](
    Field("user", OptionType(UserType), arguments = ID :: Nil, 
      resolve = (ctx) => ctx.ctx.findById(ctx.arg(ID))),
    Field("users", ListType(UserType), 
      resolve = Projector((ctx, f) => ctx.ctx.findAll(f.map(_.name))))
  ))

There are two queries in fact which the code snippet above describes. The one we have seen before, fetching user by identifier, and another one, fetching all users. Here is a quick example of latter:

query {
  users {
    id 
    email 
  }
}

Hopefully you would agree that no explanations needed, the intent is clear. Queries arguably are the strongest argument in favor of adopting GraphQL, the value proposition is really tremendous. With Sangria you do have access to the fields which clients want back, so the data store could be told to return only these subsets, using projections, selects, or similar concepts. To be closer to reality, our sample application stores data in MongoDB so we could ask it to return only fields the client is interested in.

def findAll(fields: Seq[String]): Future[Seq[User]] = collection.flatMap(
   _.find(document())
    .projection(
      fields
       .foldLeft(document())((doc, field) => doc.merge(field -> BSONInteger(1)))
    )
    .cursor[User]()
    .collect(Int.MaxValue, Cursor.FailOnError[Seq[User]]())
  )

If we get back to the typical REST(ful) web APIs, the approach most widely used these days to outline the shape of the desired response is to pass a query string parameter, for example /api/users?fields=email,firstName,lastName, .... However, from the implementation perspective, not many frameworks support such features natively, so everyone has to come up with their own way. Regarding the querying capabilities, in case you happen to be the user of terrific Apache CXF framework, you may benefit from its quite powerful search extension, which we have talked about some time ago.

If queries usually just fetch data, mutations are serving the purpose of the data modification. Syntactically they are very similar to queries but their interpretation is different. For example, here is one of the ways we could add new user to the application.

mutation {
  add(email: "a@b.com", firstName: "John", lastName: "Smith", roles: [ADMIN]) {
    id
    active
    email
    firstName
    lastName
    address {
      street
      country
    }
  }
}

In this mutation a new user John Smith with email a@b.com and ADMIN role assigned is going to be added to the system. As with queries, client is always in control which data shape it needs from server when mutation completes. Mutations could be think of as the calls for action and resemble a lot method invocations, for example the activation of the user may be done like that:

mutation {
  activate(id: 1) {
    active
  }
}

In Sangria, mutations are described exactly like queries, for example the ones we have looked at before have the following type definition:

val Mutation = ObjectType(
  "Mutation", fields[Repository, Unit](
    Field("activate", OptionType(UserType), 
      arguments = ID :: Nil,
      resolve = (ctx) => ctx.ctx.activateById(ctx.arg(ID))),
    Field("add", OptionType(UserType), 
      arguments = EmailArg :: FirstNameArg :: LastNameArg :: RolesArg :: Nil,
      resolve = (ctx) => ctx.ctx.addUser(ctx.arg(EmailArg), ctx.arg(FirstNameArg), 
        ctx.arg(LastNameArg), ctx.arg(RolesArg)))
  ))

With that, our GraphQL schema is complete:

val UserSchema = Schema(Query, Some(Mutation))

That's great, however ... what we can do with it? Just in time question, please welcome GraphQL server. As we remember, there is no attachment to particular technology or stack, but in the universe of web APIs you can think of GraphQL server as a single endpoint which is bound to POST HTTP verb. And, once we started to talk about HTTP and Scala, who could do better job than amazing Akka HTTP, luckily Sangria has a seamless integration with it.

val route: Route = path("users") {
  post {
    entity(as[String]) { document =>
      QueryParser.parse(document) match {
        case Success(queryAst) =>
          complete(Executor.execute(SchemaDefinition.UserSchema, queryAst, repository)
            .map(OK -> _)
            .recover {
              case error: QueryAnalysisError => BadRequest -> error.resolveError
              case error: ErrorWithResolver => InternalServerError -> error.resolveError
            })

        case Failure(error) => complete(BadRequest -> Error(error.getMessage))
      }
    }
  } ~ get {
    complete(SchemaRenderer.renderSchema(SchemaDefinition.UserSchema))
  }
}

You may notice that we also expose our schema under GET endpoint as well, what it is here for? Well, if you are familiar with Swagger which we have talked about a lot here, it is a very similar concept. The schema contains all the necessary pieces, enough for external tools to automatically discover the respective GraphQL queries and mutations, along with the types they are referencing. GraphiQL, an in-browser IDE for exploring GraphQL, is one of those (think about Swagger UI in the REST(ful) services world).

We are mostly there, our GraphQL server is ready, let us run it and send off a couple of queries and mutations, to get the feeling of it:

sbt run

[info] Running com.example.graphql.Boot
INFO  akka.event.slf4j.Slf4jLogger  - Slf4jLogger started
INFO  reactivemongo.api.MongoDriver  - No mongo-async-driver configuration found
INFO  reactivemongo.api.MongoDriver  - [Supervisor-1] Creating connection: Connection-2
INFO  r.core.actors.MongoDBSystem  - [Supervisor-1/Connection-2] Starting the MongoDBSystem akka://reactivemongo/user/Connection-2

Very likely our data store (we are using MongoDB run as Docker container) has no users at the moment so it sounds like a good idea to add one right away:

$ curl -vi -X POST http://localhost:48080/users -H "Content-Type: application/json" -d " \
   mutation { \
     add(email: \"a@b.com\", firstName: \"John\", lastName: \"Smith\", roles: [ADMIN]) { \
       id \
       active \
       email \
       firstName \
       lastName \
       address { \
       street \
         country \
       } \
     } \
   }"

HTTP/1.1 200 OK
Server: akka-http/10.0.5
Date: Tue, 25 Apr 2017 01:01:25 GMT
Content-Type: application/json
Content-Length: 123

{
  "data":{
    "add":{
      "email":"a@b.com",
      "lastName":"Smith",
      "firstName":"John",
      "id":1493082085000,
      "address":null,
      "active":false
    }
  }
}

It seems to work perfectly fine. The response details will be always wrapped into data envelop, no matter what kind of query or mutation you are running, for example:

$ curl -vi -X POST http://localhost:48080/users -H "Content-Type: application/json" -d " \                                                
   query { \                                                                                                                           
     users { \                                                                                                                         
       id \                                                                                                                            
       email \                                                                                                                         
     } \                                                                                                                               
   }"                                                                                                                                  

HTTP/1.1 200 OK                                                                                                                                   
Server: akka-http/10.0.5                                                                                                                          
Date: Tue, 25 Apr 2017 01:09:21 GMT                                                                                                               
Content-Type: application/json                                                                                                                    
Content-Length: 98                                                                                                                                
                                                                                                                                                  
{
  "data":{
    "users":[
      {
        "id":1493082085000,
        "email":"a@b.com"
      }
    ]
  }
}

Exactly as we ordered... Honestly, working with GraphQL feels natural, specifically when data querying is involved. And we didn't even talk about fragments, variables, directives, and a lot of other things.

Now it comes to the question: should we abandon all our practices, JAX-RS, Spring MVC, ... and switch to GraphQL? I honestly believe that this is not the case, GraphQL is a good fit to address certain kind of problems, but by and large, traditional REST(ful) web services, combined with Swagger or any other established API specification framework, are here to stay.

And please be warned, along with the benefits, GraphQL comes at a price. For example, HTTP caching and cache control won't apply anymore, HATEOAS does not make much sense either, unified responses no matter what you are calling, reliability as everything is behind single facade, ... With that in mind, GraphQL is indeed a great tool, please use it wisely!

The complete project source is available on Github.

Monday, January 9, 2017

Overcast and Docker: test your apps as if you are releasing them to production

I am sure everyone would agree that rare application architecture these days could survive without relying on some kind of data stores (either relational or NoSQL ones), or/and messaging middleware, or/and external caches, ... just to name a few. Testing such applications becomes a real challenge.

Luckily, if you are a JVM developer, things are not as bad. Most of the time you have an option to fallback to an embedded version of the data store or message broker in your integration or component test scenarios. But what if the solution you are using is not JVM-based? Great examples of those are RabbitMQ, Redis, Memcached, MySQL, Postgresql which are extremely popular choices these days, and for a very good reasons. Even better, what if your integration testing strategy is set to exercise the component (read, microservice) in the environment as close to production as possible? Should we give up here? Or should we write a bunch of flaky shell scripts to orchestrate the test runs and scary most of the developers to death? Let us see what we can do here ...

Many of you are already screaming at this point: just use Docker, or CoreOS! And this is exactly what we are going to talk about in this post, more precisely, how to use Docker to back integration / component testing. I think Docker does not need an introduction anymore. Even those of us who spent a last couple of years in a cave on a deserted island in the middle of the ocean have heard about it.

Our sample application is going to be built on top of Spring projects portfolio, heavily relying on Spring Boot magic to wire all the pieces together (who doesn't, right? it works pretty well indeed). It will be implementing a very simple workflow: publish a message to RabbitMQ exchange app.exchange (using app.queue routing key), consume the message from the RabbitMQ queue app.queue and store it in Redis list data structure under the key messages. The three self-explanatory code snippets below demonstrate how each functional piece is being done:

@Component
public class AppQueueMessageSender {
    @Autowired private RabbitTemplate rabbitTemplate;
 
    public void send(final String message) {
        rabbitTemplate.convertAndSend("app.exchange", "app.queue", message);
    }
}
@Component
public class AppQueueMessageListener {
    @Autowired private AppMessageRepository repository;
 
    @RabbitListener(
        queues = "app.queue", 
        containerFactory = "rabbitListenerContainerFactory", 
        admin = "amqpAdmin"
    )
    public void onMessage(final String message) {
        repository.persist(message);
    }
}
@Repository
public class AppMessageRepository {
    @Autowired private StringRedisTemplate redisTemplate;
 
    public void persist(final String message) {
        redisTemplate.opsForList().rightPush("messages", message);
    }
 
    public Collection<String> readAll() {
        return redisTemplate.opsForList().range("messages", 0, -1);
    }
 
    public long size() {
        return redisTemplate.opsForList().size("messages");
    }
}

As you can see, the implementation is deliberately doing the bare minimum, we are more interested in the fact that quite a few interactions with RabbitMQ and Redis are happening here. The configuration class includes only the necessary beans, everything else has been figured out by Spring Boot automatic discovery from the classpath dependencies.

@Configuration
@EnableAutoConfiguration
@EnableRabbit
@ComponentScan(basePackageClasses = AppConfiguration.class)
public class AppConfiguration {
    @Bean
    Queue queue() {
        return new Queue("app.queue", false);
    }

    @Bean
    TopicExchange exchange() {
        return new TopicExchange("app.exchange");
    }

    @Bean
    Binding binding(Queue queue, TopicExchange exchange) {
        return BindingBuilder.bind(queue).to(exchange).with(queue.getName());
    }
 
    @Bean
    StringRedisTemplate template(RedisConnectionFactory connectionFactory) {
        return new StringRedisTemplate(connectionFactory);
    }
}

At the very end comes application.yml. Essentially it contains the default connection parameters for RabbitMQ and Redis, plus a bit of logging level tuning.

logging:
  level:
    root: INFO
    
spring:
  rabbitmq:
    host: localhost
    username: guest
    password: guest
    port: 5672
    
  redis:
    host: localhost
    port: 6379

With that, our application is ready be run. For convenience, the project repository contains docker-compose.yml with official RabbitMQ and Redis images from the Docker hub.

Being TDD believers and practitioners, we make sure no application leaves the gate without thorough set of test suites and test cases. Keeping unit tests and integration tests out of scope of the subject of our discussion, let us jump right into component testing with this simple scenario.

@RunWith(SpringJUnit4ClassRunner.class)
@SpringBootTest(
    classes = { AppConfiguration.class }, 
    webEnvironment = WebEnvironment.NONE
)
public class AppComponentTest {
    @Autowired private AppQueueMessageSender sender;
    @Autowired private AppMessageRepository repository;
 
    @Test
    public void testThatMessageHasBeenPersisted() {
        sender.send("Test Message!");
        await().atMost(1, SECONDS).until(() -> repository.size() > 0);
        assertThat(repository.readAll()).containsExactly("Test Message!");
    }
}

It is really basic test case, exercising the main flow, but it makes an important point: no mocks / stubs / ... allowed, we expect the real things. The line is somewhat blurry but this is what makes component tests different from let say integration or e2e tests: we test a single full-fledged application (component) with the real dependencies (when it makes sense).

This is a right time for an excellent Overcast project to appear on the stage and help us out. Overcast brings the power of Docker to enrich test harness of the JVM applications. Among many other things, it allows to define and manage lifecycle of Docker containers from within Java code (or more precisely, any programming language based on JVM).

Unfortunately, the last released version 2.5.1 of Overcast is pretty old and does not include a lot of features and enhancements. However, building it from source is no-brainer (hopefully the new release is going to be available soon).

git clone https://github.com/xebialabs/overcast
cd overcast
./gradlew install

Essentially, the only prerequisite is to provide the configuration file overcast.conf with the list of named containers to run. In our case, we need RabbitMQ and Redis.

rabbitmq {
    dockerImage="rabbitmq:3.6.6"
    exposeAllPorts=true
    remove=true
    removeVolume=true
}

redis {
    dockerImage="redis:3.2.6"
    exposeAllPorts=true
    remove=true
    removeVolume=true
}

Great! The syntax is not as powerful as Docker Compose supports, but simple, straightforward and quite sufficient to be fair. Once configuration file is placed into src/test/resources folder, we could move on and use Overcast Java API to manage these containers programmatically. It is natural to introduce the dedicated configuration class in this case as we are using Spring framework.

@Configuration
public class OvercastConfiguration {
    @Autowired private ConfigurableEnvironment env;

    @Bean(initMethod = "setup", destroyMethod = "teardown")
    @Qualifier("rabbitmq")
    public CloudHost rabbitmq() {
        return CloudHostFactory.getCloudHost("rabbitmq");
    }

    @Bean(initMethod = "setup", destroyMethod = "teardown")
    @Qualifier("redis")
    public CloudHost redis() {
        return CloudHostFactory.getCloudHost("redis");
    }

    @PostConstruct
    public void init() throws TimeoutException {
        final CloudHost redis = redis();
        final CloudHost rabbitmq = rabbitmq();

        final Map<String, Object> properties = new HashMap<>();
        properties.put("spring.rabbitmq.host", rabbitmq.getHostName());
        properties.put("spring.rabbitmq.port", rabbitmq.getPort(5672));

        properties.put("spring.redis.host", redis.getHostName());
        properties.put("spring.redis.port", redis.getPort(6379));

        final PropertySource<?> source = new MapPropertySource("overcast", properties);
        env.getPropertySources().addFirst(source);
    }
}

And that is literally all we need! Just a couple of important notes here. Docker is going to expose random ports for each container, so we could run many test cases in parallel on the same box without any port conflicts. On most operating systems it is safe to use localhost to access the running containers but for the ones without native Docker support the workarounds with Docker Machine or boot2docker exist. That is why we override connection settings for both host and port for RabbitMQ and Redis respectively, asking for the bindings at runtime:

properties.put("spring.rabbitmq.host", rabbitmq.getHostName());
properties.put("spring.rabbitmq.port", rabbitmq.getPort(5672));

properties.put("spring.redis.host", redis.getHostName());
properties.put("spring.redis.port", redis.getPort(6379));

Lastly, more advanced Docker users may wonder how Overcast is able to figure out where Docker daemon is running? Which port it is bound to? Does it use TLS or not? Under the hood Overcast uses terrific Spotify Docker Client which is able to retrieve all the relevant details from the environment variables, which works in majority of use cases (though you can always provide your own settings).

To finish up, let us include this configuration into the test case:

@SpringBootTest(
    classes = { OvercastConfiguration.class, AppConfiguration.class }, 
    webEnvironment = WebEnvironment.NONE
)

Easy, isn't it? If we go ahead and run mvn test for our project, all test cases should pass (please notice that first run may take some time as Docker would have to pull the container images from the remote repository).

> mvn test
...
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
...

No doubts, Docker raises testing techniques to a new level. With the help of such awesome libraries as Overcast, seasoned JVM developers have even more options to come up with realistic test scenarios and run them against the components in "mostly" production environment (on the wave of hype, it fits perfectly into microservices testing strategies). There are many areas where Overcast could and will improve but it brings a lot of value even now, definitely worth checking out.

Probably, the most annoying issue you may encounter when working with Docker containers is awaiting for the moment when the container is fully started and ready to accept the requests (which heavily depends on what kind of underlying service this container is running). Although the work on that has been started, Overcast does not help with this particular problem yet though simple, old-style sleeps may be good enough (versus a bit more complex port polling for example).

But but but ... always remember about testing pyramid and strive for a right balance. Create as many test cases as you need to cover most critical and important flows, but no more. Unit and integration tests should be your main weapon.

The complete project is available on Github.