A Service shouldn't know about its Environment


Recently I was involved with setting up a build and deploy system for a suite of microservices. Each service was bundled into its own Docker image and deployed by Ansible.  One of the biggest lessons I learnt is the subject of this post.

Background


  • Project encompasses around twenty microservices (this figure only set to increase).
  • All packaged and deployed as Docker images.
  • Deployment by Ansible.   
  • Some services need to be deployed to different machines on different networks since some are public facing, some backend, some internal etc etc.
  • Different environment variables need to be injected into different running Docker containers.
  • Each service lives in its own git repo.
  • Each service built by CI server which runs project tests and creates a Docker image. 
  • Big effort to minimise the amount of work/config required to create a new microservice.

The Problem


Where to store the ansible deployment configuration for each service.

 

First idea: Each service owns its deployment configuration.


At first we had the idea of each service being in control of its own deployment destiny.  This involved each project storing its ansible config in its git repo.  I liked this since it meant that everything you needed to know about a service was in one place.  It was meant to be great for auditing and keeping track of changes since you only needed to worry about one project. 

It all sounded great in theory. However, we soon discovered it was in-practical for a number of reasons.

Making changes to deployment config is hard when it's defined in multiple repos.


The first sign this was a bad idea was that we found ourselves having to make changes in many different projects when we wanted to change the deployment mechanism. For example, say we wanted to run a command after every deploy, this would require changing the common Ansible role responsible for deployment and then updating each service to use the new version of that role. This would then require us to rebuild each project with the pointless step of building and tagging a new Docker image that was identical to the last (since no changes to the Ansible deployment config affected the Docker image for a given service). With the number of microservices set to increase, this problem was only forecast to get worse.

There's no central place which describes your environment.


Answering the question "What else runs on network/machine X?" becomes impossible as you have to check an ever growing number of service's repos.

Dependency is the wrong way around


If each project has its own deployment config, it is in affect describing a subset of the environment.  In our project, this seemed both impractical and inflexible.  In retrospect, a service knowing about its environment seems like a dependency the wrong way round. To explain this, consider the following relationships of the entities composing a typical Dockerized Java Web App.



The diagram above shows the relationships between various entities in two projects.  Each arrow can be read as "depends on".  This diagram also shows that the "Environment" can't be drawn as a single entity in its own right as it's distributed among multiple projects.

Looking at the above from right to left:
  • A library is imported by a webapp. 
  • A webapp is built into a Docker image.
  • The Ansible config deploys Docker Containers of specified Docker images.
  • The Ansible config describes a subset of the environment. 
The arrows that point from the Ansible config to the Environment feel out of place and wrong!  In fact, if any of the above left to right pointing arrows were reversed, we'd have a design problem:

  • If a library knew about the webapp using it, it wouldn't be reusable. 
  • If the webapp knew it was deployed in Docker, it would require re-work to deploy it outside of Docker. 
Since the Project has Ansible config that defines the Environment, it means that the Project is coupled to its environment and deployment implementation.  The same could be said for the project's relationship with Docker.  The webapp is coupled to Docker via it's corresponding Dockerfile which means replacing Docker with an alternative could be arduous.  However, this coupling seemed OK to us at the time since the Environment was in much more of a state of flux than our Docker containers.

Second idea: Deployment config in its own project


With this idea, we have each project only responsible for defining its Docker image and not how it's deployed.  No Ansible config lives in any microservice repo, it all lives in another project which is used to deploy everything.

The diagram can now be re-drawn as follows:



Here we see the Environment as a fully fledged entity with its own git repo.  The project knows nothing about where or how its deployed.  The project is only responsible for building Docker images.

After switching from the first idea to the second, we quickly found that this made more sense from a conceptual level and was easier practically.

Why this was better for us:



  • The Environment config could be understood in full by looking at one repo.
  • All Ansible roles that we had created were then moved into this one repo which also helped with exploring the config.
  • Making changes to the deployment config was far easier being in one place.
  • When changing the config, it was less likely to require effort that was proportional to the number of microservices. 

Comments

Popular posts from this blog

Lessons learned from a connection leak in production

How to connect your docker container to a service on the parent host

How to test for connection leaks