Thoughts from Sam Newman's - Microservices Data Decomposition Talk

April 30, 2023

I was fortunate to attend a great talk by Sam Newman on the above subject, here is a brief summary of the content and my thoughts.

First off, I should say that I went in to the talk regarding myself as "experienced" on the subject of microservices. I anticipated learning about the more complex "data decomposition" aspects rather than the basics. However I have to admit that even the fundamental points on microservice were still a benefit for me to hear. This is likely due to Sam being such a good teacher. I found myself somewhat re-learning topics which I hadn't anticipated.

Microservices and Backwards Compatability

Sam described microservices as an architecture style which allow you to partition your functionality into separate services which are independently deployable. The data that each service stores is hidden within the microservice boundary. They allow you to scale your organisation since you can have teams working on separate systems in parallel.

Tolerant Reader Pattern

Backwards compatible releases are achieved with the "tolerant reader pattern" (i.e. clients not failing when they encounter new fields in a response payload). Some tools are bad for introducing this such as Java's RMI and any libraries which autogenerates code from a schema. Martin Fowler talks about this here too.

Services shouldn't over expose

Whilst clients shouldn't read every single field, the services should only expose the minimum that is required. The best way to do this is to speak to the actual clients. Perhaps they only need to know the total number of Apples in the inventory service and not the row and shelf in the warehouse that they are located. I have seen this problem first hand too!

Shared Domain Model should be avoided

I have seen this problem first hand and the downsides. Dave Farley's great 15 minute video also explains the same point: Code duplication is fine if it avoids coupling which is far worse a problem. To quote Sam:

"My general rule of thumb: don’t violate DRY within a microservice, but be relaxed about violating DRY across all services. The evils of too much coupling between services are far worse than the problems caused by code duplication."

JSON isn't the best for microservices

As soon as this was said, people started asking for further info. Sam's point is that:

It's human readable (but maybe you don't care if this is machine to machine)

I'd argue that even if your service is purely machine to machine - human readable is a massive benefit for debugging

Not compact

Apparently it's about the same size as an xml representation - this surprised me.

Proto Buff is a good alternative as it can be faster to serialize and deserialize given it is binary if you are willing to sacrifice the human readability.

Management Information / Analytical Data

Multiple people (including myself) asked "how do you query/join data from multiple sources when its in separate services' data stores" . I am not clear on the recommendation here. Sam mentioned that there are two main options; Data warehouse or Data Mesh. I think I need to read up on this. This looks a good place to start: https://martinfowler.com/articles/data-mesh-principles.html

Static Reference Data - Where to put it?

Country codes are a great example of this, you might have multiple services which all need this data - so where does it go? There are multiple options and the answer will depend on the impact of your services being out of sync.

Separate service - perhaps this is overkill but it guarantees a single view of the data. Clients can cache the data to speed things up.
Copy Paste / Bundle it in every service. This can be achieved cleanly with a shared library. This might be good enough if you know the impact of the data being inconsistent. But make sure you are not creating services which have to be released all together.

The horror!

In terms of delivery, Sam said:

"You won't appreciate the true horror, pain and suffering of microservices until you're running them in production"

This was used to point out that microservices should be released little and often. I guess the smaller the system you have in production the less the horror!

Search This Blog

Phill Barber's Blog