Monolith vs Microservices

A friend writes, talking about a stateless software component (lets call it X) that we both know well (edited for clarity and to remove concrete details):

The reason why I call it a monolith is because we still have to deploy the whole thing at once. Ex: We want to enhance the capability <X>. If we had another component in front I could make changes to just that component. When we discussed it today I felt we were very concerned about the global rollout. We actually might have a component like that in front but for some reason we need changes in X?

This required a longer discussion so I decided to write it out. I have maintained one complex set of stateless microservices and have built from scratch and launched at least two more (maybe three depending how you count it). Microservices are perfect for:

Isolation of concerns primarily behind different teams and secondarily technologies.
Scaling of different parts of the system independently.
Deployment of different parts of the system independently.

But you pay so much for these advantages:

You lose type safety between components. You can get it back through marshalling (aka serialization) but that works alright for shallow object graphs but terribly for deep ones where the number of dependent objects is large.
You increase the complexity of your build pipeline and deployment pipeline. If you are early in the software lifecycle this is a huge cost to pay in advance. And if you are not running on hyperscalers but your own datacenters, now you need to figure out the right ratios of different components and how to deploy them.
You start to have to deal with eventual consistency of configurations and split-brain problem between different components. If you want to avoid it then you concentrate the configuration state in a single component (that is hopefully the first component in the processing) and then lug that information through the processing.
You increase the complexity of testing and you have to introduce end to end testing rather than testing a single component. And if you introduce mocking, how you have to maintain the original component and the mocks of it in all the other dependent components.
You increase the complexity of debugging. You have to trace through multiple components to find the source of the problem.
You increase local development setup and maintenance costs. To put it concretely Dockerfile is no longer all you need - gotta reach for docker-compose. For stateless components this is just extra work without any benefit.

So am I “down on microservices”? I go back to the first point I made above: if you have different development teams (and teams can’t scale indefinitely) then moving toward microservices is a good choice. If you already discovered what are your hottest to change parts of the code - sure. If you have already discovered what are your different scaling needs - sure. But I would always start from a monolith and move toward microservices only cautiously. Going the other way around is nigh impossible (oh different language? oh completely different libraries? oh different threading model? oh meant to deploy on a completely different service? oh completely different logging and metrics?). Software development already requires a lot of discipline, just on a level of a single component: microservices require even more across multiple teams. Opening that Pandora box is (too) easy but closing it is hard.

And while I’m on the subject, in the context of stateless components, I have a strong preference for threaded model vs cooperative multitasking. Don’t get me wrong: cooperative multitasking is beautiful. It’s just that beauty comes at a cost of high cognitive overhead and for a small team especially (or later generation developers), it is exceptionally hard to get it continuously correctly. And again the same argument of “is it easy to later change” applies: if you start with a threaded model and later decide that you really need cooperative multitasking, it’s far, far easier to go from the threaded to cooperative than the other way around.

PS. My friend, as a reaction to reading the draft, had this to say:

I’m not sure that blog post settled our debate 😊 We are in agreement that starting with a monolith is always correct. <elided> A better blog post might be “how to scale beyond the monolith” and talk through the ways to evolve the system with their pros/cons.

I shall endeavor to oblige.

Last modified on 2024-05-18