Blog
Distributed architecture smells. — A simple guide to avoid common mistakes
By Giulio Caccin, Developer at Scania
After over 10 years of experience working with distributed systems, I learned that a monolith is not only an application in a single repository, but also more importantly a highly coupled (a single change forces a chain of changes) set of API/contracts that share specific traits.
In the same way a distributed system is not defined by being in multiple repositories, but by being decoupled and still maintaining high cohesion (responsibilities divided by artifacts/repositories).
Along the way, in distributed systems where cohesion and coupling are wrongly balanced, you will incur in the dreaded distributed monolith.1
In this blog post I would like to highlight some of the common mistakes I’ve crossed during my years of experience and share them with you.
7 common mistakes
Binary Coupling
When overreliance on code reusability is the norm, you might push as much code as possible into centralized binaries such as:
- Contract libraries
- Clients for API access
- Common functions to manipulate data (es: serializers)
The consequence of using these kinds of libraries are:
- Service logic pushed inside the clients
- Performance problems will be harder to identify and fix
- New standards takes longer time to adopt in place of the old library
- Force language/platform to distribute/inject the library
This means that the service owner will impose their own development speed on the client app and will push its operational complexity. Forcing the developer to learn the usage of those libraries as they evolve, instead of learning new more useful tools.
This can be even worse when a complex graph of common dependencies is built from these artifacts, causing the need for creating multiple commit cycles to get out a single improvement.2
Snowflake Environments
All projects need multiple environments to exist. Each one helps catching different problems or reproduce specific features. Sometimes we cut corners and make some of them special, avoiding checking for their operativity. What I mean is that in environments such as:
- Containerized service
- Mock environments
- Dev environments
- Canary/Nightly/Staging
- Production
One of the most common mistakes is to skip creating them in the same way (automated CI/CD), or to miss having the same logging, alerts, data migration scripts and so on. This can prevent us from being able to test the deployment process itself, or data migration scripts.
Environment’s Configuration in-code
Sometimes I saw lots of configurations being set inside the codebase, this force a new build at every change and can have worrying security consequences.
Configuration values needs to be detached from both from the environment and the artifact repository using them, either on a separate repository or feature flag provider.
In code and its tests, inject only the portion of configurations that are needed, possibly inject them with a dependency injection container (to lower coupling).
A good rule of thumb should be that no file-based configuration should be required on unit tests, this imply that the configurations will need to be provided in the test itself (hi cohesion), thus making the configurations also decoupled from the file system.
Overuse of data and models
When, inside your system, you use the full data model provided by a service, even if you only need parts of it. Focus on the slice of data relevant to your microservice and disregard everything you don’t need, avoid even serializing the fields you don’t use.
Make eventual consistency your friend and increase resilience by asking for event-provided data, instead of over rely on RPC calls and caching, but pay attention not to cause eventual inconsistency by ignoring its lifecycle.
Be sure of the data you own, and even if you could, never share directly database resources.
DRY?! WET!
Don’t repeat yourself (DRY) is a rule that invite us not to write the same code twice, but sometimes that needs to be postponed. That’s why, it’s nice to keep in mind to write everything twice (WET) before jumping to hasty abstractions!
When I found out that sharing code with binaries could lead to slow downs, I realized that even basic reusing of code should not always be prioritized over single-responsibility, decoupled code. So even inside the same artifact, for example, models that are used to respect outbound contracts are evolving differently from domain core ones, so it’s important to consider them distinct entities.
This applies to all sorts of structures and functions, that lead me to use more and more feature packaging. So please, never reuse request and response models inside your domain code and use different domain models to solve different problems.
Artifacts leaking
If you are storing external libraries inside your repo, stop immediately and start using the most appropriate tool of your ecosystem to handle dependencies (such as NPM, NuGet or Gradle), make sure all environments either retrieve dependencies in the same way, or even better, act on the same produced immutable package, store in the same dependency tool.
Make sure you are always able to determine which commit that artifact/version refers to, and remember to consider it important on the log, especially when doing blue green deployment.
Not testing difficult things
Complex stored procedures, RPC calls without a standardized interaction model (like SOAP, which might be hard to use for many languages), full duplex communication, database functions which you are not sure your testing framework can invoke, interactions between microservices.
If it’s easy for you to test those technologies, or if you can accept the risk involved in not testing it, go ahead. Do not consider adopting some technologies altogether, if you cannot test them.
More importantly, if the solution to something hard is “this will require manual test”, you should know you will end up in a bad place.
How to fight back
There are several3 ways4 to go if you want to make sure you are not building the next distributed monolith, these are my suggestions:
- Aim to use better languages instead of relying on complex libraries: many java libraries can be replaced entirely by basic language provided by kotlin, or by using the latest .NET features
- Check your logs and test your alerts regularly in all the environments, like in dev or release
- Keep a similar data flow in all your non mocked environments
- Never store production configurations on code repository, just copy them on the target environment
- Automate as much as possible the creation of contracts and data transfer models
o Use open API documentation
o Remember that errors should be documented as well
o Auto generate it from your code (it’s ok to have comments on your code if you use it to create documentation)
- Realize the fact that sometimes you need RPC5, and stop worry about real/mature REST6
o Consider using protocol buffer to define and generate those calls and avoid common versioning problems
- Enforce cross functional requirements in all your environment
o Use integration tests to make sure observability works
o Remember to keep logs consistent to the definition of the various levels of logging (hint: ERROR is only for failures in current activity, not for application-wide failures)
o Check tracing and logging in all environments
o Establish sensible defaults for alerting system
- Use the right tool for the right task
o REST is for rapid iteration (human readable) of stateless resources on HTTP
o gRPC is gold when resources are scarce
o GraphQL when client decide how to aggregate data, otherwise it might be unnecessary in most cases
o Kafka when you need to sustain a complex and scalable event driven architecture with hi throughput7
o RabbitMQ or similar (AMQP) less feature-rich queue processing systems when you need speed of development and simplicity
- ·Test on a mock environment
o Prioritize having an environment where all major features can be tested during the automated deployment pipeline
o Use mock server or similar things to guarantee reproducible and predictable E2E runs. - Consider carefully if you can go for a simple monolith8
Acknowledgements
Thanks to all the Digital Dealer developers who provided feedback and suggestions to early drafts of this article. Thanks to Martin Blomster for his advice, insights and support.
/Giulio Caccin
Reference list
- https://www.microservices.com/talks/dont-build-a-distributed-monolith
- https://www.infoq.com/news/2016/02/services-distributed-monolith
- https://developers.redhat.com/blog/2017/06/22/12-factors-to-cloud-success
- https://12factor.net
- https://nordicapis.com/when-to-use-what-rest-graphql-webhooks-grpc/
- https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm
- https://ankittrehan2000.medium.com/amqp-rabbitmq-vs-kafka-for-asynchronous-communication-4c4dd703819f
- https://www.martinfowler.com/bliki/MonolithFirst.html