Last Updated: 2021-05-15
Ability to test-drive risky new technologies
In a monolith, bringing in a new technology (e.g. a database) risks impacting everyone. Not so with microservices.
i.e. the benefit of technological heterogeneity.
One server going down doesn't mean the others have to. (Counterpoint from monoliths: you can have them running on multiple machines)
Ease of deployment
A one-line change to a million line monolith requires a deploy of the whole shebang and because this is understandably risky, deploys happen less often. (counterpoint: but now you need to do 10x more deploys and have that dev-ops monitoring infrastrucutre in place)
Network failures and distributed system hardships
Whole nasty class of problems not experienced much in monoliths.
If you have ten different languages, how will people move across teams? How will you hire? How will you have enough experience to run these things at any scale?
Moreover, you will never build up tooling around some particular tech.
Service Oriented Architectures (SOA) Tips
- promote usability of software. It's a good sign if two or more end-user applications could use the same service.
- you should be more worried about what happens between the boxes than inside (e.g. ensure that you use as few communicaation protocols as possible - if one service uses REST, the next Java RBI etc., you'll have problems)
- come up with a set of up to ten principles about your system (e.g. "all communication must be REST", "all logs must be collected centrally", "make choices that favor rapid feedback and change")
- have a standard system for monitoring (e.g. Nagios) and metrics (e.g. Graphite)
- interfaces are more than protocols. Say you are using REST: Will you use verbs or nouns? How will you handle versioning? How will you handle pagination?
- spend testing effort on ensuring errors are correctly labelled as such (for this is part of your interface). E.g. if you have a server over REST that returns 2XX for errors or confuses 4xx with 5xx, your bugs will be difficult to track and your safety measures may break down.
- get your team to follow your principles by providing them with templates or boilerplate out of the box. Could be a base project, could be a code-generator. This way, programmers need to go out of their way to veer from best practices.
- if this shared base functionality is not available in a language a microservices team want to
use, you might mandate that they connect via a technology where you team have
already implemented the important stuff. E.g. Netflix uses "sidecar services"
which communicate to the JVM (their main stack).
- loosely coupled services should know as little as possible about the services it
- we want high cohesion (i.e. related functionality should be together) Otherwise we'll need to modify five places to make any change. Low cohesion will affect performance since the interfaces become "chatty". Think about updates to a User: you don't want to expose CRUD. You want to expose specific, validated ways of modifying the entity via events.
- think about microservice like you would departments in a company. Finance does not need to know how forklifts work, but it probably does need inventory levels for the accounts. So inventory will be shared between these two contexts. But here's the clincher: we don't share the full entity. Even though we know what shelf a piece of inventory is on within the warehouse service, we don't give that info to finance (i.e. to its "public interface") since it is too much info and breaks the boundaries. In other words: the internal and external representation of a given object can deviate. Concepts, like a "return", can have the same names in different contexts but entail quite different things.
- don't think about data when designing interfaces. Think about what things each service should do for other ones. E.g. "finance should set up payroll for new employees". By doing this, you avoid having an interface that is CRUD and totally anemic.
- before going microservices, use monolith modules in your first iterations of the code to experiment with the cohesiveness. Wait for things to stabilize before splitting into services since it is WAY cheaper to get it wrong during the monolith stage. TLDR: Avoid premature decomposition.
- consider nesting services within other services (e.g. finance talks to warehouse with a
/request_stock_levels request. Behind the scence, warehouse talks to inventory service. This makes it much easier to test since less services need to be stubbed out
- avoid breaking changes. If you need at add a data field in a response, don't have existing consumers designed such that they will break when extra data gets added i.e. use the tolerant reader pattern. If there is extra data added to the user model, the client should be fine with that (e.g. email address and name)
Gotchas with DB integration
- internal implementation details are leaked via the DB tables and fields accessible in other services. Some teams recommend delaying implementation of a proper data store so as to avoid a data-store dominated view of how the interface over the wire should look.
- changes to schema can break consumers. Therefore teams become scared to make changes for fear of breaking faraway code.
- database sharing lets you share data but not behavior. Therefore things like validations will need to be duplicated in every service the (say) user model is used.
- if you let multiple entities save data, you'll have trouble handling collisions of state or triggering behavior based on state changes
DRY in the context of microservices
- DRY is great within one microservice, but less so between them.
- Tempts you to create libraries of shared code, but this can increase coupling between microservices. Imagine you shared common objects between each service. Then updating this library would require updates of all microservices! Moreover: state currently in the message queues might become invalid and need to be transformed or drained.
- The key is to ensure the shared code does not leak outside the service boundary.
- Using common code like logging is totally OK since this is mostly an internal concept.
- If you are using a template as a base for services, it is probably better to copy it over than make it a dependency. That way you don't force redeploys.
- Dream is to make your service super easy to use
- Risk that you start putting code that should be within server on the client. This happens because the same team often writes both server and client code.
- Then you get to the nasty point where multiple clients need to change to effect a modification. This is bad. You want to be able to update both independently.
- AWS is the ideal model. Every API call can be done by hand and the libraries just wrap that. The libraries are built by diffent people.
- The client library should separate things related to transport protocol, failover, service discovery and things related to the service itself.
Access by reference
- Say you request something about a customer resource via the customer service. Now you have this customer entity, but it changes out from under you. Does this matter? For things like a view in a sales dashboard, it might be OK. But for sending an email? Basically, the longer we hold in memory, the higher the chance it is false. Danger situation: a long queue of jobs.
- The solution is to pass by ID and look up just-in-time.
- Counterpoint: Certain use-cases might require the customer as it looked THEN. In this case, sending an event with the changes might be the way to go.
- Annother downside: These constant lookups are hard on performance (# of requests) and increase coupling (+ unrelated data may leak)
Image that creating a customer also entails many other actions like:
- entry into loyalty points bank
- postal system sending welcome pack
- sending a welcome email to the customer
How do we orchestrate all this?
One option is that the original custom service system does it all through a series of request/response calls. If synchronous, we could even know if each stage worked (but only is synchronous and FAST)
The downsides to doing this synchornously are:
- danger of anemic CRUD-operaitons where the customer-service area knows too much about the other systems [my solution in monolith: succint messages and good boundaries]
- we accumulate too many responsibilities in the orchestrator. It becomes a GOD entity.
The alternative is pub/sub. Its downsides are:
- more difficult to get an overview [idea for solution? tooling that tells you all subscribers]
- more additional work required to check if the right thing happened. How would we know if the loyaly points entry was created? The author suggests having an independent monitoring service that maps onto the overall flow
- how will you handle long-running async processes where the response is available only at a time when the requesting node happens - through bad luck - to be down. What do you do with this response message? Do you store it somewhere? How should you react?
- you can get catastrophic failover if you allow any given message to be consumed infinite times. Imagine one job causes an exception that causes the worker to die. After timeout another worker takes it. Until such point as all workers are constantly consuming this message and dying.
tip with pub-sub: use correlation IDs to trace requests across process boundaries
Event based integration
- Async is less easy to reason about than sync (because it is not consecutive in time)
- Event-based says "this happened". Client has no idea who or what will react to it. Therefore highly decoupled.
- Often message brokers like RabbitMQ allow microservices to emit events and other services to subscribe. Brokers can even handle state of consumers and check what messages have been seen.
- Be cognizant about overly intelligent message broker that hide too much logic in their middleware instead of in your system.
- Consider having multiple versions of certain endpoint co-exist in order to avoid forcing micro-services and clients to upgrade in lock-step when there is a breaking change.
- Obviously not ideal to have two or more versions to support and test, but it is better than breaking changes. The goal is to move all clients to the new version gradually.
- Versions can exist in the urls (e.g.
api/v1/products) or in HTTP headers
Issues with REST
- Should you be blind to the client device? E.g. Surely the endpoint of your admin's helpdesk application for
/customers can afford to deliver much more data than it would to a bandwidth-constrained mobile app.
- If the client needs data from three endpoints (e.g.
/recommendations), it could be slow and very chatty. Changes might need work across many teams.
- One solution is an API gateway that marshals multiple backend calls on behalf of the frontend. Issue here is that this gateway gets entangled with everything and you lose independence. A possible solution is to have one gateway for each client (e.g. helpdesk gateway, mobile app gateway, etc.)
Build or buy?
- Main issue is lack of control when you integrate say WordPress as a base. Better to use WordPress as a service JUST FOR actual web pages while simultaneously doing everything else your way outside of WordPress and with whatever tools you want. WordPress onw even has a CLI Even if it didn't, you could use a proxy server. You might consider thinking more abstractly: how could you FAKE this entire service while buiding around it?
- Say your company uses a proprietary project management system. You might create facades around this that pulls out a) projects and b) employees in a way your other services can use. That way you can delegate domain concepts to the proprietary software while still maintaining control, testability, etc.
- Capture and intercept calls to legacy system and decide whether and where to route. Allows you to upgrade piecemeal.
Counterpoint: You might already be doing microservices
At Oxbridge Notes, the fact that my Heroku deploy talks to an SQL database via a network endpoint could be said to be a "microservice". As could my use of PayPal, S3, and other vendor integrations.
- Building Microservices by Sam Newman