Previously, we identified and explored a set of components that enable us to communicate from a source to a destination using push-based and pull-based communication models. We explored application servers, load balancers, message queues, and data streams.
This second installment is designed to be consumable independently of the first and focuses on identifying a second set of components including message topics, event buses, and workflow engines. It explores how these components can help us operate and scale increasingly complex integrations. We examine how these building blocks can be used to achieve reliable, observable, flexible, and performant distributed processes.
Hopefully all without boring you senseless.
We will examine three types of components that are more complex than communicating from a source to a destination:
-
Message fan-out
-
Rule-based conditional routing of messages across a set of consumers
-
Orchestrating and choreographing complex workflows
-
Event-driven choreography
-
Engine-driven orchestration
-
Note that these topics are exclusively related to asynchronous, pull-based processes. We will therefore not cover service meshes or other components that are designed solely with push-based distributed service integrations in mind.
Sending a message from a source to a destination can be thought of as passing a note in middle school. (I haven’t been in middle school for a while, so things may have changed. Either way, dinosaur that I am, I will proceed with this analogy.)
You have a note that you want to send to your buddy. If your buddy’s desk is next to yours, you just pass it along. If you’ve got good aim and you know where your buddy is sitting, you lob it over when the teacher is not looking. This method is both less reliable and less secure. If your note isn’t very private or you trust your neighbor not to look, you might write your buddy’s name on the cover and pass it off to your neighbor for delivery. This method also has reliability and security implications.
But let’s say you wanted to invite everyone in the class to your birthday party on Sunday. You might write a note and have everyone toss it around the class. Of course, your buddy would probably get mad at you if the note makes its way to everyone in the class except them.
A better approach might be to have a class bulletin board. You put your invitation up on the bulletin board, and your buddies check the bulletin board to see if any new events are posted for the class. This way, if your buddy doesn’t see the birthday invitation, they have only themselves to blame.
This approach fails in several scenarios. What if you want your buddies to attend your party, but your buddies are not very diligent about checking the bulletin board? What if you want to selectively invite a subset of the class?
And thus emerged the practice of personalized party invitations.
You make a list of everyone you want to invite to your party. You create a separate invitation for each person on the list. You deliver these invitations to everyone you want to invite. You check off when you’ve delivered the invitation and you check off RSVPs you receive.
The notepad approach is manageable for a small party with a short guest list. With increasing party size, however, comes the need to use more complex tools. But your notepad quickly evolves along with the scale of your problems:
Your invitation list becomes longer. You start needing to keep track of gifts you receive. You need to collect RSVP counts for catering reasons. When delivering all the invitations in person is no longer feasible, you need to record and keep track of mail addresses. You need to send out reminders when you haven’t heard back.
For my bar-mitzvah and my sisters’ bat-mitzvahs, my father kept spreadsheets. My wife and I used the same spreadsheets for our wedding.
Message complexity in human systems and message complexity in distributed systems are very similar.
Akin to the scenario of wanting to broadcast an invitation to everyone in your class, systems frequently need to broadcast information to a set of consumers. As such, engineers have built messaging primitives to enable pub/sub or publish/subscribe capabilities. In AWS, the simplest publish primitive is provided by Simple Notification Service or SNS. SNS enables you to create message topics to which you can publish messages.
Each message topic represents a bulletin board-like construct to which publishers can deliver messages. Messages can be posted to this bulletin board, and different consumers can subscribe to these messages. Subscribers can be lambda functions or SQS queues or email or text messages. For an exhaustive list of subscribers see AWS documentation.
A single SNS topic allows up to 12.5 million subscriptions. As such, if you need to take a single message and fan it out to a lot of consumers, SNS is a pretty good bet. RabbitMQ and ActiveMQ allow you to define limitless subscriptions per topic, but since you’re in control of the infrastructure, you bear the responsibility of scaling it to meet the demand.
AWS does provide a managed Amazon MQ solution that lets you operate RabbitMQ and ActiveMQ clusters, but you need to interpret metrics and determine scaling policies yourself. This will usually add more operational complexity to your solutions than architecting with SNS limitations in mind.
Frequently, when you fan out messages in distributed systems, you might not want to fan all messages out equally.
Assume, for instance, that you have a topic called ProductPurchaseNotifications in a web store. Now, assume you are selling both physical and digital products from the same storefront. You might have an upstream system responsible for ordering new inventory which is very interested in physical product purchases but is not interested in digital products at all. You might also have a royalty platform responsible that is only interested in digital products. You’ll also probably have an email system that is indiscriminate and wishes to receive all product purchase notifications.
It is, of course, possible to deliver all product purchase notifications to all consumers and let them discard what they are not interested in, but this isn’t a very efficient solution. It results in a more congested network and resource waste.
Instead, many Pub/Sub systems have therefore implemented filtering capabilities in the subscription layer. In fact, as one of the pre:Invent announcements in November (2022), Amazon added support for payload-based message filtering in SNS.