Cloud Data Warehouses (CDWs) are increasingly working their way into the dependency graph for important parts of the business: user-facing features, operational tools, customer comms, and even billing. Running this kind of operational work on a CDW might look promising initially but companies paint themselves into a corner as workloads expand:
Either the cost (in warehouse invoices) to deliver the work outpaces the value delivered, or hard performance limits inherent to the design of analytical data warehouses prevent teams from delivering the capabilities necessary to serve the work in production.
Why?
Operational workloads have fundamental requirements that are diametrically opposite from the requirements for analytical systems, and we’re finding that a tool designed for the latter doesn’t always solve for the former. That said, teams running operational work on the warehouse aren’t completely irrational. There are many good reasons for building this way, especially initially.
What is operational?
First, a working definition. An operational tool facilitates the day-to-day operation of your business.
Think of it in contrast to analytical tools that facilitate historical analysis of your business to inform longer term resource allocation or strategy.
If an operational system goes down for the day, there are people who will either be unable to do their job, or deliver a degraded service that day.
Analytical Work | Operational Work |
|
|
To simplify things, most operational work can be generalized as automated interventions in the business.
How is it different?
Going deeper into the technical requirements for analytical vs operational workloads, there are clear conflicts:
Static data is a feature for analytical work, but a bug for operational work.
When you’re doing iterative exploratory analysis or navigating between business reports, it’s convenient to be able to lock the input data down as a constant and assume only the SQL is changing.
But in operational workloads it’s reversed:
You want to lock down the SQL and always get as close as possible to the “current state” of data to operate on.
You don’t want to send notifications that no longer apply to customers.
Analytics needs historic data, operations needs fresh data.
Looking at how data changes over time is crucial to analytics, but less so for operations where you mainly just want the data to be as fresh as possible.
Ad-Hoc SQL queries are a vital part of analytical work, but not operational.
For analyst productivity, analytical tools need to be ready to answer a new SQL query fast, and most CDWs are really optimized for this (and make architectural tradeoffs to make this fast).
The operational workload, on the other hand, is more akin to traditional software development:
SQL might need to be iteratively written on a smaller scale of data in a dev environment, but in production the SQL is locked down by design.
Uptime is nice to have for analytics, but it’s mandatory in operations.
This one is pretty self-explanatory. Yes, downtime is always annoying, but an operational system going down at 3am results in a pager going off and sleep being ruined. This is seldom the case for an analytical system.
It’s not all opposites, though.
Both types of work add value by combining different sources of data.
Both use SQL queries that are complex, join-heavy, multi-level.
Both need to handle many different team’s workflows without disruption.
A tool built from the ground up for operational purposes might share some design choices with analytical ones, but the differences add up to some pretty stark inefficiencies in both data freshness and total cost.
We come to praise Cloud Data Warehouses, not to bury them
In spite of all this, data teams continue to expand into operational work on the warehouse.
Why?
We asked, here’s what comes up