The New York Times recently ran a piece on a purported sudden spate of banks closing customer accounts. Little of it is surprising if you have read previous issues of Bits about Money. The reported anecdotal user experiences have a common theme to them. Banks frequently present to their users as notably disorganized, discombobulated institutions. This an alarming and surprising fact for the parts of society that are supposed to accurately keep track of all of the money.
Why does this happen? Why does it happen across issues as diverse as bank-initiated account closures, credit card or Zelle fraud, debit card reissuance, and mortgage foreclosures? Why does it happen in such a similar fashion across many institutions, of all sizes, who exist in vicious competition with each other and who know their customers hate this?
Banks are extremely good at tracking one kind of truth, ledgers. They are extremely bad at tracking certain other forms of truth, for structural reasons. In pathological cases, which are extremely uncommon relative to all banking activity but which nonetheless happen every day and which will impact some people extremely disproportionately, the bank will appear to lack object permanence. Every interaction of the user with it feels like being Bill Murray in Groundhog Day: the people you’re talking to remember literally nothing of what they’ve promised before, what you’ve told them, and the months or years of history that lead to this moment.
How did we end up here?
Recordkeeping systems
Like every bureaucratic system, banks run on a formal system of recordkeeping which requires an unrecognized, illegible shadow system to actually function. The interactions between those systems, and what they are optimized for tracking and not optimized for, cause a lot of the pathologies that people see. The seminal text on this, focused on government bureaucracies, is Seeing like a State.
Because banks are filled with extremely creative people, we call the primary system banking is conducted on a “core.” The largest banks in the world have complicated bespoke subsystems for this, but most banks are not in the software development business, and instead license a system from a so-called core processor like Jack Henry or Fiserv.
One could fill a book with architecture diagrams for a mid-sized financial institution. The key thing that non-specialists need to understand is a) the “core” does a lot of what you think of banking as, b) the core interfaces with many other systems which make up a bank, c) in particular, the core interfaces with the ledgers of the bank, and d) all of these systems together cannot represent reality nearly as well as you’d hope.
They typically grow over the years by accretion, caused by the normal processes of software development, regulatory changes, and competitive pressures. No system will ever be able to answer all interesting questions about a user; that is formally undecidable in computer science. Banks are extremely, painfully aware that the ordinary operation of the business of the bank will occasionally drop things on the floor. They have long-since automated the fat head of customer issues, and the long tail is kicked over to operational and customer support teams.
Every time responsibility moves between subsystems, be they different organizations, different computer systems, or different groups within the bank, some percentage of cases will simply break. The boundaries of systems are responsible for a huge percentage of all operational issues at banks. (They’re also where most security vulnerabilities live: systems A and B usually agree on reality, but a bad actor can sometimes intentionally get them to disagree, in ways which cause the bad actor to gain value before A and B reconcile their view of reality.)
A major technological advance over the course of the last few decades has been ticketing systems, which strikes many technologists as being crazy, because they’re almost the simplest software that you can describe as software. All a ticketing system does is enforce an invariant: if there is a problem with a case number assigned to it, and it goes between Group A and Group B, Group A needs to know it no longer is responsible and Group B needs to know it is now responsible. Then you can do can’t-believe-they-pay-us-for-this computing and observe things like “Group B is now working on 10,342 cases”, “There are 76 cases which Group B has not acted on within the last month”, and “Ginger seems to be anomalously unproductive at closing out cases relative to her nearest coworkers.”
So why didn’t ticketing systems solve this problem? Part of it is that the problem is self-referential: the ticketing system is not the core. The ticketing system is not the subsystem that is directly responsible for anything of interest to you. The ticketing system is an entirely new system, which requires integration with other subsystems and which will frequently need to do handovers to them. This interface is frightening, unexplored territory where new classes of issues that you’ve never seen before can spring up.
Bank systems are an interesting combination of designed and accidental. The accrete like sedimentary layers. A particular force which affects banks more than most institutions is that the banking industry has undergone decades of consolidation. When banks merge, one bank doesn’t simply eat the other and digest its balance sheet and people. They end up running their systems in parallel for years while working out an integration plan. That plan will, almost inevitably, cause one of the systems to mostly “win” and the other system to mostly “lose”, but for business reasons, something of the loser will be retained indefinitely. It now has to be grafted onto the winner, despite frequently being itself decades out of date, having its own collection of grafted acquirees partially attached to it, and needing expert input from people who are no longer with the firm.
Users can watch this play out in real time. For example, I banked at First Republic and also bank at Chase, which now owns First Republic. In something which sounds unimpressive and would blow the mind of bank CTOs from as recently as ten years ago, both sides of the bank understand that the same person has an account on the other side. (You wouldn’t think testing Social Security Numbers for equality requires any high-tech wizardry, and you’d be right. The thing which was actually hard was building a process to allow complex ad hoc bidirectional synching of systems that were not built in tandem with each other.)
But because that integration is ongoing and will take years to resolve, neither part of the bank knows consequential things the other part knows about me, even where it strikes most people as obvious that they should. Chase eagerly communicates timelines for transitioning the home loan that First Republic very definitely never wrote. It is utterly clueless about the Line of Credit that they factually did extend. And it will require a lot of midnight oil from hundreds or thousands of people for most of another year before I can walk into a Chase branch and ask what the balance is on an account serviced by First Republic’s core.
Human accountability and its malcontents
So let’s talk about how banks spackle over the infelicities in their systems.
First, the bank builds many subsystems which interface with its core processing systems and ledgers. These systems are built so internal bank staff can see what a customer has done in their accounts and, perhaps, act upon those accounts on their behalf.
For those keeping score: yep, this interface boundary is another place which can cause the bank to fail to agree with reality. Relatively simple programming issues can cause the staff-exposed view of an account to fail to agree with reality known to the bank.
For example, they not infrequently fail to show some staff transactions which are “pending.” In many cases, “pending” has consequences which are extremely similar to being finalized from the perspective of the user, but a particular system might simply not show them. You’d think that is a confusing choice to make and often underrate the possibility that no one ever made this choice, not really. Sure, it exists (inarguably in this case) in code, and that code might be described in a requirements analysis document that someone handwaved together 18 years ago, but nobody ever said “Nah, exclude pending transactions”. This was a simple oversight, projected into the future indefinitely, to the enduring annoyance of old hands among staff and the conti