I am convinced there exists something we can call statistical literacy.
Unfortunately, I don’t yet know exactly what it is, so it is hard to write about.
One thing is clear: it is not about knowlege of statistical tools and
techniques. Most of the statistically literate people I meet don’t know a lick
of formal statistics. They just picked up statistical literacy from … somewhere.
They don’t know the definition of a standard deviation, but they can follow a
statistical argument just fine.
The opposite is also possible: a few years ago I had a formidable toolbox of
statistical computations I was able to do, but I would be very confused by a
basic statistical argument outside the narrow region of techniques I had
learned.
In other words, it is not about calculations. I think it is about an intuitive
sense for process variation, and how sources of variation compare to each
other.
Content warning: this is the most arrogant article I’ve written in a long time.
I ask you to bear with me, because I think it is an important observation to
discuss. Unfortunately, I lack the clarity of mind to make it more approachable:
the article is arrogant because I am dumb, not because of the subject matter
itself.
Hopefully, someone else can run with this and do a better job than I.
It’s hard to write directly about something which one don’t know what it is, so
we will proceed by analogy and example.
Back in the 1500s shipping insurance was priced under the assumption that if you
just knew enough about the voyage, you could tell for certain whether it would
be successful or not, barring the will of God. Thus, when asked to insure a
shipment, the underwriter would thoroughly investigate things like the captain’s
experience, ship maintenance status, size of crew rations, recency of
navigational charts, etc. After much research, they would conclude either that
the shipment ought to be successful, or that it ought not to be. They arrived at
a logical, binary conclusion: either the shipment will make it (based on all we
know) or it will not. Then they quoted a price based on whether or not the
shipment would make it.
This type of logical reasoning leads to a normative perspective of what the
future ought to look like. Combined with the idea that every case is unique,
this is typical of a lack of statistical illiteracy. The statistically
illiterate predicts what the future will look like based on detailed knowledge
and logical sequences of events. Given that we hadn’t yet invented statistics in
the 1500s, it is not surprising our insurer would think that way.
Of course, even underwriters at the time knew that sometimes ships that ought
to make it run into a surprise storm and sink. Similarly, ships that ought not
to make it are sometimes lucky and arrive safely. To the 1500s insurer, these
are expressions of the will of God, and are incalculable annoyances, rather than
factors to consider when pricing.
This is similar to how a gambler in the 1500s could tell you that dice were
designed to land on each number equally often – but would refuse to give you a
probability for the next throw, because the outcome of any given throw is “not
uncertain, just unknown”: God has predetermined a specific number for each
throw, and we have no way of knowing how God makes that selection.1 This
distinction between the uncertain and unknown still happens among the
statistically illiterate today.
The revolution in mindset that happened in the 1600s and 1700s was that one
could ignore most of what made a shipment unique and instead price the
insurance based on what a primitive reference class of shipments had in common,
inferring general success propensities from that. Insurers that did this
outprofited those that did not, in part because they were able to set a more
accurate price on the insurance, and in part because they spent less on
investigating each individual voyage.
I like the mid-1800s quote from Lecky commenting on the rise of rationalism,
saying
My object in the present work has been to trace the history of the spirit of
Rationalism: by which I understand not any class of definite doctrines or
criticisms, but rather a