There are many catchphrases about Haskell.
- Haskell is useless.
- Haskell aims to avoid success at all costs.
- Haskell is the best procedural language in the world.
These sound like dismissals or absurdities from the outside, but once you learn
what they really mean, they take on a new light. In this article, I want to
explain the third. (See the appendix if you are curious about the first two.)
This article really, really tried to become a monad i/o tutorial, but I think
I stopped it in time.1 By that I mean I had to rewrite it twice and delete
large chunks of monad i/o tutorial material. Here, we are going to jump right
in and focus on the interesting stuff.
Effectful computations in Haskell are first class values. This means we can
store them in variables or data structures for later use. There is a Haskell
function
randomRIO :: (Int, Int) -> IO Int
which, when given two integers as arguments, picks a random integer between
them. We can put calls to this function into a list, like so:
some_dice = [ randomRIO(1, 6), randomRIO(1, 6) ]
This is a list of two calls to randomRIO
. What surprises non-Haskellers is
that when this list is created, no random numbers are generated. Coming from
other programming languages, we are used to side effects (such as random
generation) being executed directly when the side effectful function is
called.2 You may think Haskell is different here due to laziness, but that’s
also not true. Even if we put these calls into a strict data structure, no
randomness would happen.
We can add more random generation to the list:
more_dice = some_dice <> [ randomRIO(1, 6) ]
and still no random numbers will be generated. We can go ahead and manipulate
this list in all sorts of ways, and still no random numbers would be
generated.
To be clear, the randomRIO
function could well be called3 Whether this
actually happens is a question of lazy evaluation, optimisation, etc., and when
it is called it returns a value of type IO Int
. It’s just that this value is
not an integer. If anything, we can think of it as a set of instructions for
eventually, somehow, getting an integer. It’s not an actual integer. It’s an
object encapsulating a side effect. When this side effect object executes, it
will produce a random integer, but the object itself just describes the
computation, it is not an integer.
In other words, in Haskell, it is not enough to call a side effectful function
to execute its side effects. When we call the side effectful function, it
produces an object encapsulating the side effect, and this object can be
executed in the future to produce the result of the side effect.4 Readers
familiar with JavaScript promises will recognise this concept. Indeed,
promises are modeled after side effects in Haskell.
The common way we teach beginners to do execute side effect objects is by
calling them from a do
block, using the special <-
assignment operator to
extract their result. As a first approximation, we can think of the following
code as the way to force side effects to execute.
dice_print = do side <- randomRIO(1, 6) printf "It landed on %dn" side
We can imagine that the <-
arrow executes the side effect object returned by
randomRIO
and captures the value it produces. Similarly, the side effect
object returned by printf
gets executed, but we don’t capture the result; we
don’t care about the value produced by it, we only care about the side effect
itself.
The lie-to-children here is that we pretend the do
block is magical and that
when it executes, it also executes side effects of functions called in it. This
mental model will take the beginner a long way, but at some point, one will want
to break free of it. That is when Haskell starts to really shine as a procedural
language.
This article features another lie-to-children: it will have type signatures
specialised to IO a
and [a]
. All the functions I mention are more generic
than I’m letting on.
- Anywhere this article says
IO a
it will work with any type of side effect
(likeMaybe a
,Rand g a
,StateT s m a
, etc.) - Anywhere this article says
[a]
it probably also works with other
collection/container types (likeMaybe a
,Array i a
,HashMap k a
,Tree
, etc.)
a
The reason this article uses more direct type signatures is to hopefully be
readable also to someone who does not use Haskell.5 If the reader has never
seen ml style syntax before, I think the most important thing to know is that
function calls aren’t written like isInfixOf("llo", "hello, worldn")
but
rather with spaces, as in isInfixOf "llo" "hello, worldn"
.
In order to drive the point home, we’ll start by seeing what it is the do
block actually does, because it’s not magic at all. In fact, every do block can
be converted to just two operators. If you
already know this, skip ahead to the next section.
then
If we want to combine two side effects into one, we can use the *>
operator,
which is pronounced then or sequence right. It takes two side effect objects
and creates a new side effect object that executes both when itself is executed.
The value produced by this new composite object is going to be the value
produced by the second object its constructed from. In that sense, the *>
operator is a lot like the comma operator in C: it chains together statements,
and returns the result of the last.
(*>) :: IO a -> IO b -> IO b
For example, here we combine two print statements into a single side effectful
function.
greeting :: IO () greeting = putStr "hello, " *> putStrLn "world"
This is a single side effect object called greeting
, but its execution will
involve the execution of two separate print statements.
We teach beginners to write this as
greeting = do putStr "hello, " putStrLn "world"
which is the exact same thing, although arguably easier to read. The interesting
thing is the implication for how we look at do
blocks. It turns out they may
not be magical at all; maybe they are just inserting implicit commas, i.e. a
pretty way of taking multiple side effect objects and combining them into a
single, bigger, side effect object.
bind
The one thing we cannot do with *>
is take the result from the left-hand side
effect and use it to influence the right-hand side function call, because the
*>
operator discards the first result before executing the second effect –
just like the comma operator in C. The die throwing code we saw previously,
dice_print = do side <- randomRIO(1, 6) printf "It landed on %dn" side
cannot be implemented with just *>
. We need the additional operator >>=
which takes a side effect object and plugs the value it produces into another
side effectful function.6 This operator is widely known as bind.
(>>=) :: IO a -> (a -> IO b) -> IO b
Using this operator, we could write the above do block as
print_side :: Int -> IO () print_side side = printf "It landed on %dn" side dice_print :: IO () dice_print = randomRIO(1, 6) >>= print_side
and this would take the result of executing the side effect of randomRIO
and
plugging it into another side effectful function, namely print_side
.
Two operators are all of do blocks
This illustrates that do
blocks are built from only two operators. If we stick
with do
blocks for all side effects, we will never learn why Haskell is the
greatest procedural programming language in the world, because we are limiting
ourselves to just two operators for dealing with side effects.
Let’s lift our gaze and see what happens when we look beyond those. There are
more functions for dealing with side effects.
We’ll start with the basics and work our way up.
pure
If we want to construct a new side effect object that always produces a
specific value, we can use the function pure
.
For example, this creates a side effect object that always produces the integer 4.
loaded_die :: IO Int loaded_die = -- Chosen by fair dice roll. -- Guaranteed to be random. pure 4
Creating side effect objects that always produce a known value might not seem
very useful, but it comes up all the time when bridging the worlds of pure code
and side effects.
fmap
One of the most used functions when working with side effects in Haskell is
fmap
.
fmap :: (a -> b) -> IO a -> IO b
This takes a pure function, and a side effect object, and returns a new side effect
object that is similar to the one it got, except the value produced will be
transformed by the function first.
Transforming the results of side effects is
so common that fmap
has an operator alias: <$>
. For example, to get the
length of the path to the user’s home directory, we can do
path_length :: IO Int path_length = length <$> getEnv "HOME" -- equivalent to -- path_length = fmap length (getEnv "HOME")
This creates a new side effect object which will produce the result of applying
the length
function to the result of the side effect of getEnv
.
liftA2, liftA3, …
Where fmap
allows us to transform the value produced by a single side effect,
sometimes we need to create a side effect object that produces something based
on multiple other side effects.
This is where liftA2
and friends come in.
liftA2 :: (a -> b -> c) -> IO a -> IO b -> IO c liftA3 :: (a -> b -> c -> d) -> IO a -> IO b -> IO c -> IO d
These can be thought of like fmap, except they don’t transform the result of
just one side effect, but they combine the results of multiple side effects with
one function, and create a new side effect object that produces the return value
of that function.
This is also common enough that it can be written with two
operators: the <$>
we already saw for the first argument, and then <*>
for
the rest of the arguments. To check if a user’s home directory contains the user
name, we do
home_username :: IO Bool home_username = isInfixOf <$> getEnv "LOGNAME" <*> getEnv "HOME" -- equivalent to -- home_username = liftA2 isInfixOf (getEnv "LOGNAME") (getEnv "HOME")
This has created a new side effect object that will produce true if $LOGNAME
is a part of $HOME
, and false otherwise.
Intermission: what’s the point?
I am raving ecstatic about this. Combining the results of side effects through
liftA2
and friends is such a fundamental technique that the favicon of this
website is a stylised version of the <*>
operator.
But the reader may understandably be a little underwhelmed. It seems like we
have only learned to do in Haskell what we can do in Python and every other
programming language already. All popular programming languages let us use the
results of side effects as arguments to other functions. There are already two
reasons we should care about Haskell, though, even before we see what’s to come.
These are refactorability and discipline.
Biggest benefit is probably refactorability. We might have the following code
that throws two dice and sums them up:
sum_dice :: IO Int sum_dice = liftA2 (+) (randomRIO(1,6)) (randomRIO(1,6))
and we think the repetition is annoying, so we put the actual die-tossing code
into a variable.
sum_dice :: IO Int sum_dice = let toss_die = randomRIO(1,6) in liftA2 (+) toss_die toss_die
If we did this in Python, we would accidentally store the result of the die
toss in the toss_dice
variable! If the die lands on 3, we will compute 3+3,
rather than the intention of summing two different tosses. With the way side
effects are first class values in Haskell, we are always free to blindly (!)
extract code into a variable name, and this will never change how the code
runs.7 This is called equational reasoning and it’s incredibly powerful and one of
the things that make Haskell such a strong enterprise language, and so nice to
work with procedural code in.
A second benefit is that by being structured in how we allow side effects to
affect subsequent computation, we have a lower risk of accidentally introducing
side effects where they were not intended. We also have greater control over
what can affect what, reducing the risk of interaction bugs.
The reader might, for example, argue that the previous refactoring example works
just fine in Python, by using an anonymous function as a faux side effect object:
def sum_dice(): toss_die = lambda: random.randint(1, 7) return toss_die() + toss_die()
but this (a) requires being careful in refactoring, (b) does not prevent
accidentally triggering the side effect where it was not intended, and (c)
requires a language feature (sequence points) to disambiguate in which order
side effects get executed. With Haskell, we just don’t have to care. W