Categories: Programming
Introduction
I have plenty of experience in procedural and object-oriented programming, but little in functional styles (just one project in Scala). Just for fun, I decided to learn a little Haskell programming, and found it interesting and challenging but not mind-bending - except for monads. For some reason, the way monads are described in all the standard documentation just made my head spin.
After much pondering, I think I’ve got it figured out - and it isn’t so hard after all. Below is an explanation aimed at object-oriented programmers in general. It’s a practical explanation, not a mathematical one; I’m really interested in what this concept can do for me as a developer rather than the technical details of how it is defined.
Note that the monad pattern doesn’t occur only in Haskell - it comes up in many functional languages, and can also be useful in object-oriented programming.
This has turned out to be quite a long article, so you might want to read the summary at the end first, to decide whether you really want to bother with the rest. At the least, it’s a counter-balance to the many short articles on monads I read which stopped just when things were getting interesting; here all the details are spelled out at (possibly excessive) length. This article could be considered an extension of the wikipedia article on monads in functional languages; if you can understand that page then you don’t need this article!
Contents
- Monad as a Design Pattern
- Monad Pattern Benefits
- Monads in Code
- Some Concrete Examples
- A Quick Look at the Monad Laws
- Conclusion
- Summary
- Appendix 1: The State Pattern
- Appendix 2: Haskell’s IO Monad Datatype
- References
Monad as a Design Pattern
The official definition of a monad is a set of constraints; anything that matches these constraints is a monad. But to a developer, a monad is a design pattern.
The definition doesn’t really define names for the parts of the pattern; it says things like “there must be an operation such that ..”. Different languages/communities then have different conventions for what to call these operations. In addition, there are several equivalent formulations of the pattern. In the same way that a mathematical equation can be written in multiple equivalent forms, the monad laws can also be written in various ways, eg using map+join or the “Kleisli composition operator” (which Haskell calls the “fish operator” as it is written >=>
). However the most common definition of a monad is based around an operation typically called “bind” (or compose or flatmap).
Because it is a pattern, it can be implemented in just about any language, without support from the language or its standard library - as long as the language supports generics and higher-ordered-functions, ie functions as parameters to other functions. However there are libraries for various languages that provide specific base types that monadic types can be derived from, and a few languages provide useful “sugar syntax” for working with types that derive from specific types in their standard libraries. Haskell’s standard monad type and “do” keyword are an example, and I believe F# also has custom features for this.
The things that the monad pattern can provide are described in general terms below, then this article takes a closer look at how a monad definition might look to an object-oriented programmer. Next there are some examples of specific monad implementations to get an idea of how this works in practice. Finally the official monad definition is looked at again in general terms, particularly how it corresponds to the examples.
There are occasional code examples here; these are given in pseudo-code that are based on Kotlin - a language less verbose than Java, but still in the same family.
Some useful terminology: monad is the name of the general pattern (and sometimes of a standard base class); a monadic type is a specific implementation.
Interestingly, the word monad originally comes from greek philosophy and then was used in category theory (a branch of mathematics). The relevance of these theoretical concepts to functional programming was really only revealed by Eugenio Moggi in 1989 (quite late for a core software concept) and the full current understanding (including functors and monoids) was only formulated around 2010 (see history section of previous link).
Monad Pattern Benefits
There are a few things that monadic types can do which makes them worthwhile looking at. They can:
- compose functions
- manage hidden state
- represent pipelines of transformations
- act as containers of multiple values
- perform “control flow” over a sequence of operations
The next few sections looks at these aspects in turn, although they are actually related, ie are different “views” of the same behaviour with different emphasis.
Monads as Function Composers
One of the principles of functional programming is function composition: the idea of producing one function from multiple others. Among other uses this is a building block of abstraction.
Consider a couple of function definitions (using functional-language-style syntax here):
inc = i -> i + 1
intToString = i -> tostring(i)
It is possible to create a new function which combines these:
incThenString = i -> intToString(inc(i))
This is function composition. And in fact it is possible to define a function that creates such functions for you (example in Kotlin):
fun<T1,T2,T3> compose(f1: (T1)->T2, f2: (T2)->T3) : (T1)->T3 = { t -> f2(f1(t)) }
val inc : (Int) -> Int = { x -> x + 1 }
val intToString : (Int) -> String = { x -> x.toString() + "!" }
val incThenString = compose(inc, intToString)
println(incThenString(12)) // "13!"
This function compose
takes any two functions and returns a new function that applies them one after the other - providing the parameter type of each function matches the return type of the preceding one. This is often implemented as an infix operator, ie f1 compose f2
, and sometimes written >.>
ie f1 >.> f2
. Data can be considered to “flow” through the functions in the order listed (and as shown later, this is how monads compose functions).
Actually, “dot” composition is probably the more common function composition operator outside of monads, and combines the functions in the other order: f . g
means f(g(x))
, ie data is processed first by g and then by f. Data “flows” from right to left - which sometimes matches “normal english usage” better: sqrt . sum . squareall
produces a combined function which calculates “the square root of the sum of the squares of all input values”.
There is, however, a limitation on simple function composition: the composed functions cannot influence each other except via the output of one being the input of the other. Monadic composition instead requires input functions with signatures (T1)->SomeMonad<T2>
and (T2)->SomeMonad<T3>
and produces a combined function with signature (T1)->SomeMonad<T3>
, ie the functions to be composed need to wrap their results. In return for this extra complexity, this supports a “hidden side channel” between the functions that allows the resulting (composed) function to take into account things other than just the input and output parameters of each function.
The way that the side-channel is used depends upon the type SomeMonad, ie there isn’t just a single composition implementation; each monadic type defines its own implementation. And note that normal function composition doesn’t work for two functions (T1)->SomeMonad<T2>
and (T2)->SomeMonad<T3>
because the output type of the first function is not equal to the input type of the second; instead monadic composition is required. This ability to support a “side channel” is why the monad’s version of function composition is sometimes called function composition on steroids.
As noted earlier, there are various ways of expressing the core monad pattern. One way is using the “Kleisli composition operator” aka “fish operator” (written >=>
which looks like a fish); this truly takes two functions with the appropriate (monad-returning) signatures and returns a new function. Another way is using the “bind operator” (written >>=
) in which a value of a monadic type internally wraps/hides one function and a single new function is then “bound to” or “composed with” the internal one. Yet another also embeds a function “inside the monadic value” and then builds the necessary composition logic from two operations: map
and join
(aka flatten
). All these are equivalent but provide different “apis” to the monad pattern; the “bind” approach is generally the most useful for programmers (leads to the shortest code). The “bind” operation is called flatmap
in some contexts.
Of course in most cases we want to eventually get the result of the execution of composed functions - something not directly available as the composed functions accept an input but return a wrapper (SomeMonad<T>
) rather than a directly useful value. Monadic types therefore generally provide additional functions to “unwrap” a value of that type, ie “extract” the wrapped value. That isn’t part of the monadic pattern though - and in fact Haskell’s IO monad does not provide any unwrap operation; its whole goal is to compose functions which have side effects.
This ability to compose functions is really the basis for the other benefits such as state-management, pipelines, and collections.
Monads as Hidden State Managers
In object-oriented languages, the concept of objects with private fields is, well, a core thing. And in an object-oriented language we have things like the builder-pattern. In the purest form of the builder pattern, the builder type provides a set of methods where each call returns a copy of the builder object it was invoked on, with one or more fields updated. Eventually we ask the builder to create some final object. During the sequence of calls to builder methods, there is effectively some “hidden state” that is being propagated at each call from the old to new object.
In functional languages, implementing something like a builder is more complex. Functions can have hidden state, ie values that can affect their outputs but are not visible in their parameter-lists; see closures or partial application. However there is no obvious mechanism for passing this along a sequence of function calls. Monadic function composition provides a way to pass this state from one “function evaluation” to another. Many functional languages use this to implement things like the object-oriented “builder” pattern or state-machines.
For object-oriented programmers, this might actually feel pretty obvious. A monad is an object with state which has a method which takes some parameter and produces another object with state - that’s pretty much what any (immutable) object does. However it’s important to note a couple of things..
First, state management isn’t perhaps so obvious to mathematicians or functional programmers - and their different view of state does tend to affect the terminology which they use when talking about these things.
And second, while there are obvious ways to do similar things in object-oriented languages, what the monad pattern does provide is a simple and consistent API to things doing state-management. Each monadic “object” wraps some value that is “the thing being processed” and provides a way to apply a transformation to that value; code which operates on this value needs to know nothing else about it. This provides strong decoupling between code that “transforms values of type X” and the code that decides how hidden state might be used to pre-process the value before the transformation or post-process the result.
Monads as Pipelines
When you have a chain of different operations that you want to apply to some starting type T then the monad pattern allows you compose them together, ending up with a single function (convenient). Once you’ve built this combined function, you invoke it and you get back the final result of all the functions - though due to the way the pattern is defined, there are some complications about providing the initial value and extracting the final result; that’s discussed later.
This is the “pipeline” or “lazy” view of the monad pattern; focusing on the “bind” operation leads us to think of it as effectively “building a program to execute later”. This is a very useful concept - eg when processing a stream of input values.
Of course regular imperative programming languages support declaring functions consisting of a sequence of statements, ie if you want to just execute a sequence of transforms, you could do this instead:
fun myFunction(a : SomeType) {
val b = f1(a);
val c = f2(b);
val d = f3(c);
return f4(d);
}
However the monad pattern provides an elegant “chained function call” syntax to express such sequences. And (far) more importantly, the monadic type can implement custom control flow ie can decide which of the functions in a pipeline are executed when (based on hidden state). This is discussed further in the section “Monads as a Programmable Semicolon”.
The monad pattern doesn’t require that the functions it composes together are pure functions; they can potentially have side-effects. In fact, a function can even have no return-value (return type void or unit in many languages). A monad therefore represents “a sequence of operations” rather than “a pipeline of transformations”. When bind
is called with a function that returns no value (Kotlin Unit
, Java void
) then the resulting monad’s type is M<Unit>
or similar. The next operation bound to this monad obviously won’t take a parameter, so sometimes a variant of the bind method (often written >>
) is defined whose parameter is not a function but simply an instance of a monad; see later.
A pipeline does need some way to trigger its execution. The previous section on hidden state mentioned that a monadic type typically has at least one additional operation to extract/unwrap the value within the monad; that is typically the trigger for evaluation of a pipeline ie “pulling” a value from the end of the pipeline causes all steps to be executed.
Monads as Containers
The monad laws say that each monadic type needs to have a constructor or factory method which creates an “instance” of it, given some input value. This isn’t quite what it seems, but let’s assume the obvious interpretation for now. What this implies is that a monadic type is a container for a value of some type. And this suggests the possibility that each call to the type’s “bind” function doesn’t have to build a compound function aka pipeline, but can instead be executed immediately, producing a new container of the transformed type.
Depending on the purpose of the monadic type, simply executing functions immediately as they are passed to bind
can be a more obvious and efficient solution than building up a “pipeline”.
While a monadic type has just one generic type parameter, there is no constraint on the number of instances of that type it manages. In particular, collections such as lists can fulfil the constraints for being a monad - or looking at it the other way around, a monadic type can provide the expected behaviour of such a collection while also providing the benefits of the monad pattern. In recent years many object-oriented languages have extended the API of their existing collection types to provide monad-compatible methods (map
and bind/flatmap
).
In a pipeline-centric monadic type, the values that flow from the pipeline might be sourced from somewhere external, eg a file or the results of a DB query. Executing operations immediately (container) vs executing them on-demand (pipeline) might have quite different performance and memory-usage behaviour.
While a monadic type can be a container, not every container is a monadic type; it must also have associated operations which fulfil the monad laws.
Monads as Control Flow Managers
Wikipedia has this to say:
a monad is a structure that combines program fragments (functions) and wraps their return values in a type with additional computation. In addition to defining a wrapping monadic type, monads define two operators: one to wrap a value in the monad type, and another to compose together functions that output values of the monad type (these are known as monadic functions). General-purpose languages use monads to reduce boilerplate code needed for common operations (such as dealing with undefined values or fallible functions, or encapsulating bookkeeping code). Functional languages use monads to turn complicated sequences of functions into succinct pipelines that abstract away control flow, and side-effects.
This is equivalent to the above sections: a monad is a container of values and can compose functions together in ways that the monadic type controls. This allows separation of the “steps to be applied” (user of the monadic type) and “how to apply the steps” (the monadic type itself).
The monad pattern has similarities with the strategy design pattern often applied in object-oriented programming; some “controller” is passed an object which it calls back into at the appropriate time. The listener/observer pattern is also similar.
There is an analogy that is apparently common in the Haskell community (sorry, couldn’t find the original source):
[…] monads have been described as “programmable semicolons”; a semicolon is the operator used to chain together individual statements in many imperative programming languages, thus the expression implies that extra code will be executed between the statements in the pipeline.
This nicely captures the concept of monads. When code is just a sequence of statements, then a compiler will always do the same thing: execute those statements in order. By changing that code to a sequence of calls to a monadic type, that monadic type can apply all sorts of interesting “control flow” across those statements.
In Java, C, C++, etc. we are used to having a hard-wired set of control structures: if/then/else, for, while/do, try/catch etc. But these aren’t always sufficient. It would often be nice to be able to build a list of statements to execute, and then pass that list to some “executor module” that figures out which ones to execute, based on whatever rules it likes. The simplest rule would be: execute each one in order, and return the output of the last function. Another interesting rule would be: execute each one in order but if an exception is thrown then execute a special “handler” function and return its output.
This kind of thing is possible to do in object-oriented languages. It is even possible in C by using function-pointers. Executing functions in this way can be thought of as some variant of the “Strategy” pattern. But it is so inconvenient that it is rarely worth-while.
The monad design pattern has an API which makes implementing new flow-control patterns possible.
Things that custom control-flow can enable are:
- hidden state management
- optional execution of operations (apply a statement only under some conditions)
- building pipelines of operations to be executed later
- applying an operation to a collection of values rather than just one.
These are all things that can be done without monads, but monads provide a clean and consistent concept that unifies them.
And importantly, a function which defines a pipeline of operations can be passed an initial monad as a parameter, and then bind various operations to it. This allows the caller to produce quite different effects by passing different monads to the same function; the monad can:
- apply operations immediately to one value
- apply operations in an “on demand” manner to multiple values read from some external source (eg a database)
- apply operations to multiple values in parallel using multiple threads
- log operations before and after invoking them
- and many other possibilities
This is sometimes described as “being polymorphic over the monad type” - the function accepts any monad which is a subtype of some specified base type, with the actual implementation being exchangeable.
The initial monad has this kind of control over behaviour because the “bind” operation is a “map-and-join” operation where the result of the map step is not under the control of the original monad, but the result of the join step is. This does rely on the composed functions returning a monad object which is compatible with the original monad - eg the original might be “some kind of list” and the composed functions must also return “some kind of list”. However that’s still pretty flexible. The exact limitations are described later.
Monads in Code
For an object-oriented developer, one way to think about the monad pattern is as an interface which:
- is a generic interface with exactly one type-parameter T, ie is of form
SomeMonadicType<T>
- has at least one concrete implementation (variant)
- has a factory-method for creating an instance
- has a “map-and-join” method that takes a function as parameter (usually named “bind” or “flatmap” or “
>>=
”) - optionally has a “map” method (can be built on top of map-and-join, but often has a dedicated implementation for performance)
- has one or more additional methods to somehow extract useful data out of the monad (except in the case of Haskell’s IO monad!)
I have used the term “map-and-join” because I think this describes the operations better than the standard terminology. In particular, the word “bind” suggests building a pipeline when immediate application is often the actual implementation which is rather confusing.
As noted earlier, there are other ways to define the monad pattern which would lead to a slightly different “api” (eg one with two mandatory methods map
and join
). However the version using “bind” (map-and-join) is by far the most common.
In Kotlin-ish pseudo-code, a monadic type definition might look like:
interface SomeMonadicType<T1> {
fun<T2> bind(fn : (T1) -> SomeMonadicType<T2>) : SomeMonadicType<T2>
fun<T2> map(fn : (T1) -> T2) : SomeMonadicType<T2> // optional
....
}
// Factory method to construct some implementation of the above interface
fun<T> someMonadicTypeOf(T initialValue) : SomeMonadicType<T> {...}
The method-names “bind” and “map” can be called something more appropriate for any particular type; for example a type focused on building pipelines might call a method “andThen”. An interface might even offer multiple methods that follow the bind/map law. However this article will assume bind
and map
..
We’ll have a look at a more formal definition of the monad laws towards the end of the article, and see how they correspond to this more informal description. However this will do in order to understand some examples.
Method map
takes as parameter a function which accepts a value of type T1 and returns a value of type T2, ie is capable of transforming the value(s) that the monadic type provides. The map
method itself returns a value of the same monadic type as the value it was invoked on - but which can provide zero or more instances of type T2 instead.
Method bind
(map-and-join) is similar in principle to the map method, but more general-purpose - and in fact map
can be implemented in terms of bind
, ie is a special case of it. Here the function passed in also accepts a value of type T1, but instead of returning a T2 it returns a “wrapped” T2 value with the same monadic type as the value it was invoked on. The “join” part is where the magic discussed earlier is applied - the “peeking at hidden state” or “programmable semicolon” behaviour. Sounds complicated, but it’s not really; the examples below will hopefully make this clearer.
Map and bind are effectively “factory methods” which create a new monadic value from an existing one (though in some types and circumstances the returned instance is just the original one on which the method was invoked, ie the method is a no-op). And because they return a monadic value which also has map and bind methods, it is possible to elegantly chain calls.
The functions passed to map
and bind
are a kind of strategy or callback function; map and bind can then do whatever it wants with that function. The power of the monad pattern is that different monadic types do different kinds of things with that function - and this is hidden from the caller. A function can be given a monadic value, call map, and know what it will get back: another monadic value. What exactly has happened to the data, though, depends on the monadic type - and isn’t actually revealed until something extracts data from the monadic value.
The expression “can provide” has been carefully chosen here, but for many monads the word “wraps” could be used instead, and I’ll use it below when it feels appropriate.
We’ll look at several different kinds of monads below to get a feel for what kinds of things a monad can do. We’ll look at:
- lists
- optional values
- error collectors
- streams/generators and other forms of delayed execution
- state-encapsulating monads
I’ll describe these in pseudo-code-like manner, as they are really themselves patterns that can be implemented in any language that supports basic functional features.
As these will show, this simple pattern of “give me a function and I’ll call you” allows the decisions about what to do to the data to be separated from concerns like when to do it, under what conditions to do it, and how to get the input value - or in other words allows new flow control patterns to be implemented. This brings benefits like cleaning up code (centralizing those decisions into the monad rather than the calling code) and hiding state.
The magic is of course in:
- what private members the implementations have (their internal hidden state)
- how the concrete types implement map and bind
- what other methods the interface has for extracting data from the monadic type…
These benefits get stronger the more calls to map and bind occur in sequence; the more calls that are made, the greater the benefits of having the associated “flow control” code centralized and deduplicated.
Note also that although object-oriented terms like “class” and “method” are being used here, many functional languages have no classes and the same pattern is implemented using algebraic data types and functions instead.
Some Concrete Examples
Now that some general background has been covered, let’s look at some specific monadic types that can be useful in programming.
Many of these examples could be implemented in an object-oriented language without the monad pattern. However using this pattern provides great abstractions; it allows methods to focus on the steps to be performed separately from the mechanism which applies them (which is encoded in the monadic type).
The List Pattern
We all know the List type. But as noted above, a monad can act as a container, ie a monad can act as a list - or expressed otherwise, a list can can fulfil the requirements for a monad. All we require is:
- a way of building an initial list (a constructor)
- (optionally) a map method which takes a single value from the list and returns a new single value (possibly of a different type)
- a bind method which takes a single value from the list and returns a list of values (possibly of a different type)
Method map iterates over each element, transforming it to a new value and the results are collected into a new list which is always the same size as the original list - but might be “of” a different type. The simple signature of the passed function, (T1)->T2
, doesn’t allow this method to do the state propagation that we talked about earlier. This signature also means that map
is also unable to change the shape of its container; the bind
method is needed for that.
Method bind requires a slightly more complicated callback function - one that converts each input value to a list of values. However the implementation of List.bind
concatenates these lists together, resulting in a single list which has zero or more elements for each input element. To filter out a value from the input list, the callback function returns an empty list, while obviously one element can also be turned into multiple. To produce the same 1:1 behaviour as map
, the callback function needs to transform each input element to a list of size 1 wrapping the transformed value; this makes it clear why a dedicated implementation of map can be more efficient! Due to this “flattening” step from List<List<T>>
to List<T>
the bind
operation is often called flatmap
.
Neither map
nor bind
can change the monadic type, eg change a List to a Set. If that is needed, it has to happen as additional non-monadic methods on the type.
It was mentioned earlier that a monadic type can choose to have its bind
and map
methods invoke the provided function immediately, or save it for later. If the “immediate” approach is applied, then the bind method simply iterates over the wrapped list, applying the provided function to each element in turn. If the “delayed” approach is applied, then bind creates a new function-object which iterates over a list, calling the original function, and flattening the results - and then returns a new List value which wraps this new function.
The use of monads to provide “control flow” was discussed earlier; the way that a caller provides a single function to a List monad and that function is applied repeatedly to each element (iteration) can be considered an application of this.
Here’s a possible (though ugly) implementation in Kotlin, using the “immediate function application” approach. Note that a few type-casts (someval as SomeType
) are used in this code to work around some Kotlin limitations related to the JVM’s “type erasure” - particularly with arrays of generic types. For-loops are also used rather than things like fold
because, well, this is trying to demonstrate monadic concepts to developers who are perhaps not familiar with functional programming.
class MyList<T1>(val data: Array<T1>) {
// An optimised map function
fun <T2> map(fn: (T1) -> T2): MyList<T2> {
val results = arrayOfNulls<Any>(data.size) as Array<T2>
for(i in 0 .. data.size - 1) {
results[i] = fn(data[i])
}
return MyList(results)
}
// A monadic bind method which concatenates all the results together..ie "flatmap"
fun <T2> bind(fn: (T1) -> MyList<T2>): MyList<T2> {
// Apply fn to each original element and store the results as a collection of MyList<T2>
val results = arrayOfNulls<MyList<Any>>(data.size) as Array<MyList<T2>>
for(i in 0 .. data.size - 1) {
results[i] = fn(data[i])
}
// Flatten an array of MyList<T2> to a single MyList<T2> holding the combined set of values.
// Creating an intermediate MyList below isn't necessary, ie results could have been passed directly, but
// the intermediate form corresponds better to the official specification of bind/flatten
return flatten(MyList(results))
}
// Here, flatten means to produce a new MyList whose data is the concatenation of all data in the nested MyLists
private fun<T> flatten(src: MyList<MyList<T>>) : MyList<T> {
var result = arrayOf<Any>() as Array<T>
for(item in src.data) {
result = result + item.data // append each item's nested array<T> to result
}
return MyList(result)
}
// Alternative (but less efficient) implementation of map on top of bind.
fun<T2> map2(fn : (T1) -> T2) : MyList<T2> = bind { item -> MyList(singletonArrayOf(fn(item))) }
// Ugly workaround for Kotlin issues with arrays of generic types
fun<T> singletonArrayOf(element: T) : Array<T> {
val result = arrayOfNulls<Any>(1) as Array<T>
result[0] = element
return result
}
fun foreach(fn: (T1) -> Unit): MyList<T1> {
data.forEach(fn)
return this
}
}
fun main(args: Array<String>) {
println("Hello World!")
// demonstrate transformation without changing list size
println("== Multiply elements of list by 2 using map")
MyList(arrayOf(1, 2, 3))
.map { it -> it * 2}
.foreach(::println) // { it -> println(it) }
// demonstrate transformation which increases list size
println("== Duplicate elements using bind")
MyList(arrayOf(1, 2, 3))
.bind { it -> MyList(arrayOf(it, it)) }
.foreach(::println) // { it -> println(it) }
// demonstrate transformation which decreases list size
println("== Filter odd elements using bind")
MyList(arrayOf(1, 2, 3))
.bind { it -> if (it % 2 == 0) MyList(arrayOf()) else MyList(arrayOf(it)) }
.foreach(::println) // { it -> println(it) }
}
Note that map2
is implemented by changing the original function into one which maps each item into a single-element list. Method bind
(aka flatmap
) applies this function then merges each of these single-element lists together, producing again a simple list with the same number of elements as the original.
For any “container” monad, the order in which map
and flatmap
apply the provided transform to the contained values is type-specific. For an ordered collection such as List it is expected that the type must produce a result in the same order as the original, ie the result of applying the function to element 0 of the input list must come before the result of applying the function to element 1 of the input list. However the computation on each element is independent; therefore the computation on each element of the list can potentially be executed in parallel. Any such behaviour would be part of the monadic type’s implementation and nicely encapsulated from the caller.
The power of having type List
define monad-style map
and bind
methods is that a function can accept a list and use those methods to define transformation on the values - but exactly how and when those transformations are applied will depend upon how the concrete type of the List value implements those methods. Imperative-style code which sees a list as a “dumb container for data” and simply uses a for-loop to get the elements from the list and operate on them is far less flexible. The monad-style code (map/bind instead of foreach) is also more elegant as the looping logic is centralized (in the map/bind methods) rather than duplicated in each place that processes the values.
The Optional Value Pattern
Many object-oriented languages already have a type in their standard libraries that is a very simple monad: it’s typically called Optional
or Maybe
. This effectively wraps zero or one instances of the generic type parameter.
This pattern takes advantage of the “flow control” part of the monad design pattern.
An optional/maybe monadic type has two distinct states: wrapping a value or not wrapping anything. The overall pattern can be implemented using a separate variant of the type for each state, or a single variant with an internal flag of some sort to indicate which state it represents; the multiple variant approach is more common but Java’s Optional is (currently) implemented as a single class.
For the multi-variant approach, these variants might be called Some<T>/None
or Just<T>/Nothing
or similar. For the single-class approach, there are typically factory-methods for the different states (eg Optional.of(x)
and Optional.empty()
).
Invoking the map
function when the state is “value present” invokes the provided function passing the wrapped value, then returns a new instance which wraps the result of the function.
Invoking the map
function when the state is “value missing” ignores the provided function - ie the call is a no-op. It can’t actually do anything else, as the provided function expects an input value but there isn’t one. In effect this state is “frozen”; if this state is ever reached then all later map operations have no effect. It’s a one-way trap door in the optional value processing flow.
I’m sure all readers are familiar with this; it leads to code like this (ignoring Kotlin’s built-in nullable types):
val case1 = Optional.of(2)
.map { i -> i + 2 }
.map { i -> i * 3 }
.map { i -> i.toString() + "!" }
.orElse("None")
println(case1) // shows "12!"
val case2 = Optional.empty<Int>()
.map { i -> i + 2 }
.map { i -> i * 3 }
.map { i -> i.toString() + "!" }
.orElse("Empty") // shows "Empty"
println(case2)
The benefits are hopefully clear; the code can apply a sequence of transformations without needing to care whether data is missing or present. Eventually of course the result will be needed, but the check can be done once at the end rather than before each step. Or rather: the check before each step has been pushed down into the Optional monadic type where it can be defined just once (reuseably) rather than repeatedly in the caller (see “programmable semicolon”). This ability to push control logic down into monads is one of their cool features, driven by this “I’ll call you” approach.
However there is a problem here: what if we want to apply a transformation which might return Empty rather than a value?
The map function cannot handle that; it always takes exactly one value in and returns exactly one value - but Empty is not a value. Now in this case the map-method could be defined to look for a null result and return Empty, but that wouldn’t work in other cases - or in languages which don’t support nulls.
The bind method can, however save the day here. The signature of the function passed to bind allows it to return any result which the enclosing/applying monadic type can support; in particular it allows the function to “switch the monad state” from Present to Empty. For this type, the “combine values” feature of bind is basically a no-op but the fact that a monad is returned rather than a value is important.
Neither bind
nor map
can change the monadic type, eg change an Optional into a List. Bind can change the variant of its monadic type (eg value-present to value-missing), while map cannot do even that. Both methods can change the type of the wrapped value, ie <T1>
to <T2>
. This is true for all monadic types.
Using bind may look something like this (Java’s Optional type uses method-name flatMap
instead of bind
):
val case1 = Optional.of(1)
.map { i -> i + 1 }
.flatMap { i -> if ( i%2 == 0) Optional.of(i/2) else Optional.empty() }
.map { i -> i.toString() + "!" }
.orElse("None")
println(case1) // shows "1!"
val case2 = Optional.of(2)
.map { i -> i + 1 }
.flatMap { i -> if ( i%2 == 0) Optional.of(i/2) else Optional.empty() }
.map { i -> i.toString() + "!" }
.orElse("None")
println(case2) // shows "None"
Note that the function passed to flatmap is more complex: it needs to return Optional<T>
rather than just T - but in exchange gets the power to switch from “value present” to “value missing”.
As with any monadic type, the map/bind methods can be immediate or can build an internal pipeline and execute the functions only when data is eventually “pulled” from the monadic value.
An immediate-mode implementation might look something like this:
interface MyOptional<T1> {
fun<T2> map(fn: (T1)->T2) : MyOptional<T2>
fun<T2> bind(fn: (T1)->MyOptional<T2>) : MyOptional<T2>
fun orElse(onEmpty: T1) : T1
companion object {
private val EMPTY = MyOptionalEmpty<Any>()
fun<T> empty() = EMPTY as MyOptional<T>
fun<T> of(value : T) = MyOptionalOf(value)
}
}
class MyOptionalOf<T1>(val value : T1) : MyOptional<T1> {
override fun <T2> map(fn: (T1) -> T2): MyOptional<T2> = MyOptionalOf(fn(value))
override fun <T2> bind(fn: (T1) -> MyOptional<T2>): MyOptional<T2> = fn(value)
override fun orElse(onEmpty: T1) : T1 = value
// A map implementation built on top of bind - though rather pointless as
// MyOptionalOf.bind is a no-op ie just passes the result through (no state to manage).
// This code is therefore nearly identical to "map".
fun<T2> map2(fn: (T1) -> T2) : MyOptional<T2> = bind { v -> MyOptional.of(fn(v)) }
}
class MyOptionalEmpty<T1>() : MyOptional<T1> {
override fun <T2> map(fn: (T1) -> T2): MyOptional<T2> = this as MyOptionalEmpty<T2>
override fun <T2> bind(fn: (T1) -> MyOptional<T2>): MyOptional<T2> = this as MyOptionalEmpty<T2>
override fun orElse(onEmpty: T1) : T1 = onEmpty
}
fun main(args: Array<String>) {
val case1 = MyOptional.of(1)
.map { i -> i + 1 }
.bind { i -> if ( i%2 == 0) MyOptional.of(i/2) else MyOptional.empty() }
.map { i -> i.toString() + "!" }
.orElse("None")
println(case1) // shows "1!"
val case2 = MyOptional.of(2)
.map { i -> i + 1 }
.bind { i -> if ( i%2 == 0) MyOptional.of(i/2) else MyOptional.empty() }
.map { i -> i.toString() + "!" }
.orElse("None")
println(case2) // shows "None"
For those not familiar with Kotlin: the “companion object” bit is equivalent to a Java static method and somevalue as SomeType
is a type-cast. And yes, this would normally be implemented in Kotlin using a sealed class.
This approach uses an interface and multiple implementations - ie is effectively a functional sum type. Java’s implementation of Optional instead uses a single class (similar to MyOptionalOf) and uses null for the value to indicate that “this object is in state empty”. I find this “sum type” approach more elegant.
It’s quite cool how the “empty” variant of this pattern is effectively “frozen”; all transformation operations are no-ops and just return “this” (which in this case is the singleton EMPTY instance).
What’s cooler is again this “programmable semicolon” behaviour; the code using the monad just makes a sequence of calls map and bind, and the monad gets the opportunity to “insert code” between these operations - in this case to use the “current state” (data present or missing) to treat the provided function appropriately (invoke it or ignore it). Adding all those checks manually would look rather ugly:
var case1a : Int? = 1
var case1b = if (case1a != null) case1a + 1 else null
var case1c = if (case1b != null) case1b/2 else null
var case1d = if (case1c != null) case1c.toString() + "!" else null
var case1e = if (case1d != null) case1d else "None"
And if “null” cannot be used as a “missing value” indicator, then this gets even uglier. Delegating this control-flow-decision-making to another object is well worth the cost of having to wrap the operations being applied in lambdas passed as parameters.
This whole pattern’s idea of “postpone missing data checks until the end” has a little bit in common with the concept of (fine-grained) exception-handling, where the original idea was that the “happy path” in a function could be embedded in a try-clause and all the error-handling pushed to the catch-clause. Unlike exception-handling, this optional approach is based on a general-purpose pattern.
You might also want to look at the section discussing the Maybe type in the wikipedia page on monads in functional programming.
The Error Collector Pattern
This is very similar to the Optional/Maybe pattern, except that instead of representing two states valid-data and missing-data the monad represents two states has-error and has-result.
When the optional-value pattern reaches variant None (missing data) then it effectively stops changing state and ignores all later invocations of map/bind. The error-collector pattern does the same; as long as there is no error, provided functions are executed against the “current value” and a new monad is created to wrap the new state but once an error object has been obtained, all further map/bind operations are ignored and the existing monadic value is returned unchanged.
Like the optional-value pattern, this allows a sequence of transformations to be applied, with any checks for error state postponed until the very end of the sequence of operations. At that point, the “first error encountered” is then available from the current version of the monad.
And as with the optional-value pattern, this “postpone error checks until the end” has a little bit in common with Java’s concept of (fine-grained) exception-handling, where the original idea was that the “happy path” in a function could be embedded in a try-clause and all the error-handling pushed to the catch-clause.
This monad pattern is often called Either
, with it having two variants Left
and Right
; the Left
variant is the “frozen” one (ignores map operations and just returns itself) and the Right
variant is the “evolving” one (applies the function to its wrapped value and returns a new instance of type Either). Yes, these names (Either, Left, Right) are very poor … a recurring problem with functional programming concepts. Note that Either is mostly limited to error-handling; this “left is frozen” behaviour means it doesn’t have much use in other contexts.
As with the optional-value case, method map
cannot switch monad variants; only bind
can - ie invoking a function which always succeeds should be done via map
and invoking one which might fail should be done via bind
. And again the “join” part of bind (state-sharing) isn’t really relevant here as in this pattern each monad variant is only wrapping one value.
It was mentioned earlier that a monad has only one generic type parameter. This pattern seems to have two: one for the “error type” and one for the “result type”. However the error type never changes; the function passed to bind always takes only the result type as an input parameter and can return a result of a different type but the error type is always the same. The error value is really a “side channel” or “hidden state”.
Note that the pattern of “data validation” is quite different. In that case, multiple independent checks are applied to the same data, and often all detected errors should be gathered in one pass. The error-collection pattern is different: a sequence of transforms are being applied to data, and once the first error occurs then processing must stop as there is no input to pass to the next transform.
No code example is needed here; an “immediate mode” implementation of this pattern is just a trivial extension of the code for the Optional pattern above.
Delayed Execution
As noted, a monad’s bind method can choose to execute the provided function immediately, or can just add it to a list, ie build a pipeline of things to execute at some future time.
A pipeline of transforms can be used to support “pull-based” processing. Imagine some monadic type M<T>
which gets its values for T from a file, or a database query, or even a large collection. The “execute immediately” approach will cause the first call to bind to load the entire set of source values, apply the function to them, and then store those results in memory waiting for the next call to bind, or a call to obtain the results. It may be better for the type to instead offer a method to get just one value - and then to read just one value from the upstream source, pass it through the pipeline, and return it - ie process data “on demand”.
A pipeline can also be used to implement asynchronous processing. A monad can be built (set of commands) and then passed to a threadpool for execution. The monad schedules each step of the pipeline for execution in turn, ie schedules each pipeline step only after the previous one has completed.
Apache Spark is a distributed data processing system whose client libraries for Scala use this pattern: the Spark classes provide APIs which accept functions as parameters, and internally build up a pipeline of transforms to be applied to data. This pipeline is then executed in parallel on multiple servers against different partitions of the dataset.
Haskell uses monads extensively, and in fact a Haskell program can be seen as code that first builds a single top-level monadic value which is a (delayed) pipeline which contains nested pipelines which contains nested pipelines, etc. These pipelines are the whole program - and then a single “go” command triggers this pipeline, performing execution of the application.
A Quick Look at the Monad Laws
So far, all discussion has been rather informal. Now that the purpose of monads has been described, let’s look at the official laws for monads and see how they correspond to the informal description.
There are three different (but equivalent) ways of defining a monad:
- return/bind (with map being optionally built on top)
- return/map/join (with bind being optionally built on top) - also called the (T, η, μ) triple.
- return/kleisli (with bind and map optionally built on top)
The first option is the one most often found in functional programming languages - and the one most found in tutorials and discussions of the monad pattern. And sadly, monad laws written in this form are actually the hardest of the three to make sense of. This section of the article is only going to look at the laws in general terms though (mostly because I don’t deeply understand it myself) so it makes sense to look at the most common form.
Here are the laws (in return/bind formulation) in Haskell syntax (and note that in Haskell, the word “class” is not the same as in an object-oriented language; it’s closer to declaring a namespace for a set of function prototypes):
Left identity: (return a) >>= h ≡ h a
Right identity: m >>= return ≡ m
Associativity: (m >>= g) >>= h ≡ m >>= (\x -> g x >>= h)
The word “return” above means a factory method for creating an instance of the monadic type. And yes this name is confusing for programmers. To add to the confusion, this operation is sometimes called “unit” or “pure”.
Lets rewrite this in pseudo-kotlin:
given:
interface M<A> {
fun bind(fn : (A) -> M<X>) : M<X>
companion object {
fun<A> return_(a : A) : M<A> { .. }
}
}
left identity: M.return_(a).bind(h) == h(a)
right identity: m.bind(M::return_) == m
Associativity: m.bind(g).bind(h) == m.bind { a -> g(a).bind(h) }
The “left identity” law just says that the factory function (“return”) for the type produces a monadic value which effectively encodes the value and an “identity function”, ie one that has a value and a function which does not modify that value (passes it through). Binding another function to it therefore produces a monadic value which is equivalent to passing the original value through the new function alone. The factory function cannot embed any “hidden state” into the created object - eg the current time-stamp or the current state of a variable in the environment.
The “right identity” law says that starting with a monad in any state, asking it to invoke the “factory function” produces a new value that encodes exactly the same value and hidden state as the value it was invoked on.
As far as I can see, these laws exist to make monads “repeatable”. I think both of these laws actually come pretty naturally, and a programmer would have to try pretty hard to create a monadic type that did not follow these laws.
The associativity law says that given an existing monad, and two transformations we want to append to it (to form a pipeline), it shouldn’t matter whether we:
- create a monad then apply G then H, or
- create a monad then pass in a function which is a composition of G and H
While I don’t fully understand the consequences of associativity law, I suspect that it (and the identity laws) ensure that doing “obvious refactorings” of code which look like they should be equivalent don’t introduce unexpected behaviour changes.
When a language has a way of knowing which functions are “pure” (without side-effects), then a compiler for that language can perform some optimisations such as memoization (when the function is called multiple times with the same value) or simply removing it when the function’s return-value is never used. When a type that follows the monad laws is invoked only with pure functions, then there is an additional set of optimisations that can be applied:
The importance of referential transparency is that it allows the programmer and the compiler to reason about program behavior as a rewrite system. This can help in proving correctness, simplifying an algorithm, assisting in modifying code without breaking it, or optimizing code by means of memoization, common subexpression elimination, lazy evaluation, or parallelization.
However these benefits do not apply to languages without “guaranteed pure functions” - which is the vast majority of both object-oriented and functional languages.
Conclusion
Hopefully now it is clear that “bind” enables several behaviours:
- it is there to “flatten” nested values of the same monadic type - and this is how state is propagated.
- it allows the “variant” of the monadic type to change (which is also about changing from one hidden state to another)
- it composes functions, allowing pipelines to be built
- it allows “control flow” to be inserted between steps in a pipeline (including one where the steps are executed immediately)
This works in cooperation with the functions passed to bind (as they return appropriate instances of the same monadic type).
Hopefully it is clear why method-name flatmap
is sometimes used instead of bind: it maps then flattens the results. In fact, an alternative way to express the monad pattern is to put this “flattening” at the center, by providing two functions: a map
operation as above, and a separate join
operation which takes no parameters and does purely the “flattening” part, ie converts a type M<M<T>>
to an M<T>
. Method bind is then simply join(map(f))
, ie is built on top of simpler map and join methods. This approach is in some ways easier to understand, but requires two mandatory methods (map and join) with bind derived from them, rather than one mandatory method (bind) with map derived from that (and join being unnecessary).
While function-composition can be at the core of any monadic pattern, for many use-cases it is valid for bind to do “immediate execution” of the functions provided to it. For such implementations, the results are actually equivalent - except for memory, performance, and code complexity. Patterns that truly rely on pipelines (eg asynchronous processing) can’t do immediate execution; function composition (or equivalently, building up a list of callbacks) is truly needed there.
Object-oriented languages have adopted many patterns inspired by this - eg Java’s “stream” concept which provides a syntax which can build a “list of callbacks” or can execute each operation immediately. This in turn allows the same syntax to process data that is already in memory or being loaded on-demand, with the “flow control strategy” being decided by the stream rather than the code calling it.
Monads don’t necessarily rely on pure functions. The monadic functions themselves (bind, map, etc) are expected to be pure - they always return a new value of the same monadic type. However the functions passed as parameters to these don’t need to be pure; they can have side-effects if needed. The Haskell IO monad in fact relies on this.
And by the way, a different variant of “bind” is sometimes defined (written >>
in haskell) which simply takes M<T2>
rather than (T1)->M<T2>
. This is equivalent to invoking the standard bind
passing a function (ignored)->M<T2>
. This form is used when composing a pipeline and an operation does not need the result of the previous operation - usually because the previous operation has no return value, ie is a side-effect-causing statement rather than a function. Don’t be fooled by the fact that the function used as a parameter returns the same type as the the bind function does; bind doesn’t necessarily just “pass through” the value that the bound function returns.
Summary
The monad pattern is relatively simple but allows a rich set of useful behaviours to be built on top of it.
It doesn’t provide anything which cannot be done in an object-oriented language using normal classes (to hold hidden state) together with the “strategy” pattern (callbacks). However its elegance and simplicity can make it an appealing alternative to creating different solutions for each of these problems.
And of course non-object-oriented languages need some mechanism for handling state, defining pipelines, etc. The monad pattern provides this.
Appendix 1: The State Pattern
As an OO programmer, we are used to having objects whose primary purpose is to accumulate some result over multiple steps then return it, but where some additional state needs to be tracked in addition to the “primary” value and which may affect how the primary value is computed.
It is possible to model this in pure functional programming, simply by having every function that modifies the state return an updated copy of that state, and pass the updated value into the next function. However this can get rather ugly.
In fact, this is exactly what the state monad pattern does. However it hides the state data-structure from most of the code, making it a field of the monad itself so the individual expressions can mostly ignore it. When the expressions do want to access the state, they can call the “get” and “put” functions. The get function is obviously still functional. The “put” function is also functional; it creates a new copy of the state with the specified change in it. And the “return” function creates a new monad which has a copy of the current enclosing monad’s state, so that when the “returned” monad is later invoked, it can continue where it last left off.
The bind operator can be thought of as “propagating state” at the same time as executing the new expression against the “primary contained value”.
The result is that there is just one thing for the calling code to handle: the fact that the “stateful” object is now of type “State a” rather than just “a”. Otherwise, the details of the state are now irrelevant to the caller.
Note that although this datatype is called “State”, it is still fully immutable, ie does not involve modifying data “in place” at any time. It simply automates the process of creating a slightly-altered copy of the earlier state when needed, and setting it as the “default state” variable used for all subsequent function calls.
Parser implementations for various datafiles (whether languages, config files, etc) are implemented like the State monad, so that the caller doesn’t need to care about the internal state variables.
This pattern is described here in an appendix rather than in the patterns section earlier as it probably isn’t particularly useful in hybrid object-functional languages; object-oriented programming has its own mechanism for handling objects with state.
Appendix 2: Haskell’s IO Monad Datatype
Just for the curious, this section looks briefly at the famous Haskell IO monad.
It was noted earlier that monads provide a method for transforming data (bind and maybe map), but aren’t required to provide any method for actually extracting the results afterwards. Of course most of them do, because the functions passed to bind/map are usually “pure functions” and there isn’t any point in executing a pure function and ignoring its result!
However it is possible to pass functions with side-effects into a monad if desired.
Haskell leverages this to ensure the difference between “pure functions” and ones with side-effects are represented in its type system and therefore that any code which tries to mix them together inappropriately will simple fail to compile. Some other languages prevent such mixing with keywords or annotations to mark specific functions as pure, but Haskell does this just with the same compile-time-type-checking that validates other rules such as Int and String not being compatible.
Haskell defines an IO monad type in its standard library; this type has map/bind methods as usual - but has no methods for extracting data from the monad afterwards; therefore the only useful way to use a value of this type is by passing a function that has side-effects. And every function in the Haskell standard library which is non-pure (has a side-effect), ie interacts with the external world, either takes an IO monad as a parameter, or produces an IO monad as its return-type. The result is that this IO type effectively “taints” everything it touches. Pure code can call pure code, and impure code can call pure code, but any code that tries to call impure code itself becomes impure.
A haskell program therefore always ends up structured as an IO monad at the top level which is a composition of pure and impure functions. The impure parts can call pure parts, but once a pure operation is invoke, you can be guaranteed that no side-effects will happen within it.
Note that the name “IO” for this monad is perhaps a little misleading; it is simply a monad class that executes its nested statements in order, and keeps the results from escaping. However input and output are always impure (non-repeatable interactions with the outside world) and the standard-library functions which do this all rely on IO types (as do other impure standard library functions).
The form >>
is used when a statement does not need the result of the previous statement - usually because the previous statement returns unit (void), eg a function to print to the screen.
Using the binding operator multiple times is rather clumsy; therefore there is a convenient syntactic sugar called a “do expression” which has exactly the same effect. A “do expression” returns a value, like any haskell function. The value it returns must be a Monad (proxy), eg “IO String” (meaning IO<String>
) or similar, and not a plain (non-monadic) value. If the last statement in the block doesn’t already produce a monad then one must be created - and as noted in the section titled “Monad Laws”, “factory methods” for monadic types (aka constructors) are typically named “return”. Yes, this is confusing - this has nothing to do with the return
keyword common in other languages. Which of the many overloaded functions called return
is used depends upon the parameters and the return type expected from the do-expression (Haskell can use type-deduction based on return types). For example:
-- normal function returning a string
getstr :: String
getstr = "hello"
-- normal function returning a value of type IO (which is a monad) wrapping a string.
getstrio :: IO String
getstrio = do return "hello"
References
- Wikipedia: Monad (functional programming)
- [video] An Intuitive Introduction to Monads in Under 10 Minutes
- [video] Computerphile: What is a Monad?
- [video] Studying With Alex: Intro to Monads for Software Engineers
- Typed Logic: Monads in Java
- Tomasz Nurkiewicz: Functional Programming in Pure Java: Functor and Monad Examples - Reactive programming libraries are often based on monadic principles, and this covers enough to understand the RxJava library.
- [video] Oliver Lugg: A Sensible Introduction to Category Theory - a quick look at the general concept of category theory helps to explain a lot of the terminology related to monads
- Milewski: Monoids on Steroids - the full truth about monoids and monads; wonderfully written but still brain-scrambling.
- Dmitri Zaitsev: To Functor or not to Functor