Writing State Monads in Ocaml
One of my projects for quite a while is to understand Monads. I have been working on this, off and on, for an embarrassing amount of time. Yes, I could have just learned Haskell and that would have done it but I wanted to be a pain in the arse. Also, I had a horrible experience attempting to learn Haskell a long time ago. Thus, I decided as I use Ocaml for most things then I should be able to do it. First, I read this article a few times and this which is really good and helps hugely in my understanding. Anyway, I will take a step by step approach to showing how to write a State Monad for Ocaml.
So, first thing, what is a Monad? If you remember your undergraduate philosophy classes a Monad is an entity that is basically impenetrable but can communicate. Thus, you don’t know what’s in the monad but you can communicate this. With this in mind, a Monad in code is:
module type MONAD = sig type 'a t val bind : 'a t -> ('a -> 'b t) -> 'b t val return : 'a -> 'a t end
You will notice here that I am declaring a module type and not a module. Basically, I want to build monads out of things. Thus, Monad in OO terms is an abstract base class. So, if we want to have a “kind of” Monad, we have to build it. In any case, the type
'a t is the type system wrapper around a polymorphic value. Basically, the type system wraps the value in a type so that it can be manipulated at the type level rather than at the value level.
return is kind of an odd way to put it but it “return”s a Monad out of a value of some kind (as has been pointed out by pozorvlak, mote is a better term for values of that type and is a common mistake in Monad tutorials); we don’t care what kind (or type) the value is.
bind is a way for the user to bind a function to a Monad. What you will get is the Monad feeding the value into the function and expecting a new Monad from the function then it will automatically return that Monad.
So, you can create different kinds of Monads to do different things and one of the side-effects is that side-effects are encapsulated within the Monad itself. This means that you can do things which are not “pure” functional programming within a purely functional system (this isn’t strictly true either). As you can see with
bind, you will always return a Monad. However, I can hear you saying, “but how do I get values out of a Monad?” Don’t worry you can do that by having a function like this:
val access : 'a t -> 'a
This function is some times called “run” as well. What it does is it asks the Monad for its computation (aka “a thunk”) as the Monad writer can put off actually doing the computation until this stage in the processing. So, with that knowledge, let’s attempt the next stage:
module type STATE = sig type t val empty : t end
Again, this is an abstract base class for “State” and it allows you to define your own state using a functor (I should write another thing on functors). Basically, I don’t want to constrain the kinds of StateMonads and just let my users define what kind of state they need. So, bringing this together with the MONAD declaration above:
module type STATE_MONAD = functor(State : STATE) -> sig include MONAD val access : 'a t -> 'a val put : State.t -> unit t val get : State.t t end
Ok, again another declaration with no real code, please be patient all will be revealed. This has access and put and get functions. The access function, actually does the computation of functions that are bound within the Monad. The put and get is something I am still trying to get my head around but it basically allows you to use the StateMonad as a variable store. You can store things in it like a variable and retrieve the information and return it.
So, now on to some Real Code:
module StateMonad : STATE_MONAD = functor(State : STATE) -> struct type state = State.t type 'a t = state -> ('a * state) let bind m f = fun s -> match m s with | (x, s') -> f x s' let return a = fun s -> (a, s) let access m = match m State.empty with | (x, s) -> x let put s = fun _ -> ((), s) let get = fun s -> (s, s) end
So, lets break this down. The functor allows the user to set what kind of state they want. I will demonstrate this below. The next bit is
type 'a t = state -> ('a * state). This defines the actual type that is wrapped inside the StateMonad. Basicaly it said “This Monad is a function which takes one state and translates that state into a value and the next state”. Thus, when you put a state in the StateMonad via
return, it returns just what it says on the tin a function which takes a state and return a tuple of a value and the state. The next function
bind binds a function to the Monad and the monad calls the last function it created (in this case the “m” in the code) and matches the output of “m” with the expected tuple which is then passed to the next function in the chain. The
access function takes an empty state and that fires the function chain to come to a final state and value then the final value is returned.
get functions. I am not sure I actually understand these as well as I should so I will tell you now that if what I say is wrong, please let me know. So basically
put creates a degenerate return function which puts the value in the same kind of wrapper as a function. The
get passes back a function which fires immediately and gives you back it’s value. I am completely unsure of how exactly this works but it seems to work in practice. If someone could enlighten me, that would be wonderful.
So, now I will present a trivial example: IntStateMonad.
module IntStateMonad = StateMonad( struct type t = int let empty = 0 end )
This tells the StateMonad what kind of state the user wants. This basically fills in the functor State from above. I am still slightly unsure if the empty value is really necessary or if there is some other way I can get the underlying function to fire without it. So here is an example:
let _ = let blah = IntStateMonad.return 1 in let blah2 = IntStateMonad.bind blah (fun i -> IntStateMonad.return (succ i)) in print_endline (string_of_int (IntStateMonad.access blah2))
So, can we do something more Haskell like in this instance? Yes, with the “perform” notation that is provided by pamonadcustom, which uses camp4 to parse ocaml and do syntax transformations. So we can do this like so:
let return = IntStateMonad.return let _ = let blah = perform with IntStateMonad.bind in a <-- return 1; b <-- return (succ a); return b in print_endline (string_of_int (IntStateMonad.access blah))
So you have to bind the return and bind functions to something that is accessible by the pa_monad syntax. Otherwise, it takes care of interleaving the binds then it returns the resulting Monad which you can run (or access…whatever you prefer to call it).
put, you can use it in this way:
let _ = let blah = perform with IntStateMonad.bind in a <-- return 1; b <-- IntStateMonad.put 5; c <-- return (succ a); d <-- IntStateMonad.get; return (c + d) in print_endline (string_of_int (IntStateMonad.access blah))
So, there you have it. This is a Monad which keeps track of the state for you and will return the evaluated functions in a type wrapper. This will also get and store state like a variable wrapped inside the bound function calls. All side-effects are explicitly handled inside the Monad and Leibniz gets to smile. Who said philosophy was useless?
UPDATE Added information on motes and clarified some stuff on side-effects in a purely functional programming language.