Using return vs. not using return in the list monad
I started my Grand Haskell Crusade (GHC :) ) and I am a bit confused with monads and IO functions. Could anyone explain simply what is the difference between those two functions?
f1 = do x <- [1,2]
[x, x+1] -- this is monad, right?
f2 = do x <- [1,2]
return [x, x+1]
The results are:
*Main> f1
[1,2,2,3]
*Main> f2
[[1,2],[2,3]]
Solution 1:
To see why you get the particular answers that arise, the desugaring explanations are very helpful. Let me supplement them with a little general advice about developing perceptions of Haskell code.
Haskell's type system makes no distinction between two separable "moral" purposes:
-
[x]
the type of values which are lists with elements drawn fromx
-
[x]
the type of computations of elements ofx
which allow prioritized choice
The fact that these two notions have the same representation does not mean that they play the same roles. In f1
, the [x, x+1]
is playing the role of computation, so the possibilities it generates are merged into the choice generated by the whole computation: that's what the >>=
of the list monad does. In f2
, however, the [x, x+1]
is playing the role of value, so that the whole computation generates a prioritized choice between two values (which happen to be list values).
Haskell does not use types to make this distinction [and you may have guessed by now that I think it should, but that's another story]. Instead, it uses syntax. So you need to train your head to perceive the value and computation roles when you read code. The do
notation is a special syntax for constructing computations. What goes inside the do
is built from the following template kit:
The three blue pieces make do
-computations. I've marked the computation holes in blue and the value holes in red. This is not meant to be a complete syntax, just a guide to how to perceive pieces of code in your mind.
Indeed, you may write any old expression in the blue places provided it has a suitably monadic type, and the computation so generated will be merged into the overall computation using >>=
as needed. In your f1
example, your list is in a blue place and treated as prioritized choice.
Similarly, you may write expressions in red places which may very well have monadic types (like lists in this case), but they will be treated as values all the same. That's what happens in f2
: as it were, the result's outer brackets are blue, but the inner brackets are red.
Train your brain to make the value/computation separation when you read code, so that you know instinctively which parts of the text are doing which job. Once you've reprogrammed your head, the distinction between f1
and f2
will seem completely normal!
Solution 2:
The other answers here are correct, but I wonder if they're not quite what you need... I'll try to keep this as simple as possible, just two points:
Point 1. return
is not a special thing in the Haskell language. It's not a keyword, and it's not syntactic sugar for something else. It's just a function that's part of the Monad
typeclass. Its signature is simply:
return :: a -> m a
where m
is whichever monad we're talking about at the time. It takes a "pure" value and jams it into your monad. (Incidentally, there's another function called pure
that's basically a synonym for return
... I like it better because the name is more obvious!) Anyway, if m
is the list monad, then return
has this type:
return :: a -> [a]
If it helps, you could think of the type synonym type List a = [a]
, which might make it slightly more obvious that List
is the thing we're substituting for m
. Anyway, if you were going to implement return
yourself, the only reasonable way you'd implement it is by taking some value (of whatever type a
) and sticking it in a list by itself:
return a = [a]
So I can say return 1
in the list monad, and I'll get [1]
. I can likewise say return [1, 2, 3]
and I'll get [[1, 2, 3]]
.
Point 2. IO
is a monad, but not all monads are IO
. Many Haskell tutorials seem to conflate the two topics largely for historical reasons (incidentally, the same confusing historical reasons that led to return
being so poorly named). It sounds like you might have some (understandable) confusion around that.
In your code, you're in the list monad because you wrote do x <- [1, 2]
. If instead you had written do x <- getLine
for example, you'd be in the IO
monad (because getLine
returns IO String
). Anyway, you're in the list monad, so you get list's definition of return
described above. You also get list's definition of >>=
, which is just (a flipped version of) concatMap
, defined as:
concatMap :: (a -> [b]) -> [a] -> [b]
concatMap f xs = concat (map f xs)
The other posted answers pretty much have it covered from here :) I know I didn't answer your question directly, but I'm hoping these two points instead addressed the fundamental things you might have found confusing.
Solution 3:
It's easy to see when re-writing the code with bind and return:
[1,2] >>= (\x-> [x,x+1]) === concatMap (\x-> [ x,x+1 ]) [1,2]
[1,2] >>= (\x-> return [x,x+1]) === concatMap (\x-> [ [x,x+1] ]) [1,2]
Your first code is tantamount to calling join
on the results of the second, removing one monadic "layer" introduced by return :: a -> m a
, conflating the "list" of the monad in use, with the "list" of your value. If you were returning a pair, say, it wouldn't have made much sense to omit the return
:
-- WRONG: type mismatch
[1,2] >>= (\x-> (x,x+1)) === concatMap (\x-> ( x,x+1 )) [1,2]
-- OK:
[1,2] >>= (\x-> return (x,x+1)) === concatMap (\x-> [ (x,x+1) ]) [1,2]
Or, we can use a join/fmap
re-write:
ma >>= famb === join (fmap famb ma) -- famb :: a -> m b, m ~ []
join (fmap (\x-> [x,x+1]) [1,2]) = concat [ [ x,x+1 ] | x<-[1,2]]
join (fmap (\x-> (x,x+1)) [1,2]) = concat [ ( x,x+1 ) | x<-[1,2]] -- WRONG
join (fmap (\x-> return [x,x+1]) [1,2]) = concat [ [ [x,x+1] ] | x<-[1,2]]
= [y | x<-[1,2], y<-[ x,x+1 ]]
{- WRONG -} = [y | x<-[1,2], y<-( x,x+1 )]
= [y | x<-[1,2], y<-[[x,x+1]]]
Solution 4:
The List type ([]
) is a monad, yes.
Now, remember what return
does. This is easy to see from its type signature: return :: Monad m => a -> m a
. Let's substitute the list type in: return :: a -> [a]
. So this function takes some value and returns just a list of that value. It's equivalent to \ x -> [x]
.
So in the first code sample, you have one list at the very end: [x, x+1]
. In the second sample, you have a nested list: one list comes from [x, x + 1]
and another list comes from return
. The line return [x, x + 1]
could be rewritten to [[x, x + 1]]
in this case.
At the end, the result is the list of all possible results. That is, we concatenate the result of x
as 1
and the result of x
as 2
(thanks to the x <- [1,2]
line). So in the first case, we concatenate two lists; in the second case, we concatenate two lists of lists, because the extra return
wrapped the result in an extra list.