In what sense is the IO Monad pure?
I've had the IO monad described to me as a State monad where the state is "the real world". The proponents of this approach to IO argue that this makes IO operations pure, as in referentially transparent. Why is that? From my perspective it appears that code inside the IO monad have plenty of observable side effects. Also, isn't it possible to describe pretty much any non-pure function like a function of the real world? For example, can't we think of, say, C's malloc as being a function that takes a RealWorld and an Int and returns a pointer and a RealWorld, only just like in the IO monad the RealWorld is implicit?
Note: I know what a monad is and how it's used. Please don't respond with a link to a random monad tutorial unless it specifically adresses my question.
I think the best explanation I've heard was actually fairly recently on SO. IO Foo
is a recipe for creating a Foo
. Another common, more literal, way of saying this is that it is a "program that produces a Foo
". It can be executed (many times) to create a Foo
or die trying. The execution of the recipe/program is what we ultimately want (otherwise, why write one?), but the thing that is represented by an IO
action in our code is the recipe itself.
That recipe is a pure value, in the same exact sense that a String
is a pure value. Recipes can be combined and manipulated in interesting, sometimes astonishing, ways, but the many ways these recipes can be combined (except for the blatantly non-pure unsafePerformIO
, unsafeCoerce
, etc.) are all completely referentially transparent, deterministic, and all that nice stuff. The resulting recipe depends in absolutely no way whatsoever on the state of anything other than the recipes that it was built up from.
Also, isn't it possible to describe pretty much any non-pure function like a function of the real world? For example, can't we think of, say, C's malloc as being a function that takes a RealWorld and an Int and returns a pointer and a RealWorld, only just like in the IO monad the RealWorld is implicit?
For sure ...
The whole idea of functional programming is to describe programs as a combination of small, independent calculations building up bigger computations.
Having these independent calculations, you'll have lots of benefits, reaching from concise programs to efficient and efficiently parallelizable codes, laziness up to the the rigorous guarantee that control flows as intended - with no chance of interference or corruption of arbitrary data.
Now - in some cases (like IO), we need impure code. Calculations involving such operations cannot be independent - they could mutate arbitrary data of another computation.
The point is - Haskell is always pure, IO
doesn't change this.
So, our impure, non-independent codes have to get a common dependency - we have to pass a RealWorld
. So whatever stateful computation we want to run, we have to pass this RealWorld
thing to apply our changes to - and whatever other stateful computation wants to see or make changes has to know the RealWorld
too.
Whether this is done explicitly or implicitly through the IO
monad is irrelevant. You build up a Haskell program as a giant computation that transforms data, and one part of this data is the RealWorld
.
Once the initial main :: IO ()
gets called when your program is run with the current real world as a parameter, this real world gets carried through all impure calculations involved, just as data would in a State
. That's what monadic >>=
(bind) takes care of.
And where the RealWorld
doesn't get (as in pure computations or without any >>=
-ing to main
), there is no chance of doing anything with it. And where it does get, that happened by purely functional passing of an (implicit) parameter. That's why
let foo = putStrLn "AAARGH" in 42
does absolutely nothing - and why the IO
monad - like anything else - is pure. What happens inside this code can of course be impure, but it's all caught inside, with no chance of interfering with non-connected computations.
Suppose we have something like:
animatePowBoomWhenHearNoiseInMicrophone :: TimeDiff -> Sample -> IO ()
animatePowBoomWhenHearNoiseInMicrophone
levelWeightedAverageHalfLife levelThreshord = ...
programA :: IO ()
programA = animatePowBoomWhenHearNoiseInMicrophone 3 10000
programB :: IO ()
programB = animatePowBoomWhenHearNoiseInMicrophone 3 10000
Here's a point of view:
animatePowBoomWhenHearNoiseInMicrophone
is a pure function in the sense that its results for same input, programA
and programB
, are exactly the same. You can do main = programA
or main = programB
and it would be exactly the same.
animatePowBoomWhenHearNoiseInMicrophone
is a function receiving two arguments and resulting in a description of a program. The Haskell runtime can execute this description if you set main
to it or otherwise include it in main
via binding.
What is IO
? IO
is a DSL for describing imperative programs, encoded in "pure-haskell" data structures and functions.
"complete-haskell" aka GHC is an implementation of both "pure-haskell", and an imperative implementation of an IO
decoder/executer.
It quite simply comes down to extensional equality:
If you were to call getLine
twice, then both calls would return an IO String
which would look exactly the same on the outside each time. If you were to write a function to take 2 IO String
s and return a Bool
to signal a detected difference between them both, it would not be possible to detect any difference from any observable properties. It could not ask any other function whether they are equal and any attempt at using >>=
must also return something in IO
which all are equall externally.
I'll let Martin Odersky answer this
The IO monad does not make a function pure. It just makes it obvious that it's impure.
Sounds clear enough.