Why is catching an exception non-pure, but throwing an exception is pure?
Solution 1:
Because throwing an exception inside a function doesn't make that function's result dependent on anything but the argument values and the definition of the function; the function remains pure. OTOH catching an exception inside a function does (or at least can) make that function no longer a pure function.
I'm going to look at two kinds of exceptions. The first is nondeterministic; such exceptions arise unpredictably at runtime, and include things like out of memory errors. The existence of these exceptions is not included in the meaning of the functions that might generate them. They're just an unpleasant fact of life we have to deal with because we have actual physical machines in the real world, which don't always match up to the abstractions we're using to help us program them.
If a function throws such an exception, it means that that one particular attempt to evaluate the function failed to produce a value. It doesn't necessarily mean that the function's result is undefined (on the arguments it was invoked on this time), but the system was unable to produce the result.
If you could catch such an exception within a pure caller, you could do things like have a function that returns one (non-bottom) value when a sub-computation completes successfully, and another when it runs out of memory. This doesn't make sense as a pure function; the value computed by a function call should be uniquely determined by the values of its arguments and the definition of the function. Being able to return something different depending on whether the sub-computation ran out of memory makes the return value dependent on something else (how much memory is available on the physical machine, what other programs are running, the operating system and its policies, etc.); by definition a function which can behave this way is not pure and can't (normally) exist in Haskell.
Because of purely operational failures, we do have to allow that evaluating a function may produce bottom instead of the value it "should" have produced. That doesn't completely ruin our semantic interpretation of Haskell programs, because we know the bottom will cause all the callers to produce bottom as well (unless they didn't need the value that was supposed to be computed, but in that case non-strict evaluation implies that the system never would have tried to evaluate this function and failed). That sounds bad, but when we place our computation inside the IO
monad than we can safely catch such exceptions. Values in the IO
monad are allowed to depend on things "outside" the program; in fact they can change their value dependent on anything in the world (this is why one common interpretation of IO
values is that they are as if they were passed a representation of the entire universe). So it's perfectly okay for an IO
value to have one result if a pure sub-computation runs out of memory and another result if it doesn't.
But what about deterministic exceptions? Here I'm talking about exceptions that are always thrown when evaluating a particular function on a particular set of arguments. Such exceptions include divide-by-zero errors, as well as any exception explicitly thrown from a pure function (since its result can only depend on its arguments and its definition, if it evaluates to a throw once it will always evaluate to the same throw for the same arguments[1]).
It might seem like this class of exceptions should be catchable in pure code. After all, the value of 1 / 0
just is a divide-by-zero error. If a function can have a different result depending on whether a sub-computation evaluates to a divide-by-zero error by checking whether it's passing in a zero, why can't it do this by checking whether the result is a divide-by-zero error?
Here we get back to the point larsmans made in a comment. If a pure function can observe which exception it gets from throw ex1 + throw ex2
, then its result becomes dependent on the order of execution. But that's up to the runtime system, and it could conceivably even change between two different executions of the same system. Maybe we've got some advanced auto-parallelising implementation which tries different parallelisation strategies on each execution in order to try to converge on the best strategy over multiple runs. This would make the result of the exception-catching function depend on the strategy being used, the number of CPUs in the machine, the load on the machine, the operating system and its scheduling policies, etc.
Again, the definition of a pure function is that only information which comes into a function through its arguments (and its definition) should affect its result. In the case of non-IO
functions, the information affecting which exception gets thrown doesn't come into the function through its arguments or definition, so it can't have an effect on the result. But computations in the IO
monad implicitly are allowed to depend on any detail of the entire universe, so catching such exceptions is fine there.
As for your second dot point: no, other monads wouldn't work for catching exceptions. All the same arguments apply; computations producing Maybe x
or [y]
aren't supposed to depend on anything outside their arguments, and catching any kind of exception "leaks" all sorts of details about things which aren't included in those function arguments.
Remember, there's nothing particularly special about monads. They don't work any differently than other parts of Haskell. The monad typeclass is defined in ordinary Haskell code, as are almost all monad implementations. All the same rules that apply to ordinary Haskell code apply to all monads. It's IO
itself that is special, not the fact that it's a monad.
As for how other pure languages handle exception catching, the only other language with enforced purity that I have experience with is Mercury.[2] Mercury does it a little differently from Haskell, and you can catch exceptions in pure code.
Mercury is a logic programming language, so rather than being built on functions, Mercury programs are built from predicates; a call to a predicate can have zero, one, or more solutions (if you're familiar with programming in the list monad to get nondeterminism, it's a little bit like the entire language is in the list monad). Operationally, Mercury execution uses backtracking to recursively enumerate all possible solutions to a predicate, but the semantics of a nondeterministic predicate is that it simply has a set of solutions for each set of its input arguments, as opposed to a Haskell function which calculates a single result value for each set of its input arguments. Like Haskell, Mercury is pure (including I/O, though it uses a slightly different mechanism), so each call to a predicate must uniquely determine a single solution set, which depends only on the arguments and the definition of the predicate.
Mercury tracks the "determinism" of each predicate. Predicates which always result in exactly one solution are called det
(short for deterministic). Those which generate at least one solution are called multi
. There are a few other determinism classes as well, but they're not relevant here.
Catching an exception with a try
block (or by explicitly calling the higher-order predicates which implement it) has determinism cc_multi
. The cc stands for "committed choice". It means "this computation has at least one solution, and operationally the program is only going to get one of them". This is because running the sub-computation and seeing whether it produced an exception has a solution set which is the union of the sub-computation's "normal" solutions plus the set of all possible exceptions it could throw. Since "all possible exceptions" includes every possible runtime failure, most of which will never actually happen, this solution set can't be fully realised. There's no possible way the execution engine could actually backtrack through every possible solution to the try
block, so instead it just gives you a solution (either a normal one, or an indication that all possibilities were explored and there was no solution or exception, or the first exception that happened to arise).
Because the compiler keeps track of the determinism, it will not allow you to call try
in a context where the complete solution set matters. You can't use it to generate all solutions which don't encounter an exception, for example, because the compiler will complain that it needs all solutions to a cc_multi
call, which is only going to produce one. However you also can't call it from a det
predicate, because the compiler will complain that a det
predicate (which is supposed to have exactly one solution) is making a cc_multi
call, which will have multiple solutions (we're just only going to know what one of them is).
So how on earth is this useful? Well, you can have main
(and other things it calls, if that's useful) declared as cc_multi
, and they can call try
with no problems. This means that the entire program has multiple "solutions" in theory, but running it will generate a solution. This allows you to write a program that behaves differently when it happens to run out of memory at some point. But it doesn't spoil the declarative semantics because the "real" result it would have computed with more memory available is still in the solution set (just as the out-of-memory exception is still in the solution set when the program actually does compute a value), it's just that we only end up with one arbitrary solution.
It's important that det
(there is exactly one solution) is treated differently from cc_multi
(there are multiple solutions, but you can only have one of them). Similarly to the reasoning about catching exceptions in Haskell, exception catching can't be allowed to happen in a non-"committed choice" context, or you could get pure predicates producing different solution sets depending on information from the real world that they shouldn't have access to. The cc_multi
determinism of try
allows us to write programs as if they produced an infinite solution set (mostly full of minor variants of unlikely exceptions), and prevents us from writing programs that actually need more than one solution from the set.[3]
[1] Unless evaluating it encounters a nondeterministic error first. Real life's a pain.
[2] Languages which merely encourage the programmer to use purity without enforcing it (such as Scala) tend to just let you catch exceptions wherever you want, same as they allow you to do I/O wherever you want.
[3] Note that the "committed choice" concept is not how Mercury handles pure I/O. For that, Mercury uses unique types, which is orthogonal to the "committed choice" determinism class.
Solution 2:
The paper delnan mentions in the comments, and the answers to this previous question, certainly provide sufficient reasons for only catching exceptions in IO
.
However, I can also see why reasons such as observing evaluation order or breaking monotonicity may not be persuasive on an intuitive level; it's difficult to imagine how either could cause much harm in the vast majority of code. As such, it might help to recall that exception handling is a control flow structure of a distinctly non-local variety, and being able to catch exceptions in pure code would allow (mis)using them for that purpose.
Allow me to illustrate exactly what horror this entails.
First, we define an exception type to use, and a version of catch
that can be used in pure code:
newtype Exit a = Exit { getExit :: a } deriving (Typeable)
instance Show (Exit a) where show _ = "EXIT"
instance (Typeable a) => Exception (Exit a)
unsafeCatch :: (Exception e) => a -> (e -> a) -> a
unsafeCatch x f = unsafePerformIO $ catch (seq x $ return x) (return . f)
This will let us throw any Typeable
value and then catch it from some outer scope, without the consent of any intervening expressions. For instance, we could hide an Exit
throw inside something we pass to a higher-order function to escape with some intermediate value generated by its evaluation. Astute readers may have figured out by now where this is going:
callCC :: (Typeable a) => ((a -> b) -> a) -> a
callCC f = unsafeCatch (f (throw . Exit)) (\(Exit e) -> e)
Yes, that actually works, with the caveat that it requires any use of the continuation to be evaluated whenever the whole expression is. Keep that in mind if you try this out, or just use deepseq
if nuking from orbit is more your speed.
Behold:
-- This will clearly never terminate, no matter what k is
foo k = fix (\f x -> if x > 100 then f (k x) else f (x + 1)) 0
But:
∀x. x ⊢ callCC foo
101
Escaping from inside a map
:
seqs :: [a] -> [a]
seqs xs = foldr (\h t -> h `seq` t `seq` (h:t)) [] xs
bar n k = map (\x -> if x > 10 then k [x] else x) [0..n]
Note the need to force evaluation.
∀x. x ⊢ callCC (seqs . bar 9)
[0,1,2,3,4,5,6,7,8,9]
∀x. x ⊢ callCC (seqs . bar 11)
[11]
...ugh.
Now, let us never speak of this again.