What are arrows, and how can I use them?
I don't know a tutorial, but I think it's easiest to understand arrows if you look at some concrete examples. The biggest problem I had learning how to use arrows was that none of the tutorials or examples actually show how to use arrows, just how to compose them. So, with that in mind, here's my mini-tutorial. I'll examine two different arrows: functions and a user-defined arrow type MyArr
.
-- type representing a computation
data MyArr b c = MyArr (b -> (c,MyArr b c))
1) An Arrow is a calculation from input of a specified type to output of a specified type. The arrow typeclass takes three type arguments: the arrow type, the input type, and the output type. Looking at the instance head for arrow instances we find:
instance Arrow (->) b c where
instance Arrow MyArr b c where
The Arrow (either (->)
or MyArr
) is an abstraction of a computation.
For a function b -> c
, b
is the input and c
is the output.
For a MyArr b c
, b
is the input and c
is the output.
2) To actually run an arrow computation, you use a function specific to your arrow type. For functions you simply apply the function to an argument. For other arrows, there needs to be a separate function (just like runIdentity
, runState
, etc. for monads).
-- run a function arrow
runF :: (b -> c) -> b -> c
runF = id
-- run a MyArr arrow, discarding the remaining computation
runMyArr :: MyArr b c -> b -> c
runMyArr (MyArr step) = fst . step
3) Arrows are frequently used to process a list of inputs. For functions these can be done in parallel, but for some arrows output at any given step depends upon previous inputs (e.g. keeping a running total of inputs).
-- run a function arrow over multiple inputs
runFList :: (b -> c) -> [b] -> [c]
runFList f = map f
-- run a MyArr over multiple inputs.
-- Each step of the computation gives the next step to use
runMyArrList :: MyArr b c -> [b] -> [c]
runMyArrList _ [] = []
runMyArrList (MyArr step) (b:bs) = let (this, step') = step b
in this : runMyArrList step' bs
This is one reason Arrows are useful. They provide a computation model that can implicitly make use of state without ever exposing that state to the programmer. The programmer can use arrowized computations and combine them to create sophisticated systems.
Here's a MyArr that keeps count of the number of inputs it has received:
-- count the number of inputs received:
count :: MyArr b Int
count = count' 0
where
count' n = MyArr (\_ -> (n+1, count' (n+1)))
Now the function runMyArrList count
will take a list length n as input and return a list of Ints from 1 to n.
Note that we still haven't used any "arrow" functions, that is either Arrow class methods or functions written in terms of them.
4) Most of the code above is specific to each Arrow instance[1]. Everything in Control.Arrow
(and Control.Category
) is about composing arrows to make new arrows. If we pretend that Category is part of Arrow instead of a separate class:
-- combine two arrows in sequence
>>> :: Arrow a => a b c -> a c d -> a b d
-- the function arrow instance
-- >>> :: (b -> c) -> (c -> d) -> (b -> d)
-- this is just flip (.)
-- MyArr instance
-- >>> :: MyArr b c -> MyArr c d -> MyArr b d
The >>>
function takes two arrows and uses the output of the first as input to the second.
Here's another operator, commonly called "fanout":
-- &&& applies two arrows to a single input in parallel
&&& :: Arrow a => a b c -> a b c' -> a b (c,c')
-- function instance type
-- &&& :: (b -> c) -> (b -> c') -> (b -> (c,c'))
-- MyArr instance type
-- &&& :: MyArr b c -> MyArr b c' -> MyArr b (c,c')
-- first and second omitted for brevity, see the accepted answer from KennyTM's link
-- for further details.
Since Control.Arrow
provides a means to combine computations, here's one example:
-- function that, given an input n, returns "n+1" and "n*2"
calc1 :: Int -> (Int,Int)
calc1 = (+1) &&& (*2)
I've frequently found functions like calc1
useful in complicated folds, or functions that operate on pointers for example.
The Monad
type class provides us with a means to combine monadic computations into a single new monadic computation using the >>=
function. Similarly, the Arrow
class provides us with means to combine arrowized computations into a single new arrowized computation using a few primitive functions (first
, arr
, and ***
, with >>>
and id
from Control.Category). Also similar to Monads, the question of "What does an arrow do?" can't be generally answered. It depends on the arrow.
Unfortunately I don't know of many examples of arrow instances in the wild. Functions and FRP seem to be the most common applications. HXT is the only other significant usage that comes to mind.
[1] Except count
. It's possible to write a count function that does the same thing for any instance of ArrowLoop
.
From a glance at your history on Stack Overflow, I'm going to assume that you're comfortable with some of the other standard type classes, particularly Functor
and Monoid
, and start with a brief analogy from those.
The single operation on Functor
is fmap
, which serves as a generalized version of map
on lists. This is pretty much the entire purpose of the type class; it defines "things that you can map over". So, in a sense Functor
represents a generalization of that particular aspect of lists.
The operations for Monoid
are generalized versions of the empty list and (++)
, and it defines "things that can be combined associatively, with a particular thing that's an identity value". Lists are pretty much the simplest thing that fits that description, and Monoid
represents a generalization of that aspect of lists.
In the same way as the above two, the operations on the Category
type class are generalized versions of id
and (.)
, and it defines "things connecting two types in a particular direction, that can be connected head-to-tail". So, this represents a generalization of that aspect of functions. Notably not included in the generalization are currying or function application.
The Arrow
type class builds off of Category
, but the underlying concept is the same: Arrow
s are things that compose like functions and have an "identity arrow" defined for any type. The additional operations defined on the Arrow
class itself just define a way to lift an arbitrary function to an Arrow
and a way to combine two arrows "in parallel" as a single arrow between tuples.
So, the first thing to keep in mind here is that expressions building Arrow
s are, essentially, elaborate function composition. The combinators like (***)
and (>>>)
are for writing "pointfree" style, while proc
notation gives a way to assign temporary names to inputs and outputs while wiring things up.
A useful thing to note here is that, even though Arrow
s are sometimes described as being "the next step" from Monad
s, there's really not a terribly meaningful relationship there. For any Monad
you can work with Kleisli arrows, which are just functions with a type like a -> m b
. The (<=<)
operator in Control.Monad
is arrow composition for these. On the other hand, Arrow
s don't get you a Monad
unless you also include the ArrowApply
class. So there's no direct connection as such.
The key difference here is that whereas Monad
s can be used to sequence computations and do things step-by-step, Arrow
s are in some sense "timeless" just like regular functions. They can include extra machinery and functionality that gets spliced in by (.)
, but it's more like building a pipeline, not accumulating actions.
The other related type classes add additional functionality to an arrow, such as being able to combine arrows with Either
as well as (,)
.
My favorite example of an Arrow
is stateful stream transducers, which look something like this:
data StreamTrans a b = StreamTrans (a -> (b, StreamTrans a b))
A StreamTrans
arrow converts an input value to an output and an "updated" version of itself; consider the ways that this differs from a stateful Monad
.
Writing instances for Arrow
and its related type classes for the above type may be a good exercise for understanding how they work!
I also wrote a similar answer previously that you may find helpful.
I'd like to add that arrows in Haskell are far simpler than they might appear based on the literature. They are simply abstractions of functions.
To see how this is practically useful, consider that you have a bunch of
functions you want to compose, where some of them are pure and some are
monadic. For example, f :: a -> b
, g :: b -> m1 c
, and h :: c -> m2 d
.
Knowing each of the types involved, I could build a composition by hand, but
the output type of the composition would have to reflect the intermediate
monad types (in the above case, m1 (m2 d)
). What if I just wanted to treat
the functions as if they were just a -> b
, b -> c
, and c -> d
? That is,
I want to abstract away the presence of monads and reason only about the
underlying types. I can use arrows to do exactly this.
Here is an arrow which abstracts away the presence of IO for functions in the IO monad, such that I can compose them with pure functions without the composing code needing to know that IO is involved. We start by defining an IOArrow to wrap IO functions:
data IOArrow a b = IOArrow { runIOArrow :: a -> IO b }
instance Category IOArrow where
id = IOArrow return
IOArrow f . IOArrow g = IOArrow $ f <=< g
instance Arrow IOArrow where
arr f = IOArrow $ return . f
first (IOArrow f) = IOArrow $ \(a, c) -> do
x <- f a
return (x, c)
Then I make some simple functions that I want to compose:
foo :: Int -> String
foo = show
bar :: String -> IO Int
bar = return . read
And use them:
main :: IO ()
main = do
let f = arr (++ "!") . arr foo . IOArrow bar . arr id
result <- runIOArrow f "123"
putStrLn result
Here I am calling IOArrow and runIOArrow, but if I were passing these arrows around in a library of polymorphic functions, they would only need to accept arguments of type "Arrow a => a b c". None of the library code would need to be made aware that a monad was involved. Only the creator and end user of the arrow needs to know.
Generalizing IOArrow to work for functions in any Monad is called the "Kleisli arrow", and there is already a builtin arrow for doing just that:
main :: IO ()
main = do
let g = arr (++ "!") . arr foo . Kleisli bar . arr id
result <- runKleisli g "123"
putStrLn result
You could of course also use arrow composition operators, and proc syntax, to make it a little clearer that arrows are involved:
arrowUser :: Arrow a => a String String -> a String String
arrowUser f = proc x -> do
y <- f -< x
returnA -< y
main :: IO ()
main = do
let h = arr (++ "!")
<<< arr foo
<<< Kleisli bar
<<< arr id
result <- runKleisli (arrowUser h) "123"
putStrLn result
Here it should be clear that although main
knows the IO monad is involved,
arrowUser
does not. There would be no way of "hiding" IO from arrowUser
without arrows -- not without resorting to unsafePerformIO
to turn the
intermediate monadic value back into a pure one (and thus losing that context
forever). For example:
arrowUser' :: (String -> String) -> String -> String
arrowUser' f x = f x
main' :: IO ()
main' = do
let h = (++ "!") . foo . unsafePerformIO . bar . id
result = arrowUser' h "123"
putStrLn result
Try writing that without unsafePerformIO
, and without arrowUser'
having to
deal with any Monad type arguments.