How do functors work in haskell?
I've accidently written a
Haskell Functors Tutorial
I'll answer your question using examples, and I'll put the types underneath in comments.
Watch out for the pattern in the types.
fmap
is a generalisation of map
Functors are for giving you the fmap
function. fmap
works like map
, so let's check out map
first:
map (subtract 1) [2,4,8,16] = [1,3,7,15]
-- Int->Int [Int] [Int]
So it uses the function (subtract 1)
inside the list. In fact, for lists, fmap
does exactly what map
does. Let's multiply everything by 10 this time:
fmap (* 10) [2,4,8,16] = [20,40,80,160]
-- Int->Int [Int] [Int]
I'd describe this as mapping the function that multiplies by 10 over the list.
fmap
also works on Maybe
What else can I fmap
over? Let's use the Maybe data type, which has two types of values, Nothing
and Just x
. (You can use Nothing
to represent a failure to get an answer while Just x
represents an answer.)
fmap (+7) (Just 10) = Just 17
fmap (+7) Nothing = Nothing
-- Int->Int Maybe Int Maybe Int
OK, so again, fmap
is using (+7)
inside the Maybe.
And we can fmap other functions too. length
finds the length of a list, so we can fmap it over Maybe [Double]
fmap length Nothing = Nothing
fmap length (Just [5.0, 4.0, 3.0, 2.0, 1.573458]) = Just 5
-- [Double]->Int Maybe [Double] Maybe Int
Actually length :: [a] -> Int
but I'm using it here on [Double]
so I specialised it.
Let's use show
to turn stuff into strings. Secretly the actual type of show
is Show a => a -> String
, but that's a bit long, and I'm using it here on an Int
, so it specialises to Int -> String
.
fmap show (Just 12) = Just "12"
fmap show Nothing = Nothing
-- Int->String Maybe Int Maybe String
also, looking back to lists
fmap show [3,4,5] = ["3", "4", "5"]
-- Int->String [Int] [String]
fmap
works on Either something
Let's use it on a slightly different structure, Either
. Values of type Either a b
are either Left a
values or Right b
values. Sometimes we use Either to represent a success Right goodvalue
or failure Left errordetails
, and sometime just to mix together values of two types into one. Anyway, the functor for the Either data type only works on the Right
- it leaves Left
values alone. That makes sense particularly if you're using Right values as the successful ones (and in fact we wouldn't be able to make it work on both because the types aren't necessarily the same). Lets use the type Either String Int
as an example
fmap (5*) (Left "hi") = Left "hi"
fmap (5*) (Right 4) = Right 20
-- Int->Int Either String Int Either String Int
It makes (5*)
work inside the Either, but for Eithers, only the Right
values get changed. But we can do it the other way round on Either Int String
, as long as the function works on strings. Let's put ", cool!"
at the end of stuff, using (++ ", cool!")
.
fmap (++ ", cool!") (Left 4) = Left 4
fmap (++ ", cool!") (Right "fmap edits values") = Right "fmap edits values, cool!"
-- String->String Either Int String Either Int String
It's especially cool to use fmap
on IO
Now one of my favourite ways of using fmap is to use it on IO
values to edit the value some IO operation gives me. Let's make an example that lets you type something in and then prints it out straight away:
echo1 :: IO ()
echo1 = do
putStrLn "Say something!"
whattheysaid <- getLine -- getLine :: IO String
putStrLn whattheysaid -- putStrLn :: String -> IO ()
We can write that in a way that feels neater to me:
echo2 :: IO ()
echo2 = putStrLn "Say something"
>> getLine >>= putStrLn
>>
does one thing after another, but the reason I like this is because >>=
takes the String that getLine
gave us and feeds it to putStrLn
which takes a String.
What if we wanted to just greet the user:
greet1 :: IO ()
greet1 = do
putStrLn "What's your name?"
name <- getLine
putStrLn ("Hello, " ++ name)
If we wanted to write that in the neater way I'm a bit stuck. I'd have to write
greet2 :: IO ()
greet2 = putStrLn "What's your name?"
>> getLine >>= (\name -> putStrLn ("Hello, " ++ name))
which is not nicer than the do
version. In fact the do
notation is there so you don't have to do this. But can fmap
come to the rescue? Yes it can. ("Hello, "++)
is a function that I can fmap over the getLine!
fmap ("Hello, " ++) getLine = -- read a line, return "Hello, " in front of it
-- String->String IO String IO String
we can use it like this:
greet3 :: IO ()
greet3 = putStrLn "What's your name?"
>> fmap ("Hello, "++) getLine >>= putStrLn
We can pull this trick on anything we're given. Let's disagree with whether "True" or "False" was typed in:
fmap not readLn = -- read a line that has a Bool on it, change it
-- Bool->Bool IO Bool IO Bool
Or let's just report the size of a file:
fmap length (readFile "test.txt") = -- read the file, return its length
-- String->Int IO String IO Int
-- [a]->Int IO [Char] IO Int (more precisely)
Conclusions: What does fmap
do, and what does it do it to?
If you've been watching the patterns in the types and thinking about the examples you'll have noticed that fmap takes a function that works on some values, and applies that function on something that has or produces those values somehow, editing the values. (eg readLn was to read Bool, so had type IO Bool
there's a Boolean value in it in the sense that it produces a Bool
, eg2 [4,5,6]
has Int
s in it.)
fmap :: (a -> b) -> Something a -> Something b
this works for Something
being List-of (written []
), Maybe
, Either String
, Either Int
, IO
and loads of over things. We call it a Functor if this works in a sensible way (there are some rules - later). The actual type of fmap is
fmap :: Functor something => (a -> b) -> something a -> something b
but we usually replace something
with f
for brevity. It's all the same to the compiler, though:
fmap :: Functor f => (a -> b) -> f a -> f b
Have a look back at the types and check this always works - think about Either String Int
carefully - what's f
that time?
Appendix: What are the Functor rules, and why do we have them?
id
is the identity function:
id :: a -> a
id x = x
Here are the rules:
fmap id == id -- identity identity
fmap (f . g) == fmap f . fmap g -- composition
Firstly the identity identity: If you map the function that does nothing, that doesn't change anything. That sounds obvious (a lot of rules do), but you can interpret that as saying that fmap
is only allowed to change the values, not the structure. fmap
isn't allowed to turn Just 4
into Nothing
, or [6]
into [1,2,3,6]
, or Right 4
into Left 4
because more than just the data changed - the structure or context for that data changed.
I hit this rule once when I was working on a graphical user interface project - I wanted to be able to edit the values, but I couldn't do it without changing the structure underneath. No-one would have really noticed the difference because it had the same effect, but realising it didn't obey the functor rules made me rethink my whole design, and it's much cleaner, slicker and faster now.
Secondly the composition: this means you can choose whether to fmap one function at a time, or fmap them both at the same time. If fmap
leaves the structure/context of your values alone and just edits them with the function its given, it will work with this rule too.
There's a secret third rule that mathematicians have, but we don't call it a rule in Haskell, because it just looks like a type declaration:
fmap :: (a -> b) -> something a -> something b
This one stops you from applying the function to just the first value in your list, for example. This law is enforced by the compiler.
Why do we have them? To make sure fmap
doesn't sneakily do anything behind the scenes or change anything we didn't expect. They're not enforced by the compiler (asking the compiler to prove a theorem before it compiles your code isn't fair, and would slow compilation down - the programmer should check). This means you can cheat the laws a bit, but that's a bad plan because your code can give unexpected results.
The laws of Functor are to make sure that fmap
applies your function fairly, equally, everywhere and without any other changes. That's a good, clean, clear, reliable, reusable thing.
A fuzzy explanation would be that a Functor
is some sort of container and an associated function fmap
that allows you to alter whatever is contained, given a function that transforms the contained.
For instance, lists are this kind of container, such that fmap (+1) [1,2,3,4]
yields [2,3,4,5]
.
Maybe
can also be made a functor, such that fmap toUpper (Just 'a')
yields Just 'A'
.
The general type of fmap
shows quite neatly what is going on:
fmap :: Functor f => (a -> b) -> f a -> f b
And the specialized versions may make it clearer. Here's the list version:
fmap :: (a -> b) -> [a] -> [b]
And the Maybe version:
fmap :: (a -> b) -> Maybe a -> Maybe b
You can gain information on the standard Functor
instances by querying GHCI with :i Functor
and many modules define more instances of Functor
s (and other type classes.)
Please don't take the word "container" too seriously, though. Functor
s are a well-defined concept, but you can often reason about it with this fuzzy analogy.
Your best bet in understanding what is going on is simply reading the definition of each of the instances, which should give you an intuition on what is going on. From there it's only a small step to really formalize your understanding of the concept. What needs to be added is a clarification of what our "container" really is, and that each instance much satisfy a pair of simple laws.
It's important to keep separate in your head the distinction between a functor itself, and a value in a type which has a functor applied to it. A functor itself is a type constructor like Maybe
, IO
, or the list constructor []
. A value in a functor is some particular value in a type with that type constructor applied. e.g. Just 3
is one particular value in the type Maybe Int
(that type is the Maybe
functor applied to the type Int
), putStrLn "Hello World"
is one particular value in the type IO ()
, and [2, 4, 8, 16, 32]
is one particular value in the type [Int]
.
I like to think of a value in a type with a functor applied as being "the same" as a value in the base type, but with some extra "context". People often use a container analogy for a functor, which works pretty naturally for quite a few functors but then becomes more of a hindrance than a help when you're having to convince yourself that IO
or (->) r
is like a container.
So if an Int
represents an integer value, then a Maybe Int
represents an integer value that may not be present ("may not be present" is the "context"). An [Int]
represents an integer value with a number of possible values (this is the same interpretation of the list functor as the "nondeterminism" interpretation of the list monad). An IO Int
represents an integer value whose precise value depends on the entire universe (or alternatively, it represents an integer value that can be obtained by running an external process). A Char -> Int
is an integer value for any Char
value ("function taking r
as an argument" is a functor for any type r
; with r
as Char
(->) Char
is the type constructor which is a functor, which applied to Int
becomes (->) Char Int
or Char -> Int
in infix notation).
The only thing you can do with a general functor is fmap
, with the type Functor f => (a -> b) -> (f a -> f b)
. fmap
transforms a function that operates on normal values into a function that operates on values with additional context added by a functor; what exactly this does is different for each functor, but you can do it with all of them.
So with the Maybe
functor fmap (+1)
is the function that computes a possibly-not-present integer 1 higher than its input possibly-not-present integer. With the list functor fmap (+1)
is the function that computes a nondeterministic integer 1 higher than its input nondeterministic integer. With the IO
functor, fmap (+1)
is the function that computes an integer 1 higher than its input integer-whose-value-depends-on-the-external-universe. With the (->) Char
functor, fmap (+1)
is the function that adds 1 to an integer which depends on a Char
(when I feed a Char
to the return value, I get 1 higher than what I would have got by feeding the same Char
to the original value).
But in general, for some unknown functor f
, fmap (+1)
applied to some value in f Int
is "functor version" of the function (+1)
on ordinary Int
s. It adds 1 to the integer in whatever kind of "context" this particular functor has.
On its own, fmap
isn't necessarily that useful. Usually when you're writing a concrete program and working with a functor, you're working with one particular functor, and you often think of fmap
as whatever it does for that particular functor. When I'm working with [Int]
, I'm often not thinking of my [Int]
values as nondeterministic integers, I just think of them as lists of integers, and I think of fmap
the same way I think of map
.
So why bother with functors? Why not just have map
for lists, applyToMaybe
for Maybe
s, and applyToIO
for IO
s? Then everyone would know what they do and no one would have to understand weird abstract concepts like functors.
The key is the recognition that there are a lot of functors out there; almost all container types for a start (hence the container analogy for what functors are). Every one of them has an operation corresponding to fmap
, even if we don't have functors. Whenever you write an algorithm solely in terms of the fmap
operation (or map
, or whatever it's called for your particular type), then if you write it in terms of functors rather than your particular type then it works for all functors.
It can also serve as a form of documentation. If I hand off one of my list values to a function you wrote that operates on lists, it could do any number of things. But if I hand off my list to a function you wrote that operates on values in an arbitrary functor, then I know that the implementation of your function can't be using list features, only functor features.
Thinking back to how you would use functorish things in traditional imperative programming might help see the benefits. There container types like arrays, lists, trees, etc, will usually have some pattern you use to iterate over them. It can be slightly different for different containers, although libraries often provide standard iteration interfaces to address this. But you still end up writing a little for-loop every time you want to iterate over them, and when what you want to do is calculate a result for each item in the container and collect all the results you typically end up mixing in the logic for constructing the new container as you go.
fmap
is every for loop of that form you will ever write, sorted once and for all by the library writers, before you even sit down to program. Plus it can also be used with things like Maybe
and (->) r
that probably wouldn't be seen as having anything to do with a designing a consistent container interface in imperative languages.
In Haskell, functors capture the notion of having containers of "stuff", such that you can manipulate that "stuff" without changing the shape of the container.
Functors provide one function, fmap
, that lets you do this, by taking a regular function and "lifting" it to a function from containers of one type of element to another:
fmap :: Functor f => (a -> b) -> (f a -> f b)
For example, []
, the list type constructor, is a functor:
> fmap show [1, 2, 3]
["1","2","3"]
and so are many other Haskell type constructors, like Maybe
and Map Integer
1:
> fmap (+1) (Just 3)
Just 4
> fmap length (Data.Map.fromList [(1, "hi"), (2, "there")])
fromList [(1,2),(2,5)]
Note that fmap
is not allowed to change the "shape" of the container, so if for example you fmap
a list, the result has the same number of elements, and if you fmap
a Just
it can't become a Nothing
. In formal terms, we require that fmap id = id
, i.e. if you fmap
the identity function, nothing changes.
So far I've been using the term "container", but it's really a bit more general than that. For example, IO
is also a functor, and what we mean by "shape" in that case is that fmap
on an IO
action shouldn't change the side effects. In fact, any monad is a functor2.
In category theory, functors allow you to convert between different categories, but in Haskell we only really have one category, often called Hask. Therefore all functors in Haskell convert from Hask to Hask, so they are what we call endofunctors (functors from a category to itself).
In their simplest form, functors are somewhat boring. There is only so much you can do with just one operation. However, once you start adding operations, you can go from regular functors to applicative functors to monads and things quickly get a lot more interesting, but that's beyond the scope of this answer.
1 But Set
is not, because it can only store Ord
types. Functors must be able to contain any type.
2 Due to historical reasons, Functor
is not a superclass of Monad
, though many people think it should be.