How do you know when to use fold-left and when to use fold-right?
I'm aware that fold-left produces left-leaning trees and fold-right produces right-leaning trees, but when I reach for a fold, I sometimes find myself getting bogged down in headache-inducing thought trying to determine which kind of fold is appropriate. I usually end up unwinding the entire problem and stepping through the implementation of the fold function as it applies to my problem.
So what I want to know is:
- What are some rules of thumb for determining whether to fold left or fold right?
- How can I quickly decide which type of fold to use given the problem I'm facing?
There is an example in Scala by Example (PDF) of using a fold to write a function called flatten which concatenates a list of element lists into a single list. In that case, a right fold is the proper choice (given the way the lists are concatenated), but I had to think about it a bit to arrive at that conclusion.
Since folding is such a common action in (functional) programming, I'd like to be able to make these kinds of decisions quickly and confidently. So... any tips?
Solution 1:
You can transfer a fold into an infix operator notation (writing in between):
This example fold using the accumulator function x
fold x [A, B, C, D]
thus equals
A x B x C x D
Now you just have to reason about the associativity of your operator (by putting parentheses!).
If you have a left-associative operator, you'll set the parentheses like this
((A x B) x C) x D
Here, you use a left fold. Example (haskell-style pseudocode)
foldl (-) [1, 2, 3] == (1 - 2) - 3 == 1 - 2 - 3 // - is left-associative
If your operator is right-associative (right fold), the parentheses would be set like this:
A x (B x (C x D))
Example: Cons-Operator
foldr (:) [] [1, 2, 3] == 1 : (2 : (3 : [])) == 1 : 2 : 3 : [] == [1, 2, 3]
In general, arithmetic operators (most operators) are left-associative, so foldl
is more widespread. But in the other cases, infix notation + parentheses is quite useful.
Solution 2:
Olin Shivers differentiated them by saying "foldl is the fundamental list iterator" and "foldr is the fundamental list recursion operator." If you look at how foldl works:
((1 + 2) + 3) + 4
you can see the accumulator (as in a tail-recursive iteration) being built. In contrast, foldr proceeds:
1 + (2 + (3 + 4))
where you can see the traversal to the base case 4 and building up the result from there.
So I posit a rule of thumb: if it looks like a list iteration, one that would be simple to write in tail-recursive form, foldl is the way to go.
But really this will be probably be most evident from the associativity of the operators you are using. If they are left-associative, use foldl. If they are right-associative, use foldr.
Solution 3:
Other posters have given good answers and I won't repeat what they've already said. As you have given a Scala example in your question, I'll give a Scala specific example. As Tricks already said, a foldRight
needs to preserve n-1
stack frames, where n
is the length of your list and this can easily lead to a stack overflow - and not even tail recursion could save you from this.
A List(1,2,3).foldRight(0)(_ + _)
would reduce to:
1 + List(2,3).foldRight(0)(_ + _) // first stack frame
2 + List(3).foldRight(0)(_ + _) // second stack frame
3 + 0 // third stack frame
// (I don't remember if the JVM allocates space
// on the stack for the third frame as well)
while List(1,2,3).foldLeft(0)(_ + _)
would reduce to:
(((0 + 1) + 2) + 3)
which can be iteratively computed, as done in the implementation of List
.
In a strictly evaluated language as Scala, a foldRight
can easily blow the stack up for large lists, while a foldLeft
won't.
Example:
scala> List.range(1, 10000).foldLeft(0)(_ + _)
res1: Int = 49995000
scala> List.range(1, 10000).foldRight(0)(_ + _)
java.lang.StackOverflowError
at scala.List.foldRight(List.scala:1081)
at scala.List.foldRight(List.scala:1081)
at scala.List.foldRight(List.scala:1081)
at scala.List.foldRight(List.scala:1081)
at scala.List.foldRight(List.scala:1081)
at scala.List.foldRight(List.scala:1081)
at scala.List.foldRight(List.scala:1081)
at scala.List.foldRight(List.scala:1081)
at scala.List.foldRig...
My rule of thumb is therefore - for operators that don't have a specific associativity, always use foldLeft
, at least in Scala. Otherwise, go with other advice given in the answers ;).
Solution 4:
It's also worth noting (and I realise this is stating the obvious a bit), in the case of a commutative operator the two are pretty much equivalent. In this situation a foldl might be the better choice:
foldl:
(((1 + 2) + 3) + 4)
can calculate each operation and carry the accumulated value forward
foldr:
(1 + (2 + (3 + 4)))
needs a stack frame to be opened for 1 + ?
and 2 + ?
before calculating 3 + 4
, then it needs to go back and do the calculation for each.
I'm not enough of an expert on functional languages or compiler optimisations to say whether this is will actually make a difference but it certainly seems cleaner to use a foldl with commutative operators.