Why are difference lists more efficient than regular concatenation in Haskell?

I am currently working my way through the Learn you a Haskell book online, and have come to a chapter where the author is explaining that some list concatenations can be inefficient: For example

((((a ++ b) ++ c) ++ d) ++ e) ++ f

is supposedly inefficient. The solution the author comes up with is to use 'difference lists' defined as

newtype DiffList a = DiffList {getDiffList :: [a] -> [a] }

instance Monoid (DiffList a) where
    mempty = DiffList (\xs -> [] ++ xs)
    (DiffList f) `mappend` (DiffList g) = DiffList (\xs -> f (g xs))

I am struggling to understand why DiffList is more computationally efficient than a simple concatenation in some cases. Could someone explain to me in simple terms why the above example is so inefficient, and in what way the DiffList solves this problem?


The problem in

((((a ++ b) ++ c) ++ d) ++ e) ++ f

is the nesting. The applications of (++) are left-nested, and that's bad; right-nesting

a ++ (b ++ (c ++ (d ++ (e ++f))))

would not be a problem. That is because (++) is defined as

[] ++ ys = ys
(x:xs) ++ ys = x : (xs ++ ys)

so to find which equation to use, the implementation must dive into the expression tree

             (++)
             /  \
          (++)   f
          /  \
       (++)   e
       /  \
    (++)   d
    /  \
 (++)   c
 /  \
a    b

until it finds out whether the left operand is empty or not. If it's not empty, its head is taken and bubbled to the top, but the tail of the left operand is left untouched, so when the next element of the concatenation is demanded, the same procedure starts again.

When the concatenations are right-nested, the left operand of (++) is always at the top, and checking for emptiness/bubbling up the head are O(1).

But when the concatenations are left-nested, n layers deep, to reach the first element, n nodes of the tree must be traversed, for each element of the result (coming from the first list, n-1 for those coming from the second etc.).

Let us consider a = "hello" in

hi = ((((a ++ b) ++ c) ++ d) ++ e) ++ f

and we want to evaluate take 5 hi. So first, it must be checked whether

(((a ++ b) ++ c) ++ d) ++ e

is empty. For that, it must be checked whether

((a ++ b) ++ c) ++ d

is empty. For that, it must be checked whether

(a ++ b) ++ c

is empty. For that, it must be checked whether

a ++ b

is empty. For that, it must be checked whether

a

is empty. Phew. It isn't, so we can bubble up again, assembling

a ++ b                             = 'h':("ello" ++ b)
(a ++ b) ++ c                      = 'h':(("ello" ++ b) ++ c)
((a ++ b) ++ c) ++ d               = 'h':((("ello" ++ b) ++ c) ++ d)
(((a ++ b) ++ c) ++ d) ++ e        = 'h':(((("ello" ++ b) ++ c) ++ d) ++ e)
((((a ++ b) ++ c) ++ d) ++ e) ++ f = 'h':((((("ello" ++ b) ++ c) ++ d) ++ e) ++ f)

and for the 'e', we must repeat, and for the 'l's too...

Drawing a part of the tree, the bubbling up goes like this:

            (++)
            /  \
         (++)   c
         /  \
'h':"ello"   b

becomes first

     (++)
     /  \
   (:)   c
  /   \
'h'   (++)
      /  \
 "ello"   b

and then

      (:)
      / \
    'h' (++)
        /  \
     (++)   c
     /  \
"ello"   b

all the way back to the top. The structure of the tree that becomes the right child of the top-level (:) finally, is exactly the same as the structure of the original tree, unless the leftmost list is empty, when the

 (++)
 /  \
[]   b

nodes is collapsed to just b.

So if you have left-nested concatenations of short lists, the concatenation becomes quadratic because to get the head of the concatenation is an O(nesting-depth) operation. In general, the concatenation of a left-nested

(...((a_d ++ a_{d-1}) ++ a_{d-2}) ...) ++ a_2) ++ a_1

is O(sum [i * length a_i | i <- [1 .. d]]) to evaluate fully.

With difference lists (sans the newtype wrapper for simplicity of exposition), it's not important whether the compositions are left-nested

((((a ++) . (b ++)) . (c ++)) . (d ++)) . (e ++)

or right-nested. Once you have traversed the nesting to reach the (a ++), that (++) is hoisted to the top of the expression tree, so getting at each element of a is again O(1).

In fact, the whole composition is reassociated with difference lists, as soon as you require the first element,

((((a ++) . (b ++)) . (c ++)) . (d ++)) . (e ++) $ f

becomes

((((a ++) . (b ++)) . (c ++)) . (d ++)) $ (e ++) f
(((a ++) . (b ++)) . (c ++)) $ (d ++) ((e ++) f)
((a ++) . (b ++)) $ (c ++) ((d ++) ((e ++) f))
(a ++) $ (b ++) ((c ++) ((d ++) ((e ++) f)))
a ++ (b ++ (c ++ (d ++ (e ++ f))))

and after that, each list is the immediate left operand of the top-level (++) after the preceding list has been consumed.

The important thing in that is that the prepending function (a ++) can start producing its result without inspecting its argument, so that the reassociation from

             ($)
             / \
           (.)  f
           / \
         (.) (e ++)
         / \
       (.) (d ++)
       / \
     (.) (c ++)
     / \
(a ++) (b ++)

via

           ($)---------
           /           \
         (.)           ($)
         / \           / \
       (.) (d ++) (e ++)  f
       / \
     (.) (c ++)
     / \
(a ++) (b ++)

to

     ($)
     / \
(a ++) ($)
       / \
  (b ++) ($)
         / \
    (c ++) ($)
           / \
      (d ++) ($)
             / \
        (e ++)  f

doesn't need to know anything about the composed functions of the final list f, so it's just an O(depth) rewriting. Then the top-level

     ($)
     / \
(a ++)  stuff

becomes

 (++)
 /  \
a    stuff

and all elements of a can be obtained in one step. In this example, where we had pure left-nesting, only one rewriting is necessary. If instead of (for example) (d ++) the function in that place had been a left-nested composition, (((g ++) . (h ++)) . (i ++)) . (j ++), the top-level reassociation would leave that untouched and this would be reassociated when it becomes the left operand of the top-level ($) after all previous lists have been consumed.

The total work needed for all reassociations is O(number of lists), so the overall cost for the concatenation is O(number of lists + sum (map length lists)). (That means you can bring bad performance to this too, by inserting a lot of deeply left-nested ([] ++).)

The

newtype DiffList a = DiffList {getDiffList :: [a] -> [a] }

instance Monoid (DiffList a) where
    mempty = DiffList (\xs -> [] ++ xs)
    (DiffList f) `mappend` (DiffList g) = DiffList (\xs -> f (g xs))

just wraps that so that it is more convenient to handle abstractly.

DiffList (a ++) `mappend` DiffList (b ++) ~> DiffList ((a ++) . (b++))

Note that it is only efficient for functions that don't need to inspect their argument to start producing output, if arbitrary functions are wrapped in DiffLists, you have no such efficiency guarantees. In particular, appending ((++ a), wrapped or not) can create left-nested trees of (++) when composed right-nested, so you can create the O(n²) concatenation behaviour with that if the DiffList constructor is exposed.


It might help to look at the definition of concatenation:

[]     ++ ys = ys
(x:xs) ++ ys = x : (xs ++ ys)

As you can see, in order to concatenate two lists you need to go over the left list and create a "copy" of it, just so you can change its end (this is because you can't directly change the end of the old list, due to immutability).

If you do your concatenations in the right associative way, there is no problem. Once inserted, a list will never have to be touched again (notice how ++'s definition never inspects the list on the right) so each list element is only inserted "once" for a total time of O(N).

--This is O(n)
(a ++ (b ++ (c ++ (d ++ (e ++ f)))))

However, if you do concatenation in a left associative way, then the "current" list will have to be "torn down" and "rebuild""every time you add another list fragment to the end. Each list element will be iterated over when it's inserted and whenever future fragments are appended as well! It's like the problem you get in C if you naïvely call strcat multiple times in a row.


As for difference lists, the trick is that they kind of keep an explicit "hole" at the end. When you convert a DList back to a normal list you pass it what you want to be in the hole and it will be ready to go. Normal lists, on the other hand, always plug up the hole in the end with [] so if you want to change it (when concatenating) then you need to rip open the list to get to that point.

The definition of difference lists with functions can look intimidating at first, but it's actually pretty simple. You can view them from an Object Oriented point of view by thinking of them as opaque objects "toList" method that receives the list that you should insert in the hole in the end returns the DL's internal prefix plus the tail that was provided. It's efficient because you only plug the "hole" in the end after you are done converting everything.