Understanding how recursive functions work

1.The function is called recursively until a condition is met. That condition is a > b. When this condition is met, return 0. At first glance, I would expect the return value to be 0 which is obviously incorrect.

Here is what the computer computing sumInts(2,5) would think if it were able to:

I want to compute sumInts(2, 5)
for this, I need to compute sumInts(3, 5)
and add 2 to the result.
  I want to compute sumInts(3, 5)
  for this, I need to compute sumInts(4, 5)
  and add 3 to the result.
    I want to compute sumInts(4, 5)
    for this, I need to compute sumInts(5, 5)
    and add 4 to the result.
      I want to compute sumInts(5, 5)
      for this, I need to compute sumInts(6, 5)
      and add 5 to the result.
        I want to compute sumInts(6, 5)
        since 6 > 5, this is zero.
      The computation yielded 0, therefore I shall return 5 = 5 + 0.
    The computation yielded 5, therefore I shall return 9 = 4 + 5.
  The computation yielded 9, therefore I shall return 12 = 3 + 9.
The computation yielded 12, therefore I shall return 14 = 2 + 12.

As you see, some call to the function sumInts actually returns 0 however this not the final value because the computer still has to add 5 to that 0, then 4 to the result, then 3, then 2, as described by the four last sentences of the thoughts of our computer. Note that in the recursion, the computer does not only have to compute the recursive call, it also has to remember what to do with the value returned by the recursive call. There is a special area of computer's memory called the stack where this kind of information is saved, this space is limited and functions that are too recursive can exhaust the stack: this is the stack overflow giving its name to our most loved website.

Your statement seems to make the implicit assumption that the computer forgets what it were at when doing a recursive call, but it does not, this is why your conclusion does not match your observation.

2.Printing out the value of 'a' on each iteration yields a value which I would expect: 2, 3, 4, 5 (at which point 5+1 > b which meets the first condition: a > b) but I still don't see how the value of 14 is achieved.

This is because the return value is not an a itself but the sum of the value of a and the value returned by the recursive call.

I think the confusion is stemming from thinking of it as "the same function" being called many times. If you think of it as "many copies of the same function being called", then it may be clearer:

Only one copy of the function ever returns 0, and it's not the first one (it's the last one). So the result of calling the first one is not 0.

For the second bit of confusion, I think it will be easier to spell out the recursion in English. Read this line:

return a + sumInts(a + 1, b: b)

as "return the value of 'a' plus (the return value of another copy of the function, which is the copy's value of 'a' plus (the return value of another copy of the function, which is the second copy's value of 'a' plus (...", with each copy of the function spawning a new copy of itself with a increased by 1, until the a > b condition is met.

By the time you reach the the a > b condition being true, you have a (potentially arbitrarily) long stack of copies of the function all in the middle of being run, all waiting on the result of the next copy to find out what they should add to 'a'.

(edit: also, something to be aware of is that the stack of copies of the function I mention is a real thing that takes up real memory, and will crash your program if it gets too large. The compiler can optimize it out in some cases, but exhausting stack space is a significant and unfortunate limitation of recursive functions in many languages)

To understand recursion you must think of the problem in a different way. Instead of a large logical sequence of steps that makes sense as a whole you instead take a large problem and break up into smaller problems and solve those, once you have an answer for the sub problems you combine the results of the sub problems to make the solution to the bigger problem. Think of you and your friends needing to count the number of marbles in a huge bucket. You do each take a smaller bucket and go count those individually and when you are done you add the totals together.. Well now if each of you find some friend and split the buckets further, then you just need to wait for these other friends to figure out their totals, bring it back to each of you, you add it up. And so on. The special case is when you only get 1 marble to count then you just return it back and say 1. let the other people above you do the adding you are done.

You must remember every time the function calls itself recursively it creates a new context with a subset of the problem, once that part is resolved it gets returned so that the previous iteration can complete.

Let me show you the steps:

sumInts(a: 2, b: 5) will return: 2 + sumInts(a: 3, b: 5)
sumInts(a: 3, b: 5) will return: 3 + sumInts(a: 4, b: 5)
sumInts(a: 4, b: 5) will return: 4 + sumInts(a: 5, b: 5)
sumInts(a: 5, b: 5) will return: 5 + sumInts(a: 6, b: 5)
sumInts(a: 6, b: 5) will return: 0

once sumInts(a: 6, b: 5) has executed, the results can be computed so going back up the chain with the results you get:

 sumInts(a: 6, b: 5) = 0
 sumInts(a: 5, b: 5) = 5 + 0 = 5
 sumInts(a: 4, b: 5) = 4 + 5 = 9
 sumInts(a: 3, b: 5) = 3 + 9 = 12
 sumInts(a: 2, b: 5) = 2 + 12 = 14.

Another way to represent the structure of the recursion:

 sumInts(a: 2, b: 5) = 2 + sumInts(a: 3, b: 5)
 sumInts(a: 2, b: 5) = 2 + 3 + sumInts(a: 4, b: 5)  
 sumInts(a: 2, b: 5) = 2 + 3 + 4 + sumInts(a: 5, b: 5)  
 sumInts(a: 2, b: 5) = 2 + 3 + 4 + 5 + sumInts(a: 6, b: 5)
 sumInts(a: 2, b: 5) = 2 + 3 + 4 + 5 + 0
 sumInts(a: 2, b: 5) = 14

Understanding how recursive functions work

Related

Recent Posts