Memoization using Closures

I wrote a little article about how to use closures in Python a couple weeks ago, but I’m not convinced that I really gave a good motivation for why we might want to use them. The example I gave was a little bit simplistic—here’s a better one.

First, we need to talk about memoization (if you’re familiar with
memoization already, feel free to skip along a bit). Suppose we want to write a
function to calculate the n^{th } Fibonacci number. We might first
write a naïve recursive function:

```
def fib(n):
if n < 2:
return n
else:
return fib(n - 1) + fib(n - 2)
```

This does the job (assuming we give it appropriate input), but it’s actually
terribly inefficient. Check out the call tree for `fib(5)`

:

We’ve got a few redundant calls here (notably `fib(2)`

and `fib(3)`

), and for
higher values of `n`

they really start adding up. As it turns out, this
Fibonacci implementation is O(2^{n}). Our elegantly simple algorithm is
awful, just awful.

But we can salvage it! If we were to record our calculated values (by closing
over our already-calculated values, say) we could make this thing *phenomenally*
more efficient.

```
def memoized_fib(n):
values = [0, 1]
def fib_helper(k):
if len(values) > k:
result = values[k]
else:
result = fib_helper(k - 1) + fib_helper(k - 2)
values.append(result)
return result
return fib_helper(n)
```

We’re creating a list called `values`

, for which each entry contains the
Fibonacci number associated with its index, starting with the first two. After
we’ve recursed all the way down the first time, we build up the list as we pop
functions off the stack, so subsequent calls to `fib_helper`

don’t really need
to do any recursion at all—they can just look up the appropriate value in
`values`

. The list `values`

is hidden inside a closure, so it’s protected from
the outside world. We certainly could have implemented memoization using a
global variable, but why would we if we we don’t have to? Our closure provides
an equally efficient and more elegant solution to the problem.

Just to drive home the efficiency gain, `memoized_fib(500)`

takes 0.014s to run
on my machine; `fib(500)`

is still calculating an hour later. Which makes sense,
since we’ve replaced an exponential algorithm with a linear one. Feels good.

And, yes, it’s totally true that we could have just calculated the Fibonacci numbers iteratively and avoided the whole fiasco. But what would we have learned?