DEV Community

mari tang
mari tang

Posted on

SICP 1.2 - Ackermann's Function and Contemplating Infinity

Let's talk about Ackermann's function. It looks something like this

(define (A x y)
  (cond ((= y 0) 0)
        ((= x 0) (* 2 y))
        ((= y 1) 2)
        (else (A (- x 1)
                 (A x (- y 1))))))
Enter fullscreen mode Exit fullscreen mode

to describe its behavior:

  • if y is 0, return 0
  • if x is 0, return 2*y
  • if y is 1, return 2
  • otherwise, return Ackemrann's function with x = x-1 and y = Ackermann(x, y-1)

In order to evaluate it at anything besides the base cases, we're actually recursively evaluating Ackermann's function to even get the second argument. Working it out with arguments as low as (A 2 4) leads to huge numbers. (A 2 4) results in 65536, and larger numbers such as (A 4 4) are theoretically computable, but not within our lifetimes, at least with current hardware.

How do we start to deal with numbers like that?

SICP has us manually compute (A 1 10), (A 2 4), and (A 3 3), which should be enough to convince you that we're dealing with unfathomably large numbers.

I started to see some interesting patterns. (A 1 10) starts unrolling like this:

(A 1 10)
(A 0 (A 1 9))
(A 0 (A 0 (A 1 8)))
(A 0 (A 0 (A 0 (A 1 7))))
....
(A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 0 (A 1 1)))))))))) 
Enter fullscreen mode Exit fullscreen mode

Once we hit the (A 1 1) we've finally got a base case, and it returns 2 to the parent context. (A 0 2) then evaluates to (* 2 2), which will kick one down to (A 0 4), and so on.

If the 10 just unrolls the function 10 layers deep until it finally results in a 2, and each layer is resolved by multiplying the value returned from the next layer by 2, what we've actually got is (A 1 10) = 2^10, which we can generalize as (A 1 y) = 2^y.

That's a good step, and it explains the behavior of the function for certain values, but let's dig a bit deeper. Can we do the same thing for other values of x? Let's look at the case of (A 0 y), and plug in some values so we can see what happens.

(A 0 1) => 2
(A 0 2) => 4
(A 0 3) => 6
(A 0 4) => 8

that looks like a 2*y to me!

Here's the part that blew my mind: how do we prove it?

Well, functions in comp sci are not entirely unlike functions in math. We can actually just plug in what we know- we can do partial applications of arguments and see how much we can work out. if we want to find out what (A 0 y) is, we can actually just try to evaluate it!

Let's take the body of our function and substitute 0 for x and y for y.

(cond ((= y 0) 0)
      ((= 0 0) (* 2 y))
      ((= y 1) 2)
      (else (A (- 0 1)
               (A 0 (- y 1)))))
Enter fullscreen mode Exit fullscreen mode

it looks like the second conditional is being hit. 0 == 0, after all. so what does that return? (* 2 y), exactly as desired.

Let's scale it up. what if we work with (A 1 y)?

(cond ((= y 0) 0)
      ((= 1 0) (* 2 y))
      ((= y 1) 2)
      (else (A (- 1 1)
               (A 1 (- y 1)))))
Enter fullscreen mode Exit fullscreen mode

what we get is:

(A (- 1 1) (A 1 (- y 1))))
Enter fullscreen mode Exit fullscreen mode

which becomes:

(A 0 (A 1 (- y 1))))
Enter fullscreen mode Exit fullscreen mode

Now, if (A 0 y) = 2*y, we can substitute 2*y for (A 0 y)

(* 2 (A 1 (- y 1)))
Enter fullscreen mode Exit fullscreen mode

Okay, but we still have to get the second argument by evaluating (A 1 (- y 1))

(* 2 (A (- 1 1)
        (A 1 (- 1 (- y 1)))))
Enter fullscreen mode Exit fullscreen mode

hmmmmmm

(* 2 (A 0 (A 1 (- y 2))))
Enter fullscreen mode Exit fullscreen mode

hmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm

(* 2 (* 2 (A 1 (- y 2))))
Enter fullscreen mode Exit fullscreen mode

okay, so we've seen how this unfolds, right? Eventually we'll get to ...(A 1(- y y-1)), which is ...(A 1 1). Let's see what that looks like:

(cond ((= 1 0) 0)
      ((= 1 0) (* 2 y))
      ((= 1 1) 2)
      (else (A (- 1 1)
               (A 1 (- 1 1)))))
Enter fullscreen mode Exit fullscreen mode
((= 1 1) 2)
Enter fullscreen mode Exit fullscreen mode

well, 1 == 1, no?

...(A 0 (A 0 (A 1 2)))
Enter fullscreen mode Exit fullscreen mode

which becomes

...(A 0 (A 0 2))
Enter fullscreen mode Exit fullscreen mode

and finally

...(A 0 (* 2 2))
Enter fullscreen mode Exit fullscreen mode

Each of the (A 0 ...) becomes 2 * (whatever the rest is). What do we call it when we multiply by the same number a bunch of times? exponentiation! we're doing n*2 y times, so that's 2^y!

You can scale up to (A 2 y), which ends up being the next order higher, raising 2^2^2... y times.

so why all of this? Ackermann's function does a few interesting things. each X is one level "up" from the previous (constant -> multiplication -> exponentiation -> exponentiation of exponents), which i find abstractly really fascinating.

On more practical levels, it gives us insight into how we can understand super-exponential growth through recursion, how we can think about a function that grows at a nearly unthinkable rate.

Top comments (0)