Wednesday, March 26, 2008

One day a student came to Moon and said: “I understand how to make a better garbage collector. We must keep a reference count of the pointers to each cons.”
Moon patiently told the student the following story:
One day a student came to Moon and said: “I understand how to make a better garbage collector...

—Danny Hillis explains tale-recursion.


Q: What's this?

(lambda (x)
  (make-isomorphism x 'howard))
.

A: Gur pheevrq Ubjneq vfbzbecuvfz. (rot 13)

Monday, March 10, 2008

Woof!

I've been barking up the wrong tree. Although the calls that implement the bind function are not tail recursive, the final call in the bind function is, so the monadic code is tail recursive in the larger sense (that is, if the monad calls itself or another monadic function, the call will be a tail call).

Thursday, March 6, 2008

Total silence

Crickets chirp. Somewhere a faint voice calls out “Freebird!

Actually, Rebecca Frankel suggested that “you embed a monad in a lazy language to provide ‘local strictness’ and a comonad in a strict language to provide ‘local laziness’.” I don't have my head wrapped around monads securely, so I'm very fuzzy on the notion of comonads. However, I've read that streams (that is, lazy lists, not character streams) are comonads.

Wednesday, March 5, 2008

A problem with monads?

It took me a while, but I've finally gotten to the point I wanted to make.

Take a look at the state monad:
(define (bind M f)
  (lambda (ma)
    (let* ((intermediate (M ma))
           (i-state (get-state intermediate))
           (i-value (get-value intermediate))
           (next    (f i-value)))
      (next i-state))))

.
We would use this monad to chain together a bunch of procedures that will pass a `state variable' around. The procedures aren't actually `chained' though, they are `nested'. In order to extend the chain, we put a wrapper around the existing chain, catch its return value and feed the appropriate elements through to the next function in the chain. The structure is like a set of Russian dolls where the initial function is the smallest and the subsequent functions wrap around it.

When we run this monad, the first thing we have to do is invoke the next inner monad. Of course this monad has to invoke its next inner monad, etc. etc. until we get to the innermost one. The problem is this: each invocation is consuming resources that cannot be released until the inner invocations finish. So before we even start executing the first chained procedure we are allocating resources for every chained procedure that comes after it. A sequence of chained procedures will allocate the stack frame for the last procedure first, and this frame will be retained until the entire monadic computation is complete. If a sequence has, say, twenty chained procedures, we'll push twenty stack frames as the very first operation.

The state monad is not tail recursive.

I suppose this is a trivial point for some people, but I find this to be remarkable. The lack of tail recursion would place limits on the number of procedures that could be chained together in this way. It is unlikely a person would write a program that exceeds the limit and overflows the stack, but it is not at all impossible for a meta-computation to do so. (Imagine an x86 emulator that used the state monad and chained together the procedures that implemented the instruction set.) What I find remarkable is that there appears to be no underlying logical reason why this isn't tail recursive. The chain of procedures is obviously linear and the resources for the final element in the chain aren't really used until that element is computed, yet they are the first to be allocated.

I have some notions about why this isn't tail recursive, but I'm not enough of an expert in monads or category theory to be sure, yet. I'm hoping that someone who is more of an expert can validate or refute the problem. I think the answer is one of these:
  1. Yes, it is a problem and
    • tail-recursion is just incompatible with monads, or
    • monads are so cool that we're willing to live with the problem, or
    • only Haskell people use monads in real life, and in the theoretical realm you can have as much stack as you can imagine
  2. No, it isn't a problem because you can simply rewrite the state monad like this ...
  3. No, it isn't a problem, just wrap your code in the `tail recursion' monad and magic will happen.
  4. You're an idiot because this obvious thing you clearly didn't understand fixes it.
Any opinions or ideas?

Tuesday, March 4, 2008

Functions do not return values

Ok, let me start pulling some of these disparate threads together.

Do functions return values?

There are basically three ways a function can finish its computation:
  1. Returning a literal constant.
  2. Returning the value of a variable.
  3. Calling another function.
As I mentioned last time, when we call another function to compute the return value, we are actually delegating the computation to the other function. The caller doesn't need to hang around to simply pass back the value, it can free its resources and just transfer control to the callee.

But obviously, a function like
(define (foo) 3)

.
has to return the value 3. Well, not really. The interpreter or compiler could treat literal constants and variables as calls to the identity function. We'd never be able to tell the difference. Without loss of generality, we can assume that this is exactly what happens. So every user-defined function finishes its computation by explicit delegation to another function or implicit delegation to the identity function.

But surely the identity function would have to return! Well, not really.
(define (identity x)
  (call/cc (lambda (k) (k x))))
.
Identity can delegate to call/cc. call/cc delegates to the thunk argument, and that delegates to the captured continuation.

My point isn't to amuse and amaze with stupid continuation tricks, however. My point is that all control transfer, be it a function call or return can be uniformly modeled as a function call. This isn't news. People have been doing this for years. Henry Baker took it to the extreme when he suggested that this could be an effective technique for implementing Scheme in C. Felix Winkelmann's Chicken compiler uses this exact technique. The underlying implementation never returns a value. When the stack is exhausted, the live data is evacuated and the stack pointer is reset.

Next time: a few odd consequences.

Monday, March 3, 2008

Actor Model

Another digression, but still heading towards the same general direction.
Sussman and Steele developed Scheme to explore some of the semantics of Hewitt's Actor model. They discovered that the code that implemented message passing and the code that implemented function application was identical (modulo the naming). In other words, a closure can be considered an actor that waits for a message (the arguments) and runs until it computes a value. Message passing is simply invoking a closure on a set of arguments. (This doesn't capture the entire Actor model, but it captures some interesting facets.)

A crucial part of the Actor Model is the ability to delegate computation. Any Actor can foist off part or all of a computation on to another Actor. If an Actor simply redirects a message to a different Actor (after possibly munging the payload of the message), it is important that it free any temporary resources it was using. There is no logical limit on the length of a delegation chain, so there shouldn't be a physical one if it isn't necessary.

In Scheme, this translates into the requirement for general tail recursion. If a Scheme function, call it `Foo', delegates its computation to another function `Bar', there is no need to return control to `Foo' if all it is going to do is pass the return value on up the stack. The temporary resources used by `Foo' can be released when control passes to `Bar'. If `Bar' were to delegate to `Baz', then `Bar' can release its temporary resources when it passes control. This chaining of delegation can be indefinitely long. Stack overflow is not an issue because we free the stack frames as soon as we no longer need them.

Languages that do not support general tail recursion cannot support this kind of indefinite delegation: the delegation chain must be bounded.


To tie this in with the post from a few days ago:

Languages with general tail recursion can run any program that can run without tail recursion. However, there is a class of programs that can run only if general tail recursion can be depended upon. Included in this class are continuation-passing-style and indefinite delegation. (CPS can be considered a subset of indefinite delegation if you wish). You cannot run these styles of programs without resorting to whole-program analysis and restructuring.