## Monday, August 20, 2007

### Factor with a dash of curry

Finally, Factor has compiling curry! This is a great development, and it makes designing combinators in Factor much easier. I previously discussed designing combinators, but that post was very confusing and probably too advanced. With curry in full availability, this should be much easier, and the explanation much cleaner.

Unfortunately, Factor's `curry` has nothing to
do with this delicious Indian meal.

What's curry?

I hear you asking, "What the hell are you talking about? What's curry?" Curry is a simple function, named after mathematician Haskell Curry who developed combinatory logic, though not currying. The man usually credited with that discovery, a bit earlier on, and far before modern computers were developed, was Moses Schönfinkel, so the process may also be known as Schönfinkelisation. Currying is the technique of implementing a multiargument function by chaining together several single-argument functions. To use Haskell as an example, say you have the function `add`, which adds two integers. The function could be defined as
`add :: Integer -> (Integer -> Integer)add = \x -> (\y -> x+y)`

Not all of you are Haskellers, so let me explain how that works. The first line is the type signature (that fact is indicated by the `::`) which tells us that the type of `add` is a function (indicated by `->` which takes an `Integer` and returns a function (`Integer -> Integer`) from `Integer` to `Integer`. The second line is a series of lambdas which defines the actual function. (We know it's a definition because of `=`.) The syntax `\a -> b` specifies an anonymous function which takes a value named a and returns the expression b. So add is a function which takes a value x, and returns a function y which takes a value y and returns the sum of x and y. As it happens, because Haskell is a so-called "auto-currying" language, and because `->` is right-associative due to the frequency of this construct, the exact same thing can be written as
`add :: Integer -> Integer -> Integeradd x y = x+y`

How Factor does it

Currying is a great construct, but it has almost nothing to do with Factor's `curry`. Factor's `curry` is, by the more fashionable nomenclature of those at Lambda the Ultimate, partial application: given a function, apply it with a partial argument list, yielding a new function with just those arguments filled in. (I think this operation would more appropriately be called `with` in Factor, but for now, it's `curry`.) In Scheme, this is accomplished with the nice macro `cut` (SRFI 26).

At first, it may sound like partial application only suits applicative, not concatenative languages, as we do not usually think about applying a function to its arguments, but rather composing words which operate on a stack. But in this context, all we mean by partial application, or "currying", is getting something onto the stack before the quotation is called.

With this realization, this sort of currying is easy to do on quotations. In Joy, and in Factor as it was a few years ago, quotations are lists. To prepend a constant to that list is the same as to push it on the stack before executing the quotation. The operation is simply `cons`. Now that Factor's quotations are specially tagged arrays, the code for prepending an element is `swap add*`, but the meaning remains the same. If you're a careful reader, you saw that I made a mistake there: if you want to partially apply a quotation with a word as your data, this word has to be "literalized" first, or wrapped so that it is not executed but rather pushed. So, until recently, the code for `curry` amounted to `swap literalize add*`. (`literalize` does nothing if its argument is not a word.)

But now, `curry` has been added as a primitive for better interpreter performance, and a compiler intrinsic so it can be safely used in compiling Factor code.

Derived operations

You might be wondering if this single operation can take the place of all of the uses of Scheme's `cut` macro. It can. Using only `curry`, `call` and basic stack shuffling words, it is easy to define several more complex operations. For something simple, here's how to partially apply a quotation to two arguments, rather than one:
`: 2curry ( a b quot -- quot' )    curry curry ; inline`

The `inline` declaration is there to allow things using `2curry` to be compiled. Composing two quotations is slightly more complicated. We want this to be fast, so we'll build a quotation, using `curry`, which calls the two quotations in order. This is more efficient than composition using `append`.
`: compose ( quot1 quot2 -- q1q2 )    [ >r call r> call ] 2curry ; inline`

Another useful operation is `curry*`, which partially applies a quotation based on the second item on the stack. As an example, `curry* map` is a complete replacement for `map-with`. Because of this, all of the `-with` combinators have been removed in favor of direct use of `curry`.
`: curry* ( param object quot -- object curry )    >r swap r> [ >r swap r> call ] 2curry ; inline`

Practical applications

Aside from eliminating the need for all `-with` combinators (I'm very glad I'll never have to write another one myself!), `curry` makes writing all sorts of combinators much simpler. For an involved example, see the new definitions of the sequence combinators, which are not so scary anymore. For something simpler, take the problem of counting up the number of elements in a sequence which satisfy a certain property. Say you have the sequence `{ 1 2 4 1 2 1 }` and you want to count tne number of ones. A simple way to do this is to simply take the subset satisfying the property, and then count how long it is:
`: count ( seq quot -- num )    subset length ; inline`

Using this, code to count up the number of ones could be `[ 1 = ] count`. This pattern is used a few times in the Factor core, though there is no word named `count`. Using it, we can write a `count=` word which counts the number of times an element occurs in a sequence:
`: count= ( seq elt -- num )    [ = ] curry count ;`

But there's a downside to this: for no good reason, you build a new sequence, which is immediately consumed and discarded. What if we just counted up the length instead? The `count` combinator could be rewritten as:
`: count ( seq quot -- num )    0 -rot [ rot >r call r>  swap [ 1+ ] when ] curry each ;`

There's a little bit of stack acrobatics around calling that quotation so that the quot has access to lower things on the stack.

If you think about it, there are actually two pieces here, which we can factor out. The first piece, which I'll call `sigma`, incrementally maps all values in a sequence to a new value, and sums the results. The second piece calls a quotation and converts the boolean result to either one or zero, which can be summed up. When these are separated, the implementation is even cleaner:
`: sigma ( seq quot -- num )    [ rot >r call r> + ] curry 0 -rot each ; inline: count ( seq quot -- num )    [ [ 1 ] [ 0 ] if ] compose sigma ; inline`

Both of these are faster than their counterparts (`map sum` and `subset length`) by only a constant factor, but, just for fun, let's measure this. Right now, Factor doesn't have very advanced profiling tools, but I haven't felt the need for them. For many things, a simple timing function can work well. After entering in the definitions for `sigma`, the following lines can be entered at the listener to time the word against `map sum`:
`: slow 1000000 [ ] map sum ;: fast 1000000 [ ] sigma ;{ slow fast } [ compile ] each[ slow ] time ! prints time for map sum[ fast ] time ! prints time for sigma`

On my computer, `slow` takes 195ms, and `fast` takes 126ms, for around a 35% speedup!

#### 1 comment:

Anonymous said...

WTF are you on? "More fashionable"? How about "correct"?