Monday, August 20, 2007

Factor with a dash of curry

Finally, Factor has compiling curry! This is a great development, and it makes designing combinators in Factor much easier. I previously discussed designing combinators, but that post was very confusing and probably too advanced. With curry in full availability, this should be much easier, and the explanation much cleaner.


Unfortunately, Factor's curry has nothing to
do with this delicious Indian meal.


What's curry?

I hear you asking, "What the hell are you talking about? What's curry?" Curry is a simple function, named after mathematician Haskell Curry who developed combinatory logic, though not currying. The man usually credited with that discovery, a bit earlier on, and far before modern computers were developed, was Moses Schönfinkel, so the process may also be known as Schönfinkelisation. Currying is the technique of implementing a multiargument function by chaining together several single-argument functions. To use Haskell as an example, say you have the function add, which adds two integers. The function could be defined as

add :: Integer -> (Integer -> Integer)
add = \x -> (\y -> x+y)

Not all of you are Haskellers, so let me explain how that works. The first line is the type signature (that fact is indicated by the ::) which tells us that the type of add is a function (indicated by -> which takes an Integer and returns a function (Integer -> Integer) from Integer to Integer. The second line is a series of lambdas which defines the actual function. (We know it's a definition because of =.) The syntax \a -> b specifies an anonymous function which takes a value named a and returns the expression b. So add is a function which takes a value x, and returns a function y which takes a value y and returns the sum of x and y. As it happens, because Haskell is a so-called "auto-currying" language, and because -> is right-associative due to the frequency of this construct, the exact same thing can be written as

add :: Integer -> Integer -> Integer
add x y = x+y


How Factor does it

Currying is a great construct, but it has almost nothing to do with Factor's curry. Factor's curry is, by the more fashionable nomenclature of those at Lambda the Ultimate, partial application: given a function, apply it with a partial argument list, yielding a new function with just those arguments filled in. (I think this operation would more appropriately be called with in Factor, but for now, it's curry.) In Scheme, this is accomplished with the nice macro cut (SRFI 26).

At first, it may sound like partial application only suits applicative, not concatenative languages, as we do not usually think about applying a function to its arguments, but rather composing words which operate on a stack. But in this context, all we mean by partial application, or "currying", is getting something onto the stack before the quotation is called.

With this realization, this sort of currying is easy to do on quotations. In Joy, and in Factor as it was a few years ago, quotations are lists. To prepend a constant to that list is the same as to push it on the stack before executing the quotation. The operation is simply cons. Now that Factor's quotations are specially tagged arrays, the code for prepending an element is swap add*, but the meaning remains the same. If you're a careful reader, you saw that I made a mistake there: if you want to partially apply a quotation with a word as your data, this word has to be "literalized" first, or wrapped so that it is not executed but rather pushed. So, until recently, the code for curry amounted to swap literalize add*. (literalize does nothing if its argument is not a word.)

But now, curry has been added as a primitive for better interpreter performance, and a compiler intrinsic so it can be safely used in compiling Factor code.

Derived operations

You might be wondering if this single operation can take the place of all of the uses of Scheme's cut macro. It can. Using only curry, call and basic stack shuffling words, it is easy to define several more complex operations. For something simple, here's how to partially apply a quotation to two arguments, rather than one:

: 2curry ( a b quot -- quot' )
curry curry ; inline

The inline declaration is there to allow things using 2curry to be compiled. Composing two quotations is slightly more complicated. We want this to be fast, so we'll build a quotation, using curry, which calls the two quotations in order. This is more efficient than composition using append.

: compose ( quot1 quot2 -- q1q2 )
[ >r call r> call ] 2curry ; inline

Another useful operation is curry*, which partially applies a quotation based on the second item on the stack. As an example, curry* map is a complete replacement for map-with. Because of this, all of the -with combinators have been removed in favor of direct use of curry.

: curry* ( param object quot -- object curry )
>r swap r> [ >r swap r> call ] 2curry ; inline


Practical applications

Aside from eliminating the need for all -with combinators (I'm very glad I'll never have to write another one myself!), curry makes writing all sorts of combinators much simpler. For an involved example, see the new definitions of the sequence combinators, which are not so scary anymore. For something simpler, take the problem of counting up the number of elements in a sequence which satisfy a certain property. Say you have the sequence { 1 2 4 1 2 1 } and you want to count tne number of ones. A simple way to do this is to simply take the subset satisfying the property, and then count how long it is:

: count ( seq quot -- num )
subset length ; inline

Using this, code to count up the number of ones could be [ 1 = ] count. This pattern is used a few times in the Factor core, though there is no word named count. Using it, we can write a count= word which counts the number of times an element occurs in a sequence:

: count= ( seq elt -- num )
[ = ] curry count ;

But there's a downside to this: for no good reason, you build a new sequence, which is immediately consumed and discarded. What if we just counted up the length instead? The count combinator could be rewritten as:

: count ( seq quot -- num )
0 -rot [ rot >r call r> swap [ 1+ ] when ] curry each ;

There's a little bit of stack acrobatics around calling that quotation so that the quot has access to lower things on the stack.

If you think about it, there are actually two pieces here, which we can factor out. The first piece, which I'll call sigma, incrementally maps all values in a sequence to a new value, and sums the results. The second piece calls a quotation and converts the boolean result to either one or zero, which can be summed up. When these are separated, the implementation is even cleaner:

: sigma ( seq quot -- num )
[ rot >r call r> + ] curry 0 -rot each ; inline

: count ( seq quot -- num )
[ [ 1 ] [ 0 ] if ] compose sigma ; inline


Both of these are faster than their counterparts (map sum and subset length) by only a constant factor, but, just for fun, let's measure this. Right now, Factor doesn't have very advanced profiling tools, but I haven't felt the need for them. For many things, a simple timing function can work well. After entering in the definitions for sigma, the following lines can be entered at the listener to time the word against map sum:

: slow 1000000 [ ] map sum ;
: fast 1000000 [ ] sigma ;
{ slow fast } [ compile ] each
[ slow ] time ! prints time for map sum
[ fast ] time ! prints time for sigma

On my computer, slow takes 195ms, and fast takes 126ms, for around a 35% speedup!

1 comment:

Anonymous said...

WTF are you on? "More fashionable"? How about "correct"?