Showing posts with label introduction. Show all posts
Showing posts with label introduction. Show all posts

Wednesday, May 7, 2008

Interval maps in Factor

Recently, I wrote a little library in Factor to get the script of a Unicode code point. It's in the Factor git repository in the vocab unicode.script. Initially, I relatively simple representation of the data: there was a byte array, where the index was the code point and the elements were bytes corresponding to scripts. (It's possible to use a byte array because there are only seventy-some scripts to care about.) Lookup consisted of char>num-table nth num>name-table nth. But this was pretty inefficient. The largest code point (that I wanted to represent here) was something around number 195,000, meaning that the byte array took up almost 200Kb. Even if I somehow got rid of that empty space (and I don't see an obvious way how, without a bunch of overhead), there are 100,000 code points whose script I wanted to encode.

But we can do better than taking up 100Kb. The thing about this data is that scripts are in a bunch of contiguous ranges. That is, two characters that are next to each other in code point order are very likely to have the same script. The file in the Unicode Character Database encoding this information actually uses special syntax to denote a range, rather than write out each one individually. So what if we store these intervals directly rather than store each element of the intervals?

A data structure to hold intervals with O(log n) lookup and insertion has already been developed: interval trees. They're described in Chapter 14 of Introduction to Algorithms starting on page 311, but I won't describe them here. At first, I tried to implement these, but I realized that, for my purposes, they're overkill. They're really easy to get wrong: if you implement them on top of another kind of balanced binary tree, you have to make sure that balancing preserves certain invariants about annotations on the tree. Still, if you need fast insertion and deletion, they make the most sense.

A much simpler solution is to just have a sorted array of intervals, each associated with a value. The right interval, and then the corresponding value, can be found by simple binary search. I don't even need to know how to do binary search, because it's already in the Factor library! This is efficient as long as the interval map is constructed all at once, which it is in this case. By a high constant factor, this is also more space-efficient than using binary trees. The whole solution takes less than 30 lines of code.

(Note: the intervals here are closed and must be disjoint. <=> must be defined on them. They don't use the intervals in math.intervals to save space, and since they're overkill. Interval maps don't follow the assoc protocol because intervals aren't discrete, eg floats are acceptable as keys.)

First, the tuples we'll be using: an interval-map is the whole associative structure, containing a single slot for the underlying array.

TUPLE: interval-map array ;

That array consists of interval-nodes, which have a beginning, end and corresponding value.

TUPLE: interval-node from to value ;

Let's assume we already have the sorted interval maps. Given a key and an interval map, find-interval will give the index of the interval which might contain the given key.

: find-interval ( key interval-map -- i )
[ from>> <=> ] binsearch ;

interval-contains? tests if a node contains a given key.

: interval-contains? ( object interval-node -- ? )
[ from>> ] [ to>> ] bi between? ;

Finally, interval-at* searches an interval map to find a key, finding the correct interval and returning its value only if the interval contains the key.

: fixup-value ( value ? -- value/f ? )
[ drop f f ] unless* ;

: interval-at* ( key map -- value ? )
array>> [ find-interval ] 2keep swapd nth
[ nip value>> ] [ interval-contains? ] 2bi
fixup-value ;

A few convenience words, analogous to those for assocs:

: interval-at ( key map -- value ) interval-at* drop ;
: interval-key? ( key map -- ? ) interval-at* nip ;

So, to construct an interval map, there are a fewi things that have to be done. The input is an abstract specification, consisting of an assoc where the keys are either (1) 2arrays, where the first is the beginning of an interval and the second is the end (2) numbers, representing an interval of the form [a,a]. This can be converted into a form of all (1) with the following:

: all-intervals ( sequence -- intervals )
[ >r dup number? [ dup 2array ] when r> ] assoc-map
{ } assoc-like ;

Once that is done, the objects should be converted to intervals:

: >intervals ( specification -- intervals )
[ >r first2 r> interval-node boa ] { } assoc>map ;

After that, and after the intervals are sorted, it needs to be assured that all intervals are disjoint. For this, we can use the monotonic? combinator, which checks to make sure that all adjacent pairs in a sequence satisfy a predicate. (This is more useful than it sounds at first.)

: disjoint? ( node1 node2 -- ? )
[ to>> ] [ from>> ] bi* < ;

: ensure-disjoint ( intervals -- intervals )
dup [ disjoint? ] monotonic?
[ "Intervals are not disjoint" throw ] unless ;

And, to put it all together, using a tuple array for improved space efficiency:

: <interval-map> ( specification -- map )
all-intervals [ [ first second ] compare ] sort
>intervals ensure-disjoint >tuple-array
interval-map boa ;

All in all, in the case of representing the table of scripts, a table which was previously 200KB is now 20KB. Yay!

Sunday, January 13, 2008

until: The Ultimate Enumerator

Recently, via programming.reddit, I found this paper by Oleg Kiselyov about the best way to build a generic collection protocol to go through a sequence. We want to do this so that operations on collections can be implemented once generically to work on any kind of collection.

Oleg's idea

There are two main ways to go about this: one is a cursor, which is called an iterator in Java and C#: basically, you have an object which represents a particular point in a collection, and there are operations to get the current value and the next cursor position, if one exists. The alternative is to have a higher order function called an enumerator which represents, basically, a left fold. Oleg's conclusion is that the left fold operator is the best way to go about things, and he proceeds to demonstrate how this works out in languages with and without call/cc. He shows how cursors and enumerators are isomorphic, defining each in terms of the other.

Oleg's function is something like this: you have a coll-fold-left function which takes a collection, a function, and an unbounded number of seeds. This will return those seeds after this left fold processing is done. The function takes an element and the values of the seed inputs and returns an indicator with new values for the seeds. The indicator determines whether to keep going or to stop now and return the current values from coll-fold-left.

An enumeration function like this, which is fully general, is much better than a cursor, because cursors depend on mutable state and they are much more difficult to implement, say, when traversing a tree with no links to parent nodes. Also, enumerators handle termination of the collection much better than iterators, where it must either be manually checked whether the collection has terminated, or an exception thrown when reading past the end. There are also efficiency problems, and some interesting research into doing compiler optimizations like deforestation based on enumerators.

Concatenativity and accumulators

I'm going to talk about how this can all work in Factor. Factor does have call/cc, so we can use the simpler iterator which depends on call/cc for full control. (I won't go into how to stop and start an iteration in the middle in this article; Oleg's example can be translated directly into Factor.) But we also have something else to take advantage of, which neither Scheme nor Haskell have at their disposal: a stack.

Remember, Factor is a concatenative language, and that means that whenever we make a function call, it operates on a huge built-in accumulator. You could think of the stack as one accumulator ("all words are functions from stack to stack") or as an unbounded number of accumulators (which makes more sense thinking about things from a dataflow perspective, that the stack is used to glue together the inputs and outputs of various words).

Here's an example. You know the regular old left fold? As in, taking a list like [1, 2, 3, 4], an initial value (called the identity) 0, and a function like (+), and transforming this into ((((0+1)2)+3)+4)? Well, in Factor we call that reduce, and here's how it's defined:

: reduce ( sequence identity quotation -- value )
! quotation: accumulator element -- new-value
swapd each ; inline

swapd is a stack shuffler with the effect ( x y z -- y x z ) and each is a word which takes a sequence and a quotation, and calls the quotation on each element of the sequence in order. each normally takes a quotation with stack effect ( element -- ).

But how does this work? each, like nearly every combinator, is defined to expose its quotation to values further down on the stack, so we can mess around with that stuff all we want as long as the result is balanced. Here, the quotation that we give the whole thing is messing with the top thing on the stack, right under the element, each time it's called. So, by the end, the top of the stack has our answer. Keep in mind that all of this is happening completely without mutable state.

until

So, that's the great simplification that you get from concatenative languages: there is no need to maintain the accumulators. Here's all we need to define the perfect enumeration protocol: a word which I call until ( collection quotation -- ). It takes a quotation and a collection from the stack and leaves nothing. The quotation is of the effect ( element -- ? ). If the quotation returns true, the iteration through the list stops there.

until is significantly easier to define than coll-fold-left because we don't have to worry about the accumulators. Here's a simple definition with linked lists, if f is used as the empty list:

TUPLE: cons car cdr ;

: until ( list quot -- )
over [ 2drop ]
[ [ >r cons-car r> call ] 2keep >r cons-cdr r> until ] if ; inline

Of course, we'd probably want to define it as a generic word, if we're doing this in Factor and will use it for more than one thing.

(until is kinda like the generic iteration protocol I've defined for assocs, using assoc-find, except until should be easier to implement.)

Derived operations

Using until as a base, we can easily define other basic operations like each, find and assoc-each as they currently exist in Factor.

: each ( seq quot -- )
[ f ] compose until ; inline

: find ( seq quot -- i elt ) ! This is hideous, and should be factored
>r >r -1 f r> r> [ ! i f elt quot
2swap >r >r keep r> 1+ r> drop ! ? elt i
rot [ rot and ] keep ! i f/elt ?
] curry until dup [ nip f swap ] unless ; inline

: assoc-each ( assoc quot -- )
! This assumes that assoc implement until, giving 2arrays for each element
[ first2 ] swap compose each ; inline


Evaluation

You might notice that find will involve a little bit of redundant arithmetic on integer-indexed sequences. I'm not sure how to avoid this without complicating until and leaking implementation details. Also, you might notice that there's no obvious way to implement find* (without doing a bunch of redundant iteration), which takes an index to start as an argument. This is a result of the fact that find* can only really be implemented efficiently on collections with random access indexing. There's also no find-last.

until isn't everything. As I pointed out just now, some operations, like find, may have to do redundant arithmetic when operating on certain kinds of collections to get an index, when that index must already be calculated for the traversal itself. Another limitation is that there's no obvious way to implement map. (I would still have to suggest going with Factor's current scheme of the words new, new-resizable and like. But this scheme doesn't work with, say, linked lists.) This could be implemented in place of the assoc iteration protocol, but that might result in some unnecessary allocation of 2arrays when iterating through things like hashtables.

Nevertheless, something like until could definitely help us make our sequence protocol more generic, allowing for (lazy) lists, assocs, and what we currently call sequences to be used all through the same protocol.


If you don't understand the title allusion, you should read some of the original Scheme papers.

Thursday, January 3, 2008

Learning Factor

Learning Factor is tough. One reason for this is that Factor is very different from other programming languages. Programmers today are used to imperative programming languages where data is stored and passed around in named variables (or function calls, which name their variables). Factor is the opposite of this. A lot of code tends to be written in a functional style, and even more jarringly, variables are rare, only referenced in a small fraction of words. Nobody intends to change any of this; it's a feature, not a bug!

What we do need to change is the other reason for Factor's difficulty to learn: lack of documentation and organization thereof. It's hard for me to say this when a lot of effort has been put into documentation, especially creating a comprehensive reference for the Factor core and libraries. When I got started on Factor, there was basically no documentation at all; I just asked questions on the IRC channel. Now there's tons of it, but it's in many places.

Two starting places (not necessarily in this order) are the FAQ and the Factor cookbook. The Factor cookbook is included in the Factor distribution, accessed by running Factor, selecting the "Browser" tab and clicking on Factor cookbook, or online. It was written by Slava Pestov, and illustrates really clearly the basics of Factor.

But once you get done with that, where should you go? Here are a bunch of options, depending on what exactly you want to:
  • Factor's included reference documentation, written mostly by Slava, is sound and basically complete but not very tutorial-like.
  • Slava's blog is the best place to learn about new language features as they're developed.
  • Aaron Schaefer wrote a series of blog posts that introduce Factor, starting here.
  • I wrote a bunch of introductory blog posts.
  • Elie Chaftari wrote an introduction to programming Factor, and he wrote about the FFI.
  • You can look at many other blogs about Factor through the Planet Factor aggregator. Scattered around here are many more introductions to Factor.
  • The Joy papers by Manfred von Thun provide a theoretical basis for concatenative languages.
  • You could try reading some source code in the libraries. Simple demos have the tag demo, which you can query in the vocab browser. The core sequence library is a good place to look for clean, idiomatic code.

The best way to learn Factor, after you have a grasp of the basic syntax, is to play around with it and try to build something useful. Here are some project ideas:
  • Write a date parsing library, accepting ISO 8601 format or a natural English (and maybe internationalized) format. A calendar backend already exists.
  • Create an embedded, lexically scoped, infix/prefix DSL for complex math calculations
  • Make an analog clock with the Factor UI library, or a GUI for Factor server administration
  • Solve some Project Euler problems in Factor
  • Whenever you encounter some part of Factor that you don't like, or some missing language feature, implement it. It's very likely that you'll be able to do whatever you want in the library

These problems may sound a bit difficult, but most of them have simple part-way solutions so you can begin to get your feet wet. And with any of these, if you have useful working code, we'd be happy to take it into the Factor library, if you feel like releasing it under a BSD-style license.

If you get stuck, don't despair! The friendly people on the Factor mailing list and irc.freenode.net's #concatenative channel are here to answer all of your questions. You could also just email me. Eventually, we'll have a Factor book, but it hasn't been started yet, in part because Factor's not at version 1.0 and in part because none of us have enough free time to write it.

Update: Factor documentation isn't as bad as I originally put it, so I changed this post a little.

Tuesday, December 4, 2007

extra/delegation: an annotated vocabulary

In order to allow for better compiler optimizations change-slot operations and other improvements, Slava plans to add tuple inheritance to the Factor object system. The exact form of this isn't completely finalized, but it looks like delegation in its current form isn't going to stick around much longer. I proposed a new kind of delegation mechanism, along with other changes to the Factor object system. In the end, a more conservative change was chosen in the form of mixins.

I described the Factor object system before, but I'll review delegation here. Basically, when a generic word dispatches on the tuple, one of two things can happen: either a method is found, and then called, or no method is found. If no method is found, the generic word checks the delegate slot. If that slot is filled (with something that's not f), the method will be re-called on that delegate in place of the original object. For more information on this, see my previous blog post

There are a few problems with delegates. One problem is the slicing problem: when relying on delegation to extend behavior and a method is delegated, the method call knows only about the delegate, not the original object. Sometimes, this is the desired behavior, but other times it can lead to subtle bugs. Another problem is performance. When looking up the value of a direct tuple slot, the required operation is a simple method dispatch and pointer arithmetic. If the compiler knows the tuple class of an object before runtime, it can even eliminate the dispatch step. However, when looking at a delegate slot, there is an additional pointer indirection. Crucially, with the current system, there is no way to avoid the dispatch. Tuple inheritance would require fewer levels of indirection and give more opportunity for optimization. With delegation, method dispatch takes linear time with respect to the length of the delegate chain, and this is bad. A very subtle problem with delegation is that everything is delegated at once, when often, we only need to delegate a couple things.

Yet delegation is still sometimes useful, and there must be a way to fix the slicing problem without breaking everything*. I've written a vocabulary (vocab) called delegate which attempts to fix this. The premise is that this will constitute delegation after it is removed from the core. Central to the vocab is the concept of a group, or a word which signifies a number of methods. There are three builtin types of groups: tuple classes, generic words and protocols. There is a word called group-words which extracts the constituent words from a group:

GENERIC: group-words ( group -- words )

For a generic word, the group consists just of itself

M: generic group-words
1array ;

For a tuple class, the group consists of all of the slot readers and writers, except for the delegate slot. (delegate and set-delegate aren't generic words, anyway)

M: tuple-class group-words
"slots" word-prop 1 tail ! The first slot is the delegate
! 1 tail should be removed when the delegate slot is removed
dup [ slot-spec-reader ] map
swap [ slot-spec-writer ] map append ;

A protocol is an explicitly-defined list of generic words which form a group. The syntax PROTOCOL: protocol-name words... ; is used to define a protocol, and the words it contains are stored in the "protocol-words" word property.

: define-protocol ( wordlist protocol -- )
swap { } like "protocol-words" set-word-prop ;

: PROTOCOL:
CREATE dup reset-generic dup define-symbol
parse-definition swap define-protocol ; parsing

PREDICATE: word protocol "protocol-words" word-prop ;

M: protocol group-words
"protocol-words" word-prop ;

Here's an example of the defintion of the sequence protocol:

PROTOCOL: sequence-protocol
clone clone-like like new new-resizable nth nth-unsafe
set-nth set-nth-unsafe length immutable set-length lengthen ;

The advantage of defining delegation over these groups is that, usually, we don't need to delegate for everything. With this, multiple delegation over disjoint groups is possible. This is in use in the xml.data vocab, to let tags act like their names, their attributes assoc and their children sequence at the same time. It could also be used for multiple tuple inheritance.

One mechanism which replaces delegation is called consultation. Billy Tanksley has suggested that this is what Factor's delegation should be termed; in essence, consultation is the same as delegation, except that it's limited to a single group. Say you have some class foo which implements all generic words in the group bar. There is another class baz, which has a word baz>foo itself into into a foo, or getting a corresponding foo to which to delegate the methods. Then, we could define CONSULT: bar baz baz>foo ; in order to define the delegation. If bar contains the words qux and quux, then the following method definitions will be generated:

M: baz qux baz>foo qux ;
M: baz quux baz>foo quux ;

Here's the implementation:

: spin ( x y z -- z y x ) ! This stack shuffler makes things easier to think about
swap rot ;

: define-consult-method ( word class quot -- )
pick add <method> spin define-method ;

: define-consult ( class group quot -- )
>r group-words r>
swapd [ define-consult-method ] 2curry each ;

: CONSULT:
scan-word scan-word parse-definition swapd define-consult ; parsing

So, this fixes one problem with delegation: that it happens for everything. The second problem, performance, can be solved with increased static typing, for example static typing of tuple slots. But this won't be an available feature of Factor for some time. Until then, if we're sure that a tuple slot is a specific class, we can use declare to let the compiler know. For example,

CONSULT: bar baz
baz>foo { foo } declare ;

There's still one last problem, the slicing problem. A complete solution would involve heavy hackery into the Factor method dispatch system, allowing one object to mimic another. But a simpler solution is to just copy all of the method implementations from one class to another, for the generic words in a specific group. This can be unsafe in some situations, such as when the group is a tuple class, but it can also be useful. I called the system mimicry.

: define-mimic ( group mimicker mimicked -- )
>r >r group-words r> r> [
pick "methods" word-prop at [ 2drop ]
[ method-def <method> spin define-method ] if*
] 2curry each ;

: MIMIC:
scan-word scan-word scan-word define-mimic ; parsing

So there you have it: in fewer than 40 lines, the basis for a replacement of the delegation system, at least partially fixing all three main problems. The code is in my git repository. There are still problems, of course: metadata about what mimics/consults what is not stored, and when you see a generic word or class, you'll see all of the mechanically-defined methods. It's amazing, though, how flexible Factor is to this kind of manipulation and metaprogramming.



* Actually, it's possible to re-implement delegation in Factor, completely eliminating the slicing problem, but at a slight performance cost for tuple slot lookups, and at the cost of breaking a lot of code. The thing is, if instances of tuple class a all delegate to tuple class b, then a instances will return f to b?. To get around this, semi-hacks like the is? word are used.

Friday, October 26, 2007

Factor's object system

In my FAQ, I wrote that Factor was object-oriented. That might have confused some of you: how is Factor object-oriented when code isn't organized in terms of classes like in Java? When there is no system for message passing*? When tuples can't inherit from each other?

Rationale

Originally, Factor didn't have an object system. It was said that it's not necessary; we can get by with just words, conditionals and lists. But as the library grew more complicated, it became clear that we needed a cleaner method of dispatch. This took the form of generic words, or words that can be overloaded by the class of one of their arguments. Initially, we only had predicate classes, and generic words were in fact no more than syntactic sugar for cond. But we also needed a good way to define data types, so that they wouldn't be confused with each other. Later, we realized the need for a system of 'inheritance' from tuples to tuples, flexible constructors, extensible unions and other things. Piece by piece, features were added to form a emerging coherent whole which could be used to organize some Factor code.

It's important to note that most words are not methods and not all data is stored in tuples. Groups of Factor code are put in vocabularies, not in classes. It's common to write a lot of code without using any object-oriented features (aside from generic words defined in libraries). Nevertheless, object orientation, when used appropriately, makes code much better organized.

The class hierarchy

In Factor, everything is an object. Well, of course everything is an object; everything is something! But I mean everything is an instance of the class object. Under that class, there are a number of built-in classes, representing basic things like tuple, vector, array, fixnum, float, bignum, ratio, quotation, curry, etc. You can test if an object is in a particular class with words of the scheme object?. To test if an object is a tuple, use the word tuple?.

On top of these builtin classes, there are a number of abstractions. For example, the class number is the union of all numeric classes. Under this is a numerical tower of subclasses, similar to Scheme. An example of a couple unions of this kind:

UNION: integer fixnum bignum ;
UNION: rational ratio integer;

Factor also supports extensible unions, called mixins. Whereas a union is closed, later code is allowed to add things to a mixin. The above code can be translated into mixin syntax as:

MIXIN: rational
MIXIN: integer
INSTANCE: rational ratio
INSTANCE: rational integer
INSTANCE: integer fixnum
INSTANCE: integer bignum

This is useful for things like sequences and assocs, which form mixin classes. Unions are inappropriate here, as new sequences and assocs can be made in the library.

Going in the other direction, classes can be narrowed down using predicate classes. A predicate class is a subclass of some other class consisting of instances which satisfy a conditional. For example, the natural numbers (defined as integers 0 or greater) could be defined as

PREDICATE: integer natural 0 >= ;


Generic words

So what use are all these classes? We can use predicates on them to test membership, but that's not all that useful. The centerpiece of the object system is generic words, or words which dispatch on the class of their arguments. Typically, only single dispatch is needed, so multiple dispatch is not in the core**. A generic word which dispatches on the top of the stack is defined with the parsing word GENERIC:. Methods, or implementations of the generic word, are written with the parsing word M:.Take the following useless example:

GENERIC: process ( thing -- processed )
M: string process
"foo" append ;
M: integer process
5 + ;

Here, there is a generic word process with two methods: one on strings (which appends "foo") and one on integers (which adds 5). This is equivalent to:

: process ( thing -- processed )
{
{ [ dup string? ] [ "foo" append ] }
{ [ dup integer? ] [ 5 + ] }
} cond ;

But the first version of process has the advantage that it is extensible later, in other files. Additionally, generic words are sensitive to the class hierarchy and will test things in an appropriate order, and dispatch is more optimized than with a simple cond expression.

There are other ways to dispatch besides GENERIC:. Other method combinations include MATH:, for words which dispatch on builtin math types for two arguments, HOOK:, to dispatch on the class of the value of a variable, and GENERIC#, which allows dispatch on items further down on the stack. You can also make your own method combinations.

Tuples, delegation and constructors

It's possible to construct your own data types using just arrays and predicate classes, but that can get awkward and ambiguous. To make new data structures, Factor offers a tuples, basically another name for structs or records. Here is an example of a tuple class called foo, with slots bar and baz, together with a constructor called <foo>:

TUPLE: foo bar baz ;
C: <foo> foo

The first line creates the tuple class, and the second line makes the constructor. (Slava described the details and rationale for the constructor system.) It is simple to use this. To make a new foo tuple, just get the values of bar and baz on the stack and call <foo. With that object, the bar slot can be read with the word foo-bar the baz slot can be read with the word foo-baz. To write values to those slots, use the words set-foo-bar and set-foo-baz. And, of course, to test if an object is a foo tuple, use the word foo?. All in all, TUPLE: is a wildly unhygenic macro, and we like it that way!

But wouldn't it be useful if we could let tuples inherit from each other? We've already seen that Factor allows for general subtypes and supertypes, but this does not extend to tuples. The way that tuples can be based off each other is through delegation: one tuple delegates to another. Each tuple has a slot, called the delegate (accessed with delegate and set-delegate). When a generic word dispatches on a tuple, but it doesn't have a method implementation for that tuple, it looks in the delegate slot for what to do next. If there is a delegate, then the generic word is again invoked on that delegate, with the hope that generic word itself has an implementation for that class. If not, it's on to the next delegate until there is no delegate anymore--the delegate is f.***

Instead of inheriting from a class, tuples delegate to an instance. That is, if I have my tuple TUPLE: foo bar baz and I want it to inherit from TUPLE: bing quux ;, all I have to do is make an instance of foo and an instance of bing, and set the delegate of the foo to the bing. Slot accessors like bing-quux and set-bing-quux are generic words implemented, by default, only on that one tuple class that they were made for. So, if you call bing-quux on a tuple of class foo that delegates to a bing, the foo instance will fail to respond and the generic word will be called again on the delegate, bing. This delegate will know what to do, and it'll return its quux value.

There's a subtle problem here with delegates. Look at the following code:

TUPLE: foo bar baz ;
TUPLE: bing quux ;
GENERIC: who
M: bing who how ;
GENERIC: how
M: bing how drop "bing" print ;
M: foo how drop "foo" print ;

You might expect that when who is called on a foo instance, "foo" is printed, but in fact, "bing" is printed. This is called the slicing problem: the object is "sliced" when using delegates, so only part of the object is represented in further method calls. On one hand, the behavior of M: bing who is more predictable, but on the other hand, information is lost. It's not a horrible thing, just something to watch out for.

Use cases

As I said before, a lot of Factor code has no need for object orientation. Going by my own experience, some complicated libraries, such as Unicode and inverse, don't yet use the object system at all. The XML-RPC vocabulary relies heavily on generic words to perform dispatch in turning Factor data into XML-RPC format, though no new classes are defined. The XML library, however, uses tuples to represent XML at various stages of processing, and dispatches on the class of these tuples to provide complex functionality. Several tuples are defined in the XML library to represent exceptions, though any object can be thrown as an exception.

In general, the object system should be used when it's useful and not used when it's not useful. That seems simple enough, but it is a radical departure from the philosophy of other languages where all procedures are methods. It's good to use whichever strategy you think will produce the most maintainable, clear code.

The future

For a while, I hesitated to write about the object system because it changes so frequently. Tuple constructors, for example, changed very recently to eliminate the default constructor and replace it with a more flexible system. Mixins are also a recent addition, yet they significantly change the way default methods are implemented. Soon, Slava plans to replace delegation with tuple inheritance.

Since the Factor object system is constructed completely in Factor, rather than being built in to the virtual machine, any changes to it can be done within Factor itself. This is a huge advantage. One change that I proposed could easily be written in Factor and run without any special care. Fundamental changes to the object system can even be interactively tested and debugged!



* Eduardo Cavazos has written a message passing object system in Factor, and he uses it for some of his code. Factor allows other object systems to be created, but with his system, not everything is an 'object' that can receive messages. Still, it's an interesting (and surprisingly terse) example of the flexibility of Factor, and of useful syntax extension. It is not widely used, however.

** I implemented double dispatch as a library called visitor (since it's implemented with the visitor pattern, yielding a simple lexicographic method order). Maybe in the future we'll have double dispatch in the Factor core, to better implement words like like, which end up approximating double dispatch using conditionals. Though it makes sense in theory, I can't imagine any use for triple dispatch in Factor.

*** As you can see, if you make an infinite (that is, recursive) delegate chain, method resolution may not halt. But why would you ever want to do that? Anyway, since we have predicate classes, it's already not guaranteed to halt!

Blogging is much more fun than writing essays about the history of South Africa.

Wednesday, October 17, 2007

Friday, September 7, 2007

A survey of stack shufflers

Factor is almost Lisp. Aside from the libraries, there's just one tiny difference: Factor has no named variables1. Factor is a concatenative language. This means that programs are assembled by composing other operations. Trial and error has found that the only workable structure for functions to pass data between each other is a stack2 All words in Factor work, in theory, by taking the stack, doing something to it, and returning a new stack, which can be passed on to the next word. In reality, most words only modify the top few items.

For those of you who slept through CS 101 (or, like me, never took it), here's an overview of stacks. We have just three operations: push, pop, and empty. The operation push puts a datum on the end of the stack. The operation pop deletes the top item from the stack, returning it. The operation empty tests whether the stack has any items in it. Initially, we say the stack is empty. Stacks are most commonly implemented as linked lists or growable arrays. Wikipedia has more details.

For our purposes, we'll assume that the empty operation is off-limits3. Generally, we should know whether or not the stack is empty--it has all of our data on it, after all. If a word tries to use more things on the stack then exist, the program crashes with a stack underflow error. This is a good thing, reflecting a real error with the structure of the program which needs to be fixed rather than ignored.

The basics: constants and drop

So, how can we make push and pop work with concatenative languages? We have no place to get the arguments for these, or return the results of these functions except the stack itself. Instead of using push and pop directly, slightly more complicated operations are used. These provide not a way to mess around with the stack, but also the data flow of the language.

First, the easy case: pushing constant values. Say you want to push 1 onto the stack. This is represented in Factor as, simply, 1. Think about it: when you have a literal, and when all data is stored on the stack, what else would you want to do with it besides push it onto that stack? In this way, we can actually think of 1 as a function from stacks to stacks, which takes one stack and returns another which has 1 pushed on to the front of it.

It's so much fun to push these values on the stack! But what if we want to get them off? We want to perform pop on the stack, but ignore the result. We can imagine this implemented as

_ = pop(stack)

You can think of it like dropping the top of the stack. In fact, that's what we'll call this operation, drop.

We can try this all out at the Factor listener4. Open up Factor and push a bunch of numbers to the stack. To see what's there, you can use the command .s. The stack is printed from bottom to top order. To delete the item on the top of the stack, use the command drop.

Building blocks: dup, swap and the retain stack

Unfortunately, there's not much you can do with just those two things, pushing literals and dropping them, even if you add in primitive functions and functions derived from those. One useful action is to duplicate the top of the stack. Since "duplicate" is way too long to type so many times, we'll call it dup instead. dup can be implemented as

item = pop(stack)
push(stack, item)
push(stack, item)

dup is useful if you want to do two things with one piece of data. It's also useful if you want to apply the same piece of data twice to the same function. For example, let's look at the definition of sq, which squares the number on the top of the stack5:

: sq ( x -- x^2 )
dup * ;

This means that the one input to sq is sent as both arguments to *. The part in parentheses is called the stack signature, a listing of inputs and outputs to the word. It is basically a comment; the names included are not significant, except to the reader5. To give you an idea of what's going on here, the equivalent function in Haskell would be

sq x = x*x


Another basic building block is swap. This switches the top two things on the stack. Say you want to negate a number. The easiest way to do this, given a function - ( x y -- x-y ), is something like this:

neg x = 0-x

If we translate 0-x into Factor as 0 -, we get the wrong answer; this is equivalent to x-0, an identity function. So, what we need to do is push 0 onto the stack, swap it with x, and then perform a subtraction:

: neg ( x -- -x )
0 swap - ;


The third basic operation is dipping under the top of the stack to do stuff below. For example, say you want to calculate, given x, y and z, (x+y)*z. The easiest way to do this is to dip under the top of the stack, add x and y, then rise back up and multiply that result by z. In Factor, the dipping down operation is >r and the rising up operation is r>. All together, the Factor code for this function comes out as:

: (x+y)*z ( x y z -- (x+y)*z )
>r + r> * ;

(Yes, in Factor, (x+y)*z is a valid word name, since words are just whitespace-delimited chunks. And I was feeling a little unimaginative.)

Derived operations

Using just those five basic operations--dup, drop, swap, >r and r>, we can implement any other stack shuffling operation6. Here are a few that turn out to be useful. The code here is extremely simple, but resist the temptation to do your own stack shuffling with the above operators and not the below: using derived operations makes your code look far cleaner and clearer to future readers.

swapd swaps the two items under the top of the stack. The suffix "d" indicates that it does this underneath.

: swapd ( x y z -- y x z )
>r swap r> ;


rot rotates the third item on the stack up to the top, pushing the top two down.

: rot ( x y z -- y z x )
swapd swap ;


-rot rotates the top item on the stack down to the third position, pushing the two below up. It is the inverse of rot.

: -rot ( x y z -- z x y )
swap swapd ;


nip is like drop, but it deletes the second item on the stack.

: nip ( x y -- x )
>r drop r> ;


dupd duplicates the item second from the top of the stack, and puts the duplicate underneath. This is useful if you want to do something with the top two things on the stack, but retain what's second.

: dupd ( x y -- x x y )
>r dup r> ;


over takes the second item on the stack and puts a copy of it over the top.

: over ( x y -- x y x )
dupd swap ;


tuck tucks the top item on the stack under the second one

: tuck ( x y -- y x y )
swap over ;


pick picks the third item on the stack and places a copy of it on top

: pick ( x y z -- x y z x )
>r over r> swap ;


In Factor, these are all implemented as primitives, but the only reason for this is performance. Note that, in the above descriptions, when I said "copy" I meant copying a pointer, not copying data (which can be done with clone).

Keep

There's a really cool operator known as keep, which often makes code far more elegant. Strictly speaking, it's not a stack shuffler; it takes a quotation as an argument, making it a combinator, or higher order function. Basically, keep encapsulates a simple, common pattern:

dup >r ... r>

replacing it with

[ ... ] keep

There's also 2keep, which is equivalent to

2dup >r >r ... r> r>

These, with the analogous 3keep, are more useful than they first seem. They are used for any instance where you want to process the top one or more items on the stack, together with something underneath, while retaining those items on the top for further use.

What's this all for?

This list of functions probably looks daunting to memorize, and it is difficult at first. But it's useful, and they soon become second nature. In Factor and other concatenative languages, stack shufflers are the glue used to tie the output of certain functions to the input of other functions. No matter what programming language you're working in, everything ends up being a (perhaps convoluted) stringing together of different functions. Once you get used to it, using the stack and its associated transformations can be even more natural than using tons of named variables.



1 OK, technically, you could use locals for named variables. But it's all stack manipulation underneath. Also, there's dynamically scoped variables, but those aren't used the way lexically scoped locals typically are; they're used for passing data over greater distances.

2 Experimental languages have used a queue rather than, or in addition to, a stack. But this generally leads to more difficult to use--and compile--models. The original point-free language, FP, used function composition without a stack, but the way it uses lists is trivially equivalent to the stack. However, it ends up being somewhat awkward in practice. If you have an idea for how to improve either of these, or of another data structure for concatenative languages, I'd love to hear it. Please comment.

3 Strictly speaking, this isn't the case in Factor. In fact, a function to test whether the stack is empty is trivial to implement:

: stack-empty ( -- ? )
datastack empty? ;

But it's a bad idea to use this, since words should only care about their arguments, and they should take a fixed (or easy to determine, in the case of certain macros) number of arguments.

4 Here's how to get and open Factor. If you're on Windows or Mac, download the appropriate package from the Factor website and unpack it somewhere. On either platform, double click on the F icon, and a Factor listener window will pop up. On Linux or other Unixes, download the source, then make with the appropriate target (run without arguments to see options), and run ./factor. See the readme for more details. No matter how you get it started, once you have the listener open, you can just type in words and press enter to execute them.

5 The structure of stack comments is fairly simple. Before the -- is the inputs, and after is the outputs. Both sides are written from the bottom of the stack to the top of the stack. For example, ( a b -- c d ) indicates a word which takes a second from the top of the stack, and b on the top of the stack, pops those off, and returns c second from the top and
d
on the top. The names should indicate something about the arguments, ideally more informative than their type (though that is often used). If the same name is used on the left and right sides of the --, that implies that the same object exists before and after the function is called. This occurs most often in stack shufflers.

6 Actually, there's a set of two combinators, cake and k, which can do anything, including any stack shuffle, and call, and curry. They're described in a fun paper by Brent Kerby, The Theory of Concatenative Combinators.




Note: Another way to look at the stack shufflers is through a functional implementation of stacks in terms of linked lists with pattern matching. We'll call the front of the list the top of the stack Here's a few stack shufflers in Haskell, a language where : means cons:

drop (x:stack) = stack
swap (x:y:stack) = y:x:stack
dup (x:stack) = x:x:stack
over (x:y:stack) = y:x:y:stack
tuck (x:y:stack) = x:y:x:stack
rot (x:y:z:stack) = z:x:y:stack
dip f (x:stack) = x:f (stack)
keep f stack = dip f (dup stack)

This could be the basis for a dynamically typed concatenative language in Haskell, though it wouldn't work with static typing very well. (You could use nested tuples, with statically known types, are used instead. This requires a whole bunch of polymorphism and type classes. But with this, it is impossible to have a statically unknown stack depth. This makes it impossible to do certain types of recursion.)

Another way to think of it in Haskell is in terms of higher order functions:

drop f x = f -- same as const
swap f x y = f y x -- same as flip
dup f x = f x x

This, in my mind, more closely resembles the real data flow manipulation involved here. However, they are hard to compose in real programs.

Monday, August 20, 2007

Factor with a dash of curry

Finally, Factor has compiling curry! This is a great development, and it makes designing combinators in Factor much easier. I previously discussed designing combinators, but that post was very confusing and probably too advanced. With curry in full availability, this should be much easier, and the explanation much cleaner.


Unfortunately, Factor's curry has nothing to
do with this delicious Indian meal.


What's curry?

I hear you asking, "What the hell are you talking about? What's curry?" Curry is a simple function, named after mathematician Haskell Curry who developed combinatory logic, though not currying. The man usually credited with that discovery, a bit earlier on, and far before modern computers were developed, was Moses Schönfinkel, so the process may also be known as Schönfinkelisation. Currying is the technique of implementing a multiargument function by chaining together several single-argument functions. To use Haskell as an example, say you have the function add, which adds two integers. The function could be defined as

add :: Integer -> (Integer -> Integer)
add = \x -> (\y -> x+y)

Not all of you are Haskellers, so let me explain how that works. The first line is the type signature (that fact is indicated by the ::) which tells us that the type of add is a function (indicated by -> which takes an Integer and returns a function (Integer -> Integer) from Integer to Integer. The second line is a series of lambdas which defines the actual function. (We know it's a definition because of =.) The syntax \a -> b specifies an anonymous function which takes a value named a and returns the expression b. So add is a function which takes a value x, and returns a function y which takes a value y and returns the sum of x and y. As it happens, because Haskell is a so-called "auto-currying" language, and because -> is right-associative due to the frequency of this construct, the exact same thing can be written as

add :: Integer -> Integer -> Integer
add x y = x+y


How Factor does it

Currying is a great construct, but it has almost nothing to do with Factor's curry. Factor's curry is, by the more fashionable nomenclature of those at Lambda the Ultimate, partial application: given a function, apply it with a partial argument list, yielding a new function with just those arguments filled in. (I think this operation would more appropriately be called with in Factor, but for now, it's curry.) In Scheme, this is accomplished with the nice macro cut (SRFI 26).

At first, it may sound like partial application only suits applicative, not concatenative languages, as we do not usually think about applying a function to its arguments, but rather composing words which operate on a stack. But in this context, all we mean by partial application, or "currying", is getting something onto the stack before the quotation is called.

With this realization, this sort of currying is easy to do on quotations. In Joy, and in Factor as it was a few years ago, quotations are lists. To prepend a constant to that list is the same as to push it on the stack before executing the quotation. The operation is simply cons. Now that Factor's quotations are specially tagged arrays, the code for prepending an element is swap add*, but the meaning remains the same. If you're a careful reader, you saw that I made a mistake there: if you want to partially apply a quotation with a word as your data, this word has to be "literalized" first, or wrapped so that it is not executed but rather pushed. So, until recently, the code for curry amounted to swap literalize add*. (literalize does nothing if its argument is not a word.)

But now, curry has been added as a primitive for better interpreter performance, and a compiler intrinsic so it can be safely used in compiling Factor code.

Derived operations

You might be wondering if this single operation can take the place of all of the uses of Scheme's cut macro. It can. Using only curry, call and basic stack shuffling words, it is easy to define several more complex operations. For something simple, here's how to partially apply a quotation to two arguments, rather than one:

: 2curry ( a b quot -- quot' )
curry curry ; inline

The inline declaration is there to allow things using 2curry to be compiled. Composing two quotations is slightly more complicated. We want this to be fast, so we'll build a quotation, using curry, which calls the two quotations in order. This is more efficient than composition using append.

: compose ( quot1 quot2 -- q1q2 )
[ >r call r> call ] 2curry ; inline

Another useful operation is curry*, which partially applies a quotation based on the second item on the stack. As an example, curry* map is a complete replacement for map-with. Because of this, all of the -with combinators have been removed in favor of direct use of curry.

: curry* ( param object quot -- object curry )
>r swap r> [ >r swap r> call ] 2curry ; inline


Practical applications

Aside from eliminating the need for all -with combinators (I'm very glad I'll never have to write another one myself!), curry makes writing all sorts of combinators much simpler. For an involved example, see the new definitions of the sequence combinators, which are not so scary anymore. For something simpler, take the problem of counting up the number of elements in a sequence which satisfy a certain property. Say you have the sequence { 1 2 4 1 2 1 } and you want to count tne number of ones. A simple way to do this is to simply take the subset satisfying the property, and then count how long it is:

: count ( seq quot -- num )
subset length ; inline

Using this, code to count up the number of ones could be [ 1 = ] count. This pattern is used a few times in the Factor core, though there is no word named count. Using it, we can write a count= word which counts the number of times an element occurs in a sequence:

: count= ( seq elt -- num )
[ = ] curry count ;

But there's a downside to this: for no good reason, you build a new sequence, which is immediately consumed and discarded. What if we just counted up the length instead? The count combinator could be rewritten as:

: count ( seq quot -- num )
0 -rot [ rot >r call r> swap [ 1+ ] when ] curry each ;

There's a little bit of stack acrobatics around calling that quotation so that the quot has access to lower things on the stack.

If you think about it, there are actually two pieces here, which we can factor out. The first piece, which I'll call sigma, incrementally maps all values in a sequence to a new value, and sums the results. The second piece calls a quotation and converts the boolean result to either one or zero, which can be summed up. When these are separated, the implementation is even cleaner:

: sigma ( seq quot -- num )
[ rot >r call r> + ] curry 0 -rot each ; inline

: count ( seq quot -- num )
[ [ 1 ] [ 0 ] if ] compose sigma ; inline


Both of these are faster than their counterparts (map sum and subset length) by only a constant factor, but, just for fun, let's measure this. Right now, Factor doesn't have very advanced profiling tools, but I haven't felt the need for them. For many things, a simple timing function can work well. After entering in the definitions for sigma, the following lines can be entered at the listener to time the word against map sum:

: slow 1000000 [ ] map sum ;
: fast 1000000 [ ] sigma ;
{ slow fast } [ compile ] each
[ slow ] time ! prints time for map sum
[ fast ] time ! prints time for sigma

On my computer, slow takes 195ms, and fast takes 126ms, for around a 35% speedup!

Tuesday, July 31, 2007

Factor[ial] 101

If you're starting Factor, or any other stack-based language, you'll be a little confused about how the dataflow works. A good first exercise, which probably every successful Factor programmer has done at least once, is a word to calculate the factorial of a number. If you haven't done that yet, I strongly recommend that you do, in any way you feel like, implement it. Afterwards, you can continue reading this blog post. If you've already implemented factorial before, skip down past the second section to the third. In case you forgot, the algorithm for calculating factorial can be described as:

0! = 1
(n+1)! = (n!)(n+1)

It is only defined on nonnegative integers, and it's OK if your implementation has weird behavior, such as infinite recursion, for input outside of this domain. (Mine does too.)


Chances are, you wrote something like this:

: factorial ( n -- n! )
dup zero? [ drop 1 ] [ dup 1- factorial * ] if ;

If you didn't, that's OK, because I'm going to explain this in excruciating detail. Hopefully, this will serve as a good introduction to non-Factorers or beginning Factorers. It should be readable even if you're not running Factor, but you might find it easier if you open up a listener to play around with at the same time.

So, let's start at the beginning. The colon : indicates the start of a word (this is the Factor term for a function), which is called factorial. After that, the stack comment ( n -- n! ) describes the stack effect of the code. Informally, the stack comment here says that factorial takes n off the stack and pushes n! on. For example, if the stack consists of 3 6 3 1 5, then after invoking factorial, the stack will be 3 6 3 1 120. (The stack here is written bottom to top, left to right.)

Now let's look at the actual code. The implementation of the factorial word is everything after the stack comment and before the semicolon ;, which terminates the definition. So the definition is dup zero? [ drop 1 ] [ dup 1- factorial * ] if. I'll go through this word by word. dup is a so-called "stack shuffler" whose only purpose is to manipulate the data stack. This word duplicates the top item on the stack, so there are two copies of it. (Note that it doesn't actually copy the datum, it just copies a reference to it, except for the few objects that are stored on the stack by value.) So if the stack is 1 2 3, then after executing the dup word, the stack is 1 2 3 3. Other stack shufflers include swap, drop, nip, tuck, 2dup and over. These can be found in Factor's built-in documentation system. In general, stack shufflers, like other words, only deal with the top few items on the stack. Underneath what is known to be on the stack, words by convention leave the stack as it is.

The second word in the definition of factorial is zero?. Looking at that word's definition using \ zero? see, we can see that its stack effect is ( x -- ? ). In informal notation, ? in a stack effect means that that item is a boolean, either t or f. The fact that the word zero? returns a boolean is also indicated by the fact that the name ends with a question mark. (Note: this convention only applies when the word returns a single value. When it returns multiple values, one of which is a boolean, the question mark precedes the word, as in ?head.) So, the word zero? takes the object on the top of the stack and returns a boolean indicating whether or not that value is zero.

Let's look at what we have so far: dup zero?. Applied to a few inputs, here's what it yields:

1 2 3 dup zero? ==> 1 2 3 f
0 dup zero? ==> 0 t
dup zero ==> Datastack underflow error

The last error happened because there wasn't enough on the stack to proceed. It was triggered by dup, which expects at least one item on the stack. Most stack underflow errors can be detected by the compiler beforehand.

As you might have guessed, the purpose of the dup before the zero? was so that we could save the value of the number and use it later, while also using it for the purpose of testing its zero-ness. Now, we'll enter a conditional, dependent on whether that value was zero. Following the code we've analyzed are two quotations enclosed in square brackets. These are like code blocks or anonymous functions in other programming languages, but at the same time, they can function as sequences. In their literal form, they are simply pushed onto the stack like other data literals. After these blocks is if, forming a conditional. if works like this. First, the top three items on the stack are popped. If former third-from-the-top item on the stack is f, the top quotation is executed. Otherwise, the second-from-the-top quotation is executed. I've written extensively about conditionals in Factor in a previous blog post.

So we still have to look at the code within the quotations. The first branch, [ drop 1 ] is executed when the condition is true, that is, when the input to factorial is zero. That branch drops the item on the top of the stack (the original input) and pushes 1. So, in total, 0 factorial returns 1.

The second branch, [ dup 1- factorial * ] is a bit more complicated. First, the item on the top of the stack is duplicated, and 1 is subtracted from the result. So, if the stack before that operation was 1, then the stack afterwards should be 1 0. After that, factorial is called recursively. This should only affect the top of the stack. The stack now is 1 1. The word * takes two numbers off the top of the stack and pushes their product. So, the stack result of the first quotation is 1.

I'm reading The Art of Computer Programming right now, so in his style, I'll announce this: Excercise 1 proves this code correct.


Now here's a funny piece of code which does the same thing:

: factorial ( n -- n! )
1 [ 1+ * ] reduce ;

Exercise 2 explains this code and proves it correct.


1. [HM50] Prove the first Factor code listing a correct implementation of factorial.

2. [HM50] (a) Explain the workings of the second code listing. (b) Prove the second code listing a correct implementation of factorial.

(In case you're wondering why these exercises are rated so difficult, you should know that nobody's ever formally described the semantics of Factor, and that is a prerequisite to a working proof of the efficacy of any algorithm implementation. No, you can't look in the back of the book to find the answer.)


OK, OK, I'll give an answer to exercise 2(a), since it may be a bit confusing to those unacquainted with some of the more esoteric features of Factor. In Factor, all nonnegative integers are sequences. The length of integer sequences is equal to the integer itself, and indexing an integer sequence is equal to that index. In other words, an integer n represents the range of integers from 0 to n-1, inclusive. So, to get factorial, all we need to do is add 1 to each of these, and find the product of the whole thing. A direct implementation of that would be

: factorial ( n -- n! )
[ 1+ ] map product ;

map is a combinator (word which takes a quotation as an argument) which, like Common Lisp's mapcar, applies a function to each element of a sequence (or, in Lisp, a list) and returns the resulting sequence (or list). And product simply calculates the product of all the elements of a sequence. This implementation works as expected. However, it's a little bit more efficient if we don't calculate the whole new sequence, with one added to each element, and instead incrementally calculate the product as we range through the numbers 0 to n-1. To do that, we can use another combinator called reduce. reduce takes a sequence (such as an integer) and a start value (such as 1). Then, reduce executes the quotation with each element of the sequence on the top of the stack, and the accumulated value beneath the top. The quotation returns a new accumulated value, which is ultimately the return value of the reduce. So, here, the quotation [ 1+ * ] adds one to the sequence element and then multiplies it into the accumulator.

Now, I'll expect the answers to the other two exercises to be on my desk Monday morning. You are dismissed.

Sunday, July 15, 2007

Messing around at the Factor REPL

Read Eval Print Loops, or REPLs, are really useful, I've found. One of my favorite uses, beyond prototyping, is playing around and solving problems. Often these are math problems, but for slightly more complicated things, I need string processing. Simple string processing things like this also make an easy example for explaining Factor to beginners like you. (If you're not a Factor beginner, you may want to stop reading.)

My younger brother is learning how to type right now. My dad found Tux Typing, a GPL typing program for Windows and Linux. So far, it's great; there's just one problem: when you use the word mode (where you have to type words before they fall to the bottom of the screen, creating horrible crashing sounds) there are only about 35 words in the long word setting. My dad asked me to fix this, and since it was a simple little task, I agreed.

I started by figuring out the file format for the dictionaries. It was easy: the first line was the title (I chose "Huge words") and after that, the words were listed in allcaps, separated by Unix line endings. Next, I copied the text of the Wikipedia entry Economy of the United States into Wordpad and saved it to the desktop. Then I downloaded Factor to the computer I was working on and fired up the REPL.

The first thing in making the word list is getting a list of space-separated things. So I made a file reader object and got an array of all the lines. I joined these lines with a space, then split everything separated by a space (separating both lines and words on the same line).
"article.txt" <file-reader> lines " " join " " split

Now an array of words is lying on top of the stack. There are a bunch of operations we need to do to manipulate this, and Factor's sequence combinators help make it easier. So I made sure that each word had at least three letters in it:
[ length 3 >= ] subset

And I put all the words in upper case:
[ >upper ] map

And I made sure that each character of each word was an upper case letter, to filter out apostrophes, numbers, and other similar things:
[ [ LETTER? ] all? ] subset

And finally, I made sure there were no duplicates:
prune

So, to join this together in the correct file format and add a title, I used
"Huge words" add* "\n" join

yielding a string. To write this string back to the original file, all I needed was
"article.txt" <file-writer> [ print ] with-stream

And that's it! All in all, 10-15 minutes work and I got the game working with a good word list. But the REPL made it a lot easier.

Update: I didn't have the <file-reader> and <file-writer> displayed properly before, but it's fixed now. Thanks Sam!

Tuesday, January 16, 2007

Factor's make

One of the more unique features of Factor is a word called make. It's used for generating sequences easily where it wouldn't work to use a literal, a sequence initializer, or something that operates on an existing sequence. In its simplest form, make can be used like

[ 1 , 2 , 3 , ] { } make

which is equivalent to { 1 2 3 }. The word , appends the top of the stack to an invisible array, which is outputted at the end. Of course, , can be used in a loop, and you can even call another word to add elements to the loop. For example,

: add-all ( seq -- ) [ , ] each ;
[ 3 add-all 4 add-all ] { } make

which generates { 0 1 2 0 1 2 3 }. [Note: this takes advantage of a curious property of Factor, that numbers themselves are sequences representing the range 0..n-1] add-all turns out to be a pretty useful word, so it's actually included in the Factor standard library under the name % (with a more efficient implementation). So the previous code could be rewritten [ 3 % 4 % ] { } make. These names may seem obscure at first, but you'll get used to them quickly.

The importance of this is that, with make, you can do many complex things in constructing sequences that would be much more annoying when explicitly passed around. An example of this is take-until, defined with the imperative parsing code in previous entries and make as:

: take-until ( quot -- string )
#! Take the substring of a string starting at spot
#! from code until the quotation given is true and
#! advance spot to after the substring.
[ [
dup slip swap dup [ get-char , ] unless
] skip-until ] "" make nip ;


Here's another example: say you want to write a function that takes a number and outputs the string "The answer to [whatever the number is] squared is [the number squared].". Factor doesn't require complicated string interpolation or inefficient multiple concatenations as other languages would; instead, it uses make. The word # converts a number to a string and then appends it to the current product of make. The code is below:

: square-description ( num -- )
[
"The answer to " %
dup #
" squared is " %
sq #
"." %
] "" make ;

When used in this way, "" make turns out rather like the C function sprintf. In other cases, like most uses of { } make, it ends up acting as a substitute for Lisp's ` (quasiquote).

make isn't any sort of magical builtin; it's actually implemented in very simple Factor code. It is in the vocabulary namespaces and defined as such:

: make ( quot exemplar -- seq )
>r [
V{ } clone building set
call
building get
] with-scope
r> like ; inline

In English, all that does is make a new empty vector, place it in the building variable (which is dynamically scoped), call the given quotation, retrieve the variable, and convert the contents to the given exemplar. It operates inside a new scope so any outside definition of building is unchanged, and make's value for it is invisible after the word is run.

Obviously, something's missing: how do elements get on this new sequence? The answer is that , puts them on. Its implementation is even simpler than that of make:

: , ( elt -- ) building get push ;

and that is basically all there is to make

Update: That's not all, folks! There's also a way to push a whole sequence, all at once, onto the building. How? Using a word cryptically called %:

: % ( seq -- ) building get push-all ;

The push-all word is an in-place append.

Monday, January 1, 2007

Designing combinators in Factor

Factor's a great programming lanugage, but one of the more annoying points of Factor is that it can be difficult to write complicated combinators, the Factor/Joy term for higher order functions. Here's one that I wrote recently. See if you can tell what it does:

: map-replace ( seq pred map -- )
pick dup length [ ! ( seq pred map elt i )
>r swap >r rot >r ! ( pred elt | seq map i )
[ swap call ] 2keep rot ! ( pred elt ? | seq map i )
[
r> r> swap >r rot >r ! ( elt map | pred seq i )
[ call ] keep ! ( new-elt map | pred seq i )
r> r> r> rot >r rot >r ! ( new-elt seq i | map pred )
swap [ set-nth ] keep ! ( seq | map pred )
r> r> swap ! ( seq pred map )
] [
drop r> swap r> r> drop ! ( seq map pred )
] if
] 2each 3drop ; inline

I showed this code to Slava Pestov, the creator of Factor, and he didn't understand it. This is ridiculously illegible, and the funny part is, it only has five words that actually *do* anything, and the rest just move stuff around the stack. I put these words in bold and italics.

This code looks complicated but it actually does a pretty simple thing. Given a mutable sequence, this code takes two quotations, a predicate and a map. On each element, the predicate is applied to the element. If the result is not f, the map quotation is run with the element, and that value is put in the place where the element used to be. An example of this is

{ 1 2 3 } dup [ 2 = ] [ 3 + ] map-replace .

which outputs { 1 5 3 }. I wrote this up as part of some code to change XML code in place, as a supplement to xml-map, which makes a new tree entirely.

To simplify this, I wrote a new combinator which takes only one quotation. This quotation is run on the element. If the result is f, nothing happens to the element. But if it is something else, the item is set to the resulting value. This can be written as:

: map-replace2 ( seq quot -- )
over dup length [ ! ( seq quot elt i )
>r rot >r swap [ call ] keep ! ( result quot | seq i )
swap [ ! ( quot result | seq i )
r> r> swap [ set-nth ] keep ! ( quot seq )
] [ ! ( quot | seq i )
r> r> drop ! ( quot seq )
] if* swap
] 2each 2drop ; inline


In the end, it turned out all of this was almost completely pointless. Apparently, the Factor standard library already has a word called inject, which is like map, except it works in place. This is basically what I wanted all along, I was just making the mistake of premature optimization, shying away from setting each element individually because I thought it'd be too costly. inject is defined as

: inject ( seq quot -- )
over length [
[ -rot change-nth ] 3keep
] repeat 2drop ; inline

Using inject, Slava implemented my first map-replace as

: (map-replace) ( pred quot elt -- newelt )
[ -rot slip ] keep rot [ swap call ] [ nip ] if ;

: map-replace ( seq pred quot -- )
rot
[ pick pick >r >r (map-replace) r> r> rot ] inject
2drop ;

but at this point, it looks like like inject will be enough by itself. Now, going back to the original purpose, it is really simple to lift this to operate on XML trees rather than just sequences. xml-inject now joins the existing combinators xml-each xml-map and xml-find, all of which were very simple to implement given the existing parallel combinators on sequences:

GENERIC: (xml-inject) ( quot tag -- ) inline
M: tag (xml-inject)
tag-children [
swap [ call ] keep
swap [ (xml-inject) ] keep
] inject ;
M: object (xml-inject) 2drop ;
M: xml-doc (xml-inject) delegate (xml-inject) ;
: xml-inject ( tag quot -- ) ! quot: tag -- tag
swap (xml-inject) ; inline


Update: Note that in the current version of factor, inject is called change-each. Also, when writing new combinators now, it's very often useful to include a new word called curry, which I described here.

Tuesday, December 19, 2006

Variants of if in Factor

Factor has many different combinators where other languages use only one. This can be confusing to beginners, who may spend a lot of time looking this sort of thing up using see. At first, all the variants may seem like unnecessary complication, but they actually allow code to be much more concise than they would be if the only combinator were if. The following article assumes basic knowledge of Factor syntax.

The first combinator for conditional execution, which I just mentioned, is if. This is used in the following way:

! code producing a value on the top of the stack
[
! code executed if the value on the top of the stack does not equal f
! (any value that is not f is true as far as Factor is concerned)
! this is equivalent to a "then" block
] [
! code executed if the value on the top of the stack equals f
! this is equivalent to an "else" block
] if

Neither of the two blocks of code has access to the value on the stack that chose the branch. Below is an example of a word that uses if:

: operate ( patient -- )
patient dup asleep? [
cut-open
] [
dup anaesthetize-more operate
] if ;


The simplest variation of if is when. when is used when the "else" block is empty. The following two lines are equivalent:

[ then code ] [ ] if
[ then code ] when

You'd use when in any sort of situation where action only needs to be taken when the predicate is true. Below is an example of usage:

: ?alarm ( patient -- )
dead? [ alarm set-off ] when


Another simple variation is unless, used when the "then" block is empty. The following two lines are equivalent:

[ ] [ else code ] if
[ else code ] unless

You'd use unless in any sort of situation where action only needs to be taken when the predicate is false.

These conditionals all assume that you don't need the value of the predicate in the case where it is not f. When you do need the value, use if*, when* and unless*. Note that in the "else" branch, f is *not* put on the stack, since it conveys no information. The following two lines are equivalent:

dup [ then ] [ drop else ] if
[ then ] [ else ] if*

This may not seem useful at first, but it actually comes up a lot. unless* is particularly useful in cases where you want to replace the contents of the top of the stack in the case where it is f, but when it is not f, you want to leave it how it is. Below is an example of unless*:

: knife ( -- knife )
scalpel [ hacksaw ] unless* ; where scalpel and hacksaw are both ( -- knife )


There are some cases where you don't need to branch but just need to choose between two values, using a predicate. For this, you can use ?. The following two lines are equivalent:

[ 1 ] [ 2 ] if
1 2 ?

This is not always used for literal values; it can also be used for values taken from the stack. An example:

: family-message ( patient -- string )
dead? "My sincerest condolances for your son's death"
"Congratulations on the sucessful operation" ? ;


Factor programmers should be wary of deeply nested conditionals, as they are often evidence of poor factoring. However, sometimes they do become necessary, and Factor provides a Lisp-like cond for this case. The word iterates through a sequence of pairs of a predicate and a value. The first predicate to return something other than f is selected, and its associated quotation executed. Generally, t is used to provide an "else" case. An example is below.

: cut-open ( patient -- )
knife {
{ [ dup not ] [ throw ] }
{ [ 2dup appropriate-knife? ]
[ swap body-part cut ] } ! body-part is ( patient -- part )
! cut is ( part -- )
{ [ t ] [ change-knife cut-open ] } ! change-knife is ( knife -- knife )
} cond ;

In this case, it might be preferable to use nested conditionals, but in the case where there are three or four real predicates which can be either true or false, cond makes things much simpler.

The final type of conditional is ?if. After more than a year of Factor programming, though, I still don't feel perfectly comfortable with it and have never found the need to use it. Of all the branching combinators, it is the least used (see below). However, it still deserves mention because it is still useful in some cases. The following two lines are equivalent:

[ then ] [ else ] ?if
dup [ nip then ] [ drop else ] if

?if is used when you have a predicate, which is passed to the "then" block, and a default, which is passed to the "else" block.

Using the command [ { if when unless if* when* unless* ?if ? cond } [ dup usage length swap set ] each ] make-hash, we can see how often each type of if is used. This yields the following hash associating words to the number of times they are used in the core:

H{
{ unless 44 }
{ when 100 }
{ cond 36 }
{ if 1455 }
{ ? 32 }
{ unless* 28 }
{ when* 59 }
{ ?if 26 }
{ if* 36 }
}

From this, it is evident that if is used far more often than the other types of conditionals. However, all types are useful and it is good to know all of them.