^{1}. Factor is a concatenative language. This means that programs are assembled by composing other operations. Trial and error has found that the only workable structure for functions to pass data between each other is a stack

^{2}All words in Factor work, in theory, by taking the stack, doing something to it, and returning a new stack, which can be passed on to the next word. In reality, most words only modify the top few items.

For those of you who slept through CS 101 (or, like me, never took it), here's an overview of stacks. We have just three operations:

`push`

, `pop`

, and `empty`

. The operation `push`

puts a datum on the end of the stack. The operation `pop`

deletes the top item from the stack, returning it. The operation `empty`

tests whether the stack has any items in it. Initially, we say the stack is empty. Stacks are most commonly implemented as linked lists or growable arrays. Wikipedia has more details.For our purposes, we'll assume that the

`empty`

operation is off-limits^{3}. Generally, we should know whether or not the stack is empty--it has all of our data on it, after all. If a word tries to use more things on the stack then exist, the program crashes with a

*stack underflow error*. This is a good thing, reflecting a real error with the structure of the program which needs to be fixed rather than ignored.

**The basics: constants and drop**

So, how can we make push and pop work with concatenative languages? We have no place to get the arguments for these, or return the results of these functions except the stack itself. Instead of using push and pop directly, slightly more complicated operations are used. These provide not a way to mess around with the stack, but also the data flow of the language.

First, the easy case: pushing constant values. Say you want to push 1 onto the stack. This is represented in Factor as, simply, 1. Think about it: when you have a literal, and when all data is stored on the stack, what else would you want to do with it besides push it onto that stack? In this way, we can actually think of 1 as a function from stacks to stacks, which takes one stack and returns another which has 1 pushed on to the front of it.

It's so much fun to push these values on the stack! But what if we want to get them off? We want to perform

`pop`

on the stack, but ignore the result. We can imagine this implemented as

_ = pop(stack)

You can think of it like dropping the top of the stack. In fact, that's what we'll call this operation,

`drop`

.We can try this all out at the Factor listener

^{4}. Open up Factor and push a bunch of numbers to the stack. To see what's there, you can use the command

`.s`

. The stack is printed from bottom to top order. To delete the item on the top of the stack, use the command `drop`

.**Building blocks: dup, swap and the retain stack**

Unfortunately, there's not much you can do with just those two things, pushing literals and dropping them, even if you add in primitive functions and functions derived from those. One useful action is to duplicate the top of the stack. Since "duplicate" is way too long to type so many times, we'll call it

`dup`

instead. `dup`

can be implemented as

item = pop(stack)

push(stack, item)

push(stack, item)

`dup`

is useful if you want to do two things with one piece of data. It's also useful if you want to apply the same piece of data twice to the same function. For example, let's look at the definition of `sq`

, which squares the number on the top of the stack^{5}:

: sq ( x -- x^2 )

dup * ;

This means that the one input to

`sq`

is sent as both arguments to `*`

. The part in parentheses is called the stack signature, a listing of inputs and outputs to the word. It is basically a comment; the names included are not significant, except to the reader^{5}. To give you an idea of what's going on here, the equivalent function in Haskell would be

sq x = x*x

Another basic building block is

`swap`

. This switches the top two things on the stack. Say you want to negate a number. The easiest way to do this, given a function `- ( x y -- x-y )`

, is something like this:

neg x = 0-x

If we translate

`0-x`

into Factor as `0 -`

, we get the wrong answer; this is equivalent to `x-0`

, an identity function. So, what we need to do is push 0 onto the stack, swap it with x, and then perform a subtraction:

: neg ( x -- -x )

0 swap - ;

The third basic operation is dipping under the top of the stack to do stuff below. For example, say you want to calculate, given x, y and z, (x+y)*z. The easiest way to do this is to dip under the top of the stack, add x and y, then rise back up and multiply that result by z. In Factor, the dipping down operation is

`>r`

and the rising up operation is `r>`

. All together, the Factor code for this function comes out as:

: (x+y)*z ( x y z -- (x+y)*z )

>r + r> * ;

(Yes, in Factor,

`(x+y)*z`

is a valid word name, since words are just whitespace-delimited chunks. And I was feeling a little unimaginative.)**Derived operations**

Using just those five basic operations--dup, drop, swap, >r and r>, we can implement any other stack shuffling operation

^{6}. Here are a few that turn out to be useful. The code here is extremely simple, but resist the temptation to do your own stack shuffling with the above operators and not the below: using derived operations makes your code look far cleaner and clearer to future readers.

`swapd`

swaps the two items under the top of the stack. The suffix "d" indicates that it does this underneath.

: swapd ( x y z -- y x z )

>r swap r> ;

`rot`

rotates the third item on the stack up to the top, pushing the top two down.

: rot ( x y z -- y z x )

swapd swap ;

`-rot`

rotates the top item on the stack down to the third position, pushing the two below up. It is the inverse of `rot`

.

: -rot ( x y z -- z x y )

swap swapd ;

`nip`

is like `drop`

, but it deletes the second item on the stack.

: nip ( x y -- x )

>r drop r> ;

`dupd`

duplicates the item second from the top of the stack, and puts the duplicate underneath. This is useful if you want to do something with the top two things on the stack, but retain what's second.

: dupd ( x y -- x x y )

>r dup r> ;

`over`

takes the second item on the stack and puts a copy of it over the top.

: over ( x y -- x y x )

dupd swap ;

`tuck`

tucks the top item on the stack under the second one

: tuck ( x y -- y x y )

swap over ;

`pick`

picks the third item on the stack and places a copy of it on top

: pick ( x y z -- x y z x )

>r over r> swap ;

In Factor, these are all implemented as primitives, but the only reason for this is performance. Note that, in the above descriptions, when I said "copy" I meant copying a pointer, not copying data (which can be done with

`clone`

).**Keep**

There's a really cool operator known as

`keep`

, which often makes code far more elegant. Strictly speaking, it's not a stack shuffler; it takes a quotation as an argument, making it a combinator, or higher order function. Basically, `keep`` encapsulates a simple, common pattern:`

dup >r ... r>

replacing it with

[ ... ] keep

There's also `2keep`

, which is equivalent to

2dup >r >r ... r> r>

These, with the analogous `3keep`

, are more useful than they first seem. They are used for any instance where you want to process the top one or more items on the stack, together with something underneath, while retaining those items on the top for further use.

**What's this all for?**

This list of functions probably looks daunting to memorize, and it is difficult at first. But it's useful, and they soon become second nature. In Factor and other concatenative languages, stack shufflers are the glue used to tie the output of certain functions to the input of other functions. No matter what programming language you're working in, everything ends up being a (perhaps convoluted) stringing together of different functions. Once you get used to it, using the stack and its associated transformations can be even more natural than using tons of named variables.

^{1} OK, technically, you could use locals for named variables. But it's all stack manipulation underneath. Also, there's dynamically scoped variables, but those aren't used the way lexically scoped locals typically are; they're used for passing data over greater distances.

^{2} Experimental languages have used a queue rather than, or in addition to, a stack. But this generally leads to more difficult to use--and compile--models. The original point-free language, FP, used function composition without a stack, but the way it uses lists is trivially equivalent to the stack. However, it ends up being somewhat awkward in practice. If you have an idea for how to improve either of these, or of another data structure for concatenative languages, I'd love to hear it. Please comment.

^{3} Strictly speaking, this isn't the case in Factor. In fact, a function to test whether the stack is empty is trivial to implement:

: stack-empty ( -- ? )

datastack empty? ;

But it's a bad idea to use this, since words should only care about their arguments, and they should take a fixed (or easy to determine, in the case of certain macros) number of arguments.

^{4} Here's how to get and open Factor. If you're on Windows or Mac, download the appropriate package from the Factor website and unpack it somewhere. On either platform, double click on the F icon, and a Factor listener window will pop up. On Linux or other Unixes, download the source, then `make`

with the appropriate target (run without arguments to see options), and run `./factor`

. See the readme for more details. No matter how you get it started, once you have the listener open, you can just type in words and press enter to execute them.

^{5} The structure of stack comments is fairly simple. Before the `--`

is the inputs, and after is the outputs. Both sides are written from the bottom of the stack to the top of the stack. For example, `( a b -- c d )`

indicates a word which takes `a`

second from the top of the stack, and `b`

on the top of the stack, pops those off, and returns `c`

second from the top and

d

on the top. The names should indicate something about the arguments, ideally more informative than their type (though that is often used). If the same name is used on the left and right sides of the `--`

, that implies that the same object exists before and after the function is called. This occurs most often in stack shufflers.^{6}Actually, there's a set of two combinators, cake and k, which can do

*anything*, including any stack shuffle, and call, and curry. They're described in a fun paper by Brent Kerby, The Theory of Concatenative Combinators.

Note: Another way to look at the stack shufflers is through a functional implementation of stacks in terms of linked lists with pattern matching. We'll call the front of the list the top of the stack Here's a few stack shufflers in Haskell, a language where

`:`

means cons:

drop (x:stack) = stack

swap (x:y:stack) = y:x:stack

dup (x:stack) = x:x:stack

over (x:y:stack) = y:x:y:stack

tuck (x:y:stack) = x:y:x:stack

rot (x:y:z:stack) = z:x:y:stack

dip f (x:stack) = x:f (stack)

keep f stack = dip f (dup stack)

This could be the basis for a dynamically typed concatenative language in Haskell, though it wouldn't work with static typing very well. (You could use nested tuples, with statically known types, are used instead. This requires a whole bunch of polymorphism and type classes. But with this, it is impossible to have a statically unknown stack depth. This makes it impossible to do certain types of recursion.)

Another way to think of it in Haskell is in terms of higher order functions:

drop f x = f -- same as const

swap f x y = f y x -- same as flip

dup f x = f x x

This, in my mind, more closely resembles the real data flow manipulation involved here. However, they are hard to compose in real programs.

## 7 comments:

You mention experimental queue-based languages; I did some exploring in that area, and I think they're more interesting than you indicate.

A 'normal' stack-based concatenative language actually isn't purely stack based. It has a stack, plus another unspecified data structure that holds the code. A stack+queue language specifies that code being executed is held on a queue.

Check out http://www.nsl.com/k/xy/xy.htm for an example.

Toy languages which use a queue rather than a stack are basically impossible to use. XY is a cool idea, but I think it'd be difficult to efficiently compile it (though you could still get reasonable performance if you use the optimized K functions a lot). But research is always interesting, and new models of concatenative languages are never a bad thing. Sorry if I was a little off-putting.

I think the stack effect comment for dupd is incorrect. it should be ( x y -- x y y ) no?

Logan, the stack effect is correct. In a stack effect, the top of the stack is at the right.

I think I have stack effect dyslexia. This is not the first time I've made that mistake ;)

I had the same idea and wrote a forth like syntax in lisp, with functions taking lists and returning lists. it's in my blog, but it's in spanish.

anonymous,

I know a little bit of Spanish, and Google knows more. Could you give me a link to your blog? Joy represents stacks as linked lists, so I wonder how similar the two ideas are. It's an interesting comment.

Post a Comment