For those of you who slept through CS 101 (or, like me, never took it), here's an overview of stacks. We have just three operations:
empty. The operation
pushputs a datum on the end of the stack. The operation
popdeletes the top item from the stack, returning it. The operation
emptytests whether the stack has any items in it. Initially, we say the stack is empty. Stacks are most commonly implemented as linked lists or growable arrays. Wikipedia has more details.
For our purposes, we'll assume that the
emptyoperation is off-limits3. Generally, we should know whether or not the stack is empty--it has all of our data on it, after all. If a word tries to use more things on the stack then exist, the program crashes with a stack underflow error. This is a good thing, reflecting a real error with the structure of the program which needs to be fixed rather than ignored.
The basics: constants and drop
So, how can we make push and pop work with concatenative languages? We have no place to get the arguments for these, or return the results of these functions except the stack itself. Instead of using push and pop directly, slightly more complicated operations are used. These provide not a way to mess around with the stack, but also the data flow of the language.
First, the easy case: pushing constant values. Say you want to push 1 onto the stack. This is represented in Factor as, simply, 1. Think about it: when you have a literal, and when all data is stored on the stack, what else would you want to do with it besides push it onto that stack? In this way, we can actually think of 1 as a function from stacks to stacks, which takes one stack and returns another which has 1 pushed on to the front of it.
It's so much fun to push these values on the stack! But what if we want to get them off? We want to perform
popon the stack, but ignore the result. We can imagine this implemented as
_ = pop(stack)
You can think of it like dropping the top of the stack. In fact, that's what we'll call this operation,
We can try this all out at the Factor listener4. Open up Factor and push a bunch of numbers to the stack. To see what's there, you can use the command
.s. The stack is printed from bottom to top order. To delete the item on the top of the stack, use the command
Building blocks: dup, swap and the retain stack
Unfortunately, there's not much you can do with just those two things, pushing literals and dropping them, even if you add in primitive functions and functions derived from those. One useful action is to duplicate the top of the stack. Since "duplicate" is way too long to type so many times, we'll call it
dupcan be implemented as
item = pop(stack)
dupis useful if you want to do two things with one piece of data. It's also useful if you want to apply the same piece of data twice to the same function. For example, let's look at the definition of
sq, which squares the number on the top of the stack5:
: sq ( x -- x^2 )
dup * ;
This means that the one input to
sqis sent as both arguments to
*. The part in parentheses is called the stack signature, a listing of inputs and outputs to the word. It is basically a comment; the names included are not significant, except to the reader5. To give you an idea of what's going on here, the equivalent function in Haskell would be
sq x = x*x
Another basic building block is
swap. This switches the top two things on the stack. Say you want to negate a number. The easiest way to do this, given a function
- ( x y -- x-y ), is something like this:
neg x = 0-x
If we translate
0-xinto Factor as
0 -, we get the wrong answer; this is equivalent to
x-0, an identity function. So, what we need to do is push 0 onto the stack, swap it with x, and then perform a subtraction:
: neg ( x -- -x )
0 swap - ;
The third basic operation is dipping under the top of the stack to do stuff below. For example, say you want to calculate, given x, y and z, (x+y)*z. The easiest way to do this is to dip under the top of the stack, add x and y, then rise back up and multiply that result by z. In Factor, the dipping down operation is
>rand the rising up operation is
r>. All together, the Factor code for this function comes out as:
: (x+y)*z ( x y z -- (x+y)*z )
>r + r> * ;
(Yes, in Factor,
(x+y)*zis a valid word name, since words are just whitespace-delimited chunks. And I was feeling a little unimaginative.)
Using just those five basic operations--dup, drop, swap, >r and r>, we can implement any other stack shuffling operation6. Here are a few that turn out to be useful. The code here is extremely simple, but resist the temptation to do your own stack shuffling with the above operators and not the below: using derived operations makes your code look far cleaner and clearer to future readers.
swapdswaps the two items under the top of the stack. The suffix "d" indicates that it does this underneath.
: swapd ( x y z -- y x z )
>r swap r> ;
rotrotates the third item on the stack up to the top, pushing the top two down.
: rot ( x y z -- y z x )
swapd swap ;
-rotrotates the top item on the stack down to the third position, pushing the two below up. It is the inverse of
: -rot ( x y z -- z x y )
swap swapd ;
drop, but it deletes the second item on the stack.
: nip ( x y -- x )
>r drop r> ;
dupdduplicates the item second from the top of the stack, and puts the duplicate underneath. This is useful if you want to do something with the top two things on the stack, but retain what's second.
: dupd ( x y -- x x y )
>r dup r> ;
overtakes the second item on the stack and puts a copy of it over the top.
: over ( x y -- x y x )
dupd swap ;
tucktucks the top item on the stack under the second one
: tuck ( x y -- y x y )
swap over ;
pickpicks the third item on the stack and places a copy of it on top
: pick ( x y z -- x y z x )
>r over r> swap ;
In Factor, these are all implemented as primitives, but the only reason for this is performance. Note that, in the above descriptions, when I said "copy" I meant copying a pointer, not copying data (which can be done with
There's a really cool operator known as
keep, which often makes code far more elegant. Strictly speaking, it's not a stack shuffler; it takes a quotation as an argument, making it a combinator, or higher order function. Basically,
keepon the top. The names should indicate something about the arguments, ideally more informative than their type (though that is often used). If the same name is used on the left and right sides of the
encapsulates a simple, common pattern:d
dup >r ... r>
replacing it with
[ ... ] keep
2keep, which is equivalent to
2dup >r >r ... r> r>
These, with the analogous
3keep, are more useful than they first seem. They are used for any instance where you want to process the top one or more items on the stack, together with something underneath, while retaining those items on the top for further use.
What's this all for?
This list of functions probably looks daunting to memorize, and it is difficult at first. But it's useful, and they soon become second nature. In Factor and other concatenative languages, stack shufflers are the glue used to tie the output of certain functions to the input of other functions. No matter what programming language you're working in, everything ends up being a (perhaps convoluted) stringing together of different functions. Once you get used to it, using the stack and its associated transformations can be even more natural than using tons of named variables.
1 OK, technically, you could use locals for named variables. But it's all stack manipulation underneath. Also, there's dynamically scoped variables, but those aren't used the way lexically scoped locals typically are; they're used for passing data over greater distances.
2 Experimental languages have used a queue rather than, or in addition to, a stack. But this generally leads to more difficult to use--and compile--models. The original point-free language, FP, used function composition without a stack, but the way it uses lists is trivially equivalent to the stack. However, it ends up being somewhat awkward in practice. If you have an idea for how to improve either of these, or of another data structure for concatenative languages, I'd love to hear it. Please comment.
3 Strictly speaking, this isn't the case in Factor. In fact, a function to test whether the stack is empty is trivial to implement:
: stack-empty ( -- ? )
datastack empty? ;
But it's a bad idea to use this, since words should only care about their arguments, and they should take a fixed (or easy to determine, in the case of certain macros) number of arguments.
4 Here's how to get and open Factor. If you're on Windows or Mac, download the appropriate package from the Factor website and unpack it somewhere. On either platform, double click on the F icon, and a Factor listener window will pop up. On Linux or other Unixes, download the source, then
makewith the appropriate target (run without arguments to see options), and run
./factor. See the readme for more details. No matter how you get it started, once you have the listener open, you can just type in words and press enter to execute them.
5 The structure of stack comments is fairly simple. Before the
--is the inputs, and after is the outputs. Both sides are written from the bottom of the stack to the top of the stack. For example,
( a b -- c d )indicates a word which takes
asecond from the top of the stack, and
bon the top of the stack, pops those off, and returns
csecond from the top and
--, that implies that the same object exists before and after the function is called. This occurs most often in stack shufflers.
6 Actually, there's a set of two combinators, cake and k, which can do anything, including any stack shuffle, and call, and curry. They're described in a fun paper by Brent Kerby, The Theory of Concatenative Combinators.
Note: Another way to look at the stack shufflers is through a functional implementation of stacks in terms of linked lists with pattern matching. We'll call the front of the list the top of the stack Here's a few stack shufflers in Haskell, a language where
drop (x:stack) = stack
swap (x:y:stack) = y:x:stack
dup (x:stack) = x:x:stack
over (x:y:stack) = y:x:y:stack
tuck (x:y:stack) = x:y:x:stack
rot (x:y:z:stack) = z:x:y:stack
dip f (x:stack) = x:f (stack)
keep f stack = dip f (dup stack)
This could be the basis for a dynamically typed concatenative language in Haskell, though it wouldn't work with static typing very well. (You could use nested tuples, with statically known types, are used instead. This requires a whole bunch of polymorphism and type classes. But with this, it is impossible to have a statically unknown stack depth. This makes it impossible to do certain types of recursion.)
Another way to think of it in Haskell is in terms of higher order functions:
drop f x = f -- same as const
swap f x y = f y x -- same as flip
dup f x = f x x
This, in my mind, more closely resembles the real data flow manipulation involved here. However, they are hard to compose in real programs.