The basic idea behind the XML combinators is to let XML be processed the same way Lispers (and Factorers) have always processed lists and other sequences: with
map each find subsetand now,
inject. These operations are slightly more complicated when used with a tree structure like XML, rather than a flat sequence as the operations were originally designed for. Fortunately, Factor's generic word system makes this relatively easy.
There's one consideration necessary in designing these that I wasn't happy with: since I had to iterate over all nodes, not just the leaf nodes. In all combinators, the result can be affected by which order is used. I made a relatively arbitrary decision and said that first the parent nodes were processed and then the leaves.
A note before the code of
xml-each, the simplest of the combinators: this uses Factor's generic words, which dispatch only on the top of the stack.
xml-eachmust dispatch on the class of the XML tag, not the quotation. However, the Factor convention for combinators is that the quotation is on the top of the stack, because this looks better syntactically. So I need to do a swap before entering the generic word. That said, here's the combinator:
GENERIC: (xml-each) ( quot tag -- ) inline
M: tag (xml-each)
[ swap call ] 2keep
tag-children [ (xml-each) ] each-with ;
M: object (xml-each)
swap call ;
M: xml-doc (xml-each)
delegate (xml-each) ;
: xml-each ( tag quot -- ) ! quot: tag --
swap (xml-each) ; inline
The rest of the combinators are implemented in much the same way, with only trivial changes to make different ones. Unlike some combinators (see my previous post) this is very simple to implement, not in small part because I have existing combinators to help. (Unfortunately, it doesn't seem to be possible to abstract a general "lift" combinator (analogous to
liftMin Haskell for monads) to turn
xml-map, despite the similarities between them).
A really simple usage of
[ write-item ] xml-each, which prints out each element of the given XML tag, document, or other type of element.
As a more in-depth example,
xml-find(which works just like find, but without returning an index) is used to implement the equivalent of
: get-id ( tag id -- elem )
dup tag? [
[ string? ] subset concat
] [ drop f ] if
] xml-find nip ;
(Note: one weakness of the XML utilities that currently exist is that it's annoying to deal with XML tag attributes, though this should be fixed when I do more real work using them)