My younger brother is learning how to type right now. My dad found Tux Typing, a GPL typing program for Windows and Linux. So far, it's great; there's just one problem: when you use the word mode (where you have to type words before they fall to the bottom of the screen, creating horrible crashing sounds) there are only about 35 words in the long word setting. My dad asked me to fix this, and since it was a simple little task, I agreed.
I started by figuring out the file format for the dictionaries. It was easy: the first line was the title (I chose "Huge words") and after that, the words were listed in allcaps, separated by Unix line endings. Next, I copied the text of the Wikipedia entry Economy of the United States into Wordpad and saved it to the desktop. Then I downloaded Factor to the computer I was working on and fired up the REPL.
The first thing in making the word list is getting a list of space-separated things. So I made a file reader object and got an array of all the lines. I joined these lines with a space, then split everything separated by a space (separating both lines and words on the same line).
"article.txt" <file-reader> lines " " join " " split
Now an array of words is lying on top of the stack. There are a bunch of operations we need to do to manipulate this, and Factor's sequence combinators help make it easier. So I made sure that each word had at least three letters in it:
[ length 3 >= ] subset
And I put all the words in upper case:
[ >upper ] map
And I made sure that each character of each word was an upper case letter, to filter out apostrophes, numbers, and other similar things:
[ [ LETTER? ] all? ] subset
And finally, I made sure there were no duplicates:
prune
So, to join this together in the correct file format and add a title, I used
"Huge words" add* "\n" join
yielding a string. To write this string back to the original file, all I needed was
"article.txt" <file-writer> [ print ] with-stream
And that's it! All in all, 10-15 minutes work and I got the game working with a good word list. But the REPL made it a lot easier.
Update: I didn't have the <file-reader> and <file-writer> displayed properly before, but it's fixed now. Thanks Sam!
1 comment:
Looks like <file-reader> and <file-writer> aren't properly displayed, as you forgot to escape < and > in your blog post.
Post a Comment