Saturday, January 30, 2010

Runtime type feedback in Factor

I realized today that, thanks to all the great machinery that Slava Pestov has built, Factor already has everything needed to support feedback-directed optimizations. To demonstrate this, I implemented a simple version of type feedback. The whole thing is less than 30 lines of code. It could definitely use a bit of improvement, but in a simple benchmark, I recorded a speedup of more than two times compared to the purely statically optimized version.

The idea underlying runtime type feedback is that we can produce specialized versions of code based on the types it is used with. For example, if we consider a word like append. On each iteration of the main loop, it calls a method on one of the two inputs to read from the input sequence. It also calls a method on the output to write to it. In general, none of these types are known until the program runs. However, if we know that the inputs are both arrays, then the output will also be an array and all of these method calls can be inlined using the static analysis of the compiler's propagation pass. This makes the program run significantly faster.

The existing hints mechanism capitalizes on this. Hints make it possible to declare, "this word is likely to be called with the following input from the following classes". With this declaration, the compiler can make a few specialized versions of a word annotated with hints, one for each predicted combination of classes. When the word gets called, it does a series of type checks to choose which version to use, based on the types of the arguments. If none of the types match, it calls the default version.

The problem with this is that you have to declare, in advance, what types you expect to come up. Sometimes, you don't really know what types to expect until the program runs. This is where runtime type feedback comes in. In this system, while the program runs, the language runtime collects profile information about what types words are being called with. If a word is called with a particular type enough times, then hints are added to the word to make it specialized on that type. The word is recompiled with these hints, and future calls to the word have access to this specialized version.

When I first heard about type feedback, it sounded really hard to implement. But it turns out, because Factor already has the hints mechanism and supports runtime redefinition of words, that type feedback is easy. All I did was use a hashtable to store the frequencies of types that a word was called with, and add hints based on this.

There are some disadvantages to this approach. The first is that the compiler must be packaged with any program that wants to use type feedback at runtime. This takes up memory at runtime and makes the program take longer to start up because of the bigger executable. Additionally, any runtime optimization makes programs run slower at the beginning. The system has a certain degree of "warmup time" where it is not running at full speed because runtime optimizations have not yet been applied. An extreme case of this is in a full just-in-time (JIT) compiler, where all compilation takes place at runtime, and the program is initially running in an interpreter or something similar. This would not be quite as big of an issue here, where runtime type feedback is added to an ahead-of-time (AOT) compiler, but it's still a problem.

My implementation of type feedback is far from perfect, and should be considered a proof of concept at this point. It won't be included as a default feature of Factor any time soon. First, the code that performs profiling might be faster if this code were written in a lower-level style. On some benchmarks, the cost of profiling in this implementation outweighs the benefits of the type feedback. Second, the way I use hints is pretty inefficient. I add a hint and then recompile the entire word with all the hints. So if n type combinations are added, then it spends O(n2) time compiling, where it should have only spent O(n) time. Third, hints are not implemented as well as they could be in terms of runtime performance. A better implementation, based on multimethods, would probably improve the performance of type feedback. Finally, the Factor compiler itself isn't as fast as it could be. Compilation speed matters much more in the presence of runtime compilation than for AOT compilation, so this hasn't been the top priority so far.

Despite these flaws, runtime type feedback, even implemented naively, creates a significant speedup. It seems like dynamic optimizations have a significant benefit for dynamically typed, object-oriented languages like Factor, that static optimizations alone can't completely match.