Fri 21 - Thu 27 October 2011 Portland, Oregon, United States

When optimizing large-scale applications, striking the balance between steady-state performance, start-up time, and code size has always been a grand challenge. While recent advances in trace compilation have significantly improved the steady-state performance of trace JITs for large-scale Java applications, the size control aspect of a trace compilation system remains largely overlooked. For instance, using the DaCapo 9.12 benchmarks, we observe that 40% of traces selected by a state-of-the-art trace selection algorithm are short-lived and, on average, each selected basic block is replicated 13 times in the trace cache.

This paper studies the size control problem for a class of commonly used trace selection algorithms and proposes six techniques to reduce the footprint of trace selection without incurring any performance loss. The crux of our approach is to target redundancies in trace selection in the form of either short-lived traces or unnecessary trace duplication.

Using one of the best performing selection algorithms as the baseline, we demonstrate that, on the DaCapo 9.12 benchmarks and DayTrader 2.0 on WebSphere Application Server 7.0, our techniques reduce the code size and compilation time by 69% and the start-up time by 43% while retaining the steady-state performance. On DayTrader 2.0, an example of large-scale application, our techniques also improve the steady-state performance by 10%.