While there has been decades of work on developing automatic, locality-enhancing transformations for regular programs that operate over dense matrices and arrays, there has been little investigation of such transformations for irregular programs, which operate over pointer-based data structures such as graphs, trees and lists. In this paper, we argue that, for a class of irregular applications we call traversal codes, there exists substantial data reuse and hence opportunity for locality exploitation. We develop a novel optimization called point blocking, inspired by the classic tiling loop transformation, and show that it can substantially enhance temporal locality in traversal codes. We then present a transformation and optimization framework called TreeTiler that automatically detects opportunities for applying point blocking and applies the transformation. TreeTiler uses autotuning techniques to determine appropriate parameters for the transformation. For a series of traversal algorithms drawn from real-world applications, we show that TreeTiler is able to deliver performance improvements of up to 245% over an optimized (but non-transformed) parallel baseline, and in several cases, significantly better scalability.
