SPLASH 2011
Fri 21 - Thu 27 October 2011 Portland, Oregon, United States

In dealing with the container bloat problem, we identify five memory compaction techniques, which can be used to reduce the footprint of the large number of small objects that make these containers. Using these techniques, we describe two alternative methods for more efficient encoding of the JRE’s ubiquitous HashMap data structure, and present a mathematical model in which the footprint of this can be analyzed. The fused hashing encoding method reduces memory overhead by 20%—45% on a 32-bit environment and 45%—65% on a 64-bit environment. This encoding guarantees these figures as lower bound regardless of the distribution of keys in hash buckets. The more opportunistic squashed hashing, achieves expected savings of 25%—70% on a 32-bit environment and 30%—75% on a 64-bit environments, but these savings can degrade and are not guaranteed against bad (and unlikely) distribution of keys to buckets. Both techniques are applicable and give merit to an implementation of HashSet which is independent of that of HashMap. Benchmarking using the SPECjvm2008, SPECjbb2005 and DaCapo suites does not demonstrate significant major slowdown or speedup. For TreeMap we show two encodings which reduce the overhead of tree nodes by 43% & 46% on a 32-bit environment and 55% & 73% on a 64-bit environment. These also give to separating the implementation of TreeSet from that of TreeMap, which gives rise to footprint reduction of 59% & 54% on a 32-bit environment and 61% & 77% on a 64-bit environment.