Monday, October 24, 2011

Boring charts

Here is a chart that shows the amount of free space remaining after each garbage collection (MIT/GNU Scheme compiling itself, heap size 1935360 words):
Compare that chart with this one which shows the amount of free space remaining after each GC with heaps of sizes 8388608, 16777216, 33554432, and 100663296 words. Note that the y axis is now displayed in log scale in order to get the wide variation in heap size on one chart.
As we move to larger and larger heap sizes, the proportion of memory that is freed after each GC increases. This makes sense because the program allocates memory in the same way regardless of how much heap is present (a program that adjusts its allocation profile in response to the heap size would look much, much different). When the heap is small, large allocations of temporary storage (like the one at about 8.5×108) make a significant difference in the amount of free space, but when the heap is big, these differences virtually disappear.

It is almost disappointing that the detail is lost when we use larger heap sizes. The charts for the large heap sizes are essentially flat. At large heap sizes, the variations in temporary storage usage become insignificant. In other words, the number of garbage collections becomes less dependent upon the moment by moment allocation of the program being run and approaches the simple limit of the heap size divided by the total allocation.

2 comments:

kbob said...

Would you see anything interesting in the bigger heaps if you plotted words in use instead of words free?

x = total allocations (same as now)
y = heap size - words free

At the very least, that should keep the huge free space from hiding the signal.

Also, what is your word size? 4 or 8 bytes?

Joe Marshall said...

kbob asked: Would you see anything interesting in the bigger heaps if you plotted words in use instead of words free?
Yes. The chart looks roughly like that first one, but flipped upside down. The ‘in use’ charts are much more interesting to look at, and I wanted to discuss them in more detail, but they detract from the key point: the details are irrelevant at larger heap sizes.


Also, what is your word size? 4 or 8 bytes?
I'm using the 64-bit version of MIT/GNU Scheme, so the word size is 8 bytes.