I read, via Don Box, Jan Gray’s monumental piece about performance of managed code in .NET. If you care about performance in general it’s a good read. This provoked a lot of thought and I’ll write more, but also suggested a specific coding technique for making loops faster; I tried it out and it failed the first rough-cut test, but suggests an improvement for future language designs. (Warning: in-the-trenches geeky.) (Update: on iterators and dynamic languages and Java and C#, with more benchmarks.)
(Update: massively-erudite write-up from Erik van Konijnenburg.)

Update · Boy, this one touched a nerve, setting (I think) an all-time record for incoming email. Write-up tacked on at the end, but here is a massive, fascinating deep-dive into the subject from Erik van Konijnenburg, who detours into FORTRAN-land, where they really care about their loops. I haven’t had time to give it the attention it obviously deserves.

Array Dereferencing · Gray does some measurements and discovers that array references are a little more expensive than you’d like in .NET because it does bounds-checking on each one; the same is doubtless true in Java. This is a little bit irritating, especially if you believe as I do that arrays of native integers are the ideal in-memory data structure; no others come close in terms of conceptual simplicity and low impedence mismatch with the CPU’s view of the world.

In fact, consider the single most common idiom for looping through the elements of an array:

for (int i = 0; i < a.length; i++)
  doWhateverTo(a[i]);

I wonder, since the virtual machine is bounds-checking each reference to a[i], why it is that I have to redundantly check i against a.length every time around the loop?

So, since I had insomnia anyhow, I wrote a little teeny benchmark to investigate this a bit. Here's the core:

  int frob(int in)
  {
    return (in % 1234) & 0xff;
  }

  int sum1(int [] a)
  {
    int sum = 0;
    for (int i = 0; i < a.length; i++)
      sum += frob(a[i]);
    return sum;
  }

  int sum2(int [] a)
  {
    int sum = 0;
    int i = 0;
    try
    {
      while (true)
	sum += frob(a[i++]);
    }
    catch (ArrayIndexOutOfBoundsException e)
    {
      if (i >= a.length)
	return sum;
      throw e;
    }
  }

The trick is that the sum2() loop is simpler but the VM has to build, throw, and catch an exception to get out of it. So you’d expect that if this is a good idea at all, it would do better on big rather than small arrays. I ran this against a million array elements in five different ways, summarized in the table below (the numbers are seconds). The results are partially as you’d expect in that the cost of the exception-catching method falls inversely as the size of the array.

RunsArray Sizesum1()sum2()
100001000.0452.063
100010000.0430.206
100100000.0350.068
101000000.0510.048
110000000.0470.048

However, the exception-catching never really gets to be clearly better than the old-fashioned technique. So this is probably not a coding breakthrough (although it might be on some other JVMs, and I’d be interested to see what happened if someone recompiled that into C#, the source including mainline is here). However, I do wonder why designers of modern languages don’t provide an idiom for looping along the lines of:

for (int i indexes a)
  doWhateverTo(a[i]);

Update June 12 · I got lots of email on this one. They told me some things I already knew, and some useful things I didn’t know.

Research · Piotr Kaminski writes to point at research by Michael Zastre. Stan Dyck relates the story of someone who claims running the loop backward from a.length to 0 would be faster.

Dynamic Languages · Every dynamic language on the planet has this kind of facility, of course. (Thanks to Andrew Sidwell and Adam Turoff) Some perl idioms look like this:

foreach $i (0 .. #$array)
foreach $i (@array)

Some python stuff looks like this (thanks to Fazal Majid and Simon Willison):

for x in a:
	doWhateverTo(x)

And:

for line in open('/etc/passwd'):
	print line.split(':')[0]

Matt Brubeck points out you can do this with Javascript too.

Look, I knew this, OK? I don’t to hurt the feelings over in dynamic-language land, but if I’m obsessing about squeezing the last few cycles out of traversing an array, I ain’t programming in Perl or Python or any of their ilk.

Java and C# · Java 1.5 is going to have an iterator that looks like this (thanks to Piotr Kaminski and Elliot Hughes):

int[] xs;
for (x : xs) {
    f(x);
}

C# has this already built-in (thanks to Ben Meadowcroft and Jim Ancona):

foreach (int i in integerArray)
    doWhateverTo(i);

Jim Ancona went so far as to translate my benchmark into C# and run it once with the optimizer off:

RunsArray Sizesum1()sum2()foreach
100001000.0700.4910.060
100010000.1000.0900.040
100100000.0600.0910.050
101000000.0800.0900.080
110000000.0800.0800.090

And once with it on:

RunsArray Sizesum1()sum2()foreach
100001000.0500.4410.050
100010000.0600.0900.051
100100000.0600.0200.030
101000000.0900.0100.090
110000000.0500.0600.050

Jim writes:

The non-optimized results seem to pretty much parallel your Java results, except the optimizer seems to be able to do something with your exception-throwing approach, and the CLR exception-throwing overhead seems generally lower. The 'foreach' approach pretty much tracks the normal 'for' loop, so maybe they aren't turning off bounds checking.

Note that neither Java nor C#, unlike what I suggested, straightforwardly give you access to the index into the array, so would be less useful for dealing with multiple parallel arrays.

Anyhow.


author · Dad · software · colophon · rights
picture of the day
June 11, 2003
· Technology (81 fragments)
· · Coding (98 more)

By .

I am an employee
of Amazon.com, but
the opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.