Subtleties with loops

A common task in imperative programming languages is writing a loop. A loop that can terminate requires a way to check the terminating condition and a way to repeatedly execute some part of the code. These two mechanisms exists in many forms: from the crudest approach of using an if and a goto (that must jump backwards in the code) to higher-level structured constructs like for and while ending in very high-level constructs built around higher-order functions in for_each-like constructs and more recently, in the context of GPU programming, the idea of a kernel function instantiated over a n-dimensional domain (where typically n ≤ 3 but most of the time n = 1).

These more advanced mechanisms make writing loops a commonplace task and typically regarded as uneventful. Yet, there are situations when things get subtler than we would like.

A ranged-loop over integers

Let’s consider a construct like this in some sort of pseudo-Pascal:

for i := lower to upper do
  S(i)

in which the statement S(i) is repeatedly executed with the value of the variable i starting with a value lower. Between each repetition we increase i by one. We stop repeating S(i) when i has the value upper. This is, S(upper) is executed but S(upper+1) is not.

As an example:

for i := 1 to 5 do
  writeln(i);

will print

A possible implementation

Let’s imagine how this could be compiled to a lower level representation. Imagine we only have goto and if + goto (as a way to mimick a bit how current computers work).

Back to our loop:

for i := lower to upper do
  S(i)

could be implemented like

i := lower;
loop:
  if i <= upper then goto repeated;
  goto after_loop;
repeated:
  S(i);
  i := i + 1;
  goto loop;
after_loop:
  { ... }

Iterating a whole range of integers

Now consider that, for some reason, we want to iterate over all the integers of, say, 32-bit. For simplicity, we will assume unsigned integers but signed integers face similar issues.

for i := 0 to 4294967295 do
  S(i)

It still seems not to be a big deal. But look at i, what type should it have?

If we use the implementation above, consider the last iteration. This is, when, i = 4294967295. The i variable has to be able to represent 4294967295 so it has to be at least 32-bit. If it is exactly 32-bit it will overflow when we compute i := i + 1;.

Here each system may behave differently: some system will simply wrap-around and i will become 0. Which is bad because 0 ≤ 4294967295 which is the condition we use to check whether we have to keep repeating so we will never terminate. Some other machine may trap, which is slightly better (we do terminate!) but prevents our correct program from running.

Now if you’re on a 64-bit system (or a system where the CPU provides efficient 64-bit integer arithmetic), this is easy to address, just make i to be 64-bit and you’re done.

But this is a bit of an unsatisfying answer and further questions may arise at this point.

What if we want to iterate all the 64-bit? Granted, this is a very large number of iterations and so we’re probably never going to terminate in a reasonable amount of time.

What if our CPU does not provide 32-bit integers and representing 64-bit magnitudes is expensive? The reality is that nowadays additions (and subtractions) are cheap for a CPU. For instance, on most 32-bit systems, adding or subtracting a 64-bit integer can be done with two instructions (rather than one if 64-bit were natively supported).

What if we chose to use a 64-bit integer (no matter if supported or not) but our loop has an unknown upper bound. If N is less than 4294967295 it would be fine to use a 32-bit integer.

for i := 0 to N do
  S(i)

This leaves us with a bit of an uneasy feeling and while modern machines could use a larger integer, we probably want a solution that always works.

A safer, but less nice, implementation

Can we implement the loop in a way so this issue is a non-problem? The answer is yes, but the loop will not look as nice.

if lower > upper then goto after_loop;
i := lower;
repeated:
  S(i);
  if i = upper then goto after_loop;
  i := i + 1;
  goto repeated;
after_loop:
  { ... }

Let’s be honest, this construction does not look very nice but it avoids any overflow. So i only has to be as large as lower and upper. In other words, there is no need to make it larger “just in case”.

Impact on optimisation

Compilers these days are very smart and the two loops can be compiled efficiently (they will emit almost the same code for both), so the less safe version has no particular performance advantage over the safer one.

From a teaching perspective, though, the less safe version is probably easier to explain.

What about C and C++?

But then, if we may overflow, what about a loop like this?

// Assume N is int
for (int i = 0; i <= N; i++)
  S(i);

According to the spec, the loop above is equivalent to the following code:

{
  int i = 0;
  while (i <= N) {
    S(i);
    i++;
  }
}

The C++ standard also tells us that signed integer overflow is undefined behaviour (UB) in C and C++.

Our loop is incorrect when N is 2147483647 (2147483647 is INT_MAX, assuming int is a 32-bit integer, which typically is) because it triggers UB in i++.

When a program triggers UB all bets are off in terms of its mandated behaviour. The observed behaviour becomes typically platform and/or compiler dependent. For example, in clang on x86-64 a loop like the above will loop forever at -O0 but it seems to work at -O1 or higher optimisation levels, in GCC on x86-64 it is likely to not to terminate at any optimisation level.

In contrast, a loop like this

// Assume N is unsigned
for (unsigned i = 0; i <= N; i++)
  S(i);

will never terminate when N = 4294967295. In C and C++, overflow of unsigned integers is well-defined as wrapping-around.

Based on the approach seen above, a way to correctly implement either case is as follows:

// Example for the signed case.
for (int i = 0; ; i++) {
  S(i);
  if (i == N) break;
}

Again, it does not look great but it is always correct.