Subtleties with loops
A common task in imperative programming languages is writing a loop. A loop
that can terminate requires a way to check the terminating condition and a way
to repeatedly execute some part of the code. These two mechanisms exists in
many forms: from the crudest approach of using an if
and a goto
(that must
jump backwards in the code) to higher-level structured constructs like for
and while
ending in very high-level constructs built around higher-order
functions in for_each
-like constructs and more recently, in the context of
GPU programming, the idea of a kernel function instantiated over a
n-dimensional domain (where typically n ≤ 3 but most of the time n = 1).
These more advanced mechanisms make writing loops a commonplace task and typically regarded as uneventful. Yet, there are situations when things get subtler than we would like.
A ranged-loop over integers
Let’s consider a construct like this in some sort of pseudo-Pascal:
for i := lower to upper do
S(i)
in which the statement S(i)
is repeatedly executed with the value of the
variable i
starting with a value lower
. Between each repetition we increase
i
by one. We stop repeating S(i)
when i
has the value upper
. This is,
S(upper)
is executed but S(upper+1)
is not.
As an example:
for i := 1 to 5 do
writeln(i);
will print
1
2
3
4
5
A possible implementation
Let’s imagine how this could be compiled to a lower level representation. Imagine
we only have goto
and if + goto
(as a way to mimick a bit how current computers
work).
Back to our loop:
for i := lower to upper do
S(i)
could be implemented like
i := lower;
loop:
if i <= upper then goto repeated;
goto after_loop;
repeated:
S(i);
i := i + 1;
goto loop;
after_loop:
{ ... }
Iterating a whole range of integers
Now consider that, for some reason, we want to iterate over all the integers of, say, 32-bit. For simplicity, we will assume unsigned integers but signed integers face similar issues.
for i := 0 to 4294967295 do
S(i)
It still seems not to be a big deal. But look at i
, what type should it have?
If we use the implementation above, consider the last iteration. This is, when,
i = 4294967295
. The i
variable has to be able to represent 4294967295
so
it has to be at least 32-bit. If it is exactly 32-bit it will overflow when we
compute i := i + 1;
.
Here each system may behave differently: some system will simply wrap-around
and i
will become 0
. Which is bad because 0 ≤ 4294967295
which is the
condition we use to check whether we have to keep repeating so we will never
terminate. Some other machine may trap, which is slightly better (we do
terminate!) but prevents our correct program from running.
Now if you’re on a 64-bit system (or a system where the CPU provides efficient
64-bit integer arithmetic), this is easy to address, just make i
to be 64-bit
and you’re done.
But this is a bit of an unsatisfying answer and further questions may arise at this point.
What if we want to iterate all the 64-bit? Granted, this is a very large number of iterations and so we’re probably never going to terminate in a reasonable amount of time.
What if our CPU does not provide 32-bit integers and representing 64-bit magnitudes is expensive? The reality is that nowadays additions (and subtractions) are cheap for a CPU. For instance, on most 32-bit systems, adding or subtracting a 64-bit integer can be done with two instructions (rather than one if 64-bit were natively supported).
What if we chose to use a 64-bit integer (no matter if supported or not) but our loop has an unknown upper bound. If N is less than 4294967295 it would be fine to use a 32-bit integer.
for i := 0 to N do
S(i)
This leaves us with a bit of an uneasy feeling and while modern machines could use a larger integer, we probably want a solution that always works.
A safer, but less nice, implementation
Can we implement the loop in a way so this issue is a non-problem? The answer is yes, but the loop will not look as nice.
if lower > upper then goto after_loop;
i := lower;
repeated:
S(i);
if i = upper then goto after_loop;
i := i + 1;
goto repeated;
after_loop:
{ ... }
Let’s be honest, this construction does not look very nice but it avoids any
overflow. So i
only has to be as large as lower
and upper
. In other
words, there is no need to make it larger “just in case”.
Impact on optimisation
Compilers these days are very smart and the two loops can be compiled efficiently (they will emit almost the same code for both), so the less safe version has no particular performance advantage over the safer one.
From a teaching perspective, though, the less safe version is probably easier to explain.
What about C and C++?
But then, if we may overflow, what about a loop like this?
// Assume N is int
for (int i = 0; i <= N; i++)
S(i);
According to the spec, the loop above is equivalent to the following code:
{
int i = 0;
while (i <= N) {
S(i);
i++;
}
}
The C++ standard also tells us that signed integer overflow is undefined behaviour (UB) in C and C++.
Our loop is incorrect when N
is 2147483647
(2147483647
is INT_MAX
,
assuming int
is a 32-bit integer, which typically is) because it triggers UB
in i++
.
When a program triggers UB all bets are off in terms of its mandated behaviour.
The observed behaviour becomes typically platform and/or compiler dependent.
For example, in clang on x86-64 a loop like the above will loop forever at
-O0
but it seems to work at -O1
or higher optimisation levels, in GCC on
x86-64 it is likely to not to terminate at any optimisation level.
In contrast, a loop like this
// Assume N is unsigned
for (unsigned i = 0; i <= N; i++)
S(i);
will never terminate when N = 4294967295
. In C and C++, overflow of unsigned
integers is well-defined as wrapping-around.
Based on the approach seen above, a way to correctly implement either case is as follows:
// Example for the signed case.
int i = 0;
if (!(i == N)) { // i != N
for (;; i++) {
S(i);
if (i == N) break;
}
}
or similarly
int i = 0;
if (!(i == N)) { // i != N
do {
S(i);
i++;
} while (!(i == N)); // i != N
}
Again, it does not look great but it is always correct.