In chapter 6 we saw conditional branches and we ended commenting that they can be used to implement higher control constructs. In this chapter we will see a few of them.
It may be at this point rather obvious how to implement an if-statement construct. Given a structure of the form
our code may look like
It can be done the other way around.
How to evaluate conditions?
Conditions in most programming languages are explicitly given a boolean type (e.g.
bool in C++,
LOGICAL in Fortran) or an integer type with a restricted set of values (e.g. 0 or 1 in C). Several programming languages implicitly define a conversion from regular integers to condition expressions. For instance in C, an expression
e of type integer or pointer when used in a condition is the same as evaluating the condition
e != 0 (which will give a
1 value). C++ defines the same conversion but the resulting type is
bool so it can be
false. Fortran does not have any of these conversions so the operand will always be required.
For expressions of the form
e1 ⊕ e2 where ⊕ is a relational operator like <, >, ≤, ≥, etc. We can use the instruction
cmp followed by a
b.cond with the relevant condition.
Evaluation of conditional expressions may be impacted by the language specification. Some programming languages implement conditionally evaluated logical operators (also called short-circuit evaluation). This makes these operators non-strict and closer to if-statements than the mathematical definition of the logical operators. Some other programming languages are strict in the definition of the logical operators, which means they evaluate all the operands and then apply the logical operation. C and C++ use short-circuit evaluation which means an expression like
e1 && e2 will only evaluate
e1 is true. Similarly in an expression like
e1 || e2,
e2 is evaluated only if
e1 is false (having to use "if" in the definition of these operators is what makes them "conditionally evaluated"). This means that in practice a C/C++ code like the following
is implemented like
of course doing exactly the above would be ridiculous, so in practice looks more like this (in C/C++ again, though not very nice).
We can simplify this further to make it closer to assembly (I'm adding a few redundant C/C++ labels below for clarity)
Do we always have to implement C/C++ logical operations like this? Not really, only if we're unsure that unconditionally evaluating the second operand may be different than conditionally evaluating it. For instance a case like
it won't harm to evaluate both operands so we can avoid a few branches and actually evaluate it as if the code had been (note a single
& which means bitwise operation that always evaluates the two operands).
but in a case like
We don't really want to evaluate
((10 / p) > 5) when
p == 0. The same strategy is used for
|| in which sometimes we can evaluate it as
The following table shows some examples. Take these as suggestions, more clever sequences can be used in some circumstances.
|Condition to evaluate (C/C++)||Instruction sequence|
If we want to evaluate
If we can reverse the operands
If we can reverse the operands
Without short-cut evaluation (i.e.
Without short-cut evaluation (i.e.
For the less common case when
In some other cases like !(e1 && e2) we can use De Morgan rules to evaluate it like
A common idiom, in C/C++ is the following
In Fortran there is a shorthand syntax for this case
Or more generally, in C/C++
Which in C/C++ can be written more compactly as
In many circumstances
e2 are such that it is correct to rewrite the general case as
or the opposite form
If we can write the code in one of the two forms above, and given that this case is relatively common, AArch64 has specific support for it.
csel rdest, rsource1, rsource2, cond sets the value of the destination register
rsource1 if the condition
cond is true, otherwise it is set to
csel stands for conditional selection/
For instance, a typical
max operation looks like this in C.
with csel we can implement the code above as
which is handy because it avoids a branch.
To implement a loop construct of the form
we can do
Usually inverting the flow is a bit better
This has the added benefit that if we remove the initial branch to
loop_check, this is the equivalent to the C
do-while loop or the Pascal
While we can use regular branches and comparison instructions to implement loops, AArch64 provides a couple of specialized instructions for some common circumstances:
cbz. They conditionally branch to some label if a given register is non-zero or zero respectively. This is useful for loops like this.
because we can now implement them as
cbz. That said, for this particular example it is possible to use subs instruction that acts like
sub but also updates the condition flags. So we can write
cbz can be more useful when a register being zero is a condition and its computation cannot update the flags (or doing that would require adding an extra
cmp). We have not seen pointers yet, but a common scenario is traversing a linked list ended by a null pointer.
Because a null pointer is represented usually with all zeros after
it = it->next we can use
cbnzr without having to do
Fibonacci (using a loop)
In chapter 8 we made a function that recursively computed the Fibonacci numbers. Today we are going to see how to compute them using a loop. We will reuse the same main function and only change the
fibonacci function itself.
As a reminder of the fibonacci function is actually a sequence of numbers Fi defined by the following recurrence:
- F0 = 0
- F1 = 1
- Fn = Fn-1 + Fn-2, where n > 1
To compute the current element we only need the previous two elements, we don't really need to strictly follow the mathematical definition doing a recursive function call, like we did in chapter 8. Results for
fibonacci(1) will be immediately solved, no need to loop. Any other fibonacci(n) can be computed by repeatedly computing
n - 1 previous Fibonacci numbers.
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 fibonacci: // fibonacci(n) -> result // n is 32-bit and will be passed in w0 // result is 64-bit and will be returned in x0 mov w3, w0 // w3 ← w0 cmp w3, #1 // compare w3 and 1 and update the flags b.le simple_case // jump to simple_case if w3 <= 1 sub w3, w3, #1 // w3 ← w3 - 1 mov x1, #0 // x1 ← 0 mov x2, #1 // x1 ← 1 b loop_check // branch to loop_check loop_body: add x0, x1, x2 // x0 ← x1 + x2 mov x1, x2 // x1 ← x2 mov x2, x0 // x2 ← x0 sub w3, w3, #1 // w3 ← w3 - 1 loop_check: cbnz w3, loop_body // branch to loop_body if w3 != 0 b fibonacci_end // branch to fibonacci_end simple_case: sxtw x0, w0 // x0 ← ExtendSigned32To64(w0) fibonacci_end: ret
As in the recursive version of this function, cases 0 and 1 are handled specially. We first keep w0 into w3, line 65, and then we compare w3 with 1, line 66. If the flags are such that
w3 ≤ 1 then we branch to the
simple_case which the only thing it does is extending
x0, line 82. If the fibonacci number to compute is not 0 or 1, then we have to compute the
n - 1 earlier ones. So we have to decrease
w3 by 1, line 68. In register
x1 we will have
fibonacci(n-2) and in x2 we will have fibonacci(n-1). The two initially known ones are 0 and 1 respectively (this is, for
fibonacci(2)), lines 69-70.
Now everything is set for the loop, so we jump to the check, line 71. The loop check uses
w3, line 78. Once
w3 becomes zero means that we do not have to compute any other fibonacci number, so cbnz will branch back to
loop_body only when
w3 is not 0.
The loop body simply computes the current fibonacci by adding
x2 (recall, x1 is
fibonacci(n-2) and x2 is
fibonacci(n-1)) and leaving the result in
x0, line 73. Now we need to prepare the next iteration of the loop:
x2 (which is
fibonacci(n-1)) will be
fibonacci(n-2) in the next iteration, so we put it in
x1, line 74. Similarly,
x0 (which is
fibonacci(n)) will be
fibonacci(n-1) in the next iteration, so we put it
x2, line 75.
At this point one fibonacci number has been computed, so one less to compute. So we decrease
w3 by one, line 76. Now the loop_check is executed which will iterate if needed. Once we leave the loop, we still need to skip the
simple_case, line 80.
To test this, just take the code in chapter 8 and replace the recursive function with the iterative function.
This is all for today.