ARM assembler in Raspberry Pi – Chapter 5
Until now our small assembler programs execute one instruction after the other. If our ARM processor were only able to run this way it would be of limited use. It could not react to existing conditions which may require different sequences of instructions. This is the purpose of the branch instructions.
A special register
In chapter 2 we learnt that our Raspberry Pi ARM processor has 16 integer general purpose registers and we also said that some of them play special roles in our program. I deliberately ignored which registers were special as it was not relevant at that time.
But now it is relevant, at least for register
r15. This register is very special, so special it has also another name:
pc. It is unlikely that you see it used as
r15 since it is confusing (although correct from the point of view of the ARM architecture). From now we will only use
pc to name it.
pc stand for?
pc means program counter. This name, the origins of which are in the dawn of computing, means little to nothing nowadays. In general the
pc register (also called
ip, instruction pointer, in other architectures like 386 or x86_64) contains the address of the next instruction going to be executed.
When the ARM processor executes an instruction, two things may happen at the end of its execution. If the instruction does not modify
pc (and most instructions do not),
pc is just incremented by 4 (like if we did
add pc, pc, #4). Why 4? Because in ARM, instructions are 32 bit wide, so there are 4 bytes between every instruction. If the instruction modifies
pc then the new value for
pc is used.
Once the processor has fully executed an instruction then it uses the value in the
pc as the address for the next instruction to execute. This way, an instruction that does not modify the
pc will be followed by the next contiguous instruction in memory (since it has been automatically increased by 4). This is called implicit sequencing of instructions: after one has run, usually the next one in memory runs. But if an instruction does modify the
pc, for instance to a value other than
pc + 4, then we can be running another instruction of the program. This process of changing the value of
pc is called branching. In ARM this done using branch instructions.
You can tell the processor to branch unconditionally by using the instruction
b (for branch) and a label. Consider the following program.
1 2 3 4 5 6 7 8 9
/* -- branch01.s */ .text .global main main: mov r0, #2 /* r0 ← 2 */ b end /* branch to 'end' */ mov r0, #3 /* r0 ← 3 */ end: bx lr
If you execute this program you will see that it returns an error code of 2.
$ ./branch01 ; echo $? 2
What happened is that instruction
b end branched (modifying the
pc) to the instruction at the label
end, which is
bx lr, the instruction we run at the end of our program. This way the instruction
mov r0, #3 has not actually been run at all (the processor never reached that instruction).
At this point the unconditional branch instruction
b may look a bit useless. It is not the case. In fact this instruction is essential in some contexts, in particular when linked with conditional branching. But before we can talk about conditional branching we need to talk about conditions.
If our processor were only able to branch just because, it would not be very useful. It is much more useful to branch when some condition is met. So a processor should be able to evaluate some sort of conditions.
Before continuing, we need to unveil another register called
cpsr (for Current Program Status Register). This register is a bit special and directly modifying it is out of the scope of this chapter. That said, it keeps some values that can be read and updated when executing an instruction. The values of that register include four condition code flags called
C (carry) and
V (overflow). These four condition code flags are usually read by branch instructions. Arithmetic instructions and special testing and comparison instruction can update these condition codes too if requested.
The semantics of these four condition codes in instructions updating the
cpsr are roughly the following
Nwill be enabled if the result of the instruction yields a negative number. Disabled otherwise.
Zwill be enabled if the result of the instruction yields a zero value. Disabled if nonzero.
Cwill be enabled if the result of the instruction yields a value that requires a 33rd bit to be fully represented. For instance an addition that overflows the 32 bit range of integers. There is a special case for C and subtractions where a non-borrowing subtraction enables it, disabled otherwise: subtracting a larger number to a smaller one enables C, but it will be disabled if the subtraction is done the other way round.
Vwill be enabled if the result of the instruction yields a value that cannot be represented in 32 bits two’s complement.
So we have all the needed pieces to perform branches conditionally. But first, let’s start comparing two values. We use the instruction
cmp for this purpose.
cmp r1, r2 /* updates cpsr doing "r1 - r2", but r1 and r2 are not modified */
This instruction subtracts to the value in the first register the value in the second register. Examples of what could happen in the snippet above?
r2had a value (strictly) greater than
Nwould be enabled because
r1-r2would yield a negative result.
r2had the same value, then
Zwould be enabled because
r1-r2would be zero.
r1was 1 and
r2was 0 then
r1-r2would not borrow, so in this case
Cwould be enabled. If the values were swapped (
r1was 0 and
r2was 1) then C would be disabled because the subtraction does borrow.
r1was 2147483648 (the largest positive integer in 32 bit two’s complement) and
r1was -1 then
r1-r2would be 2147483649 but such number cannot be represented in 32 bit two’s complement, so
Vwould be enabled to signal this.
How can we use these flags to represent useful conditions for our programs?
EQ(equal) When Z is enabled (Z is 1)
NE(not equal). When Z is disabled. (Z is 0)
GE(greater or equal than, in two’s complement). When both V and N are enabled or disabled (V is N)
LT(lower than, in two’s complement). This is the opposite of GE, so when V and N are not both enabled or disabled (V is not N)
GT(greather than, in two’s complement). When Z is disabled and N and V are both enabled or disabled (Z is 0, N is V)
LE(lower or equal than, in two’s complement). When Z is enabled or if not that, N and V are both enabled or disabled (Z is 1. If Z is not 1 then N is V)
MI(minus/negative) When N is enabled (N is 1)
PL(plus/positive or zero) When N is disabled (N is 0)
VS(overflow set) When V is enabled (V is 1)
VC(overflow clear) When V is disabled (V is 0)
HI(higher) When C is enabled and Z is disabled (C is 1 and Z is 0)
LS(lower or same) When C is disabled or Z is enabled (C is 0 or Z is 1)
HS(carry set/higher or same) When C is enabled (C is 1)
LO(carry clear/lower) When C is disabled (C is 0)
These conditions can be combined to our
b instruction to generate new instructions. This way,
beq will branch only if
Z is 1. If the condition of a conditional branch is not met, then the branch is ignored and the next instruction will be run. It is the programmer task to make sure that the condition codes are properly set prior a conditional branch.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
/* -- compare01.s */ .text .global main main: mov r1, #2 /* r1 ← 2 */ mov r2, #2 /* r2 ← 2 */ cmp r1, r2 /* update cpsr condition codes with the value of r1-r2 */ beq case_equal /* branch to case_equal only if Z = 1 */ case_different : mov r0, #2 /* r0 ← 2 */ b end /* branch to end */ case_equal: mov r0, #1 /* r0 ← 1 */ end: bx lr
If you run this program it will return an error code of 1 because both
r2 have the same value. Now change
mov r1, #2 in line 5 to be
mov r1, #3 and the returned error code should be 2. Note that
case_different we do not want to run the
case_equal instructions, thus we have to branch to
end (otherwise the error code would always be 1).
That’s all for today.