ARM assembler in Raspberry Pi

ARM architecture has been for long targeted at embedded systems. Embedded systems usually end being used in massively manufactured products (dishwashers, mobile phones, TV sets, etc). In this context margins are very tight so a designer will always try to spare as much components as possible (a cent saved in hundreds of thousands or even millions of appliances may pay off). One relatively expensive component is memory although every day memory is less and less expensive. Anyway, in constrained memory environments being able to save memory is good and ARM instruction set was designed with this goal in mind. It will take us several chapters to learn all of these techniques, today we will start with one feature usually named shifted operand.

Indexing modes

We have seen that, except for load (ldr), store (str) and branches (b and bXX), ARM instructions take as operands either registers or immediate values. We have also seen that the first operand is usually the destination register (being str a notable exception as there it plays the role of source because the destination is now the memory). Instruction mov has another operand, a register or an immediate value. Arithmetic instructions like add and and (and many others) have two more source registers, the first of which is always a register and the second can be a register or an immediate value.

These sets of allowed operands in instructions are collectively called indexing modes. Today this concept will look a bit off since we will not index anything. The name indexing makes sense in memory operands but ARM instructions, except load and store, do not have memory operands. This is the nomenclature you will find in ARM documentation so it seems sensible to use theirs.

We can summarize the syntax of most of the ARM instructions in the following pattern

instruction Rdest, Rsource1, source2

There are some exceptions, mainly move (mov), branches, load and stores. In fact move is not so different actually.

mov Rdest, source2

Both Rdest and Rsource1 must be registers. In the next section we will talk about source2.

We will discuss the indexing modes of load and store instructions in a future chapter. Branches, on the other hand, are surprisingly simple and their single operand is just a label of our program, so there is little to discuss on indexing modes for branches.

Shifted operand

What is this mysterious source2 in the instruction patterns above? If you recall the previous chapters we have used registers or immediate values. So at least that source2 is this: register or immediate value. You can use an immediate or a register where a source2 is expected. Some examples follow, but we have already used them in the examples of previous chapters.

mov r0, #1
mov r1, r0
add r2, r1, r0
add r2, r3, #4

But source2 can be much more than just a simple register or an immediate. In fact, when it is a register we can combine it with a shift operation. We already saw one of these shift operations in chapter 6. Not it is time to unveil all of them.

LSL #n Logical Shift Left. Shifts bits n times left. The n leftmost bits are lost and the n rightmost are set to zero.
LSL Rsource3 Like the previous one but instead of an immediate the lower byte of a register specifies the amount of shifting.
LSR #n Logical Shift Right. Shifts bits n times right. The n rightmost bits are lost and the n leftmost bits are set to zero,
LSR Rsource3 Like the previous one but instead of an immediate the lower byte of a register specifies the amount of shifting.
ASR #n Arithmetic Shift Right. Like LSR but the leftmost bit before shifting is used instead of zero in the n leftmost ones.
ASR Rsource3 Like the previous one but using a the lower byte of a register instead of an immediate.
ROR #n Rotate Right. Like LSR but the n rightmost bits are not lost but pushed onto the n leftmost bits
ROR Rsource3 Like the previous one but using a the lower byte of a register instead of an immediate.

In the listing above, n is an immediate from 1 to 31. These extra operations may be applied to the value in the second source register (to the value, not to the register itself) so we can perform some more operations in a single instruction. For instance, ARM does not have any shift right or left instruction. You just use the mov instruction.

mov r1, r2, LSL #1

You may be wondering why one would want to shift left or right the value of a register. If you recall chapter 6 we saw that shifting left (LSL) a value gives a value that the same as multiplying it by 2. Conversely, shifting it right (ASR if we use two's complement, LSR otherwise) is the same as dividing by 2. Since a shift of n is the same as doing n shifts of 1, shifts actually multiply or divide a value by 2ⁿ.

mov r1, r2, LSL #1      /* r1 ← (r2*2) */
mov r1, r2, LSL #2      /* r1 ← (r2*4) */
mov r1, r3, ASR #3      /* r1 ← (r3/8) */
mov r3, #4
mov r1, r2, LSL r3      /* r1 ← (r2*16) */

We can combine it with add to get some useful cases.

add r1, r2, r2, LSL #1   /* r1 ← r2 + (r2*2) equivalent to r1 ← r2*3 */
add r1, r2, r2, LSL #2   /* r1 ← r2 + (r2*4) equivalent to r1 ← r2*5 */

You can do something similar with sub.

sub r1, r2, r2, LSL #3  /* r1 ← r2 - (r2*8) equivalent to r1 ← r2*(-7)

ARM comes with a handy rsb (Reverse Substract) instruction which computes Rdest ← source2 - Rsource1 (compare it to sub which computes Rdest ← Rsource1 - source2).

rsb r1, r2, r2, LSL #3      /* r1 ← (r2*8) - r2 equivalent to r1 ← r2*7 */

Another example, a bit more contrived.

/* Complicated way to multiply the initial value of r1 by 42 = 7*3*2 */
rsb r1, r1, r1, LSL #3  /* r1 ← (r1*8) - r1 equivalent to r1 ← 7*r1 */
add r1, r1, r1, LSL #1  /* r1 ← r1 + (2*r1) equivalent to r1 ← 3*r1 */
add r1, r1, r1          /* r1 ← r1 + r1     equivalent to r1 ← 2*r1 */

You are probably wondering why would we want to use shifts to perform multiplications. Well, the generic multiplication instruction always work but it is usually much harder to compute by our ARM processor so it may take more time. There are times where there is no other option but for many small constant values a single instruction may be more efficient.

Rotations are less useful than shifts in everyday use. They are usually used in cryptography, to reorder bits and "scramble" them. ARM does not provide a way to rotate left but we can do a n rotate left doing a 32-n rotate right.

/* Assume r1 is 0x12345678 */
mov r1, r1, ROR #1   /* r1 ← r1 ror 1. This is r1 ← 0x91a2b3c */
mov r1, r1, ROR #31  /* r1 ← r1 ror 31. This is r1 ← 0x12345678 */

That's all for today.