ARM assembler in Raspberry Pi

So far our small assembler programs have output messages using printf and some of them have read input using scanf. These two functions are implemented in the C library, so they are more or less supported in any environment supporting the C language. But how does a program actually communicate with the world?

The operating system

Our Raspberry Pi runs Raspbian. Raspbian is an operating system based on Debian on top of the Linux kernel. The operating system is a piece of software (usually a collection of pieces that together form a useful system) that enables and manages the resources required by programs to run. Which sort of resources, you may be wondering? Well, many different kinds of them: processes, files, network devices, network communications, screens, printers, terminals, timers, etc.

From the point of view of the program, the operating system is just a big servant providing lots of services to the program. But the operating system is also a caretaker, taking action when something goes wrong or programs (sometimes caused by the users of the operating system) attempt to do something that they are not authorized to do. In our case, Linux is the kernel of the Raspbian operating system. The kernel provides the most basic functionality needed to provide these services (sometimes it provides them directly, sometimes it just provides the minimal essential functionality so they can be implemented). It can be viewed as a foundational program that it is always running (or at least, always ready) so it can serve the requests of the programs run by the users. Linux is a UNIX®-like kernel and as such shares lots of features with the long lineage of UNIX®-like operating systems.

Processes

In order to assign resources, the operating system needs an entity to which grant such resources. This entity is called a process. A process is a running program. The same program may be run several times, each time it is run it is a different process.

System calls

A process interacts with the operating system by performing system calls. A system call is conceptually like calling a function but more sophisticated. It is more sophisticated because now we need to satisfy some extra security requirements. An operating system is a critical part of a system and we cannot let processes dodge the operating system control. A usual function call offers no protection of any kind. Any strategy we could design on top of a plain function call would easily be possible to circumvent. As a consequence of this constraint, we need support from the architecture (in our case ARM) in order to safely and securely implement a system call mechanism.

In Linux ARM we can perform a system call by using the instruction swi. This instruction means software interruption and its sole purpose is to make a system call to the operating system. It receives a 24-bit operand that is not used at all by the processor but could be used by the the operating system to tell which service has been requested. In Linux such approach is not used and a 0 is set as the operand instead. So, in summary, in Linux we will always use swi #0 to perform a system call.

An operating system, and particularly Linux, provides lots of services through system calls so we need a way to select one of them. We will do this using the register r7. System calls are similar to function calls in that they receive parameters. No system call in Linux receives more than 7 arguments and the arguments are passed in registers r0 to r6. If the system call returns some value it will be returned in register r0.

Note that the system call convention is incompatible with the convention defined by the AAPCS, so programs will need specific code that deals with a system call. In particular, it makes sense to wrap these system calls into normal functions, that externally, i.e. from the point of the caller, follow the AAPCS. This is precisely the main purpose of the C library. In Linux, the C library is usually GNU Libc (but others can be used in Linux). These libraries hide the extra complexity of making system calls under the appearance of a normal function call.

Hello world, the system call way

As a simple illustration of calling the operating system we will write the archetypical "Hello world" program using system calls. In this case we will call the function write. Write receives three parameters: a file descriptor where we will write some data, a pointer to the data that will be written and the size of such data. Of these three, the most obscure may be now the file descriptor. Without entering into much details, it is just a number that identifies a file assigned to the process. Processes usually start with three preassigned files: the standard input, with the number 0, the standard output, with the number 1, and the standard error, with the number 2. We will write our messages to the standard output, so we will use the file descriptor 1.

The "ea-C" way

Continuing with our example, first we will call write through the C library. The C library follows the AAPCS convention. The prototype of the write system call can be found in the Linux man pages and is as follows.

ssize_t write(int fd, const void *buf, size_t count);

Here both size_t and ssize_t are 32-bit integers, where the former is unsigned and the latter signed. Equipped with our knowledge of the AAPCS and ARM assembler it should not be hard for us to perform a call like the following

const char greeting[13] = "Hello world\n";
write(1, greeting, sizeof(greeting)); // Here sizeof(greeting) is 13

Here is the code

/* write_c.s */

.data

greeting: .asciz "Hello world\n"
after_greeting:

/* This is an assembler constant: the assembler will compute it. Needless to say
   that this must evaluate to a constant value. In this case we are computing the
   difference of addresses between the address after_greeting and greeting. In this
   case it will be 13 */
.set size_of_greeting, after_greeting - greeting

.text

.globl main

main:
    push {r4, lr}

    /* Prepare the call to write */  
    mov r0, #1                /* First argument: 1 */
    ldr r1, addr_of_greeting  /* Second argument: &greeting */
    mov r2, #size_of_greeting /* Third argument: sizeof(greeting) */
    bl write                  /* write(1, greeting, sizeof(greeting));

    mov r0, #0
    pop {r4, lr}
    bx lr

addr_of_greeting : .word greeting

The system call way

Ok, calling the system call through the C library was not harder than calling a normal function. Let's try the same directly performing a Linux system call. First we have to identify the number of the system call and put it in r7. The call write has the number 4 (you can see the numbers in the file /usr/include/arm-linux-gnueabihf/asm/unistd.h). The parameters are usually the same as in the C function, so we will use registers r0, r1 and r2 likewise.

/* write_sys.s */

.data


greeting: .asciz "Hello world\n"
after_greeting:

.set size_of_greeting, after_greeting - greeting

.text

.globl main

main:
    push {r7, lr}

    /* Prepare the system call */
    mov r0, #1                  /* r0 ← 1 */
    ldr r1, addr_of_greeting    /* r1 ← &greeting */
    mov r2, #size_of_greeting   /* r2 ← sizeof(greeting) */

    mov r7, #4                  /* select system call 'write' */
    swi #0                      /* perform the system call */

    mov r0, #0
    pop {r7, lr}
    bx lr

addr_of_greeting : .word greeting

As you can see it is not that different to a function call but instead of branching to a specific address of code using bl we use swi #0. Truth be told, it is rather unusual to perform system calls directly. It is almost always preferable to call the C library instead.

That's all for today.