Compilation example

Example of compiling C code #

Suppose you want to compile a short C program into machine code, the bits that your microcontroller can execute.

Let’s start with a stupidly simple program, called simple.c.

void main() {
    int x = 255;
    x = x + 0xAAAA;
}

On a Raspberry Pi, I could compile that like this: gcc simple.c.

(GCC is the Gnu C compiler. It’s a ubiquitous free, open source C compiler, licensed under the GPL. The other major player in the open source compiler world is Clang/LLVM, which uses a BSD-style license. Anyway, gcc is installed by default on the Pi, so we’ll stick with that.)

This takes the 53-byte C file we started with and turns it into a 7912-byte binary, called a.out by default.

We can look at some of the steps along the way.

gcc -S simple.c just compiles the C into assembly, but doesn’t translate that into machine code.

pi@raspberrypi:~ $ ls -l
total 20
-rwxr-xr-x 1 pi pi 7912 Dec 11 15:30 a.out
-rw-r--r-- 1 pi pi   53 Dec 11 14:42 simple.c
-rw-r--r-- 1 pi pi  852 Dec 11 15:30 simple.o
-rw-r--r-- 1 pi pi  807 Dec 11 14:56 simple.s

Here are the contents of simple.s.

pi@raspberrypi:~ $ cat simple.s
        .arch armv6
        .eabi_attribute 28, 1
        .eabi_attribute 20, 1
        .eabi_attribute 21, 1
        .eabi_attribute 23, 3
        .eabi_attribute 24, 1
        .eabi_attribute 25, 1
        .eabi_attribute 26, 2
        .eabi_attribute 30, 6
        .eabi_attribute 34, 1
        .eabi_attribute 18, 4
        .file   "simple.c"
        .text
        .align  2
        .global main
        .arch armv6
        .syntax unified
        .arm
        .fpu vfp
        .type   main, %function
main:
        @ args = 0, pretend = 0, frame = 8
        @ frame_needed = 1, uses_anonymous_args = 0
        @ link register save eliminated.
        str     fp, [sp, #-4]!
        add     fp, sp, #0
        sub     sp, sp, #12
        mov     r3, #255
        str     r3, [fp, #-8]
        ldr     r3, [fp, #-8]
        add     r3, r3, #43520
        add     r3, r3, #170
        str     r3, [fp, #-8]
        nop
        add     sp, fp, #0
        @ sp needed
        ldr     fp, [sp], #4
        bx      lr
        .size   main, .-main
        .ident  "GCC: (Raspbian 8.3.0-6+rpi1) 8.3.0"
        .section        .note.GNU-stack,"",%progbits

The lines that start with @ are just comments.

The crux of the program is in three lines

        mov     r3, #255
        ...
        add     r3, r3, #43520
        add     r3, r3, #170

Note that 43520 is the same as 0xAA00, and 170 is the same as 0x00AA.

All the sp and fp stuff deal with stack pointer and frame pointer, respectively. The last line bx lr just means “return from the function using the address in the link register.” (bx stands for “branch and exchange.")

gcc -c simple.c compiles the C code to assembly, and then translates that into machine code.

pi@raspberrypi:~ $ xxd simple.o
00000000: 7f45 4c46 0101 0100 0000 0000 0000 0000  .ELF............
00000010: 0100 2800 0100 0000 0000 0000 0000 0000  ..(.............
00000020: c401 0000 0000 0005 3400 0000 0000 2800  ........4.....(.
00000030: 0a00 0900 04b0 2de5 00b0 8de2 0cd0 4de2  ......-.......M.
00000040: ff30 a0e3 0830 0be5 0830 1be5 aa3c 83e2  .0...0...0...<..
00000050: aa30 83e2 0830 0be5 0000 a0e1 00d0 8be2  .0...0..........
00000060: 04b0 9de4 1eff 2fe1 0047 4343 3a20 2852  ....../..GCC: (R
00000070: 6173 7062 6961 6e20 382e 332e 302d 362b  aspbian 8.3.0-6+
00000080: 7270 6931 2920 382e 332e 3000 412e 0000  rpi1) 8.3.0.A...
00000090: 0061 6561 6269 0001 2400 0000 0536 0006  .aeabi..$....6..
000000a0: 0608 0109 010a 0212 0414 0115 0117 0318  ................
000000b0: 0119 011a 021c 011e 0622 0100 0000 0000  ........."......
000000c0: 0000 0000 0000 0000 0000 0000 0100 0000  ................
000000d0: 0000 0000 0000 0000 0400 f1ff 0000 0000  ................
000000e0: 0000 0000 0000 0000 0300 0100 0000 0000  ................
000000f0: 0000 0000 0000 0000 0300 0200 0000 0000  ................
00000100: 0000 0000 0000 0000 0300 0300 0a00 0000  ................
00000110: 0000 0000 0000 0000 0000 0100 0000 0000  ................
00000120: 0000 0000 0000 0000 0300 0500 0000 0000  ................
00000130: 0000 0000 0000 0000 0300 0400 0000 0000  ................
00000140: 0000 0000 0000 0000 0300 0600 0d00 0000  ................
00000150: 0000 0000 3400 0000 1200 0100 0073 696d  ....4........sim
00000160: 706c 652e 6300 2461 006d 6169 6e00 002e  ple.c.$a.main...
00000170: 7379 6d74 6162 002e 7374 7274 6162 002e  symtab..strtab..
00000180: 7368 7374 7274 6162 002e 7465 7874 002e  shstrtab..text..
00000190: 6461 7461 002e 6273 7300 2e63 6f6d 6d65  data..bss..comme
000001a0: 6e74 002e 6e6f 7465 2e47 4e55 2d73 7461  nt..note.GNU-sta
000001b0: 636b 002e 4152 4d2e 6174 7472 6962 7574  ck..ARM.attribut
000001c0: 6573 0000 0000 0000 0000 0000 0000 0000  es..............
000001d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000001e0: 0000 0000 0000 0000 0000 0000 1b00 0000  ................
000001f0: 0100 0000 0600 0000 0000 0000 3400 0000  ............4...
00000200: 3400 0000 0000 0000 0000 0000 0400 0000  4...............
00000210: 0000 0000 2100 0000 0100 0000 0300 0000  ....!...........
00000220: 0000 0000 6800 0000 0000 0000 0000 0000  ....h...........
00000230: 0000 0000 0100 0000 0000 0000 2700 0000  ............'...
00000240: 0800 0000 0300 0000 0000 0000 6800 0000  ............h...
00000250: 0000 0000 0000 0000 0000 0000 0100 0000  ................
00000260: 0000 0000 2c00 0000 0100 0000 3000 0000  ....,.......0...
00000270: 0000 0000 6800 0000 2400 0000 0000 0000  ....h...$.......
00000280: 0000 0000 0100 0000 0100 0000 3500 0000  ............5...
00000290: 0100 0000 0000 0000 0000 0000 8c00 0000  ................
000002a0: 0000 0000 0000 0000 0000 0000 0100 0000  ................
000002b0: 0000 0000 4500 0000 0300 0070 0000 0000  ....E......p....
000002c0: 0000 0000 8c00 0000 2f00 0000 0000 0000  ......../.......
000002d0: 0000 0000 0100 0000 0000 0000 0100 0000  ................
000002e0: 0200 0000 0000 0000 0000 0000 bc00 0000  ................
000002f0: a000 0000 0800 0000 0900 0000 0400 0000  ................
00000300: 1000 0000 0900 0000 0300 0000 0000 0000  ................
00000310: 0000 0000 5c01 0000 1200 0000 0000 0000  ....\...........
00000320: 0000 0000 0100 0000 0000 0000 1100 0000  ................
00000330: 0300 0000 0000 0000 0000 0000 6e01 0000  ............n...
00000340: 5500 0000 0000 0000 0000 0000 0100 0000  U...............
00000350: 0000 0000                                ....

We can disassemble this with objdump.

pi@raspberrypi:~ $ objdump -d simple.o
simple.o:     file format elf32-littlearm
Disassembly of section .text:
00000000 <main>:
   0:   e52db004        push    {fp}            ; (str fp, [sp, #-4]!)
   4:   e28db000        add     fp, sp, #0
   8:   e24dd00c        sub     sp, sp, #12
   c:   e3a030ff        mov     r3, #255        ; 0xff
  10:   e50b3008        str     r3, [fp, #-8]
  14:   e51b3008        ldr     r3, [fp, #-8]
  18:   e2833caa        add     r3, r3, #43520  ; 0xaa00
  1c:   e28330aa        add     r3, r3, #170    ; 0xaa
  20:   e50b3008        str     r3, [fp, #-8]
  24:   e1a00000        nop                     ; (mov r0, r0)
  28:   e28bd000        add     sp, fp, #0
  2c:   e49db004        pop     {fp}            ; (ldr fp, [sp], #4)
  30:   e12fff1e        bx      lr

More info about shrinking binaries: https://journal.lunar.sh/2020/10/24/tiny-linux-c-binaries.html