Embedded from Scratch:System boot and hardware access
Federico [email protected]
Advanced Operating Systems (2016/2017)
Advanced Operating SystemsFederico Terraneo
282/Outline
Bare-metal programming HW architecture overview
Linker script Boot process
High-level programming languages requirements Assembler start-up script
Hardware access Memory mapped peripheral registers Data-sheets Bit manipulation
Example Blinking LED
Advanced Operating SystemsFederico Terraneo
283/Bare-metal programming
HW architecture overview Processes running on top of an operating systems live in avirtual address space
Pointers handled in C programcan only access addresses inthe virtual space
A process is completely isolatedfrom the physical address space
Loading a program into memoryis responsibility of the operatingsystem
Processes can't access HWperipherals, but only makesyscalls to the OS
System memory
Advanced Operating SystemsFederico Terraneo
284/Bare-metal programming
HW architecture overview This class shows how programming is done without an OS abstraction
➔ To learn how to program in a bare metal environment➔ To see how programming is done inside an OS
To the sake of simplicity the target will be a micro-controller A computing device that even nowadays is often programmed without an OS
Micro-controller architecture A single chip including CPU core, memory and peripherals
A “computer on a chip” We will be using the STM32F407 micro-controller
➔ 32 bits CPU, popular ARM (Cortex M4) architecture➔ On-chip 192KB RAM and 1 MB of Flash memory➔ Wide range of on-chip peripherals (USB, Ethernet,
ADC/DAC, Serial port, GPIO, etc... )
Advanced Operating SystemsFederico Terraneo
285/Bare-metal programming
STM32F407 HW architecture overview No support for virtual memory
C/C++ programs are written in termsof physical addresses
Program code is located in the Flash, Flash is memory-mapped from address 0x0 Direct code execution from Flash is possible
Stack, heap and global variables areplaced in RAM
RAM is memory-mapped at 0x20000000 Addresses starting from 0x40000000
are reserved to hardware peripherals Is sparsely used, mostly unmapped
Part of the address space is unmapped Accessing unmapped areas causes a fault interrupt to be generated
Peripherals
RAM
Flash0x0
0xFFFFF
0x20000000
0x2001FFFF
0x40000000
0xFFFFFFFF
SimplifiedSTM32F407 Memory map
Advanced Operating SystemsFederico Terraneo
286/Bare-metal programming
Linker script The compilation stage does not differfrom OS-based environments
At the linking stage, when generatingthe bare-metal program the linker needsto know where to place
Code (.text section) Flash memory→ Data (.data / .bss) RAM→ Stack and Heap are also allocated in RAM, but at run-time
This is the purpose of the linker script An additional file passed to the linker todescribe the memory regions of the architecture
Peripherals
RAM
Flash0x0
0xFFFFF
0x20000000
0x2001FFFF
0x40000000
0xFFFFFFFF
SimplifiedSTM32F407 Memory map
Advanced Operating SystemsFederico Terraneo
287/Bare-metal programming
Linker script for a C program on STM32F407 (1/2)
ENTRY(Reset_Handler)
MEMORY{ flash(rx) : ORIGIN = 0x00000000, LENGTH = 1M ram(wx) : ORIGIN = 0x20000000, LENGTH = 128K}
_stack_top = 0x20000000+128*1024;
SECTIONS{ . = 0; .text : { /* Startup code must go at address 0 */ KEEP(*(.isr_vector)) *(.text) . = ALIGN(4); *(.rodata) } > flash
. = ALIGN(8); _etext = .;…
MEMORY section specifies thehardware memory layout
ENTRY tells the linker the firstfunction that will be called at boot
_stack_top is a variable definingthe start address of the StackSECTIONS block tells the linkerhow to map the program sectionsinto memory regions
“.” is a special variable called “location counter” that is incremented, after each sectionmapping, by the section size
This block maps the .isr_vector , .text and .rodata sections to the Flash memory
Advanced Operating SystemsGiuseppe Massari
288/Baremetal programming
Linker script for a C program on STM32F407 (2/2)…
.data : { _data = .; *(.data) . = ALIGN(8); _edata = .; } > ram AT > flash
_bss_start = .; .bss : { *(.bss) . = ALIGN(8); } > ram _bss_end = .;
_end = .;}
This block maps the .data section to RAM,but places also a copy of it in the Flash(to initialize it at boot)
This block maps the .bss , section to RAM
The linker script defines many variables thatwill be used for initializing sections at run-time and ALIGN commands to satisfy the alignmentrequirements of the process ABI (Application Binary Interface)
Advanced Operating SystemsFederico Terraneo
289/Bare-metal programming
Boot process A CPU can be seen as a complex state machine executing the instruction pointed by the Program Counter (or Instruction Pointer) register, and incrementing it to execute the next instruction
Question: Where does the processor start executing code from, when the system is powered on?
The answer is architecture dependent, but there are two common solutions1) Set the Program Counter to a predefined address (e.g., 0x0) to identify the first instruction to execute
2) Read a predefined memory location containing the address of the first instruction, and use that value to initialize the Program Counter (ARM Cortex processors approach)
Advanced Operating SystemsFederico Terraneo
2810/Bare-metal programming
Boot process In the STM32F407 the address of the first instruction must be placed at 0x00000004
Question: Is it enough to put there the address of the main() function of a C program?
Answer: NO – High-level programming languages make assumption about the state of their execution environment
Be it the abstraction of a process in a OS or a bare-metal machine
A small part of the program needs to be written in assembler The part executed at boot aims at satisfying the assumptions above
We are going to see assumptions made for C and C++ programs
Advanced Operating SystemsFederico Terraneo
2811/Bare-metal programming
C programming language requirements The stack pointer register must point to the top of a suitable memory area
The compiler implicitly uses the stack to allocate local variables within functions Until the stack pointer is initialized only assembler code can be executed
Global and static initialized variables must set to their initial value Since they are placed in RAM, and after the power on the content of RAM is random, they must be explicitly initialized
Global and static uninitialized variables must be set to 0
If the program uses the heap, additional support is required For bare-metal embedded systems the environment may choose to not provide an heap in the memory
If the program uses the C standard library, certain syscalls need to be provided to perform the requested high-level operations
For example write() is called by printf, open() when opening a file For bare-metal embedded systems the environment may choose to not provide an implementation of these syscalls
Advanced Operating SystemsFederico Terraneo
2812/Bare-metal programming
C programming language start-up script (1/2).syntax unified.cpu cortexm4.thumb
.section .isr_vector
.global __Vectors__Vectors:
.word _stack_top
.word Reset_Handler
.section .text
.global Reset_Handler
.type Reset_Handler, %functionReset_Handler:/* Copy .data from FLASH to RAM */
ldr r0, =_dataldr r1, =_edataldr r2, =_etextcmp r0, r1beq nodatasubs r2, r2, #4
dataloop:ldr r3, [r2, #4]!str r3, [r0], #4cmp r1, r0bne dataloop
nodata:
Assembler file identification for ARM Cortex M4 processors Section containing a table of pointers0x00000000 = stack pointer init0x00000004 = program counter init
defined in the linker script
Reset_Handler function declaration
This block copies .data section from Flash to RAM (as shown in the linker script)
defined in the linker script
Advanced Operating SystemsFederico Terraneo
2813/Bare-metal programming
C programming language start-up script (2/2)
/* Zero .bss */ldr r0, =_bss_startldr r1, =_bss_endcmp r0, r1beq nobssmovs r3, #0
bssloop:str r3, [r0], #4cmp r1, r0bne bssloop
nobss:/* Jump to main() */
bl main/* If main() returns, endless loop */loop:
b loop.size Reset_Handler, .Reset_Handler
This block set to zero .bss section
defined in the linker script
bl is the “function call” instruction in ARM assembly
We will NOT see how to provide a heap nor how to implement syscallsin a bare-metal environment
Advanced Operating SystemsFederico Terraneo
2814/Bare-metal programming
C++ programming language requirements All the requirements for the C programming language
As with C, the heap and syscalls may be omitted if not used
Constructors of global objects need to be called before main➔ C++ classes can have constructors➔ C++ objects can be declared as global variable➔ Their constructors need to be called!
If the program uses exceptions, additional sections are generated by the compiler that need to be handled in the linker script
For bare-metal embedded systems the environment may choose to not provide C++ exception support
Advanced Operating SystemsFederico Terraneo
2815/Bare-metal programming
Linker script for a C++ program on STM32F407
ENTRY(Reset_Handler)… SECTIONS{ . = 0; .text : {/* Startup code must go at address 0 */ KEEP(*(.isr_vector)) *(.text) . = ALIGN(4); *(.rodata)
/* Table of global constructors, for C++ */ . = ALIGN(4); __init_array_start = .; KEEP (*(.init_array)) __init_array_end = .; } > flash…
No difference with respect to C program case
.init_array section contains atable (built by the compiler) of function pointers to theconstructors of global objects
No more differences with respect to C program case
Advanced Operating SystemsFederico Terraneo
2816/Bare-metal programming
C++ programming language start-up script…
.global Reset_Handler
.type Reset_Handler, %functionReset_Handler:…nobss:
/* Call global contructors for C++ Can't use r0r3 as the callee doesn't preserve them */ldr r4, =__init_array_startldr r5, =__init_array_endcmp r4, r5beq noctor
ctorloop:ldr r3, [r4], #4blx r3cmp r5, r4bne ctorloop
noctor:/* Jump to main() */
bl main/* If main() returns, endless loop */loop:
b loop.size Reset_Handler, .Reset_Handler
No difference with respect toC program case
Reset_Handler function first initializes .data and .bss, as global constructors may reference otherglobal variables
This loop calls the function pointers one by one
No more differences with respectto C program case
Advanced Operating SystemsFederico Terraneo
2817/Outline
Bare-metal programming HW architecture overview
Linker script Boot process
High-level programming languages requirements Assembler start-up script
Hardware access Memory mapped peripheral registers Data-sheets Bit manipulation
Example Blinking LED
Advanced Operating SystemsFederico Terraneo
2818/Hardware access
HW/SW interfacing Question: how can software interact with hardware?
Peripheral registers Most common way used by hardware peripherals to expose their functionality to the software
Memory locations mapped to specific addresses in the processor address space
Must not be confused with CPU registers! Mapped at physical addresses (not virtual ones)
In operating systems with memory protection (Linux, Mac, Windows) are accessible only from within the OS kernel
In a micro-controller environment they are freely accessible
Advanced Operating SystemsFederico Terraneo
2819/Hardware access
Memory mapped peripheral registers
Similarities with programming languages variables Accessible in the same way
For example, through load/store assembler instructions in RISC processors In many cases they can be read and written in software
Even if some registers may be read-only 8, 16 or 32 bit wide, just like unsigned char, unsigned short and unsigned int data types in C/C++
Advanced Operating SystemsFederico Terraneo
2820/Hardware access
Memory mapped peripheral registers
Differences with programming languages variables What gets written in those registers causes actions in the real world
Such as a LED turning on, an ADC initiating a conversion, a character being sent through a serial port, etc...
They are at well specified memory addressesWhen a variable is allocated in RAM, whether on the stack or the heap, to the programmer it doesn’t matter the address where it is allocatedWhile if a peripheral register is mapped at the address 0x101e5018 the programmer needs to be sure that is writing at that address
Peripheral registers are not at exclusive use to the programmer, they are shared between the hardware and software
The hardware can decide to change the content of a register to signal events or status flags, while variables simply keep the value stored in them by the programmer
Advanced Operating SystemsFederico Terraneo
2821/Hardware access
Memory mapped peripheral registers Question: How can a programmer know the peripherals available in a given architecture?
For a micro-controller, available peripherals are detailed in a document, usually called datasheet or reference manual
Data-sheets are usually available at the manufacturer’s website
Question: Assuming there is a 32-bit register called IODIR0, at address 0xe0028008, how to access from a C/C++ program, writing 0 to it?
Advanced Operating SystemsFederico Terraneo
2822/Hardware access
Memory mapped peripheral registers (access method 1)
We use a C cast operator to cast the 0xe0028008 number to a pointer to an unsigned int data type.
The choice of an unsigned int is due to that fact that the register is a 32bit register and unsigned int is a 32bit data type (on a 32bitprocessor)
The '*' at the left dereferences the pointer, thus giving access to the memory location pointed to by the pointer
Zero is written into that address volatile keyword is necessary to disable compiler optimizations
Such as instruction reordering and redundant write elimination that cause problems as the compiler is not aware that at that memory location there is a peripheral register
void clearReg(){
(*((volatile unsigned int *) 0xe0028008)) = 0;}
Advanced Operating SystemsFederico Terraneo
2823/Hardware access
Memory mapped peripheral registers (access method 1)
More readable code Makes writing to a peripheral register more similar in syntax to writing to a variable
Use of the symbolic name IODIR0 can be easily looked up in the data-sheet Common practice: to group macros defining all registers of an architecture in an header file
Most micro-controller manufacturers already provide that header file
#define IODIR0 (*((volatile unsigned int *) 0xe0028008))
void clearReg(){
IODIR0 = 0;}
Advanced Operating SystemsFederico Terraneo
2824/Hardware access
Memory mapped peripheral registers (access method 2) Rarely a peripheral has only one register
Usually a peripheral is controlled through a set of registers, that are often mapped at contiguous or close-by addresses
This allows grouping all peripheral registers in a data structure, like this
struct GpioPeripheral{
volatile unsigned int CRL;volatile unsigned int CRH;volatile unsigned int BSRR;volatile unsigned int BRR;
};
#define GPIO ((struct GpioPeripheral *)0xfeeeaba0)
Advanced Operating SystemsFederico Terraneo
2825/Hardware access
Memory mapped peripheral registers (access method 2) Using this method, the code to clear register CRL in the GPIO peripheral is:
There are no differences, not even in performance between the two methods
Some manufacturers however use the first method, some the second one, so it is necessary to know both
void clearReg(){
GPIO>CRL = 0;};
Advanced Operating SystemsFederico Terraneo
2826/Hardware access
Data-sheets The figure below shows a typical peripheral register as documented in a data-sheet
Information provided for each register Name (e.g., REGEXAMPLE) Address in memory (0xE0000004) Meaning and access permissions of all the bits
Whether they are readable (r) and/or writable (w) Some bits may be unused
Advanced Operating SystemsFederico Terraneo
2827/Hardware access
Bit manipulation Given the following register representation in code:
Questions1) How to set bit EN to 1 leaving the other bits unaffected?2) How to clear bit CNF2 to 0 leaving the other bits unaffected?3) How to test if bit FLAGA is set to 1?
Answers1) REGEXAMPLE |= (1<<0);2) REGEXAMPLE &= ~(1<<2);3) if (REGEXAMPLE & (1<<4)) {…}
#define REGEXAMPLE (*((volatile unsigned char *) 0xe0000004))
Advanced Operating SystemsFederico Terraneo
2828/Hardware access
Bit manipulation