+ All Categories
Home > Documents > Teaching Embedded System Design and Optimization with the ...

Teaching Embedded System Design and Optimization with the ...

Date post: 16-Oct-2021
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
82
1 ARM University Program Copyright © ARM Ltd 2014 Teaching Embedded System Design and Optimization with the ARM Cortex-M0+ Microcontrollers Dr. Alexander G. Dean Dept. of ECE North Carolina State University Raleigh, NC [email protected] http://www.cesr.ncsu.edu/agdean
Transcript

1ARM University ProgramCopyright © ARM Ltd 2014

Teaching Embedded System Design and Optimization with the ARM Cortex-M0+ Microcontrollers

Dr. Alexander G. DeanDept. of ECE

North Carolina State UniversityRaleigh, NC

[email protected]://www.cesr.ncsu.edu/agdean

2ARM University ProgramCopyright © ARM Ltd 2014

Course Approach Hands-on MCU development experience Programming in C with free toolchain from Keil Simple, inexpensive hardware Easy system expansion with Arduino-compatible hardware

Relevant, useful material based on 13 years of experience Interaction with industry through on-site design reviews of real

embedded systems Teaching large (50 – 120 students) undergraduate and graduate

embedded system courses

Prerequisites Introductory course: introduction to computer organization, C

programming Advanced course: introductory course. Other courses also helpful

3ARM University ProgramCopyright © ARM Ltd 2014

Course Materials Easy adoption for your own course Flexibility with modular design All source files provided (pptx, docx, vsd, c, h …)

Course modules typically include PowerPoint slides/lecture notes Demonstration code for use in lecture and outside of class Homework questions and solutions Lab exercise(s) with step-by-step procedure and questions Programming project(s) with solution

Status Introductory course materials available now Advanced course materials available early summer Developing a textbook to support both courses

4ARM University ProgramCopyright © ARM Ltd 2014

5ARM University ProgramCopyright © ARM Ltd 2014

Worldwide Adoption

Video introduction Nearly 100 adoptions since launch last summer North & South America, Europe, Asia, Africa

“We were delighted to be one of the first institutions to receive the ARM University Program's Lab-in-a-Box on Embedded Systems. It has immediately proven itself to me as an excellent resource for our research and teaching activities.” Dr. Boris Adryan, University of Cambridge, UK.

"Freescale is delighted to be building on our longstanding relationship with ARM by providing access to our solutions to universities and electronic engineering students around the world. The Lab-in-a-Box is just one way Freescale and ARM collaboratively reach out to our engineers of the future and help us create a new wave of innovative embedded technologies that will drive our increasingly connected world.“ Andy Mastronardi, Director, University Programs, Freescale

6ARM University ProgramCopyright © ARM Ltd 2014

Cortex Processor Cores Cortex-A: application profile High performance, multiprocessing

Cortex-R: real-time profile Predictable performance

Cortex-M: microcontroller profile - optimized for embedded applications Implementation Short pipeline Fast interrupt response Low cost, low power Fast GPIO access

Instruction set Good code density (16-bit Thumb 2 instructions) Bit and byte operations Optional single-cycle multiply instruction, hardware divide, saturated math Optional DSP & SIMD instructions Optional floating point unit

7ARM University ProgramCopyright © ARM Ltd 2014

Target Board – Freescale Freedom KL25Z 32-bit ARM Cortex-M0+ processor Freescale Kinetis MKL25Z128VLK4

microcontroller Extremely low power use 48 MHz max processor clock freq. 128 KB Flash ROM, 16 KB RAM Wide range of peripherals, including

USB on-the-go

FRDM-KL25Z board $13 (USD) Peripherals: 3-axis accelerometer, RGB

LED, capacitive touch slider Expansion ports are compatible with

Arduino shield ecosystem – endless opportunities, low-cost hardware mbed.org enabled - online software

development toolchain, reusable codeImages courtesy of Freescale

8ARM University ProgramCopyright © ARM Ltd 2014

Hardware Ecosystem Arduino shields Wide variety Low cost High volume

Xtrinsic sensor board from element 14

Images courtesy of Freescale, Adafruit, Element 14, Parallax, Seeed Studio

9ARM University ProgramCopyright © ARM Ltd 2014

And Even More Shields

Image courtesy of Seeed Studio

10ARM University ProgramCopyright © ARM Ltd 2014

SOFTWARE DEVELOPMENT TOOLS

11ARM University ProgramCopyright © ARM Ltd 2014

Software Development Suite: MDK-ARM Low cost tools for ARM7, ARM9, Cortex-M and Cortex-R4 MCUs Extensive support for many devices Core and peripheral simulation Flash support

Integrated development environment with ARM compiler Full (pro) version available for teaching, free

32 KB size-limited version available also Debugger with full access to program, core and peripherals Real-time trace on devices based on Cortex-M3 and M4

Real-Time Library KEIL RTX RTOS + Source Code TCP networking suite, Flash File System, CAN Driver Library, USB Device Interface

Debug Hardware Evaluation boards Separate support channel See www.keil.com

12ARM University ProgramCopyright © ARM Ltd 2014

CURRICULUM OVERVIEW

13ARM University ProgramCopyright © ARM Ltd 2014

Curriculum Overview Introductory Course: Building an Embedded System with a

Microcontroller Microcontroller concepts Software design basics ARM Cortex-M0+ architecture and interrupt system C as implemented in assembly language Peripherals and interfacing

Advanced Course: Embedded System Design, Analysis and Optimization Creating responsive multithreaded systems Optimizing code speed Optimizing system power and energy Optimizing memory requirements

Details in appendix

14ARM University ProgramCopyright © ARM Ltd 2014

Introductory Course ModulesSoftware Design Basics

CMSIS Emb. SW EngineeringConcurrency

DMA

Cortex-M0+ Processor Core

Interrupts

Using ArduinoShields

HW & SW forRobustness Analog I/OGP I/O Serial

Comm.

C as Implementedin Assembly Lang.

Timers

15ARM University ProgramCopyright © ARM Ltd 2014

Introduction to Embedded Systems What is an Embedded System? Why add a computer to the

larger system? Differences between… Embedded and general-purpose

computers Microcontrollers and

microprocessors Embedded system Functions Attributes Constraints Economics

16ARM University ProgramCopyright © ARM Ltd 2014

Embedded MCU vs. General-Purpose CPU Both have a CPU core to execute

instructions Microcontroller has peripherals for

concurrent embedded interfacing and control Analog Non-logic level

signals Timing, communications Reliability and safety

Embedded systems have concurrent, reactive behaviors Must respond to sequences and

combinations of events Real-time systems have deadlines on

responses Typically must perform multiple

separate activities concurrently

17ARM University ProgramCopyright © ARM Ltd 2014

Concurrent Hardware & Software Operation

Embedded systems rely on both MCU hardware peripherals and software to get everything done on time

Hardware HardwareSoftware Software Software

Time

18ARM University ProgramCopyright © ARM Ltd 2014

Advanced Course Modules (1/2)Concurrency Concepts

and Interrupts

Design, Analysis & Optimization of

Real-Time Systems

Cooperative Task Scheduling

PreemptiveTask Scheduling

CMSIS-DSP and Cortex-M0+ Features

Source Code and Toolchain Tuning

High-LevelOptimizations

Execution-TimeProfiling

Examining Object Code

C as Implementedin Assembly Lang.

19ARM University ProgramCopyright © ARM Ltd 2014

Advanced Course Modules (2/2)Power & Energy

Modeling & Analysis

Optimizing Power and/or Energy with

Sleep and DVFS

Cortex M0+Core

KL25ZMCU

Tools for Analyzing Memory Use

C Program Memory Use Concepts

FreedomBoard

ROM Size Optimization

RAM SizeOptimization

20ARM University ProgramCopyright © ARM Ltd 2014

COURSE MODULE DETAILS

21ARM University ProgramCopyright © ARM Ltd 2014

INTRODUCTORY COURSE

22ARM University ProgramCopyright © ARM Ltd 2014

Software Development, Processor & InterruptsModule Presentations Demonstration

CodeHomework and Test Questions

Laboratory Exercise

Programming Project

Introduction to Embedded Systems

Presentation Solutions

SW Design Basics: Concurrency, SW Eng. & CMSIS APIs

Presentation Solutions

Cortex-M0+ Processor Core

Presentation Solutions Text Processing in Assembly Language• Lab Exercise• Code

Integer Square Root Approximation• Assignment• Starter code• Solution

C Code as Implemented in Assembly Language

Presentation C/Asm Demo 1 Code,Demo 2 Code

Solutions, source project

Examining Toolchain Output• Lab Exercise• Code

Interrupts Presentation Interrupt Demo Code, Notes

Solutions, spreadsheet

Measuring Interrupt Timing• Lab Exercise

Human Response Timer• Assignment• Solution

23ARM University ProgramCopyright © ARM Ltd 2014

Using PeripheralsGeneral Purpose Digital Interfacing

Presentation GPIO Demo Code (includes Basic Light Switch, RGB LED Flasher, Speaker tone generator, Text LCD with parallel bus)

Solutions, Spreadsheet

Basic User Interface with LCD and switches• Lab Exercise • Code

Slide Whistle• Assignment• Solution

Analog Interfacing Presentation Analog Interfacing Demo Code

Interfacing withArduino Analog Devices

Solutions, spreadsheet

ADC: Voltmeter• Lab Exercise• CodeComparator: Voltage Monitor• Lab Exercise• CodeDAC: Signal Generator• Lab Exercise• Code

Infrared Proximity Sensor• Assignment• Solution

Timers Presentation Timer Demo Code Solutions, spreadsheet

Signal Generator with Precision Timing and Buffering• Lab Exercise • Code• Solution

Clock with Pulsing LED• Assignment• SolutionUsing PWM to control motor Speed• Assignment• Solution

24ARM University ProgramCopyright © ARM Ltd 2014

Using PeripheralsSerial Communication

Presentation I2C and Accelerometer

Creating a Console Interface with a UART

Interfacing with Arduino Serial Devices

Solutions UART Performance Analysis• Lab Exercise• Code

UART: Creating a Speedometer with a GPS Receiver• Assignment• SolutionInterfacing with an SD memory card using SPI• Assignment• Solution

Improving System Robustness with Hardware and Software

Presentation Watchdog Timer

Stack Overflow Detection

Solutions Testing an embedded system’s robustness• Lab Exercise• Code

Making a system robust• Assignment • Solution

Using Direct Memory Access to Improve Performance

Presentation Memory Copy & ISR Replacement

Solutions Evaluating Memory Copy Speeds• Lab Exercise• Code

DMA: Upgrading from ISRs to DMA• Assignment• Solution

25ARM University ProgramCopyright © ARM Ltd 2014

ADVANCED COURSE

26ARM University ProgramCopyright © ARM Ltd 2014

Building Multithreaded SystemsModule Presentations Demonstration

CodeHomework and Test Questions

Laboratory Exercise

Programming Project

Introduction to Advanced Topics

Presentation

Managing Concurrency with Cooperative and Preemptive Schedulers

Presentation RTX Preemptive Scheduler

NonpreemptiveScheduler

Solutions Evaluating Scheduler Responsiveness• Assignment• Code• Solution

Designing Multithreaded Applications with RTOS Support

Presentation Data race conditions with preemption

Solutions Using RTOS Mechanisms• Assignment• Code• Solution

Upgrading the Waveform Generator to use an RTOS• Assignment• Solution

Design, Analysis, and Optimization of Real-Time Systems

Presentation Response time evaluation

Solutions Instrumenting RTX and verifying task timing• Assignment• Solution

Advanced Debugging with Cortex-M0+ & MDK

Presentation Microtrace bufferKernel-aware debugging with MDK

Using the MTB• Assignment• Code

27ARM University ProgramCopyright © ARM Ltd 2014

Performance Analysis and OptimizationModule Presentations Demonstration

CodeHomework and Test Questions

Laboratory Exercise

Programming Project

Profiling Program Execution Time

Presentation Profiling Spherical Geometry Calculations

Solutions Profiling Lab• Lab Exercise• Code

Tilt-compensated compass profiling• Assignment• Solution

Examining Object Code without Getting Lost

Presentation Sample program Solutions

Speed Optimization with Toolchain Tuning

Presentation Tuning the toolchain Solutions Evaluating Compiler Optimizations• Lab Exercise• Code

Toolchain tuning for the TC compass• Assignment• Solution

Speed Optimization with Program Transformations

Presentation Optimizing Spherical Geometry Calculations

Solutions Code optimization for the TC compass• Assignment• Solution

DSP Acceleration with Cortex-M0+ and the CMSIS-DSP Library

Presentation Real-time audio filtering

Solutions Using the CMSIS-DSP library for an ultrasonic rangefinder• Assignment• Solution

28ARM University ProgramCopyright © ARM Ltd 2014

Power and Energy Analysis and OptimizationModule Presentations Demonstration

CodeHomework and Test Questions

Laboratory Exercise

Programming Project

Power and Energy Analysis

Presentation Solutions Freedom Board Power Analysis Lab • Lab Exercise• CodeFreedom BoardEnergy Analysis• Lab Exercise• Code

KL25Z Features for Low Power and Energy

Presentation Evaluating Power in Active and Sleep Modes

Solutions Evaluating KL25Z Low-Power Standby Modes• Lab Exercise• CodeKL25Z Voltage and Frequency Scaling• Lab Exercise• Code

Optimizing Power orEnergy Use

Presentation Analyzing and optimizing data logger energy use

Solutions This Side Up: Optimizing an orientation logger for lower energy use• Assignment• Solution

29ARM University ProgramCopyright © ARM Ltd 2014

Memory Analysis and OptimizationModule Presentations Demonstration

CodeHomework and Test Questions

Laboratory Exercise

Programming Project

Profiling and Reducing ROM and RAM Memory Requirements

Presentation Graphics rendering size profiling and optimization

Solutions Evaluating the impact of the compileroptimizations and toolchain options

Reducing RAM and ROM for a multithreaded system• Assignment• Solution

30ARM University ProgramCopyright © ARM Ltd 2014

INTRODUCTORY COURSE

31ARM University ProgramCopyright © ARM Ltd 2014

Introduction to Embedded Systems Design Embedded System Fundamentals Concurrency Software Engineering for Embedded Systems CMSIS Improving System Robustness with Hardware and Software

Processor Cortex-M0+ Processor Core C Code as Implemented in Assembly Language Interrupts

Using Peripherals General Purpose Digital Interfacing Analog Interfacing Timers Serial Communication Direct Memory Access

32ARM University ProgramCopyright © ARM Ltd 2014

Introduction to Embedded Systems What is an Embedded System? Why add a computer to the

larger system? Differences between… Embedded and general-purpose

computers Microcontrollers and

microprocessors Embedded system Functions Attributes Constraints Economics

33ARM University ProgramCopyright © ARM Ltd 2014

Software Design Basics Concurrency in software and

hardware Peripherals Software tasks

Task scheduling and response time Prioritization Preemption

Software engineering Development models Design before coding Graphical representations:

statecharts, flowcharts, sequence diagrams

Essential UML Peer review Testing concepts

Time

StaticDec CheckRec Sw LCD

Dec CheckRec

Dynamic Run-to-Completion

Dec Check

Dynamic Preemptive

34ARM University ProgramCopyright © ARM Ltd 2014

Cortex Microcontroller Software Interface Standard Vendor-independent

hardware abstractionlayer for Cortex-M

Standardizes interfaces to Processor core Peripherals Debug access port RTOS

Provides Optimized libraries

of DSP functions in fixed and floating point

Peripheral system view description

35ARM University ProgramCopyright © ARM Ltd 2014

Cortex-M0+ CPU Core Processor core registers Memory space, contents and

addressing ARMv6-M Thumb instruction set

overview

36ARM University ProgramCopyright © ARM Ltd 2014

C as Implemented in Assembly Language We program in C for convenience,

but should understand the assembly code implementing it Code efficiency Ease of analysis

An overview of what C gets compiled to C start-up module Register use Activation records Subroutines Data types &

classes Using pointers Control flow

Lower address (Free stack space)

Activation record for current function

Local storage <- Stack ptrReturn address

Arguments

Activation record for caller function

Local storageReturn address

ArgumentsActivation record for caller’s caller

function

Local storageReturn address

ArgumentsHigher address

Activation record for caller’s caller’s

caller function

Local storageReturn address

Arguments

37ARM University ProgramCopyright © ARM Ltd 2014

Exceptions and Interrupts Exception and Interrupt Concepts Vector table Stack use Processing sequence Entering an Exception Handler Exiting an Exception Handler

Cortex-M0+ Interrupts NVIC interrupt controller, exception

mask, prioritization Using Port Module and External

Interrupts Timing Analysis Program Design with Interrupts Sharing Data Safely Between ISRs

and Other Threads Data atomicity and race conditions Volatile data

38ARM University ProgramCopyright © ARM Ltd 2014

General-Purpose I/O GPIO Basic Concepts Port Circuitry Control Registers Accessing Hardware Registers in C Clocking and Muxing

Circuit Interfacing Inputs Switches

Outputs LEDs Speaker

Both Interfacing with a Text LCD

PDOR select

PDIR select

PDDR select

Data Bus bit n

Port Data Direction Register

D Q

Port Data Output

RegisterD Q

AddressDecoder

Address Bus

Pin or Pad on

package

Port Data Input

RegisterD Q

I/O Clock

TglRstSet

PSOR selectPCOR selectPTOR select

Pin Control Register

MUX field

39ARM University ProgramCopyright © ARM Ltd 2014

Demo, Lab and Project Demo Basic Light Switch RGB LED Flasher Speaker tone generator

Lab Interfacing with a generic text LCD

and switches Can target Arduino LCD + switch

shield instead Project Slide whistle sound effect generator

40ARM University ProgramCopyright © ARM Ltd 2014

Analog Interfacing Analog and digital domains Quantization and transfer functions Sampling

Converters DAC Comparator ADC – flash, successive approximation

KL25Z Peripherals Digital to analog converter Analog comparator Reference voltage DAC

Analog to digital converter Clock configuration Input channel multiplexer Conversion triggers Special features: averaging, low power,

repeat, automatic compareVo

ltage

TimeStart of Conversion

Test voltage(DAC output)

T1 T2 T3 T4 T5 T6000000

111111

100000100100

AnalogInput

know

1xx

xxx,

try

1100

00

100110

know

xxx

xxx,

try

1000

00

know

10x

xxx,

try

1010

00

know

100

xxx,

try

1001

00

know

100

1xx,

try

1001

10

know

100

11x,

try

1001

11

know

100

110.

Don

e.

41ARM University ProgramCopyright © ARM Ltd 2014

Demos, Labs and Projects Demo Measure voltage with ADC Detect low voltage with comparator Waveform generator with DAC

Labs ADC – Supply voltage monitor Comparator – Low voltage alarm DAC – Waveform generator

Project Infrared proximity sensor using ADC and

digital output

Object present, reflects IR back to receiver

42ARM University ProgramCopyright © ARM Ltd 2014

Timers Concepts Elapsed time measurement Event counting Periodic interrupts Input capture Output compare Pulse-width modulation Software data queues

KL25Z Timers Periodic Interrupt Timer Timer/PWM Module Low-Power Timer Sys Tick

Interrupt

Clock

Read current timer value (TVL) from PIT_CVALn

PresettableBinary Counter Interrupt

ReloadStart Value

Read/write timer start value (TSV) from PIT_LDVALn

43ARM University ProgramCopyright © ARM Ltd 2014

Demo, Lab and Project Demo Count interrupts, adjust PWM signal

based on board tilt as measured by accelerometer

Lab Signal generator with precision timing

and buffering Projects Time-of-day clock with pulsing LED Using PWM to control a motor’s

speed

44ARM University ProgramCopyright © ARM Ltd 2014

Serial Communications Serial communications Concepts Tools Software: polling, interrupts and

buffering Processing binary and text messages

UART communications Concepts KL25 I2C peripheral

SPI communications Concepts KL25 SPI peripheral

I2C communications Concepts KL25 I2C peripheral

Tbit *1.5

Databits

Data Sampling Time at

Receiver

Tbit *2.5

Tbit *3.5

Tbit *4.5

Tbit *5.5

Tbit *6.5

Tbit *7.5

Tbit *8.5

Tbit *9.5

Tbit *10.5

Time

Zero

SerialInterface

tx_isr rx_isr

get_stringsend_string

Start Talker +Sentence

Type

SentenceBody

Checksum1

Checksum2

$Append char to buf.

Any char. except *, \r or \nAppend char to buf.Inc. counter

*, \r or \n, non-text, orcounter>6 buf==$SDDBT,

$VWVHW, or $YXXDREnqueue all chars. from buf

Any char. except *Enqueue char

*Enqueue char

Any char.Save as checksum1

/r or /n

Any char.Save as checksum2

45ARM University ProgramCopyright © ARM Ltd 2014

Demo, Lab and Projects Demo I2C communication with onboard accelerometer Console interface with a UART

Lab UART performance and timing analysis

Projects UART: Creating a speedometer with a GPS receiver SPI: Interfacing with an SD memory card

46ARM University ProgramCopyright © ARM Ltd 2014

Improving System Robustness

Low voltage detector

Watchdog timer

Defensive programming

Stack depth analysis

Testing and test coverage

Time

WDTValue

Start WDTWDT times out,resets system

Restart WDT Restart WDT

Global Data

Heap

A Stack

B Stack

Instructions

Thread A

Thread B

SF Regs

Monitor RAM

47ARM University ProgramCopyright © ARM Ltd 2014

Direct Memory Access Basic Concepts DMA peripheral Selecting trigger sources

DMA Peripherals in Cortex-M0+

DMA Applications Data Transfer Replacing ISRs

48ARM University ProgramCopyright © ARM Ltd 2014

Demo, Lab and Project Demo Memory copy Waveform playback to DAC without

ISR Lab Evaluate memory copy speed with

different DMA configurations Project Remote data acquisition system with

serial control and analog data input

49ARM University ProgramCopyright © ARM Ltd 2014

ADVANCED COURSE

50ARM University ProgramCopyright © ARM Ltd 2014

Advanced Design, Analysis and Optimization Advanced Scheduling and Design Cooperative and Preemptive Schedulers Designing Multithreaded Systems Sharing Data Safely with RTOS Support Timing Analysis of Real-Time Systems Advanced Debugging with Cortex-M0+

Analysis and Optimizations Code Execution Speed Analysis Optimization

Power and Energy Analysis and Modeling Optimization

Memory Requirements Analysis Optimization

51ARM University ProgramCopyright © ARM Ltd 2014

Cooperative and Preemptive Schedulers Task scheduling and response time Prioritization and preemption

Non-preemptive scheduling Task states and scheduling rules Scheduler implementation RTX API and example Limitations Response time Prioritization Software structure

Preemptive scheduling Task states and scheduling rules Scheduler implementation and context switching mechanics RTX API and example Limitations Data sharing Memory use

52ARM University ProgramCopyright © ARM Ltd 2014

Designing Multithreaded Systems What is in an RTOS? Bounded response times Preemption Time management Synchronization

Task creation & control Signaling events Sharing data safely Concepts Data atomicity Readers and writers Race conditions

Mechanisms Synchronization objects Scheduler lock Interrupt masking

Keil RTX Real-Time Operating System

53ARM University ProgramCopyright © ARM Ltd 2014

Creating Real-Time Systems Periodic task model Releases, periods and

deadlines Worst-case execution time

(analytical and empirical) Simplifying assumptions

Metrics Response time Schedulability

Priority-based scheduling Priority assignment

Fixed priority (RM, DM) Response time analysis Utilization bound schedulability test

Dynamic priority (EDF) Response time analysis Utilization bound schedulability test

P1

P2

P3

0 1 2 3 4 5 6 7 8 9 10 11 12Time

P1 P1 P1

P2 P2

P3

jihpj j

iii C

TRCR ∑

+=

)(

( )12 /1 −= mMax mU

54ARM University ProgramCopyright © ARM Ltd 2014

Handling More Complex RT Systems Supporting aperiodic tasks

Non-zero context switch times

Dependencies between tasks

Dealing with priority inversion

Dealing with WCET >> ACET

Supporting different deadlines

jihpj j

niii C

TaBCR ∑

++=

)(

M

L

M

H

L

H

12

3 4

5R

R

L

55ARM University ProgramCopyright © ARM Ltd 2014

Lab and Projects Lab – Response Time Comparison Non-preemptive, non-prioritized Non-preemptive, prioritized Preemptive, prioritized

Projects Worst-case execution time analysis

for the Cortex-M0+ Instrumenting RTX and verifying task

timing

Image courtesy Sparkfun

56ARM University ProgramCopyright © ARM Ltd 2014

Advanced Debugging with Cortex-M0+

CoreSight Micro Trace Buffer Data Watchpoints Hardware breakpoints

Kernel-aware debugging in Keil MDK

57ARM University ProgramCopyright © ARM Ltd 2014

Code Execution Speed Analysis Overview

Timing Methods Timer peripheral Scope with twiddle bits

Profiling Methods Program counter sampling Hardware trace support

Object Code Inspection How to keep from getting lost

58ARM University ProgramCopyright © ARM Ltd 2014

Code Execution Speed Optimization Rationale and Trade-offs Design-time vs. Compile-time Optimizations Maintainability and Portability vs. Fast Code The Evils of Premature Optimization Trust but Verify (the Compiler)

Using the Compiler to the Fullest What should the compiler be able to do? Helping and persuading the compiler Precalculation

Algorithms, Data Organization and Data Structures Math Fixed point and integer math Reduced precision floating-point math Polynomial approximations

59ARM University ProgramCopyright © ARM Ltd 2014

Cortex-M Optimizations CMSIS-DSP Library Primarily supports vectors and matrices for DSP Versions optimized for Cortex-M0, M0+, M3, or M4 Data types Floating point Fixed point: q7, q15, q31, q63

Wide range of functions available Basic math, fast math, complex math Filters, transforms, matrix functions Motor control, Statistical functions Support functions Interpolation functions

Cortex-M0+-specific coding practices

60ARM University ProgramCopyright © ARM Ltd 2014

Demo and Project Demos Profiling and optimizing spherical

geometry code Project Profiling and optimizing a tilt-compensated

compass

61ARM University ProgramCopyright © ARM Ltd 2014

Power and Energy Analysis Modeling fundamentals Basic models Static & dynamic power

Optimizing for power vs. energy Power systems Voltage regulators and switching

converters Switches (diodes and transistors)

Modeling system power Measuring current and power KL25Z processor Freedom board Sampling V and I

Measuring energy Sampling V and I and integrating Using an ultracapacitor

Cap

acito

r Vol

tage

Time

V1

Δt

V2

62ARM University ProgramCopyright © ARM Ltd 2014

Demos and Labs Demos Power measurement Ultracapacitor-based

energy measurement

Labs Freedom board power analysis –

where does the power go? Energy analysis – how long will the

ultracap power the board?

PowerMCU???

MCU Current (µA) at VDD = 3VMode Normal LL VLL VLPRun 5000 250Wait 3700 135Stop 345 1.9 4.4Stop 3 1.4Stop 1 0.77Stop 0 0.38

63ARM University ProgramCopyright © ARM Ltd 2014

Power and Energy Optimization System optimization challenges Optimizations for peripherals Standby and low-power modes Clock gating Voltage scaling and conversion Frequency scaling

Optimizations for the MCU Voltage scaling Clock frequency scaling Voltage and frequency scaling Active & sleep modes

Integrating power and energy management into task schedulers Non-real-time systems Real-time systems

0.00.20.40.60.81.01.21.41.6

Pow

er (m

W)

Average Power

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

0 10 20 30 40 50

Aver

age

Pow

er (m

W)

MCU Clock Frequency (MHz)

PActive

PSleep

Paverage

64ARM University ProgramCopyright © ARM Ltd 2014

Analysis of Memory Requirements Memory is free up to a point … after

which it becomes expensive

Reducing memory requirements may enable Use of a less expensive MCU Implementation of a more

sophisticated algorithm More diagnostic code More fault logging

Determining Memory Requirements Program memory use Static, automatic and dynamic data Code memory

Understanding the linker map file

65ARM University ProgramCopyright © ARM Ltd 2014

Optimization of Memory Requirements

Optimizing Data Memory Language support for read-only data Toolchain support Memory models Data sizes Packed data structures and bitfields

Improving stack memory size estimation Shrinking activation records Using stack-friendly functions

66ARM University ProgramCopyright © ARM Ltd 2014

Optimization of Memory Requirements Optimizing Code Memory Language support Toolchain configuration Function outlining Memory models Optimized library variants

Improving similar or identical code

Optimization for Multitasking Systems Improving the accuracy of stack

memory estimates Reducing or eliminating preemption Combining tasks to reduce stack

count Preemption threshold scheduling

Task

1 M

ax S

tack

Task

2 M

ax S

tack

Task

3 M

ax S

ta

Task 1 Statics Task 2 StaticsTask 3 StaticsTask 4 Statics

Preemptive DynamicNon-preemptive Dynamic

Task

1 M

ax S

tack

Task

2 M

ax S

tack

Task

3 M

ax S

tack

Task

4 M

ax S

tack

Task 1 Statics Task 2 StaticsTask 3 StaticsTask 4 Statics

67ARM University ProgramCopyright © ARM Ltd 2014

EXAMPLE PROJECTS

68ARM University ProgramCopyright © ARM Ltd 2014

INFRARED PROXIMITY SENSOR

69ARM University ProgramCopyright © ARM Ltd 2014

Concepts

Detect object by shining infrared light and measuring impact on ambient light level

Components Infrared LED – emits IR light Infrared phototransistor – conducts more current as IR increases

No object present, noIR reflected back to receiver

Object present, reflects IR back to receiver

70ARM University ProgramCopyright © ARM Ltd 2014

Using Differential Measurements

Basic approach of measuring ambient + reflected light is unreliable Vulnerable to changes in

ambient light due to flicker in light sources, shadows, etc.

Use differential measurements instead Measure brightness with

IRLED off Measure brightness with

IRLED on Difference in brightness levels

indicates amount of IR reflected

Measure IR level with IRLED off

Measure IR level with IRLED on

71ARM University ProgramCopyright © ARM Ltd 2014

Response Time Issues IR Sensor (phototransistor)

does not respond instantaneously Has internal capacitance which

must be charged or discharged Rate of change depends on

brightness

Need to introduce time delay in processing sequence Change IRLED value Wait for some time for sensor to

respond Read IR Sensor

IRLED

OnOff

OnOff

On

IR SensorDarker

Lighter

72ARM University ProgramCopyright © ARM Ltd 2014

BUBBLE LEVEL WITH TEXT LCD

73ARM University ProgramCopyright © ARM Ltd 2014

Learning Objectives

Configuring GPIO pins for input and output

Interfacing with Text LCD controller

Developing Code Printing text Interfacing with

accelerometer

74ARM University ProgramCopyright © ARM Ltd 2014

Text LCD Module

MCU

LCD Controller (HD44780

or equivalent)

LCD Glass

Driver

16 Rows

Serial Data Driver

40 Columns 40 Col. 40 Col.

Enable ERead/~Write R/~WRegister Select RS

Data Bus DB0-7

Contrast Adjustment

VO

GroundVSS

SupplyVDD

Hardware Overview

MCU communicates with LCD controller via Parallel data bus (DB0-7) Read/~Write Register Select (register or data?) Enable

LCD controller interprets commands from MCU Write to display memory Change configuration Read status Read memory

75ARM University ProgramCopyright © ARM Ltd 2014

LCD Controller InstructionsInstruction Code DescriptionRS R/W B7 B6 B5 B4 B3 B2 B1 B0

Clear display 0 0 0 0 0 0 0 0 0 1 Clears display and returns cursor to the home position (address 0).

Cursor home 0 0 0 0 0 0 0 0 1 * Returns cursor to home position.

Entry mode set 0 0 0 0 0 0 0 1 I/D SSets cursor move direction (I/D); specifies to shift the display (S). These operations are performed during

data read/write.

Display on/off control 0 0 0 0 0 0 1 D C B Sets on/off of all display (D), cursor on/off (C), and blink of cursor position character (B).

Cursor/display shift 0 0 0 0 0 1 S/C R/L * * Cursor-move or display-shift (S/C), direction (R/L).

Function set 0 0 0 0 1 DL N F * * Sets interface data length (DL), number of display line (N), and character font (F).

Set CGRAM address 0 0 0 1 CGRAM address Sets the CGRAM address. CGRAM data are sent and received after this setting.

Set DDRAM address 0 0 1 DDRAM address Sets the DDRAM address. DDRAM data are sent and received after this setting.

Read busy flag &address counter 0 1 BF CGRAM/DDRAM address

Reads busy flag (BF) indicating internal operation being performed and reads CGRAM or DDRAM

address counter contents.Write CGRAM or

DDRAM 1 0 Write Data Write data to CGRAM or DDRAM.

Read from CG/DDRAM 1 1 Read Data Read data from CGRAM or DDRAM.

76ARM University ProgramCopyright © ARM Ltd 2014

POWER AND ENERGY MEASUREMENT LAB

77ARM University ProgramCopyright © ARM Ltd 2014

Learning Objectives Using a timer peripheral Low-power timer configuration

Interrupt handlers

Low-Power MCU operation Low power modes Run and stop modes Low-leakage wakeup unit

Measuring power and energy Concepts Using an ultracapacitor

78ARM University ProgramCopyright © ARM Ltd 2014

System Power

The MCU uses up to 15 to 20 mW Many embedded systems have peripheral circuitry, and that also draws

power So we need to consider that as well. The Freedom board is no exception – it is a good example.

79ARM University ProgramCopyright © ARM Ltd 2014

Freedom KL25Z Board Power Architecture

Will evaluate impact of disconnecting OpenSDA interface

P5V_SDA

P5-9_VIN

P5V_KL25Z

Linear 3.3V Regulator

Coin Cell

P3V3 P3V3_KL25Z

P3V3_SDA

Inertial Sensor

OpenSDAInterface

RGB LEDs

KL25Z MCU

J3

J14(Reset)

J4

80ARM University ProgramCopyright © ARM Ltd 2014

How Do We Measure Energy? Energy 𝑊𝑊(𝑇𝑇) = ∫0

𝑇𝑇 𝑉𝑉 𝑡𝑡 𝐼𝐼 𝑡𝑡 𝑑𝑑𝑡𝑡

V and I will vary as we turn on and off devices, change clock speeds, etc.

We need something to integrate power over time

Solutions: Use sampling energy meter, but low sample rate will limit accuracy MCU make wake up for only a few microseconds before going to

sleep Power the circuit from a capacitor with a known capacitance C, then

calculate capacitor energy before and after test

81ARM University ProgramCopyright © ARM Ltd 2014

Capacitor-Based Energy Measurement Measure capacitor voltage before (V1)

and after (V2) The energy W used by the circuit can

be calculated 𝑊𝑊 = 𝐶𝐶 𝑉𝑉12−𝑉𝑉22

2

Average power is energy divided by time

𝑃𝑃 = 𝐶𝐶 𝑉𝑉12−𝑉𝑉22

2𝑡𝑡

A constant current load I will take tseconds to discharge the capacitor from V1 to V2:

𝑡𝑡 = 𝐶𝐶 (𝑉𝑉1 −𝑉𝑉2 )𝐼𝐼

A constant resistance load R will take t seconds to discharge the capacitor from V1 to V2:

𝑡𝑡 = −𝐶𝐶𝐶𝐶 𝑙𝑙𝑙𝑙 𝑉𝑉2𝑉𝑉1

Cap

acito

r Vol

tage

Time

V1

Δt

V2

82ARM University ProgramCopyright © ARM Ltd 2014

Summary Ready-to-use teaching material Supports range of embedded systems courses Introductory Programming a microcontroller in C

Advanced Optimizing response time Optimizing execution speed Optimizing memory use Optimizing power and energy

Targets Freescale Freedom-KL25Z MCU platform

For more information, contact [email protected] [email protected] www.arm.com/university


Recommended