Post on 15-Jan-2016
transcript
DSP Architecture Differences and Examples of Embedded Computers
Lecture 3 January 18, 2005
EENG 449b / CPSC 439b Computer Systems
Andreas Savvidesandreas.savvides@yale.edu
Office: AKW 212Tel 432-1275
Course Websitehttp://www.eng.yale.edu/enalab/courses/2005s/eeng449b
Recap: 5 Steps of MIPS Datapath
MemoryAccess
Write
Back
InstructionFetch
Instr. DecodeReg. Fetch
ExecuteAddr. Calc
LMD
ALU
MU
X
Mem
ory
Reg File
MU
XM
UX
Data
Mem
ory
MU
X
SignExtend
4
Ad
der Zero?
Next SEQ PC
Addre
ss
Next PC
WB Data
Inst
RD
RS1
RS2
Imm
How do DSP Processors Differ?
Designed for high performance, repetitive numerical intensive tasks
Distinct features:• Single cycle multiply accumulated instructions
(MAC)o Useful for digital filters, FFTs, correlation
computations
• Several memory accesses in the same cycle• One or more address generation units
An example DSP processor datapath
Specialized Addressing Modes
Register indirect addressing with post increment• In MIPs we have add R4, (R1)• How would it be in DSP?
Modulo addressing Bit-reverse addressing => FFT
• FFTs algorithms shuffle their addressingo Eg 0,1,2,3,4,5 is accessed 0,4,2,6,1,5
Specialized I/O Handling Mechanisms
DSPs need to get a lot of data from outside world• Cameras, celphones, MP3 Players
Acquire data w/o processor interruption• Specialized interrupt schemes• DMA transfer units, specialized serial and parallel ports• Mutliport memories and independent memory banks• Multiple on chip buses
Tools disadvantages: general purpose processors have more tools available.
DSP Design Choices
Arithmetic format• Fixed Point vs. Floating Point• Fixed point: numbers are integers or fractions
in fixed range• Floating point:
o Exponent and mantissa
o Mantissa x 2exponent
Fixed vs. floating point tradeoffs?
DSP Data Widths & Speed
Floating point; mostly 32-bit Fixed point: 16-bit Speed factor:
• Clock speed does not tell the whole story• MIPS is the common metric• Some DSPs use a VLIW architecture
Harvard vs. Von Neumann
Software Development Path
An Example Microcontroller OKI ML67Q5002(Not a DSP!)
OKI ML67Q5002• 32-bit ARM7TDMI core (16-bit THUMB mode)• Built-in memory:
o SRAM 32Kbyteso Boot ROM 4Kbyteso FLASH memory 256Kbytes
• Provided interfaces:o 4 channels of 10-bit resolution ADC.o DMA support.o SPI, SIO, I2C, UART, PWM interfaceso 42 configurable GPIO pinso Variety of external and internal configurable interruptso 6 hardware timers
Features• ARM7TDMI• ROM-less (ML675001) 256KB MCP Flash (ML67Q5002) 512KB MCP Flash (ML67Q5003)• 8KB Unified Cache• 32KB RAM • Interrupts 25 + 1 FIQ• I2C (1-ch x master)• DMA (2-ch)• Timers (7 x 16-bit)• WDT (16-bit)• PWM (2 x 16-bit)• UART (2-ch)/ SIO (1-ch) • GPIO (5 x 8-bit) • ADC (4-ch x 10-bit)
• up to 66MHz• -40 ~ +85 C• Package 144 LFBGA 144 QFP
XYZ Computation:The OKI ARM ML675001/67Q5002/67Q5003
[Slide from OKI Semiconductor]
OKI ARM ML675001/67Q5002/67Q5003
ARM7TDMI
What does ARM7TDMI Mean?
Based on an ARM7 core• Von Neuman Architecture
o Same address and data bus
• Approximately 1.9 Clock cycles per instruction• T – Thumb architecture extension – 2 instruction sets
o ARM 32-bitso Thumb 16-bits
• D – Core has debug extensions• M – Core had an enhanced multiplier (32x8) with
instructions for 64-bit results• I – Core has EmbeddedICE Logic Extensions
CPU States
CPU can be either in ARM or THUMB states• User can implicitly change the processor state from ARM
to THUMB• All exception handling happens in ARM mode• If an exception happens during Thumb mode, the the
processor transitions to ARM to execute the instruction and returns to THUMB at the end of the exception handler
THUMB mode trades-off performance for code density• Cheaper memory and lower power consumption for
embedded systems
FLASH Starts here
External SRAMstarts here
Internal RAMstarts here
MCU Basics: What are interrupts?
Asynchronous breaks in the program execution• Press of a button, expiration of a timer, DMA interrupt indicating the
completion of a memory transfer When an interrupt occurs, the processor will transition to the
corresponding interrupt handler to service the interrupt and then resume execution
The OKI processor has an 8-level interrupt priority mechanism• Total of 24 types of interrupts that can happen during instruction execution
o 1 fast external interrupto 4 external interruptso 19 Internal interrupts
– E.g System timer, watchdog timer, DMA interrupts etc
The chip has mechanisms for dealing with interrupts• Interrupts are enabled and disabled through registers for each peripheral
Hardware Timers(16-bit)
Controls the mode (interval or one-shot)Starts and stops the timerEnables/disables the interrutps for this timer
Holds value to compare against
Holds the value that initializes the timer at startup
Clock Divider
Steps in Setting up a Hardware Timer
Example using hardware TIMER01. Stop timer & disable interrupts by writing to control register
(TIMECNTL0)
2. Write the timer starting value to the base register (TIMEBASE0)
3. Write the stop value in the compare register (TIMECOMP0)
4. Start the timer by writing to the control register (TIMECNTL0)
This will start the timer. An interrupt will occur when the counter register reaches the value of the compare register
Note: After the interrupt is handled, the status register (TIMESTAT0)needs to be cleared to use the timer again.
How to you access peripherals?
You can access peripherals and GPIO by reading/writing registers
Typically one would write device drivers and then use higher level abstractions
You will need this knowledge to write device drivers for different peripherals and to assess the real-time capabilities of your software
Some platforms & applications
Seismic monitoring, personal exploration rover, mobile micro-servers, networked info-mechanical systems, hierarchical wireless sensor networks
[NIMS, UCLA] [Robotics, CMU] [Intel + UCLA]
[CENS, UCLA][Intel + UCLA]
[Slide from V. Ragunanthan]
A Generic Sensor Node Architecture
PROCESSINGSUB-SYSTEM
COMMUNICATIONSUB-SYSTEM
SENSINGSUB-SYSTEM
POWER MGMT.SUB-SYSTEM
ACTUATIONSUB-SYSTEM
Base Case: The Mica Mote(The most popular sensing platform today)
AVR 128, 8-bit MCUDS2401Unique ID
51-PIN I/O Connector
Transmission Power Control
Hardware Accelerators
Radio Transceiver(CC1000 or CC2420)
Power Regulation MAX1678(3V)
Co-processor
External Flash
Digital I/O Analog I/OProgramming
Lines
For more information refer to the TinyOS Website http://www.tinyos.net
What is Stargate? A single board, wireless-equipped computing platform
• Developed at Intel Research Leverages advances in computation, communication and storage to facilitate wireless
systems research
System architecture
Computation sub-system PXA255 processor based on the XScale
microarch. • Successor to the StrongARM family
• Variable clock (100 - 400 MHz), less than 500 mW power
• Several sleep modes, rich set of peripherals
Wireless DPM: Hierarchical radios
Three vastly different wireless radios supported
Combined to form power-efficient, heterogeneous communication subsystem• Hierarchical device discovery and connection setup scheme leads to up
to 40X savings in discovery power
Technology
Data RateTx
CurrentEnergy per
bitIdle
CurrentStartup
time
Mote 76.8 Kbps 10 mA 430 nJ/bit 7 mA Low
Bluetooth 1 Mbps 45 mA 149 nJ/bit 22 mA Medium
802.11 11 Mbps 300 mA 90 nJ/bit 160 mA High
IEEE 802.11
Bluetooth
Mote
Energy per bit
Startup time
Idle current
Other power management features
Wake on wireless: Bluetooth based remote wakeup• BT module awake, rest of the system is shutdown• Incoming BT packet causes wakeup• On-demand power management (event-driven apps)• BT module in “wake on wireless” mode draws ~ 3mA
Motion detection for wake up• Passive small-bead mercury switch connected to GPIO• Movement causes switch to close and wakeup system• Can also be used to trigger wireless scanning for APs
UCLA iBadge
iBadge Functional Units
Main Processing Unit• ATMega128L Microcontroller from Atmel• Responsible for power management, localization, and interfaces
different functional units Localization Unit:
• Relative and absolute positioning• responsible for obtaining precise 3D location of iBadge in the
classroom • estimates its 3D location using an ad-hoc localization process
Speech Processing Unit:• Consists of TI DSP and CODEC• Performs speech codec and front end processing of the real time
speech of the children• Two modes (Simple Coding or Front End Processing) of operation
based on power requirements and user request.
iBadge Functional Units (Continued)
Power Management/Tracking Unit:• Battery Monitors (DS2438) keep track of energy usage of
various functional units• CMOS switches provides control to turn on/off different
part of the circuits Orientation/Tilt Sensing Unit
• Accelerometer combined with magnetometer provides the orientation of the children with earth’s magnetic field
Environment Sensing Unit• Temperature, Humidity, Atmospheric Pressure, and Light
Intensity
Telos: New OEP Mote*
Single board philosophy• Robustness, Ease of use, Lower Cost• Integrated Humidity & Temperature sensor
First platform to use 802.15.4• CC2420 radio, 2.4 GHz, 250 kbps (12x mica2)• 3x RX power consumption of CC1000, 1/3 turn on time• Same TX power as CC1000
Motorola HCS08 processor• Lower power consumption, 1.8V operation,
faster wakeup time• 40 MHz CPU clock, 4K RAM
Package• Integrated onboard antenna +3dBi gain• Removed 51-pin connector• Everything USB & Ethernet based• 2/3 A or 2 AA batteries• Weatherproof packaging
Support in upcoming TinyOS 1.1.3 Release Codesigned by UC Berkeley and Intel Research Available February from Moteiv (moteiv.com)
*D. Culler, UC Berkeley
Yale’s XYZ Sensor Node
Sensor node created for experimentation
• Low cost, low power, many peripherals
• Integrated accelerometer, light and temperature sensor
Uses an IEEE 802.15.4 protocol• Chipcon 2420 radio
OKI ARM Thumb Processor• 256KB FLASH, 32KB RAM• Max clock speed 58MHz, scales
down to 2MHz• Multiple power management
functions Powered with 3AA batteries & has
external connectors for attaching peripheral boards
Designed at Yale Enalab and Cogent computer systems, will be used as the main platform for the course
XYZ’s Architecture
XYZ: Communication Subsystem
Chipcon CC2420 Zigbee RF Transceiver• 2.4 GHz IEEE 802.15.4 @ 250Kbps• Programmable output power• RX/TX data buffering• Digital RSSI support• DSSS modulation• Security features
o CTR encryption/decryptiono CBC-MAC authenticationo CCM encryption and authenticationo All security operations are based on AES encryption using 128 bits
XYZ: Supervisor Circuitry & Low Power Sleep
OKI μC
RTC
DS1337
Voltage Regulator
3 x AA batteries
2.5V
3.3V
I2C
WAKEUP
EnableInterrupt (SQW)
DS1337 Real Time clock datasheet: http://pdfserv.maxim-ic.com/en/ds/DS1337.pdf
Step 1: The μC selects the total time that wants to be turned off and programs the DS1337 accordingly, through the 2-wire serial interface.
Step 2: The DS1337 turns-off the μC and uses its own crystal to keep the notion of time.
Step 3: The DS1337 wakes up the μC after the programmed amount of time has elapsed.
Note that the DS1337 RTC can disable the voltage regulator and completely turn-off the sensor node!
XYZ: On Board Sensors
Light
Accelerometer
Temperature
OKI μC
A
D
C
AIN0
AIN1
AIN2
X
Y
PIOE5(EXINT0)
2-axis accelerometer datasheet (ADXL202E): http://www.rotomotion.com/datasheets/ADXL202E_a.pdf
Temperature Sensor datasheet (TMP05): http://www.analog.com/UploadedFiles/Data_Sheets/192632828TMP05_6_prk.pdf
Light Sensor datasheet (TSL251R): http://www.goblack.de/desy/digitalt/sensoren/tsl-250/tsl250r.pdf
Current Efforts on XYZ Peripheral Boards
Ragobots @ UCLA
Suspended nodes and camera support @ Yale, ENALAB
Manufacturers of Microcontroller Based Sensor Nodes
Millenial Net (www.millenial.com)
• iBean sensor nodes Ember (www.ember.com)
• Integrated IEEE 802.15.4 stack and radio on a single chip Crossbow (www.xbow.com)
• Mica2 mote, Micaz, Dot mote and Stargate Platform Intel Research
• Stargate, iMote Dust Inc
• Smart Dust Cogent Computer (www.cogcomp.com)
• XYZ Node (CSB502) in collaboration with ENALAB@Yale Mote iv – Telos Mote Sensoria Corporation (www.sensoria.com)
• WINS NG Nodes More….
Other Sensor Node Projects
Augmented off-the-shelf systems• PC104 computers (used in some habitat monitoring applications)• iPAQ PDAs (used for prototypes @ UCLA/CENS)
Networked Infomechanical Systems (NIMS) • www.cens.ucla.edu
Dedicated embedded sensor nodes and SOCs• MIT uAMP nodes (http://www-mtl.mit.edu/research/icsystems/uamps/)
• Berkeley BWRC picoradio node (http://bwrc.eecs.berkeley.edu/Research/Pico_Radio)
• ISI Pasta node (http://pasta.east.isi.edu)
Typical Operating Characteristics for 4 classes of Sensor Nodes
Source: J. Hill, M. Horton, R. King and L. Krishnamurthy,”The Platforms Enabling Wireless Sensor Networks”, Communications of the ACM June 2004
Power PerspectiveComparison of Energy Sources
Power (Energy) Density Source of Estimates
Batteries (Zinc-Air) 1050 -1560 mWh/cm3 (1.4 V) Published data from manufacturers
Batteries(Lithium ion) 300 mWh/cm3 (3 - 4 V) Published data from manufacturers
Solar (Outdoors)
15 mW/cm2 - direct sun
0.15mW/cm2 - cloudy day. Published data and testing.
Solar (Indoor)
.006 mW/cm2 - my desk
0.57 mW/cm2 - 12 in. under a 60W bulb Testing
Vibrations 0.001 - 0.1 mW/cm3 Simulations and Testing
Acoustic Noise
3E-6 mW/cm2 at 75 Db sound level
9.6E-4 mW/cm2 at 100 Db sound level Direct Calculations from Acoustic TheoryPassive Human
Powered 1.8 mW (Shoe inserts >> 1 cm2) Published Study.
Thermal Conversion 0.0018 mW - 10 deg. C gradient Published Study.
Nuclear Reaction
80 mW/cm3
1E6 mWh/cm3 Published Data.
Fuel Cells
300 - 500 mW/cm3
~4000 mWh/cm3 Published Data.
With aggressive energy management, ENS With aggressive energy management, ENS mightmightlive off the environment.live off the environment.
Source: UC Berkeley & CENS
Many ways to Optimize Power Consumption
Power aware computing• Ultra-low power microcontrollers• Dynamic power management HW
o Dynamic voltage scaling (e.g Intel’s PXA, Transmeta’s Crusoe)o Components that switch off after some idle time
Energy aware software• Power aware OS: dim displays, sleep on idle times, power aware scheduling
Power management of radios• Sometimes listen overhead larger than transmit overhead
Energy aware packet forwarding• Radio automatically forwards packets at a lower level, while the rest of the node is
asleep Energy aware wireless communication
• Exploit performance energy tradeoffs of the communication subsystem, better neighbor coordination, choice of modulation schemes