of 270
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
1/270
Embedded System Design 06EC82
ECE, SJBIT
Sub: Embedded System Design Sub code: 06EC82
Sem: VIII
PART A
UNIT- 1INTRODUCTION: Overview of embedded systems, embedded system design challenges,
common design metrics and optimizing them. Survey of different embedded system design
technologies, trade-offs. Custom Single-Purpose Processors, Design of custom single purpose
processor 4 Hours
UNIT2
SINGLE-PURPOSE PROCESSORS: Hardware, Combinational Logic, Sequential Logic, RT
level Combinational and Sequential Components, Optimizing single-purpose processors. Single-
Purpose Processors: Software, Basic Architecture, Operation, Programmers View, Development
Environment, ASIPS. 6 Hours
UNIT3
Standard Single-Purpose Peripherals, Timers, Counters, UART, PWM, LCD Controllers,
Keypad controllers, Stepper Motor Controller, A to D Converters, Examples. 6 Hours
UNIT4
MEMORY: Introduction, Common memory Types, Compulsory memory, Memory Hierarchy
and Cache, Advanced RAM. Interfacing, Communication Basics, Microprocessor Interfacing,
Arbitration, Advanced Communication Principles, Protocols - Serial, Parallel 8 Hours
PART - B
UNIT - 5
INTERRUPTS: Basics - Shared Data Problem - Interrupt latency. Survey of Software
Architecture, Round Robin, Round Robin with Interrupts - Function Queues - scheduling -
RTOS architecture. 8 Hours
UNIT6
INTRODUCTION TO RTOS:MORE OS SERVICES: Tasks - states - Data - Semaphoresand shared data. More operating systems services - Massage Queues - Mail Boxes -Timers Events - Memory Management. 8 Hours
UNIT7 & 8
Basic Design Using RTOS:Principles- An example, Encapsulating semaphores and Queues.
Hard real-time scheduling considerationsSaving Memory space and power. Hardware
software co-design aspects in embedded systems. 12 Hours
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
2/270
Embedded System Design 06EC82
ECE, SJBIT
INDEX SHEET
SL.NO TOPIC PAGE NO.
UNIT - 1
INTRODUCTION: Overview of embedded systems8 to 32
01Embedded systems overview
8 to 9
02 Design challenges, common design metrics 9 to 12
03
Processor technology 13 to 15
04IC technology 16 to 19
05Design Technology 19 to 20
06Tradeoffs 21 to 22
07 Recommended questions and solutions23 to 31
UNIT - 2
CUSTOM SINGLE-PURPOSE PROCESSORS
33 to 72
01
HARDWARE:
Introduction, combinational logic
32 to 36
02Sequential logic 36 to 38
03 Custom single purpose processor design 39 to 40
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
3/270
Embedded System Design 06EC82
ECE, SJBIT
04RT level processor design 40 to 42
05Optimizing custom processors 42 to 44
06
SOFTWARE:
Basic architecture45 to 49
07operation 50 to 51
08 Programmers view 51 to 55
09 Development environment55 to 57
10 ASIPs57 to 60
11 Recommended questions and solutions61 to 71
UNIT - 3
Standard Single Purpose Processors : Peripherals73 to 88
01Introduction, timers, counters watchdog timers 73 to 74
02UART,PWM, 75 to 76
03LCD controllers ,Stepper Motor controllers 77 to 79
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
4/270
Embedded System Design 06EC82
ECE, SJBIT
04Analog to Digital converters ,RTC 80 to 81
05 Recommended questions and solutions82 to 87
UNIT - 4
Memory and Microprocessor interfacing89 to 153
01Intro, Memory write ability 89 to 91
02Common memory types 92 to 98
03Composing memory 98 to 99
04Memory hierarchy and cache 99 to 105
05 Advanced RAM 105 to 108
06Communication basics 109 to 113
07Microprocessor interfacing 114 to 121
08Arbitration 122 to 125
09Multilevel Bus architectures 125 to 126
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
5/270
Embedded System Design 06EC82
ECE, SJBIT
10Advanced communication principles 126 to 132
11Recommended questions and solutions 133 to 152
UNIT - 5
INTERRUPTS and Survey of software architecture154 to 174
01Shared Date problem 154 to 157
02Round robin 157 to 161
03Function queues 161 to 162
04RTOS architecture 162 to 166
05 Recommended questions and solutions 167 to 174
UNIT - 6
INTRODUCTION TO RTOS , MORE ON OS SERVICES175 to 230
01Tasks , states data 175 to 183
02Semaphores 184 to 195
03Messages queues, mail boxes 195 to 209
04 Events , memory management209 to 219
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
6/270
Embedded System Design 06EC82
ECE, SJBIT
05 Recommended questions and solutions220 to 228
UNIT
7 & 8
BASIC DESIGN USING RTOS
231 to 270
01Principles 230 to 234
02Encapsulating semaphores 234 to 258
03Hard real time scheduling considerations 258 to 258
04Saving memory and power 258 to 260
05 Recommended questions and solutions261 to 269
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
7/270
Embedded System Design 06EC82
ECE, SJBIT
PART A
UNIT- 1
INTRODUCTION: Overview of embedded systems, embedded system design challenges,
common design metrics and optimizing them. Survey of different embedded system design
technologies, trade-offs. Custom Single-Purpose Processors, Design of custom single purpose
processors.
4 Hours
TEXT BOOKS:
1. Embedded System Design: A Unified Hardware/Software Introduction - Frank
Vahid, Tony Givargis, John Wiley & Sons, Inc.2002
2. An Embedded software Primer - David E. Simon: Pearson Education, 1999
REFERENCE BOOKS:
1. Embedded Systems: Architecture and Programming, Raj Kamal, TMH. 2008
2. Embedded Systems Architecture A Comprehensive Guide for Engineers and
Programmers, Tammy Noergaard, Elsevier Publication, 2005
3. Embedded C programming, Barnett, Cox & Ocull, Thomson (2005).
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
8/270
Embedded System Design 06EC82
ECE, SJBIT
EMBEDED SYSTEM DESIGN
UNIT 1
INTRODUCTION1.1Embedded systems overview
1.2Design Challenges
1.3Processor Technology
1.4IC Technology
1.5Design Technology
1.1. Embedded systems overview
An embedded system is nearly any computing system other than a desktop computer. Anembedded system is a dedicated system which performs the desired function upon power up,repeatedly.
Embedded systems are found in a variety of common electronic devices such as consumerelectronics ex. Cell phones, pagers, digital cameras, VCD players, portable Video games,calculators, etc.,
Embedded systems are found in a variety of common electronic devices, such as:(a)consumer electronics -- cell phones, pagers, digital cameras, camcorders, videocassetterecorders, portable video games, calculators, and personal digital assistants; (b) home appliances-- microwave ovens, answering machines, thermostat, home security, washing machines, andlighting systems; (c) office automation -- fax machines, copiers, printers, and scanners; (d)business equipment -- cash registers, curbside check-in, alarm systems, card readers, productscanners, and automated teller machines; (e) automobiles --transmission control, cruise control,fuel injection, anti-lock brakes, and active suspension.
Common characteristics of Embedded systems :
Embedded systems have several common characteristics that distinguish such system from othercomputing systems;
1. Single functioned :An Embedded system executes a single program repeatedly. Theentire program is executed in a loop over and over again.
2. Tightly coupled (constrained):It should cost less, perform fast enough to process data inreal time, must fit in a single chip, consume as much less power as possible, etc.
3. Reactive and real time: Embedded Systems should continuously react to changes in theenvironment. It should also process and compute data in real time without delay.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
9/270
Embedded System Design 06EC82
ECE, SJBIT
Fig 1.1 An embedded system example a digital camera
1.2 Design challenge
Design matrics:
A Design metric is a measure of implementations features such as cost, size, performance andpower. Embedded system
- must cost less
- must be sized to fit on a single chip.
- must perform in real time (response time)
- must consume minimum power
The embedded system designer must be designed to meet the desired functionality. Apartmeeting the functionality, should also consider optimizing numerous design metrics.
common design metrics that a design engineer should consider :
- NRE( non recurring engineering Cost) : The one time monetary cost of designing thesystem.
- Unit cost: Monetary cost of manufacturing each copy of the system, excluding NRE cost.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
10/270
Embedded System Design 06EC82
ECE, SJBIT
1
- Size: physical space required by the system. Often measured in terms of bytes in case ofsoftware, and no. of gates in terms of hardware.
- Performance: execution/response time of the system.
- Power: The amount of power consumed by the system, which may define lifetime of the
battery and cooling requirement of IC. More power means more heat.
- Flexibility: ability to change the functionality of the system.
- Time to prototype: time needed to build a working system w/o incurring heavy NRE.
- Time to market: time required to develop & released to the market.
- Maintainability: ability to modify the system after its release to the market.
- Correctness: our confidence that we have implemented systems functionality correctly.
- Safety: probability that the system does not cause any harm.
Metrics typically compete with one another: improving one often leads to worsening ofanother
Fig : 1.2 Design metric competition
1.2.1 Time to Market Design Metric :
- The time to market: Introducing an embedded system early to the market can make bigdifference in terms of systems profitability. Market windows generally will be very
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
11/270
Embedded System Design 06EC82
ECE, SJBIT
1
narrow, often in the order of few months. Missing this window can mean significant lossin sales.
Fig 1.3 Time to Market
(A) Market window (B) simplified revenue model for computing revenue loss
Lets investigate the loss of revenue that can occur due to delayed entry of a product in themarket. We can use a simple triangle model y axis is the market rise, x axis to represent the pointof entry to the market. The revenue for an on time market entry is the area of the trianglelabeled on time and the revenue for a delayed entry product is the area of the triangle labeledDelayed. The revenue loss for a delayed entry is the difference of these triangles areas.
% revenue loss = ((on timeDelayed)/on time)*100 %
The area of on time triangle = * base * height
W -- height the market raise
D -- Delayed entry ( in terms of weeks or months )
2Wproducts life time
Area of on time triangle = *2W*W
Area of delayed triangle=1/2*(W-D+W)*(W-D)
%age revenue loss = (D (3W- D)/2W*W) * 100 %
Ex: products life time is 52 weeks
Delay of entry to the market is 4 weeks
Percentage revenue loss = 22%
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
12/270
Embedded System Design 06EC82
ECE, SJBIT
1
1.2.2 The NRE and Unit cost Design metrics:
Unlike other design metric the best technology choice will depend on the no of units.
Tech. A would result in NRE cost $2000 unit cost $100
B $30000 $30
C 100000 $2
Total cost= NRE cost + unit cost* no of units
Per product cost = total cost/no of units
= NRE cost/no of units + unit cost
1.2.3 The performance Design metric:
Performance of a system is a measure of how long the system takes to execute our desiredtasks. There are several measures of performance. The two main measures are
Latency or response time
Throughput : no of tasks that are processed in unit
speed up is a method of comparing performance of two systems
Speed up of A over B = performance of A/performance of B.
Technologies used in embedded systems:
Technology is a manner of accomplishing a task. There are three types of technologies arecentral to embedded system design:
Processor technologies
IC technologies
Design technologies
Processor technology: relates to architecture of the computation engine use to implement asystems desired functionalities. Generally the term processor is associated with programmable
software processors. But many non programmable digital systems can be thought of asprocessors.
Single purpose processors: is a digital system designed to execute exactly only one function.Performance may be good, flexibility may be poor.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
13/270
Embedded System Design 06EC82
ECE, SJBIT
1
Application specific processor: may serve as a compromise between single purpose and generalpurpose. An ASIP is a programmable processor optimized for particular class of applicationshaving common characteristics, such as embedded control, digital signal processing, ortelecommunications. This provides flexibility, while achieving good performance, low powerand size.
General purpose processors: The designer of a general purpose or microprocessor, builds aprogrammable device that is suitable for a variety to maximize the sale.
Design considerations
Should accommodate different kinds of program
Should provide general data path to handle variety ofcomputations
Design technology: design technology involves converting our concepts of desiredfunctionalities into an implementation. Design implementations should optimize design metrics
and should also realize faster.
Variations of top down design process have become popular
1.3.1 Processor Technologies:
1. General Purpose ProcessorsSoftware
2. Single Purpose ProcessorsHardware
3.Application Specific Processors:Application specific Instruction set processors (ASIP)
1. General Purpose Processors Software
They are programmable devices used in a variety of applications. They are also known asmicroprocessors. They have a program memory and a general data path with a large register file
and general ALU. The data path must be large enough to handle a variety ofcomputations. The programmer writes the program to carry out the required functionalityin the program memory and uses the features (instructions) provided by the general data
path. This is called as thesoftware portion of the system. The benefits of such a processorare very high. They require Low time-to-market and have low NRE costs. They provide ahigh flexibility.
Design time andNRE costare low, because the designer must only write a program, but need notdo any digital design. Flexibility is high, because changing functionality requires only changingthe program. Unit cost may be relatively low in small quantities, since the processormanufacturer sells large quantities to other customers and hence distributes the NRE cost over
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
14/270
Embedded System Design 06EC82
ECE, SJBIT
1
many units. Performance may be fast for computation-intensive applications, if using a fastprocessor, due to advanced architecture features and leading edge IC technology.
some design-metric drawbacks : Unit costmay be too high for large quantities. Performancemay be slow for certain applications. Size andpowermay be large due to unnecessary processor
hardware.Figure 1.4(d) illustrates the use of a single-purpose processor in our embedded system example,representing an exact fit of the desired functionality, nothing more, nothing less.
Fig : 1.4 Processors vary in their customization for the problem at hand: (a) desired functionality, (b) general-
purpose processor, (b) application-specific processor, (c)single-purpose processor.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
15/270
Embedded System Design 06EC82
ECE, SJBIT
1
Fig 1.5 Implementing desired functionality on different General purpose processor
2. Single Purpose Processors Hardware:
This is a digital circuit designed to execute exactly one program. Its features are, it contains onlythe components needed to execute a single program; it contains no program memory. Usercannot change the functionality of the chip. They are fast, low powered and small sized.
An embedded system designer creates a single-purpose processor by designing a custom digitalcircuit. Using a single-purpose processor in an embedded system results in several design metricbenefits and drawbacks, which are essentially the inverse of those for general purposeprocessors. Performance may be fast, size and power may be small, and unit-cost may be low for
large quantities, while design time and NRE costs may be high, flexibility is low, unit cost maybe high for small quantities, and performance may not match general-purpose processors forsome applications.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
16/270
Embedded System Design 06EC82
ECE, SJBIT
1
Fig 1.6 Implementing desired functionality on different single purpose processor
3.Application Specific Processors:Application specific Instruction set processors (ASIP):
They are programmable processors optimized for a particular class of applications havingcommon characteristics. They strike a compromise between general-purpose and single-purposeprocessors. They have a program memory, an optimized data path and special functional units.They have good performance, some flexibility, size and power.
An application-specific instruction-set processor (or ASIP) can serve as a compromise betweenthe above processor options. An ASIP is designed for a particular class of applications withcommon characteristics, such as digital-signal processing, telecommunications, embeddedcontrol, etc. The designer of such a processor can optimize the datapath for the application class,
perhaps adding special functional units for common operations, and eliminating otherinfrequently used units.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
17/270
Embedded System Design 06EC82
ECE, SJBIT
1
Fig 1.7 Implementing desired functionality on different Application Specific processor
Digital-signal processors (DSPs) are a common class of ASIP, so demand special mention. ADSP is a processor designed to perform common operations on digital signals, which are thedigital encodings of analog signals like video and audio. These operations carry out commonsignal processing tasks like signal filtering, transformation,or combination. Such operations areusually math-intensive, including operations like multiply and add or shift and add. To supportsuch operations, a DSP may have special purpose datapath components such a multiply-accumulate unit, which can perform a computation like T = T + M[i]*k using only one
instruction. Because DSP programs often manipulate large arrays of data, a DSP may alsoinclude special hardware to fetch sequential data memory locations in parallel with otheroperations, to further speed execution.
Highlight merits and demerits of single purpose processors and general-purpose processors.
Single Purpose Processors:
Merits:
1. They are fast
2. They consume low power
3. They have small size
4. Unit cost may be low for large quantities
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
18/270
Embedded System Design 06EC82
ECE, SJBIT
1
Demerits:
1. NRE costs may be high
2. Low flexibility
3. Unit cost high for small quantities
4. Performance may not match for some applications
General Purpose Processors:
Merits:
1. High Flexibility
2. Low NRE costs
3. Low time to market
4. Performance may be for fast and high-intensive computations.
De-Merits:
1. Unit cost may be relatively high for large quantities.
2. Performance may be slower for certain applications.
3. Size and power may be large due to unnecessary processor hardware.
How a single purpose processor is distinctly different from a general-purpose processor?
Sl.No
.Single Purpose Processor General Purpose Processor
1. Executes exactly one program. Executes any program written by the user.
2.The functionality cannot be changed.
The functionality can be changed by theuser by writing the required program.
Sl.No.
Single Purpose Processor General Purpose Processor
3. They do not have program memory They have program memory
4. Do not have any flexibility and containresources required only for that particularfunctionality
Has a very large amount of resource whichmay or may not be used for a particularfunctionality as decided by the user
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
19/270
Embedded System Design 06EC82
ECE, SJBIT
1
5. Merits include : They are fast, theyconsume low power, they have small sizeand the unit cost may be low for largequantities
Merits include : They have high Flexibility,Low NRE costs, Low time to market,Performance may be for fast and high-intensive computations.
1.4 IC Technology
Every processor must eventually be implemented on an IC. IC technology involves the mannerin which we map a digital (gate-level) implementation onto an IC. An IC (Integrated Circuit),often called a chip, is a semiconductor device consisting of a set of connected transistors andother devices. A number of different processes exist to build semiconductors, the most popular ofwhich is CMOS (Complementary Metal Oxide Semiconductor). The IC technologies differ byhow customized the IC is for a particular implementation. IC technology is independent fromprocessor technology; any type of processor can be mapped to any type of IC technology.
Fig : 1. 8 The independence of processor and IC technologies: any processor technology can be
mapped to any IC technology.
To understand the differences among IC technologies, we must first recognize thatsemiconductors consist of numerous layers. The bottom layers form the transistors. The middlelayers form logic gates. The top layers connect these gates with wires. One way to create theselayers is by depositing photo-sensitive chemicals on the chip surface and then shining lightthrough masks to change regions of the chemicals. Thus, the task of building the layers isactually one of designing appropriate masks. A set of masks is often called a layout. Thenarrowest line that we can create on a chip is called the feature size, which today is well belowone micrometer (sub-micron).
1.4.1 Full-custom/VLSI
In a full-custom IC technology, we optimize all layers for our particular embedded systemsdigital implementation. Such optimization includes placing the transistors to minimizeinterconnection lengths, sizing the transistors to optimize signal transmissions and routing wiresamong the transistors. Once we complete all the masks, we send the mask specifications to a
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
20/270
Embedded System Design 06EC82
ECE, SJBIT
2
fabrication plant that builds the actual ICs. Full-custom IC design, often referred to as VLSI(Very Large Scale Integration) design, has very high NRE cost and long turnaround times(typically months) before the IC becomes available, but can yield excellent performance withsmall size and power. It is usually used only in high-volume or extremely performance-criticalapplications.
1.4.2 Semi-custom ASIC (gate array and standard cell)
In an ASIC (Application-Specific IC) technology, the lower layers are fully or partially built,leaving us to finish the upper layers. In a gate array technology, the masks for the transistor andgate levels are already built (i.e., the IC already consists of arrays of gates). The remaining taskis to connect these gates to achieve our particular implementation. In a standard cell technology,logic-level cells (such as an AND gate or an AND-OR-INVERT combination) have their maskportions pre-designed, usually by hand. Thus, the remaining task is to arrange these portions into
complete masks for the gate level, and then to connect the cells. ASICs are by far the mostpopular IC technology, as they provide for good performance and size, with much less NRE costthan full-custom ICs.
1.4.3 PLD
In a PLD (Programmable Logic Device) technology, all layers already exist, so we can purchasethe actual IC. The layers implement a programmable circuit, where programming has a lower-level meaning than a software program. The programming that takes place may consist ofcreating or destroying connections between wires that connect gates, either by blowing a fuse, orsetting a bit in a programmable switch. Small devices, called programmers, connected to a
desktop computer can typically perform such programming. We can divide PLD's into two types,simple and complex. One type of simple PLD is a PLA (Programmable Logic Array), whichconsists of a programmable array of AND gates and a programmable array of OR gates. Anothertype is a PAL (Programmable Array Logic), which uses just one programmable array to reducethe number of expensive programmable components. One type of complex PLD, growing veryrapidly in popularity over the past decade, is the FPGA (Field Programmable Gate Array), whichoffers more general connectivity among blocks of logic, rather than just arrays of logic as withPLAs and PALs, and are thus able to implement far more complex designs. PLDs offer very lowNRE cost and almost instant IC availability. However, they are typically bigger than ASICs, mayhave higher unit cost, may consume more power, and may be slower (especially FPGAs). Theystill provide reasonable performance, though, so are especially well suited to rapid prototyping.
1.5 DESIGN TECHNOLOGY:
Design technology involves the manner in which we convert our concept of desired systemfunctionality into an implementation. We must not only design the implementation to optimisedesign metrics, but we must do so quickly.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
21/270
Embedded System Design 06EC82
ECE, SJBIT
2
Variations of a top-down design process have become popular in the past decade, an ideal formof which is illustrated in the figure. The designer refines the system through several abstractionlevels. At the system level the designer describes the desired functionality in an executablelanguage like C. This is called system specification.
The designer refines this specification by distributing portions of it among several general and/orsingle purpose processors, yielding behavioural specifications for each processor.
The designer refines these specifications into register-transfer (RT) specifications by convertingbehaviour on general-purpose processors to assembly code, and by converting behaviour onsingle purpose processors to a connection of register-transfer components and state machines.The designer then refines the RT level specification into a logic specification.
Finally, the designer refines the remaining specifications into an implementation consisting ofmachine code for general purpose processors and a design gate level net list for single purposeprocessors.
Fig 1.9 : Deal top-down design process, and productivity improvers.
There are three main approaches to improving the design process for increased productivity,which we label as compilation/synthesis, libraries/IP, and test/verification. Several otherapproaches also exist.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
22/270
Embedded System Design 06EC82
ECE, SJBIT
2
1.5.1 Compilation/Synthesis
Compilation/Synthesis lets a designer specify desired functionality in an abstract manner, andautomatically generates lower-level implementation details. Describing a system at highabstraction levels can improve productivity by reducing the amount of details, often by an order
of magnitude, that a design must specify.A logic synthesis tool converts Boolean expressions into a connection of logic gates (called anetlist). A register-transfer (RT) synthesis tool converts finite-state machines and register-transfers into a datapath of RT components and a controller of Boolean equations. A behavioralsynthesis tool converts a sequential program into finite-state machines and register transfers.Likewise, a software compiler converts a sequential program to assembly code, which isessentially register-transfer code. Finally, a system synthesis tool converts an abstract systemspecification into a set of sequential programson general and single-purpose processors.The relatively recent maturation of RT and behavioral synthesis tools has enabled a unified viewof the design process for single-purpose and general-purpose processors. Design for the former iscommonly known as hardware design, and design for the latter as software design. In the
past, the design processes were radically different software designers wrote sequentialprograms, while hardware designers connected components.
Fig 1.10 The co-design ladder: recent maturation of synthesis enables a unified view
of hardware and software.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
23/270
Embedded System Design 06EC82
ECE, SJBIT
2
1.5.2 Libraries/IP
Libraries involve re-use of pre-existing implementations. Using libraries of existingimplementations can improve productivity if the time it takes to find, acquire, integrate and test alibrary item is less than that of designing the item oneself. A logic-level library may consist of
layouts for gates and cells. An RT-level library may consist of layouts for RT components, likeregisters, multiplexors, decoders, and functional units. A behavioral-level library may consist ofcommonly used components, such as compression components, bus interfaces, displaycontrollers, and even general purpose processors. The advent of system-level integration hascaused a great change in this level of library.
1.5.3 Test/Verification
Test/Verification involves ensuring that functionality is correct. Such assurance can prevent time-consuming debugging at low abstraction levels and iterating back to high abstraction levels.Simulation is the most common method of testing for correct functionality, although more formal
verification techniques are growing in popularity. At the logic level, gate level simulatorsprovide output signal timing waveforms given input signal waveforms.Likewise, general-purpose processor simulators execute machine code. At the RT-level,hardware description language (HDL) simulators execute RT-level descriptions and provideoutput waveforms given input waveforms. At the behavioral level, HDL simulators simulatesequential programs, and co-simulators connect HDL and general purpose processor simulatorsto enable hardware/software co-verification. At the system level, a model simulator simulates theinitial system specification using an abstract computation model, independent of any processortechnology, to verify correctness andcompleteness of the specification.
1.5.4 More productivity improvers
There are numerous additional approaches to improving designer productivity. Standards focuson developing well-defined methods for specification, synthesis and libraries. Such standards canreduce the problems that arise when a designer uses multiple tools, or retrieves or providesdesign information from or to other designers. Common standards include language standards,synthesis standards and library standards.
Languages focus on capturing desired functionality with minimum designer effort. For example,the sequential programming language of C is giving way to the object oriented language of C++,which in turn has given some ground to Java. As another example, state-machine languages
permit direct capture of functionality as a set of states and transitions, which can then betranslated to other languages like C.
Frameworks provide a software environment for the application of numerous tools throughoutthe design process and management of versions of implementations. For example, a frameworkmight generate the UNIX directories needed for various simulators and synthesis tools,supporting application of those tools through menu selections in a single graphical user interface.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
24/270
Embedded System Design 06EC82
ECE, SJBIT
2
RECOMMENDED QUESTIONS
UNIT 1
Overview of embedded systems
1. What is an embedded system? Why is it so hard to define ES?
2. List and define the three main characteristics of embedded system that
distinguish such systems from other computing systems.
3. What is design metric?
4.
List a pair of design metrics that may compete with one another providing
an intuitive explanation of the reason behind it.
5. What is market window and why is it so important to reach the market
early in this window?
6. What is NRE cost?
7. List and define the three main processor technologies. What are the
benefits of using different processor technologies.
8. List the main IC technologies and list out the benefits.
9. List the three main design technologies and how is it helpful to designers.
10.Provide a definition of Moores law. 11.Compute annual growth rate of IC capacity and designer productivity.
12.What is design gap?
13.What I renaissance engineer and why is it so important in current
market?
14.Define what is meant by mythical man month.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
25/270
Embedded System Design 06EC82
ECE, SJBIT
2
QUESTION PAPER SOLUTION
UNIT 1
Q1.Highlight the merits and demerits of single purpose processors and general-
purpose processors.
Single Purpose Processors:
Merits:
5. They are fast
6. They consume low power
7. They have small size
8. Unit cost may be low for large quantities
Demerits:
5. NRE costs may be high
6. Low flexibility
7. Unit cost high for small quantities
8. Performance may not match for some applications
General Purpose Processors:
Merits:
5. High Flexibility
6. Low NRE costs
7. Low time to market
8. Performance may be for fast and high-intensive computations.
De-Merits:
4. Unit cost may be relatively high for large quantities.
5. Performance may be slower for certain applications.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
26/270
Embedded System Design 06EC82
ECE, SJBIT
2
6. Size and power may be large due to unnecessary processor hardware.
Q2.How a single purpose processor is distinctly different from a general-purpose processor?
Sl.No. Single Purpose Processor General Purpose Processor
1. Executes exactly one program. Executes any program written by the user.
2.The functionality cannot be changed.
The functionality can be changed by the user by
writing the required program.
Sl.No. Single Purpose Processor General Purpose Processor
3. They do not have program memory They have program memory
4.Do not have any flexibility and contain resources
required only for that particular functionality
Has a very large amount of resource which may or
may not be used for a particular functionality as
decided by the user
5. Merits include : They are fast, they consume low
power, they have small size and the unit cost may
be low for large quantities
Merits include : They have high Flexibility, Low NRE
costs, Low time to market, Performance may be for
fast and high-intensive computations.
Q3. Explain the three Processor Technologies Briefly
1. General Purpose Processors Software:
They are programmable devices used in a variety of applications. They are also known as microprocessors.
They have a program memory and a general data path with a large register file and a general ALU. The
data path must be large enough to handle a variety of computations. The programmer writes the program
to carry out the required functionality in the program memory and uses the features (instructions)
provided by the general data path. This is called as the software portion of the system. The benefits of
such a processor are very high. They require Low time-to-market and have low NRE costs. They provide a
high flexibility.
2. Single Purpose Processors Hardware:
This is a digital circuit designed to execute exactly one program. Its features are, it contains only the
components needed to execute a single program; it contains no program memory. User cannot change
the functionality of the chip. They are fast, low powered and small sized.
3. Application Specific Processors:Application specific Instruction set processors (ASIP)
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
27/270
Embedded System Design 06EC82
ECE, SJBIT
2
They are programmable processors optimized for a particular class of applications having common
characteristics. They strike a compromise between general-purpose and single-purpose processors. They
have a program memory, an optimized data path and special functional units. They have good
performance, some flexibility, size and power.
4. What are the common design metrics that a design engineer should
consider?
- NRE( non recurring engineering Cost) : The one time monetary cost of designing the system.
- Unit cost: Monetary cost of manufacturing each copy of the system, excluding NRE cost.
- Size: physical space required by the system. Often measured in terms of bytes in case of software, and no.
of gates in terms of hardware.
- Performance: execution/response time of the system.
- Power: The amount of power consumed by the system, which may define lifetime of the battery and
cooling requirement of IC. More power means more heat.
- Flexibility: ability to change the functionality of the system.
- Time to prototype: time needed to build a working system w/o incurring heavy NRE.
- Time to market: time required to develop & released to the market.
- Maintainability: ability to modify the system after its release to the market.
- Correctness: our confidence that we have implemented systems functionality correctly.
- Safety: probability that the system does not cause any harm.
Metrics typically compete with one another: improving one often leads to worsening of another
Q5. Write short notes on IC technology
Every processor must eventually be implemented on an IC. IC technology involves the manner in which
we map a digital (gate-level) implementation onto an IC. An IC (Integrated Circuit), often called a chip,
is a semiconductor device consisting of a set of connected transistors and other devices. A number of
different processes exist to build semiconductors, the most popular of which is CMOS (Complementary
Metal Oxide Semiconductor). The IC technologies differ by how customized the IC is for a particular
implementation. IC technology is independent from processor technology; any type of processor can be
mapped to any type of IC technology.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
28/270
Embedded System Design 06EC82
ECE, SJBIT
2
The independence of processor and IC technologies: any processor technology can be
mapped to any IC technology.
To understand the differences among IC technologies, we must first recognize that semiconductors
consist of numerous layers. The bottom layers form the transistors. The middle layers form logic gates.
The top layers connect these gates with wires. One way to create these layers is by depositing photo-sensitive chemicals on the chip surface and then shining light through masks to change regions of the
chemicals. Thus, the task of building the layers is actually one of designing appropriate masks. A set of
masks is often called a layout. The narrowest line that we can create on a chip is called thefeature size,
which today is well below one micrometer (sub-micron). For each IC technology, all layers must
eventually be built to get a working IC; the question is who builds each layer and when.
Q6. Derive the equation for percentage loss for any market rise . A
product was delayed by 4 weeks in releasing to market. The peak
revenue for on time entry to market would occur after 20 weeks for amarket rise angle by 45. Find the percentage revenue loss.
Ans : Lets investigate the loss of revenue that can occur due to delayed entry of a product in the
market. We can use a simple triangle model y axis is the market rise, x axis to represent the point of
entry to the market. The revenue for an on time market entry is the area of the triangle labeled on
time and the revenue for a delayed entry product is the area of the triangle labeled Delayed. The
revenue loss for a delayed entry is the difference of these triangles areas.
% revenue loss = ((on time Delayed)/on time)*100 %
The area of on time triangle = * base * height
W -- height the market raise
D -- Delayed entry ( in terms of weeks or months )
2Wproducts life time
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
29/270
Embedded System Design 06EC82
ECE, SJBIT
2
Area of on time triangle = *2W*W
Area of delayed triangle=1/2*(W-D+W)*(W-D)
%age revenue loss = (D (3W- D)/2W*W) * 100 %
Ex: products life time is 52 weeks
Delay of entry to the market is 4 weeks
Percentage revenue loss = 22%
Q7. Compare GPP,SPP and ASSP along with their block diagrams .1. General Purpose Processors Software
They are programmable devices used in a variety of applications. They are also known as
microprocessors. They have a program memory and a general data path with a large register file andgeneral ALU. The data path must be large enough to handle a variety of computations. The
programmer writes the program to carry out the required functionality in the program memory
and uses the features (instructions) provided by the general data path. This is called as the
software portion of the system. The benefits of such a processor are very high. They require Low
time-to-market and have low NRE costs. They provide a high flexibility.
Design time and NRE costare low, because the designer must only write a program, but need not do any
digital design. Flexibilityis high, because changing functionality requires only changing the program. Unit
costmay be relatively low in small quantities, since the processor manufacturer sells large quantities to
other customers and hence distributes the NRE cost over many units. Performance may be fast for
computation-intensive applications, if using a fast processor, due to advanced architecture features and
leading edge IC technology.
some design-metric drawbacks : Unit costmay be too high for large quantities. Performance may be
slow for certain applications. Size andpowermay be large due to unnecessary processor hardware.
Figure 1.4(d) illustrates the use of a single-purpose processor in our embedded system example,
representing an exact fit of the desired functionality, nothing more, nothing less.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
30/270
Embedded System Design 06EC82
ECE, SJBIT
3
Fig : 1.4 Processors vary in their customization for the problem at hand: (a) desired functionality, (b) general-
purpose processor, (b) application-specific processor, (c)single-purpose processor.
Fig 1.5 Implementing desired functionality on different General purpose processor
2. Single Purpose Processors Hardware:
This is a digital circuit designed to execute exactly one program. Its features are, it contains only the
components needed to execute a single program; it contains no program memory. User cannot change
the functionality of the chip. They are fast, low powered and small sized.
An embedded system designer creates a single-purpose processor by designing a custom digital circuit.Using a single-purpose processor in an embedded system results in several design metric benefits and
drawbacks, which are essentially the inverse of those for general purpose processors. Performance may
be fast, size and power may be small, and unit-cost may be low for large quantities, while design time
and NRE costs may be high, flexibility is low, unit cost may be high for small quantities, and performance
may not match general-purpose processors for some applications.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
31/270
Embedded System Design 06EC82
ECE, SJBIT
3
Fig 1.6 Implementing desired functionality on different single purpose processor
3.Application Specific Processors:Application specific Instruction set processors (ASIP):
They are programmable processors optimized for a particular class of applications having common
characteristics. They strike a compromise between general-purpose and single-purpose processors. They
have a program memory, an optimized data path and special functional units. They have good
performance, some flexibility, size and power.
An application-specific instruction-set processor (or ASIP) can serve as a compromise between the above
processor options. An ASIP is designed for a particular class of applications with common characteristics,
such as digital-signal processing, telecommunications, embedded control, etc. The designer of such a
processor can optimize the datapath for the application class, perhaps adding special functional units for
common operations, and eliminating other infrequently used units.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
32/270
Embedded System Design 06EC82
ECE, SJBIT
3
Fig 1.7 Implementing desired functionality on different Application Specific processor
Digital-signal processors (DSPs) are a common class of ASIP, so demand special mention. A DSP is a
processor designed to perform common operations on digital signals, which are the digital encodings of
analog signals like video and audio. These operations carry out common signal processing tasks likesignal filtering, transformation,or combination. Such operations are usually math-intensive, including
operations like multiply and add or shift and add. To support such operations, a DSP may have special
purpose datapath components such a multiply-accumulate unit, which can perform a computation like T
= T + M[i]*k using only one instruction. Because DSP programs often manipulate large arrays of data, a
DSP may also include special hardware to fetch sequential data memory locations in parallel with other
operations, to further speed execution.
Q8. Suggest two methods to improve productivity.
There are numerous additional approaches to improving designer productivity. Standards focus on
developing well-defined methods for specification, synthesis and libraries. Such standards can reduce
the problems that arise when a designer uses multiple tools, or retrieves or provides design information
from or to other designers. Common standards include language standards, synthesis standards and
library standards.
Languages focus on capturing desired functionality with minimum designer effort. For example, the
sequential programming language of C is giving way to the object oriented language of C++, which in
turn has given some ground to Java. As another example, state-machine languages permit direct capture
of functionality as a set of states and transitions, which can then be translated to other languages l ike C.
Frameworks provide a software environment for the application of numerous tools throughout the
design process and management of versions of implementations. For example, a framework might
generate the UNIX directories needed for various simulators and synthesis tools, supporting application
of those tools through menu selections in a single graphical user interface.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
33/270
Embedded System Design 06EC82
ECE, SJBIT
3
UNIT 2
SINGLE-PURPOSE PROCESSORS: Hardware, Combinational Logic, Sequential Logic, RT
level Combinational and Sequential Components, Optimizing single-purpose processors. Single-
Purpose Processors: Software, Basic Architecture, Operation, Programmers View, Development
Environment, ASIPS.
6 Hours
TEXT BOOKS:
1. Embedded System Design: A Unified Hardware/Software Introduction - Frank
Vahid, Tony Givargis, John Wiley & Sons, Inc.2002
REFERENCE BOOKS:
1. Embedded Systems: Architecture and Programming, Raj Kamal, TMH. 2008
2. Embedded Systems Architecture A Comprehensive Guide for Engineers and
Programmers, Tammy Noergaard, Elsevier Publication, 2005
3. Embedded C programming, Barnett, Cox & Ocull, Thomson (2005).
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
34/270
Embedded System Design 06EC82
ECE, SJBIT
3
UNIT 2
CUSTOM SINGLE PURPOSE PROCESSORS: HARDWARE
2.1 INTRODUCITON:
A processor is a digital circuit designed to perform computation tasks . a processor consists of adatapath capable of storing and manipulating data and a controller capable of moving datathrough the datapath.
A general purpose processor is designed to carry out a wide variety of computation task.A singlepurpose processor is designed specifically to carry out a particular computational task.
A custom single-purpose processor may be
Fast, small, low power But, high NRE, longer time-to-market, less flexible
2.2 COMBINATIONAL LOGIC:
1. Transistors and Logic Gates
2. Basic combinational logic design
3. RT level combinational components
Transistors and Logic Gates:
A transistor is the basic electrical component in digital systems. A transistor acts as
simple on/off switch. Among the designs CMOS is one .
Fig 2.1 view of CMOS transistor on silicon
The CMOS transistor consists of Gate, source and drain , where gate controls the current
flow from source to drain. The voltage of +3V or +5V can be supply which will refer to
logic 1 and low voltage is typically ground and treated as logic 0.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
35/270
Embedded System Design 06EC82
ECE, SJBIT
3
When logic 1 is applied to gate transistor conducts so current flows
When logic 0 is applied to gate transistor does not conduct.
Fig 2.2 a & b CMOS transistor implementation
Fig 2.2 a b & c CMOS transistor implementation of inverter,NAND and NOR gate
Digital system designers work at the abstraction level of logic gates where each gate is
represented symbolically with Boolean equation as shown in figure 2.3
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
36/270
Embedded System Design 06EC82
ECE, SJBIT
3
Fig 2.3 Basic logic gates
Combinational logic design:
A combinational circuit is a digital circuit whose output is purely a function of its
present inputs. Such a circuit has no memory of past inputs.example is shown below.
Fig 2.4 combi design : problem , TT, output , minimized , final ckt.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
37/270
Embedded System Design 06EC82
ECE, SJBIT
3
RT level combinational components:
Design of complex digital circuits takes time using only logic gates , so, combinational
components like Mux, Decoders,adders ,comparators, ALUetc can be designed used RT
level synthesis .
Fig 2.5 combinational components
2.3 Sequential logic
a.Flip flops
b.RT level sequential components
c. Sequential logic design
2.3.1 Flip flops
A sequential circuit is a digital circuit whose outputs are a function of the present as well
as previous input values. Basic sequential circuits is a flip flop. A flip flop stores a single
bit.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
38/270
Embedded System Design 06EC82
ECE, SJBIT
3
D-flip flop: It has two inputs D and clock, when clock is 1, value of D is stored in flip
flop and output Q occurs. When clock is 0, previously stored bit is maintained and
output appears at Q.
SR Flip flop : It hasthree inputs S,R,clock , when clock is 1, inputs S and R are examined
, if S is 1 ,1 is stored. If R is 1, 0 is stored. If both S and R is 0, there is no change. If both
are 1 behavior is undefined. Thus S stands for set and R for reset.
Fig 2.6 Sequential components
2.3.2 RT level sequential components:
A register , shift register and counters are designed using RT level synthesis, In
which , a register stores n bits from its n-bit data input I with those stored bits
appearing at its output Q and bits are stored in parallel.
A shift register stores n bits, but these bits cannot be stored in parallel , instead
they are shifted into the registers serially . A shift register has one data input I
and two control inputs clock and shift.
A counter is a register that can also increment add one binary bit to its stored
binary value. A synchronous input value only has an effect during a clock edge. Anasynchronous inputs value affects the circuit independent of the clock. All these
are shown in figure 2.6
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
39/270
Embedded System Design 06EC82
ECE, SJBIT
3
2.3.3 Sequential logic design
Sequential logic design can be achieved using a straight forward technique
which is illustrated below
Fig 2.7 (a) (b)( c)( d) sequential logic design
Fig 2.7 (e) (f) sequential logic design
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
40/270
Embedded System Design 06EC82
ECE, SJBIT
4
2.4 Custom single purpose processor design:
A basic processor consists of a controller and a data path . The datapath stores and
manipulates a systems data controller carries out the configuration of the datapath
and sets the datapath control inputs like register load mux select signals functional unitsand connection units to obtain desired configuration of the datapath.
Fig 2.8 A basic processor(a) controller and datapath
(b) view inside the controller and datapath
Example program :
First create algorithm
Convert algorithm to complex state machine
Known as FSMD: finite-state machine with datapath Can use templates to perform such conversion
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
41/270
Embedded System Design 06EC82
ECE, SJBIT
4
Fig : 2.9 Example program GCD
Create a register for any declared variable
Create a functional unit for each arithmetic operation
Connect the ports, registers and functional units
Based on reads and writes
Use multiplexors for multiple sources
Create unique identifier
for each datapath component control input and output
Templates for creating state diagram :
We finished the datapath
We have a state table for the next state and control logic
All thats left is combinational logic design This is notan optimized design, but we see the basic steps
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
42/270
Embedded System Design 06EC82
ECE, SJBIT
4
Fig 2.10 : Templates for creating state diagram
2.5 RT level Custom Single Purpose processor Design:
We often start with a state machine
Rather than algorithm
Cycle timing often too central to functionality
Example
Bus bridge that converts 4-bit bus to 8-bit bus
Start with FSMD
Known as register-transfer (RT) level Exercise: complete the design
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
43/270
Embedded System Design 06EC82
ECE, SJBIT
4
Fig 2.13 RT level Custom Single Purpose processor Design example
2.6 Optimizing Custom single-purpose processors
Optimization is the task of making design metric values the best
possible
Optimization opportunities
original program
FSMD datapath
FSM
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
44/270
Embedded System Design 06EC82
ECE, SJBIT
4
Optimizing the original program
Analyze program attributes and look for areas of possible
improvement
number of computations size of variable
time and space complexity
operations used
multiplication and division very expensive
Fig 2.15 optimizing the program
Optimizing the FSMD:
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
45/270
Embedded System Design 06EC82
ECE, SJBIT
4
Areas of possible improvements
merge states
states with constants on transitions can be eliminated,
transition taken is already known states with independent operations can be merged
separate states
states which require complex operations (a*b*c*d) can be
broken into smaller states to reduce hardware size
scheduling
Fig 2.16 optimizing the FSDM for GCD Optimizing the datapath:
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
46/270
Embedded System Design 06EC82
ECE, SJBIT
4
Sharing of functional units
one-to-one mapping, as done previously, is not necessary
if same operation occurs in different states, they can share a single
functional unit
Multi-functional units ALUs support a variety of operations, it can be shared among
operations occurring in different states
Optimizing the FSM:
State encoding
task of assigning a unique bit pattern to each state in an FSM
size of state register and combinational logic vary
can be treated as an ordering problem
State minimization
task of merging equivalent states into a single state
state equivalent if for all possible input combinations the
two states generate the same outputs and transitions to
the next same state
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
47/270
Embedded System Design 06EC82
ECE, SJBIT
4
GENENRAL PURPOSE PROCESSORS : SOFTWARE
A General-Purpose Processor is a
Processor designed for a variety of computation tasks
Low unit cost, in part because manufacturer spreads NRE overlarge numbers of units
Motorola sold half a billion 68HC05 microcontrollers in
1996 alone
Carefully designed since higher NRE is acceptable
Can yield good performance, size and power
Low NRE cost, short time-to-market/prototype, high flexibility
User just writes software; no processor design
a.k.a. microprocessor micro used when they wereimplemented on one or a few chips rather than entire rooms
Basic Architecture:
A general purpose processor sometimes called a CPU consists of datapath
and a control unit linked with memory.
Control unit and datapath
Note similarity to single-purpose processor
Key differences
Datapath is general
Control unit doesnt store the algorithm the algorithm is
programmed into the memory
Datapath Operations:
Load
Read memory location into register
ALU operation
Input certain registers through ALU, store back in register
Store
Write register to memory location
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
48/270
Embedded System Design 06EC82
ECE, SJBIT
4
Fig 2.17 GPP basic architecture
Control unit :
Control unit: configures the datapath operations
Sequence of desired operations (instructions) stored in memory
program
Instruction cycle broken into several sub-operations, each one clock cycle, e.g.:
Fetch: Get next instruction into IR
Decode: Determine what the instruction means
Fetch operands: Move data from memory to datapath register Execute: Move data through the ALU
Store results: Write data from register to memory
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
49/270
Embedded System Design 06EC82
ECE, SJBIT
4
Control Unit Sub-Operations:
Fetch
Get next instruction into IR
PC: program counter, always points to next instruction IR: holds the fetched instruction
Decode
Determine what the instruction means
Fetch operands
Move data from memory to datapath register
Execute
Move data through the ALU
This particular instruction does nothing during this sub-operation
Store results Write data from register to memory
This particular instruction does nothing during this sub-operation
Memory:
Program information consists of the sequence of instructions that cause the processor
to carry out the desired system functionality. Data information represents the values
being input, output and transformed by the program. We can store program and data
together or separately..
In a Princeton architecture,data and program words share the same memory space. The
Princeton architecture may result in a simpler hardware connection to memory, since
only one connection is necessary.
In a Harvard architecture, the program memory space is distinct from the data memory
space. A Harvard architecture,while requiring two connections, can perform instruction
and data fetches simultaneously, so may result in improved performance.
Most machines have a Princeton architecture. The Intel 8051 is a well-known Harvardarchitecture.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
50/270
Embedded System Design 06EC82
ECE, SJBIT
5
Figure 2.19: Two memory architectures: (a) Harvard, (b) Princeton
Memory may be read-only memory (ROM) or readable and writable memory
(RAM). ROM is usually much more compact than RAM. An embedded system often uses
ROM for program memory, since, unlike in desktop systems, an embedded systems
program does not change. Constant-data may be stored in ROM, but other data of
course requires RAM.
Memory may be on-chip or off-chip. On-chip memory resides on the same IC as the
processor, while off-chip memory resides on a separate IC. The processor can usually
access on-chip memory must faster than off-chip memory, perhaps in just one cycle, but
finite IC capacity of course implies only a limited amount of on-chip memory.
Figure 2.20: Cache memory
To reduce the time needed to access (read or write) memory, a local copy of a portion
of memory may be kept in a small but especially fast memory called cache. Cache
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
51/270
Embedded System Design 06EC82
ECE, SJBIT
5
memory often resides on-chip, and often uses fast but expensive static RAM technology
rather than slower but cheaper dynamic RAM. Cache memory is based on the principle
that if at a particular time a processor accesses a particular memory location, then the
processor will likely access that location and immediate neighbors of the location in the
near future.
Operation:
Instruction execution:
1. Fetch instruction: the task of reading the next instruction from memory into
the instruction register.
2. Decode instruction: the task of determining what operation the instruction
in the instruction register represents (e.g., add, move, etc.).
3. Fetch operands: the task of moving the instructions operand data intoappropriate registers.
4. Execute operation: the task of feeding the appropriate registers through the
ALU and back into an appropriate register.
5. Store results: the task of writing a register into memory.
If each stage takes one clock cycle, then we can see that a single instruction may take
several cycles to complete.
Pipelining
Pipelining is a common way to increase the instruction throughput of a microprocessor.
We first make a simple analogy of two people approaching the chore of washing and
drying 8 dishes. In one approach, the first person washes all 8 dishes, and then the
second person dries all 8 dishes. Assuming 1 minute per dish per person, this approach
requires 16 minutes. The approach is clearly inefficient since at any time only one
person is working and the other is idle. Obviously, a better approach is for the second
person to begin drying the first dish immediately after it has been washed. This
approach requires only 9 minutes -- 1 minute for the first dish to be washed, and then 8
more minutes until the last dish is finally dry . We refer to this latter approach as
pipelined.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
52/270
Embedded System Design 06EC82
ECE, SJBIT
5
Figure 2.21: Pipelining: (a) non-pipelined dish cleaning, (b) pipelined dish cleaning,
(c) pipelined instruction execution.
Each dish is like an instruction, and the two tasks of washing and drying are like the five
stages listed above. By using a separate unit (each akin a person) for each stage, we can
pipeline instruction execution. After the instruction fetch unit etches the first
instruction, the decode unit decodes it while the instruction fetch unit simultaneously
fetches the next instruction.
Superscalar and VLIW Architectures:
Performance can be improved by:
Faster clock (but theres a limit)
Pipelining: slice up instruction into stages, overlap stages
Multiple ALUs to support more than one instruction stream
Superscalar
Scalar: non-vector operations
Fetches instructions in batches, executes as many as
possible
May require extensive hardware to detect
independent instructions
VLIW: each word in memory has multiple independent
instructions
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
53/270
Embedded System Design 06EC82
ECE, SJBIT
5
Relies on the compiler to detect and schedule
instructions
Currently growing in popularity
Programmers View
Programmer doesnt need detailed understanding of architecture
Instead, needs to know what instructions can be executed
Two levels of instructions:
Assembly level
Structured languages (C, C++, Java, etc.)
Most development today done using structured languages But, some assembly level programming may still be necessary
Drivers: portion of program that communicates with and/or controls
(drives) another device
Often have detailed timing considerations, extensive bit
manipulation
Assembly level may be best for these
Fig 2.22 Instruction stored in memory
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
54/270
Embedded System Design 06EC82
ECE, SJBIT
5
Instruction Set:
Defines the legal set of instructions for that processor
Data transfer: memory/register, register/register, I/O, etc.
Arithmetic/logical: move register through ALU and back Branches: determine next PC value when not just PC+1
Addressing Modes:
Fig 2.23 Addressing modes
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
55/270
Embedded System Design 06EC82
ECE, SJBIT
5
Fig 2.24 A Simple (Trivial) Instruction Set
Program and data memory spaceThe embedded systems programmer must be aware of the size of the available memory
for program and for data. The programmer must not exceed these limits. In addition,
the programmer will probably want to be aware of on-chip program and data memory
capacity, taking care to fit the necessary program and data in on-chip memory if
possible.
RegistersThe assembly-language programmer must know how many registers are available for
general-purpose data storage. For example, a base register may exist, which permits the
programmer to use a data-transfer instruction where the processor adds an operand
field to the base register to obtain an actual memory address.
I/OThe programmer should be aware of the processors input and output (I/O) facilities,
with which the processor communicates with other devices. One common I/O facility is
parallel I/O, in which the programmer can read or write a port (a collection of external
pins) by reading or writing a special-function register. Another common I/O facility is a
system bus, consisting of address and data ports that are automatically activated by
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
56/270
Embedded System Design 06EC82
ECE, SJBIT
5
certain addresses or types of instructions.
Interrupts
An interrupt causes the processor to suspend execution of the main program, andinstead jump to an Interrupt Service Routine (ISR) that fulfills a special, short-term
processing need. In particular, the processor stores the current PC, and sets it to the
address of the ISR. After the ISR completes, the processor resumes execution of the
main program by restoring the PC.The programmer should be aware of the types of
interrupts supported by the processor (we describe several types in a subsequent
chapter), and must write ISRs when necessary. The assembly-language programmer
places each ISR at a specific address in program memory. The structured-language
programmer must do so also; some compilers allow a programmer to force a procedure
to start at a particular memory location, while recognize pre-defined names for
particular ISRs.For example, we may need to record the occurrence of an event from a peripheral
device, such as the pressing of a button. We record the event by setting a variable in
memory when that event occurs, although the users main program may not process
that event until later. Rather than requiring the user to insert checks for the event
throughout the main program, the programmer merely need write an interrupt service
routine and associate it with an input pin connected to the button. The processor will
then call the routine automatically when the button is pressed.
Operating System
Optional software layer providing low-level services to a program (application).
File management, disk access
Keyboard/display interfacing
Scheduling multiple programs for execution
Or even just multiple threads from one program
Program makes system calls to the OS
Development Environment
Development processor
The processor on which we write and debug our programs
Usually a PC
Target processor
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
57/270
Embedded System Design 06EC82
ECE, SJBIT
5
The processor that the program will run on in our embedded system
Often different from the development processor
Software Development Process
Compilers Cross compiler
Runs on one processor, but generates code for another
Assemblers
Linkers
Debuggers
Profilers
Fig 2.25 Software Development Process
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
58/270
Embedded System Design 06EC82
ECE, SJBIT
5
Running a Program: If development processor is different than target, how can we run our compiled
code? Two options:
Download to target processor
Simulate
Simulation
One method: Hardware description language
But slow, not always available
Another method: Instruction set simulator (ISS)
Runs on development processor, but executes instructions of target
processor
Testing and Debugging: ISS
Gives us control over time set breakpoints, look at register values, setvalues, step-by-step execution, ...
But, doesnt interact with real environment
Download to board
Use device programmer
Runs in real environment, but not controllable
Compromise: emulator
Runs in real environment, at speed or near
Supports some controllability from the PC
Fig 2.26 software design process
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
59/270
Embedded System Design 06EC82
ECE, SJBIT
5
Application-Specific Instruction-Set Processors (ASIPs):
General-purpose processors
Sometimes too general to be effective in demanding application
e.g., video processing requires huge video buffers andoperations on large arrays of data, inefficient on a GPP
But single-purpose processor has high NRE, not programmable
ASIPs targeted to a particular domain
Contain architectural features specific to that domain
e.g., embedded control, digital signal processing, video
processing, network processing, telecommunications, etc.
Still programmable
A Common ASIP: Microcontroller
For embedded control applications
Reading sensors, setting actuators
Mostly dealing with events (bits): data is present, but not in huge
amounts
e.g., VCR, disk drive, digital camera (assuming SPP for image
compression), washing machine, microwave oven
Microcontroller features
On-chip peripherals Timers, analog-digital converters, serial communication, etc.
Tightly integrated for programmer, typically part of register
space
On-chip program and data memory
Direct programmer access to many of the chips pins
Specialized instructions for bit-manipulation and other low-level
operations
Digital Signal Processors (DSP)
For signal processing applications
Large amounts of digitized data, often streaming
Data transformations must be applied fast
e.g., cell-phone voice filter, digital TV, music synthesizer
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
60/270
Embedded System Design 06EC82
ECE, SJBIT
6
DSP features
Several instruction execution units
Multiple-accumulate single-cycle instruction, other instrs.
Efficient vector operations e.g., add two arrays
Vector ALUs, loop buffers, etc.
Selecting a Microprocessor
Issues
Technical: speed, power, size, cost
Other: development environment, prior expertise, licensing, etc.
Speed: how evaluate a processors speed?
Clock speed but instructions per cycle may differ Instructions per second but work per instr. may differ
Dhrystone: Synthetic benchmark, developed in 1984.
Dhrystones/sec.
MIPS: 1 MIPS = 1757 Dhrystones per second (based on Digitals
VAX 11/780). A.k.a. Dhrystone MIPS. Commonly used today.
So, 750 MIPS = 750*1757 = 1,317,750 Dhrystones per
second
SPEC: set of more realistic benchmarks, but oriented to desktops
EEMBC EDN Embedded Benchmark Consortium, Suites of benchmarks: automotive, consumer electronics,
networking, office automation, telecommunications
Designing a General Purpose Processor Not something an embedded system designer normally would do
But instructive to see how simply we can build one top down
Remember that real processors arent usually built this way Much more optimized, much more bottom-up design
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
61/270
Embedded System Design 06EC82
ECE, SJBIT
6
Fig:2.27 A simple microprocessor
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
62/270
Embedded System Design 06EC82
ECE, SJBIT
6
RECOMMENDED QUESTIONS
UNIT 2
( Hardware)
1. What is single purpose processor? What are the benefits of choosing a
single purpose processor over a general purpose processor.?
2. How do nMOS and pMOS transistors differ?
3. Build a 3-input NAND gate using a minimum number of CMOS transistors.
4. Build a 3-input NOR gate using a minimum number of CMOS transistors.
5. Build a 2-input AND gate using a minimum number of CMOS transistors.
6. Build a 2-input OR gate using a minimum number of CMOS transistors.
7. Explain why NAND and NOR gates are more common than AND and OR
gates.
8. Distinguish between combinational and sequential circuit.
9. Design a 2-bit comparator with single output less than using
combinational design technique.
10.Design a 3 X 8 decoder with truth table and K-maps.
11.What is the difference between synchronous and asynchronous circuit?
12.What is the purpose of datapath and control path?
13.Design a single purpose processor that outputs Fibonacci numbers upto nplaces. Start with a function computing the desired result, translate it into
state diagram and sketch a probable datapath.
UNIT 2
( Software)1. Describe why a general purpose processor could cost less than a single
purpose processor.
2. Create a table listing the address spaces for 8 ,16, 24,32, 64 bit address
sizes.
3. Illustrate how program and data memory fetches can be overlapped in a
Harvard architecture.
4. For a microcontroller create a table listing Five existing variations stressing
the features that differ from the basic version.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
63/270
Embedded System Design 06EC82
ECE, SJBIT
6
QUESTION PAPER SOLUTION
UNIT 2
Q1. Write an algorithm for GCD with more time complexity and write theFSDM and also determine total number of steps required for GCD.
First create algorithm
Convert algorithm to complex state machine
Known as FSMD: finite-state machine with datapath
Can use templates to perform such conversion
GCD
Create a register for any declared variable
Create a functional unit for each arithmetic operation
Connect the ports, registers and functional units
Based on reads and writes
Use multiplexors for multiple sources
Create unique identifier
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
64/270
Embedded System Design 06EC82
ECE, SJBIT
6
for each datapath component control input and output
Templates for creating state diagram :
We finished the datapath
We have a state table for the next state and control logic
All thats left is combinational logic design
This is notan optimized design, but we see the basic steps
Templates for creating state diagram
Q2. Explain the different methods to optimize the FSDM .
Optimization is the task of making design metric values the best
possible
Optimization opportunities
original program
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
65/270
Embedded System Design 06EC82
ECE, SJBIT
6
FSMD
datapath
FSM
Optimizing the original program
Analyze program attributes and look for areas of possible
improvement
number of computations
size of variable
time and space complexity operations used
multiplication and division very expensive
Q3. Explain the different memory architecturesProgram information consists of the sequence of instructions that cause the processor
to carry out the desired system functionality. Data information represents the values
being input, output and transformed by the program. We can store program and data
together or separately..
In a Princeton architecture,data and program words share the same memory space. The
Princeton architecture may result in a simpler hardware connection to memory, since
only one connection is necessary.
In a Harvard architecture, the program memory space is distinct from the data memory
space. A Harvard architecture,while requiring two connections, can perform instruction
and data fetches simultaneously, so may result in improved performance.
Most machines have a Princeton architecture. The Intel 8051 is a well-known Harvard
architecture.
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
66/270
Embedded System Design 06EC82
ECE, SJBIT
6
Figure 2.19: Two memory architectures: (a) Harvard, (b) Princeton
Memory may be read-only memory (ROM) or readable and writable memory
(RAM). ROM is usually much more compact than RAM. An embedded system often uses
ROM for program memory, since, unlike in desktop systems, an embedded systems
program does not change. Constant-data may be stored in ROM, but other data of
course requires RAM.
Memory may be on-chip or off-chip. On-chip memory resides on the same IC as the
processor, while off-chip memory resides on a separate IC. The processor can usually
access on-chip memory must faster than off-chip memory, perhaps in just one cycle, but
finite IC capacity of course implies only a limited amount of on-chip memory.
Q4. Explain pipelining for instruction execution with dish cleaning.
Pipelining is a common way to increase the instruction throughput of a microprocessor.
We first make a simple analogy of two people approaching the chore of washing and
drying 8 dishes. In one approach, the first person washes all 8 dishes, and then the
second person dries all 8 dishes. Assuming 1 minute per dish per person, this approach
requires 16 minutes. The approach is clearly inefficient since at any time only one
person is working and the other is idle. Obviously, a better approach is for the second
person to begin drying the first dish immediately after it has been washed. This
approach requires only 9 minutes -- 1 minute for the first dish to be washed, and then 8
more minutes until the last dish is finally dry .
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
67/270
Embedded System Design 06EC82
ECE, SJBIT
6
: Pipelining: (a) non-pipelined dish cleaning, (b) pipelined dish cleaning,
(c) pipelined instruction execution.
Each dish is like an instruction, and the two tasks of washing and drying are like the five
stages listed above. By using a separate unit (each akin a person) for each stage, we can
pipeline instruction execution. After the instruction fetch unit etches the first
instruction, the decode unit decodes it while the instruction fetch unit simultaneously
fetches the next instruction.
Q5. Explain the software development process.
Software Development Process Compilers
Cross compiler
Runs on one processor, but generates code for another
Assemblers
Linkers
Debuggers
Profilers
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
68/270
Embedded System Design 06EC82
ECE, SJBIT
6
Fig 2.25 Software Development Process
Running a Program: If development processor is different than target, how can we run our compiled
code? Two options:
Download to target processor
Simulate
Simulation
One method: Hardware description language
But slow, not always available
Another method: Instruction set simulator (ISS)
Runs on development processor, but executes instructions of target
processor
Testing and Debugging: ISS Gives us control over time set breakpoints, look at register values, set
values, step-by-step execution, ...
But, doesnt interact with real environment
Download to board
Use device programmer
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
69/270
Embedded System Design 06EC82
ECE, SJBIT
6
Runs in real environment, but not controllable
Compromise: emulator
Runs in real environment, at speed or near
Supports some controllability from the PC
software design process
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
70/270
Embedded System Design 06EC82
ECE, SJBIT
7
optimizing the program
Optimizing the FSMD:
Areas of possible improvements
merge states
states with constants on transitions can be eliminated,
transition taken is already known states with independent operations can be merged
separate states
states which require complex operations (a*b*c*d) can be
broken into smaller states to reduce hardware size
scheduling
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
71/270
Embedded System Design 06EC82
ECE, SJBIT
7
optimizing the FSDM for GCD Optimizing the datapath:
Sharing of functional units
one-to-one mapping, as done previously, is not necessary
if same operation occurs in different states, they can share a single
functional unit
Multi-functional units
ALUs support a variety of operations, it can be shared amongoperations occurring in different states
Optimizing the FSM:
State encoding
8/22/2019 Ece Viii Embedded System Design [06ec82] Notes
72/270
Embedded System Desi