R. Ernst, TU Braunschweig 1
Embedded System ModelingPart 1
R. Ernst
TU Braunschweig, Germany
R. Ernst, TU Braunschweig 2
Lecture Overview
1 Introduction and motivation
2 Application modeling
3 Target architecture modeling
Part 1
Part 2
R. Ernst, TU Braunschweig 3
Lecture Overview
1 Introduction and motivation
2 Application modeling
3 Target architecture modeling
R. Ernst, TU Braunschweig 4
Embedded Systems
• Embedded Systemmicrocomputer system „embedded“ in a technical system
examples
R. Ernst, TU Braunschweig 5
Embedded Systems - Trends
• Trend 1: higher system integration
– integration of complete programmable subsystems on a single IC - Systems-on-Chip (SOC)
– programmable platforms examples: network processors, multi-media platforms, automotive platforms
R. Ernst, TU Braunschweig 6
Automotive Platform - Example MPC 555
FlashRAM
(448 KB)SRAM(26 KB)
burstinterface
U-bus
PowerPC
(RCPU)
systemcontrol
E-bus
businterface
businterface
Inter Module Bus
CANbus
interface(2)
CANbus
interface(2)
serialmulti-
channel module
serialmulti-
channel module
peripheral channels(PWM...)
businterface
businterface
ADC(2x16)
ADC(2x16)
time processing
unit (2)
time processing
unit (2)
programmable processors
weakly prog.co-processors
reused components (IP)
memories
interface&control
R. Ernst, TU Braunschweig 7
MPC 555
IC consists for the most part of IP and reused components
source:
Motorolaand Microprocessor Report, April 20, 98
R. Ernst, TU Braunschweig 8
Platform component types
• Another example: Philips NexperiaTM platform (Source: Th. Claasen, DAC 2000)
SDRAMSDRAM
TM-coreD$
I$
TriMedia CPU
DEVICE I/P BLOCKDEVICE I/P BLOCK
DEVICE I/P BLOCKDEVICE I/P BLOCK
DEVICE I/P BLOCKDEVICE I/P BLOCK
. . .
DVP System Silicon
PI
BU
S
MMI
DV
P M
EM
OR
Y B
US
DEVICE I/P BLOCKDEVICE I/P BLOCK
coreD$
I$
MIPS CPU
DEVICE I/P BLOCKDEVICE I/P BLOCK. . .
DEVICE I/P BLOCKDEVICE I/P BLOCK
PI
BU
S
TriMediaTMMIPSTM
programmable processors
OS + API + custom SW
weakly programmable co-processors
configurableIP components
memories
communication components
R. Ernst, TU Braunschweig 9
Embedded System - Trends - 2
• Trend 2: networked systemsubiquitous computing, telecom, automotive, avionics, space, ...
subsystem integration
subsystem integration
Image source: Siemens
service integration
service integration
R. Ernst, TU Braunschweig 10
Embedded System Design Constraints
• many non functional constraints influence design goals and architectures
– tight cost margins
– hard time constraints
– power consumption (mobile systems)
– safety including EMI
– size, weight, ...
• reduced and overlapping design cycles
R. Ernst, TU Braunschweig 11
Embedded architectures are heterogeneous
• different processing element types
– processors, weakly programmable coprocessors, IP components
• different interconnection networks and communication protocols
• different memory types
• different scheduling and synchronization strategies
M
CoP
M
M
PDSP
M
P
R. Ernst, TU Braunschweig 12
Managing HW architecture complexity
• development of application programmer interfaces (API) to hide complexity from application programmer and improve portability
• specialized RTOS to control resource sharing and interfaces
complex multi-level HW/SW architecture
R. Ernst, TU Braunschweig 13
Software architecture example
Chip Bus
core
RTOS
I/O IntBus-CTRL
timertimer
drivers
RTOS-APIs
application
periphery
cache
memprivate
private
private
private
sha
red
hardware
software
architecture
application
• layered software architecture with API
embedded SW is heterogeneous
ce1
pe1
API
R. Ernst, TU Braunschweig 14
RTOS
SW-library
(drivers etc.)
API, middle ware,
RTOS (e.g. VxWorks)
SW architecture - 2
B3B2
B1
B4
process languages, e.g. C, SystemC, VHDL
P4
P3
P2
P1shared memory
application
run-time system
P4
P3P2
P1
R. Ernst, TU Braunschweig 15
ES implementation challenges
• integration
– design process integration
– heterogeneous component and language integration (VSIA, Accellera)
• design space exploration and optimization
• verification
R. Ernst, TU Braunschweig 16
Lecture Overview
1 Introduction and motivation
2 Application modeling
3 Target architecture modeling
R. Ernst, TU Braunschweig 17
2 Application modeling
• use of different modeling languages in a single system
– languages/semantics established in application domain: flow graphs, FSM, ...
– domain specific optimizations requiredsignal flow graph transformations, FSM transformations, ...
– investment in language environments
single design language unlikely
Simulinksubsystem 2
inputlanguage 2
subsystem 3
subsystem 1
IP
UML
R. Ernst, TU Braunschweig 18
Application & Architecture
implementation language
architecture layer
application
„articulation point“
subsystem 2
Simulink
inputlanguage 2
subsystem 3
subsystem 1
IP
UML
application development
application layer
Part 1:application models
M
CoP
M
M
PDSP
M
P
core
RTOS
I/O IntBus-CTRL
timertimer
drivers
RTOS-APIs
application
Part 2:architecture models
implementation
R. Ernst, TU Braunschweig 19
Embedded System Modeling Principles
• standard embedded system model
– networks of communicating processes (P1, P2)
– processes locally sequential
P1 P2
R. Ernst, TU Braunschweig 20
Sequential processes
• sequential process languages („Host“ language)
– programming languages
• C, C++, Java, assembly language, ...
– hardware description languages
• VDHL processes, SystemC processes
P1 P2
R. Ernst, TU Braunschweig 21
Sequential process modeling
• unbuffered control and acyclic data flow graphs
{j = i * 2 + 1;g [i] = a [j] * p [i]; if (g [j] > max_val)
g [j] = max_val;}
for (i==0; i <= n/2; i++)
g
jmax_val
i
*
+
2
1
[]
g
[]
p
[]
a
* >
=
[]
i
basic blockno branch except at end
process control flow
R. Ernst, TU Braunschweig 22
Sequential process modeling - 2
• basic block (BB) represented by data flow graph (DFG)
– nodes are elementary operations
– edges are variables/signals
– acyclic graph
– signals not buffered
• operations activated (ready) if all signals/variables available („AND“)
• all output signals available after operation execution
• basic model for compilers and HW synthesis tools
– operation scheduling
(but: RT and logic synthesis assume single cycle)
– allocation of operations, variables, busses
g
jmax_val
i
*
+
2
1
[]
g
[]
p
[]
a
* >
=
[]
i
R. Ernst, TU Braunschweig 23
Sequential process modeling - 3
• process control flow represented by control flow graph
– nodes are basic blocks
– edges represent control branches
– signals not buffered
• basic blocks activated if flow arrives on one input edge („OR“)
• one output branch activated after BB execution
• global model for compilers and HW synthesis tools
– in HW synthesis often combined in single graph (CDFG)
{j = i * 2 + 1;g [i] = a [j] * p [i];if (g [j] > max_val)
g [j] = max_val;}
for (i==0; i <= n/2; i++)
{j = i * 2 + 1;g [i] = a [j] * p [i];if (g [j] > max_val)
g [j] = max_val;}
for (i==0; i <= n/2; i++)
R. Ernst, TU Braunschweig 24
Process communication
communication is part of the “process coordination“ language
P1 P2
shared variable communication
P4
P3
P2
P1shared memory
„a = b + c;“„g = a * c;“
message passing communication
P4
P3P2
P1
„send (a);“
„receive (a);“
R. Ernst, TU Braunschweig 25
2 Application modeling
• Important application models
– Kahn graphs
– concurrent FSM and synchronous/reactive systems
– asynchronous processes
– systems with periodic process activation
– others (e.g. Petri nets)
• different coordination semantic “Models of computation”
R. Ernst, TU Braunschweig 26
Kahn process networks
• communicating processes with directed flow
• communication: token “stream” between two processes
• process: operations on tokens
P1stream
P2stream
R. Ernst, TU Braunschweig 27
Kahn process networks
• special class of process networks
• communication via FIFO with unbounded capacity
• process:
– destructive read (“consumption”) at process start
– non-destructive write (“production”) at process end
– blocking read - process only executed if data available
– non-blocking write
P1
FIFO
P2
FIFO
FIFO
R. Ernst, TU Braunschweig 28
Kahn process networks - 2• monotonous process
– a process is monotonous if F(x1 x2 ... xn xn+1) F(x1 x2 ... xn )
i.e., the output string grows with the input string
• monotonous Kahn process networks are independent of the order of process executions
– excellent for design space exploration
– buffering supports pipelining
– process order / scheduling can be optimized for
• processor cost minimization
• FIFO memory minimization - token are often data arrays (e.g. frames)
• timing constraints - often periodic I/O data token (signal proc.)
– minimum cycle defines maximum schedule length
R. Ernst, TU Braunschweig 29
Data flow process networks
• data flow process networks are Kahn process networks with firing rules– conditions for process execution: number and type of tokens (“AND”)
– output tokens per process execution
• examples of data flow process networks
– SDF - processes consume / produce fixed number of tokens (Lee/Messerschmitt)
– cyclo static DF - processes cycle through a fixed token prod/cons (Lauwereins)
– boolean DF - token cons/prod depends on boolean control input (Buck/Lee)
• commercial tools COSSAP, SPW, DSP Station, ...
R. Ernst, TU Braunschweig 30
Example: Modem (Lee[86])
optimization for minimal program size („single appearance“)
(16A)(16B)(2C)IJKLM(2N)PFDOEGH
Fork
Biq
Biq
Add
Deco
sc Fork Conj
In Filt Hil Eq Mul Deci Out
Mul
12
48
I
K
ML P
A CB ED FD
HG
2D
O
1
1
1
11
1
1
1
1 1
1
J
N1 2
22
1 2
222
2
1122
211 1
2 22 2
R. Ernst, TU Braunschweig 31
FSMs
• Finite state machines (FSM)
– Moore FSM
– Mealy FSM
R. Ernst, TU Braunschweig 32
Moore FSM
• M = (S, I, O, – S state set
– I input set
– O output set
– I x SStransition function
– Soutput function
s1/o1 s2/o2
ii
ij
Moore FSM graph Moore FSM structure
OI
mem
S S´
R. Ernst, TU Braunschweig 33
Mealy FSM
• M = (S, I, O, – S state set
– I input set
– O output set
– I x SStransition function
– I x Soutput function
s1 s2
ii /o2
ij /o1
Mealy FSM graph Mealy FSM structure
OI
mem
S S´
R. Ernst, TU Braunschweig 34
FSM semantics
• synchronous switching networks
– state transition at every clock cycle (RTL)
• general FSM
– FSM transits upon new input value or input event
– abstracts from target architecture clocking
implementation not constrained
• HW implementation 1 to n clock cycles
• SW implementation m processor instructions
R. Ernst, TU Braunschweig 35
Concurrent FSM
• complex system functions described as concurrent FSM
– easier to understand
– usually lower implementation cost
– required for multiprocessor and networked systems
FSM 1
FSM 2
R. Ernst, TU Braunschweig 36
Concurrent FSMs example - traffic lights
F2
r y1
gy2
F1
A1
A2
A1S = {r, y1, y2, g}E = {F1, F2} F1: vehicle approaches light 1F2: vehicle approaches light 2
wait for 5s and F1and A2.r
wait for 2s
wait for 20s
wait for 5s
wait for 5s and F2 and not F1and A1.r
r y1
gy2
A2
wait for 2s
wait for 20s
wait for 5s
R. Ernst, TU Braunschweig 37
traffic light implementation
• SW : n instruction cycles + signal communication
• HW: 1 to n clock cycles + signal communication
FSM transition time implementation dependent
non-deterministic behavior
R. Ernst, TU Braunschweig 38
• reachability for asynchronous FSM
must extend behavior to exclude illegal states
– semaphore, monitor, ...
Concurrent FSM cooperation
rr
y1r ry1
gr rg
y2r
rr rr
ry2
ry1
y1y1
y1r
gr
y1y1
% %
!!
...
A1A2
F1 F1, F2
F2 non deterministic behavior
non deterministic behavior
non deterministic behavior:behavior is not only dependant on FSM input and state
illegal behavior
R. Ernst, TU Braunschweig 39
Behavior extension using monitor AM
idle
A1ack
A1.reqA1.r A2.r
A2.req
A1
AM.A2ack
A2
r req
gy2
wait for 5s and F1and AM.idle
wait for 2s
wait for 20s
wait for 5s
y1
r req
gy2
wait for 5s and F2and AM.idle
wait for 2s
wait for 20s
wait for 5s
y1
AM.A1ack
A2ackAM
R. Ernst, TU Braunschweig 40
Synchronous FSM
• state transition in t
deterministic behavior
– simplifies simulation and verification
– implementation t
• implementation strongly constrained
• must introduce global synchronization or guarantee target architecture timing
R. Ernst, TU Braunschweig 41
reachability for synchronous FSM
rr
y1r ry1
gr rg
y2r
rr rr
ry2
% %
A1A2
F1F2 and not F1
Traffic light with synchronous FSM
r y1
gy2
A1
wait for 5s and F1and A2.r
wait for 2s
wait for 20s
wait for 5s
wait for 5s and F2 and not F1and A1.r
r y1
gy2
A2
wait for 2s
wait for 20s
wait for 5s
R. Ernst, TU Braunschweig 42
Synchronous reactive systems
• FSM processes communicate via events with tev
• process is activated by any input event (“OR”)
• process reacts instantaneously
• output data are immediately available to all other processes
• timing is introduced as time events
P1
event
P2
P3
P5
event
event
event
after t1: event
event
P4event
R. Ernst, TU Braunschweig 43
Synchronous reactive systems - 2
• properties relevant to implementation
– better control of system response times than in system with buffering
– exact time events + instantaneous signaling and execution
defines total order of events and process executions (synchronous FSMs)
system determinate (if processes are determinate)
– total order of process execution and exact timing cause design space limitation
• impact depends on target architecture and system requirements
• high importance for computation intensive tasks and tight timing requirements
R. Ernst, TU Braunschweig 44
Synchronous reactive systems - 3
• hierarchical extensions for complex systems
• language examples: ESTEREL, STATECHARTs (UML), ...
reg
zero
lap off
on
chime
time
B BC
B
D
H
run
D
disp
[in(run.on)]
[in(run.off)]
A
A
stopwatch
stopwatch example (STATECHART)
R. Ernst, TU Braunschweig 45
Asynchronous communicating processes
• example: SDL (System description language)
– locally controlled processes with buffered communication
– processes control buffer function
– complex semantics with non-determinism
– used in telecommunication
– example: Telelogic Tau
R. Ernst, TU Braunschweig 46
Periodically activated processes
• example SIMULINK (The MathWorks)
• processes B1, ..., B4; input output ports (i1, o1, o2)
• activation period ts
• communication using shared variables
– destructive write, non destructive read
potential data „loss“
B1ts =1
B4ts=4
B3ts =3
B2ts =2
i1 o2
o1
R. Ernst, TU Braunschweig 47
Other semantics
• Petri nets
– rich class of related semantics
• CFSM (VCC)
• environment: continuous time models
R. Ernst, TU Braunschweig 48
Combining models of computation
• co-simulation approach
• compositional approach
Simulinksubsystem 2
inputlanguage 2
subsystem 3
subsystem 1
IP
UML
R. Ernst, TU Braunschweig 49
optimizationoptimization
Co-simulation approach
• individual implementation with local optimization and low-level communication synthesis
subsystem1(lang 1)
optimizationoptimization
subsystem 2(lang 2)
optimizationoptimization
subsystem n(lang n). . .
integration network
• integration network - shallow integration
– co-simulation backplane
– common communication protocol for co-synthesis
co-design &validation 1
co-design &validation 1
co-design &validation 2
co-design &validation 2
co-design &validation n
co-design &validation n
environment(lang x)
R. Ernst, TU Braunschweig 50
Co-Simulation Strategies
• one simulator with „foreign language interface“ to other models
– examples: VHDL simulator
• centralized event scheduling and timing
– simulator backbone (timing architecture dependent - part 2)
• decentralized, hierarchical event scheduling
– example: PTOLEMY
R. Ernst, TU Braunschweig 51
PTOLEMY co-simulation
• hierarchy of domains communicating through „event horizons“
block porthole
• initialize ()• receive data ()• send data ()
blockporthole
event horizon
scheduler
scheduler
domain (subsystem 1)
domainn
event horizon
event schedule synchronization
R. Ernst, TU Braunschweig 52
Integration network limitations
• semantics exploitation in implementation very limited
• global constraints example: timing constraints across subsystems
bus interfacebus interface process networkprocess network bus interfacebus interface
t max
• memory system optimization: buffer optimization under changing system states
• IP HW and SW modules
• shared (heterogeneous) target platform and predefined RTOS
• often many suppliers particular problem: no details known
R. Ernst, TU Braunschweig 53
Compositional approaches
• PCC and Stateflow
• PTOLEMY II and Cocentric
• SPI
R. Ernst, TU Braunschweig 54
Process coordination calculus - PCC
• process network with 2 process types (Grötker et al.)
– data driven processes: may be activated if sufficient token (SDF)
– event driven processes: activated at any input event (FSM)
– event and stream edges
SDFprocess
SDFprocess
FSMprocess
SDFprocess
event (queues)stream
execution order dependent behavior
– order of data process execution determines event process behavior
– event arrival times influence data process behavior
R. Ernst, TU Braunschweig 55
PCC - 2
• approach:use additional scheduling constraints to control network behavior
design space limitation
• commercial language with 2 process types: Stateflow (SIMULINK)
R. Ernst, TU Braunschweig 56
*charts [Girault, Lee, Lee, 97]
• hierarchical process network (PTOLEMY II)
• refinement of processes
– single model per process node
– different models across hierarchy
– rules for embedding process nodes with different semantics
• example: requires termination of minimum cycle in DF - process nodes
– hierarchy order: FSM - X1 - FSM - X2 - FSM - ...,
where Xi : SDF, HDF, SR, DE (discrete event)
R. Ernst, TU Braunschweig 57
game off game on
coin/blueLt
exit/redLt
error/flashTilt,redLtFSM
coin
ready
go
stop
time
blueLt
yellowLt
greenLt
redLt
flashLt
*charts - example
SR
time
ready
go
stop
yellowLt
greenLt
error
exit
waitstop
end
waitgo
idle
FSM
timeout/error end/greenLt
ready ^ ¬ timeout/start
stop/errorstop/exit
timeout/error
time
ready
go
stop
error
exit
greenLt
start
player 1
player 2
time
ready
stop
end
start
go
timeyellowLt
end
error
exit
greenLt
start
waitidle
start/yellowLt
go timeout/end
time
start
go
end
yellowLt
FSM
time timeout
FSM
SR
FSMFSM
SDFadd
delay
constant
comparetime timeout
SDF
R. Ernst, TU Braunschweig 58
Hierachical process networks - 2
• commercial tool: CoCentric System Studio (Synopsys)
– hierachical combination of data flow process network and synchronous reactive model
– many additional constructs, such as “weak” and “strong” process termination, gated processes, ...
R. Ernst, TU Braunschweig 59
PCC and *charts
• both models can be used as a general representation
– for simulation (executable nodes)
– for optimization
• low-level coordination language used
• both require adherence to predefined model(s) of computation
– legacy code, 3rd party SW, partially documented system parts?
R. Ernst, TU Braunschweig 60
Single model with intervals - SPI
• SPI - System Property Intervals (www.spi-process.org)
• abstract model with intervals [Ziegenbein et al.]
• process communication via channels
– FIFO buffers (C) destructive read, non-destructive write
– registers (R) destructive write, non-destructive read
• system properties annotated as intervals
– communication, timing, constraints
covering systems with conditional or unknown behavior
• virtual processes and channels to model coordination
R. Ernst, TU Braunschweig 61
Example - remote motor control
try_receive (message) from PI; if adress(message) = MyAdress then
value = decode(message); send (value) to P3;
end if;
global parameter
P3: motor control loop
P1: bus interface control
D
I1: bus signal
O1: error signal
I3: sensor signal
O3: motor control signal
P2: bus message processing
SDF
16 t1
t1
t3
< t lat,
R. Ernst, TU Braunschweig 62
Remote motor control - SPI model
PbusP1
C11
C12 C13
C7
C8
C9
C10C6
Ptime
C1 C14
P2 Perror
C2
P31
Pmotor
C17
1
1
1
1 1 1 1
1
1
1
1
1
16
16 [0,1]
16
[0,16] [0,1]
[0,1] [0,1]
1 1
dinit = 1
d init = 1
dinit = 151
dinit = 16dinit = 1
LC = [ t1, t1 ]
1
C15
Psensor
1
1
C16
dinit = 1
LC = [ t3, t3 ]
1
1
11
P32 P33
1
C18
1C19
1 1
C20
11
d init = 1 SDF
periodic activationwith deadline =e.o. period
domain coupling
R. Ernst, TU Braunschweig 63
Part 1 - Conclusion
• process networks are standard in embedded systems application languages
• different models of computation support exploitation of subsystem properties
• multi-language designs with reuse are becoming standard
• languages combination with co-simulational or compositional approach
• compositional approach more complex but exploits global system properties for implementation(research)
R. Ernst, TU Braunschweig 64
Literature
• Overview
– G. DeMicheli, R. Ernst, W. Wolf. Readings in Hardware/Software Co-design. Morgan Kaufmann Publishers, 2002.Collection of papers on modeling, simulation and implementation.
• Compiler
– A.V. Aho, R. Sethi, J.D. Ulmann. Compilers: Principles, Techniques and Tools. Addison-Wesley, Reading, 88.
– S. Muchnick. Advanced Compiler Design Implementation. Morgan Kaufmann Publishers, 97.
• Synthesis
– G. DeMicheli. Synthesis and Optimization of Digital Circuits. McGraw-Hill, 94.
R. Ernst, TU Braunschweig 65
Literature - 2
• Models of Computation
– E. A. Lee, Th. M. Parks: Dataflow Process Networks, Proceedings of the IEEE, Vol. 83, No. 5, pp. 773-799, May 1995.
– N. Halbwachs: Synchronous Programming of Reactive Systems, Kluwer Academic Publishers, 1993.
– D. Harel: The STATEMATE Semantics of StateCharts, ACM Transactions on Software Engineering and Methodology, 5(4), pp. 293-333, October 1996.
– ITU-T: Recommendation Z.100. CCITT specification and description language SDL, 1993.
– A. Girault, B. Lee, E. A. Lee: A Preliminary Study of Hierarchical Finite State Machines with Multiple Concurrency Models, Technical Report UCB/ERL M97/57, 1997.
– T. Grötker, R. Schoenen, H.Meyr: PCC: A Modeling Technique for Mixed Control/Data Flow Systems, Proceedings of ED&TC ´97, pp. 482-486, 1997.
R. Ernst, TU Braunschweig 66
Literature - 3
• Models of Computation - cont‘d
– S. A. Edwards. Languages for Digital Embedded Systems. Kluwer Academic Publishers, 2000.
– SPI papers: www.spi-project.org