COdesign and power Management in PLatform- based design ... D1.2.1... · COMPLEX/PoliMi/R/D1.2.1...

Public

FP7-ICT-2009- 4 (247999) COMPLEX

COdesign and power Management in PLatform-

based design space EXploration

Project Duration 2009-12-01 – 2012-11-30 Type IP

WP no. Deliverable no. Lead participant

WP1 D1.2.1 POLIMI

Definition of application, stimuli and platform

specification, and definition of tool interfaces

Prepared by Gianluca Palermo (POLIMI), Carlo Brandolese (POLIMI), Francisco

Ferrero (GMV), Fernando H. Casanueva (UC), Gunnar Schomaker

(OFFIS), Claus Brunzema (OFFIS), Kim Grüttner (OFFIS), Kai Hylla

(OFFIS), Bart Vanthournout (COWARE), Davide Quaglia (EDALAB),

Luciano Lavago (POLITO), Massimo Poncino (POLITO), Emanuel

Vaumorin (MDS), Chantal Couvreur (IMEC), Saif Ali Butt (CV)

Issued by OFFIS

Document Number/Rev. COMPLEX/PoliMi/R/D1.2.1/1.1

Classification COMPLEX Public

Submission Date 2010-10-15

Due Date 2010-08-31

Project co-funded by the European Commission within the Seventh Framework Programme (2007-2013)

© Copyright 2010 OFFIS e.V., STMicroelectronics srl., STMicroelectronics Beijing R&D Inc, Thales

Communications SA, GMV Aerospace and Defence SA, CoWare NV, ChipVision Design Systems AG,

EDALab srl, Magillem Design Services SAS, Politecnico di Milano, Universidad de Cantabria, Politecnico di

Torino, Interuniversitair Micro-Electronica Centrum vzw, European Electronic Chips & Systems design

Initiative.

This document may be copied freely for use in the public domain. Sections of it may be copied provided

that acknowledgement is given of this original work. No responsibility is assumed by COMPLEX or its

members for any aplication or design, nor for any infringements of patents or rights of others which may result

from the use of this document.

COMPLEX/PoliMi/R/D1.2.1 Public

Definition of application, stimuli and platform specification, and definition of tool interfaces

Page 2

History of Changes

ED. REV. DATE PAGES REASON FOR CHANGES

GP 1.0 2010-10-12 153 Complete version of the Deliverable.

KG 1.1 2010-10-15 152 Final review and publication



Page 3

Contents

1 Scope of this document ...................................................................................................... 5

2 COMPLEX Design Flow ................................................................................................... 6 2.1 Introduction ................................................................................................................ 6 2.2 MDA Design Entry .................................................................................................. 10

2.2.1 UML/MARTE Design Entry ................................................................................ 12 2.2.2 Matlab/Stateflow Design Entry ............................................................................ 28

2.3 Estimation and Model Generation ............................................................................ 33 2.3.1 Task separation / Test bench generation .............................................................. 33 2.3.2 Source Analysis and Augmented Code Generation (SW tasks) .......................... 35

2.3.3 Source Analysis and Augmented Code Generation (HW tasks) .......................... 38 2.3.4 Block annotated C++ (BAC++) ........................................................................... 40 2.3.5 Virtual System Generator ..................................................................................... 42 2.3.6 Global resource manager (Pre-optimized Power Controller) ............................... 45

2.3.7 Summary: Model Estimation and Generation Flow ............................................. 48 2.4 Simulation ................................................................................................................ 49

2.4.1 Goals ..................................................................................................................... 49 2.4.2 Requirements ........................................................................................................ 49

2.5 Exploration and Optimization .................................................................................. 51 2.5.1 Simulation Traces and Analysis ........................................................................... 52

2.5.2 Design Space Exploration .................................................................................... 54 2.5.3 DSE-XML Interface ............................................................................................. 57

3 Application of COMPLEX Design Flow to Use Cases ................................................... 59

3.1 USE CASE 1 – Tool chain Description ................................................................... 60

3.1.1 Description of the use-case .................................................................................. 60 3.1.2 Flow Coverage ..................................................................................................... 60


3.2.1 Description of the use-case .................................................................................. 62 3.2.2 Flow Coverage ..................................................................................................... 62


3.3.1 Short description of the use case .......................................................................... 66 3.3.2 Definition of the design space .............................................................................. 67

3.3.3 Description of the coverage .................................................................................. 68 4 COMPLEX Tool-set Overview ........................................................................................ 72

4.1 MOST Tool Overview ............................................................................................. 73

4.2 SCOPE + Tool Overview ......................................................................................... 74

4.3 UML/MARTE Design Entry Tool Overview .......................................................... 75

4.4 SWAT: SW Estimation Tool Overview ................................................................... 76 4.5 Virtual Platform Tool Overview .............................................................................. 77

4.6 PowerOpt Tool Overview ........................................................................................ 78 4.7 HIFSuite Tool Overview .......................................................................................... 79 4.8 Memories Modelling, Characterization and Optimization (MCO) Tool Overview . 80

4.9 IPXACT Tool-Chain Overview ............................................................................... 81 4.10 SMOG Tool Overview ............................................................................................. 82 4.11 IMEC Global Resource Manager Tool Overview .................................................... 83 4.12 SystemC Network Simulation Library (SCNSL) Overview .................................... 84

5 Summary .......................................................................................................................... 86 6 References ........................................................................................................................ 87



Page 4

A. COMPLEX Flow and Project Activities Overview ......................................................... 88 B. Tool-set Description ......................................................................................................... 89

B.1. MOST Tool Description ............................................................................................... 90 B.2. SCOPE + Tool Description .......................................................................................... 94 B.3. UML/MARTE Design Entry Tool Description ......................................................... 105 B.4. SWAT: SW Estimation Tool Description .................................................................. 109 B.5. Virtual Platform Tool Description ............................................................................. 115

B.6. PowerOpt Tool Description ....................................................................................... 121 B.7. HIFSuite Tool Description ......................................................................................... 126 B.8. Memories Modelling, Characterization and Optimization Tool Description ............ 131 B.9. IPXACT Tool-Chain Description .............................................................................. 134 B.10. SMOG Tool Description ........................................................................................ 141

B.11. IMEC Global Resource Manager (GRM) Tool Description .................................. 144

B.12. SystemC Network Simulation Library (SCNSL) Description ............................... 147



Page 5

1 Scope of this document

This deliverable is the result from Task T1.2 - System and tool interface specification (Start:

M1 - End: M9) where the participants under the leadership of PoliMi are PoliMi, CoWare,

CV, EDALab, GMV, MDS, UC, OFFIS, Polito and IMEC.

This deliverable documents the specification of the COMPLEX design approach. It might

occur that this deliverable is updated to describe important changes in the COMPLEX

approach that might arise during the project. To this end we consider this as a ―living‖

document that will be publicly available. The idea behind the deliverable is to have a

reference document for the design flow and tools for the next activities and deliverables. For

this reason, the COMPLEX consortium has planned to release an update of this document at

M18 to cover possible changes in the design flow.

The COMPLEX flow follows a platform-based design approach where the functionality and

architecture view of the system are separated.

The first goal of this deliverable is to describe the requirements of the main interfaces in the

COMPLEX design flow to enable interoperability among all involved partners. As described

in the Description of Work [1] the following requirements are focused on:

“Application” and stimuli description: Defines the functional view of the system including

the definition of the initial, functional and non-functional specification methodology using

MARTE. Matlab/Stateflow is also required as an additional system modelling input

incorporating dynamic system behaviour.

Platform description: Defines the architectural view of the system. It includes the definition

of the MARTE HW resource modelling methodology supporting the specification of the

execution platform. From this initial architectural specification, the corresponding IP-XACT

description will be generated.

Model generation and cost-function definition: Defines the step needed to build the system

model starting from the application and platform description. Model generation and cost

function definition should take care of the design space exploration feedback loop that can be

done automatically or manually by the designer.

Tool interface identification: Identification of the required tool interfaces for a shared

methodology for granting the interoperability of the different EDA and the design process

work-flow. The tool interface identification should be done taking into account the specific

needs of each COMPLEX use case defined in D1.1.1 - Definition of requirements, industrial

use-cases and evaluation strategy.

The document structure is mainly composed by three parts: The first-one describes the

COMPLEX design flow presenting each step in terms of goals and requirements (see

Chapter 2), the second-one presents the use-case specific mapping of the COMPLEX flow

clarifying the interaction among the tools (see Chapter 3), while The third-one briefly

describes all the tools in the COMPLEX flow in terms of overview (see Chapter 4), tool/user

interface and portability (see Appendix B).



Page 6

2 COMPLEX Design Flow

2.1 Introduction

The overall COMPLEX Design Flow as defined in the Description of Work [1] is split into a

variety of different intermediate steps as depicted in Figure 1.

Figure 1: Complex Design Flow

MDA design entry

COMPLEX provides an MDA (Model Driven Architecture) design entry (a), using the

MARTE UML profile as well as the Stateflow and Simulink tools. The platform independent

model (PIM) specifies the application or behaviour model of the system. The use-case

scenario of the PIM is defined using UML or Matlab/Stateflow (b). The system specification

model describes the system functionality and synchronisation points through abstract

communication channels (e.g. handshake) and defines some kind of communication

scheduling. The platform description model (PDM) (d) describes the connection of allocated

execution and memory resources. The user-constrained HW/SW separation and mapping (c)

describes the binding of the processes in the platform independent model to execution units

and memories of the PDM.

system

specification

in SystemC

system

input

stimuli

automatically

pre-optimized

power

controller

HW/SW task separation & testbench generation

source analysis

behavioral synthesis

functional, power,

& timing

model generation

source analysis

cross compilation

functional, power,

& timing

model generation

virtual system generator with

TLM2 interface synthesis

bus cycle accurate

SystemC model

with self-simulating

power & timing models

simulation

trace

BA

C+

+

BA

C+

+

HW

tasks

Syste

mC

e f

h

i j

l

m n o

esti

mati

on

& m

od

el

gen

era

tio

nsim

ula

tio

n

exp

lora

tio

n &

op

tim

izati

on

SW

tasks

execu

tab

le

sp

ecif

icati

on

visualization/

reporting

tool

trace

analysis tool

pow

er/

perf

orm

ance

metr

ics

userexploration &

optimization

tool

p

q

r

sMARTE

PIM or

Matlab/

Simulink

user

constrained

HW/SW sep.

& mapping

MARTE

PDM(Platform

Description

Model)

c

parameters for

new design space

instance

desig

n s

pace d

efinitio

n

design space instance

parameterst

a d

MD

A

desig

n e

ntr

y

• functional reimplementation

• hardware/software

partitioning/separation

• runtime management

• embedded software/compiler

optimizations

• IP platform selection &

configuration

• memory

configuration/management

(static & dynamic)

• custom hardware synthesis

constraints

use-cases

architecture/platform

description

(IP-XACT)

virtual platform

IP component

models

g

Syste

mC

b

k



Page 7

Executable Specification

As the PIM is a pure specification model, for functional evaluation it is either simulated

directly using the Mathworks tools, in order to analyse and optimise the network

performance, or converted to an executable SystemC model (e) for the detailed platform

design. This model contains functional descriptions of tasks that will run as user-defined

hardware like ASICs, as software on a processor, or are provided as IP-components from

third-party vendors. The latter ones are required just for functional simulation and are not

being modified during the subsequent flow. In order to execute the SystemC model specified

in (e), it needs to be stimulated. The stimuli (f) might originate from user interaction or

communication with other components that are part of the environment. External system

stimuli are derived from the MARTE use-case specifications or from the environment model

in Stateflow/Simulink (b). The IP-XACT platform specification (g) consists of blocks

(components) with interconnected interfaces. Each bock represents an IP component that can

be configured and characterized by the use of meta-data annotations. IP-XACT allows

different views on each IP component. To enable high-speed TLM simulation, a view with

associated SystemC and OSCI TLM2 descriptions can be used. The IP component meta-data

(area, delay, power, etc.) can be described by a non-functional view. The IP-XACT

description is generated from the MARTE PDM (d). From the IP-XACT platform

specification a structural top-level view of the platform architecture is assembled. It consists

of processing elements, dedicated hardware, memories, and interconnects. COMPLEX does

not use interconnected RTL components but virtual platform IP components. Their behaviour

is modelled in SystemC and their communication interfaces are OSCI TLM2 compliant.

Consequently all interconnection models used in COMPLEX are also TLM2 compatible.

Estimation & model generation

Step (h) collects all information from the executable specification phase, parses and

analyses/elaborates the SystemC specification model, reads the mapping information, and the

IP-XACT platform specification meta-data. All these information are written to an internal

design representation. From that internal representation the behaviour description of each

component can be extracted and forwarded to the domain specific analysis and synthesis

tools. Behaviours mapped to dedicated HW resources (HW tasks) are forwarded to existing

source analysis and behavioural synthesis tools (i). Behaviours mapped to SW resources, as

general purpose processors or digital signal processors, are forwarded to existing source

analysis and cross-compilation tools (j). Additionally test benches with activation traces and

constraints for the behavioural synthesis and cross-compilation are forwarded to these third

party tools. HW tasks which shall be implemented in custom hardware enter block (i) together

with typical input stimuli (input data and active/idle statistics) as well as synthesis constraints

(such as technology node, threshold voltage, available area, etc.). Each task is then analysed

and fully synthesised (scheduling, binding, allocation, implementation of power-management

methodology, controller generation, floor-planning, etc.) down to RT-level using third party

behavioural synthesis tools (ChipVisions‘s PowerOpt is used in the project). Finally, code in

BAC++ (Block Annotated C++) is generated. The BAC++ is clustered, in such way that run-

time or power variable control structures, as well as bus requests are separated. Delay, static

power, dynamic power, and variation information are instrumented to the behavioural C++

code so that a power and delay aware simulation can be performed. The SW task's code will

be analyzed and cross-compiled by existing third party tools (j). During analysis metrics like

power consumption and worst-case execution time are estimated and a model is generated

from that data. Identically to the behavioural synthesis in (i), code including BAC++

annotations is generated by an augmented cross-compiler back-end. Block (k) represents



Page 8

SystemC/TLM2 performance and power characterized platform IP components like HW

accelerators, communication resources (bus, point-to-point channel), and memories. The

virtual system generator (l) reads the BAC++ from (i) and (j) and takes the instantiated virtual

IP component models from (k). The virtual system generator assembles these different input

blocks to an executable system model. Therefore, the instrumented code coming form (i)

and (j) needs to be connected to the remaining virtual platform IP components, using the

OSCI TLM2 API. The output of this generator is a complete performance and power aware

system model with up to basic block accuracy.

Simulation

The custom hardware, synthesized in (i) and parts of the virtual platform IP components,

specified in (k) provide dynamic power management (DPM) abilities. For the first iteration of

the entire Design-Space Exploration (DSE), an initial power controller (m) will be

automatically generated, controlling each component‘s power state based on the transition

cost and the activity distribution. Afterwards, the DPM policies can be automatically refined

or modified by the user. The generated SystemC model (n) can be compiled and directly

executed on the host machine of the designer. The instrumented code (BAC++) allows

writing different simulation traces under employment of the specified system input stimuli (f)

through use-cases (b). The granularity of tracing information can be parameterized to the

needs of the analysis and exploration step. It is expected that full tracing will slow down the

entire simulation significantly, which makes an appropriate choice of granularity extremely

important. The tracing granularity can be chosen for each component independently and can

be refined hierarchically. This allows a more fine-grained monitoring of certain interesting

component‘s behaviour and a more coarse-grained monitoring of other components.

In case of networked embedded systems, e.g., wireless sensor networks, COMPLEX will also

address the simulation of communications among different, distributed embedded systems,

since this aspect is significant in the assessment of the performance of the design solutions

and therefore in the design-space exploration and optimisation.

Exploration & optimisation

The simulation trace (o) contains timing, dynamic and static power information (with respect

to process variation) of each platform component, related to the executed workload as well as

other relevant metrics like memory usage. User-defined module, port, process, and function

names from the system specification (a, e) are preserved to ensure traceability to the input

model of the executable specification. Simulation traces of each platform part are read into an

analysis tool (p). Main tasks of this tool are the extraction of activity- and power-relevant data

of the different platform parts and the pre-processing of this data for either graphical

presentation to the designer (q) or as input for an automatic exploration and optimization

tool (r). The visualization engine (q) will take power and activity-data prepared by the

analysis tool (p) and present this information to the designer. One possible visualization type

is a platform power-breakdown in which the power contribution of all platform parts can be

inspected. Platform evaluation and optimisation (r) is two-fold. On the one hand the user can

constrain the overall platform selection, deduce further constraints on HW/SW separation, or

identify power-consuming implementations and replace them with more power-efficient ones.

On the other hand, the automatic exploration and optimization tool is based on multi-objective

optimization heuristics to efficiently navigate the overall design space defined in (t). Once

obtained the design space definition, the exploration tool starts an optimization loop

interacting with the rest of the COMPLEX design flow to find the optimal system

configuration in terms of a user constrained target function. In the optimization loop, the DSE

framework generates a new design space instance (s) to be automatically evaluated by the rest



Page 9

of the COMPLEX design flow, which returns the new power and performance metrics (scalar

values format) from (p). All information gained from the platform analysis and optimization

phase will serve as input and feedback (s) to the next iteration of the platform refinement flow

and thus will lead to an optimized executable specification of the overall system.

To ensure a seamless interaction between these steps, a careful definition of specification

formats, model descriptions, and tool interfaces has to be defined. In this document, the

continuous evolution of the required definitions is documented as they are refined during the

course of the COMPLEX project.

The following sections are organised according to the different phases of the COMPLEX

Design Flow. Every connection between different design/tool tasks in the flow is covered and

the corresponding requirements for the tool interfaces, model descriptions etc. are discussed.



Page 10

2.2 MDA Design Entry

COMPLEX project supports two different model-driven capturing mechanisms for modelling

embedded systems, the UML/MARTE design branch and the Matlab/Stateflow design branch.

The selection of the design branch depends on the use-case specific needs and company

choices. However, UML/MARTE is more appropriate in the case of large projects and

whenever there is a necessity of performing design space exploration. The following depicts

the two different design branches in COMPLEX:

Figure 2: Model-Driven Architecture design entry

The UML/MARTE design entry for COMPLEX covers all activities of the COMPLEX

design flow depicted in picture Figure 1 related with the MDA entry and executable

specification. Next figure shows the different modelling activities and output artefacts derived

from them.

Activities (a) and (e) (Figure 1), covered by UML/MARTE PIM modelling activity,

which outputs the MARTE PIM, CFAM model (see APPENDIX B.3 for additional

description of this interface) and the SystemC executable specification of the

application.



Page 11

Activities (b) and (f) (Figure 1), covered by the UML/MARTE Stimuli Definition

activity, which produces the System Input Stimuli for the exercise of the SW and HW

parts of the system.

Activities (d) and (g) (Figure 1), covered by the UML/MARTE PDM modelling

activity, which outputs the MARTE PDM and the IP-XACT specification of the

system platform.

Activity (c) (Figure 1), covered by the UML/MARTE PSM modelling activity, which

produces the MARTE PSM model and the different XML files which enables the

design exploration (XML system description and XML design space).

Figure 3: UML/MARTE Design Entry

On the other side, the Matlab/Stateflow design entry for the COMPLEX project covers only a

subset of the activities of the COMPLEX design flow depicted in picture Figure 1 related with

the MDA entry and executable specification.

In particular, it covers activities (a) and (e) for the generation of the SystemC executable

specification of the application and activities (b) and (f) for the generation of the Input Stimuli

to exercise the modelled system.



Page 12

2.2.1 UML/MARTE Design Entry

2.2.1.1 Goals

The goals to be achieved by the UML/MARTE model-driven front-end can be summarized by

the following points:

Description

The definition of a COMPLEX modelling methodology based on UML/MARTE able to

describe a heterogeneous embedded system composed of SW and HW components, and

suitable to feed the simulation and design exploration processes that enable finding the

optimum architectural mapping.

The development of a COMPLEX modelling environment supporting the aforementioned

COMPLEX modelling methodology and based on the usage, and extension or adaptation, if

necessary of state-of-art capture tools.

The development of a COMPLEX transformation toolset enabling the generation of a

SystemC specification model from the UML/MARTE model.

2.2.1.2 Requirements The following table provides the general requirements for the UML MARTE modelling

methodology and environment, enumerated as RMMi.

RMM1 The modelling methodology shall be a component-based methodology.

Rationale: To enhance modularity and enable the HW/SW partitioning process.

RMM2 The description of the system functions in terms of HW and SW components, as

well as the specification of the use cases, constraints on the HW/SW separation and

mapping rules shall be logically separated.

Rationale: Separation of Concerns.

RMM3 COMPLEX shall support UML-based profiles.

RMM4 COMPLEX project shall support the UML/MARTE profile for modelling both

system functions and platform.

Rationale: Follow the MARTE profile standard.

RMM5 COMPLEX modelling framework should support the development of UML

profiles whether UML/MARTE profile does not manage specific semantics needed

in the COMPLEX project.



Page 13

Rationale: Yield a complete methodology, where syntactical contributions of

COMPLEX are clearly packaged and separated, and which serves for improving

MARTE.

RMM6 The definition of non-functional properties (e.g. throughput, end-to-end latencies,

robustness, survivability), formal assertions (e.g. VSL), and constraints (i.e. OCL

expressions), and their assignment to system components shall be supported.

RMM7 The modelling environment shall be based on Open Source Integrated

Development Environment.

RMM8 The methodology shall enforce the separation of concerns by means the definition

of model views.

Rationale: the separation of concerns concept allows managing the complexity of

the system modelling by focusing the system designer on specific aspects of the

system, like data exchange, application real-time aspects or platform-related

issues.

RMM9 The methodology shall enforce the traceability of system requirements to models.

RMM10 The methodology shall enforce the traceability of system requirements to text by

means of additional comments and references to the system specification.

RMM11 The COMPLEX UML/MARTE design modelling shall use open standards

whenever they exist and are suitable.

RMM12 The COMPLEX UML/MARTE design methodology shall provide an integrated

Eclipse based environment for PIM, PDM and PSM modelling, as well as the

Stimuli definition.

RMM13 The COMPLEX UML/MARTE design framework shall support the use of

configuration management tools

Input required Required by

tool

Input

provided by

System Specification MARTE

design entry

toolset

Use Case

Provider

Output provided Provided by

tool Output required by

UML/MARTE PIM UML Model

editor

User Generated

HW/SW Mapping

and Task separation

SystemC executable specification of the model PIM to HW/SW Task



Page 14

SystemC

generator

Separation and Test

Bench Generation

CFAM executable specification of the model PIM to CFAM

generator

Estimation and

Model Generation

(SCOPE+)

UML/MARTE PDM UML Model

Editor

User Generated

HW/SW Mapping

and Task separation

IP-XACT specification of the system platform PDM to IP-

XACT

generator

Virtual System

Generator

System Stimuli Input Model HW/SW Task

Separation and Test

Bench Generation

UML/MARTE PSM UML Model

Editor

User Generated

HW/SW Mapping

and Task separation

XML System description file PSM to

XMLD

generator

Estimation and

Model Generation

(SCOPE+)

XML Design Space file PSM to

XMLDS

generator

Estimation and

Model Generation

(SCOPE+)

2.2.1.3 MARTE PIM and System specification in SystemC

2.2.1.3.1 Goals

The goal of this step in the COMPLEX flow is to capture a Platform Independent Model

(PIM) of the application through the MARTE front-end and to generate a Concurrent

Functional Application Model (CFAM) and a SystemC specification.

system

specification

in SystemC

system

input

stimuli

automatically

pre-optimized

power

controller


source analysis


functional, power,

& timing

model generation

source analysis

cross compilation

functional, power,

& timing

model generation



bus cycle accurate

SystemC model



simulation

trace

BA

C+

+

BA

C+

+

HW

tasks

Syste

mC

e f

h

i j

l

m n o

esti

mati

on

& m

od

el

gen

era

tio

nsim

ula

tio

n

exp

lora

tio

n &

op

tim

izati

on

SW

tasks

execu

tab

le

sp

ecif

icati

on

visualization/

reporting

tool

trace

analysis tool

pow

er/

perf

orm

ance

metr

ics

userexploration &

optimization

tool

p

q

r

sMARTE

PIM or

Matlab/

Simulink

user

constrained

HW/SW sep.

& mapping

MARTE

PDM(Platform

Description

Model)

c

parameters for

new design space

instance

desig

n s

pace d

efinitio

n


parameterst

a d

MD

A

desig

n e

ntr

y






optimizations


configuration

• memory


(static & dynamic)


constraints

use-cases


description

(IP-XACT)

virtual platform

IP component

models

g

Syste

mC

b

k



Page 15

2.2.1.3.2 Requirements Several requirements are to be addressed in this design step and are listed here in the

following for the PIM-UML/MARTE frontend, CFAM API and SystemC executable

specification.

The following table defines the requirements applicable to capture of the PIM within the

UML/MARTE model-driven front-end, enumerated as RMIi.

RMI1 The PIM model shall support two views, the functional view and the

communication and concurrency view.

Rationale: this requirement states the minimum set of system concerns to be

considered in the PIM model. For example: Data Structure, Interface and

Functional specification (structure of functionality and/or classes), Concurrency

structure.

RMI2 The PIM model shall follow a SW-centric approach.

Rationale: Convenient initially assuming a SW-based implementation as the most

convenient solution before the support of the DSE flow. This assumption eliminates

any interpretation ambiguity for the implementation flow if no allocation is

specified.

By default, and if no further mapping information is present or considered, the

PIM model is considered as a Concurrent Functional Application Model. Thus

PIM components are actually SW components by default.

RMI3 An application component (or System Component) shall support the following

types of interfaces definition:

Cyclical interface: it is executed periodically based on the platform clock.

Sporadic interface: it is executed with a maximum arrival time between calls.

Protected interface: the execution of the interface is blocking, i.e. only one

component can call that interface at once.

Rationale: An application component is a system component representing the

functionality of the system, potentially concurrent, and thus part of the platform

independent model (PIM). In a SW centric methodology, an application component

is synonymous of SW component.

RMI4 The COMPLEX UML/MARTE PIM model shall support the association of real-

time constraints and non-functional properties to system components (e.g., through

the association to their interfaces).

The following table provides the requirements relative to the CFAM API, enumerated as

RMCi. For additional information about the CFAM model, refer APPENDIX B.3

RMC1 The action language for the CFAM API shall be ANSI-C.



Page 16

Rationale: C/C++ is the most widespread language in embedded software

programming. Additionally, estimation background technology of COMPLEX (e.g.

SCoPE) takes this API as input.

RMC2 The CFAM model shall keep a high correlation with the UML/MARTE

description of the application.

The CFAM shall support and separate the specification of interfaces, functionality,

class structure, and concurrency structure.

It shall support non-functional attributes of system components. For instance, if a

throughput constraints put on a specific interface in the UML/MARTE model,

should find an almost straightforward interpretation in the CFAM API (e.g., as a

requirement associated to a function interface).

Rationale: This should simplify and accelerate the generation tool, and make

clearer the task of the automatic generation.

RMC3 UML/MARTE application component shall be mapped to a C/C++ based structure

which better approaches to a SW component, from a SW programmer perspective

(e.g., a class implementing and packaging the interface).

Rationale: This way, the CFAM will be easily understood by a SW programmer, a

HW designer or a system-level designer.

RMC4 The following services shall be modelled the CFAM API:

Message queuing

Synchronization services

Board Support Package

Rationale: More suitability for the system-level. A generic API enables an

agnostic representation of any concurrent application (either based on a RTOS, or

a language incorporating itself concurrency and synchronizations facilities, such

as ADA) and even the concurrency structure of a HW/SW implementation. It also

facilitates re-allocation of functionality from SW to HW.

RMC5 The specification methodology shall enforce the seamless re-allocation of system

components either on SW or HW components.

Rationale: Widen Exploration Space

RMC6 The specification of system components shall use the same CFAM API and action

language, regardless their allocation to platform components. Thus a change in the

allocation of a system components shall not require any edition of it.



Page 17

Rationale: This enables considering the CFAM API also as a PIM or system-level.

Remind that in COMPLEX this methodology is SW-centric and it is considered by

default an application model. It also eases moving functionality from SW to HW

both at MARTE level and at CFAM level (e.g. change an arrow end in a MARTE

model or change a command in CFAM)

The following requirements are related to the MARTE PIM model to CFAM application

model generator, enumerated as RM2Ci.

RM2C1 The COMPLEX UML/MARTE PIM to CFAM generator shall follow the system

component naming conventions.

RM2C2 Each cyclical interface shall be transformed into an active task, with its own

execution flow.

RM2C3 Sporadic and protected interfaces shall be transformed into passive tasks, which

do not have their own execution flow.

RM2C3 At least, the following type of communication and synchronization mechanisms

shall be allowed at application model:

FIFOs with blocking semantics

Mutexes

Message Queues

Rationale: The execution of component interfaces is typically modelled in critical

system with a FIFO queue, although depends on the computational model. That is

why message queue channels are also considered.

RM2C4 The name of the following elements of the UML/MARTE application

components shall be preserved in the CFAM model:

Application (SW) component

Provided and Required interfaces

Port identifiers

RM2C5 At least, the following traceability information should be included as comments:

Detection of the UML/MARTE interfaces and generation of the corresponding

interfaces in the CFAM.

Detection of UML/MARTE application components and generation of the

corresponding code structure in the CFAM.

Detection of the UML/MARTE required/provided ports (types, direction) and the

generation of the corresponding code structure in the CFAM.

Requirements included in the UML/MARTE model.



Page 18

The following requirements are related to the generation of the SystemC executable model

from the UML/MARTE specification, enumerated as RMSCi.

RMSC1 The transformation from the UML/MARTE model to the SystemC model shall

ensure the separation of the application and platform levels.

Rationale: Actually, this is achieved once the generation focuses on the CFAM

and on the XMLD files, which clearly separates the PIM (or application model,

since it is a SW centric approach) from the PDM and the allocation information,

present in the XMLD).

RMSC2 The transformation of the system model into the executable model shall be

automatic with no user intervention (button click approach).

RMSC3 Action code (function bodies) associated to the UML/MARTE model shall be

used without user modifications (assuming this code fulfils the limitations of the

COMPLEX estimation tools: SCoPE, etc) for the generation of the SystemC

executable specification.

RMSC4 The transformation tool shall be integrated with the UML/MARTE modelling

(capture) tool.

RMSC5 If desired, separable and independent from the specification (capture) tool.

RMSC6 The transformation tool shall be Portable to other modelling environments.

RMSC7 COMPLEX shall support Model to Text on current Eclipse transformation

technologies with respect to m2m & m2t projects.

RMSC8 A homogeneous and understandable naming convention for the generation tools

shall be provided.

RMSC9 Non-functional attributes of system components shall be preserved during

generation.

RMSC10 The generator will give as comments (C/C++ comments in the CFAM case, XML

comments in the XMLD and IP/XACT cases); at the beginning of the generated

file information about the generation tool that automatically generated the

generated file, and at least the version and the date of generation.

RMSC11 The generator will be able to include comments with information supporting the

traceability of the generation process, including the recognition of MARTE

elements, and specifically those elements that involve some generation. It is

desirable this to be an optional feature for the user.

RMSC12 The generation will support the production of a configurable and executable

SystemC model representing a set of PSM. It will enable that the exploration tool

gets the estimations from a PSM by configuring the executable PSM, before

simulation, and without requiring re-generation (re-compilation) of the executable

PSM.

Rationale: Faster iterations along the DSE cycle.



Page 19

RMSC13 From a UML/MARTE model reflecting several PSMs, a SystemC executable

model will be generated, reflecting such PSMs and enabling the configuration

from the Exploration tool.

Rationale: Enable the correlation between the UML/MARTE model and the

SystemC executable model, while enabling platform configuration (including

exploration of new architectures) from the exploration tool, with out requiring a

re-generation of the executable model.



Page 20

2.2.1.4 Use cases System Input Stimuli and Test bench generation

2.2.1.4.1 Goals The following table introduces the goals to be achieved by the UML/MARTE model-driven

front-end regarding the use case stimuli definition and test bench generation.

Description

To define a methodology to design the test bench environment in parallel with the system

model within the same modelling environment.

To define a methodology to specify test cases and their associated requirements supported by

the system-modelling environment.

To define a mechanism to associate input stimuli to system components.

2.2.1.4.2 Requirements The following table defines the requirements applicable to the capture of the Use Case Model

within the UML/MARTE model-driven front-end, enumerated as RMUCi.

RMUC1 It shall be possible to design the test bench environment using the same modelling

environment used to development the system model.

Rationale: Efficiency. It gives as homogeneity as possible in the modelling of

system and its environment. This does not prevents to take advantage of the

flexibility which, compared to the system description, is usually feasible to develop

environment models.

RMUC2 It shall be possible to specify test cases and associate them to requirements.

Rationale: To enable the different types of checks/validations which have to

performed to a system, and moreover, consider the expected criteria (functionality,

performance) to consider each check will pass (e.g., power consume of a vocoder

can be measured when using a silence detection block and when it is disabled.

Power consumer requirements can be different in each case)

RMUC3 It shall be possible to define input stimuli for each test case.

system

specification

in SystemC

system

input

stimuli

automatically

pre-optimized

power

controller


source analysis


functional, power,

& timing

model generation

source analysis

cross compilation

functional, power,

& timing

model generation



bus cycle accurate

SystemC model



simulation

trace

BA

C+

+

BA

C+

+

HW

tasks

Syste

mC

e f

h

i j

l

m n o

esti

mati

on

& m

od

el

gen

era

tio

nsim

ula

tio

n

exp

lora

tio

n &

op

tim

izati

on

SW

tasks

execu

tab

le

sp

ecif

icati

on

visualization/

reporting

tool

trace

analysis tool

pow

er/

perf

orm

ance

metr

ics

userexploration &

optimization

tool

p

q

r

sMARTE

PIM or

Matlab/

Simulink

user

constrained

HW/SW sep.

& mapping

MARTE

PDM(Platform

Description

Model)

c

parameters for

new design space

instance

desig

n s

pace d

efinitio

n


parameterst

a d

MD

A

desig

n e

ntr

y






optimizations


configuration

• memory


(static & dynamic)


constraints

use-cases


description

(IP-XACT)

virtual platform

IP component

models

gS

yste

mC

b

k



Page 21

Rationale: To let specify input stimuli, usually associated to test cases and

requirements (e.g., the previously mentioned power consumption check requires to

disable or not the slice detector, depending on the test case)

RMUC4 All tests cases shall be traced to existing requirements in the COMPLEX

UML/MARTE model.

RMUC5 The COMPLEX system stimuli generator shall generate traceability information

between test cases and requirements.



Page 22

2.2.1.5 User-generated HW/SW Mapping and Task separation

2.2.1.5.1 Goals The goals to be achieved by the UML/MARTE model-driven front-end regarding the

definition of HW/SW partitioning and mapping is to define a mechanism to modify the

HW/SW partitioning and mapping under user-specified constraints.

2.2.1.5.2 Requirements

The following table defines the requirements for the capture of User Constrained HW/SW

separation and Mapping applicable within the UML/MARTE model-driven front-end for the

generation of a Platform Specific Model (PSM), enumerated as RMHSi.

RMHS1 The modelling environment shall allow the specification of constraints on the

definition of the system model properties that affects to the evaluation of the

system performance and power consumption (i.e. working frequency and voltage,

implementation technology of processing elements, or allocation of system

components onto different HW executive elements).

Rationale: It enables the design space exploration, as an exploration of the

performance yield by the different PSMs, requiring neither changes to the

UML/MARTE model, nor a new generation from it.

RMHS2 The modelling environment shall ensure the generation of at least a single system

model.

Rationale: a specific complete mapping guarantees that at least an executable

PSM can be generated. A complete mapping means that no ambiguity about that

platform resource will execute a given functionality will remain after the

allocation.

RMHS3 The COMPLEX UML/MARTE model shall support the parametric configurability

already supported by Multicube SCoPE.

Rationale: In order to keep from the MDA methodology, at least the same

capability for expressing the exploration space.

RMHS4 The modelling environment shall also allow the specification of a set of different

allocation schemes within the PSM.

system

specification

in SystemC

system

input

stimuli

automatically

pre-optimized

power

controller


source analysis


functional, power,

& timing

model generation

source analysis

cross compilation

functional, power,

& timing

model generation



bus cycle accurate

SystemC model



simulation

trace

BA

C+

+

BA

C+

+

HW

tasks

Syste

mC

e f

h

i j

l

m n o

esti

mati

on

& m

od

el

gen

era

tio

nsim

ula

tio

n

exp

lora

tio

n &

op

tim

izati

on

SW

tasks

execu

tab

le

sp

ecif

icati

on

visualization/

reporting

tool

trace

analysis tool

pow

er/

perf

orm

ance

metr

ics

userexploration &

optimization

tool

p

q

r

sMARTE

PIM or

Matlab/

Simulink

user

constrained

HW/SW sep.

& mapping

MARTE

PDM(Platform

Description

Model)

c

parameters for

new design space

instance

desig

n s

pace d

efinitio

n


parameterst

a d

MD

A

desig

n e

ntr

y






optimizations


configuration

• memory


(static & dynamic)


constraints

use-cases


description

(IP-XACT)

virtual platform

IP component

models

g

Syste

mC

b

k



Page 23

Rationale: It enables the exploration of different allocations from the PIM

UML/MARTE model (Application model under the SW-centric assumption) to

platform resources.

RMHS5 The specification of a mapping set shall be done by means of adding mapping

constraints on the system model.

Rationale: For enabling a synthetic representation in the model of the different

mappings. Mapping Constraints are expressions which define a set of feasible

mappings (e.g., given a function f1 and two processors, and the set of mappings

{f1-P1, f1-P2, f1-P1^P2}, several sets of feasible mappings are possible, e.g. the

Mapping Set MS1={f1-P1,f1-P2 }. Then, if the MS1 is expressed in the model it

would mean that the executable will let the exploration tool, to explore cases

where f1 executes only in a single processor Pi from the two available.

RMHS6 Mapping constraints should be expressed as constructs under a suitable, preferably

standardised, language (e.g., OCL).

Rationale: In order to enable natural and expressive expressions of the mapping

constraints, and favour the acceptation of the methodology.

The following table defines the requirements applicable to the XML system description

interface derived from the PSM model, enumerated as RMXSDi.

RMXSD1 It shall support instantiation and connection of the same elements enumerated in

section for the description of the PDM.

RMXSD2 It shall support the configurability of the PDM, for the configuration from the

exploration tool.

RMXSD3 Allocation relations between application components (CFAM) and platform

resources shall be explicitly and separately described in this file.

RMXSD4 Allocation relations between application components (CFAM) and platform

resources shall be configurable, to express a set of PSMs, which, after

generation of the executable specifications, enables the exploration of different

allocations.

The following table defines the requirements applicable to the XML design space interface

derived from the PIM and PSM models, enumerated as RMXDSi.

RMXDS1 It shall support the configuration of the parameters (e.g., frequency, voltage,

number of cores, …)

RMXDS2 It shall support the configuration of a specific mapping (corresponding to the

mapping space)



Page 24

2.2.1.6 MARTE PDM and Architecture Description (IP-XACT)

2.2.1.6.1 Goals The goals to be achieved by the UML/MARTE model-driven front-end regarding the

definition and transformations of the PDM is to define under a standard format (IP/XACT)

that enables the export of the optimum hardware platform found in DSE, serving for different

uses (e.g., for serving as input for the virtual system generator, and for implementation

phases).

2.2.1.6.2 Requirements

The following table provides the requirements relative to the UML/MARTE PDM model,

enumerated as RMPD1.

system

specification

in SystemC

system

input

stimuli

automatically

pre-optimized

power

controller


source analysis


functional, power,

& timing

model generation

source analysis

cross compilation

functional, power,

& timing

model generation



bus cycle accurate

SystemC model



simulation

trace

BA

C+

+

BA

C+

+

HW

tasks

Syste

mC

e f

h

i j

l

m n o

esti

mati

on

& m

od

el

gen

era

tio

nsim

ula

tio

n

exp

lora

tio

n &

op

tim

izati

on

SW

tasks

execu

tab

le

sp

ecif

icati

on

visualization/

reporting

tool

trace

analysis tool

pow

er/

perf

orm

ance

metr

ics

userexploration &

optimization

tool

p

q

r

sMARTE

PIM or

Matlab/

Simulink

user

constrained

HW/SW sep.

& mapping

MARTE

PDM(Platform

Description

Model)

c

parameters for

new design space

instance

desig

n s

pace d

efinitio

n


parameterst

a d

MD

A

desig

n e

ntr

y






optimizations


configuration

• memory


(static & dynamic)


constraints

use-cases


description

(IP-XACT)

virtual platform

IP component

models

g

Syste

mC

b

k



Page 25

RMPD1 The PDM specification shall support the following HW platform elements:

processor cores

platform communication infrastructure

memories

usual platform elements: e.g. timers, watchdogs, arbiters, bridges, I/O (e.g. RS-

423 required by the use case).

existing IPs

areas for custom hardware and their (register) interfaces

HW loadable components (i.e. PLD, FPGA)

Application-specific processing elements (i.e. DSP)

Rationale: In order to develop a sufficiently expressive methodology for describing

the HW platform.

RMPD2 The system platform model shall be supported by the MARTE HRM and SRM

sub-profiles.

Rationale: The most suitable sub-profile of MARTE for this task.

RMPD3 The system platform model will support and require at least the specification of the

Architecture of the HW Platform of the system (Instance of System-Level

Architectural Components and their Interconnection).

Rationale: The system architecture is usually an aspect open to design and with a

significant impact in performance.

RMPD4 The system platform model shall include the minimum information required by

SCoPE+ to enable the generation of the system platform executable model (e.g.,

through the inclusion of SCoPE+) and generation of performance figures. This

information will be preferably selected from UML/MARTE stereotype attributes.

Rationale: To guarantee the generation of an executable from the UML/MARTE

model useful for DSE.

RMPD5 The modelling environment shall enable the generation (through the corresponding

generator developed in COMPLEX) of an IP/XACT specification of the system

platform.

Rationale: To provide a standard description of the optimum platform (which is a

static description).



Page 26

RMPD6 The generation of the IPXACT specification of the system platform shall be based

on the optimal HW architecture.

The following table provides the requirements relative to the IP/XACT description of the

platform model to be extracted after DSE, enumerated as RMIPXi.

RMIPX1 The IP/XACT format shall support instantiation and connection of the same HW

platform elements previously enumerated (processor, buses, etc)

RMIPX2 A minimum set of IP-XACT elements [4] (based on the current IP/XACT

supported by SCoPE [7][8]) is enumerated:

Design element (spirit:design) and its Vendor Library Name Version

(VLNV) identifier.

Component instances section (spirit:componentInstances)

Support the Component Instance element (spirit:componentInstance),

and the following related sub elements‖:

Component Instance name (spirit:instanceName)

Component Instance VLNV identifier.

Vendor extensions. There ―context labels‖ are used to specify the type of

component (used for automatic/default generation components for the virtual

platform used for the estimation).

Component interconnexion section (spirit:interconnections)

Component interconnection (spirit:interconnection) and the following

related sub elements:

The interconnection name element (spirit:instanceName)

The active interface element (spirit:activeInterface) to support the

specification of a two sided interconnection. Master, Slave, MirrorerMaster

and MirroredSlave references will be supported.

RMIPX3 The IP/XACT file will include also XML comments to enable the generation

tool information to correlate the UML/MARTE model with the IP/XACT

product.

RMIPX4 The IP/XACT format shall support the architecture description and the

integration of other IPs for further estimation (as a valid input for the virtual

system generator).

RMIPX5 The IP/XACT format shall facilitate the integration of IPs for implementation



Page 27

purposes

The following requirements are related to the generator of the IP/XACT description of the

platform model, enumerated as RMGXi:

RMGX1 The generator shall follow the name conventions and general requirements.

RMGX2 Support the generation of the IP-XACT code reflecting the backbone of the

architecture, supporting predefined processor cores, bus, memory and basic

platform elements.

RMGX3 Include the existing IP-XACT descriptions of existing IPs. The tool should

support the inclusion of specific IPs, described by means of separated IP-XACT

files.

RMGX4 Back compatibility with SW estimation tools (e.g. SCoPE). That is, to support the

IP-XACT format currently supported by SCoPE IP-XACT plug-in. This way,

such a plug-in can be used (and extended) for the automatic generation of the

IP/XACT description of the architecture.

RMGX5 Support the generation of platform configuration. The minimum set of

configuration parameters is the following: Number of processors (for a node),

memory map (base address of devices and size)

RMGX6 The name of the UML/MARTE HW components shall be preserved in the

IP/XACT description.



Page 28

2.2.2 Matlab/Stateflow Design Entry

2.2.2.1 Goals

The goals to be achieved by the Matlab/Stateflow model-driven front-end can be summarized

by the following points:

Description

The definition of a COMPLEX modelling methodology based on Matlab/Stateflow able to

describe the dynamic behaviour of an embedded system as a function of input stimuli.

The development of a COMPLEX modelling environment supporting the aforementioned

COMPLEX modelling methodology and based on the usage, and extension or adaptation, if

necessary of state-of-art capture tools.

The development of a COMPLEX transformation toolset enabling the generation of a

SystemC specification model from the Stateflow model.

2.2.2.2 Requirements This section is deliberately structured to be as similar as possible to the previous section,

describing the requirements for the UML/MARTE entry. The following table provides the

general requirements for the Stateflow modelling methodology and environment, enumerated

as RSMi.

RSM1 The modelling methodology shall be a component-based methodology.

Rationale: To enhance modularity and enable the HW/SW partitioning process.

RSM2 The description of the system functions, as well as the specification of the use

cases, shall be logically separated.

Rationale: Separation of Concerns.

RSM3 COMPLEX shall support the Stateflow tool for modelling the system function and

stimuli.

system

specification

in SystemC

system

input

stimuli

automatically

pre-optimized

power

controller


source analysis


functional, power,

& timing

model generation

source analysis

cross compilation

functional, power,

& timing

model generation



bus cycle accurate

SystemC model



simulation

trace

BA

C+

+

BA

C+

+

HW

tasks

Syste

mC

e f

h

i j

l

m n o

esti

mati

on

& m

od

el

gen

era

tio

nsim

ula

tio

n

exp

lora

tio

n &

op

tim

izati

on

SW

tasks

execu

tab

le

sp

ecif

icati

on

visualization/

reporting

tool

trace

analysis tool

pow

er/

perf

orm

ance

metr

ics

userexploration &

optimization

tool

p

q

r

sMARTE

PIM or

Matlab/

Simulink

user

constrained

HW/SW sep.

& mapping

MARTE

PDM(Platform

Description

Model)

c

parameters for

new design space

instance

desig

n s

pace d

efinitio

n


parameterst

a d

MD

A

desig

n e

ntr

y






optimizations


configuration

• memory


(static & dynamic)


constraints

use-cases


description

(IP-XACT)

virtual platform

IP component

models

g

Syste

mC

b

k

system

specification

in SystemC

system

input

stimuli

automatically

pre-optimized

power

controller


source analysis


functional, power,

& timing

model generation

source analysis

cross compilation

functional, power,

& timing

model generation



bus cycle accurate

SystemC model



simulation

trace

BA

C+

+

BA

C+

+

HW

tasks

Syste

mC

e f

h

i j

l

m n o

esti

mati

on

& m

od

el

gen

era

tio

nsim

ula

tio

n

exp

lora

tio

n &

op

tim

izati

on

SW

tasks

execu

tab

le

sp

ecif

icati

on

visualization/

reporting

tool

trace

analysis tool

pow

er/

perf

orm

ance

metr

ics

userexploration &

optimization

tool

p

q

r

sMARTE

PIM or

Matlab/

Simulink

user

constrained

HW/SW sep.

& mapping

MARTE

PDM(Platform

Description

Model)

c

parameters for

new design space

instance

desig

n s

pace d

efinitio

n


parameterst

a d

MD

A

desig

n e

ntr

y






optimizations


configuration

• memory


(static & dynamic)


constraints

use-cases


description

(IP-XACT)

virtual platform

IP component

models

g

Syste

mC

b

k



Page 29

Rationale: Support a de-facto industrial standard tool.

RSM4 The methodology shall enforce the separation of concerns by means of appropriate

segregation between functionality and stimuli/observation in different blocks.

Rationale: the separation of concerns concept allows managing the complexity of

the system modelling by focusing the system designer on specific aspects of the

system, like specification and verification.

RSM5 The COMPLEX Stateflow design modelling shall use existing tools and model

formats whenever they exist and are suitable.

RSM6 The COMPLEX Stateflow design methodology shall provide an integrated design

environment for functional modelling, as well as the stimuli definition.


tool

Input

provided by

System Specification Stateflow

editor

Use Case

Provider



SystemC executable specification of the model HIFSuite HW/SW Task

Separation and Test

Bench Generation

System Stimuli Input Model HW/SW Task

Separation and Test

Bench Generation

2.2.2.3 Stateflow System specification in SystemC

system

specification

in SystemC

system

input

stimuli

automatically

pre-optimized

power

controller


source analysis


functional, power,

& timing

model generation

source analysis

cross compilation

functional, power,

& timing

model generation



bus cycle accurate

SystemC model



simulation

trace

BA

C+

+

BA

C+

+

HW

tasks

Syste

mC

e f

h

i j

l

m n o

esti

mati

on

& m

od

el

gen

era

tio

nsim

ula

tio

n

exp

lora

tio

n &

op

tim

izati

on

SW

tasks

execu

tab

le

sp

ecif

icati

on

visualization/

reporting

tool

trace

analysis tool

pow

er/

perf

orm

ance

metr

ics

userexploration &

optimization

tool

p

q

r

sMARTE

PIM or

Matlab/

Simulink

user

constrained

HW/SW sep.

& mapping

MARTE

PDM(Platform

Description

Model)

c

parameters for

new design space

instance

desig

n s

pace d

efinitio

n


parameterst

a d

MD

A

desig

n e

ntr

y






optimizations


configuration

• memory


(static & dynamic)


constraints

use-cases


description

(IP-XACT)

virtual platform

IP component

models

g

Syste

mC

b

k



Page 30

2.2.1.3.1 Goals

The goal of this step in the COMPLEX flow is to capture a Platform Independent Model

(PIM) of the application through the Stateflow front-end and to generate a SystemC

specification.

2.2.1.3.2 Requirements The requirements to be addressed in this design step are listed here for the Stateflow front-end

and SystemC executable specification.

The following table defines the requirements (enumerated as RSIi) applicable to capture of

the PIM within the Stateflow model-driven front-end.

RSI1 The PIM model shall support two views, the functional view (single FSM)

and the communication and concurrency view (Statechart with AND states).

Rationale: this requirement states the minimum set of system concerns to be

considered in the PIM model. For example: Functional specification

(structure of functionality), Concurrency structure.

RSI2 The PIM model shall follow a SystemC-centric approach.

SystemC processes can be mapped indifferently to HW and to SW, thus

implementing Statechart concurrency as multiple processes eases the

subsequent partitioning process. Each AND state (concurrent FSM) shall be

transformed into an active task (SystemC thread), with its own execution

flow.

RSI3 The COMPLEX Stateflow to SystemC generator shall follow the system

component naming conventions.

RSI4 At least, the following type of communication and synchronization

mechanisms shall be allowed in the application model:

FIFOs with blocking semantics

Events

The following requirements (enumerated as RSSCi) are related to the generation of the

SystemC executable model from the Stateflow specification:

RSSC1 The transformation from the Stateflow model to the SystemC model shall ensure

the separation of the application and platform levels.

Rationale: the SystemC generator generates only the SystemC application model,

while the platform model is created separately by the platform provider.

RSSC2 The transformation of the system model into the executable model shall be



Page 31

automatic with no user intervention (button click approach).

RSSC3 Action code (function bodies) associated to the Stateflow model shall be used

without user modifications (assuming this code fulfils the limitations of the

COMPLEX estimation tools: SCoPE, etc) for the generation of the SystemC

executable specification.

RSSC4 A homogeneous and understandable naming convention for the generation tools

shall be provided.

2.2.2.4 Use cases System Input Stimuli and Test bench generation

2.2.1.4.1 Goals The following table introduces the goals to be achieved by the Stateflow model-driven front-

end regarding the use case stimuli definition and test bench generation.

Description

To define a methodology to design the test bench environment in parallel with the system

model within the same modelling environment.

To define a methodology to specify test cases and their associated requirements supported by

the system-modelling environment.

To define a mechanism to associate input stimuli to system components.

2.2.1.4.2 Requirements The following table defines the requirements applicable to the capture of the Use Case Model

within the Stateflow model-driven front-end, enumerated as RSUCi.

RSUC1 It shall be possible to design the test bench environment using the same modelling

environment used to development the system model.

Rationale: Efficiency. It gives as most homogeneity as possible in the modelling of

system and its environment. This does not prevents to take advantage of the

flexibility which, compared to the system description, is usually feasible to develop

environment models.

RSUC2 It shall be possible to specify test cases and associate them to requirements.

system

specification

in SystemC

system

input

stimuli

automatically

pre-optimized

power

controller


source analysis


functional, power,

& timing

model generation

source analysis

cross compilation

functional, power,

& timing

model generation



bus cycle accurate

SystemC model



simulation

trace

BA

C+

+

BA

C+

+

HW

tasks

Syste

mC

e f

h

i j

l

m n o

esti

mati

on

& m

od

el

gen

era

tio

nsim

ula

tio

n

exp

lora

tio

n &

op

tim

izati

on

SW

tasks

execu

tab

le

sp

ecif

icati

on

visualization/

reporting

tool

trace

analysis tool

pow

er/

perf

orm

ance

metr

ics

userexploration &

optimization

tool

p

q

r

sMARTE

PIM or

Matlab/

Simulink

user

constrained

HW/SW sep.

& mapping

MARTE

PDM(Platform

Description

Model)

c

parameters for

new design space

instance

desig

n s

pace d

efinitio

n


parameterst

a d

MD

A

desig

n e

ntr

y






optimizations


configuration

• memory


(static & dynamic)


constraints

use-cases


description

(IP-XACT)

virtual platform

IP component

models

g

Syste

mC

b

k



Page 32

Rationale: To enable the different types of checks/validations which have to be

applied to a system, and moreover, to choose the expected criteria (functionality,

performance) to consider for each check (e.g., power consumption of a wireless

radio can be measured when using different medium access control mechanisms.

Power consumption requirements can be different in each case)

RSUC3 It shall be possible to define input stimuli for each test case.

Rationale: To let specify input stimuli, usually associated to test cases and

requirements (e.g., the previously mentioned power consumption check requires to

disable or not the radio, depending on the test case)



Page 33

2.3 Estimation and Model Generation

2.3.1 Task separation / Test bench generation

2.3.1.1 Goals

Cut the system specification in SystemC into modules/tasks based on communication

boundaries, user constraints and allocation/mapping information from the MARTE model.

Transform and adapt the modules/tasks to a form suitable for the following estimation tools.

Generate a virtual platform skeleton that will integrate the annotated BAC++ output from the

estimation tools.

Block test benches are generated to allow simulation and test of the separated modules. These

test benches are either based on replaying traces coming from stimuli files or implement a

representation of the surrounding remainder of the specified system.

2.3.1.2 Requirements

A system elaboration with detection of processes, ports, channels and bindings needs to be

performed in order to be able to detect communication/separation points. The system

specification code must follow some restrictions in order to be fully analyzable; otherwise a

full code interpreter would be needed in the elaboration engine. All system construction must

occur in the module constructors in a declarative style that can be elaborated at compile-time,

recursion or unrestricted loops are not allowed here.

For separating a behavioural block that communicates with other blocks in the abstract

SystemC description, we need to adapt the abstraction level of the communication to the

appropriate level for the estimation tools. Traces containing input stimuli are reused from the

executable system specification. If the separated modules are to be tested by a representation

of the residual system, this rest is assembled and transformed into a suitable test bench.

User defined HW/SW separation and mapping information enters the system in form of a

textual description listing identifiers/instance names of the design with their desired

configuration.

The IP-XACT description is used to generate a skeleton virtual platform that can be linked

with the output of the SW and HW estimation tools. There will be a finite set of supported

interface modelling elements in the IP-XACT model, e.g. FIFOs, handshake- and double-

handshake channels.

system

specification

in SystemC

system

input

stimuli

automatically

pre-optimized

power

controller


source analysis


functional, power,

& timing

model generation

source analysis

cross compilation

functional, power,

& timing

model generation



bus cycle accurate

SystemC model



simulation

trace

BA

C+

+

BA

C+

+

HW

tasks

Syste

mC

e f

h

i j

l

m n o

es

tim

ati

on

& m

od

el

ge

ne

rati

on

sim

ula

tio

n

ex

plo

rati

on

& o

pti

miz

ati

on

SW

tasks

ex

ec

uta

ble

sp

ec

ific

ati

on

visualization/

reporting

tool

trace

analysis tool

po

we

r/p

erf

orm

ance

me

tric

s

userexploration &

optimization

tool

p

q

r

sMARTE

PIM or

Matlab/

Simulink

user

constrained

HW/SW sep.

& mapping

MARTE

PDM(Platform

Description

Model)

c

parameters for

new design space

instance

de

sig

n s

pa

ce d

efinitio

n


parameterst

a d

MD

A

de

sig

n e

ntr

y






optimizations


configuration

• memory


(static & dynamic)


constraints

use-cases


description

(IP-XACT)

virtual platform

IP component

models

g

Syste

mC

b

k



Page 34


tool Input provided by

System specification in SystemC SMOG MARTE PIM –

Matlab/StateFlow

or System designer

System input stimuli SMOG Use cases/ System

designer

User constrained HW/SW separation and mapping SMOG System designer

Architecture/Platform description (IP-XACT) SMOG IP-XACT tool chain

Output provided Provided by tool Output required by

Separated HW tasks SMOG Source Analysis and augmented

code generation (HW tasks)

Separated SW tasks SMOG Source Analysis and augmented

code generation (SW tasks)

Block test benches SMOG Source Analysis and augmented

code generation (HW tasks)/(SW

tasks)



Page 35

2.3.2 Source Analysis and Augmented Code Generation (SW tasks)

2.3.2.1 Goals

This step implements two main features of the software related portion of the design flow:

Software task analysis

Augmented code generation

Since the two processes are strictly related, they are considered as a single design flow step.

More precisely the augmented code generation need the information (costs and program

structure) derived by the analysis phase.

Source Code Analysis

The analysis of the source code has the goal of building an abstract model of the source code

of a given task. In particular the model should completely represent the structure and the

semantic of the task but must be independent of the actual input data. To account for data

dependencies the generated model has the form of a source code program with the same

functionality as the original task but with profiling and tracing capabilities.

The non-functional information that should be traced (primarily execution time and power

consumption) require a model of the underlying architecture. For the sake of modularity and

flexibility, such models are only used in a post-processing phase during which all estimates

are calculated and collected.

The input of this step is a software task in the form of a set of C language source files. The

output is constituted by the same input files annotated with estimation of the non-functional

properties being analyzed. Report and traces can also be generated as by-products of the

analysis flow.

Augmented Code Generation

The non-functional model produced in the analysis phase is rearranged in such a way to

obtain a new model – representing the exact behaviour of the task – that can be simulated

along with the rest of the architecture. Simulation of the augmented code model includes

accounting for execution time and power consumption, as well as other dynamic non-

functional properties that might be potentially of interest within the flow.



Page 36

The augmented code will take the form of the original C source code enriched with suitable

calls to a set of functions defined as part of the BAC++ and SystemC API. These calls allow

keeping track at runtime of the evolution of the non-functional figures under study. Since the

overall simulation will be based on SystemC/BAC++, it is necessary to wrap the pure C code

within suitably defined SystemC wrappers.

In the simulation of different interacting software tasks, the operating system plays a crucial

role. To account for its contribution, two solutions are possible, depending on the specific

portion of the operating system that is being considered:

Characterization of individual functions. This is the preferred approach to account for the

impact of the operating system when explicit calls are present in the user code. This applies,

for example, to most of the I/O and memory management functions.

Statistical characterization of hidden contributions, such as interrupt handling and scheduling.

In both cases some form of characterization of the operating system non-functional behaviour

is necessary. This can either be accomplished using the abovementioned source code analysis

techniques or using ad-hoc, lower level statistical approaches combined with suitable

measurement campaigns.


The input of the source analysis module must comply with the following requirements,

enumerated as RESWi:

RESW1 Input must consist of a set of header and implementation C source files. The

language supported is compliant to both ANSI X3.159-1989 and ISO/IEC

9899:1990 standards.

RESW2 Compiler-specific extensions to the language are not considered by the standard

tool-chain but can be dealt with by suitable, configurable extensions.

RESW3 The overall structure of the program can either be that of a stand-alone

application or that of a library. In this latter case, suitable stubs must be created to

exercise the code being analysed

RESW4 The structure of the application can consist of several processes or tasks.

Interaction between tasks and processes is not strictly part of the application code

but can be accounted for by suitably modelling the underlying operating system.

RESW5 To complete the estimation flow a characterization of the underlying hardware

architecture must be provided.

RESW6 The input for augmented code generation must comply with the following

requirements

RESW7 The nature of the non-functional information must be numerical.

RESW8 The interaction between functional and non-functional aspects of the application

is implemented by means of instrumentation of the original source code of the

application with suitable tracing functions.



Page 37

RESW8 An API of tracing functions must be defined. This API must support all the three

types of instrumentation envisaged (post-processing, light-weight and

SystemC/BAC++).

RESW9 The information to be traced and its format must be specified.



Behavioural description of the task to be

analyzed/augmented, given as C source code

SWAT HW/SW Task

Separation and Test

Bench generation

Characterisation of the target processor SWAT Technology

provider

List of functions in the SW code that should not be

estimated (e.g. pre-characterized function, or HW

stub)

SWAT Application

provider


tool

Output required

by

Augmented SW task with power/performance

annotations

SWAT Virtual System

Generator

Power/Performance estimation reports of the SW code SWAT User

Execution Traces SWAT Design Space

Exploration



Page 38

2.3.3 Source Analysis and Augmented Code Generation (HW tasks)

2.3.3.1 Goals

This step is composed by two main features: (1) the HW task analysis and (2) augmented

code generation. Those two features are strictly correlated and included within a single design

flow step.

The goal of this step is to estimate the non-functional properties of HW tasks in terms of

power and timing. Using this information a self-simulating model of the relevant tasks is

created that will be used during simulation in order to obtain functional behaviour dependent

information about power and timing. It is also possible to see the impact of the chosen

hardware technology on the functional behaviour i.e., timing.

Analysis

First, the high-level description (SystemC) of the task is analysed. Analysis is done using a

functional simulation of the task. During simulation functional behaviour as well as typical

data patterns for all operations are obtained. A power optimized RTL description is created,

which is the basis for the estimation. During estimation several non-functional properties like

dynamic and static power as well as timing are obtained, according to the synthesis-

constraints given by the user or the DSE-tool, respectively.

During synthesis of the task, an RTL description of the hardware component implementing

the task accompanied by a pre-optimized power-controller is generated. The power controller

allows setting the hardware component into individual power modes. For mode selection an

interface is provided that can be used by the power manager of the overall system. In order to

find a good power management policy for the system, the analysis of the components also

provides information about additional costs and penalties regarding switches between power

modes. This information is provided in term of a power mode table, which contains the

following information:

ID External parameter Resulting parameter Penalty (for each subsequent mode)

Supply

voltage

Clock

frequency

Avg. dyn.

power

Avg.

leakage

Switching time Switching power

Table 2-1: Power mode table per component



Page 39

Augmented Code Generation

With the estimated RTL description it is possible to create a self-simulation model of the

description. Despite the functional behaviour, this model contains the non-functional

information obtained during analysis. The model will then be used during simulation of the

overall system. This self-simulation model will be implemented using BAC++ (see

Section 2.3.4). For each block identified inside the RTL implementation of the behaviour, the

functional part will contain the number of clock cycles required by the HW to execute that

particular block and the capacity switched during execution. Data dependency of the switched

capacity is considered statistically.

The non-functional model contains information about the leakage and the clock frequency for

each power mode, as specified in the power mode table (see Table 2-1). The current clock

frequency can be used by the functional part to calculate the time, a certain piece of behaviour

consumed during execution. The switched capacity together with supply voltage etc. can be

used to calculate the power dissipation during execution. When switching from one power

mode to another, the execution time is increased by the penalty, also specified in the power

mode table. The same applies to the energy.



Behavioural description of the task to be implemented

in custom hardware, given as C/C++ source code that

is compliant with a synthesisable subset of the C/C++

language

PowerOpt HW/SW Task

Separation and Test

Bench generation

Characterisation of the target technology PowerOpt Technology provider

Synthesis constraints (clock speed, power vs. area,

etc.)

PowerOpt HW/SW Task

Separation and Test

Bench generation



BAC++ (see Section 2.3.4 for details) version of the

characterised behaviour

PowerOpt Virtual System

Generator

A list of different power modes, provided by the

component (see Table 2-1 for details)

PowerOpt Global Resource

Manager



Page 40

2.3.4 Block annotated C++ (BAC++)

Aim of block annotated C++ (BAC++) is to enrich the functional model of a certain

behaviour with additional information representing the hardware, executing the particular

behaviour. This information is then used during simulation to obtain information about the

power and timing.

In general, the augmented C++-code contains three parts:

The functional behaviour including estimated values, depending on the actual behaviour

during simulation e.g., switched capacity, clock cycles, instruction count, etc;

the non-functional model, containing information about values that are independent from

the actual behaviour like static power, for example;

and (3) an observer that translates the measured values into the metrics required by the

user that is power and timing.

Values related to the functional behaviour of the task are represented in terms of per block

annotated C++ code. That code is build from (basic-) blocks, each one containing a small part

of the behaviour as well as metrics for power and timing estimation, directly depending on the

behaviour. For HW this could be the switched capacity and for SW the instruction count, for

example. During simulation, different blocks are executed, depending on the actual control

flow, caused be the applied input stimuli.

The non-functional part does not depend on the actual behaviour, but it depends on the HW

implementing/executing the behaviour. It may also depend on the processing unit‘s current

power mode. That is, non-functional values may be influenced by the overall power manager

by setting the power modes of the processing unit.

The observer combines information from the functional and from the non-functional model,

in order to obtain the metrics required by the user and the DSE-tool, respectively. Thus, it

translates the metrics, obtained during BAC++ simulation into the values that should be

traced. In order to reduce the amount of data created during system simulation, the observer is

also able to perform some pre-processing e.g., sliding window averaging.

It is important to note, that the values that enrich the functional mode depend on the type of

the processing unit implementing/executing the behaviour. Thus each of both

characterisation-flows (i) and (j) from Figure 1 will create BAC++ monitoring different

values. Same applies to the non-functional model. As just mentioned the particular observer is

responsible for translating the values from the functional and the non-functional model into

the metrics required by the user. It is also important to note, that the observers for both, HW-

BAC++ and SW-BAC++ use the same API to communicate with the trace file generator.



Page 41

Figure 4: Augmented code estimation

Figure 4 shows how the three parts mentioned above collaborate during simulation. It shows a

generic approach, which is also suitable for software tasks, running on a processor.

If multiple behaviours are mapped to the same execution unit (e.g., multiple tasks running on

the same processor) some kind of scheduler, or even a more powerful real-time operation

system (RTOS) is required. In this case, each task is augmented as mentioned above. The

processing unit is still represented using a single non-functional model. The so-called

runnable contains the functional model of the task, as well as the non-functional model of the

HW. It also provides the interface, that can be used by the overall power manager to set the

power modes of the HW component, executing the behaviour. The runnable is wrapped by the

TLM2 template, which enables communication with the surrounding system.

Runable (BAC++, incl. scheduler)

BehaviourNon-functional

model

TLM2 template (depending on component type)

Operating mode table

- ID

- Vdd

- fmax

- avg. Pleak

- avg. Pdyn

- penalty (power)

- penalty (delay)

- etc.

Trace file

generatorS

urro

un

din

g

Syste

m

Component

characterisation

Power controller

generation

BehaviourBehaviour

(functional model)

Operating mode ID

Policy

Observer (depending on TLM template type)

Trace generation

Information pre-

processing

IF MEM

Estimated values



Page 42

2.3.5 Virtual System Generator

2.3.5.1 Goals

At this step in the flow the Virtual platform is generated according to the architecture model

and the task/communication mapping. The goal is to generate an efficient simulation model

that can be executed in the simulation step (see Section 2.4) in order to provide with the

simulation trace information for the design space exploration tools.

The Virtual System Generator takes a description of the architecture definition and the

SystemC code for the architecture models as well as the augmented code for HW and SW

tasks to assemble a complete description of the system that can be executed. The SystemC

models for the platform description can be defined at different levels of abstraction (e.g.

abstract processing elements versus instruction set simulators) therefore the Virtual System

Generator needs to be able to insert the necessary communication interfaces between models

to bridge these abstraction levels.


The Virtual System Generator must comply with the following requirements, enumerated as

RVSGi:

RVSG1 All models for architecture components, HW tasks and SW tasks should comply

with the IEEE1666 SystemC standard.

RVSG2 The external interfaces of the components that will be integrated should comply

with a predefined set of interface definitions that minimally contains TLM2.0

base protocol and sc_signal.

RVSG3 There is a predefined set of platform elements that support abstract application

mapping. This will at least contain instruction set simulators, virtual processing

resources and custom HW models.

RVSG4 Refinement of abstract communication elements for HW and SW tasks to

platform (via transactors/remote service calls over TLM) requires well-defined

protocols / interfaces for the HW and SW tasks.

RVSG5 Automatic insertion of abstraction level conversion for communication interfaces

requires well-defined protocols/interfaces for the platform components

RVSG6 The Virtual System Generator will be able to implement the SW task mapping to

system

specification

in SystemC

system

input

stimuli

automatically

pre-optimized

power

controller


source analysis


functional, power,

& timing

model generation

source analysis

cross compilation

functional, power,

& timing

model generation



bus cycle accurate

SystemC model



simulation

trace

BA

C+

+

BA

C+

+

HW

tasks

Syste

mC

e f

h

i j

l

m n o

esti

mati

on

& m

od

el

gen

era

tio

nsim

ula

tio

n

exp

lora

tio

n &

op

tim

izati

on

SW

tasks

execu

tab

le

sp

ecif

icati

on

visualization/

reporting

tool

trace

analysis tool

pow

er/

perf

orm

ance

metr

ics

userexploration &

optimization

tool

p

q

r

sMARTE

PIM or

Matlab/

Simulink

user

constrained

HW/SW sep.

& mapping

MARTE

PDM(Platform

Description

Model)

c

parameters for

new design space

instance

desig

n s

pace d

efinitio

n


parameterst

a d

MD

A

desig

n e

ntr

y






optimizations


configuration

• memory


(static & dynamic)


constraints

use-cases


description

(IP-XACT)

virtual platform

IP component

models

g

Syste

mC

b

k



Page 43

either virtual platform resources (like Virtual Processing Units (VPUs) or to an

instruction set simulator.

RVSG7 The Virtual System Generator should support configurable instrumentation for

the simulation executable to simplify design space exploration from a single set

of models. Configuration can be used for power control, frequency scaling, power

modes, execution modes etc.

RVSG8 To enable fast turnaround time between different platform variants the Virtual

System Generator tools should support run-time configurable platform

descriptions for a certain predefined coding style.

Input required Required by tool Input provided by

IP-XACT specification of the system platform Virtual Platform

tools

IP-XACT tool chain

Augmented SystemC code for the SW task

models

Virtual Platform

tools

Source analysis and

augmented code

generation (SW

tasks) - SWAT

Augmented SystemC code for the HW task

models

Virtual Platform

tools

Source analysis and

augmented code

generation (HW

tasks) - PowerOpt

HW and SW task mapping to system platform Virtual Platform

tools

Task Separation and

test bench generation

- SMOG


SystemC code for the platform simulator Virtual Platform

tools

Simulator

Build and launch scripts for platform simulator Virtual Platform

tools

Simulator

Elaboration time configuration for platform

simulator

Virtual Platform

tools

Simulator



Page 44

CPU0 HW0Virtual System Prototype

executable specification

in SystemC (e.g. on CoWare Platform)

C

C

IF

IF

IF

C

C

C

TL

M 2

Ro

ute

r

(IB

M P

LB

Pro

toc

ol)

PPC dedicated HW

IF

Memory

Mem

Mem

Memory

Model

OSCI TLM 2

communication

model:

• PV

• BA

• BCA

Contains bus power

& timing model.

T0 T1

T2Ca

ch

e

Mo

de

l

Communication

Graph of T0 only

showing explicit

communication

nodes.

Computation nodes

contain power and

timing annotations.

Instruction & data

fetches are handled

by Cache Model.

System memory

model from IP-XACT

repository.

Communication

graph of T1. Shows

parallel execution

obtained from

behavioural

synthesis. Power

and execution time

annotated in

computation nodes.

TLM initiator socket TLM target socket

sequential C/C++ code

Port

Behaviour ( with Active Task)

Parallel Application Description

T0

T1

T2PPC

IBM

P

LB

Arbiter

dedicated HW

Architecture/Platform Description

ISA, pipeline, cache

behaviour,

clk freq.

voltage & freq. scaling

width, protocol

scheduling policy

target technology

CPU0HW0


in SystemC

Mem

data & address

widths

max.

area

Behaviour (with Passive Task)

Interface

Mapping DescriptionT0 -> CPU0

T1 -> HW0

T2 -> HW0

Syste

m-L

evel M

od

el

TL

M M

od

el



Page 45

2.3.6 Global resource manager (Pre-optimized Power Controller)

2.3.6.1 Goals

The pre-optimized power controller consists of two main parts.

The first main part contains one power controller per HW component (see Section 2.3.3),

which allows setting of the HW component, implementing the task into individual power

modes, and providing an interface to the Global Resource Manager (GRM) of the overall

system.

The second main part is the GRM, optimizing the system parameters at run time, i.e.

adapting the hardware platform and the application configuration during execution in order

to further reduce the power consumption. The GRM acts as a middleware between the

application and the platform. Among other functionalities, the GRM can vary the

frequency of processors, power on and off power islands, select power modes of HW

components, or switch between different qualities of service proposed by the application.

The Global Resource Manager (GRM) is loaded on the host processor of the platform. It is a

software task running in parallel with the applications. The goals of the GRM are as follows:

First, this GRM should support a holistic view of resources and quality management. This

is needed for global resource allocation decisions, arbitrating between all applications, and

optimizing a utility function (also called Quality of user Experience (QoE)), given the

available resources. This QoE allows trade-off, negotiated with the user, between quality and

cost.

Second, this GRM should transparently optimize the resource usage and the application

mapping on the platform. This is needed to facilitate the application development and manage

the quality requirements without rewriting the applications.

Third, this GRM should dynamically adapt to changing context. This is needed to achieve a

high efficiency under changing environment.

Since such a GRM is intended for embedded platforms, a only lightweight implementation

is acceptable. To that end, this GRM should be considered in the system simulation to control

its complexity and monitor its overhead, such as performance and power consumption.


The GRM is loaded on the host processor of the platform. It is a software task, specified in C,

and running on top of the basic OS services. The GRM interfaces with the design-time

exploration (see Section 2.5.2) are as follows:

system

specification

in SystemC

system

input

stimuli

pre-optimized

Global Resource

Manager

(SW task)


source analysis


functional, power,

& timing

model generation

source analysis

cross compilation

functional, power,

& timing

model generation



bus cycle accurate

SystemC model



simulation

trace

BA

C+

+

BA

C+

+

HW

tasks

Syste

mC

esti

ma

tio

n &

mo

de

l g

en

era

tio

nsim

ula

tio

n

exp

lora

tio

n &

op

tim

iza

tio

n

SW

tasks

exec

uta

ble

sp

ec

ific

ati

on

visualization/

reporting

tool

trace

analysis tool

pow

er/

perf

orm

ance

metr

ics

userexploration &

optimization

tool

MARTE

PIM or

Matlab/

Simulink

user

constrained

HW/SW sep.

& mapping

MARTE

PDM

(Platform

Description

Model)

parameters for

new design space

instance

desig

n s

pace d

efinitio

n


parameters

MD

A

des

ign

en

try






optimizations


configuration

• memory


(static & dynamic)


constraints

use-cases


description

(IP-XACT)

virtual platform

IP component

models

Syste

mC

Power

Management

HW Interface



Page 46

The design-time exploration leads to a multi-dimensional set of optimal application mappings.

Each mapping is characterized by a code version together with an optimal combination of

application constraints, HW/SW partitioning, used platform resources, operation modes of

each platform component, and costs.

At run-time, whenever the environment is changing, the GRM selects a mapping from this

multi-dimensional set in order to optimize the utility function, while respecting the

application constraints and minimizing the costs.

The GRM interfaces with the HW components, including their power controllers (see Section

2.3.2.1) and with the processors of the platform, through APIs:

To send them requests about activation to run a specific task under specific constraints or

deactivation.

To get access to their power consumption and timing information.

Input required Required

by tool Input provided by

Hardware platform description (high-level view) GRM IPXACT tool-chain

Or platform provider

Application constraints GRM System designer

Multi-dimensional set of optimal platform

configurations (HW/SW partitioning and memory

architecture) and application operation modes

(XML)

GRM Design Space Exploration

Switching costs (power, execution time) between

power modes of HW components (from power mode

tables)

GRM Source Analysis and

Augmented Code

Generation (HW tasks)

Switching costs (power, execution time) between

power modes of SW processors

GRM IPXACT tool-chain

Or platform provider

APIs for setting the HW components, for power

mode selection, and for accessing power and timing

information

GRM Source Analysis and

Augmented Code

Generation (HW tasks)

APIs for setting the SW components, for power

mode selection, and for accessing power and timing

information

GRM Platform provider

Output provided Provided by Output



Page 47

tool required by

GRM SW task for run-time management of platform

resources, to run in parallel with the application

GRM Final

executable

specification



Page 48

2.3.7 Summary: Model Estimation and Generation Flow

COMPLEX Project22

sequential

C/C++

Protocol

Elements

Library

Ch.

seq.

C/C++/SystemC

with or without

wait()Protocol

Behaviour

Parallel Application Description

M0

M1

M2CPU

Bu

s

Arbiter

dedicated HW

Architecture/Platform Description

ISA, pipeline,

cache behaviour,

clk freq.

voltage & freq.

scaling

width, protocol

scheduling policy

max.

area

technologyMapping DescriptionM0 -> CPU0

M1 -> HW0

M2 -> HW0

CPU0HW0

SPIRIT/IP-XACT

C/C++ Front-End

FOSSY*Elaborator

CPU

Bu

s

Arbiter

dedicated HW

CPU0HW0

Virtual System Description

untimed/causal

simulation trace

(data)

ChipVision

PowerOpt

SW execution

time

estimator

“3 Address

SystemC” with

instrumentation

(exec. time,

stat. & dyn. power)

“3 Address

SystemC” with

instrumentation

(exec. time)

int. rep.

to C writer

internal

design

representation

Int. rep.

to virtual system

seq. C-Code

cycle accurate

simulation trace

(data, time,

stat. & dyn. power)

OSCI TLM 2

communication

model:

• PV

• BA

• BCA


in SystemC

executable

specification

in SystemC (e.g. on CoWare Platform)

block testbench

TB generator

e.g.

behaviour(M0)

e.g.

behaviour(M1)

behaviour(M2)

IP-X

AC

T F

low

Platform

Vendor

XY

TLM

Architecture

Templates



Page 49

2.4 Simulation

2.4.1 Goals

The goal of the simulation step is to enable a detailed evaluation of the HW/SW platform

obtained by the previous part of the flow. Considering the possibility to adopt the COMPLEX

flow for the design of a node into a distributed environment the goal of this step is not only to

take care of SystemC/BAC++ Models of the platform in a closed environment. In particular,

this step includes also the evaluation of distributed environments through a network

simulation.

Another goal is the development of an flexible, and efficient instrumentation interface for

dynamic/configurable tracing of different architecture properties (power, temperature …).

2.4.2 Requirements

The simulation must comply with the following requirements, enumerated as RVSSi:

RVSS1 Configurable during run-time in terms of Accuracy / Granularity, Scaling factors

(voltage scaling, frequency scaling), Overheads (e.g. state switching times and

cost)

RVSS2 Support for temporal decoupling between synchronisation points. The intended

temporal decoupling requires configuration hooks to identify synchronisation

points in VP model.

RVSS3 The simulation should generate trace files for Design Space exploration in a

predefined trace format. For this purpose a set of instrumentation API‘s should be

predefined and used in the different platform component models as well as in the

HW and SW task descriptions

RVSS4 The simulation should support trace file generation for SW tasks through an

interface to the underlying processing resource simulation model (instruction set

simulator or virtual processing resource)

RVSS5 The simulation should support configuration through the power controller by

providing an interface to the different platform component models for power

configuration

RVSS6 The instrumentation API supported by the simulation should provide interfaces

for static and dynamic power annotation. Static power annotation should be

possible per component process. Dynamic power annotation should be enabled

per basic block of the functionality of the component.

system

specification

in SystemC

system

input

stimuli

automatically

pre-optimized

power

controller


source analysis


functional, power,

& timing

model generation

source analysis

cross compilation

functional, power,

& timing

model generation



bus cycle accurate

SystemC model



simulation

trace

BA

C+

+

BA

C+

+

HW

tasks

Syste

mC

e f

h

i j

l

m n o

es

tim

ati

on

& m

od

el

ge

ne

rati

on

sim

ula

tio

n

ex

plo

rati

on

& o

pti

miz

ati

on

SW

tasks

exec

uta

ble

sp

ec

ific

ati

on

visualization/

reporting

tool

trace

analysis tool

po

wer/

pe

rfo

rmance

me

tric

s

userexploration &

optimization

tool

p

q

r

sMARTE

PIM or

Matlab/

Simulink

user

constrained

HW/SW sep.

& mapping

MARTE

PDM(Platform

Description

Model)

c

parameters for

new design space

instance

de

sig

n s

pa

ce d

efinitio

n


parameterst

a d

MD

A

de

sig

n e

ntr

y






optimizations


configuration

• memory


(static & dynamic)


constraints

use-cases


description

(IP-XACT)

virtual platform

IP component

models

g

Syste

mC

b

k



Page 50

RVSS7 The instrumentation API provided by the simulation should allow for different

instrumentation values depending on run-time configuration parameters like

voltage and frequency changes.

RVSS8 The trace file generation shall be run-time controllable both at the beginning and

end points of the trace generation as well as to the definition of what

instrumentation APIs will be enabled during the simulation.

Input required Required by tool Input provided by

SystemC code and build scripts for virtual

platform

SystemC

simulator

Virtual System

Generator

Initial configuration settings for the different

platform components

SystemC

simulator

Virtual System

Generator

Instrumentation setup SystemC

simulator

User or Design

Space Exploration


Trace files Simulator Design Space

Exploration



Page 51

2.5 Exploration and Optimization

The Design Space Exploration (DSE) is a central phase in design of novel computing

platforms. In fact, for a given system specification there may be many different design

alternatives that need to be evaluated and judged to understand their quality and to take a

decision on which is the system alternative to implement. Design alternatives may consist of

tuning and allocating hardware components, different mappings of software tasks to

resources, different scheduling policies implemented on shared resources, functional

modifications, memory assignment, as well as lower level design parameters such as clock

frequency or bus/network width and topology.

DSE should involve the analysis of multiple criteria, since each design alternative usually

represents a trade-off among different optimization goals. For instance, if we consider high

performance processors, usually they are more expensive in terms of area and power

consumption than low performance processors. So far, most design optimization

methodologies just regard one single cost aspect, e.g., energy or speed or memory footprint.

However, the side effect of optimizing one cost aspect is often that the others become worse,

by an unpredictable quantity.

The DSE step is the part of the design flow capable to create a feedback loop between

performance estimation and parameters configurations of the target system. In the COMPLEX

flow, the DSE loop can interact at different level on a different set of parameters.

At MDA design entry and Executable Specification levels, the exploration loop can act by

exploring the design space in terms of:

Functional Reimplementation

Mapping of HW/SW tasks

IP Selection and Configurations

Memory Configurations

On the other side at the Estimation and Model Generation level, the explorable parameters

range within the following list:

IP Configuration

Memory Configuration

Custom Hardware Synthesis constraints

Selection of Embedded SW optimization

RunTime Management strategies



Page 52

2.5.1 Simulation Traces and Analysis

2.5.1.1 Goals

Main goal of this part of the flow is the extraction of the activity- and power-relevant data of

the different platform parts and to pre-process these data to be either graphically presented to

the designer through a visualization reporting tool or to be used by automatic exploration and

optimization tool. The visualization engine will take power and activity-data prepared by the

analysis tool and present this information to the designer.


The reporting and visualisation support should be able to be compliant with the following

requirements, enumerated as RDVi:

RDV1 It should be able to present the simulation trace graphically (e.g. using known

viewers on top of well established trace formats like VCD).

RDV2 It should be able to maintain the design hierarchy in the reporting (should be

represented in a hierarchical naming convention, such as

TopModule.SubModule1.Sumbodule2.Port)

RDV3 It should be able to trace the activity of each module of the platform in order to

understand in detail the system behaviours. Core activity, power consumption for

each core, memory and interconnection utilization, are possible examples.

RDV4 It should be able to generate compact metrics to evaluate the generated instance of

the design space.

Input required Required by tool Input

provided by

Activity traces for each module of the platform

including hierarchy

COWARE Virtual

Platform tools or

external graphical

tools (e.g.,

GTKWave)

Simulation

Platform

system

specification

in SystemC

system

input

stimuli

automatically

pre-optimized

power

controller


source analysis


functional, power,

& timing

model generation

source analysis

cross compilation

functional, power,

& timing

model generation



bus cycle accurate

SystemC model



simulation

trace

BA

C+

+

BA

C+

+

HW

tasks

Syste

mC

e f

h

i j

l

m n o

es

tim

ati

on

& m

od

el

ge

ne

rati

on

sim

ula

tio

n

ex

plo

rati

on

& o

pti

miz

ati

on

SW

tasks

ex

ec

uta

ble

sp

ec

ific

ati

on

visualization/

reporting

tool

trace

analysis tool

po

we

r/p

erf

orm

ance

me

tric

s

userexploration &

optimization

tool

p

q

r

sMARTE

PIM or

Matlab/

Simulink

user

constrained

HW/SW sep.

& mapping

MARTE

PDM(Platform

Description

Model)

c

parameters for

new design space

instance

de

sig

n s

pa

ce d

efinitio

n


parameterst

a d

MD

A

de

sig

n e

ntr

y






optimizations


configuration

• memory


(static & dynamic)


constraints

use-cases


description

(IP-XACT)

virtual platform

IP component

models

g

Syste

mC

b

k



Page 53

Output provided Provided by tool Output required

by

Memory access traces including timing and

addresses required

Simulation Platform Exploration and

Optimization

Compact metrics evaluation reported following

the XML-DSE interface defined in (2.5.3)

Simulation Platform Exploration and

Optimization

Graphical visualization of the HW/SW platform

activity

COWARE or external

graphical tools (e.g.,

GTKWave)

User



Page 54

2.5.2 Design Space Exploration

2.5.2.1 Goals

Starting from the definition of the design space, the DSE step iteratively generates an instance

of the design space to be given as input to the model generation phase. The simulation phase

uses the generated model to estimate the performance values, and to give feedback to the DSE

step for the generation of the next design space instance.

It is a step in the flow that is needed for surfing the design space (changing the system

parameters) in order to find the optimal system configurations among all the possible

alternatives that are part of the design space. Moreover, the design space exploration loop is

also used to determine some knowledge about the system parameters (such as the main

effects, interaction effects) and design space (such as, configuration distribution with respect

to the system performance). This phase can be done by using a user centric DSE or an

automatic DSE phase.

The goal in using an automatic design space exploration and optimization tool is in the fact

that it should be able to automatically interact with system models in order to avoid the

intervention of the designer for the DSE phase (except for the analysis of the results) once the

target problem is formally defined.

On the other side, the usage of a user centric DSE flow is needed when a detailed analysis of

the system behaviour is necessary (e.g. trace analysis or time behaviour), once the problem

cannot be formally defined or it is not easy to be defined, or when the automatic modification

of the parameters on the system model is not possible or requires a larger modelling effort.

system

specification

in SystemC

system

input

stimuli

automatically

pre-optimized

power

controller


source analysis


functional, power,

& timing

model generation

source analysis

cross compilation

functional, power,

& timing

model generation



bus cycle accurate

SystemC model



simulation

trace

BA

C+

+

BA

C+

+

HW

tasks

Syste

mC

e f

h

i j

l

m n o

es

tim

ati

on

& m

od

el

ge

ne

rati

on

sim

ula

tio

n

ex

plo

rati

on

& o

pti

miz

ati

on

SW

tasks

ex

ec

uta

ble

sp

ec

ific

ati

on

visualization/

reporting

tool

trace

analysis tool

po

we

r/p

erf

orm

ance

me

tric

s

userexploration &

optimization

tool

p

q

r

sMARTE

PIM or

Matlab/

Simulink

user

constrained

HW/SW sep.

& mapping

MARTE

PDM(Platform

Description

Model)

c

parameters for

new design space

instance

de

sig

n s

pa

ce d

efinitio

n


parameterst

a d

MD

A

de

sig

n e

ntr

y






optimizations


configuration

• memory


(static & dynamic)


constraints

use-cases


description

(IP-XACT)

virtual platform

IP component

models

g

Syste

mC

b

k

system

specification

in SystemC

system

input

stimuli

automatically

pre-optimized

power

controller


source analysis


functional, power,

& timing

model generation

source analysis

cross compilation

functional, power,

& timing

model generation



bus cycle accurate

SystemC model



simulation

trace

BA

C+

+

BA

C+

+

HW

tasks

Syste

mC

e f

h

i j

l

m n o

es

tim

ati

on

& m

od

el

ge

ne

rati

on

sim

ula

tio

n

ex

plo

rati

on

& o

pti

miz

ati

on

SW

tasks

ex

ec

uta

ble

sp

ec

ific

ati

on

visualization/

reporting

tool

trace

analysis tool

po

we

r/p

erf

orm

ance

me

tric

s

userexploration &

optimization

tool

p

q

r

sMARTE

PIM or

Matlab/

Simulink

user

constrained

HW/SW sep.

& mapping

MARTE

PDM(Platform

Description

Model)

c

parameters for

new design space

instance

de

sig

n s

pa

ce d

efinitio

n


parameterst

a d

MD

A

de

sig

n e

ntr

y






optimizations


configuration

• memory


(static & dynamic)


constraints

use-cases


description

(IP-XACT)

virtual platform

IP component

models

g

Syste

mC

b

k



Page 55

Summarizing, on the one hand the user can constrain the overall platform selection, deduce

further constraints on HW/SW separation, or identify power-consuming implementations and

replace them with power-efficient ones. On the other hand, the automatic exploration and

optimization tool is based on multi-objective optimization heuristics to efficiently navigate a

parametric version of the overall design space defined formally.


The DSE step should be able to generate a new design space instance once collected

information from previous executed instances of the design space. Moreover, it should be able

also to read used-defined constraints and to report violations.

To do this, requirements have been identified on the interfaces with the other steps of the flow

that are different if the optimization and exploration loop will be done by the user or by an

automatic design space exploration framework.

User Requirements, enumerated as RDUi:

RDU1 The design space that is explorable from the user should be defined

RDU2 The system parameters that the user is supposed to be able to modify should be

easily accessible. More over the user effort to explore the design space should be

as low as possible (e.g. huge modification to parts of the code should be avoided

to this step).

RDU3 Modification to system parameters should be traceable in the executable model.

RDU4 The user should be easily able to keep trace of the explored configuration and its

performance.

RDU5 The incoming reports or visualization tool should be able to support the designer

in his/her decisions

Automatic Exploration & Optimization Tool, enumerating as RDAi:

RDA1 The target system should be configurable in terms of a discrete set of parameters

RDA2 A suitable executable system model should be provided.

RDA3 It should be possible to automatically start/stop/control the simulation.

RDA4 It should be capable to instantiate a certain design instance with parameters from

the design-space definition,

RDA5 It should receive compact metrics to calculating new/next design instance to be

simulated

All those requirements are summarized in be compliant with the interface specification as

defined in [18] where a DSE-XML format has been used to parametrically and formally

model the complete design space of HW/SW platform.



Page 56

Input required Required

by tool Input provided by

Memory access traces including timing and

addresses required

Memories

MCO tool

Simulation Trace analysis

or Simulation Platform

Formal design space definition as defined in the

DSE-XML interface (see section 2.5.3)

MOST Use Case Provider

Evaluation metrics for a particular design space

instance as defined in the DSE-XML interface (see

section 2.5.3)

MOST Simulation Trace analysis

or Simulation Platform

Parametrical version of the target HW/SW platform MOST Use Case Provider

Executable version of the target HW/SW platform MOST COMPLEX Estimation

Flow



Design space instance definition as described in the

DSE-XML interface (see section 2.5.3)

MOST COMPLEX

Estimation Flow

Optimal HW/SW platform configuration in the design

space and design space analysis

MOST User

Optimal memory configuration Memories

MCO tool

User



Page 57

2.5.3 DSE-XML Interface

The interaction between user agents and the exploration framework is shown in the following

figure:

Essentially, two kinds of user agents are assumed to interact with the exploration framework:

Use case and simulator provider. This is the provider of the use case and its associated

simulator. He is responsible for releasing the combined package of the system simulation

model (or an automatic flow to generate it) and the target application running on it.

Exploration architect. This is the user architect who is responsible for identifying the

optimal configuration of the architecture underlying the use case.

A use case is defined as the combination of the target architecture and the application running

on it. The simulator is the executable model of the use case and it is a single executable file

(binary or script), which interacts with the design space exploration tool to provide the value

of the estimated metrics, given an input configuration. In literature, the simulator is also

referred to as the solver.

We define the interface between the Design Space Exploration Tool and the exploration

architect as the human-computer-interaction interface. This interface can be GUI (Graphical

User Interface)-based or command-line-based and it is used for specifying and solving the

exploration problem in terms optimization metrics and constraints.

The goal of the DSE-XML interface is to address the interaction between the simulator and

the design space exploration tools, which is essentially an automatic program-to-program

interaction. In general, the interaction can be described as following:

1. The design space exploration tool generates one feasible system configuration whose

system metrics should be estimated by the simulator.

2. The simulator generates a set of system metrics to be passed back to the design space

exploration tool.

The specification of the formats of the input/output data to/from the simulator is defined as

the explorer/simulator interface.



Page 58

In order to link the use case and the simulator to the design space exploration tool, a design

space definition file should be released by the use case and simulator provider together with

the executable model of the use case (simulator). This file describes the set of configurable

parameters of the simulator, their value range and the set of evaluation metrics that can be

estimated by the simulator. This file describes also how to invoke the simulator as well as an

optional set of rules with which the generated parameter values should be compliant. The

rules are only used by the exploration tool in order to do not generate invalid or unfeasible

solutions during the automated exploration process.

The DSE-XML specification [18] provides an XML based grammar for writing both the

design space definition file and the simulator interface files.



Page 59

3 Application of COMPLEX Design Flow to Use Cases

As described in the deliverable D1.1.1 - Definition of requirements, industrial use-cases and

evaluation strategy – during the first months of the project the consortium has defined three

industrial use cases to exercise the COMPLEX design flow.

The three identified use case are the following:

Use case 1 – Embedded distributed system – a wireless sensor network platform for elderly

people monitoring.

Use case 2 – Surveillance system – a multiprocessor platform for audio and video

processing applications targeting an audio-based surveillance system.

Use case 3 – Space domain system – a near-real time data processor for object survey,

tracking and imaging for supporting Space Situational Awareness (SSA)

This chapter outlines the application of the COMPLEX design flow on the defined use cases.

Starting from the flow coverage figures already presented in the deliverable D1.1.1 -

Definition of requirements, industrial use-cases and evaluation strategy – each of the

following section gives more details on the part of the tool chain adopted to design the target

use cases.



Page 60

3.1 USE CASE 1 – Tool chain Description

3.1.1 Description of the use-case

The candidate application to be used in the Use Case 1 to validate the COMPLEX design flow

for a Wireless Sensor Network platform comes from the health care domain.It is a virtual

machine oriented to data processing in body sensor networks. A node of this application is

able to perform some computations (such as max, min, median, etc.) based on collected data

sets from sensors. The parameters of these computations (called ―features‖) such as sampling

rate, window, shift of data set, etc. can be tuned and the features can be activated or

deactivated depending on the application demands. Parameter tuning and

activation/deactivation of features can be done at run time. One prominent applications of this

virtual machine will be to detect body movements, for example to monitor the health state of

an elderly patient.

The target HW-platform that will be used for this application is based ReISC SoC, which is a

project developed in STMicroelectronics; the chip, at 90 nm technology, was taped-out at the

end of 2009.

3.1.2 Flow Coverage

The figure below (Figure 5) represents the flow coverage of use-case 1.

Figure 5: Flow Coverage of use-case 1



Page 61

The part of the COMPLEX flow that this use case is able to stimulate is mainly related on the

SW stack since the target HW platform already exists and modification on the HW side can

be done only to the memory subsystem and its usage.

The flow is iterative in the sense that the DSE loop will used to tune the memory subsystem

as stated just before and for power management issues.

The following figure shows a more detailed view of the design flow used for the Use Case 1.

Figure 6: Detailed flow of use case 1

As shown in Figure 6, the entry of this use case is composed by a StateFlow description of the

target application. The HIFSuite tool generates a C/SystemC description of the application

starting from its Stateflow specification. The C/SystemC code will be used on top of the SW

stack of the target platform. At this point, without the execution on the target platform of the

entire SW stack, the SW estimation tool generates a first evaluation of the SW code from the

power/performance point of view giving the designer to modify or analyze the SW application

with only a small effort in terms of time.

From the platform side, it is mainly based on the REISC SoC TLM model enhanced with a

configurable memory subsystem that is the main objective of the DSE phase. The HIFSuite

tool is also used to generate some components of the TLM model starting from their RTL

description.

Due to the distributed nature of the target application, this use case will also enable to

evaluate the runtime management of the platform in a multi-node environment. For this

purpose, the SystemC Network Simulation Library (SCNSL) is used to describe a wireless

network scenario in which the REISC SoC node is connected to other less detailed nodes

which either receive packets or produce interfering traffic on the radio channel.



Page 62


3.2.1 Description of the use-case

This use-case is an audio-driven surveillance system integrating audio and video acquisition

modules. The audio part acts as a continuous monitor that activates the video acquisition if an

intrusion has been detected in the surroundings of the device. This offers the possibility to

disconnect several parts of the design when not active and save power.

The application is divided into three main steps:

Step 1: Audio Activity Detection

The goal of this step is to detect if there is any activity in the surroundings of the device.

Only one micro is active during this phase of the process. If activity is detected then the

system switches to step 2, otherwise it remains in observing mode.

Step 2: Features extraction &| classification &| localization &| abnormal event detection

In this step, audio activity has been detected and the system analyses audio information. 4

micros are active and the objective is to extract relevant information from the audio stream

and detect if there is a threat in the surroundings of the device, in which case the system

switches to step 3.

The first prototype of the system will only implement features extraction and abnormal

event detection in order to focus on tools usage. Then, if no blocking issues are raised, the

classification and localization algorithms can be implemented.

Step 3: Image acquisition &| transmission

The last step of the system consists in acquiring video in order to observe the threat. The

camera is activated and an algorithm selects relevant images. Only key images are selected

and transmitted in order to save power and bandwidth.

3.2.2 Flow Coverage

The figure below (Figure 7) represents the flow coverage of use-case 2.

General view of the flow:

The flow is iterative, which means that the architecture of the system can be further refined

after simulation. The DSE tool for example, is capable to run several simulations and find the

best compromise between a set of parameters that in this particular use case range from the

Cache/Memory configuration to the selection of the best HW/SW architecture in terms of

number and types of HW accelerators.



Page 63

Figure 7: Flow Coverage of use-case 2

Detailed flow:

As detailed on Figure 8, the entry of the flow is composed as follow:

The target application is described as a set of SystemC tasks that express the inner

parallelism of the application.

The SMOG HW/SW separation tool processes application and constraints, and generates

specialized tasks targeting hardware or software implementation.

Hardware tasks are then further processed by the PowerOpt tool that adds timing and

power annotations. This tool also generates a hardware power controller that is capable to

change the functioning mode of the hardware module and adapt its power consumption.

On the software side, tasks are processed by the ―SW estimation‖ tool that also add timing

and power information to the tasks.

At this step of the design, a first version of the virtual platform based on SystemC TLM 2.0

can be generated and simulated on the host machine in order to get a first idea of timings

and power consumption.

One step further on the design flow consists in the generation of a more detailed SystemC

TLM 2.0 environment based on the REISC SoC platform where SW tasks are mapped

directly into ISS, while the HW modules are mapped into the physical platform. At this



Page 64

level also of detail a resource manager is considered to interact with HW modules and

processor in order to reduce the power consumed by the platform.

The DSE tool is then capable to run several simulations and test several configurations in

order to extract functioning points that present the best compromise among performance,

power and area considering.

Figure 8: Detailed flow of use case 2



Page 65

Run-time:

A second power optimization level consists in optimizing system parameters at run-time,

which means to adapt the platform and application structures during execution in order to

further reduce power consumption. This is handled by the GRM (Global Resource Manager)

that acts as a middleware between the application and the platform. Among other

functionalities, the GRM can vary the frequency of processors, power on and off power

islands, select power modes of hardware modules, or switch between different qualities of

service proposed by the application. The figure below represents the system composed of

three power islands and on main processor called host that manages the execution of the

system.



Page 66


3.3.1 Short description of the use case

The use case consists of a space domain application which consists of an object survey,

tracking and imaging system in the market of Space Situational Awareness (SSA). Two

different but overlapping definitions must be given to understand what SSA means:

Space surveillance: it is defined as the routine, operational service of detection, correlation,

characterization, and orbit determination of space objects.

SSA can be preliminarily defined as a comprehensive knowledge of the population of space

objects, of existing threats/risks, and of the space environment.

The proposed use case will centre on the space segment of the SSA system, on-board

applications that provide surveillance and situational awareness data to the ground segment

for its correlation with other ground-based sources.

The following list describes briefly the main functions of this application:

Video Capture: this function digitalizes the analogue image given from the optical device

installed in the satellite. The digitalization is performed a suitable periodicity so that the

space-based application processes the information and report to the ground segment about

possible space events.

Image processing: this module is responsible for the image pre-processing functions previous

to the detection, tracking and hazard estimation. This functions deal with image resolution

reduction, region selection and image filtering.

Image Reporting Unit: this module bears the responsibility of forwarding the pre-processed

images to the Object Survey function for its analysis via a high-speed bus.

Object Survey: this first functional block detects the Resident Space Objects (RSO) against

the space background, determines the object nature, and discards those shapes that are out of

the scope of the SSA analysis. Once a pattern is recognized, information about the estimated

position with regard the satellite attitude and position is passed to the Object Tracking and

Hazard Estimation functions.

Object Tracking: once a RSO object is detected, its position and velocity need to be

determined and an orbit calculated. An RSO with a solved orbit can be periodically revisited

and its orbit updated. This functional block is only executed if the detection of RSO is

triggered.

Hazard Estimation: for potentially hostile RSO, manoeuvres can be detected and threat

potential can be assessed if the RSO position and velocity is observed with a sufficient

accuracy. This functional block is only executed if the detection of RSO is triggered.

Routing Unit: if a potential hostile RSO is detected, and its orbit interferes with the spacecraft

orbit, the satellite operator can be notified whenever the RSO is manoeuvring onto dangerous

trajectories with respect to the spacecraft or high value systems. This functional block will

provide information about the detected RSO to the ground segment for its correlation with



Page 67

other existing data and human decision-making. This functional block is only executed if the

detection of RSO is triggered.

Star tracker: a star tracker is an optical device that measures the position(s) of star(s) so that it

is able to determine the spacecraft attitude with respect to celestial reference system. The

main astrometric measurable magnitudes are position (celestial coordinates of right ascension

and declination), motion (Newtonian motion through space) and parallax (annular motion of a

star directly related to is distance from the Earth).

GPS receiver: a GPS receiver provides the geo-location and time information to the satellite

to support SSA observations. Whist this kind of sensors are usually hardware-based, this one

will be software-based and will form part of the use case evaluation. All signal processing

(working on raw signal samples) intended to obtain the measurements from the satellites, as

well as position and time data, is performed by software tasks, instead of the conventional

hardware approach.

For the purpose of COMPLEX project, the star tracker sensor will be simulated, as well as the

camera device and video capturing process. However, the GPS receiver will be only simulated

in terms of raw data input, so that signal processing will be performed by software tasks.

Regarding the target platform for this use case, a mixed HW/SW architecture is proposed in

order to exercise the performance estimation and HW/SW partitioning process prior its

implementation onto platform resources.

According to application specification, the platform should allow to:

Acquire images from the camera device.

Perform simple imaging filtering, resolution change and compression.

Provide access to high-speed buses (i.e. Spacewire and RS-422 serial buses)

Provide computational resources based on ARM processors for the execution of the SW

components.

The following picture shows the suggested platform architecture for the use case 3.

FPGA

Spacewire Port

Camera Interface

SDRAM1

Flash ROM1

ARM Processor

Spacewire Port

RF-system Interface

SDRAM2

Flash ROM2

RS-422 Port1 (GPS

receiver)

RS-422 Port2 (Star

tracker)

Computing Platform

(DSP, ARM Processor)

SDRAM3

Flash ROM3

RS-422 Port1 (GPS

receiver)

GPS RF front-end IF

Serial line

Serial Line

Figure 9: HW platform for the space use case.

3.3.2 Definition of the design space

The proposed use case and the proposed platform present some fixed aspects. However, many

other aspects are open, and certainly require for a design space exploration in order to find the

best configuration in terms of performance (assuming, of course, a correct functional

behaviour of the SSA). Such a design space is defined by the following open aspects:



Page 68

Some functionality can be decided to be executed either in SW or in HW (e.g., Image

processing, or at least some sub-functionalities could be done in SW by integrating an

MPSoC in the FPGA.

Functionalities implemented as SW can either be executed by the same processor or by

different processors.

Number of processors (if any) in FPGA

Sizes of RAM and ROM memories

Voltage levels and working Frequencies

The consideration of alternative communication links (currently RS-422 serial links is

assumed).

Complementary, there are also some constraints to DSE which can be advanced and help to

conveniently reduce the design space. Specifically, the GPS receiver functionality is defined

as SW and it makes no sense to execute it out of the ARM processor tied to the RF front end.

Despite moving GPS receiver functionality to other executive resources of the platform would

be feasible, constraining the exploration by assuming a fixed allocation of such functionality,

will enable a more focused and efficient exploration.

3.3.3 Description of the coverage

Figure 10: COMPLEX flow Coverage of Use Case 3.



Page 69

Figure 10 shows the COMPLEX flow coverage of the Use Case 3 (UC3 in short). UC3

validates the upper part of the COMPLEX flow, specifically, the MARTE entry and code

generation.

In Figure 11, a detail of the Use Case 3 flow is given. A UML/MARTE specification which

includes the description of the SSA functions, including a previous and orthogonal

specification of data types and functional interfaces, and the concurrency structure of the

system will define the Platform Independent Model (PIM). The use cases of the SSA and the

non-functional constraints will be captured.

The description of the platform (PDM) and the allocation of system functionalities

(application components) to platform resources to produce a platform specific model (PSM)

will be also performed by using the developed UML/MARTE methodology. Constraints to

the design space exploration, including fixed allocations (e.g., the GPS functionality tied to

the processor associated to the RF front-end) will be captured.

Additional information can complete the PIM, figuring out in two alternatives. A first

possibility could be to add time information, without specific functionality. This would enable

the generation of models enabling of real-time static analysis which do not require the actual

functionality (a typical case is schedulability analysis). This type of modeling and analysis is

well known and performed at early phases of design in the space domain.

Model-Driven Architecture (MDA) Entry

PDM PIM

PSM

Final platform

(after DSE)

SC-PDM

XML SysD

XML DS

CFAM

DSE loop

Metrics_1

Virtual Platform ISSMetrics_2

.bin(s)

Manual refinement (after DSE)

XML DS: XML Design Space file

XML SysD: XML System Description file

CFAM: Concurrent Functional Application Model

SC-PDM: SystemC Platform Description Model

.bin: binary executable of the application

After DSE,

optimum metrics

SCoPE+

MDS

Studio Metrics_i

Conf.

MOST

Figure 11: Detail of the design flow in the Use Case 3.

Use Case 3 will explore a complementary and subsequent analysis possibility, once the

functional C/C++ code is available, and it is of interesting both, enabling a functional

validation of the system, and early performance estimation, relying on such a code, based on

Here is where the native source analysis technology of SCoPE+ will be useful, since SCoPE

will be able to automatically extract from the actual C/C++ code associated performance



Page 70

figures, considering the allocation of the application components to the target platform

resources.

It should be noted, that the proposed flow in Figure 10 covers both SW and HW estimations,

since SCoPE+ framework will be equipped also with a HW estimation technology (in T2.4).

As can be seen in Figure 9, there is a FPGA in the proposed platform, which, as mentioned in

section 3.3.3, widens the design space, once there is the possibility to move some SSA

functions either to software or to hardware.

The SCoPE+ framework (developed in T2.2 and T2.4 of COMPLEX) will serve for the

generation of a PSM SystemC based executable model. It will take as inputs:

The Concurrent Functional Application Model, coded under the (CFAM) API, plus the

actual C/C++ action code associated to each function. This CFAM description of the

application will have a mapping to a SystemC PIM, which can feed the rest of the

COMPLEX flow and tools.

The System Description (SysD), including the platform architecture and the allocation of

SW tasks to HW resources, in XML format (XMLD).

These two inputs (CFAM and XMLD) in turn will be previously generated through the

corresponding generators developed in T2.1 of COMPLEX.

The Use Case 3 also covers the usage of the exploration tool (MOST), since the different

configurations and runs of the SystemC executable will be driven by MOST, which will be

the tool supporting an efficient search of the design space.

Another innovative aspect in COMPLEX is that the XMLD file will be actually a Space of

System Descriptions. It means that, this file will not only describe a specific platform and

allocation, but a set of feasible platforms and allocations. Therefore, the resulting executable

is actually a configurable executable, which reflects a set of PSMs, instead of a single PSM.

This will enable a fast exploration cycle enabled by the MOST exploration tool and the

executable specification generated through SCoPE+, which will not need to be recompiled

among the different executions triggered by the different explored solutions.

Additional XML files can be used and supported as input by SCoPE+, in order to facilitate its

interfacing with the MOST exploration tool. For instance, a XML file will be used to state the

metrics the exploration tool is interested in.

Each new configuration explored by MOST corresponds to a different XML Configuration

file. Each XML configuration file, read by the executable specification before starting the

simulation, will complete the system definition, by providing specific values to platform

configurable parameters, and by defining a specific allocation (when a mapping space, instead

a specific mapping was described). The metrics produced by SCoPE+ (metrics_1 in Figure

11) will have a XML format too, understood by MOST, which after read it, and settles the

next point of the design space to be explored. The example will consider performance

estimations, including timing and power figures (as well as others, such as CPU load).

Performance figures will consider SW and HW estimations, since there will be allocations to

HW execution resources.

The optimum solutions found by the MOST explorer will serve the designer to manually

update (if necessary) the UML/MARTE description, to make explicit the optimum PDM.

From this PDM, there will be an automatic generation of a corresponding IP-XACT



Page 71

description, with the optimum hardware architecture found. This generation will be performed

with the MARTE2IPXACT generator developed in T2.1 of COMPLEX. The IP-XACT PDM

description generated will enable the integration of IP-XACT descriptions of platform IPs.

The Synopsys-CoWare Platform Creator tool will serve to validate the solution obtained in

the DSE loop. This platform, relaying on an Instruction Set Simulator (ISS) technology, will

produce a more accurate estimation of performance, and will serve to validate the

performance results obtained with SCoPE+ (by comparing metrics_1 and metrics_2). Notice

that the ISS-based estimation technology is slower, which makes its less suitable for a DSE

loop. However, it will enable:

The assessment and validation of the performance results obtained through SCoPE+, and

the validity of the system-level DSE for taking early design decisions. For this, at least a

single design space point should be sufficient. Obtaining more points could serve to

confirm the results and even to find potential systematic deviations.

The assessment of the speed-up obtained with SCoPE+ for the estimation loop.

The CoWare Virtual Platform generation will take as input the IP/XACT file obtained by the

DSE loop, as well as the IPXACT files of the involved IPs.

Manual refinement will be required in general to generate the SW application code from the

PSM. This refinement can be supported by some methodology, such as SWGen, and/or

partially by similar generation techniques as the ones used for code generation from

UML/MARTE. However, the development of a general, complete and efficient SW

generation methodology requires an effort which is out of the scope of COMPLEX.

Summarizing, the collaboration of

the UML/MARTE capture methodology developed in COMPLEX,

the generators developed in COMPLEX to produce the input to SCoPE+,

the SCoPE+ native source performance estimation framework, enabled in COMPLEX with

capabilities for abstract specification of the concurrent functional applications, and for

producing a highly configurable executable platform,

and the MOST exploration tool

will be used to fast and efficiently find an optimum implementation of the SSA system.

Then, an IP/XACT description of the optimum hardware architecture for the execution of the

SSA will be generated. A manual generation of the firmware (maybe assisted by the SWGen

library or similar generation tools) will be done. It will serve as input to CoWare Virtual

Platform in order to provide a validation of the performance figures obtained through

SCoPE+.



Page 72

4 COMPLEX Tool-set Overview

This chapter presents an overview of all the set of tools that will be used in the COMPLEX

project. The goal is to build in this chapter a reference for the next COMPLEX deliverable

and documents regarding the tool description.

In this chapter only a brief description of each tool is reported while a more detailed overview

is inserted in the APPENDIX B for the interested readers.

To better capture the relation between the tools and the use case presented before, the

following table presents the tool coverage for each of the COMPLEX use cases.

Use Case 1

Distributed System

Use Case 2

Surveillance System

Use Case 3

Space System

MOST DSE X X X

SCOPE + X

UML/MARTE tool-chain X

SWAT X X

CoWare Virtual Platform X

PowerOpt X

HIF Suite X

Memories MCO Tool X X

IPXACT tool-chain X

SMOG Tools X

IMEC GRM X

SystemC Network

Simulation Library

(SCNSL)

X



Page 73

4.1 MOST Tool Overview

The Multi-Objective System Tune (MOST) tool is a proprietary tool for discrete optimization

specifically designed for enabling design space exploration of hardware/software

architectures.

Multicube explorer (m3explorer.sourceforge.net) is the open source version used as proof of

concept for methods and techniques to be included in the proprietary version (MOST).

MOST is a design space exploration tool that helps driving the designer towards near-optimal

solutions to the architectural exploration problem. The final product of the framework is a

Pareto set of configurations within the design evaluation space of the given architecture and

analysis on the effects of design space parameters on to the objective functions.

One of the goals of MOST is to provide a command line interface to construct automated

exploration strategies. Those strategies are implemented by means of command scripts

interpreted by the tool without the need of manual intervention. This structure can easily

support the batch execution of complex strategies.

In the COMPLEX Design Flow the MOST tool will be used to automatically close some of

the feedback loops.



Page 74

4.2 SCOPE + Tool Overview

SCoPE+, a source-code level performance estimation framework, will be in COMPLEX in

charge of enabling the fast estimation of performance figures from the SystemC executable

specification of the PSM. This fast, but suffciently accurate estimation is crucial for enabling

a practical Design Space Exploration (DSE) loop. Specifically, in COMPLEX, the fast

estimations of SCoPE+ will enable MOST, the exploration tool in the COMPLEX flow, to

look into more solutions in the design space, than using a ISS-based exploration tool during

the same exploration time.

Despite of SCOPE+ seems to cover most of the estimation and model generation flow in the

COMPLEX framework; it does not create any redundancy since it covers the area at a higher

abstraction level with respect to the rest of the tools for fast system estimation.



Page 75

4.3 UML/MARTE Design Entry Tool Overview

The Papyrus MDT tool is selected as the model-driven entry tool for the development of the

MARTE Platform Independent Model (PIM), Platform Description Model (PDM) and

Platform Specific Model (PSM).

Papyrus provides a graphical editor for the Unified Modelling Language (UML 2), but also

the development of profile extensions to the UML meta-model. This last characteristic makes

Papyrus editor quite interesting if the UML model needs to be extended with specific

concepts that are not available in the MARTE profile. In this sense, the MARTE Profile for

Modelling and Analysis of Real-time and Embedded Systems (MARTE), is already available

and thus usable for the Papyrus user.

The MARTE profile supports the model-driven development of real time embedded systems

by adding relevant techniques to UML. The profile provides support for specification, design,

and verification of complex hardware/software systems and is intended to replace the existing

UML Profile for Schedulability, Performance and Time.

Another MARTE concern is to enable the model-based analysis. Therefore the MARTE

profile intents to support existing analyzing techniques. MARTE focuses on performance and

schedulability analysis.



Page 76

4.4 SWAT: SW Estimation Tool Overview

The SWAT software estimation tool chain has the goal of providing early estimates of the

impact of software application on the overall performance of the system, mainly in terms of

execution time and power consumption.

The tool chain is composed of several cascaded modules that can be used independently, for

specific estimation needs, or automatically, for a complete estimation run. The main goal of

the toolset is that of providing estimations of different non-functional aspects by decoupling

the aspects related to the structure of the application (source code), the specificities of the

executor (architecture) and the data dependencies (profiling).



Page 77

4.5 Virtual Platform Tool Overview

The COWARE Virtual Platform tools are a simulation environment that allow multiple design

tasks ranging from early development of embedded software for multicore systems to the

evaluation of the system performance in terms of software execution time, memory

bottlenecks or power consumption for various architectural variants or software mappings

The Virtual platform tools provide solutions for early software development as well as for

architecture exploration.



Page 78

4.6 PowerOpt Tool Overview

ChipVision will provide its low-power behavioural synthesis tool PowerOpt. In the

COMPLEX framework, PowerOpt is used for the power estimation of custom hardware

blocks. After the behavioural synthesis of each entire HW task, monolithic basic blocks are

automatically identified. Power and delay figures are automatically inserted, according to

lower level simulation. The main tasks include:

Providing timing and power information for custom hardware components

Low-level power and data dependencies are modelled statistically

High-level power and timing are estimated by simulating the system

Map the recording of run-time and power traces to the tool API

PowerOpt automatically synthesizes power-efficient RTL architectures from the electronic

system level (ESL). With its analysis framework, the tool offers fast and accurate analysis

capabilities at the system-level using real activity data, thus enabling the users to explore

trade-offs between power, area, and timing. With these analysis capabilities, higher power

savings compared to traditional RT-based optimizations and compared to other high level

synthesis tools can be obtained (up to 75% compared to traditional, hand-crafted RTL).

Additional benefits are faster time to results and higher productivity as the user works with

compact system level code which integrates with system level simulation and modelling.



Page 79

4.7 HIFSuite Tool Overview

HIFSuite is a set of tools and application programming interfaces (APIs) that provides support

for modelling and verification of HW/SW systems. The core of HIFSuite is the HDL

Intermediate Format (HIF) language upon which a set of front-end and back-end tools have

been developed to allow the conversion of HDL code into HIF code and vice-versa. HIFSuite

allows designers to manipulate and integrate heterogeneous components implemented by

using different hardware description languages (HDLs). Moreover, HIFSuite includes tools,

which rely on HIF APIs, for manipulating HIF descriptions in order to support code

abstraction and post-refinement verification.

HIFSuite plays two roles in COMPLEX Design Flow:

Generation of a C/SystemC description of a Stateflow specification of a SW application;

Abstraction from RTL to TLM of the models of HW components.



Page 80

4.8 Memories Modelling, Characterization and Optimization (MCO) Tool Overview

Objective of this tool is to enhance the of the DSE framework by adding the feature of the

exploration of the memory architecture. While the DSE framework is able to explore the

memory dimension using conventional structural parameters (e.g., memory size, memory

width) or functional parameters (e.g., burst access), the memory tool provided by POLITO

will be able to explore alternative memory organizations, in particular, based on multi-banked

solutions specifically tailored to the application. This tool is meant as a plug-in to the DSE

engine.



Page 81

4.9 IPXACT Tool-Chain Overview

The Magillem tool suite is used today in production design and verification flows of the

leading SoC integrators (STM, STE, NXP, TI, Qualcomm, etc.) and envisaged in advanced

ESL flows by system manufacturers (Thales, Astrium, Airbus, etc.). Magillem is used as a

framework for metadata management and thus can be considered as the backbone for the

interoperability of tools and models around a common description of IPs, systems and

subsystems based on a well accepted standard: IP-XACT (IEEE 1685).

In COMPLEX, Magillem will be used to make the glue between tools of the global

architecture exploration framework and for the implementation of automation engines.



Page 82

4.10 SMOG Tool Overview

The Separated Model Generation tool is used to prepare the input to the HW and SW

estimation tools. SMOG takes information from four different sources. The executable

SystemC system specification contains the behaviour and the communication of the design.

The system input stimuli are used to test the implementation. User constraints on the HW/SW

mapping of particular components guide the DSO and finally the MARTE platform

description model defines details of the virtual platform that is used later in the COMPLEX

design flow.



Page 83

4.11 IMEC Global Resource Manager Tool Overview


The first main part contains one power controller per HW component (generated by tool

described in Section 4.6), which allows setting of the HW component, implementing the task

into individual power modes, and providing an interface to the Global Resource Manager

(GRM) of the overall system.

The second main part is the GRM, optimizing the system parameters at run time, i.e. adapting

the hardware platform and the application configuration during execution in order to further

reduce the power consumption. The GRM acts as a middleware between the application and

the platform. Among other functionalities, the GRM can vary the frequency of processors,

power on and off power islands, select power modes of HW components, or switch between

different qualities of service proposed by the application.



Page 84

4.12 SystemC Network Simulation Library (SCNSL) Overview

Next-generation networked embedded systems pose new challenges in the design and

simulation domains. System design choices may affect the network behaviour and Network

design choices may impact on the System design. For this reason, it is important - at the early

stages of the design flow - to model and simulate not only the system under design, but also

the heterogeneous networked environment in which it operates [19].For this purpose, in

COMPLEX we exploit a modelling language traditionally used for System design - SystemC

- to build a System/Network simulator named SystemC Network Simulation Library

(SCNSL). This library allows to model network scenarios in which different kinds of nodes,

or nodes described at different abstraction levels, interact together. As done by basic SystemC

for signals on the bus, SCNSL provides primitives to model packet transmission, reception,

contention on the channel and wireless path loss. The use of SystemC as underlying

technology has the advantage that HW, SW, and network can be jointly designed, validated

and refined. The advantages of SCNSL are:

simplicity: a single language/tool, i.e., Systems, is used to model both the system (i.e.,

CPU, memory, peripherals) and the communication network;

efficiency: faster simulations can be performed since no external network simulator is

required;

re-use of SystemC IP blocks

scalability: support of different abstraction levels in the design description

openness: several tools available for SystemC can be exploited seamlessly

extensibility: the use of standard SystemC and the source code availability guarantee the

extensibility of the library to meet design-specific constraints.



Page 85



Page 86

5 Summary

This document contains the current status of the required definition of application, stimuli and

platform specification, as well as the identification of tool interfaces for the overall

COMPLEX Design Flow. Since the underlying use-case requirements, implementation

constraints, and evaluation results are expected to have an impact on the overall interaction

between the different parts of the flow, this document is meant to evolve with the course of

the project along with the stabilisation of these definitions.

For this reason, the COMPLEX consortium has planned to release an update of this document

at M18 to cover possible changes in the design flow.



Page 87

6 References

[1] "Description of Work". COMPLEX – COdesign and power Management in Platform-

based design-space EXploration, FP7-ICT-2009- 4 (247999), 2009.

[2] Ch. Ykman-Couvreur. ―Exploration Framework for Run-Time Resource Management

of Embedded Multi-Core Platforms‖, IEEE Int. Conf. On Embedded Computer

Systems: Architectures, Modelling and Simulation, Samos, Greece, July 2010.

[3] J.W. Shipman, ―Tkinter reference: a GUI for Python‖, New Mexico Tech Computer

Center, January 2009.

[4] The Spirit Consortium. IP-XACT v1.4: A Specification for XML meta-data and tool

interfaces. 2008. March.

[5] E. Vaumorin, J.Stuyt, F.Kilic, ―SPIRIT IP-XACT Extensions and Exploitation for

Verification Software Methodology‖

[6] D. Calvo, H. Posadas, E. Villar. Automatic Generation of HW/SW SystemC Models

for System Simulation using IP-XACT. Submitted to VLSI SoC 2010.

[7] www.teisa.unican.es/scope

[8] http://www.teisa.unican.es/gim/en/scope/multicube.html

[9] http://www.multicube.eu/

[10] http://www.scalopes.eu/

[11] J.Castillo, D. Quijano. SCoPE v1.1.0 user manual. August, 2008.

[12] H. Posadas, E.Villar, USER MANUAL for M3P SCoPE Plug-in:XML Management

for DSE. V1.0.2. Authorized by E. Villar. December, 2008.

[13] H. Posadas, F. Herrera, P. Sánchez, E. Villar, F. Blasco. ―System-Level Performance

Analysis in SystemC‖ In Proceedings of DATE‘04. February, 2004.

[14] H. Posadas, E. Villar, Dominique Ragot, Marcos Martínez. "Early Modelling of

Linux-based RTOS Platforms in a SystemC Time-Approximate Co-Simulation

Environment". IEEE International Symposium on Object/Component/Service-

Oriented Real-Time Distributed Computing (ISORC'10). 2010-05.

[15] J. Castillo, H. Posadas, E. Villar, Marcos Martínez. "Fast Instruction Cache Modelling

for Approximate Timed HW/SW Co-Simulation ". 20th Great Lakes Symposium on

VLSI (GLSVLSI'10), Providence, USA. 2010-05.

[16] F. Fummi, D. Quaglia, F. Stefanni, A SystemC-based Framework for Modelling and

Simulation of Networked Embedded Systems, Proc. of "Forum on Specification and

Design Languages (FDL)", Stuttgart, Germany, Sep. 23-25, 2008

[17] www.teisa.unican.es/SWGen

[18] Specification of the XML interface between design tools and use cases R1.4.

www.multicube.eu

[19] N. Bombieri, F. Fummi, and D. Quaglia, ―System/network design space exploration

based on tlm for networked embedded systems,‖ ACM Transactions on Embedded

Computing Systems (TECS), vol. 9, no. 4, Mar. 2010.‖

[20] F. Fummi, D. Quaglia, F. Stefanni, A SystemC-based Framework for Modelling and

Simulation of Networked Embedded Systems, Proc. of "Forum on Specification and

Design Languages (FDL)", Stuttgart, Germany, Sep. 23-25, 2008.

http://www.teisa.unican.es/scope

http://www.teisa.unican.es/gim/en/scope/multicube.html

http://www.multicube.eu/

http://www.scalopes.eu/




Page 88

A. COMPLEX Flow and Project Activities Overview

T2.5 HW/SW task separation & testbench generation

T2.5 virtual system generator with TLM2 interface synthesis

WP6: Management

WP5: dissemination & exploitation

system

specification

in SystemC

system

input

stimuli

automatically

pre-optimized

power

controller

source analysis

quick synthesis

functional, power,

& timing

model generation

X-compilation

source analysis

functional, power,

& timing

model generation

bus cycle accurate

SystemC model



T1.2

simulation

trace

BA

C+

+

BA

C+

+

HW

tasks

Syste

mC

es

tim

ati

on

& m

od

el g

en

era

tio

ns

imu

lati

on

ex

plo

rati

on

& o

pti

miz

ati

on

SW

tasks

T1

.2 S

ys

tem

an

d t

oo

lin

terf

ac

es

pe

cif

ica

tio

n

T1.3

visualization/

reporting

tool

T3.1

trace analysis tool

T3

.4 p

ow

er

&

pe

rfo

rma

nce

me

tric

s

user

T3.4

exploration &

optimization

tool

Marte or

StateFlow/

Simulink

PIM

user

constrained

HW/SW sep.

& mapping

MARTE

PDM(Platform

Description

Model)

Syste

mC

T3

.1 d

esig

n s

pa

ce

de

fin

itio

n


parameters

• T3.2 Embedded software

optimizations

• T3.3 Custom hardware

optimizations

• T3.4 Design-space exploration

•Platform IP selection &

configuration

•Memory configuration

&

management

•HW/SW partitioning &

separation

• T3.5 Run-time management


description

(IP-XACT)

Middleware, virtual

platform

IP component

models

use-cases

T2.1 Model-driven design front-end

T3

.4 p

ara

me

ter

assig

nm

en

t

T1.3 existing

tool integration

WP4 demonstration &

industrial evaluation

T5.2 contribution to

industrial

standards

T2.2 T2.4 T2.3

T1

.2

T1.4 industrial flow

integration

T1.2/T1.3

T1.1 Requirements

& Definitions

- Req. def.

- industrial

use-case def.

- evaluation def.

T3.5

WP1 Requirements,

specification and

integration to holistic

design environment

WP2 Estimation and

model generation

WP3 System

exploration and

optimization

BAC++ = Block

Annotated C++ code

T1.2/T1.3

T1.2/T1.3 T1.2/T1.3 T1.2/T1.3

T3.1

T1.2/T1.3 T1.2/T1.3



Page 89

B. Tool-set Description

This annex presents an overview of all the set of tools that will be used in the COMPLEX

project detailing the description presented in Chapter 4. Each section of the annex gives a

brief introduction of each tool. The same subsections will be found for each tool presenting:

Tool Overview: Presents a brief overview of the tool with main features

Tool Architecture: Presents an overview of the tool architecture and its component

Tool Interface: Presents the requirements imposed by interoperability/interfacing of tools

and applications/ architecture simulators (standards, formats, files)

User Interface: Describes the user interface and its type (script-based, graphical)

Portability: Defines OS and libraries requirements and tool dependencies

Tool Documentation: Specifies the type of documentation associated to the tool (User

Manual, Development Manual, Tutorials)



Page 90

B.1. MOST Tool Description

The Multi-Objective System Tune (MOST) tool is a proprietary tool for discrete optimization

specifically designed for enabling design space exploration of hardware/software

architectures.

Multicube explorer (m3explorer.sourceforge.net) is the open source version used as proof of

concept for methods and technique to be included in the proprietary version (MOST).

B.1.1 Tool Overview

MOST is a design space exploration tool that helps driving the designer towards near-optimal

solutions to the architectural exploration problem. The final product of the framework is a

Pareto set of configurations within the design evaluation space of the given architecture and

analysis on the effects of design space parameters on to the objective functions.

One of the goals of MOST is to provide a command line interface to construct automated

exploration strategies. Those strategies are implemented by means of command scripts

interpreted by the tool without the need of manual intervention. This structure can easily

support the batch execution of complex strategies.

B.1.2 Tool Architecture

The tool is composed by an interpreter (shell) and a kernel engine that orchestrate the

optimization process by invoking the constituent and inter-changeable blocks of the

framework:



Page 91

The internal organization of the software has been factored in order to provide standard and

common APIs for the various modules associated with the fundamental functionalities of

MOST. The standard API consists of a corresponding dynamic linkable object interface and it

has been implemented for the design of experiments and the optimization modules.

A similar standard data interchange format is used for supporting the introduction of

response surface models which allow analytical approximations of the target system metrics

without requiring lengthy simulations.

The following is a more detailed list of the modules available in MOST:

Design of experiments. The following set of design of experiments modules have been

implemented into the MOST framework:

Full search DoE

Full factorial DoE

Random DoE

Box-Benken

Central Composite Deisgn (CCD)

Optimization algorithms: The following set of optimization modules have been

implemented into the MOST framework:

NSGA-II genetic algorithm

Multi-objective simulated annealing algorithms

DoE and Pareto DoE algorithms

Steepest descent (for single objective optimization problems).

RESPIR

Response surface modelling. The MOST response surface modelling functionality is

composed of set of analytical models and a special data interchange format. The models have

been implemented as external programs by using common development frameworks such the

GNU Scientific Library, R and other well-known model building libraries such as FANN. The

modules that will be available are:

Linear Regression

Radial Basis Functions

Shepard Interpolation

Spline-based Regression

Neural Networks

Design space analysis reports. The user interface associated with the MOST tool is able to

generate both textual report of the optimization (configuration and objective pairs) as well as



Page 92

more comprehensive HTML reports if the Multicube Explorer tool is installed on the

machine.

B.1.3 Tool Interface

The integration of the MOST tool with the use-cases of the COMPLEX design flow is

performed by using a standard XML interface (MULTICUBE XML R1.4 [18], as specified

in the Multicube Explorer R1.0 manual, http://www.multicube.eu). The XML is interpreted

directly by the kernel of MOST to create an internal, optimized representation of the design

space.

The MOST core design space representation provides a set of abstract operations that are

mapped on the specific use case under analysis. The abstract operations are represented by

iterators (full search, random and factorial iterators) over the feasible design space. These

services are exploited by the DoE and Optimizer plug-ins instantiated within the MOST shell.

The core design space representation provides services for validating architectural choices at

the optimizer level and evaluating the associated objective functions. The objective functions

are defined as a subset of the use case system level metrics and can be manipulated by the

user by interacting with MOST.

The design space exploration is performed by using the external use case simulator

integrated with the XML interface. In principle, the optimizer instantiates a set of architectural

configurations by means of the design space iterators, converts the representation to the

standard Multicube XML format and executes the simulator. Information about simulator runs

will be displayed directly on the MOST shell. MOST creates a specific directory to execute

each instance of the simulator. In this directory, a valid system parameters file is created

before starting the simulator. A system metrics file is obtained as the output of the simulator

execution.

B.1.4 User Interface

The main user interface associated with the MOST tool is a command line interface that can

be driven by execution scripts. This particular interface is suitable for remote execution of

design space exploration on server farms. An HTML result presentation is available in the

tool by using M3Explorer.

Example of MOST script:

Example of HTML report.




Page 93

B.1.5 Portability

The MOST tool will be available in licensed binary format for x86 Ubuntu.

The M3Explorer version is open source and available at: m3explorer.sourceforge.net

B.1.6 Tool documentation

The scripting language of MOST is documented by manuals available on the command line

and several examples associated with the distribution. The header files associated with the

APIs as well as the SDK for developing RSMs will be appropriately commented to enable the

development of the external libraries. The M3Explorer version is with user and developer

manuals. Those manuals will be also available during the project lifetime for the MOST

version.



Page 94

B.2. SCOPE + Tool Description

SCoPE+, a source-code level performance estimation framework, will be in COMPLEX in

charge of enabling the fast estimation of performance figures from the SystemC executable

specification of the PSM. This fast, but enough accurate estimation is crucial for enabling a

practical Design Space Exploration (DSE) loop. Specifically, in COMPLEX, the fast

estimations of SCoPE+ will enable MOST, the exploration tool in the COMPLEX flow, to

look into more solutions in the design space, than using an ISS-based exploration tool during

the same exploration time.

B.2.1 Tool Overview

SCoPE+ will be built in COMPLEX as an extension of SCoPE [7].

Current features of SCoPE are:

Functional Simulation of concurrent SW under three different APIs (POSIX, uCOS-II and

Win32)

Modelling of RTOS and of drivers (including loading of kernel modules and the handling

of interruptions).

Modelling and Consideration of the Hardware Platform. For it, SCoPE includes:

o A TLM2-based bus for the communication with peripherals and the transmission of

hardware interruptions.

o DMA for coping large amount of data.

o Simple memory for the simulation of cache and DMA traffic.

o Hardware interface for an easy custom hardware connection.

o Network interface that works as a net card for the NoC.

o External network simulator to implement the NoC connected to SCoPE.

o Provide several types of performance figures which include:

o Time Figures: Running Time, CPU load.

o RTOS performance figures: number of thread and context switches.

o Number of instructions and cache misses.

o Power and Energy Figures, of the core and of the cache.

o Simulation speed up to 1 order of magnitude faster than an ISS.

o Configurable executable specification. That is, the executable specification, allows to

check different platform parameters (i.e., memory size, number of processors), without

requiring recompilation. It enables faster DSE.

o A set of plug-in available for enabling XML-based interfacing of SCoPE with the

exploration tool and with the user.



Page 95

In COMPLEX, SCoPE will be extended to SCoPE+ which, as well as the current features

offered by SCoPE, will add the following features:

Support of a Concurrent Functional Application Modelling API (or CFAM API). This

enables a higher abstraction for the capture of the concurrent functionality. Currently, in

SCoPE, concurrent functionality is captured as a software application, and thus is RTOS-

API dependent.

Support of both, SW and HW estimations for the functional code captured under the CFM

API (currently, SCoPE performs only SW estimations). For a functional task, either a SW

estimation or a HW estimation will be performed depending on the allocation of the

functional task (either to a SW or a HW computation resource).

Extension of the configurability options of the executable specification. Specifically, it

would be interesting to enable different runs of the executable specification, without

requiring recompiling. for different allocations.

Details about SCoPE+, regarding these extensions and innovations with respect the current

SCoPE will be reported in D2.2.1 and D2.2.2.


Figure 12 shows a simplified scheme of the SCoPE architecture. It shows the SCoPE kernel

itself, plus the scope code enabling the support of the different APIs for interfacing the tool.

The internal architecture of the SCoPE kernel (RTOS modelling part, driver modelling, etc) is

omitted in Figure 12 since it is not relevant in order to consider the interfacing of the tool.

There are other architectural elements related to interfaces which are not represented in Figure

12 because are not relevant for the COMPLEX project.

Figure 12: SCoPE Architecture (focused on tool interface).

As shown on the left hand side Figure 12, some parts of the tool are dedicated to support a

specific programming API, namely C/C++ plus system calls to a specific RTOS API (POSIX,

uCOS-II, Win32). On the right hand side, Figure 12 shows the different parts of the tool

which enables:

The specification of the platform. It includes describing the hardware architecture, but also

the platform SW elements.

The specification of the allocation of software tasks to different platform elements.



Page 96

The configuration of the simulation and of the different metrics.

As an integral part of SCoPE, a SystemC code, wrapped by the sc_main function, is directly

supported by SCoPE as an input defining the platform architecture, the allocation of software

tasks to platform resources and to control the simulation, e.g. the simulated time.

SCoPE performs a very abstract modelling of basic elements of the hardware platform (such

as the processor and the instruction cache) and of the RTOS. This means that SCoPE does not

need to include and wrap ISS based models of simulated processors. However, it is possible

to include models of Intellectual Property (IP) blocks. As represented on the right hand side of

Figure 12, a basic requirement for such IP integration is that the IP model must present a

SystemC-TLM2 interface.

Additionally, the interface capabilities of SCoPE are enhanced by two plug-ins. Such plug-ins

are actually not part of the inner architecture of SCoPE. In deed, it enables that SCoPE can be

used without installing these plug-ins, with a bare use of the SystemC-based interface for

describing the platform. However, the mentioned plug-ins facilitated the interfacing between

SCoPE and other tools, such as M3Explorer. Therefore, it is interesting to reflect them

together the SCoPE architecture, when focusing on the interface capabilities of SCoPE.

The SCoPE M3Explorer Plug-in (M3P in Figure 12) enables an XML-based interface,

defined in the MULTICUBE project [8][9], with exploration tools, such as M3Explorer.

The SCoPE IP-XACT Plug-in (IPXACTP in Figure 12), developed in the SCALOPES project

[10], enables an XML-based interface, similar to the one defined in MULTICUBE, but

enabling now an IP-XACT description of the target platform. It also enables the integration of

IPs counting on the availability of their related IP-XACT.

Further details on the requirements on these interfaces are explained in the following section.

As mentioned in previous section, SCoPE+ will extend SCoPE in three main aspects. Relying

on Figure 12, Figure 13 shows the involvements of such extensions on the tool architecture.

Figure 13: SCoPE+ architecture (focused on tool interface).

SCoPE+ will extend SCoPE architecture to support the CFM API. The other main impact in

architectural terms is the inclusion as part of the SCoPE kernel of the specific HW estimation

stuff (reflected as a box within the SCOPE kernel box in Figure 13).



Page 97

In order to clear up how the HW estimation method will be integrated into SCoPE, a further

insight in the SCoPE kernel it is necessary. Figure 14 shows a simplified scheme of the way

SCoPE models the processing element (for instance, a processor).

Figure 14: Simplified performance estimation flow of SCoPE.

Currently, SCoPE uses two basic steps for producing the executable specification able to

produce performance figures.

In the first step, the stand-alone performance figure (execution time, power consumption) for

the inner code of each SW task is obtained. For it, the source code is parsed in order to

recognize basic blocks. For each basic block, a performance figure is associated by using an

estimation method. Currently, three estimation methods are possible:

Operator Cost Model: Each C/C++ operator has an average value of time and energy

consumption.

ASM Code Analysis Model: By analyzing the cross-compiled code, the number of

instructions for each basic block is obtained. Execution time is estimated multiplying by

the average Cycles per Instruction (CPI) of the processor. By multiplying this value by an

average energy per instruction, an estimation of the total energy required by the processor

is obtained.

ASM Op-code Model: This method considers different times and energies per instruction

(instead of the linear model followed by the previous approaches).

Once the estimations have been done, such figures are back annotated to the source code. The

instrumented source can be then compiled (against the scope library) and the executable is

obtained.

Figure 15 shows how Figure 14 is enhanced with the incorporation of the HW estimation

method. Now, instead of an application code under a specific RTOS API, the concurrent

functionality of the specification code is parsed. The parsing has to consider now allocation

information in order to distinguish if each basic block belongs either to the software or the

hardware partition. Notice that in SCoPE this is not necessary since all the application code is

assumed to belong to the software partition.



Page 98

Figure 15: Simplified performance estimation flow of SCoPE+.

Therefore, the identified basic blocks are now distinguished between software and hardware

basic blocks. Depending on the type of basic block, either a SW estimation (by applying any

of the three estimation methods mentioned before) or a HW estimation is performed. As in

SCoPE, SCoPE+ will require to specify the SW estimation method (notice that now it is

explicitly remarked). SCoPE+ will also let to configure the HW estimation method. In this

sense, rather than talking about different estimation techniques, it is more interesting to

foresee the configuration of the different aspects affecting the figures which the estimation

will yield. For instance, ASAP versus ALAP scheduling, minimum latency vs. minimum area,

library of hardware operators, clock frequency, etc.


SCoPE+ will take as input a system description. For it, SCoPE+ will enable different

alternatives.

In any of them, for the description of functionality, SCoPE+ will support ANSI C/C++ code

as action language, that is, for function implementation. A general limitation is that SCoPE+

will work only on source code (not on binary code). A specific limitation is that SCoPE+ will

not support an enum declaration ending with a comma:

enum fruits ={apple, lemon, orange,}; // not supported

enum fruits ={apple, lemon, orange}; // supported

SCoPE+ will enable the description of a system by enabling the user to define a concurrent

application, the platform and the allocation of application elements to platform resources.

B.2.3.1 Concurrent Application

As in SCoPE, SCoPE+ will enable the user to specify a concurrent application relying on the

services of an RTOS. The following programming APIs will be supported:

POSIX

uCOS-II (v2.85)

Win32



Page 99

A complete list of the functions supported by the tool for each RTOS API is documented in

the SCoPE user manual.

A current limitation for the POSIX API is that it does not support dynamic creation of

processes, that is, fork statements. Currently, static processes are supported. They have to be

wrapped within a function wrapping (treated as if they were a main entry function).

B.2.3.2 Platform Definition, Allocation, System Configuration and Metrics

SCoPE+ will support the current SystemC based way in SCoPE to enable the capture of the

platform definition, the allocation of application/specification processes to platform elements

and to configure the simulation. The following code is an example of it.

/* \file sc_main.cpp */ #include <iostream>

using namespace std;

#include "sc_scope.h" //platform headers

int uc_main(void); //user main functions

void * user_code(void *) { /* User application */

uc_main();

}

/* sc_main */

#define NUM_NODES 1 //number of nodes

int sc_main(int argc, char **argv) {

UC_rtos_class *rtos;

vector<UC_rtos_class *> rtos_list;

UC_TLM_bus_class *bus;

UC_TLM_bus_class *hub;

for (int node = 0; node < NUM_NODES; node++) {

rtos = new UC_rtos_class(1, "arm926t"); //the rtos with the number of

cpus

(*rtos)[0]->new_process(user_code, NULL, "bubble"); //register new

process

rtos_list.push_back(rtos);

bus = new UC_TLM_bus_class(sc_gen_unique_name("HAL"), 100000000,

RAM_START);

hub = new UC_TLM_bus_class(sc_gen_unique_name("hub"), 100000000);

// Processor to bus binding

(*rtos)[0]->bind(bus);

bus->bind(hub);

UC_hw_memory *mem = new UC_hw_memory(sc_gen_unique_name("mem"),

RAM_START, RAM_START + RAM_SIZE - 1, 100/*resp.time(ns)*/);

hub->bind(mem);

bus->generate_memory_map();

}

sc_start(100,SC_MS);

cout << "Main finish" << endl; //simulation finished

cout << "Simulated time: " << sc_time_stamp() << endl;

for (int i = 0; i < NUM_NODES; i++) delete rtos_list[i];

return 0;

}

Further details on how to build with this format a platform, the allocation, and the system

configuration are found in [11].



Page 100

Similarly, the capabilities for specifying the platform through XML files by using the

different plug-ins will be kept in SCoPE+.

The M3P plug-in currently enables interfacing SCoPE with a set of XML files. There is an

input .XML file for:

System description: It enables the description of the HW platform, of the SW platform

and of the SW application, of the allocation of the SW tasks to the platform elements,

and the simulation parameters.

System configuration: It enables the definition of the platform configurable

parameters. Number of processors in a SMP system, size of caches, and bandwidth of

a bus or memory delay are possible configuration parameters.

Metrics definition: enables the definition of the metrics that SCoPE will report once

the simulation is finished.

After running the executable specification, an output metric report XML file will be produced.

It contains the metrics specified in the metrics definition file. Further details about the specific

format of the XML files of the MP3 plug-in are available in [11].

It is foreseen that along COMPLEX project, the capabilities make configurable the allocation

are improved, specifically regarding the possibility to explore the impact of different

allocations with a single executable specification.

Similar, SCoPE will preserve the interfacing currently enabled by the SCoPE IP-XACT plug-

in. This plug-in separates the different information required by the tools in a set of XML files,

each one for describing:

- HW platform (i.e., hw_platform.xml)

- SW platform XML file (i.e., sw_platform.xml)

- Allocation XML file (i.e., mapping.xml)

- Platform and Simulation Configuration XML file (i.e., params.xml)

- Metrics Definition XML file (i.e., metrics.xml)

Again, the execution of the executable produces an output XML file with the requested

metrics.

The main novelty, regarding the format of the XML files is with regard the HW platform

description file, is the possibility to read the platform under an IP-XACT standard.

B.2.3.3 HW IP insertion

Another way to interface SCoPE+ regards the way to consider HW IP component. For it,

SCoPE+ admits the insertion of a SystemC-TLM model of the IP.

For such an insertion the SystemC model of the IP has to inherit from a SCoPE class,

uc_generic_bus_if. This class contains the interface to connect any module to the bus. The

new hardware has to implement the functions read and write to be accessible from the

software.



Page 101

Figure 16: Wrapper for HW IP connection to the TLM bus.

Following there is a simple example of a register of 4 bytes that may be the bus interface of a

complex HW IP.

#include "uc_generic_bus_if.h"

class my_periph : public uc_hw_if {

sc_time response_time;

int value;

public:

my_periph(sc_module_name module_name, unsigned int begin, unsigned int

end, int irq_num, int ret);

int write(DATA data,int size,ADDRESS addr,int tlm_id);

int read(DATA *data,int size,ADDRESS addr,int tlm_id);

};

As shown, IP HW modules have to be developed in SystemC, inheriting the SCoPE HW

interface (uc_hw_if). The constructor receives as arguments the module name

(sc_module_name type), the HW interrupt number associated to the peripheral and the delay

time. In this case, it also receives the beginning and the end of the memory address assigned,

thus uses the 4 parameters constructor. However, this simple case does not use IRQ service.

The uc_hw_if class is wrapper with a simple interface which is shown in Figure 16. These

wrappers enables accessing the TLM interface of the bus model (which includes TLM data

structures and functions) through the simple interface shown in Figure 16, and composed of

read and write functions with data, size and address as parameters. This makes easier the

interfacing, since the developer of the code of the model of the HW IP does not need to know

about complex TLM structures, while still can work with the relevant data.

Of course, the user can still use, if the desired use the TLM interfacing. Following, the list of

public functions of the uc_hw_if class is shown to give an idea of the interface enabled by

such a wrapping class:



Page 102

UC_hw_if_base(sc_module_name module_name, unsigned int begin, unsigned int

end, int irq_num = -1);

UC_hw_if_base(sc_module_name module_name, int irq_num = -1);

// Receive request

virtual void b_transport(tlm_generic_payload &trans, sc_time &delay);

// Send Interruption

virtual void send_irq();

virtual void send_irq(int id);

virtual UC_tlm_target_socket<> * get_target_port();

// Specific peripheral functions (DATA = void *, ADRESS = unsigned int)

virtual int read(ADDRESS addr, DATA data, unsigned int size, void

*extension);

virtual int write(ADDRESS addr, DATA data, unsigned int size, void

*extension);

virtual int read(ADDRESS addr, DATA data, unsigned int size);

virtual int write(ADDRESS addr, DATA data, unsigned int size);

// Burst size control

virtual void set_burst_size( int burst_size );

virtual int get_burst_size();

Notice that the HW IP TLM model can also include inner processes. In such a case, the

constructor has to declare a SC_THREAD with an associated member function (within the

context of the HW IP module, which contains the process functionality.

Finally, TLM models of HW IPs can be encapsulated within corresponding IP-XACT

descriptions. When such an IP-XACT encapsulation is given, they can be plugged via the

SCoPE-IP-XACT plug-in.

B.2.3.4 Specification of the Concurrent Functionality

An API for building Concurrent Functional Application Models, or CFAM API, will be

supported by SCoPE+. It will be a C/C++-based API, where computational code will be

written as when using any of the aforementioned RTOS APIs, applying the same limitations.

For supporting concurrency, communication and other services typically related to RTOS

syscalls, the CFM API of SCoPE+ will rely on generic statements, which are not tied to the

syntax of a specific RTOS API.

The CFAM will be an important innovative aspect of SCoPE+ Through the CFAM API,

there will be the possibility to re- assign (and estimate) application of functional components

to different types of implementation alternatives (e.g. as software in the same node, as

software in a different node, as hardware) without requiring the change of such computational

code. This will remove the need for code transformations (either manual or automated) for

such different estimations. For this, the CFAM will include additional statements for

indicating such allocation (whose syntax is to be decided).


The SCoPE+ framework will provide a user interface is based on the console and Makefiles

for configuring the compilation of the system executable (which requires defining estimation

methods, source files, library paths, etc). It also requires establishing a set of environment

variables.

The usage of the plug-ins is also via command line and requires only slight modifications on

the Makefiles to include the corresponding libraries.



Page 103

Overall performance figures are dumped through the console and can be redirected to a file.

Moreover, specific performance metrics are dumped to the report XML file when the plug-in

is used.

The graphical user interface is being reworked and under development in projects parallel to

COMPLEX

B.2.5 Portability

SCoPE installation has the following requirements:

- 32-bit Linux platform

- GNU C/C++ compiler 4.0 or later

- SystemC 2.2.0 (for versions after gcc-4.1.2, the patched version of SystemC-2.2.0

will be required).

- flex 2.5.33 or later

- bison (GNU bison) 2.3 or later

- zlib1g and zlib1g-dev libraries

- For the ASM estimation method: cross-compiler for the specific target architecture

(e. g. arm-elf-)

The tool has been checked for the following 32-bits linux distributions: Fedora 8, Fedora 11,

Ubuntu 10.04, and Debian.

Currently, SCoPE has been ported to: Ubuntu 10.04, with flex 2-5.33, bison 2.4.1, gcc-4.4.1

(this version is not publically available). An additional effort to port SCoPE to 64-bits

platforms is also in progress.


SCoPE has a related website (www.teisa.unican.es/scope). There the latest public versions of

SCoPE can be downloaded.

Currently SCoPE is delivered including the typical README and INSTALL files in the

distribution. It also comes with several examples covering different ranges from a simple

single-node, single processor, single process platform up to a simple network with two nodes,

and several software processes.

SCoPE has also a related public manual

(http://www.teisa.unican.es/gim/en/scope/scope_web/scope_download.php), updated to

SCoPE v.1.1.5. Currently, an effort out of COMPLEX project is in progress to update the

manual to the latter version of SCoPE (v 1.1.4), and furthermore to the rebuilding of the

graphical interface and the incorporation of features developed under other projects, such as a

thermal plug-in, modelling of voltage scaling, etc.

In the SCoPE website, also related publications can be found, going from the former ideas,

published in [13], to the last contributions to the tool, published in [14].

http://www.teisa.unican.es/gim/en/scope/scope_web/scope_download.php



Page 104

The M3P plug-in has a related site [8], where the plug-in itself can be download and first

information about is given. Within the M3P plug-in distribution, the user will find also a

related manual [12] and a reference guide.

The SCoPE IP-XACT plug-in has also a reference guide. Regarding the XML format, it is

similar to the format described for the M3P, except for the file separation and for the HW

platform description.

SCoPE+ will mean the update of the SCoPE manual to account for all the features, formats,

and ways of use, including both new features and the ones currently supported by SCoPE.

The SCoPE site will be also updated to comprise references to new publications, and to

enable the download of the new version of the tool and the aforementioned update of the

manual. As mentioned, in case that modifications to the plug-ins are required to reflect further

configurability or to comprise a wider IP-XACT set, then the new plug-in version will be

made available, as well as the updated documentation.



Page 105

B.3. UML/MARTE Design Entry Tool Description

The Papyrus MDT tool is selected as the model-driven entry tool for the definition of the

MARTE Platform Independent Model (PIM), Platform Description Model (PDM) and

Platform Specific Model (PSM).

Papyrus provides a graphical editor for the Unified Modelling Language (UML 2), but also

the development of profile extensions to the UML meta-model. This last characteristic makes

Papyrus editor quite interesting if the UML model needs to be extended with specific

concepts that are not available in the MARTE profile. In this sense, the MARTE Profile for

Modelling and Analysis of Real-time and Embedded Systems (MARTE), is already available

and thus usable for the Papyrus user.

The MARTE profile supports the model-driven development of real time embedded systems

by adding relevant techniques to UML. The profile provides support for specification, design,

and verification of complex hardware/software systems and is intended to replace the existing

UML Profile for Schedulability, Performance and Time.

Another MARTE concern is to enable the model-based analysis. Therefore the MARTE

profile intents to support existing analyzing techniques. MARTE focuses on performance and

schedulability analysis.

B.3.1 Tool Overview

Papyrus is an open source graphical modelling tool. It is integrated in the Eclipse environment

and complies with the UML2 standard, the diagram Interchange (DI2) standard and supports

the UML-Profiles SysML, MARTE, CCM and EAST-ADL2. Regarding the UML MARTE

profile, current Papyrus implementation is based on the OMG specification UML Profile for

MARTE V1.0 (formal/2009-11-02) November 2009. This implementation provides the

profile, the model library and a beta version of the Value Specification Language (VSL)

editor.

It also allows UML profile extensions in all ways listed above. Different approaches may be

followed when developing a domain-specific profile:

Creation of embedded in Eclipse plug-ins: this approach eases deployment of the UML

profile creating a dynamic profile definition that can be used in Papyrus. The profile can be

registered in an extension point in order to ease its retrieval within UML tools – this saves the

user from manually searching for the desired profile. This approach is extremely quick, at the

cost of losing some control;

Creation of a static profile definition: the Eclipse UML2 project has defined new feature

named ―Static Profiles‖ that make possible to use stereotype via generated Java classes

instead of profile definitions. This feature may be useful to add special behaviour in the

profile. For example, a derived property is a particular kind of property that is not supposed to

be valuated by the user, but automatically deduced from the context where the stereotype is

applied.

Papyrus provides also facilities to write OCL constraints with a content assistant (code

completion). OCL constraints declared in the profile stereotypes may be evaluated on user

models.



Page 106

The Object Constraint Language (OCL) is a declarative language for describing rules that

apply to Unified Modelling Language (UML) models developed at IBM and now part of the

UML standard. Initially, OCL was only a formal specification language extension to UML.

OCL may now be used with any Meta-Object Facility (MOF) Object Management Group

(OMG) meta-model, including UML. The Object Constraint Language is a precise text

language that provides constraint and object query expressions on any MOF model or meta-

model that cannot otherwise be expressed by diagrammatic notation. OCL is a key component

of the new OMG standard recommendation for transforming models, the

Queries/Views/Transformations (QVT) specification.

OCL language statements are constructed in four parts:

- a context that defines the limited situation in which the statement is valid;

- a property that represents some characteristics of the context (e.g., if the context is a

class, a property might be an attribute);

- an operation (e.g., arithmetic, set-oriented) that manipulates or qualifies a property,

and;

- Keywords (e.g., if, then, else, and, or, not, implies) that are used to specify

conditional expressions.


The following picture shows the Papyrus architecture. It relays on Eclipse technologies

(indeed, it is an eclipse project) and the plug-ins EMG (Eclipse Modelling Framework) and

GEF (Graphical Modelling Framework) for modelling support. Papyrus also integrates with

other profiles and code transformations based on Eclipse technologies and compatible with

the XML Meta-data Interchange (XMI) format, and the UML2 meta-model.

Figure 17: Papyrus Architecture


The following requirements in terms of interoperability with the end user and other tools are

foreseen during the development of COMPLEX UML/MARTE modelling tool:



Page 107

The COMPLEX UML/MARTE modelling tool shall conform to the XMI and MOF based

meta-model specifications

The COMPLEX UML/MARTE modelling tool should be inter-operable with other model

editors

The COMPLEX UML/MARTE modelling tool shall be open

The COMPLEX UML/MARTE modelling tool shall be based on Eclipse

The COMPLEX UML/MARTE modelling tool should provide user friendly functionality

The COMPLEX UML/MARTE modelling tool source developed in the project shall be

freely available

The COMPLEX UML/MARTE modelling tool shall be compatible to a fine grain

configuration management

The COMPLEX UML/MARTE modelling tool and the necessary transformation engines

shall be runnable on the same Eclipse instance

The COMPLEX UML/MARTE modelling tool shall produce model files which do not

require manual transformation to feed transformation tools

The COMPLEX UML/MARTE modelling tool should facilitate the integration of the

transformation tools

The COMPLEX UML/MARTE modeling tool shall support the UML 2.0 meta-model

The COMPLEX UML/MARTE modeling tool shall be executable in both Windows and

Linux operating systems.

The COMPLEX UML/MARTE modeling tool environment (Eclipse-based) shall use a

Java Runtime Environment which version is higher than 5.0.


The following image depicts the Papyrus MDT graphical interface. Four important elements

must be remarked in this image:

Project outline: describes the model structure, including all UML elements and links created

in different UML diagrams. It also provides information of the applied profiles and the

package imports of other packages.

Profile-specific entry: the property tab allows the definition of profile-specific values for the

attributes of the stereotype assigned to the UML element, the editor of constraints based on

OCL and VSL.

Model diagrams: it represents the modelling canvas including taps with all currently opened

model diagrams that form part of the UML project.

Elements: it defines which UML elements and links are available for each model diagram

kind.



Page 108

Figure 18: Papyrus MDT graphical user interface

B.3.5 Portability

Papyrus may be installed as a standalone product (RCP versions) for Windows win32 series

(XP) and Linux/gtk-x86. There are modelling bundles for other platforms like OSX. In these

cases, the installation of papyrus is performed via the update-site.

The following requirement must be met by the system in order to install Papyrus:

- Java 5 (or higher) version of Java virtual machine

- Eclipse Modelling Bundle (Galileo)

- ANTLR


The following material is intended to be used from Papyrus webpage

(http://www.papyrusuml.org):

- Papyrus First Steps (link to specific content)

- Tutorial for UML Profile creation in Papyrus (link to specific content)

- Constraints declaration (link to specific content)

http://www.papyrusuml.org/

http://www.papyrusuml.org/scripts/home/publigen/content/templates/show.asp?P=121&L=EN&ITEMID=24

http://www.papyrusuml.org/home/liblocal/docs/Documentation/ProfileTutorial/TutorialForProfileUsageInPapyrus.doc

http://www.papyrusuml.org/scripts/home/publigen/content/templates/show.asp?P=143&L=EN&ITEMID=29



Page 109

B.4. SWAT: SW Estimation Tool Description

The SWAT software estimation tool chain has the goal of providing early estimates of the

impact of software application on the overall performance of the system, mainly in terms of

execution time and power consumption.

B.4.1 Tool Overview

The tool chain is composed of several cascaded modules that can be used independently, for

specific estimation needs, or automatically, for a complete estimation run. The main goal of

the toolset is that of providing estimations of different non-functional aspects by decoupling

the aspects related to the structure of the application (source code), the specificities of the

executor (architecture) and the data dependencies (profiling).


The software estimation tool chain is built around a single configurable modelling core. It

consists of a C language parser suitably enhanced with some specific features that analyses

the structure of the input application and builds several internal models. The core is structured

as schematically shown in the following figure.

The source code is progressively transformed into a concrete syntax tree, than simplifies in an

abstract syntax tree, than annotated according to the specificity of the model that the core is

building, and finally transformed into a low-level pseudo-assembly representation. Such a

representation does not aim at modelling the functional behaviour of the original source code

but rather to catch the relevant aspects with respect to execution time and power dissipation.

For this reason, it ignores all the details of actual compilation and focuses on coarser grained

aspects such as the number and type of low-level operations. It is important noting thus that

this phase generates a static model of the application that is totally data-independent.

To complete the construction of such a model, no information other than the source code is

needed. The next step consists in producing an executable model that allows taking into

account the dependencies on input data. This phase requires operating both at source code

level and at assembly level. To perform this step, furthermore, a model of the compiler and a

model of the target architecture are needed. The former model accounts for some

source code

CST

AST

annotated AST

pseudo-assembly

model

static

cost model

static

code model

target

model

compiler

model

source code

CST

AST

annotated AST

pseudo-assembly

model

static

cost model

static

code model

target

model

compiler

model



Page 110

specificities of compiler and has only a limited impact on the output model. The latter

describes the structure of the architecture of the executor and its effect is that of tuning the

pseudo-assembly model for the specific processor. It is worth noting that this latter model

generally requires providing elementary timing and power figures. in absence of such data,

the estimation flow can still be used and will result in dimensionless estimations mainly

useful to compare alternative implementations of the same application at source level.

The outputs of the two modelling phases just described can be combined into a single

complete dynamic model in different ways, depending on the goal of the estimation process.

The first way the results of the core can be combined leads to the generation of an

instrumented version of the original source code, the host-executable model. Such a model

contains the entire necessary infrastructure to derive and annotate structural (static) and

profiling (dynamic) information into source level execution traces. This arrangement is

shown in the figure below.

source code

CST

AST

annotated AST

pseudo-assembly

model

static

cost model

static

code model

host

executable

model

execution

(host)

stimuli

dynamic

estimates

post

processing

target

model

compiler

model

source code

CST

AST

annotated AST

pseudo-assembly

model

static

cost model

static

code model

host

executable

model

execution

(host)

stimuli

dynamic

estimates

post

processing

target

model

compiler

model

The model generated in this way can be executed on a generic host machine and fed with

appropriate stimuli, leading to the production of source level execution traces that combine

the application code model (static) with the response to actual stimuli (dynamic). A further

step of post processing combines all the information available to produce the estimation

report. In its finest-grain form, such a report indicates the execution time and the power

consumption of all "atomic" constituents of the application Such atoms are defined according

to the semantic of the source language and to the expected execution of a real processor, and

range in complexity from simple assignments, arithmetic operations, access to variables to

more complex behaviours such as function calls and returns, loops and so on.

To be really useful, estimates should be provided at the same level of abstraction at which the

application developer thinks and designs the program, that is, the source level. The post

processing engine, in fact, can collect atomic contribution per line of the source code, and, at

an even coarser grain, per function. The estimates collected in this way can either be back-

annotated to the original source code or presented through a graphical user interface (currently

under development).



Page 111

Thanks to the decoupling of the behaviour of the application into a static model and a

dynamic model, the same host-executable model can be used with different sets of input

stimuli to produce different estimates, corresponding to a wide range of usage scenarios. It is

important noting that since the structure of application itself does not change with data, the

static source code model can be built only once.

The intermediate models built by the tool chain core can be combined to produce an

augmented version of the original source code capable of interacting with a system-wide

model both in terms of functional behaviour and non-functional aspects. This usage of the

flow is schematically shown in the following figure.

The output of this flow is functionally equivalent to the input code but includes suitable

annotations (in the from of SystemC calls) that assign non-functional properties to individual

portions of the code. The content of each annotation can be either dimensionless or expressed

in physical units such as seconds or amperes. In the former case, the model only requires a

code generation phase that inserts the required calls. In the latter, on the other hand, a run

through the estimation flow described above is necessary. This step, in fact, generates the

elementary cost contribution to be passed to the SystemC calls inserted into the original code.

The level of granularity of the instrumentation, and is consequent accuracy, can be tuned by

the user to obtain the best trade-off between accuracy and simulation performance.

Concerning the precise format of the calls injected into the original code and the

implementation of the corresponding functions, an API compatible with the rest of the system

model must be defined.

The last possible usage of the outputs of the estimation core consists in a combination that is

similar to the one just described, but differs in terms of goals. In this case, in fact, the goal is

to produce a new version of the application suitably enriched with a lightweight

infrastructure that can be compiled into the final application and deployed with it. This

provides the application itself with the awareness of its non-functional behaviour at run-

time. This feature, combined with a specific software component provides a complete

infrastructure to support dynamic management of non-functional aspects. The flow in the

figure below schematizes the sequence of operations required.

source code

CST

AST

annotated AST

pseudo-assembly

model

static

cost model

static

code model

SystemC

BAC++

Instrumented

Source Code

target

model

compiler

model

source code

CST

AST

annotated AST

pseudo-assembly

model

static

cost model

static

code model

SystemC

BAC++

Instrumented

Source Code

target

model

compiler

model



Page 112

The source code generated following this flow is very similar to that annotated with SystemC

calls. The main difference lays in the complexity of the library supporting the annotation.

Since in this case all the code and the data structures will be deployed, it is crucial that they

are maintained as simple and small as possible. It is also important that the user can specify

which portions of the code need to be annotated and which need not. Realistically, in fact,

only a small subset of the source code of the application is likely to be interesting (or critical)

under the non-functional point of view. A first run through the estimation tool chain, for

example, might be useful to locate such critical portions.

The source level estimation flow has the limitation that requires the complete source code of

the application to proceed. It often happens, though, that portions of the application (libraries,

operating systems) are only available in binary form. A call to these binary functions is thus

a sort of "hole" in the source level model. To account for these binary portions of the

application there is no other solution than providing some sort of black-box modelling.

The figure below sketches an estimation flow for binary libraries.

It assumes that at least a set of header files defining the interface of the library is available,

along with an instruction set simulator of the target platform. It is worth noting that the

knowledge concerning how to use the library function is necessary to use this flow. More

source code

CST

AST

annotated AST

pseudo-assembly

model

static

cost model

static

code model

Lightweight

Instrumented

Source Codetarget

model

compiler

model

standard

cross-compilation

toolchain

Binary code

for deployment

source code

CST

AST

annotated AST

pseudo-assembly

model

static

cost model

static

code model

Lightweight

Instrumented

Source Codetarget

model

compiler

model

standard

cross-compilation

toolchain

Binary code

for deployment

binary library

and headers

prototype

extraction

stub/measure

code generation

execution

(ISS or target)

model

learning

stimuli

generation

analytical

cost models

target

model

user profile

stimuli

binary library

and headers

prototype

extraction

stub/measure

code generation

execution

(ISS or target)

model

learning

stimuli

generation

analytical

cost models

target

model

user profile

stimuli



Page 113

agnostic approaches have also been proposed in literature but are beyond the scope of this tool

chain.

The proposed flow has the main goal of simplifying and automating the process of executing

a given set of functions with suitable stimuli, collecting the results in terms of execution time

and, possibly, power consumption, and deriving a simple analytical cost model to be used

within the source level estimation flow.


The software estimation tool chain interfaces with the rest of the flow by means of C source

code files and structured reports. In particular the input of the tool chain is constituted by a

set of C source files (implementation and headers). Using this information the estimation tools

are capable of generating both an augmented code and a dimensionless estimation model.

This model is a sort of abstract timing and power consumption model consisting of the

original source code annotated with suitable hooks, in the form of function calls that can be

later specialized to match the timing and energy characteristics of the target architecture and

system. The abstract estimation can be regarded as an enhanced fine-grained profiling that can

be enriched with actual figures in a post processing phase. To this purpose, additional

information is needed: a timing and power model of the underlying instruction set

architecture. Such model must be in a tabular form and must specify the contributions to

execution time and to power consumption of the elementary operations of the microprocessor

and of the relevant parts of the system, such as memories and buses. The accuracy and the

granularity of such figures can vary from average values to more detailed data for specific

operations.

The output of the tool is primarily an annotated version of the input source code. The

specific form of the annotation depends on the flow being considered (static estimation,

lightweight model or augmented code model for dynamic estimation). When annotated for

static estimation, the instrumented source code can be compiled and executed on a host

machine to produce a complete and detailed report on the timing and power characteristics of

the application. This report has the form of list of power and timing figures back annotated on

the original code. Post processing of such figures can be performed to produce coarser-grain

information, such as application-wide average values or average execution time and power

consumption per function.


The typical interaction with the tool chain is through command line. The whole toolset is

wrapped within a script that provides the same interface and behaves exactly as GCC (the

same approach can be followed for other compilers by customizing the wrapper script). This

approach allows using the tool chain as if it were the compiler itself. Considering, for

example, a project made of several source files plus a Makefile, the estimation flow can be

run using the same Makefile and letting the script take the place of GCC.

Individual tools of the estimation flow can be used from the command line to customize the

flow for specific needs. This allows, for example, to instrument and estimate only a portion of

a complex application.

A prototypical graphical user interface is being designed and developed to allow fast

estimation and analysis of isolated part of a complex application. This GUI allows selecting



Page 114

the portions of code to be considered, the options for estimation and a visual representation of

execution time and power consumption of the code.

B.4.5 Portability

The tool chain has been developed under Linux and needs several scripting tools such as awk,

tcl, bash and others. The key tools are distributed in binary format, along with all the required

scripts. The graphical user interface is being developed based on the Qt library under Linux.

Source code of the tool chain is also available to the partners subject to specific agreements.


The documentation comes in different forms:

- General tool chain description and documentation.

- A short user manual, providing basic usage information by means of worked

examples and how-to.

- Man pages, both in man format and in html.

- Help. Each tool of the tool chain provides a simple command line help.

All the code is documented with Doxygen. The code documentation is only available along

with the source code, upon specific agreements



Page 115

B.5. Virtual Platform Tool Description

The Virtual Platform tools are a simulation environment that allow multiple design tasks

ranging from early development of embedded software for multicore systems to the

evaluation of the system performance in terms of software execution time, memory

bottlenecks or power consumption for various architectural variants or software mappings

B.5.1 Tool Overview

The Virtual platform tools provide solutions for early software development as well as for

architecture exploration.

When used for early software development virtual platforms are software models of complete

systems that provide software engineers with high-speed, pre-silicon development

environments months before hardware is available. Virtual platforms enable concurrent

development of SoC hardware and software, significantly shortening embedded system

suppliers‘ hardware/software integration time and accelerating their products to market.

Because they are based on software models, virtual platforms offer unmatched effectiveness

for developing and debugging multi-core designs. As software development effort has begun

to overtake hardware development costs in modern SoCs, software availability has become

the key factor gating time to profit for semiconductor providers. Virtual platforms are an

effective solution for accelerating pre-silicon software development. The virtual platform tool,

Innovator, provides an integrated SystemC development environment for assembling virtual

prototypes from transaction-level models (TLMs) from various sources, and creating new

ones.

Alternatively virtual platforms are also used to capture the platform architecture at a higher

level of abstraction—processors, busses and peripherals are modelled in a manner that allows

the execution of real software applications on the platform model so the performance of the

architecture can be properly evaluated and explored. The virtual platform tool, Platform

Architect, consist of a SystemC-based graphical environment for capturing the entire product

platform and the dash board for initiating the platform analysis functions. Platform Architect's

analysis tools provide textual and graphical views for the system designer, verification

engineer, and software architect to analyze SoC architectures and performance, including

critical items such as software execution and bus occupancy. For architectural analysis,

Platform Architect provides views to:



Page 116

- Analyze cycle-accurate performance

- Study throughput and bottlenecks

- Look at bus switching and cache usage

to reduce power

- Optimize bus & memory architecture

For the case of functional analysis,

Platform Architect provides views to:

- Look at system response and task

scheduling

- Analyze processor loading to drive

partitioning

- Profile software for optimization

- Cross-correlate different views to

extract powerful information

Analysis views can be configured at

run-time using SystemC Explorer (a Platform Architect feature), enabling the user to decide

on the data that is captured. The visualization environment allows the user to see the default

graphical views dynamically during simulation or during post-processing to identify

bottlenecks in the design. Views can be re-configured as required and data from multiple

simulations can be grouped together for easy comparison of candidate architectures.


The virtual platform tools are a set of tools and libraries that can be used together to create

and use virtual platforms for early software development and or architecture exploration.

These two solutions are build, using the same set of key infrastructure elements:

Model Wizard: is a tool that enables a user to create SystemC models for individual

components of a system. The tool facilitates the use of SystemC and TLM2 constructs. It

generates the interface for a component as well as the internal register map based on graphical

inputs or a script. The user can then add the functionality to the generated template. The tool

supports full roundtrip engineering and there is support for test driven design also

documentation can be generated.

Platform Creator: is a tool that enables a user to assemble a platform from individual

components. There is a graphical block diagram entry as well as a scripting interface.

Individual components are loaded from a library or imported as SystemC or RTL code.

Library components can be configurable or additional generators can be called to enable

maximum flexibility. The tool configures the simulation environment. There are additional

features to support models of different abstraction levels, to enable easy switching between

implementations of a component and links to documentation, library management, etc. There

is support for automatic transactor insertion to support multi-abstraction level simulations.

Model library: with the virtual platform tools there is a rich library of IP components, ranging

from processor models over interconnect infrastructure models to common peripheral models.



Page 117

Examples are ARM, Tensilica, PowerPC, MIPS, Pentium, etc processor models; NIC301 and

Sonics bus interconnect models; Primecell peripheral models, memory controllers; etc. To

support the different use models these components are available at different levels of

abstraction and are highly configurable. Certain 3rd

party models come with their own

integrated tool environments. The model library contains interconnect models and their

integrated configuration tools

SystemC simulator: the virtual platform tools come with a custom SystemC simulator that

supports the full SystemC and TLM standards but that is further enhanced to provide with

links and interfaces to various debugging and analysis tools. The simulator also supports co-

simulation with several RTL simulators, emulators and HW prototyping solutions.

SystemC debug tool: the simulator is supported through an eclipse based SystemC debugger

that provides source level debugging as well as additional SystemC debugging features to

enable model designers to validate and debug their component models. The debugger comes

with port and variable tracing, sequence charts for events and threads, timed breakpoints, etc.

Platform debug tool: is called platform analyzer. This tool provides with a more abstract view

on a simulation, targeted to the SW developer. There are no SystemC debug features but

additional platform debug features are available. It provides with a register and memory

visibility with SW and HW breakpoints and watch points. It integrates with SW debuggers for

the SW running on the processors as well as with the SW analysis tools. There is support to

create custom debug and analysis views for a particular platform model.

HW analysis: in order to support the architecture exploration use case there is a HW analysis

tools that provides with transaction tracing and statistical analysis views for the platform



Page 118

architecture. It provides with transaction throughput and latency analysis for the individual

interconnect components as well as for the overall architecture

SW analysis: in order to support the SW development and SW optimization use cases there is

a SW analysis tool that provides with memory, cache statistics as well as OS enabled context

view, function tracing, stack tracing etc.


The interfaces for the different components of the virtual platform tools are as follows:

- Model Wizard:

Has a TCL API that interacts with the internal data-model of the tool. This can be used

to create component templates from a script. It is also used to create custom imports

from other formats like IP-XACT.

It exports SystemC code compliant with the IEEE-1666 standard and the TLM2-LRM.

It uses the SCML (SystemC Modelling Library) for common model patterns. There is

a source code version of this modelling library available. The generated SystemC code

uses a particular style to enable roundtrip engineering.

It exports a proprietary XML library format that can be used by platform assembly

tools to avoid having to import the SystemC code. It is used by platform architect to

identify the interfaces of the component.

- Platform Creator:

- Has a TCL API that interacts with the internal data-model of the tool. This can be

used to load libraries, manage parameter configurations, create and build a virtual

platform from a script.

- It uses a proprietary XML format to store the current project settings.

- It uses a proprietary XML format to load and store libraries of components.

- It can import SystemC compliant with the IEEE-1666 standard and the TLM2-LRM.

There are a few limitations to the C++ design patterns that can be handled for port

interfaces. There is support for recognizing template and constructor arguments as

well as scml_properties: all these will be presented as parameters to the user. Platform

creator can handle a range of dependencies between these parameters and the SystemC

interfaces of a component.

- It can import VHDL and Verilog RTL descriptions and supports a restricted set of

pin interfaces and data-types on those models. There is a mapping from this set of

interfaces and data-types to the closest SystemC equivalent.

- It exports SystemC code compliant with the IEEE-1666 standard and the TLM2-

LRM. For RTL components it will use the RTL co-simulation API‘s from the

simulation infrastructure. For advanced build options there is a dependency on

proprietary API‘s of the simulation and build infrastructure.

- It can export RTL code for the platform connectivity in case all components are

described at an abstraction level for which there is an RTL connectivity equivalent.



Page 119

- Model library:

- Is described in SystemC, or consists of models wrapped in SystemC compliant to the

platform creator import requirements

- Has an XML description according to the platform creator library format. This

library description refers to the SystemC code but also to additional TCL scripts that

are used to validate configuration options when the component is used in platform

creator, or to setup a correct build infrastructure for the component. Certain library

components or scripts can depend on 3rd

party tools for the configuration of complex

components.

- SystemC simulator:

- Is compliant to the IEEE-1666 standard and the TLM2 LRM.

- Has additional interfaces for

- 3rd

party debugger integration: this is to enable SW debuggers to interact with

the SW running on the processor models in a platform simulation

- Analysis API‘s: the SystemC simulator supports a proprietary, documented

API for logging transactions and collecting statistical analysis information.

There is also support for the standard SystemC tracing API‘s

- RTL co-simulation: the SystemC simulator supports the standard PLI

interfaces of RTL simulators to enable co-simulation between SystemC and

RTL.

- SystemC debug tool:

- relies on GDB and has a TCL API to build a simulation and to interact with a

running simulation.

- The debugging tool exists as a standalone tool but also as an eclipse perspective to

enable integrated debugging with source code.

- HW and SW analysis tools:

- they rely on a proprietary analysis database format, for which there is a TCL API to

extract and manipulate analysis information.


As already mentioned most tools have a TCL API to allow a script based interaction with the

features of the tools. Some of these TCL APIs are integrated: e.g. the platform creator TCL

API integrates the simulator TCL API so that from a single script a platform can be created

and build and started. This enables runtime and debugging support from within platform

creator.

All tools have a graphical user interface. Most of the analysis and debug tools can be

integrated with the eclipse Integrated development environment to create a single cockpit for

debug and analysis. The platform analyzer and platform creator tools are independent and



Page 120

cannot be integrated with eclipse. Most tools provide with a console from which tcl

commands can be issued or script launched.

B.5.5 Portability

The tools depend on certain OS versions to run on, as well as SystemC release versions, gcc

compiler versions and 3rd

party tool versions. Depending on the version of the tool these

dependencies vary. There is typically 1 new major tools release per year where some of these

dependencies change. For backward compatibility there is always a second or event third

release version of the dependent tool supported in order to allow migration of platforms and

models. For accurate information regarding these dependencies it is best to consult the release

letter and tool/model documentation.


The virtual platform tools come with an extensive set of manuals. Each tool and model comes

with its own manual (typically a user manual and a detailed command line interface reference

manual). Interface API‘s are documented in the tools manuals. Modelling API‘s and

modelling guidelines come as separate manuals.

Manuals come as pdf documentation, HTML documentation and some tools also generate or

display model documentation.



Page 121

B.6. PowerOpt Tool Description

ChipVision will provide its low-power behavioural synthesis tool PowerOpt. In the

COMPLEX framework, PowerOpt is used for the power estimation of custom hardware

blocks. After the behavioural synthesis of each entire HW task, monolithic basic blocks are

automatically identified. Power and delay figures are automatically inserted, according to

lower level simulation. The main tasks include:

- Providing timing and power information for custom hardware components

- Low-level power and data dependencies are modelled statistically

- High-level power and timing are estimated by simulating the system

- Map the recording of run-time and power traces to the tool API

B.6.1 Tool Overview

PowerOpt automatically synthesizes power-efficient RTL architectures from the electronic

system level (ESL). With its analysis framework, the tool offers fast and accurate analysis

capabilities at the system-level using real activity data, thus enabling the users to explore

trade-offs between power, area, and timing. With these analysis capabilities, higher power

savings compared to traditional RT-based optimizations and compared to other high level

synthesis tools can be obtained (up to 75% compared to traditional, hand-crafted RTL).

Additional benefits are faster time to results and higher productivity as the user works with

compact system level code which integrates with system level simulation and modelling. The

following figure makes a comparison of the traditional RT-based design flow and the

behavioural synthesis design flow using PowerOpt.

Traditional design flow with manual creation of RTL code (left) compared to power-optimizing flow with

automatic RTL generation using ChipVision’s PowerOpt (right).

Layout

Gate Level

RT Level

Block Architecture

SoC Architecture

System Architecture

C/C++/SystemC

RTL Level

Physical Design

Synthesis

Gate Level

Physical Design

Synthesis

Gate Level

RTL Level

CPF/UPF and

SDC constraints

C/C++/SystemC

PowerOpt Power Optimizing

High Level Synthesis

RTL Coding



Page 122


The PowerOpt tool flow can be divided into the following six phases:

- Setup

- Scheduling

- Activity Generation

- Synthesis

- Power Estimation

- Export

The ESL specification of a design is compiled and loaded into PowerOpt in the setup phase

and then an intermediate representation is generated. The user can then analyze the design

hierarchy and the control/data flow.

At this point various constraints can be set on the design. This includes synthesis attributes,

like voltage and frequency, as well as architectural decisions, like memory mapping.

Furthermore, various transformations, like loop unrolling or inlining, can be enabled.

PowerOpt Flow – Power Optimizing Automatic Synthesis

The design can now be synthesized starting with the scheduling phase. Based upon this

representation, several source code level optimizations (path depth reduction, common sub-

expression elimination, etc.) and transformations are applied. During the actual scheduling

step the latency of all processes in the design is determined. The control and data flow

dependencies are considered as well as the timing information (latency) for the executed

operations. The timing information is extracted from the power library.

The scheduling phase gives the user a first overview of the timing requirements of the design

and a check is performed to ensure the design is synthesizable. If not, a warning message is

generated.



Page 123

It is possible to change constraints (modify the number of resources used in the design or set

new values for voltage or frequency) after scheduling and re-schedule to see the impact

immediately. Loop optimizations (pipelining) can be activated for loops in the design.

If all timing constraints are met, the user can proceed in the flow. The next step depends

whether area or power should be the optimization target during synthesis. To create a power-

optimized design, an activity profile of the design needs to be generated. This is accomplished

by the Activity Generation procedure.

This step involves the following steps:

- Building an un-optimized architecture,

- Generating input stimuli (test vectors) and golden results files,

- Exporting and simulating the architecture, and

- Loading the activity profile into PowerOpt.

The activity profile is then used to optimize for low-power in the succeeding synthesis step

where the initial steps of allocation and binding are performed. Different allocations and

different bindings for each allocation are tested and the power-optimal solution is finally

chosen.

Next, a netlist for the synthesis result is built up internally. Based on the netlist generated

during synthesis the power, area, and timing values are calculated for the design under test.

Information required for this process is extracted from the power library.

PowerOpt offers comprehensive analysis capabilities showing the impact of architectural

changes at system-level on power, timing, and area. The tool produces a set of reports to

interpret and analyze the results.

The user may then refine the design either by setting constraints (resource, timing, or

scheduling constraints) or by modifying the source code of the design. Different synthesis

runs can be performed without having to re-simulate the design as it is sufficient to re-

schedule and re-synthesize after changing constraints. This interactive way of working with

the tool enables an effective design space exploration with short feedback loops.

If no activity profile is generated the user can directly proceed from Scheduling to Synthesize;

the activity generation is not required and thus no energy estimation is performed. Note that

all activity-based power optimizations cannot be applied in the operation mode with the

optimization goal being to save area.

Another important advantage of PowerOpt is that the superior analysis capabilities offered by

high-level synthesis can be extended for power tradeoffs. The initial steps in the PowerOpt

flow of importing and compiling the design only need to be done once. From that point

onwards, the user can work interactively with the tool using the flow recommended by

ChipVision:

Typically, a first synthesis run is done using the default constraints. The user can exploit the

comprehensive analysis capabilities offered by PowerOpt in order to explore the design in

terms of timing, area, and power before Verilog is generated.



Page 124

Then, the user can change constraints as explained above and proceed with Scheduling,

Activity Generation, and Synthesis, skipping the initial steps of the flow. No re-compiling of

the design is required unless the source code is changed.

This procedure is repeated until all requirements are fulfilled.

Once the result meets the expectations of the user then the design can be exported as

synthesizable Verilog RTL code. PowerOpt can also generate a test bench for the exported

design that processes the same vectors as the ESL level test bench. The exported design can

be easily verified by simulation. UPF and CPF constraints (voltage domains) as well as false

path SDC constraints can be generated to control the logic synthesis and layout tools during

the design flow steps.


PowerOpt is a behavioural synthesis tool, which can operate on the highest level of the system

design. It accepts design entry in C/C++ or SystemC. Due to its powerful front-end it allows

the user to harness almost all the high level features of the system description in C/C++ or

SystemC. The full description of its language capabilities is to be found in the Language

Modelling Guide which is delivered with PowerOpt.

At the output side, PowerOpt offers a direct interface to RT simulation and logic synthesis

tools. To support the RT simulation PowerOpt outputs a clean and human readable Verilog

description of the system, a test bench, test vectors, and the simulation script for almost all the

major simulation tools. To support the logic synthesis, PowerOpt outputs a synthesis script,

SDC constraints which also include all the false paths and UPF/CPF constraints for a power-

optimized design flow.


PowerOpt is available with two user interfaces:

- Graphical user interface (GUI)

- Script-only user interface (command line version)

Both interfaces support the Tcl script language. The advantage being that the full Tcl/Tk 8.4

command set can be used to control PowerOpt. This is useful as it allows users to perform

multiple synthesis runs with different constraints. The commands can be written into a file

and sourced after starting the tool.

PowerOpt also supports full scripting capability. All commands that are issued by the GUI are

stored in a command log file that can later be used (also in part) in the command line version

or in the command prompt of the GUI.

B.6.5 Portability

PowerOpt is available for different Linux distributions. In order to increase portability and

ease up the installation procedure, the tool is shipped with the complete tool chain that is

required to install and operate the tool and to compile and simulate customer designs.

The following Linux operating systems are supported:



Page 125

- RedHat Enterprise 4.4 (for 32 bit and 64 bit architectures, recommended)

- RedHat Enterprise 5.3 (for 32 bit and 64 bit architectures, recommended)

- SuSE 10.2 (for 32 bit architectures only)

- SuSE 11.1 (for 32 bit and 64 bit architectures, recommended)

A complete PowerOpt installation consists of two tar balls, the PowerOpt installation itself

and the tool chain required to operate the tool on the supported operating system and to run

customer designs in C, C++, or SystemC. This makes PowerOpt self-contained for all the

tools on which it depends.

The tool chain consists of the following components:

- binutils-2.19, required by gcc-3.4.6 and gcc-4.3.2

- gcc-3.4.6, required to compile customer designs in C, C++, or SystemC

- gcc-4.3.2, required to compile new power libraries (component data bases, CDBs)

- gmp-4.2.4, required to compile mpfr-4.2.4

- icarus Verilog simulator. Note that the version delivered with PowerOpt is enhanced

compared to the online version available at http://www.icarus.com/eda/verilog/.

- mpfr-4.2.4, required to compile gcc-4.3.2

- sytemc-2.2.0, required to compile customer designs written in SystemC

B.6.6 Tool Documentation

PowerOpt is shipped with a full set of documentation:

- User Guide

- Language Modelling Guide

- Reference Manual

- Tutorial

These documents will be adapted reflecting the enhancements developed during the

COMPLEX project.



Page 126

B.7. HIFSuite Tool Description

B.7.1 Tool Overview

HIFSuite is a set of tools and application programming interfaces (APIs) that provides support

for modelling and verification of HW/SW systems. The core of HIFSuite is the HDL

Intermediate Format (HIF) language upon which a set of front-end and back-end tools have

been developed to allow the conversion of HDL code into HIF code and vice-versa. HIFSuite

allows designers to manipulate and integrate heterogeneous components implemented by

using different hardware description languages (HDLs). Moreover, HIFSuite includes tools,

which rely on HIF APIs, for manipulating HIF descriptions in order to support code

abstraction and post-refinement verification.

HIFSuite plays two roles in COMPLEX Design Flow:

1 - Generation of SystemC models from Stateflow descriptions;

2 - Abstraction from RTL to TLM of the models of HW components.


Figure 19 shows an overview of the HIFSuite features and components. HIFSuite is

composed of:

- An HIF core-language and APIs: a set of HIF objects corresponding to traditional HDL

constructs as, for example, processes variable/signal declarations, sequential and

concurrent statements, etc.

- A set of front/back-end conversion tools:

- HDL2HIF. Front-end tools that parse VHDL, Verilog and SystemC (RTL and TLM)

descriptions and generate the corresponding HIF representations.

- HIF2HDL. Back-end tools that convert HIF models into VHDL, Verilog, SystemC

(RTL and TLM) and NuSMV code.

- A set of APIs in C++ that allow designers to develop HIF-based tools to explore,

manipulate and extract information from HIF descriptions. The HIF code manipulated by

such APIs can be converted back to the target HDLs by means of HIF2HDL.

- A set of tools developed upon the HIF APIs that manipulate HIF code to support

modelling and verification of HW/SW systems. In particular:

- A2T: a tool that automatically abstracts RTL IPs into TLM models.



Page 127

Figure 19: HIFSuite overview


Interaction of HIFSuite with other tools is possible in two ways:

- by providing HIFSuite with files containing VHDL, Verilog or SystemC code or

Stateflow descriptions;

- by interfacing with the HIFSuite APIs (see next section for details).


The current version of the user interface is textual. HIFSuite is invoked by entering a simple

command line which specifies parameter for code conversion and manipulation. Moreover,

expert users can interact with HIFSuite by mean of a set of powerful C++ APIs which allows

exploring, manipulating and extracting information from HIF descriptions. There are two

different subsets in HIF APIs: the HIF core-language APIs and the HIF manipulation APIs.

B.7.4.1 HIF core-language APIs

Each HIF construct is mapped to a C++ class which describes specific properties and

attributes of the corresponding HDL construct. Each class is provided with a set of methods

for getting or setting such properties and attributes.

The UML class diagram in Figure 20 presents a share of the HIF core-language APIs class

diagram. Object is the root of the HIF class hierarchy. Every class in the HIF core-language

APIs has Object as its ultimate parent.

HIF APIs

SystemC

VHDL

Verilog

SystemC

VHDL

Verilog

C/C++

Stateflow

To be

developed

in COMPLEX

To be

developed

in COMPLEX

Automatic

Abstraction

Tool (A2T)

Other

manipulation &

verification tools

NuSMV

HDL2HIF

Front-end

conversion

tool

HIF2HDL

Back-end

conversion

tool



Page 128

Figure 20 A share of HIF core language diagram

B.7.4.2 HIF Manipulation APIs

The HIF manipulation APIs are used to manipulate the objects in HIF trees and they are

exploited by the tools described in the following Chapters.

The first step for HIF manipulation consists of reading the HIF description by the following

function:

Object* Hif::File::ASCII::read(const char* filename)

This function loads the file and builds the corresponding tree data structure in memory. An

analogous writing function allows dumping on a file the modified HIF tree:

char Aif::File::ASCII::write(const char* filename, Object* obj)

Once the HIF file is loaded in memory, many APIs are available to navigate the HIF

description; the most important ones are listed hereafter.

Search function. The search function finds the objects which match criteria specified by the

user. It searches the target objects starting from a given object until it reaches the bottom of

the HIF tree (or the max depth, if the corresponding parameter is set). For example, the search

function can be used to find out all variables which match the name state starting from base

object, as in Figure 21.

Object

BitObject

BoolObject

CharObject

EnumObject

IntObject

PointerObject

RealObject

TypeRefObject

ArrayObject

RecordObject

TypeObject

SimpleTypeObjectCompositeTypeObjectAssignObject

CaseObject

ExitObject

ForObject

IfObject

NextObject

PCallObject

ReturnObject

ActionObject

SwitchObject

WaitObject

WhileObject



Page 129

Visitor design pattern. In object-oriented programming and software engineering, the visitor

design pattern is generally adopted as a way for separating an algorithm from an object

structure. A practical result of this separation is the ability to add new operations to existing

object structures without modifying these structures. In fact, the visitor design pattern is very

useful when there is a tree-based hierarchy of objects and it is necessary to allow an easy

implementation of new features to manipulate such a tree. The HIF APIs provide visitor

techniques in two forms: as an interface which must be extended to provide visitor operators,

and as an apply() function. In the first case, a virtual method is inserted inside the HIF object

hierarchy, which simply calls a specific-implemented visiting method on the object passed as

parameter. The passed object is called visitor and it is a pure abstract class. Hence, the

programmer has to extend such a visitor to visit and manage the HIF tree, by implementing

the desired visiting methods, in accordance with its goals. On the contrary, the apply()

function is useful to perform a user-defined function on all the objects contained in a subtree

of a HIF description. The signature for the apply function is the following:

void Hif::apply (Object *o,

char(*f)(Object *,void *),

void *data)

Compare function. It provides designers with a way to compare two HIF objects and the

respective subtrees. Its signature is the following:

static char compare (Object *obj1, Object *obj2)

Object replacement function. It provides designers with a way for replacing an object and its

subtree with another one. Its signature is the following:

int Hif::replace(Object* from, Object* to)

Figure 21 Search function usage example

B.7.5 Portability

HIFSuite can be compiled on both Linux and Windows operating systems. Compilation

requires the following libraries: cmake, bison, flex, perl, clang and boost C++ libraries.

Moreover Microsoft Visual Studio 9 SP1 and gnuwin32 are necessary under Windows, while

gcc is necessary under Linux.

Execution of HIFSuite only need boost C++ libraries.


HIFSuite documentation is composed of:

a tutorial which guides the user in entering commands and developing new HIF-based

manipulation tools by exploiting the APIs;

Hif::hif_query query;query.set_object_type(NameNode); // search for NameNode

query.set_name("state"); // search for string "state"

std::list<Node*>* found_object = Hif::search(base_object, query);



Page 130

a Doxygen-based document which describes the HIF class hierarchy and details for using

the APIs.

Public documentation and a demo version of release 3.4 is available at http:// hifsuite.edalab.it



Page 131

B.8. Memories Modelling, Characterization and Optimization Tool Description

B.8.1 Tool Overview

Objective of this tool is to enhance the of the DSE framework by adding the feature of the

exploration of the memory architecture. While the DSE framework is able to explore the

memory dimension using conventional structural parameters (e.g., memory size, memory

width) or functional parameters (e.g., burst access), the memory tool provided by POLITO

will be able to explore alternative memory organizations, in particular, based on multi-banked

solutions specifically tailored to the application. This tool is meant as a plug-in to the DSE

engine.

The overall tool includes two main components:

The modelling & characterization tool, which builds models for the metrics of interest

parameters (power, energy and various forms of delay) based on the relevant parameters.

Since we envision sub-banking as the only type of ―organization‖, the size of a memory

block is currently the only relevant parameter. Additional parameters include supply

voltage Vdd and threshold voltage Vth (provided that the technology supports regulation).

Models are of empirical nature and will be built by means of an accurate characterization

of different memory blocks using ST‘s proprietary memory compiler in the target

technology. Characterization data will be fitted against a linear model template using

conventional least mean square regression. Linear model provide enough accuracy given

the good correlation between the metrics and memory size (as already demonstrated in the

literature).

The optimization tool, which is based on the idea of building a multi-bank implementation

of a memory block (be it a scratchpad or a cache) that is customized to the memory access

profile of an application. The rationale behind the optimization is based on the fact that

distribution of accesses to memory cells is non uniform; since power, energy and delay are

roughly proportional to memory size, these two observations shows that it is convenient to

implement a multi-banked memory so that the average memory access is both more

energy-efficient and faster than access to a monolithic memory.


The tool has no particular ―software‖ architecture. Figure 22 shows the workflow of the tool.

The characterization is run once for a given technology and yields a set of raw data as a

function of the parameters, specified in input as a range between a minimum and a maximum

values. The raw data in output correspond to a list of value for each metric, one for each point

in the space, as shown in Figure.

The modeling tool takes the raw data as inputs and builds a linear function of the same

parameters using least mean square regression. Notice that there will be a model for each

metric of interest.



Page 132

Figure 22: Overall Flow of Memory Exploration: Characterization, Modeling, and Optimization.

Finally, the Optimization tool will take as inputs: the address trace (i.e., address X accessed at

cycle Y), the number N of words of the target memory block (e.g., a 64KB), and the memory

models built in the previous step. Based on these data, it will produce a multi-banked

implementation of the N-word memory.

For simplicity of the decoding logic, the banks map contiguous portions of the memory, i.e.,

there is no relocation whatsoever. Figure 23 shows this contiguous type of sub-banking.

Figure 23: Conceptual example of memory sub-banking.

Notice also that neither replication nor overlapping of memory locations is expected;

therefore the sum of the sizes of the blocks coincides (in the figure, N1+N2+N3+N4 N).



Page 133


The interfaces between tools concern information for which no standard universally

acknowledged formats do exist. Therefore, no particular compliance with any standard is

required or expected. At the same time, there is flexibility in adapting the tools to possible

legacy formats, or that will emerge from other partners. For example, should the trace be

obtained by means of conventional profilers such as pixie, pixstats or gprof, it

would not be a problem to support such trace formats. Currently our plan is to support the

most essential and intuitive format for a trace that is:

<access type> <address>

Where access type is either ‗read‘ or ‗write‘.

Further requirements might arise from the interface with the DSE tool, but they will be

analyzed when designing the tool integration.


The tool will be initially developed as a standalone tool for debugging reasons. We plan

however to eventually comply with the DSE interface as our tool (optimization in particular)

should become an add-on of DSE. Therefore, both input and output interface will be the same

as DSE.

B.8.5 Portability

The tool will not require any special language support or libraries. The prototype will be

implemented in plain C, using standard libraries. Although code development will be done on

Linux platforms, it is expected that the code will be portable on any platform for which a C

compiler is available.


Documentation will be a relevant part of the deliverables concerning the memory tool. User

Manual (including some guided tutorials) and Development Manual will be provided.



Page 134

B.9. IPXACT Tool-Chain Description

The Magillem tool suite is used today in production design and verification flows of the

leading SoC integrators (STM, STE, NXP, TI, Qualcomm, etc.) and envisaged in advanced

ESL flows by system manufacturers (Thales, Astrium, Airbus, etc.). Magillem is used as a

framework for metadata management and thus can be considered as the backbone for the

interoperability of tools and models around a common description of IPs, systems and

subsystems based on a well accepted standard: IP-XACT (IEEE 1685).

In COMPLEX, Magillem will be used to make the glue between tools of the global

architecture exploration framework and for the implementation of automation engines.

B.9.1 Tool Overview The Magillem tool suite is the leading offer today for managing IP-XACT IEEE 1685

standard. IP-XACT issued initially from the SPIRIT consortium (and now Accelera) is

nowadays recognized by the electronics community as an apposite choice for managing

properly and efficiently the new ESL design flows. Nevertheless, the migration from a legacy

design flow to another, taking full benefits of IP-XACT, requires some heavy and complex

operations. The next figure presents the four steps which have to be completed. They are

detailed in the following subsections.

A Four Steps Methodology to Build ESL Flows with IP-XACT

IP Description

The goal of this first step is to package all the components of an IP library into XML files in

accordance with the IP-XACT schema, which describes the syntax and semantic rules for the

description of three kinds of elements: the bus definitions, the components and the designs (in

which components are instantiated). Thus the purpose of the IP packaging is to fill in for each

component the XML fields that describe its attributes: physical ports, interfaces, parameters,

generics, register map, physical attributes, etc. An important part of the schema is dedicated to

referencing the files related to the different views of a component: a view may be for instance

a simulable model in a specific language (VHDL, Verilog, SystemC, etc) or documentation

files (e.g. PDF, HTML, Framemaker). This work facilitates future reuse of existing

components, because all of their features are easily accessible for its integration and

configuration in a bigger system, as it will be explained in the next step.

System Description

After this step, is it possible to import, configure and integrate components into the system,

assemble the design, resolve connections issues, and automate design tasks, thus lightening

the verification steps. Some examples of the use of IP XACT at this level are: partial or full

automation of design assembly and configuration, detection of communication protocols

mismatch, top level netlisting, or automatic customization of compilation and simulation of

designs. The work that is the topic of this paper takes place in this category: managing the



Page 135

generation of specific verification code from the IP-XACT description of a system and its

components.

Design Automation and Flow Control

The third step of the methodology, depicted in Figure 29, aims at linking the design activities

around the centric IP-XACT database by means of a dedicated environment which provides

access to the IP-XACT information. The tool suite chosen for this study (Magillem

environment) provides an IP Packager, a Platform Assembly tool, as well as a Generator

Studio to develop and debug additional TGI-based generators. These may be encapsulated

within the IP-XACT representation of an IP and may for example simply launch the execution

of a script, getting arguments values from the design description in IP-XACT, or be on the

contrary a more complex engine, the role of which would be to modify the design itself (e.g.

add connections, insert adapters, or configure components).

Principle Schema for an IP-XACT Flow

Checkers can also be developed and used to verify design rules at some point, before going further in the

design flow. Besides, IP-XACT provides mechanisms to describe the sequences of chained generators and

checkers.

Advanced Design Flow Architecture

This last step in the methodology has a high potential because it exploits all features described

previously and allows the actual implementation of advanced ESL activities, such as

architecture exploration or software application automated mapping on a hardware platform.

These examples show the complexity that has to be managed by the three first steps: all

components must be packaged and their configurability must be taken into account; the design

assembly automation should be maximized, while any architecture choice should be handled.

At last, the generator chains, as defined previously, can be configured and controlled by

supervisor engines: for instance a validation sequence will configure and execute several

times the generators dedicated to test bench configuration, compilation and simulation.


MIP: Magillem IP Packager

Moving efficiently to an IP based methodology and flow requires legacy IP libraries to be

captured in a technology independent format with an easy-to-use, scalable and automated

process. Magillem IP Packager automatically creates an IP-XACT certified description for

any VHDL, Verilog or SystemC component using a non-intrusive technology to existing

flow. The packager scalability also enables to automatically import libraries of legacy



Page 136

component, and its high level of modularity allows to deal with any client directory structure

and to handle various customer specific information

Principle Schema for importing IP legacy into IP-XACT

MPA: Magillem Platform Assembly

To accelerate the design of complex systems, such as System-on-Chip (SoC), and FPGA

based solutions, the IP-XACT standard provides a mechanism for describing and handling

multisourced IP that enables automated design integration and configuration within multi-

vendor tool flows. To achieve these goals, the MPA is the centre piece of a powerful intuitive

Integrated Design Environment. The user friendly interface guides the designer during

platform assembly and configuration, and streamlines exploration and implementation of IP-

based systems Analog and system platform viewer, Version and configuration management,

Certification requirements traceability, Verification tool kit, Checkbox Interface to connect IP

(bus, signal, split, tie), Drag and drop IP into design, Various EDA tools connectors, RTL and

ESL Netlisters.

Magillem Platform Assembly



Page 137

MGS: Magillem Generator Studio

Magillem Generator Studio is used to write IP-XACT based generators quickly and easily.

Supported features: Generator Studio Overview, Integrated into the Magillem Environment,

Debug mode, Assist and guidance, Automatic Completion, All versions of IP-XACT are

supported, and Both IP-XACT Generator Interfaces are supported: LGI & TGI, Tight

Generator Interface (TGI), New Interface made for IP-XACT 1.4 schema, Extended and

Enhanced API to access legacy data.

Magillem Generator Studio

MGS is fully integrated in the Magillem Design Environment, allowing a complete

interaction with MPA and others Magillem modules. MGS allows building an efficient bridge

based on TGI between the IP-XACT meta-data and the existing customer design flow and

assets.

Magillem Generator Studio architecture



Page 138

Features:

- Automatic Completion, dynamic syntax checking and inline documentation in a user

friendly environment

- Used in conjunction with MPA (Magillem Platform Assembly), the TGI recorder

feature can record operations performed in Magillem schematic editor (instantiate and

configure components, create connections…) to automatically create a TGI script

- Automatic creation of IP-XACT generator files for new or imported code, exploration

and management of any existing IP-XACT generators

- Interactive and command line execution of IP-XACT generators TGI Usage

- Ensure 100% error free manipulation of an IP database

- Generators can be implemented in different languages (Java, TCL, Python…)

- TGI-like API provided for all previous versions of IP-XACT (1.0, 1.1, 1.2) and

support for LGI (Loose Generator Interface)

- Ready for the IP-XACT IEEE 1685

MRV: Magillem Register View

Entering the market with a product targeting the traditional need of IC designers to manage

the Registers, MAGILLEM is offering a brand new approach: Customers do not have to

choose between an Excel based Register capture system, disconnected from their design, or an

expensive, dedicated Register management tool, still not addressing the issues of collaborative

work. Cost effective, and non-compromising, MRV by Magillem offers a Register View of

IP-XACT Systems and IPs:

Magillem Register View

Supported features: Complete Register and Bitfield editor (Instead of a simple viewer), Move

and resize Bitfields (Interactive update of the bitfields at the tip of the mouse), Memory Map

editor (Move and replace memory blocks), Drag and drop inside the memory map (Registers



Page 139

to Memory map), Copy/Paste (Of memory blocks in memory map), Visual identification of

overlaps (Immediate debugging), True synchronization with RTL or ESL platform (Visualize

and edit the full project hierarchy), True Hierarchical description (Single source based,

solutions can‘t work for the next generation! Concurrent developments handle more than 500

different xml files containing memory map fragments. The relationship and the ordering,

hierarchy defined in the project must be preserved).


MIP: Magillem IP Packager

IP-XACT v1.0, v1.2, v1.4, v1.5 certified, IP-XACT IEEE 1685, Coherency Check

Source of importation:

- CoreUse standard structure of repository support

- Custom structure of repository support

- IP yellow Page integration

- Hierarchical IPs containing a mix of VHDL and Verilog components

- Complete File set creation of the component and its dependencies

Change and release management tool:

- CVS connector

- Clearcase connector

Standard digital and Analog HDL import:

- VHDL IEEE 1076, ‘83 or ‗93 import

- Verilog IEEE 1364, ‘95, ‘01 or ‗05 import

- SystemC IEEE 1666 import

- VHDL-AMS IEEE 1076.1 import

- Verilog-AMS LRM 2.3.1 import

- SystemC-AMS LRM 1.0 import

Customer Specific information Import:

- Register and memory map definition: CSV, Excel, Framemaker, SystemRDL

- Legacy XML

- Documentation

MPA: Magillem Platform Assembly

IP-XACT v1.0, v1.2, v1.4, v1.5 support, IEEE 1685 support

Hierarchical Netlister:

- VHDL IEEE 1076

- Verilog IEEE 1364

- SystemC IEEE 1666, with automatic shell generation for VHDL and Verilog IPs (IUS,

Questa)

- VHDL-AMS IEEE 1076.1

- Verilog-AMS LRM 2.3.1

- SystemC-AMS LRM 1.0

- SystemVerilog IEEE 1800



Page 140

EDA Tool Connectors:

IC and system simulation:

- Modelsim connector (Mentor)

- Incisive connector (Cadence)

- VCS connector (Synopsys)

ESL synthesis:

- Catapult C connector(Mentor)

- C-to-Silicon connector(Cadence)

IC synthesis:

- Design Compiler connector (Synopsys)

- Encounter RTL compiler (Cadence)

FPGA synthesis:

- ISE connector (Xilinx)

- Quartus II connector (Altera)

- Sinplify connector (Synopsys)

- Precision connector (Mentor)

Analog IC simulation:

- HSpice & Eldo connector

- Virtuoso connector (Cadence)

Database: Cadence OpenAccess

Legacy design import: Import VHDL design to IP-XACT, Import Verilog design to IP-XACT


Magillem tools are available as plug-in in Eclipse framework (graphical interfaces), and can

also be used in script command line (Tcl, Python).

B.9.5 Portability

OS: Windows, Linux


Available documentation: User Manual, Tutorials.



Page 141

B.10. SMOG Tool Description

The Separated Model Generation tool is used to prepare the input to the HW and SW

estimation tools. SMOG takes information from four different sources. The executable

SystemC system specification contains the behaviour and the communication of the design.

The system input stimuli are used to test the implementation. User constraints on the HW/SW

mapping of particular components guide the DSE and finally the MARTE platform

description model defines details of the virtual platform that is used later in the COMPLEX

design flow.

B.10.1 Tool Overview

For a successful DSE the full executable SystemC system specification must be cut up into

self-contained modules/tasks. SMOG analyses the SystemC specification and separates

behavioural parts along communication points. According to the user-constrained HW/SW

mapping and the platform description model different interfaces (TLM calls, wrapper

modules etc.) are created which match the needs of the estimation tools that follow SMOG in

the COMPLEX design flow.

For every part a corresponding block test bench is created that is either based on stimuli files

or emulates the behaviour of the rest of the original system. The separated part and the test

bench form a new executable system that can be simulated and tested.

A virtual platform skeleton is generated according to mapping information and the

architecture/platform description. This skeleton can be combined with the BAC++ output

from the HW and SW estimation tools to form the executable virtual simulation platform.



Page 142



SMOG uses four sources of information:

The executable system specification. This is standard SystemC 2.2 using a limited set of

communication elements: FIFOs, handshake- and double-handshake channels.

System input stimuli. These are trace files or designated test bench modules written in

SystemC.

User-constrained HW/SW mapping. This is a text file configuring components of the

system specification to be hardware or software. The components are identified by instance

names resp. identifiers given in the SystemC source code.

Architecture/platform description. This is a standard IP-XACT file.

SystemCSpecification

TestbenchGenerator

PartExtractor

Virtual SystemSkeleton Generator

System Input

Stimuli

User HW/SWMapping

Architecture /Platform

Description

C++ Front-End

Elaborator

InternalRepresentation

SequentialC-Code

BlockTestbench

VirtualPlatformSkeleton

SMOG

PowerOpt

SW Estimation

BAC++



Page 143


There is no need for a graphical user interface; all information is passed via SystemC source

code and textual configuration files.

B.10.5 Portability

SMOG is developed under GNU/Linux, but since there is only text file interaction, it should

be no problem to port SMOG to other platforms.


There will be at least a detailed User Manual and a tutorial using use case 2 as an example.



Page 144

B.11. IMEC Global Resource Manager (GRM) Tool Description



The first main part contains one power controller per HW component (generated by tool

described in Section 4.6), which allows setting of the HW component, implementing the task

into individual power modes, and providing an interface to the Global Resource Manager

(GRM) of the overall system.

The second main part is the GRM, optimizing the system parameters at run time, i.e. adapting

the hardware platform and the application configuration during execution in order to further

reduce the power consumption. The GRM acts as a middleware between the application and

the platform. Among other functionalities, the GRM can vary the frequency of processors,

power on and off power islands, select power modes of HW components, or switch between

different qualities of service proposed by the application. This GRM is described in the

following.

The Global Resource Manager (GRM) is loaded on the host processor of the platform. It is a

software task, specified in C, and running on top of the basic OS services in parallel with the

applications. The goals of the GRM are to support a holistic view of resources and quality

management, to transparently optimize the resource usage and the application mapping on the

platform, and to dynamically adapt to changing context.


Figure 24: Distributed and hierarchical GRM approach

The current GRM architecture (see Figure 24) follows a distributed and hierarchical approach.

On the one hand, the GRM is loaded on the host processor of the platform. It is a software

task running in parallel with the applications. It provides a bridge between the applications,



Page 145

the user, and the platform, it conforms to each Intellectual Property (IP) core (e.g., ASIC,

FPGA, multi-CPUs), and it is used to find global and optimal trade-offs in application

mapping. A detailed view of the GRM is depicted in Figure 25. On the other hand, the

approach allows each IP core to possibly execute its own resource management without any

restriction, through a Local Resource Manager (LRM). Such an LRM encapsulates the local

policies and mechanisms used to initiate, monitor and control computation on its IP core.

Figure 25: GRM architecture

The following terminology assumptions are made. A system consists of multiple applications,

activated at run time. An application (e.g., video streaming) consists of jobs (e.g., MPEG4

encoder) communicating with each other through inter-job channels, as follows. (1) One job is

mapped entirely on one IP core. (2) Whereas the functional specification of a job is fixed,

there may be several specific algorithms or implementations for a given job. Also a job

implementation can take several forms (fixed logic, configurable logic, software) and offer

different characteristics. The associated meta-data (qualities, platform resource usage, costs),

provided at design time, are structured and stored in the Job information database of the GRM

to enable fast exploration during run-time decisions. (3) A job can consist of multiple tasks

communicating with each other. To conform to the hierarchical approach of the GRM, jobs

and communication between them are managed by the GRM, whereas tasks and

communication between them are managed at the IP core level.

To provide a bridge between the applications, the user, and the platform, generic services are

supported by the GRM. Among them, we distinguish between services called by the GRM

and being automated and services called by the applications and controlled by the application

programmer. These latest services relate to job execution (start, stop, resume, kill,

synchronize, wait, switching point), message exchanges, event recognition and handling,

timer interrupts, and shared memory access. The services called by the GRM are classified

into managers to structure the interface between the GRM and the applications, the user, and

the platform, respectively.



Page 146

The interface with the applications is provided by three managers: the application, job, and

inter-job channel managers. Their goal is to enable a holistic view of the platform resources, a

dynamic adaptation to changing context, and a transparent optimization of resource usage.

The interface with the user (or external entity accessing application specifications) is provided

by the QoE manager. QoE is a subjective measure of the application value from the user

perspective. It is influenced by the user terminal device (e.g., low- or high-definition TV), his

environment (e.g., in the car or at home), his expectations, the nature of the content and its

importance (e.g., a simple yes/no message or an orchestral concert).

The interface with the platform is provided by three managers: the platform, IP core, and

routing path managers. The goal of the IP core manager is to direct requests to the

corresponding IP cores. The routing path manager is responsible for establishing set of

routing paths, globally optimizing the usage of the communication infrastructure, and

enabling dynamic bandwidth allocation.

More information can be found in [2].


As mentioned in Section 2.3.6.2, The GRM is loaded on the host processor of the platform. It

is a software task, specified in C, and running on top of the basic OS services.

The GRM interfaces with the design-time exploration tool (MOST), with the HW

components, including their power controllers (see Section 2.3.3) and with the processors of

the platform, through APIs.


A preliminary Graphical User Interface (GUI) is currently available. To that end, Tkinter

within the Python programming language is used. Tkinter is a GUI widget set for Python.

B.11.5 Portability

Supported OS: Linux


Both a paper describing the GRM and a user manual describing how to apply it are planned.



Page 147

B.12. SystemC Network Simulation Library (SCNSL) Description


Next-generation networked embedded systems pose new challenges in the design and

simulation domains. System design choices may affect the network behaviour and Network

design choices may impact on the System design. For this reason, it is important ---at the

early stages of the design flow--- to model and simulate not only the system under design, but

also the heterogeneous networked environment in which it operates [19]. For this purpose, in

COMPLEX we exploit a modelling language traditionally used for System design ---

SystemC--- to build a packet-based network simulator named SystemC Network Simulation

Library (SCNSL).

SCNSL allows to model network scenarios in which different kinds of nodes, or nodes

described at different abstraction levels, interact together. The use of SystemC as unique tool

has the advantage that HW, SW, and network can be jointly designed, validated and refined.

The following description of SCNSL regards a proof-of-concept implementation [20], but a

completely new version will be created in the COMPLEX project.


The driving motivation at the base of SCNSL is to have a single simulation tool to model both

the embedded system under design and the surrounding network environment. SystemC has

been chosen for its great flexibility, but a lot of work has been done to introduce some

important elements for network simulation.

Figure 26 Relationship of SCNSL with respect to traditional SystemC modelling.

Figure 26 shows the relationship among the system under design, SCNSL and the SystemC

standard library. In traditional scenarios, the system under design is modeled by using the

primitives provided by the SystemC standard library, i.e., modules, processes, ports, and

events. The resulting module is then simulated by a simulation engine, either the one provided

in the SystemC free distribution or a third-party tool.



Page 148

To perform network simulations new primitives are required as described below. Starting

from SystemC primitives, SCNSL provides such elements so that they can be used together

with System models to create network scenarios.

Another point regards the description of the simulation scenario. In SystemC, such description

is usually provided in the sc_main() function which creates module instances and connects

them before starting simulation; in this phase, it is not possible to specify simulation events as

in a story board (e.g., ``at time X the module Y is activated''). Instead, in many network

simulators such functionality is available and the designer not only specifies the network

topology, but also can plan events, e.g., node movements, link failures, activation/de-

activation of traffic sources, and packet drops. For this reason, SCNSL also supports the

registration of such network-oriented events during the initial instantiation of SystemC

modules.

As depicted in Figure 26, the model of the system under design uses both traditional SystemC

primitives for the specification of its internal behavior, and SCNSL primitives to send and

receive packets on the network channel and to test if the channel is busy. SCNSL takes in

charge the translation of network primitives (e.g., packets events) into SystemC primitives.

Main components

To support network modeling and simulation, a tool has to provide the following elements:

Kernel: the kernel is responsible for the correct simulation, i.e., its adherence to the behavior

of an actual communication channel; the kernel has to execute events in the correct temporal

order and it has to take into account the physical features of the channel such as, for example,

propagation delay, signal loss and so forth.

Node: nodes are the active elements of the network; they produce, transform and consume

transmitted data;

Packet: in packet-switched networks the packet is the unit of data exchanged among nodes; it

consists of a header and a payload;

Channel: the channel is an abstraction of the transmitting medium which connects two or

more nodes; it can be either a point-to-point link or a shared medium;

Port: nodes use ports to send and receive packets.



Page 149

Figure 27 Main components of SCNSL.

Figure 27 shows the main components of SCNSL; they can be easily related to the previous

list as explained below.

Channels are very important components, because they are an abstraction of the transmission

media. Standard SystemC channels are generally used to model interconnections between HW

components and, therefore, they can be used to model network at physical level. However,

many general purpose network simulators reproduce transmissions at packet level to speed up

simulations. SCNSL follows this approach and provides a flexible channel abstraction named

Communicator_if_t. A communicator is the most abstract transmission component and, in

fact, both NodeProxy and Network classes derive from it. New capabilities and behaviour can

be easily added by extending this class. Communicators can be interconnected each other to

create chains. Each valid chain shall have on one end a NodeProxy instance and, on the other

end, the Network; hence transmitted packets will move from the source NodeProxy to the

Network traversing zero or more intermediate communicators and then they will eventually

traverse the communicators placed between the Network and the destination NodeProxy. In

this way, it is possible to modify the simulation behaviour by just creating a new

communicator and placing its instance between the network and the desired NodeProxy.

The SCNSL simulation kernel is implemented by the Network_if_t class. This class is the

most complex object of SCNSL, because it manages transmissions and, for this reason, it

must be highly optimized. For instance, in the wireless model, the network transmits packets

and simulates their transmission delay; it can delete ongoing transmissions, change node

position, check which nodes are able to receive a packet, and verify if a received packet has

been corrupted due to collisions. The standard SystemC kernel does not address these aspects

directly, but it provides important primitives such as concurrency models and events. The

network class uses these SystemC primitives to reproduce transmission behaviour. In

particular, it is worth to note that SCNSL does not have its own scheduler since it exploits the

SystemC scheduler by mapping network events on standard SystemC events. There are two

concrete classes implementing the Network interface: the WirelessNetwork and the

LinkNetwork, able to reproduce respectively the wireless and wired behaviour of a channel.

The simulation kernel has been implemented inside the same object implementing the

wirelsss/wired channel for performance reasons.



Page 150

The Node is one critical point of our library which supports both System and Network design.

From the point of view of a network simulator the node is just the producer or consumer of

packets and therefore its implementation is not important. However, for the system designer,

node implementation is crucial and many operations are connected to its modelling, i.e.,

change of abstraction level, validation, fault injection, HW/SW partitioning, mapping to an

available platform, synthesis, and so forth. For this reason we introduced the class

NodeProxy_if_t which decouples node implementation from network simulation. Each Node

instance is connected to a NodeProxy instance and, from the perspective of the network

simulation kernel, the NodeProxy instance is the alter-ego of the node. This solution allows to

keep a stable and well-defined interface between the NodeProxy and the simulation kernel

and, at the same time, to let complete freedom in the modelling choices for the node; as

depicted in Figure 27 the box named Node is separated from the simulation kernel by the box

named NodeProxy and different strategies can be adopted for the modelling of the node, e.g.,

interconnection of basic blocks or finite-state machine. It is worth to note that other SystemC

libraries can also be used in node implementation, e.g., re-used IP blocks and testing

components such as the well-known SystemC Verification Library. For example, the Figure

also shows an optional package above the node; this package is provided by SCNSL and it

contains some additional SystemC modules, i.e., an RTL description of a timer and a source

of stimuli. These components may simplify designer's work even if they are outside the scope

of network simulation.

Another critical point in the design of the tool has been the concept of packet. Generally,

packet format depends on the corresponding protocol even if some features are always

present, e.g., the length and source/destination address. System design requires a bit-accurate

description of packet contents to test parsing functionality while from the point of view of the

network simulator the strictly required fields are the length for bitrate computation and some

flags to mark collisions (if routing is performed by the simulator, source/destination addresses

are used too). Furthermore, the smaller the number of different packet formats, the more

efficient is the simulator implementation. To meet these opposite requirements in SCNSL, an

internal packet format is used by the simulator while the system designer can use other

different packet formats according to protocol design. The conversion between the user packet

format and the internal packet format is performed in the NodeProxy.

Figure 28 shows the class hierarchy of the Communicator; as said before, both Network and

NodeProxy inherit from the Communicator. A wireless network is a specific kind of Network

with its own behaviour and thus derives from the abstract Network. NodeProxies depend both

on the type of network and on the abstraction level used in node implementation; for example,

Figure 28 reports a TLM and an RTL version of a wireless NodeProxy.



Page 151

Figure 28 Class hierarchy of the Communicator.


SCNSL is an extension of SystemC. Thus, it is directly integrated into any design flow,

library or tool which supports SystemC.

The only additional requirement is that the user module which is bounded with the

NodeProxy shall be children of the Node interface.


SCNSL has no direct user interface. The simulation scenario is instantiated inside the

sc_main() of the test case, as usually done for standard SystemC simulations.

Regarding simulation outputs, there are two different traces.

Nodes outputs: since nodes are completely under user control, statistics, information and

traces can be collected in the most suitable format for user‘s objectives, like log files, database

queries, etc.

Backend outputs: the SCNSL internal modules have a tracing facility, which can be

activated a compile time. The traces are printed on standard output, but users can redirect

them on a file for post-processing.

The new version of SCNSL shall have better facilities to generate traces to be used in the

exploitation&optimization phase of the COMPLEX flow.

B.12.5 Portability

SCNSL is an extension of SystemC written in standard C++ and therefore can be used on the

same platforms available for SystemC.


The simulator is an open source project hosted at http://scnsl.sourceforge.net.

http://scnsl.sourceforge.net/



Page 152

Development support includes:

- Wiki pages

- Source code repository (SVN for the last version, Bazar for the new version)

- Bug tracking facility

- Mailing list

New SCNSL versions shall have also other documents in Latex/PDF describing the

programming style adopted and the User Guide.

Date post:	17-Jul-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

COdesign and power Management in PLatform- based design ... D1.2.1... · COMPLEX/PoliMi/R/D1.2.1...

Documents