+ All Categories
Home > Documents > Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar...

Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar...

Date post: 01-Apr-2015
Category:
Upload: adam-grapes
View: 234 times
Download: 3 times
Share this document with a friend
25
Design Space Exploration with SimpleScalar
Transcript
Page 1: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Design Space Explorationwith SimpleScalar

Page 2: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

The SimpleScalar Toolset

Page 3: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

The Simplescalar Toolset

Page 4: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

Simluation Suite

Page 5: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

SimpleScalar ISA• clean and simple instruction set

architecture:• MIPS/ DLX + more addressing modes -

delay slots• 64- bit inst encoding facilitates instruction

set research• 16- bit space for hints, new insts, and

annotations• four operand instruction format, up to 256

registers

Page 6: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

SimpleScalar Architected State

Page 7: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

Out of order simulator

Configurable set ofFUs

Page 8: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

Configurable Memory Hierarchy• All caches and TLB

configurations specified with same format:

< nsets>:< bsize>:< assoc>:< repl>

• Block replacement policyl - for LRUf - for FIFOr - for RANDOM

Page 9: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

Configurable Memory Hierarchy

Page 10: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

Design Space Exploration• Metric definition

• Energy*Delay• Area*Delay

• Design space definition• L1 and L2 caches, n° ALUs ...

• Embedded Application Definition• Metric minimization

• Exhaustive search• Greedy search• Gradient search • Simulated Annealing and so on

Page 11: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

Design Space Exploration:A case study.• Metric Defined:

Price over Performance= area*CPI• Design space:

• Sets, block, associativity and replacement polocy for each cache;

• number of integer ALUs;• number of integer multipliers;• number of floating-point ALUs;• number of floating-point multipliers.

Design space exploration performed by F. Cassoli and A. Ferrante @ ALARI

Page 12: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

Design Space Definition• Ranges for each parameter

• DL1:128:{32, 64}:4:L• IL1:{256, 512}:32:1:L• UL2:{1024, 2048}:{64, 128}:4:{L, F}• IALU:{2, 4}• IMULT:{1, 2, 4}• FPALU:{1, 4}• FPMULT:{1, 2}

• 768 different cases

Page 13: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

Embedded Application

• EPIC decoder (Efficient Pyramid Image deCoder) • Image data compression utility written

in C.• Free Mediabench Source• Based on wavelet decomposition and a

Huffman entropy (de)coder.

Page 14: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

Cost Function

F(x)= A(x)*D(x)• Area of x (sum of equivalent gates of

each module). Models found in the literature.

• Delay of x (computed through simulation of EPIC on architecture x).

Page 15: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

Result of the exhaustive search

9.E+05

1.E+06

1.E+06

2.E+06

2.E+06

2.E+06

2.E+06

2.E+06

3.E+06

3.E+06

2 102 202 302 402 502 602 702

DL1-# of sets = 32 DL1-# of sets = 64

Page 16: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

Optimal Configuration• The lowest value of the PoP is 998’732.31,

obtained with:DL1: 128:32:4:LIL1: 256:32:1:LUL2: 1024:64:4:FIALU: 4IMULT: 2FPALU: 4FPMULT: 2

Page 17: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

Cost Function Properties

• The difference between the PoPs for a DL1 cache of 32 and of 64 sets is very little.

• The difference between the PoPs for a IL1 cache of 256 and of 512 sets is very little.

Page 18: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

0.E+00

5.E+05

1.E+06

2.E+06

2.E+06

3.E+06

3.E+06

1 51 101 151

UL2-# of sets = 1024 UL2-# of sets = 2048

UL2-dim of block = 64 UL2-dim of block = 128

UL2-dim of block = 64 UL2-dim of block = 128

Page 19: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

Cost Function Properties• Increasing the sets of UL2 increases the

PoP (in average). • Augmenting the dimension of the block of

the UL2 cache always leads to an abrupt growth of the PoP.

• The L2-cache dimension grows very much, so that the cache becomes significantly larger that the rest of the system.

Page 20: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

Cost Function Properties

9.80E+05

1.00E+06

1.02E+06

1.04E+06

1.06E+06

1.08E+06

1.10E+06

1.12E+06

1.14E+06

1.16E+06

1.18E+06

1 11 21 31 41

# IALU = 2 # IALU = 4

Page 21: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

Cost Function Properties

9.95E+05

1.00E+06

1.01E+06

1.01E+06

1.02E+06

1.02E+06

1.03E+06

1.03E+06

1.04E+06

1.04E+06

25 30 35 40 45

# IMULT = 1 # IMULT = 2 # IMULT = 4

FPALU = 1 FPALU = 4

Page 22: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

Cost Function Properties

9.98E+05

9.99E+05

1.00E+06

1.00E+06

1.00E+06

1.00E+06

1.00E+06

1.01E+06

1.01E+06

1.01E+06

1.01E+06

37 38 39 40

FPMULT = 1; UL2 type = L

FPMULT = 1; UL2 type = F

FPMULT = 2; UL2 type = L

FPMULT = 2; UL2 type = F

Page 23: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

Area – CPI scatter plot

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1200000 1700000 2200000 2700000 3200000

Area

CP

I

Page 24: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

Conclusions• Reduction of PoP when the number of

integer ALUs is doubled. Great benefit with reduced area increase.

• Optimal configuration has IMULT = 2, (not 1 or 4, because EPIC does not expose much parallelism).

• However FPALU = 4 leads to better results than FPALU = 1.

• L2 FIFO policy outperforms LRU.• Same benefits when adding an FPMULT.

Page 25: Design Space Exploration with SimpleScalar. Vittorio Zaccaria – Alari @ ST 2001 The SimpleScalar Toolset.

Vittorio Zaccaria – Alari @ ST 2001

Conclusions• A greedy algorithm has also been applied

to minimize the cost function.• Starting from different points

• average number of simulations required= 49• minimum number of simulations required= 11• maximum number of simulations required=83

• Full search optimum always reached• Considering that an exhaustive search

needs 768 simulations, we reduce time of about 93.6%.


Recommended