Performance Simulators José Nelson Amaral CMPUT 429 Dept. of Computing Science University of...

Post on 03-Jan-2016

217 views 1 download

Tags:

transcript

Performance Simulators

José Nelson AmaralCMPUT 429

Dept. of Computing ScienceUniversity of Alberta

Reading material

• Section 1.3.2 Performance Simulators in Baer’s textbook.

Circuit Design Simulation (SPICE)

• Wires, gates, transistors, CMOS, electric signals, etc.

Logical Design Simulation

• Arithmetic and Logic Units (ALUs), Programmable Logic Arrays (PLAs)

• Hardware description languages (VHDL, Verilog)

Register Transfer Level (RTL)

• Microarchitecture level: data flow between basic blocks; control lines

RETRO: Univ. of Western Australiartlib: Universitat Hamburg

Processor and Memory Hierarchy Description (SimpleScalar)

• ISA definition, cache specifications, etc.

System level simulators

• I/O, multithreading, multiprocessing

Flavors of Simulation

• Trace-driven simulators: input is a sequence of instructions that have been executed by a program.– Needs trace collection

• hardware monitors: imprecise• software monitors: slow and interfering with execution• need lots of storage for the traces

• Execution-driven simulators: input is from a program interpreter.– Level of detail is a designing choice

Bauer, p. 19

Drawbacks of simulators

• Difficult to simulate I/O • Simulation take a long time– slowdown of 30 to 40 times!• Takes more than 5 hours to simulate a 2-minute

program.

Bauer, p. 20

Speeding Up Simulation

• Simulate only the first billion instructions– probably not representative of the actual execution;

• Fast-forward first billion instructions and simulate the second billion instructions– Again only one contiguous portion of the program is

simulated.

• Sample execution intervals (p.e. every 10 intervals of 100 million instructions)

• Detect similar phases in the program.

Bauer, p. 21

Speeding Up Simulation(Phase-based simulation)

1. Divide execution in intervals (pe. 100 million instructions).

2. Give each interval a signature (pe. average frequency of execution of each basic block).

0.1 0.8 0.4 0.7 0.2 0.5 0.9 0.3 0.1 0.3 0.8 0.6 0.1

3. Cluster intervals based on signature.0.1 0.8 0.4 0.7 0.2 0.5 0.9 0.3 0.1 0.3 0.8 0.6 0.0

4. Simulate a limited number of samples from each cluster.0.4 0.9 0.1 0.3 0.6

5. Weigh the results of the simulation based on cluster frequency.Bauer, p. 21

Simulation AccuracyFirst billion instructions

Fast forwarding

Phase-based

Bauer, p. 22

Smaller inputs

• Handcraft smaller inputs to a set of benchmarks– Smaller runs should be statistically equivalent to

the original benchmark runs– Advantage: no sampling– Disadvantage: no automation, difficult to find

adequate reduced inputs

Bauer, p. 22

Further reading

• T. Austin, E. Larson, and D. Ernst, “SimpleScalar: An Infrastructure for Computer System Modeling,” IEEE Computer, 35, 2, Feb. 2002, 59-67

• How to find it:1. Go to http://www.library.ualberta.ca/2. Click

How to find papers in library

1. Go to http://www.library.ualberta.ca/2. Click on Science on left hand-side.3. Click on Computing Science.4. Select the database (most common are ACM,

IEEE, Springer). For this paper click on IEEE Explore

5. If you are off-campus click on Web AccessEnter your CCID and password