+ All Categories
Home > Documents > Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha...

Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha...

Date post: 18-Jan-2016
Category:
Upload: chrystal-harper
View: 212 times
Download: 0 times
Share this document with a friend
18
Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs, Haifa, Israel Presenter: Ioana Burcea
Transcript
Page 1: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Power Awareness through Selective Dynamically Optimized Traces

Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs, Haifa, Israel

Presenter: Ioana Burcea

Page 2: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Agenda

Motivation for PARROT = Power-Aware aRchitecture Running Optimized Traces

PARROT Concept and Architecture Performance and Energy Results Discussion

– What makes PARROT a power-aware architecture?– What is new about this paper? / What are the contributions

of this paper?

Page 3: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Motivation

We pay more energy per task– Poor scaling of performance with power consumption

PARROT tries to change the balance– Filtering Techniques to Improve Trace-Cache Efficiency –

PACT 2001– Selecting Long Atomic Traces for High Coverage – ICS

2003– Specialized Dynamic Optimizations for High-Performance

Energy-Efficient Microarchitecture – CGO 2004

Page 4: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

PARROT Concepts – The Big Picture

Based on the well-known cold/hot (10/90) paradigm

PARROT Principles– Reuse: trace-cache centric– Dynamic optimizations: more performance with

less energy– Focus: invest where it pays– Pipeline decoupling: hybrid front-end, cold and

hot execution pipelines– Transparency: immune to s/w compatibility

Page 5: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Traces and Trace Selection

Decoded atomic traces– Complex retirement & recovery in case of misprediction– More aggressive optimizations

Trace Selection – deterministic criteria– Capacity limitation: 64 uops– Complete basic blocks– Terminating CTI (control-transfer instructions)

Indirect jumps, software exceptions, backward taken branches

– Return instructions: procedure inlining– Trace join

Page 6: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Microarchitecture

Split-execution vs. unified-execution– Foreground phase: fetch-to-execution pipeline– Background phase (post-processing): trace selection and

optimization

Page 7: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Microarchitecture (cont’d)

• Two predictors: GHR = Global History Buffer

•Branch predictor

•Trace predictor

• Deterministic trace build scheme

• Filtering mechanisms:

• The hot filter selects frequent traces from those executed on the cold pipeline

• The blazing filter selects for optimization the hottest traces

• Dynamic optimizations

• generic and core specific optimizations

• gradually applied (?)

Page 8: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Simulation framework

An “in-house” proprietary performance and power simulator

Optimizations applied as different passes– Optimization delay for one trace ~ 100 cycles

Energy simulation– Power consumption matrix for each operation on each

hardware unit– Leakage

Uniform leakage in space over the processor core and L2 cache and in time modeling a high temperature

LE = PMAX * (0.05 * M + 0.4*K) * CYC

Page 9: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Configuration Space

Page 10: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Experimental Evaluation

Metrics– IPC– Total energy– Cubic-MIPS-per-WATT (CMPW)

A measure of the design tradeoffs between power and performance

Benchmarks– SpecInt2000– SpecFP2000– Office– Multimedia– DotNet

Page 11: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Performance and Power Awareness

Page 12: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Extreme Microarchitectural Alternatives

Page 13: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Hot Code Predictability

Page 14: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Trace-cache Fetch Coverage

Page 15: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Optimizer Capabilities

Page 16: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Energy Breakdown

Page 17: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Their Conclusions…

Page 18: Power Awareness through Selective Dynamically Optimized Traces Roni Rosner, Yoav Almog, Micha Moffie, Naftali Schwartz and Avi Mendelson – Intel Labs,

Our Conclusions

What makes PARROT a power-aware architecture?

What is new about this paper? / What are the contributions of this paper?– rePlay (?)


Recommended