+ All Categories
Home > Documents > cs.virginia/~krw7c/avf.html

cs.virginia/~krw7c/avf.html

Date post: 04-Feb-2016
Category:
Upload: becca
View: 20 times
Download: 0 times
Share this document with a friend
Description:
Bit Read?. Bit has error protection. benign fault no error. benign fault no error. benign fault no error. Kristen Walcott, Greg Humphreys, Sudhanva Gurumurthi. Does bit matter?. Does bit matter?. University of Virginia {walcott, humper, gurumurthi}@cs.virginia.edu. - PowerPoint PPT Presentation
Popular Tags:
1
http://www.cs.virginia.edu/~krw7c/avf.html Bit Read? Bit has error protectio n benign fault no error yes no Does bit matter? no Does bit matter? Particle Strike Causes Bit Flip! Detection only Detection & Correction benign fault no error benign fault no error Silent Data Corruption yes no True Detected Unrecoverabl e Error False Detected Unrecoverabl e Error yes no Outliers We identify strong correlations between structural AVF values and a small set of processor metrics. Using linear and quadratic regression, we came up with an AVF characterization that uses only a few variables. These characterizations can be used to predict AVF accurately! FIT = Failure in Time = 1 failure in a billion hours Intel Corporation 1 10 100 1000 10000 100000 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 D ata C orruption FIT 1000 M TBF G oal FIT ifallflips m anifestas errors FIT if10% offlips m anifestas errors •Transient faults due to particle strikes are a key challenge in microprocessor design. • As transistor counts increase exponentially, per-chip faults are a growing burden. • Spatial and temporal redundancy techniques are used to protect against faults. • Redundancy techniques assume that any fault will result in a visible program error (i.e., the Architectural Vulnerability Factor (AVF)) is 100 percent. • Over-design can hurt performance and drain power. AVF bit = Probability Bit Matters = # of Visible Errors # of Bit Flips from Particle Strikes As soft errors become more of a problem, protection will be needed even for every day PCs. Providing total redundancy is too expensive and assumes that AVF is 100%. Our work shows that AVF varies over time. n 0 Kristen Walcott, Greg Humphreys, Sudhanva Gurumurthi University of Virginia {walcott, humper, gurumurthi}@cs.virginia.edu Dynamic Prediction of Architectural Vulnerability What bits matter? Computer Science at the UNIVERSITY of VIRGINIA Rising Problem Dynamic AVF Prediction Calculating Vulnerability With an accurate predictor, redundancy may be turned on only when vulnerability is high. Preliminary results show that partial redundancy provides a significant performance boost over full redundancy. Next we will perform a more rigorous exploration of the design space of partial redundant multithreading implementations and investigate Future Work Prediction Results Challen ge
Transcript
Page 1: cs.virginia/~krw7c/avf.html

http://www.cs.virginia.edu/~krw7c/avf.html

BitRead?

Bit has error

protection

benign faultno error

yes no

Does bit matter?

no

Does bit matter?

Particle StrikeCauses Bit Flip!

Detectiononly

Detection & Correction

benign faultno error

benign faultno error

Silent Data Corruption

yes no

True Detected Unrecoverable

Error

False Detected Unrecoverable

Error

yes no

Outliers

We identify strong correlations between structural AVF values and a small set of processor metrics.

Using linear and quadratic regression, we came up with an AVF characterization that uses only a few variables. These characterizations can be used to predict AVF accurately!

FIT = Failure in Time = 1 failure in a billion hoursIntel Corporation

1

10

100

1000

10000

100000

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

Dat

a C

orr

up

tion

FIT

1000 MTBF Goal

FIT if all flips manifest as errors

FIT if 10% of flips manifest as errors

•Transient faults due to particle strikes are a key challenge in microprocessor design. • As transistor counts increase exponentially, per-chip faults are a growing burden. • Spatial and temporal redundancy techniques are used to protect against faults. • Redundancy techniques assume that any fault will result in a visible program error (i.e., the Architectural Vulnerability Factor (AVF)) is 100 percent.• Over-design can hurt performance and drain power.

AVFbit = Probability Bit Matters =

# of Visible Errors

# of Bit Flips from Particle Strikes

As soft errors become more of a problem, protection will be needed even for every day PCs.Providing total redundancy is too expensive and assumes that AVF is 100%.

Our work shows that AVF varies over time.

n0

Kristen Walcott, Greg Humphreys, Sudhanva Gurumurthi

University of Virginia{walcott, humper, gurumurthi}@cs.virginia.edu

Dynamic Prediction of Architectural Vulnerability

What bits matter?

Computer Scienceat the UNIVERSITY of VIRGINIA

Rising Problem Dynamic AVF Prediction

Calculating VulnerabilityWith an accurate predictor, redundancy may be turned on

only when vulnerability is high. Preliminary results show that partial redundancy provides a significant

performance boost over full redundancy. Next we will perform a more rigorous exploration of the design space of partial redundant multithreading implementations and

investigate redundancy toggling policies.

Future Work

Prediction Results

Challenge

Recommended