+ All Categories
Home > Documents > FAULT TOLERANT SYSTEMSeuler.ecs.umass.edu/ece655/pdf/Part20-ch8-vlsi2.pdf · 2016-10-17 · Memory...

FAULT TOLERANT SYSTEMSeuler.ecs.umass.edu/ece655/pdf/Part20-ch8-vlsi2.pdf · 2016-10-17 · Memory...

Date post: 11-Mar-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
17
Page 1 Part.20.1 Copyright 2007 Koren & Krishna, Morgan-Kaufman FAULT TOLERANT SYSTEMS http://www.ecs.umass.edu/ece/koren/FaultTolerantSystems Part 20 – VLSI 2 Chapter 8 –Defect Tolerance in VLSI Circuits Part.20.2 Copyright 2007 Koren & Krishna, Morgan-Kaufman Opportunities for Yield Enhancement The yield of a chip can be enhanced through Architecture choice (redundancy – including spare components in the design) » the chip can still be operational in the presence of some faults Decreasing the critical area, and consequently l at the design stage during » compaction » routing » placement » floorplanning Decreasing the defect density We concentrate on the first two options
Transcript
Page 1: FAULT TOLERANT SYSTEMSeuler.ecs.umass.edu/ece655/pdf/Part20-ch8-vlsi2.pdf · 2016-10-17 · Memory with Redundant Blocks ♦ 1 Gb DRAM is partitioned into 8 128 Mb mats •512 basic

Page 1

Part.20.1 Copyright 2007 Koren & Krishna, Morgan-Kaufman

FAULT TOLERANT SYSTEMS

http://www.ecs.umass.edu/ece/koren/FaultTolerantSystems

Part 20 – VLSI 2Chapter 8 –Defect Tolerance in VLSI Circuits

Part.20.2 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Opportunities for Yield Enhancement

♦ The yield of a chip can be enhanced through•Architecture choice (redundancy – including spare components in the design)

» the chip can still be operational in the presence of some faults

•Decreasing the critical area, and consequently λat the design stage during

» compaction» routing» placement» floorplanning

•Decreasing the defect density♦ We concentrate on the first two options

Page 2: FAULT TOLERANT SYSTEMSeuler.ecs.umass.edu/ece655/pdf/Part20-ch8-vlsi2.pdf · 2016-10-17 · Memory with Redundant Blocks ♦ 1 Gb DRAM is partitioned into 8 128 Mb mats •512 basic

Page 2

Part.20.3 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Yield Enhancement through Redundancy

♦ In many ICs - identical blocks of circuits (also called modules) are replicated •Example: Memory chips

♦ If the whole chip is expected to be fault-free -yield will be very low

♦ Adding spare modules increases the yield •Example: Large memory chips - either spare rows, or spare columns, or both are added

♦ Adding spare modules also increases the chip area ♦ This results in less chips out of the wafer♦ Even with a higher yield, we may end up with fewer

operational chips per wafer

Part.20.4 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Effective Yield♦ Area of chip increases as spares are added

è less chips on wafer♦ Yield by itself may not be the right measure for

circuits with redundancy♦ Effective Yield takes into account the increase

in chip area - measures the real benefits of redundancy

♦ Number of spares is selected so that the effective yield is maximized

Page 3: FAULT TOLERANT SYSTEMSeuler.ecs.umass.edu/ece655/pdf/Part20-ch8-vlsi2.pdf · 2016-10-17 · Memory with Redundant Blocks ♦ 1 Gb DRAM is partitioned into 8 128 Mb mats •512 basic

Page 3

Part.20.5 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Effective Yield - Example♦ Yield Vs. Effective Yield for a circuit with

N=10 modules and R spares♦ Negative Binomial distribution used♦ λ=0.1 ; α =1 ; R=0,1,2,...,10

♦ Optimal number of spares: R=2♦ How were these yields calculated?

Part.20.6 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Basic Model of Redundancy -Replicated Circuits

♦ M modules needed for proper operation of chip♦ R spare modules are added ♦ All N=M+R modules are identical♦ At least M out of N modules must be fault-free ♦ Average number of faults per module is denoted

by λm

♦If Poisson distribution is used: λm= λ/N

iNNN

Mi

iNchip ee

i

NY −−

=

− −

= ∑ )1()( λλ

Page 4: FAULT TOLERANT SYSTEMSeuler.ecs.umass.edu/ece655/pdf/Part20-ch8-vlsi2.pdf · 2016-10-17 · Memory with Redundant Blocks ♦ 1 Gb DRAM is partitioned into 8 128 Mb mats •512 basic

Page 4

Part.20.7 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Redundancy – Negative Binomial Model♦ An equivalent expression when using the Poisson

distribution:

♦ Compounding this expression with the Gamma distribution as compounder -

♦ Example: The yield of a chip with N=10 modules; λ selected so that the yield with no redundancy is 0.1

∑∑−

=

+−

=

−=

iN

k

NkikN

Michip e

kiN

iN

Y0

)()1( λ

∑∑−

=

=

+

+

−=

iN

k

kN

Michip N

kik

iNiN

Y0

)(1)1(

α

αλ

Part.20.8 Copyright 2007 Koren & Krishna, Morgan-Kaufman

More Complex Designs

♦ Two (or more) different types of modules♦ Support circuitry with no redundancy♦ Average number of faults per module: λmi

♦ For support circuitry (ck – chip kill): λck

α

α

λλλ −

+++++

ckmm kiki

kiN

iN

kiN

iN

21)()(

12211

2

22

2

2

1

11

1

1

∑∑∑∑−

=

===−−=

22

2

21

11

1

2

22

1

11 00)1()1(

iN

k

kkiN

k

N

Mi

N

MichipY

Page 5: FAULT TOLERANT SYSTEMSeuler.ecs.umass.edu/ece655/pdf/Part20-ch8-vlsi2.pdf · 2016-10-17 · Memory with Redundant Blocks ♦ 1 Gb DRAM is partitioned into 8 128 Mb mats •512 basic

Page 5

Part.20.9 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Memory Arrays with Redundancy

♦ Memory arrays – highly regular♦ Simplifies incorporating redundancy into their design♦ Defect-tolerance techniques successfully applied to

memories since late 1970's♦ Simplest technique - spare rows and columns (word

lines and bit lines) ♦ Another technique – using

error-correcting codes ♦ Yield increases 30-fold

in early prototypes ♦ 1.5-to-3-fold increases

in yield of mature processes

Part.20.10 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Defect Tolerant Memories♦ One of the earliest designs: IBM’s 16K bit chip♦ Six redundant bit lines, four redundant word lines♦ Added area of 7%♦ Word and bit lines failures + Individual cell failures ♦ Decoders are “programmed” using fusible links or

Laser ♦ A row containing one or more

defective memory cells is disconnected by blowing a fusible link

♦ Disconnected row replaced by spare row with a programmable decoder (fusible links) - can replace any defective row

Page 6: FAULT TOLERANT SYSTEMSeuler.ecs.umass.edu/ece655/pdf/Part20-ch8-vlsi2.pdf · 2016-10-17 · Memory with Redundant Blocks ♦ 1 Gb DRAM is partitioned into 8 128 Mb mats •512 basic

Page 6

Part.20.11 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Which Rows/Columns to Replace?

♦ 1. Identify faulty cells through testing, e.g., Built-In Self Testing (BIST)

♦ 2. Determine which rows/columns to replace•More complex – single faulty cell can be replaced by either – Example: 6×6 array, 2+2 spares:

♦ Use Row First assignment:♦ Use all available rows first♦ Replace row R0 and R1♦ 4 defective cells left♦ Array can not be repaired♦ Can another algorithm do better?

Part.20.12 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Bipartite Graph

♦ Two sets of vertices –corresponding to rows and columns

♦ An edge connect Ri to Cj if the cell at intersection is defective

♦ Select the minimum number of vertices to cover all edges• For each edge at least one incident

vertex must be selected♦ Example:

• Select C2 and R5• Select one out of C0 and R3• Select one out of C4 and R0

♦ Bipartite graph edge covering is NP-complete

Page 7: FAULT TOLERANT SYSTEMSeuler.ecs.umass.edu/ece655/pdf/Part20-ch8-vlsi2.pdf · 2016-10-17 · Memory with Redundant Blocks ♦ 1 Gb DRAM is partitioned into 8 128 Mb mats •512 basic

Page 7

Part.20.13 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Heuristics♦ Should restrict to spare rows (columns) only?

• Two defects in same column (row)• A complete column (row) can be defective

♦ Two-step algorithm:• 1. Replace Must-Repair rows (and columns)

» Must-Replace Row: # of defects > # available spare columns

» After this, other rows (columns) may become must-repair• 2. Simple heuristic like Row-First to deal with remaining few

defects

♦ Example:• C2 must-repair column• Then – R5 becomes must-repair row• Finally, replace R0 and C0

Part.20.14 Copyright 2007 Koren & Krishna, Morgan-Kaufman

IBM's 16Mb DRAM with ECC♦ Spare rows/columns and error-correcting code (ECC) ♦ 4 independent quadrants with 16 redundant bit lines and 24

redundant word lines per quadrant♦ For every 137 bits, 9 check bits allow correction of any single

bit error ♦ Every 8 adjacent bits assigned to 8 separate words – lower

probability of 2 or more faults in same word ♦ Write includes: (1) Fetch, (2) Write back♦ Benefit of combined

strategy larger than sum of expected benefits:•ECC effective against individual cell failures

•redundancy effective against failures in same row/column

Page 8: FAULT TOLERANT SYSTEMSeuler.ecs.umass.edu/ece655/pdf/Part20-ch8-vlsi2.pdf · 2016-10-17 · Memory with Redundant Blocks ♦ 1 Gb DRAM is partitioned into 8 128 Mb mats •512 basic

Page 8

Part.20.15 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Reliability vs Yield

♦ ECC commonly used in memory to protect against transient faults during operation - increase reliability

♦ Reliability improvement due to ECC only slightly affected by the use to correct defective cells

♦ Still, redundant rows and columns most commonly used

♦ Incorporated also in large cache units

♦ Benefits of redundant rows/columns is especially significant in early stages of production when yield is low•earlier introduction of new products into the market

Part.20.16 Copyright 2007 Koren & Krishna, Morgan-Kaufman

New Defect-Tolerant Memories

♦ Memory ICs have become very large ♦ Conventional redundancy of rows and columns - not

sufficient ♦ Partitioning into sub-arrays is a must

•Decrease the current•Shorten bit and word lines to reduce access time

♦ Disadvantages of conventional techniques•Inefficient use of redundant lines•Unable to deal with chip-kill defects

♦ New defect tolerance techniques are necessary

Page 9: FAULT TOLERANT SYSTEMSeuler.ecs.umass.edu/ece655/pdf/Part20-ch8-vlsi2.pdf · 2016-10-17 · Memory with Redundant Blocks ♦ 1 Gb DRAM is partitioned into 8 128 Mb mats •512 basic

Page 9

Part.20.17 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Memory with Redundant Blocks

♦ 1 Gb DRAM constructed out of 4 256 Mb subarrays• Each subarray can become a part up to 4 different ICs• 16 sabarrays (marked) would ordinarily not be fabricated• Resulting a 2% increase in area – only column redundancy

• No improvement in yield if Poisson model followed, considerable improvement if negative binomial model used

x

Part.20.18 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Memory with Redundant Blocks

♦ 1 Gb DRAM is partitioned into 8 128 Mb mats•512 basic arrays of size 256Kbit (32 x 16matrix)

•32 spare rows and 32 spare columns•4 spare rows are allocated to a 16Mbit portion of the mat

•8 spare columns are allocated to a 32 Mbitportion of the mat

♦ 8 redundant blocks of size 1Mbit each•4 basic 256Kbit arrays •8 spare rows + 4 spare columns

Page 10: FAULT TOLERANT SYSTEMSeuler.ecs.umass.edu/ece655/pdf/Part20-ch8-vlsi2.pdf · 2016-10-17 · Memory with Redundant Blocks ♦ 1 Gb DRAM is partitioned into 8 128 Mb mats •512 basic

Page 10

Part.20.19 Copyright 2007 Koren & Krishna, Morgan-Kaufman

128Mb mat (32 x 16 256Kbit arrays)

Part.20.20 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Block Diagram

8 mats (128Mbit each)+8 redundant blocks (1Mbit each)

♦ A redundant block including 4 256Kbit arrays, 8 redundant rows and 4 redundant columns

Page 11: FAULT TOLERANT SYSTEMSeuler.ecs.umass.edu/ece655/pdf/Part20-ch8-vlsi2.pdf · 2016-10-17 · Memory with Redundant Blocks ♦ 1 Gb DRAM is partitioned into 8 128 Mb mats •512 basic

Page 11

Part.20.21 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Yield Comparison (for half chip)

λ (1/cm )2

Part.20.22 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Defect-Tolerant Microprocessor

♦ ESPRIT Project♦ 16-bit processor core♦ Data Path - 1 spare

bit-slice ♦ Control Path - PLAs

with spare product terms

♦ Area overhead < 25%

Page 12: FAULT TOLERANT SYSTEMSeuler.ecs.umass.edu/ece655/pdf/Part20-ch8-vlsi2.pdf · 2016-10-17 · Memory with Redundant Blocks ♦ 1 Gb DRAM is partitioned into 8 128 Mb mats •512 basic

Page 12

Part.20.23 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Data path and control redundancy

♦ Effective Yield

Part.20.24 Copyright 2007 Koren & Krishna, Morgan-Kaufman

The 3D Computer (Hughes Labs)

♦ A massively parallel array (SIMD) for image processing ♦ 32 x 32 array (5-wafer stack)♦ 128 x 128 array (15-wafer stack) - 100 BOPS♦ Redundancy is a must

Page 13: FAULT TOLERANT SYSTEMSeuler.ecs.umass.edu/ece655/pdf/Part20-ch8-vlsi2.pdf · 2016-10-17 · Memory with Redundant Blocks ♦ 1 Gb DRAM is partitioned into 8 128 Mb mats •512 basic

Page 13

Part.20.25 Copyright 2007 Koren & Krishna, Morgan-Kaufman

The 3D Computer - Defect Tolerance

♦ “Conventional" Wafer Stack Wafer-scale Design

Part.20.26 Copyright 2007 Koren & Krishna, Morgan-Kaufman

The 3D Computer - Interstitial Redundancy

♦ (2,4) redundancy ♦ Local redundancy

uniformly distributed♦ Simple switches and

short interconnections

Page 14: FAULT TOLERANT SYSTEMSeuler.ecs.umass.edu/ece655/pdf/Part20-ch8-vlsi2.pdf · 2016-10-17 · Memory with Redundant Blocks ♦ 1 Gb DRAM is partitioned into 8 128 Mb mats •512 basic

Page 14

Part.20.27 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Effect of Floorplanning on Yield

♦ VLSI designers rarely consider yield issues when selecting a floorplan for a newly designed chip

♦ This is justified for chips which are small relative to defect clusters

♦ Floorplanning can affect yield under the following conditions:• Area of chip is very large• Defects are clustered• Defect clusters are medium-sized compared to chip• Chip has modules with different sensitivities to defects

♦ or• Chip has incorporated redundancy

Part.20.28 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Effect of Floorplanning - Simple Chip

♦ Example:♦ A chip consists of nine equal-area functional units ♦ Fault densities♦ Fault clusters are medium-size (2x2 or 2x3)♦ Two floorplans:

Page 15: FAULT TOLERANT SYSTEMSeuler.ecs.umass.edu/ece655/pdf/Part20-ch8-vlsi2.pdf · 2016-10-17 · Memory with Redundant Blocks ♦ 1 Gb DRAM is partitioned into 8 128 Mb mats •512 basic

Page 15

Part.20.29 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Floorplanning For High Yield -Recommendations

♦ Floorplan (b) has a higher yield♦ Module with highest fault density is placed in

center♦ Modules with lowest fault densities are placed

in corners♦ Distance from center - inversely related to

sensitivity to defects♦ Intuitive explanation - likelihood of one defect

cluster killing two or more chips is reduced♦ May have a negative impact on wiring length

Part.20.30 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Original and Alternate Floorplans of

Matsushita's Adenartµprocessor

Page 16: FAULT TOLERANT SYSTEMSeuler.ecs.umass.edu/ece655/pdf/Part20-ch8-vlsi2.pdf · 2016-10-17 · Memory with Redundant Blocks ♦ 1 Gb DRAM is partitioned into 8 128 Mb mats •512 basic

Page 16

Part.20.31 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Yield and Wiring Cost of 4 Floorplans

Part.20.32 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Floorplanning - Redundancy♦ Example: A chip with 4 modules M1,M2,S1,S2

♦ S1 - a spare for M1; S2 - a spare for M2

♦ Floorplan (c) has the highest yield♦ Guarantees separation between primary modules and

their spares for any size and shape of defect clusters

♦ It is less likely that the same cluster will hit both the module and its spare, thus killing the chip

Page 17: FAULT TOLERANT SYSTEMSeuler.ecs.umass.edu/ece655/pdf/Part20-ch8-vlsi2.pdf · 2016-10-17 · Memory with Redundant Blocks ♦ 1 Gb DRAM is partitioned into 8 128 Mb mats •512 basic

Page 17

Part.20.33 Copyright 2007 Koren & Krishna, Morgan-Kaufman

Example - Floorplan of the 3-D Computer♦ A (2,4) structure - every spare unit adjacent to

the four primary units that it can replace♦ Short interconnection links between a spare and a

primary♦ Advantage: Performance degradation upon a failure

is minimal♦ Disadvantage: Proximity of primary units and spare

results in a low yield in the presence of clustered faults

original floorplan

Part.20.34 Copyright 2007 Koren & Krishna, Morgan-Kaufman

An Alternate Floorplan♦ Spare is placed

farther apart from the primary units it can replace

♦ 128 x 128 array♦ Medium-area

Negative Binomial ♦ defect block size

of two rows (α=2)


Recommended