+ All Categories
Home > Documents > Self-Testable, Self-Adaptable and Error-Resilient System ... · Serializer 2 2 Digital core Pattern...

Self-Testable, Self-Adaptable and Error-Resilient System ... · Serializer 2 2 Digital core Pattern...

Date post: 28-Jun-2018
Category:
Upload: lydieu
View: 228 times
Download: 0 times
Share this document with a friend
26
Engineering Insights 2006 Self-Testable, Self-Adaptable and Error-Resilient System Design Tim Cheng Electrical and Computer Engineering coping with increasing variability and reliability concerns
Transcript

Engineering Insights 2006

Self-Testable, Self-Adaptable and Error-Resilient System Design –

Tim ChengElectrical and Computer Engineering

coping with increasing variability and reliability concerns

Engineering Insights 2006

• Decreasing design window• Less tolerance for design revisions

• Decreasing design window• Less tolerance for design revisions

• Exponentially more transistors• Increasing complexity in system

context

• Exponentially more transistors• Increasing complexity in system

context

Time-to-Money Heterogeneity

• Coupling cap• Signal integrity• Inductance• Leakage

•• Coupling capCoupling cap•• Signal integritySignal integrity•• InductanceInductance•• LeakageLeakage

• Greater diversity of on-chip elements: processors, SW, RF, memory, analog, high-speed bus

•• Greater diversity of onGreater diversity of on--chip chip elements: elements: processors, SW, RF, processors, SW, RF, memory, analog, highmemory, analog, high--speed busspeed bus

Drivers of SystemDrivers of System--onon--aa--Chip/SystemChip/System--inin--aa--Package Design & Test TechnologiesPackage Design & Test Technologies

Com

plex

ityNanoScale

Effects

Engineering Insights 2006

Challenges Facing the Next Decade in Integrated System Design

• Managing and exploiting design partitioning and trade-offs for heterogeneous systems – HW, SW, analog, RF, MEMS, optical, etc..

• Power and energy• Verification and test• Reliability and robustness • Implementation fabrics beyond silicon

Engineering Insights 2006

Increasing Failure Sources and Failure Rates

40

50

60

70

80

90

100

110

Tem

pera

ture

(C)

On-Die Temperature variations

SEU

random defects

parametric variations

soft errors

design errors

Engineering Insights 2006

Harder to Design Reliable System-Chips• First-silicon success rate has been dropping

– <30% for complex ASIC/[email protected]– Pre-silicon logic bugs have been increasing at 3X-4X/generation

for Intel’s processors

• Yield has been dropping for volume production– IBM’s 8-core Cell-Processor chips: ~10-20% yield

• “Better than worst-case” design resulting in failures w/o defects– Increase in variation of process parameters with scaling– Worst-case design getting way too conservative

• One-time-factory production testing will be too costly and insufficient for failure screening

Engineering Insights 2006

New Design and Test Paradigm: Reliable Systems With Unreliable Components

• Systems must be designed to cope with failures• Efficient silicon debug is a must

– Design for debugging would become necessary• Must have embedded self-test for error detection

– For both testing in manufacturing line and in-the-field– Both on-line and off-line testing

• Must be re-configurable and adaptable for error recovery– Using spares to replace defective parts– Using redundancy to mask errors– Using self-tuning to compensate variations

Engineering Insights 2006

Some of Our Research Results• Embedded Software-Based Self-Test for SoC

– Reuse of embedded processors and on-chip resources for self-test and diagnosis

• Test, Characterization and Diagnosis for High Speed Serial IO Interfaces

• Silicon Debug for Timing Failures• Formal Equivalence Checking between System

Specification and RTL Code

Engineering Insights 2006

Embedded-Software-Based Self-Test For Programmable Systems

Test and diagnosis are applications of a programmable SOC!!Test and diagnosis are applications of a programmable SOC!!

Reuse of on-chip programmable components for testProcessor/DSP/FPGA cores for on-chip test generation, measurement, response analysis and even diagnosis

Self-test a processor using its instruction set for high fault coverageUse the tested processor/DSP to test buses, interfaces and other components, including analog and mixed-signal components

Engineering Insights 2006 Embedded SW-Based Self-Testing for Programmable System Chips

BusInterface Master Wrapper

BusArbiter

Low-CostTester

On-ChipMemory

Test program

Responses

VCISignatures

DSP

VCI

IP CoreVCI

System MemoryVCI

On-chip Bus

BusInterfaceMaster Wrapper

BusInterface Target WrapperBusInterface Target Wrapper

Loading test program at low speed

Self-test at operational

speed

Unloading response

signature at low speed

CPU

• Low-cost tester• High-quality at-speed test• Low test overhead• Non-intrusive

Test in normal operational mode• No violation of power consumption• More accurate speed-binning

Engineering Insights 2006

• DSP-based v.s. analog ATE– Increased test throughput– Reduced switching & settling

time– Device response is

memorized and analyzed for different parameters

– Software DSP doesn’t have to be real-time

– Performance limited by ADC/DAC

DSP-Based Self-Test for Analog/Mixed-Signal Components

DSP-based ATE

ADCDAC AnalogDUT

Memory

Digital signal processing

Synchronization

SOC

ADCDAC AnalogCUT

Memory

Digital signal processing

Synchronization

DSP/Processor core

• DSP-based embedded self-test - on-chip tester– Relieve need of

expensive ATE– Reduce external noise

• Practical issues– Test quality limited by

DAC/ADC– DAC/ADC are not

always available, and must be tested first

1-bit DAC

1-bit DSM

Engineering Insights 2006

Self-Test for Analog/MS Components• A self-test architecture using the delta-sigma

modulation principle for signal generation and for waveform acquisition

ATE SOCATE SOC

AnalogCUT

AnalogCUT

Responseanalysis

Programmablecore + memory

Test stimuli& spec.

Pass/fail ?

On-chip DAC/Low. res. DACOn-chip DAC/Low. res. DAC

On-chip ADC/1-bit ΔΣ

modulator

On-chip ADC/1-bit ΔΣ

modulator

Software ΔΣmodulator

Engineering Insights 2006

Bit-Error-Rate (BER) Estimation for High-Speed Serial Interface

• Most applications require 10-12 or even lower BER– Measuring BER impractical for production testing

• BER can be estimated using parameters with strong correlation with BER:– Spectral information of jitter

• Frequencies and amplitudes of Periodic Jitter (PJ)• Rms value of Random Jitter (RJ)

– Jitter transfer characteristics of a CDR circuit• Magnitude response (low pass filter characteristic)• Phase response (timing response in clock recovery)

– Channel characteristics

Engineering Insights 2006

Measure phase response of a CDR circuit

Measure magnitude

response of a CDR circuit

Measure BER

2.5 Gbps Data with jitter Recovered

clock

Clock with jitter

BER Estimation - HW Validation

Maxim’s 2.5 Gbps CDR circuit (MAX 3873A )

SynthesysSynthesys ReseachReseach’’ssBERTScopeBERTScope

Engineering Insights 2006

Results - Clk-like and PRBS Data

1.0E-12

1.0E-11

1.0E-10

1.0E-09

1.0E-08

1.0E-07

1.0E-06

1.0E-05

0.01 0.1 1 10 100

PJ Freq (M Hz)

BER

M ea. BER Est. BER

1.0E-12

1.0E-11

1.0E-10

1.0E-09

1.0E-08

1.0E-07

0.01 0.1 1 10 100

PJ Freq (M Hz)BER

M ea. BER Est. BER

Clock-like pattern w/ 0.5T PJ Clock-like pattern w/ 0.45T PJ

• Measured and estimated BER match very well

• The difference is larger at low (~10-12) level, due to the bounded nature of RJ in practice

1.0E-12

1.0E-11

1.0E-10

1.0E-09

1.0E-08

1.0E-07

0.01 0.1 1 10 100

PJ Freq (M Hz)

BER

M ea. BER Est. BER

PRBS pattern w/ 0.5T PJ

Engineering Insights 2006

Serial IOTx

Rx

Serializer2

2

Digital core

Patterngenerator

Patternanalyzer

Deserializer

Digital Analog

PE

Pre-emphasis

EQ

Equalizer

CDR

Testable/Debuggable Design for Adaptive Equalizer in High Speed Serial-Link

• A design-for-testability (DfT) solution for adaptive equalizer (EQ) in high-speed IO

• Applicable to various EQ architectures• Addressing RX’s observability & controllability problems• Lower test cost and higher fault coverage than

conventional eye-diagram-based method

EQ CDR

Clock

Signal

Access Point

RX

EqualizedSignal

On-ChipMonitor

Data

DriveCircuit

Engineering Insights 2006

Testable Design for Adaptive Equalizer

Adp.Alg.

UnequalizedInput y(t)

clk

ε (k)

v(t)

EqualizedOutput

DataI(k)

v(t)/I(k)

EQ

(for DFE)

v(k)

y(k)

Clock

I(k)

ε (k)

Z-1 Z-1

Adp.Alg.

PatternGenerator

Scan OutFF FF

K1

FFE

ci

• Minor HW modification of EQ: a FF chain (storing tap-coefficients), a pattern generator, and some switches– Extra circuits are all digital

• Addressing both characterization and production testing

General Architecture for DFE/FFETestable Decision-Feedback

Equalizer (DFE) or Feed-Forward Equalizer (FFE):

Engineering Insights 2006

Experiments: Testable Equalizers

AnalogInput

DigitalPE Card

I'(k)

DSP

Tester

Clock

FFEy(t)

ci DUTε (k)

I(k)

Adp.Alg.

PatternGen

DigitalInput -1

-0.50

0.51

1.5

-2 -1 0 1 2Tap Number

Tap

Wei

ght fault-free

with tap-1 s.a. fault

-0.20

0.20.40.60.8

11.2

-2 -1 0 1 2Tap Number

Tap

Wei

ght Fault-free

With Gain Error

-0.4-0.2

00.20.40.60.8

11.2

-2 -1 0 1 2 Tap Number

Tap

Wei

ght Fault-free

With Offset Voltage

• Applying digital tests and examining tap-coefficients for fault detection and diagnosis

• Demonstrating much higher fault coverage than eye-diagram-based approach

Engineering Insights 2006

From Test to Recovery/Reconfiguration - Examples

• Memory– “BIST → BISD → BISR” a common practice

• Analog/RF/High-speed IO components– Digitally assisted self-calibration: fine-tuning

performance; more robust to process, temperature and voltage variations…

• Dynamic circuits– Using programmable keeper and on-chip leakage

sensors for tuning performance and robustness

Engineering Insights 2006

Dynamic Circuit Using Static Keeper

clk

. . .RS0 RS7

D0 D7

RS1

D1

LBL0

LBL1

N0

Keeper upsizing degrades average performance

Conventional Static Keeper

Engineering Insights 2006

Pessimistic Design Hurts Performance

worst-case corner

(130nm CMOS Measurements, 110°C)

0

50

100

150

200

Normalized IOFF

Num

ber

of d

ies

0 1 2 3 4 5 6

nominal corner

Substantial variation in leakage across dies4-5X variation between nominal and worst-case leakagePerformance determined at nominal leakageRobustness determined at worst-case leakage

Engineering Insights 2006

Programmable Keeper for Dynamic Ckts3-bit programmable keeper

clk

. . .RS0 RS7

D0 D7

RS1

D1

LBL0

LBL1

N0

b[2:0]

W 2W 4Ws s s

Opportunistic speedup via keeper downsizingRef: C. Kim and K Roy, Purdue

Engineering Insights 2006

On-Die Leakage Sensor

C. Kim et al. , VLSI Circuits Symp. ‘04

83μm

73μm

current reference

comparators

currentm

irrors

VBIASgen.

NMOS device

test interface

High leakage sensing gain Compact analog design sharing bias generators

Technology 90nm dual Vt CMOS

VDD 1.2VResolution 7 levels

Power consumption 0.66 mW @80Cº

Dimensions 83 X 73 μm2

Engineering Insights 2006

Output codes from leakage sensor001 010 011 100 101 110 111

Leakage Binning Results

Engineering Insights 2006

Process detection

Test & Tuning Process for Self-Tunable DesignFab

Assembly

wafer test

Burn inPackage testCustomer

Leakage measurement

On-die leakage sensor

Program using fuses

Engineering Insights 2006

Digital Logic - Exploring Redundancy and Reconfiguration Tradeoffs

• Redundancy– Suitable for soft/transient/marginality errors– Different forms:

• Hardware redundancy • Time redundancy • Information redundancy

• Reconfiguration– Suitable for hard errors (e.g. defects)

• Design methodology and tools for area, power, performance and reliability tradeoffs?

Engineering Insights 2006

SummaryQuality can't be added, it has to be designed in. Cost-effective embedded self-test will replace existing manufacturing test methodologies for heterogeneous SoC/SiPPost-silicon tuning/calibration/reconfiguration is becoming promising, and necessary, for Si nano systems


Recommended