+ All Categories
Home > Documents > Logic BIST silicon debug and volume diagnosis methodology

Logic BIST silicon debug and volume diagnosis methodology

Date post: 18-Nov-2023
Category:
Upload: independent
View: 0 times
Download: 0 times
Share this document with a friend
10
Paper 7.3 INTERNATIONAL TEST CONFERENCE 1 978-1-4577-0152-8/11/$26.00 ©2011 IEEE Logic BIST Silicon Debug and Volume Diagnosis Methodology M. Enamul Amyeen Andal Jayalakshmi Srikanth Venkataraman Sundar V. Pathy Ewe C. Tan Intel Corporation, Hillsboro, OR (enamul.amyeen | andal.jayalakshmi | srikanth.venkataraman| sundar.v.pathy| ewe.cheong.tan)@intel.com Abstract Post silicon speed-path debug and production volume diagnosis for yield learning are critical to meet product time to market demand. In this paper, we present Logic BIST speed-path debug technique and methodology for achieving higher frequency demand. We have developed a methodology for Logic BIST production fail volume diagnosis and presented tester time and memory overhead tradeoffs and optimization for enabling volume diagnosis. Results are presented showing successful isolation of silicon speed-paths on Intel® SOCs. 1. Introduction Post silicon debug to root-cause speed-path failures and production volume diagnosis for yield ramp are two critical components for meeting product time to market demand. Increasing test cost and high complexity in testing multiple functional blocks are making Logic built in self-test or LBIST test method an attractive alternative to ATPG tests [1,2,3]. Logic built-in self-test is a means whereby an integrated circuit can test its own circuitry. It operates by exercising the circuit logic and then detecting if the logic behaved as intended using on-chip test generator and test response compactor. LBIST test method is being seen as an attractive alternative and can be applied to almost any level of configuration with a minimal interface and tester support. The other major benefit is tester memory saving. Since LBIST generates patterns on the die, it does not have to store patterns on tester memory. LBIST generates pseudo- random patterns and usually needs additional test points to compensate the inadequate test coverage. The additional test points result in higher (typically more than 90%) fault coverage for LBIST patterns. The LBIST controller and the test points have their own overheads in area and power. The typical hardware overhead for LBIST ranges from 1% to 2%. Nonetheless, LBIST is becoming an industry standard due to its attractive advantages over the costs. In a product development, the post-silicon debug and diagnosis methodologies play a major role. Silicon debug methods are used between the first silicon to product qualification to root-cause any design flaws and the diagnosis methods are used to root-cause any fab process defects that delays product ramp-up and high-volume production. Speed path debug identifies the performance limiting paths that are fixed in future product stepping. Functional patterns are used for speed debug traditionally, but are very expensive and time consuming due to their long test sequences [4]. On the other hand, scan-based tests are easier to debug due to the limited number of at-speed capture cycles and can provide the top limiting speed-paths. In this paper we present flows for Logic BIST post silicon debug and production-fail volume diagnosis. The silicon results demonstrate the effectiveness of the proposed methodology in successfully isolating speed-paths for LBIST failures on Intel SOCs. We proposed a novel two pass flow for LBIST on-line volume diagnosis eliminating conventional interactive debug approach. To the best of our knowledge this is first approach for LBIST on-line volume diagnosis. The rest of the paper is organized as follows: Section 2 presents an overview of the LBIST debug and logic diagnosis including LBIST implementation, speed path debug, and the unit-level and production-fail logic diagnosis flow. Section 3 presents LBIST speed-path debug methodology. LBIST diagnosis flow for unit-level and production-fail volume diagnosis are described in section 4. The experimental results including isolation silicon speed-paths, tester overhead tradeoffs and optimization for LBIST volume diagnosis are presented in section 5. Section 6 concludes the paper.
Transcript

Paper 7.3 INTERNATIONAL TEST CONFERENCE 1 978-1-4577-0152-8/11/$26.00 ©2011 IEEE

Logic BIST Silicon Debug and Volume Diagnosis Methodology M. Enamul Amyeen Andal Jayalakshmi Srikanth Venkataraman Sundar V. Pathy Ewe C. Tan

Intel Corporation, Hillsboro, OR (enamul.amyeen | andal.jayalakshmi | srikanth.venkataraman| sundar.v.pathy| ewe.cheong.tan)@intel.com

AbstractPost silicon speed-path debug and production volume diagnosis for yield learning are critical to meet product time to market demand. In this paper, we present Logic BIST speed-path debug technique and methodology for achieving higher frequency demand. We have developed a methodology for Logic BIST production fail volume diagnosis and presented tester time and memory overhead tradeoffs and optimization for enabling volume diagnosis. Results are presented showing successful isolation of silicon speed-paths on Intel® SOCs.

1. Introduction Post silicon debug to root-cause speed-path failures and production volume diagnosis for yield ramp are two critical components for meeting product time to market demand. Increasing test cost and high complexity in testing multiple functional blocks are making Logic built in self-test or LBIST test method an attractive alternative to ATPG tests [1,2,3].

Logic built-in self-test is a means whereby an integrated circuit can test its own circuitry. It operates by exercising the circuit logic and then detecting if the logic behaved as intended using on-chip test generator and test response compactor. LBIST test method is being seen as an attractive alternative and can be applied to almost any level of configuration with a minimal interface and tester support. The other major benefit is tester memory saving. Since LBIST generates patterns on the die, it does not have to store patterns on tester memory. LBIST generates pseudo-random patterns and usually needs additional test points to compensate the inadequate test coverage.The additional test points result in higher (typically more than 90%) fault coverage for LBIST patterns. The LBIST controller and the test points have their own overheads in area and power. The typical hardware overhead for LBIST ranges from 1% to 2%.Nonetheless, LBIST is becoming an industry standard due to its attractive advantages over the costs.

In a product development, the post-silicon debug and diagnosis methodologies play a major role. Silicon debug methods are used between the first silicon to product qualification to root-cause any design flaws and the diagnosis methods are used to root-cause any fab process defects that delays product ramp-up and high-volume production. Speed path debug identifies the performance limiting paths that are fixed in future product stepping. Functional patterns are used for speed debug traditionally, but are very expensive and time consuming due to their long test sequences [4].On the other hand, scan-based tests are easier to debug due to the limited number of at-speed capture cycles and can provide the top limiting speed-paths.

In this paper we present flows for Logic BIST postsilicon debug and production-fail volume diagnosis. The silicon results demonstrate the effectiveness of the proposed methodology in successfully isolating speed-paths for LBIST failures on Intel SOCs. We proposed a novel two pass flow for LBIST on-line volume diagnosis eliminating conventional interactive debug approach. To the best of our knowledge this is first approach for LBIST on-line volume diagnosis.

The rest of the paper is organized as follows: Section 2 presents an overview of the LBIST debug and logic diagnosis including LBIST implementation, speed path debug, and the unit-level and production-fail logic diagnosis flow. Section 3 presents LBIST speed-path debug methodology. LBIST diagnosis flow for unit-level and production-fail volume diagnosis are described in section 4. The experimental results including isolation silicon speed-paths, tester overhead tradeoffs and optimization for LBIST volume diagnosis are presented in section 5. Section 6 concludes the paper.

Paper 7.3 INTERNATIONAL TEST CONFERENCE 2

2. Logic BIST Debug and Diagnosis Overview Since LBIST uses pseudo-random patterns, we need special methods for debug and diagnosis of those blocks that have LBIST circuitry. In contrast, the scan based ATPG debug and diagnosis methods are based on deterministic test patterns targeted for specific faults. In this paper, for debug and diagnosis we convert LBIST pattern into conventional ATPG ASCII pattern format. In this section we present an overview of the standard LBIST test flow, the speed-path debug method and the logic diagnosis flow.

2.1 Logic BIST Overview

An integrated circuit with LBIST has special test circuitry embedded in the design for generating the stimulus and detecting the response. There are many implementations of LBIST, but almost all use a Pseudo Random Pattern Generator (PRPG) to generate the stimulus for the design, and a Multiple Input Shift Register (MISR) to capture the response. The PRPG produces the test pattern data and supplies it to internal scan chains and the MISR compacts the scan chain responses of the circuit which serves as a signature for that cycle of operation. This signature is unique in the sense that each failure in the device will likely result in different value. In a standard LBIST flow, the MISR signatures are accumulated over the test cycles (usually large such as 100000) and compared with the golden signature at the end to know if the unit fails or passes. This is shown in Figure 1.

Figure 1. Standard LBIST implementation

A standard LBIST diagnosis flow uses a search method such as a binary search to search for the failing vector and applies this vector to shift out the scan chain responses. This method is time consuming due to its adaptive search and not suitable for onlinefail data collection. For volume production fail diagnosis, we need a method to capture failures online (while on the tester) and log the fail information in the tester data-log while keeping the tester overheads to a minimum. We have developed and implemented a flow for production fail diagnosis and the details along with tester overhead computations are discussed in section 4 and 5 respectively. Next, we present the overview of speed-paths debug methodology.

2.2 Speed-path Debug

Speed paths limit the performance of a chip causing chip to fail test at the target frequency. Finding the speed paths enable us to find places where potential design fixes can be applied to improve the performance and push the design timing wall. Identifying speed paths is a crucial step in the post-silicon stage for speed-sensitive products.At-speed fail data collection plays a critical role in identifying the speed paths. Transition fault patterns can be applied to several chips on the tester and at-speed fail data can be collected for different design corners corresponding to different voltages and temperatures. The frequency range of the at-speed fail data collection can be determined from the Shmoo plots as shown in Figure 2. The plot can show the first failing frequency at which at-speed fail data is taken and the test engineers usually take additional at-speed fail data to investigate additional speed paths that need to be optimized up to the frequency goal of the design.

Paper 7.3 INTERNATIONAL TEST CONFERENCE 3

Figure 2. Shmoo Plot

We used Poirot speed path analysis tool [4,6] which identifies potential speed-path for the top sightings to address speed-gap and hit frequency goal. Figure 3 shows Speed-path analysis for two cycles launch on capture test. The first functional clock pulse initiate the transition and send clock pulse to capture the observed values. Finally failures were observed during unloading of scan chain values. Speed-path analysis tool does both static and dynamic analysis to determine the receiving and destination flops. It then ranks and identifies candidate paths based on path sensitization using simulation values.

In the next section we provide an overview of the logic diagnosis flow for engineering and volume diagnosis.

2.3. Logic diagnosis

The diagnosis process is utilized to identify and root-cause the issues of low product yield so that corrective actions can be taken. The true purpose of logic diagnosis is to determine the location of the defect though it can also be used to find the logic nature of the defect [5,6]. There are two applications for logic diagnosis: First, offline or unit-level diagnosis to root cause failures during engineering debug or diagnosis. Second, online or production-fail volume diagnosis to root cause yield issues.

Figure 3. Launch-on capture at-speed transition cycles

The unit-level diagnosis flow is shown in Figure 4. The fail data is obtained from testing the fail unit on the tester. The fail data is converted to readable failure information for the logic diagnosis tool to analyze. The logic diagnosis tool uses the failure observations from the tester’s datalog and the failure observations of a fault simulator that is built into the tool to analyze and isolate the failure. It prioritizes the list of candidates for fault locations for further analysis through physical failure analysis techniques.

In a turnkey high volume diagnosis flow setup, the datalogs for failing units are analyzed online by the turnkey diagnosis servers and the results are stored in a database server. These results are later analyzed for systematic defects to resolve any yield issues.

3. Speed-path Debug Flow

This section describes our methodology to collect and analyze LBIST at-speed transition failures. The fail data is collected by setting up Shmoo for each voltage and frequency to find the corresponding Fmax limit Fmax1, Fmax2, Fmax3, … , Fmax100 and logging the first failing pattern or trial. Once failing trial is identified, the scan chain values are shifted out through Tester Data Out (TDO) pin serially. The failing scan cells are identified by comparing the response with the expected scan cell values.

Volt

Pass

Fail

Freq

Paper 7.3 INTERNATIONAL TEST CONFERENCE 4

Figure 4. Unit-level logic diagnosis flow

Figure 5. Logic BIST diagnosis flow

We run Speed-path debug analysis with Fmax failures and report potential speed-paths with the source and destination flops. The speed-paths are ordered based on Fmax value and the common speed-paths across from the highest limiting Fmax to lowest limiting Fmax are consolidated. For further analysis of the top N Fmax limiters from the bucketing process, the speed-path sightings and the information on the limiting paths are recorded. The sighted speed-paths are compared with timing database and paths with small timing margin are targeted for fix in the next product stepping.

4 Logic BIST Diagnosis

There are two applications to logic diagnosis. It can be used for unit-level diagnosis which identifies issues with packaged units that failed on the tester. The other application is for high-volume diagnosis where the failures are collected online from the failing units for later analysis. This section describes our methodology for these two applications.

4.1 Unit-level Diagnosis

In this section, we will review the overall flow and the components for unit-level LBIST diagnosis. We used a methodology which is similar to ATPG diagnosis that is used to root-cause the hard defects in the circuit. The overall flow is presented in Figure 5.

We used logic diagnosis and it needs a few collaterals for it to do the simulation needed to identify the fault locations. There are two components that need to be in place: 1. ASCII test patterns 2. Failure file. We generated the ASCII test patterns using LBIST pattern database. The LBIST failure data that comes from the tester datalog needs special handling as it is just the raw scan dump instead of any useful failure information. We have automated this failure conversion to create a failure file. The custom ASCII patterns which has both failing and passing patterns are to be prepared from the pattern files for diagnosis tool to analyze the actual failures against simulated faults. The results obtained for simulation of injected faults for Intel SOCs are presented in section 5. The components of the LBIST diagnosis flow aredescribed in detail in the subsections.

Diagnosis

Paper 7.3 INTERNATIONAL TEST CONFERENCE 5

4.1.1 Logic BIST ASCII Pattern generation

During the LBIST front-end flow the LBIST database allows us to generate the ASCII patterns which are the equivalent of the LBIST pseudo-random patterns generated on the die. Since LBIST pattern database has the capability to initialize a PRPG, setup Linear Feedback Shift Register (LFSR) connections and other constraints and generate random patterns, we could simulate the LBIST controller and generate the pseudo random patterns that actually get generated on the chip. These patterns can be written out in ASCII format for later use by the logic diagnosis tool.

Since the LBIST test cycles are large (100K) the LBIST pattern database has to be run for these long cycles to generate all the 100K patterns. This is time consuming and occupies large disk space. To overcome this we developed a flow to generate only the patterns of interest. First we would need to run a fault simulation till the end pattern to generate the LBIST trace which has the PRPG seed for all the 100K patterns. Then we extract the PRPG/MISR signature from the generated trace file for generating the pattern of interest.

4.1.2 Custom ASCII Pattern generation

Since LBIST uses a large test cycle and the failing patterns could be far apart, we need to have an automated way to create a custom pattern file which collects patterns from multiple pattern files. The logic diagnosis tool uses both the passing and failing patterns and response for its analysis of actual failures against simulation of injected faults. The LBIST patterns could be pre-generated and stored for later use or could be generated on demand using the method described in section 4.1.1.

4.1.3 Fail Datalog conversion

Unlike ATPG diagnosis where the chip response is compared to golden on the tester and the result is logged in the tester datalog, responses for LBIST failures are not compared online. The failure triggers the execution of diagnostic patterns which dump outthe failing segment of the scan chains (concatenated as a daisy chain) as a long list of 0s and 1s. The fail responses are compared offline to determine the failing chain and flop. It is possible to just dump out

the fail segments (instead of the whole chain), if the LBIST controller can be configured for that. But this is product specific as the hardware to reconfigure the chains of interest must be present in the design. The MISR segment configuration file provides the chain to segment mapping which helps to identify the failing chain from the fail response.

Figure 6. Logic BIST volume diagnosis flow

4.2 Production-fail volume diagnosis

The overall flow for production-fail volume diagnosis includes three basic components: test content generation, fail data collection and automated analysis. The overall flow is given in Figure 6.

The test content for the diagnosis flow are generated by the LBIST front-end flow which constitutes the Go-NoGo and Fault-Isolation patterns for the pass and the fail flow respectively. These are used by the test instance which identifies the failing MISRs and triggers the fail-data collection. The fail-data are analyzed using a similar flow that we used for unit-

Paper 7.3 INTERNATIONAL TEST CONFERENCE 6

level diagnosis. The details of each of these three processes are explained in the following sub-sections.

Figure 7. Logic BIST pattern slices

4.2.1 Test Content generation

The test content is generated by the LBIST front-end flow as Go-NoGo passing and Fault Isolation failing patterns. The LBIST flow uses 100K patterns and is sliced into 4 windows 1-100, 101-1000, 1001-10K and 10001-100k and one Go-NoGo pattern is generated for each window. During the pass flow, the MISR signatures of the first 100 trials in each window are compared with the golden and the failing trials are stored for use by the fail flow. The cumulative MISR of each window is also stored and compared to know if there are failures outside of the first 100 trials in that window.

The Fault Isolation patterns are generated for the first 100 trials of each window totaling to 400 trials. The actual failing trials that fall within these 400 trials are run in the second iteration to dump out the scan chain. The scan chain responses are not compared with golden on-the die due to pattern memory restrictions. We have automated this comparison as explained in the next subsection.

4.2.2 Fail data collection

A Tester Method class is implemented to handle the high-volume manufacturing fault isolation data collection in a consolidated manner. The test method has 3 pass executions of patterns. The first pass is where the single ATE Pattern per trial window (4 patterns in this example) is executed together in a single burst and each failing pattern is logged. At the end of the first pass, we know the windows which have failing trials in them. This leads to the second pass, where we execute the targeted trial patterns (first 100 per window) in a burst and log all the failing patterns, which identify the individual failing trials. It is possible that the failing trial is outside targeted trial list within that window; in that case, no

further data collection effort is done. At the end of the second pass, we have the list of individual failing trials, now we move on to the third pass, where we execute each of the failing trial diag pattern (concatenated chain mode) to log the scan chain data to a datalog file. The datalog file is then used offline for further POIROT based failure analysis. The 3 passes are shown here in Figure 7 and Figure 8.

Figure 8. Shifting out of failing MISR segment chains

One of the optimizations is to combine the First Pass and Second Pass into just 4 Patterns and generated the LBIST patterns such that MISR compare happens at the end of each trial for first hundred trials of each window. This allows the third pass to be 100 diag patterns (with chain concatenation for TDO dump) per window for a total of 400 patterns. This provides a better test time optimization at a slightly higher pattern generation complexity.

4.2.3 Automated Analysis

The LBIST failures are analyzed using an automated flow similar to unit-level diagnosis and the details of the unit-level flow and the individual components were explained in detail in section 3. The advantage of volume diagnosis over unit-level diagnosis is its capability to collect fail data during production and analyze failures.

5. Experimental Results

We applied Logic BIST debug and diagnosis flow to isolate speed-path failures and manufacturing defects on Intel® SOCs. The circuit statistics for LBIST blocks are shown in Table 1. Column 2 and 3 show the primary inputs count and primary outputs count respectively. The total gate count is shown in column

Paper 7.3 INTERNATIONAL TEST CONFERENCE 7

4. For the largest block the gate count is 5.9 Million. The total number of scan chains is shown in column 6 and varies from ~100 to 600. The last column shows the total number scan-cells which are evenly distributed across the scan chains.

Several diagnostic measures are used to evaluate the match between the simulation failures and the observed failures and ranking the candidate faults. The diagnostic measures are illustrated in figure 9. The term intersectionis used to describe a match between fault and the observed behavior [6,7,8]. On some test cycles the stuck-at fault will cause simulation failures even though

Figure 9. Diagnostic measures for ranking candidates

there are no observed failures on those simulation cycles. These additional simulation failures are termed as mispredictions [9,10,11], and they are caused by non-excitation of the silicon defect on those simulation cycles. If we observe a complete match between simulation and observation failures for a test cycle, then it is referred as cycle-intersection.

The fault candidates are classified into different groups based on the intersection, cycle-intersection and misprediction count. Faults with identical behavior, i.e., same intersection, cycle-intersection, and mispredictioncounts belong to same candidate fault class.

5.1 Silicon Speed-path Debug results

In this section, we will present results of silicon speed-paths debug using LBIST content. Out of all LBIST failing units, 25% of the units failed at the specified Vmin voltage and passed at a higher voltage. After characterization 3 units were selected for further debug.

Table 2 shows the LBIST failure characteristics on three units with respect to clock period, voltage, first failing trial, and number of failing flops. On unit S1, failures were observed at 10ns, within a voltage range

Table 1. Circuit statistics

0.85V to 1.0V for 100,000 LBIST patterns. A binary search was then performed to identify failing LBIST scan operation cycle or trial. For all three units, failures were observed beyond 10,000 trials. The number of failing destination flops given in the last column was obtained by shifting out the scan chain values and comparing with the expected response.

Table 2: LBIST Speed Failures

UnitID

Clock Period

Voltage Failing trial

No ofFailingflops

S1 10ns 0.85Vto 1.0V

10,769 14

S2 10ns 0.85V to 1.0V

57,770 7

S3 10ns 0.84V 57,770 7

Units S2 and S3 produced identical failing flops signatures at failing trial 57,770. The failing trail and failing flop information is passed to speed-path diagnosis tool [4]. The speed-path diagnosis identified failing speed-paths corresponding to each failing flop. The common segments among the failing paths were then ranked based on the number of occurrences. A common segment path will rank higher if it appears in higher number of candidate paths. On all three units, the top ranked path segments were resided within or at the boundary of a specific full adder cell instance. Next, we mapped the failing locations in layout and identified the failing regions of interest.

LBIST Block

PIs POsGate Count

# Scan Chains

Latch Count

BlockG 478 443 5946335 661 216471

BlockI 806 655 4164887 640 147714

BlockC 547 439 2696123 128 120570

BlockD 727 451 4492995 128 84579

Block0 3205 3779 2795196 192 84540

Block1 4384 4638 4086324 360 17057

Block2 2858 1949 4131242 288 134173

Block3 792 576 2482149 192 84409Non-prediction

Intersection Misprediction

CandidateObserved

Paper 7.3 INTERNATIONAL TEST CONFERENCE 8

In order to confirm and root-cause the speed failures laser probing LADA [12] tool was used. The unitswere thinned for the back of the chip for LADA scanning. The LADA probing patterns were created only with the single failing trials. The failing pattern or trial was then applied repeated while scanning laser at the layout location identified from diagnosis. When unit shift from a failing condition to passing condition it is referred as a hit and identified as a dot. Figure 10 (a) shows the LADA hit at the failing adder cell at the driver and receiver node of the adder cell. The adder cell location is highlighted by a bounding rectangle. Figure 10(b) shows the LADA hit with layout image overlay.

(a) (b)

Figure 10(a) Image of LADA hit at the adder cell location (b) LADA hit image with layout overlay

(a) (b)

Figure 11 (a) Failing and (b) passing Shmoo with LADA scanning.

Figure 10 shows the failing and passing Shmoo plot while LADA scanning. The passing Shmoo is observed while parking LADA on the region of interest identified from diagnosis.

LADA hit confirmed the failing area identified from speed-path debug. In order to identify the failing condition, time-resolved emission system TRE [13,14] probing was used. Figure 12 shows output

from TRE probing. The probing at the input and out of adder identified the failing cycle. The failing condition on the carry-out signal is showing missing pulse. Circuit simulation later confirmed the weakersignal strength at one of the adder inputs causing the failure. The driver strength and added cell design was changed to rectify the problem on future stepping of the product.

Figure 12. TRE probing waveform at the adder cell

5.2 Simulation Results for Unit-Level Diagnosis

We have setup the unit-level LBIST diagnosis flow successfully on LBIST blocks. We generated the ASCII patterns, did simulation validation for all the LBISTed blocks.

We used fault simulation tool for logic validation and all blocks have passed the simultion. We used Poirot for simulation validation with injected faults and obtained a good resolution for all LBIST blocks. The results are tabulated below in Table 3. The diagnosis results for 3 random faults are shown here. The “Defect Type” shows the type of defect. It could be either stuck-at-0 (SA0) or stuck-at-1(SA1). The “Number of failures” represents the total failing patterns for this defect. This is obtained from the failure information file. The run time of the logic diagnosis tool was 5.61 secs on an average for the faults shown here.

5.3 Tester overhead tradeoffs

We will analyze overhead for test time and tester memory for enabling LBIST volume diagnosis. The overheads are computed based on design data and we

Clock Period Clock Period

Volta

ge

Volta

ge

Clk

CO(pass)

CO(fail)

Paper 7.3 INTERNATIONAL TEST CONFERENCE 9

have detailed the observations below. There are two types of patterns: 1. Go-NoGo and 2. Fault-Isolation.

Table 3 Logic BIST simulation results

The Go-NoGo patterns are used to identify the failing MISRs and the Fault-Isolation patterns are used to concatenate the scan chains into a single chain and shift out the responses through test data observationor TDO pin. The test time computations for these two patterns are given below.

Test Time: Based on current estimates, there will be some additional costs for both passing and failingunits for Go-NoGo pattern relative to the vanilla LBIST pattern which execute a single MISR signature pass-fail test. Next, we will define terms to compute test time. Let, Lscan be the length of scan chain, Tshift be

the shift clock period, Mbits be the number of MISR bits, Ntr be the number of trials, NMISR be the number MISR compares, and Ttest be the tester clock period. For vanilla or baseline single MISR signature pass-fail test, total execution time is comprised of time for shifting the scan chain values over all trails and time for single shifting of final MISR signature. MISR bits are shifted out through the TDO pin at tester clock frequency. In order to shift MISR bits, additional overhead register bits, denoted as Obits are also need to be shifted. The total execution time for baseline or vanilla single MISR signature pass fail test is computed as:

The shift clock frequency for our design is four times faster than the tester clock frequency. For our design, Ntr value of 100,000 trials, shift clock period of 10 nano second, tester clock period of 40 nano sec are used. The number of MISR bits, Mbits varies and can range from 100 to 600. Similarly shifting overhead register bits, Obits varies from 450 to 2000.

For Go-NoGo patterns assuming 400 MISR compares represented as Ncmp, total execution time for Go-NoGopattern on passing unit is calculated as:

The Tgonogo execution time for passing unit varies and for our designs, can be 1 to 15% higher than Tbase. For a failing unit, the failing MISR signatures need to becorrected before continuing the execution of remaining trials. The execution time for a failing unit will depend on the number of failing trials and the number of MISR bits that gets corrupted. Let Nfailtr be the number of failing trials. Typically 5% to 20% of total trials fail on a failing unit. Once failing MISR signature is compared with the good MISR signature stored in the tester capture memory, corrupted MISR bits are identified. Let Rcap (0.005 millisecond) be the time to correct each MISR corrupted bit. For logic fails, typically single scan chain segment fails and corresponding MISR segment bits get corrupted. Let, Nseg be the number of chain segment, and Mbits/Nsegrepresents the number of MISR bits corresponding to a single chain segment. The execution time for Go-NoGo patterns on a failing unit can be computed as follows:

LBISTBlock

Diagnosis results

Fault#

Defect type

Number of

failures

Runtime

(secs)BlockG 1 SA1 438 4.28

2 SA0 27 7.493 SA1 705 4.31

BlockI 1 SA0 86 8.052 SA1 83 3.103 SA0 23 6.39

BlockC 1 SA0 78 4.592 SA0 269 3.313 SA1 2997 10.05

BlockD 1 SA0 127 7.462 SA1 131 10.223 SA0 1161 8.36

Block0 1 SA1 135 1.442 SA1 654 4.173 SA1 639 3.51

Block1 1 SA0 1599 7.222 SA1 139 2.153 SA1 2854 6.02

Block2 1 SA1 218 8.162 SA0 54 3.373 SA0 546 6.38

Block3 1 SA1 1520 4.172 SA1 113 3.303 SA1 895 7.02

testbitsbitsshiftscantrbase TOMTLNT ������ )(

cmptestbitsbitsshiftscantrpassgonogo NTOMTLNT ������� )(_

capseg

bitsfailtrpassgonogofailgonogo R

NMNTT ���� __

Paper 7.3 INTERNATIONAL TEST CONFERENCE 10

On a failing unit, the corrupted MISR signature bits need to be restored for Go-NoGo passing pattern. Assuming 10% of the MISR compared trials fails and single MISR segment bits are corrupted, the overhead for Go-NoGo patterns varies between 2% to 20% on a failing unit. For fault-isolation pattern, execution time is comprised of shifting of scan chain values through TDO, fault-isolation pattern overhead (Opat), and datalogging time of observed values from the capture memory. Let Lseg be the length of a single chain segments obtained from concatenating the chain corresponds to a single chain. The execution time for Fault isolation pattern is computed as:

The fail unit test time to be 5-10x Tbase (mostly dominated by serial scan unload time). The overall test time increase will be dependent on Tbase relative to test time for all tests and their time, and also the fraction of good to bad units (failing LBIST). The exact increase will depend on the particular configuration of the block. Initially this test time increase will not be too much of a concern since there will be small amount of material or units to test. But over time with product ramp, we may have to dial down the sample of fail units to reduce the overall test time. Tester Memory: For Go-NoGo patterns, good MISR signatures for the compared failing trials are stored. For our LBIST blocks, the maximum number of trials for MISR compare is limited by the available tester memory. With current tester capacity, for the largest failing block we can accommodate 500 MISR compares within the capture memory limit. For fault isolation pattern, expected values of scan chain captures are not stored to save tester memory and failing scan cells are determined in off-line by comparing observed values with the expected chain values. 7. Conclusions We presented Logic BIST debug and volume diagnosis methodology for isolating speed-path failures and manufacturing defects. Silicon results are presented isolating speed-paths using at-speed scan test content on Intel SOCs. A novel volume diagnosis methodology for LBIST diagnosis is developed and presented eliminating conventional interactive failure diagnosis approach.

Acknowledgement The authors would like to thank Carlston Lim for enabling speed-path diagnosis flow, Ajithkumar Kalangara for implementing LBIST test class, Chin Wah Lim for the silicon failure debug, Inn Chin Wong from SOC product engineering team for enabling volume diagnosis deployment.

References [1] Kenneth M. Butler, “ATPG versus Logic BIST - Now and in the Future”, International Test Conference, 2001 [2] Greg Crowell, Ron Press, "Using Scan Based Techniques for Fault Isolation in Logic Devices", Microelectronics Failure Analysis, 132–138, Oct 2004 [3] W.-T. Cheng, M. Sharma, T. Rinderknecht, Liyang, C. Hill, “Signature Based Diagnosis for Logic BIST”, IEEE International Test Conference, Oct. 21–26, 2007

[4] R. McLaughlin, S. Venkataraman and C. Lim, “Automated Debug of Speed Path Failures Using Functional Tests” 91 – 96; VLSI Test Symposium, pp 91-96, 2009 [5] M. Abramovici, M. A. Breuer, and A. D. Friedman, “Digital

System Testing and Testable Design”, AT&T Bell Laboratories and W. H. Freeman and Company, 1990.

[6] S. Venkataraman, S. Drummonds, “Poirot: Applications of a Logic Fault Diagnosis Tool”, IEEE Design & Test of Computers,Jan.-Feb. 2001 pp. 19-31.

[7] B. Chess, D. B. Lavo, F. J. Ferguson, and T. Larrabee, “Diagnosis of Realistic Bridging Faults with Single Stuck-at Information,” IEEE/ACM International Conference on CAD, pp. 185-192, Nov. 95.

[8] D. B. Lavo, T. Larrabee, and B. Chess, “Beyond Byzantine Generals: Unexpected Behavior and Bridging Faults Diagnosis,” IEEE International Test Conference, pp. 611-619, Oct. 96.

[9] D. Josephon, S. Poehlman, and V. Govan,, “Debug Methodology for the McKinley Processor”, International Test Conference, 2001, pp. 451-460.

[10] Dahlgren, P.; Dickinson, P.; Parulkar, I., “Latch divergency in microprocessor failure analysis”, International Test Conference, 2003, pp.755 – 763.

[11] P. Maxwell, I. Hartanto, L. Bentz, “Comparing Functional and Structural Test”, International Test Conference, 2000, pp. 400-407.

[12] J. A. Roelette, T. Eiles, “Critical timing analysis in microprocessors using near-IR laser assisted devices alteration,” IEEE International Test Conference, pp. 264-273, Oct. 2003.

[13] J. C. Tsang, J. A. Kash, D. P. Vallett, “Picosecond imagingcircuit analysis”, IBM Journal of Research and Development,44(4), 2000, pp. 583-604

[14] J. C. Tsang, J. A. Kash, and D.P. Vallett, "Time-resolved optical characterization of electrical activity in integrated circuits,", IEEE International Test Conference, pp. 1440-1459

patcapsegtestsegFI ORLTLT �����


Recommended