+ All Categories
Home > Documents > Parallelized Network-on-Chip-Reused Test Access Mechanism...

Parallelized Network-on-Chip-Reused Test Access Mechanism...

Date post: 15-Mar-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
5
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 35, NO. 7, JULY 2016 1219 Parallelized Network-on-Chip-Reused Test Access Mechanism for Multiple Identical Cores Taewoo Han, Inhyuk Choi, Hyunggoy Oh, and Sungho Kang, Senior Member, IEEE Abstract—This paper proposes a new network-on-chip (NoC)-reused test access mechanism (TAM) for testing multiple identical cores. It can test multiple cores concurrently and identify faulty cores to derate the chip by excluding the core. In order to minimize the test time, the TAM utilizes the majority value of test response data. All of the cores can thereby be tested in parallel and test costs (in both test pins and test time) are exactly the same as those for a single core. The hard- ware overhead is minimized by reusing the NoC infrastructures and transfer-counters are designed as a majority analyzer. The experimen- tal results in this paper show that the proposed TAM can test multiple cores in the same time as a single core and with negligible hardware overhead. Index TermsMulticore, network-on-chip (NoC), parallel test, test access mechanism (TAM). I. I NTRODUCTION A system-on-chip (SoC) design mainly consists of multiple IP cores, each of which contains an individual design block and its design. Thus, an SoC test implies a highly structured design-for- test (DFT) infrastructure to observe and control individual core test solutions [1]. The design of a communication infrastructure within such complex systems requires high performance and high qual- ity levels while connecting an increasing number of cores. The communication architecture causes severe on-chip synchronization errors, unpredictable delays, and power consumption. A network-on- chip (NoC) is proposed as a solution to overcome the limitations from bus-based and point-to-point communication architectures [2]. When the NoC is used as an interconnection fabric, the cores in the SoC can be tested using the NoC as the test access mechanism (TAM). This NoC-reused TAM allows the use of existing functional interconnects, with reduced area, pin count, and test time costs [3]. Infrastructures of the NoC, which include routers and interconnections, must be tested before reusing the NoC as TAM [4]. Amory et al. [5] and Cota et al. [6] proposed a DFT scheme to test all identical routers concurrently and Xiang and Zhang [7] and Xiang [8] pro- posed a scheme to test interconnections with a reduced cost. An NoC-reused TAM facilitates design reuse and localizes the DFT effort to access points and core wrappers such as the IEEE 1500 [9], and therefore reduces the impact of last-minute design changes. For heterogeneous cores, research on the optimization of a dedi- cated TAM [10], [11] and NoC-reused TAM [12] demonstrated that Manuscript received November 26, 2014; revised June 16, 2015; accepted September 2, 2015. Date of publication September 23, 2015; date of cur- rent version June 16, 2016. This work was supported by the National Research Foundation of Korea through the Korea Government (MSIP) under Grant 2015R1A2A1A13001751. This paper was recommended by Associate Editor J. L. Dworak. (Corresponding author: Sungho Kang.) T. Han is with the SoC Development, System LSI, Samsung Inc., Gyounggi-do 445-701, Korea (e-mail: [email protected]). I. Choi, H. Oh, and S. Kang are with the Computer Systems and Reliable SOC Laboratory, Department of Electrical and Electronic Engineering, Yonsei University, Seoul 120-749, Korea (e-mail: [email protected]; [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TCAD.2015.2481872 pin-count-aware test schedule optimization can reduce test time for given pins. Recently, modern microprocessor designs have evolved to include multiple identical cores [13] and an NoC helps to implement recon- figurable systems and a topology reconfiguration for defect-tolerant NoC-based homogeneous multicore or many-core systems [14]. The build of this highly reliable system begins with an accurate test: identifying faulty cores to derate the chip by excluding it. A pipeline- based TAM is proposed for parallel testing of multiple identi- cal cores [15]. NoC-reused parallel (NRP) TAM (NRP-TAM) [16] adopts this pipeline-based test scheme to be used as NoC-reused TAM. However, the pipeline-based TAM has the characteristic of requiring additional test time when the primary core has a fault. The worst case is when the primary core fails continuously, thereby requir- ing N tests for N cores. In addition, if one chip needs additional test time, the other chips on the same wafer also wait the additional test. As a result, the overall test process should be delayed and the only one fault at the primary core is expected to have a huge impact. A majority-based TAM [17] is proposed to overcome the limitations of the pipeline-based TAM and it can test all cores using the same test pins and test time as required for testing a single core, but it is designed as the dedicated TAM. The majority analyzer in the dedicated TAM is a combinational module and it is hard to apply for the NoC infrastructures which have sequential logic with routing buffers. In this paper, completely parallelized NoC-reused TAM for mul- tiple identical cores is proposed. It is implemented by utilizing the scheme of majority-based TAM. Also, a dedicated majority analyzer is designed for reusing NoC infrastructures. With the majority-based TAM scheme, cores that produce test response (TR) data, which is different from the majority value (MV), can then be considered to be faulty. The MV is then tested by the automated test equipment (ATE) to determine whether it matches the expected value or not, which indi- cates a fault. The proposed NoC-reused TAM targets most common NoC architectures [18] and has flexibility in its design, configura- tion, and application. The proposed TAM can be used to perform a complete core-level diagnosis and the test process is completely parallelized for minimizing the test costs. II. PREVIOUS WORKS A. NoC-Reused TAM An NoC-reused TAM for testing multiple identical cores is studied with the pipeline-based TAM scheme. If the pipeline-based TAM is applied to an NoC-reused TAM as it is, the bandwidth of the TAM is reduced by half. This is due to the test pattern (TP) data and TR data of a primary core transferring in one direction. Fig. 1(a) shows a simple diagram of the NoC-reused TAM in which the pipeline-based TAM is applied as it is. Cores connected to the routers are omitted in this figure. The width of a flit in the NoC is W. In order to transfer the TP and the primary core’s TR in the same direction, the TP uses a W/2 bandwidth and the primary TR uses the other W/2 bandwidth. In the pipeline-based TAM, the spare output channels can be used to 0278-0070 c 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Transcript
Page 1: Parallelized Network-on-Chip-Reused Test Access Mechanism ...soc.yonsei.ac.kr/Abstract/International_journal/pdf/134-Parallelized... · IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 35, NO. 7, JULY 2016 1219

Parallelized Network-on-Chip-Reused Test AccessMechanism for Multiple Identical Cores

Taewoo Han, Inhyuk Choi, Hyunggoy Oh, and Sungho Kang, Senior Member, IEEE

Abstract—This paper proposes a new network-on-chip (NoC)-reusedtest access mechanism (TAM) for testing multiple identical cores. It cantest multiple cores concurrently and identify faulty cores to derate thechip by excluding the core. In order to minimize the test time, theTAM utilizes the majority value of test response data. All of the corescan thereby be tested in parallel and test costs (in both test pins andtest time) are exactly the same as those for a single core. The hard-ware overhead is minimized by reusing the NoC infrastructures andtransfer-counters are designed as a majority analyzer. The experimen-tal results in this paper show that the proposed TAM can test multiplecores in the same time as a single core and with negligible hardwareoverhead.

Index Terms—Multicore, network-on-chip (NoC), parallel test,test access mechanism (TAM).

I. INTRODUCTION

A system-on-chip (SoC) design mainly consists of multiple IPcores, each of which contains an individual design block and itsdesign. Thus, an SoC test implies a highly structured design-for-test (DFT) infrastructure to observe and control individual core testsolutions [1]. The design of a communication infrastructure withinsuch complex systems requires high performance and high qual-ity levels while connecting an increasing number of cores. Thecommunication architecture causes severe on-chip synchronizationerrors, unpredictable delays, and power consumption. A network-on-chip (NoC) is proposed as a solution to overcome the limitations frombus-based and point-to-point communication architectures [2]. Whenthe NoC is used as an interconnection fabric, the cores in the SoC canbe tested using the NoC as the test access mechanism (TAM). ThisNoC-reused TAM allows the use of existing functional interconnects,with reduced area, pin count, and test time costs [3]. Infrastructuresof the NoC, which include routers and interconnections, must betested before reusing the NoC as TAM [4]. Amory et al. [5]and Cota et al. [6] proposed a DFT scheme to test all identicalrouters concurrently and Xiang and Zhang [7] and Xiang [8] pro-posed a scheme to test interconnections with a reduced cost. AnNoC-reused TAM facilitates design reuse and localizes the DFT effortto access points and core wrappers such as the IEEE 1500 [9],and therefore reduces the impact of last-minute design changes.For heterogeneous cores, research on the optimization of a dedi-cated TAM [10], [11] and NoC-reused TAM [12] demonstrated that

Manuscript received November 26, 2014; revised June 16, 2015; acceptedSeptember 2, 2015. Date of publication September 23, 2015; date of cur-rent version June 16, 2016. This work was supported by the NationalResearch Foundation of Korea through the Korea Government (MSIP) underGrant 2015R1A2A1A13001751. This paper was recommended by AssociateEditor J. L. Dworak. (Corresponding author: Sungho Kang.)

T. Han is with the SoC Development, System LSI, Samsung Inc.,Gyounggi-do 445-701, Korea (e-mail: [email protected]).

I. Choi, H. Oh, and S. Kang are with the Computer Systems and ReliableSOC Laboratory, Department of Electrical and Electronic Engineering,Yonsei University, Seoul 120-749, Korea (e-mail: [email protected];[email protected]; [email protected]).

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TCAD.2015.2481872

pin-count-aware test schedule optimization can reduce test time forgiven pins.

Recently, modern microprocessor designs have evolved to includemultiple identical cores [13] and an NoC helps to implement recon-figurable systems and a topology reconfiguration for defect-tolerantNoC-based homogeneous multicore or many-core systems [14]. Thebuild of this highly reliable system begins with an accurate test:identifying faulty cores to derate the chip by excluding it. A pipeline-based TAM is proposed for parallel testing of multiple identi-cal cores [15]. NoC-reused parallel (NRP) TAM (NRP-TAM) [16]adopts this pipeline-based test scheme to be used as NoC-reusedTAM. However, the pipeline-based TAM has the characteristic ofrequiring additional test time when the primary core has a fault. Theworst case is when the primary core fails continuously, thereby requir-ing N tests for N cores. In addition, if one chip needs additional testtime, the other chips on the same wafer also wait the additional test.As a result, the overall test process should be delayed and the onlyone fault at the primary core is expected to have a huge impact.A majority-based TAM [17] is proposed to overcome the limitationsof the pipeline-based TAM and it can test all cores using the sametest pins and test time as required for testing a single core, but itis designed as the dedicated TAM. The majority analyzer in thededicated TAM is a combinational module and it is hard to applyfor the NoC infrastructures which have sequential logic with routingbuffers.

In this paper, completely parallelized NoC-reused TAM for mul-tiple identical cores is proposed. It is implemented by utilizing thescheme of majority-based TAM. Also, a dedicated majority analyzeris designed for reusing NoC infrastructures. With the majority-basedTAM scheme, cores that produce test response (TR) data, which isdifferent from the majority value (MV), can then be considered to befaulty. The MV is then tested by the automated test equipment (ATE)to determine whether it matches the expected value or not, which indi-cates a fault. The proposed NoC-reused TAM targets most commonNoC architectures [18] and has flexibility in its design, configura-tion, and application. The proposed TAM can be used to performa complete core-level diagnosis and the test process is completelyparallelized for minimizing the test costs.

II. PREVIOUS WORKS

A. NoC-Reused TAM

An NoC-reused TAM for testing multiple identical cores is studiedwith the pipeline-based TAM scheme. If the pipeline-based TAM isapplied to an NoC-reused TAM as it is, the bandwidth of the TAMis reduced by half. This is due to the test pattern (TP) data and TRdata of a primary core transferring in one direction. Fig. 1(a) showsa simple diagram of the NoC-reused TAM in which the pipeline-basedTAM is applied as it is. Cores connected to the routers are omitted inthis figure. The width of a flit in the NoC is W. In order to transferthe TP and the primary core’s TR in the same direction, the TP usesa W/2 bandwidth and the primary TR uses the other W/2 bandwidth.In the pipeline-based TAM, the spare output channels can be used to

0278-0070 c© 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Page 2: Parallelized Network-on-Chip-Reused Test Access Mechanism ...soc.yonsei.ac.kr/Abstract/International_journal/pdf/134-Parallelized... · IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN

1220 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 35, NO. 7, JULY 2016

(a) (b)

Fig. 1. Simple diagram of the NoC-reused TAMs. (a) Pipeline (twotracks) [15]. (b) NRP-TAM [16].

directly observe the response of another primary core. This allows twosets of cores to be compared in parallel to each observable core [15].Because Primary1 reuses only half of the bandwidth (W/2) of the testoutput pins, the remaining bandwidth (W/2) of the test output pinscan be used to construct the two tracks (for Primary2). It can reducethe test time stochastically (the probability of two primary cores bothhaving fault is expected as less than the one of primary core havingfault). However, it reuses only half of the bandwidth of the test inputpins and half of the bidirectional links in the NoC. Therefore, it leadsto a loss of test time than which one reuses the full bandwidth of linksin NoC. In this paper, a “pipeline-based TAM” is a dedicated TAMand “pipeline” is the NoC-reused TAM which adopts the pipeline-based test scheme as it is.

The limited test bandwidth of Pipeline is overcome by anNRP-TAM [16]. NRP-TAM is a parallel TAM which is specializedfor multiple identical cores in NoC-based system. In order to usethe full bandwidth of NoC interconnection as the width of TAM,a new deterministic routing algorithm to transfer test data is designed.Fig. 1(b) represents this TAM. If one link is in use for transferringthe TP, the primary’s TR is transferred by bypassing the link. It cantest the homogeneous cores in an NoC efficiently, but they have thesame drawback as the pipeline-based TAM. If the primary core hasa fault, additional test time is required for testing the other cores.

The majority-based TAM can be applied to an NoC-reused TAMthat promises to overcome the limitations above.

B. Majority-Based TAM

The majority-based TAM [17] has intuitive and clear fundamentals.In a multicore system, multiple identical cores can be tested in parallelusing broadcasted TPs. If there are no faults, the TR data can bepredicted to be the same among each of the cores. By expectingthat most cores will not have faults, the proposed TAM analyzes theTR data and finds the MV. A core that produces TR data which isdifferent from the MV can then be considered to be faulty. Naturally,the MV is then tested by the ATE to determine whether it matchesthe expected value or not, which indicates a fault. When the MV isequal to the expected data, it means that more than half of the coresare not faulty. If the TR data of a core is different from the MV, thatcore is recorded in the error registers as a faulty core. When the MVis different from the expected data, it means that more than half ofthe cores are faulty. In this case, it is possible for the TAM to operatethe wrong test. The nonfaulty cores would be recorded instead in theerror registers, but this multicore chip would be discarded. If the exactdiagnosis is required even if more than half of the number of coreshaving faults, the test process is repeated with the nonfaulty cores.

Fig. 2. Architecture of the proposed NoC-reused TAM.

Consequently, with the majority-based TAM scheme, all of thecores can be tested simultaneously using the same test pins and testtime as that required for testing a single core. Test methodology withthe proposed NoC-reused TAM in this paper is described with thescan test. Because the TAM is only related to transferring test data,it can be extended to all scan, functional, and other tests for multipleidentical cores.

III. PROPOSED NOC-REUSED TAM

A typical structure of NoC system with the proposed NoC-reusedTAM is represented in Fig. 2. Detailed descriptions about the TAMare presented in the following chapters.

A. Architecture

Fig. 2 represents a typical structure of an NoC system with theNoC-reused TAM which adopts the majority-based TAM scheme. Itreuses the buffers in the inputs of the routers and some multiplexersand comparators (XOR gates) are added; however, the comparatorscan be shared with the testing of routers. In addition, a bitwise counteris designed for analyzing the MV. It is assumed that Router0 isa source and sink node. External ATE sends the TP data to Router0,receives the MV of cores from Router0 and confirms whether theresponse data is identical to the expected data. Router0 transfers theTP data from the ATE to Core0 and, at the same time, it is transferredto Router1. One input buffer in Router1 is reused as the pipeliningregister for TP data. The red lines indicate the transmission of TPdata which transfers from Router0 to Router1 and Router4 accord-ing to the proposed routing algorithm. The link between Router0 andRouter1 is shared for both TP data and majority counting data by thetime division methods. The majority counting data is the number of“1”s in the TR data for analyzing the MV. Therefore, the green linesrepresent the transmission of TP and majority counting data. Eachrouter has a majority analyzer and it uses bitwise transfer-countersto maximize the efficiency of its process. The blue lines indicate thetransmission of the complete MV which transfers from Router5 toRouter4 and from Router1 to Router0.

Page 3: Parallelized Network-on-Chip-Reused Test Access Mechanism ...soc.yonsei.ac.kr/Abstract/International_journal/pdf/134-Parallelized... · IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 35, NO. 7, JULY 2016 1221

Fig. 3. Routing results of the proposed NoC-reused TAM.

All routers and cores can transfer and receive the TP data, themajority counting data, and the MV in the same way. Each routerthen compares the TR data of its core and the MV. As a result,each router requires additional buffering to adjust the timing. Thisbuffering reuses the input buffer between the core and the switch box.

B. Routing Algorithm

The proposed TAM uses shortest-path routing to transmit TP data,MV, and the majority counting data. It reuses the bidirectional links inthe NoC for an abundant bandwidth of the TAM. In order to transmitthe TP data and the counting data for analyzing the MV simultane-ously, the TAM utilizes the difference between the speed of the scanshift cycle and the speed of the transmission cycle. Typically, thespeed of the transmission cycle in an NoC is higher than 600 MHz,but the speed of the scan shift cycle is about 15 MHz [18]. Therefore,the proposed TAM can transmit enough counting data per TR data.Fig. 3 represents 4 × 3 NoC systems and routings according to theproposed algorithm. Cores connected to the routers are omitted.

TP data transmits from Router0 (first router) to the other routers.After one scan shift cycle of a router receives the TP data, its coreprints a TR data and the router generates a majority counting data.The majority counting data transmits to the next router. Router11 (lastrouter) receives all the majority counting data and generates MV. TheMV transmits from Router11 to the other routers. With this routingalgorithm, each router receives the MV one scan shift cycle after thecore of router printing a TR data. In order to compare the MV and theTR of core, the router requires one buffer (detailed time informationis referred in Fig. 5). After routing all of the routers in an NoC inthe same way, regardless of the number of cores, the depth of thebuffering is always one clock. Therefore, all cores in the NoC canbe tested concurrently by the pipelined test data.

C. Majority Analyzer

Each router in the proposed NoC-reused TAM has a majority ana-lyzer for analyzing the MV. It counts the number of 1s in the TRdata of each core; bitwise transfer-counters are designed for maximiz-ing the number of testable cores and the simplified architecture helpsto reduce the operation time and hardware overhead. The transfer-counters are newly designed counter architectures in order to counterand transfer the data simultaneously. Fig. 4 shows the architectureof the majority analyzer and how it uses transfer-counters. For 4 × 3cores, three bits are required for discovering the number of 1s in theTR data that are larger than half of the total cores (23 > 12/2).

When the least significant bit of the majority counting data is trans-ferred from the previous router, this bit is added with the TR data ofthe core by the adder. Then, the majority counting data is increasedonly when the bit of the TR data is 1. The result of the bitwise sumis transferred to the next router. The next bit of the majority countingdata is then transferred from the previous router and this bit is added

Fig. 4. Architecture of majority analyzer.

Fig. 5. Timing diagram of the proposed NoC-reused TAM.

with the preceding carry bit by the adder. This sum is transferred tothe next router as well. After the same process for the most signifi-cant bit takes place, all of the majority counting data of the previousrouters are counted and transferred. At the last router, the calculatedmajority counting data is compared with half of the number of totalcores and the MV is 1 when the majority counting data is larger thanthis half.

D. Timing Analysis

To represent the pipelined test data and concurrent test processof the proposed NoC-reused TAM, a timing diagram of test data inthe TAM is shown in Fig. 5. TP data is transferred from Core0 toCore1, and from Core1 to Core2 (TP1 is the first TP and TP2 isthe second TP). After one scan shift cycle, Core1 gets the TR dataand the majority counting data (mcnt) are calculated and transmittedaccording to the transmission cycle. The MV of all cores can beanalyzed in the same way and Core1 receives the MV from the lastcore (Core3 in this case) after one more scan shift cycle. In order tocompare the TR data and the MV, the TR data must be buffered onescan shift cycle

Transmission clock

scan shift clock≥

(log2

# of cores

2+ 1

)+ # of cores in row

+ # of cores in column. (1)

The proposed TAM utilizes the difference between the speed of thescan shift cycle and the speed of the transmission cycle. Equation (1)represents the relation between the speed of the cycles and the numberof testable cores. The left side indicates the utilizable clocks fromthe different speeds of the transmission clock and scan shift clock.The right side is composed of the bits for counting the MV and the

Page 4: Parallelized Network-on-Chip-Reused Test Access Mechanism ...soc.yonsei.ac.kr/Abstract/International_journal/pdf/134-Parallelized... · IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN

1222 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 35, NO. 7, JULY 2016

Fig. 6. Expected test time of NoC-reused TAMs (N = 12 and G = 10).

longest transmission path. In order to transfer and count the majoritycounting data simultaneously, transfer-counters are used [if a routercounts the all-bits of the majority counting data and then transfersthem to the next router, the calculation on the right side of (1) wouldbe a multiplication instead of an addition]. Given that the speed ofthe transmission cycle in an NoC is 600 MHz and the speed of thescan shift cycle is 15 MHz, the proposed TAM can test 256 coresconcurrently. As a result, the abundant number of cores can be testedwith the same test time as that of a single core.

IV. EXPERIMENTAL RESULTS

Several experiments were performed to verify the effectivenessof the proposed NoC-reused TAM. The experimental results includecomparisons of the proposed TAM to the previous NoC-reused TAMwhich adopted the pipeline-based TAM scheme. These are imple-mented with the general NoC [19] and synthesized by the Synopsys90-nm generic library [20] for analyzing the proposed TAM in realNoC systems.

A. Test Time

One advantage of the majority-based test scheme is that it can testmultiple identical cores in the same amount of time as one core. InFig. 6, the expected test time of the proposed NoC-reused TAM isalways 1T (T: time for testing one core). On the other hand, bothdedicated TAM and NoC-reused TAM which adopted the pipeline-based test scheme require additional test time according to the yieldof the core. Giles et al. [15] analyzed the expected test time of thepipeline-based TAM. The expected test time which is necessary todetermine the pass/fail status of each core (up to the decision to exitaccording to the deration policy) may be calculated as a functionof the per core yield, Y. The proposed TAM and NRP-TAM canreuse the full bandwidth of bidirectional NoC links as the TAM, butpipeline reuses one directional link for both TP data and TR data ofthe primary core. As the width of the TAM decreases to become halfas large, the test time tends to increase to nearly double [16]. The restof the output channels in pipeline can be reused for comprising twotracks. Therefore, the experimental result of pipeline in Fig. 6 doublesthe expected test time of the pipeline-based TAM with two tracks [9].

Fig. 6 represents a case for when at least ten cores are good among12 cores, but the NoC-reused TAMs have consistent expected testtimes in various cases [15], [17]. While the expected test time ofthe majority-based TAM is always 1T, the expected test time for thepipeline-based TAM increases rapidly with decreasing yield. Multipletracks in the pipeline-based TAM can reduce the expected test time,but it can be applicable in particular conditions (asymmetric channels

TABLE IHARDWARE OVERHEAD OF THE PROPOSED TAM IN 4 × 3 NOC

TABLE IIHARDWARE OVERHEAD OF THE PROPOSED TAM IN 4 × 3 × 2 NOC

at test inputs and test outputs) and, above all, it requires more testtime than a single track when the yield of the core is good.

B. Hardware Overhead

NoC-reused TAMs were designed in RTL code and synthesizedin order to compare the hardware size. Table I shows the hardwareoverhead of the NRP-TAM and proposed TAM in a 4 × 3 (2-D)NoC and Table II indicates the case of a 4 × 3 × 2 (3-D) NoC. Thehardware architecture of the proposed TAM is related to each routerand it can be extended to various numbers of routers. Therefore,Tables I and II show similar tendencies regardless of the number ofrouters or their dimensions. The hardware size of the proposed TAMwith an NoC is represented in the number of NAND gates. Sincethe NoC-reused TAMs are reusing the input buffers for pipeliningregisters and additional buffering, the hardware overhead of the TAMsdecreases when the number of buffers in the original NoC increases.Furthermore, the hardware overhead increases in the larger flit sizeat the NoC, which is the width of the TAM.

The hardware overhead of the proposed NoC-reused TAM is lessthan 5% in the worst case of the experiments. The remaining compo-nents are similar, but the proposed TAM has more hardware overheadthan the NRP-TAM due to the majority analyzers. However, consider-ing the fact that the number of gates of a modern multicore processorsystem is much more than a million gates, the hardware overhead ofthe proposed TAM in a chip is negligible.

Consequently, the proposed TAM can be implemented with theminimized hardware overhead by reusing the NoC infrastructures andit is the only NoC-reused TAM which can test multiple identical coresin NoC using the same test time as required for testing a single core.

V. CONCLUSION

In this paper, a novel NoC-reused TAM for parallel testing ofa multicore system is described. All of the cores can be tested

Page 5: Parallelized Network-on-Chip-Reused Test Access Mechanism ...soc.yonsei.ac.kr/Abstract/International_journal/pdf/134-Parallelized... · IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 35, NO. 7, JULY 2016 1223

simultaneously with the scheme of the majority-based test strategyand the test time is the same as that required for a single core.A majority analyzer with transfer-counters are designed which usesthe MV of TR data to test multiple identical cores. The hardwareoverhead is minimized by reusing the infrastructure of an NoC.Experimental results show that the proposed NoC-reused TAM hasa minimized test time with an abundant TAM width and negligiblehardware overhead. The majority-based NoC-reused TAM is onlyrelated to the delivery of TR data and it can be compatible andimproved with existing DFT technologies.

REFERENCES

[1] International Technology Roadmap for Semiconductors (ITRS):2013 Edition, Semicond. Ind. Assoc., Washington, DC, USA, 2013,pp. 26–30.

[2] R. Marculescu, U. Y. Ogras, L.-S. Peh, N. E. Jerger, and Y. Hoskote,“Outstanding research problems in NoC design: System, microarchi-tecture, and circuit perspectives,” IEEE Trans. Comput.-Aided DesignIntegr. Circuits Syst., vol. 28, no. 1, pp. 3–21, Jan. 2009.

[3] E. Cota, A. M. Amory, and M. S. Lubaszewski, Reliability, Availabilityand Serviceability of Networks-on-Chip. New York, NY, USA: Springer,2012.

[4] R. Nourmandi-Pour and N. Mousavian, “A fully parallel BIST-basedmethod to test the crosstalk defects on the inter-switch links in NOC,”Microelectron. J., vol. 44, pp. 248–257, Mar. 2013.

[5] A. M. Amory, E. Briao, E. Cota, M. Lubaszewski, and F. G. Moraes,“A scalable test strategy for network-on-chip routers,” in Proc. IEEE Int.Test Conf., Austin, TX, USA, 2005, Art. ID 25.1.

[6] E. Cota et al., “A high fault coverage approach for the test of data control,and handshake interconnects in mesh networks-on-chip,” IEEE Trans.Comput., vol. 57, no. 9, pp. 1202–1215, Sep. 2008.

[7] D. Xiang and Y. Zhang, “Cost-effective power-aware core testing inNoCs based on a new unicast-based multicast scheme,” IEEE Trans.Comput.-Aided Design Integr. Circuits Syst., vol. 30, no. 1, pp. 135–147,Jan. 2011.

[8] D. Xiang, “A cost-effective scheme for network-on-chip router and inter-connect testing,” in Proc. IEEE Asian Test Symp., Jiaoxi Township,Taiwan, 2013, pp. 207–212.

[9] E. J. Marinissen and Y. Zorian, “IEEE Std 1500 enables modularSoC testing,” IEEE Des. Test. Comput., vol. 26, no. 1, pp. 8–17,Jan./Feb. 2009.

[10] V. Iyengar, K. Chakrabarty, and E. J. Marinissen, “Test wrapper andtest access mechanism co-optimization for system-on-chip,” J. Electron.Test., vol. 18, no. 2, pp. 213–230, 2002.

[11] B. Noia, K. Charkrabarty, and E. J. Marinissen, “Optimization methodsfor post-bond testing of 3D stacked ICs,” J. Electron. Test. Theory Appl.,vol. 28, pp. 103–120, Feb. 2012.

[12] R. Michael and K. Chakrabarty, “Optimization of test pin-count, testscheduling, and test access for NoC-based multicore SoCs,” IEEE Trans.Comput., vol. 63, no. 3, pp. 691–702, Mar. 2014.

[13] I. Parulkar, T. Ziaja, R. Pendurkar, A. D’Souza, and A. Majumdar,“A scalable, low cost design-for-test architecture for UltraSPARC chipmulti-processors,” in Proc. IEEE Int. Test Conf., Baltimore, MD, USA,2002, pp. 726–735.

[14] L. Zhang, Y. Han, Q. Xu, X. Li, and H. Li, “On topology reconfigura-tion for defect-tolerant NoC-based homogeneous manycore systems,”IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 17, no. 9,pp. 1173–1186, Sep. 2009.

[15] G. Giles, J. Wang, A. Sehgal, K. J. Balakrishnan, and J. Wingfield, “Testaccess mechanism for multiple identical cores,” in Proc. IEEE Int. TestConf., Santa Clara, CA, USA, 2009, pp. 1–10.

[16] T. Han, I. Choi, H. Oh, and S. Kang, “A scalable and parallel test accessstrategy for NoC-based multicore system,” in Proc. IEEE Asian TestSymp., Hangzhou, China, 2014, pp. 81–86.

[17] T. Han, I. Choi, and S. Kang, “Majority-based test access mechanismfor parallel testing of multiple identical cores,” IEEE Trans. Very LargeScale Integr. (VLSI) Syst., vol. 23, no. 8, pp. 1439–1447, Aug. 2015.

[18] E. Salminen, A. Kulmala, and T. D. Hämäläinen, “Surveyof network-on-chip proposals,” White Paper, OCP-IP, 2008,pp. 1–13.

[19] (Apr. 8, 2014). Efficient Microarchitecture for Network-on-Chip Routers.[Online]. Available: http://purl.stanford.edu/wr368td5072

[20] (Mar. 26, 2012). 90 nm Generic Library. [Online]. Available:http://www.synopsys.com/Community/UniversityProgram


Recommended