Functional Safety Design and Quality concerns
for automotive semiconductors
with some recommendations from OEM
14th November, 2018
(C) Copyright NISSAN MOTOR CO., LTD. 2018 All rights reserved.
Functional Safety Design Flow in ISO26262
OE
MC
om
ponent
Supplie
r
S:severity
E:exposure
C:controllability
ASIL
A B C D
+
+
Hazard analysis ASIL definition
High
A B C D
PMHF <1000 FIT <100 FIT <100 FIT <10 FIT
FailureDetection
SPFM - ≥90% ≥97% ≥99%
LFM - ≥60% ≥80% ≥90%
-System Design -Setting of Safety Goal-Allocation of Safety Function
Component Development
Safety integrity level
System structure
ASIL requirement
-ASIL : Automotive Safety Integrity Level-PMHF : Probabilistic Metric for random Hardware Failure-SPFM : Single Point Fault Metrics-LFM : Latent Fault Metrics
Failure mode &
Effect analysis
PMHF
SPFM/LFMSafety Mechanism Design (HW & SW)
Reliability Improvement(HW)
ASIL Achievement
(C) Copyright NISSAN MOTOR CO., LTD. 2018 All rights reserved.
ASIL Design in Microcontrol lers
Function A
System layer
Function B
Function C
Component layer
Main control(Function A)
ASIL D
Safety mechanism is demanded from microcontrollers controlling important functions concerning the safety
such us driving, turning and stopping
Allocation of Safety Function
Sub control(Function C)
Reliability and Safety Design to satisfy PMHF/SPFM/LFM
required on each ASIL
-Failure Rate reduction based on FMEA -Proper choice of Safety Mechanism
PFHF < 10FITSPFM ≧ 99%LFM ≧ 90%
PFHF < 100FITSPFM ≧ 90%LFM ≧ 60%
Depending on system requirement,The severest ASIL D is enforced upon a
microcontroller itself !!
Design step and Requirement level in ISO26262
ASIL D
ASIL B
ASIL C
ASIL B
(C) Copyright NISSAN MOTOR CO., LTD. 2018 All rights reserved.
Safety Mechanism Implementat ion
CPU core
Memory & Memory bus
Infrastructures
System Bus & Peripherals
-Address Parity/EDC-Data ECC-Fail Safe Guard for memory protection-Arbiter error detection
-Address Parity/EDC-Data Parity/EDC-Fail Safe Guard for peripherals-Arbiter error detection-Self-diagnostics(HW+SW)-A/D test
-Triple Modular Redundant resister(TMR)-Error Control Module(ECM)-Clock Monitor-WDT -CRC-Field-BIST for digital logics and memories
-SW Core self test diagnostics(SW)-Lockstep Cores -MISR(Multiple-Input Signature Resistor)-Single-Core Optimized Tightly Coupled Fault Supervisor Configuration
System Safety Goal
Safety Mechanism Implementation Tools
Software
-Basic SoftWare(BSW)-Self Test Software Library-Fault Injection w/ HW(ICE)
Application Design
Software Development- Organization - Process - Tools (e.g. Compiler)
Safety Manual/Application Note
- Recommended Usage - Failure Control Method- Software Test Description
test flow/fault injection case
ISO26262 Certification Evidence
Automotive Microcontroller/SOC venders provide OEMs/System venders with various solution for smooth ISO26262 compliance
(C) Copyright NISSAN MOTOR CO., LTD. 2018 All rights reserved.
Differences between automotive & consumer semiconductors
Field failure rate Reliability design Production (inspection)
Au
tom
oti
ve
■Target“Zero Defect”AEC-Q004
■Reliability design AEC-Q100・HTOL/HTS/THS/...≧1,000hr
■Design of test coverage AEC-Q100・Indication for each type of circuit
■Functional safety ISO26262・Functional safety like a dual watch
■Core tool APQP, PPAP・Definition of the quality approach for automotive
■Inspection in process・HT/LT test, Visual test, X-ray, ...・100% test
■Screening・Burn-in, HT/LT stress test, Thermal shock, ...
Consu
mer
■Target N/AApprox.~200ppm
■Reliability design as general level・HTOL/HTS/THS/...<1,000hr
■Inspection as general level・Only RT test・Sampling test (No screenings)
AEC Automotive Electronics CouncilHTOL High Temperature Operating LifeHTS High Temperature StorageTHS High temp. and high humidity storage
■ Higher reliability and production quality in automotive■ Zero Defect is defined at AEC-Q004
(C) Copyright NISSAN MOTOR CO., LTD. 2018 All rights reserved.
Example of product categories■IEC 60749-43:2017
Semiconductor devices – Mechanical and climatic test methods –Part 43: Guidelines for IC reliability qualification plans
Category Automotive Use A Automotive Use B Consumer use
Examples of application
Powertrains, brakes, drivingsupport system, airbags
Navigation systems, car air conditioners, audio systems
Home electronics, toys, appliances
Annual operating hours
500 h (driving hours)
Differs depending on whether or not to work with KEY ON/OFF.
500 h (driving hours) Up to 8760 h
Differs among applications
Useful life 15 years (cumulative failureprobability: 0,1 %)
15 years (cumulative failureprobability: 0,1 %)
Up to 10 years (cumulativefailure probability: 0,1 %)
Differs among applications
Early failure rate 1 × 10-6 or below per annum 50×10-6 or below per annum Up to 500 × 10-6 per annum
Differs among applications
“Zero Defect” AEC-Q004
(C) Copyright NISSAN MOTOR CO., LTD. 2018 All rights reserved.
Quality concerns and recommendations for automotive semiconductors
1. Many early failures and sudden failures
1) Insufficient screening coverageEnhancement in screening capability
2) Problem caused by foundry management
- Poor communication in development phase Enhance foundry management
- Concerns about quality control after SOP Continuous PDCA loop after SOP
2. Problem about analysis response
- Difficulty to take a countermeasure because ofunknown “Root cause”
Complete clarification of the root cause
- Long time taken to clarify “Root cause” Shorten lead time to analysis
(C) Copyright NISSAN MOTOR CO., LTD. 2018 All rights reserved.
Subject from field failure unit - Early failure, Sudden failure
*Result of Nissan in-house production (2016~2018)
Enhancement in screening capability
・Foundry management・PDCA after SOP
■79% of field failures were caused by semiconductor breakdown■Many early and sudden failures occurred
(~12months of car delivery, especially within 3months)
(C) Copyright NISSAN MOTOR CO., LTD. 2018 All rights reserved.
Foundry management
Waferprocessing
TEGinspection
Waferinspection
(PAT)
Molding Electrical screeningFCT(RT/HT/LT)
Back-end(Foundry B)
Fablesscompany
Front-end(Foundry A)
Weak quality follow-up(Unknown process conditions)
■Poor communication in development phase■Concerns about quality monitoring and outflow prevention after SOP
Poor communicationLack of FMEA, DRBFM review
FMEA DRBFM
Enhancement offoundry control
Continuous PDCAafter SOP
(C) Copyright NISSAN MOTOR CO., LTD. 2018 All rights reserved.
Problem caused by analysis response■Difficulty to completely clarify the root cause of failed semiconductors
Clarified ratio ・・・ Company A 7%Company B 75%Company C 90%
ChipMaker
Failure mode
Months to take permanent countermeasures (Reproduction complete = 0 month)
1 2 3 4 5 6 7 8 9 10 11 12
X Abnormal formation of oxidization layer
X Wrong connection of contact via hole
X Abnormal insulation layer
X Wrong wire bonding loop
X W/B pad corrosion
Y Wrong connection of contact via hole
Y Contamination in silicide process
Y Lattice defect
Z Contamination on wafer backside
Z LED die crack
Z Burn out (die damage)
Z Short by terminal burr
■Taking a long time to clarify the “Root cause” even after repeated failures
(C) Copyright NISSAN MOTOR CO., LTD. 2018 All rights reserved.
Improvement of analysis response
1) Clarification of the root cause by secure analysis by FTA✓Estimation of root cause from FTA✓Verification of FTA factors and failure phenomenon
FIB
OBIRCH
2) Shorten lead time to analysis
(C) Copyright NISSAN MOTOR CO., LTD. 2018 All rights reserved.
End
(C) Copyright NISSAN MOTOR CO., LTD. 2018 All rights reserved.
Appendix
(C) Copyright NISSAN MOTOR CO., LTD. 2018 All rights reserved.
Appendix
Lockstep cores1つのダイに搭載した2つのプロセッサコアのクロックを同期させながら、それぞれのコアで同じ処理を行い、処理結果を比較回路で比較して、同じ処理結果だったときだけ実行する。
MSIRMultiple Input Signature Register
多入力スキャンチェインのテスト応答を同時に入力することができ,長大なシリアルデータを時間方向に圧縮して多ビットの系列を得る.BISTにおいてテスト応答の圧縮回路として用いられる.
Single-Core Optimized Tightly Coupled Fault Supervisor Configuration
シングルコア密結合方式処理用コアに内部信号を参照する診断回路を組み合わせ、比較と自己診断を自動実行させます。これにより従来のデュアルコア方式に比べ、ハードウエア、ソフトウエア規模の削減を図ることができる
Triple Modular Redundant resister(TMR)
回路モジュールなどを3重化する冗長構成方式で、3つの出力の多数決結果により1個の回路モジュールが故障しても正しい結果を得ることができる。また同時に出力同士を比較することで故障検出を可能にする。
Arbiter error detection調停エラーを検出。メモリを使用する際、プロセサは調停機構(Arbiter)に使用要求を送り、どのプロセサにコモンバスの使用権を与えるかを決定する。
WDTWatch Dog Timer設定されたタイマー周期内にクロックが入力されるとタイマーをリセットすることでシステムの正常動作を監視
EDC(Error Detection Code)
エラー検出コードチェックサムのようなもので、エラーの有無をチェックし、もしもエラーが見つかれば、ECCにある訂正コードを利用してコレクション
ECC(Error Correction Code) エラー訂正コード
ECM様々なエラーソースやモニタ回路で発生するエラー信号に対し、割り込みや内部リセット信号を発生するモジュール
Fail safe guard for RAM protection
CPUの暴走によるRAMデータの書き換えを防止する機構
FMEDAFailure Mode, Effect and Diagnostic Analysys
FIT 故障率を表す単位のことで、109時間あたりに何件故障が発生するか