Date post: | 06-May-2015 |
Category: |
Technology |
Upload: | infinit-innovationsnetvaerket-for-it |
View: | 659 times |
Download: | 2 times |
07/03/14
1
Risk Based QA
Michael Agerkvist Petersen [email protected]
dk.linkedin.com/in/michaelagerkvist
Michael Agerkvist Petersen
• QA @ Radiometer Medical • 18+ years with Medical Devices – HW development – SW development – Project Management – Process Improvement – QA
• Owner of MDCA (Medical Device Compliance Assistance) – Spare Qme job J
07/03/14
2
Risk Based QA
• Based on my experience from: – Making a class C Medical Device at Novo Nordisk – Working with Encrypted Pin Pads (ATM keyboards) at Cryptera
– Other projects
• This will not be a complete introducQon to (safety) risk management or Risk Based TesQng.
Agenda
• IntroducQon • Risk Based Quality Assurance – Regulatory Risks – Process Rigor – Risk Based DocumentaQon Rigor – Risk Based TesQng – ProacQve QA
• Examples • PiZalls
07/03/14
3
IntroducQon
Some definiQons • Quality: The totality of features and characterisQcs of a product or service that bears its ability to saQsfy stated or implied needs [ISO] – There are many different customers: Users, OrganizaQon, Regulatory….
• QA: Quality Assurance – many ways to implement this in pracQce – Ensuring that development is in compliance (with defined process)
– Doing the actual tesQng – “anything” which helps ensuring Quality to the different customers
07/03/14
4
Soeware today
• Today soeware controls more and more in our daily life.
• Soeware failures may negaQvely affect business, human health or even human life
• ResulQng in increasing public and regulatory demands for befer products and even more Qme pressure.
Regulatory Compliance • More and more industries gets regulated
• Some are more regulated than others
Accident
Injury or other loss
Public reacQon
New laws &
regulaQons
Regulated Industry
Nuclear
Flight
Medical Device
Military
Paym
ent systems
Public Syste
ms
• The bar is raised over Qme
[Not to scale]
07/03/14
5
TesQng Paradox • TesQng is a structured approach to reduce the number of defects it
is the last acQvity before release. So it is oeen the first to be sacrificed
• TesQng is always a sample. You can never test everything, and you can always find more to test.
• A good test case is one finding a defect – running a complete test suite without finding a single defects will not add much value.
• The problem with most systemaQc test methods, like black box methods (equivalence parQQoning, boundary value analysis , cause-‐effect graphing, etc.), is that they generate too many test cases, many of them will never find a defect.
• However – in a regulated world it adds value to run tests which does not find defects – a “clean sheet” is needed to make a submission without too many quesQons
Risk Based Approach
• A way to lessen the the work load. Could be to test more in: – bad areas of the product. – the most important funcQonal areas and product properQes.
• And tesQng less in the other areas… • But, how to: – find the right areas? – PrioriQzing other QA acQviQes?
07/03/14
6
Risk Based Quality Assurance
TradiQonal QA • Consist of:
– Review – Dynamic Test – StaQc analysis – Templates – Source Code standards – Traceability analysis – Checklists – Etc
• They are typically: – Time consuming – Passive – Always compromised due to Schedule pressure
07/03/14
7
Balancing QA
Too li-le • Safety risk • Poor reliability • AddiQonal cost • Project delay • Regulatory failure • Customer complaints
Too much • AddiQonal cost • Project delay • Under verificaQon (in the more
important areas) • Increased maintenance
Result of poor QA
Liability and liQgaQon
Recalls
Loss of company image
In-‐market updates
ReducQon in performance
No regulatory approval
Project delay
Death
Major Injury
Minor Injury
Security violaQon
ReducQon in performance
Loss of data
Business cost Customer cost
Severity
High
Low
07/03/14
8
Risk Based QA Approach • Some QA acQviQes will sQll be Passive
– They are well known and needed • Some QA acQviQes will be more ProacQve
– We know we are going to release with defects -‐ so why not try to miQgate the impact of specific types of defects in the design?
• All QA acQviQes will be prioriQzed – Some defects or lack of process/documentaQon will do more harm
than others. So why use the same effort and acQviQes on every feature or enQty?
• Use the risk based QA approach to – Find the most criQcal defects as early as possible at the lowest effort/
cost – Find the most criQcal classes of defects and miQgate the impact of
them in the design – Balance the rigor of process, documentaQon and design
Some risk definiQons • Harm: Physical injury or damage to the health of people, or
damage to property or the environment. – From project delay (financial loss) to death (e.g. of user)
• Hazard: PotenQal source of harm • Probability: Of a harm to occur.
– Could correspond to the frequency of funcQonality usage by the user.
• Severity: Measure of the possible consequence of a hazard – Customer cost: Loss of data -‐> Security violaQon -‐> Injury-‐>Death
– Business: Project delay-‐>Recalls-‐>Liability&liQgaQon • Risk = Severity x Probability
[IEC 14971]
07/03/14
9
Probability
• Probability of failure – Usage frequency (funcQons used several Qmes a day vs once in lifeQme)
– For Medical Device Soeware, probability for SW defects = 100% -‐ but the probability of Harm needs to take probabiliQes from the enQre chain of events
Probability Levels Probability of Harm
Probability of Harm Descrip@on Ra@ng DefiniQon
5 Frequent Constantly present
4 Probable Have been/ will be reported but no more than once per month
3 Occasional Have been/ will be reported but no more than once per year
2 Remote Have been/ will be reported but no more than once in products lifeQme (~10 years)
1 Improbable Considered unlikely to occur
This is not science… And very difficult in real life… So do spend too much Qme finding the right value
07/03/14
10
Severity Levels Severity of poten@al Harm
Harm descrip@on # Term Descrip@on
5 Catastrophic Results in immediate death of paQent or user • Immediate death of person caused by Explosion, Fire or
Electrical shock
4 Serious
Results in permanent impairment or criQcal injury that would require medical or surgical intervenQon to preclude irreversible impairment or damage
• Incorrect medical treatment of paQent due to paQent data mix-‐up
• Body part damage (e.g. eye or fingers)
3 Moderate
Results in temporary/ reversible injury or temporary/reversible impairment requiring professional medical or surgical intervenQon
• Incorrect or inadequate medical treatment
2 Minor Results in temporary injury or impairment not requiring professional medical intervenQon
• Minor incorrect or inadequate medical treatment • Delayed medical treatment, loss of sample/new sample
required or no result • Equipment damage • Privacy violaQon (e.g. data leak)
1 Negligible Inconvenience or temporary discomfort • UnsaQsfied user • Loss of old data
QuanQfying Severity & Probability
• Amount of severity should happen by considering the different viewpoints of the system’s stakeholders.
• The probability of impact can only happen indirectly, e.g. by evaluaQon of – frequent of use, – quality indicators like the complexity of the soeware itself,
– the quality of the documentaQon – etc.
07/03/14
11
QuanQfying Severity & Probability • Do not spend too much Qme determining the exact values
• The relaQve scoring is the most important in order to idenQfy the most criQcal parts.
• The scoring could be rather informal and based on a brainstorm
Note: For Medical Device Class B and Class C it is expected that the Risk Analysis is more elaborated than stated here
Risk levels
• Four levels (3 should be OK) • Based on IEC 62304 safety class (A,B,C) • ClassificaQon used for: – Requirements (i.e. some funcQons are more criQcal, e.g. Risk Control Measures than others)
– Structural elements both design and actual implementaQon (i.e. elements implemenQng criQcal requirements)
Least criQcal
Most criQcal
C2
B
A
C1
07/03/14
12
Ploxng the risks
Severity High Low
Prob
ability
High
Low
High Risks
Medium Risks
Low Risks
C2
C1
B
A
Different risk perspecQves For a Medical Device: • Safety: Freedom of unacceptable risks. IdenQfying and miQgaQng safety
risks to paQents and users. • Effec@veness: Fulfil the medical claims, delivering value to the paQent and
users. MeeQng the user’s needs. Correctness of the product, meeQng its specificaQons.
• Customer sa@sfac@on: Good user experience, good service, ease of use, free of defects, reliable.
• Regulatory: Being in compliance by meeQng the Regulatory ExpectaQons including: Safety, Efficacy and Customer saQsfacQon.
• Project: MeeQng the organisaQons expectaQons, including Quality, Safety,
EffecQveness and Regulatory, but also Qmelines and other tradiQonal project risk related stuff.
07/03/14
13
Different risk perspecQves
Safety
EffecQveness
Regulatory
Customer SaQsfacQon
Project risk
Regulatory Risk – Process Rigor
07/03/14
14
Process Rigor Common FDA Warning lefer issues
– Lack of • test specificaQons and test results • comprehensive, up-‐to-‐date specificaQons (design input)
– Inadequate • fault handling and stress tesQng • change and release control
Regulatory Risks versus ProducQvity & Predictability
• RegulaQons and standards does not seek benefits in producQvity or project predictability
• But they don’t preclude producQvity and predictability from being important.
• So it is your responsibility and interest to have development processes focusing on: – ProducQvity – Predictability Without sacrificing Safety, Customer saQsfacQon, EffecQveness and Regulatory risks
07/03/14
15
Regulatory Risks versus ProducQvity & Predictability
• Risk Driven Approach: – Regulatory interpretaQon – focus on the intenQon – ConQnuously process improvement within the intenQon and frame of
the Regulatory expectaQons – Align process with the different risks (e.g. Safety, EffecQveness, and
Customer saQsfacQon) associated -‐ focus on what really mafers
• “Too much will always be too much, but maybe not enough”
Regulatory Risk: When not in compliance
• Compliance is not created by: – using a checklist – copying from the: • standards • regulaQons • guidance's
• CreaQng compliance is about meeQng the “intent” of the standards – not just following “the lefer of the law”
07/03/14
16
Regulatory Risk – Don’t climb too high
E.g. Tools validaQon -‐ Balancing between: • what is required and • when value adding stops
minimum
opQmum
Risk Based DocumentaQon Rigor
07/03/14
17
DocumentaQon Rigor • To lifle
– Difficult to anchor decisions – High regulatory risk – Project delay – Maintenance is difficult
• To much – AddiQonal cost – Less Qme for development (project delay) – Maintenance is difficult – Risk of in-‐consistence – Project delay
Requirements Rigor Rigor of requirements High
Low
Safety
UI
User Satisfaction
Service Risk
Low High
Efficacy
07/03/14
18
Design Rigor
• 62304 allows soeware to be decomposed into soeware items with different safety classes – if they are segregated and segregaQon raQonale provided
– No raQonale => Safety Class is the same for all Item/Units.
– type of segregaQon could vary based on risk and other factors
Design Rigor – Item/Unit Level
Higher Risk Level requires, more detailed and rigorous: • Architecture descripQon • Detailed Design If the enQre SW is Safety Class C Detailed design is required, but it is sQll possible to use the Risk Based Approach and adjust on details and rigor
Least criQcal Component
Most criQcal Component UI
FuncQon
Model
Driver
HAL
SI
C2 B C2
B
A A
C1
C1
Allocate Risk Level to the different Items/Units based on their responsibility
07/03/14
19
Risk Based TesQng
Risk Based TesQng • IdenQfy the top most criQcal funcQons • Consider: – Evaluate whether the users will idenQfy defects in funcQon or afribute.
– Use historical data to idenQfy funcQon areas with many defects
• Do extra tesQng in criQcal areas and areas with many defects – Use domain specialists – Extend (automated) regression test when new defects are found
07/03/14
20
For System TesQng
Risk level SW System test ac@vi@es
C2, C1 • FuncQonal • Exploratory • Consider other test strategies (Stress, Boundary, Stability, State
transsion, Recovery) dependent of the FuncQon under test and Risk level
B • FuncQonal • Security • Exploratory
A • FuncQonal (all requirements have at least one TC)
• Allocate Risk Level to the different FuncQons/requirements based on the possible severity and probability.
Item/Unit Level tesQng
Higher Risk Level requires, more detailed and rigorous QA AcQviQes: • Reviews, • TesQng, • Etc. If appropriate use historical data to idenQfy Item/Units with many defects in order to adjust the Risk Level
Least criQcal Component
Most criQcal Component UI
FuncQon
Model
Driver
HAL
SI
C2 B C2
B
A A
C1
C1
07/03/14
21
Item/Unit Level tesQng Risk level Unit Tes@ng ac@vi@es
C2 • Formal Code Review by at least one SW Developer plus SW Risk Manager • StaQc Analysis • Soeware Unit Test, 100% decision/condiQon coverage • IntegraQon test using decisions tables and classificaQon trees
C1 • Formal Code Review by at least one SW Developer • StaQc Analysis • Soeware Unit Test, 100% Statement coverage • IntegraQon test using decision tables
B • Formal Code Review by at least one SW Developer • StaQc Analysis • Soeware Unit Test, 100% funcQon coverage • IntegraQon test
A • Informal Peer Code Review • IntegraQon test part of System Level Test
Note: Risk Level A not to be used for Class C Soeware
Item/Unit Level tesQng & Complexity
Risk level
Complexity Reduce
C2 McCabe <= 2 AND LoC < 10 Reduce Unit Test to 100% FuncQon coverage
C1 McCabe <= 3 AND LoC < 20 Reduce Unit Test to 100% FuncQon coverage
B McCabe <= 3 AND LoC < 30 No Unit Test necessary (Not for Class C Soeware
A NA NA
• Complexity – root cause for many defects – but low complexity code may also require less tesQng.
• In source code use complexity metrics to adjust the QA acQviQes
• Note: Complexity metrics also part of the code standard
07/03/14
22
ProacQve QA
ProacQve QA • Uses same approach as for Medical Device Safety Risk Management (IEC 14971, IEC 80002).
• IdenQfy most criQcal SW hazards and their causes, e.g.: – Loss of configuraQon could make the SW/Device useless
– Faulty data could result in fault funcQonality and/or results
– Never-‐ending waiQng loops could could make the SW/Device slow or useless
• Implement proper miQgaQons in the design
07/03/14
23
Examples
Example – Keyboard in ATM
Keyboard Dispenser
Display
Card Reader
Ext Keyboard
XFS drv XFS drv XFS drv
XFS
Win XP
Bank App
PC
ATM
Master-‐key
derived-‐key derived-‐key
derived-‐key derived-‐key
• <1% source code doing keyboard funcQonality
• Remaining related to: • Security: crypt, key-‐
handling, surveillance • Service: ConfiguraQon,
status log etc.
Reliability wise – key handling is very important • Without the Master key the keyboard needs to
back to the manufacturer • Without derived keys the keyboard needs a
service tech. visit
07/03/14
24
Risk Based QA for ATM keyboard
• “Spontaneous” loss of keys in the field • Risk based approach: New file system with – CRC Error correcQon – “Black box recorder” (log of field events for debugging)
– More rigor of requirements and design – Unit tesQng of the new file system
Risk Based QA for ATM keyboard • In the pilot phase -‐ several incidences where keyboard is “completely dead”
• In certain situaQons, defects in both the CRC Error correcQon and “Black box recorder” ends up in: “logging an error result in a new error…”
• Learnings: – Adding “miQgaQons” in soeware increases complexity – which may end up in more erroneous SW
– Remember integraQon and scenario tesQng – Consider some kind of recovery mechanism
07/03/14
25
Example ProacQve QA for VHF radio
• VHF Radio stores “vital” data in EEPROM. • During SW test a HW design flaw is found. HW do not give “Power Down” in Qme => “vital” data is corrupted => VHF Radio is useless.
• ProacQve QA: – CRC protecQon of data – Shadowing of data – Controlled Scheme for data update – The approach also miQgates SW failures
Bafery monitor in Medical Device • A Bafery powered Medical Device have a bafery monitor to inform when charging is needed
• ProacQve QA – Bafery power is (also) displayed in number of measurements lee
– If number of measurements lee <= 2 then measurements is not possible
– When number of measurements lee <=2 then it is always decreased with 1 independently of bafery status
– Monitoring algorithm adapts when bafery degrades
07/03/14
26
ICU Monitor in Demo mode
• ICU Monitor have a demo mode to show realisQc waveforms in sales situaQon.
• Erroneously a ICU Monitor jumped into Demo mode during monitoring of real paQent
• ProacQve QA – Changing waveform to non realisQc waveforms – WriQng “Demo” where waveforms are displayed – Timeout on Demo mode – jumping back to real mode aeer a period of in-‐acQvity
PiZalls
07/03/14
27
PiZalls
• Too much are considered criQcal or too much are considered not-‐criQcal – Risk of not finding the criQcal defects
• Customer and Manufacturer have different view of what is criQcal – Risk of un-‐saQsfied customer
• Management only buys the cost reducQon part of risk based QA – Risk of poor quality
PiZalls
• No use of historical data – also within the project – If defect trends shows different than your risk evaluaQon (e.g. un-‐idenQfied criQcal defects) then you should adapt your Risk Based Approach.
• Design miQgaQons adds too much complexity – resulQng in other defects, difficult to maintain SW
• The “system” to handle Risk Based QA are too complex.