Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems
Software in safety‐critical systems
Software safety requirements
Software safety integrity
DefinitionsProgrammable electronic (PE)o Based on computer technology which may be comprised of hardware,
software, and of input and/or output units
Softwareo Intellectual creation comprising the programs, procedures, data, rules and
any associated documentation pertaining to the operation of a data processing system
Safety related softwareo Software that is used to implement safety functions
Software safety integrityo Likelihood of software in a PE system achieving its safety functions under all
stated conditions within specified period of time
Systematic safety integrityo Part of the safety integrity relating to systematic failures in a dangerous
mode of failure
o Error in the design, implementation of software causes systematic failures
Determining software SIL
Basic concepts:
o Software SIL shall be identical to the PE system SIL(implementing the safety function)
o Exception: Mechanism exists to prevent the failure of a software module from causing the system to go to an unsafe state
Reducing software SIL requires
o Analysis of failure modes and effects (probabilistic analysis)
o Analysis of independence between software and the applied prevention mechanisms
Different software modules may have different SIL
Architecture for reducing software SIL
Független szoftver modulok „bejárása”
Channel 1
I/O
locking locking
Channel 2
Signature Signature
(1)
(1)
(1)
(2)
(2)
(2) (3)
(3)
(3)
(1)
(1)
(3)
(3)
(4) (4)
2 channelsand independent acknowledgement
Problems and solutions
Systematic failures in complex software:o Development of fault‐free software cannot be guaranteed in case of complex functions
• Goal: Reducing the number of faults (remaining after verification and validation activities) that may cause hazard‐> Fault prevention, fault removal, fault tolerance, fault forecasting
o Target failure measure cannot be demonstrated by a quantitative analysis
• General techniques do not exist, estimations are questionable
Standards prescribe methods and techniques for the software development, operation and maintenance:1. Safety lifecycle2. Techniques and measures in all phases of the lifecycle3. Documentation4. Competence and independence of personnel
1. Safety lifecycle
Hardware and software developmentPE system architecture (partitioning of functions) determines software requirements
PES integration follows software development
Final step: E/E/PES integration
E/E/PES safetyrequirementsspecification
E/E/PESarchitecture
Software safetyrequirements
Software designand development
PES integration(software
and hardware)
Hardware safety requirements
Programmablehardware design and development
Non‐programmablehardware design and development
E/E/PES integration
Software safety lifecycleSafety req. spec. has two parts:o Software safety
functions
o Software safety integrity levels
Validation planning is required
Integration with PE hardware is required
Final step: Software safety validation
Software safety requirements
Safety functions
Safety integrity
Software safety validation planning
Software design and development
PES integration (hw and sw)
Software safety validation
Example lifecycle (V‐model)
Development model is not prescribed by the standardso V‐model is characterized by clear conditions to step forward
o V&V planning is explicitly supported (input for V&V activities)
Software quality assuranceSoftware Quality Assurance Plano Specifying activities and documents according to ISO 9000‐3
• ISO 9001 accreditation
o Determining all technical and control activities in the lifecycle• Activities, inputs and outputs (esp. verification and validation)
• Quantitative quality metrics
• Specification of its own updating (frequency, responsibility, methods)
o External supplier control
Software configuration managemento Configuration control before release for all artifacts (code, documents, …)
o Changes require authorization
Problem reporting and corrective actions (issue tracking)o “Lifecycle” of problems:
From reporting through analysis, design and implementation to validation
o Preventive actions
2. Techniques and measures
Basic approach
Goal: Preventing the introduction of systematic faultsand controlling the residual faults
SIL determines the set of techniques to be applied as
o M: Mandatory
o HR: Highly recommended (rationale behind not using it should be detailed and agreed with the assessor)
o R: Recommended
o ‐‐‐: No recommendation for or against being used
o NR: Not recommended
Combinations of techniques is allowed
o E.g., alternate or equivalent techniques are marked
Hierarchy of methods is formed (references to tables)
Example: Guide to selection of techniques
Software safety requirements specification:
o Techniques 2a and 2b are alternatives
o Referred table: Semi‐formal methods (B.7)
Hierarchy of design methods
Hierarchy of design methods
Hierarchy of design methods
Specific techniques: Design
Safety bag technique
o Independent external monitor ensuring that the main computer performs safe
Memorizing executed traces
o Comparison of program execution with previously documented reference in order to force it to fail safely if it attempts to execute a path which is not allowed
Defensive programming
o Checking anomalous control/data flow and data values during execution
• E.g., checking variable ranges, plausibility,
• consistency of configuration, availability of hw, etc.
o and react in a safe manner
Hierarchy of V&V methods
Hierarchy of V&V methods
Hierarchy of V&V methods
Hierarchy of V&V methods
Specific techniques: Verification
Probabilistic testingo Deriving probabilistic figures about the reliability of components from (automated) testing
• via environment simulation focusing on frequent trajectories
Test case execution from error seedingo Inserting errors in order to estimate the number of remaining errors after testing from the number of inserted and detected errors
Fagan inspectionso Revealing mistakes by a systematic audit on documents and design artifacts
Sneak circuit analysiso Detecting unexpected paths or logic flow (latent conditions inadvertently designed into the system) which initiate undesired functions
Application of tools in the lifecycle
Fault prevention:o Program translation from high‐level programming languages
o MBD, CASE tools: High level modeling and code/configuration generators
Fault removal:o Analysis, testing and diagnosis
o Correction (code modification)
Management toolso Contributing both to fault prevention and removal
o Includes project management, configuration management, issue tracking
Safety concerns of tools
Types of tools
o Tools potentially introducing faults• Modeling and programming tools
• Program translation tools
o Tools potentially failing to detect faults• Analysis and testing tools
• Project management tools
Requirements
o Use certified or widely adopted tools• “Increased confidence from use” (no evidence of improper results yet)
o Use the well‐tested parts without altering the usage
o Check the output of tools (analysis/diversity)
o Control access and versions
Safety of programming languages
Factors for selection of languageso Functional characteristics (probability of faults)
• Logical soundness (unambiguous definition)
• Complexity of definition (understandability)
• Expressive power
• Verifiability (consistency with specification)
• Vulnerability (security aspects)
o Availability and quality of tools
o Expertise available in the design team
Coding standards (subsets of languages) are definedo “Dangerous” constructs are excluded (e.g., function pointers)
o Static checking can be used to verify the subset
Specific (certified) compilers are availableo Compiler verification kit for third‐party compilers
Safety of programming languages
Factors for selection of languageso Functional characteristics (probability of faults)
• Logical soundness (unambiguous definition)
• Complexity of definition (understandability)
• Expressive power
• Verifiability (consistency with specification)
• Vulnerability (security aspects)
o Availability and quality of tools
o Expertise available in the design team
Coding standards (subsets of languages) are definedo “Dangerous” constructs are excluded (e.g., function pointers)
o Static checking can be used to verify the subset
Specific (certified) compilers are availableo Compiler verification kit for third‐party compilers
Constructs that make verification difficult (61508):• Unconditional jumps excluding subroutine calls• Recursion• Pointers, heaps or any type of dynamic variables• Interrupt handling at source code level• Multiple entries and exits of loops and subprograms• Implicit variable initialization or declaration• Variant records and equivalence• Procedural parameters
Language comparison
Wild jumps: Jump to arbitrary address in memoryOverwrites: Overwriting arbitrary address in memoryModel of math: Well‐defined data typesSeparate compilation: Type checking across modules
Coding standards for C and C++
MISRA C (Motor Industry Software Reliability Association)o Safe subset of C (2004): 141 rules (121 required, 20 advisory)o Examples:
• Rule 33 (Required): The right hand side of a "&&" or "||" operator shall not contain side effects.
• Rule 49 (Advisory): Tests of a value against zero should be made explicit, unless the operand is effectively Boolean.
• Rule 59 (R): The statement forming the body of an "if", "else if", "else", "while", "do ... while", or "for" statement shall always be enclosed in braces.
• Rule 104 (R): Non‐constant pointers to functions shall not be used.
o Tools to check “MISRA conformance” (LDRA, PolySpace, …)• Test cases to demonstrate adherence to MISRA rules
MISRA C++ (2008): 228 rulesUS DoD, JSF C++: 221 rules (incl. metric guidelines)o “Joint Strike Fighter Air Vehicle C++ Coding Standard”
Interesting factsBoeing 777: Approx. 35 languages are usedo Mostly Ada with assembler (e.g., cabin management
system)
o Onboard extinguishers in PLM
o Seatback entertainment system in C++ with MFC
European Space Agency:o Mandates Ada for mission critical systems
Honeywell: Aircraft navigation data loader in C
Lockheed: F‐22 Advanced Tactical Fighter program in Ada 83 with a small amount in assembly
GM trucks vehicle controllers mostly in Modula‐GM (Modula‐GM is a variant of Modula‐2)
TGV France: Braking and switching system in Ada
Westinghouse: Automatic Train Protection (ATP) systems in Pascal
Safety‐critical OS: Required properties
Partitioning in spaceo Memory protectiono Guaranteed resource availability
Partitioning in timeo Deterministic schedulingo Guaranteed resource availability in time
Mandatory access control for critical objectso Not (only) discretionary
Bounded execution time o Also for system functions
Support for fault tolerance and high availabilityo Fault detection and recovery / failovero Redundancy control
Example: Safety and RTOS
Compromise needed
o Complex RTOS:• Difficult to test
o “Bare machine”:• Less scheduling risks
• High maintenance risks
Example: Tornado® for Safety Critical Systems
o Integrated software solution uses Wind River's securely partitioned VxWorks® AE653 RTOS
o ARINC 653: Time and space partitioning(guaranteed isolation)
o RTCA/DO‐178B: Level A certification
o POSIX, Ada, C support
3. Documentation
Principles for documentation
Type of documentationo Comprehensive (overall lifecycle)
• E.g., Software Verification Plan
o Specific (for a given lifecycle phase)• E.g., Software Source Code Verification Report
Document Cross Reference Tableo Determines documentation for a lifecycle phase
o Determines relations among documents
Traceability of documents is requiredo Relationships between documents are specified (input, output)
o Terminology, references, abbreviations are consistent
Merging documents is allowedo If responsible persons (authors) shall not be independent
Document cross reference table (EN50128)
creation of a document
used document in a given phase(read vertically)
Example
Document structure in EN5012830 documents in a systematic structureo Specificationo Designo Verification
System Development PhaseSystem Requirements SpecificationSystem Safety Requirements SpecificationSystem Architecture DescriptionSystem Safety Plan
Software Maintenance PhaseSoftware Maintenance RecordsSoftware Change Records
Software Assessment PhaseSoftware Assessment Report
Software Requirements Spec. PhaseSoftware Requirements SpecificationSoftware Requirements Test SpecificationSoftware Requirements Verification Report
Software Validation PhaseSoftware Validation Report
Software/hardware Integration PhaseSoftware/hardware Integration Test Report
Software Architecture & Design PhaseSoftware Architecture SpecificationSoftware Design SpecificationSoftware Architecture and Design Verification Report
Software Integration PhaseSoftware Integration Test Report
Software Module Design PhaseSoftware Module Design SpecificationSoftware Module Test SpecificationSoftware Module Verification Report
Software Module Testing PhaseSoftware Module Test Report
Coding PhaseSoftware Source Code & Supporting Documentation Software Source Code Verification Report
Software Planning PhaseSoftware Development PlanSoftware Quality Assurance PlanSoftware Configuration Management PlanSoftware Verification PlanSoftware Integration Test PlanSoftware/hardware Integration Test PlanSoftware Validation PlanSoftware Maintenance Plan
4. Competence and independenceof personnel
Human factorsIn contrast to computerso Humans often fail in:
• reacting in time• following a predefined set of instructions
o Humans are good in:• handling unanticipated problems
Human errorso Not all kind of human errors are equally likelyo Hazard analysis (FMECA) is possible in a given contexto Results shall be integrated into system safety analysis
Reducing the errors of developerso Safe languages, tools, environmentso Training, experience and redundancy (independence)
Reducing operator errors:o Designing ergonomic HMI (patterns are available)o Designing to aid the operator rather than take over
Organization
Safety management
o Quality assurance
o Safety Organization
Competence shall be demonstrated
o Training, experience and qualifications
Independence of roles:
o DES: Designer (analyst, architect, coder, unit tester)
o VER: Verifier
o VAL: Validator
o ASS: Assessor
o MAN: Project manager
o QUA: Quality assurance personnel
Independence of personnel
DES, VER, VAL
DES VER, VAL
DES
MGR
VER, VAL
MGR
DES VER VAL
ASS
ASS
ASS
ASS
SIL 0:
SIL 1 or 2:
SIL 3 or 4:
or:
Organization PersonEN 50128:
Specific design aspects:Development of generic software
Software/hardwareintegration
Overview of the lifecycle
Systemdevelopment
Requirementspecification
Architecturedesign
Moduledesign
Module coding
Module testplan
Integration testplan
Validation testplan
Moduletesting
Software integration
Softwarevalidation
Softwareassessment
Operation andmaintenance
Generic software: It can be used and re‐used after parameterization with specific data
Parameterization
Design for paramete‐rization
V&V of paramete‐rization
Design aspects
Checking compatibility of changes in program code and parameters
Maintenance
Checking for potential parameter values and their combinations
Verification and validation
Separation of program code and parameters
Module design
Design of interfaces for parameterizationArchitecture design
Specification of parameterized functions
Specification of data validation
Software requirement specification
Decision about parameterized functionsSystem development
Design aspectLifecycle phase
Parameterization activities
Validation of the system parameterized with application data
Software validationSoftware assessment
Parameterization (configuration)Software integration
Data preparation
Data testing (verification)
Data test reporting
Software architecture and design
Specifying application requirementsSoftware requirements specification
ActivitySimilar lifecycle phase
Summary
Summary
Design of safe software
o Lifecycle
o Techniques and measures
o Documentation
o Competence and independence of personnel
Specific aspects
o Safe subsets of programming languages
o The role of tools
o Safe operating systems
o Development of generic software