EMER: Engineering Critical Systems: human scale systems with emergence
EME : 2
Safety critical systems
• Safety critical systems engineering has to consider emergent behaviour– Safety is itself emergent
– A system is considered safe when its potential for undesired emergent behaviour is sufficiently restricted
• Recent work considers complex systems where emergent behaviours cannot be controlled– but do need to be understood
• Command and control
• Building evacuation and crowd management
• Transport systems management
EME : 3
Social scale complex systems
A systems of systems is a group of interacting systems that interact to achieve some operational goal
• System of systems (SoS) are a focus of research in HISE and Enterprise Systems groups in the department– Large Scale Complex IT Systems
– Social-scale critical systems
• Systems of interest all include people– Which adds irrationality to the behaviours of the system
• Start by defining SoS
Alexander, Hall-May & Kelly, 2004 onwards – http://www-users.cs.york.ac.uk/~tpk/
EME : 4
Human SoS characteristics
• Goals:– Overall goals, shared by all components– Individual component goals
• Autonomy:– Multiple heterogeneous components with at least some
individual capabilities and independence of action
• Mobility:– Components are spatially distributed and mobile– Communication is by ad hoc networks
• Components need to collaborate to achieve overall goals– No (reliable) central command and control
http://www-users.cs.york.ac.uk/~tpk/issc04c.pdf
EME : 5
Example: building evacuation
• Emergency situation in a familiar setting– Individual goal is to get out
• Typically following established exit route, not emergency route
– Overall goal is to clear the building fast and to know it is clear
– Emergency disrupts social and communication structures
• Glasgow evacuation simulations– Use Monte Carlo simulation, not individual behaviours
– Not formally engineered, but built using appropriate engineering background
– Based on scenario analysis and simulation in realistic settings See http://www.dcs.gla.ac.uk/~johnson/
eg. papers/9_11.PDF
EME : 6
Engineering and building evacuation
• Modelling human factors (vs Monte Carlo simulation)– Shown to be impossible on any meaningful scale
– Attitudes, prior experience etc of many people
• Modelling building– Blueprints and site knowledge
– Build all human-scale features into the model
• Environment analysed– Ability to change features of emergency, building, response
• Simulation is as simple as possible
• Validation is against evidence– From fire practices in situ
– From literature, experience, observation
EME : 7
Example: traffic policies
• Safety policy: operational rules that guide agent behaviour so that emergent “designed” SoS-level behaviour does not result in accidents
The belief that numerous independently designed and constructed autonomous systems can work together
synergistically and without accident is naïve unless they operate to a higher and consistent set of rules
• Focus on identifying objectives of rule set
• Derive an argument for each showing how a policy (rule set) can mitigate
http://www-users.cs.york.ac.uk/~tpk/issc05a.pdf
EME : 8Safety case arguments: documenting assurance
• Significant critical systems engineering research into arguing and documenting assurance– Safety analysis and argumentation
– Dependability, security also now using assurance techniques
– See research by Tim Kelly and others in York’s HISE group• World leaders in safety argumentation and safety-critical-
systems training
• Analysis techniques focus on challenging evidence– Safety is established by exposure of an argument over
evidence to expert scrutiny
– Reveals the extent and limitations to trust in the system
– No system is ever absolutely safe
• Arguments summarised in Goal Structuring Notation
EME : 9
Basic Goal Structuring Notation
See T.P.Kelly, PhD thesis, www.cs.york.ac.uk/ftpdir/reports/99/YCST/05/YCST-99-05.pdfR. Weaver, PhD these, www.cs.york.ac.uk/ftpdir/reports/2004/YCST/01/YCST-2004-01.pdfand papers by Kelly’s group, www-users.cs.york.ac.uk/~tpk/pubs.html
EME : 10
Recording a safety argument
EME : 11
Example: command and control
• Hypothetical study of the safety of various aspects of a combined military operation with UAVs– Identification of emergent hazards
• Safety problems due to complexity rather than component failure
– Agent-based simulation to do combinatoric behaviours
R. D. Alexander’s PhD: http://www.cs.york.ac.uk/ftpdir/reports/2007/YCST/21/YCST-2007-21.pdf
EME : 12
Engineering command and control
• Case study has been used in many safety related analyses– Well-known components, existing models, etc.
• Careful engineering approach based on conventional simulation design and conventional safety analysis– Systematic derivation and deviation of hazard vignettes
• Work on how SoS characteristics contribute to hazards– Uses BDI (desires, beliefs intentions) for human
components
• Multi-agent simulation validated against existing models
• Machine learning used to identify new hazard
EME : 13
Common features of examples
• Use of existing research and best practice– Ways to model people (BDI)
– Ways to model environment
– Ways to construct and analyse efficient simulations
• Validation:– Do models and simulations match the real world?
• Deviational analysis:– How might something have been overlooked?
• Arguments– E.g. safety: a risk is as low as reasonably possible
• within the assumptions of the model or simulation…
Could molecular nanotechnology be assured safe?
EME : 15
Evidence-based engineering (Kelly)
• When we use any engineering technique, we need to know how it affects our ability to justify quality– Evidence that a design is realistic
• Proven properties of a specification are irrelevant if we implement on an unproven platform
• At nano-scale, we’re talking about unproven physical media
– Simulation is only useful it we can justify its contents• Emergent properties may be artefacts of simulated
environment
• Real environments have many unknown unknowns
• ALARP rules, ok? … – Doubt (risk…) must be as low as reasonably practicable
http://www.hse.gov.uk/risk/theory/alarp.htm
EME : 16
Assurance arguments
• Evidence-based engineering would design arguments of quality, validity etc alongside product design
• Demonstrable validity, safety, security, dependability …
– It must be possible to convince others
• The need for evidence guides techniques for analysing and quantifying quality attributes– Directs analysis to unexpected behaviour and state
– Structured brainstorming
– Flaw hypothesis, deviational analysis, What-if, HAZOP ..• Use expert insight and experience to challenge assumptions
EME : 17
Modelling
• Deviational analysis techniques can be applied to any models, assumptions, …– Design models in notations such as UML, CSP etc
• HAZOP + use cases, mutating CSP …
• Assurance needs evidence of modelling quality and relevance– MDE (meta)model compliance and consistency
– Rigorous extensions to diagram-and-text modelling
– Formal refinement …
• Need to be clear what is being modelled and why
Srivatanakul, http://www.cs.york.ac.uk/ftpdir/reports/YCST-2005-12.pdf
EME : 18Issues for nano-scale SoS: Goals and buy-in to goals
• Goals in human SoS imply agents with choice• Buy-in to system goals by component systems
• At nano-scale, are there SoS or component goals or intentions?
• If a property emerges, is an SoS goal is met … ?
• We could transfer goals to designer, so that development and assurance need to capture:– designer’s intention
– ability of SoS and components to achieve intent
• Also, a key to engineering SoS is goals that reflect dependability attributes
• Goals to avoid specific sorts of harm …
EME : 19Issues for nano-scale SoS:Autonomy of component systems
• Autonomy implies choice– eg individual can revise goals, change communication links
• Soldiers think before detonating global destruction
• Nanites are not autonomous but similar effects from– Probabilistic features of elements & environment
– Accidental mutation or damage
– Spontaneous interaction and variably with environment
• Nanites have high capacity for getting lost or making an undesirable alliance– Engineering needs to understand and account for these
features of nanites
EME : 20Issues for nano-scale SoS: Environment
• For human SoS, global environment is “known”– Local operating conditions affect agents’ perception and
use of environment• Weather, terrain, infrastructure operation etc
• A broken radio or flooded river is a problem that can be understood and worked around
• Nano-scale environment is a real problem– We do not understanding nano-scale environment
– We do not understand how nanites would interact with their environment
– Nanites cannot devise imaginative solutions to unforeseen scenarios
– Policies, operational guidance etc irrelevant
EME : 21
Nano-scale systems of systems?
• Nano-scale complex emergent systems are SoS
• Despite absence of free-will, many of the issues are the same
• Many of the consequences of inadequate design are similar– Catastrophic uncontrolled interaction
• Treating nano-scale complex emergent systems as SoS leads us to look at other critical-systems research for engineering inspiration