Post on 05-Jun-2018
transcript
Formal Methods meets Machine Learning: Explorations in Cyber-
Physical System DesignSanjit A. Seshia
ProfessorUC Berkeley
Joint work with:Jyo Deshmukh, Tommaso Dreossi, Alex Donze, Dorsa Sadigh, Susmit Jha, Xiaoqing Jin, Tomoyuki Kaga, Tomoya Yamaguchi,
S. Shankar Sastry
Stanford UniversityDecember 4, 2017
Cyber-Physical Systems (CPS):Integration of computation with physical processes, defined by both cyber & physical
Courtesy of Kuka Robotics Corp.Courtesy of Doug Schmidt
Power generation and distribution
Courtesy of General Electric
Military systems:
E-Corner, Siemens
Transportation(Air traffic control at SFO)Avionics
Telecommunications
Instrumentation(Soleil Synchrotron)
Daimler-Chrysler
Automotive
Building Systems
Factory automation
[E. A. Lee]
Automotive domain representative of key societal challenges: Smart Cities / Infrastructure Energy Efficiency Climate Change Humans and Automation
Growing Use of Machine Learning/AI in Cyber-Physical Systems
S. A. Seshia 3
Many Safety-Critical Systems
Growing Features Growing Costs
70 to 100 ECUs in modern luxury cars, close to 100M LOC Engine control: 1.7M LOC
F-22 raptor: 1.7M, Boeing 787: 6.5M
Frost & Sullivan: 200M to 300M LOC Electronics & Software: 35-40% of luxury car cost
200219971988 2009200219971988 2009
[from J. Deshmukh]
4
Charette, R., This Car Runs on Code, IEEE spectrum, http://spectrum.ieee.org/transportation/systems/this-car-runs-on-code
High Cost of Failures Safety-critical: human life at risk Recalls, production delays, lawsuits, etc. Toyota UA: $1.2B settlement with DoJ in 2014, lawsuits, Tesla autopilot incidents: reasons still unclear
S. A. Seshia
Formal Methods to the Rescue?
Industry need for higher assurance Increasing interest in Formal Methods
Formal methods = Mathematical, Algorithmic techniques for modeling, design, analysis Specification: WHAT the system must/must not do Verification: WHY it meets the spec (or not) Synthesis: HOW it meets the spec (correct-by-construction
design)
Major success story: Digital circuit design Can we address the challenges of CPS design?
5
Formal Methods meets Machine Learning
Machine Learning Formal Methods Greater efficiency, ease of use/applicability Formal Inductive Synthesis
Formal Methods Machine Learning Stronger assurances of safety/correctness for
learning systems
S. A. Seshia 6
Further details:1. S. A. Seshia, Combining Induction, Deduction, and Structure for
Verification and Synthesis, Proceedings of the IEEE, November 2015.2. S. A. Seshia, D. Sadigh, and S. S. Sastry, Towards Verified Artificial
Intelligence, July 2016, http://arxiv.org/abs/1606.08514
Outline
Synthesizing Requirements for Closed-Loop Control Systems Industrial Tech Transfer to Toyota
Falsification of Deep Learning based CPS Context: autonomous driving
Conclusion
S. A. Seshia 7
Mining Requirements for Closed-Loop Control Systems
S. A. Seshia 8
[Jin, Donze, Deshmukh, Seshia, HSCC 2013, TCAD 2015;Yamaguchi et al. FMCAD 2016]
Challenges for Verification of Control Systems
Closed-loop setting very complex software + physical artifacts nonlinear dynamics large look-up tables large amounts of switching
Requirements Incomplete/Informal Specifications often created
concurrently with the design! Designers often only have informal
intuition about what is good behavior shape recognition
Mining Requirements from Closed-Loop Models 9
Experimental Engine Control Model
Industry Problem: Applying Formal Methods to Legacy Systems
Its working, but I dont understand why!
Our Solution: Requirements Mining
Value added by mining:
Mined Requirements become useful
documentation
Use for code maintenance and revision
Use during tuning and testing
Mining Requirements from Closed-Loop Models 10
Designer reviews mined requirements Settling time is 6.25 ms Overshoot is 100 units Expressed in Signal
Temporal Logic [Maler & Nickovic, 04]
Tool extracts properties of closed-loop design using a Simulator
Designers View of Our Solution
6.25ms
100
Mining Requirements from Closed-Loop Models 11
CounterExample Guided Inductive Synthesis (CEGIS)
Find Tightest Properties
Settling Time is ??Overshoot is ??Upper Bound on x is ??
Are there behaviors that do NOT satisfy theserequirements?
Settling Time is 5 msOvershoot is 5 KPaUpper Bound on x is 3.6
1.
Mining Requirements from Closed-Loop Models 12
Experimental Engine Control Model
[Jin, Donze, Deshmukh, Seshia, HSCC13; TCAD15]
Idea of CEGIS goes back to ASPLOS 2006 paperby Solar-Lezama et al.
Settling Time is 5.3 msOvershoot is 5.1 KPaUpper Bound on x is 3.8
Settling Time is msOvershoot is KPaUpper Bound on x is
CounterExample Guided Inductive Synthesis (CEGIS)
Find Tightest Properties
Settling Time is ??Overshoot is ??Upper Bound on x is ??
Are there behaviors that do NOT satisfy theserequirements?
Counterexamples
1.
Mining Requirements from Closed-Loop Models 13
Experimental Engine Control Model
CounterExample Guided Inductive Synthesis
Find "Tightest" Properties
Settling Time is ??Overshoot is ??Upper Bound on x is ??
Are there behaviors that do NOT satisfy theserequirements?
Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is 4.1
NO
Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is 4.1
Mined Requirement
Counterexamples
1.
Mining Requirements from Closed-Loop Models 14
Experimental Engine Control Model
CounterExample Guided Inductive Synthesis
Find "Tightest" Properties
Settling Time is ??Overshoot is ??Upper Bound on x is ??
Are there behaviors that do NOT satisfy theserequirements?
Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is 4.1
NO
Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is 4.1
Mined Requirement
Counterexamples
1.
Mining Requirements from Closed-Loop Models 15
Experimental Engine Control Model
Parametric Signal Temporal Logic
(PSTL)
Parameter Synthesis (exploits
monotonicity)
Optimization-based Falsification
CounterExample Guided Inductive Synthesis
Find "Tightest" Properties
Settling Time is ??Overshoot is ??Upper Bound on x is ??
Are there behaviors that do NOT satisfy theserequirements?
Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is 4.1
NO
Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is 4.1
Mined Requirement
Counterexamples
1.
Mining Requirements from Closed-Loop Models 16
Experimental Engine Control Model
Parametric Signal Temporal Logic
(PSTL)
Parameter Synthesis (exploits
monotonicity)
Optimization-based Falsification
Signal Temporal Logic (STL) Extension of Linear Temporal Logic (LTL) and Metric
Temporal Logic (MTL) Quantitative semantics: satisfaction of a property over a
trace given real-valued interpretation Greater value more easily satisfied Non-negative satisfaction value Boolean satisfaction
Example: For all time points between 60 and 100, the absolute value of x is below 0.1
S. A. Seshia 17
0 100
1
-0.1 +0.1
60
x
t
Quantitative Satisfaction Function for STL
Function that maps STL formula and a given trace (valuation of signals) to a numeric value
Example: is inf[60,100] (0.1 - |x|)
Quantifies how much a trace satisfies a property Large positive value: trace easily satisfies Small positive value: trace close to violating Negative value: trace does not satisfy
Mining Requirements from Closed-Loop Models 18
Parametric Signal Temporal Logic (PSTL)
Constants in STL formula replaced with parameters Scale parameters Time parameters
Examples:
S. A. Seshia 19
Between some time and 10 seconds, x remains greater than some value
After transmissionshifts to gear 2, itremains in gear 2 for at least secs
CounterExample Guided Inductive Synthesis
Find "Tightest" Properties
Settling Time is ??Overshoot is ??Upper Bound on x is ??
Are there behaviors that do NOT satisfy theserequirements?
Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is 4.1
NO
Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is 4.1
Mined Requirement
Counterexamples
1.
Mining Requirements from Closed-Loop Models 20
Experimental Engine Control Model
Parametric Signal Temporal Logic
(PSTL)
Parameter Synthesis (exploits
monotonicity)
Optimization-based Falsification
Parameter Synthesis = Find -tight values of params (for suitably small )
0 100
3
2.9
1 000 000
Find "Tightest" Properties
Mining Requirements from Closed-Loop Models 21
Too loose
Want the value of corresponding to the tightest satisfaction over a set of traces
x
Parameter Synthesis
Non-linear optimization problem
Satisfaction function for STL is non-linear in general
Nave (strawman) approach:
grid parameter space to precision
evaluate satisfaction value at each point
pick valuation with smallest satisfaction value
Problem: Exponential number of grid points (in
#parameters)
Find "Tightest" Properties
Mining Requirements from Closed-Loop Models 22
If upper bound of all signals is 3, any number > 3 is also an upper bound
Satisfaction function monotonic in parameter value Example:
(, x) = inft ( - x(t) ) For all x, (, x) is a monotonic function of Advantage: If monotonic, use binary search over
parameter space, otherwise exhaustive search
Satisfaction Monotonicity
0 10050
34
Find "Tightest" Properties
Mining Requirements from Closed-Loop Models 23
Need to decide whether:For all x, (, x) is a monotonic function of
Theorem: Deciding monotonicity of a PSTL formula is undecidable
Use an encoding to satisfiability modulo theories (SMT) solving Quantified formulas involving uninterpreted functions,
and arithmetic over reals linear arithmetic if PSTL predicates are linear
Solved easily with Z3
Deciding Satisfaction MonotonicityFind "Tightest" Properties
Mining Requirements from Closed-Loop Models 24
CounterExample Guided Inductive Synthesis
Find "Tightest" Properties
Settling Time is ??Overshoot is ??Upper Bound on x is ??
Are there behaviors that do NOT satisfy theserequirements?
Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is 4.1
NO
Settling Time is 6.3 msOvershoot is 5.6 KPaUpper Bound on x is 4.1
Mined Requirement
Counterexamples
1.
Mining Requirements from Closed-Loop Models 25
Experimental Engine Control Model
Parametric Signal Temporal Logic
(PSTL)
Parameter Synthesis (exploits
monotonicity)
Optimization-based Falsification
Black-Box Falsification Procedure
u S(u)
Falsification Tool
\
\
Mining Requirements from Closed-Loop Models
Are there behaviors that do NOT satisfy theserequirements?
26
Falsification as Optimization
Solve * = minu (, S(u)) Leverages quantitative semantics of STL Relies on standard numerical optimization methods (e.g.
Nelder-Mead)
If * < 0, found falsifying trace!
Nonlinear Optimization Problem, No exact solution, Limited theoretical guarantees
Are there behaviors that do NOT satisfy theserequirements?
Mining Requirements from Closed-Loop Models 27
Experimental Evaluation Summary [details in TCAD15 paper] Defined Templates for Common Requirements in
Automotive Control all monotonic PSTL!! Dwell-Time requirements Timed/Untimed Safety properties Timed Inevitability (bounded liveness) Input Profiles: assumptions on shape of input signals Control-theoretic requirements on output signals (bounded
overshoot/undershoot, settling time, error from reference signal, etc.)
Three Benchmarks Simple Simulink Automatic Transmission Model Toyota HSCC14 Challenge Air-Fuel Ratio controller Toyota Experimental Diesel Engine Airpath controller
S. A. Seshia 28
Results on Industrial Airpath Controller
Found max overshoot with 7000+ simulations in 13 hours Attempt to mine maximum observed settling time:
stops after 4 iterations gives answer tsettle = simulation time horizon (shown in trace below)
Experimental Engine Control Model
Mining Requirements from Closed-Loop Models 29
[Jin, Donze, Deshmukh, Seshia, HSCC 2013]
Mining can expose deep bugs
Uncovered a tricky bug Discussion with control designer revealed it to be a real
bug Root cause identified as wrong value in a look-up table,
bug was fixed Why mining could be useful for bug-finding:
Can uncover subtle relations that should not hold Looking for bugs Mine for negation of bug
Experimental Engine Control Model
Mining Requirements from Closed-Loop Models 30
Bug fixed Settling time successfully mined
S. A. Seshia 31
OLD
NEW
Industrial Case Studies with Toyota
Work with group @ Toyota Japan on enabling software verification by mining specifications on the closed-loop system
Useful in a production setting: Finds issues where previous methods fell short! Reduced 70% of human effort
S. A. Seshia 32
[Yamaguchi et al., FMCAD16]
Toyota Units Experience with Model Checking
33
Total Work hour
560min
1 trial
70min
40min
7 trials
Making modelMaking property
Revising property
Revising model
Executingmodel checking
Mappingcounterexample
Making/revising property: 110 minMapping counterexample: 280 min for just 1 module
[Yamaguchi et al., FMCAD16]
Graph5
30
40
70.71
70.71
70.71
70.71
70.71
70.71
70.71
Sheet1
3013010110
4014010110
15710515575
40728015575
103307214
2048010330
565214
3013010110
4014010110
1 70.7138.8
2 70.7138.8
3 70.7138.8
4 70.7138.8
5 70.7138.8
6 70.71351
7 70.71
564.97565
15710514.85060606061557515
40728039.60161616161557515
103304.24303030372142.8
2048011.3147474747103306
31.91
495194
Sheet1
Sheet2
Sheet3
34
Overview of Methodology
in out
1. Pre-condition mining
Pre-condition forsoftware module
2. Softwaremodel checking
Module levelcounterexample
System levelcounterexample
3. Simulation-BasedVerification
controller
sensor
plant
module module
module
target
Pedal Brake WaterTemp
module
AirTemp
Breach
Breach
SLDV/CBMC
[Yamaguchi et al., FMCAD16]
From CEGIS to Oracle-Guided Inductive Synthesis
Inductive Synthesis: Learning from Examples (ML)Formal Inductive Synthesis: Learn from Examples while satisfying a Formal Specification
S. A. Seshia 36
[Jha & Seshia, A Theory of Formal Synthesis via Inductive Learning, 2015,Acta Informatica 2017.]
General Approach: Oracle-Guided LearningCombine Learner with Oracle (e.g., Verifier) that answers Learners Queries
LEARNER ORACLE
query
response
Falsification of Cyber-Physical Systems with Machine Learning Components
S. A. Seshia 37
T. Dreossi, A. Donze, and S. A. Seshia. Compositional Falsification of Cyber-Physical Systems with Machine Learning Components, In NASA Formal Methods Symposium, May 2017.
Challenges for Verified AI
S. A. Seshia 38
System SEnvironment ESpecification
YES [+ proof]Does S || E satisfy ?
NO [+ counterexample]
S. A. Seshia, D. Sadigh, S. S. Sastry. Towards Verified Artificial Intelligence. July 2016. https://arxiv.org/abs/1606.08514.
Design Correct-by-Construction
instead?Counterexamples, etc. from Rich Signal Spaces?
Problem: Verify Automotive System (CPS) that uses ML-based Perception
39
Focus: Falsification: finding scenarios that violate safety properties Test (Data) Generation: generate interesting data for
training / testing improve accuracy Deep Neural Networks, given the increasing interest and use
in the automotive context.
S. A. Seshia
Controller Plant
Environment
Learning-Based Perception
Sensor Input
Automatic Emergency Braking System (AEBS)
40
AEBS Controller Plant
Environment
Deep Learning-Based Object Detection
Goal: Brake when an obstacle is near, to maintain a minimum safety distance Controller, Plant, Env models in Matlab/Simulink
Object detection/classification system based on deep neural networks Inception-v3, AlexNet, trained on ImageNet more recent: squeezeDet, Yolo, trained on KITTI
Our Approach: Use a System-Level Specification
S. A. Seshia 41
Verify the Deep Neural Network Object Detector
Verify the System containing the Deep Neural Network
Formally Specify the End-to-End Behavior of the System
Controller Plant
Environment
Learning-Based Perception
STL Formula: G (dist(ego vehicle, env object) > )
Approach: Simulation-Based Falsification
S. A. Seshia 42
Challenge: Very High Dimensionality of Input Space!
Standard solution: Use Compositional (Modular)Verification
However: no formal spec. for neural network component!
Compositional Verification without Compositional Specification?!!
Our Approach: Combine Temporal Logic CPS Falsifier with ML Analyzer
43
CPS Falsifier
ML Analyzer
Spec
Model
Region of Uncertainty
Interesting sensor data(images)
Error? CPS Falsifier uses abstraction of
ML component Optimistic analysis: assume ML
classifier is always correct Pessimistic analysis: assume
classifier is always wrong
Compositional: CPS Falsifier and ML Analyzer can be designed and run
independently (& communicate)!
S. A. Seshia
Difference is the region of uncertainty where output of the ML component matters
Identifying Region of Uncertainty (ROU) for Automatic Emergency Braking System
44
Perform Optimistic and Pessimistic Analyses on the Deep Neural Network
ML always correct ML always wrong Potentially unsafe region (ROU) depending on ML
component (yellow)
S. A. Seshia
Machine Learning Analyzer
45
Systematically Explore ROU in the Image (Sensor) Space
Feature space
brightness car z-pos
Abstraction map
brightnesscar z-pos
car x-pos
Abstract space A
S. A. Seshiax
Abstract space A
()
Neural network {, }
Systematic Sampling (low-discrepancy sampling)
Sample Result
46
Inception-v3Neural
Network(pre-trained on ImageNet using
TensorFlow)
Misclassifications
This misclassification may not be of concern
Sample Result
47
Inception-v3Neural
Network(pre-trained on ImageNet using
TensorFlow)
Misclassifications
Corner caseImage
But this one is a real hazard!
Image Streams
S. A. Seshia 48
Superimposition of tests on backgroundBlind spots
Results on squeezeDet NN and KITTI dataset for autonomous driving
[Dreossi, Ghosh, et al., ICML 2017 workshop]
Verifier-Guided Training of Deep Neural Networks
Instance of Oracle-Guided Inductive Synthesis Oracle is Verifier (CPSML Falsifier) used to perform
counterexample-guided training of DNNs Substantially increase accuracy with only few
additional examples
S. A. Seshia 49
DEEP NEURAL NETWORK
FALSIFIER (CPS + ML)
Learned Classifier
Conclusion: Formal Methods meets Machine Learning
Formal Methods can play an important role in CPS Design with high assurance Industrial scale and machine learning pose particular
challenges
Machine Learning Formal Methods Formal Inductive Synthesis (of specifications,
programs, etc.)
Formal Methods Machine Learning Compositional reasoning about learning-based systems
S. A. Seshia 50
Towards Verified Learning-based CPS
S. A. Seshia 51
S. A. Seshia, D. Sadigh, S. S. Sastry. Towards Verified Artificial Intelligence. July 2016. https://arxiv.org/abs/1606.08514.
Challenges1. Environment (incl.
Human) Modeling2. Specification
3. Learning Systems Complexity
4. Efficient Training, Testing, Verification
5. Design for Correctness
PrinciplesData-Driven, Introspective Environment ModelingSystem-Level Specification; Robustness/Quantitative Spec.
Abstract & ExplainVerification-Guided, Adversarial Analysis and ImprovisationFormal Inductive Synthesis
Exciting Times Ahead!!! Thank you!
Formal Methods meets Machine Learning: Explorations in Cyber-Physical System DesignCyber-Physical Systems (CPS):Integration of computation with physical processes, defined by both cyber & physical Growing Use of Machine Learning/AI in Cyber-Physical Systems Growing Features Growing CostsFormal Methods to the Rescue?Formal Methods meets Machine LearningOutlineMining Requirements for Closed-Loop Control SystemsChallenges for Verification of Control SystemsIndustry Problem: Applying Formal Methods to Legacy SystemsDesigners View of Our SolutionSlide Number 12Slide Number 13Slide Number 14Slide Number 15Slide Number 16Signal Temporal Logic (STL)Quantitative Satisfaction Function for STLParametric Signal Temporal Logic (PSTL)Slide Number 20Parameter Synthesis = Find -tight values of params (for suitably small )Parameter SynthesisSatisfaction MonotonicityDeciding Satisfaction MonotonicitySlide Number 25Black-Box Falsification ProcedureFalsification as OptimizationExperimental Evaluation Summary [details in TCAD15 paper]Results on Industrial Airpath ControllerMining can expose deep bugsBug fixed Settling time successfully mined Industrial Case Studies with ToyotaToyota Units Experience with Model Checking Overview of MethodologyFrom CEGIS to Oracle-Guided Inductive SynthesisFalsification of Cyber-Physical Systems with Machine Learning ComponentsChallenges for Verified AI Problem: Verify Automotive System (CPS) that uses ML-based PerceptionAutomatic Emergency Braking System (AEBS)Our Approach: Use a System-Level SpecificationApproach: Simulation-Based FalsificationOur Approach: Combine Temporal Logic CPS Falsifier with ML AnalyzerIdentifying Region of Uncertainty (ROU) for Automatic Emergency Braking SystemMachine Learning AnalyzerSample ResultSample ResultImage StreamsVerifier-Guided Training of Deep Neural NetworksConclusion: Formal Methods meets Machine LearningTowards Verified Learning-based CPS