Formal Methods meets Machine Learning: Explorations in ...sseshia/talks/Seshia-FM_ML_CPS... ·...

transcript

Formal Methods meets Machine Learning: Explorations in Cyber-

Physical System DesignSanjit A. Seshia

ProfessorUC Berkeley

Joint work with:Jyo Deshmukh, Tommaso Dreossi, Alex Donze, Dorsa Sadigh, Susmit Jha, Xiaoqing Jin, Tomoyuki Kaga, Tomoya Yamaguchi,

S. Shankar Sastry

Stanford UniversityDecember 4, 2017

Cyber-Physical Systems (CPS):Integration of computation with physical processes, defined by both cyber & physical

Courtesy of Kuka Robotics Corp.Courtesy of Doug Schmidt

Power generation and distribution

Courtesy of General Electric

Military systems:

E-Corner, Siemens

Transportation(Air traffic control at SFO)Avionics

Telecommunications

Instrumentation(Soleil Synchrotron)

Daimler-Chrysler

Automotive

Building Systems

Factory automation

[E. A. Lee]

Automotive domain representative of key societal challenges: Smart Cities / Infrastructure Energy Efficiency Climate Change Humans and Automation

Growing Use of Machine Learning/AI in Cyber-Physical Systems

S. A. Seshia 3

Many Safety-Critical Systems

Growing Features Growing Costs

70 to 100 ECUs in modern luxury cars, close to 100M LOC Engine control: 1.7M LOC

F-22 raptor: 1.7M, Boeing 787: 6.5M

Frost & Sullivan: 200M to 300M LOC Electronics & Software: 35-40% of luxury car cost

200219971988 2009200219971988 2009

[from J. Deshmukh]

4

Charette, R., This Car Runs on Code, IEEE spectrum, http://spectrum.ieee.org/transportation/systems/this-car-runs-on-code

High Cost of Failures Safety-critical: human life at risk Recalls, production delays, lawsuits, etc. Toyota UA: $1.2B settlement with DoJ in 2014, lawsuits, Tesla autopilot incidents: reasons still unclear

S. A. Seshia

Formal Methods to the Rescue?

Industry need for higher assurance Increasing interest in Formal Methods

Formal methods = Mathematical, Algorithmic techniques for modeling, design, analysis Specification: WHAT the system must/must not do Verification: WHY it meets the spec (or not) Synthesis: HOW it meets the spec (correct-by-construction

design)

Major success story: Digital circuit design Can we address the challenges of CPS design?

5

Formal Methods meets Machine Learning

Machine Learning Formal Methods Greater efficiency, ease of use/applicability Formal Inductive Synthesis

Formal Methods Machine Learning Stronger assurances of safety/correctness for

learning systems

S. A. Seshia 6

Further details:1. S. A. Seshia, Combining Induction, Deduction, and Structure for

Verification and Synthesis, Proceedings of the IEEE, November 2015.2. S. A. Seshia, D. Sadigh, and S. S. Sastry, Towards Verified Artificial

Intelligence, July 2016, http://arxiv.org/abs/1606.08514

Outline

Synthesizing Requirements for Closed-Loop Control Systems Industrial Tech Transfer to Toyota

Falsification of Deep Learning based CPS Context: autonomous driving

Conclusion

S. A. Seshia 7

Mining Requirements for Closed-Loop Control Systems

S. A. Seshia 8

[Jin, Donze, Deshmukh, Seshia, HSCC 2013, TCAD 2015;Yamaguchi et al. FMCAD 2016]

Challenges for Verification of Control Systems

Closed-loop setting very complex software + physical artifacts nonlinear dynamics large look-up tables large amounts of switching

Requirements Incomplete/Informal Specifications often created

concurrently with the design! Designers often only have informal

intuition about what is good behavior shape recognition

Mining Requirements from Closed-Loop Models 9

Experimental Engine Control Model

Industry Problem: Applying Formal Methods to Legacy Systems

Its working, but I dont understand why!

Our Solution: Requirements Mining

Value added by mining:

Mined Requirements become useful

documentation

Use for code maintenance and revision

Use during tuning and testing


Designer reviews mined requirements Settling time is 6.25 ms Overshoot is 100 units Expressed in Signal

Temporal Logic [Maler & Nickovic, 04]

Tool extracts properties of closed-loop design using a Simulator

Designers View of Our Solution

6.25ms

100


CounterExample Guided Inductive Synthesis (CEGIS)

Find Tightest Properties

Settling Time is ??Overshoot is ??Upper Bound on x is ??

Are there behaviors that do NOT satisfy theserequirements?

Settling Time is 5 msOvershoot is 5 KPaUpper Bound on x is 3.6

1.



[Jin, Donze, Deshmukh, Seshia, HSCC13; TCAD15]

Idea of CEGIS goes back to ASPLOS 2006 paperby Solar-Lezama et al.

Settling Time is 5.3 msOvershoot is 5.1 KPaUpper Bound on x is 3.8

Settling Time is msOvershoot is KPaUpper Bound on x is

CounterExample Guided Inductive Synthesis (CEGIS)

Find Tightest Properties



Counterexamples

1.



CounterExample Guided Inductive Synthesis

Find "Tightest" Properties




NO


Mined Requirement

Counterexamples

1.








NO


Mined Requirement

Counterexamples

1.



Parametric Signal Temporal Logic

(PSTL)

Parameter Synthesis (exploits

monotonicity)

Optimization-based Falsification






NO


Mined Requirement

Counterexamples

1.




(PSTL)


monotonicity)


Signal Temporal Logic (STL) Extension of Linear Temporal Logic (LTL) and Metric

Temporal Logic (MTL) Quantitative semantics: satisfaction of a property over a

trace given real-valued interpretation Greater value more easily satisfied Non-negative satisfaction value Boolean satisfaction

Example: For all time points between 60 and 100, the absolute value of x is below 0.1

S. A. Seshia 17

0 100

1

-0.1 +0.1

60

x

t

Quantitative Satisfaction Function for STL

Function that maps STL formula and a given trace (valuation of signals) to a numeric value

Example: is inf[60,100] (0.1 - |x|)

Quantifies how much a trace satisfies a property Large positive value: trace easily satisfies Small positive value: trace close to violating Negative value: trace does not satisfy


Parametric Signal Temporal Logic (PSTL)

Constants in STL formula replaced with parameters Scale parameters Time parameters

Examples:

S. A. Seshia 19

Between some time and 10 seconds, x remains greater than some value

After transmissionshifts to gear 2, itremains in gear 2 for at least secs






NO


Mined Requirement

Counterexamples

1.




(PSTL)


monotonicity)


Parameter Synthesis = Find -tight values of params (for suitably small )

0 100

3

2.9

1 000 000



Too loose

Want the value of corresponding to the tightest satisfaction over a set of traces

x

Parameter Synthesis

Non-linear optimization problem

Satisfaction function for STL is non-linear in general

Nave (strawman) approach:

grid parameter space to precision

evaluate satisfaction value at each point

pick valuation with smallest satisfaction value

Problem: Exponential number of grid points (in

#parameters)



If upper bound of all signals is 3, any number > 3 is also an upper bound

Satisfaction function monotonic in parameter value Example:

(, x) = inft ( - x(t) ) For all x, (, x) is a monotonic function of Advantage: If monotonic, use binary search over

parameter space, otherwise exhaustive search

Satisfaction Monotonicity

0 10050

34



Need to decide whether:For all x, (, x) is a monotonic function of

Theorem: Deciding monotonicity of a PSTL formula is undecidable

Use an encoding to satisfiability modulo theories (SMT) solving Quantified formulas involving uninterpreted functions,

and arithmetic over reals linear arithmetic if PSTL predicates are linear

Solved easily with Z3

Deciding Satisfaction MonotonicityFind "Tightest" Properties







NO


Mined Requirement

Counterexamples

1.




(PSTL)


monotonicity)


Black-Box Falsification Procedure

u S(u)

Falsification Tool

\

\

Mining Requirements from Closed-Loop Models


26

Falsification as Optimization

Solve * = minu (, S(u)) Leverages quantitative semantics of STL Relies on standard numerical optimization methods (e.g.

Nelder-Mead)

If * < 0, found falsifying trace!

Nonlinear Optimization Problem, No exact solution, Limited theoretical guarantees



Experimental Evaluation Summary [details in TCAD15 paper] Defined Templates for Common Requirements in

Automotive Control all monotonic PSTL!! Dwell-Time requirements Timed/Untimed Safety properties Timed Inevitability (bounded liveness) Input Profiles: assumptions on shape of input signals Control-theoretic requirements on output signals (bounded

overshoot/undershoot, settling time, error from reference signal, etc.)

Three Benchmarks Simple Simulink Automatic Transmission Model Toyota HSCC14 Challenge Air-Fuel Ratio controller Toyota Experimental Diesel Engine Airpath controller

S. A. Seshia 28

Results on Industrial Airpath Controller

Found max overshoot with 7000+ simulations in 13 hours Attempt to mine maximum observed settling time:

stops after 4 iterations gives answer tsettle = simulation time horizon (shown in trace below)



[Jin, Donze, Deshmukh, Seshia, HSCC 2013]

Mining can expose deep bugs

Uncovered a tricky bug Discussion with control designer revealed it to be a real

bug Root cause identified as wrong value in a look-up table,

bug was fixed Why mining could be useful for bug-finding:

Can uncover subtle relations that should not hold Looking for bugs Mine for negation of bug



Bug fixed Settling time successfully mined

S. A. Seshia 31

OLD

NEW

Industrial Case Studies with Toyota

Work with group @ Toyota Japan on enabling software verification by mining specifications on the closed-loop system

Useful in a production setting: Finds issues where previous methods fell short! Reduced 70% of human effort

S. A. Seshia 32

[Yamaguchi et al., FMCAD16]

Toyota Units Experience with Model Checking

33

Total Work hour

560min

1 trial

70min

40min

7 trials

Making modelMaking property

Revising property

Revising model

Executingmodel checking

Mappingcounterexample

Making/revising property: 110 minMapping counterexample: 280 min for just 1 module


Graph5

30

40

70.71

70.71

70.71

70.71

70.71

70.71

70.71

Sheet1

3013010110

4014010110

15710515575

40728015575

103307214

2048010330

565214

3013010110

4014010110

1 70.7138.8

2 70.7138.8

3 70.7138.8

4 70.7138.8

5 70.7138.8

6 70.71351

7 70.71

564.97565

15710514.85060606061557515

40728039.60161616161557515

103304.24303030372142.8

2048011.3147474747103306

31.91

495194

Sheet1

Sheet2

Sheet3

34

Overview of Methodology

in out

1. Pre-condition mining

Pre-condition forsoftware module

2. Softwaremodel checking

Module levelcounterexample

System levelcounterexample

3. Simulation-BasedVerification

controller

sensor

plant

module module

module

target

Pedal Brake WaterTemp

module

AirTemp

Breach

Breach

SLDV/CBMC


From CEGIS to Oracle-Guided Inductive Synthesis

Inductive Synthesis: Learning from Examples (ML)Formal Inductive Synthesis: Learn from Examples while satisfying a Formal Specification

S. A. Seshia 36

[Jha & Seshia, A Theory of Formal Synthesis via Inductive Learning, 2015,Acta Informatica 2017.]

General Approach: Oracle-Guided LearningCombine Learner with Oracle (e.g., Verifier) that answers Learners Queries

LEARNER ORACLE

query

response

Falsification of Cyber-Physical Systems with Machine Learning Components

S. A. Seshia 37

T. Dreossi, A. Donze, and S. A. Seshia. Compositional Falsification of Cyber-Physical Systems with Machine Learning Components, In NASA Formal Methods Symposium, May 2017.

Challenges for Verified AI

S. A. Seshia 38

System SEnvironment ESpecification

YES [+ proof]Does S || E satisfy ?

NO [+ counterexample]

S. A. Seshia, D. Sadigh, S. S. Sastry. Towards Verified Artificial Intelligence. July 2016. https://arxiv.org/abs/1606.08514.

Design Correct-by-Construction

instead?Counterexamples, etc. from Rich Signal Spaces?

Problem: Verify Automotive System (CPS) that uses ML-based Perception

39

Focus: Falsification: finding scenarios that violate safety properties Test (Data) Generation: generate interesting data for

training / testing improve accuracy Deep Neural Networks, given the increasing interest and use

in the automotive context.

S. A. Seshia

Controller Plant

Environment

Learning-Based Perception

Sensor Input

Automatic Emergency Braking System (AEBS)

40

AEBS Controller Plant

Environment

Deep Learning-Based Object Detection

Goal: Brake when an obstacle is near, to maintain a minimum safety distance Controller, Plant, Env models in Matlab/Simulink

Object detection/classification system based on deep neural networks Inception-v3, AlexNet, trained on ImageNet more recent: squeezeDet, Yolo, trained on KITTI

Our Approach: Use a System-Level Specification

S. A. Seshia 41

Verify the Deep Neural Network Object Detector

Verify the System containing the Deep Neural Network

Formally Specify the End-to-End Behavior of the System

Controller Plant

Environment

Learning-Based Perception

STL Formula: G (dist(ego vehicle, env object) > )

Approach: Simulation-Based Falsification

S. A. Seshia 42

Challenge: Very High Dimensionality of Input Space!

Standard solution: Use Compositional (Modular)Verification

However: no formal spec. for neural network component!

Compositional Verification without Compositional Specification?!!

Our Approach: Combine Temporal Logic CPS Falsifier with ML Analyzer

43

CPS Falsifier

ML Analyzer

Spec

Model

Region of Uncertainty

Interesting sensor data(images)

Error? CPS Falsifier uses abstraction of

ML component Optimistic analysis: assume ML

classifier is always correct Pessimistic analysis: assume

classifier is always wrong

Compositional: CPS Falsifier and ML Analyzer can be designed and run

independently (& communicate)!

S. A. Seshia

Difference is the region of uncertainty where output of the ML component matters

Identifying Region of Uncertainty (ROU) for Automatic Emergency Braking System

44

Perform Optimistic and Pessimistic Analyses on the Deep Neural Network

ML always correct ML always wrong Potentially unsafe region (ROU) depending on ML

component (yellow)

S. A. Seshia

Machine Learning Analyzer

45

Systematically Explore ROU in the Image (Sensor) Space

Feature space

brightness car z-pos

Abstraction map

brightnesscar z-pos

car x-pos

Abstract space A

S. A. Seshiax

Abstract space A

()

Neural network {, }

Systematic Sampling (low-discrepancy sampling)

Sample Result

46

Inception-v3Neural

Network(pre-trained on ImageNet using

TensorFlow)

Misclassifications

This misclassification may not be of concern

Sample Result

47

Inception-v3Neural

Network(pre-trained on ImageNet using

TensorFlow)

Misclassifications

Corner caseImage

But this one is a real hazard!

Image Streams

S. A. Seshia 48

Superimposition of tests on backgroundBlind spots

Results on squeezeDet NN and KITTI dataset for autonomous driving

[Dreossi, Ghosh, et al., ICML 2017 workshop]

Verifier-Guided Training of Deep Neural Networks

Instance of Oracle-Guided Inductive Synthesis Oracle is Verifier (CPSML Falsifier) used to perform

counterexample-guided training of DNNs Substantially increase accuracy with only few

additional examples

S. A. Seshia 49

DEEP NEURAL NETWORK

FALSIFIER (CPS + ML)

Learned Classifier

Conclusion: Formal Methods meets Machine Learning

Formal Methods can play an important role in CPS Design with high assurance Industrial scale and machine learning pose particular

challenges

Machine Learning Formal Methods Formal Inductive Synthesis (of specifications,

programs, etc.)

Formal Methods Machine Learning Compositional reasoning about learning-based systems

S. A. Seshia 50

Towards Verified Learning-based CPS

S. A. Seshia 51

S. A. Seshia, D. Sadigh, S. S. Sastry. Towards Verified Artificial Intelligence. July 2016. https://arxiv.org/abs/1606.08514.

Challenges1. Environment (incl.

Human) Modeling2. Specification

3. Learning Systems Complexity

4. Efficient Training, Testing, Verification

5. Design for Correctness

PrinciplesData-Driven, Introspective Environment ModelingSystem-Level Specification; Robustness/Quantitative Spec.

Abstract & ExplainVerification-Guided, Adversarial Analysis and ImprovisationFormal Inductive Synthesis

Exciting Times Ahead!!! Thank you!

Formal Methods meets Machine Learning: Explorations in Cyber-Physical System DesignCyber-Physical Systems (CPS):Integration of computation with physical processes, defined by both cyber & physical Growing Use of Machine Learning/AI in Cyber-Physical Systems Growing Features Growing CostsFormal Methods to the Rescue?Formal Methods meets Machine LearningOutlineMining Requirements for Closed-Loop Control SystemsChallenges for Verification of Control SystemsIndustry Problem: Applying Formal Methods to Legacy SystemsDesigners View of Our SolutionSlide Number 12Slide Number 13Slide Number 14Slide Number 15Slide Number 16Signal Temporal Logic (STL)Quantitative Satisfaction Function for STLParametric Signal Temporal Logic (PSTL)Slide Number 20Parameter Synthesis = Find -tight values of params (for suitably small )Parameter SynthesisSatisfaction MonotonicityDeciding Satisfaction MonotonicitySlide Number 25Black-Box Falsification ProcedureFalsification as OptimizationExperimental Evaluation Summary [details in TCAD15 paper]Results on Industrial Airpath ControllerMining can expose deep bugsBug fixed Settling time successfully mined Industrial Case Studies with ToyotaToyota Units Experience with Model Checking Overview of MethodologyFrom CEGIS to Oracle-Guided Inductive SynthesisFalsification of Cyber-Physical Systems with Machine Learning ComponentsChallenges for Verified AI Problem: Verify Automotive System (CPS) that uses ML-based PerceptionAutomatic Emergency Braking System (AEBS)Our Approach: Use a System-Level SpecificationApproach: Simulation-Based FalsificationOur Approach: Combine Temporal Logic CPS Falsifier with ML AnalyzerIdentifying Region of Uncertainty (ROU) for Automatic Emergency Braking SystemMachine Learning AnalyzerSample ResultSample ResultImage StreamsVerifier-Guided Training of Deep Neural NetworksConclusion: Formal Methods meets Machine LearningTowards Verified Learning-based CPS

Formal Methods meets Machine Learning: Explorations in ...sseshia/talks/Seshia-FM_ML_CPS... ·...

Documents