+ All Categories
Home > Documents > Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 ›...

Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 ›...

Date post: 03-Jul-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
87
Timed Games and Stochastic Priced Timed Games Synthesis & Machine Learning TIGA STRATEGO Kim G. Larsen – Aalborg University DENMARK
Transcript
Page 1: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Timed Gamesand

Stochastic Priced Timed GamesSynthesis & Machine Learning

TIGA

STRATEGO

Kim G. Larsen

– Aalborg University

DENMARK

Page 2: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Overview

Timed Automata Decidability (regions) Symbolic Verification (zones)

Priced Timed Automata Decidability (priced regions) Symbolic Verification (priced zones)

Stochastic Timed Automata Stochastic Semantics Statistical Model Checking Stochastic Hybrid Automata

Timed Games & Stochastic Priced Timed Games Symbolic Synthesis Reinforcement Learning Applications

TU Graz, May 2017 Kim Larsen [2]

TRON

CLASSIC

TIGA

CORA

ECDAR

SMC

Optimization

Synthesis

Component

Testing

PerformanceAnalysis

Verification

STRATEGOOptimal Synthesis

1995

2001

2005

2011

2014

2010

2004

Page 3: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Timed Gamesand

Stochastic Priced Timed GamesSynthesis & Machine Learning

TIGA

STRATEGO

Kim G. Larsen

– Aalborg University

DENMARK

Page 4: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Timed Automata & Model Checking

TU Graz, May 2017 Kim Larsen [4]

State (L1, x=0.81)Transitions

(L1 , x=0.81) - 2.1 ->

(L1 , x=2.91)->

(goal , x=2.91)

Ehi goal ?

Ahi goal ?

A[ ] : L4 ?

Page 5: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Timed Game

TU Graz, May 2017 Kim Larsen [5]

Page 6: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Timed Game & Synthesis

TU Graz, May 2017

x<=1

x<=1

Kim Larsen [6]

Page 7: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Decidability of Timed Games

TU Graz, May 2017 Kim Larsen [7]

Page 8: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Untimed and Timed Games

Reachability / Safety Games

Uncontrollable

Controllable

1

2

3

4

x>1

x·1

x<1

x:=0

x<1

x·1

x¸2

8TU Graz, May 2017

Page 9: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Untimed Games

Uncontrollable

Controllable

Memoryless Strategy:F : Q Ec

Winning Run:States(r) Å G Ø

Winning Strategy:Runs(F) WinRuns

9TU Graz, May 2017

Page 10: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Untimed Games

Uncontrollable

Controllable

Memoryless Strategy:F : Q Ec

Winning Run r:States(r) Å G Ø

Winning Strategy:Runs(F) WinRuns

10TU Graz, May 2017

Page 11: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Untimed Games

Uncontrollable

Controllable

Memoryless Strategy:F : Q Ec

Winning Run r :States(r) Å G Ø

Winning Strategy:Runs(F) WinRuns

11TU Graz, May 2017

Page 12: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Untimed Games

Uncontrollable

Controllable

Backwards Fixed-Point Computation

cPred(X) = { q2Q | 9 q’2 X. q c q’}uPred(X) = { q2Q | 9 q’2 X. q u q’}

p(X) = cPred(X) \ uPred(XC) ]

Theorem:The set of winning states is obtained as the least fixpoint of the function:

X a p(X) [ Goal

12TU Graz, May 2017

Page 13: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Untimed Games

Uncontrollable

Controllable

Backwards Fixed-Point Computation

cPred(X) = { q2Q | 9 q’2 X. q c q’}uPred(X) = { q2Q | 9 q’2 X. q u q’}

p(X) = cPred(X) \ uPred(XC) ]

Theorem:The set of winning states is obtained as the least fixpoint of the function:

X a p(X) [ Goal

13TU Graz, May 2017

Page 14: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Untimed Games

Uncontrollable

Controllable

Backwards Fixed-Point Computation

cPred(X) = { q2Q | 9 q’2 X. q c q’}uPred(X) = { q2Q | 9 q’2 X. q u q’}

p(X) = cPred(X) \ uPred(XC) ]

Theorem:The set of winning states is obtained as the least fixpoint of the function:

X a p(X) [ Goal

14TU Graz, May 2017

Page 15: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Untimed Games

Uncontrollable

Controllable

Backwards Fixed-Point Computation

cPred(X) = { q2Q | 9 q’2 X. q c q’}uPred(X) = { q2Q | 9 q’2 X. q u q’}

p(X) = cPred(X) \ uPred(XC) ]

Theorem:The set of winning states is obtained as the least fixpoint of the function:

X a p(X) [ Goal

15TU Graz, May 2017

Page 16: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Untimed Games

Uncontrollable

Controllable

Backwards Fixed-Point Computation

cPred(X) = { q2Q | 9 q’2 X. q c q’}uPred(X) = { q2Q | 9 q’2 X. q u q’}

p(X) = cPred(X) \ uPred(XC) ]

Theorem:The set of winning states is obtained as the least fixpoint of the function:

X a p(X) [ Goal

16TU Graz, May 2017

Page 17: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Untimed Games

Uncontrollable

Controllable

Backwards Fixed-Point Computation

cPred(X) = { q2Q | 9 q’2 X. q c q’}uPred(X) = { q2Q | 9 q’2 X. q u q’}

p(X) = cPred(X) \ uPred(XC) ]

Theorem:The set of winning states is obtained as the least fixpoint of the function:

X a p(X) [ Goal

17TU Graz, May 2017

Page 18: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Computing Winning States

TU Graz, May 2017 Kim Larsen [18]

Page 19: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Reachability GamesBackwards Fixed-Point Computation

Theorem:

The set of winning states is obtained as the least fixpointof the function: X a p(X) [ Goal

cPred(X) = { q2Q | 9 q’2 X. q c q’}

uPred(X) = { q2Q | 9 q’2 X. q u q’}

Predt(X,Y) = { q2Q | 9 t. qt2X and 8 s·t. qs2YC }

p(X) = Predt[ X [ cPred(X) , uPred(XC) ]

Definitions

X

YPredt(X,Y)

Kim Larsen [19]TU Graz, May 2017

Page 20: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Symbolic On-the-fly Algorithms for Timed Games [CDF+05, BCD+07]

symbolic version of on-the-fly MC algorithmfor modal mu-calculus

Liu & Smolka 98

Kim Larsen [20]TU Graz, May 2017

Page 21: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

UPPAAL Tiga [CDF+05, BCD+07]

Reachability properties: control: A[ p U q ] until

control: Ahi q control: A[ true U q ]

Safety properties: control: A[ p W q ] weak until

control: A[] p control: A[ p W false ]

Time-optimality : control_t*(u,g): A[ p U q ]

u is an upper-bound to prune the search

g is the time to the goal from the current state

TU Graz, May 2017 Kim Larsen [21]

Page 22: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

UPPAAL Tiga

TU Graz, May 2017 Kim Larsen [22]

DEMO

Page 23: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Model Checking (ex Train Gate)

TU Graz, May 2017 Kim Larsen [23]

: Never two trains at

the crossing at the

same time

Environment

Controller

Page 24: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Synthesis (ex Train Gate)

TU Graz, May 2017 Kim Larsen [24]

: Never two trains at

the crossing at the

same time

Environment

Controller

?

Page 25: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Timed Games

TU Graz, May 2017 Kim Larsen [25]

: Never two trains at

the crossing at the

same time

Controllable Uncontrollable

Find strategy for controllable

actions st behaviour satisfies

Controller

Environment

Page 26: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

TU Graz, May 2017

Synthesis Demo

Kim Larsen [26]

Page 27: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

A Buggy Brick Sorting Program

16MCD 2001, Twente Kim G. Larsen

UCb

First UPPAAL model

Sorting of Lego Boxes

Conveyer Belt

Exercise: Design Controller so that only yellew boxes are being pushed out

Boxes

Piston

Black

Yellow

9 18 81 90

99

BlckYel

remove

eject

Controller

Ken Tindell

MAIN PUSH

Conveyer Belt

eject

27TU Graz, May 2017

Page 28: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Brick Sorting

Piston

Generic Plate

Controller

28TU Graz, May 2017

Page 29: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Brick Sorting

Piston

Generic Plate

Controller

29TU Graz, May 2017

Page 30: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

30

Problem: avoid having the plates falling down

The Chinese Juggling Problem

thanks to Oded Maler

Kim G Larsen

Page 31: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Balancing Plates / Timed Automata

A Plate

The Joggler

E :(Plate1.Bang or Plate2.Bang or …)Kim G Larsen 31China Summer 2009

Page 32: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Balancing Plates / Time Uncertainty

Strategy

BDD/CDD

Kim G Larsen 32China Summer 2009

Page 33: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Production Cell Overview

Realistic case-study describedin several formalisms(1994 and later).

Objective: stampmetal plates in press.

feed belt, two-armedrobot, press, anddeposit belt.

Kim Larsen [33]TU Graz, May 2017

Page 34: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Production Cell in UPPAAL Tiga

TU Graz, May 2017 Kim Larsen [34]

Page 35: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Experimental Results

[CDF+05]

[BCD+07]

Kim Larsen [35]TU Graz, May 2017

Page 36: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Plastic Injection Molding Machine

Robust and optimal control

Tool Chain

Synthesis: UPPAAL TIGA

Verification: PHAVer

Performance: SIMULINK

40% improvement of existing solutions..

Quasiomodo

Kim Larsen [36]TU Graz, May 2017

[CJL+09]

Page 37: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Oil Pump Control Problem

R1: stay within safe interval [4.9,25.1]

R2: minimize average/overall oil volume

Kim Larsen [37]TU Graz, May 2017

Quasiomodo

Page 38: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

The Machine (consumption)

Infinite cyclic demand to be satisfied by our control strategy.

P: latency 2 s between state change of pump

F: noise 0.1 l/s

Kim Larsen [38]TU Graz, May 2017

Quasiomodo

Page 39: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Hybrid Game Model

TU Graz, May 2017 Kim Larsen [39]

Page 40: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Abstract Game Model

UPPAAL Tiga offers games of perfect information

Abstract game model such that states only contain information about: Volume of oil at the beginning of cycle

The ideal volume as predicted by the consumption cycle

Current time within the cycle

State of the Pump (on/off)

Discrete model

DV, V_rate

V_acctime

Kim Larsen [40]TU Graz, May 2017

Quasiomodo

Page 41: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Machine (uncontrollable)

Checks whether V under noise gets

outside [Vmin+0.1,Vmax-0.1]

Kim Larsen [41]TU Graz, May 2017

Quasiomodo

Page 42: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Pump (controllable)

Every 1 (one) seconds

Kim Larsen [42]TU Graz, May 2017

Quasiomodo

Page 43: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

TU Graz, May 2017

Global Approach

Find some interval I1=[V1,V2] [4.9,25.1] s.t

I1 is m-stable i.e. from any V0 in I1 there is strategy stwhatever fluctuation volume is always within [5,25] and at the end within I2=[V1+m,V1-m]

I1 is optimal among all m-stable intervals.

0

25

5

10

15

20

I1 I2

0 s 20 s

Page 43

Page 44: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Tool Chain

Strategy Synthesis TIGA

Verification PHAVER

Performance Evaluation

SIMULINK

GuaranteedCorrectnessRobustness

with40% Improvement

Quasiomodo

Kim Larsen [44]TU Graz, May 2017

Page 45: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Synthesis of Home Automation

TU Graz, May 2017 Kim Larsen [45]

Page 46: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

What else ?

Timed Games w Partial Observability Action-based Observation: undecidable [BDMP03] Finite-observation of states: decidable [CDL+07]

Priced Timed Games: Acyclic, cost non-zeno: decidable [LTMM02] [BCFL04] 1 clock: decidable [BLMR06] >2 clocks: undecidable [BBR05, BBM06] 2 clocks: open

Energy Games: Several Open Problems Exponential Observers

Climate Controller inPig Stables [JRLD07]

CHESS Way [Quasimodo@ESWEEK]

TU Graz, May 2017 Kim Larsen [46]

Page 47: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Timed Gamesand

Stochastic Priced Timed GamesSynthesis & Machine Learning

TIGA

STRATEGO

Kim G. Larsen

– Aalborg University

DENMARK

Page 48: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Going to Uppsala – in 1 hour

TU Graz, May 2017 Kim Larsen [48]

0.9

0.9

0.1

0.1

U[42,45]

U[0,35]

U[0,20]

U[0,140]

Optimal WC Strategy(2-player) Take bikeWC=45

Optimal Expected Strategy(1½ player)Take carE = 16WC = 140

Optimal Expected Strategyguaranteeing WC<=60

?????

Page 49: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

DEMO

Page 50: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

GTimed Game

σStrategy

PStochastic

PricedTimed Game

P|σ

φ

synthesis

abstraction

σ°optimizedStrategy

G|σTimed Automata

P|σ°Stochastic Priced Timed Automata

minE(cost)

maxE(gain)

Uppaal TIGAstrategy NS = control: A<> goalstrategy NS = control: A[] safe

Statistical Learning

strategy DS = minE (cost) [<=10]: <> done under NSstrategy DS = maxE (gain) [<=10]: <> done under NS

UppaalE<> error under NSA[] safe under NS

Uppaal SMCsimulate 5 [<=10]{e1, e2} under SS Pr[<=10](<> error) under SS E[<=10;100](max: cost) under SS

Page 51: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Timed Games

TU Graz, May 2017 Kim Larsen [51]

Strategy:

Memoryless, deterministic, most permissive.

Uncontrol-

lable

Controllable

TIGA

Run

𝜋 = INIT, 𝑥 = 050.1

՜r(CHOICE, 𝑥 = 0)

2.4՜b(B, 𝑥 = 0)

20.3՜d(END, 𝑥 = 20.3)

Total time = 50.1 + 2.4 + 20.3 = 72.8

Page 52: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Timed Games –Time Bounded Reachability

TU Graz, May 2017 Kim Larsen [52]

Objective: 𝐴⟨⟩(END ∧ time≤ 210)

Deterministic, memoryless strategy:

100 200

x w w

𝝀 𝝀

time

100 200

time

x w w

𝝀

9070

𝝀

a

𝝀

a b

Most permissive, memoryless strategy

100

100

time

time

Page 53: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Priced Timed Games

TU Graz, May 2017 Kim Larsen [53]

• Cost optimal strategy take b immediately

WC= 280

”CORA”

COST

Total 𝑐𝑜𝑠𝑡 = 𝟎 + 𝟒. 𝟖 + 𝟔𝟎. 𝟗 = 𝟔𝟓. 𝟕

Priced Run

𝜋 = Init, 𝑥 = 050.1

𝟎՜rCHOICE, 𝑥 = 0

2.4

4.8՜b

B, 𝑥 = 020.3

𝟔𝟎.𝟗՜d(END, 𝑥 = 20.3)

Page 54: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Priced Timed MDP

TU Graz, May 2017 Kim Larsen [54]

• Cost optimal strategy take b immediately

WC= 280

Priced Timed MDP

Optimal expected cost str

take a immediately

expectation = 160

UNIFORM[0,100]

Controllable

”SMC”

COST

Page 55: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Priced Timed MDP

TU Graz, May 2017

Kim Larsen [55]

Cost optimal strategy

take b immediately

overall = 280

Priced Timed MDP

Optimal expected cost str

take a immediately

expectation = 160

Minimal Expected Cost while

guaranteeing END is reached

within time 210:

Strat.: t>90 (100,w)

t>70 (0,a)

t<70 (0,b)

= 204

UNIFORM[0,100]

Controllable

”SMC”

COST

Page 56: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Stochastic Strategies for Learning!

TU Graz, May 2017 Kim Larsen [56]

Objective: 𝐴⟨⟩(END ∧ time≤ 210)

Most permissive, memoryless strategy:

100

100 200

x

𝝀

9070

𝝀

a

𝝀

a bCost optimal deterministic

sub-strategy !

100

200

x w w

𝝀

9070

𝝀

a

𝝀

a b

100

Stochastic Strategies

𝝀 )

time

time

w w

Page 57: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Reinforcement Learning

TU Graz, May 2017 Kim Larsen [57]

Time Bounded Reachability(G,T)

TIGA

SMC

SMC

Page 58: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Learned Strategies

TU Graz, May 2017 Kim Larsen [58]

More plots of runs according to strategies learne. 𝝀

a

b

Covariance Matrices

Page 59: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Learned Strategies

TU Graz, May 2017 Kim Larsen [59]

More plots of runs according to strategies learne. 𝝀

a

b

Covariance Matrices

Page 60: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Learned Strategies

TU Graz, May 2017 Kim Larsen [60]

More plots of runs according to strategies learne. 𝝀

a

b

Covariance Matrices

Page 61: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Learned Strategies

TU Graz, May 2017 Kim Larsen [61]

More plots of runs according to strategies learne. 𝝀

a

b

Covariance Matrices

Page 62: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Learned Strategies

TU Graz, May 2017 Kim Larsen [62]

More plots of runs according to strategies learne. 𝝀

a

b

Covariance Matrices

Page 63: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Learned Strategies

TU Graz, May 2017 Kim Larsen [63]

More plots of runs according to strategies learne. 𝝀

a

b

Covariance Matrices

Page 64: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Strategies – Representation

TU Graz, May 2017 Kim Larsen [64]

Nondeterministic Strategies 𝜎𝑛(ℓ,𝑣)

⊆ Σ𝑐 ∪ 𝜆

Stochastic Strategies

Covariance Matrices

Splitting

Logistic Regression

𝜇𝑠(ℓ,𝑣)

∶ Σ𝑐 ∪ 𝜆 ՜ [0,1]

𝑅ℓ𝑅ℓ

Page 65: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Stochastic Timed Game

TU Graz, May 2017 Kim Larsen [65]

Page 66: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

DEMO

Page 67: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Safe & Adaptive Cruice Control

TU Graz, May 2017 Kim Larsen [67]

Q1: Find the most permissive strategy ensuring safety.

Q2:Find the optimal sub-strategy that will allow Ego to go as close as possible.

EGO FRONT

Page 68: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Two Player Game (simplified)

TU Graz, May 2017 Kim Larsen [68]

Page 69: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Discretization

TU Graz, May 2017 Kim Larsen [69]

Discrete

Continuous

Page 70: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Front (complete)

TU Graz, May 2017 Kim Larsen [70]

Page 71: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

No Strategy

TU Graz, May 2017 Kim Larsen [71]

Page 72: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Safety Strategy

TU Graz, May 2017 Kim Larsen [72]

Page 73: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Safety Strategy

TU Graz, May 2017 Kim Larsen [73]

Page 74: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Safe and Optimal Strategy

TU Graz, May 2017 Kim Larsen [74]

Page 75: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Safe and Optimal Strategy

TU Graz, May 2017 Kim Larsen [75]

Page 76: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Optimal and Safe Strategy

TU Graz, May 2017 Kim Larsen [76]

Page 77: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Safety Strategy (Code)

TU Graz, May 2017 Kim Larsen [77]

Page 78: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Synthesis of Climate Controllers

TU Graz, May 2017 Kim Larsen [78]

TACAS16

Page 79: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Synthesis of Climate Controllers

TU Graz, May 2017 Kim Larsen [79]

TACAS16

Page 80: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Synthesis of Climate Controllers

TU Graz, May 2017 Kim Larsen [80]

TACAS16

Page 81: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

From Verification to Synthesisand Optimization

TU Graz, May 2017 Kim Larsen [81]

1 ½

2 1½

Page 82: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Skov

HYDAC

SELUXIT

Zone-based climatecontrol pig-stables

Profit-optimal, minimal-wear and energy-awareschedules for satelittes

Personalized light control in homeautomation

Energy- and comfort-optimal floor heating

Intelligent Traffic Control

Safe and optimal car maneuvers

Mathias G Sørensen

TU Graz, May 2017

GOMSpace

Skov

More Practical Synthesis …

82

LASSO

Learning, Analysis,

SynthesiS and Optimization

of Cyber-Physical Systems

Page 83: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Ongoing Work

Verification of learned strategy (zonification) Full correctness of hybrid control obtained from

discretization. Combination with Invariance Analysis for Switched Controller

Synthesis; Metrics; Verification using SpaceEX;

From optimal strategy of bounded horizon to optimal infinite strategy.

Strategies: complexity (space and time) and permissiveness

On-line synthesis may be too slow for some application

Satellites > Floor-heating > Traffic > Power Electronics Learn Neural Network representation of optimal strategy

TU Graz, May 2017 Kim Larsen [83]

Page 84: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

LASSOLearning, Analysis, SynthesiS and Optimization

of Cyber-Physical Systems

1

𝜇1 𝜇𝑛

Safety Constraints

Perf. Measures

Model of

Physical Comp.Model of

Cyber Comp.

Unknown

Known

Learning

Analysis

Synthesize

Optimize

Fig 1. The LASSO Framework

TU Graz, May 2017

NEXT UPPAAL BRANCHES

PhD/visitors/PostDocPosition available

Contact: [email protected]

Page 85: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Thank You !

TU Graz, May 2017 Kim Larsen [85]

Page 86: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Exercises (from yesterday)

Exercise 28: Jobshop Scheduling

Look at

https://www.dropbox.com/sh/96tvd7qklf1gcyh/AADFwVJk97L9qtJLkf0VUgtOa?dl=0

In UPPAAL SMC

Coffee Machine

Hammer Nail

Hybrid

… and more

TU Graz, 2017 Kim Larsen [86]

Page 87: Timed Games STRATEGO and Stochastic Priced Timed Gamespeople.cs.aau.dk › ~kgl › GRAZ17 › GRAZ5.pdf · 2017-05-30 · 1 =[V 1,V 2] [4.9,25.1] s.t I 1 is m-stable i.e. from any

Exercises

Download UPPAAL STRATEGO 4.1.20-3 or 4.1.20-4

http://people.cs.aau.dk/~marius/stratego

Experiment with Newspaper

Traffic Uppsala

Cruice

Small Car

See Dropbox

TU Graz, May 2017 Kim Larsen [87]


Recommended