+ All Categories
Home > Documents > AGENT-BASED MODELING, SIMULATION, AND CONTROL...

AGENT-BASED MODELING, SIMULATION, AND CONTROL...

Date post: 23-Sep-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
37
1 AGENT-BASED MODELING, SIMULATION, AND CONTROLSOME APPLICATIONS IN TRANSPORTATION Montasir Abbas, Virginia Tech (with contributions from past and present VT-SCORES students, including: Zain Adam, Sahar Ghanipoor-Machiani, Linsen Chong, and Milos Mladenovic) Workshop III: Traffic Control New Directions in Mathematical Approaches for Traffic Flow Management IPAM October 27, 2015
Transcript
Page 1: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

1

AGENT-BASED MODELING, SIMULATION, AND CONTROL—

SOME APPLICATIONS IN TRANSPORTATION

Montasir Abbas, Virginia Tech

(with contributions from past and present VT-SCORES students, including:

Zain Adam, Sahar Ghanipoor-Machiani, Linsen Chong, and Milos Mladenovic)

Workshop III: Traffic Control

New Directions in Mathematical Approaches for Traffic Flow Management

IPAM

October 27, 2015

Page 2: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Presentation Outline• Agent based modeling… what? why? And how?

• What is the learning framework? What are the

techniques?

• Examples of learning:

• Controller agents

• Driver behavior agents

• Vehicle agents

• What if we don’t incorporate learning?

• Conclusions

2

Page 3: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Background

3

Page 4: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Learning

• Can we predict a condition or a

behavior/response from a wealth of data?

• Can we model and interpret a phenomenon in

a state-action framework?

• The same input data can lead to different

performance measures, and we are the

reason!

4

Page 5: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Motivation

B

A

C

E

F G

H I

D

Varying Traffic Behavior Maneuvers Naturalistic Data

5

15

25

35

45

5 15 25 35 45 55

t= 4 sec

t= 0 sec (reference

time)

t= 2 sec

X (m)

Y (

m)

Trained Agents

VISSIM Simulation

Detailed Behavioral Data Trajectories

Advanced VISSIM API-

Agent Interface

5

Page 6: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

A Learning Framework

6

State 2

State 1

State 3

State 4State 6

Other states

State 5

State

Action

State S Diagram Policy P

Page 7: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Learning Techniques

• Machine Learning

• Q-Learning

• Reinforcement

Learning

• Etc.

7

State 2

State 1

State 3

State 4State 6

Other states

State 5

State

Action

State S Diagram Policy P

Page 8: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Q-Learning

8

Acting on environment,

receiving rewards,

selecting actions to

reach a goal

Page 9: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Application: Dilemma Zone Problem

• Application of learning to controller and

to humans

• Controllers making decisions

• Humans learning from mistakes

9

Page 10: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

To stop or not to stop? That is the question!

10

Page 11: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

To stop

11

Page 12: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

To go

12

Page 13: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Controller Agent—Learning the Policy

13

Page 14: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

14

Environment’s State Variables:

Total number of vehicles in DZ

Agent’s Actions:

- End the Green

- Extend the Green

Reward:

Vehicles caught in DZ

Q-learning algorithm parameters:

Learning rate: 0.01

Discount rate: 0.5

Page 15: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Off-line and Online Learning• Find P* with simulation

• Update Q-table with real

data

15

Markovian Traffic State Estimation

0

1

2

3

4

5

6

7

8

9

0 5 10 15 20 25 30 35 40

Time to max-out (sec)

Se

Page 16: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Semantic Memory

StateAction

(stop/go)

Episodic Memory

Dataset

Memory Decay

Working Memory

Updated

Q tableQ table

Trained Q

table

E-Greedy

Brain Analogy

Procedural Memory

Pro

pensi

ty

Distractions Emotions

Human Learning Model

Page 17: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Dealing with High State Dimensionality

(Naturalistic driving behavior study)*

• Training input: traffic states and actions

• Training output: acceleration and steering

• Input variables discretized using fuzzy sets

• Continuous actions are generated from discrete

actions

• Uses all the safety critical events available in training

17

*Safety and Mobility Agent-based

Reinforcement-learning Traffic Simulation

Add-on Module (SMART SAM)

Page 18: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

NFACRL Framework𝑆𝑖=the 𝑖𝑡ℎ input variable (state variable)

𝐾 =number of input variables

𝑁𝑀𝑖=number of fuzzy sets or membership

functions for the 𝑆𝑖

𝑀𝑖𝑎(𝑖)

=𝑎(𝑖)𝑡ℎ fuzzy set or membership function

for the 𝑖𝑡ℎ input variable

𝑅𝑗=the 𝑗𝑡ℎ fuzzy rule

𝑁=number of fuzzy rules

18

𝜆𝑗=weight between 𝑗𝑡ℎ fuzzy rule and critic

𝑤𝑞𝑗=weight between 𝑗𝑡ℎ fuzzy rule and action 𝑞

𝑉 =critic value

𝐴𝑞 =output of 𝑞𝑡ℎ action

Where 𝑖 = 1,…𝐾, 𝑎 𝑖 = 1, . . 𝑁𝑀𝑖 , 𝑗 = 1,… , 𝑁 and

𝑞 = 1, . . , 𝑃

Page 19: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Applications and Cross Validation

• Test the heterogeneity of the drivers

• Training: Used the data from Agent A in training with

its behavioral rules as output

• Validation: Used the output rule of Agent A and

applied it to driver B

• Heterogeneity of Agent A, B is represented by degree

of accuracy in validation

19

Page 20: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Agent A: Event 1

20

0 50 100 150 200 250 300 350-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

Time (0.1s)

Accele

ration (

g)

Longitidinal Action Estimation

Naturalistic

Agent

0 50 100 150 200 250 300 350-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

0.05

Time (0.1s)

Yaw

Angle

(ra

diu

s)

Lateral Action Estimation

Naturalistic

Agent

Page 21: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Agent A: Event 2

21

0 50 100 150 200 250 300 350 400-0.05

-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

Time (0.1s)

Accele

ration (

g)

Longitidinal Action Estimation

Naturalistic

Agent

0 50 100 150 200 250 300 350 400-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

Time (0.1s)

Yaw

Angle

(ra

diu

s)

Lateral Action Estimation

Naturalistic

Agent

Page 22: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Driver Agent B

22

0 100 200 300 400 500 600-0.35

-0.3

-0.25

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

Time (0.1s)

Accele

ration (

g)

Longitidinal Action Estimation

Naturalistic

Agent

0 100 200 300 400 500 600-0.02

-0.01

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Time (0.1s)

Yaw

Angle

(ra

diu

s)

Lateral Action Estimation

Naturalistic

Agent

Page 23: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Driver Agent A: Own Behavior

23

0 50 100 150 200 250 300 350-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

Time (0.1s)

Accele

ration (

g)

Longitidinal Action Estimation

Naturalistic

Agent

0 50 100 150 200 250 300 350-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

0.05

Time (0.1s)

Yaw

Angle

(ra

diu

s)

Lateral Action Estimation

Naturalistic

Agent

Page 24: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Driver B: Own Behavior

24

0 100 200 300 400 500 600-0.35

-0.3

-0.25

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

Time (0.1s)

Accele

ration (

g)

Longitidinal Action Estimation

Naturalistic

Agent

0 100 200 300 400 500 600-0.02

-0.01

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

Time (0.1s)

Yaw

Angle

(ra

diu

s)

Lateral Action Estimation

Naturalistic

Agent

Page 25: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Driver A: Using Behavior from B

25

0 50 100 150 200 250 300 350-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

Time (0.1s)

Accele

ration (

g)

Longitidinal Action Estimation

Naturalistic

Agent

0 50 100 150 200 250 300 350-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

0.05

Time (0.1s)

Yaw

Angle

(ra

diu

s)

Lateral Action Estimation

Naturalistic

Agent

Page 26: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Driver B: Using Behavior from A

• Heterogeneity is clear

26

0 100 200 300 400 500 600-0.35

-0.3

-0.25

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

Time (0.1s)

Accele

ration (

g)

Longitidinal Action Estimation

Naturalistic

Agent

0 100 200 300 400 500 600-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

0.05

0.06

Time (0.1s)

Yaw

Angle

(ra

diu

s)

Lateral Action Estimation

Naturalistic

Agent

Page 27: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Mega-Agent Behavior

• Mega-Agent behaves as Driver B

27

0 100 200 300 400 500 600-0.35

-0.3

-0.25

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

Time (0.1s)

Accele

ration (

g)

Longitidinal Action Estimation

Naturalistic

Agent

0 100 200 300 400 500 600-0.02

-0.01

0

0.01

0.02

0.03

0.04

0.05

0.06

Time (0.1s)

Yaw

Angle

(ra

diu

s)

Lateral Action Estimation

Naturalistic

Agent

Page 28: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Comparison of Mega-Agent to

Cross Validation Result• Degree of accuracy: R square

28

Event Agent A Agent B Mega

long lat long lat long lat

Event A 0.98 0.967 0.81 0.83 0.98 0.95

Event B 0.82 0.6 0.97 0.92 0.97 0.9

Page 29: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

But…Why NOT Statistical Modeling?

• Would lead to wrong conclusions!

29

Page 30: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Future CV/AV Applications

• Multi-modal applications: modeling,

simulation, and optimization

• Accounting for different priorities,

including emergency vehicles

• Utilization of the computing capabilities

of CV/AV

• Linking arterial control to freeway

management scenarios

• Characterizing and changing network

performance

30

Page 31: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

31

User-Controlled AI-Controlled High-priority

Token-based PL selection system

AI PL selection system based on performance

Pre-set PL based on vehicle type

Microscopic simulation framework for system evaluation

Vehicle Agents

ABM system

Reservation Matrix

Revocation-enabled FIFO

Trajectory Adjustment

Fuel and Emission Optimization

Road and Vehicle Characteristics

User and System Requirements

System Configuration and ABMS Rules

Perf

orm

ance

Mea

sure

s

Multi-agent System Framework

Page 32: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Multi-agent System Framework

32

a1

a2t1

t2

RD

Required Delay for a

vehicle after arriving

at the intersection

until higher priority

vehicles clear all

conflict tiles

Time

Speed

TimeRather than driving

with constant speed,

come to a complete

stop for a duration of

RD before resuming

speed, a vehicle

follows a modified

trajectory to delay its

arrival by RD

Dis

tanc

e

Here I

Am

State

PI

RD

Tim

e

t1, a

1, t2, a

2

Page 33: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

33

Page 34: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Negotiating an Intersection

34

Page 35: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Experiment Setup

• Simulating high and low priority levels in some

approaches

• Tabulated delay values and vehicle trajectories

for different approaches

35

0

810

1620

2.7

5.6

8.5

11

.41

4.3

17

.22

0.1

23

.02

5.9

28

.83

1.7

34.

63

7.5

40

.44

3.3

46

.24

9.1

52

.05

4.9

57

.86

0.7

63

.66

6.5

69

.47

2.3

75

.27

8.1

81

.08

3.9

86.

88

9.7

92

.69

5.5

98

.41

01

.3

Dis

tan

ce

Time

Time-Space Diagram for Phase 2

2.0 - 4.4

4.0 - 10.2

26.0 - 10.2

32.0 - 4.6

41.0 - 4.8

50.0 - 4.6

56.0 - 10.2

57.0 - 4.4

70.0 - 10.2

78.0 - 4.6

85.0 - 4.6

114.0 - 4.6 0

810

1620

2.3

5.2

8.1

11

.01

3.9

16

.81

9.7

22

.62

5.5

28

.43

1.3

34

.23

7.1

40

.04

2.9

45

.84

8.7

51

.65

4.5

57

.46

0.3

63

.26

6.1

69

.07

1.9

74

.87

7.7

80

.68

3.5

86

.48

9.3

92

.29

5.1

98

.01

00

.9

Dis

tan

ce

Time

Time-Space Diagram for Phase 4

1.0 - 4.4

8.0 - 4.6

9.0 - 4.4

10.0 - 4.8

11.0 - 4.4

12.0 - 4.8

13.0 - 4.1

14.0 - 4.6

15.0 - 4.1

18.0 - 4.6

20.0 - 4.4

21.0 - 4.8

23.0 - 4.6

28.0 - 4.4

29.0 - 4.4

Page 36: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Experiment Results

• Agents adapt by forming dense

platoons to pass through large gaps

more efficiently

• Interesting emergent behavior can

be observed from simple interaction

rules

• Low priority agents are sensitive to

traffic demand level

• Frequent EV calls re-synch the EV

approach

36

Phase % EV, Ph2

Scenario 1 2 3 4 5 6 7 8

PL

1 2 1 3 1 2 1 3 4

10 200 200 200 200 200 200 200 200 0

11 400 400 400 400 400 400 400 400 0

12 400 600 400 600 400 600 400 600 0

13 400 800 400 800 400 800 400 800 0

14 200 200 200 200 200 200 200 200 10

15 400 400 400 400 400 400 400 400 20

16 400 600 400 600 400 600 400 600 30

0.0

50.0

100.0

150.0

200.0

250.0

10 11 12 13 14 15 16

1

2

3

4

5

6

7

8

Page 37: AGENT-BASED MODELING, SIMULATION, AND CONTROL …helper.ipam.ucla.edu/publications/traws3/traws3_13330.pdf• Machine Learning • Q-Learning • Reinforcement Learning • Etc. 7

Concluding Remarks

• Intelligent agents can capture individual

learning, and agent-based modeling can

capture the emerging system behavior

• Think state-action framework…it can

explain a lot of things

• Win the chess game, not just the next

move

37


Recommended