+ All Categories
Home > Documents > Multi-Agent Reinforcement Learning For Integrated Network ... · Effect of of MARLIN vs Existing...

Multi-Agent Reinforcement Learning For Integrated Network ... · Effect of of MARLIN vs Existing...

Date post: 26-Aug-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
16
Multi-Agent Reinforcement Learning For Integrated Network (MARLIN) of Adaptive Traffic Signal Controllers Samah El-Tantawy, Ph.D. Post Doctoral Fellow Baher Abdulhai, Ph.D., P.Eng. Professor, Dept of Civil Engineering
Transcript
Page 1: Multi-Agent Reinforcement Learning For Integrated Network ... · Effect of of MARLIN vs Existing Conditions Walkers Line 9,134 vehicles • MARLIN vs Base Case • Speed Savings:

Multi-Agent Reinforcement Learning For Integrated Network (MARLIN) of Adaptive Traffic Signal Controllers

Samah El-Tantawy, Ph.D. Post Doctoral Fellow

Baher Abdulhai, Ph.D., P.Eng. Professor, Dept of Civil Engineering

Page 2: Multi-Agent Reinforcement Learning For Integrated Network ... · Effect of of MARLIN vs Existing Conditions Walkers Line 9,134 vehicles • MARLIN vs Base Case • Speed Savings:

Traffic Lights • Intended as source of safety and

efficiency

• Become source of delay under heavy demand

• How to make them smart, agile and demand responsive?

2

Page 3: Multi-Agent Reinforcement Learning For Integrated Network ... · Effect of of MARLIN vs Existing Conditions Walkers Line 9,134 vehicles • MARLIN vs Base Case • Speed Savings:

Evolution of “Adaptive” Traffic Signal Control MARLIN-ATSC: Level 44

Level 0 • Fixed-

Time and Actuated Control

• TRANSYT • 1969, UK

Level 1 • Centralized

Control, Off-line Optimization

• SCATS • 1979,

Australia • >50

installations worldwide

Level 2 • Centralized

Control, On-line Optimization

• SCOOT • 1981, UK • >170

installations worldwide

Level 3 • Distributed

Control, Model-Based

• OPAC, RHODES

• 1992, USA • 5 installations

in USA

Level 4 • Distributed

Self-Learning Control

• MARLIN-ATSC

• 2011, Canada

3

3

Page 4: Multi-Agent Reinforcement Learning For Integrated Network ... · Effect of of MARLIN vs Existing Conditions Walkers Line 9,134 vehicles • MARLIN vs Base Case • Speed Savings:

Issues with Leading ATSC Technologies?

• Expensive • Not scalable • Not robust

Centralized

• Relying on an accurate traffic modelling framework

• the accuracy of which is questionable Model-Based

• Increasing the complexity of the system exponentially with the increase in the number of intersections/controllers

Curse of Dimensionality

• Requiring highly skilled labour to operate due to their complexity.

Human Intervention

Requirements

4

4

Page 5: Multi-Agent Reinforcement Learning For Integrated Network ... · Effect of of MARLIN vs Existing Conditions Walkers Line 9,134 vehicles • MARLIN vs Base Case • Speed Savings:

MARLIN: The Technology Solution Reinforcement Learning

Reward (savings in delay)

environment

action

observation (queue Lengths)

MARLIN

5

5

Page 6: Multi-Agent Reinforcement Learning For Integrated Network ... · Effect of of MARLIN vs Existing Conditions Walkers Line 9,134 vehicles • MARLIN vs Base Case • Speed Savings:

6

Actual Network Paramic Model

Simulated Testbed Bay and Front (Downtown Toronto)

Traffic Volumes (OD Matrix)

6

6

Page 7: Multi-Agent Reinforcement Learning For Integrated Network ... · Effect of of MARLIN vs Existing Conditions Walkers Line 9,134 vehicles • MARLIN vs Base Case • Speed Savings:

Performance: Average Delay Reduction

7

48.9%

39.5%

Page 8: Multi-Agent Reinforcement Learning For Integrated Network ... · Effect of of MARLIN vs Existing Conditions Walkers Line 9,134 vehicles • MARLIN vs Base Case • Speed Savings:

MARLIN- ATSC: Game Theory and Network-Wide Coordination

(a)

(b)

Collaboration with each adjacent intersection in the

neighborhood

8

Corridor Synchronization

Page 9: Multi-Agent Reinforcement Learning For Integrated Network ... · Effect of of MARLIN vs Existing Conditions Walkers Line 9,134 vehicles • MARLIN vs Base Case • Speed Savings:

Large-scale Simulated Testbed Downtown Toronto

9

MARLIN-IC vs BC

% of Savings in Average Delay Area 1

Area 2

Area 3

Page 10: Multi-Agent Reinforcement Learning For Integrated Network ... · Effect of of MARLIN vs Existing Conditions Walkers Line 9,134 vehicles • MARLIN vs Base Case • Speed Savings:

Simulation Testing on City of Burlington

Walkers Line & Harvester Rd. Guelph Line & Harvester Rd.

10

Harvester and

Fraser Dr

WalkersLine and

Harvester

WalkersLine and

QEW N Ramp

WalkersLine and

QEW S Ramp

SSR and Harvester

Guelph Line and

Harvester

Guelph Line and

QEW S Ramp

Guelph Line and

QEW N Ramp SSR and Harvester

WalkersLine and

Fairview

Testbed1

Testbed2

Harvester and

Fraser Dr

WalkersLine and

Harvester

WalkersLine and

QEW N Ramp

WalkersLine and

QEW S Ramp

SSR and Harvester

Guelph Line and

Harvester

Guelph Line and

QEW S Ramp

Guelph Line and

QEW N Ramp SSR and Harvester

WalkersLine and

Fairview

Testbed1

Testbed2

Testbed 2 Testbed 1

Guelph Line and

QEW S

Ramp

Guelph Line and

QEW N Ramp SSR and Harvester

Guelph Line and

Harvester

Page 11: Multi-Agent Reinforcement Learning For Integrated Network ... · Effect of of MARLIN vs Existing Conditions Walkers Line 9,134 vehicles • MARLIN vs Base Case • Speed Savings:

Effect of of MARLIN vs Existing Conditions

Walkers Line

9,134 vehicles

• MARLIN vs Base Case • Speed Savings: 13-32% • Travel Time Savings: 11-25%

• Savings in CO2 Emission Factors

• MARLIN vs Base Case : - 13%

MARLIN-I ~ MARLIN-C

Independent is Enough

11

Guelph Line

6,578 vehicles

• MARLIN-C vs Base Case • Speed Savings: 11-25% • Travel Time Savings: 8-21%

• Savings in CO2 Emission Factors

• MARLIN vs Base Case : - 32%

MARLIN-C >> MARLIN-I

Coordination is Necessary

Page 12: Multi-Agent Reinforcement Learning For Integrated Network ... · Effect of of MARLIN vs Existing Conditions Walkers Line 9,134 vehicles • MARLIN vs Base Case • Speed Savings:

MARLIN-Hardware In The Loop Simulations (HILS) Architecture

Controller Interface Device(CID)RS485 to USB

Traffic Signal Controller

RS485 -SDLC protocol

USB -SDLC protocol

Industrial Computer

Ethernet -NTCIP protocol

Paramics Modeller

16

12

Page 13: Multi-Agent Reinforcement Learning For Integrated Network ... · Effect of of MARLIN vs Existing Conditions Walkers Line 9,134 vehicles • MARLIN vs Base Case • Speed Savings:

MARLIN Field Components

Ethernet

Switch

Industrial Computer

running MARLIN Traffic Controller

13

13

Page 14: Multi-Agent Reinforcement Learning For Integrated Network ... · Effect of of MARLIN vs Existing Conditions Walkers Line 9,134 vehicles • MARLIN vs Base Case • Speed Savings:

Summary of MARLIN Features

14

MARLIN

Self-Learning

Decentralized

Model-Free

Coordinated Scalable

Pattern Sensitive

Generic

Human Intervention Requirements

Centralized

Inefficient Coordination

Model-Based

Curse of Dimensionality

Prediction Requirement

Specific Design

Page 15: Multi-Agent Reinforcement Learning For Integrated Network ... · Effect of of MARLIN vs Existing Conditions Walkers Line 9,134 vehicles • MARLIN vs Base Case • Speed Savings:

Status

15

Field

Operations

Testing

(City of Burlington)

R & D

Partnership with

PEEK

Research and Lab Testing


Recommended