Probabilistic Models and Optimization Algorithms for Large ...

Probabilistic Models and Optimization Algorithms forLarge-scale Transportation Problems

by

Jing Lu

B.A, New York University (2014)

Submitted to the Sloan School of Managementin partial fulfillment of the requirements for the degree of

Doctor of Philosophy in Operation Research

at the

MASSACHUSETTS INSTITUTE OF TECHNOLOGY

February 2020

© Massachusetts Institute of Technology 2020. All rights reserved.

Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Sloan School of Management

January 17, 2020

Certified by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Carolina Osorio

Associate Professor of Civil and Environmental EngineeringThesis Supervisor

Accepted by. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Patrick Jaillet

Dugald C. Jackson ProfessorDepartment of Electrical Engineering and Computer Science

Co-Director, Operation Research Center

2

Probabilistic Models and Optimization Algorithms for Large-scale

Transportation Problems

by

Jing Lu

Submitted to the Sloan School of Managementon January 17, 2020, in partial fulfillment of the

requirements for the degree ofDoctor of Philosophy in Operation Research

AbstractThis thesis tackles two major challenges of urban transportation optimization problems: (i)high-dimensionality and (ii) uncertainty in both demand and supply. These challenges areaddressed from both modeling and algorithm design perspectives.

The first part of this thesis focuses on the formulation of analytical transient stochasticlink transmission models (LTM) that are computationally tractable and suitable for large-scale network analysis and optimization. We first formulate a stochastic LTM based on themodel of Osorio and Flötteröd (2015). We propose a formulation with enhanced scalabil-ity. In particular, the dimension of the state space is linear, rather than cubic, in the link’sspace capacity. We then propose a second formulation that has a state space of dimensiontwo; it scales independently of the link’s space capacity. Both link models are validatedversus benchmark models, both analytical and simulation-based. The proposed models areused to address a probabilistic formulation of a city-wide signal control problem and arebenchmarked versus other existing network models. Compared to the benchmarks, bothmodels derive signal plans that perform systematically better considering various perfor-mance metrics. The second model, compared to the first model, reduces the computationalruntime by at least two orders of magnitude.

The second part of this thesis proposes a technique to enhance the computational ef-ficiency of simulation-based optimization (SO) algorithms for high-dimensional discreteSO problems. The technique is based on an adaptive partitioning strategy. It is embeddedwithin the Empirical Stochastic Branch-and-Bound (ESB&B) algorithm of Xu and Nelson(2013). This combination leads to a discrete SO algorithm that is both globally convergentand has good small sample performance. The proposed algorithm is validated and used toaddress a high-dimensional car-sharing optimization problem.

Thesis Supervisor: Carolina OsorioTitle: Associate Professor of Civil and Environmental Engineering

3

4

Acknowledgments

Firstly, I would like to thank Carolina Osorio for not only being an advisor, but also a

mentor and role model to me during my time at MIT. It is my pleasure to work with her

for last 5 years. I am constantly inspired by her vision, ambitions, and positivity. Her

expertise, her advice, her critical thinking, and her positive attitude towards everything

have an undeniable influence on me. Carolina is always thoughtful and encouraging, which

makes our meetings not only fruitful but also enjoyable. She helped to keep me motivated

and gave me courage to keep trying after failures. Without her help, I could not have been

overcome all the difficulties during this journey.

I would like to thank Professor Patrick Jaillet and Professor Saurabh Amin for serving

as the other two members of my thesis committee, and for their useful feedback and insights

during my committee meetings and multiple 1-1 meetings. In addition, I would like to

thank Professor Saurabh Amin for his help and advice in choosing career path. I would

also like to thank Professor Richard Larson to serve on my general exam committee and to

share his vision, life experiences and amazing stories with me during meeting in his fancy

office.

Additionally, many thanks to Roberta Pizzinato for being so helpful for scheduling

meetings and travel arrangements. I would also like to thank ORC stuff Laura Rose and

Andrew Carvalho for their administrative assistance.

I am extremely grateful to my colleagues at MIT. I would especially thank the fellow

labmates in the Osorio lab: Linsen Chong, Tianli Zhou, Chao Zhang, Kevin Zhang, Timo-

thy Tay, and Evan Fields, without whom I would not have gone so far. Special thanks also

go to my friends at MIT: Manxi Wu, Haizheng Zhang, Daisy Zhou, Fangzhou Lu, Haihao

Lu, Yuchen Wang, Shujing Wang, Yiqun Hu, Shu Ma, Junbin Huang, Li Wang, Rong Yuan,

and many others for accompanying me during this long journey. Special thanks also go to

my friends outside MIT: Ning Hua, Ju Wang, Xiaoli Xu, Yun Zhang, and many others for

supporting me through ups and downs. All of you are really like my families. Our explo-

ration of fun places and restaurants around the great Boston area has become an integral

part of my life. My thanks also go to colleagues I have worked with during my summer

5

internship at Cruise Automation, John Khawam, Sean Skwerer, Chuoran Wang, Michael

McCoy, and many others for the insightful discussions in the future of autonomous vehi-

cles. I learned a lot during this precious opportunity.

I am much obliged to Professor Joel Spencer, my mentor during the undergraduate years

at New York University, who brought me into the world of mathematics and continuously

supports me until today. Without him, I would not have had my achievements today.

Last, but certainly not least, I would like to acknowledge and thank my family for their

unconditional trust, support, and encouragement. A special thank must be given to my

mother Jue Sun. This thesis is dedicated to you.

This work is partially supported by the U.S. National Science Foundation under Grant

No. 1562912. Any opinions, findings, and conclusions or recommendations expressed in

this material are those of the authors and do not necessarily reflect the views of the National

Science Foundation.

6

Contents

1 Introduction 17

1.1 Motivation and objective . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

1.1.1 Stochastic traffic flow modeling . . . . . . . . . . . . . . . . . . . 18

1.1.2 Adaptive partitioning strategy for discrete SO problems . . . . . . . 19

1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1.3 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2 Analytical Probabilistic Link Transmission Model With Linear Complexity 25

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.2 Link model formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.2.1 Multivariate link model . . . . . . . . . . . . . . . . . . . . . . . . 28

2.2.2 Univariate link models . . . . . . . . . . . . . . . . . . . . . . . . 30

2.2.3 Univariate upstream queue model . . . . . . . . . . . . . . . . . . 31

2.2.4 Univariate downstream queue model . . . . . . . . . . . . . . . . . 37

2.2.5 Mixture model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

2.3 Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

2.4 Network analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

2.4.1 City-scale signal control . . . . . . . . . . . . . . . . . . . . . . . 55

2.4.2 Numerical analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 61

2.4.3 Comparison to signal plans derived by commercial signal control

software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

2.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

7

3 Analytical Probabilistic Link Transmission Model With Constant Complexity 71

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

3.2 Past link model formulations . . . . . . . . . . . . . . . . . . . . . . . . . 74

3.3 Proposed link model formulation . . . . . . . . . . . . . . . . . . . . . . . 75

3.3.1 Downstream boundary conditions . . . . . . . . . . . . . . . . . . 76

3.3.2 Upstream boundary conditions . . . . . . . . . . . . . . . . . . . . 83

3.4 Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

3.4.1 Validation versus a stochastic link transmission model simulator . . 93

3.4.2 Validation versus a microscopic traffic simulator . . . . . . . . . . 99

3.5 Optimization case study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

3.5.1 City-scale signal control . . . . . . . . . . . . . . . . . . . . . . . 108

3.5.2 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

3.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

4 Adaptive Partitioning Strategy for High-Dimensional Discrete Simulation-based

Optimization Problems 123

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

4.2 ESB&B framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

4.3 Adaptive partitioning strategy . . . . . . . . . . . . . . . . . . . . . . . . . 128

4.3.1 Parallel partition . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

4.3.2 Hyperplane partition . . . . . . . . . . . . . . . . . . . . . . . . . 133

4.3.3 Adaptive partitioning ESB&B algorithm . . . . . . . . . . . . . . . 134

4.4 Numerical examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

4.4.1 The Griewank function . . . . . . . . . . . . . . . . . . . . . . . . 136

4.4.2 The car-sharing fleet assignment problem . . . . . . . . . . . . . . 151

4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

5 Conclusions 163

A Appendices of Chapter 2 167

A.1 Estimation of the weight parameter w . . . . . . . . . . . . . . . . . . . . 167

8

A.2 Tables of time-average JSD metric . . . . . . . . . . . . . . . . . . . . . . 168

B Appendices of Chapter 3 171

B.1 Property: τDQ(k) = λDQ(k) when µ(k) = 0 . . . . . . . . . . . . . . . . . 171

B.2 Calculation of limµ(k)→0 τDQ(k) of Equation (3.15) . . . . . . . . . . . . . 172

B.3 Estimation of the scalar coefficients in τDQ(k) . . . . . . . . . . . . . . . . 173

B.4 Variance of the sojourn time of DQ(k) . . . . . . . . . . . . . . . . . . . . 174

B.5 Estimation of the scalar coefficients in τUQ(k) . . . . . . . . . . . . . . . . 176

B.6 Tables of mean absolute differences . . . . . . . . . . . . . . . . . . . . . 177

9

10

List of Figures

2-1 Link dynamics of the multivariate link model . . . . . . . . . . . . . . . . 29

2-2 Model runtime comparison . . . . . . . . . . . . . . . . . . . . . . . . . . 48

2-3 Experiment 1: impact of the temporal variation of demand on the distribu-

tions, as well as the expected values, of UQ and of DQ . . . . . . . . . . . 51

2-4 Experiment 2: impact of the temporal variation of demand on the distribu-

tions, as well as the expected values, of UQ and of DQ . . . . . . . . . . . 52

2-5 Comparison of the JSD values for the 21 experiments with time-independent

demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

2-6 Comparison of the JSD values for the 21 experiments with time-independent

demand (zoomed-in results) . . . . . . . . . . . . . . . . . . . . . . . . . 57

2-7 Lausanne city road network (adapted from Dumont and Bert (2006)) . . . . 58

2-8 Lausanne network model . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

2-9 Cumulative distribution functions of the average proportion of time a lane

is full considering different initial signal plans . . . . . . . . . . . . . . . . 64

2-10 Cumulative distribution functions of the average lane queue-length consid-

ering different initial signal plans . . . . . . . . . . . . . . . . . . . . . . . 65

2-11 Cumulative distribution functions of the average trip travel times consider-

ing different initial signal plans . . . . . . . . . . . . . . . . . . . . . . . . 66


is full . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

2-13 Cumulative distribution functions of the average lane queue-length . . . . . 67

2-14 Cumulative distribution functions of the average trip travel time . . . . . . 67

11

3-1 Experiment 1: impact of the temporal variation of demand on the link’s

upstream and downstream boundary conditions . . . . . . . . . . . . . . . 94

3-2 Experiment 2: impact of the temporal variation of demand on the link’s

upstream and downstream conditions . . . . . . . . . . . . . . . . . . . . . 95

3-3 Comparison of the average absolute errors for the 21 experiments with

time-varying demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

3-4 Comparison of the computational runtimes for the 21 experiments with

time-varying demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

3-5 Microscopic simulation model of a single-lane link . . . . . . . . . . . . . 101

3-6 Comparison of the expected inflow and outflow for the experiment with

arrival rate λ = 0.1 veh/sec . . . . . . . . . . . . . . . . . . . . . . . . . . 104






alternating arrival rate between 0.3 veh/sec and 0 veh/sec . . . . . . . . . . 106

3-10 Microscopic simulation model for platoon arrival experiments . . . . . . . 106

3-11 Comparison of the expected inflow and outflow for the tandem link exper-

iment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

3-12 Lausanne network model . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

3-13 Cumulative distribution functions of the average, over all lanes in the net-

work, proportion of time a lane is full considering different initial signal

plans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

3-14 Average proportion of time lane is full for ILTM signal plans and proposed

signal plans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

3-15 Cumulative distribution functions of the average proportion of the lane that

is occupied by vehicles considering different initial signal plans . . . . . . 117

3-16 Cumulative distribution functions of the average trip travel time consider-

ing different initial signal plans . . . . . . . . . . . . . . . . . . . . . . . . 118

12


is full . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

3-18 Cumulative distribution functions of the average proportion of the lane that

is occupied by vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

3-19 Cumulative distribution functions of the average trip travel time . . . . . . 120

4-1 The ground truth values of f(x1, x2) in the current best subregion. . . . . . 129

4-2 The sampled solutions of f(x1, x2) in the current best subregion. . . . . . . 129

4-3 Different partitions of the current best subregion. . . . . . . . . . . . . . . 130

4-4 The contour plot of two-dimensional Griewank function on [−5, 5]× [−5, 5].137

4-5 Objective function estimate of the current iterate across iterations. . . . . . 138

4-6 Objective function estimate of the current iterate with 95% confidence in-

terval across iterations of ESBB algorithm (zoomed-in results). . . . . . . . 139


terval across iterations of the proposed algorithm (zoomed-in results). . . . 140

4-8 Distance between current best solution and the global minimum solution

across iterations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

4-9 The path of best solution at current iterate across iterations in the feasible

domain of the original ESB&B algorithm. . . . . . . . . . . . . . . . . . . 141


domain of the original ESB&B algorithm (zoomed-in results). . . . . . . . 141

4-11 Allocation of sampling budget in the feasible domain of the original ESB&B

algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142


domain of the proposed algorithm. . . . . . . . . . . . . . . . . . . . . . . 143


domain of the proposed algorithm (zoomed-in results). . . . . . . . . . . . 143

4-14 Allocation of sampling budget in the feasible domain of the proposed algo-

rithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

13

4-15 Objective function estimate of the current iterate across iterations averaged

over 50 algorithm runs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146


over 50 algorithm runs (zoomed-in results). . . . . . . . . . . . . . . . . . 146

4-17 The contour plot of two-dimensional Griewank function on [−1, 9]× [−1, 9].147

4-18 Objective function estimate of the current iterate across iterations. . . . . . 147


terval across iterations of ESBB algorithm (zoomed-in results). . . . . . . . 148


terval across iterations of the proposed algorithm (zoomed-in results). . . . 149

4-21 Distance between current best solution and the global minimum solution

across iterations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149


over 50 algorithm runs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150


over 50 algorithm runs (zoomed-in results). . . . . . . . . . . . . . . . . . 150

4-24 Zipcar stations in Boston South End neighborhood (Google Maps; 2017) . . 153

4-25 The objective function estimate of the current iterate across iterations for

low-demand experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

4-26 Objective function estimate of the current iterate across iterations(averaged

over the 5 algorithm runs) for the low-demand experiment. . . . . . . . . . 155

4-27 The objective function estimate of the current iterate across iterations for

high-demand experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

4-28 Objective function estimate of the current iterate across iterations (averaged

over the 5 algorithm runs) for the high-demand experiment. . . . . . . . . . 158

14

List of Tables

2.1 Transition rate table of UQ. . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.2 Transition rate table of DQ. . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.3 Link Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

2.4 Parameters for Lausanne case study . . . . . . . . . . . . . . . . . . . . . 62

3.1 Link parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

3.2 Link parameters used in the microscopic simulator . . . . . . . . . . . . . 101

3.3 Link parameters used in the proposed model . . . . . . . . . . . . . . . . . 102

3.4 Link parameters used in the proposed model . . . . . . . . . . . . . . . . . 107

3.5 Average runtime (in min) per iteration of the signal control optimization

algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

4.1 Performance statistics of the derived final solutions in the low-demand case. 156

4.2 P-values of the two-sample t-test comparing the solutions derived by ESB&B

algorithm and the proposed algorithm with parallel partition in the low-

demand experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156


algorithm and the proposed algorithm with hyperplane partition in the low-


4.4 P-values of the two-sample t-test comparing the solutions derived by the

proposed algorithm with parallel and hyperplane partition in the low-demand

experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

4.5 Performance statistics of the derived final solutions in the high-demand case. 159

15


algorithm and the proposed algorithm with parallel partition in the high-



algorithm and the proposed algorithm with hyperplane partition in the high-


4.8 P-values of the two-sample t-test comparing the solutions derived by the

proposed algorithm with parallel and hyperplane partition in the high-demand

experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

A.1 Time-average JSD metric of the UQ distribution. The value NaN denotes

cases where the evaluation of the multivariate model exceeded the limit of

40 hours. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

A.2 Time-average JSD metric of the DQ distribution. The value NaN denotes


40 hours. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

B.1 Mean absolute difference eUQ of P(UQ(k) = ℓ). The value NaN denotes


40 hours. (Starting empty with time-varying demand over time) . . . . . . 177

B.2 Mean absolute difference eDQ of P(DQ(k) = 0). The value NaN denotes


40 hours. (Starting empty with time-varying demand over time) . . . . . . 178

16

Chapter 1

Introduction

1.1 Motivation and objective

Uncertainties exist in many aspects of transportation networks, e.g., demand, heterogeneity

in people’s behaviors, etc. Such uncertainties have always created challenges in large-scale

transportation network modeling, operations and designs. The state-of-the-art stochastic

models are developed to include the consideration of one or more uncertainties that exist

in the transportation networks. These models may represent reality to a better extent but

at the cost of model complexity, which can impede their practical implementation and

application especially when computational budgets are limited. As the network increases

in size, the analysis and decision-making with such stochastic models become more and

more challenging.

In this thesis, we address the limited computational resources in practically applying

stochastic models to large-scale network problems in the following two approaches: (i) de-

velop computationally efficient analytical probabilistic transportation network models that

can be evaluated and optimized by well-developed gradient-based optimization algorithms

and (ii) develop efficient optimization algorithms that can solve problems formulated using

simulation-based urban mobility models.

17

1.1.1 Stochastic traffic flow modeling

The vast majority of the literature in the field of analytical traffic flow modeling has focused

on the development and use of deterministic traffic models. There is an increasing interest

in the development of analytical stochastic models. The increase in the quantity, quality and

resolution of traffic data available allows us to validate models that provide a probabilistic

description of traffic. Such models can be used to enhance the reliability and the robustness

of our transportation networks.

Although the research on analytical stochastic traffic modeling is gaining momentum,

many challenges remain to be addressed:

1. Currently, the most popular approach to formulate an analytical stochastic traffic

model is adding a stochastic noise term to a deterministic traffic model (e.g., Boel

and Mihaylova (2006)). However, for such approaches, the expected traffic dynam-

ics are not guaranteed to be consistent with their deterministic counterparts. For such

approaches, Jabari and Liu (2012) argue that randomness is often applied in an im-

perfect and incomplete fashion. This can lead to the existence and implications of

negative sample paths and hence a misrepresentation of reality.

2. As the size of the network increases, stochastic traffic network models can suffer the

curse of dimensionality. For instance, consider a network of 100 links and every link

has 10 different states, so the total number of joint states of the network is 10100.

Computational efficiency is always a major concern for practical implementations of

stochastic models (Calvert et al.; 2012).

3. There are also challenges such as correlation among variables. Different from deter-

ministic models, where one only has to consider the relations among single values,

probabilistic models need to consider the dependencies among values of random

variables. Not every permutation of the values of random variables is feasible. For

instance, consider a link with space capacity ℓ. The random variable that represents

the number of vehicles ready to leave the link is upper bounded by the random vari-

able that represents the total number of vehicles on the link, although both random

18

variables have the same support {0, ..., ℓ}. Thus, there is a challenge in developing

probabilistic models that consider correlation between random variables while main-

taining easy implementation that does not detract from model efficiencies.

In this thesis, we address some of the challenges by formulating probabilistic link trans-

mission models that are (i) consistent with mainstream deterministic traffic flow theory and

(ii) scalable and computationally efficient for large-scale network analysis and optimiza-

tion.

1.1.2 Adaptive partitioning strategy for discrete SO problems

This research is motivated by the car-sharing network design problem we worked on with

Zipcar. Over the years, Zipcar has been collecting high-resolution reservation data from its

customers in the Boston market. There arises the question of how to fully utilize such rich

disaggregated data to help the operators redesign the service system. Zhou et al. (2019)

consider the fleet assignment problem of the two-way car-sharing system from the aspect

of service operator, i.e., finding an assignment of a fleet of vehicles across the network

of stations that maximizes the expected profit over a given finite time horizon, denoted as

the planning period. Instead of using a simplified description of demand and of demand-

supply interactions, Zhou et al. (2019) rely on the demand simulator of Fields et al. (2017)

developed based on the rich available high-resolution reservation data and formulated the

fleet assignment problem as a discrete SO problem. Given a fleet assignment across the

network, the expected profit is simulated and taken as the average over simulation runs.

In another word, a closed-form objective function is not available. The dimension of the

problem can be as high as hundreds, and the decision variables are discrete (i.e., the number

of vehicles assigned to each station).

For discrete SO problems with a large number of feasible points, the existing algorithms

focus mainly on global convergence to the optimal solution asymptotically (e.g., Tsai and

Fu (2014)). Methods that aim at identifying solutions with good performances within small

sampling budgets include the Heuristic Constrained Genetic Algorithm (HCGA) (Tsai and

Fu; 2014), Industrial Strength COMPASS (ISC) (Xu et al.; 2010). Nevertheless, developing

19

discrete SO algorithms that can efficiently tackle high-dimensional problems remains a

challenge.

In this thesis, we propose a technique to enhance the computational efficiency of SO

algorithms for high-dimensional discrete SO problems. The technique is based on an adap-

tive partitioning strategy. It is embedded within the ESB&B algorithm of Xu and Nelson

(2013). The resulting algorithm has enhanced finite-time (small sampling budget) perfor-

mances while maintaining global convergence.

1.2 Contributions

The contribution of this thesis are as follows:

Analytical probabilistic link transmission models

In this thesis, we develop two analytical link transmission models: a mixture model (Chap-

ter 2) and a two-dimensional model (Chapter 3). They are both transient probabilistic mod-

els that provide a probabilistic description of congestion build-up and dissipation. Thus,

they can be used to deliver performance metrics such as the dynamics in spillback prob-

ability, expectation and variance of queue length, and travel time for sensitivity analysis

or for (robust) optimization purposes. Both models can accurately approximate the links

boundary conditions for realistic traffic situations such as platoon phenomenon caused by

signal controls.

The proposed mixture model is based on the model of Osorio and Flötteröd (2015),

which is a stochastic formulation of the deterministic link transmission model of Yperman

et al. (2007). It tracks the marginal distribution of the upstream and downstream bound-

ary conditions over time, and hence it has a model complexity that is linear in the link’s

space capacity. The two-dimensional model further combines the idea of relaxation pro-

cess. It reduces the model complexity to a constant that no longer depends on the link’s

space capacity. This makes the proposed model suitable for large-scale network optimiza-

tion or situations where a large number of model evaluations is required. Both models are

validated versus a simulation-based implementation of the stochastic LTM. They yield sig-

20

nificant gains in computational efficiency while preserving accuracy. The mixture model

yields accurate distributional approximations of the link’s boundary conditions. The two-

dimensional model’s accuracy on approximating the link’s boundary conditions is compa-

rable to that of Osorio and Flötteröd (2015) and of the mixture model. In the case study, we

further demonstrate the computational efficiency of the models with a large-scale network

signal control problem. The proposed mixture model enables the large-scale network sig-

nal controls problem to be solved offline, and the two-dimensional model further reduces

the computational runtime by two orders of magnitude and hence enables it to be solved in

a timely manner.

Optimization algorithm

In this thesis, we propose a technique to enhance the computational efficiency of SO al-

gorithms for high-dimensional discrete SO problems (Chapter 4). The technique is based

on an adaptive partitioning strategy, which can take on problem-specific structures known

a priori, such as the clustering effect in car-sharing fleet assignment problems, to further

enhance the algorithm efficiencies. The proposed adaptive partitioning strategy is inte-

grated in the ESB&B framework (Xu and Nelson; 2013). The combination leads to a

general-purpose discrete SO algorithm. The resulting algorithm preserves global conver-

gence under infinite simulation efforts and has a good small sampling budget performance.

The proposed algorithm is validated and used to address a high-dimensional car-sharing

optimization problem.

Applications

• Signal control problem. The proposed analytical probabilistic link transmission

models can be used in a variety of network analysis and optimization problems. In

the case studies of this thesis, we address a continuous optimization problem, which

is a fixed-time signal control problem of the Swiss city of Lausanne. The network

consists of 603 links, 902 lanes, and 231 intersections. Among all the intersec-

tions, 17 signalized intersections distributed through the network are to be optimized,

21

which results in a decision variable of dimension 99. This is considered a large-scale

signal control optimization problem (Osorio and Chong; 2015).

The proposed models (the mixture and two-dimensional models) are used to address

this city-wide signal control problem. The proposed models are benchmarked with

a deterministic link transmission model (e.g., intelligent link transmission model

(ILTM) (Himpe et al.; 2016)). The signal plans derived from the proposed mixture

and two-dimensional models have similar performance considering various metrics.

They systematically outperform the initial signal plans, signal plans derived by the

ILTM model, and a signal plan proposed by a widely used commercial software.

Compared to the mixture model, the two-dimensional model reduces the computa-

tional runtime by at least two orders of magnitude.

• Car-sharing network design. The two-way car-sharing fleet assignment problem

of Zipcar in Boston is formulated into a discrete SO problem, in which the objective

function (i.e., expected profit) does not have a closed form but can only be esti-

mated via simulations. The proposed adaptive partitioning strategy is used to solve

the formulated discrete SO problem. We show that the proposed algorithm has bet-

ter finite-time performances than the original ESB&B algorithm. We demonstrate

the flexibility of the proposed adaptive partitioning strategy in taking on problem-

specific structures known a priori, i.e., the clustering effect based on the locations

of the stations. It is shown that the proposed algorithm with prior knowledge can

identify solutions with improved performances faster than the proposed algorithm

without prior knowledge.

1.3 Structure of the Thesis

A chapter by chapter description of the thesis follows.

Chapter 2 proposes an analytical probabilistic link transmission model. The proposed

model builds upon the multivariate model of Osorio and Flötteröd (2015). It proposes

a formulation with enhanced scalability. In particular, the proposed model has a

22

complexity that is linear in the link’s space capacity. The method of this chapter has

been presented and published as:

Lu, J. and Osorio C. (2018). A probabilistic traffic-theoretic network loading

model suitable for large-scale network analysis, Transportation Science 52(6):1509-

1530.

Lu, J. and Osorio C. (2016, Sept 15). A probabilistic traffic theoretic and scal-

able network loading model, European Association for Research in Transportation

(hEART), TU Delft, The Netherlands.

Lu, J. and Osorio C. (2015, Nov 4). Analytical stochastic link transmission model

suitable for large-scale analysis, Proceedings of the 2015 INFORMS Annual Meet-

ing, Philadelphia, Penn, USA.

Chapter 3 proposes a relaxation approximation of the analytical probabilistic link trans-

mission model formulated in Chapter 2. It is a stochastic formulation with a constant

model complexity. This makes it suitable for large-scale network optimization with

tight time budgets. The method of this chapter has been submitted for journal publi-

cation. Preliminary results of this method has been presented as:

Lu, J. and Osorio C. (2018, Jun 7). A probabilistic analytical traffic-theoretic net-

work loading model for large-scale network optimization, Proceedings of the 7th

international symposium on dynamic traffic assignment Smart Transportation, Hong

Kong University, HK, China.

Chapter 4 proposes an adaptive partitioning strategy to enhance the computational ef-

ficiency of SO algorithms for high-dimensional discrete SO problems. It is inte-

grated in the ESB&B framework (Xu and Nelson; 2013). The combination leads to

a general-purpose discrete SO algorithm, which preserves global convergence under

infinite simulation efforts and has a good small sample (finite time) performance.

Chapter 5 summarizes this thesis and includes future research directions.

Appendix A contains the appendices of Chapter 2.

23

Appendix B contains the appendices of Chapter 3.

24

Chapter 2

Analytical Probabilistic Link

Transmission Model With Linear

Complexity

This chapter presents an analytical stochastic link transmission model. It is a stochastic

formulation of the link transmission model (LTM), which itself is an operational formula-

tion of Newell’s simplified theory of kinematic waves. The proposed model builds upon

the multivariate model of Osorio and Flötteröd (2015). It proposes a formulation with en-

hanced scalability. In particular, compared to the multivariate model, it has a complexity

that is linear, rather than cubic, in the link’s space capacity. This makes it suitable for large-

scale network analysis. The method of this chapter has been published as: Lu, J. and

Osorio C. (2018). A probabilistic traffic-theoretic network loading model suitable for

large-scale network analysis, Transportation Science 52(6):1509-1530.

2.1 Introduction

This chapter focuses on the formulation of stochastic (i.e., probabilistic) network loading

models of road traffic. The vast majority of the literature in the field of traffic flow theory

has focused on the development of deterministic traffic models. There has been a recently

renewed interest in the development of analytical stochastic models which is, arguably,

25

triggered by both: (i) the interest of major transportation agencies around the world in

estimating and improving the robustness and reliability of their networks (Transport for

London; 2010; U.S. Department of Transportation; 2008); (ii) the availability of high reso-

lution traffic data, which enables the validation of more detailed models.

In a transportation network, there are sources of uncertainty both in supply (e.g., weather)

and in demand (e.g., spatial and temporal distribution of travel demand, heterogeneous pop-

ulation of travelers). Recent studies that review sources and modeling approaches to de-

mand and supply uncertainty include Sumalee et al. (2011); Lam et al. (2008). For instance,

in the field of microscopic travel demand modeling, a variety of probabilistic models have

been developed to account for uncertainties in various travel choices such as: departure

time, mode, route, etc. In the field of macroscopic modeling, the variability (or scatter) in

the fundamental diagrams has led the community to develop probabilistic models to better

interpret and fit field data. A review of recent approaches to model, or account for, the

variability in fundamental diagrams is given in Sumalee et al. (2011) and in Jabari et al.

(2014b). For instance, the work of Heidemann (2001) uses a probabilistic non-stationary

(i.e., transient) traffic model to interpret hysteresis loops and the case study in Sumalee

et al. (2011) uses a probabilistic model to improve the fit of a fundamental diagram with

high scatter. Nonetheless, there is a lack of probabilistic traffic models that are both: (i)

consistent with mainstream traditional deterministic traffic flow theoretic models, and (ii)

tractable enough to enable the efficient analysis and optimization of large-scale networks.

The main contribution of this chapter is to formulate a probabilistic link model that

is both: (i) consistent with mainstream deterministic traffic flow theory; and (ii) is

computationally tractable to enable large-scale network analysis.

Jabari (2012) and Laval and Chilukuri (2014) provide reviews of stochastic traffic flow

theoretic models. Recent formulations include those derived from the variational theory of

Daganzo (2005): e.g., Deng et al. (2013); Laval and Chilukuri (2014); Laval and Castril-

lón (2015). The most popular approach to stochastic traffic modeling is the formulation

of stochastic cell-transmission models (CTMs; e.g., Boel and Mihaylova; 2006; Sumalee

et al.; 2011; Jabari and Liu; 2012). The approach of Boel and Mihaylova (2006) is an ex-

ample of the most common approach to stochastic CTM models in that it adds Gaussian

26

noise terms to the deterministic formulation. This contributes to model tractability yet does

not guarantee expected (i.e., average) traffic dynamics consistent with the CTM dynamics.

The implications of this are further discussed in Jabari and Liu (2012). The model of Jabari

and Liu (2012) considers stochastic vehicle headways. It allows for a variety of headway

distributions and has a fluid limit approximation that is consistent with the CTM. Boel and

Mihaylova (2006) and Jabari and Liu (2012) are sampling-based approaches, which can

become computationally intensive for large-scale networks. Jabari and Liu (2013) pro-

pose a second-order Gaussian approximation of the model of Jabari and Liu (2012) that

can be evaluated without sampling. The CTM is a space-discretized approximation of the

kinematic wave model (KWM; Lighthill and Witham (1955); Richards (1956a)), hence a

stochastic CTM formulation does not guarantee consistency with the KWM.

The recent work of Osorio and Flötteröd (2015) extends the model of Osorio et al.

(2011) and proposes a link model that is a stochastic formulation of the deterministic link

transmission model of Yperman et al. (2007), which itself is an operational formulation

of Newell’s simplified theory of kinematic waves (Newell; 1993). The model considers an

isolated link and derives an analytical description of the transient (i.e., time-dependent) dis-

tribution of link boundary conditions. It yields the joint distribution of the link’s upstream

and downstream boundary conditions. Hence, it provides a higher-order (i.e., beyond first-

order) description of within-link dependencies. The model represents the link as a set of

three finite space capacity stochastic queues. For a link with space capacity ℓ, the dimen-

sion of the state space of the joint distribution is 16(ℓ+ 1)(ℓ2 + 2ℓ+ 6). In other words, the

model complexity is in the order of O(ℓ3).

This chapter formulates a link model with a complexity that is linear, rather than cubic,

in the link’s space capacity, i.e., the proposed model has O(ℓ) complexity. It is therefore

scalable and appropriate for large-scale network analysis. The proposed model is derived

from the model of Osorio and Flötteröd (2015). It is therefore a stochastic formulation of

Newell’s simplified theory of kinematic waves (Newell; 1993).

Section 2.2 formulates the proposed model. The model is validated (Section 2.3) and

used to address a large-scale signal control problem (Section 2.4). Conclusions and a dis-

cussion of ongoing work are presented in Section 2.5. The Appendices contain additional

27

numerical validation results.

2.2 Link model formulation

2.2.1 Multivariate link model

We outline here the main ideas of the model of Osorio and Flötteröd (2015). Hereafter,

we refer to the Osorio and Flötteröd (2015) model as the multivariate link model. For

a description of how this model relates to Newell’s simplified theory of kinematic waves

or to the operational formulation of Yperman et al. (2007), we refer the reader to Osorio

and Flötteröd (2015). Consider a link with a triangular fundamental diagram, free flow

velocity v, backward wave speed w (negative), flow capacity q, jam density ρ, and link

length L. The process that vehicular traffic flow goes through within the link is described

as follows. Upon entrance to the link, it is delayed by L/v time units. It is then ready

for departure, and enters the physical vehicular queue downstream, if one exists. Upon

departure from the link, there is an additional delay of L/|w| before the newly available

space becomes available upstream of the link. This delay represents the time it takes a

kinematic backward wave to traverse the link. The multivariate model is a continuous-

space discrete-time model, where L/v (resp. L/|w|) is rounded to the integer kfwd (resp.

kbwd).

This process is summarized in Figure 2-1. During time interval k, the link has an

expected inflow (resp. outflow) denoted qin(k) (resp. qout(k)). The delay incurred upon

entrance to the link is represented by the lagged inflow queue, denoted LI. In discrete

time, LI can be thought of as a set of kfwd cells. One can think of this delay as if the flow

traveled sequentially from the first until the kfwdth cell of LI. This last cell of LI is denoted

LLI in Figure 2-1. This cell configuration of LI is a mere representation, the multivariate

model describes LI aggregately, i.e., it is not decomposed into individual cells. After this

delay, the flow enters the downstream queue, denoted DQ. The departure of flow from

the link triggers two events: the flow departs DQ (in a network setting, it would enter a

downstream link) and it enters the lagged outflow queue, denoted LO. The purpose of LO

28

is to capture the kinematic backward wave delay. One can think of this delay as if the newly

available space traveled sequentially from the first until the kbwdth cell of LO. This last cell

of LO is denoted LLO in Figure 2-1. The multivariate link model accounts for stochasticity

in the link’s arrival and departure processes. Time-dependent (i.e., inhomogeneous, non-

homogeneous) finite-state birth-death processes are assumed. This leads to stochastic link

flows, to stochastic cumulative flows both upstream and downstream of the link, and hence,

to a stochastic description of link states.

qin(k) Downstream Queue (DQ) qout(k)

Lagged Outflow Queue (LO)(lag of L/|w| time units)

kbwd · · · · · · 2 1

LLOLagged Inflow Queue (LI)

(lag of L/v time units)

1 2 · · · · · · kfwd

LLI

Figure 2-1: Link dynamics of the multivariate link model

The multivariate model jointly tracks the dynamics between the three queues LI, DQ,

and LO. It also defines the upstream queue, UQ, as:

UQ = LI+DQ+ LO. (2.1)

More specifically, the model is a discrete-time model. We denote LI(t;k) (resp. DQ(t;k),

LO(t;k), UQ(t;k)) as the number of vehicles in LI (resp. DQ,LO,UQ) at continuous

time t within discrete time interval k of duration δ. The model yields the joint distribution

of P(LI(t;k), DQ(t;k), LO(t;k),

UQ(t;k)). The linear equality (2.1) implies that this four-dimensional joint distribution

29

can be obtained by tracking three of the four variables. The model implementation of

Osorio and Flötteröd (2015) tracks LI,DQ and LO. For a given link with space capacity ℓ

(which is defined as a rounded version of ρL), the state space is defined by {(li, dq, lo) ∈

[0, ℓ]3 : li+ dq+ lo ≤ ℓ}. The state space dimension is 16(ℓ+ 1)(ℓ2 + 2ℓ+ 6).

In this chapter, we propose a formulation with a state space dimension that is linear,

instead of cubic, in the space capacity while still providing a detailed representation of the

within-link dependencies. This enables its use for the efficient analysis and optimization of

large-scale networks.

2.2.2 Univariate link models

Hereafter, unless necessary, we drop the time dependency notation and use LI or LI(k) to

denote LI(t;k). We do the same for DQ,LO and UQ. The main insights of the multivari-

ate model that underly the newly proposed formulation are the following: UQ provides a

detailed description of the link’s upstream boundary conditions, while DQ provides a de-

tailed description of the link’s downstream boundary conditions. One approach would be to

propose a model that jointly describes (UQ,DQ). This would improve model scalability

by going from a three- to a two-dimensional state space. The idea considered in this chapter

goes even further, it proposes a univariate (i.e., one-dimensional) state space, which leads

to a scalable formulation. We consider the following two independent univariate models.

• One model of UQ. Its purpose is to accurately capture the link’s upstream boundary

conditions.

• One model of DQ. Its purpose is to accurately capture the link’s downstream bound-

ary conditions.

The proposed model is then defined as a mixture of these two independent univariate mod-

els. There is significant dependency between the upstream and the downstream boundary

conditions of a link, as illustrated by Equation (2.1). In other words, there is dependency

between the dynamics of UQ and of DQ. The numerical case studies in Osorio and Flöt-

teröd (2015) analyze this dependency in more detail. The main challenge addressed in this

30

chapter is therefore to develop independent univariate models of UQ and of DQ while still

capturing the dependency between the link’s upstream and downstream boundary condi-

tions.

Consider an isolated link with space capacity ℓ, an inhomogeneous Poisson arrival pro-

cess with exogenous arrival rate λ(k) (time is indexed by k), and exponentially distributed

service times at the downstream end of the link with exogenous downstream bottleneck

flow capacity µ(k). For this isolated link, Section 2.2.3 (resp. 2.2.4) formulates a uni-

variate model that tracks the distribution of UQ (resp. DQ) over time. Section 2.2.5 then

formulates the proposed mixture model, which combines the UQ and the DQ models.

2.2.3 Univariate upstream queue model

This section formulates a univariate model of UQ. Following the approach in Osorio and

Flötteröd (2015), UQ is modeled as a birth-death process with a finite state space defined

by {uq ∈ [0, ℓ]}. For time interval k of duration δ, the transient probability distribution

of UQ satisfies a system of linear differential equations with solution defined by (see, for

instance, Reibman (1991) for details):

P(UQ(t;k)) = P(UQ(0;k))etQUQ(k) ∀ t ∈ [0, δ], (2.2)

where P(UQ(0;k)) are the initial conditions at the beginning of time interval k, and

QUQ(k) is the transition rate matrix of UQ.

The initial conditions are given by ensuring continuity at the start of the time interval:

P(UQ(0;k)) = P(UQ(δ;k− 1)). (2.3)

Let P(UQ(k)) denote the UQ distribution at the end of time interval k, i.e., it is a

simplified notation for P(UQ(δ;k)). Equations (2.2) and (2.3) can be combined to obtain

the equation that yields the distribution of UQ at the end of the time interval:

P(UQ(k)) = P(UQ(k− 1))eδQUQ(k). (2.4)

31

Equations (2.4) states that in order to approximate the transient distribution of UQ, we

need to approximate the transition rate matrix QUQ(k). Table 2.1 defines the non-diagonal

and non-null elements of the transition rate matrix. This table considers for an arbitrary

initial state uq of UQ (displayed in column 1), the feasible instantaneous transitions that

can take place to new states (column 2), the rate at which the transitions take place (column

3), and the conditions on the initial states needed for the transitions to be feasible (column

4). For UQ, there are two types of events that trigger state changes. The first is flow arrival

to the link. This is described in the first row of the table. This row states that arrivals to

the link lead to an increase in the state of UQ (i.e., the new state is uq + 1), this occurs

with rate λ(k) and can occur as long as UQ is not full (i.e., there is available space at the

upstream end of the link: uq < ℓ). The second type of event are flow departures from UQ,

these are described by the second row of the table. They occur at rate µUQ(uq;k) and can

occur as long as UQ is not empty (i.e., uq > 0). The diagonal elements of the transition

rate matrix, QUQ(k)ss, are derived from the non-diagonal elements by:

QUQ(k)ss = −∑j =s

QUQ(k)sj. (2.5)

Table 2.1 states that the univariate model of UQ depends on two rates: (i) λ(k), which

for an isolated link is an exogenous rate, and (ii) µUQ(uq;k). The latter is referred to as

the service rate of UQ. It is an endogenous rate. We now formulate its approximation.

Service rate of UQ

Recall from Section 2.2.1 and Figure 2-1 that departures from UQ correspond to flow that

leaves the last cell of LO. In Figure 2-1, this last cell is the kbwdth cell denoted LLO.

Therefore, the number of departures from UQ during time interval k is a random vari-

initial state new state rate conditionuq uq+ 1 λ(k) uq < ℓ

uq uq− 1 µUQ(uq;k) uq > 0

Table 2.1: Transition rate table of UQ.

32

able and it can be expressed as LLO(k)|UQ(k) = uq. Let E[Tm(k)] denote the expected

inter-departure time from UQ conditional on there being a total of m departures during

time interval k. By definition, service rate is the inverse of expected time between consec-

utive departures. The service rate of UQ conditional on UQ = uq, µUQ(uq;k), can be

approximated as follows:

µUQ(uq;k) ≈uq∑m=1

1

E[Tm]P(LLO(k) = m|UQ(k) = uq) (2.6)

Equation (2.6) approximates µUQ(uq;k) as the mean inverse of expected inter-departure

time from UQ conditional on there being a total of m departures during time interval k. In

order to approximate E[Tm(k)] for m > 0, we use the following property. For a Poisson

process, given that a total number of m arrivals have occurred during a time interval of

duration δ, then the unordered arrival times are independently, uniformly distributed over

the time interval of interest (cf., for instance, Section 2.12.3 of Larson and Odoni (1981)).

Hence, the expected inter-arrival time is δ/m. We approximate the departure process of

UQ as a Poisson process. Therefore, given a total of m departures from UQ during time

interval k, the expected time between consecutive departures is approximated with δ/m.

Equation (2.6) becomes:

µUQ(uq;k) ≈uq∑m=1

m

δP(LLO(k) = m|UQ(k) = uq) (2.7)

=1

δ

uq∑m=0

mP(LLO(k) = m|UQ(k) = uq) (2.8)

=1

δE[LLO(k) | UQ(k) = uq], (2.9)

where E[LLO(k) | UQ(k) = uq] represents the expected outflow from UQ during time

interval k, given that UQ is in state uq. The expression for this conditional expectation is

derived as follows.

33

First, assume UQ to be a Poisson process with rate:

qUQ(k) =

k−1∑r=0

qin(r) −

k−kbwd−1∑r=0

qout(r), (2.10)

where qin(k) (resp. qout(k)) denotes the instantaneous link inflow (resp. outflow) rates at

the end of time interval k. As in the multivariate model (as well as in its deterministic

counterpart model of Yperman et al. (2007)), we use these instantaneous link inflow and

outflow rates to approximate the expected inflow to (resp. outflow from) the link during

time interval k. In other words, these instantaneous rates qin(k) and qout(k) are held con-

stant throughout the time interval k of duration δ. Equation (2.10) approximates the rate

of the Poisson process UQ as the difference between: (i) all flow that has entered the link

from time interval 0 until the end of time interval k − 1 (this is represented by the first

summation) and (ii) all flow that has left the link from time interval 0 until the end of time

interval k − kbwd − 1 (this is represented by the second summation). Recall that kbwd rep-

resents the number of time intervals needed for a kinematic backward wave to traverse the

link. Therefore, this second summation accounts for this kinematic backward wave delay

by considering all flow that has left the LO queue, and hence has left UQ.

Second, assume LLO and {UQ− LLO} to be two independent Poisson processes. This

simplifying independence assumption neglects the temporal dependency between LLO and

{UQ−LLO}. The numerical validation results of Section 2.3 highlight the small effect this

has on the final model’s accuracy. The rate of LLO is given by:

qLLO(k) = qout(k− kbwd). (2.11)

The term qout(k − kbwd) represents the expected flow that has left the link during time

interval k−kbwd. This leads to UQ being a sum of two independent Poisson processes: LLO

and {UQ− LLO}. Therefore, the conditional distribution of LLO(k) given {UQ(k) = uq}

is binomial with parameters (uq, qLLO(k)/qUQ(k)) (cf., for instance, Section 2.12.4 of

Larson and Odoni (1981)). Hence, the expected number of departures from UQ during

34

time interval k is approximated with:

E[LLO(k) | UQ(k) = uq] ≈ uqqLLO(k)

qUQ(k). (2.12)

The accuracy of this approximation depends on the dependency between LLO and {UQ−

LLO}. In particular, we expect it to decrease as congestion level increases.

In summary, given the rates that fully define the transition rate matrix: λ(k) (an exoge-

nous rate for an isolated link) and µUQ(uq;k) (given by Equation (3.29c)), the transient

probability distribution of UQ is obtained by evaluating Equation (2.4).

Expected link inflow and outflow

Given the univariate model of UQ, we now describe how it can be used to compute the

expected inflow and expected outflow of the link during time interval k. An arrival may

enter the link as long as there is space at the upstream end of the link. This happens with

probability P(UQ(k) < ℓ). Hence, the expected inflow to the link is:

qin(k) = λ(k)P(UQ(k) < ℓ). (2.13)

Note that in a full network model (i.e., if we combine the link model with a node model), a

vehicle in an upstream link that cannot enter its desired downstream link because it is full

would wait at its current location until an available space downstream is allocated to it. In

other words, spillbacks occur with probability P(UQ(k) = ℓ). In this chapter we consider

a single link model, hence vehicles that wish to enter the link while it is full are considered

lost demand. If a model with no losses is desired, then an infinite space-capacity queue can

be inserted upstream of the link to capture vehicles that are waiting to enter the link.

Similarly, the probability that there are vehicles ready to leave the link is P(DQ(k) >

0). Thus, the expected outflow from the link is:

qout(k) = µ(k)P(DQ(k) > 0). (2.14)

From Equation (2.4), we obtain the distribution of UQ at the end of time interval k, which

35

allows us to compute qin(k) through Equation (3.1). In order to compute qout(k), we need

to compute P(DQ(k) > 0). Nonetheless, in this univariate UQ model, we do not track

DQ directly. Let us now describe how it is approximated.

We proceed as above, where we approximate UQ as a sum of independent Poisson

processes, and approximate the distribution of DQ given {UQ = uq} as a binomial with

parameters (uq, qDQ(k)/qUQ(k)), where:

qDQ(k) =

k−kfwd−1∑r=0

qin(r) −

k−1∑r=0

qout(r). (2.15)

Equation (2.15) considers the expected flow in DQ as the difference between: (i) the sum

of all of the expected inflows into the link from time 0 to time k − kfwd − 1 (i.e., omitting

the flows that are still in LI) and (ii) the sum of all expected outflows out of the link (i.e.,

outflow from time 0 to time k− 1).

We obtain P(DQ(k) > 0) as follows:

P(DQ(k) > 0) = 1− P(DQ(k) = 0) (2.16)

= 1−

ℓ∑n=0

P(DQ(k) = 0 | UQ(k) = n)P(UQ(k) = n) (2.17)

≈ 1−

ℓ∑n=0

(1−

qDQ(k)

qUQ(k)

)n

P(UQ(k) = n), (2.18)

where the binomial probability mass function was used to derive the last expression. This

approximation is accurate when the dependencies among LI, DQ and LO is weak (e.g.

uncongested link).

36

Marginal distribution of DQ

The univariate UQ model can be used to approximate the entire marginal distribution of

DQ, by proceeding similarly as in the derivation of Equation (3.4). For all i ∈ [0, ℓ]:

P(DQ(k) = i) =

ℓ∑n=i

P(DQ(k) = i | UQ(k) = n)P(UQ(k) = n) (2.19)

≈ℓ∑

n=i

(n

i

)(qDQ(k)

qUQ(k)

)i (1−

qDQ(k)

qUQ(k)

)n−i

P(UQ(k) = n),

(2.20)

where(n

i

)denotes the binomial coefficient. Equation (2.20) is obtained by approximating

P(DQ(k) | UQ(k) = n) as a binomial distribution with parameters (uq, qDQ(k)/qUQ(k)).

Algorithm

Algorithm 4 summarizes the numerical evaluation of the UQ model. In the algorithm, we

omit the computation of the marginal distribution of DQ at each time interval k. However,

all the parameters in Equation (2.20) are stored and thus the distribution of DQ for any

time interval k can be computed if needed.

2.2.4 Univariate downstream queue model

The approach to formulate the univariate DQ model is similar to that used for the univari-

ate UQ model of Section 2.2.3. We model DQ as a birth-death process with finite state

space defined by {dq ∈ [0, ℓ]}. Just as for the UQ model, the transient distribution of DQ,

P(DQ(k)), satisfies an equation of the form (2.4) with initial conditions P(DQ(k−1)) and

transition rate matrix QDQ(k). The non-diagonal and non-null elements of the transition

rate matrix of DQ, QDQ(k) are given in Table 2.2. The first row of Table 2.2 describes

the event of arrivals to DQ, i.e., flow that transitions from LI to DQ (see Figure 2-1). The

second row describes the event of flow departing DQ (i.e., departing the link). The cor-

responding rate µ(k) is the downstream bottleneck flow capacity and is considered exoge-

nous for an isolated link. The diagonal elements of the transition rate matrix are computed

37

Algorithm 1 Algorithm of the univariate upstream queue model

1. set exogenous parameters ρ, v,w, ℓ and δ

2. set arrival and service rate over time λ(k) and µ(k) for ∀ k = 1, 2, ...

3. compute kfwd = ⌈ ℓρvδ

⌉ and kbwd = ⌈ ℓρ|w|δ

⌉

4. set exogenous initial link conditions: qin(0), qout(0), P(UQ(0)), qUQ(0), qLLO(0),and qDQ(0)

5. set qin(r) = 0 and qout(r) = 0 for r < 0

6. repeat the following for time intervals k = 1, 2, ...

(a) compute qUQ(k),qLLO(k) and qDQ(k) according to Eq. (2.10), (3.16), and(2.15), respectively

(b) for uq = 0, 1, ..., ℓ, compute µUQ(uq;k) according to Eq. (3.29c)

(c) form the transition rate matrix QUQ(k) defined in Table 2.1

(d) compute P(UQ(k)) according to Eq. (2.4)

(e) compute P(DQ(k) > 0) according to Eq. (3.4)

(f) compute qin(k) and qout(k) according to Eq. (3.1) and (3.2)

38

following equations as in (2.5).

Table 2.2 indicates that the transition rate matrix of DQ is defined by two rates: (i)

an endogenous arrival rate λDQ(k) and (ii) an exogenous service rate µ(k). This table is

simpler than that of the UQ model (Table 2.1) because both rates are state-independent (i.e.,

neither depends on the state dq). In Table 2.1, the service rate of UQ is state-dependent,

i.e., µUQ(uq;k) depends on uq.

For a finite capacity birth-death process with state-independent rates, Morse (1958,

Equation (6.13), Chap. 6) provides a closed-form expression to Equation (2.4), which

avoids the need to numerically evaluate the matrix exponential. For time interval k of

length δ, DQ distribution at the end of time interval k, P(DQ(k)), is given by:

P(DQ(k) = n) =

ℓ∑m=0

P(DQ(k− 1) = m)Pmn (δ) for 0 ≤ n ≤ ℓ (2.21a)

Pmn (δ) = Pn(k) +

2ρ(k)n−m

2

ℓ+ 1

ℓ∑s=1

µ(k)

γs(k)

[sin

(smπ

ℓ+ 1

)−√ρ(k) sin

(s(m+ 1)π

ℓ+ 1

)]·[

sin(

snπ

ℓ+ 1

)−√ρ(k) sin

(s(n+ 1)π

ℓ+ 1

)]e−γs(k)δ (2.21b)

γs(k) = λDQ(k) + µ(k) − 2√

λDQ(k)µ(k) cos(

sπ

ℓ+ 1

)for s = 1, 2, ..., ℓ (2.21c)

Pn(k) =

(1− ρ(k)

1− ρ(k)ℓ+1

)ρ(k)n (2.21d)

ρ(k) =λDQ(k)

µ(k). (2.21e)

Equation (2.21a) states that the distribution of DQ at the end of time interval k can

be obtained by a convex combination of distributions Pmn (δ) each of which is defined in

Equation (2.21b) as the sum of: (i) the stationary probability of being in state n, which

is denoted Pn(k) and defined by Equation (2.21d), and (ii) a time-dependent term with

initial state new state rate conditiondq dq+ 1 λDQ(k) dq < ℓ

dq dq− 1 µ(k) dq > 0

Table 2.2: Transition rate table of DQ.

39

exponential decay. The exponential decay is parameterized by γs(k), which is defined by

Equation (2.21c) and is referred in the queuing literature as the inverse of the relaxation

time. In summary, the distribution of DQ is given by (3.6), which depends on two rates:

(i) an exogenous service rate µ(k) and (ii) an endogenous arrival rate λDQ(k). We now

describe how we approximate this endogenous arrival rate.

Arrival rate of DQ

The distribution of DQ at the end of time interval k is given by the System of Equa-

tions (3.6), which depends on the endogenous rate, λDQ(k). In order to approximate this

rate, we observe that for a queue with finite space capacity ℓ and arrival rate λ, the expected

inflow to the queue is given by: λP(N < ℓ), where N represents the number of vehicles

in the queue. We use this property to obtain the following expression for the arrival rate to

DQ:

λDQ(k) ≈ qin(k− kfwd)

P(DQ(k− 1) < ℓ). (2.22)

The numerator qin(k−kfwd) represents the expected inflow to the link during time interval

k − kfwd, i.e., this is the flow that is expected to leave the last cell of LI (denoted LLI in

Figure 2-1) and enter DQ during time interval k. The denominator P(DQ(k − 1) < ℓ) is

based on the DQ distribution at the end of time interval k−1, which is the DQ distribution

at the beginning of time interval k.

Expected link inflow and outflow

Given the univariate model of DQ, we now describe how it can be used to compute the ex-

pected inflow and expected outflow of the link during time interval k. Recall their definition

given in (3.1) and (3.2). The System of Equations (3.6) yields the marginal distribution of

DQ, hence the expected outflow qout(k) (defined by Equation (3.2)) can be directly com-

puted.

In order to compute the expected inflow qin(k) (defined by Equation (3.1)) we need

P(UQ(k) < ℓ). Nonetheless, in this univariate DQ model, we do not track UQ directly.

40

Let us now describe how it is approximated.

We express P(UQ(k) < ℓ) as a function of the conditional distribution of {UQ−DQ}

given DQ:

P(UQ(k) < ℓ) = 1− P(UQ(k) = ℓ) (2.23)

= 1−

ℓ∑n=0

P(UQ(k) = ℓ | DQ(k) = n)P(DQ(k) = n) (2.24)

= 1−

ℓ∑n=0

P(UQ(k) −DQ(k) = ℓ− n | DQ(k) = n)P(DQ(k) = n)

(2.25)

≈ 1−

ℓ∑n=0

p1(k)ℓ−nP(DQ(k) = n). (2.26)

Equation (2.25) is obtained from (2.24) by observing that P(UQ(k) = ℓ | DQ(k) = n)

equals P(UQ(k) − DQ(k) = ℓ − n | DQ(k) = n). Equation (2.26) is obtained by

approximating the conditional distribution of {UQ−DQ} given {DQ = n} with a binomial

distribution with parameters (ℓ− n, p1(k)).

The first parameter of this distribution ℓ − n is derived by observing that the random

variable {UQ − DQ} given {DQ = n} can only take values in [0, ℓ − n]. Let us detail

this. Equation (2.1) implies UQ − DQ ≥ 0. Additionally, by definition UQ ≤ ℓ. Thus,

conditional on DQ = n, we have UQ−DQ ≤ ℓ− n.

Let us now approximate the second parameter of this binomial distribution, p1(k).

E[UQ(k)] = E[DQ(k)] + E[UQ(k) −DQ(k)] (2.27)

= E[DQ(k)] + E[E[UQ(k) −DQ(k)|DQ(k)]] (2.28)

= E[DQ(k)] +

ℓ∑n=0

E[UQ(k) −DQ(k) | DQ(k) = n]P(DQ(k) = n)(2.29)

≈ E[DQ(k)] +

ℓ∑n=0

(ℓ− n)p1(k)P(DQ(k) = n) (2.30)

= E[DQ(k)] + p1(k)(ℓ− E[DQ(k)]). (2.31)

41

Equation (2.27) is obtained by adding and subtracting E[DQ(k)] on the right hand side.

The law of total expectation is used in (2.28), and rewritten in more detail in (2.29). Since

{UQ − DQ} conditional on {DQ = n} is approximated as a binomial distribution with

parameters (ℓ−n, p1(k)), then E[UQ−DQ | DQ = n] equals (ℓ−n)p1(k), which leads

to (2.30). The summation is simplified to obtain (2.31), which itself can be rearranged to

obtain the approximation for p1(k):

p1(k) ≈E[UQ(k)] − E[DQ(k)]

ℓ− E[DQ(k)]. (2.32)

In order to evaluate Equation (2.32): E[DQ(k)] can be computed from the marginal

distribution of DQ (Equations (3.6)) as:

E[DQ(k)] =

ℓ∑n=0

nP(DQ(k) = n), (2.33)

and E[UQ(k)] can be obtained from the approximation of UQ as a Poisson process with

rate defined by Equation (2.10), and thus

E[UQ(k)] ≈ qUQ(k) · δ =

k−1∑r=0

qin(r) −

k−kbwd−1∑r=0

qout(r)

· δ. (2.34)

In summary, P(UQ(k) < ℓ) is approximated by Equation (2.26), with p1(k) given by

Equation (2.32) and P(DQ(k) = n) given by the System of Equations (3.6).

42

Marginal distribution of UQ

The univariate DQ model can be used to approximate the entire marginal distribution of

UQ, by proceeding similarly as in the derivation of Equation (2.26). For all i ∈ [0, ℓ]:

P(UQ(k) = i) =

i∑n=0

P(UQ(k) = i | DQ(k) = n)P(DQ(k) = n) (2.35)

=

i∑n=0

P(UQ(k) −DQ(k) = i− n | DQ(k) = n)P(DQ(k) = n)

(2.36)

≈i∑

n=0

(ℓ− n

i− n

)p1(k)

i−n(1− p1(k))ℓ−iP(DQ(k) = n). (2.37)

where P(DQ(k) = n) is given by the System of Equations (3.6), and p1(k) is given

by Equation (2.32). Equation (2.37) is obtained by approximating P(UQ(k) − DQ(k) |

DQ(k) = n) as a binomial distribution with parameters (ℓ− n, p1(k)).

Algorithm

Algorithm 2 summarizes the numerical evaluation of the DQ model. In the algorithm, we

omit the computation of the marginal distribution of UQ at each time interval k. However,

all the parameters in Equation (2.37) are stored, and thus the distribution of UQ for any

time interval k can be computed if needed.

2.2.5 Mixture model

Recall that by design the role of UQ is to capture the link’s upstream boundary conditions,

while that of DQ is to capture the link’s downstream boundary conditions. In order to

capture both the link’s upstream and downstream boundary conditions, while ensuring a

model suitable for large-scale network analysis, we propose a link model that is a mixture

of the univariate UQ model (formulated in Section 2.2.3) and of the univariate DQ model

43

Algorithm 2 Algorithm of the univariate downstream queue model





⌉

4. set exogenous initial link conditions: qin(0), qout(0), P(DQ(0)), qUQ(0), andqLLI(0)



(a) compute qUQ(k) according to Eq. (2.10)

(b) compute λDQ(k) according to Eq. (2.22)

(c) compute P(DQ(k)) according to the System of Equations (3.6)

(d) compute E[UQ(k)] according to Eq. (2.34)

(e) compute E[DQ(k)] according to Eq. (2.33)

(f) compute p1(k) according to Eq. (2.32)

(g) compute P(UQ(k) < ℓ) according to Eq. (2.26)

(h) compute qin(k) and qout(k) according to Eq. (3.1) and (3.2)

44

(formulated in Section 2.2.4). The proposed model is given by:

P(UQ(k)) = wPUQ(UQ(k)) + (1− w)PDQ(UQ(k)) (2.38)

P(DQ(k)) = wPUQ(DQ(k)) + (1− w)PDQ(DQ(k)), (2.39)

where the following notation is used:

PUQ(UQ(k)) UQ distribution from the UQ model (Equation (2.4));

PDQ(UQ(k)) UQ distribution from the DQ model (Equation (2.37));

PUQ(DQ(k)) DQ distribution from the UQ model (Equation (2.20));

PDQ(DQ(k)) DQ distribution from the DQ model (Equations (3.6)).

An analytical expression for the weight parameter, w, is derived through insights ob-

tained from a variety of numerical experiments. Its expression is given by:

w(ℓ, µ, kfwdδ) = e− ℓ2

70µkfwdδ . (2.40)

The experiments compared the performance of the proposed mixture model to that of a

discrete-event simulation model used in Osorio and Flötteröd (2015), which implements the

stochastic link transmission model. It samples individual vehicles. The forward and back-

ward lags are explicitly implemented on each vehicle. A total of 180 experiments were con-

ducted considering combinations of ℓ ∈ {5, 10, 15, . . . , 100}; ρ = λ/µ ∈ {0.25, 0.5, 0.75};

µ ∈ {0.2, 0.4, 0.6}. A more detailed description of the derivation of weight parameter, w,

is given in Appendix A.

For the mixture model, the expected inflow and outflow, i.e. qin(k) and qout(k), are

obtained according to Equations (3.1) and (3.2) where P(UQ(k) < ℓ) and P(DQ(k) > 0)

are given by (2.38) and (2.39), respectively. Algorithm 3 summarizes the mixture model

approach. Notice that steps 7 and 8 in the algorithm can be run simultaneously and inde-

pendently to further enhance the runtime.

45

Algorithm 3 Algorithm of the mixture model





⌉

4. compute w according to Eq. (2.40)

5. set exogenous initial link conditions: qin(0), qout(0), P(UQ(0)), P(DQ(0)),qUQ(0), qLLO(0), qLLI(0) and qDQ(0)


7. run step 6 of algorithm 4, this yields PUQ(UQ(k)) for all k = 1, 2...

8. run step 6 of algorithm 2, this yields PDQ(DQ(k)) for all k = 1, 2...

9. for any time interval k,

(a) compute PUQ(DQ(k)) according to Eq. (2.20)

(b) compute PDQ(UQ(k)) according to Eq. (2.37)

(c) compute P(UQ(k)) according to Eq. (2.38)

(d) compute P(DQ(k)) according to Eq. (2.39)

(e) compute qin(k) and qout(k) according to Eq. (3.1) and (3.2)

46

Parameter Valuev 0.01 km/secw −0.005 km/secρ 200 veh/kmq 2400 veh/h = 0.67 veh/secδ 0.1 sec

µ(k) 1440 veh/h = 0.4 veh/secλ(k) varies by experiment

ℓ, L, kfwd, kbwd varies by experiment

Table 2.3: Link Parameters

2.3 Validation

In this section we validate the model. We evaluate and compare both in terms of compu-

tational runtime and accuracy. First, we compare the computational runtimes of the pro-

posed model to those of the multivariate model (Osorio and Flötteröd; 2015). We consider

a single-lane link with parameters shown in Table 3.1. The link configuration is the same

as that used in Osorio and Flötteröd (2015) except for the service rate. The service rate

of the link is fixed at 0.4 veh/sec for all experiments. The experiments consider different

arrival rates and link lengths (and hence, different space capacities, forward lags and back-

ward lags). We consider a set of three different arrival rates (λ ∈ {0.1, 0.2, 0.3} veh/sec)

and seven different space capacities (ℓ ∈ {10, 20, 30, 40, 60, 80, 100}). The combination

of these values leads to a total of 21 experiments. The considered space capacity values

correspond to link lengths L ∈ {50, 100, 150, 200, 300, 400, 500} (in meters), forward lags

kfwd ∈ {5, 10, 15, 20, 30, 40, 50} and backward lags kbwd ∈ {10, 20, 30, 40, 60, 80, 100}.

Each experiment starts with an empty link at time zero and runs for 250 seconds at which

point the link is ensured to have reached a stationary regime. All experiments are carried

out on a standard laptop machine with Intel Core i7-4700HQ CPU running at 2.40 GHz.

Figure 3-4 compares the runtimes of the mixture model (circles) and of the multivariate

model (asterisks). The x-axis considers the space capacity values ℓ. The y-axis displays

the average computational runtime (in minutes). The average is computed over the three

experiments with three different arrival rate values. The y-axis is plotted on a logarithmic

scale. The maximum runtime for evaluating an experiment is set to be 40 hours. If an

47

10 20 30 40 50 60 70 80 90 100

Space capacity ℓ

10-2

10-1

100

101

102

103

104

Runtime(m

in)

MixtureMultivariate

Figure 2-2: Model runtime comparison

experiment has not concluded within 40 hours, it is terminated. For ℓ = 30 the average

runtime of the multivariate model is already 2366 minutes (≈ 39 hours). Hence, for exper-

iments where ℓ > 30, the multivariate model is not evaluated. Figure 3-4 illustrates that

the runtime of the multivariate model increases exponentially with ℓ, while for the mixture

model the increase appears linear. For the mixture model, the average runtime over all 21

experiments is 0.05 minutes. The maximum average runtime is obtained for ℓ = 100 and

is 0.11 minutes. Thus, compared to the multivariate model, the mixture model achieves

significant improvements in computational complexity both theoretically and numerically.

We now compare the multivariate model and the mixture model in terms of their ac-

curacy. In order to evaluate the accuracy of each of these analytical models, we use a

discrete-event simulator of the stochastic link transmission model. The simulator is the

same as that used for validation in Osorio and Flötteröd (2015). It samples individual vehi-

cles, and implements for each vehicle exact forward and backward lags. The arrival process

is a Poisson process. For vehicles at the downstream end of the link, inter-departure times

are independent and identically distributed exponential random variables. The simulated

estimates are obtained from 106 replications.

48

First, we consider two experiments with temporal variations in demand and evaluate

the ability of the analytical models to approximate the transient distributions of UQ and

of DQ. For both experiments, ℓ = 10. Experiment 1 has an arrival rate of 0.1 veh/sec

during time [0, 125] seconds, an arrival rate of 0.5 veh/sec during time [125, 175] seconds

and an arrival rate of 0.3 veh/sec during time [175, 300] seconds. This experiment corre-

sponds to step-changes from uncongested to highly-congested (i.e. λ(k) > µ(k)) and then

to congested traffic conditions. Experiment 2 considers first an arrival rate of 0.3 veh/sec

during time [0, 100] seconds, of 0.1 veh/sec during time [100, 200] seconds and then of 0.5

veh/sec during time [200, 300] seconds. This experiment corresponds to step-changes from

congested to uncongested and then to highly-congested traffic conditions. The two experi-

ments are designed such that during the highly-congested period (where λ(k) > µ(k)), the

period is not long enough in Experiment 1 for the transient distribution to converge to its

stationary counterpart, while in Experiment 2 it is a long enough period.

Figure 2-3 considers Experiment 1. Each plot of Figure 2-3(a) considers a given time

T (in seconds) and displays the distribution of UQ, P(UQ(T)), at time T as proposed

by: the mixture model (red squares), the multivariate model (blue diamonds) and the

simulated estimates (black crosses). The different plots consider different times: T ∈

{1, 30, 60, 90, 120, 150, 180, 210, 240, 270} seconds. Similarly, each plot of Figure 2-3(b)

displays the distribution of DQ, P(DQ(T)), at time T . The simulated estimates are dis-

played with 95% confidence intervals. These are barely visible.

Recall that for this experiment, there is a sharp increase in demand at time T = 125

sec and a sharp decrease at time T = 175 sec. The changes in the distributions of UQ and

DQ after time T = 125 seconds and T = 175 seconds are visible for all models. During

time [125, 175], states with higher values of UQ (resp. DQ) have higher probabilities.

After time T = 175, states with higher values of UQ (resp. DQ) have comparably lower

probabilities. Figures 2-3(a) and 2-3(b) show that the dynamics of the simulator are well

approximated by both the mixture and the multivariate models. Additionally, both analyti-

cal models converge, both before T = 125 seconds and after T = 175 seconds, to stationary

distributions that approximate well the simulated distribution.

The plots of Figure 2-3(c) display, respectively, E[UQ(T)] and E[DQ(T)] as a function

49

of time T . The sharp increase in expectation after time T = 125 seconds and the sharp

decrease after time T = 175 seconds are well approximated by both analytical models.

The stationary values before T = 125 seconds and after T = 175 seconds are also well

approximated.

Note also that for all three models considered here (mixture, multivariate and simula-

tor) their arrival process and their departure process are stochastic. Hence, spillback may

occur even when µ(k) > λ(k). More specifically, the spillback probability is given by

P(UQ(T) = ℓ). For instance, in the right-most plot of the second row of Figure 2-3(a), the

spillback probability is non-zero (i.e., P(UQ(T) = ℓ) > 0).

Experiment 2 considers a sharp decrease in demand at T = 100 seconds and a sharp

increase in demand at T = 200. Figures 2-4(a) and 2-4(b) display, respectively, the distri-

butions of UQ and of DQ as a function of time (i.e., P(UQ(T)) and P(DQ(T))). In this

experiment, we observe a shift in probability mass to states with smaller values of UQ and

DQ during time [100, 200] seconds and a shift in probability mass to states with larger val-

ues of UQ and DQ after time T = 200 seconds. In this experiment, both analytical models

converge to the stationary distribution after each change in demand. The conclusions here

are the same as for the previous experiment: both the stationary and the transient distri-

butions are well approximated by the analytical models. The time-dependent expectations

E[UQ(T)] and E[DQ(T)] are displayed in Figure 2-4(c). Again, the dynamics are well cap-

tured by both analytical models. In summary, for Experiments 1 and 2, the approximations

of both the mixture and the multivariate models are good. The transient and the stationary

distributions are well approximated by both models.

We now evaluate the accuracy of the mixture model over a larger set of experiments. We

consider the 21 experiments mentioned above. The main goal is to evaluate the loss of ac-

curacy of the mixture model compared to the (less scalable but more accurate) multivariate

model. In order to evaluate the accuracy of a given distribution (UQ or DQ), we evaluate

its distance to the distribution estimated via simulation with the stochastic LTM simulator

described previously and used for validation in Osorio and Flötteröd (2015). Recall that

this simulator is an exact implementation of the stochastic LTM. The distance between an

analytical distribution (mixture or multivariate) and the simulated distribution is evaluated

50

0 5 100

0.2

0.4

0.6

0.8

1

P(U

Q(T

)=n)

T=1

0 5 100

0.2

0.4

0.6T=30

0 5 100

0.2

0.4

0.6T=60

0 5 100

0.2

0.4

0.6T=90

0 5 100

0.2

0.4

0.6T=120

0 5 10n

0

0.2

0.4

0.6

P(U

Q(T

)=n)

T=150

0 5 10n

0

0.2

0.4

0.6T=180

0 5 10n

0

0.2

0.4

0.6T=210

0 5 10n

0

0.2

0.4

0.6T=240

0 5 10n

0

0.2

0.4

0.6T=270

multivariatemixturesimulation

(a) Distribution of UQ over time, P(UQ(T))

0 5 100

0.2

0.4

0.6

0.8

1

P(D

Q(T

)=n)

T=1

0 5 100

0.2

0.4

0.6

0.8

1T=30

0 5 100

0.2

0.4

0.6

0.8

1T=60

0 5 100

0.2

0.4

0.6

0.8

1T=90

0 5 100

0.2

0.4

0.6

0.8

1T=120

0 5 10n

0

0.1

0.2

0.3

0.4

0.5

P(D

Q(T

)=n)

T=150

0 5 10n

0

0.1

0.2

0.3

0.4

0.5T=180

0 5 10n

0

0.1

0.2

0.3

0.4

0.5T=210

0 5 10n

0

0.1

0.2

0.3

0.4

0.5T=240

0 5 10n

0

0.1

0.2

0.3

0.4

0.5T=270


(b) Distribution of DQ over time, P(DQ(T))

0 100 200 300

time T

0

2

4

6

8

10

E[U

Q(T

)]

simulationmultivariatemixture

0 100 200 300

time T

0

0.5

1

1.5

2

2.5

3

3.5

E[D

Q(T

)]


(c) Expectation of UQ and of DQ over time, E[UQ(T)] and E[DQ(T)]

Figure 2-3: Experiment 1: impact of the temporal variation of demand on the distributions,as well as the expected values, of UQ and of DQ

51

0 5 100

0.2

0.4

0.6

0.8

1

P(U

Q(T

)=n)

T=1

0 5 100

0.2

0.4

0.6T=30

0 5 100

0.2

0.4

0.6T=60

0 5 100

0.2

0.4

0.6T=90

0 5 100

0.2

0.4

0.6T=120

0 5 10n

0

0.2

0.4

0.6

0.8

1

P(U

Q(T

)=n)

T=150

0 5 10n

0

0.2

0.4

0.6T=180

0 5 10n

0

0.2

0.4

0.6T=210

0 5 10n

0

0.2

0.4

0.6T=240

0 5 10n

0

0.2

0.4

0.6T=270


(a) Distribution of UQ over time, P(UQ(T))

0 5 100

0.2

0.4

0.6

0.8

1

P(D

Q(T

)=n)

T=1

0 5 100

0.1

0.2

0.3

0.4

0.5T=30

0 5 100

0.1

0.2

0.3

0.4

0.5T=60

0 5 100

0.1

0.2

0.3

0.4

0.5T=90

0 5 100

0.2

0.4

0.6

0.8

1T=120

0 5 10n

0

0.2

0.4

0.6

0.8

1

P(D

Q(T

)=n)

T=150

0 5 10n

0

0.2

0.4

0.6

0.8

1T=180

0 5 10n

0

0.1

0.2

0.3

0.4

0.5T=210

0 5 10n

0

0.1

0.2

0.3

0.4

0.5T=240

0 5 10n

0

0.1

0.2

0.3

0.4

0.5T=270


(b) Distribution of DQ over time, P(DQ(T))

0 100 200 300

time T

0

2

4

6

8

10

E[U

Q(T

)]


0 100 200 300

time T

0

0.5

1

1.5

2

2.5

3

3.5

E[D

Q(T

)]


(c) Expectation of UQ and of DQ over time, E[UQ(T)] and E[DQ(T)]

Figure 2-4: Experiment 2: impact of the temporal variation of demand on the distributions,as well as the expected values, of UQ and of DQ

52

with the Jensen-Shannon divergence (JSD) metric (Endres and Schindelin; 2003). For a

pair of distributions P1 and P2, the JSD metric is defined by:

JSD(P1 ∥ P2) =1

2D(P1 ∥ M) +

1

2D(P2 ∥ M) (2.41)

D(P1 ∥ P2) =∑i

P1(i) logP1(i)

P2(i), (2.42)

where D(P1 ∥ P2) is the Kullback-Leibler divergence (KLD) (Kullback and Leibler; 1951)

and M = 12(P1 + P2). Unlike the KLD, the JSD is both symmetric and upper bounded by

1. The lower the JSD value, the smaller the distance between the two distributions, i.e.,

the higher the accuracy. We define the time-average JSD over the entire time period (i.e.,

250 seconds) as the temporal mean of the JSD values, i.e.: 1250

∑250

T=1 JSD(P1(T) ∥ P2(T))

where P1(T) and P2(T) are the distributions evaluated at time T .

Since the main goal is to evaluate the accuracy loss of the mixture model compared to

the multivariate model, we will compare the time-average JSD values of the mixture model

(i.e., the time-average JSD distance between the distribution approximated by the mixture

model and the simulated distribution) and the time-average JSD values of the multivariate

model (i.e., the time-average JSD distance between the distribution approximated by the

multivariate model and the simulated distribution). In order to guide us in the interpreta-

tion of the magnitude of the JSD metric, we provide three additional models to compare

the proposed model with: (i) the deterministic LTM (denoted DetDet, which stands for

deterministic arrivals and deterministic departures), (ii) a simulation-based instance of the

LTM with deterministic arrivals and independent exponentially distributed inter-departure

times (denoted DetExp), (iii) a simulation-based instance of the LTM with independent ex-

ponentially distributed inter-arrival times and deterministic inter-departure times (denoted

ExpDet). Since DetDet is a deterministic traffic model, for a given experiment and a

given time, it generates a unique link state (i.e., the distribution has all the probability mass

concentrated in a single state). For the simulation-based models, the distributional esti-

mates are obtained from 106 simulation replications. In summary, for a given experiment

(out of the 21 experiments), a given model (mixture, multivariate, DetDet, DetExp and

53

ExpDet) and a given distribution (UQ or DQ), we evaluate its distance to the simulated

distribution using the time-average JSD metric.

As described above, the simulator consists of the deterministic LTM yet with a proba-

bilistic arrival process and a probabilistic departure process. Hence, the underlying distribu-

tions (of UQ and of DQ) it yields are expected to differ from those of the purely determin-

istic LTM. Thus, the time-average JSD values of DetDet can be interpreted as the effect

of extending the LTM with a given probabilistic arrival process and a given probabilistic

departure process. Similarly, the time-average JSD values of ExpDet (resp. DetExp) can

be interpreted as the effect of extending the LTM with a given probabilistic arrival (resp.

departure) process.

Figure 3-3 displays the time-average JSD values for the 21 experiments described

above. The top (resp. bottom) row plots consider the UQ (resp. DQ) distribution. The

first column of plots considers the experiments with arrival rate λ(k) = 0.1 veh/sec. The

second and third column consider arrival rate values of 0.2 and 0.3 veh/sec, respectively.

Each plot compares 5 models: the mixture model (circles), the multivariate model (aster-

isks), DetDet (square), ExpDet (triangle) and DetExp (cross). Each plot displays the

time-average JSD metric (y-axis) as a function of the space capacity (x-axis). Recall that

for the multivariate model, the runtimes for the experiments with ℓ > 30 exceed 40 hours

and are hence not computed. Figure 2-6 considers a zoomed-in version of Figure 3-3. It

displays only the mixture, the multivariate and the ExpDet models, which are those with

the lowest error values (i.e., their curves mostly overlap along the x-axis in Figure 3-3).

For all plots of Figure 3-3, the time-average JSD values of DetDet and DetExp are

significantly higher than those of the other models. In particular, the curves of the three

other models (mixture, multivariate and ExpDet) are barely visible along the x-axis. Fig-

ure 2-6 presents in more detail the curves of these three models. For P(UQ(T)) (i.e., top

row plots), the time-average JSD values of the mixture model are higher than those of the

multivariate and of ExpDet. Yet the values remain very small. For P(DQ(T)) (i.e., bottom

row plots), the time-average JSD values of the ExpDet model are higher than those of the

mixture and of the multivariate model. For space capacities ℓ ≥ 30, the curve of the mix-

ture model overlaps with the x-axis, it is barely visible. This indicates very high accuracy.

54

Recall also that for ℓ > 30, the computation time for the mixture model exceeds 40 hours

and is hence not evaluated. Overall, these experiments indicate that the loss of accuracy of

the mixture model compared with the multivariate model is not significant. The numeri-

cal time-average JSD values displayed in Figure 3-3 are provided, for all experiments, in

Tables 1 and 2 of Appendix B.

In summary, for experiments with both constant and time-varying demand, the mixture

model performs comparably with the multivariate model, while being significantly faster

to evaluate. The gain in computational runtime increases with the space capacity. In par-

ticular, for medium-dimensional state spaces (i.e., medium-sized links), the evaluation of

the mixture model remains instantaneous (i.e., in the order of seconds), while that of the

multivariate model increases exponentially.

2.4 Network analysis

In this section, the proposed mixture model is used to address a traffic signal control prob-

lem for the city of Lausanne, Switzerland. Section 2.4.1 formulates the problem and de-

scribes the case study. Section 2.4.2 presents the numerical results and Section 2.4.3 com-

pares the performance of the resulting signal plans to that of a signal plan derived by a

commercial signal control software.

2.4.1 City-scale signal control

We consider the city of Lausanne, Switzerland. The city map is shown in Figure 2-7, and

the area of consideration is delimited in white. The network model of a stochastic micro-

scopic simulator is displayed in Figure 3-12. The network consists of 603 links, 902 lanes

and 231 intersections. We consider a problem where we determine the signal plans of 17

intersections distributed throughout the city. These 17 intersections are depicted as squares

in Figure 3-12. We consider a fixed-time signal control problem. For a review of traffic sig-

nal control terminology and formulations, see Appendix A of Osorio (2010). A fixed-time

signal plan, also called time-of-day or pre-timed plan, is an off-line pre-determined plan

that is periodical during a specific time of day (e.g., evening peak). Fixed-time plans are

55

2040

6080

100Space

Capacity

ℓ

0

0.1

0.2

0.3

0.4

0.5

Time-average JSD for P(UQ(T))

Mixture

Multivariate

DetD

etE

xpDet

DetE

xp

(a)Arrivalrate

λ(k)=

0.1

veh/sec

2040

6080

100Space

Capacity

ℓ

0

0.1

0.2

0.3

0.4

0.5

0.6


Mixture

Multivariate

DetD

etE

xpDet

DetE

xp

(b)Arrivalrate

λ(k)=

0.2

veh/sec

2040

6080

100Space

Capacity

ℓ

0

0.1

0.2

0.3

0.4

0.5

0.6


Mixture

Multivariate

DetD

etE

xpDet

DetE

xp

(c)Arrivalrate

λ(k)=

0.3

veh/sec

2040

6080

100Space

Capacity

ℓ

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

Time-average JSD for P(DQ(T))

Mixture

Multivariate

DetD

etE

xpDet

DetE

xp

(d)Arrivalrate

λ(k)=

0.1

veh/sec

2040

6080

100Space

Capacity

ℓ

0

0.05

0.1

0.15

0.2

0.25

0.3Time-average JSD for P(DQ(T))

Mixture

Multivariate

DetD

etE

xpDet

DetE

xp

(e)Arrivalrate

λ(k)=

0.2

veh/sec

2040

6080

100Space

Capacity

ℓ

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4


Mixture

Multivariate

DetD

etE

xpDet

DetE

xp

(f)Arrivalrate

λ(k)=

0.3

veh/sec

Figure2-5:C

omparison

oftheJSD

valuesforthe

21experim

entsw

ithtim

e-independentdemand

56

2040

6080

100

Space

Capacity

ℓ

0

0.51

1.52


×10

-3

Mix

ture

Mul

tivar

iate

Exp

Det

(a)A

rriv

alra

teλ(k)=

0.1

veh/

sec

2040

6080

100

Space

Capacity

ℓ

012345678 Time-average JSD for P(UQ(T))

×10

-3

Mix

ture

Mul

tivar

iate

Exp

Det

(b)A

rriv

alra

teλ(k)=

0.2

veh/

sec

2040

6080

100

Space

Capacity

ℓ

0

0.00

5

0.01

0.01

5

0.02

0.02

5


Mix

ture

Mul

tivar

iate

Exp

Det

(c)A

rriv

alra

teλ(k)=

0.3

veh/

sec

2040

6080

100

Space

Capacity

ℓ

0

0.51

1.52

2.53


×10

-3

Mix

ture

Mul

tivar

iate

Exp

Det

(d)A

rriv

alra

teλ(k)=

0.1

veh/

sec

2040

6080

100

Space

Capacity

ℓ

0

0.00

2

0.00

4

0.00

6

0.00

8

0.01

0.01

2


Mix

ture

Mul

tivar

iate

Exp

Det

(e)A

rriv

alra

teλ(k)=

0.2

veh/

sec

2040

6080

100

Space

Capacity

ℓ

0

0.00

5

0.01

0.01

5

0.02

0.02

5

0.03


Mix

ture

Mul

tivar

iate

Exp

Det

(f)A

rriv

alra

teλ(k)=

0.3

veh/

sec

Figu

re2-

6:C

ompa

riso

nof

the

JSD

valu

esfo

rthe

21ex

peri

men

tsw

ithtim

e-in

depe

nden

tdem

and

(zoo

med

-in

resu

lts)

57

Figure 2-7: Lausanne city road network (adapted from Dumont and Bert (2006))

appropriate for networks with sparse or unreliable real-time data. They are also commonly

used by major cities with high and uniformly distributed congestion levels, such as New

York City (Osorio et al.; 2015).

We consider a fixed-time signal control problem for the 5:00-5:30pm evening peak. The

signal plans of the 17 intersections are determined jointly. The decision variables are the

green splits (i.e., normalized green times) of the phases of the different intersections. All

other traditional control variables (e.g., cycle times, offsets, stage structure) are assumed

fixed. This leads to a total of 99 endogenous signal phase variables, i.e., the dimension of

the decision vector is 99.

58

Figure 2-8: Lausanne network model

To formulate the problem, we introduce the following notation.

bd ratio of available cycle time to total cycle time for intersection d;

x vector of green splits;

x(j) green split of signal phase j;

xLB vector of lower bounds for green splits;

D set of intersection indices;

PD(d) set of endogenous signal phase indices of intersection d;

L set of all lanes;

T total number of one-minute time intervals;

N number of lanes, i.e., cardinality of L.

The problem is formulated as follows:

minx

f(x) =1

TN

∑i∈L

T∑t=1

P(UQi(t; x) = ℓi) (2.43)

59

subject to

∑j∈PD(d)

x(j) = bd, ∀d ∈ D (2.44)

x ≥ xLB. (2.45)

The decision vector, x, denotes the green splits of the signal controlled lanes. The linear

equality constraints (3.38) ensure that, for each intersection, the sum of green times equals

the available cycle time. Constraint (3.39) ensures lower bounds for the green splits. This

objective function averages, over time and over all lanes, the spillback probability of each

lane. This spillback probability is represented by P(UQi(t; x) = ℓi), which denotes the

probability of UQ being full at integer time t under signal plan x. This problem formulation

minimizes the spatial and temporal occurrence of spillbacks.

The above signal control problem has a probabilistic formulation, which is naturally

addressed with probabilistic traffic models. Given the high computation times of the multi-

variate model (cf. Section 2.3), the above problem is only solved with the proposed mixture

model.

Implementation details

The values of the main exogenous parameters of the mixture model are displayed in Ta-

ble 2.4. The decision variables of this problem (the green splits of the signal plans) deter-

mine the downstream flow capacity of the underlying lanes. More specifically, for a signal

controlled lane i, its flow capacity is given by:

µi −∑

j∈PI(i)

x(j)s = eis, ∀i ∈ L, (2.46)

where s represents the saturation flow, ei represents the ratio of fixed green time to cycle

time of signalized lane i, PI(i) represents the set of endogenous signal phases of lane i and

L denotes the set of signal controlled lanes.

This chapter formulates a link model. It can be coupled with a probabilistic node model

to formulate a full network model. As is discussed in Section 2.5, the formulation of

60

probabilistic traffic-theoretic node models is part of ongoing work. In order to limit this

case study to the use of the link model (rather than link and node models), we assume

link demand to be exogenous, i.e., it does not vary with signal plans. Hence, the mixture

model is used to design signal plans that improve within-link traffic dynamics. Across-link

dynamics, or more generally changes in traffic assignment, are not accounted for in this

formulation. The results of this case study show that even with the use of such simplifying

assumptions (e.g., the lack of an endogenous node model), the link model identifies signal

plans with good network-wide performance.

The exogenous arrival rate (or demand rate) for lane i at time-interval k, denoted λi(k),

is computed, prior to optimization, by solving the following linear system of equations:

λi(k) = γi +∑j

pjiλj(k), ∀i ∈ L, (2.47)

where γi denotes an external arrival rate (i.e., rate of trips that start at lane i), pji is a turning

probability from lane j to lane i. Both γi and pji are exogenous and time-independent,

hence λ is also exogenous and time-independent. Equation (3.35) states that the arrival rate

of lane i is the sum of the external arrival rate γi to lane i and of the demand that arises

from upstream lanes. Problem (3.37)-(3.39) is solved using the Active-set algorithm of the

fmincon routine of Matlab (MATLAB; 2016).

2.4.2 Numerical analysis

We solve Problem (3.37)-(3.39) considering four different initial points. Each point is

drawn uniformly randomly from the feasible space (Equations (3.38)-(3.39)). The uniform

sampling is conducted using the code of Stafford (2006). The use of four different initial

points leads to four optimal solutions. In order to evaluate the performance of the various

signal plans (initial and optimal), we use a microscopic traffic simulation model of Lau-

sanne (Dumont and Bert; 2006), which is calibrated for the evening peak period demand

and implemented with the Aimsun simulator (TSS; 2014). Each signal plan is embedded

within the simulator, 50 simulation replications are run. We then compare the cumulative

distribution (obtained over these 50 replications) of the main network performance mea-

61

Parameter ValueT 30 one-minute intervalsN 902 lanesδ 0.1 sec

xLB 4 secv 50 km/hw −15 km/hρ 200 veh/kms 1800 veh/hµ varies by signal plansλ calculated from Equation (3.35)

ℓ, γ, pij, ei, bd exogenous values obtained from Osorio (2010, Chap. 4)kfwd kfwd = ⌈ ℓ

ρvδ⌉

kbwd kbwd = ⌈ ℓρ|w|δ

⌉

Table 2.4: Parameters for Lausanne case study

sures. Each simulation replication consists of a 15 minute warm-up period, followed by a

30 minute (5:00-5:30pm) simulation period. For a given simulation replication, the objec-

tive function (3.37) is estimated as the average (over all lanes) proportion of time a lane is

full.

Each plot of Figure 3-13 considers one random initial point. Each plot displays two

cumulative distribution curves: one for the initial signal plan and one for the optimal plan

of Problem (3.37)-(3.39). Each curve is the cumulative distribution function (cdf) of the

average proportion of time a lane is full. More specifically, the x-axis displays the average

proportion of time a lane is full. For a given value of x, the y-axis displays the proportion of

simulation replications (out of 50) that have average proportion of time a lane is full smaller

than x. Therefore, the more a cdf curve is shifted to the left, the better the performance of

the corresponding signal plan. The solid curves correspond to the cdf of the initial signal

plans, the dashed curves represent that of the optimal signal plans of Problem (3.37)-(3.39).

As shown in plots 3-15(a)-3-15(d), all the cdf curves of the optimal signal plan are to the

left of the corresponding initial plan. In other words, the model yields solutions that have

lower average proportion of time a lane is full.

Figures 3-15 and 3-16 have similar figure structure as Figure 3-13. Figure 3-15 analyses

the performance of the signal plans in terms of the average lane queue-length (in vehicles).

62

This average is computed over time and over lanes. The x-axis displays the average lane

queue-length. For a given value of x, the y-axis displays the proportion of simulation

replications (out of 50) that have average lane queue-length smaller than x. As before, the

more these curves are shifted to the left, the better the performance of the corresponding

signal plans. The four plots of Figure 3-15 indicate that, for all initial points, the proposed

optimal signal plans yields lower average lane queue-length. Figure 3-16 analyses the

performance of the signal plans in terms of the average trip travel times (in minutes). The

x-axis displays the average trip travel time. For a given value of x, the y-axis displays the

proportion of simulation replications (out of 50) that have average trip travel times smaller

than x. For all initial points, the proposed optimal signal plans yield lower average trip

travel times.

2.4.3 Comparison to signal plans derived by commercial signal con-

trol software

In this section, we compare the performance of the optimal signal plans with that of a signal

plan obtained from a widely used commercial signal control software (Synchro Trafficware

(2011)). For details on how the signal plan for the city of Lausanne is obtained from

Synchro, we refer the reader to Section 5.3 of Osorio and Chong (2015). Note that Synchro,

which is a signal control optimization software based on a deterministic macroscopic traffic

model, does not solve Problem (3.37)-(3.39).

Figures 3-17, 3-18 and 3-19 consider the same performance metrics as before: average

proportion of time a lane is full, average lane queue-length and average trip travel time.

Each figure displays 9 cdf curves. The four dashed (resp. solid thin) curves correspond

to the four initial (resp. optimal) points of the previous analysis. The solid thick curve

corresponds to the signal plan proposed by Synchro. Recall that for each figure, the more a

cdf curve is shifted to the left, the better the performance of the corresponding signal plan.

For all three figures, the four left-most curves are the four plans proposed by the mixture

model. In other words, for all three performance metrics, the proposed plans outperform

all initial plans as well as the Synchro plan. These figures also show that, for all three

63

0.010.015

0.020.025

0.030.035

0.040.045

x: average proportion of time a lane is full

0

0.2

0.4

0.6

0.8 1

Cumulative distribution function F(x)

InitialO

ptimal

(a)Initialpoint1

0.010.012

0.0140.016

0.0180.02

0.0220.024


0

0.2

0.4

0.6

0.8 1


InitialO

ptimal

(b)Initialpoint2

0.010.015

0.020.025

0.030.035

0.040.045


0

0.2

0.4

0.6

0.8 1


InitialO

ptimal

(c)Initialpoint3

0.010.015

0.020.025

0.030.035

0.040.045

0.05x: average proportion of tim

e a lane is full

0

0.2

0.4

0.6

0.8 1


InitialO

ptimal

(d)Initialpoint4

Figure2-9:C

umulative

distributionfunctions

oftheaverage

proportionoftim

ea

laneis

fullconsideringdifferentinitialsignalplans

64

0.5

11.

52

2.5

3x:

ave

rage

lane

que

ue-le

ngth

(in

veh

icle

s)

0

0.2

0.4

0.6

0.81


Initi

alO

ptim

al

(a)I

nitia

lpoi

nt1

0.5

11.

52

2.5

x: a

vera

ge la

ne q

ueue

-leng

th (

in v

ehic

les)

0

0.2

0.4

0.6

0.81


Initi

alO

ptim

al

(b)I

nitia

lpoi

nt2

0.5

11.

52

2.5

33.

5x:

ave

rage

lane

que

ue-le

ngth

(in

veh

icle

s)

0

0.2

0.4

0.6

0.81


Initi

alO

ptim

al

(c)I

nitia

lpoi

nt3

0.5

11.

52

2.5

33.

5x:

ave

rage

lane

que

ue-le

ngth

(in

veh

icle

s)

0

0.2

0.4

0.6

0.81


Initi

alO

ptim

al

(d)I

nitia

lpoi

nt4

Figu

re2-

10:C

umul

ativ

edi

stri

butio

nfu

nctio

nsof

the

aver

age

lane

queu

e-le

ngth

cons

ider

ing

diff

eren

tini

tials

igna

lpla

ns

65

56

78

910

x: average trip travel time [m

in]

0

0.2

0.4

0.6

0.8 1


InitialO

ptimal

(a)Initialpoint1

45

67

89

10x: average trip travel tim

e [min]

0

0.2

0.4

0.6

0.8 1


InitialO

ptimal

(b)Initialpoint2

56

78

910


e [min]

0

0.2

0.4

0.6

0.8 1


InitialO

ptimal

(c)Initialpoint3

56

78

910


e [min]

0

0.2

0.4

0.6

0.8 1


InitialO

ptimal

(d)Initialpoint4

Figure2-11:C

umulative


oftheaverage

triptraveltim

esconsidering

differentinitialsignalplans

66

0.01

0.01

50.

020.

025

0.03

0.03

50.

040.

045

0.05

x: a

vera

ge p

ropo

rtio

n of

tim

e a

lane

is fu

ll

0

0.2

0.4

0.6

0.81


Syn

chro

sig

nal p

lan

Initi

al s

igna

l pla

nP

ropo

sed

sign

al p

lan

Figu

re2-

12:

Cum

ulat

ive

dist

ribu

tion

func

tions

ofth

eav

erag

epr

o-po

rtio

nof

time

ala

neis

full

0.5

11.

52

2.5

33.

5x:

ave

rage

lane

que

ue-le

ngth

(in

veh

icle

s)

0

0.2

0.4

0.6

0.81


Syn

chro

sig

nal p

lan

Initi

al s

igna

l pla

nP

ropo

sed

sign

al p

lan

Figu

re2-

13:

Cum

ulat

ive

dist

ribu

tion

func

tions

ofth

eav

erag

ela

nequ

eue-

leng

th

45

67

89

1011

x: a

vera

ge tr

ip tr

avel

tim

e [m

in]

0

0.2

0.4

0.6

0.81


Syn

chro

sig

nal p

lan

Initi

al s

igna

l pla

nP

ropo

sed

sign

al p

lan

Figu

re2-

14:

Cum

ulat

ive

dist

ribu

tion

func

tions

ofth

eav

erag

etr

iptr

avel

time

67

metrics, the performance of the initial plans varies significantly, while the performance of

the proposed signal plans is very similar. This illustrates the robustness of the proposed

model to the quality of the initial points. For two metrics, average proportion of time a

lane is full and average lane queue-length, the Synchro plan outperforms 3 of the 4 initial

plans and performs similarly to the fourth plan. For the average trip travel time metric, the

Synchro plan outperforms all 4 initial points.

2.5 Conclusions

This chapter formulates an analytical stochastic link model that is both computationally

tractable and is consistent with the kinetic theory of traffic flow. The model is validated

versus stochastic simulation results, using a simulator of the stochastic link transmission

model. Compared to the model of Osorio and Flötteröd (2015), the proposed model has a

complexity that is linear in the link space capacity, rather than cubic. This leads to signif-

icant gains in computational runtimes. Both models provide an accurate approximation of

the distribution of the link’s boundary conditions. The proposed model is used to address

a signal control problem for the city of Lausanne. It yields signal plans that systematically

outperform initial random plans for various performance metrics. The experiments illus-

trate the robustness of the model to the quality of the initial points. The proposed plans also

outperform a signal plan derived by a widely used commercial signal control software.

Ongoing work formulates scalable probabilistic network models. There are two main

challenges to be addressed. First, there is a need to formulate probabilistic and scalable

node models. The probabilistic model of Osorio et al. (2011) includes a two-link node

model that provides a higher-order description of the across-node dependencies. It yields

the joint distribution of the boundary conditions that each link adjacent to a node pro-

vides to the node, i.e., the joint distribution of the upstream link’s downstream boundary

conditions and the downstream link’s upstream boundary conditions. The extension of this

formulation to nodes with multiple upstream and downstream links is part of ongoing work.

Second, there is a need to formulate scalable network models. For a network with n links,

each with space capacity ℓ, directly coupling the proposed link model with the node model

68

of Osorio et al. (2011) would yield a model complexity in the order of O(ℓn). Such a model

is inappropriate for large-scale network analysis. Ongoing work investigates two research

directions. First, we study the use of network decomposition techniques. For instance,

combining the link and node models with the technique of Flötteröd and Osorio (2014)

would lead to a network model with complexity O(sℓr), where s is the number of intersec-

tions and r is the maximum number of links adjacent to an intersection. Second, we study

the use of aggregation-disaggregation techniques that address the curse of dimensionality

by providing an aggregate description of network states (Osorio and Yamani; 2017; Osorio

and Wang; 2017).

69

70

Chapter 3

Analytical Probabilistic Link

Transmission Model With Constant

Complexity

This chapter presents an analytical probabilistic link transmission model with constant

complexity. It builds upon the model formulated in Chapter 2. It proposes a formula-

tion that only tracks two key probability states over time. Therefore, the dimension of the

state space of the model is of dimension two, it is independent of the link’s space capacity.

This contrasts with the model of Chapter 2 that had a dimension that increased linearly with

the link’s space capacity. The model is thus suitable for large-scale network optimization

with time budgets or real-time optimization problems.

3.1 Introduction

In the field of traffic flow modeling, there is a recent and increased interest in the formula-

tion of probabilistic models. This is facilitated and motivated by a number of factors, in-

cluding increased availability of urban mobility data, advanced censoring technologies that

enable increased data granularity (i.e., resolution) such that more detailed models can be

calibrated and validated, enhanced computing capabilities such that more elaborate models

can be evaluated. Additionally, transportation agencies in the US and in Europe have rec-

71

ognized both the importance and the need to evaluate and to improve network robustness

and reliability metrics (U.S. Department of Transportation; 2008; Transport for London;

2010). This calls for a probabilistic description of network performance.

Calvert et al. (2012) discuss the advantages and disadvantages of both deterministic

and stochastic modeling approaches from both methodological and transportation practice

perspectives. They identify the lack of computational efficiency as one of the main chal-

lenges current stochastic models face. Indeed, compared to their deterministic counterparts,

stochastic models may suffer from the curse of dimensionality and are often computation-

ally inefficient for the analysis, let alone the optimization, of large-scale networks. The goal

of this chapter is to propose an analytical stochastic traffic theoretic model that addresses

these scalability and computational efficiency concerns.

Deterministic traffic model formulations and their solution methods have been exten-

sively studied leading to seminal works such as Lighthill and Whitham (1955); Richards

(1956b); Daganzo (1994); Newell (2002).The formulation of their stochastic counterparts

are in the early stages. Detailed reviews of stochastic traffic flow models are provided by

Sumalee et al. (2011); Jabari (2012); Calvert et al. (2012); Laval and Chilukuri (2014) and

Chen et al. (2015). This chapter focuses on analytical (i.e., not simulation-based) formula-

tions. In this research area, recent work has proposed formulations based on the variational

theory of Daganzo (2005) (Deng et al.; 2013; Laval and Chilukuri; 2014; Laval and Cas-

trillón; 2015). The most popular approach to formulate a stochastic traffic model is to

add stochasticity to a specific deterministic traffic flow model. For instance, Boel and Mi-

haylova (2006) formulate a stochastic cell-transmission model (CTM) (Daganzo; 1994) by

adding Gaussian noise terms to the sending and receiving functions of the deterministic

CTM. However, for such approaches, the expected traffic dynamics are not guaranteed to

be consistent with their deterministic CTM counterparts. A detailed discussion of this, in-

cluding the existence and implications of negative sample paths, are given in Jabari and Liu

(2012). Rather than adding noise directly to the speed-density relationship, Jabari and Liu

(2012) consider stochastic vehicle headways and Jabari et al. (2014a) consider a stochastic

formulation of Newell’s simplified car-following model (Newell; 2002). Probabilistic as-

sumptions are made at the microscopic level, and macroscopic probabilistic speed-density

72

relationships are then derived. For analytical models that add Gaussian noise terms to a spe-

cific deterministic model, computational inefficiency can arise due to the need to sample

from high-dimensional Gaussian distributions. The work of Zheng et al. (2018) proposes

a stochastic formulation of the model of Newell (1961). Other approaches to stochastic

modeling include gas-kinetic (Boltzmann-like) models of traffic (Paveri-Fontana; 1975;

Hoogendoorn and Bovy; 2001), aggregated traffic modeling approach (e.g., (sub)region-

based models of Ramezani et al. (2015)) and uncertainty propagation approaches (Sayegh

et al.; 2017).

An alternative approach has been the use of probabilistic queueing theory. Most work

has considered a stationary analysis (Heidemann; 1991, 1994; Heidemann and Wegmann;

1997). Work that considers transient (i.e., non-stationary) analysis includes Olszewski

(1994); Heidemann (2001); van Zuylen and Viti (2003); Viti and Van Zuylen (2010). Ma-

jority of the works on transient analysis consider queues with infinity capacity. In reality,

there is physical upper bound on the number of vehicles a road can hold, which means the

queueing capacity is finite, and spillback effect as a consequence phenomenon is frequently

observed in congestion urban network. Formulations based on both transient queueing the-

ory and finite (space) capacity queueing network theory have also been proposed (Osorio

et al.; 2011; Osorio and Flötteröd; 2015; Lu and Osorio; 2018). Transient queueing models

can contribute to provide a probabilistic and analytical description of congestion build-up

and dissipation. However, the formulation of a probabilistic transient model with sufficient

computational efficiency to enable large-scale network analysis and optimization remains

a challenge (Viti and Van Zuylen; 2010).

This chapter tackles this challenge and extends this literature. The model of Osorio and

Flötteröd (2015) is a stochastic formulation of the deterministic link transmission model

of Yperman et al. (2007), which itself is an operational formulation of Newell’s simplified

theory of kinematic waves (Newell; 1993). The model considers a single link with space

capacity ℓ and represents the link as a set of three queues with finite (space) capacity. It

derives the joint transient probability distribution of the link’s upstream and downstream

boundary conditions. For a link with space capacity ℓ, the model complexity is in the order

of O(ℓ3). The recent work of Lu and Osorio (2018) (cf. Chapter 2) extends the model

73

of Osorio and Flötteröd (2015) by making it more computationally efficient. Instead of

deriving the joint distribution of the link’s upstream and downstream boundary conditions,

Lu and Osorio (2018) (cf. Chapter 2) yield the marginal distribution of the link’s upstream

boundary conditions and the marginal distribution of the link’s downstream boundary con-

ditions. They provide a simplified description of the spatial and temporal dependencies

between the upstream and the downstream boundary conditions. The model complexity is

in the order of O(ℓ). This reduction in model complexity enhances the computational effi-

ciency of the model. In this chapter, we formulate a model with further enhanced compu-

tational efficiency. The goal is to enable large-scale network optimization to be performed

efficiently. We extend the model of Lu and Osorio (2018) (cf. Chapter 2) and propose a

formulation with constant complexity, i.e., the complexity no longer depends on the link’s

space capacity ℓ.

The chapter is organized as follows. Section 3.2 brief reviews the past link model

formulations of Lu and Osorio (2018) (cf. Chapter 2) and Osorio and Flötteröd (2015).

In Section 3.3, we motivate and formulate the proposed model. The proposed model is

validated in Section 3.4. It is then used to address a city-wide signal control problem and

is benchmarked versus other methods (Section 3.5). Section 3.6 summarizes the chapter

and discusses ongoing work. The Appendices contain additional equation derivations and

numerical validation results.

3.2 Past link model formulations

The outline of the main ideas of the models of Lu and Osorio (2018) (hereafter referred to

as the mixture model proposed in Chapter 2) and of Osorio and Flötteröd (2015) (hereafter

referred to as the multivariate model) are presented in Section 2.2.1 of Chapter 2.

Lu and Osorio (2018) (cf. Chapter 2) note that the link’s upstream (resp. downstream)

boundary conditions are described by UQ (resp. DQ). The model is formulated as a

mixture of two independent univariate models: a univariate model of UQ and a univariate

model of DQ. The model tracks the full marginal distributions of UQ and of DQ, over

time. The dimension of the state space for the model is 2(ℓ+ 1), i.e., the model complexity

74

is in the order of O(ℓ). In other words, Lu and Osorio (2018) (cf. Chapter 2) enhance

the scalability of the multivariate model of Osorio and Flötteröd (2015) by formulating a

model with linear, rather than cubic, complexity in the link’s space capacity ℓ.

In this chapter, we propose a formulation with a state space of dimension 2. In other

words, the model complexity is now independent of the link’s space capacity ℓ. This leads

to enhanced scalability and improves the ability of these models to be used efficiently for

large-scale network optimization. This proposed formulation is simpler than past formu-

lations, yet as illustrated in Section 3.4, it still captures sufficient dependency between

the link’s upstream and downstream boundary conditions. Hereafter, we use the notation

DQ(k) (resp. LI(k), UQ(k), LO(k)) to denote the state of DQ (resp. LI, UQ, LO) at the

end of time interval k. The notations DQ and DQ(k) are used interchangeably.

3.3 Proposed link model formulation

The main idea underlying the proposed model is that in order to describe the link’s bound-

ary conditions, we do not need to track the full marginal distributions of UQ and of DQ

as in Lu and Osorio (2018) (cf. Chapter 2), let alone track their full joint distribution as in

Osorio and Flötteröd (2015). More specifically, we have identified 2 specific queue states

that are essential to describe these boundary conditions. The first state is DQ = 0, which

describes whether or not there is vehicular flow downstream ready to depart the link. The

second state is UQ = ℓ, which describes whether or not there is road space available at

the upstream end of the link. Intuitively, in a network setting with two links, vehicular

flow can be transmitted from the upstream link to the downstream link if the following two

conditions hold: (i) there is flow at the upstream link ready to depart to the downstream

link (i.e., for the upstream link DQ > 0), and (ii) there is space available at the upstream

end of the downstream link (i.e., for the downstream link UQ < ℓ). Thus, for a given link,

the proposed model approximates only 2 state probabilities: P(DQ = 0) and P(UQ = ℓ).

More formally, for a given time interval k, the expected link inflow is defined as:

qin(k) = λ(k)(1− P(UQ(k) = ℓ)). (3.1)

75

Equation (3.1) states that vehicles can enter the link as long as there is space available at the

upstream end of the link (i.e., UQ(k) < ℓ), which happens with probability P(UQ(k) <

ℓ) = 1− P(UQ(k) = ℓ). Similarly, the expected link outflow is defined as:

qout(k) = µ(k)(1− P(DQ(k) = 0)). (3.2)

Equation (3.2) states that there are vehicle departures from the link as long as there are

vehicles at the downstream end of the link that are ready for departure (i.e., DQ(k) > 0),

which happens with probability P(DQ(k) > 0) = 1− P(DQ(k) = 0).

The mixture model of Lu and Osorio (2018) (cf. Chapter 2) derives the marginal dis-

tributions of UQ(k) and of DQ(k) at every time step k. However, the only information

needed to compute the dynamics of the link’s boundary conditions are the two probabilities

P(UQ(k) = ℓ) and P(DQ(k) = 0).

In this chapter, we propose a model that only keeps track of these two key probabilities

over time (i.e., P(UQ(k) = ℓ) and P(DQ(k) = 0)). It improves model scalability by

reducing the dimension of the state space. The proposed model has a state space of dimen-

sion 2. In other words, its complexity is now constant and no longer depends on the space

capacity of the link. Thus, in a network setting, the proposed model linearly scales with the

number of links in the network, independently of link attributes such as link lengths. The

rest of this section is organized as follows. Section 3.3.1 formulates the model of the link’s

downstream boundary conditions P(DQ(k) = 0). Section 3.3.2 formulates the model of

the link’s upstream boundary conditions P(UQ(k) = ℓ). Section 3.3.2 summarizes the

algorithm for the proposed link model.

3.3.1 Downstream boundary conditions

This section formulates the probabilistic model of the link’s downstream boundary condi-

tion P(DQ(k) = 0). Recall from Section 3.2 that the service process of the link consists of

service of vehicles in DQ, and these service times are i.i.d exponential random variables.

Since the service process of the link is the same as the service process of DQ(k), we do not

need to approximate the service process of DQ(k). In other words, service times of DQ(k)

76

are independent and identically distributed exponential random variables with exogenous

rate µ(k). Thus, we only need to approximate the arrival process of DQ(k).

As described in Section 3.2, upon entering the link, all vehicles enter LI(k), and then

enter DQ(k). Thus, arrivals from DQ(k) consist of departures from LI(k). Thus, no

arrivals to DQ(k) should be rejected, lost or blocked due to a capacity limit of DQ(k).

This leads us to model DQ(k) as an infinite (space) capacity queue.

We approximate the downstream queue during time interval k, DQ(k), as an M/M/1

queue. The arrival process of DQ(k) is approximated as a Poisson process with endoge-

nous rate λDQ(k). To approximate λDQ(k), recall from Section 3.2 that the arrivals to DQ

correspond to flow that leaves the last cell of LI. In Figure 2-1, this cell is the kfwdth cell,

which is denoted LLI. Thus, flow that enters DQ during time interval k corresponds to

flow that entered the link during time interval k − kfwd. Thus, we approximate the arrival

rate to DQ(k) as the expected flow to enter the link during time interval k− kfwd:

λDQ(k) = qin(k− kfwd). (3.3)

For an M/M/1 queueing system, an exact closed-form expression for the transient

queue-length distribution exists (e.g., Eq. (2.163) of Kleinrock (1975)), however the use of

such an expression requires keeping track of the entire queue-length distribution at every

time step. Since our aim is to track a single state probability, P(DQ(k) = 0), rather than

the full distribution, we do not use the closed-form expression.

The transient behavior of a system has also been studied with relaxation processes,

which describe the return of a disturbed system to its equilibrium state. Prigogine and

Andrews (1960) was among the pioneering work that introduced gas-kinetic models for the

analysis of vehicular traffic flow. Extensions of their work include Paveri-Fontana (1975);

Nelson (1995); Helbing (1997). The notion of relaxation process and relaxation time is

also used in Gazis et al. (1961); Payne (1971); Ross (1988).

In the queueing theory literature, the theoretical study of relaxation times is well stud-

ied. Our studies differs from past literature in the following ways. We focus on the approx-

imation of the relaxation time (or equivalently its inverse) of a single probability state (i.e.,

77

P(DQ(k) = 0) of Eq. (3.4)), while most papers have studied the relaxation time of the

expected queue-length for infinite (space) capacity single-server queueing systems, such as

in Newell (1982, Chap. 5) and in Odoni and Roth (1983). Our proposed approach allows

for an arbitrary initial state for the queueing system, while most theoretical relaxation time

studies focus on initially empty systems (e.g., Newell (1982, Chapter 5)). There are studies,

based on numerical stochastic simulations, that have considered arbitrary deterministic ini-

tial states, such as in Odoni and Roth (1983). Moreover, most studies consider an isolated

queueing system, whereas our work considers several queueing systems that have coupled

dynamics. For instance, the dynamics of DQ(k) and of UQ(k) are coupled due to the

dependencies between the link’s downstream and upstream boundary conditions.

We introduce the following notation:

P(DQ(k) = 0) probability of DQ(k) being empty at the end of time interval k (which is also

the beginning of time interval k+ 1);

P(DQk = 0) time-interval specific stationary probability of DQ = 0;

τDQ(k) inverse of the relaxation time during time interval k.

We propose the following formulation:

P(DQ(k) = 0) = P(DQk = 0)+[P(DQ(k−1) = 0)−P(DQk = 0)

]e−τDQ(k)δ. (3.4)

Equation (3.4) states that the transient probability P(DQ(k) = 0) at the end of time interval

k is approximated as the sum of a stationary probability (term P(DQk = 0)) and a term

that decays exponentially with time. The latter term is the difference between the initial

condition of time interval k (term P(DQ(k − 1) = 0)) and the corresponding stationary

probability P(DQk = 0). The functional form of Equation (3.4) is inspired by both the

exact closed-form expression of the transient queue-length distribution of an M/M/1/ℓ of

Morse (1958, Chap. 6, Equation (6.13)) as well as by the approximate expression of the

transient spillback probability (also known as the blocking probability) of an M/M/1/ℓ of

Chong and Osorio (2017, Equation (14a)).

Equation (3.4) contains two endogenous terms, P(DQk = 0) and τDQ(k). We now

78

present their formulations starting with that of Pk(DQ = 0). We define the traffic intensity

of DQ(k) as:

ρDQ(k) =λDQ(k)

µ(k). (3.5)

We use the following expression to approximate P(DQk = 0):

P(DQk = 0) =

{1− ρDQ(k) if ρDQ(k) < 1 (3.6a)

0 otherwise (3.6b)

Equation (3.6a) is obtained from the closed-form expression for the stationary queue-

length distribution of an M/M/1 queue (e.g., Gross (2008, Chap. 2, Equation (2.9))).

However, this expression only holds for ρDQ(k) < 1. In our transient setting, the traffic

intensity ρDQ(k) may temporarily exceed 1. Equation (3.6b) allows for this and is defined

such as to ensure continuity of P(DQk = 0) at ρDQ(k) = 1.

We now present the approximation for τDQ(k). In queueing theory, 1/τDQ(k) is known

as the relaxation time, which measures the time a given performance metric needs to reach

its stationary value. Thus, τDQ(k) measures the speed at which the given performance

metric approaches its stationary value (i.e., a larger τDQ(k) corresponds to a higher speed

of convergence to stationary values). We approximate τDQ(k) as follows:

τDQ(k) = τDQ,1 + τDQ,2 (3.7a)

τDQ,1 =[1− α1e

−ρDQ(k)]×

[µ(k)(1− ρDQ(k))

2

(1+ ρDQ(k))

](3.7b)

τDQ,2 = α2µ(k)

∣∣∣∣P(DQ(k− 1) = 0) − P(DQk = 0)

ℓ(1− P(DQk = 0))

∣∣∣∣1/5 (3.7c)

The terms α1 and α2 in Equation (3.7b) and (3.7c), respectively, are defined as follows:

α1 =

{α1,1 if P(DQ(k− 1) = 0) > P(DQk = 0) (3.8a)

α1,2 otherwise (3.8b)

α2 =

{α2,1 if P(DQ(k− 1) = 0) > P(DQk = 0) (3.9a)


79

where α1,1, α1,2, α2,1 and α2,2 are exogenous scalar coefficients. Equation (3.7a) approx-

imates τDQ(k) as the sum of two relaxation terms: τDQ,1 (which is defined in Eq. (3.7b))

and of τDQ,2 (which is defined in Eq. (3.7c)). We now describe how this formulation is

derived. Equation (3.7a) defines τDQ(k) as the sum of two relaxation terms. In a nutshell,

the first term τDQ,1, defined by Equation (3.7b), is formulated based on the relaxation time

study of Newell (1982, Chap. 5, Equation (5.6)), while the second term τDQ,2, defined by

Equation (3.7c), is formulated based on insights from numerical simulation experiments.

More specifically, Equation (3.7b) defines τDQ,1 as the product of 2 terms within brack-

ets. The expression within the right-side bracket corresponds to the inverse of the relaxation

time for the expected queue-length of an M/M/1 system of Newell (1982, Chap. 5, Equa-

tion (5.6)):

τ =(1− ρ2)µ

C2S + C2

Aρ(3.10)

where CA (resp. CS) is the coefficient of variation for the inter-arrival (resp. service time).

For an M/M/− system, we have CA = CS = 1. The expression within the left-side

bracket is an adjustment term that is based on the following observations.

• The adjustment term should be unit-free. This is based on the observation of Odoni

and Roth (1983) that the relaxation time be scaled in time so that it varies directly

with the units of the arrival or service rates. In other words, two identical queueing

systems measured in different time units should yield the same value of τDQ(k)δ (of

Eq. (3.4) or equivalently Eq. (3.7a)). Thus, τDQ,1 should vary directly with the units

of the arrival or service rates. Since the right-side bracket term varies directly with

the service rate µ, the left-side bracket term should be unit-free.

• We want to be able to model cases where downstream departures are not allowed,

this can arise due to the presence of a red traffic light. This would imply the service

rate reaches 0 (i.e., µ(k) = 0). To allow for this, we define τDQ(k) = λDQ(k)

when µ(k) = 0. This is because when µ(k) = 0, DQ(k) becomes a pure arrival

process with rate λDQ(k). The only possibility that DQ is empty at the end of time

interval k (i.e., DQ(k) = 0) is that DQ is empty at the beginning of time interval

80

k (i.e., DQ(k − 1) = 0) and no arrival during time interval k of length δ (denoted

NDQ(k) = 0). This lead to

P(DQ(k) = 0) = P(DQ(k− 1) = 0)P(NDQ(k) = 0) (3.11)

= P(DQ(k− 1) = 0)e−λDQ(k)δ (3.12)

= P(DQk = 0) +[P(DQ(k− 1) = 0) − P(DQk = 0)

]e−λDQ(k)δ.

(3.13)

Since the arrival to DQ during time interval k is Poisson process with rate λDQ(k),

the number of arrivals in a time interval length of δ (denoted NDQ(k)) follows a Pois-

son distribution with parameter λDQ(k)δ and P(NDQ(k) = 0) = e−λDQ(k)δ. Equa-

tion (3.12) is hence obtained from Equation (3.11). By definition (i.e., Eq. (3.5)),

ρDQ → ∞ and hence P(DQk = 0) = 0 by Equation (3.6). Equation (3.13) is

obtained from Equation (3.12) by adding and subtracting the term P(DQk = 0)

which equals to zero. Note Equation (3.13) is of the exact form of Equation (3.4)

with τDQ(k) = λDQ(k). Hence, when µ(k) = 0, τDQ(k) should exist and equal to

λDQ(k).

Next, we show that limµ(k)→0 τDQ(k) = λDQ(k). In another word, τDQ(k) is well-

defined and continuous in the domain µ(k) ∈ [0,+∞). As µ(k) → 0, τDQ,1, defined

in Eq. (3.7b), becomes:

limµ(k)→0

τDQ,1 = λDQ(k). (3.14)

Moreover, as µ(k) → 0, τDQ,2 (Eq. (3.7c)) becomes zero, and hence τDQ(k), which

is the sum of the two terms, becomes:

limµ(k)→0

τDQ(k) = λDQ(k). (3.15)

The calculations of the limits are given in Appendix B.2.

Equation (3.7c) defines τDQ,2. This formulation is based on insights obtained from

81

numerical studies using a simulation-based implementation of the stochastic link trans-

mission model. It corresponds to the benchmark simulator used in Osorio and Flötteröd

(2015). This simulator samples individual vehicles, and imposes the forward and backward

lags explicitly for each vehicle. A total of 126 simulation experiments are carried out. Each

experiment starts with an initial empty state and runs for 300 time units, it has one traffic in-

tensity value for the first 150 time units and another value for the remaining 150 time units.

We use the arrow notation 0.5 → 0.25 to denote an experiment with a traffic intensity that

changes from 0.5 to 0.25. The experiments consider all combinations of traffic intensity

λ/µ ∈ {0.5 → 0.25, 0.75 → 0.25, 1.25 → 0.25, 0.75 → 0.5, 1.25 → 0.5, 1.25 → 0.75},

service rate µ ∈ {0.2, 0.4, 0.6}, and space capacity ℓ ∈ {10, 20, 30, 40, 60, 80, 100}.

Equation (3.7c) defines τDQ,2 based on the following observations from these simula-

tion experiments.

• We want the relaxation time (or equivalently its inverse) to depend on the space

capacity ℓ. This is because we observe that the inverse of the relaxation time of

P(DQ(k) = 0) is inversely related to ℓ. In other words, as ℓ increases, so does the

time needed to reach stationarity. The parameter ℓ is in the denominator of Equa-

tion (3.7c). Thus, as ℓ increases, τDQ,2 decreases, this leads to a longer time to reach

stationarity.

• We want the relaxation time to depend on the distance between the initial state (i.e.,

initial probability) and the steady state. In our simulation experiments, we observe a

positive correlation between the inverse of the relaxation time and the absolute differ-

ence between the initial state and the corresponding steady state probability. Odoni

and Roth (1983) have also observed that for an isolated M/M/1 queueing system

and for arbitrary deterministic initial states, the relaxation time of the expected queue

length may depend on the distances between the initial state and steady state. The

term∣∣∣P(DQ(k−1)=0)−P(DQk=0)

(1−P(DQk=0))

∣∣∣ of Equation (3.7c) represents the normalized absolute

difference between the initial state and the steady state. Thus, when this increases,

so does τDQ,2, and this leads to a higher speed (i.e., lower time) to reach stationarity.

• We want the relaxation time to depend on whether congestion is propagating (i.e.,

82

aggravating, building up) or dissipating. The difference in the speed at which traf-

fic propagates or dissipates has been observed experimentally (e.g., Kerner and Re-

hborn (1996)) and numerically (e.g., Orosz et al. (2009)). Similarly, we have ob-

served in the simulation experiments that it takes a shorter time to reach stationarity

when congestion is building up compared to when it is dissipating. The term α1 of

τDQ,1 (i.e., Eq. (3.7b)) and α2 of τDQ,2 (i.e., Eq. (3.7c)) are introduced to account

for this. They are defined by Equations (3.8) and (3.9) as functions of the initial

state P(DQ(k − 1) = 0) and the corresponding steady state P(DQk = 0). More

specifically, Equations (3.8a) and (3.9a) consider the case of congestion building up,

while Equations (3.8b) and (3.9b) consider the case of congestion dissipation.

A description of how the exogenous scalar parameters α1,1, α1,2, α2,1 are α2,2 are fitted

is given in Appendix A.2. In the experiments presented in this chapter, the fitted values are

α1,1 = α1,2 = 0.4, α2,1 = 0.6 and α2,2 = 0. Note that α2,1 > α2,2, which means that,

all else being equal, τDQ(k) is larger for under congestion build up conditions compared

to congestion dissipation conditions. In other words, it takes longer to reach stationarity

when congestion is dissipating than when it is building up.

3.3.2 Upstream boundary conditions

This section formulates the probabilistic model of the link’s upstream boundary conditions

P(UQ(k) = ℓ). In queueing theory P(UQ(k) = ℓ) is known as the blocking probability of

UQ(k). In traffic flow theory it represents the spillback probability of the link.

Recall from Section 3.2 that the arrival process to the link is assumed to be a Poisson

process with exogenous rate λ(k). Since the arrival process to the link is the same as

the arrival process to UQ(k), the arrival process to UQ(k) is also a Poisson process with

exogenous rate λ(k). Thus, we only need to approximate the service process of UQ(k).

Recall from Section 3.2 that flow that enters UQ sequentially undergoes the following

three phases of service: (i) it is delayed kfwd time intervals (this delay is represented in

Figure 2-1 by LI); (ii) it enters DQ, where it experiences a sojourn (waiting and service)

time; (iii) vehicular flow that leaves the link (i.e., leaves DQ) generates newly available

83

road space, which becomes available at the upstream end of the link after a delay of kbwd

time intervals (this delay is represented in Figure 2-1 by LO). Once this space becomes

available upstream, the corresponding flow leaves UQ.

Flow departures from UQ correspond to flow departures from the most downstream

cell of LO (which is denoted LLO in Figure 2-1). Let qLLO(k) denote the expected outflow

from LLO during time interval k. It corresponds to vehicular flow that left the link during

time interval k− kbwd, i.e.,:

qLLO(k) = qout(k− kbwd). (3.16)

To approximate P(UQ(k) = ℓ) we consider two cases depending on whether or not

qLLO(k) = 0. Note that at time interval k, qLLO(k) is known since it defined by expected

link outflows from past time intervals (see Eq. (3.16)).

Case qLLO(k) = 0

If qLLO(k) = 0, then the expected outflow from UQ(k) is also zero. This implies that, with

probability 1, there are no departures from UQ(k) (in other words, positive outflow from

UQ(k) occurs with a probability of zero). Thus, UQ(k) is a pure arrival process.

Let N(k) denote the number of attempted new arrivals during time interval k whether

or not they successfully enter UQ. Thus, the number of arrivals that successfully entered

UQ during time interval k is the minimum of N(k) and the available space left, i.e., ℓ −

UQ(k−1). Thus, the number of vehicles in UQ at the end of time interval k (i.e., UQ(k))

is sum of the number of vehicles in UQ at the beginning of time interval k (i.e., UQ(k−1))

and the number of vehicles that successfully entered UQ:

UQ(k) = UQ(k− 1) + min{N(k), ℓ−UQ(k− 1)}. (3.17)

84

Therefore, P(UQ(k) = ℓ) can be obtained as follows:

P(UQ(k) = ℓ) = P((UQ(k− 1) + min{N(k), ℓ−UQ(k− 1)}) = ℓ) (3.18)

=

ℓ∑i=0

P(min{N(k), ℓ− i} = ℓ− i|UQ(k− 1) = i)P(UQ(k− 1) = i)

(3.19)

=

ℓ∑i=0

P(N(k) ≥ ℓ− i|UQ(k− 1) = i)P(UQ(k− 1) = i) (3.20)

=

ℓ∑i=0

P(N(k) ≥ ℓ− i)P(UQ(k− 1) = i) (3.21)

Equation (3.18) gives P(UQ(k) = ℓ) by substituting in Equation (3.17). Equation (3.19)

is obtained by conditioning on the states of UQ at the beginning of time interval k (i.e.,

UQ(k− 1)). In the conditional probability of Equation (3.19), the equality min{N(k), ℓ−

i} = ℓ − i holds if and only if N(k) ≥ ℓ − i. Thus, Equation (3.20) is obtained. Since

the process of attempted arrivals does not have any dependence on the initial state of the

system, P(N(k) ≥ ℓ − i|UQ(k − 1) = i) = P(N(k) ≥ ℓ − i) and Equation (3.21) is

obtained.

Since the arrival process to the link, which is also the arrival process to UQ(k), is a

Poisson process with rate λ(k), then N(k), the number of attempted arrivals during time

interval k, follows a Poisson distribution with parameter λ(k)δ. Thus, P(N(k) ≥ ℓ − i) is

calculated as follows:

P(N(k) ≥ ℓ− i) = 1− P(N(k) ≤ ℓ− i− 1) = 1− e−λ(k)δ

ℓ−i−1∑j=0

(λ(k)δ)j

j!. (3.22)

Equation (3.21) depends on the full marginal distribution of UQ (i.e., it depends on all

terms P(UQ(k − 1) = i), ∀i ∈ {0, . . . , ℓ}). However, the proposed model does not track

the full distribution of UQ, it only tracks the scalar probability P(UQ(k − 1) = ℓ). Thus,

85

we propose the following approximation for P(UQ(k− 1) = i), 0 ≤ i ≤ ℓ− 1:

P(UQ(k− 1) = i) =1− P(UQ(k− 1) = ℓ)∑ℓ−1

j=0 f(j, qUQ(k− 1)δ)

f(i, qUQ(k− 1)δ) (3.23a)

f(i, qUQ(k− 1)δ) =(qUQ(k− 1)δ)ie−qUQ(k−1)δ

i!(3.23b)

qUQ(k− 1) =

k−2∑r=0

qin(r) −

k−kbwd−2∑r=0

qout(r). (3.23c)

Equation (3.23b) gives the probability mass function (pmf) of a Poisson distribution with

parameter qUQ(k− 1)δ. Equation (3.23a) is a normalized and finite support ({0, ..., ℓ− 1})

Poisson distribution (with parameter qUQ(k − 1)δ). The normalization term (the fraction

term) is defined such that:

ℓ∑i=0

P(UQ(k− 1) = i) = P(UQ(k− 1) = ℓ) +

ℓ−1∑i=0

1− P(UQ(k− 1) = ℓ)∑ℓ−1


f(i, qUQ(k− 1)δ)

(3.24)

= P(UQ(k− 1) = ℓ) +1− P(UQ(k− 1) = ℓ)∑ℓ−1


ℓ−1∑i=0

f(i, qUQ(k− 1)δ)

(3.25)

= P(UQ(k− 1) = ℓ) + 1− P(UQ(k− 1) = ℓ) = 1

(3.26)

Equation (3.23c) defines the expected flow in UQ(k − 1) as the difference between: (i)

aggregated (over time) flow that has entered the link up until the end of time interval k− 2

(first summation) and (ii) aggregated (over time) vehicular flow that has left the link up

until the end of time interval k − kbwd − 2 (second summation). The second summation

accounts for the kinematic backward wave delay.

Case qLLO(k) > 0

When qLLO(k) > 0, we account for all three service processes that flow within UQ goes

through, which were mentioned at the start of Section 3.3.2. This leads us to approximate

86

UQ(k) as an M/G/ℓ/ℓ system. Let us detail this. Denote SDQ(k) the sojourn time (waiting

plus service time) of DQ(k). First, since the service time of UQ(k) is the sum of that of

these three processes, we assume it to be generally distributed. The expected service time,

E[SUQ(k)], is given by:

E[SUQ(k)] = kfwdδ+ E[SDQ(k)] + kbwdδ (3.27)

where E[SDQ(k)] is the expected sojourn time of the DQ(k) system. Second, we approxi-

mate UQ(k) as a multi-server, rather than a single-server, queueing system. This is because

the flow in LI and in LO is served (or processed) simultaneously, rather than sequentially.

We introduce the following notation.

P(UQ(k) = ℓ) probability of UQ(k) being full at the end of time interval k (which is also

the beginning of time interval k+ 1);

P(UQk = ℓ) time-interval specific stationary probability of UQ = ℓ;

τUQ(k) inverse of the relaxation time during time interval k.

If qLLO(k) > 0, then we use the same functional form as for DQ(k) (Eq. (3.4)) to approx-

imate P(UQ(k) = ℓ):

P(UQ(k) = ℓ) = Pk(UQ = ℓ)+[P(UQ(k−1) = ℓ)−P(UQk = ℓ)

]e−τUQ(k)δ. (3.28)

Just as for DQ(k) (Eq. (3.4)), the transient probability P(UQ(k) = ℓ) is defined as the sum

of a time-interval specific stationary probability (term P(UQk = ℓ)) and a term that decays

exponentially with time and accounts for the difference between the initial conditions (i.e.,

P(UQ(k − 1) = ℓ)) and the corresponding stationary probability (i.e., P(UQk = ℓ)).

The functional form of Equation (3.28) is inspired by Jagerman (1975, Equation (166))

which expresses the transient blocking probability of an M/M/ℓ/ℓ system as the sum of

the corresponding stationary probability (known as the Erlang-B formula, it is presented

below in Equation (3.29a)) and a term that decays exponentially with time. We consider

generally distributed, rather than Markovian, service times. Nonetheless, Equation (3.28)

87

uses a similar functional form for the transient probability of an M/M/ℓ/ℓ queue as in

Jagerman (1975, Equation (166)).

The stationary probability P(UQk = ℓ) of Equation (3.28) is approximated as follows:

P(UQk = ℓ) =ρUQ(k)

ℓ/ℓ!∑ℓ

n=0 ρUQ(k)n/n!(3.29a)

ρUQ(k) = λ(k)/µUQ(k) (3.29b)

µUQ(k) = 1/(kfwdδ+ E[SDQ(k)] + kbwdδ) (3.29c)

E[SDQ(k)] =ℓρDQ(k)

ℓ+1 − (ℓ+ 1)ρDQ(k)ℓ + 1

µ(k)(1− ρDQ(k)ℓ)(1− ρDQ(k)). (3.29d)

The system M/G/ℓ/ℓ has been extensively studied and is known as the Erlang loss model.

Equation (3.29a) is the stationary blocking probability for an M/G/ℓ/ℓ. It is known as

the Erlang-B formula. It was first derived by Erlang (1917) for an M/M/ℓ/ℓ, Khinchin

(1962) later proved that it holds for generally distributed service times with finite expecta-

tion. Equation (3.29b) is the definition of the traffic intensity of the M/G/ℓ/ℓ: it is the ratio

of the arrival rate λ(k) to the inverse of the expected service time, which is given by the

inverse of Equation (3.29c) or equivalently by Equation (3.27). Equation (3.29d) approxi-

mates the expected sojourn time of DQ(k). The expression corresponds to the closed-form

expression for the expected sojourn time of an M/M/1/ℓ system (e.g., Gross (2008, Chap.

2, Equations (2.48) and (2.51))). Note that Equation (3.29d) uses an M/M/1/ℓ model for

DQ(k), while Section 3.3.1 uses an M/M/1 model. The use of an M/M/1/ℓ model

yields an expected sojourn time that: (i) is well defined for all traffic intensity values (i.e.,

even when λDQ(k) ≥ µ(k)) and (ii) is bounded from above. This is not the case for the

expected sojourn time of an M/M/1 model (i.e., E[SDQ(k)] = 1/(µ(k)−λDQ(k))), which

assumes a traffic intensity strictly smaller than 1 and goes to infinity as the traffic intensity

approaches 1.

The endogenous parameter τUQ(k) of Eq. (3.28) represents the inverse of the relaxation

time. As discussed in Section 3.3.1, it measures the speed of convergence to the stationary

88

value. We approximate τUQ(k) as follows.

τUQ(k) =α3ℓµUQ(k)(1− ρUQ(k)/ℓ)

2

1+ CUQ(k)2× 1

ℓ(3.30a)

CUQ(k) =√

Var(SUQ(k))/E[SUQ(k)] (3.30b)

Var(SUQ(k)) = [ℓρDQ(k)2ℓ+2 − 2ℓρDQ(k)

2ℓ+1 + (ℓ+ 1)ρDQ(k)2ℓ − ℓ(ℓ+ 1)ρDQ(k)

ℓ+2

+ 2ℓ(ℓ+ 1)ρDQ(k)ℓ+1 − (ℓ2 + ℓ+ 2)ρDQ(k)

ℓ + 1]/[µ(k)2(1− ρDQ(k)ℓ)2(1− ρDQ(k))

2],(3.30c)

where the term α3 in Equation (3.30a) is defined by:

α3 =

{α3,1 if P(UQ(k− 1) = ℓ) < P(UQk = ℓ) (3.31a)


Studies of the relaxation time of an Erlang loss model are limited and mostly focus

on the special case of an M/M/ℓ/ℓ system (e.g., van Doorn and Zeifman (2009)) and

on asymptotic study of the relaxation time of the expected queue length. Our focus is

on the relaxation time of the blocking probability of an Erlang loss model with generally

distributed service times (i.e., M/G/ℓ/ℓ). To derive an approximation of the relaxation

time for such a system, we follow the approach of Roth (1994). Roth assumes that the

functional form of the relaxation time of a multi-server M/M/ℓ system is the same as that

of the single server M/M/1 system and replaces µ with ℓµ. The expression for a single-

server M/M/1 system is derived by Odoni and Roth (1983). We proceed similarly: we

use the same functional form for the relaxation time of an M/G/1 system, and replace µ

with ℓµ.

We first achieve the inverse of relaxation time of an M/G/1. Roth (1994) studies the

relaxation times for infinite-capacity, single-server queueing system that are Markovian (in-

cluding systems in which inter-arrival and service times can be represented as exponential,

Erlangian, hyperexponential, and phase-type random variables) and concludes a general

inverse relaxation formula for such system as τ = µ(1−λ/µ)2

C2A+C2

S

where CA (resp. CS) is the

coefficient of variation for the inter-arrival (resp. service) time. Note that any distribution

can be arbitrarily well approximated by a phase-type distribution (Kingman; 1963), and

hence, we approximate the relaxation time of an M/G/1 system as an M/PH/1 system

89

with CA = 1 and CS equals the coefficient variation for the service time of UQ(k), denoted

CUQ(k) given by Equation (3.30b). We then proceed to obtain the inverse of relaxation time

of M/G/ℓ as:

τM/G/ℓ =ℓµUQ(k)(1− λ(k)/ℓµUQ(k))

2

1+ CUQ(k)2=

ℓµUQ(k)(1− ρUQ(k)/ℓ)2

1+ CUQ(k)2. (3.32)

Notice that Equation (3.32) is exactly the left part of Equation (3.30a).

In this way, the relaxation formula for an M/G/ℓ queueing system is obtained and then

we adjust it for finite capacity. To adjust for finite capacity, we follow the idea of Chong

and Osorio (2017), in which the relaxation time of an M/M/1/ℓ system is approximated

as the product of an M/M/1 system and the capacity of the queue ℓ. We then proceed to

obtain the approximation of inverse relaxation time of an M/G/ℓ/ℓ system as the inverse

of product of relaxation time of an M/G/ℓ system and the capacity of the queue. Equa-

tion (3.30a) for τUQ(k) of an M/G/ℓ/ℓ queue is thus obtained as the product of τM/G/ℓ

and 1/ℓ. The numerator of Equation (3.30b) is given by Equation (3.30c) and the denom-

inator is given by Equation (3.27). Equation (3.30c) is obtained as follows. Recall that

by definition: SUQ(k) = kfwdδ + SDQ(k) + kbwdδ, where kfwdδ and kbwdδ are constant

delays. Thus, Var(SUQ(k)) = Var(SDQ(k)). Equation (3.30c) is derived in Appendix B.4

and represents the variance of the sojourn time of an M/M/1/ℓ system.

The proposed expression for τUQ(k), defined by the System of Equations (3.30) has the

following properties:

• Just as for the expression proposed for τDQ(k) (Eq. (3.7)), the expression for τUQ(k)

has the same time units as the arrival and service rates. More specifically, Equa-

tion (3.30a) has the same units as µUQ(k) (note that ρUQ(k), CDQ(k) and ℓ are unit-

free).

• Davis et al. (1995) study non-stationary Erlang loss models with a special focus

on Mt/PH/n/n systems. They observe that the inverse of the relaxation time de-

creases, as the variability of the service time increases. In other words, the more

variable the service time, the longer it takes to reach stationarity. This holds for the

proposed equation. In Equation (3.30a), τUQ(k) is inversely proportional to CUQ(k),

90

which is the coefficient of variation of the service time of UQ(k). Thus, the higher

the variability of SUQ(k), the higher CUQ(k), the smaller τUQ(k), and thus the longer

time it takes to reach stationarity.

• Just as for τDQ(k) (Equation (3.7)), the relaxation time depends on whether con-

gestion is building up or dissipating. Thus, Equation (3.31) defines α3 as a func-

tion of whether congestion is building up (Equation (3.31a)) or dissipating (Equa-

tion (3.31b)).

We use the same 126 simulation experiments described in Section 3.3.1 to fit the ex-

ogenous scalar coefficients α3,1 and α3,2 of Equation (3.31). A description of how these

coefficients are estimated is given in Appendix B.5. The fitted values are α3,1 = 25 and

α3,2 = 7.5. Note that α3,1 > α3,2, which means that when traffic dissipates it takes a longer

time to reach stationary than when traffic builds up.

Algorithm

Algorithm 4 summarizes the proposed model. Steps 1 through 5 are initialization steps.

Step 6 is carried out iteratively, it yields for each time interval the two key probabilities:

P(DQ(k) = 0) and P(UQ(k) = ℓ), as well as expected link inflow (i.e., qin(k)) and

expected link outflow (i.e., qout(k)). All function evaluations can be done sequentially and

no simultaneous evaluation of a system of equations is required. This makes our algorithm

computationally efficient.

3.4 Validation

In this section, we evaluate the accuracy and the computational efficiency of the proposed

model. We compare its performance to that of two stochastic simulators: (i) the stochastic

link transmission model (LTM) simulator (Section 3.4.1), and (ii) the microscopic traffic

simulator Aimsun (TSS; 2014) (Section 3.4.2). The stochastic LTM simulator is a discrete-

event implementation of a stochastic formulation of the deterministic link transmission

model (LTM) (Yperman et al.; 2007). It assumes an inhomogeneous Poisson arrival process

91

Algorithm 4 Link model algorithm

1. set exogenous link parameters ρ, v,w, ℓ and the duration of each time interval δ

2. compute the forward and backward lags: kfwd = ⌈ ℓρvδ

⌉ and kfwd = ⌈ ℓρ|w|δ

⌉

3. set, for each time interval, the exogenous arrival rates and service rates λ(k) andµ(k) for ∀ k = 1, 2, ...

4. set initial link conditions: qin(0), qout(0), qUQ(0), P(UQ(0) = ℓ) and P(DQ(0) =0)

5. set qin(k) = 0 and qout(k) = 0 for k < 0


(a) compute λDQ(k) and qLLO(k) according to Eq. (3.3) and Eq. (3.16), respec-tively

(b) compute ρDQ(k) according to Eq. (3.5)

(c) compute P(DQk = 0) according to Eq. (3.6)

(d) compute α1 and α2 according to Eq. (3.8) and Eq. (3.9), respectively

(e) compute τDQ,1 and τDQ,2 according to Eq. (3.7b) and Eq. (3.7c), respectively

(f) compute τDQ(k) according to the system of Eq. (3.7a)

(g) compute P(DQ(k) = 0) according to Eq. (3.4)

(h) if qLLO(k) = 0:

i. compute qUQ(k− 1) according to Eq. (3.23c)ii. compute f(i, qUQ(k− 1)δ) ∀i ∈ {0, 1, ..., ℓ− 1} according to Eq. (3.23b)

iii. compute P(N(k) ≥ ℓ− i) ∀i ∈ {0, 1, ..., ℓ} and compute P(UQ(k− 1) =i) ∀i ∈ {0, 1, ..., ℓ−1} according to Eq. (3.22) and Eq. (3.23a), respectively

iv. compute P(UQ(k) = ℓ) according to Eq. (3.21)

else:

i. compute E[SDQ(k)] and Var(SUQ(k)) according to Eq. (3.29d) andEq. (3.30c), respectively

ii. compute E[SUQ(k)] and µUQ(k) according to Eq. (3.27) and Eq. (3.29c),respectively

iii. compute ρUQ(k) and CUQ(k) according to Eq. (3.29b) and Eq. (3.30b),respectively

iv. compute P(UQk = ℓ) according to Eq. (3.29a)v. compute α3 according to Eq. (3.31)

vi. compute τUQ(k) according to Eq. (3.30a)vii. compute P(UQ(k) = ℓ) according to Eq. (3.28)

(i) compute qin(k) and qout(k) according to Eq. (3.1) and (3.2), respectively

92

at the upstream end of the link and a stochastic departure process at the downstream end

of the link. The simulator samples individual vehicles, it implements the exact forward

and backward lags. The vehicles at the downstream end of the link are served following

a first-come first-serve rule. The service times are independent and identically distributed

exponential random variables. The simulator was used as a benchmark in the validation

experiments in Osorio and Flötteröd (2015) and in Lu and Osorio (2018) (cf. Chapter 2).

The microscopic simulator Aimsun (TSS; 2014) is a commercial traffic simulator that relies

on disaggregate car-following and lane-changing models for individual vehicles.

3.4.1 Validation versus a stochastic link transmission model simulator

We benchmark the performance of the proposed model versus that of the multivariate model

(Osorio and Flötteröd; 2015) and of the mixture model (Lu and Osorio; 2018) (cf. Chapter

2). The analytical approximations provided by each of the three analytical models are

compared to simulation-based estimates obtained from 106 simulation replications of the

stochastic LTM simulator.

We consider a link with parameters defined in Table 3.1 for all experiments conducted

in Section 3.4.1. First, we consider two experiments with time-varying demand and evalu-

ate the ability of the proposed model to approximate the link’s upstream and downstream

boundary conditions. The space capacity of the link ℓ, is fixed at 10 for both experiments.

The link is initially empty. Experiment 1 considers a case where traffic conditions change

from uncongested to highly-congested and then to congested. More specifically, it consid-

ers an arrival rate, λ(k), of 0.1 veh/sec during the interval [0, 125] seconds, of 0.5 veh/sec

during the interval [125, 175] seconds and of 0.3 veh/sec during the interval [175, 300] sec-

onds. Experiment 2 considers a case where traffic conditions change from congested to

uncongested and then to highly-congested. It considers an arrival rate of 0.3 veh/sec during

the interval [0, 100] seconds, of 0.1 veh/sec during the interval [100, 200] seconds and of

0.5 veh/sec during the interval [200, 300] seconds.

Figure 3-1 considers experiment 1. The left (resp. right) plot considers P(DQ(T) = 0)

(resp. P(UQ(T) = ℓ)). For each plot, the x-axis displays the integer time T in seconds and

93

Parameter Valuev 0.01 km/secw −0.005 km/secρ 200 veh/kmq 2400 veh/h = 0.67 veh/secδ 1 sec

µ(k) 1440 veh/h = 0.4 veh/secλ(k) varies by experiment

ℓ, L, kfwd, kbwd varies by experiment

Table 3.1: Link parameters

0 50 100 150 200 250 300time T

0

0.2

0.4

0.6

0.8

1

1.2

P(D

Q(T

)=0)

SimProposedMultivariateMixture

0 50 100 150 200 250 300time T

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

P(U

Q(T

)=L)


Figure 3-1: Experiment 1: impact of the temporal variation of demand on the link’s up-stream and downstream boundary conditions

the y-axis displays the corresponding probability. The simulated estimates are displayed as

a red line with asterisks, those of the proposed model are the black solid line, those of the

the multivariate model are the black dot-dashed line, and those of the mixture model are the

black dashed line. The simulated estimates are displayed with 95% confidence intervals,

which are barely visible.

Recall that in experiment 1, there is a sharp increase in demand at time T = 125 seconds

and a sharp decrease at time T = 175 seconds. All analytical models yield similar temporal

trends for both P(DQ(T) = 0) and P(UQ(T) = ℓ). More specifically, as congestion

increases, we expect P(DQ(T) = 0) to decrease and P(UQ(T) = ℓ) to increase. Similarly,

as congestion decreases, we expect P(DQ(T) = 0) to increase and P(UQ(T) = ℓ) to

decrease. All models exhibit these trends. They all capture the sharp decrease and increase

94

0 50 100 150 200 250 300time T

0

0.2

0.4

0.6

0.8

1P

(DQ

(T)=

0)


0 50 100 150 200 250 300time T

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

P(U

Q(T

)=L)


Figure 3-2: Experiment 2: impact of the temporal variation of demand on the link’s up-stream and downstream conditions

trends of the simulator. The multivariate model yields the most accurate approximation.

The proposed model tends to overestimate the stationary spillback probability during the

highly congested regime, whereas the mixture model tends to underestimate the stationary

spillback probability during both the congested and the highly congested regimes. Overall,

the proposed model yields a good approximation of both probabilities P(DQ(T) = 0) and

P(UQ(T) = ℓ).

The results of experiment 2 are displayed in Figure 3-2. These plots have the same

layout as those of Figure 3-1. The simulated estimates are displayed with 95% confidence

intervals, which are barely visible. Recall that experiment 2 considers a sharp decrease in

demand at T = 100 seconds and a sharp increase at T = 200 seconds. The left plot shows

an increase in P(DQ(T) = 0) after T = 100 seconds and a decrease after T = 200 seconds.

The right plot shows that the spillback probability (P(UQ(T) = ℓ)) decreases after T = 100

seconds and increases after T = 200 seconds. All analytical models capture these sharp

changes in probability mass. The proposed model yields a less accurate approximation of

the stationary spillback probability during the highly congested regime, while the mixture

model yields a less accurate approximation for both the congested and the highly congested

regimes. Overall, all three models approximate well the dynamics of the link’s boundary

conditions for sudden and significant changes in congestion levels.

95

Next, we benchmark the accuracy of the proposed model over a set of 21 experiments.

All experiments start with an empty link and have a duration of TF = 300 seconds. For each

experiment, the arrival rate changes at 150 seconds. The experiments consider all combi-

nations of the following time-varying arrival rates (λ ∈ {0.2 → 0.1, 0.3 → 0.1, 0.3 →0.2} veh/sec) and space capacities (ℓ ∈ {10, 20, 30, 40, 60, 80, 100}). The space capac-

ity values considered correspond to link lengths L ∈ {50, 100, 150, 200, 300, 400, 500}

(in meters), forward lags kfwd ∈ {5, 10, 15, 20, 30, 40, 50} and backward lags kbwd ∈

{10, 20, 30, 40, 60, 80, 100}. For all experiments the arrival rates are such that congestion

builds up during the first 150 seconds and the link reaches a stationary regime, then the

arrival rate decreases, congestion dissipates and the link reaches a less congested stationary

regime. For each experiment and each model, we set a maximum computation runtime of

40 hours. Model evaluations that have not concluded within the 40 hours are terminated.

The error metric used to evaluate the accuracy of a given analytical model is the average

absolute difference between the simulated estimate and the analytical approximation:

eDQ =1

TF

TF∑T=1

|PA(DQ(T) = 0) − PS(DQ(T) = 0)| (3.33)

eUQ =1

TF

TF∑T=1

|PA(UQ(T) = ℓ) − PS(UQ(T) = ℓ)|, (3.34)

where PA denotes the probability approximated by an analytical model (proposed, mixture

or multivariate) and PS denotes the simulated estimate.

Figure 3-3 displays the average absolute difference for the 21 experiments. The top

(resp. bottom) three plots consider the spillback probability P(UQ(T) = ℓ) (resp. P(DQ(T) =

0)). The first, second and third column of plots consider the experiments with arrival rate

0.2 → 0.1 veh/sec, 0.3 → 0.1 veh/sec and 0.3 → 0.2 veh/sec, respectively. Each plot

compares the three models: the proposed model (circles), the mixture model (asterisks)

and the multivariate model (triangles). The x-axis displays the space capacity (i.e., ℓ) and

the y-axis displays the average absolute difference (i.e., eUQ or eDQ). The top three plots,

which consider the spillback probability, have a logarithmic-scaled y-axis. For the exper-

iments with space capacity greater than 30 (i.e., ℓ > 30), the multivariate model does not

96

1020

3040

5060

7080

9010

010

-30

10-2

5

10-2

0

10-1

5

10-1

0

10-5

100

Pro

pose

dM

ixtu

reM

ultiv

aria

te

(a)A

rriv

alra

teλ(k)=

0.2

→0.1

veh/

sec

1020

3040

5060

7080

9010

010

-20

10-1

5

10-1

0

10-5

100

Pro

pose

dM

ixtu

reM

ultiv

aria

te

(b)A

rriv

alra

teλ(k)=

0.3

→0.1

veh/

sec

1020

3040

5060

7080

9010

010

-20

10-1

5

10-1

0

10-5

100

Pro

pose

dM

ixtu

reM

ultiv

aria

te

(c)A

rriv

alra

teλ(k)=

0.3

→0.2

veh/

sec

1020

3040

5060

7080

9010

010

-3

10-2

10-1

Pro

pose

dM

ixtu

reM

ultiv

aria

te

(d)A

rriv

alra

teλ(k)=

0.2

→0.1

veh/

sec

1020

3040

5060

7080

9010

010

-3

10-2

10-1

Pro

pose

dM

ixtu

reM

ultiv

aria

te

(e)A

rriv

alra

teλ(k)=

0.3

→0.1

veh/

sec

1020

3040

5060

7080

9010

010

-3

10-2

10-1

Pro

pose

dM

ixtu

reM

ultiv

aria

te

(f)A

rriv

alra

teλ(k)=

0.3

→0.2

veh/

sec

Figu

re3-

3:C

ompa

riso

nof

the

aver

age

abso

lute

erro

rsfo

rthe

21ex

peri

men

tsw

ithtim

e-va

ryin

gde

man

d

97

conclude within 40 hours, thus these runs are terminated and are not displayed in the plots.

The main insights from Figure 3-3 are as follows. For most experiments, the multivari-

ate model gives the lowest errors for both P(UQ(T) = ℓ) and P(DQ(T) = 0), followed

by the mixture model. For the spillback probabilities (i.e., top three plots), both the pro-

posed model and the mixture model have errors that decrease exponentially as the space

capacity ℓ increases. As the space capacity increases, the error in the approximation of

P(DQ(T) = 0) (bottom three plots) for the proposed model preserves the same order of

magnitude. The numerical values of the errors displayed in Figure 3-3 are provided in Ap-

pendix B.6 (Tables B.1 and B.2). The average (over the 21 experiments) eUQ and eDQ of the

proposed model are 0.0006 and 0.0180, respectively, whereas those of the mixture model

are 0.0023 and 0.0073. Compared to the mixture model, the proposed model, on average,

gains accuracy in approximating the upstream boundary conditions and loses accuracy in

approximating the downstream boundary conditions.

Overall, the multivariate model has the highest accuracy, yet is computationally inef-

ficient for large space capacity values. The proposed model and the mixture model have

comparable accuracy. They both perform well for all experiments.

We now compare the multivariate model, the mixture model and the proposed model in

terms of computational runtime. Figure 3-4 compares the runtimes for the 21 experiments.

Figure 3-4(a), 3-4(b) and 3-4(c) consider the experiments with time-varying arrival rate

0.2 → 0.1 veh/sec, 0.3 → 0.1 veh/sec and 0.3 → 0.2 veh/sec, respectively. Figure 3-4(d)

plots the average runtime over all three time-varying arrival rate experiments. For each plot,

the x-axis displays the space capacity ℓ and the y-axis displays the computational runtime

(in seconds). The y-axis is plotted on a logarithmic scale. Each plot considers runtimes of

the three models: proposed (circles), mixture (asterisks) and multivariate (triangles). Since

the multivariate model does not conclude within 40 hours for experiments with ℓ > 30, they

are not evaluated. For all three arrival rate experiments (i.e., for Figures 3-4(a), 3-4(b) and

3-4(c)) the following trends hold: the runtime of the multivariate model increases exponen-

tially with ℓ, that of the mixture model increases linearly, while for the proposed model,

the computational runtime appears constant. The average runtime over the 21 experiments

of the proposed model is 0.08 seconds, whereas that of the mixture model is 0.49 seconds.

98

The average runtime is improved by one order of magnitude.

In summary, for all experiments, the proposed model performs comparably with the

mixture and multivariate model in describing the dynamics of the link’s boundary condi-

tions. The gain in computational runtime is significant and increases as the space capacity

increases.

3.4.2 Validation versus a microscopic traffic simulator

We benchmark the proposed model versus the microscopic traffic simulator Aimsun. We

consider a signal-controlled single-lane link, depicted in Figure 3-5, with parameters shown

in Table 3.2. There are two detectors on the link that provide vehicle count data every sec-

ond. We study the traffic dynamics between the entrance and the exit detectors of the link.

The detector denoted entrance detector (resp. exit detector) captures the link’s upstream

(resp. downstream) boundary conditions. The performance metrics that we consider are

the expected inflow and the expected outflow, in vehicles per second, and are computed as

the average, over 2 seconds, vehicle count. We consider four experiments with different de-

mand scenarios. The first three experiments have a constant arrival rate over the simulation

period. The arrival rates rates are 0.1 veh/sec, 0.2 veh/sec and 0.3 veh/sec, respectively.

The fourth experiment has a time-varying arrival rate that mimics arrival patterns from an

upstream signal controlled intersection. More specifically, the arrival rate [veh/sec] within

every signal cycle (i.e., every 60 seconds) is given by:

λ(k) =

0.3 for 0 ≤ k ≤ 40 sec

0 for 40 < k ≤ 60 sec.(3.35)

For all experiments, we start with an empty link with a warm-up period of 5 seconds, the

free-flow travel time for the arrivals to first reach the entrance detector and the performance

metrics are estimated from 200 simulation replications.

The part of the link in between the two detectors are described using the proposed ana-

lytical model. The corresponding link parameters used in the proposed model are given in

Table 3.3. All experiments initiate with an empty link and have a duration of 250 seconds.

99

1020

3040

5060

7080

90100

10-2

100

102

104

106

Proposed

Mixture

Multivariate

(a)Runtim

efortim

e-varyingarrivalrate

λ(k)=

0.2→

0.1

veh/sec

1020

3040

5060

7080

90100

10-2

100

102

104

106

Proposed

Mixture

Multivariate

(b)Runtim

efortim


λ(k)=

0.3→

0.1

veh/sec

1020

3040

5060

7080

90100

10-2

100

102

104

106

Proposed

Mixture

Multivariate

(c)Runtim

efortim


λ(k)=

0.3→

0.2

veh/sec

1020

3040

5060

7080

90100

10-2

100

102

104

106

Proposed

Mixture

Multivariate

(d)Average

runtime

overallthreetim

e-varyingarrivalrates

Figure3-4:C

omparison

ofthecom

putationalruntimes

forthe21

experiments

with

time-varying

demand

100

Figure 3-5: Microscopic simulation model of a single-lane link

Parameter ValueLink length 100 m

Entrance detector position 50 mExit detector position 100 m

Flow capacity 2400 veh/h = 0.67 veh/secMaximum speed 36 km/h = 10 m/sec

Downstream control type fixed-time signal planCycle time 60 sec

Green phase 48 secSimulation length 5 min = 300 sec

Replications number 200

Detection interval 1 secArrival rate λ(k) varies by experiment

Table 3.2: Link parameters used in the microscopic simulator

The time-varying service rate [veh/sec] of the link used in the proposed model within every

signal cycle (i.e., every 60 seconds) is given by:

µ(k) =

0.67 for 0 ≤ k ≤ 48 sec

0 for 48 < k ≤ 60 sec.(3.36)

The arrival rates used in the proposed model for the four experiments are set exactly the

sames as in the simulator, i.e., λ(k) = 0.1 veh/sec, λ(k) = 0.2 veh/sec, λ(k) = 0.3 veh/sec

for the first three experiments, respectively, and for the fourth experiment, λ(k) within

every signal cycle is given by Equation (3.35).

Figures 3-6-3-9 compare estimates of the expected inflow (computed at the entrance

101

Parameter Valuev 0.01 km/secw −0.005 km/secρ 200 veh/kmq 2400 veh/h = 0.67 veh/secδ 1 secℓ 10

L 50 mkfwd 5

kbwd 10

λ(k) varies by experimentµ(k) varies within signal cycle given by Eq. (3.36)

Table 3.3: Link parameters used in the proposed model

detector) and of the expected outflow (computed at the exit detector) obtained from the

simulator to those obtained from the proposed analytical model. For each figure, the left

(resp. right) plot depicts the expected inflow (resp. outflow). Each plot displays, as dotted

lines, 95% confidence intervals of the simulated estimates. The proposed model results are

displayed as a solid line with circles.

Figure 3-6 considers the experiment with a constant arrival rate of 0.1 veh/sec. The

expected inflow (left plot) remains constant around 0.1 veh/sec for the whole experiment.

This indicates that the demand is low enough such that the queue-length does not exceed

the entrance detector and thus does not constrain the arrivals to it. The expect outflow (right

plot) shows the zero outflow during the red phases of the traffic signal. The outflow pattern

during the first traffic signal cycle differs from the others because the link starts off empty

and there is only a warm-up period of 5 seconds for the arrivals to first reach the entrance

detector. So when the first green phase starts, there is no vehicular queue. For all other

cycles, there is a queue of vehicles waiting to leave the link when the green phase starts.

The outflow pattern shows how the queue gradually dissipates (i.e., the outflow gradually

decreases).

Figures 3-7 and 3-8 display the results for the experiments with constant arrival rates

of 0.2 veh/sec and 0.3 veh/sec, respectively. The expected outflow patterns (right plots of

each figure) are similar to those of the right plot of Figure 3-6. The left plots of Figures 3-7

and 3-8 indicate that as demand increases, the expected inflow may temporarily decrease

102

due to the vehicular queue extending beyond the entrance detector. This phenomenon is

also captured by the proposed analytical model.

Figure 3-9 considers the experiment with time-varying demand. As is illustrated in

the left plot, the expected inflow oscillates between 0.3 and 0 veh/sec. In the right plot,

the expected outflow pattern is different from the previous three experiments after the first

signal cycle. This is because the demand pattern and signal pattern at the downstream end

of the link are not synchronized within a signal cycle of 60 seconds. In order to better

explain the pattern, we further plot a color bar at the top of the right plot. This color bar

consists of two colors: green, which represents the period of time the expected inflow is

0.3 veh/sec shifted a free-flow link travel time of 5 seconds, and red, which represents the

period of time the expected inflow is 0 veh/sec shifted a free-flow time. In another word, the

green bar shows roughly the positive arrival period to the downstream queue, whereas the

red bar for the no arrival period. On the other hand, the right plot displays eight vertical grid

lines, which represents times when the downstream signal phase changes. The phases starts

with green and then alternates. As before, the expect outflow shows the zero outflow during

the red phases of the traffic signal. The outflow pattern during the first traffic signal cycle is

similar to those of the previous three experiments. Note the green phase ends slightly later

than the positive arrival period and thus causes a brief decrease in outflow before the red

phase. The upcoming green signal phase starts with no arrival period (i.e., pure departure),

the expected outflow decreases at the beginning which shows how the queue gradually

dissipates. The following increase in expected outflow is because of the restart of arrival

period to downstream queue. This shows how the queue gradually increases and stabilizes.

This outflow pattern repeats afterwards and it is fully captured by the proposed analytical

model.

For all experiments with both constant and alternating arrival rates, the approximations

of both the expected inflow and of the expected outflow derived by the proposed model

almost always lie within the 95% confidence interval of the simulated estimates. Thus, the

proposed model accurately captures the dynamic of the boundary conditions.

We now consider an experiment with platoon arrivals. This experiment serves to eval-

uate the ability of the proposed model, and in particular its use of a time-varying Poisson

103

0 50 100 150 200 250

Time T

0

0.05

0.1

0.15

0.2E

xpec

ted

inflo

w [v

eh/s

ec]

Microscopic simulator 95% CIProposed

0 50 100 150 200 250

Time T

0

0.1

0.2

0.3

0.4

0.5

Exp

ecte

d ou

tflow

[veh

/sec

]


Figure 3-6: Comparison of the expected inflow and outflow for the experiment with arrivalrate λ = 0.1 veh/sec

arrival process, to approximate the link’s boundary conditions under platoon arrival pat-

terns. We consider two tandem single-lane links as shown in Figure 3-10. Both links are

controlled by a fixed-time signal plan with a 60 second cycle time. The cycle is composed

of a single green phase followed by a single red phase. The offset is set to zero, i.e., both

intersections start their green phase at the same time. The duration of the green phase for

the upstream (resp. downstream) intersection is 40 (resp. 48) seconds. The maximum

speed of both links are the same as in Table 3.2. Arrivals from the upstream link to the

downstream link form platoon created by the upstream signalized intersection.

Just as in our previous experiments, the downstream link has the entrance and the exit

detectors, which are used to estimate expected inflows and outflows. We study the traffic

dynamics between the entrance and the exit detectors of the downstream link. Since node

modeling is not the focus of this chapter, we use a third detector to estimate the downstream

link’s arrival rate λ(k) of the analytical model. More specifically, λ(k) is estimated by the

flow observed at the detector labeled detector 1 in Figure 3-10 and shifted in time by the

free-flow travel time between detector 1 and the entrance detector (5 seconds). The setting

of detector 1 is the same as the entrance and exit detectors. The parameters of the proposed

analytical model are displayed in Table 3.4. We start with an empty link with a warm-up

period of 10 seconds, the free-flow time for the arrivals to first reach the entrance detector,

104

0 50 100 150 200 250

Time T

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

Exp

ecte

d in

flow

[veh

/sec

]


0 50 100 150 200 250

Time T

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Exp

ecte

d ou

tflow

[veh

/sec

]



0 50 100 150 200 250

Time T

0

0.1

0.2

0.3

0.4

0.5

Exp

ecte

d in

flow

[veh

/sec

]


0 50 100 150 200 250

Time T

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Exp

ecte

d ou

tflow

[veh

/sec

]



105

0 50 100 150 200 250

Time T

0

0.1

0.2

0.3

0.4

0.5

Exp

ecte

d in

flow

[veh

/sec

]


0 60 120 180 240

Time T

0

0.1

0.2

0.3

0.4

0.5

Exp

ecte

d ou

tflow

[veh

/sec

]


Figure 3-9: Comparison of the expected inflow and outflow for the experiment with alter-nating arrival rate between 0.3 veh/sec and 0 veh/sec

Figure 3-10: Microscopic simulation model for platoon arrival experiments

106

Parameter Valuev 0.01 km/secw −0.005 km/secρ 200 veh/kmq 2400 veh/h = 0.67 veh/secδ 1 secℓ 10

L 50 mkfwd 5

kbwd 10

λ(k) estimates from detector 1µ(k) varies within signal cycle given by Eq. (3.36)

Table 3.4: Link parameters used in the proposed model

0 50 100 150 200 250

Time T

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Exp

ecte

d in

flow

[veh

/sec

]


0 50 100 150 200 250

Time T

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7E

xpec

ted

outfl

ow [v

eh/s

ec]


Figure 3-11: Comparison of the expected inflow and outflow for the tandem link experiment

and the performance metrics are estimated from 200 simulation replications.

The left (resp. right) plot of Figure 3-11 shows the expected inflow (resp. outflow). For

both plots, the analytical approximations of both the expected inflow and of the expected

outflow fall within the 95% confidence intervals of the simulated estimates. This shows

the ability of the proposed model to capture the impact of platoon arrivals on the links’

boundary conditions.

In summary, the comparisons to a microscopic traffic simulator indicate that the pro-

posed model can accurately approximate the links’ boundary conditions for realistic traffic

107

situations such as signalized links and platoon arrival patterns.

3.5 Optimization case study

In this section, we evaluate and benchmark the computational efficiency of the proposed

model. We use it to tackle a signal control optimization problem for the Swiss city of Lau-

sanne. The signal control problem considered is the same as that studied in Lu and Osorio

(2018) (cf. Chapter 2). Section 3.5.1 formulates the problem. Section 3.5.2 compares the

performance of the proposed model with that of three other approaches: (i) the mixture

model Lu and Osorio (2018) (cf. Chapter 2), (ii) the deterministic intelligent link transmis-

sion model (ILTM) (Himpe et al.; 2016), and (iii) a widely used commercial signal control

software. The ILTM model is a network loading model that combines an efficient iterative

link transmission model with the node model of Tampère et al. (2011).

3.5.1 City-scale signal control

The Lausanne network consists of 603 links, 902 lanes and 231 intersections. The network

model of the stochastic microscopic simulator is shown in Figure 3-12. We consider a

fixed-time signal control problem in which we determine the signal plans of 17 intersections

distributed throughout the network (displayed as squares in Figure 3-12). The signal plans

of the 17 intersections are determined jointly. The problem is a fixed-time signal control

problem for the evening peak period 5:00-5:30pm. The decision variables are the green

splits of the signal phases of the 17 intersections. All other control variables (e.g., cycle

times and offsets) are fixed. This leads to a total of 99 endogenous signal phase variables

108

Figure 3-12: Lausanne network model

(i.e., the decision vector is of dimension of 99). We use the following notation.

bd ratio of available cycle time to total cycle time for intersection d;

x vector of green splits;

x(j) green split of signal phase j;

xLB vector of lower bounds for green splits;

D set of intersection indices;

PD(d) set of endogenous signal phase indices of intersection d;

L set of all lanes;

T total number of one-minute time intervals;

N1 number of lanes, i.e., cardinality of L.

The problem is formulated as follows:

minx

f(x) =1

TN1

∑i∈L

T∑t=1

P(UQi(t; x) = ℓi) (3.37)

109

subject to

∑j∈PD(d)

x(j) = bd, ∀d ∈ D (3.38)

x ≥ xLB. (3.39)

The decision vector, x, is the green splits of the signal controlled lanes. Constraint (3.38)

ensures that, for every intersection, the sum of the green times equals the available cycle

time. Constraint (3.39) sets lower bounds, which are set to 4 seconds in this case study.

P(UQi(t; x) = ℓi) denotes the spillback probability of lane i at integer time t under signal

plan x. Therefore, the objective function is the average (over space and over time) spill-

back probability. The goal is to find a signal plan that minimizes the spatial and temporal

occurrence of spillbacks.

To evaluate the performance of the signal plans proposed by the various methods, we

use a calibrated microscopic model of the Lausanne network, which embeds realistic link

and node models. Thus, when evaluating the performance of a signal plan, the simula-

tor accounts for how route choices and link demand can vary with signal plan changes.

The deterministic ILTM model is also a complete network loading model that embeds the

well-established node model of Tampère et al. (2011). However, the probabilistic analyti-

cal models (i.e., the proposed model and the mixture model) are link models. In order to

use them for network optimization, they assume demand for each link is exogenous and

describe the within-link dynamics, they do not describe the across-link (i.e., node) dynam-

ics. Implementation details, including the computation of the exogenous link demand, are

given in Section 4.1 of Lu and Osorio (2018) (cf. Chapter 2). The above problem is solved

with the proposed model and with the mixture model using the interior-point algorithm of

the fmincon routine of Matlab (MATLAB; 2016).

Since the ILTM is a deterministic model and it does not approximate the spillback

probabilities that define the objective function (3.37). Thus, we use 2 alternate objective

functions for ILTM:

1. The average, over time and over the network lanes, proportion of lane that is occupied

110

by vehicles:

f(x) =1

TN1

∑i∈L

T∑t=1

(cvn

upi (t) − cvndown

i (t))/ℓi. (3.40)

where cvnupi (t) (resp. cvndown

i (t)) is the cumulative number of vehicles that passes

the upstream (resp. downstream) end of link i up until time t, which is the output

metric produced by the deterministic ILTM model. This objective function reflects

the average saturation degree of each link in the network over time, which is between

0 and 1. A larger objective function value reflects the network is more likely to be

congested and thus more likely for spillback to happen within the network.

2. The average, over time and over the network lanes, link travel time (given by Eq. (3.41)).

f(x) =1

TN1

∑i∈L

T∑t=1

TTi(t). (3.41)

where TTi(t) is the travel time of link i at time t, which is an output metric produced

by the deterministic ILTM model. This objective function reflects average trip travel

time through the network. This is a common choice of objective function for deter-

ministic models. A larger objective function value reflects that a longer travel time

through the network which suggests a higher chance of spillback.

A natural deterministic approximation of (3.37) would be the average, over all lanes in the

network, proportion of time the lane is full. However, this function is not continuously

differentiable. Thus, we could not use it with the derivative-based interior-point algorithm

used with the other models.

For all methods, the maximum runtime is set to 24 hours. If the algorithm does not

converge to a local optimal solution within the time limit, the algorithm is terminated and

the current iterate is used as the final solution.

111

Initial point 1 2 3 4Mixture 145.0 146.1 144.4 149.4

ILTM 10.6 10.7 10.7 10.8Proposed 1.3 1.3 1.3 1.3

Table 3.5: Average runtime (in min) per iteration of the signal control optimization algo-rithm

3.5.2 Numerical results

Problem (3.37)-(3.39) for the proposed and mixture model, and Problems (3.40) and (3.41)

with constraints (3.38) and (3.39) for the ILTM model are solved considering four different

initial points. The initial points are drawn uniformly randomly from the feasible region

(Equations (3.38)-(3.39)) using the sampling code of Stafford (2006).

For the proposed model and the ILTM model, all four optimization runs (i.e., one for

each initial point) conclude within the time limit. Actually, they all finish within 2.5 hours

for the proposed model and within 5.5 hours for the ILTM model. For the mixture model,

the algorithms do not converge within the time limit. Table 3.5 compares the average com-

putation time (in minutes) per algorithmic iteration. Each column of Table 3.5 corresponds

to a different initial point. Since we consider 2 different objective functions for ILTM, Table

3.5 displays the average computation time, averaged over the two optimization problems.

Table 3.5 indicates the computation time of all models does not vary significantly across

initial points. For the proposed model, the average runtime per iteration is in the order of 1

minute. For ILTM model, it is in the order of 10 minutes, and for the mixture model it is in

the order of 2.4 hours (i.e., 146 minutes). Thus, the proposed model improves the runtime

by 1 order of magnitude compared to ILTM and by 2 orders of magnitude compared to the

mixture model. Note however, that, unlike the probabilistic models (proposed or mixture)

the ILTM model is a full network model that embeds an endogenous node model. This

added realism comes, of course, with an increase in the computational runtime.

We now compare the performance of the derived signal plans. To evaluate the perfor-

mance of a given signal plan, we use a microscopic traffic simulation model of Lausanne

(Dumont and Bert; 2006), which is calibrated for the evening peak period demand. It is im-

plemented in the Aimsun software (TSS; 2014). For a given signal plan, we embed it within

112

the microscopic simulation software, and evaluate 50 simulation replications. Each repli-

cation consists of a warm-up period of 15 minutes followed by a simulation period of 30

minutes. For each simulation replication, we estimate the objective function (Eq. (3.37)),

which is the average (over lanes) proportion of time (over 30 minutes) a lane is full. For

each signal plan, we construct a cumulative distribution function (cdf) of these 50 objective

function observations.

Each plot of Figure 3-13 considers a different initial signal plan and plots five cdf

curves: one for the initial signal plan (dashed line), one for the solution derived by the

proposed model (solid line), one for the solution derived by the mixture model (dot-dashed

line) and two for the solutions derived by the deterministic ILTM model with different

objective functions (circle-dotted lines). The x-axis displays the objective function realiza-

tions (i.e., average (over all the lanes in the network) proportion of time a lane is full). The

y-axis displays the proportion of the 50 simulation replications that have objective function

realizations smaller than x. Thus, the more a cdf curve is shifted to the left, the better the

performance of the corresponding signal plan.

For all 4 plots (Fig. 3-13(a)-3-13(d)), the cdf curves of the derived signal plans, from

the mixture model, from the proposed model and from the ILTM model, are to the left of

the initial signal plan. Thus, all models identify signal plans that outperform the initial

signal plans. For all 4 plots, the cdf curves of the derived signal plans from the stochastic

models (i.e, the proposed and mixture model) are to the left of the derived signal plans

from ILTM model. Thus, the stochastic models can identify signal plans that outperform

the signal plans derived by ILTM model (with both objective functions).

Figure 3-14 compares the average proportion of time a lane is full for the ILTM signal

plans (with objective (3.40), i.e., ILTM-1) and the proposed plan. Each plot of Figure 3-14

considers a different initial point and each dot in the plot represents a lane in the network.

The x-axis (resp. y-axis) displays the proportion of time the lane is full averaged over all

50 simulation replications with the ILTM (resp. proposed) signal plan. A reference line of

y = x is displayed in each plot. Note that for both ILTM and the proposed model, even

after optimization, there remain lanes with an average that is greater than 0.4, which means

that there remain highly congested lanes. Approximately 20% of the lanes have improved

113

0.010.02

0.030.04

0.050.06


0

0.2

0.4

0.6

0.8 1


InitialP

roposedM

ixtureILT

M-1

ILTM

-2

(a)Initialpoint1

0.010.015

0.020.025

0.030.035

0.040.045


0

0.2

0.4

0.6

0.8 1


InitialP

roposedM

ixtureILT

M-1

ILTM

-2

(b)Initialpoint2

0.010.02

0.030.04

0.050.06

0.070.08


0

0.2

0.4

0.6

0.8 1


InitialP

roposedM

ixtureILT

M-1

ILTM

-2

(c)Initialpoint3

0.010.02

0.030.04

0.050.06

0.07x: average proportion of tim

e a lane is full

0

0.2

0.4

0.6

0.8 1

Cumulative distribution function F(x)InitialP

roposedM

ixtureILT

M-1

ILTM

-2

(d)Initialpoint4

Figure3-13:C

umulative


oftheaverage,overalllanes

inthe

network,proportion

oftime

alane

isfullconsidering


114

performance under the proposed signal plan, while 10% have worse performance. The

mean difference in performance per lane between the ILTM signal plans and the proposed

plan for each plot is 0.006, 0.007, 0.004 and 0.009, respectively. Thus, on average, the

proposed method yields signal plans that mitigate the occurrence of spillbacks. Note that

this occurs, even if the proposed model, unlike ILTM, is merely a link model, it lacks a

node model.

Figure 3-15 compares the performance of the signal plans in terms of the average (over

all lanes in the network) proportion of the lane that is occupied by vehicles, which is equiv-

alent to the first objective function used by the ILTM optimization. This figure has a similar

layout as Figure 3-13. The figure compares the cdf curves of the different signal plans. As

before, the more a cdf curve is shifted to the left, the better its performance (i.e., the higher

the proportion of simulation replications, out of the 50, that have low average proportion of

the lane that is occupied by vehicles). All four plots in Figure 3-15 indicate that all derived

signal plans outperform their corresponding initial signal plans. The derived signal plans

from both probabilistic models (mixture and proposed) have similar performance and they

outperform the signal plans derived by the deterministic ILTM model with both objective

functions.

Figure 3-16 compares the performance of the signal plans in terms of the average trip

travel time, which is equivalent to the second objective function used by the ILTM opti-

mization. For all initial points, all models yield signal plans that outperform the initial

points. The signal plans derived by the proposed model and by the mixture model have sim-

ilar performance and they outperform the signal plans derived by the deterministic ILTM

model with both objective functions.

We compare the performance of the proposed signal plans with that of a signal plan de-

rived by the widely used commercial signal control software Synchro (Trafficware; 2011).

The Synchro software is based on a deterministic macroscopic traffic model, it does not

solve the same optimization problem (3.37)-(3.39). For details on how the Synchro signal

plan is derived, we refer the reader to Section 5.3 of Osorio and Chong (2015). Figures 3-

17, 3-18 and 3-19 display, respectively, the three performance metrics of interest: average

proportion of time a lane is full, average proportion of the lane that is occupied by vehicles

115

00.1

0.20.3

0.40.5

0.60.7

0.8

Average proportion of tim

e lane is full for ILTM

signal plan

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8Average proportion of time lane is full

for proposed signal plan

(a)Initialpoint1

00.1

0.20.3

0.40.5

0.60.7

0.8



signal plan

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Average proportion of time lane is fullfor proposed signal plan

(b)Initialpoint2

00.1

0.20.3

0.40.5

0.60.7

0.8



signal plan

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8


(c)Initialpoint3

00.1

0.20.3

0.40.5

0.60.7

0.8



signal plan

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8


(d)Initialpoint4

Figure3-14:

Average

proportionoftim

elane

isfullforILT

Msignalplans

andproposed

signalplans

116

0.04

0.05

0.06

0.07

0.08

0.09

0.1

0.11

x: a

vera

ge p

ropo

rtio

n of

the

lane

that

is o

ccup

ied

by v

ehic

les

0

0.2

0.4

0.6

0.81


Initi

alP

ropo

sed

Mix

ture

ILT

M-1

ILT

M-2

(a)I

nitia

lpoi

nt1

0.04

0.05

0.06

0.07

0.08

0.09

x: a

vera

ge p

ropo

rtio

n of

the

lane

that

is o

ccup

ied

by v

ehic

les

0

0.2

0.4

0.6

0.81


Initi

alP

ropo

sed

Mix

ture

ILT

M-1

ILT

M-2

(b)I

nitia

lpoi

nt2

0.04

0.06

0.08

0.1

0.12

0.14

x: a

vera

ge p

ropo

rtio

n of

the

lane

that

is o

ccup

ied

by v

ehic

les

0

0.2

0.4

0.6

0.81


Initi

alP

ropo

sed

Mix

ture

ILT

M-1

ILT

M-2

(c)I

nitia

lpoi

nt3

0.04

0.06

0.08

0.1

0.12

0.14

x: a

vera

ge p

ropo

rtio

n of

the

lane

that

is o

ccup

ied

by v

ehic

les

0

0.2

0.4

0.6

0.81

Cumulative distribution function F(x)In

itial

Pro

pose

dM

ixtu

reIL

TM

-1IL

TM

-2

(d)I

nitia

lpoi

nt4

Figu

re3-

15:

Cum

ulat

ive

dist

ribu

tion

func

tions

ofth

eav

erag

epr

opor

tion

ofth

ela

neth

atis

occu

pied

byve

hicl

esco

nsid

erin

gdi

ffer

ent

initi

alsi

gnal

plan

s

117

45

67

89


e [min]

0

0.2

0.4

0.6

0.8 1Cumulative distribution function F(x)

InitialP

roposedM

ixtureILT

M-1

ILTM

-2

(a)Initialpoint1

45

67

89


e [min]

0

0.2

0.4

0.6

0.8 1


InitialP

roposedM

ixtureILT

M-1

ILTM

-2

(b)Initialpoint2

45

67

89

1011


in]

0

0.2

0.4

0.6

0.8 1


InitialP

roposedM

ixtureILT

M-1

ILTM

-2

(c)Initialpoint3

45

67

89

1011


in]

0

0.2

0.4

0.6

0.8 1

Cumulative distribution function F(x)InitialP

roposedM

ixtureILT

M-1

ILTM

-2

(d)Initialpoint4

Figure3-16:C

umulative


oftheaverage

triptraveltim

econsidering


118

and average trip travel time normalized by free-flow travel time. Each figure contains 9

cdf curves: four black dashed lines for the four initial points, four solid thin lines for the

four solutions derived by the proposed model and one solid thick line for the signal plan

proposed by Synchro. For all figures, the left-most curves are the ones corresponding to

the signal plans derived by the proposed model. In other words, for all three performance

metrics, the signal plans derived by the proposed model outperform both the initial signal

plans and the signal plan derived by Synchro. For each figure, the performance of the four

initial points varies significantly, while that of the proposed signal plans is similar. In other

words, the proposed method is robust to the quality of the initial points.

In summary, compared to the mixture model, the proposed model improves the average

runtime by 2 orders of magnitude, and yields signal plans with either improved or simi-

lar performance. Compared to the deterministic ILTM model, both the proposed and the

mixture model are able to find signal plans with improved performances. This case study il-

lustrates the scalability and computational efficiency of the proposed model. The proposed

model is suitable for large-scale network analysis and optimization.

3.6 Conclusions

This chapter formulates an analytical probabilistic stochastic model that is scalable and

suitable for large-scale network optimization. The main idea of the proposed model is to

describe the link’s boundary conditions with only two key probabilities instead of tracking

the full marginal, or full joint, distributions. More specifically, while the dimension of

the state space of the models of Osorio and Flötteröd (2015) and of Lu and Osorio (2018)

(cf. Chapter 2) scales cubically and linearly, respectively, with the link’s space capacity,

the proposed model has a constant dimension of 2. Thus, it scales independently of the

link attributes such as the link’s space capacity. This makes it suitable for large-scale

network analysis and optimization. The model is validated versus stochastic simulation

results from a simulation-based implementation of a stochastic link transmission model.

The model’s accuracy is comparable to that of Osorio and Flötteröd (2015) and of Lu and

Osorio (2018) (cf. Chapter 2), while being more computationally efficient. The model is

119

0.010.02

0.030.04

0.050.06

0.070.08


0

0.2

0.4

0.6

0.8 1


Synchro signal plan

Initial signal planP

roposed signal plan

Figure3-17:

Cum

ulativedistribution

functionsof

theaverage

pro-portion

oftime

alane

isfull

0.040.06

0.080.1

0.120.14

x: average proportion of the lane that is occupied by vehicles

0

0.2

0.4

0.6

0.8 1


Synchro signal plan


roposed signal plan

Figure3-18:

Cum

ulativedistribution

functionsof

theaverage

pro-portion

ofthelane

thatisoccupied

byvehicles

45

67

89

1011


in]

0

0.2

0.4

0.6

0.8 1


Synchro signal plan


roposed signal plan

Figure3-19:

Cum

ulativedistribution

functionsof

theaverage

triptraveltim

e

120

also validated versus a microscopic traffic simulator, and it can accurately approximate the

link’s boundary conditions for realistic traffic situations such as platoon-like arrival. The

proposed model is then used to address a signal control problem for the city of Lausanne

(Switzerland). The derived solutions are benchmarked with those derived by the mixture

model of Lu and Osorio (2018) (cf. Chapter 2) and ILTM model of Himpe et al. (2016).

The derived signal plans from both the proposed model and the mixture model have similar

performance, considering various performance metrics. They both outperform the initial

plans, signal plans derived by ILTM model and a signal plan proposed by a widely used

commercial software. Compared to the model of Lu and Osorio (2018) (cf. Chapter 2), the

proposed model reduces computational runtime by 2 orders of magnitude.

Future work focuses on the formulation of scalable stochastic network models, the goal

is to be able to recover the joint distribution of a path or a network. First, there is a need

to formulate scalable probabilistic node models that are consistent with their deterministic

counterparts. Osorio et al. (2011) includes a two-link probabilistic node model that pro-

vides the dependencies of the links’ boundary conditions across a node. It yields the joint

distribution of the downstream boundary conditions of the upstream link and the upstream

boundary conditions of the downstream link. The extension of this formulation to nodes

with multiple incoming and outgoing links is part of ongoing work. Second, scalable and

computationally efficient network model formulations are required. Consider a network of

n links, directly coupling the proposed link model with the node model of Osorio et al.

(2011) would yield a complexity of O(2n), which is not scalable. Possible techniques

to achieve scalability include network decomposition (Flötteröd and Osorio; 2017) and

aggregation-disaggregation (Osorio and Yamani; 2017; Osorio and Wang; 2017).

121

122

Chapter 4

Adaptive Partitioning Strategy for

High-Dimensional Discrete

Simulation-based Optimization

Problems

In this chapter, we introduce a technique to enhance the computational efficiency of SO

algorithms for high-dimensional discrete SO problems. The technique is based on an inno-

vative adaptive partitioning strategy. It is integrated with the Empirical Stochastic Branch-

and-Bound (ESB&B) framework proposed by Xu and Nelson (2013). This combination

leads to a general-purpose discrete SO algorithm that is both globally convergent and has

good small sample (finite-time) performance.

4.1 Introduction

Simulation-based optimization (SO), also referred as optimization via simulation (OvS),

is the optimization of the performance of a stochastic system, where the objective func-

tion and/or constraints can only be estimated through stochastic simulations. Detailed

overviews of SO literatures are provided by Fu (2002); Amaran et al. (2016); Bhosekar

and Ierapetritou (2018). Based on the feasible region structures, Hong and Nelson (2009)

123

divide SO problems into three categories: continuous SO, ranking and selection, and dis-

crete SO. For a continuous SO problem, the decision variables are continuous, and methods

developed to tackle such problems include stochastic approximation methods (e.g., Rob-

bins and Monro (1951); Bhatnagar et al. (2011)), direct search methods (e.g., Andradóttir

(2006)), surrogate-based methods (e.g., Angün et al. (2009); Regis and Shoemaker (2013);

Wang and Ierapetritou (2018)). For ranking and selection problems, all feasible solutions

can be simulated at least once and the (near) best solution is chosen with a given confi-

dence interval (e.g., Kim and Nelson (2006)). A discrete SO problem is one with discrete

decision variables and the number of feasible solutions is usually too large for each one

to be simulated. The existing algorithms mainly focus on global convergence to the opti-

mal solution(s) asymptotically (e.g.,Xu and Nelson (2013); Tsai and Fu (2014)). Methods

that aim at identifying solutions with good performances within small sampling budgets

include Heuristic Constrained Genetic Algorithm (HCGA) (Tsai and Fu; 2014), and Indus-

trial Strength COMPASS (ISC) (Xu et al.; 2010). Nevertheless, developing discrete SO

algorithms that can efficiently tackle high-dimensional problems remains a challenge.

Xu and Nelson (2013) propose an Empirical Stochastic Branch-and-Bound (ESB&B)

framework for discrete SO problems based on the nested partitions method of Shi and Ólaf-

sson (2000) and stochastic branch-and-bound (Norkin et al.; 1998). It takes advantage of

the partitioning structure of stochastic branch-and-bound method and empirically estimates

the bounds based on sampled solutions. The ESB&B algorithm also uses improvement

bounds to represent the potential of each subregion to guide the sampling strategy in the

next iteration. The ESB&B algorithm is globally convergent, i.e., given infinite simulation

budget, it converges asymptotically to the global optimum. As mentioned in Xu and Nelson

(2013), there are many valid partitioning strategies; however, a good partitioning strategy

can usually improve the algorithm efficiency. Most partitioning strategies developed in the

literature are generic and heuristic, for example, dividing the feasible region equally into k

subregions along a randomly chosen dimension (Xu and Nelson; 2013).

In this chapter, we propose an innovative adaptive partitioning strategy. It is embedded

within the ESB&B framework (Xu and Nelson; 2013) and forms a globally convergent al-

gorithm for discrete SO problems. The proposed partitioning strategy iteratively divides

124

the feasible region in a fashion such that previously sampled solutions with similar per-

formances are located in the same subregion. It is an adaptive sample-based partitioning

strategy that enhances small sample (finite-time) performances of the ESB&B framework

of Xu and Nelson (2013). This proposed partitioning strategy can take on problem-specific

structures known a priori, such as clustering effect in car-sharing fleet assignment prob-

lems, to further enhance the algorithm efficiencies without significant modifications.

This chapter is organized as follows. Section 4.2 reviews the ESB&B algorithm by Xu

and Nelson (2013). Section 4.3 presents the proposed adaptive partitioning strategy and its

solving method developed by Dunn (2018). Section 4.4 validates the proposed partitioning

strategy on one synthetic and one real-world discrete SO problems. Section 4.5 concludes

this chapter.

4.2 ESB&B framework

Let us first define the discrete SO problem. The goal is to find x that solves

maxx∈X

E[Y(x)] (4.1)

where x = [x1, ..., xp] are the discrete decision variables, and X is a convex feasible region

that contains finite but a large number of feasible solutions, which can be represented by

constraints of the form:

li ≤ xi ≤ ui, i = 1, ..., p (4.2)

gj(x) ≤ 0, j = 1, ..., q (4.3)

li, xi, ui ∈ Z, i = 1, ..., p. (4.4)

The objective function E[Y(x)] is the expected performance at point x, which can only be

estimated by generating observations of Y(x) via simulation.

To solve the discrete SO problems of the above format, Xu and Nelson (2013) propose

the Empirical Stochastic Branch-and-Bound (ESB&B) framework, which converges to the

125

globally optimal solution(s) asymptotically (i.e., under unlimited sampling budgets and

simulation efforts). The ESB&B algorithm is detailed in Algorithm 5. The algorithm ter-

minates when the total sampling budget is used up. Whenever the algorithm is terminated,

the final solution is the one with the maximum cumulative sample average.

There are two important steps within each iteration: i) sampling and bounding (Step

2), ii) and partitioning (Step 3). For details regarding the sampling and bounding step, we

refer to Section 3 of Xu and Nelson (2013). We discuss in details the partitioning step.

Partitioning divides the estimated best subregion into a set of smaller subregions that are

disjoint and nonempty. In the ESB&B implementation of Xu and Nelson (2013),

the embedded generic partitioning strategy chooses (either deterministically or ran-

domly) a variable xi and divide along this dimension the best subregion into ω dis-

joint subregions. A detailed description of this partitioning strategy is given by the

online Appendix A of Xu and Nelson (2013).

However, there exists many valid partitions for a given subregion. A good partition-

ing strategy can help locate the most promising subregion more efficiently, and hence

allocate the sampling budget more efficiently. Let us take the following illustrative

example to discuss the pros and cons of the current generic partitioning strategy used

in ESB&B and potential directions of improvement.

Illustrative example:

Consider the maximization of a 2-dimensional deterministic function f(x1, x2), for

which neither the closed-form nor any structural information about the function is

known. The ground truth of f(x1, x2) in the current best subregion is shown in Fig-

ure 4-1. However, the values of the function can only be known through sampling.

Figure 4-2 shows samples that have already been evaluated in the current best subre-

gion.

We need to further partition this region and sample from each new subregion. Figure

4-3(a) shows the generic partitioning strategy used in the ESB&B framework, which

divides this region into equal parts along the chosen x1 dimension. This partitioning

strategy requires the users to predefine the number of subregions to be divided into

126

Algorithm 5 ESB&B framework

Step 0. Initialization: set iteration counter k = 0, initial partition P0 = {X }, best subregion R0 = X

Step 1. Partitioning:If the best subregion Rk is singleton:

(a) set P ′k = Pk

else:

(a) construct a partition of the best subregion P(Rk)

(b) define the new full partition by P ′k = (Pk\{Rk}) ∪ P(Rk)

(c) denote X P the elements of P ′k

Step 2. Sampling and bounding:

2.1 Solution sampling:

i. for each subregion X P ∈ P(Rk), randomly sample νR solutionsii. if k > 0, for each subregion X P ∈ Pk\{Rk}, sample θ(X P) solutions, where

θ(X P) is computed in Step 2.3 of iteration k− 1 based on information in Φk−1

iii. aggregate all of the sampled solutions into set Sk, and set Φk = Φk−1 ∪ Sk

2.2 Bound estimation:

i. for each x ∈ Sk, simulate ∆nF replications if x /∈ Φk−1, simulate ∆nA additionalreplications if x ∈ Φk−1

ii. for X P ∈ P ′k, calculate estimated upper bound ηk+1(X P)

2.3 Sample allocation: compute the number of solutions to be sampled, θ(X P), for allX P ∈ P ′

k for iteration k+ 1 based on information in Φk

Step 3. Updating partition and best subregion:

(a) update the best subregion Rk+1 = arg maxXP∈P ′k{ηk+1(X P)}

(b) partition Pk+1 = P ′k

(c) set k = k+ 1, go to Step 1.

127

(denoted ω), and it does not utilize any information from the solutions that have

already been sampled. After partitioning, each new subregion gets an equal amount

of sampling budget. Based on the performances of the sampled solutions, a new best

subregion will be selected and further explored. In this case, the middle and right

subregions both have the chance to be selected as the next best subregion as they

both contains peaks of the underlying function. If the right subregion is chosen as

the next best subregion, it can take a while for the algorithm to return to explore the

middle subregion where the true global maximum solution locates.

On the other hand, there is another potential partition of the current best subregion

given in Figure 4-3(b). Note that this partition divides the subregion into three parts:

two that contains the underlying function’s basins and one contains all the peaks.

This partition is better in the sense that it successfully identifies the patterns of the

underlying function. It is obtained by grouping sampled solutions with similar per-

formances in the same subregion.

In Section 4.3, we propose an adaptive partitioning strategy which is defined by the

previously sampled solutions in the subregion.

4.3 Adaptive partitioning strategy

The main idea underlying the proposed partitioning strategy is to find a partition of the

current best subregion Rk in which the sampled solutions with similar performances are

divided in the same subregion. In this section, we introduce two sets of adaptive partitioning

strategies: (i) parallel partition in Section 4.3.1, which applies to any problems, and (ii)

hyperplane partition in Section 4.3.2, which applies when splitting features in the form

of linear combinations of decision variables can be obtained from prior knowledge (e.g.,

clustering effect in car-sharing fleet assignment problem).

128

-3 -2 -1 0 1 2 3

x1

-3

-2

-1

0

1

2

3x 2

-6

-4

-2

0

2

4

6

8

Figure 4-1: The ground truth values of f(x1, x2) in the current best subregion.

-3 -2 -1 0 1 2 3

x1

-3

-2

-1

0

1

2

3

x 2

-6

-4

-2

0

2

4

6

8

Figure 4-2: The sampled solutions of f(x1, x2) in the current best subregion.

129

-3 -2 -1 0 1 2 3

x1

-3

-2

-1

0

1

2

3x 2

-6

-4

-2

0

2

4

6

(a) A naive partition of the current best subregion.

-3 -2 -1 0 1 2 3

x1

-3

-2

-1

0

1

2

3

x 2

-6

-4

-2

0

2

4

6

(b) A better partition of the current best subregion.

Figure 4-3: Different partitions of the current best subregion.

130

4.3.1 Parallel partition

A parallel partition is one that only contains cuts of the form aTx < b or aTx ≥ b where

ai ∈ {0, 1} and∑p

i=1 ai = 1. The subregions resulting from a parallel partition are in the

form of the intersection of a hyperbox with the best subregion. This family of partitions are

favorable in the sense that it directly maps to the bounds of each decision variable and hence

is easy to interpret and draw feasible uniform random samples from. A parallel partitioning

strategy can be applied to any discrete SO problem, thus it is a generic partitioning strategy.

We formulate the search for such a parallel partition as a minimization problem:

minP(Rk)

∑XP∈P(Rk)

∑x∈XP∩Φk

(Y(x) − Y)2 (4.5)

s.t. |X P ∩Φk| ≥ Nmin, ∀ X P ∈ P(Rk), (4.6)

|P(Rk)| ≤ d, (4.7)

where P(Rk) is a valid parallel partition of the current best subregion Rk, Y(x) is the

cumulative sample mean of all n(x) observations at solution x:

Y(x) =1

n(x)

n(x)∑s=1

Ys(x). (4.8)

Y is average estimated mean performances of all sampled points in subregion X P:

Y =1

|X P ∩Φk|

∑x∈XP∩Φk

Y(x) (4.9)

Therefore, the objective function is the sum over all subregions the squared deviation from

the mean. Constraint (4.6) states that at least Nmin already sampled solutions must be

clustered in one subregion of the partition, otherwise the optimization problem (4.5) is

solved to minimum by a partition in which each subregion contains exactly one sample

and the objective function value will be zero. An alternative is to limit the number of

subregions to be divided into, denoted d (Constraint (4.7)). The partition derived by solving

problem (4.5) with constraints on the partition structure (e.g., Nmin or/and d), denoted

131

P∗(Rk), is one that optimally groups sampled points with similar performances. Next, we

discuss the solution algorithm for the proposed minimization problem (4.5)-(4.7).

The proposed partition problem (4.5)-(4.7) is similar to the underlying optimization

problem of the decision tree model (e.g., Breiman et al. (1984); Dunn (2018)) with deci-

sion variables as input variables (features) and performances as dependent variables. The

decision tree model is mainly used for prediction purposes. The underlying optimization

problem of training a decision tree is to find both a tree-structured split of the training

samples based on input variables and labels for classification (or constant values for re-

gression) of leaf nodes so that the prediction error is minimized, given some constraints on

the tree structures (e.g., minimum leaf size and/or maximum tree depth) and restrictions

against overfitting. The decision tree method CART by Breiman et al. (1984) is a top-down

greedy algorithm that does not guarantee a globally optimal decision tree. A recent work

of Dunn (2018) proposes an algorithm that solves the decision tree problem at least locally

optimally, and by initiating with multiple starting points, it attempts for global optimality.

They formulate the decision tree problem as an mixed integer programming problem and

the objective function can be written as follows:

minT

R(T ) + α|T | (4.10)

s.t. N(ℓ) ≥ Nmin, ∀ ℓ ∈ leaves(T ), (4.11)

depth(T ) ≤ D, (4.12)

where R(T ) represents the prediction error of the tree T made on the training data. |T |

denotes the number of branch node in T . It is a representation of the tree complexity.

N(ℓ) represents the number of samples in each leave node ℓ of the tree T , and hence the

constraint (4.11) restricts each leaf node to contain at least Nmin samples. One can also

restricts the maximum depth of the tree (Constraint (4.11)). It sets an upper bound on the

number of leaf nodes (i.e., 2D). In other words, the goal is to find a decision tree model

that balances the prediction accuracy and the model complexity. The optimal decision tree

method is shown to perform robustly to noisy training data (both feature and label noises).

Note that a given tree T creates a unique partition on the feature space and hence a

132

unique division of the samples. For regression tasks, the constant prediction value of a leaf

node that minimizes the mean squared error is the mean performance of the samples in that

leaf node. Thus, by setting α = 0 and choosing the mean squared error as the error metric

(i.e., R(·)), we retrieve an optimization problem that is equivalent to (4.5). In other words,

the proposed partitioning problem (4.5) is equivalent to the optimal parallel regression

tree optimization (Dunn; 2018, Chapter 4) with complexity penalty coefficient α = 0.

The optimization problem (4.10)-(4.12) can be solved efficiently with their local search

algorithm coupled with multiple start points for sample sizes up to hundreds of thousands.

This is more than enough for discrete SO problems, in which samples are usually more

time-consuming to simulate and hence limited in size.

4.3.2 Hyperplane partition

A hyperplane partition is one that contains cuts of the form aTx < b or aTx ≥ b where

ai ∈ R. The subregions resulting from a hyperplane partition are polyhedrons. The current

sampling strategy MIX-D algorithm can efficiently sample uniformly from such subregion.

Hence, although more complicated than parallel partition, this family of partitions are also

practical, especially when problem-specific structures of such form is known a priori.

Assume that variables yj = aTj x for j = 1, ..., q are known in advance to be potentially

important splitting factors other than the decision variables xi for i = 1, ..., p. Let us take

the car-sharing fleet assignment problem as an example. The car-sharing service provider

wants to find a fleet assignment to stations across the network that maximizes their profit.

Geographically nearby stations often share demands among one another, since customers

are often willing to walk short distances for available vehicles if their target station runs out

of vehicle. Thus, the total number of vehicles in a cluster of nearby stations is potentially

a more important factor that influences the profit generated than the number of vehicles

in each individual station. Therefore, a partition based on the total number of vehicles in

clusters of nearby stations can more effectively divide the feasible region and lead to a more

effective search for subregions with higher profits. This can result in a further improved

finite-time performances and algorithm efficiency. In this example, yj is the total number

133

of vehicles assigned in the cluster of nearby stations Cj as yj =∑

i∈Cjxi where xi are the

number of vehicle assigned to station i.

The modification in order to incorporate such a hyperplane cut is minor. We simply treat

yj as candidates that the partitioning algorithm can split on together with all the decision

variables xi. The proposed partitioning strategy by solving the optimization (4.5)-(4.7)

is robust to correlated splitting features such as yj and xi, since it naturally includes a

splitting feature selection process. One other benefit the proposed adaptive partitioning

strategy brings is that each subregion constructed contains at least one previously sampled

solution, which can be used directly to initiate the MIX-D sampling scheme.

4.3.3 Adaptive partitioning ESB&B algorithm

The proposed adaptive partitioning ESB&B algorithm is given by Algorithm 6. The pa-

rameters related to partitioning strategy that need to be specified are the maximum tree

depth D and/or minimum number of points each subregion contains Nmin, both are related

to the number of subregions the current best subregion can be divided into. The algorithm

chooses the number of subregions to divide the region into optimally, given the predeter-

mined parameters D and/or Nmin. Different from other generic partitioning strategies, the

derived partition from the proposed adaptive partitioning strategy ensures that each new

subregions contains at least one sampled points, or at least Nmin if this parameter is stated.

This benefits the MIX-D sampling scheme in the sense that these interior sampled points

can be used directly to initiate random walk. The termination of the algorithm is usually

when the simulation budget is exhausted. We select the final solution x∗ as the one with the

maximum cumulative sample average.

4.4 Numerical examples

In this section, we consider two sets of numerical examples to illustrate the proposed adap-

tive partitioning ESB&B algorithm. The first example, Griewank function, has many local

minima which makes it a challenging test function. This low-dimensional example illus-

trates how the proposed method can escape tricky local minima by adaptively partitioning

134

Algorithm 6 Adaptive partitioning ESB&B algorithm

Step 0. Initialization:

(a) set iteration counter k = 0, initial partition P0 = {X }, best subregion R0 = X(b) sample uniformly at random the initial n0 solutions in the feasible region X , simulate

∆nF replications of each sample, and record them in set Φ0.

(c) set training set Ψ0 = Φ0

Step 1. Partitioning:If the best subregion Rk is singleton:

(a) set P ′k = Pk

else:

(a) construct a partition of the best subregion P(Rk) using the proposed adaptive parti-tioning strategy with training set Ψk, predetermined parameters D and/or Nmin and/orother splitting features yj than decision variables xi

(b) define the new full partition by P ′k = (Pk \ {Rk}) ∪ P(Rk)

(c) denote X P the elements of P ′k

Step 2. Sampling and bounding:

2.1 Solution sampling:

i. for each subregion X P ∈ P(Rk), randomly sample νR solutionsii. if k > 0 ,for each subregion X P ∈ Pk \ {Rk}, sample θ(X P) solutions, where

θ(X P) is computed in Step 2.3 of iteration k− 1 based on information in Φk−1

iii. aggregate all of the sampled solutions into set Sk, and set Φk = Φk−1 ∪ Sk

2.2 Bound estimation:

i. for each x ∈ Sk, simulate ∆nF replications if x /∈ Φk−1, simulate ∆nA additionalreplications if x ∈ Φk−1

ii. for all X P ∈ P ′k, calculate estimates ηk+1(X P)

2.3 Sample allocation: compute the number of solutions to be sampled, θ(X P), for allX P ∈ P ′

k for iteration k+ 1 based on information in Φk

Step 3. Updating partition and best subregion:

(a) update the best subregion Rk+1 = arg maxXP∈P ′k{ηk+1(X P)}

(b) partition Pk+1 = P ′k

(c) training set Ψk+1 = {x : x ∈ Rk+1 ∩Φk}

(d) set k = k+ 1, go to Step 1.

135

the feasible region with problem structure inferred from previously sampled points. The

second example is a real-world car-sharing case study in Zhou et al. (2019) for which the

optimal solutions are not known, it is used to explore the performance of the proposed al-

gorithm on high-dimensional discrete SO problems. In this example, we also demonstrate

the use of hyperplane partition resulting from additional splitting variables yj, which are

derived from prior knowledge on the geographical locations of the stations. For both ex-

amples, we benchmark the proposed algorithm against the original ESB&B algorithm with

the generic partitioning strategy.

4.4.1 The Griewank function

We consider the Griewank function, which is commonly used as a test case for optimization

algorithms (L. Salemi et al.; 2019). This function has a single global minimum and many

local minima, which makes it a challenging test for optimization algorithms. Figure 4-4

displays the contour plot of the two-dimensional Griewank function on domain [−5, 5] ×

[−5, 5]. The global minimum of the function is at the origin (0, 0) with response value 0

and there are four local minima near the four corners of the domain with response values

0.0086.

We first consider a minimization of the Griewank function with feasible region [−5, 5]×

[−5, 5], where the globally optimal solution (0, 0) is at the center of the feasible region. To

make it a discrete SO problem, the feasible region is divided into a 101 × 101 lattice,

and the function value at each solution is given by the Griewank function plus a normally

distributed noise with mean zero and variance σ2. In this numerical example, we take σ =

0.01, which is relative to the difference between local minima 0.0086 and global minimum

0. The number of replications for each non-encountered sample is set at ∆nF = 10, and

encountered sample ∆nA = 2. For both algorithms, we initiate with the same uniformly

randomly sampled pool of size 10 (asterisks in Figure 4-4). At each iteration, the total

sampling budget for subregions other than the current best (denoted νO) is 5, and that

for the current best subregion (denoted |P(Rk)|νR) is 10; the budget limit is 40 iterations.

Thus, at the end of each run, roughly 5% of all feasible solutions are sampled and evaluated.

136

-5 -4 -3 -2 -1 0 1 2 3 4 5

x1

-5

-4

-3

-2

-1

0

1

2

3

4

5

x 2

2D Griewank Function

global minimum

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Figure 4-4: The contour plot of two-dimensional Griewank function on [−5, 5]× [−5, 5].

For the generic partitioning scheme used in the original ESB&B algorithm, the current best

subregion Rk is divided into ω = 2 subregions equally along the longest dimension of Rk.

For the proposed adaptive partitioning scheme, the maximum tree depth D is set at 2, i.e.,

Rk can be at most divided into 4 subregions, and the minimum number of sampled points

grouped in one subregion is set at 2, i.e., Nmin = 2. We run each algorithm 50 times.

Figure 4-5 shows the current best estimate of the objective value across iterations of

five randomly selected runs of each algorithm (i.e., five sample paths of each algorithm).

The solid black lines display the results for the proposed algorithm, and the dashed red

lines for the original ESB&B algorithm. As the iteration advances, the current best esti-

mate of the objective value has a general decreasing trend for both algorithms, although

there are temporary increases due to stochasticity. Figure 4-6 (resp. Figure 4-7) plots a

zoomed version of the current best estimate of the objective value with 95% confidence

intervals, so as to clearly show the performance of the ESB&B algorithm (resp. proposed

algorithm) close to the global minimum function value at zero (solid blue line). Note that

at each iteration, the current best solution is simulated at least 10 replications, since each

non-encountered sample will be simulated ∆nF = 10 replications, and if it is sampled

137

again, an additional ∆nA = 2 replications will be simulated. There is only one run of the

ESB&B algorithm that ends up with an estimated objective value that cannot be rejected at

confidence level 95% to be different from the true global minimum value 0. On the other

hand, four runs of the proposed algorithm end up with estimated objective values that are

statistically indifferent from the true global minimum after iteration 15 at confidence level

95%.

Figure 4-8 shows the distance between current best solution and the global minimum

solution across iterations. The solid black lines display the results for the proposed algo-

rithm and dashed red lines for the ESB&B algorithm. Four runs of the proposed algorithm

end up with solutions close to the true global minimum at (0, 0), whereas only one run of

the ESB&B algorithm ends up close to (0, 0). Based on these five experiment runs, the

proposed method tends to find solutions that are closer to the global minimum.

5 10 15 20 25 30 35 40

Iteration

-0.05

0

0.05

0.1

0.15

0.2

Cur

rent

est

imat

e of

obj

ectiv

e va

lue


ESBBProposed

Global minimum objective = 0

Figure 4-5: Objective function estimate of the current iterate across iterations.

Figures 4-9 and 4-11 show the experiment results for the first run of the original ESB&B

algorithm, which is a typical run of ESB&B getting trapped at a local minimum solution. In

both figures, the final partition of the feasible region and the contour plot of the Griewank

138

5 10 15 20 25 30 35 40

Iteration

-0.01

0

0.01

0.02

0.03

0.04C

urre

nt e

stim

ate

of o

bjec

tive

valu

e


ESBB algorithm95% confidence intervalTrue global objective value

Figure 4-6: Objective function estimate of the current iterate with 95% confidence intervalacross iterations of ESBB algorithm (zoomed-in results).

function are displayed. Figure 4-9 plots the path of the best solution at current iterate

across iterations in the feasible domain, the path is plotted with blue arrows, and Figure

4-10 displays the zoomed-in results. The asterisk points plotted are the initial sampled set

Φ0. The generic partitioning strategy used in the ESB&B algorithm missed the subregion

that contains the global minimum in the first iteration and left it near the boundary of a

subregion. The algorithm gets trapped to the lower right local minimum. Figure 4-11

displays the sampling budget allocation in the feasible domain at the end of iteration 40,

in which each black dot represents a sampled solution. As discussed, much less sampling

budget is allocated to subregions that have been discarded earlier.

Figures 4-12 and 4-14 show the experiment results for first run of the proposed al-

gorithm. As before, the final partition of the feasible region and the contour plot of the

Griewank function are displayed in both figures. Figure 4-12 plots the path of the current

best solution across iterations in the feasible domain of the proposed algorithm, and Figure

4-13 displays the zoomed-in results. Figure 4-14 displays the sampling budget allocation in

139

5 10 15 20 25 30 35 40

Iteration

-0.01

0

0.01

0.02

0.03

0.04C

urre

nt e

stim

ate

of o

bjec

tive

valu

e


Proposed algorithm95% confidence intervalTrue global objective value

Figure 4-7: Objective function estimate of the current iterate with 95% confidence intervalacross iterations of the proposed algorithm (zoomed-in results).

5 10 15 20 25 30 35 40

Iteration

0

1

2

3

4

5

6

7

Dis

tanc

e to

glo

bal o

ptim

al s

olut

ion


ESBBProposed

Figure 4-8: Distance between current best solution and the global minimum solution acrossiterations.

140

-5 -4 -3 -2 -1 0 1 2 3 4 5

x1

-5

-4

-3

-2

-1

0

1

2

3

4

5x 2

ESBB algorithm

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Figure 4-9: The path of best solution at current iterate across iterations in the feasibledomain of the original ESB&B algorithm.

2.5 3 3.5 4

x1

-5

-4.5

-4

-3.5

x 2

ESBB algorithm (zoomed in)

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Figure 4-10: The path of best solution at current iterate across iterations in the feasibledomain of the original ESB&B algorithm (zoomed-in results).

141

-5 -4 -3 -2 -1 0 1 2 3 4 5

x1

-5

-4

-3

-2

-1

0

1

2

3

4

5

x 2

ESBB algorithm

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Figure 4-11: Allocation of sampling budget in the feasible domain of the original ESB&Balgorithm.

the feasible domain of the proposed algorithm. These figures have similar layouts as Figure

4-9 and 4-11, respectively. Different from the generic partitioning strategy, the proposed

adaptive partitioning strategy divides the feasible region along x2 initially and note that the

globally optimal solution is placed in the interior of one of the subregion. At iteration 24,

the algorithm escapes the lower right local minimum and starts to explore the middle sub-

region. The final partition of the feasible region identifies the underlying function’s peaks

and basins successfully. The sampling budgets are allocated more to the subregions that

contain the basins than those containing the peaks as expected.

An average sample-path performance (over the 50 runs) for each algorithm is con-

structed; this performance metric is used to compare against other algorithms, e.g., Xu et al.

(2010, 2013); Xu and Nelson (2013). Figure 4-15 plots the objective value of the estimated

optimal solution across iterations (averaged over 50 algorithm runs) for the ESB&B algo-

rithm (red line with shaded 95% confidence boundary) and the proposed algorithm (black

line with shaded 95% confidence boundary). Figure 4-16 zooms in on the average perfor-

142

-5 -4 -3 -2 -1 0 1 2 3 4 5

x1

-5

-4

-3

-2

-1

0

1

2

3

4

5x 2

Adaptive partitioning ESBB algorithm

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Figure 4-12: The path of best solution at current iterate across iterations in the feasibledomain of the proposed algorithm.

0 0.5 1 1.5 2 2.5 3 3.5 4

x1

-5

-4.5

-4

-3.5

-3

-2.5

-2

-1.5

-1

-0.5

0

x 2


0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Figure 4-13: The path of best solution at current iterate across iterations in the feasibledomain of the proposed algorithm (zoomed-in results).

143

-5 -4 -3 -2 -1 0 1 2 3 4 5

x1

-5

-4

-3

-2

-1

0

1

2

3

4

5

x 2


0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Figure 4-14: Allocation of sampling budget in the feasible domain of the proposed algo-rithm.

mance close to the global minimum function value at zero. Since this is a minimization

problem, the lower the curve, the better the algorithm performance in terms of estimated

objective value. Note that the curve for the proposed algorithm is lower than that for the

ESB&B algorithm except for iterations 10 to 25 where they overlap with each other. This

indicates: (i) at early stage, the proposed algorithm find solutions with improved perfor-

mances faster; (ii) during iterations 10 to 25, both algorithm reaches solutions with similar

performances around local optimal objective values; (iii) as the iteration number increases,

the proposed algorithm continues to find solutions with improved performances whereas

the ESB&B algorithm mostly gets stuck around the locally optimal solutions. At the termi-

nation of the algorithms (i.e., iteration 40), the proposed algorithm ends up with an average

estimated objective value that is statistically lower than that of the ESB&B algorithm. Ad-

ditionally, the proposed algorithm finds the true globally optimal solution (i.e., (0, 0)) for

27 out of 50 runs, whereas the ESB&B algorithm finds it for 3 out of 50 runs. The propose

algorithm finds a final solution with mean performances statistically indifferent from the

144

true optimal value (i.e., 0) for 41 out of 50 runs at significance level 0.05, whereas the

ESB&B algorithm finds it for 4 out of 50 runs.

Next, we consider a minimization of the Griewank function with feasible region [−1, 9]×

[−1, 9], where the globally optimal solution (0, 0) is no longer at the center of the feasible

region. In this example, the generic partitioning strategy of ESB&B algorithm does not end

up with an initial cut close to globally optimal solution. Figure 4-17 displays the contour

plot of the two-dimensional Griewank function on domain [−1, 9]× [−1, 9] together with a

uniformly random generated initial sample set (displayed in asterisks) for both algorithms.

The algorithms parameters are set the same as in the previous experiment. We run each

algorithm 50 times.

Figure 4-18 plots the current best estimate of objective value across iterations of five

randomly selected runs of each algorithm (i.e., five sample paths of each algorithm). As

before, we observe that the current best estimate of the objective value inherits a decreasing

trend for both algorithms. Figure 4-19 (resp. Figure 4-20) plots the current best estimate

of objective value with 95% confidence interval of ESB&B algorithm (resp. proposed

algorithm) and zooms in on the performance close to the global minimum function value

at zero (solid blue line). There is only one run of the ESB&B algorithm that ends up with

estimated objective values that cannot be rejected at confidence level 95% to be different

from the true global minimum value 0. Meanwhile, all five runs of the proposed algorithm

end up with estimated objective values that are statistically indifferent from the true global

minimum after iteration 25 at confidence level 95%.

Figure 4-21 shows the distance between current best solution and the global minimum

solution across iterations. This figure has a similar layout as Figure 4-8. All runs of the pro-

posed algorithm end up with solutions close to the true global minimum at (0, 0), whereas

only one run of the ESB&B algorithm ends up close to (0, 0). Based on these five ex-

periment runs, the proposed methods tends to find solutions that are closer to the global

minimum.

Figure 4-22 plots the average estimate of the objective value across iterations for the

ESB&B algorithm (red line with shaded 95% confidence boundary) and the proposed al-

gorithm (black line with shaded 95% confidence boundary). Figure 4-23 zooms in on the

145

Figure 4-15: Objective function estimate of the current iterate across iterations averagedover 50 algorithm runs.

Figure 4-16: Objective function estimate of the current iterate across iterations averagedover 50 algorithm runs (zoomed-in results).

146

-1 0 1 2 3 4 5 6 7 8 9

x1

-1

0

1

2

3

4

5

6

7

8

9

x 2


global minimum

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

Figure 4-17: The contour plot of two-dimensional Griewank function on [−1, 9]× [−1, 9].

5 10 15 20 25 30 35 40

Iteration

-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Cur

rent

est

imat

e of

obj

ectiv

e va

lue


ESBBProposed

Global minimum objective = 0

Figure 4-18: Objective function estimate of the current iterate across iterations.

147

5 10 15 20 25 30 35 40

Iteration

-0.01

0

0.01

0.02

0.03

0.04C

urre

nt e

stim

ate

of o

bjec

tive

valu

e


ESBB algorithm95% confidence intervalTrue global objective value

Figure 4-19: Objective function estimate of the current iterate with 95% confidence intervalacross iterations of ESBB algorithm (zoomed-in results).

average performance close to the global minimum function value at zero. As before, the

curve for the proposed algorithm is lower than that of the ESB&B algorithm. In other

words, the proposed algorithm has a better performance than the ESB&B algorithm on av-

erage. The initial decrease in estimated objective value of the proposed algorithm is faster

than that of the ESB&B algorithm, this indicates a faster exploring speed of solutions with

improved performances. At termination (i.e., iteration 40), the proposed algorithm ends up

with an average estimated objective value that is statistically lower than that of the ESB&B

algorithm. Additionally, 35 runs of the proposed algorithm end up with true globally op-

timal solution at (0, 0), whereas only 14 runs for the ESB&B algorithm obtain the true

globally optimal solution. 46 runs of the proposed algorithm end with a solution with mean

performance statistically indifference from the true optimal value 0 at significance level

0.05, whereas only 14 runs for the ESB&B algorithm achieve this.

In summary, the proposed adaptive partitioning ESB&B algorithm is a globally conver-

gent algorithm, meanwhile it has an improved finite-time (limited sample budget) perfor-

148

5 10 15 20 25 30 35 40

Iteration

-0.02

-0.01

0

0.01

0.02

0.03

0.04C

urre

nt e

stim

ate

of o

bjec

tive

valu

e


Proposed algorithm95% confidence intervalTrue global objective value

Figure 4-20: Objective function estimate of the current iterate with 95% confidence intervalacross iterations of the proposed algorithm (zoomed-in results).

5 10 15 20 25 30 35 40

Iteration

0

2

4

6

8

10

Dis

tanc

e to

glo

bal o

ptim

al s

olut

ion


ESBBProposed

Figure 4-21: Distance between current best solution and the global minimum solutionacross iterations.

149

Figure 4-22: Objective function estimate of the current iterate across iterations averagedover 50 algorithm runs.

Figure 4-23: Objective function estimate of the current iterate across iterations averagedover 50 algorithm runs (zoomed-in results).

150

mance in terms of finding improved solutions faster, better estimated final objective values

and higher likelihood of finding the true global optimal solutions.

4.4.2 The car-sharing fleet assignment problem

In this section, we apply the proposed algorithm to a real-world car-sharing fleet assignment

problem. This case study problem is adopted from Zhou et al. (2019). It considers a two-

way car-sharing system from the perspective of the service operator. Essentially, it consists

of finding an assignment of a fleet of vehicles across the network of stations that maximizes

the expected profit over a given finite time horizon, denoted as the planning period. Instead

of using a simplified description of demand and of demand-supply interactions, this case

study relies on a demand simulator of Fields et al. (2017) developed based on the rich

high-resolution reservation data from Zipcar’s Boston market. It is formulated as a discrete

SO problem in Zhou et al. (2019), in which the objective function value (i.e., the expected

profit of a given fleet assignment) can only be obtained via simulation.

The discrete SO problem is formulated as follows:

maxx

E[R(x;q1)] −∑i∈I

cixi (4.13)

s.t.∑i∈I

xi ≤ N (4.14)

xi ≤ Ni, ∀ i ∈ I (4.15)

xi ∈ Z≥0, ∀ i ∈ I (4.16)

where x = [xi] are the decision variables and xi is the number of vehicles assigned to station

i. R(x;q1) is the random variable representing the revenue with fleet assignment x and

exogenous simulation parameter vector (e.g., reservation pricing) q1, ci is the exogenous

cost, over the planning period, of a parking space at station i, N is the total number of

vehicles to assign, Ni is the capacity of station i, and I is the set of all stations. The

objective function (4.13) represents the expected profit for a given fleet assignment x as the

difference between the expected revenue E[R(x;q1)] and costs∑

i∈I cixi. The estimates of

151

the expected revenue E[R(x;q1)] can only be obtained via simulations. Constraint (4.14)

bounds the total number of vehicles assigned across all the stations. Constraint (4.15)

bounds the number of vehicles assigned to each individual station i by the capacity of the

station. Finally, the number of vehicles assigned to each station must be an nonnegative

integer (Constraint (4.16)).

As discussed in Section 4.3.2, in the car-sharing fleet assignment problem, the total

number of vehicles assigned to a cluster of nearby stations (i.e., yj =∑

i∈Cjxi) may form

a more efficient cut of the feasible region, since stations nearby usually share demands

as customers will search nearby stations for substitutions if the target station runs out of

vehicles. Therefore, to address the formulated discrete SO problem (4.13), we apply the

following three algorithms: the original ESB&B algorithm, the proposed algorithm with

parallel partition, and the proposed algorithm with hyperplane partition. The potential

splitting factors yj for hyperplane partition are generated as follows: for each stations i,

form a cluster of stations center at station i with a given radius, and eliminate replications.

In this section, we choose the radius of 1 in the distance unit used in the simulator when

calculating the spillover effect from one station to another.

We consider the fleet assignment problem of Boston south end area which contains 23

stations, i.e., the decision vector is of dimension 23 (shown in Figure 4-24). Each station

i has a space capacity Ni = 16 and the total number of cars to assign is N = 211. All

other exogenous variables (e.g., ci) other than demand level of the simulator are set the

same as in Zhou et al. (2019). We tested the algorithms on one low-demand level and one

high-demand level case. The maximum number of algorithm iterations is set to 40. At every

iteration, the number of solutions to be sampled from subregions other than the current best

subregion is set to 10 (i.e., νO = 10), and the total number of solutions to be sampled from

the current best subregion is set to 20 (i.e., |P(Rk)|νR = 20). The number of replications

for each non-encountered sample is set at ∆nF = 5, encountered sample ∆nA = 2. For all

algorithms, we initialize with 20 randomly uniformly sampled solutions plus one solution

with relatively good performance, which can be considered as a warm start. This warm-start

solution is obtained by solving the analytical car-sharing fleet assignment model formulated

as an Mixed Integer Programming (MIP) in Zhou et al. (2019, Eq.(8)-(13)). For the original

152

ESB&B algorithm, the best subregion is divided into 3 new subregions at each iteration

(i.e., ω = 3). For the proposed algorithm with both parallel partition and hyperplane

partition, the maximum tree depth is set at 2 (i.e., D = 2), and hence the current best

subregion can be divided into at most 4 subregions. The minimum number of sampled

points grouped in one subregion is set at 2, i.e., Nmin = 2. Note that the computational

time spend on simulation per iteration in this experiment setting is roughly 500 seconds, the

proposed partitioning strategy with both parallel and hyperplane partitions finishes within

2 seconds, which is comparably trivial. We run each algorithm five times.

Figure 4-24: Zipcar stations in Boston South End neighborhood (Google Maps; 2017)

Figure 4-25 compares the objective estimate of the current iterate across iterations for

each algorithm run of the low-demand experiment. The x-axis displays the iteration index

and the y-axis displays the performance estimate of the current iterate (i.e., simulation-

based estimate of the objective function of the best point). The red dashed lines displays

the results for the ESB&B algorithm, the black solid lines for the proposed algorithm with

parallel partition, and the blue asterisk lines for the proposed algorithm with hyperplane

partition. Figure 4-26 displays the aggregated average performance for each algorithm. It

153

plots the objective value of estimated optimal solution at each iteration (averaged over 5

algorithm runs) for the ESB&B algorithm (red line with shaded 95% confidence bound-

ary), the proposed algorithm with parallel partition (black line with shaded 95% confi-

dence boundary), and the proposed algorithm with parallel partition (blue asterisk line with

shaded 95% confidence boundary). In both figures, we observe the following trends. Ini-

tially, all algorithms start at a position similar to each other, which is the warm-start so-

lution. As iteration advances, both the proposed algorithms with parallel and hyperplane

partition identify solutions with improved performance faster than the ESB&B algorithm.

To further evaluate the performances of the solutions derived, we simulate each derived

solution for 50 replications. Table 4.1 displays the performance statistics of the derived

solutions from different algorithm runs. The first and second columns show the algorithm

name and run id. The third and fourth columns show the mean and standard deviation

of profits generated by each derived solution. We conduct a two-sample t-test between

each derived solution from the proposed algorithm (with parallel and hyperplane partitions)

and each derived solution from the ESB&B algorithm. The null hypothesis is that the

average profit generated by the derived solutions from both algorithms are the same. The

corresponding alternative hypothesis is that the average profit generated by the solution

derived from the proposed algorithm is higher. Table 4.2 and 4.3 display the p-values

for the 50 tests. The null hypothesis cannot be rejected if the p-value is greater than the

significant level 0.05 (colored red in the tables). The null hypothesis is rejected for all tests,

except for the one which compares the derived solution from run 4 of the ESB&B algorithm

and that from run 3 of the proposed algorithm with parallel partition. In other words,

at the end of last iteration, the performances of the solutions derived from the proposed

algorithm are mostly better than those derived from the ESB&B algorithm. To investigate

the added value of hyperplane cuts in the proposed algorithm, we conduct a two-sample

t-test for each solution derived from the proposed algorithm with parallel partition and

each solution derived from the proposed algorithm with hyperplane partition. As before,

our null hypothesis is that the average profits generated by both solutions are the same,

and the alternative hypothesis is that the average profit generated by the solution derived

from the proposed algorithm with hyperplane partition is higher. Table 4.4 displays the

154

5 10 15 20 25 30 35 40Iteration

6.28

6.29

6.3

6.31

6.32

6.33

6.34

6.35E

stim

ated

pro

fit o

f cur

rent

iter

ate

($)

104

ESBBProposed-ParallelProposed-Hyperplane

Figure 4-25: The objective function estimate of the current iterate across iterations forlow-demand experiment.

Figure 4-26: Objective function estimate of the current iterate across iterations(averagedover the 5 algorithm runs) for the low-demand experiment.

155

p-values for all 25 two-sample t-tests. The red cell in the table represents the test in which

the null hypothesis cannot be rejected at significance level 0.05. There are 15 tests that

reject the null hypothesis and suggest that solution derived by the proposed algorithm with

hyperplane partition generates higher profits.

Algorithm Run Mean Standard deviation

ESB&B algorithm

1 63066.95 173.732 63001.90 174.633 63056.80 175.064 63138.56 184.425 63076.53 174.82

Proposed algorithmParallel partition

1 63204.83 175.602 63220.92 161.943 63169.15 180.094 63244.88 150.865 63223.98 169.94

Proposed algorithmHyperplane partition

1 63260.53 204.982 63265.89 174.313 63247.42 168.434 63302.07 188.595 63322.54 180.93

Table 4.1: Performance statistics of the derived final solutions in the low-demand case.

Proposed algorithm with parallel partitionRun 1 2 3 4 5

ESB&B

1 0.0001 0.0000 0.0024 0.0000 0.00002 0.0000 0.0000 0.0000 0.0000 0.00003 0.0000 0.0000 0.0010 0.0000 0.00004 0.0344 0.0098 0.2017 0.0011 0.00895 0.0002 0.0000 0.0052 0.0000 0.0000

Table 4.2: P-values of the two-sample t-test comparing the solutions derived by ESB&Balgorithm and the proposed algorithm with parallel partition in the low-demand experiment.

Figure 4-27 compares the objective estimate of the current iterate across iterations for

each algorithm run of the high-demand experiment. This figure has a similar layout as Fig-

ure 4-25. Figure 4-28 displays the aggregated average performance for each algorithm. It

has a similar layout as Figure 4-26. From the beginning to the end, the ESB&B algorithm

does not find any solution with improved performance for four algorithm runs. On the other

156

Proposed algorithm with hyperplane partitionRun 1 2 3 4 5

ESB&B

1 0.0000 0.0000 0.0000 0.0000 0.00002 0.0000 0.0000 0.0000 0.0000 0.00003 0.0000 0.0000 0.0000 0.0000 0.00004 0.0011 0.0003 0.0013 0.0000 0.00005 0.0000 0.0000 0.0000 0.0000 0.0000

Table 4.3: P-values of the two-sample t-test comparing the solutions derived by ESB&Balgorithm and the proposed algorithm with hyperplane partition in the low-demand exper-iment.


Proposed algorithmwith parallel partition

1 0.0739 0.0420 0.1093 0.0045 0.00072 0.1433 0.0922 0.2123 0.0116 0.00193 0.0099 0.0037 0.0135 0.0002 0.00004 0.3325 0.2604 0.4684 0.0487 0.01095 0.1672 0.0000 0.2451 0.0160 0.0030

Table 4.4: P-values of the two-sample t-test comparing the solutions derived by the pro-posed algorithm with parallel and hyperplane partition in the low-demand experiment.

hand, both the proposed algorithm with parallel and hyperplane partition do make improve-

ment as iteration advances. The proposed algorithm with hyperplane partition starts to find

better solutions earlier than the one with parallel partition.

To further evaluate the performances of the solutions derived, we simulate each derived

solution for 50 replications. Table 4.5 displays the performance statistics of the derived

solutions from different algorithm runs. This table has a similar layout as Table 4.1. We first

conduct a two-sample t-tests between each solution derived from the ESB&B algorithm and

each solution derived from the proposed algorithm (with parallel and hyperplane partition).

As before, the null hypothesis is that average profits generated by the solutions derived from

both algorithms are the same, and the alternative hypothesis is that the solution derived

from the proposed algorithm generates a higher profit. Table 4.6 and 4.7 display the p-

values for the 50 tests. For all tests, the null hypothesis is rejected at significance level

0.05. In other words, the proposed algorithm ends up with solutions with higher profits.

To validate the added value of hyperplane cuts in the proposed algorithm, we conduct a

157

5 10 15 20 25 30 35 40Iteration

1.38

1.385

1.39

1.395

1.4E

stim

ated

pro

fit o

f cur

rent

iter

ate

($)

105

ESBBProposed-ParallelProposed-Hyperplane

Figure 4-27: The objective function estimate of the current iterate across iterations forhigh-demand experiment.

Figure 4-28: Objective function estimate of the current iterate across iterations (averagedover the 5 algorithm runs) for the high-demand experiment.

158

two-sample t-test between each solution derived with parallel partition and each solution

derived with hyperplane partition. The null hypothesis is that the average profits generated

by both solutions are the same. The alternative hypothesis is that the solution derived by the

proposed algorithm with hyperplane partition generates a higher profit. Table 4.8 displays

the p-values for the 25 two-sample t-tests. For all the 25 t-tests, we reject the null hypothesis

at significance level 0.05. Hence, the proposed algorithm with hyperplane partition ends

up with derived solutions with better performances than those derived by the proposed

algorithm with parallel partition. This obvious improvement in the proposed algorithm’s

finite-time performance may be because under high demand condition, customer spillover

from one station to another happens frequently and hence a hyperplane cut based on the

total number of vehicles in a cluster of nearby stations may form a more efficient way of

dividing the feasible region.

Algorithm Run Mean Standard deviation

ESB&B algorithm

1 138420.78 177.052 138420.06 190.273 138390.60 158.244 138357.82 171.415 138357.76 210.50

Proposed algorithmParallel partition

1 138669.72 184.942 138703.72 166.583 138723.02 201.854 138751.44 204.595 138666.18 206.07

Proposed algorithmHyperplane partition

1 139502.98 196.232 139340.78 206.503 139188.80 212.034 139468.24 224.515 139272.32 195.04

Table 4.5: Performance statistics of the derived final solutions in the high-demand case.

In summary, compared to the ESB&B algorithm, the proposed algorithm can improve

the finite-time performances by exploring the underlying function structure through sam-

pled points and incorporating prior knowledge of the problem-specific structures.

159

Proposed algorithm with parallel partitionRun 1 2 3 4 5

ESB&B

1 0.0000 0.0000 0.0000 0.0000 0.00002 0.0000 0.0000 0.0000 0.0000 0.00003 0.0000 0.0000 0.0000 0.0000 0.00004 0.0000 0.0000 0.0000 0.0000 0.00005 0.0000 0.0000 0.0000 0.0000 0.0000

Table 4.6: P-values of the two-sample t-test comparing the solutions derived by ESB&Balgorithm and the proposed algorithm with parallel partition in the high-demand experi-ment.


ESB&B

1 0.0000 0.0000 0.0000 0.0000 0.00002 0.0000 0.0000 0.0000 0.0000 0.00003 0.0000 0.0000 0.0000 0.0000 0.00004 0.0000 0.0000 0.0000 0.0000 0.00005 0.0000 0.0000 0.0000 0.0000 0.0000

Table 4.7: P-values of the two-sample t-test comparing the solutions derived by ESB&Balgorithm and the proposed algorithm with hyperplane partition in the high-demand exper-iment.


Proposed algorithmwith parallel partition

1 0.0000 0.0000 0.0000 0.0000 0.00002 0.0000 0.0000 0.0000 0.0000 0.00003 0.0000 0.0000 0.0000 0.0000 0.00004 0.0000 0.0000 0.0000 0.0000 0.00005 0.0000 0.0000 0.0000 0.0000 0.0000

Table 4.8: P-values of the two-sample t-test comparing the solutions derived by the pro-posed algorithm with parallel and hyperplane partition in the high-demand experiment.

160

4.5 Conclusion

In this chapter, we propose an adaptive partitioning strategy and combine it with the ESB&B

framework developed by Xu and Nelson (2013). This proposed partitioning strategy is a

sample-driven approach that explores the structure of the underlying objective function

by iteratively dividing the feasible region into subregions such that sampled points within

the same subregion have similar performances. The proposed partitioning strategy can be

integrated with prior knowledge of specific problem structures to form more efficient hy-

perplane partitions. Solving the proposed partitioning problem is fast and efficient through

solving the formulated MIP via the local search algorithm developed by Dunn (2018).

The proposed algorithm combines the proposed adaptive partitioning strategy within the

ESB&B framework. It is a general-purpose algorithm that converges globally for discrete

SO problems with finite and convex feasible region, and it can be applied to general SO

problems without significant amount of modification. The proposed algorithm improves

the finite-time performances of the original ESB&B algorithm with generic partitioning

strategy.

For SO problems with tight sampling budget at each iteration, a smart sampling strat-

egy other than uniform sampling is important. This can be a potential research direction

to explore. Currently, the adaptive partitioning strategy supports parallel partition and only

hyperplane partitions with potential hyperplane cuts known in advance, it can be an in-

teresting research direction to automatically generate hyperplane cuts for any problems

without such prior knowledge.

161

162

Chapter 5

Conclusions

This chapter concludes the thesis by reviewing its main contents and contributions. It also

provides potential directions for future researches.

Chapter 2 formulates an analytical stochastic link transmission model that is both com-

putationally efficient and consistent with Newell’s simplified kinetic theory of traffic

flow. The proposed model builds upon the multivariate model of Osorio and Flöt-

teröd (2015). The model has a complexity that is linear, rather than cubic, in the

link’s space capacity. This makes the model suitable for large-scale network analy-

sis.

The model is validated versus a simulation-based implementation of the stochastic

link transmission model. The proposed model yields significant gains in compu-

tational efficiency while preserving accuracy. The proposed model is then used to

address a signal control problem for the city of Lausanne. It yields signal plans that

systematically outperform initial random plans for various performance metrics. The

experiments illustrate the robustness of the model to the quality of the initial points.

The proposed plans also outperform a signal plan derived from a widely used com-

mercial signal control software.

Chapter 3 proposes a relaxation approximation of the stochastic link transmission model

formulated in Chapter 2. It proposes a formulation with a constant model complexity,

whereas the past formulations have a complexity that scales linearly or cubically with

163

link length. This makes it suitable for large-scale network optimization with time

budgets or real-time optimization problems.

The model is validated versus a simulation-based implementation of the stochastic

link transmission model. Its performance is also benchmarked with other past analyt-

ical formulations. The proposed model yields estimates with comparable accuracy,

while the computational efficiency is enhanced by at least one order of magnitude.

The proposed model is also validated versus a microscopic traffic simulator and it

can accurately approximate the link’s boundary conditions for realistic traffic situ-

ations. The model is then used to address the same city-wide traffic signal control

problem with time budget. Compared to a benchmark probabilistic analytical model,

the proposed model enhances computational efficiency by two orders of magnitude,

while deriving signal plans with similar performance. Compared to a benchmark

deterministic network loading model, the proposed model derives signal plans with

better performance. It also yields signal plans that outperform those obtained from a

widely used commercial signal control software.

Chapter 4 proposes a technique to enhance the computational efficiency of SO algorithms

for high-dimensional discrete SO problems. The technique is based on an adaptive

partitioning strategy, which divides iteratively the feasible region along the decision

variables into subregions in a fashion such that previously sampled solutions with

similar performance are located in the same subregion. The proposed partitioning

strategy can take on problem-specific structures known a priori to form more ef-

ficient hyperplane cuts. The proposed adaptive partitioning strategy is integrated

in the ESB&B framework by Xu and Nelson (2013). The resulting algorithm is a

general-purpose discrete SO algorithm that converges globally for problems with a

finite and convex feasible region, and it can be applied to deterministic problems

without a significant amount of modification. Two numerical studies show that the

proposed algorithm outperforms the original ESB&B algorithm in small sampling

budget (finite-time) performance. The advantage is greater when prior knowledge of

the problem-specific structures are available.

164

Future research directions The proposed stochastic link transmission model (Chapter 2)

and its relaxation approximation (Chapter 3) are link models that describe the within-

link dynamic and produce the changes in link states (links boundary conditions),

given arrival and departure profiles. Another key component of a complete network

loading model is the node model, which describes the between-link dynamics and

produces arrival and departure profiles, given the states of its connected links.

In order to formulate a complete probabilistic network model, there is a need to for-

mulate probabilistic and scalable node models. The probabilistic model of Osorio

et al. (2011) includes a tandem-link node model that provides a higher-order descrip-

tion of the across-node dependencies. The extension of this formulation to nodes

with multiple upstream and downstream links remains for future work.

Second, there is a need to formulate scalable network models. For a network with n

links, each with space capacity ℓ, directly coupling the proposed link model of Chap-

ter 3 with the node model of Osorio et al. (2011) would yield a model complexity in

the order of O(ℓn). Such a model is inappropriate for large-scale network analysis.

Potential research directions include but are not limited to network decomposition

technique (e.g., Flötteröd and Osorio (2014)) and aggregate-disaggregate techniques

(e.g., Osorio and Yamani (2017); Osorio and Wang (2017)).

The proposed algorithm in Chapter 4 for discrete SO problems combines an adap-

tive partitioning strategy within the ESB&B framework of Xu and Nelson (2013).

The proposed adaptive partitioning strategy uncovers the structure of the underlying

function by iteratively dividing the feasible region into subregions such that sam-

pled points within the same subregion have similar performances. The resulting

algorithm concentrates the limited computational effort (sampling budgets) on sub-

regions where good solutions appear to be. The proposed partitioning strategy can

also make more efficient hyperplane cuts by using prior knowledge of the problem-

specific structures. An extension of this approach consists of automatic detection of

such hyperplane cuts.

165

166

Appendix A

Appendices of Chapter 2

A.1 Estimation of the weight parameter w

This section describes the procedure followed to formulate, and to fit the coefficients of,

the weight parameter of Equation (40). Recall that the goal of the mixture model is to

accurately approximate the upstream and the downstream boundary conditions of the link.

In other words, it should yield an accurate approximation of the distribution of UQ and

of DQ. We consider a single isolated link and conduct a total of 180 experiments with

varied combinations of the space capacity (ℓ ∈ {5, 10, 15, . . . , 100}), the traffic inten-

sity (ρ = λ/µ ∈ {0.25, 0.5, 0.75}) and the service rate (or downstream flow capacity)

(µ ∈ {0.2, 0.4, 0.6}). Each experiment considers a time period of duration 250 seconds.

For each experiment we compare the approximation of the UQ and of the DQ distribu-

tions, over time T , to the distributions estimated via stochastic simulation with a discrete-

event simulator of the stochastic link transmissions model. Based on the results of these

experiments, we first observed that the parameters that most impact the quality of the ap-

proximation are ℓ, µ and kfwdδ. This lead us to formulate the following expression for the

weight parameter:

w(ℓ, µ, kfwdδ;β) = e− ℓ2

βµkfwdδ , (A.1)

167

where β is a scalar coefficient. The coefficient is fit such as to minimize, over all 180

experiments, the following error function:

1

2

[1

250

250∑T=1

JSD(PUQ1 (T) ∥ P

UQ2 (T))

]+1

2

[1

250

250∑T=1

JSD(PDQ1 (T) ∥ P

DQ2 (T))

], (A.2)

where PUQ1 (T) (resp. PDQ

1 (T)) is the UQ (resp. DQ) distribution obtained from the mixture

model at time T and PUQ2 (T) (resp. PDQ

2 (T)) is the UQ (resp. DQ) distribution estimated

via stochastic simulation at time T . The two summations of (A.2) consider the error in the

UQ distributions and in the DQ distributions, respectively. This leads to β = 70. This

results in the final weight parameter expression defined in Equation (40).

A.2 Tables of time-average JSD metric

Tables A.1 and A.2 display, respectively, the time-average JSD metric of the UQ and DQ

distributions.

168

Experiment Time-average JSD of the UQ distribution

λ(k) ℓ Mixture Multivariate DetDet DetExp ExpDet

0.1

10 0.0010 0.0000 0.3539 0.2600 0.0003

20 0.0012 0.0000 0.3988 0.3177 0.0001

30 0.0013 0.0000 0.4248 0.3543 0.0001

40 0.0014 NaN 0.4402 0.3779 0.0000

60 0.0013 NaN 0.4590 0.4111 0.0000

80 0.0011 NaN 0.4692 0.4320 0.0000

100 0.0010 NaN 0.4753 0.4489 0.0000

0.2

10 0.0054 0.0000 0.4261 0.2434 0.0020

20 0.0068 0.0000 0.4644 0.3080 0.0013

30 0.0070 0.0000 0.4839 0.3476 0.0008

40 0.0070 NaN 0.4961 0.3767 0.0005

60 0.0062 NaN 0.5105 0.4189 0.0003

80 0.0054 NaN 0.5181 0.4476 0.0001

100 0.0045 NaN 0.5223 0.4713 0.0001

0.3

10 0.0081 0.0000 0.4654 0.1615 0.0036

20 0.0206 0.0000 0.5071 0.2387 0.0045

30 0.0237 0.0000 0.5214 0.2863 0.0041

40 0.0223 NaN 0.5294 0.3215 0.0032

60 0.0182 NaN 0.5384 0.3772 0.0016

80 0.0145 NaN 0.5434 0.4215 0.0008

100 0.0115 NaN 0.5458 0.4602 0.0004

Table A.1: Time-average JSD metric of the UQ distribution. The value NaN denotes caseswhere the evaluation of the multivariate model exceeded the limit of 40 hours.

169

Experiment Time-average JSD of the DQ distribution

λ(k) ℓ Mixture Multivariate DetDet DetExp ExpDet

0.1

10 0.0007 0.0000 0.1550 0.0486 0.0028

20 0.0002 0.0000 0.1530 0.0476 0.0028

30 0.0000 0.0000 0.1486 0.0468 0.0027

40 0.0000 NaN 0.1466 0.0460 0.0026

60 0.0000 NaN 0.1401 0.0441 0.0025

80 0.0000 NaN 0.1335 0.0422 0.0024

100 0.0000 NaN 0.1270 0.0402 0.0022

0.2

10 0.0030 0.0000 0.2679 0.0527 0.0115

20 0.0008 0.0000 0.2640 0.0538 0.0112

30 0.0002 0.0000 0.2584 0.0538 0.0107

40 0.0000 NaN 0.2528 0.0518 0.0105

60 0.0000 NaN 0.2414 0.0497 0.0099

80 0.0000 NaN 0.2301 0.0476 0.0095

100 0.0000 NaN 0.2188 0.0456 0.0089

0.3

10 0.0077 0.0000 0.3677 0.0347 0.0262

20 0.0033 0.0000 0.3811 0.0507 0.0217

30 0.0007 0.0000 0.3760 0.0544 0.0202

40 0.0002 NaN 0.3680 0.0539 0.0196

60 0.0000 NaN 0.3512 0.0516 0.0184

80 0.0000 NaN 0.3343 0.0492 0.0174

100 0.0000 NaN 0.3173 0.0467 0.0164

Table A.2: Time-average JSD metric of the DQ distribution. The value NaN denotes caseswhere the evaluation of the multivariate model exceeded the limit of 40 hours.

170

Appendix B

Appendices of Chapter 3

B.1 Property: τDQ(k) = λDQ(k) when µ(k) = 0

In this section, we derive the property of τDQ(k) when the service rate of DQ(k) becomes

zero. When µ(k) = 0, DQ(k) is a pure (Poisson) arrival process. The only possibility for

DQ being empty at the end of the time interval k (i.e., DQ(k) = 0) is that DQ is empty at

the beginning of the time interval k, which happens with probability P(DQ(k − 1) = 0),

and there is no arrival to DQ during time interval k of length δ. Since the arrival to DQ

during time interval k is Poisson process with rate λDQ(k), the number of arrivals in a

time interval length of δ follows a Poisson distribution with parameter λDQ(k)δ and thus

no arrival to DQ during time interval k of length δ happens with probability e−λDQ(k)δ.

Therefore, we have

P(DQ(k) = 0) = P(DQ(k− 1) = 0)e−λDQ(k)δ (B.1)

= P(DQk = 0) + [P(DQ(k− 1) = 0) − P(DQk = 0)] e−λDQ(k)δ.

(B.2)

Equation (B.2) is obtained from Equation (B.1) by adding outside the bracket and sub-

tracting within the bracket the term P(DQk = 0). When µ(k) = 0, we have ρDQ =

λDQ(k)/µ(k) ≫ 1, and thus P(DQk = 0) = 0 (given by Eq. (3.6b)). Equations (B.2) and

(B.1) are equal since adding/subtracting zero does not affect the result. Equation (B.2) is in

171

the exact form of Equation (3.4) by replacing τDQ(k) with λDQ(k). Thus, not only should

τDQ(k) exist when µ(k) = 0 but also should it equal to λDQ(k).

B.2 Calculation of limµ(k)→0 τDQ(k) of Equation (3.15)

In this section, we derive the limit of τDQ(k) given by Equation (3.7) as µ(k) approaches

zero. We first calculate the limits of τDQ,1 (given by Eq. (3.7b)) and τDQ,2 (given by

Eq. (3.7c)) independently as follows:

limµ(k)→0

τDQ,1 = limµ(k)→0

(1− α1e−ρDQ(k))× µ(k)(1− ρDQ(k))

2

(1+ ρDQ(k))(B.3)

= limµ(k)→0

(1− α1e−ρDQ(k))× lim

µ(k)→0

µ(k)(1− ρDQ(k))2

(1+ ρDQ(k))(B.4)

= limµ(k)→0

(1− α1e−λDQ(k)/µ(k))× lim

µ(k)→0

µ(k)2(1− ρDQ(k))2

µ(k)(1+ ρDQ(k))(B.5)

= (1− 0)× limµ(k)→0

(µ(k) − µ(k)ρDQ(k))2

(µ(k) + µ(k)ρDQ(k))(B.6)

= 1× limµ(k)→0

(µ(k) − λDQ(k))2

(µ(k) + λDQ(k))(B.7)

=(0− λDQ(k))

2

(0+ λDQ(k))(B.8)

= λDQ(k) (B.9)

limµ(k)→0

τDQ,2 = limµ(k)→0

α2µ(k)

∣∣∣∣P(DQ(k− 1) = 0) − P(DQk = 0)

ℓ(1− P(DQk = 0))

∣∣∣∣1/5 (B.10)

= α2

∣∣∣∣P(DQ(k− 1) = 0) − P(DQk = 0)

ℓ(1− P(DQk = 0))

∣∣∣∣1/5 limµ(k)→0

µ(k) (B.11)

= 0 (B.12)

Note that τDQ(k) (given by Eq. (3.7a)) is proposed as the sum of two terms τDQ,1 and

τDQ,2, and the limit of the sum of two functions is equal to the sum of the limit of the two

172

functions. Therefore, we have

limµ(k)→0

τDQ(k) = limµ(k)→0

τDQ,1 + limµ(k)→0

τDQ,2 (B.13)

= λDQ(k) + 0 (B.14)

= λDQ(k) (B.15)

B.3 Estimation of the scalar coefficients in τDQ(k)

This appendix describes the procedure to fit the exogenous coefficients (α1,1, α1,2, α2,1 and

α2,2) of Equation (3.7). A total of 126 simulation experiments were carried out. Each

experiment starts off empty and runs for TF = 300 time units, it has one traffic intensity

value for the first 150 time units and another value for the remaining 150 time units. We use

the arrow notation 0.5 → 0.25 to denote an experiment with a traffic intensity that changes

from 0.5 to 0.25. The experiments consider all combinations of traffic intensity λ/µ ∈

{0.5 → 0.25, 0.75 → 0.25, 1.25 → 0.25, 0.75 → 0.5, 1.25 → 0.5, 1.25 → 0.75}, service

rate µ ∈ {0.2, 0.4, 0.6}, and space capacity ℓ ∈ {10, 20, 30, 40, 60, 80, 100}. The simulator

yields an estimate of P(DQ(k) = 0), denoted PS(DQ(k) = 0), for all k = 1, ..., TF. The

coefficients α1,1, α1,2, α2,1 and α2,2 are fit such as to minimize, over all 126 experiments,

the error function given by Equation (3.33) and rewritten here:

eDQ =1

TF

TF∑T=1

|PA(DQ(T) = 0) − PS(DQ(T) = 0)|, (B.16)

where PS(DQ(T) = 0) is the estimate from the simulator and PA(DQ(T) = 0) is the

analytical approximation obtained from Algorithm 4 with the following adjustments. At

every time step k,

• P(UQ(k) = ℓ) is obtained from the simulator;

• λDQ(k) is obtained from the simulator;

• τDQ(k) is obtained from Equation (3.7), which depends on scalar parameters α1,1,

α1,2, α2,1 and α2,2.

173

In other words, perfect information about the link’s upstream boundary conditions is as-

sumed in the calculation of PA(DQ(k) = 0). Thus, PA(DQ(k) = 0) only depends on

the choice of α1,1, α1,2, α2,1 and α2,2, thus the error function eDQ only depends on α1,1,

α1,2, α2,1 and α2,2. The scalars are estimated jointly and the numerical values obtained are

α1,1 = 0.4, α1,2 = 0.4, α2,1 = 0.6 and α2,2 = 0.

B.4 Variance of the sojourn time of DQ(k)

In this section, we derive the expression for the variance of the sojourn time of DQ(k).

Recall that we use the sojourn time of an M/M/1/ℓ queue with arrival rate λDQ(k) and

service rate µ(k) to approximate the sojourn time of DQ(k). To make the notation simpler,

hereafter the time index k is dropped. Let ρ = λDQ/µ.

The probability density function of the sojourn time of a M/M/1/ℓ queue, denoted

fSDQ(t), is given by Sztrik (2012, Chap. 2.4, Page 34):

fSDQ(t) =

ℓ−1∑n=0

µ(µt)n

n!e−µt P(DQ = n)

1− P(DQ = ℓ)(B.17)

where P(DQ = n) is the steady state probability of DQ.

174

We use this probability density function expression to compute E[S2DQ] as follows.

E[S2DQ] =

∫∞0

t2fSDQ(t)dt (B.18)

=

∫∞0

t2ℓ−1∑n=0

µ(µt)n

n!e−µt P(DQ = n)

1− P(DQ = ℓ)dt (B.19)

=

∫∞0

t2ℓ−1∑n=0

µ(µt)n

n!e−µt

(1−ρ

1−ρℓ+1

)ρn

1−(

1−ρ

1−ρℓ+1

)ρℓdt (B.20)

=(1− ρ)

1− ρℓ

∫∞0

t2ℓ−1∑n=0

µ(µt)n

n!e−µtρndt (B.21)

=(1− ρ)

1− ρℓ

ℓ−1∑n=0

ρn

∫∞0

t(µt)n+1

n!e−µtdt (B.22)

=(1− ρ)

1− ρℓ

ℓ−1∑n=0

ρn Γ(n+ 3)

µ2n!(B.23)

=(1− ρ)

(1− ρℓ)µ2

ℓ−1∑n=0

ρn(n+ 2)(n+ 1) (B.24)

=−ℓ(ℓ+ 1)ρℓ+2 + 2ℓ(ℓ+ 2)ρℓ+1 − (ℓ+ 1)(ℓ+ 2)ρℓ + 2

µ2(1− ρℓ)(1− ρ)2(B.25)

Equation (B.20) is obtained from Equation (B.19) by substituting the closed-form expres-

sion of the steady state probability distribution of an M/M/1/ℓ system (see Gross (2008,

Chap. 2, Equation (2.49))), which is given by:

P(DQ = n) =

(1− ρ

1− ρℓ+1

)ρn, ∀n ∈ {0, ..., ℓ}. (B.26)

The expected value of the sojourn time of DQ is given by (see Eq. (3.29d)):

E[SDQ] =ℓρℓ+1 − (ℓ+ 1)ρℓ + 1

µ(1− ρℓ)(1− ρ)(B.27)

175

Thus, the variance of the sojourn time of DQ is given by:

Var(SDQ) = E[S2DQ] − E[SDQ]

2 (B.28)

=ℓρ2ℓ+2 − 2ℓρ2ℓ+1 + (ℓ+ 1)ρ2ℓ − ℓ(ℓ+ 1)ρℓ+2 + 2ℓ(ℓ+ 1)ρℓ+1 − (ℓ2 + ℓ+ 2)ρℓ + 1

µ2(1− ρℓ)2(1− ρ)2

(B.29)

B.5 Estimation of the scalar coefficients in τUQ(k)

This section describes the procedure to fit the exogenous coefficients α3,1 and α3,2 of Equa-

tion (3.30). The same set of 126 simulation experiments as described in Appendix A.2 are

used. The coefficients α3,1 and α3,2 are fit by minimizing, over all 126 experiments, the

following error function given by Equation (3.34) and rewritten here:

eUQ =1

TF

TF∑T=1

|PA(UQ(T) = ℓ) − PS(UQ(T) = ℓ)|, (B.30)

where PS(UQ(T) = ℓ) is the estimate from the simulator and PA(UQ(T) = ℓ) is the ana-

lytical approximation, which is obtained from Algorithm 4 with the following adjustments.

At every time step k,

• P(DQ(k) = 0) is obtained from the simulator;

• qUQ(k) and qLLO(k) are obtained from the simulator;

• τUQ(k) is obtained from Equation (3.30), which depends on scalar parameters α3,1

and α3,2.

In other words, perfect information about the link’s downstream boundary conditions is

assumed in the calculation of PA(UQ(T) = ℓ). Thus, PA(UQ(T) = ℓ) only depends on

the choice of α3,1 and α3,2 and thus the error function eUQ depends only on α3,1 and α3,2.

The scalars are estimated jointly and the numerical values obtained are α3,1 = 25 and

α3,2 = 7.5.

176

B.6 Tables of mean absolute differences

Experiment eUQ

λ(k) ℓ Mixture Multivariate Proposed

0.2 → 0.1

10 3.11e− 3 7.80e− 5 1.80e− 3

20 5.75e− 5 4.02e− 6 4.74e− 5

30 7.57e− 7 3.92e− 7 7.35e− 7

40 9.25e− 12 NaN 3.25e− 10

60 5.20e− 17 NaN 8.45e− 15

80 2.41e− 22 NaN 1.92e− 19

100 2.49e− 28 NaN 4.72e− 25

0.3 → 0.1

10 1.48e− 2 4.50e− 4 2.10e− 3

20 4.67e− 3 1.23e− 4 1.80e− 3

30 7.03e− 4 3.90e− 5 4.38e− 4

40 8.02e− 5 NaN 6.07e− 5

60 3.11e− 7 NaN 2.91e− 7

80 1.82e− 13 NaN 3.59e− 10

100 3.08e− 17 NaN 8.60e− 14

0.3 → 0.2

10 1.87e− 2 4.44e− 4 4.30e− 3

20 5.19e− 3 1.29e− 4 2.22e− 3

30 8.36e− 4 3.79e− 5 5.49e− 4

40 1.00e− 4 NaN 7.81e− 5

60 6.59e− 7 NaN 2.91e− 7

80 2.04e− 13 NaN 4.59e− 10

100 6.07e− 17 NaN 1.23e− 13

Table B.1: Mean absolute difference eUQ of P(UQ(k) = ℓ). The value NaN denotes caseswhere the evaluation of the multivariate model exceeded the limit of 40 hours. (Startingempty with time-varying demand over time)

177

Experiment eDQ

λ(k) ℓ Mixture Multivariate Proposed

0.2 → 0.1

10 5.00e− 3 2.93e− 3 4.60e− 3

20 2.76e− 3 3.00e− 3 4.35e− 3

30 2.25e− 3 3.02e− 3 4.67e− 3

40 2.51e− 3 NaN 5.12e− 3

60 2.63e− 3 NaN 5.51e− 3

80 2.68e− 3 NaN 5.82e− 3

100 2.74e− 3 NaN 6.16e− 3

0.3 → 0.1

10 1.52e− 2 0.44e− 2 0.69e− 2

20 0.73e− 2 0.46e− 2 1.54e− 2

30 0.40e− 2 0.47e− 2 1.82e− 2

40 0.33e− 2 NaN 1.88e− 2

60 0.40e− 2 NaN 1.93e− 2

80 0.42e− 2 NaN 1.97e− 2

100 0.42e− 2 NaN 2.00e− 2

0.3 → 0.2

10 1.62e− 2 0.35e− 2 1.27e− 2

20 0.69e− 2 0.35e− 2 1.36e− 2

30 0.36e− 2 0.36e− 2 1.33e− 2

40 0.27e− 2 NaN 1.38e− 2

60 0.32e− 2 NaN 1.43e− 2

80 0.34e− 2 NaN 1.46e− 2

100 0.34e− 2 NaN 1.49e− 2

Table B.2: Mean absolute difference eDQ of P(DQ(k) = 0). The value NaN denotes caseswhere the evaluation of the multivariate model exceeded the limit of 40 hours. (Startingempty with time-varying demand over time)

178

Bibliography

Amaran, S., Sahinidis, N. V., Sharda, B. and Bury, S. J. (2016). Simulation optimization: areview of algorithms and applications, Annals of Operations Research 240(1): 351–380.

Andradóttir, S. (2006). An overview of simulation optimization via random search, Hand-books in operations research and management science 13: 617–631.

Angün, E., Kleijnen, J., den Hertog, D. and Gürkan, G. (2009). Response surface method-ology with stochastic constraints for expensive simulation, Journal of the operationalresearch society 60(6): 735–746.

Bhatnagar, S., Hemachandra, N. and Mishra, V. K. (2011). Stochastic approximation algo-rithms for constrained optimization via simulation, ACM Transactions on Modeling andComputer Simulation (TOMACS) 21(3): 15.

Bhosekar, A. and Ierapetritou, M. (2018). Advances in surrogate based modeling, feasibil-ity analysis, and optimization: A review, Computers & Chemical Engineering 108: 250–267.

Boel, R. and Mihaylova, L. (2006). A compositional stochastic model for real time freewaytraffic simulation, Transportation Research Part B 40: 319–334.

Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984). Classification and regressiontrees, Wadsworth International Group 37(15): 237–251.

Calvert, S., Taale, H., Snelder, M. and Hoogendoorn, S. (2012). Probability in traffic: achallenge for modelling, 4th International Symposium on Dynamic Traffic Assignment(DTA), Massachusetts, USA.

Chen, X., Li, L. and Shi, Q. (2015). Stochastic Evolutions of Dynamic Traffic Flow,Springer, Berlin Heidelberg.

Chong, L. and Osorio, C. (2017). A simulation-based optimization algorithm for dynamiclarge-scale urban transportation problems, Transportation Science 52(3): 637–656.

Daganzo, C. (2005). A variational formulation of kinematic waves: basic theory and com-plex boundary conditions, Transportation Research Part B 39(2): 187–196.

Daganzo, C. F. (1994). The cell transmission model: A dynamic representation of high-way traffic consistent with the hydrodynamic theory, Transportation Research Part B28(4): 269–287.

179

Davis, J. L., Massey, W. A. and Whitt, W. (1995). Sensitivity to the service-time distribu-tion in the nonstationary Erlang loss model, Management Science 41(6): 1107–1116.

Deng, W., Lei, H. and Zhou, X. (2013). Traffic state estimation and uncertainty quantifi-cation based on heterogeneous data sources: a three detector approach, TransportationResearch Part B 57: 132 – 157.

Dumont, A. G. and Bert, E. (2006). Simulation de l’agglomération Lausannoise SIMULO,Laboratoire des voies de circulation, ENAC, Ecole Polytechnique Fédérale de Lausanne.URL: Available at: http://web.mit.edu/osorioc/www/papers/dumont06BertRapport.pdf

Dunn, J. W. (2018). Optimal trees for prediction and prescription, PhD thesis, Mas-sachusetts Institute of Technology.

Endres, D. M. and Schindelin, J. E. (2003). A new metric for probability distributions,IEEE Transactions on Information Theory 49(7): 1858– 1860.

Erlang, A. K. (1917). Solution of some problems in the theory of probabilities of sig-nificance in automatic telephone exchanges, Post Office Electrical Engineer’s Journal10(1917-1918): 189–197.

Fields, E., Osorio, C. and Zhou, T. (2017). A data-driven car sharing simulator for infer-ring latent demand, Technical report, Massachusetts Institute of Technology, Cambridge,Massachusetts, USA.Available at: http://web.mit.edu/osorioc/www/papers/fields17Sim.pdf .

Flötteröd, G. and Osorio, C. (2014). Stochastic analytic dynamic qeueing network modelwith spillback, Proceedings of the International Symposium of Dynamic Traffic Assign-ment (DTA).Available at: http://web.mit.edu/osorioc/www/papers/floOso13Nwks.pdf .

Flötteröd, G. and Osorio, C. (2017). Stochastic network link transmission model, Trans-portation Research Part B 102: 180–209.

Fu, M. C. (2002). Optimization for simulation: Theory vs. practice, INFORMS Journal onComputing 14(3): 192–215.

Gazis, D. C., Herman, R. and Rothery, R. W. (1961). Nonlinear follow-the-leader modelsof traffic flow, Operations research 9(4): 545–567.

Google Maps (2017). 23 Zipcar stations in Boston South End neighborhood, https://drive.google.com/open?id=1hOvbRIjfZJjF5L3OfoAThiq0\_p8&usp=sharing. Accessed: 2017-09-22.

Gross, D. (2008). Fundamentals of queueing theory, John Wiley & Sons, New York, U.S.,chapter 2, pp. 49–103.

Heidemann, D. (1991). Queue length and waiting time distributions at priority intersec-tions, Transportation Research Part B 25(4): 163–174.

180

Heidemann, D. (1994). Queue length and delay distributions at traffic signals, Transporta-tion Research Part B 28(5): 377–389.

Heidemann, D. (2001). A queueing theory model of nonstationary traffic flow, Transporta-tion Science 35(4): 405–412.

Heidemann, D. and Wegmann, H. (1997). Queueing at unsignalized intersections, Trans-portation Research Part B 31(3): 239–263.

Helbing, D. (1997). Modeling multi-lane traffic flow with queuing effects, Physica A:Statistical Mechanics and its Applications 242(1-2): 175–194.

Himpe, W., Corthout, R. and Tampère, M. C. (2016). An efficient iterative link transmissionmodel, Transportation Research Part B 92: 170–190.

Hong, L. J. and Nelson, B. L. (2009). A brief introduction to optimization via simulation,Proceedings of the 2009 Winter Simulation Conference (WSC), IEEE, pp. 75–85.

Hoogendoorn, S. P. and Bovy, P. H. (2001). Generic gas-kinetic traffic systems modelingwith applications to vehicular traffic flow, Transportation Research Part B 35(4): 317–336.

Jabari, S. E. (2012). A stochastic model of macroscopic traffic flow: Theoretical founda-tions, PhD thesis, University of Minnesota.

Jabari, S. E. and Liu, H. X. (2012). A stochastic model of traffic flow: Theoretical founda-tions, Transportation Research Part B 46(1): 156–174.

Jabari, S. E. and Liu, H. X. (2013). A stochastic model of traffic flow: Gaussian approxi-mation and estimation, Transportation Research Part B 47: 15–41.

Jabari, S. E., Zheng, J. and Liu, H. X. (2014a). A probabilistic stationary speed–densityrelation based on Newell’s simplified car-following model, Transportation Research PartB 68: 205–223.

Jabari, S. E., Zheng, J. and Liu, H. X. (2014b). A probabilistic stationary speed-densityrelation based on newell’s simplified car-following model, Transportation Research PartB 68: 205–223.

Jagerman, D. (1975). Nonstationary blocking in telephone traffic, Bell Labs TechnicalJournal 54(3): 625–661.

Kerner, B. S. and Rehborn, H. (1996). Experimental features and characteristics of trafficjams, Physical Review E 53: R1297–R1300.

Khinchin, A. Y. (1962). Erlang’s formulas in the theory of mass service, Theory of Proba-bility & Its Applications 7(3): 320–325.

Kim, S.-H. and Nelson, B. L. (2006). Selecting the best system, Handbooks in operationsresearch and management science 13: 501–534.

181

Kingman, J. (1963). Poisson counts for random sequences of events, The Annals of Math-ematical Statistics 34(4): 1217–1232.

Kleinrock, L. (1975). Queueing Systems Volume 1:Theory, Wiley-Interscience, New York,NY, USA, chapter 2, p. 77.

Kullback, S. and Leibler, R. A. (1951). On information and sufficiency, The annals ofmathematical statistics 22(1): 79–86.

L. Salemi, P., Song, E., Nelson, B. L. and Staum, J. (2019). Gaussian markov randomfields for discrete optimization via simulation: Framework and algorithms, OperationsResearch 67(1): 250–266.

Lam, W. H., Shao, H. and Sumalee, A. (2008). Modeling impacts of adverse weatherconditions on a road network with uncertainties in demand and supply, Transportationresearch part B: methodological 42(10): 890–910.

Larson, R. C. and Odoni, A. R. (1981). Urban Operations Research, Prentice-Hall, Inc.,Englewood Cilffs, New Jersey, USA.

Laval, J. A. and Castrillón, F. (2015). Stochastic approximations for the macroscopic funda-mental diagram of urban networks, Transportation Research Procedia, Papers selectedfor the International Symposium of Transportation and Traffic Theory (ISTTT), Vol. 7,pp. 615–630.

Laval, J. A. and Chilukuri, B. R. (2014). The distribution of congestion on a class ofstochastic kinematic wave models, Transportation Science 48(2): 217–224.

Lighthill, M. and Whitham, G. (1955). On kinematic waves. I: Flood movement in longrivers, II: a theory of traffic flow on long crowded roads, Proceedings of the Royal Societyof London A: Mathematical, Physical and Engineering Sciences, Vol. 229, The RoyalSociety, pp. 281–345.

Lighthill, M. and Witham, J. (1955). On kinematic waves II. a theory of traffic flow on longcrowded roads, Proceedings of the Royal Society A 229: 317–345.

Lu, J. and Osorio, C. (2018). A probabilistic traffic-theoretic network loading model suit-able for large-scale network analysis, Transportation Science 52(6): 1509–1530.

MATLAB (2016). Optimization Toolbox: User’s Guide (R2016a), The Mathworks, Inc.,Natick, Massachusetts.

Morse, P. (1958). Queues, inventories and maintenance; the analysis of operational sys-tems with variable demand and supply, Wiley, New York, USA, chapter 6, pp. 59–67.

Nelson, P. (1995). A kinetic model of vehicular traffic and its associated bimodal equilib-rium solutions, Transport Theory and Statistical Physics 24(1-3): 383–409.

Newell, C. (1982). Applications of queueing theory, Chapman and Hall, New York, USA,chapter 3, pp. 143–175.

182

Newell, G. (1993). A simplified theory of kinematic waves in highway traffic, part I:general theory, Transportation Research Part B 27(4): 281–287.

Newell, G. F. (1961). Nonlinear effects in the dynamics of car following, OperationsResearch 9(2): 209–229.

Newell, G. F. (2002). A simplified car-following theory: a lower order model, Transporta-tion Research Part B 36(3): 195–205.

Norkin, V. I., Pflug, G. C. and Ruszczynski, A. (1998). A branch and bound method forstochastic global optimization, Mathematical programming 83(1-3): 425–450.

Odoni, A. R. and Roth, E. (1983). An empirical investigation of the transient behavior ofstationary queueing systems, Operations Research 31(3): 432–455.

Olszewski, P. S. (1994). Modeling probability distribution of delay at signalized intersec-tions, Journal of Advanced Transportation 28(3): 253–274.

Orosz, G., Wilson, R. E., Szalai, R. and Stépán, G. (2009). Exciting traffic jams: nonlinearphenomena behind traffic jam formation on highways, Physical Review E 80(4): 046205.

Osorio, C. (2010). Mitigating network congestion: analytical models, optimization meth-ods and their applications, PhD thesis, Ecole Polytechnique Fédérale de Lausanne.

Osorio, C., Chen, X., Gao, J., Talas, M. and Marsico, M. (2015). On the con-trol of highly congested urban networks with intricate traffic patterns: a NewYork City case study, Technical report, Department of Civil and Environmen-tal Engineering, Massachusetts Institute of Technology (MIT). Available at:http://web.mit.edu/osorioc/www/papers/osoChenNYCDOTOfflineSO.pdf .

Osorio, C. and Chong, L. (2015). A computationally efficient simulation-based optimiza-tion algorithm for large-scale urban transportation problems, Transportation Science49(3): 623–636.

Osorio, C. and Flötteröd, G. (2015). Capturing dependency among link boundaries in astochastic dynamic network loading model, Transportation Science 49(2): 420–431.

Osorio, C., Flötteröd, G. and Bierlaire, M. (2011). Dynamic network loading: a stochasticdifferentiable model that derives link state distributions, Transportation Research Part B45(9): 1410–1423.

Osorio, C. and Wang, C. (2017). On the analytical approximation of joint aggregate queue-length distributions for traffic networks: a stationary finite capacity Markovian networkapproach, Transportation Research Part B 95: 305–339.

Osorio, C. and Yamani, J. (2017). Analytical and scalable analysis of transient tandemMarkovian finite capacity queueing networks, Transportation Science 51(3): 823–840.

183

Paveri-Fontana, S. (1975). On boltzmann-like treatments for traffic flow: a critical reviewof the basic model and an alternative proposal for dilute traffic analysis, TransportationResearch 9(4): 225–235.

Payne, H. J. (1971). Model of freeway traffic and control, Mathematical Model of PublicSystem: Simulation Council Proceedings Series 1: 51–61.

Prigogine, I. and Andrews, F. C. (1960). A Boltzmann-like approach for traffic flow, Op-erations Research 8(6): 789–797.

Ramezani, M., Haddad, J. and Geroliminis, N. (2015). Dynamics of heterogeneity in urbannetworks: aggregated traffic modeling and hierarchical control, Transportation ResearchPart B 74: 1–19.

Regis, R. G. and Shoemaker, C. A. (2013). Combining radial basis function surrogatesand dynamic coordinate search in high-dimensional expensive black-box optimization,Engineering Optimization 45(5): 529–555.

Reibman, A. (1991). A splitting technique for Markov chain transient solution, in W. J.Stewart (ed.), Numerical solution of Markov chains, Marcel Dekker, Inc, New York,USA, chapter 19, pp. 373–400.

Richards, P. I. (1956a). Shock waves on highways, Operations Research 4(1): 42–51.

Richards, P. I. (1956b). Shock waves on the highway, Operations Research 4(1): 42–51.

Robbins, H. and Monro, S. (1951). A stochastic approximation method, The annals ofmathematical statistics pp. 400–407.

Ross, P. (1988). Traffic dynamics, Transportation Research Part B 22(6): 421–435.

Roth, E. (1994). The relaxation time heuristic for the initial transient problem in M/M/kqueueing systems, European Journal of Operational Research 72(2): 376–386.

Sayegh, A. S., Connors, R. D. and Tate, J. E. (2017). Uncertainty propagation from thecell transmission traffic flow model to emission predictions: a data-driven approach,Transportation Science 52(6): 1327–1346.

Shi, L. and Ólafsson, S. (2000). Nested partitions method for global optimization, Opera-tions research 48(3): 390–407.

Stafford, R. (2006). Random vectors with fixed sum. Accessed June 1, 2015.URL: Http://www.mathworks.com/matlabcentral/fileexchange/9700

Sumalee, A., Zhong, R. X., Pan, T. L. and Szeto, W. Y. (2011). Stochastic Cell Transmis-sion Model (SCTM): a stochastic dynamic traffic model for traffic state surveillance andassignment, Transportation Research Part B 45(3): 507–533.

184

Sztrik, J. (2012). Basic queueing theory, University of Debrecen, Debrecen, Hungary,chapter 2.4, pp. 32–37. Accessed July 20, 2018.URL: https://pdfs.semanticscholar.org/848f/a1f48ad9d3edb24b05667f15cfc633eb8f69.pdf

Tampère, C., Corthout, R., Cattrysse, D. and Immers, L. (2011). A generic class of first-order node models for dynamic macroscopic simulations of traffic flows, TransportationResearch Part B 45(1): 289–309.

Trafficware (2011). Synchro Studio 8 User Guide, Trafficware, Sugar Land, TX.

Transport for London (2010). Traffic modelling guidelines. version 3.0, Technical report,Transport for London (TfL).

Tsai, S. C. and Fu, S. Y. (2014). Genetic-algorithm-based simulation optimizationconsidering a single stochastic constraint, European Journal of Operational Research236(1): 113–125.

TSS (2014). AIMSUN 8.1 Microsimulator Users Manual, Transport Simulation System.

U.S. Department of Transportation (2008). Transportation vision for 2030, Technical re-port, U.S. Department of Transportation (DOT), Research and Innovative TechnologyAdministration.

van Doorn, E. A. and Zeifman, A. I. (2009). On the speed of convergence to stationarity ofthe Erlang loss system, Queueing Systems 63(1-4): 241.

van Zuylen, H. J. and Viti, F. (2003). Uncertainty and the dynamics of queues at con-trolled intersections, International Federation of Automatic Control (IFAC) Proceedings36(14): 43–48.

Viti, F. and Van Zuylen, H. J. (2010). Probabilistic models for queues at fixed controlsignals, Transportation Research Part B 44(1): 120–135.

Wang, Z. and Ierapetritou, M. (2018). Constrained optimization of black-box stochasticsystems using a novel feasibility enhanced kriging-based method, Computers & Chemi-cal Engineering 118: 210–223.

Xu, J., Nelson, B. L. and Hong, J. (2010). Industrial strength compass: A comprehensivealgorithm and software for optimization via simulation, ACM Transactions on Modelingand Computer Simulation (TOMACS) 20(1): 3.

Xu, J., Nelson, B. L. and Hong, L. J. (2013). An adaptive hyperbox algorithm for high-dimensional discrete optimization via simulation problems, INFORMS Journal on Com-puting 25(1): 133–146.

Xu, W. L. and Nelson, B. L. (2013). Empirical stochastic branch-and-bound for optimiza-tion via simulation, IIE Transactions 45(7): 685–698.

185

Yperman, I., Tampere, C. and Immers, B. (2007). A kinematic wave dynamic networkloading model including intersection delays, Transportation Research Board 86th An-nual Meeting, Washington DC, USA.

Zheng, F., Jabari, S. E., Liu, H. X. and Lin, D. (2018). Traffic state estimation usingstochastic Lagrangian dynamics, Transportation Research Part B 115: 143–165.

Zhou, T., Fields, E. and Osorio, C. (2019). Large-scale data-driven simulation-based car-sharing network design, Submitted to Transportation Research Part B .Available at: http://web.mit.edu/osorioc/www/papers/zhoOsoFieCarSharing.pdf .

186

Date post:	23-Mar-2022
Category:	Documents
Upload:	others
View:	3 times
Download:	0 times

Probabilistic Models and Optimization Algorithms for Large ...

Documents