Probabilistic Models and Optimization Algorithms forLarge-scale Transportation Problems
by
Jing Lu
B.A, New York University (2014)
Submitted to the Sloan School of Managementin partial fulfillment of the requirements for the degree of
Doctor of Philosophy in Operation Research
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
February 2020
© Massachusetts Institute of Technology 2020. All rights reserved.
Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Sloan School of Management
January 17, 2020
Certified by . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Carolina Osorio
Associate Professor of Civil and Environmental EngineeringThesis Supervisor
Accepted by. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Patrick Jaillet
Dugald C. Jackson ProfessorDepartment of Electrical Engineering and Computer Science
Co-Director, Operation Research Center
Probabilistic Models and Optimization Algorithms for Large-scale
Transportation Problems
by
Jing Lu
Submitted to the Sloan School of Managementon January 17, 2020, in partial fulfillment of the
requirements for the degree ofDoctor of Philosophy in Operation Research
AbstractThis thesis tackles two major challenges of urban transportation optimization problems: (i)high-dimensionality and (ii) uncertainty in both demand and supply. These challenges areaddressed from both modeling and algorithm design perspectives.
The first part of this thesis focuses on the formulation of analytical transient stochasticlink transmission models (LTM) that are computationally tractable and suitable for large-scale network analysis and optimization. We first formulate a stochastic LTM based on themodel of Osorio and Flötteröd (2015). We propose a formulation with enhanced scalabil-ity. In particular, the dimension of the state space is linear, rather than cubic, in the link’sspace capacity. We then propose a second formulation that has a state space of dimensiontwo; it scales independently of the link’s space capacity. Both link models are validatedversus benchmark models, both analytical and simulation-based. The proposed models areused to address a probabilistic formulation of a city-wide signal control problem and arebenchmarked versus other existing network models. Compared to the benchmarks, bothmodels derive signal plans that perform systematically better considering various perfor-mance metrics. The second model, compared to the first model, reduces the computationalruntime by at least two orders of magnitude.
The second part of this thesis proposes a technique to enhance the computational ef-ficiency of simulation-based optimization (SO) algorithms for high-dimensional discreteSO problems. The technique is based on an adaptive partitioning strategy. It is embeddedwithin the Empirical Stochastic Branch-and-Bound (ESB&B) algorithm of Xu and Nelson(2013). This combination leads to a discrete SO algorithm that is both globally convergentand has good small sample performance. The proposed algorithm is validated and used toaddress a high-dimensional car-sharing optimization problem.
Thesis Supervisor: Carolina OsorioTitle: Associate Professor of Civil and Environmental Engineering
3
Acknowledgments
Firstly, I would like to thank Carolina Osorio for not only being an advisor, but also a
mentor and role model to me during my time at MIT. It is my pleasure to work with her
for last 5 years. I am constantly inspired by her vision, ambitions, and positivity. Her
expertise, her advice, her critical thinking, and her positive attitude towards everything
have an undeniable influence on me. Carolina is always thoughtful and encouraging, which
makes our meetings not only fruitful but also enjoyable. She helped to keep me motivated
and gave me courage to keep trying after failures. Without her help, I could not have been
overcome all the difficulties during this journey.
I would like to thank Professor Patrick Jaillet and Professor Saurabh Amin for serving
as the other two members of my thesis committee, and for their useful feedback and insights
during my committee meetings and multiple 1-1 meetings. In addition, I would like to
thank Professor Saurabh Amin for his help and advice in choosing career path. I would
also like to thank Professor Richard Larson to serve on my general exam committee and to
share his vision, life experiences and amazing stories with me during meeting in his fancy
office.
Additionally, many thanks to Roberta Pizzinato for being so helpful for scheduling
meetings and travel arrangements. I would also like to thank ORC stuff Laura Rose and
Andrew Carvalho for their administrative assistance.
I am extremely grateful to my colleagues at MIT. I would especially thank the fellow
labmates in the Osorio lab: Linsen Chong, Tianli Zhou, Chao Zhang, Kevin Zhang, Timo-
thy Tay, and Evan Fields, without whom I would not have gone so far. Special thanks also
go to my friends at MIT: Manxi Wu, Haizheng Zhang, Daisy Zhou, Fangzhou Lu, Haihao
Lu, Yuchen Wang, Shujing Wang, Yiqun Hu, Shu Ma, Junbin Huang, Li Wang, Rong Yuan,
and many others for accompanying me during this long journey. Special thanks also go to
my friends outside MIT: Ning Hua, Ju Wang, Xiaoli Xu, Yun Zhang, and many others for
supporting me through ups and downs. All of you are really like my families. Our explo-
ration of fun places and restaurants around the great Boston area has become an integral
part of my life. My thanks also go to colleagues I have worked with during my summer
5
internship at Cruise Automation, John Khawam, Sean Skwerer, Chuoran Wang, Michael
McCoy, and many others for the insightful discussions in the future of autonomous vehi-
cles. I learned a lot during this precious opportunity.
I am much obliged to Professor Joel Spencer, my mentor during the undergraduate years
at New York University, who brought me into the world of mathematics and continuously
supports me until today. Without him, I would not have had my achievements today.
Last, but certainly not least, I would like to acknowledge and thank my family for their
unconditional trust, support, and encouragement. A special thank must be given to my
mother Jue Sun. This thesis is dedicated to you.
This work is partially supported by the U.S. National Science Foundation under Grant
No. 1562912. Any opinions, findings, and conclusions or recommendations expressed in
this material are those of the authors and do not necessarily reflect the views of the National
Science Foundation.
6
Contents
1 Introduction 17
1.1 Motivation and objective . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.1.1 Stochastic traffic flow modeling . . . . . . . . . . . . . . . . . . . 18
1.1.2 Adaptive partitioning strategy for discrete SO problems . . . . . . . 19
1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.3 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2 Analytical Probabilistic Link Transmission Model With Linear Complexity 25
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 Link model formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.1 Multivariate link model . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.2 Univariate link models . . . . . . . . . . . . . . . . . . . . . . . . 30
2.2.3 Univariate upstream queue model . . . . . . . . . . . . . . . . . . 31
2.2.4 Univariate downstream queue model . . . . . . . . . . . . . . . . . 37
2.2.5 Mixture model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.3 Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.4 Network analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.4.1 City-scale signal control . . . . . . . . . . . . . . . . . . . . . . . 55
2.4.2 Numerical analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.4.3 Comparison to signal plans derived by commercial signal control
software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
7
3 Analytical Probabilistic Link Transmission Model With Constant Complexity 71
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.2 Past link model formulations . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.3 Proposed link model formulation . . . . . . . . . . . . . . . . . . . . . . . 75
3.3.1 Downstream boundary conditions . . . . . . . . . . . . . . . . . . 76
3.3.2 Upstream boundary conditions . . . . . . . . . . . . . . . . . . . . 83
3.4 Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.4.1 Validation versus a stochastic link transmission model simulator . . 93
3.4.2 Validation versus a microscopic traffic simulator . . . . . . . . . . 99
3.5 Optimization case study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
3.5.1 City-scale signal control . . . . . . . . . . . . . . . . . . . . . . . 108
3.5.2 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
3.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
4 Adaptive Partitioning Strategy for High-Dimensional Discrete Simulation-based
Optimization Problems 123
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.2 ESB&B framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4.3 Adaptive partitioning strategy . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.3.1 Parallel partition . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
4.3.2 Hyperplane partition . . . . . . . . . . . . . . . . . . . . . . . . . 133
4.3.3 Adaptive partitioning ESB&B algorithm . . . . . . . . . . . . . . . 134
4.4 Numerical examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
4.4.1 The Griewank function . . . . . . . . . . . . . . . . . . . . . . . . 136
4.4.2 The car-sharing fleet assignment problem . . . . . . . . . . . . . . 151
4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
5 Conclusions 163
A Appendices of Chapter 2 167
A.1 Estimation of the weight parameter w . . . . . . . . . . . . . . . . . . . . 167
8
A.2 Tables of time-average JSD metric . . . . . . . . . . . . . . . . . . . . . . 168
B Appendices of Chapter 3 171
B.1 Property: τDQ(k) = λDQ(k) when µ(k) = 0 . . . . . . . . . . . . . . . . . 171
B.2 Calculation of limµ(k)→0 τDQ(k) of Equation (3.15) . . . . . . . . . . . . . 172
B.3 Estimation of the scalar coefficients in τDQ(k) . . . . . . . . . . . . . . . . 173
B.4 Variance of the sojourn time of DQ(k) . . . . . . . . . . . . . . . . . . . . 174
B.5 Estimation of the scalar coefficients in τUQ(k) . . . . . . . . . . . . . . . . 176
B.6 Tables of mean absolute differences . . . . . . . . . . . . . . . . . . . . . 177
9
List of Figures
2-1 Link dynamics of the multivariate link model . . . . . . . . . . . . . . . . 29
2-2 Model runtime comparison . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2-3 Experiment 1: impact of the temporal variation of demand on the distribu-
tions, as well as the expected values, of UQ and of DQ . . . . . . . . . . . 51
2-4 Experiment 2: impact of the temporal variation of demand on the distribu-
tions, as well as the expected values, of UQ and of DQ . . . . . . . . . . . 52
2-5 Comparison of the JSD values for the 21 experiments with time-independent
demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2-6 Comparison of the JSD values for the 21 experiments with time-independent
demand (zoomed-in results) . . . . . . . . . . . . . . . . . . . . . . . . . 57
2-7 Lausanne city road network (adapted from Dumont and Bert (2006)) . . . . 58
2-8 Lausanne network model . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2-9 Cumulative distribution functions of the average proportion of time a lane
is full considering different initial signal plans . . . . . . . . . . . . . . . . 64
2-10 Cumulative distribution functions of the average lane queue-length consid-
ering different initial signal plans . . . . . . . . . . . . . . . . . . . . . . . 65
2-11 Cumulative distribution functions of the average trip travel times consider-
ing different initial signal plans . . . . . . . . . . . . . . . . . . . . . . . . 66
2-12 Cumulative distribution functions of the average proportion of time a lane
is full . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
2-13 Cumulative distribution functions of the average lane queue-length . . . . . 67
2-14 Cumulative distribution functions of the average trip travel time . . . . . . 67
11
3-1 Experiment 1: impact of the temporal variation of demand on the link’s
upstream and downstream boundary conditions . . . . . . . . . . . . . . . 94
3-2 Experiment 2: impact of the temporal variation of demand on the link’s
upstream and downstream conditions . . . . . . . . . . . . . . . . . . . . . 95
3-3 Comparison of the average absolute errors for the 21 experiments with
time-varying demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
3-4 Comparison of the computational runtimes for the 21 experiments with
time-varying demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
3-5 Microscopic simulation model of a single-lane link . . . . . . . . . . . . . 101
3-6 Comparison of the expected inflow and outflow for the experiment with
arrival rate λ = 0.1 veh/sec . . . . . . . . . . . . . . . . . . . . . . . . . . 104
3-7 Comparison of the expected inflow and outflow for the experiment with
arrival rate λ = 0.2 veh/sec . . . . . . . . . . . . . . . . . . . . . . . . . . 105
3-8 Comparison of the expected inflow and outflow for the experiment with
arrival rate λ = 0.3 veh/sec . . . . . . . . . . . . . . . . . . . . . . . . . . 105
3-9 Comparison of the expected inflow and outflow for the experiment with
alternating arrival rate between 0.3 veh/sec and 0 veh/sec . . . . . . . . . . 106
3-10 Microscopic simulation model for platoon arrival experiments . . . . . . . 106
3-11 Comparison of the expected inflow and outflow for the tandem link exper-
iment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
3-12 Lausanne network model . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
3-13 Cumulative distribution functions of the average, over all lanes in the net-
work, proportion of time a lane is full considering different initial signal
plans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
3-14 Average proportion of time lane is full for ILTM signal plans and proposed
signal plans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
3-15 Cumulative distribution functions of the average proportion of the lane that
is occupied by vehicles considering different initial signal plans . . . . . . 117
3-16 Cumulative distribution functions of the average trip travel time consider-
ing different initial signal plans . . . . . . . . . . . . . . . . . . . . . . . . 118
12
3-17 Cumulative distribution functions of the average proportion of time a lane
is full . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
3-18 Cumulative distribution functions of the average proportion of the lane that
is occupied by vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
3-19 Cumulative distribution functions of the average trip travel time . . . . . . 120
4-1 The ground truth values of f(x1, x2) in the current best subregion. . . . . . 129
4-2 The sampled solutions of f(x1, x2) in the current best subregion. . . . . . . 129
4-3 Different partitions of the current best subregion. . . . . . . . . . . . . . . 130
4-4 The contour plot of two-dimensional Griewank function on [−5, 5]× [−5, 5].137
4-5 Objective function estimate of the current iterate across iterations. . . . . . 138
4-6 Objective function estimate of the current iterate with 95% confidence in-
terval across iterations of ESBB algorithm (zoomed-in results). . . . . . . . 139
4-7 Objective function estimate of the current iterate with 95% confidence in-
terval across iterations of the proposed algorithm (zoomed-in results). . . . 140
4-8 Distance between current best solution and the global minimum solution
across iterations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
4-9 The path of best solution at current iterate across iterations in the feasible
domain of the original ESB&B algorithm. . . . . . . . . . . . . . . . . . . 141
4-10 The path of best solution at current iterate across iterations in the feasible
domain of the original ESB&B algorithm (zoomed-in results). . . . . . . . 141
4-11 Allocation of sampling budget in the feasible domain of the original ESB&B
algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
4-12 The path of best solution at current iterate across iterations in the feasible
domain of the proposed algorithm. . . . . . . . . . . . . . . . . . . . . . . 143
4-13 The path of best solution at current iterate across iterations in the feasible
domain of the proposed algorithm (zoomed-in results). . . . . . . . . . . . 143
4-14 Allocation of sampling budget in the feasible domain of the proposed algo-
rithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
13
4-15 Objective function estimate of the current iterate across iterations averaged
over 50 algorithm runs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
4-16 Objective function estimate of the current iterate across iterations averaged
over 50 algorithm runs (zoomed-in results). . . . . . . . . . . . . . . . . . 146
4-17 The contour plot of two-dimensional Griewank function on [−1, 9]× [−1, 9].147
4-18 Objective function estimate of the current iterate across iterations. . . . . . 147
4-19 Objective function estimate of the current iterate with 95% confidence in-
terval across iterations of ESBB algorithm (zoomed-in results). . . . . . . . 148
4-20 Objective function estimate of the current iterate with 95% confidence in-
terval across iterations of the proposed algorithm (zoomed-in results). . . . 149
4-21 Distance between current best solution and the global minimum solution
across iterations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
4-22 Objective function estimate of the current iterate across iterations averaged
over 50 algorithm runs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
4-23 Objective function estimate of the current iterate across iterations averaged
over 50 algorithm runs (zoomed-in results). . . . . . . . . . . . . . . . . . 150
4-24 Zipcar stations in Boston South End neighborhood (Google Maps; 2017) . . 153
4-25 The objective function estimate of the current iterate across iterations for
low-demand experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
4-26 Objective function estimate of the current iterate across iterations(averaged
over the 5 algorithm runs) for the low-demand experiment. . . . . . . . . . 155
4-27 The objective function estimate of the current iterate across iterations for
high-demand experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
4-28 Objective function estimate of the current iterate across iterations (averaged
over the 5 algorithm runs) for the high-demand experiment. . . . . . . . . . 158
14
List of Tables
2.1 Transition rate table of UQ. . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.2 Transition rate table of DQ. . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.3 Link Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.4 Parameters for Lausanne case study . . . . . . . . . . . . . . . . . . . . . 62
3.1 Link parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
3.2 Link parameters used in the microscopic simulator . . . . . . . . . . . . . 101
3.3 Link parameters used in the proposed model . . . . . . . . . . . . . . . . . 102
3.4 Link parameters used in the proposed model . . . . . . . . . . . . . . . . . 107
3.5 Average runtime (in min) per iteration of the signal control optimization
algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.1 Performance statistics of the derived final solutions in the low-demand case. 156
4.2 P-values of the two-sample t-test comparing the solutions derived by ESB&B
algorithm and the proposed algorithm with parallel partition in the low-
demand experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
4.3 P-values of the two-sample t-test comparing the solutions derived by ESB&B
algorithm and the proposed algorithm with hyperplane partition in the low-
demand experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
4.4 P-values of the two-sample t-test comparing the solutions derived by the
proposed algorithm with parallel and hyperplane partition in the low-demand
experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
4.5 Performance statistics of the derived final solutions in the high-demand case. 159
15
4.6 P-values of the two-sample t-test comparing the solutions derived by ESB&B
algorithm and the proposed algorithm with parallel partition in the high-
demand experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
4.7 P-values of the two-sample t-test comparing the solutions derived by ESB&B
algorithm and the proposed algorithm with hyperplane partition in the high-
demand experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
4.8 P-values of the two-sample t-test comparing the solutions derived by the
proposed algorithm with parallel and hyperplane partition in the high-demand
experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
A.1 Time-average JSD metric of the UQ distribution. The value NaN denotes
cases where the evaluation of the multivariate model exceeded the limit of
40 hours. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
A.2 Time-average JSD metric of the DQ distribution. The value NaN denotes
cases where the evaluation of the multivariate model exceeded the limit of
40 hours. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
B.1 Mean absolute difference eUQ of P(UQ(k) = ℓ). The value NaN denotes
cases where the evaluation of the multivariate model exceeded the limit of
40 hours. (Starting empty with time-varying demand over time) . . . . . . 177
B.2 Mean absolute difference eDQ of P(DQ(k) = 0). The value NaN denotes
cases where the evaluation of the multivariate model exceeded the limit of
40 hours. (Starting empty with time-varying demand over time) . . . . . . 178
16
Chapter 1
Introduction
1.1 Motivation and objective
Uncertainties exist in many aspects of transportation networks, e.g., demand, heterogeneity
in people’s behaviors, etc. Such uncertainties have always created challenges in large-scale
transportation network modeling, operations and designs. The state-of-the-art stochastic
models are developed to include the consideration of one or more uncertainties that exist
in the transportation networks. These models may represent reality to a better extent but
at the cost of model complexity, which can impede their practical implementation and
application especially when computational budgets are limited. As the network increases
in size, the analysis and decision-making with such stochastic models become more and
more challenging.
In this thesis, we address the limited computational resources in practically applying
stochastic models to large-scale network problems in the following two approaches: (i) de-
velop computationally efficient analytical probabilistic transportation network models that
can be evaluated and optimized by well-developed gradient-based optimization algorithms
and (ii) develop efficient optimization algorithms that can solve problems formulated using
simulation-based urban mobility models.
17
1.1.1 Stochastic traffic flow modeling
The vast majority of the literature in the field of analytical traffic flow modeling has focused
on the development and use of deterministic traffic models. There is an increasing interest
in the development of analytical stochastic models. The increase in the quantity, quality and
resolution of traffic data available allows us to validate models that provide a probabilistic
description of traffic. Such models can be used to enhance the reliability and the robustness
of our transportation networks.
Although the research on analytical stochastic traffic modeling is gaining momentum,
many challenges remain to be addressed:
1. Currently, the most popular approach to formulate an analytical stochastic traffic
model is adding a stochastic noise term to a deterministic traffic model (e.g., Boel
and Mihaylova (2006)). However, for such approaches, the expected traffic dynam-
ics are not guaranteed to be consistent with their deterministic counterparts. For such
approaches, Jabari and Liu (2012) argue that randomness is often applied in an im-
perfect and incomplete fashion. This can lead to the existence and implications of
negative sample paths and hence a misrepresentation of reality.
2. As the size of the network increases, stochastic traffic network models can suffer the
curse of dimensionality. For instance, consider a network of 100 links and every link
has 10 different states, so the total number of joint states of the network is 10100.
Computational efficiency is always a major concern for practical implementations of
stochastic models (Calvert et al.; 2012).
3. There are also challenges such as correlation among variables. Different from deter-
ministic models, where one only has to consider the relations among single values,
probabilistic models need to consider the dependencies among values of random
variables. Not every permutation of the values of random variables is feasible. For
instance, consider a link with space capacity ℓ. The random variable that represents
the number of vehicles ready to leave the link is upper bounded by the random vari-
able that represents the total number of vehicles on the link, although both random
18
variables have the same support {0, ..., ℓ}. Thus, there is a challenge in developing
probabilistic models that consider correlation between random variables while main-
taining easy implementation that does not detract from model efficiencies.
In this thesis, we address some of the challenges by formulating probabilistic link trans-
mission models that are (i) consistent with mainstream deterministic traffic flow theory and
(ii) scalable and computationally efficient for large-scale network analysis and optimiza-
tion.
1.1.2 Adaptive partitioning strategy for discrete SO problems
This research is motivated by the car-sharing network design problem we worked on with
Zipcar. Over the years, Zipcar has been collecting high-resolution reservation data from its
customers in the Boston market. There arises the question of how to fully utilize such rich
disaggregated data to help the operators redesign the service system. Zhou et al. (2019)
consider the fleet assignment problem of the two-way car-sharing system from the aspect
of service operator, i.e., finding an assignment of a fleet of vehicles across the network
of stations that maximizes the expected profit over a given finite time horizon, denoted as
the planning period. Instead of using a simplified description of demand and of demand-
supply interactions, Zhou et al. (2019) rely on the demand simulator of Fields et al. (2017)
developed based on the rich available high-resolution reservation data and formulated the
fleet assignment problem as a discrete SO problem. Given a fleet assignment across the
network, the expected profit is simulated and taken as the average over simulation runs.
In another word, a closed-form objective function is not available. The dimension of the
problem can be as high as hundreds, and the decision variables are discrete (i.e., the number
of vehicles assigned to each station).
For discrete SO problems with a large number of feasible points, the existing algorithms
focus mainly on global convergence to the optimal solution asymptotically (e.g., Tsai and
Fu (2014)). Methods that aim at identifying solutions with good performances within small
sampling budgets include the Heuristic Constrained Genetic Algorithm (HCGA) (Tsai and
Fu; 2014), Industrial Strength COMPASS (ISC) (Xu et al.; 2010). Nevertheless, developing
19
discrete SO algorithms that can efficiently tackle high-dimensional problems remains a
challenge.
In this thesis, we propose a technique to enhance the computational efficiency of SO
algorithms for high-dimensional discrete SO problems. The technique is based on an adap-
tive partitioning strategy. It is embedded within the ESB&B algorithm of Xu and Nelson
(2013). The resulting algorithm has enhanced finite-time (small sampling budget) perfor-
mances while maintaining global convergence.
1.2 Contributions
The contribution of this thesis are as follows:
Analytical probabilistic link transmission models
In this thesis, we develop two analytical link transmission models: a mixture model (Chap-
ter 2) and a two-dimensional model (Chapter 3). They are both transient probabilistic mod-
els that provide a probabilistic description of congestion build-up and dissipation. Thus,
they can be used to deliver performance metrics such as the dynamics in spillback prob-
ability, expectation and variance of queue length, and travel time for sensitivity analysis
or for (robust) optimization purposes. Both models can accurately approximate the links
boundary conditions for realistic traffic situations such as platoon phenomenon caused by
signal controls.
The proposed mixture model is based on the model of Osorio and Flötteröd (2015),
which is a stochastic formulation of the deterministic link transmission model of Yperman
et al. (2007). It tracks the marginal distribution of the upstream and downstream bound-
ary conditions over time, and hence it has a model complexity that is linear in the link’s
space capacity. The two-dimensional model further combines the idea of relaxation pro-
cess. It reduces the model complexity to a constant that no longer depends on the link’s
space capacity. This makes the proposed model suitable for large-scale network optimiza-
tion or situations where a large number of model evaluations is required. Both models are
validated versus a simulation-based implementation of the stochastic LTM. They yield sig-
20
nificant gains in computational efficiency while preserving accuracy. The mixture model
yields accurate distributional approximations of the link’s boundary conditions. The two-
dimensional model’s accuracy on approximating the link’s boundary conditions is compa-
rable to that of Osorio and Flötteröd (2015) and of the mixture model. In the case study, we
further demonstrate the computational efficiency of the models with a large-scale network
signal control problem. The proposed mixture model enables the large-scale network sig-
nal controls problem to be solved offline, and the two-dimensional model further reduces
the computational runtime by two orders of magnitude and hence enables it to be solved in
a timely manner.
Optimization algorithm
In this thesis, we propose a technique to enhance the computational efficiency of SO al-
gorithms for high-dimensional discrete SO problems (Chapter 4). The technique is based
on an adaptive partitioning strategy, which can take on problem-specific structures known
a priori, such as the clustering effect in car-sharing fleet assignment problems, to further
enhance the algorithm efficiencies. The proposed adaptive partitioning strategy is inte-
grated in the ESB&B framework (Xu and Nelson; 2013). The combination leads to a
general-purpose discrete SO algorithm. The resulting algorithm preserves global conver-
gence under infinite simulation efforts and has a good small sampling budget performance.
The proposed algorithm is validated and used to address a high-dimensional car-sharing
optimization problem.
Applications
• Signal control problem. The proposed analytical probabilistic link transmission
models can be used in a variety of network analysis and optimization problems. In
the case studies of this thesis, we address a continuous optimization problem, which
is a fixed-time signal control problem of the Swiss city of Lausanne. The network
consists of 603 links, 902 lanes, and 231 intersections. Among all the intersec-
tions, 17 signalized intersections distributed through the network are to be optimized,
21
which results in a decision variable of dimension 99. This is considered a large-scale
signal control optimization problem (Osorio and Chong; 2015).
The proposed models (the mixture and two-dimensional models) are used to address
this city-wide signal control problem. The proposed models are benchmarked with
a deterministic link transmission model (e.g., intelligent link transmission model
(ILTM) (Himpe et al.; 2016)). The signal plans derived from the proposed mixture
and two-dimensional models have similar performance considering various metrics.
They systematically outperform the initial signal plans, signal plans derived by the
ILTM model, and a signal plan proposed by a widely used commercial software.
Compared to the mixture model, the two-dimensional model reduces the computa-
tional runtime by at least two orders of magnitude.
• Car-sharing network design. The two-way car-sharing fleet assignment problem
of Zipcar in Boston is formulated into a discrete SO problem, in which the objective
function (i.e., expected profit) does not have a closed form but can only be esti-
mated via simulations. The proposed adaptive partitioning strategy is used to solve
the formulated discrete SO problem. We show that the proposed algorithm has bet-
ter finite-time performances than the original ESB&B algorithm. We demonstrate
the flexibility of the proposed adaptive partitioning strategy in taking on problem-
specific structures known a priori, i.e., the clustering effect based on the locations
of the stations. It is shown that the proposed algorithm with prior knowledge can
identify solutions with improved performances faster than the proposed algorithm
without prior knowledge.
1.3 Structure of the Thesis
A chapter by chapter description of the thesis follows.
Chapter 2 proposes an analytical probabilistic link transmission model. The proposed
model builds upon the multivariate model of Osorio and Flötteröd (2015). It proposes
a formulation with enhanced scalability. In particular, the proposed model has a
22
complexity that is linear in the link’s space capacity. The method of this chapter has
been presented and published as:
Lu, J. and Osorio C. (2018). A probabilistic traffic-theoretic network loading
model suitable for large-scale network analysis, Transportation Science 52(6):1509-
1530.
Lu, J. and Osorio C. (2016, Sept 15). A probabilistic traffic theoretic and scal-
able network loading model, European Association for Research in Transportation
(hEART), TU Delft, The Netherlands.
Lu, J. and Osorio C. (2015, Nov 4). Analytical stochastic link transmission model
suitable for large-scale analysis, Proceedings of the 2015 INFORMS Annual Meet-
ing, Philadelphia, Penn, USA.
Chapter 3 proposes a relaxation approximation of the analytical probabilistic link trans-
mission model formulated in Chapter 2. It is a stochastic formulation with a constant
model complexity. This makes it suitable for large-scale network optimization with
tight time budgets. The method of this chapter has been submitted for journal publi-
cation. Preliminary results of this method has been presented as:
Lu, J. and Osorio C. (2018, Jun 7). A probabilistic analytical traffic-theoretic net-
work loading model for large-scale network optimization, Proceedings of the 7th
international symposium on dynamic traffic assignment Smart Transportation, Hong
Kong University, HK, China.
Chapter 4 proposes an adaptive partitioning strategy to enhance the computational ef-
ficiency of SO algorithms for high-dimensional discrete SO problems. It is inte-
grated in the ESB&B framework (Xu and Nelson; 2013). The combination leads to
a general-purpose discrete SO algorithm, which preserves global convergence under
infinite simulation efforts and has a good small sample (finite time) performance.
Chapter 5 summarizes this thesis and includes future research directions.
Appendix A contains the appendices of Chapter 2.
23
Chapter 2
Analytical Probabilistic Link
Transmission Model With Linear
Complexity
This chapter presents an analytical stochastic link transmission model. It is a stochastic
formulation of the link transmission model (LTM), which itself is an operational formula-
tion of Newell’s simplified theory of kinematic waves. The proposed model builds upon
the multivariate model of Osorio and Flötteröd (2015). It proposes a formulation with en-
hanced scalability. In particular, compared to the multivariate model, it has a complexity
that is linear, rather than cubic, in the link’s space capacity. This makes it suitable for large-
scale network analysis. The method of this chapter has been published as: Lu, J. and
Osorio C. (2018). A probabilistic traffic-theoretic network loading model suitable for
large-scale network analysis, Transportation Science 52(6):1509-1530.
2.1 Introduction
This chapter focuses on the formulation of stochastic (i.e., probabilistic) network loading
models of road traffic. The vast majority of the literature in the field of traffic flow theory
has focused on the development of deterministic traffic models. There has been a recently
renewed interest in the development of analytical stochastic models which is, arguably,
25
triggered by both: (i) the interest of major transportation agencies around the world in
estimating and improving the robustness and reliability of their networks (Transport for
London; 2010; U.S. Department of Transportation; 2008); (ii) the availability of high reso-
lution traffic data, which enables the validation of more detailed models.
In a transportation network, there are sources of uncertainty both in supply (e.g., weather)
and in demand (e.g., spatial and temporal distribution of travel demand, heterogeneous pop-
ulation of travelers). Recent studies that review sources and modeling approaches to de-
mand and supply uncertainty include Sumalee et al. (2011); Lam et al. (2008). For instance,
in the field of microscopic travel demand modeling, a variety of probabilistic models have
been developed to account for uncertainties in various travel choices such as: departure
time, mode, route, etc. In the field of macroscopic modeling, the variability (or scatter) in
the fundamental diagrams has led the community to develop probabilistic models to better
interpret and fit field data. A review of recent approaches to model, or account for, the
variability in fundamental diagrams is given in Sumalee et al. (2011) and in Jabari et al.
(2014b). For instance, the work of Heidemann (2001) uses a probabilistic non-stationary
(i.e., transient) traffic model to interpret hysteresis loops and the case study in Sumalee
et al. (2011) uses a probabilistic model to improve the fit of a fundamental diagram with
high scatter. Nonetheless, there is a lack of probabilistic traffic models that are both: (i)
consistent with mainstream traditional deterministic traffic flow theoretic models, and (ii)
tractable enough to enable the efficient analysis and optimization of large-scale networks.
The main contribution of this chapter is to formulate a probabilistic link model that
is both: (i) consistent with mainstream deterministic traffic flow theory; and (ii) is
computationally tractable to enable large-scale network analysis.
Jabari (2012) and Laval and Chilukuri (2014) provide reviews of stochastic traffic flow
theoretic models. Recent formulations include those derived from the variational theory of
Daganzo (2005): e.g., Deng et al. (2013); Laval and Chilukuri (2014); Laval and Castril-
lón (2015). The most popular approach to stochastic traffic modeling is the formulation
of stochastic cell-transmission models (CTMs; e.g., Boel and Mihaylova; 2006; Sumalee
et al.; 2011; Jabari and Liu; 2012). The approach of Boel and Mihaylova (2006) is an ex-
ample of the most common approach to stochastic CTM models in that it adds Gaussian
26
noise terms to the deterministic formulation. This contributes to model tractability yet does
not guarantee expected (i.e., average) traffic dynamics consistent with the CTM dynamics.
The implications of this are further discussed in Jabari and Liu (2012). The model of Jabari
and Liu (2012) considers stochastic vehicle headways. It allows for a variety of headway
distributions and has a fluid limit approximation that is consistent with the CTM. Boel and
Mihaylova (2006) and Jabari and Liu (2012) are sampling-based approaches, which can
become computationally intensive for large-scale networks. Jabari and Liu (2013) pro-
pose a second-order Gaussian approximation of the model of Jabari and Liu (2012) that
can be evaluated without sampling. The CTM is a space-discretized approximation of the
kinematic wave model (KWM; Lighthill and Witham (1955); Richards (1956a)), hence a
stochastic CTM formulation does not guarantee consistency with the KWM.
The recent work of Osorio and Flötteröd (2015) extends the model of Osorio et al.
(2011) and proposes a link model that is a stochastic formulation of the deterministic link
transmission model of Yperman et al. (2007), which itself is an operational formulation
of Newell’s simplified theory of kinematic waves (Newell; 1993). The model considers an
isolated link and derives an analytical description of the transient (i.e., time-dependent) dis-
tribution of link boundary conditions. It yields the joint distribution of the link’s upstream
and downstream boundary conditions. Hence, it provides a higher-order (i.e., beyond first-
order) description of within-link dependencies. The model represents the link as a set of
three finite space capacity stochastic queues. For a link with space capacity ℓ, the dimen-
sion of the state space of the joint distribution is 16(ℓ+ 1)(ℓ2 + 2ℓ+ 6). In other words, the
model complexity is in the order of O(ℓ3).
This chapter formulates a link model with a complexity that is linear, rather than cubic,
in the link’s space capacity, i.e., the proposed model has O(ℓ) complexity. It is therefore
scalable and appropriate for large-scale network analysis. The proposed model is derived
from the model of Osorio and Flötteröd (2015). It is therefore a stochastic formulation of
Newell’s simplified theory of kinematic waves (Newell; 1993).
Section 2.2 formulates the proposed model. The model is validated (Section 2.3) and
used to address a large-scale signal control problem (Section 2.4). Conclusions and a dis-
cussion of ongoing work are presented in Section 2.5. The Appendices contain additional
27
numerical validation results.
2.2 Link model formulation
2.2.1 Multivariate link model
We outline here the main ideas of the model of Osorio and Flötteröd (2015). Hereafter,
we refer to the Osorio and Flötteröd (2015) model as the multivariate link model. For
a description of how this model relates to Newell’s simplified theory of kinematic waves
or to the operational formulation of Yperman et al. (2007), we refer the reader to Osorio
and Flötteröd (2015). Consider a link with a triangular fundamental diagram, free flow
velocity v, backward wave speed w (negative), flow capacity q, jam density ρ, and link
length L. The process that vehicular traffic flow goes through within the link is described
as follows. Upon entrance to the link, it is delayed by L/v time units. It is then ready
for departure, and enters the physical vehicular queue downstream, if one exists. Upon
departure from the link, there is an additional delay of L/|w| before the newly available
space becomes available upstream of the link. This delay represents the time it takes a
kinematic backward wave to traverse the link. The multivariate model is a continuous-
space discrete-time model, where L/v (resp. L/|w|) is rounded to the integer kfwd (resp.
kbwd).
This process is summarized in Figure 2-1. During time interval k, the link has an
expected inflow (resp. outflow) denoted qin(k) (resp. qout(k)). The delay incurred upon
entrance to the link is represented by the lagged inflow queue, denoted LI. In discrete
time, LI can be thought of as a set of kfwd cells. One can think of this delay as if the flow
traveled sequentially from the first until the kfwdth cell of LI. This last cell of LI is denoted
LLI in Figure 2-1. This cell configuration of LI is a mere representation, the multivariate
model describes LI aggregately, i.e., it is not decomposed into individual cells. After this
delay, the flow enters the downstream queue, denoted DQ. The departure of flow from
the link triggers two events: the flow departs DQ (in a network setting, it would enter a
downstream link) and it enters the lagged outflow queue, denoted LO. The purpose of LO
28
is to capture the kinematic backward wave delay. One can think of this delay as if the newly
available space traveled sequentially from the first until the kbwdth cell of LO. This last cell
of LO is denoted LLO in Figure 2-1. The multivariate link model accounts for stochasticity
in the link’s arrival and departure processes. Time-dependent (i.e., inhomogeneous, non-
homogeneous) finite-state birth-death processes are assumed. This leads to stochastic link
flows, to stochastic cumulative flows both upstream and downstream of the link, and hence,
to a stochastic description of link states.
qin(k) Downstream Queue (DQ) qout(k)
Lagged Outflow Queue (LO)(lag of L/|w| time units)
kbwd · · · · · · 2 1
LLOLagged Inflow Queue (LI)
(lag of L/v time units)
1 2 · · · · · · kfwd
LLI
Figure 2-1: Link dynamics of the multivariate link model
The multivariate model jointly tracks the dynamics between the three queues LI, DQ,
and LO. It also defines the upstream queue, UQ, as:
UQ = LI+DQ+ LO. (2.1)
More specifically, the model is a discrete-time model. We denote LI(t;k) (resp. DQ(t;k),
LO(t;k), UQ(t;k)) as the number of vehicles in LI (resp. DQ,LO,UQ) at continuous
time t within discrete time interval k of duration δ. The model yields the joint distribution
of P(LI(t;k), DQ(t;k), LO(t;k),
UQ(t;k)). The linear equality (2.1) implies that this four-dimensional joint distribution
29
can be obtained by tracking three of the four variables. The model implementation of
Osorio and Flötteröd (2015) tracks LI,DQ and LO. For a given link with space capacity ℓ
(which is defined as a rounded version of ρL), the state space is defined by {(li, dq, lo) ∈
[0, ℓ]3 : li+ dq+ lo ≤ ℓ}. The state space dimension is 16(ℓ+ 1)(ℓ2 + 2ℓ+ 6).
In this chapter, we propose a formulation with a state space dimension that is linear,
instead of cubic, in the space capacity while still providing a detailed representation of the
within-link dependencies. This enables its use for the efficient analysis and optimization of
large-scale networks.
2.2.2 Univariate link models
Hereafter, unless necessary, we drop the time dependency notation and use LI or LI(k) to
denote LI(t;k). We do the same for DQ,LO and UQ. The main insights of the multivari-
ate model that underly the newly proposed formulation are the following: UQ provides a
detailed description of the link’s upstream boundary conditions, while DQ provides a de-
tailed description of the link’s downstream boundary conditions. One approach would be to
propose a model that jointly describes (UQ,DQ). This would improve model scalability
by going from a three- to a two-dimensional state space. The idea considered in this chapter
goes even further, it proposes a univariate (i.e., one-dimensional) state space, which leads
to a scalable formulation. We consider the following two independent univariate models.
• One model of UQ. Its purpose is to accurately capture the link’s upstream boundary
conditions.
• One model of DQ. Its purpose is to accurately capture the link’s downstream bound-
ary conditions.
The proposed model is then defined as a mixture of these two independent univariate mod-
els. There is significant dependency between the upstream and the downstream boundary
conditions of a link, as illustrated by Equation (2.1). In other words, there is dependency
between the dynamics of UQ and of DQ. The numerical case studies in Osorio and Flöt-
teröd (2015) analyze this dependency in more detail. The main challenge addressed in this
30
chapter is therefore to develop independent univariate models of UQ and of DQ while still
capturing the dependency between the link’s upstream and downstream boundary condi-
tions.
Consider an isolated link with space capacity ℓ, an inhomogeneous Poisson arrival pro-
cess with exogenous arrival rate λ(k) (time is indexed by k), and exponentially distributed
service times at the downstream end of the link with exogenous downstream bottleneck
flow capacity µ(k). For this isolated link, Section 2.2.3 (resp. 2.2.4) formulates a uni-
variate model that tracks the distribution of UQ (resp. DQ) over time. Section 2.2.5 then
formulates the proposed mixture model, which combines the UQ and the DQ models.
2.2.3 Univariate upstream queue model
This section formulates a univariate model of UQ. Following the approach in Osorio and
Flötteröd (2015), UQ is modeled as a birth-death process with a finite state space defined
by {uq ∈ [0, ℓ]}. For time interval k of duration δ, the transient probability distribution
of UQ satisfies a system of linear differential equations with solution defined by (see, for
instance, Reibman (1991) for details):
P(UQ(t;k)) = P(UQ(0;k))etQUQ(k) ∀ t ∈ [0, δ], (2.2)
where P(UQ(0;k)) are the initial conditions at the beginning of time interval k, and
QUQ(k) is the transition rate matrix of UQ.
The initial conditions are given by ensuring continuity at the start of the time interval:
P(UQ(0;k)) = P(UQ(δ;k− 1)). (2.3)
Let P(UQ(k)) denote the UQ distribution at the end of time interval k, i.e., it is a
simplified notation for P(UQ(δ;k)). Equations (2.2) and (2.3) can be combined to obtain
the equation that yields the distribution of UQ at the end of the time interval:
P(UQ(k)) = P(UQ(k− 1))eδQUQ(k). (2.4)
31
Equations (2.4) states that in order to approximate the transient distribution of UQ, we
need to approximate the transition rate matrix QUQ(k). Table 2.1 defines the non-diagonal
and non-null elements of the transition rate matrix. This table considers for an arbitrary
initial state uq of UQ (displayed in column 1), the feasible instantaneous transitions that
can take place to new states (column 2), the rate at which the transitions take place (column
3), and the conditions on the initial states needed for the transitions to be feasible (column
4). For UQ, there are two types of events that trigger state changes. The first is flow arrival
to the link. This is described in the first row of the table. This row states that arrivals to
the link lead to an increase in the state of UQ (i.e., the new state is uq + 1), this occurs
with rate λ(k) and can occur as long as UQ is not full (i.e., there is available space at the
upstream end of the link: uq < ℓ). The second type of event are flow departures from UQ,
these are described by the second row of the table. They occur at rate µUQ(uq;k) and can
occur as long as UQ is not empty (i.e., uq > 0). The diagonal elements of the transition
rate matrix, QUQ(k)ss, are derived from the non-diagonal elements by:
QUQ(k)ss = −∑j =s
QUQ(k)sj. (2.5)
Table 2.1 states that the univariate model of UQ depends on two rates: (i) λ(k), which
for an isolated link is an exogenous rate, and (ii) µUQ(uq;k). The latter is referred to as
the service rate of UQ. It is an endogenous rate. We now formulate its approximation.
Service rate of UQ
Recall from Section 2.2.1 and Figure 2-1 that departures from UQ correspond to flow that
leaves the last cell of LO. In Figure 2-1, this last cell is the kbwdth cell denoted LLO.
Therefore, the number of departures from UQ during time interval k is a random vari-
initial state new state rate conditionuq uq+ 1 λ(k) uq < ℓ
uq uq− 1 µUQ(uq;k) uq > 0
Table 2.1: Transition rate table of UQ.
32
able and it can be expressed as LLO(k)|UQ(k) = uq. Let E[Tm(k)] denote the expected
inter-departure time from UQ conditional on there being a total of m departures during
time interval k. By definition, service rate is the inverse of expected time between consec-
utive departures. The service rate of UQ conditional on UQ = uq, µUQ(uq;k), can be
approximated as follows:
µUQ(uq;k) ≈uq∑m=1
1
E[Tm]P(LLO(k) = m|UQ(k) = uq) (2.6)
Equation (2.6) approximates µUQ(uq;k) as the mean inverse of expected inter-departure
time from UQ conditional on there being a total of m departures during time interval k. In
order to approximate E[Tm(k)] for m > 0, we use the following property. For a Poisson
process, given that a total number of m arrivals have occurred during a time interval of
duration δ, then the unordered arrival times are independently, uniformly distributed over
the time interval of interest (cf., for instance, Section 2.12.3 of Larson and Odoni (1981)).
Hence, the expected inter-arrival time is δ/m. We approximate the departure process of
UQ as a Poisson process. Therefore, given a total of m departures from UQ during time
interval k, the expected time between consecutive departures is approximated with δ/m.
Equation (2.6) becomes:
µUQ(uq;k) ≈uq∑m=1
m
δP(LLO(k) = m|UQ(k) = uq) (2.7)
=1
δ
uq∑m=0
mP(LLO(k) = m|UQ(k) = uq) (2.8)
=1
δE[LLO(k) | UQ(k) = uq], (2.9)
where E[LLO(k) | UQ(k) = uq] represents the expected outflow from UQ during time
interval k, given that UQ is in state uq. The expression for this conditional expectation is
derived as follows.
33
First, assume UQ to be a Poisson process with rate:
qUQ(k) =
k−1∑r=0
qin(r) −
k−kbwd−1∑r=0
qout(r), (2.10)
where qin(k) (resp. qout(k)) denotes the instantaneous link inflow (resp. outflow) rates at
the end of time interval k. As in the multivariate model (as well as in its deterministic
counterpart model of Yperman et al. (2007)), we use these instantaneous link inflow and
outflow rates to approximate the expected inflow to (resp. outflow from) the link during
time interval k. In other words, these instantaneous rates qin(k) and qout(k) are held con-
stant throughout the time interval k of duration δ. Equation (2.10) approximates the rate
of the Poisson process UQ as the difference between: (i) all flow that has entered the link
from time interval 0 until the end of time interval k − 1 (this is represented by the first
summation) and (ii) all flow that has left the link from time interval 0 until the end of time
interval k − kbwd − 1 (this is represented by the second summation). Recall that kbwd rep-
resents the number of time intervals needed for a kinematic backward wave to traverse the
link. Therefore, this second summation accounts for this kinematic backward wave delay
by considering all flow that has left the LO queue, and hence has left UQ.
Second, assume LLO and {UQ− LLO} to be two independent Poisson processes. This
simplifying independence assumption neglects the temporal dependency between LLO and
{UQ−LLO}. The numerical validation results of Section 2.3 highlight the small effect this
has on the final model’s accuracy. The rate of LLO is given by:
qLLO(k) = qout(k− kbwd). (2.11)
The term qout(k − kbwd) represents the expected flow that has left the link during time
interval k−kbwd. This leads to UQ being a sum of two independent Poisson processes: LLO
and {UQ− LLO}. Therefore, the conditional distribution of LLO(k) given {UQ(k) = uq}
is binomial with parameters (uq, qLLO(k)/qUQ(k)) (cf., for instance, Section 2.12.4 of
Larson and Odoni (1981)). Hence, the expected number of departures from UQ during
34
time interval k is approximated with:
E[LLO(k) | UQ(k) = uq] ≈ uqqLLO(k)
qUQ(k). (2.12)
The accuracy of this approximation depends on the dependency between LLO and {UQ−
LLO}. In particular, we expect it to decrease as congestion level increases.
In summary, given the rates that fully define the transition rate matrix: λ(k) (an exoge-
nous rate for an isolated link) and µUQ(uq;k) (given by Equation (3.29c)), the transient
probability distribution of UQ is obtained by evaluating Equation (2.4).
Expected link inflow and outflow
Given the univariate model of UQ, we now describe how it can be used to compute the
expected inflow and expected outflow of the link during time interval k. An arrival may
enter the link as long as there is space at the upstream end of the link. This happens with
probability P(UQ(k) < ℓ). Hence, the expected inflow to the link is:
qin(k) = λ(k)P(UQ(k) < ℓ). (2.13)
Note that in a full network model (i.e., if we combine the link model with a node model), a
vehicle in an upstream link that cannot enter its desired downstream link because it is full
would wait at its current location until an available space downstream is allocated to it. In
other words, spillbacks occur with probability P(UQ(k) = ℓ). In this chapter we consider
a single link model, hence vehicles that wish to enter the link while it is full are considered
lost demand. If a model with no losses is desired, then an infinite space-capacity queue can
be inserted upstream of the link to capture vehicles that are waiting to enter the link.
Similarly, the probability that there are vehicles ready to leave the link is P(DQ(k) >
0). Thus, the expected outflow from the link is:
qout(k) = µ(k)P(DQ(k) > 0). (2.14)
From Equation (2.4), we obtain the distribution of UQ at the end of time interval k, which
35
allows us to compute qin(k) through Equation (3.1). In order to compute qout(k), we need
to compute P(DQ(k) > 0). Nonetheless, in this univariate UQ model, we do not track
DQ directly. Let us now describe how it is approximated.
We proceed as above, where we approximate UQ as a sum of independent Poisson
processes, and approximate the distribution of DQ given {UQ = uq} as a binomial with
parameters (uq, qDQ(k)/qUQ(k)), where:
qDQ(k) =
k−kfwd−1∑r=0
qin(r) −
k−1∑r=0
qout(r). (2.15)
Equation (2.15) considers the expected flow in DQ as the difference between: (i) the sum
of all of the expected inflows into the link from time 0 to time k − kfwd − 1 (i.e., omitting
the flows that are still in LI) and (ii) the sum of all expected outflows out of the link (i.e.,
outflow from time 0 to time k− 1).
We obtain P(DQ(k) > 0) as follows:
P(DQ(k) > 0) = 1− P(DQ(k) = 0) (2.16)
= 1−
ℓ∑n=0
P(DQ(k) = 0 | UQ(k) = n)P(UQ(k) = n) (2.17)
≈ 1−
ℓ∑n=0
(1−
qDQ(k)
qUQ(k)
)n
P(UQ(k) = n), (2.18)
where the binomial probability mass function was used to derive the last expression. This
approximation is accurate when the dependencies among LI, DQ and LO is weak (e.g.
uncongested link).
36
Marginal distribution of DQ
The univariate UQ model can be used to approximate the entire marginal distribution of
DQ, by proceeding similarly as in the derivation of Equation (3.4). For all i ∈ [0, ℓ]:
P(DQ(k) = i) =
ℓ∑n=i
P(DQ(k) = i | UQ(k) = n)P(UQ(k) = n) (2.19)
≈ℓ∑
n=i
(n
i
)(qDQ(k)
qUQ(k)
)i (1−
qDQ(k)
qUQ(k)
)n−i
P(UQ(k) = n),
(2.20)
where(n
i
)denotes the binomial coefficient. Equation (2.20) is obtained by approximating
P(DQ(k) | UQ(k) = n) as a binomial distribution with parameters (uq, qDQ(k)/qUQ(k)).
Algorithm
Algorithm 4 summarizes the numerical evaluation of the UQ model. In the algorithm, we
omit the computation of the marginal distribution of DQ at each time interval k. However,
all the parameters in Equation (2.20) are stored and thus the distribution of DQ for any
time interval k can be computed if needed.
2.2.4 Univariate downstream queue model
The approach to formulate the univariate DQ model is similar to that used for the univari-
ate UQ model of Section 2.2.3. We model DQ as a birth-death process with finite state
space defined by {dq ∈ [0, ℓ]}. Just as for the UQ model, the transient distribution of DQ,
P(DQ(k)), satisfies an equation of the form (2.4) with initial conditions P(DQ(k−1)) and
transition rate matrix QDQ(k). The non-diagonal and non-null elements of the transition
rate matrix of DQ, QDQ(k) are given in Table 2.2. The first row of Table 2.2 describes
the event of arrivals to DQ, i.e., flow that transitions from LI to DQ (see Figure 2-1). The
second row describes the event of flow departing DQ (i.e., departing the link). The cor-
responding rate µ(k) is the downstream bottleneck flow capacity and is considered exoge-
nous for an isolated link. The diagonal elements of the transition rate matrix are computed
37
Algorithm 1 Algorithm of the univariate upstream queue model
1. set exogenous parameters ρ, v,w, ℓ and δ
2. set arrival and service rate over time λ(k) and µ(k) for ∀ k = 1, 2, ...
3. compute kfwd = ⌈ ℓρvδ
⌉ and kbwd = ⌈ ℓρ|w|δ
⌉
4. set exogenous initial link conditions: qin(0), qout(0), P(UQ(0)), qUQ(0), qLLO(0),and qDQ(0)
5. set qin(r) = 0 and qout(r) = 0 for r < 0
6. repeat the following for time intervals k = 1, 2, ...
(a) compute qUQ(k),qLLO(k) and qDQ(k) according to Eq. (2.10), (3.16), and(2.15), respectively
(b) for uq = 0, 1, ..., ℓ, compute µUQ(uq;k) according to Eq. (3.29c)
(c) form the transition rate matrix QUQ(k) defined in Table 2.1
(d) compute P(UQ(k)) according to Eq. (2.4)
(e) compute P(DQ(k) > 0) according to Eq. (3.4)
(f) compute qin(k) and qout(k) according to Eq. (3.1) and (3.2)
38
following equations as in (2.5).
Table 2.2 indicates that the transition rate matrix of DQ is defined by two rates: (i)
an endogenous arrival rate λDQ(k) and (ii) an exogenous service rate µ(k). This table is
simpler than that of the UQ model (Table 2.1) because both rates are state-independent (i.e.,
neither depends on the state dq). In Table 2.1, the service rate of UQ is state-dependent,
i.e., µUQ(uq;k) depends on uq.
For a finite capacity birth-death process with state-independent rates, Morse (1958,
Equation (6.13), Chap. 6) provides a closed-form expression to Equation (2.4), which
avoids the need to numerically evaluate the matrix exponential. For time interval k of
length δ, DQ distribution at the end of time interval k, P(DQ(k)), is given by:
P(DQ(k) = n) =
ℓ∑m=0
P(DQ(k− 1) = m)Pmn (δ) for 0 ≤ n ≤ ℓ (2.21a)
Pmn (δ) = Pn(k) +
2ρ(k)n−m
2
ℓ+ 1
ℓ∑s=1
µ(k)
γs(k)
[sin
(smπ
ℓ+ 1
)−√ρ(k) sin
(s(m+ 1)π
ℓ+ 1
)]·[
sin(
snπ
ℓ+ 1
)−√ρ(k) sin
(s(n+ 1)π
ℓ+ 1
)]e−γs(k)δ (2.21b)
γs(k) = λDQ(k) + µ(k) − 2√
λDQ(k)µ(k) cos(
sπ
ℓ+ 1
)for s = 1, 2, ..., ℓ (2.21c)
Pn(k) =
(1− ρ(k)
1− ρ(k)ℓ+1
)ρ(k)n (2.21d)
ρ(k) =λDQ(k)
µ(k). (2.21e)
Equation (2.21a) states that the distribution of DQ at the end of time interval k can
be obtained by a convex combination of distributions Pmn (δ) each of which is defined in
Equation (2.21b) as the sum of: (i) the stationary probability of being in state n, which
is denoted Pn(k) and defined by Equation (2.21d), and (ii) a time-dependent term with
initial state new state rate conditiondq dq+ 1 λDQ(k) dq < ℓ
dq dq− 1 µ(k) dq > 0
Table 2.2: Transition rate table of DQ.
39
exponential decay. The exponential decay is parameterized by γs(k), which is defined by
Equation (2.21c) and is referred in the queuing literature as the inverse of the relaxation
time. In summary, the distribution of DQ is given by (3.6), which depends on two rates:
(i) an exogenous service rate µ(k) and (ii) an endogenous arrival rate λDQ(k). We now
describe how we approximate this endogenous arrival rate.
Arrival rate of DQ
The distribution of DQ at the end of time interval k is given by the System of Equa-
tions (3.6), which depends on the endogenous rate, λDQ(k). In order to approximate this
rate, we observe that for a queue with finite space capacity ℓ and arrival rate λ, the expected
inflow to the queue is given by: λP(N < ℓ), where N represents the number of vehicles
in the queue. We use this property to obtain the following expression for the arrival rate to
DQ:
λDQ(k) ≈ qin(k− kfwd)
P(DQ(k− 1) < ℓ). (2.22)
The numerator qin(k−kfwd) represents the expected inflow to the link during time interval
k − kfwd, i.e., this is the flow that is expected to leave the last cell of LI (denoted LLI in
Figure 2-1) and enter DQ during time interval k. The denominator P(DQ(k − 1) < ℓ) is
based on the DQ distribution at the end of time interval k−1, which is the DQ distribution
at the beginning of time interval k.
Expected link inflow and outflow
Given the univariate model of DQ, we now describe how it can be used to compute the ex-
pected inflow and expected outflow of the link during time interval k. Recall their definition
given in (3.1) and (3.2). The System of Equations (3.6) yields the marginal distribution of
DQ, hence the expected outflow qout(k) (defined by Equation (3.2)) can be directly com-
puted.
In order to compute the expected inflow qin(k) (defined by Equation (3.1)) we need
P(UQ(k) < ℓ). Nonetheless, in this univariate DQ model, we do not track UQ directly.
40
Let us now describe how it is approximated.
We express P(UQ(k) < ℓ) as a function of the conditional distribution of {UQ−DQ}
given DQ:
P(UQ(k) < ℓ) = 1− P(UQ(k) = ℓ) (2.23)
= 1−
ℓ∑n=0
P(UQ(k) = ℓ | DQ(k) = n)P(DQ(k) = n) (2.24)
= 1−
ℓ∑n=0
P(UQ(k) −DQ(k) = ℓ− n | DQ(k) = n)P(DQ(k) = n)
(2.25)
≈ 1−
ℓ∑n=0
p1(k)ℓ−nP(DQ(k) = n). (2.26)
Equation (2.25) is obtained from (2.24) by observing that P(UQ(k) = ℓ | DQ(k) = n)
equals P(UQ(k) − DQ(k) = ℓ − n | DQ(k) = n). Equation (2.26) is obtained by
approximating the conditional distribution of {UQ−DQ} given {DQ = n} with a binomial
distribution with parameters (ℓ− n, p1(k)).
The first parameter of this distribution ℓ − n is derived by observing that the random
variable {UQ − DQ} given {DQ = n} can only take values in [0, ℓ − n]. Let us detail
this. Equation (2.1) implies UQ − DQ ≥ 0. Additionally, by definition UQ ≤ ℓ. Thus,
conditional on DQ = n, we have UQ−DQ ≤ ℓ− n.
Let us now approximate the second parameter of this binomial distribution, p1(k).
E[UQ(k)] = E[DQ(k)] + E[UQ(k) −DQ(k)] (2.27)
= E[DQ(k)] + E[E[UQ(k) −DQ(k)|DQ(k)]] (2.28)
= E[DQ(k)] +
ℓ∑n=0
E[UQ(k) −DQ(k) | DQ(k) = n]P(DQ(k) = n)(2.29)
≈ E[DQ(k)] +
ℓ∑n=0
(ℓ− n)p1(k)P(DQ(k) = n) (2.30)
= E[DQ(k)] + p1(k)(ℓ− E[DQ(k)]). (2.31)
41
Equation (2.27) is obtained by adding and subtracting E[DQ(k)] on the right hand side.
The law of total expectation is used in (2.28), and rewritten in more detail in (2.29). Since
{UQ − DQ} conditional on {DQ = n} is approximated as a binomial distribution with
parameters (ℓ−n, p1(k)), then E[UQ−DQ | DQ = n] equals (ℓ−n)p1(k), which leads
to (2.30). The summation is simplified to obtain (2.31), which itself can be rearranged to
obtain the approximation for p1(k):
p1(k) ≈E[UQ(k)] − E[DQ(k)]
ℓ− E[DQ(k)]. (2.32)
In order to evaluate Equation (2.32): E[DQ(k)] can be computed from the marginal
distribution of DQ (Equations (3.6)) as:
E[DQ(k)] =
ℓ∑n=0
nP(DQ(k) = n), (2.33)
and E[UQ(k)] can be obtained from the approximation of UQ as a Poisson process with
rate defined by Equation (2.10), and thus
E[UQ(k)] ≈ qUQ(k) · δ =
k−1∑r=0
qin(r) −
k−kbwd−1∑r=0
qout(r)
· δ. (2.34)
In summary, P(UQ(k) < ℓ) is approximated by Equation (2.26), with p1(k) given by
Equation (2.32) and P(DQ(k) = n) given by the System of Equations (3.6).
42
Marginal distribution of UQ
The univariate DQ model can be used to approximate the entire marginal distribution of
UQ, by proceeding similarly as in the derivation of Equation (2.26). For all i ∈ [0, ℓ]:
P(UQ(k) = i) =
i∑n=0
P(UQ(k) = i | DQ(k) = n)P(DQ(k) = n) (2.35)
=
i∑n=0
P(UQ(k) −DQ(k) = i− n | DQ(k) = n)P(DQ(k) = n)
(2.36)
≈i∑
n=0
(ℓ− n
i− n
)p1(k)
i−n(1− p1(k))ℓ−iP(DQ(k) = n). (2.37)
where P(DQ(k) = n) is given by the System of Equations (3.6), and p1(k) is given
by Equation (2.32). Equation (2.37) is obtained by approximating P(UQ(k) − DQ(k) |
DQ(k) = n) as a binomial distribution with parameters (ℓ− n, p1(k)).
Algorithm
Algorithm 2 summarizes the numerical evaluation of the DQ model. In the algorithm, we
omit the computation of the marginal distribution of UQ at each time interval k. However,
all the parameters in Equation (2.37) are stored, and thus the distribution of UQ for any
time interval k can be computed if needed.
2.2.5 Mixture model
Recall that by design the role of UQ is to capture the link’s upstream boundary conditions,
while that of DQ is to capture the link’s downstream boundary conditions. In order to
capture both the link’s upstream and downstream boundary conditions, while ensuring a
model suitable for large-scale network analysis, we propose a link model that is a mixture
of the univariate UQ model (formulated in Section 2.2.3) and of the univariate DQ model
43
Algorithm 2 Algorithm of the univariate downstream queue model
1. set exogenous parameters ρ, v,w, ℓ and δ
2. set arrival and service rate over time λ(k) and µ(k) for ∀ k = 1, 2, ...
3. compute kfwd = ⌈ ℓρvδ
⌉ and kbwd = ⌈ ℓρ|w|δ
⌉
4. set exogenous initial link conditions: qin(0), qout(0), P(DQ(0)), qUQ(0), andqLLI(0)
5. set qin(r) = 0 and qout(r) = 0 for r < 0
6. repeat the following for time intervals k = 1, 2, ...
(a) compute qUQ(k) according to Eq. (2.10)
(b) compute λDQ(k) according to Eq. (2.22)
(c) compute P(DQ(k)) according to the System of Equations (3.6)
(d) compute E[UQ(k)] according to Eq. (2.34)
(e) compute E[DQ(k)] according to Eq. (2.33)
(f) compute p1(k) according to Eq. (2.32)
(g) compute P(UQ(k) < ℓ) according to Eq. (2.26)
(h) compute qin(k) and qout(k) according to Eq. (3.1) and (3.2)
44
(formulated in Section 2.2.4). The proposed model is given by:
P(UQ(k)) = wPUQ(UQ(k)) + (1− w)PDQ(UQ(k)) (2.38)
P(DQ(k)) = wPUQ(DQ(k)) + (1− w)PDQ(DQ(k)), (2.39)
where the following notation is used:
PUQ(UQ(k)) UQ distribution from the UQ model (Equation (2.4));
PDQ(UQ(k)) UQ distribution from the DQ model (Equation (2.37));
PUQ(DQ(k)) DQ distribution from the UQ model (Equation (2.20));
PDQ(DQ(k)) DQ distribution from the DQ model (Equations (3.6)).
An analytical expression for the weight parameter, w, is derived through insights ob-
tained from a variety of numerical experiments. Its expression is given by:
w(ℓ, µ, kfwdδ) = e− ℓ2
70µkfwdδ . (2.40)
The experiments compared the performance of the proposed mixture model to that of a
discrete-event simulation model used in Osorio and Flötteröd (2015), which implements the
stochastic link transmission model. It samples individual vehicles. The forward and back-
ward lags are explicitly implemented on each vehicle. A total of 180 experiments were con-
ducted considering combinations of ℓ ∈ {5, 10, 15, . . . , 100}; ρ = λ/µ ∈ {0.25, 0.5, 0.75};
µ ∈ {0.2, 0.4, 0.6}. A more detailed description of the derivation of weight parameter, w,
is given in Appendix A.
For the mixture model, the expected inflow and outflow, i.e. qin(k) and qout(k), are
obtained according to Equations (3.1) and (3.2) where P(UQ(k) < ℓ) and P(DQ(k) > 0)
are given by (2.38) and (2.39), respectively. Algorithm 3 summarizes the mixture model
approach. Notice that steps 7 and 8 in the algorithm can be run simultaneously and inde-
pendently to further enhance the runtime.
45
Algorithm 3 Algorithm of the mixture model
1. set exogenous parameters ρ, v,w, ℓ and δ
2. set arrival and service rate over time λ(k) and µ(k) for ∀ k = 1, 2, ...
3. compute kfwd = ⌈ ℓρvδ
⌉ and kbwd = ⌈ ℓρ|w|δ
⌉
4. compute w according to Eq. (2.40)
5. set exogenous initial link conditions: qin(0), qout(0), P(UQ(0)), P(DQ(0)),qUQ(0), qLLO(0), qLLI(0) and qDQ(0)
6. set qin(r) = 0 and qout(r) = 0 for r < 0
7. run step 6 of algorithm 4, this yields PUQ(UQ(k)) for all k = 1, 2...
8. run step 6 of algorithm 2, this yields PDQ(DQ(k)) for all k = 1, 2...
9. for any time interval k,
(a) compute PUQ(DQ(k)) according to Eq. (2.20)
(b) compute PDQ(UQ(k)) according to Eq. (2.37)
(c) compute P(UQ(k)) according to Eq. (2.38)
(d) compute P(DQ(k)) according to Eq. (2.39)
(e) compute qin(k) and qout(k) according to Eq. (3.1) and (3.2)
46
Parameter Valuev 0.01 km/secw −0.005 km/secρ 200 veh/kmq 2400 veh/h = 0.67 veh/secδ 0.1 sec
µ(k) 1440 veh/h = 0.4 veh/secλ(k) varies by experiment
ℓ, L, kfwd, kbwd varies by experiment
Table 2.3: Link Parameters
2.3 Validation
In this section we validate the model. We evaluate and compare both in terms of compu-
tational runtime and accuracy. First, we compare the computational runtimes of the pro-
posed model to those of the multivariate model (Osorio and Flötteröd; 2015). We consider
a single-lane link with parameters shown in Table 3.1. The link configuration is the same
as that used in Osorio and Flötteröd (2015) except for the service rate. The service rate
of the link is fixed at 0.4 veh/sec for all experiments. The experiments consider different
arrival rates and link lengths (and hence, different space capacities, forward lags and back-
ward lags). We consider a set of three different arrival rates (λ ∈ {0.1, 0.2, 0.3} veh/sec)
and seven different space capacities (ℓ ∈ {10, 20, 30, 40, 60, 80, 100}). The combination
of these values leads to a total of 21 experiments. The considered space capacity values
correspond to link lengths L ∈ {50, 100, 150, 200, 300, 400, 500} (in meters), forward lags
kfwd ∈ {5, 10, 15, 20, 30, 40, 50} and backward lags kbwd ∈ {10, 20, 30, 40, 60, 80, 100}.
Each experiment starts with an empty link at time zero and runs for 250 seconds at which
point the link is ensured to have reached a stationary regime. All experiments are carried
out on a standard laptop machine with Intel Core i7-4700HQ CPU running at 2.40 GHz.
Figure 3-4 compares the runtimes of the mixture model (circles) and of the multivariate
model (asterisks). The x-axis considers the space capacity values ℓ. The y-axis displays
the average computational runtime (in minutes). The average is computed over the three
experiments with three different arrival rate values. The y-axis is plotted on a logarithmic
scale. The maximum runtime for evaluating an experiment is set to be 40 hours. If an
47
10 20 30 40 50 60 70 80 90 100
Space capacity ℓ
10-2
10-1
100
101
102
103
104
Runtime(m
in)
MixtureMultivariate
Figure 2-2: Model runtime comparison
experiment has not concluded within 40 hours, it is terminated. For ℓ = 30 the average
runtime of the multivariate model is already 2366 minutes (≈ 39 hours). Hence, for exper-
iments where ℓ > 30, the multivariate model is not evaluated. Figure 3-4 illustrates that
the runtime of the multivariate model increases exponentially with ℓ, while for the mixture
model the increase appears linear. For the mixture model, the average runtime over all 21
experiments is 0.05 minutes. The maximum average runtime is obtained for ℓ = 100 and
is 0.11 minutes. Thus, compared to the multivariate model, the mixture model achieves
significant improvements in computational complexity both theoretically and numerically.
We now compare the multivariate model and the mixture model in terms of their ac-
curacy. In order to evaluate the accuracy of each of these analytical models, we use a
discrete-event simulator of the stochastic link transmission model. The simulator is the
same as that used for validation in Osorio and Flötteröd (2015). It samples individual vehi-
cles, and implements for each vehicle exact forward and backward lags. The arrival process
is a Poisson process. For vehicles at the downstream end of the link, inter-departure times
are independent and identically distributed exponential random variables. The simulated
estimates are obtained from 106 replications.
48
First, we consider two experiments with temporal variations in demand and evaluate
the ability of the analytical models to approximate the transient distributions of UQ and
of DQ. For both experiments, ℓ = 10. Experiment 1 has an arrival rate of 0.1 veh/sec
during time [0, 125] seconds, an arrival rate of 0.5 veh/sec during time [125, 175] seconds
and an arrival rate of 0.3 veh/sec during time [175, 300] seconds. This experiment corre-
sponds to step-changes from uncongested to highly-congested (i.e. λ(k) > µ(k)) and then
to congested traffic conditions. Experiment 2 considers first an arrival rate of 0.3 veh/sec
during time [0, 100] seconds, of 0.1 veh/sec during time [100, 200] seconds and then of 0.5
veh/sec during time [200, 300] seconds. This experiment corresponds to step-changes from
congested to uncongested and then to highly-congested traffic conditions. The two experi-
ments are designed such that during the highly-congested period (where λ(k) > µ(k)), the
period is not long enough in Experiment 1 for the transient distribution to converge to its
stationary counterpart, while in Experiment 2 it is a long enough period.
Figure 2-3 considers Experiment 1. Each plot of Figure 2-3(a) considers a given time
T (in seconds) and displays the distribution of UQ, P(UQ(T)), at time T as proposed
by: the mixture model (red squares), the multivariate model (blue diamonds) and the
simulated estimates (black crosses). The different plots consider different times: T ∈
{1, 30, 60, 90, 120, 150, 180, 210, 240, 270} seconds. Similarly, each plot of Figure 2-3(b)
displays the distribution of DQ, P(DQ(T)), at time T . The simulated estimates are dis-
played with 95% confidence intervals. These are barely visible.
Recall that for this experiment, there is a sharp increase in demand at time T = 125
sec and a sharp decrease at time T = 175 sec. The changes in the distributions of UQ and
DQ after time T = 125 seconds and T = 175 seconds are visible for all models. During
time [125, 175], states with higher values of UQ (resp. DQ) have higher probabilities.
After time T = 175, states with higher values of UQ (resp. DQ) have comparably lower
probabilities. Figures 2-3(a) and 2-3(b) show that the dynamics of the simulator are well
approximated by both the mixture and the multivariate models. Additionally, both analyti-
cal models converge, both before T = 125 seconds and after T = 175 seconds, to stationary
distributions that approximate well the simulated distribution.
The plots of Figure 2-3(c) display, respectively, E[UQ(T)] and E[DQ(T)] as a function
49
of time T . The sharp increase in expectation after time T = 125 seconds and the sharp
decrease after time T = 175 seconds are well approximated by both analytical models.
The stationary values before T = 125 seconds and after T = 175 seconds are also well
approximated.
Note also that for all three models considered here (mixture, multivariate and simula-
tor) their arrival process and their departure process are stochastic. Hence, spillback may
occur even when µ(k) > λ(k). More specifically, the spillback probability is given by
P(UQ(T) = ℓ). For instance, in the right-most plot of the second row of Figure 2-3(a), the
spillback probability is non-zero (i.e., P(UQ(T) = ℓ) > 0).
Experiment 2 considers a sharp decrease in demand at T = 100 seconds and a sharp
increase in demand at T = 200. Figures 2-4(a) and 2-4(b) display, respectively, the distri-
butions of UQ and of DQ as a function of time (i.e., P(UQ(T)) and P(DQ(T))). In this
experiment, we observe a shift in probability mass to states with smaller values of UQ and
DQ during time [100, 200] seconds and a shift in probability mass to states with larger val-
ues of UQ and DQ after time T = 200 seconds. In this experiment, both analytical models
converge to the stationary distribution after each change in demand. The conclusions here
are the same as for the previous experiment: both the stationary and the transient distri-
butions are well approximated by the analytical models. The time-dependent expectations
E[UQ(T)] and E[DQ(T)] are displayed in Figure 2-4(c). Again, the dynamics are well cap-
tured by both analytical models. In summary, for Experiments 1 and 2, the approximations
of both the mixture and the multivariate models are good. The transient and the stationary
distributions are well approximated by both models.
We now evaluate the accuracy of the mixture model over a larger set of experiments. We
consider the 21 experiments mentioned above. The main goal is to evaluate the loss of ac-
curacy of the mixture model compared to the (less scalable but more accurate) multivariate
model. In order to evaluate the accuracy of a given distribution (UQ or DQ), we evaluate
its distance to the distribution estimated via simulation with the stochastic LTM simulator
described previously and used for validation in Osorio and Flötteröd (2015). Recall that
this simulator is an exact implementation of the stochastic LTM. The distance between an
analytical distribution (mixture or multivariate) and the simulated distribution is evaluated
50
0 5 100
0.2
0.4
0.6
0.8
1
P(U
Q(T
)=n)
T=1
0 5 100
0.2
0.4
0.6T=30
0 5 100
0.2
0.4
0.6T=60
0 5 100
0.2
0.4
0.6T=90
0 5 100
0.2
0.4
0.6T=120
0 5 10n
0
0.2
0.4
0.6
P(U
Q(T
)=n)
T=150
0 5 10n
0
0.2
0.4
0.6T=180
0 5 10n
0
0.2
0.4
0.6T=210
0 5 10n
0
0.2
0.4
0.6T=240
0 5 10n
0
0.2
0.4
0.6T=270
multivariatemixturesimulation
(a) Distribution of UQ over time, P(UQ(T))
0 5 100
0.2
0.4
0.6
0.8
1
P(D
Q(T
)=n)
T=1
0 5 100
0.2
0.4
0.6
0.8
1T=30
0 5 100
0.2
0.4
0.6
0.8
1T=60
0 5 100
0.2
0.4
0.6
0.8
1T=90
0 5 100
0.2
0.4
0.6
0.8
1T=120
0 5 10n
0
0.1
0.2
0.3
0.4
0.5
P(D
Q(T
)=n)
T=150
0 5 10n
0
0.1
0.2
0.3
0.4
0.5T=180
0 5 10n
0
0.1
0.2
0.3
0.4
0.5T=210
0 5 10n
0
0.1
0.2
0.3
0.4
0.5T=240
0 5 10n
0
0.1
0.2
0.3
0.4
0.5T=270
multivariatemixturesimulation
(b) Distribution of DQ over time, P(DQ(T))
0 100 200 300
time T
0
2
4
6
8
10
E[U
Q(T
)]
simulationmultivariatemixture
0 100 200 300
time T
0
0.5
1
1.5
2
2.5
3
3.5
E[D
Q(T
)]
simulationmultivariatemixture
(c) Expectation of UQ and of DQ over time, E[UQ(T)] and E[DQ(T)]
Figure 2-3: Experiment 1: impact of the temporal variation of demand on the distributions,as well as the expected values, of UQ and of DQ
51
0 5 100
0.2
0.4
0.6
0.8
1
P(U
Q(T
)=n)
T=1
0 5 100
0.2
0.4
0.6T=30
0 5 100
0.2
0.4
0.6T=60
0 5 100
0.2
0.4
0.6T=90
0 5 100
0.2
0.4
0.6T=120
0 5 10n
0
0.2
0.4
0.6
0.8
1
P(U
Q(T
)=n)
T=150
0 5 10n
0
0.2
0.4
0.6T=180
0 5 10n
0
0.2
0.4
0.6T=210
0 5 10n
0
0.2
0.4
0.6T=240
0 5 10n
0
0.2
0.4
0.6T=270
multivariatemixturesimulation
(a) Distribution of UQ over time, P(UQ(T))
0 5 100
0.2
0.4
0.6
0.8
1
P(D
Q(T
)=n)
T=1
0 5 100
0.1
0.2
0.3
0.4
0.5T=30
0 5 100
0.1
0.2
0.3
0.4
0.5T=60
0 5 100
0.1
0.2
0.3
0.4
0.5T=90
0 5 100
0.2
0.4
0.6
0.8
1T=120
0 5 10n
0
0.2
0.4
0.6
0.8
1
P(D
Q(T
)=n)
T=150
0 5 10n
0
0.2
0.4
0.6
0.8
1T=180
0 5 10n
0
0.1
0.2
0.3
0.4
0.5T=210
0 5 10n
0
0.1
0.2
0.3
0.4
0.5T=240
0 5 10n
0
0.1
0.2
0.3
0.4
0.5T=270
multivariatemixturesimulation
(b) Distribution of DQ over time, P(DQ(T))
0 100 200 300
time T
0
2
4
6
8
10
E[U
Q(T
)]
simulationmultivariatemixture
0 100 200 300
time T
0
0.5
1
1.5
2
2.5
3
3.5
E[D
Q(T
)]
simulationmultivariatemixture
(c) Expectation of UQ and of DQ over time, E[UQ(T)] and E[DQ(T)]
Figure 2-4: Experiment 2: impact of the temporal variation of demand on the distributions,as well as the expected values, of UQ and of DQ
52
with the Jensen-Shannon divergence (JSD) metric (Endres and Schindelin; 2003). For a
pair of distributions P1 and P2, the JSD metric is defined by:
JSD(P1 ∥ P2) =1
2D(P1 ∥ M) +
1
2D(P2 ∥ M) (2.41)
D(P1 ∥ P2) =∑i
P1(i) logP1(i)
P2(i), (2.42)
where D(P1 ∥ P2) is the Kullback-Leibler divergence (KLD) (Kullback and Leibler; 1951)
and M = 12(P1 + P2). Unlike the KLD, the JSD is both symmetric and upper bounded by
1. The lower the JSD value, the smaller the distance between the two distributions, i.e.,
the higher the accuracy. We define the time-average JSD over the entire time period (i.e.,
250 seconds) as the temporal mean of the JSD values, i.e.: 1250
∑250
T=1 JSD(P1(T) ∥ P2(T))
where P1(T) and P2(T) are the distributions evaluated at time T .
Since the main goal is to evaluate the accuracy loss of the mixture model compared to
the multivariate model, we will compare the time-average JSD values of the mixture model
(i.e., the time-average JSD distance between the distribution approximated by the mixture
model and the simulated distribution) and the time-average JSD values of the multivariate
model (i.e., the time-average JSD distance between the distribution approximated by the
multivariate model and the simulated distribution). In order to guide us in the interpreta-
tion of the magnitude of the JSD metric, we provide three additional models to compare
the proposed model with: (i) the deterministic LTM (denoted DetDet, which stands for
deterministic arrivals and deterministic departures), (ii) a simulation-based instance of the
LTM with deterministic arrivals and independent exponentially distributed inter-departure
times (denoted DetExp), (iii) a simulation-based instance of the LTM with independent ex-
ponentially distributed inter-arrival times and deterministic inter-departure times (denoted
ExpDet). Since DetDet is a deterministic traffic model, for a given experiment and a
given time, it generates a unique link state (i.e., the distribution has all the probability mass
concentrated in a single state). For the simulation-based models, the distributional esti-
mates are obtained from 106 simulation replications. In summary, for a given experiment
(out of the 21 experiments), a given model (mixture, multivariate, DetDet, DetExp and
53
ExpDet) and a given distribution (UQ or DQ), we evaluate its distance to the simulated
distribution using the time-average JSD metric.
As described above, the simulator consists of the deterministic LTM yet with a proba-
bilistic arrival process and a probabilistic departure process. Hence, the underlying distribu-
tions (of UQ and of DQ) it yields are expected to differ from those of the purely determin-
istic LTM. Thus, the time-average JSD values of DetDet can be interpreted as the effect
of extending the LTM with a given probabilistic arrival process and a given probabilistic
departure process. Similarly, the time-average JSD values of ExpDet (resp. DetExp) can
be interpreted as the effect of extending the LTM with a given probabilistic arrival (resp.
departure) process.
Figure 3-3 displays the time-average JSD values for the 21 experiments described
above. The top (resp. bottom) row plots consider the UQ (resp. DQ) distribution. The
first column of plots considers the experiments with arrival rate λ(k) = 0.1 veh/sec. The
second and third column consider arrival rate values of 0.2 and 0.3 veh/sec, respectively.
Each plot compares 5 models: the mixture model (circles), the multivariate model (aster-
isks), DetDet (square), ExpDet (triangle) and DetExp (cross). Each plot displays the
time-average JSD metric (y-axis) as a function of the space capacity (x-axis). Recall that
for the multivariate model, the runtimes for the experiments with ℓ > 30 exceed 40 hours
and are hence not computed. Figure 2-6 considers a zoomed-in version of Figure 3-3. It
displays only the mixture, the multivariate and the ExpDet models, which are those with
the lowest error values (i.e., their curves mostly overlap along the x-axis in Figure 3-3).
For all plots of Figure 3-3, the time-average JSD values of DetDet and DetExp are
significantly higher than those of the other models. In particular, the curves of the three
other models (mixture, multivariate and ExpDet) are barely visible along the x-axis. Fig-
ure 2-6 presents in more detail the curves of these three models. For P(UQ(T)) (i.e., top
row plots), the time-average JSD values of the mixture model are higher than those of the
multivariate and of ExpDet. Yet the values remain very small. For P(DQ(T)) (i.e., bottom
row plots), the time-average JSD values of the ExpDet model are higher than those of the
mixture and of the multivariate model. For space capacities ℓ ≥ 30, the curve of the mix-
ture model overlaps with the x-axis, it is barely visible. This indicates very high accuracy.
54
Recall also that for ℓ > 30, the computation time for the mixture model exceeds 40 hours
and is hence not evaluated. Overall, these experiments indicate that the loss of accuracy of
the mixture model compared with the multivariate model is not significant. The numeri-
cal time-average JSD values displayed in Figure 3-3 are provided, for all experiments, in
Tables 1 and 2 of Appendix B.
In summary, for experiments with both constant and time-varying demand, the mixture
model performs comparably with the multivariate model, while being significantly faster
to evaluate. The gain in computational runtime increases with the space capacity. In par-
ticular, for medium-dimensional state spaces (i.e., medium-sized links), the evaluation of
the mixture model remains instantaneous (i.e., in the order of seconds), while that of the
multivariate model increases exponentially.
2.4 Network analysis
In this section, the proposed mixture model is used to address a traffic signal control prob-
lem for the city of Lausanne, Switzerland. Section 2.4.1 formulates the problem and de-
scribes the case study. Section 2.4.2 presents the numerical results and Section 2.4.3 com-
pares the performance of the resulting signal plans to that of a signal plan derived by a
commercial signal control software.
2.4.1 City-scale signal control
We consider the city of Lausanne, Switzerland. The city map is shown in Figure 2-7, and
the area of consideration is delimited in white. The network model of a stochastic micro-
scopic simulator is displayed in Figure 3-12. The network consists of 603 links, 902 lanes
and 231 intersections. We consider a problem where we determine the signal plans of 17
intersections distributed throughout the city. These 17 intersections are depicted as squares
in Figure 3-12. We consider a fixed-time signal control problem. For a review of traffic sig-
nal control terminology and formulations, see Appendix A of Osorio (2010). A fixed-time
signal plan, also called time-of-day or pre-timed plan, is an off-line pre-determined plan
that is periodical during a specific time of day (e.g., evening peak). Fixed-time plans are
55
2040
6080
100Space
Capacity
ℓ
0
0.1
0.2
0.3
0.4
0.5
Time-average JSD for P(UQ(T))
Mixture
Multivariate
DetD
etE
xpDet
DetE
xp
(a)Arrivalrate
λ(k)=
0.1
veh/sec
2040
6080
100Space
Capacity
ℓ
0
0.1
0.2
0.3
0.4
0.5
0.6
Time-average JSD for P(UQ(T))
Mixture
Multivariate
DetD
etE
xpDet
DetE
xp
(b)Arrivalrate
λ(k)=
0.2
veh/sec
2040
6080
100Space
Capacity
ℓ
0
0.1
0.2
0.3
0.4
0.5
0.6
Time-average JSD for P(UQ(T))
Mixture
Multivariate
DetD
etE
xpDet
DetE
xp
(c)Arrivalrate
λ(k)=
0.3
veh/sec
2040
6080
100Space
Capacity
ℓ
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
Time-average JSD for P(DQ(T))
Mixture
Multivariate
DetD
etE
xpDet
DetE
xp
(d)Arrivalrate
λ(k)=
0.1
veh/sec
2040
6080
100Space
Capacity
ℓ
0
0.05
0.1
0.15
0.2
0.25
0.3Time-average JSD for P(DQ(T))
Mixture
Multivariate
DetD
etE
xpDet
DetE
xp
(e)Arrivalrate
λ(k)=
0.2
veh/sec
2040
6080
100Space
Capacity
ℓ
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Time-average JSD for P(DQ(T))
Mixture
Multivariate
DetD
etE
xpDet
DetE
xp
(f)Arrivalrate
λ(k)=
0.3
veh/sec
Figure2-5:C
omparison
oftheJSD
valuesforthe
21experim
entsw
ithtim
e-independentdemand
56
2040
6080
100
Space
Capacity
ℓ
0
0.51
1.52
Time-average JSD for P(UQ(T))
×10
-3
Mix
ture
Mul
tivar
iate
Exp
Det
(a)A
rriv
alra
teλ(k)=
0.1
veh/
sec
2040
6080
100
Space
Capacity
ℓ
012345678 Time-average JSD for P(UQ(T))
×10
-3
Mix
ture
Mul
tivar
iate
Exp
Det
(b)A
rriv
alra
teλ(k)=
0.2
veh/
sec
2040
6080
100
Space
Capacity
ℓ
0
0.00
5
0.01
0.01
5
0.02
0.02
5
Time-average JSD for P(UQ(T))
Mix
ture
Mul
tivar
iate
Exp
Det
(c)A
rriv
alra
teλ(k)=
0.3
veh/
sec
2040
6080
100
Space
Capacity
ℓ
0
0.51
1.52
2.53
Time-average JSD for P(DQ(T))
×10
-3
Mix
ture
Mul
tivar
iate
Exp
Det
(d)A
rriv
alra
teλ(k)=
0.1
veh/
sec
2040
6080
100
Space
Capacity
ℓ
0
0.00
2
0.00
4
0.00
6
0.00
8
0.01
0.01
2
Time-average JSD for P(DQ(T))
Mix
ture
Mul
tivar
iate
Exp
Det
(e)A
rriv
alra
teλ(k)=
0.2
veh/
sec
2040
6080
100
Space
Capacity
ℓ
0
0.00
5
0.01
0.01
5
0.02
0.02
5
0.03
Time-average JSD for P(DQ(T))
Mix
ture
Mul
tivar
iate
Exp
Det
(f)A
rriv
alra
teλ(k)=
0.3
veh/
sec
Figu
re2-
6:C
ompa
riso
nof
the
JSD
valu
esfo
rthe
21ex
peri
men
tsw
ithtim
e-in
depe
nden
tdem
and
(zoo
med
-in
resu
lts)
57
Figure 2-7: Lausanne city road network (adapted from Dumont and Bert (2006))
appropriate for networks with sparse or unreliable real-time data. They are also commonly
used by major cities with high and uniformly distributed congestion levels, such as New
York City (Osorio et al.; 2015).
We consider a fixed-time signal control problem for the 5:00-5:30pm evening peak. The
signal plans of the 17 intersections are determined jointly. The decision variables are the
green splits (i.e., normalized green times) of the phases of the different intersections. All
other traditional control variables (e.g., cycle times, offsets, stage structure) are assumed
fixed. This leads to a total of 99 endogenous signal phase variables, i.e., the dimension of
the decision vector is 99.
58
Figure 2-8: Lausanne network model
To formulate the problem, we introduce the following notation.
bd ratio of available cycle time to total cycle time for intersection d;
x vector of green splits;
x(j) green split of signal phase j;
xLB vector of lower bounds for green splits;
D set of intersection indices;
PD(d) set of endogenous signal phase indices of intersection d;
L set of all lanes;
T total number of one-minute time intervals;
N number of lanes, i.e., cardinality of L.
The problem is formulated as follows:
minx
f(x) =1
TN
∑i∈L
T∑t=1
P(UQi(t; x) = ℓi) (2.43)
59
subject to
∑j∈PD(d)
x(j) = bd, ∀d ∈ D (2.44)
x ≥ xLB. (2.45)
The decision vector, x, denotes the green splits of the signal controlled lanes. The linear
equality constraints (3.38) ensure that, for each intersection, the sum of green times equals
the available cycle time. Constraint (3.39) ensures lower bounds for the green splits. This
objective function averages, over time and over all lanes, the spillback probability of each
lane. This spillback probability is represented by P(UQi(t; x) = ℓi), which denotes the
probability of UQ being full at integer time t under signal plan x. This problem formulation
minimizes the spatial and temporal occurrence of spillbacks.
The above signal control problem has a probabilistic formulation, which is naturally
addressed with probabilistic traffic models. Given the high computation times of the multi-
variate model (cf. Section 2.3), the above problem is only solved with the proposed mixture
model.
Implementation details
The values of the main exogenous parameters of the mixture model are displayed in Ta-
ble 2.4. The decision variables of this problem (the green splits of the signal plans) deter-
mine the downstream flow capacity of the underlying lanes. More specifically, for a signal
controlled lane i, its flow capacity is given by:
µi −∑
j∈PI(i)
x(j)s = eis, ∀i ∈ L, (2.46)
where s represents the saturation flow, ei represents the ratio of fixed green time to cycle
time of signalized lane i, PI(i) represents the set of endogenous signal phases of lane i and
L denotes the set of signal controlled lanes.
This chapter formulates a link model. It can be coupled with a probabilistic node model
to formulate a full network model. As is discussed in Section 2.5, the formulation of
60
probabilistic traffic-theoretic node models is part of ongoing work. In order to limit this
case study to the use of the link model (rather than link and node models), we assume
link demand to be exogenous, i.e., it does not vary with signal plans. Hence, the mixture
model is used to design signal plans that improve within-link traffic dynamics. Across-link
dynamics, or more generally changes in traffic assignment, are not accounted for in this
formulation. The results of this case study show that even with the use of such simplifying
assumptions (e.g., the lack of an endogenous node model), the link model identifies signal
plans with good network-wide performance.
The exogenous arrival rate (or demand rate) for lane i at time-interval k, denoted λi(k),
is computed, prior to optimization, by solving the following linear system of equations:
λi(k) = γi +∑j
pjiλj(k), ∀i ∈ L, (2.47)
where γi denotes an external arrival rate (i.e., rate of trips that start at lane i), pji is a turning
probability from lane j to lane i. Both γi and pji are exogenous and time-independent,
hence λ is also exogenous and time-independent. Equation (3.35) states that the arrival rate
of lane i is the sum of the external arrival rate γi to lane i and of the demand that arises
from upstream lanes. Problem (3.37)-(3.39) is solved using the Active-set algorithm of the
fmincon routine of Matlab (MATLAB; 2016).
2.4.2 Numerical analysis
We solve Problem (3.37)-(3.39) considering four different initial points. Each point is
drawn uniformly randomly from the feasible space (Equations (3.38)-(3.39)). The uniform
sampling is conducted using the code of Stafford (2006). The use of four different initial
points leads to four optimal solutions. In order to evaluate the performance of the various
signal plans (initial and optimal), we use a microscopic traffic simulation model of Lau-
sanne (Dumont and Bert; 2006), which is calibrated for the evening peak period demand
and implemented with the Aimsun simulator (TSS; 2014). Each signal plan is embedded
within the simulator, 50 simulation replications are run. We then compare the cumulative
distribution (obtained over these 50 replications) of the main network performance mea-
61
Parameter ValueT 30 one-minute intervalsN 902 lanesδ 0.1 sec
xLB 4 secv 50 km/hw −15 km/hρ 200 veh/kms 1800 veh/hµ varies by signal plansλ calculated from Equation (3.35)
ℓ, γ, pij, ei, bd exogenous values obtained from Osorio (2010, Chap. 4)kfwd kfwd = ⌈ ℓ
ρvδ⌉
kbwd kbwd = ⌈ ℓρ|w|δ
⌉
Table 2.4: Parameters for Lausanne case study
sures. Each simulation replication consists of a 15 minute warm-up period, followed by a
30 minute (5:00-5:30pm) simulation period. For a given simulation replication, the objec-
tive function (3.37) is estimated as the average (over all lanes) proportion of time a lane is
full.
Each plot of Figure 3-13 considers one random initial point. Each plot displays two
cumulative distribution curves: one for the initial signal plan and one for the optimal plan
of Problem (3.37)-(3.39). Each curve is the cumulative distribution function (cdf) of the
average proportion of time a lane is full. More specifically, the x-axis displays the average
proportion of time a lane is full. For a given value of x, the y-axis displays the proportion of
simulation replications (out of 50) that have average proportion of time a lane is full smaller
than x. Therefore, the more a cdf curve is shifted to the left, the better the performance of
the corresponding signal plan. The solid curves correspond to the cdf of the initial signal
plans, the dashed curves represent that of the optimal signal plans of Problem (3.37)-(3.39).
As shown in plots 3-15(a)-3-15(d), all the cdf curves of the optimal signal plan are to the
left of the corresponding initial plan. In other words, the model yields solutions that have
lower average proportion of time a lane is full.
Figures 3-15 and 3-16 have similar figure structure as Figure 3-13. Figure 3-15 analyses
the performance of the signal plans in terms of the average lane queue-length (in vehicles).
62
This average is computed over time and over lanes. The x-axis displays the average lane
queue-length. For a given value of x, the y-axis displays the proportion of simulation
replications (out of 50) that have average lane queue-length smaller than x. As before, the
more these curves are shifted to the left, the better the performance of the corresponding
signal plans. The four plots of Figure 3-15 indicate that, for all initial points, the proposed
optimal signal plans yields lower average lane queue-length. Figure 3-16 analyses the
performance of the signal plans in terms of the average trip travel times (in minutes). The
x-axis displays the average trip travel time. For a given value of x, the y-axis displays the
proportion of simulation replications (out of 50) that have average trip travel times smaller
than x. For all initial points, the proposed optimal signal plans yield lower average trip
travel times.
2.4.3 Comparison to signal plans derived by commercial signal con-
trol software
In this section, we compare the performance of the optimal signal plans with that of a signal
plan obtained from a widely used commercial signal control software (Synchro Trafficware
(2011)). For details on how the signal plan for the city of Lausanne is obtained from
Synchro, we refer the reader to Section 5.3 of Osorio and Chong (2015). Note that Synchro,
which is a signal control optimization software based on a deterministic macroscopic traffic
model, does not solve Problem (3.37)-(3.39).
Figures 3-17, 3-18 and 3-19 consider the same performance metrics as before: average
proportion of time a lane is full, average lane queue-length and average trip travel time.
Each figure displays 9 cdf curves. The four dashed (resp. solid thin) curves correspond
to the four initial (resp. optimal) points of the previous analysis. The solid thick curve
corresponds to the signal plan proposed by Synchro. Recall that for each figure, the more a
cdf curve is shifted to the left, the better the performance of the corresponding signal plan.
For all three figures, the four left-most curves are the four plans proposed by the mixture
model. In other words, for all three performance metrics, the proposed plans outperform
all initial plans as well as the Synchro plan. These figures also show that, for all three
63
0.010.015
0.020.025
0.030.035
0.040.045
x: average proportion of time a lane is full
0
0.2
0.4
0.6
0.8 1
Cumulative distribution function F(x)
InitialO
ptimal
(a)Initialpoint1
0.010.012
0.0140.016
0.0180.02
0.0220.024
x: average proportion of time a lane is full
0
0.2
0.4
0.6
0.8 1
Cumulative distribution function F(x)
InitialO
ptimal
(b)Initialpoint2
0.010.015
0.020.025
0.030.035
0.040.045
x: average proportion of time a lane is full
0
0.2
0.4
0.6
0.8 1
Cumulative distribution function F(x)
InitialO
ptimal
(c)Initialpoint3
0.010.015
0.020.025
0.030.035
0.040.045
0.05x: average proportion of tim
e a lane is full
0
0.2
0.4
0.6
0.8 1
Cumulative distribution function F(x)
InitialO
ptimal
(d)Initialpoint4
Figure2-9:C
umulative
distributionfunctions
oftheaverage
proportionoftim
ea
laneis
fullconsideringdifferentinitialsignalplans
64
0.5
11.
52
2.5
3x:
ave
rage
lane
que
ue-le
ngth
(in
veh
icle
s)
0
0.2
0.4
0.6
0.81
Cumulative distribution function F(x)
Initi
alO
ptim
al
(a)I
nitia
lpoi
nt1
0.5
11.
52
2.5
x: a
vera
ge la
ne q
ueue
-leng
th (
in v
ehic
les)
0
0.2
0.4
0.6
0.81
Cumulative distribution function F(x)
Initi
alO
ptim
al
(b)I
nitia
lpoi
nt2
0.5
11.
52
2.5
33.
5x:
ave
rage
lane
que
ue-le
ngth
(in
veh
icle
s)
0
0.2
0.4
0.6
0.81
Cumulative distribution function F(x)
Initi
alO
ptim
al
(c)I
nitia
lpoi
nt3
0.5
11.
52
2.5
33.
5x:
ave
rage
lane
que
ue-le
ngth
(in
veh
icle
s)
0
0.2
0.4
0.6
0.81
Cumulative distribution function F(x)
Initi
alO
ptim
al
(d)I
nitia
lpoi
nt4
Figu
re2-
10:C
umul
ativ
edi
stri
butio
nfu
nctio
nsof
the
aver
age
lane
queu
e-le
ngth
cons
ider
ing
diff
eren
tini
tials
igna
lpla
ns
65
56
78
910
x: average trip travel time [m
in]
0
0.2
0.4
0.6
0.8 1
Cumulative distribution function F(x)
InitialO
ptimal
(a)Initialpoint1
45
67
89
10x: average trip travel tim
e [min]
0
0.2
0.4
0.6
0.8 1
Cumulative distribution function F(x)
InitialO
ptimal
(b)Initialpoint2
56
78
910
11x: average trip travel tim
e [min]
0
0.2
0.4
0.6
0.8 1
Cumulative distribution function F(x)
InitialO
ptimal
(c)Initialpoint3
56
78
910
11x: average trip travel tim
e [min]
0
0.2
0.4
0.6
0.8 1
Cumulative distribution function F(x)
InitialO
ptimal
(d)Initialpoint4
Figure2-11:C
umulative
distributionfunctions
oftheaverage
triptraveltim
esconsidering
differentinitialsignalplans
66
0.01
0.01
50.
020.
025
0.03
0.03
50.
040.
045
0.05
x: a
vera
ge p
ropo
rtio
n of
tim
e a
lane
is fu
ll
0
0.2
0.4
0.6
0.81
Cumulative distribution function F(x)
Syn
chro
sig
nal p
lan
Initi
al s
igna
l pla
nP
ropo
sed
sign
al p
lan
Figu
re2-
12:
Cum
ulat
ive
dist
ribu
tion
func
tions
ofth
eav
erag
epr
o-po
rtio
nof
time
ala
neis
full
0.5
11.
52
2.5
33.
5x:
ave
rage
lane
que
ue-le
ngth
(in
veh
icle
s)
0
0.2
0.4
0.6
0.81
Cumulative distribution function F(x)
Syn
chro
sig
nal p
lan
Initi
al s
igna
l pla
nP
ropo
sed
sign
al p
lan
Figu
re2-
13:
Cum
ulat
ive
dist
ribu
tion
func
tions
ofth
eav
erag
ela
nequ
eue-
leng
th
45
67
89
1011
x: a
vera
ge tr
ip tr
avel
tim
e [m
in]
0
0.2
0.4
0.6
0.81
Cumulative distribution function F(x)
Syn
chro
sig
nal p
lan
Initi
al s
igna
l pla
nP
ropo
sed
sign
al p
lan
Figu
re2-
14:
Cum
ulat
ive
dist
ribu
tion
func
tions
ofth
eav
erag
etr
iptr
avel
time
67
metrics, the performance of the initial plans varies significantly, while the performance of
the proposed signal plans is very similar. This illustrates the robustness of the proposed
model to the quality of the initial points. For two metrics, average proportion of time a
lane is full and average lane queue-length, the Synchro plan outperforms 3 of the 4 initial
plans and performs similarly to the fourth plan. For the average trip travel time metric, the
Synchro plan outperforms all 4 initial points.
2.5 Conclusions
This chapter formulates an analytical stochastic link model that is both computationally
tractable and is consistent with the kinetic theory of traffic flow. The model is validated
versus stochastic simulation results, using a simulator of the stochastic link transmission
model. Compared to the model of Osorio and Flötteröd (2015), the proposed model has a
complexity that is linear in the link space capacity, rather than cubic. This leads to signif-
icant gains in computational runtimes. Both models provide an accurate approximation of
the distribution of the link’s boundary conditions. The proposed model is used to address
a signal control problem for the city of Lausanne. It yields signal plans that systematically
outperform initial random plans for various performance metrics. The experiments illus-
trate the robustness of the model to the quality of the initial points. The proposed plans also
outperform a signal plan derived by a widely used commercial signal control software.
Ongoing work formulates scalable probabilistic network models. There are two main
challenges to be addressed. First, there is a need to formulate probabilistic and scalable
node models. The probabilistic model of Osorio et al. (2011) includes a two-link node
model that provides a higher-order description of the across-node dependencies. It yields
the joint distribution of the boundary conditions that each link adjacent to a node pro-
vides to the node, i.e., the joint distribution of the upstream link’s downstream boundary
conditions and the downstream link’s upstream boundary conditions. The extension of this
formulation to nodes with multiple upstream and downstream links is part of ongoing work.
Second, there is a need to formulate scalable network models. For a network with n links,
each with space capacity ℓ, directly coupling the proposed link model with the node model
68
of Osorio et al. (2011) would yield a model complexity in the order of O(ℓn). Such a model
is inappropriate for large-scale network analysis. Ongoing work investigates two research
directions. First, we study the use of network decomposition techniques. For instance,
combining the link and node models with the technique of Flötteröd and Osorio (2014)
would lead to a network model with complexity O(sℓr), where s is the number of intersec-
tions and r is the maximum number of links adjacent to an intersection. Second, we study
the use of aggregation-disaggregation techniques that address the curse of dimensionality
by providing an aggregate description of network states (Osorio and Yamani; 2017; Osorio
and Wang; 2017).
69
Chapter 3
Analytical Probabilistic Link
Transmission Model With Constant
Complexity
This chapter presents an analytical probabilistic link transmission model with constant
complexity. It builds upon the model formulated in Chapter 2. It proposes a formula-
tion that only tracks two key probability states over time. Therefore, the dimension of the
state space of the model is of dimension two, it is independent of the link’s space capacity.
This contrasts with the model of Chapter 2 that had a dimension that increased linearly with
the link’s space capacity. The model is thus suitable for large-scale network optimization
with time budgets or real-time optimization problems.
3.1 Introduction
In the field of traffic flow modeling, there is a recent and increased interest in the formula-
tion of probabilistic models. This is facilitated and motivated by a number of factors, in-
cluding increased availability of urban mobility data, advanced censoring technologies that
enable increased data granularity (i.e., resolution) such that more detailed models can be
calibrated and validated, enhanced computing capabilities such that more elaborate models
can be evaluated. Additionally, transportation agencies in the US and in Europe have rec-
71
ognized both the importance and the need to evaluate and to improve network robustness
and reliability metrics (U.S. Department of Transportation; 2008; Transport for London;
2010). This calls for a probabilistic description of network performance.
Calvert et al. (2012) discuss the advantages and disadvantages of both deterministic
and stochastic modeling approaches from both methodological and transportation practice
perspectives. They identify the lack of computational efficiency as one of the main chal-
lenges current stochastic models face. Indeed, compared to their deterministic counterparts,
stochastic models may suffer from the curse of dimensionality and are often computation-
ally inefficient for the analysis, let alone the optimization, of large-scale networks. The goal
of this chapter is to propose an analytical stochastic traffic theoretic model that addresses
these scalability and computational efficiency concerns.
Deterministic traffic model formulations and their solution methods have been exten-
sively studied leading to seminal works such as Lighthill and Whitham (1955); Richards
(1956b); Daganzo (1994); Newell (2002).The formulation of their stochastic counterparts
are in the early stages. Detailed reviews of stochastic traffic flow models are provided by
Sumalee et al. (2011); Jabari (2012); Calvert et al. (2012); Laval and Chilukuri (2014) and
Chen et al. (2015). This chapter focuses on analytical (i.e., not simulation-based) formula-
tions. In this research area, recent work has proposed formulations based on the variational
theory of Daganzo (2005) (Deng et al.; 2013; Laval and Chilukuri; 2014; Laval and Cas-
trillón; 2015). The most popular approach to formulate a stochastic traffic model is to
add stochasticity to a specific deterministic traffic flow model. For instance, Boel and Mi-
haylova (2006) formulate a stochastic cell-transmission model (CTM) (Daganzo; 1994) by
adding Gaussian noise terms to the sending and receiving functions of the deterministic
CTM. However, for such approaches, the expected traffic dynamics are not guaranteed to
be consistent with their deterministic CTM counterparts. A detailed discussion of this, in-
cluding the existence and implications of negative sample paths, are given in Jabari and Liu
(2012). Rather than adding noise directly to the speed-density relationship, Jabari and Liu
(2012) consider stochastic vehicle headways and Jabari et al. (2014a) consider a stochastic
formulation of Newell’s simplified car-following model (Newell; 2002). Probabilistic as-
sumptions are made at the microscopic level, and macroscopic probabilistic speed-density
72
relationships are then derived. For analytical models that add Gaussian noise terms to a spe-
cific deterministic model, computational inefficiency can arise due to the need to sample
from high-dimensional Gaussian distributions. The work of Zheng et al. (2018) proposes
a stochastic formulation of the model of Newell (1961). Other approaches to stochastic
modeling include gas-kinetic (Boltzmann-like) models of traffic (Paveri-Fontana; 1975;
Hoogendoorn and Bovy; 2001), aggregated traffic modeling approach (e.g., (sub)region-
based models of Ramezani et al. (2015)) and uncertainty propagation approaches (Sayegh
et al.; 2017).
An alternative approach has been the use of probabilistic queueing theory. Most work
has considered a stationary analysis (Heidemann; 1991, 1994; Heidemann and Wegmann;
1997). Work that considers transient (i.e., non-stationary) analysis includes Olszewski
(1994); Heidemann (2001); van Zuylen and Viti (2003); Viti and Van Zuylen (2010). Ma-
jority of the works on transient analysis consider queues with infinity capacity. In reality,
there is physical upper bound on the number of vehicles a road can hold, which means the
queueing capacity is finite, and spillback effect as a consequence phenomenon is frequently
observed in congestion urban network. Formulations based on both transient queueing the-
ory and finite (space) capacity queueing network theory have also been proposed (Osorio
et al.; 2011; Osorio and Flötteröd; 2015; Lu and Osorio; 2018). Transient queueing models
can contribute to provide a probabilistic and analytical description of congestion build-up
and dissipation. However, the formulation of a probabilistic transient model with sufficient
computational efficiency to enable large-scale network analysis and optimization remains
a challenge (Viti and Van Zuylen; 2010).
This chapter tackles this challenge and extends this literature. The model of Osorio and
Flötteröd (2015) is a stochastic formulation of the deterministic link transmission model
of Yperman et al. (2007), which itself is an operational formulation of Newell’s simplified
theory of kinematic waves (Newell; 1993). The model considers a single link with space
capacity ℓ and represents the link as a set of three queues with finite (space) capacity. It
derives the joint transient probability distribution of the link’s upstream and downstream
boundary conditions. For a link with space capacity ℓ, the model complexity is in the order
of O(ℓ3). The recent work of Lu and Osorio (2018) (cf. Chapter 2) extends the model
73
of Osorio and Flötteröd (2015) by making it more computationally efficient. Instead of
deriving the joint distribution of the link’s upstream and downstream boundary conditions,
Lu and Osorio (2018) (cf. Chapter 2) yield the marginal distribution of the link’s upstream
boundary conditions and the marginal distribution of the link’s downstream boundary con-
ditions. They provide a simplified description of the spatial and temporal dependencies
between the upstream and the downstream boundary conditions. The model complexity is
in the order of O(ℓ). This reduction in model complexity enhances the computational effi-
ciency of the model. In this chapter, we formulate a model with further enhanced compu-
tational efficiency. The goal is to enable large-scale network optimization to be performed
efficiently. We extend the model of Lu and Osorio (2018) (cf. Chapter 2) and propose a
formulation with constant complexity, i.e., the complexity no longer depends on the link’s
space capacity ℓ.
The chapter is organized as follows. Section 3.2 brief reviews the past link model
formulations of Lu and Osorio (2018) (cf. Chapter 2) and Osorio and Flötteröd (2015).
In Section 3.3, we motivate and formulate the proposed model. The proposed model is
validated in Section 3.4. It is then used to address a city-wide signal control problem and
is benchmarked versus other methods (Section 3.5). Section 3.6 summarizes the chapter
and discusses ongoing work. The Appendices contain additional equation derivations and
numerical validation results.
3.2 Past link model formulations
The outline of the main ideas of the models of Lu and Osorio (2018) (hereafter referred to
as the mixture model proposed in Chapter 2) and of Osorio and Flötteröd (2015) (hereafter
referred to as the multivariate model) are presented in Section 2.2.1 of Chapter 2.
Lu and Osorio (2018) (cf. Chapter 2) note that the link’s upstream (resp. downstream)
boundary conditions are described by UQ (resp. DQ). The model is formulated as a
mixture of two independent univariate models: a univariate model of UQ and a univariate
model of DQ. The model tracks the full marginal distributions of UQ and of DQ, over
time. The dimension of the state space for the model is 2(ℓ+ 1), i.e., the model complexity
74
is in the order of O(ℓ). In other words, Lu and Osorio (2018) (cf. Chapter 2) enhance
the scalability of the multivariate model of Osorio and Flötteröd (2015) by formulating a
model with linear, rather than cubic, complexity in the link’s space capacity ℓ.
In this chapter, we propose a formulation with a state space of dimension 2. In other
words, the model complexity is now independent of the link’s space capacity ℓ. This leads
to enhanced scalability and improves the ability of these models to be used efficiently for
large-scale network optimization. This proposed formulation is simpler than past formu-
lations, yet as illustrated in Section 3.4, it still captures sufficient dependency between
the link’s upstream and downstream boundary conditions. Hereafter, we use the notation
DQ(k) (resp. LI(k), UQ(k), LO(k)) to denote the state of DQ (resp. LI, UQ, LO) at the
end of time interval k. The notations DQ and DQ(k) are used interchangeably.
3.3 Proposed link model formulation
The main idea underlying the proposed model is that in order to describe the link’s bound-
ary conditions, we do not need to track the full marginal distributions of UQ and of DQ
as in Lu and Osorio (2018) (cf. Chapter 2), let alone track their full joint distribution as in
Osorio and Flötteröd (2015). More specifically, we have identified 2 specific queue states
that are essential to describe these boundary conditions. The first state is DQ = 0, which
describes whether or not there is vehicular flow downstream ready to depart the link. The
second state is UQ = ℓ, which describes whether or not there is road space available at
the upstream end of the link. Intuitively, in a network setting with two links, vehicular
flow can be transmitted from the upstream link to the downstream link if the following two
conditions hold: (i) there is flow at the upstream link ready to depart to the downstream
link (i.e., for the upstream link DQ > 0), and (ii) there is space available at the upstream
end of the downstream link (i.e., for the downstream link UQ < ℓ). Thus, for a given link,
the proposed model approximates only 2 state probabilities: P(DQ = 0) and P(UQ = ℓ).
More formally, for a given time interval k, the expected link inflow is defined as:
qin(k) = λ(k)(1− P(UQ(k) = ℓ)). (3.1)
75
Equation (3.1) states that vehicles can enter the link as long as there is space available at the
upstream end of the link (i.e., UQ(k) < ℓ), which happens with probability P(UQ(k) <
ℓ) = 1− P(UQ(k) = ℓ). Similarly, the expected link outflow is defined as:
qout(k) = µ(k)(1− P(DQ(k) = 0)). (3.2)
Equation (3.2) states that there are vehicle departures from the link as long as there are
vehicles at the downstream end of the link that are ready for departure (i.e., DQ(k) > 0),
which happens with probability P(DQ(k) > 0) = 1− P(DQ(k) = 0).
The mixture model of Lu and Osorio (2018) (cf. Chapter 2) derives the marginal dis-
tributions of UQ(k) and of DQ(k) at every time step k. However, the only information
needed to compute the dynamics of the link’s boundary conditions are the two probabilities
P(UQ(k) = ℓ) and P(DQ(k) = 0).
In this chapter, we propose a model that only keeps track of these two key probabilities
over time (i.e., P(UQ(k) = ℓ) and P(DQ(k) = 0)). It improves model scalability by
reducing the dimension of the state space. The proposed model has a state space of dimen-
sion 2. In other words, its complexity is now constant and no longer depends on the space
capacity of the link. Thus, in a network setting, the proposed model linearly scales with the
number of links in the network, independently of link attributes such as link lengths. The
rest of this section is organized as follows. Section 3.3.1 formulates the model of the link’s
downstream boundary conditions P(DQ(k) = 0). Section 3.3.2 formulates the model of
the link’s upstream boundary conditions P(UQ(k) = ℓ). Section 3.3.2 summarizes the
algorithm for the proposed link model.
3.3.1 Downstream boundary conditions
This section formulates the probabilistic model of the link’s downstream boundary condi-
tion P(DQ(k) = 0). Recall from Section 3.2 that the service process of the link consists of
service of vehicles in DQ, and these service times are i.i.d exponential random variables.
Since the service process of the link is the same as the service process of DQ(k), we do not
need to approximate the service process of DQ(k). In other words, service times of DQ(k)
76
are independent and identically distributed exponential random variables with exogenous
rate µ(k). Thus, we only need to approximate the arrival process of DQ(k).
As described in Section 3.2, upon entering the link, all vehicles enter LI(k), and then
enter DQ(k). Thus, arrivals from DQ(k) consist of departures from LI(k). Thus, no
arrivals to DQ(k) should be rejected, lost or blocked due to a capacity limit of DQ(k).
This leads us to model DQ(k) as an infinite (space) capacity queue.
We approximate the downstream queue during time interval k, DQ(k), as an M/M/1
queue. The arrival process of DQ(k) is approximated as a Poisson process with endoge-
nous rate λDQ(k). To approximate λDQ(k), recall from Section 3.2 that the arrivals to DQ
correspond to flow that leaves the last cell of LI. In Figure 2-1, this cell is the kfwdth cell,
which is denoted LLI. Thus, flow that enters DQ during time interval k corresponds to
flow that entered the link during time interval k − kfwd. Thus, we approximate the arrival
rate to DQ(k) as the expected flow to enter the link during time interval k− kfwd:
λDQ(k) = qin(k− kfwd). (3.3)
For an M/M/1 queueing system, an exact closed-form expression for the transient
queue-length distribution exists (e.g., Eq. (2.163) of Kleinrock (1975)), however the use of
such an expression requires keeping track of the entire queue-length distribution at every
time step. Since our aim is to track a single state probability, P(DQ(k) = 0), rather than
the full distribution, we do not use the closed-form expression.
The transient behavior of a system has also been studied with relaxation processes,
which describe the return of a disturbed system to its equilibrium state. Prigogine and
Andrews (1960) was among the pioneering work that introduced gas-kinetic models for the
analysis of vehicular traffic flow. Extensions of their work include Paveri-Fontana (1975);
Nelson (1995); Helbing (1997). The notion of relaxation process and relaxation time is
also used in Gazis et al. (1961); Payne (1971); Ross (1988).
In the queueing theory literature, the theoretical study of relaxation times is well stud-
ied. Our studies differs from past literature in the following ways. We focus on the approx-
imation of the relaxation time (or equivalently its inverse) of a single probability state (i.e.,
77
P(DQ(k) = 0) of Eq. (3.4)), while most papers have studied the relaxation time of the
expected queue-length for infinite (space) capacity single-server queueing systems, such as
in Newell (1982, Chap. 5) and in Odoni and Roth (1983). Our proposed approach allows
for an arbitrary initial state for the queueing system, while most theoretical relaxation time
studies focus on initially empty systems (e.g., Newell (1982, Chapter 5)). There are studies,
based on numerical stochastic simulations, that have considered arbitrary deterministic ini-
tial states, such as in Odoni and Roth (1983). Moreover, most studies consider an isolated
queueing system, whereas our work considers several queueing systems that have coupled
dynamics. For instance, the dynamics of DQ(k) and of UQ(k) are coupled due to the
dependencies between the link’s downstream and upstream boundary conditions.
We introduce the following notation:
P(DQ(k) = 0) probability of DQ(k) being empty at the end of time interval k (which is also
the beginning of time interval k+ 1);
P(DQk = 0) time-interval specific stationary probability of DQ = 0;
τDQ(k) inverse of the relaxation time during time interval k.
We propose the following formulation:
P(DQ(k) = 0) = P(DQk = 0)+[P(DQ(k−1) = 0)−P(DQk = 0)
]e−τDQ(k)δ. (3.4)
Equation (3.4) states that the transient probability P(DQ(k) = 0) at the end of time interval
k is approximated as the sum of a stationary probability (term P(DQk = 0)) and a term
that decays exponentially with time. The latter term is the difference between the initial
condition of time interval k (term P(DQ(k − 1) = 0)) and the corresponding stationary
probability P(DQk = 0). The functional form of Equation (3.4) is inspired by both the
exact closed-form expression of the transient queue-length distribution of an M/M/1/ℓ of
Morse (1958, Chap. 6, Equation (6.13)) as well as by the approximate expression of the
transient spillback probability (also known as the blocking probability) of an M/M/1/ℓ of
Chong and Osorio (2017, Equation (14a)).
Equation (3.4) contains two endogenous terms, P(DQk = 0) and τDQ(k). We now
78
present their formulations starting with that of Pk(DQ = 0). We define the traffic intensity
of DQ(k) as:
ρDQ(k) =λDQ(k)
µ(k). (3.5)
We use the following expression to approximate P(DQk = 0):
P(DQk = 0) =
{1− ρDQ(k) if ρDQ(k) < 1 (3.6a)
0 otherwise (3.6b)
Equation (3.6a) is obtained from the closed-form expression for the stationary queue-
length distribution of an M/M/1 queue (e.g., Gross (2008, Chap. 2, Equation (2.9))).
However, this expression only holds for ρDQ(k) < 1. In our transient setting, the traffic
intensity ρDQ(k) may temporarily exceed 1. Equation (3.6b) allows for this and is defined
such as to ensure continuity of P(DQk = 0) at ρDQ(k) = 1.
We now present the approximation for τDQ(k). In queueing theory, 1/τDQ(k) is known
as the relaxation time, which measures the time a given performance metric needs to reach
its stationary value. Thus, τDQ(k) measures the speed at which the given performance
metric approaches its stationary value (i.e., a larger τDQ(k) corresponds to a higher speed
of convergence to stationary values). We approximate τDQ(k) as follows:
τDQ(k) = τDQ,1 + τDQ,2 (3.7a)
τDQ,1 =[1− α1e
−ρDQ(k)]×
[µ(k)(1− ρDQ(k))
2
(1+ ρDQ(k))
](3.7b)
τDQ,2 = α2µ(k)
∣∣∣∣P(DQ(k− 1) = 0) − P(DQk = 0)
ℓ(1− P(DQk = 0))
∣∣∣∣1/5 (3.7c)
The terms α1 and α2 in Equation (3.7b) and (3.7c), respectively, are defined as follows:
α1 =
{α1,1 if P(DQ(k− 1) = 0) > P(DQk = 0) (3.8a)
α1,2 otherwise (3.8b)
α2 =
{α2,1 if P(DQ(k− 1) = 0) > P(DQk = 0) (3.9a)
α2,2 otherwise (3.9b)
79
where α1,1, α1,2, α2,1 and α2,2 are exogenous scalar coefficients. Equation (3.7a) approx-
imates τDQ(k) as the sum of two relaxation terms: τDQ,1 (which is defined in Eq. (3.7b))
and of τDQ,2 (which is defined in Eq. (3.7c)). We now describe how this formulation is
derived. Equation (3.7a) defines τDQ(k) as the sum of two relaxation terms. In a nutshell,
the first term τDQ,1, defined by Equation (3.7b), is formulated based on the relaxation time
study of Newell (1982, Chap. 5, Equation (5.6)), while the second term τDQ,2, defined by
Equation (3.7c), is formulated based on insights from numerical simulation experiments.
More specifically, Equation (3.7b) defines τDQ,1 as the product of 2 terms within brack-
ets. The expression within the right-side bracket corresponds to the inverse of the relaxation
time for the expected queue-length of an M/M/1 system of Newell (1982, Chap. 5, Equa-
tion (5.6)):
τ =(1− ρ2)µ
C2S + C2
Aρ(3.10)
where CA (resp. CS) is the coefficient of variation for the inter-arrival (resp. service time).
For an M/M/− system, we have CA = CS = 1. The expression within the left-side
bracket is an adjustment term that is based on the following observations.
• The adjustment term should be unit-free. This is based on the observation of Odoni
and Roth (1983) that the relaxation time be scaled in time so that it varies directly
with the units of the arrival or service rates. In other words, two identical queueing
systems measured in different time units should yield the same value of τDQ(k)δ (of
Eq. (3.4) or equivalently Eq. (3.7a)). Thus, τDQ,1 should vary directly with the units
of the arrival or service rates. Since the right-side bracket term varies directly with
the service rate µ, the left-side bracket term should be unit-free.
• We want to be able to model cases where downstream departures are not allowed,
this can arise due to the presence of a red traffic light. This would imply the service
rate reaches 0 (i.e., µ(k) = 0). To allow for this, we define τDQ(k) = λDQ(k)
when µ(k) = 0. This is because when µ(k) = 0, DQ(k) becomes a pure arrival
process with rate λDQ(k). The only possibility that DQ is empty at the end of time
interval k (i.e., DQ(k) = 0) is that DQ is empty at the beginning of time interval
80
k (i.e., DQ(k − 1) = 0) and no arrival during time interval k of length δ (denoted
NDQ(k) = 0). This lead to
P(DQ(k) = 0) = P(DQ(k− 1) = 0)P(NDQ(k) = 0) (3.11)
= P(DQ(k− 1) = 0)e−λDQ(k)δ (3.12)
= P(DQk = 0) +[P(DQ(k− 1) = 0) − P(DQk = 0)
]e−λDQ(k)δ.
(3.13)
Since the arrival to DQ during time interval k is Poisson process with rate λDQ(k),
the number of arrivals in a time interval length of δ (denoted NDQ(k)) follows a Pois-
son distribution with parameter λDQ(k)δ and P(NDQ(k) = 0) = e−λDQ(k)δ. Equa-
tion (3.12) is hence obtained from Equation (3.11). By definition (i.e., Eq. (3.5)),
ρDQ → ∞ and hence P(DQk = 0) = 0 by Equation (3.6). Equation (3.13) is
obtained from Equation (3.12) by adding and subtracting the term P(DQk = 0)
which equals to zero. Note Equation (3.13) is of the exact form of Equation (3.4)
with τDQ(k) = λDQ(k). Hence, when µ(k) = 0, τDQ(k) should exist and equal to
λDQ(k).
Next, we show that limµ(k)→0 τDQ(k) = λDQ(k). In another word, τDQ(k) is well-
defined and continuous in the domain µ(k) ∈ [0,+∞). As µ(k) → 0, τDQ,1, defined
in Eq. (3.7b), becomes:
limµ(k)→0
τDQ,1 = λDQ(k). (3.14)
Moreover, as µ(k) → 0, τDQ,2 (Eq. (3.7c)) becomes zero, and hence τDQ(k), which
is the sum of the two terms, becomes:
limµ(k)→0
τDQ(k) = λDQ(k). (3.15)
The calculations of the limits are given in Appendix B.2.
Equation (3.7c) defines τDQ,2. This formulation is based on insights obtained from
81
numerical studies using a simulation-based implementation of the stochastic link trans-
mission model. It corresponds to the benchmark simulator used in Osorio and Flötteröd
(2015). This simulator samples individual vehicles, and imposes the forward and backward
lags explicitly for each vehicle. A total of 126 simulation experiments are carried out. Each
experiment starts with an initial empty state and runs for 300 time units, it has one traffic in-
tensity value for the first 150 time units and another value for the remaining 150 time units.
We use the arrow notation 0.5 → 0.25 to denote an experiment with a traffic intensity that
changes from 0.5 to 0.25. The experiments consider all combinations of traffic intensity
λ/µ ∈ {0.5 → 0.25, 0.75 → 0.25, 1.25 → 0.25, 0.75 → 0.5, 1.25 → 0.5, 1.25 → 0.75},
service rate µ ∈ {0.2, 0.4, 0.6}, and space capacity ℓ ∈ {10, 20, 30, 40, 60, 80, 100}.
Equation (3.7c) defines τDQ,2 based on the following observations from these simula-
tion experiments.
• We want the relaxation time (or equivalently its inverse) to depend on the space
capacity ℓ. This is because we observe that the inverse of the relaxation time of
P(DQ(k) = 0) is inversely related to ℓ. In other words, as ℓ increases, so does the
time needed to reach stationarity. The parameter ℓ is in the denominator of Equa-
tion (3.7c). Thus, as ℓ increases, τDQ,2 decreases, this leads to a longer time to reach
stationarity.
• We want the relaxation time to depend on the distance between the initial state (i.e.,
initial probability) and the steady state. In our simulation experiments, we observe a
positive correlation between the inverse of the relaxation time and the absolute differ-
ence between the initial state and the corresponding steady state probability. Odoni
and Roth (1983) have also observed that for an isolated M/M/1 queueing system
and for arbitrary deterministic initial states, the relaxation time of the expected queue
length may depend on the distances between the initial state and steady state. The
term∣∣∣P(DQ(k−1)=0)−P(DQk=0)
(1−P(DQk=0))
∣∣∣ of Equation (3.7c) represents the normalized absolute
difference between the initial state and the steady state. Thus, when this increases,
so does τDQ,2, and this leads to a higher speed (i.e., lower time) to reach stationarity.
• We want the relaxation time to depend on whether congestion is propagating (i.e.,
82
aggravating, building up) or dissipating. The difference in the speed at which traf-
fic propagates or dissipates has been observed experimentally (e.g., Kerner and Re-
hborn (1996)) and numerically (e.g., Orosz et al. (2009)). Similarly, we have ob-
served in the simulation experiments that it takes a shorter time to reach stationarity
when congestion is building up compared to when it is dissipating. The term α1 of
τDQ,1 (i.e., Eq. (3.7b)) and α2 of τDQ,2 (i.e., Eq. (3.7c)) are introduced to account
for this. They are defined by Equations (3.8) and (3.9) as functions of the initial
state P(DQ(k − 1) = 0) and the corresponding steady state P(DQk = 0). More
specifically, Equations (3.8a) and (3.9a) consider the case of congestion building up,
while Equations (3.8b) and (3.9b) consider the case of congestion dissipation.
A description of how the exogenous scalar parameters α1,1, α1,2, α2,1 are α2,2 are fitted
is given in Appendix A.2. In the experiments presented in this chapter, the fitted values are
α1,1 = α1,2 = 0.4, α2,1 = 0.6 and α2,2 = 0. Note that α2,1 > α2,2, which means that,
all else being equal, τDQ(k) is larger for under congestion build up conditions compared
to congestion dissipation conditions. In other words, it takes longer to reach stationarity
when congestion is dissipating than when it is building up.
3.3.2 Upstream boundary conditions
This section formulates the probabilistic model of the link’s upstream boundary conditions
P(UQ(k) = ℓ). In queueing theory P(UQ(k) = ℓ) is known as the blocking probability of
UQ(k). In traffic flow theory it represents the spillback probability of the link.
Recall from Section 3.2 that the arrival process to the link is assumed to be a Poisson
process with exogenous rate λ(k). Since the arrival process to the link is the same as
the arrival process to UQ(k), the arrival process to UQ(k) is also a Poisson process with
exogenous rate λ(k). Thus, we only need to approximate the service process of UQ(k).
Recall from Section 3.2 that flow that enters UQ sequentially undergoes the following
three phases of service: (i) it is delayed kfwd time intervals (this delay is represented in
Figure 2-1 by LI); (ii) it enters DQ, where it experiences a sojourn (waiting and service)
time; (iii) vehicular flow that leaves the link (i.e., leaves DQ) generates newly available
83
road space, which becomes available at the upstream end of the link after a delay of kbwd
time intervals (this delay is represented in Figure 2-1 by LO). Once this space becomes
available upstream, the corresponding flow leaves UQ.
Flow departures from UQ correspond to flow departures from the most downstream
cell of LO (which is denoted LLO in Figure 2-1). Let qLLO(k) denote the expected outflow
from LLO during time interval k. It corresponds to vehicular flow that left the link during
time interval k− kbwd, i.e.,:
qLLO(k) = qout(k− kbwd). (3.16)
To approximate P(UQ(k) = ℓ) we consider two cases depending on whether or not
qLLO(k) = 0. Note that at time interval k, qLLO(k) is known since it defined by expected
link outflows from past time intervals (see Eq. (3.16)).
Case qLLO(k) = 0
If qLLO(k) = 0, then the expected outflow from UQ(k) is also zero. This implies that, with
probability 1, there are no departures from UQ(k) (in other words, positive outflow from
UQ(k) occurs with a probability of zero). Thus, UQ(k) is a pure arrival process.
Let N(k) denote the number of attempted new arrivals during time interval k whether
or not they successfully enter UQ. Thus, the number of arrivals that successfully entered
UQ during time interval k is the minimum of N(k) and the available space left, i.e., ℓ −
UQ(k−1). Thus, the number of vehicles in UQ at the end of time interval k (i.e., UQ(k))
is sum of the number of vehicles in UQ at the beginning of time interval k (i.e., UQ(k−1))
and the number of vehicles that successfully entered UQ:
UQ(k) = UQ(k− 1) + min{N(k), ℓ−UQ(k− 1)}. (3.17)
84
Therefore, P(UQ(k) = ℓ) can be obtained as follows:
P(UQ(k) = ℓ) = P((UQ(k− 1) + min{N(k), ℓ−UQ(k− 1)}) = ℓ) (3.18)
=
ℓ∑i=0
P(min{N(k), ℓ− i} = ℓ− i|UQ(k− 1) = i)P(UQ(k− 1) = i)
(3.19)
=
ℓ∑i=0
P(N(k) ≥ ℓ− i|UQ(k− 1) = i)P(UQ(k− 1) = i) (3.20)
=
ℓ∑i=0
P(N(k) ≥ ℓ− i)P(UQ(k− 1) = i) (3.21)
Equation (3.18) gives P(UQ(k) = ℓ) by substituting in Equation (3.17). Equation (3.19)
is obtained by conditioning on the states of UQ at the beginning of time interval k (i.e.,
UQ(k− 1)). In the conditional probability of Equation (3.19), the equality min{N(k), ℓ−
i} = ℓ − i holds if and only if N(k) ≥ ℓ − i. Thus, Equation (3.20) is obtained. Since
the process of attempted arrivals does not have any dependence on the initial state of the
system, P(N(k) ≥ ℓ − i|UQ(k − 1) = i) = P(N(k) ≥ ℓ − i) and Equation (3.21) is
obtained.
Since the arrival process to the link, which is also the arrival process to UQ(k), is a
Poisson process with rate λ(k), then N(k), the number of attempted arrivals during time
interval k, follows a Poisson distribution with parameter λ(k)δ. Thus, P(N(k) ≥ ℓ − i) is
calculated as follows:
P(N(k) ≥ ℓ− i) = 1− P(N(k) ≤ ℓ− i− 1) = 1− e−λ(k)δ
ℓ−i−1∑j=0
(λ(k)δ)j
j!. (3.22)
Equation (3.21) depends on the full marginal distribution of UQ (i.e., it depends on all
terms P(UQ(k − 1) = i), ∀i ∈ {0, . . . , ℓ}). However, the proposed model does not track
the full distribution of UQ, it only tracks the scalar probability P(UQ(k − 1) = ℓ). Thus,
85
we propose the following approximation for P(UQ(k− 1) = i), 0 ≤ i ≤ ℓ− 1:
P(UQ(k− 1) = i) =1− P(UQ(k− 1) = ℓ)∑ℓ−1
j=0 f(j, qUQ(k− 1)δ)
f(i, qUQ(k− 1)δ) (3.23a)
f(i, qUQ(k− 1)δ) =(qUQ(k− 1)δ)ie−qUQ(k−1)δ
i!(3.23b)
qUQ(k− 1) =
k−2∑r=0
qin(r) −
k−kbwd−2∑r=0
qout(r). (3.23c)
Equation (3.23b) gives the probability mass function (pmf) of a Poisson distribution with
parameter qUQ(k− 1)δ. Equation (3.23a) is a normalized and finite support ({0, ..., ℓ− 1})
Poisson distribution (with parameter qUQ(k − 1)δ). The normalization term (the fraction
term) is defined such that:
ℓ∑i=0
P(UQ(k− 1) = i) = P(UQ(k− 1) = ℓ) +
ℓ−1∑i=0
1− P(UQ(k− 1) = ℓ)∑ℓ−1
j=0 f(j, qUQ(k− 1)δ)
f(i, qUQ(k− 1)δ)
(3.24)
= P(UQ(k− 1) = ℓ) +1− P(UQ(k− 1) = ℓ)∑ℓ−1
j=0 f(j, qUQ(k− 1)δ)
ℓ−1∑i=0
f(i, qUQ(k− 1)δ)
(3.25)
= P(UQ(k− 1) = ℓ) + 1− P(UQ(k− 1) = ℓ) = 1
(3.26)
Equation (3.23c) defines the expected flow in UQ(k − 1) as the difference between: (i)
aggregated (over time) flow that has entered the link up until the end of time interval k− 2
(first summation) and (ii) aggregated (over time) vehicular flow that has left the link up
until the end of time interval k − kbwd − 2 (second summation). The second summation
accounts for the kinematic backward wave delay.
Case qLLO(k) > 0
When qLLO(k) > 0, we account for all three service processes that flow within UQ goes
through, which were mentioned at the start of Section 3.3.2. This leads us to approximate
86
UQ(k) as an M/G/ℓ/ℓ system. Let us detail this. Denote SDQ(k) the sojourn time (waiting
plus service time) of DQ(k). First, since the service time of UQ(k) is the sum of that of
these three processes, we assume it to be generally distributed. The expected service time,
E[SUQ(k)], is given by:
E[SUQ(k)] = kfwdδ+ E[SDQ(k)] + kbwdδ (3.27)
where E[SDQ(k)] is the expected sojourn time of the DQ(k) system. Second, we approxi-
mate UQ(k) as a multi-server, rather than a single-server, queueing system. This is because
the flow in LI and in LO is served (or processed) simultaneously, rather than sequentially.
We introduce the following notation.
P(UQ(k) = ℓ) probability of UQ(k) being full at the end of time interval k (which is also
the beginning of time interval k+ 1);
P(UQk = ℓ) time-interval specific stationary probability of UQ = ℓ;
τUQ(k) inverse of the relaxation time during time interval k.
If qLLO(k) > 0, then we use the same functional form as for DQ(k) (Eq. (3.4)) to approx-
imate P(UQ(k) = ℓ):
P(UQ(k) = ℓ) = Pk(UQ = ℓ)+[P(UQ(k−1) = ℓ)−P(UQk = ℓ)
]e−τUQ(k)δ. (3.28)
Just as for DQ(k) (Eq. (3.4)), the transient probability P(UQ(k) = ℓ) is defined as the sum
of a time-interval specific stationary probability (term P(UQk = ℓ)) and a term that decays
exponentially with time and accounts for the difference between the initial conditions (i.e.,
P(UQ(k − 1) = ℓ)) and the corresponding stationary probability (i.e., P(UQk = ℓ)).
The functional form of Equation (3.28) is inspired by Jagerman (1975, Equation (166))
which expresses the transient blocking probability of an M/M/ℓ/ℓ system as the sum of
the corresponding stationary probability (known as the Erlang-B formula, it is presented
below in Equation (3.29a)) and a term that decays exponentially with time. We consider
generally distributed, rather than Markovian, service times. Nonetheless, Equation (3.28)
87
uses a similar functional form for the transient probability of an M/M/ℓ/ℓ queue as in
Jagerman (1975, Equation (166)).
The stationary probability P(UQk = ℓ) of Equation (3.28) is approximated as follows:
P(UQk = ℓ) =ρUQ(k)
ℓ/ℓ!∑ℓ
n=0 ρUQ(k)n/n!(3.29a)
ρUQ(k) = λ(k)/µUQ(k) (3.29b)
µUQ(k) = 1/(kfwdδ+ E[SDQ(k)] + kbwdδ) (3.29c)
E[SDQ(k)] =ℓρDQ(k)
ℓ+1 − (ℓ+ 1)ρDQ(k)ℓ + 1
µ(k)(1− ρDQ(k)ℓ)(1− ρDQ(k)). (3.29d)
The system M/G/ℓ/ℓ has been extensively studied and is known as the Erlang loss model.
Equation (3.29a) is the stationary blocking probability for an M/G/ℓ/ℓ. It is known as
the Erlang-B formula. It was first derived by Erlang (1917) for an M/M/ℓ/ℓ, Khinchin
(1962) later proved that it holds for generally distributed service times with finite expecta-
tion. Equation (3.29b) is the definition of the traffic intensity of the M/G/ℓ/ℓ: it is the ratio
of the arrival rate λ(k) to the inverse of the expected service time, which is given by the
inverse of Equation (3.29c) or equivalently by Equation (3.27). Equation (3.29d) approxi-
mates the expected sojourn time of DQ(k). The expression corresponds to the closed-form
expression for the expected sojourn time of an M/M/1/ℓ system (e.g., Gross (2008, Chap.
2, Equations (2.48) and (2.51))). Note that Equation (3.29d) uses an M/M/1/ℓ model for
DQ(k), while Section 3.3.1 uses an M/M/1 model. The use of an M/M/1/ℓ model
yields an expected sojourn time that: (i) is well defined for all traffic intensity values (i.e.,
even when λDQ(k) ≥ µ(k)) and (ii) is bounded from above. This is not the case for the
expected sojourn time of an M/M/1 model (i.e., E[SDQ(k)] = 1/(µ(k)−λDQ(k))), which
assumes a traffic intensity strictly smaller than 1 and goes to infinity as the traffic intensity
approaches 1.
The endogenous parameter τUQ(k) of Eq. (3.28) represents the inverse of the relaxation
time. As discussed in Section 3.3.1, it measures the speed of convergence to the stationary
88
value. We approximate τUQ(k) as follows.
τUQ(k) =α3ℓµUQ(k)(1− ρUQ(k)/ℓ)
2
1+ CUQ(k)2× 1
ℓ(3.30a)
CUQ(k) =√
Var(SUQ(k))/E[SUQ(k)] (3.30b)
Var(SUQ(k)) = [ℓρDQ(k)2ℓ+2 − 2ℓρDQ(k)
2ℓ+1 + (ℓ+ 1)ρDQ(k)2ℓ − ℓ(ℓ+ 1)ρDQ(k)
ℓ+2
+ 2ℓ(ℓ+ 1)ρDQ(k)ℓ+1 − (ℓ2 + ℓ+ 2)ρDQ(k)
ℓ + 1]/[µ(k)2(1− ρDQ(k)ℓ)2(1− ρDQ(k))
2],(3.30c)
where the term α3 in Equation (3.30a) is defined by:
α3 =
{α3,1 if P(UQ(k− 1) = ℓ) < P(UQk = ℓ) (3.31a)
α3,2 otherwise (3.31b)
Studies of the relaxation time of an Erlang loss model are limited and mostly focus
on the special case of an M/M/ℓ/ℓ system (e.g., van Doorn and Zeifman (2009)) and
on asymptotic study of the relaxation time of the expected queue length. Our focus is
on the relaxation time of the blocking probability of an Erlang loss model with generally
distributed service times (i.e., M/G/ℓ/ℓ). To derive an approximation of the relaxation
time for such a system, we follow the approach of Roth (1994). Roth assumes that the
functional form of the relaxation time of a multi-server M/M/ℓ system is the same as that
of the single server M/M/1 system and replaces µ with ℓµ. The expression for a single-
server M/M/1 system is derived by Odoni and Roth (1983). We proceed similarly: we
use the same functional form for the relaxation time of an M/G/1 system, and replace µ
with ℓµ.
We first achieve the inverse of relaxation time of an M/G/1. Roth (1994) studies the
relaxation times for infinite-capacity, single-server queueing system that are Markovian (in-
cluding systems in which inter-arrival and service times can be represented as exponential,
Erlangian, hyperexponential, and phase-type random variables) and concludes a general
inverse relaxation formula for such system as τ = µ(1−λ/µ)2
C2A+C2
S
where CA (resp. CS) is the
coefficient of variation for the inter-arrival (resp. service) time. Note that any distribution
can be arbitrarily well approximated by a phase-type distribution (Kingman; 1963), and
hence, we approximate the relaxation time of an M/G/1 system as an M/PH/1 system
89
with CA = 1 and CS equals the coefficient variation for the service time of UQ(k), denoted
CUQ(k) given by Equation (3.30b). We then proceed to obtain the inverse of relaxation time
of M/G/ℓ as:
τM/G/ℓ =ℓµUQ(k)(1− λ(k)/ℓµUQ(k))
2
1+ CUQ(k)2=
ℓµUQ(k)(1− ρUQ(k)/ℓ)2
1+ CUQ(k)2. (3.32)
Notice that Equation (3.32) is exactly the left part of Equation (3.30a).
In this way, the relaxation formula for an M/G/ℓ queueing system is obtained and then
we adjust it for finite capacity. To adjust for finite capacity, we follow the idea of Chong
and Osorio (2017), in which the relaxation time of an M/M/1/ℓ system is approximated
as the product of an M/M/1 system and the capacity of the queue ℓ. We then proceed to
obtain the approximation of inverse relaxation time of an M/G/ℓ/ℓ system as the inverse
of product of relaxation time of an M/G/ℓ system and the capacity of the queue. Equa-
tion (3.30a) for τUQ(k) of an M/G/ℓ/ℓ queue is thus obtained as the product of τM/G/ℓ
and 1/ℓ. The numerator of Equation (3.30b) is given by Equation (3.30c) and the denom-
inator is given by Equation (3.27). Equation (3.30c) is obtained as follows. Recall that
by definition: SUQ(k) = kfwdδ + SDQ(k) + kbwdδ, where kfwdδ and kbwdδ are constant
delays. Thus, Var(SUQ(k)) = Var(SDQ(k)). Equation (3.30c) is derived in Appendix B.4
and represents the variance of the sojourn time of an M/M/1/ℓ system.
The proposed expression for τUQ(k), defined by the System of Equations (3.30) has the
following properties:
• Just as for the expression proposed for τDQ(k) (Eq. (3.7)), the expression for τUQ(k)
has the same time units as the arrival and service rates. More specifically, Equa-
tion (3.30a) has the same units as µUQ(k) (note that ρUQ(k), CDQ(k) and ℓ are unit-
free).
• Davis et al. (1995) study non-stationary Erlang loss models with a special focus
on Mt/PH/n/n systems. They observe that the inverse of the relaxation time de-
creases, as the variability of the service time increases. In other words, the more
variable the service time, the longer it takes to reach stationarity. This holds for the
proposed equation. In Equation (3.30a), τUQ(k) is inversely proportional to CUQ(k),
90
which is the coefficient of variation of the service time of UQ(k). Thus, the higher
the variability of SUQ(k), the higher CUQ(k), the smaller τUQ(k), and thus the longer
time it takes to reach stationarity.
• Just as for τDQ(k) (Equation (3.7)), the relaxation time depends on whether con-
gestion is building up or dissipating. Thus, Equation (3.31) defines α3 as a func-
tion of whether congestion is building up (Equation (3.31a)) or dissipating (Equa-
tion (3.31b)).
We use the same 126 simulation experiments described in Section 3.3.1 to fit the ex-
ogenous scalar coefficients α3,1 and α3,2 of Equation (3.31). A description of how these
coefficients are estimated is given in Appendix B.5. The fitted values are α3,1 = 25 and
α3,2 = 7.5. Note that α3,1 > α3,2, which means that when traffic dissipates it takes a longer
time to reach stationary than when traffic builds up.
Algorithm
Algorithm 4 summarizes the proposed model. Steps 1 through 5 are initialization steps.
Step 6 is carried out iteratively, it yields for each time interval the two key probabilities:
P(DQ(k) = 0) and P(UQ(k) = ℓ), as well as expected link inflow (i.e., qin(k)) and
expected link outflow (i.e., qout(k)). All function evaluations can be done sequentially and
no simultaneous evaluation of a system of equations is required. This makes our algorithm
computationally efficient.
3.4 Validation
In this section, we evaluate the accuracy and the computational efficiency of the proposed
model. We compare its performance to that of two stochastic simulators: (i) the stochastic
link transmission model (LTM) simulator (Section 3.4.1), and (ii) the microscopic traffic
simulator Aimsun (TSS; 2014) (Section 3.4.2). The stochastic LTM simulator is a discrete-
event implementation of a stochastic formulation of the deterministic link transmission
model (LTM) (Yperman et al.; 2007). It assumes an inhomogeneous Poisson arrival process
91
Algorithm 4 Link model algorithm
1. set exogenous link parameters ρ, v,w, ℓ and the duration of each time interval δ
2. compute the forward and backward lags: kfwd = ⌈ ℓρvδ
⌉ and kfwd = ⌈ ℓρ|w|δ
⌉
3. set, for each time interval, the exogenous arrival rates and service rates λ(k) andµ(k) for ∀ k = 1, 2, ...
4. set initial link conditions: qin(0), qout(0), qUQ(0), P(UQ(0) = ℓ) and P(DQ(0) =0)
5. set qin(k) = 0 and qout(k) = 0 for k < 0
6. repeat the following for time intervals k = 1, 2, ...
(a) compute λDQ(k) and qLLO(k) according to Eq. (3.3) and Eq. (3.16), respec-tively
(b) compute ρDQ(k) according to Eq. (3.5)
(c) compute P(DQk = 0) according to Eq. (3.6)
(d) compute α1 and α2 according to Eq. (3.8) and Eq. (3.9), respectively
(e) compute τDQ,1 and τDQ,2 according to Eq. (3.7b) and Eq. (3.7c), respectively
(f) compute τDQ(k) according to the system of Eq. (3.7a)
(g) compute P(DQ(k) = 0) according to Eq. (3.4)
(h) if qLLO(k) = 0:
i. compute qUQ(k− 1) according to Eq. (3.23c)ii. compute f(i, qUQ(k− 1)δ) ∀i ∈ {0, 1, ..., ℓ− 1} according to Eq. (3.23b)
iii. compute P(N(k) ≥ ℓ− i) ∀i ∈ {0, 1, ..., ℓ} and compute P(UQ(k− 1) =i) ∀i ∈ {0, 1, ..., ℓ−1} according to Eq. (3.22) and Eq. (3.23a), respectively
iv. compute P(UQ(k) = ℓ) according to Eq. (3.21)
else:
i. compute E[SDQ(k)] and Var(SUQ(k)) according to Eq. (3.29d) andEq. (3.30c), respectively
ii. compute E[SUQ(k)] and µUQ(k) according to Eq. (3.27) and Eq. (3.29c),respectively
iii. compute ρUQ(k) and CUQ(k) according to Eq. (3.29b) and Eq. (3.30b),respectively
iv. compute P(UQk = ℓ) according to Eq. (3.29a)v. compute α3 according to Eq. (3.31)
vi. compute τUQ(k) according to Eq. (3.30a)vii. compute P(UQ(k) = ℓ) according to Eq. (3.28)
(i) compute qin(k) and qout(k) according to Eq. (3.1) and (3.2), respectively
92
at the upstream end of the link and a stochastic departure process at the downstream end
of the link. The simulator samples individual vehicles, it implements the exact forward
and backward lags. The vehicles at the downstream end of the link are served following
a first-come first-serve rule. The service times are independent and identically distributed
exponential random variables. The simulator was used as a benchmark in the validation
experiments in Osorio and Flötteröd (2015) and in Lu and Osorio (2018) (cf. Chapter 2).
The microscopic simulator Aimsun (TSS; 2014) is a commercial traffic simulator that relies
on disaggregate car-following and lane-changing models for individual vehicles.
3.4.1 Validation versus a stochastic link transmission model simulator
We benchmark the performance of the proposed model versus that of the multivariate model
(Osorio and Flötteröd; 2015) and of the mixture model (Lu and Osorio; 2018) (cf. Chapter
2). The analytical approximations provided by each of the three analytical models are
compared to simulation-based estimates obtained from 106 simulation replications of the
stochastic LTM simulator.
We consider a link with parameters defined in Table 3.1 for all experiments conducted
in Section 3.4.1. First, we consider two experiments with time-varying demand and evalu-
ate the ability of the proposed model to approximate the link’s upstream and downstream
boundary conditions. The space capacity of the link ℓ, is fixed at 10 for both experiments.
The link is initially empty. Experiment 1 considers a case where traffic conditions change
from uncongested to highly-congested and then to congested. More specifically, it consid-
ers an arrival rate, λ(k), of 0.1 veh/sec during the interval [0, 125] seconds, of 0.5 veh/sec
during the interval [125, 175] seconds and of 0.3 veh/sec during the interval [175, 300] sec-
onds. Experiment 2 considers a case where traffic conditions change from congested to
uncongested and then to highly-congested. It considers an arrival rate of 0.3 veh/sec during
the interval [0, 100] seconds, of 0.1 veh/sec during the interval [100, 200] seconds and of
0.5 veh/sec during the interval [200, 300] seconds.
Figure 3-1 considers experiment 1. The left (resp. right) plot considers P(DQ(T) = 0)
(resp. P(UQ(T) = ℓ)). For each plot, the x-axis displays the integer time T in seconds and
93
Parameter Valuev 0.01 km/secw −0.005 km/secρ 200 veh/kmq 2400 veh/h = 0.67 veh/secδ 1 sec
µ(k) 1440 veh/h = 0.4 veh/secλ(k) varies by experiment
ℓ, L, kfwd, kbwd varies by experiment
Table 3.1: Link parameters
0 50 100 150 200 250 300time T
0
0.2
0.4
0.6
0.8
1
1.2
P(D
Q(T
)=0)
SimProposedMultivariateMixture
0 50 100 150 200 250 300time T
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
P(U
Q(T
)=L)
SimProposedMultivariateMixture
Figure 3-1: Experiment 1: impact of the temporal variation of demand on the link’s up-stream and downstream boundary conditions
the y-axis displays the corresponding probability. The simulated estimates are displayed as
a red line with asterisks, those of the proposed model are the black solid line, those of the
the multivariate model are the black dot-dashed line, and those of the mixture model are the
black dashed line. The simulated estimates are displayed with 95% confidence intervals,
which are barely visible.
Recall that in experiment 1, there is a sharp increase in demand at time T = 125 seconds
and a sharp decrease at time T = 175 seconds. All analytical models yield similar temporal
trends for both P(DQ(T) = 0) and P(UQ(T) = ℓ). More specifically, as congestion
increases, we expect P(DQ(T) = 0) to decrease and P(UQ(T) = ℓ) to increase. Similarly,
as congestion decreases, we expect P(DQ(T) = 0) to increase and P(UQ(T) = ℓ) to
decrease. All models exhibit these trends. They all capture the sharp decrease and increase
94
0 50 100 150 200 250 300time T
0
0.2
0.4
0.6
0.8
1P
(DQ
(T)=
0)
SimProposedMultivariateMixture
0 50 100 150 200 250 300time T
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
P(U
Q(T
)=L)
SimProposedMultivariateMixture
Figure 3-2: Experiment 2: impact of the temporal variation of demand on the link’s up-stream and downstream conditions
trends of the simulator. The multivariate model yields the most accurate approximation.
The proposed model tends to overestimate the stationary spillback probability during the
highly congested regime, whereas the mixture model tends to underestimate the stationary
spillback probability during both the congested and the highly congested regimes. Overall,
the proposed model yields a good approximation of both probabilities P(DQ(T) = 0) and
P(UQ(T) = ℓ).
The results of experiment 2 are displayed in Figure 3-2. These plots have the same
layout as those of Figure 3-1. The simulated estimates are displayed with 95% confidence
intervals, which are barely visible. Recall that experiment 2 considers a sharp decrease in
demand at T = 100 seconds and a sharp increase at T = 200 seconds. The left plot shows
an increase in P(DQ(T) = 0) after T = 100 seconds and a decrease after T = 200 seconds.
The right plot shows that the spillback probability (P(UQ(T) = ℓ)) decreases after T = 100
seconds and increases after T = 200 seconds. All analytical models capture these sharp
changes in probability mass. The proposed model yields a less accurate approximation of
the stationary spillback probability during the highly congested regime, while the mixture
model yields a less accurate approximation for both the congested and the highly congested
regimes. Overall, all three models approximate well the dynamics of the link’s boundary
conditions for sudden and significant changes in congestion levels.
95
Next, we benchmark the accuracy of the proposed model over a set of 21 experiments.
All experiments start with an empty link and have a duration of TF = 300 seconds. For each
experiment, the arrival rate changes at 150 seconds. The experiments consider all combi-
nations of the following time-varying arrival rates (λ ∈ {0.2 → 0.1, 0.3 → 0.1, 0.3 →0.2} veh/sec) and space capacities (ℓ ∈ {10, 20, 30, 40, 60, 80, 100}). The space capac-
ity values considered correspond to link lengths L ∈ {50, 100, 150, 200, 300, 400, 500}
(in meters), forward lags kfwd ∈ {5, 10, 15, 20, 30, 40, 50} and backward lags kbwd ∈
{10, 20, 30, 40, 60, 80, 100}. For all experiments the arrival rates are such that congestion
builds up during the first 150 seconds and the link reaches a stationary regime, then the
arrival rate decreases, congestion dissipates and the link reaches a less congested stationary
regime. For each experiment and each model, we set a maximum computation runtime of
40 hours. Model evaluations that have not concluded within the 40 hours are terminated.
The error metric used to evaluate the accuracy of a given analytical model is the average
absolute difference between the simulated estimate and the analytical approximation:
eDQ =1
TF
TF∑T=1
|PA(DQ(T) = 0) − PS(DQ(T) = 0)| (3.33)
eUQ =1
TF
TF∑T=1
|PA(UQ(T) = ℓ) − PS(UQ(T) = ℓ)|, (3.34)
where PA denotes the probability approximated by an analytical model (proposed, mixture
or multivariate) and PS denotes the simulated estimate.
Figure 3-3 displays the average absolute difference for the 21 experiments. The top
(resp. bottom) three plots consider the spillback probability P(UQ(T) = ℓ) (resp. P(DQ(T) =
0)). The first, second and third column of plots consider the experiments with arrival rate
0.2 → 0.1 veh/sec, 0.3 → 0.1 veh/sec and 0.3 → 0.2 veh/sec, respectively. Each plot
compares the three models: the proposed model (circles), the mixture model (asterisks)
and the multivariate model (triangles). The x-axis displays the space capacity (i.e., ℓ) and
the y-axis displays the average absolute difference (i.e., eUQ or eDQ). The top three plots,
which consider the spillback probability, have a logarithmic-scaled y-axis. For the exper-
iments with space capacity greater than 30 (i.e., ℓ > 30), the multivariate model does not
96
1020
3040
5060
7080
9010
010
-30
10-2
5
10-2
0
10-1
5
10-1
0
10-5
100
Pro
pose
dM
ixtu
reM
ultiv
aria
te
(a)A
rriv
alra
teλ(k)=
0.2
→0.1
veh/
sec
1020
3040
5060
7080
9010
010
-20
10-1
5
10-1
0
10-5
100
Pro
pose
dM
ixtu
reM
ultiv
aria
te
(b)A
rriv
alra
teλ(k)=
0.3
→0.1
veh/
sec
1020
3040
5060
7080
9010
010
-20
10-1
5
10-1
0
10-5
100
Pro
pose
dM
ixtu
reM
ultiv
aria
te
(c)A
rriv
alra
teλ(k)=
0.3
→0.2
veh/
sec
1020
3040
5060
7080
9010
010
-3
10-2
10-1
Pro
pose
dM
ixtu
reM
ultiv
aria
te
(d)A
rriv
alra
teλ(k)=
0.2
→0.1
veh/
sec
1020
3040
5060
7080
9010
010
-3
10-2
10-1
Pro
pose
dM
ixtu
reM
ultiv
aria
te
(e)A
rriv
alra
teλ(k)=
0.3
→0.1
veh/
sec
1020
3040
5060
7080
9010
010
-3
10-2
10-1
Pro
pose
dM
ixtu
reM
ultiv
aria
te
(f)A
rriv
alra
teλ(k)=
0.3
→0.2
veh/
sec
Figu
re3-
3:C
ompa
riso
nof
the
aver
age
abso
lute
erro
rsfo
rthe
21ex
peri
men
tsw
ithtim
e-va
ryin
gde
man
d
97
conclude within 40 hours, thus these runs are terminated and are not displayed in the plots.
The main insights from Figure 3-3 are as follows. For most experiments, the multivari-
ate model gives the lowest errors for both P(UQ(T) = ℓ) and P(DQ(T) = 0), followed
by the mixture model. For the spillback probabilities (i.e., top three plots), both the pro-
posed model and the mixture model have errors that decrease exponentially as the space
capacity ℓ increases. As the space capacity increases, the error in the approximation of
P(DQ(T) = 0) (bottom three plots) for the proposed model preserves the same order of
magnitude. The numerical values of the errors displayed in Figure 3-3 are provided in Ap-
pendix B.6 (Tables B.1 and B.2). The average (over the 21 experiments) eUQ and eDQ of the
proposed model are 0.0006 and 0.0180, respectively, whereas those of the mixture model
are 0.0023 and 0.0073. Compared to the mixture model, the proposed model, on average,
gains accuracy in approximating the upstream boundary conditions and loses accuracy in
approximating the downstream boundary conditions.
Overall, the multivariate model has the highest accuracy, yet is computationally inef-
ficient for large space capacity values. The proposed model and the mixture model have
comparable accuracy. They both perform well for all experiments.
We now compare the multivariate model, the mixture model and the proposed model in
terms of computational runtime. Figure 3-4 compares the runtimes for the 21 experiments.
Figure 3-4(a), 3-4(b) and 3-4(c) consider the experiments with time-varying arrival rate
0.2 → 0.1 veh/sec, 0.3 → 0.1 veh/sec and 0.3 → 0.2 veh/sec, respectively. Figure 3-4(d)
plots the average runtime over all three time-varying arrival rate experiments. For each plot,
the x-axis displays the space capacity ℓ and the y-axis displays the computational runtime
(in seconds). The y-axis is plotted on a logarithmic scale. Each plot considers runtimes of
the three models: proposed (circles), mixture (asterisks) and multivariate (triangles). Since
the multivariate model does not conclude within 40 hours for experiments with ℓ > 30, they
are not evaluated. For all three arrival rate experiments (i.e., for Figures 3-4(a), 3-4(b) and
3-4(c)) the following trends hold: the runtime of the multivariate model increases exponen-
tially with ℓ, that of the mixture model increases linearly, while for the proposed model,
the computational runtime appears constant. The average runtime over the 21 experiments
of the proposed model is 0.08 seconds, whereas that of the mixture model is 0.49 seconds.
98
The average runtime is improved by one order of magnitude.
In summary, for all experiments, the proposed model performs comparably with the
mixture and multivariate model in describing the dynamics of the link’s boundary condi-
tions. The gain in computational runtime is significant and increases as the space capacity
increases.
3.4.2 Validation versus a microscopic traffic simulator
We benchmark the proposed model versus the microscopic traffic simulator Aimsun. We
consider a signal-controlled single-lane link, depicted in Figure 3-5, with parameters shown
in Table 3.2. There are two detectors on the link that provide vehicle count data every sec-
ond. We study the traffic dynamics between the entrance and the exit detectors of the link.
The detector denoted entrance detector (resp. exit detector) captures the link’s upstream
(resp. downstream) boundary conditions. The performance metrics that we consider are
the expected inflow and the expected outflow, in vehicles per second, and are computed as
the average, over 2 seconds, vehicle count. We consider four experiments with different de-
mand scenarios. The first three experiments have a constant arrival rate over the simulation
period. The arrival rates rates are 0.1 veh/sec, 0.2 veh/sec and 0.3 veh/sec, respectively.
The fourth experiment has a time-varying arrival rate that mimics arrival patterns from an
upstream signal controlled intersection. More specifically, the arrival rate [veh/sec] within
every signal cycle (i.e., every 60 seconds) is given by:
λ(k) =
0.3 for 0 ≤ k ≤ 40 sec
0 for 40 < k ≤ 60 sec.(3.35)
For all experiments, we start with an empty link with a warm-up period of 5 seconds, the
free-flow travel time for the arrivals to first reach the entrance detector and the performance
metrics are estimated from 200 simulation replications.
The part of the link in between the two detectors are described using the proposed ana-
lytical model. The corresponding link parameters used in the proposed model are given in
Table 3.3. All experiments initiate with an empty link and have a duration of 250 seconds.
99
1020
3040
5060
7080
90100
10-2
100
102
104
106
Proposed
Mixture
Multivariate
(a)Runtim
efortim
e-varyingarrivalrate
λ(k)=
0.2→
0.1
veh/sec
1020
3040
5060
7080
90100
10-2
100
102
104
106
Proposed
Mixture
Multivariate
(b)Runtim
efortim
e-varyingarrivalrate
λ(k)=
0.3→
0.1
veh/sec
1020
3040
5060
7080
90100
10-2
100
102
104
106
Proposed
Mixture
Multivariate
(c)Runtim
efortim
e-varyingarrivalrate
λ(k)=
0.3→
0.2
veh/sec
1020
3040
5060
7080
90100
10-2
100
102
104
106
Proposed
Mixture
Multivariate
(d)Average
runtime
overallthreetim
e-varyingarrivalrates
Figure3-4:C
omparison
ofthecom
putationalruntimes
forthe21
experiments
with
time-varying
demand
100
Figure 3-5: Microscopic simulation model of a single-lane link
Parameter ValueLink length 100 m
Entrance detector position 50 mExit detector position 100 m
Flow capacity 2400 veh/h = 0.67 veh/secMaximum speed 36 km/h = 10 m/sec
Downstream control type fixed-time signal planCycle time 60 sec
Green phase 48 secSimulation length 5 min = 300 sec
Replications number 200
Detection interval 1 secArrival rate λ(k) varies by experiment
Table 3.2: Link parameters used in the microscopic simulator
The time-varying service rate [veh/sec] of the link used in the proposed model within every
signal cycle (i.e., every 60 seconds) is given by:
µ(k) =
0.67 for 0 ≤ k ≤ 48 sec
0 for 48 < k ≤ 60 sec.(3.36)
The arrival rates used in the proposed model for the four experiments are set exactly the
sames as in the simulator, i.e., λ(k) = 0.1 veh/sec, λ(k) = 0.2 veh/sec, λ(k) = 0.3 veh/sec
for the first three experiments, respectively, and for the fourth experiment, λ(k) within
every signal cycle is given by Equation (3.35).
Figures 3-6-3-9 compare estimates of the expected inflow (computed at the entrance
101
Parameter Valuev 0.01 km/secw −0.005 km/secρ 200 veh/kmq 2400 veh/h = 0.67 veh/secδ 1 secℓ 10
L 50 mkfwd 5
kbwd 10
λ(k) varies by experimentµ(k) varies within signal cycle given by Eq. (3.36)
Table 3.3: Link parameters used in the proposed model
detector) and of the expected outflow (computed at the exit detector) obtained from the
simulator to those obtained from the proposed analytical model. For each figure, the left
(resp. right) plot depicts the expected inflow (resp. outflow). Each plot displays, as dotted
lines, 95% confidence intervals of the simulated estimates. The proposed model results are
displayed as a solid line with circles.
Figure 3-6 considers the experiment with a constant arrival rate of 0.1 veh/sec. The
expected inflow (left plot) remains constant around 0.1 veh/sec for the whole experiment.
This indicates that the demand is low enough such that the queue-length does not exceed
the entrance detector and thus does not constrain the arrivals to it. The expect outflow (right
plot) shows the zero outflow during the red phases of the traffic signal. The outflow pattern
during the first traffic signal cycle differs from the others because the link starts off empty
and there is only a warm-up period of 5 seconds for the arrivals to first reach the entrance
detector. So when the first green phase starts, there is no vehicular queue. For all other
cycles, there is a queue of vehicles waiting to leave the link when the green phase starts.
The outflow pattern shows how the queue gradually dissipates (i.e., the outflow gradually
decreases).
Figures 3-7 and 3-8 display the results for the experiments with constant arrival rates
of 0.2 veh/sec and 0.3 veh/sec, respectively. The expected outflow patterns (right plots of
each figure) are similar to those of the right plot of Figure 3-6. The left plots of Figures 3-7
and 3-8 indicate that as demand increases, the expected inflow may temporarily decrease
102
due to the vehicular queue extending beyond the entrance detector. This phenomenon is
also captured by the proposed analytical model.
Figure 3-9 considers the experiment with time-varying demand. As is illustrated in
the left plot, the expected inflow oscillates between 0.3 and 0 veh/sec. In the right plot,
the expected outflow pattern is different from the previous three experiments after the first
signal cycle. This is because the demand pattern and signal pattern at the downstream end
of the link are not synchronized within a signal cycle of 60 seconds. In order to better
explain the pattern, we further plot a color bar at the top of the right plot. This color bar
consists of two colors: green, which represents the period of time the expected inflow is
0.3 veh/sec shifted a free-flow link travel time of 5 seconds, and red, which represents the
period of time the expected inflow is 0 veh/sec shifted a free-flow time. In another word, the
green bar shows roughly the positive arrival period to the downstream queue, whereas the
red bar for the no arrival period. On the other hand, the right plot displays eight vertical grid
lines, which represents times when the downstream signal phase changes. The phases starts
with green and then alternates. As before, the expect outflow shows the zero outflow during
the red phases of the traffic signal. The outflow pattern during the first traffic signal cycle is
similar to those of the previous three experiments. Note the green phase ends slightly later
than the positive arrival period and thus causes a brief decrease in outflow before the red
phase. The upcoming green signal phase starts with no arrival period (i.e., pure departure),
the expected outflow decreases at the beginning which shows how the queue gradually
dissipates. The following increase in expected outflow is because of the restart of arrival
period to downstream queue. This shows how the queue gradually increases and stabilizes.
This outflow pattern repeats afterwards and it is fully captured by the proposed analytical
model.
For all experiments with both constant and alternating arrival rates, the approximations
of both the expected inflow and of the expected outflow derived by the proposed model
almost always lie within the 95% confidence interval of the simulated estimates. Thus, the
proposed model accurately captures the dynamic of the boundary conditions.
We now consider an experiment with platoon arrivals. This experiment serves to eval-
uate the ability of the proposed model, and in particular its use of a time-varying Poisson
103
0 50 100 150 200 250
Time T
0
0.05
0.1
0.15
0.2E
xpec
ted
inflo
w [v
eh/s
ec]
Microscopic simulator 95% CIProposed
0 50 100 150 200 250
Time T
0
0.1
0.2
0.3
0.4
0.5
Exp
ecte
d ou
tflow
[veh
/sec
]
Microscopic simulator 95% CIProposed
Figure 3-6: Comparison of the expected inflow and outflow for the experiment with arrivalrate λ = 0.1 veh/sec
arrival process, to approximate the link’s boundary conditions under platoon arrival pat-
terns. We consider two tandem single-lane links as shown in Figure 3-10. Both links are
controlled by a fixed-time signal plan with a 60 second cycle time. The cycle is composed
of a single green phase followed by a single red phase. The offset is set to zero, i.e., both
intersections start their green phase at the same time. The duration of the green phase for
the upstream (resp. downstream) intersection is 40 (resp. 48) seconds. The maximum
speed of both links are the same as in Table 3.2. Arrivals from the upstream link to the
downstream link form platoon created by the upstream signalized intersection.
Just as in our previous experiments, the downstream link has the entrance and the exit
detectors, which are used to estimate expected inflows and outflows. We study the traffic
dynamics between the entrance and the exit detectors of the downstream link. Since node
modeling is not the focus of this chapter, we use a third detector to estimate the downstream
link’s arrival rate λ(k) of the analytical model. More specifically, λ(k) is estimated by the
flow observed at the detector labeled detector 1 in Figure 3-10 and shifted in time by the
free-flow travel time between detector 1 and the entrance detector (5 seconds). The setting
of detector 1 is the same as the entrance and exit detectors. The parameters of the proposed
analytical model are displayed in Table 3.4. We start with an empty link with a warm-up
period of 10 seconds, the free-flow time for the arrivals to first reach the entrance detector,
104
0 50 100 150 200 250
Time T
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Exp
ecte
d in
flow
[veh
/sec
]
Microscopic simulator 95% CIProposed
0 50 100 150 200 250
Time T
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Exp
ecte
d ou
tflow
[veh
/sec
]
Microscopic simulator 95% CIProposed
Figure 3-7: Comparison of the expected inflow and outflow for the experiment with arrivalrate λ = 0.2 veh/sec
0 50 100 150 200 250
Time T
0
0.1
0.2
0.3
0.4
0.5
Exp
ecte
d in
flow
[veh
/sec
]
Microscopic simulator 95% CIProposed
0 50 100 150 200 250
Time T
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Exp
ecte
d ou
tflow
[veh
/sec
]
Microscopic simulator 95% CIProposed
Figure 3-8: Comparison of the expected inflow and outflow for the experiment with arrivalrate λ = 0.3 veh/sec
105
0 50 100 150 200 250
Time T
0
0.1
0.2
0.3
0.4
0.5
Exp
ecte
d in
flow
[veh
/sec
]
Microscopic simulator 95% CIProposed
0 60 120 180 240
Time T
0
0.1
0.2
0.3
0.4
0.5
Exp
ecte
d ou
tflow
[veh
/sec
]
Microscopic simulator 95% CIProposed
Figure 3-9: Comparison of the expected inflow and outflow for the experiment with alter-nating arrival rate between 0.3 veh/sec and 0 veh/sec
Figure 3-10: Microscopic simulation model for platoon arrival experiments
106
Parameter Valuev 0.01 km/secw −0.005 km/secρ 200 veh/kmq 2400 veh/h = 0.67 veh/secδ 1 secℓ 10
L 50 mkfwd 5
kbwd 10
λ(k) estimates from detector 1µ(k) varies within signal cycle given by Eq. (3.36)
Table 3.4: Link parameters used in the proposed model
0 50 100 150 200 250
Time T
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Exp
ecte
d in
flow
[veh
/sec
]
Microscopic simulator 95% CIProposed
0 50 100 150 200 250
Time T
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7E
xpec
ted
outfl
ow [v
eh/s
ec]
Microscopic simulator 95% CIProposed
Figure 3-11: Comparison of the expected inflow and outflow for the tandem link experiment
and the performance metrics are estimated from 200 simulation replications.
The left (resp. right) plot of Figure 3-11 shows the expected inflow (resp. outflow). For
both plots, the analytical approximations of both the expected inflow and of the expected
outflow fall within the 95% confidence intervals of the simulated estimates. This shows
the ability of the proposed model to capture the impact of platoon arrivals on the links’
boundary conditions.
In summary, the comparisons to a microscopic traffic simulator indicate that the pro-
posed model can accurately approximate the links’ boundary conditions for realistic traffic
107
situations such as signalized links and platoon arrival patterns.
3.5 Optimization case study
In this section, we evaluate and benchmark the computational efficiency of the proposed
model. We use it to tackle a signal control optimization problem for the Swiss city of Lau-
sanne. The signal control problem considered is the same as that studied in Lu and Osorio
(2018) (cf. Chapter 2). Section 3.5.1 formulates the problem. Section 3.5.2 compares the
performance of the proposed model with that of three other approaches: (i) the mixture
model Lu and Osorio (2018) (cf. Chapter 2), (ii) the deterministic intelligent link transmis-
sion model (ILTM) (Himpe et al.; 2016), and (iii) a widely used commercial signal control
software. The ILTM model is a network loading model that combines an efficient iterative
link transmission model with the node model of Tampère et al. (2011).
3.5.1 City-scale signal control
The Lausanne network consists of 603 links, 902 lanes and 231 intersections. The network
model of the stochastic microscopic simulator is shown in Figure 3-12. We consider a
fixed-time signal control problem in which we determine the signal plans of 17 intersections
distributed throughout the network (displayed as squares in Figure 3-12). The signal plans
of the 17 intersections are determined jointly. The problem is a fixed-time signal control
problem for the evening peak period 5:00-5:30pm. The decision variables are the green
splits of the signal phases of the 17 intersections. All other control variables (e.g., cycle
times and offsets) are fixed. This leads to a total of 99 endogenous signal phase variables
108
Figure 3-12: Lausanne network model
(i.e., the decision vector is of dimension of 99). We use the following notation.
bd ratio of available cycle time to total cycle time for intersection d;
x vector of green splits;
x(j) green split of signal phase j;
xLB vector of lower bounds for green splits;
D set of intersection indices;
PD(d) set of endogenous signal phase indices of intersection d;
L set of all lanes;
T total number of one-minute time intervals;
N1 number of lanes, i.e., cardinality of L.
The problem is formulated as follows:
minx
f(x) =1
TN1
∑i∈L
T∑t=1
P(UQi(t; x) = ℓi) (3.37)
109
subject to
∑j∈PD(d)
x(j) = bd, ∀d ∈ D (3.38)
x ≥ xLB. (3.39)
The decision vector, x, is the green splits of the signal controlled lanes. Constraint (3.38)
ensures that, for every intersection, the sum of the green times equals the available cycle
time. Constraint (3.39) sets lower bounds, which are set to 4 seconds in this case study.
P(UQi(t; x) = ℓi) denotes the spillback probability of lane i at integer time t under signal
plan x. Therefore, the objective function is the average (over space and over time) spill-
back probability. The goal is to find a signal plan that minimizes the spatial and temporal
occurrence of spillbacks.
To evaluate the performance of the signal plans proposed by the various methods, we
use a calibrated microscopic model of the Lausanne network, which embeds realistic link
and node models. Thus, when evaluating the performance of a signal plan, the simula-
tor accounts for how route choices and link demand can vary with signal plan changes.
The deterministic ILTM model is also a complete network loading model that embeds the
well-established node model of Tampère et al. (2011). However, the probabilistic analyti-
cal models (i.e., the proposed model and the mixture model) are link models. In order to
use them for network optimization, they assume demand for each link is exogenous and
describe the within-link dynamics, they do not describe the across-link (i.e., node) dynam-
ics. Implementation details, including the computation of the exogenous link demand, are
given in Section 4.1 of Lu and Osorio (2018) (cf. Chapter 2). The above problem is solved
with the proposed model and with the mixture model using the interior-point algorithm of
the fmincon routine of Matlab (MATLAB; 2016).
Since the ILTM is a deterministic model and it does not approximate the spillback
probabilities that define the objective function (3.37). Thus, we use 2 alternate objective
functions for ILTM:
1. The average, over time and over the network lanes, proportion of lane that is occupied
110
by vehicles:
f(x) =1
TN1
∑i∈L
T∑t=1
(cvn
upi (t) − cvndown
i (t))/ℓi. (3.40)
where cvnupi (t) (resp. cvndown
i (t)) is the cumulative number of vehicles that passes
the upstream (resp. downstream) end of link i up until time t, which is the output
metric produced by the deterministic ILTM model. This objective function reflects
the average saturation degree of each link in the network over time, which is between
0 and 1. A larger objective function value reflects the network is more likely to be
congested and thus more likely for spillback to happen within the network.
2. The average, over time and over the network lanes, link travel time (given by Eq. (3.41)).
f(x) =1
TN1
∑i∈L
T∑t=1
TTi(t). (3.41)
where TTi(t) is the travel time of link i at time t, which is an output metric produced
by the deterministic ILTM model. This objective function reflects average trip travel
time through the network. This is a common choice of objective function for deter-
ministic models. A larger objective function value reflects that a longer travel time
through the network which suggests a higher chance of spillback.
A natural deterministic approximation of (3.37) would be the average, over all lanes in the
network, proportion of time the lane is full. However, this function is not continuously
differentiable. Thus, we could not use it with the derivative-based interior-point algorithm
used with the other models.
For all methods, the maximum runtime is set to 24 hours. If the algorithm does not
converge to a local optimal solution within the time limit, the algorithm is terminated and
the current iterate is used as the final solution.
111
Initial point 1 2 3 4Mixture 145.0 146.1 144.4 149.4
ILTM 10.6 10.7 10.7 10.8Proposed 1.3 1.3 1.3 1.3
Table 3.5: Average runtime (in min) per iteration of the signal control optimization algo-rithm
3.5.2 Numerical results
Problem (3.37)-(3.39) for the proposed and mixture model, and Problems (3.40) and (3.41)
with constraints (3.38) and (3.39) for the ILTM model are solved considering four different
initial points. The initial points are drawn uniformly randomly from the feasible region
(Equations (3.38)-(3.39)) using the sampling code of Stafford (2006).
For the proposed model and the ILTM model, all four optimization runs (i.e., one for
each initial point) conclude within the time limit. Actually, they all finish within 2.5 hours
for the proposed model and within 5.5 hours for the ILTM model. For the mixture model,
the algorithms do not converge within the time limit. Table 3.5 compares the average com-
putation time (in minutes) per algorithmic iteration. Each column of Table 3.5 corresponds
to a different initial point. Since we consider 2 different objective functions for ILTM, Table
3.5 displays the average computation time, averaged over the two optimization problems.
Table 3.5 indicates the computation time of all models does not vary significantly across
initial points. For the proposed model, the average runtime per iteration is in the order of 1
minute. For ILTM model, it is in the order of 10 minutes, and for the mixture model it is in
the order of 2.4 hours (i.e., 146 minutes). Thus, the proposed model improves the runtime
by 1 order of magnitude compared to ILTM and by 2 orders of magnitude compared to the
mixture model. Note however, that, unlike the probabilistic models (proposed or mixture)
the ILTM model is a full network model that embeds an endogenous node model. This
added realism comes, of course, with an increase in the computational runtime.
We now compare the performance of the derived signal plans. To evaluate the perfor-
mance of a given signal plan, we use a microscopic traffic simulation model of Lausanne
(Dumont and Bert; 2006), which is calibrated for the evening peak period demand. It is im-
plemented in the Aimsun software (TSS; 2014). For a given signal plan, we embed it within
112
the microscopic simulation software, and evaluate 50 simulation replications. Each repli-
cation consists of a warm-up period of 15 minutes followed by a simulation period of 30
minutes. For each simulation replication, we estimate the objective function (Eq. (3.37)),
which is the average (over lanes) proportion of time (over 30 minutes) a lane is full. For
each signal plan, we construct a cumulative distribution function (cdf) of these 50 objective
function observations.
Each plot of Figure 3-13 considers a different initial signal plan and plots five cdf
curves: one for the initial signal plan (dashed line), one for the solution derived by the
proposed model (solid line), one for the solution derived by the mixture model (dot-dashed
line) and two for the solutions derived by the deterministic ILTM model with different
objective functions (circle-dotted lines). The x-axis displays the objective function realiza-
tions (i.e., average (over all the lanes in the network) proportion of time a lane is full). The
y-axis displays the proportion of the 50 simulation replications that have objective function
realizations smaller than x. Thus, the more a cdf curve is shifted to the left, the better the
performance of the corresponding signal plan.
For all 4 plots (Fig. 3-13(a)-3-13(d)), the cdf curves of the derived signal plans, from
the mixture model, from the proposed model and from the ILTM model, are to the left of
the initial signal plan. Thus, all models identify signal plans that outperform the initial
signal plans. For all 4 plots, the cdf curves of the derived signal plans from the stochastic
models (i.e, the proposed and mixture model) are to the left of the derived signal plans
from ILTM model. Thus, the stochastic models can identify signal plans that outperform
the signal plans derived by ILTM model (with both objective functions).
Figure 3-14 compares the average proportion of time a lane is full for the ILTM signal
plans (with objective (3.40), i.e., ILTM-1) and the proposed plan. Each plot of Figure 3-14
considers a different initial point and each dot in the plot represents a lane in the network.
The x-axis (resp. y-axis) displays the proportion of time the lane is full averaged over all
50 simulation replications with the ILTM (resp. proposed) signal plan. A reference line of
y = x is displayed in each plot. Note that for both ILTM and the proposed model, even
after optimization, there remain lanes with an average that is greater than 0.4, which means
that there remain highly congested lanes. Approximately 20% of the lanes have improved
113
0.010.02
0.030.04
0.050.06
x: average proportion of time a lane is full
0
0.2
0.4
0.6
0.8 1
Cumulative distribution function F(x)
InitialP
roposedM
ixtureILT
M-1
ILTM
-2
(a)Initialpoint1
0.010.015
0.020.025
0.030.035
0.040.045
x: average proportion of time a lane is full
0
0.2
0.4
0.6
0.8 1
Cumulative distribution function F(x)
InitialP
roposedM
ixtureILT
M-1
ILTM
-2
(b)Initialpoint2
0.010.02
0.030.04
0.050.06
0.070.08
x: average proportion of time a lane is full
0
0.2
0.4
0.6
0.8 1
Cumulative distribution function F(x)
InitialP
roposedM
ixtureILT
M-1
ILTM
-2
(c)Initialpoint3
0.010.02
0.030.04
0.050.06
0.07x: average proportion of tim
e a lane is full
0
0.2
0.4
0.6
0.8 1
Cumulative distribution function F(x)InitialP
roposedM
ixtureILT
M-1
ILTM
-2
(d)Initialpoint4
Figure3-13:C
umulative
distributionfunctions
oftheaverage,overalllanes
inthe
network,proportion
oftime
alane
isfullconsidering
differentinitialsignalplans
114
performance under the proposed signal plan, while 10% have worse performance. The
mean difference in performance per lane between the ILTM signal plans and the proposed
plan for each plot is 0.006, 0.007, 0.004 and 0.009, respectively. Thus, on average, the
proposed method yields signal plans that mitigate the occurrence of spillbacks. Note that
this occurs, even if the proposed model, unlike ILTM, is merely a link model, it lacks a
node model.
Figure 3-15 compares the performance of the signal plans in terms of the average (over
all lanes in the network) proportion of the lane that is occupied by vehicles, which is equiv-
alent to the first objective function used by the ILTM optimization. This figure has a similar
layout as Figure 3-13. The figure compares the cdf curves of the different signal plans. As
before, the more a cdf curve is shifted to the left, the better its performance (i.e., the higher
the proportion of simulation replications, out of the 50, that have low average proportion of
the lane that is occupied by vehicles). All four plots in Figure 3-15 indicate that all derived
signal plans outperform their corresponding initial signal plans. The derived signal plans
from both probabilistic models (mixture and proposed) have similar performance and they
outperform the signal plans derived by the deterministic ILTM model with both objective
functions.
Figure 3-16 compares the performance of the signal plans in terms of the average trip
travel time, which is equivalent to the second objective function used by the ILTM opti-
mization. For all initial points, all models yield signal plans that outperform the initial
points. The signal plans derived by the proposed model and by the mixture model have sim-
ilar performance and they outperform the signal plans derived by the deterministic ILTM
model with both objective functions.
We compare the performance of the proposed signal plans with that of a signal plan de-
rived by the widely used commercial signal control software Synchro (Trafficware; 2011).
The Synchro software is based on a deterministic macroscopic traffic model, it does not
solve the same optimization problem (3.37)-(3.39). For details on how the Synchro signal
plan is derived, we refer the reader to Section 5.3 of Osorio and Chong (2015). Figures 3-
17, 3-18 and 3-19 display, respectively, the three performance metrics of interest: average
proportion of time a lane is full, average proportion of the lane that is occupied by vehicles
115
00.1
0.20.3
0.40.5
0.60.7
0.8
Average proportion of tim
e lane is full for ILTM
signal plan
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8Average proportion of time lane is full
for proposed signal plan
(a)Initialpoint1
00.1
0.20.3
0.40.5
0.60.7
0.8
Average proportion of tim
e lane is full for ILTM
signal plan
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Average proportion of time lane is fullfor proposed signal plan
(b)Initialpoint2
00.1
0.20.3
0.40.5
0.60.7
0.8
Average proportion of tim
e lane is full for ILTM
signal plan
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Average proportion of time lane is fullfor proposed signal plan
(c)Initialpoint3
00.1
0.20.3
0.40.5
0.60.7
0.8
Average proportion of tim
e lane is full for ILTM
signal plan
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Average proportion of time lane is fullfor proposed signal plan
(d)Initialpoint4
Figure3-14:
Average
proportionoftim
elane
isfullforILT
Msignalplans
andproposed
signalplans
116
0.04
0.05
0.06
0.07
0.08
0.09
0.1
0.11
x: a
vera
ge p
ropo
rtio
n of
the
lane
that
is o
ccup
ied
by v
ehic
les
0
0.2
0.4
0.6
0.81
Cumulative distribution function F(x)
Initi
alP
ropo
sed
Mix
ture
ILT
M-1
ILT
M-2
(a)I
nitia
lpoi
nt1
0.04
0.05
0.06
0.07
0.08
0.09
x: a
vera
ge p
ropo
rtio
n of
the
lane
that
is o
ccup
ied
by v
ehic
les
0
0.2
0.4
0.6
0.81
Cumulative distribution function F(x)
Initi
alP
ropo
sed
Mix
ture
ILT
M-1
ILT
M-2
(b)I
nitia
lpoi
nt2
0.04
0.06
0.08
0.1
0.12
0.14
x: a
vera
ge p
ropo
rtio
n of
the
lane
that
is o
ccup
ied
by v
ehic
les
0
0.2
0.4
0.6
0.81
Cumulative distribution function F(x)
Initi
alP
ropo
sed
Mix
ture
ILT
M-1
ILT
M-2
(c)I
nitia
lpoi
nt3
0.04
0.06
0.08
0.1
0.12
0.14
x: a
vera
ge p
ropo
rtio
n of
the
lane
that
is o
ccup
ied
by v
ehic
les
0
0.2
0.4
0.6
0.81
Cumulative distribution function F(x)In
itial
Pro
pose
dM
ixtu
reIL
TM
-1IL
TM
-2
(d)I
nitia
lpoi
nt4
Figu
re3-
15:
Cum
ulat
ive
dist
ribu
tion
func
tions
ofth
eav
erag
epr
opor
tion
ofth
ela
neth
atis
occu
pied
byve
hicl
esco
nsid
erin
gdi
ffer
ent
initi
alsi
gnal
plan
s
117
45
67
89
10x: average trip travel tim
e [min]
0
0.2
0.4
0.6
0.8 1Cumulative distribution function F(x)
InitialP
roposedM
ixtureILT
M-1
ILTM
-2
(a)Initialpoint1
45
67
89
10x: average trip travel tim
e [min]
0
0.2
0.4
0.6
0.8 1
Cumulative distribution function F(x)
InitialP
roposedM
ixtureILT
M-1
ILTM
-2
(b)Initialpoint2
45
67
89
1011
x: average trip travel time [m
in]
0
0.2
0.4
0.6
0.8 1
Cumulative distribution function F(x)
InitialP
roposedM
ixtureILT
M-1
ILTM
-2
(c)Initialpoint3
45
67
89
1011
x: average trip travel time [m
in]
0
0.2
0.4
0.6
0.8 1
Cumulative distribution function F(x)InitialP
roposedM
ixtureILT
M-1
ILTM
-2
(d)Initialpoint4
Figure3-16:C
umulative
distributionfunctions
oftheaverage
triptraveltim
econsidering
differentinitialsignalplans
118
and average trip travel time normalized by free-flow travel time. Each figure contains 9
cdf curves: four black dashed lines for the four initial points, four solid thin lines for the
four solutions derived by the proposed model and one solid thick line for the signal plan
proposed by Synchro. For all figures, the left-most curves are the ones corresponding to
the signal plans derived by the proposed model. In other words, for all three performance
metrics, the signal plans derived by the proposed model outperform both the initial signal
plans and the signal plan derived by Synchro. For each figure, the performance of the four
initial points varies significantly, while that of the proposed signal plans is similar. In other
words, the proposed method is robust to the quality of the initial points.
In summary, compared to the mixture model, the proposed model improves the average
runtime by 2 orders of magnitude, and yields signal plans with either improved or simi-
lar performance. Compared to the deterministic ILTM model, both the proposed and the
mixture model are able to find signal plans with improved performances. This case study il-
lustrates the scalability and computational efficiency of the proposed model. The proposed
model is suitable for large-scale network analysis and optimization.
3.6 Conclusions
This chapter formulates an analytical probabilistic stochastic model that is scalable and
suitable for large-scale network optimization. The main idea of the proposed model is to
describe the link’s boundary conditions with only two key probabilities instead of tracking
the full marginal, or full joint, distributions. More specifically, while the dimension of
the state space of the models of Osorio and Flötteröd (2015) and of Lu and Osorio (2018)
(cf. Chapter 2) scales cubically and linearly, respectively, with the link’s space capacity,
the proposed model has a constant dimension of 2. Thus, it scales independently of the
link attributes such as the link’s space capacity. This makes it suitable for large-scale
network analysis and optimization. The model is validated versus stochastic simulation
results from a simulation-based implementation of a stochastic link transmission model.
The model’s accuracy is comparable to that of Osorio and Flötteröd (2015) and of Lu and
Osorio (2018) (cf. Chapter 2), while being more computationally efficient. The model is
119
0.010.02
0.030.04
0.050.06
0.070.08
x: average proportion of time a lane is full
0
0.2
0.4
0.6
0.8 1
Cumulative distribution function F(x)
Synchro signal plan
Initial signal planP
roposed signal plan
Figure3-17:
Cum
ulativedistribution
functionsof
theaverage
pro-portion
oftime
alane
isfull
0.040.06
0.080.1
0.120.14
x: average proportion of the lane that is occupied by vehicles
0
0.2
0.4
0.6
0.8 1
Cumulative distribution function F(x)
Synchro signal plan
Initial signal planP
roposed signal plan
Figure3-18:
Cum
ulativedistribution
functionsof
theaverage
pro-portion
ofthelane
thatisoccupied
byvehicles
45
67
89
1011
x: average trip travel time [m
in]
0
0.2
0.4
0.6
0.8 1
Cumulative distribution function F(x)
Synchro signal plan
Initial signal planP
roposed signal plan
Figure3-19:
Cum
ulativedistribution
functionsof
theaverage
triptraveltim
e
120
also validated versus a microscopic traffic simulator, and it can accurately approximate the
link’s boundary conditions for realistic traffic situations such as platoon-like arrival. The
proposed model is then used to address a signal control problem for the city of Lausanne
(Switzerland). The derived solutions are benchmarked with those derived by the mixture
model of Lu and Osorio (2018) (cf. Chapter 2) and ILTM model of Himpe et al. (2016).
The derived signal plans from both the proposed model and the mixture model have similar
performance, considering various performance metrics. They both outperform the initial
plans, signal plans derived by ILTM model and a signal plan proposed by a widely used
commercial software. Compared to the model of Lu and Osorio (2018) (cf. Chapter 2), the
proposed model reduces computational runtime by 2 orders of magnitude.
Future work focuses on the formulation of scalable stochastic network models, the goal
is to be able to recover the joint distribution of a path or a network. First, there is a need
to formulate scalable probabilistic node models that are consistent with their deterministic
counterparts. Osorio et al. (2011) includes a two-link probabilistic node model that pro-
vides the dependencies of the links’ boundary conditions across a node. It yields the joint
distribution of the downstream boundary conditions of the upstream link and the upstream
boundary conditions of the downstream link. The extension of this formulation to nodes
with multiple incoming and outgoing links is part of ongoing work. Second, scalable and
computationally efficient network model formulations are required. Consider a network of
n links, directly coupling the proposed link model with the node model of Osorio et al.
(2011) would yield a complexity of O(2n), which is not scalable. Possible techniques
to achieve scalability include network decomposition (Flötteröd and Osorio; 2017) and
aggregation-disaggregation (Osorio and Yamani; 2017; Osorio and Wang; 2017).
121
Chapter 4
Adaptive Partitioning Strategy for
High-Dimensional Discrete
Simulation-based Optimization
Problems
In this chapter, we introduce a technique to enhance the computational efficiency of SO
algorithms for high-dimensional discrete SO problems. The technique is based on an inno-
vative adaptive partitioning strategy. It is integrated with the Empirical Stochastic Branch-
and-Bound (ESB&B) framework proposed by Xu and Nelson (2013). This combination
leads to a general-purpose discrete SO algorithm that is both globally convergent and has
good small sample (finite-time) performance.
4.1 Introduction
Simulation-based optimization (SO), also referred as optimization via simulation (OvS),
is the optimization of the performance of a stochastic system, where the objective func-
tion and/or constraints can only be estimated through stochastic simulations. Detailed
overviews of SO literatures are provided by Fu (2002); Amaran et al. (2016); Bhosekar
and Ierapetritou (2018). Based on the feasible region structures, Hong and Nelson (2009)
123
divide SO problems into three categories: continuous SO, ranking and selection, and dis-
crete SO. For a continuous SO problem, the decision variables are continuous, and methods
developed to tackle such problems include stochastic approximation methods (e.g., Rob-
bins and Monro (1951); Bhatnagar et al. (2011)), direct search methods (e.g., Andradóttir
(2006)), surrogate-based methods (e.g., Angün et al. (2009); Regis and Shoemaker (2013);
Wang and Ierapetritou (2018)). For ranking and selection problems, all feasible solutions
can be simulated at least once and the (near) best solution is chosen with a given confi-
dence interval (e.g., Kim and Nelson (2006)). A discrete SO problem is one with discrete
decision variables and the number of feasible solutions is usually too large for each one
to be simulated. The existing algorithms mainly focus on global convergence to the opti-
mal solution(s) asymptotically (e.g.,Xu and Nelson (2013); Tsai and Fu (2014)). Methods
that aim at identifying solutions with good performances within small sampling budgets
include Heuristic Constrained Genetic Algorithm (HCGA) (Tsai and Fu; 2014), and Indus-
trial Strength COMPASS (ISC) (Xu et al.; 2010). Nevertheless, developing discrete SO
algorithms that can efficiently tackle high-dimensional problems remains a challenge.
Xu and Nelson (2013) propose an Empirical Stochastic Branch-and-Bound (ESB&B)
framework for discrete SO problems based on the nested partitions method of Shi and Ólaf-
sson (2000) and stochastic branch-and-bound (Norkin et al.; 1998). It takes advantage of
the partitioning structure of stochastic branch-and-bound method and empirically estimates
the bounds based on sampled solutions. The ESB&B algorithm also uses improvement
bounds to represent the potential of each subregion to guide the sampling strategy in the
next iteration. The ESB&B algorithm is globally convergent, i.e., given infinite simulation
budget, it converges asymptotically to the global optimum. As mentioned in Xu and Nelson
(2013), there are many valid partitioning strategies; however, a good partitioning strategy
can usually improve the algorithm efficiency. Most partitioning strategies developed in the
literature are generic and heuristic, for example, dividing the feasible region equally into k
subregions along a randomly chosen dimension (Xu and Nelson; 2013).
In this chapter, we propose an innovative adaptive partitioning strategy. It is embedded
within the ESB&B framework (Xu and Nelson; 2013) and forms a globally convergent al-
gorithm for discrete SO problems. The proposed partitioning strategy iteratively divides
124
the feasible region in a fashion such that previously sampled solutions with similar per-
formances are located in the same subregion. It is an adaptive sample-based partitioning
strategy that enhances small sample (finite-time) performances of the ESB&B framework
of Xu and Nelson (2013). This proposed partitioning strategy can take on problem-specific
structures known a priori, such as clustering effect in car-sharing fleet assignment prob-
lems, to further enhance the algorithm efficiencies without significant modifications.
This chapter is organized as follows. Section 4.2 reviews the ESB&B algorithm by Xu
and Nelson (2013). Section 4.3 presents the proposed adaptive partitioning strategy and its
solving method developed by Dunn (2018). Section 4.4 validates the proposed partitioning
strategy on one synthetic and one real-world discrete SO problems. Section 4.5 concludes
this chapter.
4.2 ESB&B framework
Let us first define the discrete SO problem. The goal is to find x that solves
maxx∈X
E[Y(x)] (4.1)
where x = [x1, ..., xp] are the discrete decision variables, and X is a convex feasible region
that contains finite but a large number of feasible solutions, which can be represented by
constraints of the form:
li ≤ xi ≤ ui, i = 1, ..., p (4.2)
gj(x) ≤ 0, j = 1, ..., q (4.3)
li, xi, ui ∈ Z, i = 1, ..., p. (4.4)
The objective function E[Y(x)] is the expected performance at point x, which can only be
estimated by generating observations of Y(x) via simulation.
To solve the discrete SO problems of the above format, Xu and Nelson (2013) propose
the Empirical Stochastic Branch-and-Bound (ESB&B) framework, which converges to the
125
globally optimal solution(s) asymptotically (i.e., under unlimited sampling budgets and
simulation efforts). The ESB&B algorithm is detailed in Algorithm 5. The algorithm ter-
minates when the total sampling budget is used up. Whenever the algorithm is terminated,
the final solution is the one with the maximum cumulative sample average.
There are two important steps within each iteration: i) sampling and bounding (Step
2), ii) and partitioning (Step 3). For details regarding the sampling and bounding step, we
refer to Section 3 of Xu and Nelson (2013). We discuss in details the partitioning step.
Partitioning divides the estimated best subregion into a set of smaller subregions that are
disjoint and nonempty. In the ESB&B implementation of Xu and Nelson (2013),
the embedded generic partitioning strategy chooses (either deterministically or ran-
domly) a variable xi and divide along this dimension the best subregion into ω dis-
joint subregions. A detailed description of this partitioning strategy is given by the
online Appendix A of Xu and Nelson (2013).
However, there exists many valid partitions for a given subregion. A good partition-
ing strategy can help locate the most promising subregion more efficiently, and hence
allocate the sampling budget more efficiently. Let us take the following illustrative
example to discuss the pros and cons of the current generic partitioning strategy used
in ESB&B and potential directions of improvement.
Illustrative example:
Consider the maximization of a 2-dimensional deterministic function f(x1, x2), for
which neither the closed-form nor any structural information about the function is
known. The ground truth of f(x1, x2) in the current best subregion is shown in Fig-
ure 4-1. However, the values of the function can only be known through sampling.
Figure 4-2 shows samples that have already been evaluated in the current best subre-
gion.
We need to further partition this region and sample from each new subregion. Figure
4-3(a) shows the generic partitioning strategy used in the ESB&B framework, which
divides this region into equal parts along the chosen x1 dimension. This partitioning
strategy requires the users to predefine the number of subregions to be divided into
126
Algorithm 5 ESB&B framework
Step 0. Initialization: set iteration counter k = 0, initial partition P0 = {X }, best subregion R0 = X
Step 1. Partitioning:If the best subregion Rk is singleton:
(a) set P ′k = Pk
else:
(a) construct a partition of the best subregion P(Rk)
(b) define the new full partition by P ′k = (Pk\{Rk}) ∪ P(Rk)
(c) denote X P the elements of P ′k
Step 2. Sampling and bounding:
2.1 Solution sampling:
i. for each subregion X P ∈ P(Rk), randomly sample νR solutionsii. if k > 0, for each subregion X P ∈ Pk\{Rk}, sample θ(X P) solutions, where
θ(X P) is computed in Step 2.3 of iteration k− 1 based on information in Φk−1
iii. aggregate all of the sampled solutions into set Sk, and set Φk = Φk−1 ∪ Sk
2.2 Bound estimation:
i. for each x ∈ Sk, simulate ∆nF replications if x /∈ Φk−1, simulate ∆nA additionalreplications if x ∈ Φk−1
ii. for X P ∈ P ′k, calculate estimated upper bound ηk+1(X P)
2.3 Sample allocation: compute the number of solutions to be sampled, θ(X P), for allX P ∈ P ′
k for iteration k+ 1 based on information in Φk
Step 3. Updating partition and best subregion:
(a) update the best subregion Rk+1 = arg maxXP∈P ′k{ηk+1(X P)}
(b) partition Pk+1 = P ′k
(c) set k = k+ 1, go to Step 1.
127
(denoted ω), and it does not utilize any information from the solutions that have
already been sampled. After partitioning, each new subregion gets an equal amount
of sampling budget. Based on the performances of the sampled solutions, a new best
subregion will be selected and further explored. In this case, the middle and right
subregions both have the chance to be selected as the next best subregion as they
both contains peaks of the underlying function. If the right subregion is chosen as
the next best subregion, it can take a while for the algorithm to return to explore the
middle subregion where the true global maximum solution locates.
On the other hand, there is another potential partition of the current best subregion
given in Figure 4-3(b). Note that this partition divides the subregion into three parts:
two that contains the underlying function’s basins and one contains all the peaks.
This partition is better in the sense that it successfully identifies the patterns of the
underlying function. It is obtained by grouping sampled solutions with similar per-
formances in the same subregion.
In Section 4.3, we propose an adaptive partitioning strategy which is defined by the
previously sampled solutions in the subregion.
4.3 Adaptive partitioning strategy
The main idea underlying the proposed partitioning strategy is to find a partition of the
current best subregion Rk in which the sampled solutions with similar performances are
divided in the same subregion. In this section, we introduce two sets of adaptive partitioning
strategies: (i) parallel partition in Section 4.3.1, which applies to any problems, and (ii)
hyperplane partition in Section 4.3.2, which applies when splitting features in the form
of linear combinations of decision variables can be obtained from prior knowledge (e.g.,
clustering effect in car-sharing fleet assignment problem).
128
-3 -2 -1 0 1 2 3
x1
-3
-2
-1
0
1
2
3x 2
-6
-4
-2
0
2
4
6
8
Figure 4-1: The ground truth values of f(x1, x2) in the current best subregion.
-3 -2 -1 0 1 2 3
x1
-3
-2
-1
0
1
2
3
x 2
-6
-4
-2
0
2
4
6
8
Figure 4-2: The sampled solutions of f(x1, x2) in the current best subregion.
129
-3 -2 -1 0 1 2 3
x1
-3
-2
-1
0
1
2
3x 2
-6
-4
-2
0
2
4
6
(a) A naive partition of the current best subregion.
-3 -2 -1 0 1 2 3
x1
-3
-2
-1
0
1
2
3
x 2
-6
-4
-2
0
2
4
6
(b) A better partition of the current best subregion.
Figure 4-3: Different partitions of the current best subregion.
130
4.3.1 Parallel partition
A parallel partition is one that only contains cuts of the form aTx < b or aTx ≥ b where
ai ∈ {0, 1} and∑p
i=1 ai = 1. The subregions resulting from a parallel partition are in the
form of the intersection of a hyperbox with the best subregion. This family of partitions are
favorable in the sense that it directly maps to the bounds of each decision variable and hence
is easy to interpret and draw feasible uniform random samples from. A parallel partitioning
strategy can be applied to any discrete SO problem, thus it is a generic partitioning strategy.
We formulate the search for such a parallel partition as a minimization problem:
minP(Rk)
∑XP∈P(Rk)
∑x∈XP∩Φk
(Y(x) − Y)2 (4.5)
s.t. |X P ∩Φk| ≥ Nmin, ∀ X P ∈ P(Rk), (4.6)
|P(Rk)| ≤ d, (4.7)
where P(Rk) is a valid parallel partition of the current best subregion Rk, Y(x) is the
cumulative sample mean of all n(x) observations at solution x:
Y(x) =1
n(x)
n(x)∑s=1
Ys(x). (4.8)
Y is average estimated mean performances of all sampled points in subregion X P:
Y =1
|X P ∩Φk|
∑x∈XP∩Φk
Y(x) (4.9)
Therefore, the objective function is the sum over all subregions the squared deviation from
the mean. Constraint (4.6) states that at least Nmin already sampled solutions must be
clustered in one subregion of the partition, otherwise the optimization problem (4.5) is
solved to minimum by a partition in which each subregion contains exactly one sample
and the objective function value will be zero. An alternative is to limit the number of
subregions to be divided into, denoted d (Constraint (4.7)). The partition derived by solving
problem (4.5) with constraints on the partition structure (e.g., Nmin or/and d), denoted
131
P∗(Rk), is one that optimally groups sampled points with similar performances. Next, we
discuss the solution algorithm for the proposed minimization problem (4.5)-(4.7).
The proposed partition problem (4.5)-(4.7) is similar to the underlying optimization
problem of the decision tree model (e.g., Breiman et al. (1984); Dunn (2018)) with deci-
sion variables as input variables (features) and performances as dependent variables. The
decision tree model is mainly used for prediction purposes. The underlying optimization
problem of training a decision tree is to find both a tree-structured split of the training
samples based on input variables and labels for classification (or constant values for re-
gression) of leaf nodes so that the prediction error is minimized, given some constraints on
the tree structures (e.g., minimum leaf size and/or maximum tree depth) and restrictions
against overfitting. The decision tree method CART by Breiman et al. (1984) is a top-down
greedy algorithm that does not guarantee a globally optimal decision tree. A recent work
of Dunn (2018) proposes an algorithm that solves the decision tree problem at least locally
optimally, and by initiating with multiple starting points, it attempts for global optimality.
They formulate the decision tree problem as an mixed integer programming problem and
the objective function can be written as follows:
minT
R(T ) + α|T | (4.10)
s.t. N(ℓ) ≥ Nmin, ∀ ℓ ∈ leaves(T ), (4.11)
depth(T ) ≤ D, (4.12)
where R(T ) represents the prediction error of the tree T made on the training data. |T |
denotes the number of branch node in T . It is a representation of the tree complexity.
N(ℓ) represents the number of samples in each leave node ℓ of the tree T , and hence the
constraint (4.11) restricts each leaf node to contain at least Nmin samples. One can also
restricts the maximum depth of the tree (Constraint (4.11)). It sets an upper bound on the
number of leaf nodes (i.e., 2D). In other words, the goal is to find a decision tree model
that balances the prediction accuracy and the model complexity. The optimal decision tree
method is shown to perform robustly to noisy training data (both feature and label noises).
Note that a given tree T creates a unique partition on the feature space and hence a
132
unique division of the samples. For regression tasks, the constant prediction value of a leaf
node that minimizes the mean squared error is the mean performance of the samples in that
leaf node. Thus, by setting α = 0 and choosing the mean squared error as the error metric
(i.e., R(·)), we retrieve an optimization problem that is equivalent to (4.5). In other words,
the proposed partitioning problem (4.5) is equivalent to the optimal parallel regression
tree optimization (Dunn; 2018, Chapter 4) with complexity penalty coefficient α = 0.
The optimization problem (4.10)-(4.12) can be solved efficiently with their local search
algorithm coupled with multiple start points for sample sizes up to hundreds of thousands.
This is more than enough for discrete SO problems, in which samples are usually more
time-consuming to simulate and hence limited in size.
4.3.2 Hyperplane partition
A hyperplane partition is one that contains cuts of the form aTx < b or aTx ≥ b where
ai ∈ R. The subregions resulting from a hyperplane partition are polyhedrons. The current
sampling strategy MIX-D algorithm can efficiently sample uniformly from such subregion.
Hence, although more complicated than parallel partition, this family of partitions are also
practical, especially when problem-specific structures of such form is known a priori.
Assume that variables yj = aTj x for j = 1, ..., q are known in advance to be potentially
important splitting factors other than the decision variables xi for i = 1, ..., p. Let us take
the car-sharing fleet assignment problem as an example. The car-sharing service provider
wants to find a fleet assignment to stations across the network that maximizes their profit.
Geographically nearby stations often share demands among one another, since customers
are often willing to walk short distances for available vehicles if their target station runs out
of vehicle. Thus, the total number of vehicles in a cluster of nearby stations is potentially
a more important factor that influences the profit generated than the number of vehicles
in each individual station. Therefore, a partition based on the total number of vehicles in
clusters of nearby stations can more effectively divide the feasible region and lead to a more
effective search for subregions with higher profits. This can result in a further improved
finite-time performances and algorithm efficiency. In this example, yj is the total number
133
of vehicles assigned in the cluster of nearby stations Cj as yj =∑
i∈Cjxi where xi are the
number of vehicle assigned to station i.
The modification in order to incorporate such a hyperplane cut is minor. We simply treat
yj as candidates that the partitioning algorithm can split on together with all the decision
variables xi. The proposed partitioning strategy by solving the optimization (4.5)-(4.7)
is robust to correlated splitting features such as yj and xi, since it naturally includes a
splitting feature selection process. One other benefit the proposed adaptive partitioning
strategy brings is that each subregion constructed contains at least one previously sampled
solution, which can be used directly to initiate the MIX-D sampling scheme.
4.3.3 Adaptive partitioning ESB&B algorithm
The proposed adaptive partitioning ESB&B algorithm is given by Algorithm 6. The pa-
rameters related to partitioning strategy that need to be specified are the maximum tree
depth D and/or minimum number of points each subregion contains Nmin, both are related
to the number of subregions the current best subregion can be divided into. The algorithm
chooses the number of subregions to divide the region into optimally, given the predeter-
mined parameters D and/or Nmin. Different from other generic partitioning strategies, the
derived partition from the proposed adaptive partitioning strategy ensures that each new
subregions contains at least one sampled points, or at least Nmin if this parameter is stated.
This benefits the MIX-D sampling scheme in the sense that these interior sampled points
can be used directly to initiate random walk. The termination of the algorithm is usually
when the simulation budget is exhausted. We select the final solution x∗ as the one with the
maximum cumulative sample average.
4.4 Numerical examples
In this section, we consider two sets of numerical examples to illustrate the proposed adap-
tive partitioning ESB&B algorithm. The first example, Griewank function, has many local
minima which makes it a challenging test function. This low-dimensional example illus-
trates how the proposed method can escape tricky local minima by adaptively partitioning
134
Algorithm 6 Adaptive partitioning ESB&B algorithm
Step 0. Initialization:
(a) set iteration counter k = 0, initial partition P0 = {X }, best subregion R0 = X(b) sample uniformly at random the initial n0 solutions in the feasible region X , simulate
∆nF replications of each sample, and record them in set Φ0.
(c) set training set Ψ0 = Φ0
Step 1. Partitioning:If the best subregion Rk is singleton:
(a) set P ′k = Pk
else:
(a) construct a partition of the best subregion P(Rk) using the proposed adaptive parti-tioning strategy with training set Ψk, predetermined parameters D and/or Nmin and/orother splitting features yj than decision variables xi
(b) define the new full partition by P ′k = (Pk \ {Rk}) ∪ P(Rk)
(c) denote X P the elements of P ′k
Step 2. Sampling and bounding:
2.1 Solution sampling:
i. for each subregion X P ∈ P(Rk), randomly sample νR solutionsii. if k > 0 ,for each subregion X P ∈ Pk \ {Rk}, sample θ(X P) solutions, where
θ(X P) is computed in Step 2.3 of iteration k− 1 based on information in Φk−1
iii. aggregate all of the sampled solutions into set Sk, and set Φk = Φk−1 ∪ Sk
2.2 Bound estimation:
i. for each x ∈ Sk, simulate ∆nF replications if x /∈ Φk−1, simulate ∆nA additionalreplications if x ∈ Φk−1
ii. for all X P ∈ P ′k, calculate estimates ηk+1(X P)
2.3 Sample allocation: compute the number of solutions to be sampled, θ(X P), for allX P ∈ P ′
k for iteration k+ 1 based on information in Φk
Step 3. Updating partition and best subregion:
(a) update the best subregion Rk+1 = arg maxXP∈P ′k{ηk+1(X P)}
(b) partition Pk+1 = P ′k
(c) training set Ψk+1 = {x : x ∈ Rk+1 ∩Φk}
(d) set k = k+ 1, go to Step 1.
135
the feasible region with problem structure inferred from previously sampled points. The
second example is a real-world car-sharing case study in Zhou et al. (2019) for which the
optimal solutions are not known, it is used to explore the performance of the proposed al-
gorithm on high-dimensional discrete SO problems. In this example, we also demonstrate
the use of hyperplane partition resulting from additional splitting variables yj, which are
derived from prior knowledge on the geographical locations of the stations. For both ex-
amples, we benchmark the proposed algorithm against the original ESB&B algorithm with
the generic partitioning strategy.
4.4.1 The Griewank function
We consider the Griewank function, which is commonly used as a test case for optimization
algorithms (L. Salemi et al.; 2019). This function has a single global minimum and many
local minima, which makes it a challenging test for optimization algorithms. Figure 4-4
displays the contour plot of the two-dimensional Griewank function on domain [−5, 5] ×
[−5, 5]. The global minimum of the function is at the origin (0, 0) with response value 0
and there are four local minima near the four corners of the domain with response values
0.0086.
We first consider a minimization of the Griewank function with feasible region [−5, 5]×
[−5, 5], where the globally optimal solution (0, 0) is at the center of the feasible region. To
make it a discrete SO problem, the feasible region is divided into a 101 × 101 lattice,
and the function value at each solution is given by the Griewank function plus a normally
distributed noise with mean zero and variance σ2. In this numerical example, we take σ =
0.01, which is relative to the difference between local minima 0.0086 and global minimum
0. The number of replications for each non-encountered sample is set at ∆nF = 10, and
encountered sample ∆nA = 2. For both algorithms, we initiate with the same uniformly
randomly sampled pool of size 10 (asterisks in Figure 4-4). At each iteration, the total
sampling budget for subregions other than the current best (denoted νO) is 5, and that
for the current best subregion (denoted |P(Rk)|νR) is 10; the budget limit is 40 iterations.
Thus, at the end of each run, roughly 5% of all feasible solutions are sampled and evaluated.
136
-5 -4 -3 -2 -1 0 1 2 3 4 5
x1
-5
-4
-3
-2
-1
0
1
2
3
4
5
x 2
2D Griewank Function
global minimum
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
Figure 4-4: The contour plot of two-dimensional Griewank function on [−5, 5]× [−5, 5].
For the generic partitioning scheme used in the original ESB&B algorithm, the current best
subregion Rk is divided into ω = 2 subregions equally along the longest dimension of Rk.
For the proposed adaptive partitioning scheme, the maximum tree depth D is set at 2, i.e.,
Rk can be at most divided into 4 subregions, and the minimum number of sampled points
grouped in one subregion is set at 2, i.e., Nmin = 2. We run each algorithm 50 times.
Figure 4-5 shows the current best estimate of the objective value across iterations of
five randomly selected runs of each algorithm (i.e., five sample paths of each algorithm).
The solid black lines display the results for the proposed algorithm, and the dashed red
lines for the original ESB&B algorithm. As the iteration advances, the current best esti-
mate of the objective value has a general decreasing trend for both algorithms, although
there are temporary increases due to stochasticity. Figure 4-6 (resp. Figure 4-7) plots a
zoomed version of the current best estimate of the objective value with 95% confidence
intervals, so as to clearly show the performance of the ESB&B algorithm (resp. proposed
algorithm) close to the global minimum function value at zero (solid blue line). Note that
at each iteration, the current best solution is simulated at least 10 replications, since each
non-encountered sample will be simulated ∆nF = 10 replications, and if it is sampled
137
again, an additional ∆nA = 2 replications will be simulated. There is only one run of the
ESB&B algorithm that ends up with an estimated objective value that cannot be rejected at
confidence level 95% to be different from the true global minimum value 0. On the other
hand, four runs of the proposed algorithm end up with estimated objective values that are
statistically indifferent from the true global minimum after iteration 15 at confidence level
95%.
Figure 4-8 shows the distance between current best solution and the global minimum
solution across iterations. The solid black lines display the results for the proposed algo-
rithm and dashed red lines for the ESB&B algorithm. Four runs of the proposed algorithm
end up with solutions close to the true global minimum at (0, 0), whereas only one run of
the ESB&B algorithm ends up close to (0, 0). Based on these five experiment runs, the
proposed method tends to find solutions that are closer to the global minimum.
5 10 15 20 25 30 35 40
Iteration
-0.05
0
0.05
0.1
0.15
0.2
Cur
rent
est
imat
e of
obj
ectiv
e va
lue
2D Griewank Function
ESBBProposed
Global minimum objective = 0
Figure 4-5: Objective function estimate of the current iterate across iterations.
Figures 4-9 and 4-11 show the experiment results for the first run of the original ESB&B
algorithm, which is a typical run of ESB&B getting trapped at a local minimum solution. In
both figures, the final partition of the feasible region and the contour plot of the Griewank
138
5 10 15 20 25 30 35 40
Iteration
-0.01
0
0.01
0.02
0.03
0.04C
urre
nt e
stim
ate
of o
bjec
tive
valu
e
2D Griewank Function
ESBB algorithm95% confidence intervalTrue global objective value
Figure 4-6: Objective function estimate of the current iterate with 95% confidence intervalacross iterations of ESBB algorithm (zoomed-in results).
function are displayed. Figure 4-9 plots the path of the best solution at current iterate
across iterations in the feasible domain, the path is plotted with blue arrows, and Figure
4-10 displays the zoomed-in results. The asterisk points plotted are the initial sampled set
Φ0. The generic partitioning strategy used in the ESB&B algorithm missed the subregion
that contains the global minimum in the first iteration and left it near the boundary of a
subregion. The algorithm gets trapped to the lower right local minimum. Figure 4-11
displays the sampling budget allocation in the feasible domain at the end of iteration 40,
in which each black dot represents a sampled solution. As discussed, much less sampling
budget is allocated to subregions that have been discarded earlier.
Figures 4-12 and 4-14 show the experiment results for first run of the proposed al-
gorithm. As before, the final partition of the feasible region and the contour plot of the
Griewank function are displayed in both figures. Figure 4-12 plots the path of the current
best solution across iterations in the feasible domain of the proposed algorithm, and Figure
4-13 displays the zoomed-in results. Figure 4-14 displays the sampling budget allocation in
139
5 10 15 20 25 30 35 40
Iteration
-0.01
0
0.01
0.02
0.03
0.04C
urre
nt e
stim
ate
of o
bjec
tive
valu
e
2D Griewank Function
Proposed algorithm95% confidence intervalTrue global objective value
Figure 4-7: Objective function estimate of the current iterate with 95% confidence intervalacross iterations of the proposed algorithm (zoomed-in results).
5 10 15 20 25 30 35 40
Iteration
0
1
2
3
4
5
6
7
Dis
tanc
e to
glo
bal o
ptim
al s
olut
ion
2D Griewank Function
ESBBProposed
Figure 4-8: Distance between current best solution and the global minimum solution acrossiterations.
140
-5 -4 -3 -2 -1 0 1 2 3 4 5
x1
-5
-4
-3
-2
-1
0
1
2
3
4
5x 2
ESBB algorithm
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
Figure 4-9: The path of best solution at current iterate across iterations in the feasibledomain of the original ESB&B algorithm.
2.5 3 3.5 4
x1
-5
-4.5
-4
-3.5
x 2
ESBB algorithm (zoomed in)
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
Figure 4-10: The path of best solution at current iterate across iterations in the feasibledomain of the original ESB&B algorithm (zoomed-in results).
141
-5 -4 -3 -2 -1 0 1 2 3 4 5
x1
-5
-4
-3
-2
-1
0
1
2
3
4
5
x 2
ESBB algorithm
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
Figure 4-11: Allocation of sampling budget in the feasible domain of the original ESB&Balgorithm.
the feasible domain of the proposed algorithm. These figures have similar layouts as Figure
4-9 and 4-11, respectively. Different from the generic partitioning strategy, the proposed
adaptive partitioning strategy divides the feasible region along x2 initially and note that the
globally optimal solution is placed in the interior of one of the subregion. At iteration 24,
the algorithm escapes the lower right local minimum and starts to explore the middle sub-
region. The final partition of the feasible region identifies the underlying function’s peaks
and basins successfully. The sampling budgets are allocated more to the subregions that
contain the basins than those containing the peaks as expected.
An average sample-path performance (over the 50 runs) for each algorithm is con-
structed; this performance metric is used to compare against other algorithms, e.g., Xu et al.
(2010, 2013); Xu and Nelson (2013). Figure 4-15 plots the objective value of the estimated
optimal solution across iterations (averaged over 50 algorithm runs) for the ESB&B algo-
rithm (red line with shaded 95% confidence boundary) and the proposed algorithm (black
line with shaded 95% confidence boundary). Figure 4-16 zooms in on the average perfor-
142
-5 -4 -3 -2 -1 0 1 2 3 4 5
x1
-5
-4
-3
-2
-1
0
1
2
3
4
5x 2
Adaptive partitioning ESBB algorithm
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
Figure 4-12: The path of best solution at current iterate across iterations in the feasibledomain of the proposed algorithm.
0 0.5 1 1.5 2 2.5 3 3.5 4
x1
-5
-4.5
-4
-3.5
-3
-2.5
-2
-1.5
-1
-0.5
0
x 2
Adaptive partitioning ESBB algorithm
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
Figure 4-13: The path of best solution at current iterate across iterations in the feasibledomain of the proposed algorithm (zoomed-in results).
143
-5 -4 -3 -2 -1 0 1 2 3 4 5
x1
-5
-4
-3
-2
-1
0
1
2
3
4
5
x 2
Adaptive partitioning ESBB algorithm
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
Figure 4-14: Allocation of sampling budget in the feasible domain of the proposed algo-rithm.
mance close to the global minimum function value at zero. Since this is a minimization
problem, the lower the curve, the better the algorithm performance in terms of estimated
objective value. Note that the curve for the proposed algorithm is lower than that for the
ESB&B algorithm except for iterations 10 to 25 where they overlap with each other. This
indicates: (i) at early stage, the proposed algorithm find solutions with improved perfor-
mances faster; (ii) during iterations 10 to 25, both algorithm reaches solutions with similar
performances around local optimal objective values; (iii) as the iteration number increases,
the proposed algorithm continues to find solutions with improved performances whereas
the ESB&B algorithm mostly gets stuck around the locally optimal solutions. At the termi-
nation of the algorithms (i.e., iteration 40), the proposed algorithm ends up with an average
estimated objective value that is statistically lower than that of the ESB&B algorithm. Ad-
ditionally, the proposed algorithm finds the true globally optimal solution (i.e., (0, 0)) for
27 out of 50 runs, whereas the ESB&B algorithm finds it for 3 out of 50 runs. The propose
algorithm finds a final solution with mean performances statistically indifferent from the
144
true optimal value (i.e., 0) for 41 out of 50 runs at significance level 0.05, whereas the
ESB&B algorithm finds it for 4 out of 50 runs.
Next, we consider a minimization of the Griewank function with feasible region [−1, 9]×
[−1, 9], where the globally optimal solution (0, 0) is no longer at the center of the feasible
region. In this example, the generic partitioning strategy of ESB&B algorithm does not end
up with an initial cut close to globally optimal solution. Figure 4-17 displays the contour
plot of the two-dimensional Griewank function on domain [−1, 9]× [−1, 9] together with a
uniformly random generated initial sample set (displayed in asterisks) for both algorithms.
The algorithms parameters are set the same as in the previous experiment. We run each
algorithm 50 times.
Figure 4-18 plots the current best estimate of objective value across iterations of five
randomly selected runs of each algorithm (i.e., five sample paths of each algorithm). As
before, we observe that the current best estimate of the objective value inherits a decreasing
trend for both algorithms. Figure 4-19 (resp. Figure 4-20) plots the current best estimate
of objective value with 95% confidence interval of ESB&B algorithm (resp. proposed
algorithm) and zooms in on the performance close to the global minimum function value
at zero (solid blue line). There is only one run of the ESB&B algorithm that ends up with
estimated objective values that cannot be rejected at confidence level 95% to be different
from the true global minimum value 0. Meanwhile, all five runs of the proposed algorithm
end up with estimated objective values that are statistically indifferent from the true global
minimum after iteration 25 at confidence level 95%.
Figure 4-21 shows the distance between current best solution and the global minimum
solution across iterations. This figure has a similar layout as Figure 4-8. All runs of the pro-
posed algorithm end up with solutions close to the true global minimum at (0, 0), whereas
only one run of the ESB&B algorithm ends up close to (0, 0). Based on these five ex-
periment runs, the proposed methods tends to find solutions that are closer to the global
minimum.
Figure 4-22 plots the average estimate of the objective value across iterations for the
ESB&B algorithm (red line with shaded 95% confidence boundary) and the proposed al-
gorithm (black line with shaded 95% confidence boundary). Figure 4-23 zooms in on the
145
Figure 4-15: Objective function estimate of the current iterate across iterations averagedover 50 algorithm runs.
Figure 4-16: Objective function estimate of the current iterate across iterations averagedover 50 algorithm runs (zoomed-in results).
146
-1 0 1 2 3 4 5 6 7 8 9
x1
-1
0
1
2
3
4
5
6
7
8
9
x 2
2D Griewank Function
global minimum
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
Figure 4-17: The contour plot of two-dimensional Griewank function on [−1, 9]× [−1, 9].
5 10 15 20 25 30 35 40
Iteration
-0.05
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Cur
rent
est
imat
e of
obj
ectiv
e va
lue
2D Griewank Function
ESBBProposed
Global minimum objective = 0
Figure 4-18: Objective function estimate of the current iterate across iterations.
147
5 10 15 20 25 30 35 40
Iteration
-0.01
0
0.01
0.02
0.03
0.04C
urre
nt e
stim
ate
of o
bjec
tive
valu
e
2D Griewank Function
ESBB algorithm95% confidence intervalTrue global objective value
Figure 4-19: Objective function estimate of the current iterate with 95% confidence intervalacross iterations of ESBB algorithm (zoomed-in results).
average performance close to the global minimum function value at zero. As before, the
curve for the proposed algorithm is lower than that of the ESB&B algorithm. In other
words, the proposed algorithm has a better performance than the ESB&B algorithm on av-
erage. The initial decrease in estimated objective value of the proposed algorithm is faster
than that of the ESB&B algorithm, this indicates a faster exploring speed of solutions with
improved performances. At termination (i.e., iteration 40), the proposed algorithm ends up
with an average estimated objective value that is statistically lower than that of the ESB&B
algorithm. Additionally, 35 runs of the proposed algorithm end up with true globally op-
timal solution at (0, 0), whereas only 14 runs for the ESB&B algorithm obtain the true
globally optimal solution. 46 runs of the proposed algorithm end with a solution with mean
performance statistically indifference from the true optimal value 0 at significance level
0.05, whereas only 14 runs for the ESB&B algorithm achieve this.
In summary, the proposed adaptive partitioning ESB&B algorithm is a globally conver-
gent algorithm, meanwhile it has an improved finite-time (limited sample budget) perfor-
148
5 10 15 20 25 30 35 40
Iteration
-0.02
-0.01
0
0.01
0.02
0.03
0.04C
urre
nt e
stim
ate
of o
bjec
tive
valu
e
2D Griewank Function
Proposed algorithm95% confidence intervalTrue global objective value
Figure 4-20: Objective function estimate of the current iterate with 95% confidence intervalacross iterations of the proposed algorithm (zoomed-in results).
5 10 15 20 25 30 35 40
Iteration
0
2
4
6
8
10
Dis
tanc
e to
glo
bal o
ptim
al s
olut
ion
2D Griewank Function
ESBBProposed
Figure 4-21: Distance between current best solution and the global minimum solutionacross iterations.
149
Figure 4-22: Objective function estimate of the current iterate across iterations averagedover 50 algorithm runs.
Figure 4-23: Objective function estimate of the current iterate across iterations averagedover 50 algorithm runs (zoomed-in results).
150
mance in terms of finding improved solutions faster, better estimated final objective values
and higher likelihood of finding the true global optimal solutions.
4.4.2 The car-sharing fleet assignment problem
In this section, we apply the proposed algorithm to a real-world car-sharing fleet assignment
problem. This case study problem is adopted from Zhou et al. (2019). It considers a two-
way car-sharing system from the perspective of the service operator. Essentially, it consists
of finding an assignment of a fleet of vehicles across the network of stations that maximizes
the expected profit over a given finite time horizon, denoted as the planning period. Instead
of using a simplified description of demand and of demand-supply interactions, this case
study relies on a demand simulator of Fields et al. (2017) developed based on the rich
high-resolution reservation data from Zipcar’s Boston market. It is formulated as a discrete
SO problem in Zhou et al. (2019), in which the objective function value (i.e., the expected
profit of a given fleet assignment) can only be obtained via simulation.
The discrete SO problem is formulated as follows:
maxx
E[R(x;q1)] −∑i∈I
cixi (4.13)
s.t.∑i∈I
xi ≤ N (4.14)
xi ≤ Ni, ∀ i ∈ I (4.15)
xi ∈ Z≥0, ∀ i ∈ I (4.16)
where x = [xi] are the decision variables and xi is the number of vehicles assigned to station
i. R(x;q1) is the random variable representing the revenue with fleet assignment x and
exogenous simulation parameter vector (e.g., reservation pricing) q1, ci is the exogenous
cost, over the planning period, of a parking space at station i, N is the total number of
vehicles to assign, Ni is the capacity of station i, and I is the set of all stations. The
objective function (4.13) represents the expected profit for a given fleet assignment x as the
difference between the expected revenue E[R(x;q1)] and costs∑
i∈I cixi. The estimates of
151
the expected revenue E[R(x;q1)] can only be obtained via simulations. Constraint (4.14)
bounds the total number of vehicles assigned across all the stations. Constraint (4.15)
bounds the number of vehicles assigned to each individual station i by the capacity of the
station. Finally, the number of vehicles assigned to each station must be an nonnegative
integer (Constraint (4.16)).
As discussed in Section 4.3.2, in the car-sharing fleet assignment problem, the total
number of vehicles assigned to a cluster of nearby stations (i.e., yj =∑
i∈Cjxi) may form
a more efficient cut of the feasible region, since stations nearby usually share demands
as customers will search nearby stations for substitutions if the target station runs out of
vehicles. Therefore, to address the formulated discrete SO problem (4.13), we apply the
following three algorithms: the original ESB&B algorithm, the proposed algorithm with
parallel partition, and the proposed algorithm with hyperplane partition. The potential
splitting factors yj for hyperplane partition are generated as follows: for each stations i,
form a cluster of stations center at station i with a given radius, and eliminate replications.
In this section, we choose the radius of 1 in the distance unit used in the simulator when
calculating the spillover effect from one station to another.
We consider the fleet assignment problem of Boston south end area which contains 23
stations, i.e., the decision vector is of dimension 23 (shown in Figure 4-24). Each station
i has a space capacity Ni = 16 and the total number of cars to assign is N = 211. All
other exogenous variables (e.g., ci) other than demand level of the simulator are set the
same as in Zhou et al. (2019). We tested the algorithms on one low-demand level and one
high-demand level case. The maximum number of algorithm iterations is set to 40. At every
iteration, the number of solutions to be sampled from subregions other than the current best
subregion is set to 10 (i.e., νO = 10), and the total number of solutions to be sampled from
the current best subregion is set to 20 (i.e., |P(Rk)|νR = 20). The number of replications
for each non-encountered sample is set at ∆nF = 5, encountered sample ∆nA = 2. For all
algorithms, we initialize with 20 randomly uniformly sampled solutions plus one solution
with relatively good performance, which can be considered as a warm start. This warm-start
solution is obtained by solving the analytical car-sharing fleet assignment model formulated
as an Mixed Integer Programming (MIP) in Zhou et al. (2019, Eq.(8)-(13)). For the original
152
ESB&B algorithm, the best subregion is divided into 3 new subregions at each iteration
(i.e., ω = 3). For the proposed algorithm with both parallel partition and hyperplane
partition, the maximum tree depth is set at 2 (i.e., D = 2), and hence the current best
subregion can be divided into at most 4 subregions. The minimum number of sampled
points grouped in one subregion is set at 2, i.e., Nmin = 2. Note that the computational
time spend on simulation per iteration in this experiment setting is roughly 500 seconds, the
proposed partitioning strategy with both parallel and hyperplane partitions finishes within
2 seconds, which is comparably trivial. We run each algorithm five times.
Figure 4-24: Zipcar stations in Boston South End neighborhood (Google Maps; 2017)
Figure 4-25 compares the objective estimate of the current iterate across iterations for
each algorithm run of the low-demand experiment. The x-axis displays the iteration index
and the y-axis displays the performance estimate of the current iterate (i.e., simulation-
based estimate of the objective function of the best point). The red dashed lines displays
the results for the ESB&B algorithm, the black solid lines for the proposed algorithm with
parallel partition, and the blue asterisk lines for the proposed algorithm with hyperplane
partition. Figure 4-26 displays the aggregated average performance for each algorithm. It
153
plots the objective value of estimated optimal solution at each iteration (averaged over 5
algorithm runs) for the ESB&B algorithm (red line with shaded 95% confidence bound-
ary), the proposed algorithm with parallel partition (black line with shaded 95% confi-
dence boundary), and the proposed algorithm with parallel partition (blue asterisk line with
shaded 95% confidence boundary). In both figures, we observe the following trends. Ini-
tially, all algorithms start at a position similar to each other, which is the warm-start so-
lution. As iteration advances, both the proposed algorithms with parallel and hyperplane
partition identify solutions with improved performance faster than the ESB&B algorithm.
To further evaluate the performances of the solutions derived, we simulate each derived
solution for 50 replications. Table 4.1 displays the performance statistics of the derived
solutions from different algorithm runs. The first and second columns show the algorithm
name and run id. The third and fourth columns show the mean and standard deviation
of profits generated by each derived solution. We conduct a two-sample t-test between
each derived solution from the proposed algorithm (with parallel and hyperplane partitions)
and each derived solution from the ESB&B algorithm. The null hypothesis is that the
average profit generated by the derived solutions from both algorithms are the same. The
corresponding alternative hypothesis is that the average profit generated by the solution
derived from the proposed algorithm is higher. Table 4.2 and 4.3 display the p-values
for the 50 tests. The null hypothesis cannot be rejected if the p-value is greater than the
significant level 0.05 (colored red in the tables). The null hypothesis is rejected for all tests,
except for the one which compares the derived solution from run 4 of the ESB&B algorithm
and that from run 3 of the proposed algorithm with parallel partition. In other words,
at the end of last iteration, the performances of the solutions derived from the proposed
algorithm are mostly better than those derived from the ESB&B algorithm. To investigate
the added value of hyperplane cuts in the proposed algorithm, we conduct a two-sample
t-test for each solution derived from the proposed algorithm with parallel partition and
each solution derived from the proposed algorithm with hyperplane partition. As before,
our null hypothesis is that the average profits generated by both solutions are the same,
and the alternative hypothesis is that the average profit generated by the solution derived
from the proposed algorithm with hyperplane partition is higher. Table 4.4 displays the
154
5 10 15 20 25 30 35 40Iteration
6.28
6.29
6.3
6.31
6.32
6.33
6.34
6.35E
stim
ated
pro
fit o
f cur
rent
iter
ate
($)
104
ESBBProposed-ParallelProposed-Hyperplane
Figure 4-25: The objective function estimate of the current iterate across iterations forlow-demand experiment.
Figure 4-26: Objective function estimate of the current iterate across iterations(averagedover the 5 algorithm runs) for the low-demand experiment.
155
p-values for all 25 two-sample t-tests. The red cell in the table represents the test in which
the null hypothesis cannot be rejected at significance level 0.05. There are 15 tests that
reject the null hypothesis and suggest that solution derived by the proposed algorithm with
hyperplane partition generates higher profits.
Algorithm Run Mean Standard deviation
ESB&B algorithm
1 63066.95 173.732 63001.90 174.633 63056.80 175.064 63138.56 184.425 63076.53 174.82
Proposed algorithmParallel partition
1 63204.83 175.602 63220.92 161.943 63169.15 180.094 63244.88 150.865 63223.98 169.94
Proposed algorithmHyperplane partition
1 63260.53 204.982 63265.89 174.313 63247.42 168.434 63302.07 188.595 63322.54 180.93
Table 4.1: Performance statistics of the derived final solutions in the low-demand case.
Proposed algorithm with parallel partitionRun 1 2 3 4 5
ESB&B
1 0.0001 0.0000 0.0024 0.0000 0.00002 0.0000 0.0000 0.0000 0.0000 0.00003 0.0000 0.0000 0.0010 0.0000 0.00004 0.0344 0.0098 0.2017 0.0011 0.00895 0.0002 0.0000 0.0052 0.0000 0.0000
Table 4.2: P-values of the two-sample t-test comparing the solutions derived by ESB&Balgorithm and the proposed algorithm with parallel partition in the low-demand experiment.
Figure 4-27 compares the objective estimate of the current iterate across iterations for
each algorithm run of the high-demand experiment. This figure has a similar layout as Fig-
ure 4-25. Figure 4-28 displays the aggregated average performance for each algorithm. It
has a similar layout as Figure 4-26. From the beginning to the end, the ESB&B algorithm
does not find any solution with improved performance for four algorithm runs. On the other
156
Proposed algorithm with hyperplane partitionRun 1 2 3 4 5
ESB&B
1 0.0000 0.0000 0.0000 0.0000 0.00002 0.0000 0.0000 0.0000 0.0000 0.00003 0.0000 0.0000 0.0000 0.0000 0.00004 0.0011 0.0003 0.0013 0.0000 0.00005 0.0000 0.0000 0.0000 0.0000 0.0000
Table 4.3: P-values of the two-sample t-test comparing the solutions derived by ESB&Balgorithm and the proposed algorithm with hyperplane partition in the low-demand exper-iment.
Proposed algorithm with hyperplane partitionRun 1 2 3 4 5
Proposed algorithmwith parallel partition
1 0.0739 0.0420 0.1093 0.0045 0.00072 0.1433 0.0922 0.2123 0.0116 0.00193 0.0099 0.0037 0.0135 0.0002 0.00004 0.3325 0.2604 0.4684 0.0487 0.01095 0.1672 0.0000 0.2451 0.0160 0.0030
Table 4.4: P-values of the two-sample t-test comparing the solutions derived by the pro-posed algorithm with parallel and hyperplane partition in the low-demand experiment.
hand, both the proposed algorithm with parallel and hyperplane partition do make improve-
ment as iteration advances. The proposed algorithm with hyperplane partition starts to find
better solutions earlier than the one with parallel partition.
To further evaluate the performances of the solutions derived, we simulate each derived
solution for 50 replications. Table 4.5 displays the performance statistics of the derived
solutions from different algorithm runs. This table has a similar layout as Table 4.1. We first
conduct a two-sample t-tests between each solution derived from the ESB&B algorithm and
each solution derived from the proposed algorithm (with parallel and hyperplane partition).
As before, the null hypothesis is that average profits generated by the solutions derived from
both algorithms are the same, and the alternative hypothesis is that the solution derived
from the proposed algorithm generates a higher profit. Table 4.6 and 4.7 display the p-
values for the 50 tests. For all tests, the null hypothesis is rejected at significance level
0.05. In other words, the proposed algorithm ends up with solutions with higher profits.
To validate the added value of hyperplane cuts in the proposed algorithm, we conduct a
157
5 10 15 20 25 30 35 40Iteration
1.38
1.385
1.39
1.395
1.4E
stim
ated
pro
fit o
f cur
rent
iter
ate
($)
105
ESBBProposed-ParallelProposed-Hyperplane
Figure 4-27: The objective function estimate of the current iterate across iterations forhigh-demand experiment.
Figure 4-28: Objective function estimate of the current iterate across iterations (averagedover the 5 algorithm runs) for the high-demand experiment.
158
two-sample t-test between each solution derived with parallel partition and each solution
derived with hyperplane partition. The null hypothesis is that the average profits generated
by both solutions are the same. The alternative hypothesis is that the solution derived by the
proposed algorithm with hyperplane partition generates a higher profit. Table 4.8 displays
the p-values for the 25 two-sample t-tests. For all the 25 t-tests, we reject the null hypothesis
at significance level 0.05. Hence, the proposed algorithm with hyperplane partition ends
up with derived solutions with better performances than those derived by the proposed
algorithm with parallel partition. This obvious improvement in the proposed algorithm’s
finite-time performance may be because under high demand condition, customer spillover
from one station to another happens frequently and hence a hyperplane cut based on the
total number of vehicles in a cluster of nearby stations may form a more efficient way of
dividing the feasible region.
Algorithm Run Mean Standard deviation
ESB&B algorithm
1 138420.78 177.052 138420.06 190.273 138390.60 158.244 138357.82 171.415 138357.76 210.50
Proposed algorithmParallel partition
1 138669.72 184.942 138703.72 166.583 138723.02 201.854 138751.44 204.595 138666.18 206.07
Proposed algorithmHyperplane partition
1 139502.98 196.232 139340.78 206.503 139188.80 212.034 139468.24 224.515 139272.32 195.04
Table 4.5: Performance statistics of the derived final solutions in the high-demand case.
In summary, compared to the ESB&B algorithm, the proposed algorithm can improve
the finite-time performances by exploring the underlying function structure through sam-
pled points and incorporating prior knowledge of the problem-specific structures.
159
Proposed algorithm with parallel partitionRun 1 2 3 4 5
ESB&B
1 0.0000 0.0000 0.0000 0.0000 0.00002 0.0000 0.0000 0.0000 0.0000 0.00003 0.0000 0.0000 0.0000 0.0000 0.00004 0.0000 0.0000 0.0000 0.0000 0.00005 0.0000 0.0000 0.0000 0.0000 0.0000
Table 4.6: P-values of the two-sample t-test comparing the solutions derived by ESB&Balgorithm and the proposed algorithm with parallel partition in the high-demand experi-ment.
Proposed algorithm with hyperplane partitionRun 1 2 3 4 5
ESB&B
1 0.0000 0.0000 0.0000 0.0000 0.00002 0.0000 0.0000 0.0000 0.0000 0.00003 0.0000 0.0000 0.0000 0.0000 0.00004 0.0000 0.0000 0.0000 0.0000 0.00005 0.0000 0.0000 0.0000 0.0000 0.0000
Table 4.7: P-values of the two-sample t-test comparing the solutions derived by ESB&Balgorithm and the proposed algorithm with hyperplane partition in the high-demand exper-iment.
Proposed algorithm with hyperplane partitionRun 1 2 3 4 5
Proposed algorithmwith parallel partition
1 0.0000 0.0000 0.0000 0.0000 0.00002 0.0000 0.0000 0.0000 0.0000 0.00003 0.0000 0.0000 0.0000 0.0000 0.00004 0.0000 0.0000 0.0000 0.0000 0.00005 0.0000 0.0000 0.0000 0.0000 0.0000
Table 4.8: P-values of the two-sample t-test comparing the solutions derived by the pro-posed algorithm with parallel and hyperplane partition in the high-demand experiment.
160
4.5 Conclusion
In this chapter, we propose an adaptive partitioning strategy and combine it with the ESB&B
framework developed by Xu and Nelson (2013). This proposed partitioning strategy is a
sample-driven approach that explores the structure of the underlying objective function
by iteratively dividing the feasible region into subregions such that sampled points within
the same subregion have similar performances. The proposed partitioning strategy can be
integrated with prior knowledge of specific problem structures to form more efficient hy-
perplane partitions. Solving the proposed partitioning problem is fast and efficient through
solving the formulated MIP via the local search algorithm developed by Dunn (2018).
The proposed algorithm combines the proposed adaptive partitioning strategy within the
ESB&B framework. It is a general-purpose algorithm that converges globally for discrete
SO problems with finite and convex feasible region, and it can be applied to general SO
problems without significant amount of modification. The proposed algorithm improves
the finite-time performances of the original ESB&B algorithm with generic partitioning
strategy.
For SO problems with tight sampling budget at each iteration, a smart sampling strat-
egy other than uniform sampling is important. This can be a potential research direction
to explore. Currently, the adaptive partitioning strategy supports parallel partition and only
hyperplane partitions with potential hyperplane cuts known in advance, it can be an in-
teresting research direction to automatically generate hyperplane cuts for any problems
without such prior knowledge.
161
Chapter 5
Conclusions
This chapter concludes the thesis by reviewing its main contents and contributions. It also
provides potential directions for future researches.
Chapter 2 formulates an analytical stochastic link transmission model that is both com-
putationally efficient and consistent with Newell’s simplified kinetic theory of traffic
flow. The proposed model builds upon the multivariate model of Osorio and Flöt-
teröd (2015). The model has a complexity that is linear, rather than cubic, in the
link’s space capacity. This makes the model suitable for large-scale network analy-
sis.
The model is validated versus a simulation-based implementation of the stochastic
link transmission model. The proposed model yields significant gains in compu-
tational efficiency while preserving accuracy. The proposed model is then used to
address a signal control problem for the city of Lausanne. It yields signal plans that
systematically outperform initial random plans for various performance metrics. The
experiments illustrate the robustness of the model to the quality of the initial points.
The proposed plans also outperform a signal plan derived from a widely used com-
mercial signal control software.
Chapter 3 proposes a relaxation approximation of the stochastic link transmission model
formulated in Chapter 2. It proposes a formulation with a constant model complexity,
whereas the past formulations have a complexity that scales linearly or cubically with
163
link length. This makes it suitable for large-scale network optimization with time
budgets or real-time optimization problems.
The model is validated versus a simulation-based implementation of the stochastic
link transmission model. Its performance is also benchmarked with other past analyt-
ical formulations. The proposed model yields estimates with comparable accuracy,
while the computational efficiency is enhanced by at least one order of magnitude.
The proposed model is also validated versus a microscopic traffic simulator and it
can accurately approximate the link’s boundary conditions for realistic traffic situ-
ations. The model is then used to address the same city-wide traffic signal control
problem with time budget. Compared to a benchmark probabilistic analytical model,
the proposed model enhances computational efficiency by two orders of magnitude,
while deriving signal plans with similar performance. Compared to a benchmark
deterministic network loading model, the proposed model derives signal plans with
better performance. It also yields signal plans that outperform those obtained from a
widely used commercial signal control software.
Chapter 4 proposes a technique to enhance the computational efficiency of SO algorithms
for high-dimensional discrete SO problems. The technique is based on an adaptive
partitioning strategy, which divides iteratively the feasible region along the decision
variables into subregions in a fashion such that previously sampled solutions with
similar performance are located in the same subregion. The proposed partitioning
strategy can take on problem-specific structures known a priori to form more ef-
ficient hyperplane cuts. The proposed adaptive partitioning strategy is integrated
in the ESB&B framework by Xu and Nelson (2013). The resulting algorithm is a
general-purpose discrete SO algorithm that converges globally for problems with a
finite and convex feasible region, and it can be applied to deterministic problems
without a significant amount of modification. Two numerical studies show that the
proposed algorithm outperforms the original ESB&B algorithm in small sampling
budget (finite-time) performance. The advantage is greater when prior knowledge of
the problem-specific structures are available.
164
Future research directions The proposed stochastic link transmission model (Chapter 2)
and its relaxation approximation (Chapter 3) are link models that describe the within-
link dynamic and produce the changes in link states (links boundary conditions),
given arrival and departure profiles. Another key component of a complete network
loading model is the node model, which describes the between-link dynamics and
produces arrival and departure profiles, given the states of its connected links.
In order to formulate a complete probabilistic network model, there is a need to for-
mulate probabilistic and scalable node models. The probabilistic model of Osorio
et al. (2011) includes a tandem-link node model that provides a higher-order descrip-
tion of the across-node dependencies. The extension of this formulation to nodes
with multiple upstream and downstream links remains for future work.
Second, there is a need to formulate scalable network models. For a network with n
links, each with space capacity ℓ, directly coupling the proposed link model of Chap-
ter 3 with the node model of Osorio et al. (2011) would yield a model complexity in
the order of O(ℓn). Such a model is inappropriate for large-scale network analysis.
Potential research directions include but are not limited to network decomposition
technique (e.g., Flötteröd and Osorio (2014)) and aggregate-disaggregate techniques
(e.g., Osorio and Yamani (2017); Osorio and Wang (2017)).
The proposed algorithm in Chapter 4 for discrete SO problems combines an adap-
tive partitioning strategy within the ESB&B framework of Xu and Nelson (2013).
The proposed adaptive partitioning strategy uncovers the structure of the underlying
function by iteratively dividing the feasible region into subregions such that sam-
pled points within the same subregion have similar performances. The resulting
algorithm concentrates the limited computational effort (sampling budgets) on sub-
regions where good solutions appear to be. The proposed partitioning strategy can
also make more efficient hyperplane cuts by using prior knowledge of the problem-
specific structures. An extension of this approach consists of automatic detection of
such hyperplane cuts.
165
Appendix A
Appendices of Chapter 2
A.1 Estimation of the weight parameter w
This section describes the procedure followed to formulate, and to fit the coefficients of,
the weight parameter of Equation (40). Recall that the goal of the mixture model is to
accurately approximate the upstream and the downstream boundary conditions of the link.
In other words, it should yield an accurate approximation of the distribution of UQ and
of DQ. We consider a single isolated link and conduct a total of 180 experiments with
varied combinations of the space capacity (ℓ ∈ {5, 10, 15, . . . , 100}), the traffic inten-
sity (ρ = λ/µ ∈ {0.25, 0.5, 0.75}) and the service rate (or downstream flow capacity)
(µ ∈ {0.2, 0.4, 0.6}). Each experiment considers a time period of duration 250 seconds.
For each experiment we compare the approximation of the UQ and of the DQ distribu-
tions, over time T , to the distributions estimated via stochastic simulation with a discrete-
event simulator of the stochastic link transmissions model. Based on the results of these
experiments, we first observed that the parameters that most impact the quality of the ap-
proximation are ℓ, µ and kfwdδ. This lead us to formulate the following expression for the
weight parameter:
w(ℓ, µ, kfwdδ;β) = e− ℓ2
βµkfwdδ , (A.1)
167
where β is a scalar coefficient. The coefficient is fit such as to minimize, over all 180
experiments, the following error function:
1
2
[1
250
250∑T=1
JSD(PUQ1 (T) ∥ P
UQ2 (T))
]+1
2
[1
250
250∑T=1
JSD(PDQ1 (T) ∥ P
DQ2 (T))
], (A.2)
where PUQ1 (T) (resp. PDQ
1 (T)) is the UQ (resp. DQ) distribution obtained from the mixture
model at time T and PUQ2 (T) (resp. PDQ
2 (T)) is the UQ (resp. DQ) distribution estimated
via stochastic simulation at time T . The two summations of (A.2) consider the error in the
UQ distributions and in the DQ distributions, respectively. This leads to β = 70. This
results in the final weight parameter expression defined in Equation (40).
A.2 Tables of time-average JSD metric
Tables A.1 and A.2 display, respectively, the time-average JSD metric of the UQ and DQ
distributions.
168
Experiment Time-average JSD of the UQ distribution
λ(k) ℓ Mixture Multivariate DetDet DetExp ExpDet
0.1
10 0.0010 0.0000 0.3539 0.2600 0.0003
20 0.0012 0.0000 0.3988 0.3177 0.0001
30 0.0013 0.0000 0.4248 0.3543 0.0001
40 0.0014 NaN 0.4402 0.3779 0.0000
60 0.0013 NaN 0.4590 0.4111 0.0000
80 0.0011 NaN 0.4692 0.4320 0.0000
100 0.0010 NaN 0.4753 0.4489 0.0000
0.2
10 0.0054 0.0000 0.4261 0.2434 0.0020
20 0.0068 0.0000 0.4644 0.3080 0.0013
30 0.0070 0.0000 0.4839 0.3476 0.0008
40 0.0070 NaN 0.4961 0.3767 0.0005
60 0.0062 NaN 0.5105 0.4189 0.0003
80 0.0054 NaN 0.5181 0.4476 0.0001
100 0.0045 NaN 0.5223 0.4713 0.0001
0.3
10 0.0081 0.0000 0.4654 0.1615 0.0036
20 0.0206 0.0000 0.5071 0.2387 0.0045
30 0.0237 0.0000 0.5214 0.2863 0.0041
40 0.0223 NaN 0.5294 0.3215 0.0032
60 0.0182 NaN 0.5384 0.3772 0.0016
80 0.0145 NaN 0.5434 0.4215 0.0008
100 0.0115 NaN 0.5458 0.4602 0.0004
Table A.1: Time-average JSD metric of the UQ distribution. The value NaN denotes caseswhere the evaluation of the multivariate model exceeded the limit of 40 hours.
169
Experiment Time-average JSD of the DQ distribution
λ(k) ℓ Mixture Multivariate DetDet DetExp ExpDet
0.1
10 0.0007 0.0000 0.1550 0.0486 0.0028
20 0.0002 0.0000 0.1530 0.0476 0.0028
30 0.0000 0.0000 0.1486 0.0468 0.0027
40 0.0000 NaN 0.1466 0.0460 0.0026
60 0.0000 NaN 0.1401 0.0441 0.0025
80 0.0000 NaN 0.1335 0.0422 0.0024
100 0.0000 NaN 0.1270 0.0402 0.0022
0.2
10 0.0030 0.0000 0.2679 0.0527 0.0115
20 0.0008 0.0000 0.2640 0.0538 0.0112
30 0.0002 0.0000 0.2584 0.0538 0.0107
40 0.0000 NaN 0.2528 0.0518 0.0105
60 0.0000 NaN 0.2414 0.0497 0.0099
80 0.0000 NaN 0.2301 0.0476 0.0095
100 0.0000 NaN 0.2188 0.0456 0.0089
0.3
10 0.0077 0.0000 0.3677 0.0347 0.0262
20 0.0033 0.0000 0.3811 0.0507 0.0217
30 0.0007 0.0000 0.3760 0.0544 0.0202
40 0.0002 NaN 0.3680 0.0539 0.0196
60 0.0000 NaN 0.3512 0.0516 0.0184
80 0.0000 NaN 0.3343 0.0492 0.0174
100 0.0000 NaN 0.3173 0.0467 0.0164
Table A.2: Time-average JSD metric of the DQ distribution. The value NaN denotes caseswhere the evaluation of the multivariate model exceeded the limit of 40 hours.
170
Appendix B
Appendices of Chapter 3
B.1 Property: τDQ(k) = λDQ(k) when µ(k) = 0
In this section, we derive the property of τDQ(k) when the service rate of DQ(k) becomes
zero. When µ(k) = 0, DQ(k) is a pure (Poisson) arrival process. The only possibility for
DQ being empty at the end of the time interval k (i.e., DQ(k) = 0) is that DQ is empty at
the beginning of the time interval k, which happens with probability P(DQ(k − 1) = 0),
and there is no arrival to DQ during time interval k of length δ. Since the arrival to DQ
during time interval k is Poisson process with rate λDQ(k), the number of arrivals in a
time interval length of δ follows a Poisson distribution with parameter λDQ(k)δ and thus
no arrival to DQ during time interval k of length δ happens with probability e−λDQ(k)δ.
Therefore, we have
P(DQ(k) = 0) = P(DQ(k− 1) = 0)e−λDQ(k)δ (B.1)
= P(DQk = 0) + [P(DQ(k− 1) = 0) − P(DQk = 0)] e−λDQ(k)δ.
(B.2)
Equation (B.2) is obtained from Equation (B.1) by adding outside the bracket and sub-
tracting within the bracket the term P(DQk = 0). When µ(k) = 0, we have ρDQ =
λDQ(k)/µ(k) ≫ 1, and thus P(DQk = 0) = 0 (given by Eq. (3.6b)). Equations (B.2) and
(B.1) are equal since adding/subtracting zero does not affect the result. Equation (B.2) is in
171
the exact form of Equation (3.4) by replacing τDQ(k) with λDQ(k). Thus, not only should
τDQ(k) exist when µ(k) = 0 but also should it equal to λDQ(k).
B.2 Calculation of limµ(k)→0 τDQ(k) of Equation (3.15)
In this section, we derive the limit of τDQ(k) given by Equation (3.7) as µ(k) approaches
zero. We first calculate the limits of τDQ,1 (given by Eq. (3.7b)) and τDQ,2 (given by
Eq. (3.7c)) independently as follows:
limµ(k)→0
τDQ,1 = limµ(k)→0
(1− α1e−ρDQ(k))× µ(k)(1− ρDQ(k))
2
(1+ ρDQ(k))(B.3)
= limµ(k)→0
(1− α1e−ρDQ(k))× lim
µ(k)→0
µ(k)(1− ρDQ(k))2
(1+ ρDQ(k))(B.4)
= limµ(k)→0
(1− α1e−λDQ(k)/µ(k))× lim
µ(k)→0
µ(k)2(1− ρDQ(k))2
µ(k)(1+ ρDQ(k))(B.5)
= (1− 0)× limµ(k)→0
(µ(k) − µ(k)ρDQ(k))2
(µ(k) + µ(k)ρDQ(k))(B.6)
= 1× limµ(k)→0
(µ(k) − λDQ(k))2
(µ(k) + λDQ(k))(B.7)
=(0− λDQ(k))
2
(0+ λDQ(k))(B.8)
= λDQ(k) (B.9)
limµ(k)→0
τDQ,2 = limµ(k)→0
α2µ(k)
∣∣∣∣P(DQ(k− 1) = 0) − P(DQk = 0)
ℓ(1− P(DQk = 0))
∣∣∣∣1/5 (B.10)
= α2
∣∣∣∣P(DQ(k− 1) = 0) − P(DQk = 0)
ℓ(1− P(DQk = 0))
∣∣∣∣1/5 limµ(k)→0
µ(k) (B.11)
= 0 (B.12)
Note that τDQ(k) (given by Eq. (3.7a)) is proposed as the sum of two terms τDQ,1 and
τDQ,2, and the limit of the sum of two functions is equal to the sum of the limit of the two
172
functions. Therefore, we have
limµ(k)→0
τDQ(k) = limµ(k)→0
τDQ,1 + limµ(k)→0
τDQ,2 (B.13)
= λDQ(k) + 0 (B.14)
= λDQ(k) (B.15)
B.3 Estimation of the scalar coefficients in τDQ(k)
This appendix describes the procedure to fit the exogenous coefficients (α1,1, α1,2, α2,1 and
α2,2) of Equation (3.7). A total of 126 simulation experiments were carried out. Each
experiment starts off empty and runs for TF = 300 time units, it has one traffic intensity
value for the first 150 time units and another value for the remaining 150 time units. We use
the arrow notation 0.5 → 0.25 to denote an experiment with a traffic intensity that changes
from 0.5 to 0.25. The experiments consider all combinations of traffic intensity λ/µ ∈
{0.5 → 0.25, 0.75 → 0.25, 1.25 → 0.25, 0.75 → 0.5, 1.25 → 0.5, 1.25 → 0.75}, service
rate µ ∈ {0.2, 0.4, 0.6}, and space capacity ℓ ∈ {10, 20, 30, 40, 60, 80, 100}. The simulator
yields an estimate of P(DQ(k) = 0), denoted PS(DQ(k) = 0), for all k = 1, ..., TF. The
coefficients α1,1, α1,2, α2,1 and α2,2 are fit such as to minimize, over all 126 experiments,
the error function given by Equation (3.33) and rewritten here:
eDQ =1
TF
TF∑T=1
|PA(DQ(T) = 0) − PS(DQ(T) = 0)|, (B.16)
where PS(DQ(T) = 0) is the estimate from the simulator and PA(DQ(T) = 0) is the
analytical approximation obtained from Algorithm 4 with the following adjustments. At
every time step k,
• P(UQ(k) = ℓ) is obtained from the simulator;
• λDQ(k) is obtained from the simulator;
• τDQ(k) is obtained from Equation (3.7), which depends on scalar parameters α1,1,
α1,2, α2,1 and α2,2.
173
In other words, perfect information about the link’s upstream boundary conditions is as-
sumed in the calculation of PA(DQ(k) = 0). Thus, PA(DQ(k) = 0) only depends on
the choice of α1,1, α1,2, α2,1 and α2,2, thus the error function eDQ only depends on α1,1,
α1,2, α2,1 and α2,2. The scalars are estimated jointly and the numerical values obtained are
α1,1 = 0.4, α1,2 = 0.4, α2,1 = 0.6 and α2,2 = 0.
B.4 Variance of the sojourn time of DQ(k)
In this section, we derive the expression for the variance of the sojourn time of DQ(k).
Recall that we use the sojourn time of an M/M/1/ℓ queue with arrival rate λDQ(k) and
service rate µ(k) to approximate the sojourn time of DQ(k). To make the notation simpler,
hereafter the time index k is dropped. Let ρ = λDQ/µ.
The probability density function of the sojourn time of a M/M/1/ℓ queue, denoted
fSDQ(t), is given by Sztrik (2012, Chap. 2.4, Page 34):
fSDQ(t) =
ℓ−1∑n=0
µ(µt)n
n!e−µt P(DQ = n)
1− P(DQ = ℓ)(B.17)
where P(DQ = n) is the steady state probability of DQ.
174
We use this probability density function expression to compute E[S2DQ] as follows.
E[S2DQ] =
∫∞0
t2fSDQ(t)dt (B.18)
=
∫∞0
t2ℓ−1∑n=0
µ(µt)n
n!e−µt P(DQ = n)
1− P(DQ = ℓ)dt (B.19)
=
∫∞0
t2ℓ−1∑n=0
µ(µt)n
n!e−µt
(1−ρ
1−ρℓ+1
)ρn
1−(
1−ρ
1−ρℓ+1
)ρℓdt (B.20)
=(1− ρ)
1− ρℓ
∫∞0
t2ℓ−1∑n=0
µ(µt)n
n!e−µtρndt (B.21)
=(1− ρ)
1− ρℓ
ℓ−1∑n=0
ρn
∫∞0
t(µt)n+1
n!e−µtdt (B.22)
=(1− ρ)
1− ρℓ
ℓ−1∑n=0
ρn Γ(n+ 3)
µ2n!(B.23)
=(1− ρ)
(1− ρℓ)µ2
ℓ−1∑n=0
ρn(n+ 2)(n+ 1) (B.24)
=−ℓ(ℓ+ 1)ρℓ+2 + 2ℓ(ℓ+ 2)ρℓ+1 − (ℓ+ 1)(ℓ+ 2)ρℓ + 2
µ2(1− ρℓ)(1− ρ)2(B.25)
Equation (B.20) is obtained from Equation (B.19) by substituting the closed-form expres-
sion of the steady state probability distribution of an M/M/1/ℓ system (see Gross (2008,
Chap. 2, Equation (2.49))), which is given by:
P(DQ = n) =
(1− ρ
1− ρℓ+1
)ρn, ∀n ∈ {0, ..., ℓ}. (B.26)
The expected value of the sojourn time of DQ is given by (see Eq. (3.29d)):
E[SDQ] =ℓρℓ+1 − (ℓ+ 1)ρℓ + 1
µ(1− ρℓ)(1− ρ)(B.27)
175
Thus, the variance of the sojourn time of DQ is given by:
Var(SDQ) = E[S2DQ] − E[SDQ]
2 (B.28)
=ℓρ2ℓ+2 − 2ℓρ2ℓ+1 + (ℓ+ 1)ρ2ℓ − ℓ(ℓ+ 1)ρℓ+2 + 2ℓ(ℓ+ 1)ρℓ+1 − (ℓ2 + ℓ+ 2)ρℓ + 1
µ2(1− ρℓ)2(1− ρ)2
(B.29)
B.5 Estimation of the scalar coefficients in τUQ(k)
This section describes the procedure to fit the exogenous coefficients α3,1 and α3,2 of Equa-
tion (3.30). The same set of 126 simulation experiments as described in Appendix A.2 are
used. The coefficients α3,1 and α3,2 are fit by minimizing, over all 126 experiments, the
following error function given by Equation (3.34) and rewritten here:
eUQ =1
TF
TF∑T=1
|PA(UQ(T) = ℓ) − PS(UQ(T) = ℓ)|, (B.30)
where PS(UQ(T) = ℓ) is the estimate from the simulator and PA(UQ(T) = ℓ) is the ana-
lytical approximation, which is obtained from Algorithm 4 with the following adjustments.
At every time step k,
• P(DQ(k) = 0) is obtained from the simulator;
• qUQ(k) and qLLO(k) are obtained from the simulator;
• τUQ(k) is obtained from Equation (3.30), which depends on scalar parameters α3,1
and α3,2.
In other words, perfect information about the link’s downstream boundary conditions is
assumed in the calculation of PA(UQ(T) = ℓ). Thus, PA(UQ(T) = ℓ) only depends on
the choice of α3,1 and α3,2 and thus the error function eUQ depends only on α3,1 and α3,2.
The scalars are estimated jointly and the numerical values obtained are α3,1 = 25 and
α3,2 = 7.5.
176
B.6 Tables of mean absolute differences
Experiment eUQ
λ(k) ℓ Mixture Multivariate Proposed
0.2 → 0.1
10 3.11e− 3 7.80e− 5 1.80e− 3
20 5.75e− 5 4.02e− 6 4.74e− 5
30 7.57e− 7 3.92e− 7 7.35e− 7
40 9.25e− 12 NaN 3.25e− 10
60 5.20e− 17 NaN 8.45e− 15
80 2.41e− 22 NaN 1.92e− 19
100 2.49e− 28 NaN 4.72e− 25
0.3 → 0.1
10 1.48e− 2 4.50e− 4 2.10e− 3
20 4.67e− 3 1.23e− 4 1.80e− 3
30 7.03e− 4 3.90e− 5 4.38e− 4
40 8.02e− 5 NaN 6.07e− 5
60 3.11e− 7 NaN 2.91e− 7
80 1.82e− 13 NaN 3.59e− 10
100 3.08e− 17 NaN 8.60e− 14
0.3 → 0.2
10 1.87e− 2 4.44e− 4 4.30e− 3
20 5.19e− 3 1.29e− 4 2.22e− 3
30 8.36e− 4 3.79e− 5 5.49e− 4
40 1.00e− 4 NaN 7.81e− 5
60 6.59e− 7 NaN 2.91e− 7
80 2.04e− 13 NaN 4.59e− 10
100 6.07e− 17 NaN 1.23e− 13
Table B.1: Mean absolute difference eUQ of P(UQ(k) = ℓ). The value NaN denotes caseswhere the evaluation of the multivariate model exceeded the limit of 40 hours. (Startingempty with time-varying demand over time)
177
Experiment eDQ
λ(k) ℓ Mixture Multivariate Proposed
0.2 → 0.1
10 5.00e− 3 2.93e− 3 4.60e− 3
20 2.76e− 3 3.00e− 3 4.35e− 3
30 2.25e− 3 3.02e− 3 4.67e− 3
40 2.51e− 3 NaN 5.12e− 3
60 2.63e− 3 NaN 5.51e− 3
80 2.68e− 3 NaN 5.82e− 3
100 2.74e− 3 NaN 6.16e− 3
0.3 → 0.1
10 1.52e− 2 0.44e− 2 0.69e− 2
20 0.73e− 2 0.46e− 2 1.54e− 2
30 0.40e− 2 0.47e− 2 1.82e− 2
40 0.33e− 2 NaN 1.88e− 2
60 0.40e− 2 NaN 1.93e− 2
80 0.42e− 2 NaN 1.97e− 2
100 0.42e− 2 NaN 2.00e− 2
0.3 → 0.2
10 1.62e− 2 0.35e− 2 1.27e− 2
20 0.69e− 2 0.35e− 2 1.36e− 2
30 0.36e− 2 0.36e− 2 1.33e− 2
40 0.27e− 2 NaN 1.38e− 2
60 0.32e− 2 NaN 1.43e− 2
80 0.34e− 2 NaN 1.46e− 2
100 0.34e− 2 NaN 1.49e− 2
Table B.2: Mean absolute difference eDQ of P(DQ(k) = 0). The value NaN denotes caseswhere the evaluation of the multivariate model exceeded the limit of 40 hours. (Startingempty with time-varying demand over time)
178
Bibliography
Amaran, S., Sahinidis, N. V., Sharda, B. and Bury, S. J. (2016). Simulation optimization: areview of algorithms and applications, Annals of Operations Research 240(1): 351–380.
Andradóttir, S. (2006). An overview of simulation optimization via random search, Hand-books in operations research and management science 13: 617–631.
Angün, E., Kleijnen, J., den Hertog, D. and Gürkan, G. (2009). Response surface method-ology with stochastic constraints for expensive simulation, Journal of the operationalresearch society 60(6): 735–746.
Bhatnagar, S., Hemachandra, N. and Mishra, V. K. (2011). Stochastic approximation algo-rithms for constrained optimization via simulation, ACM Transactions on Modeling andComputer Simulation (TOMACS) 21(3): 15.
Bhosekar, A. and Ierapetritou, M. (2018). Advances in surrogate based modeling, feasibil-ity analysis, and optimization: A review, Computers & Chemical Engineering 108: 250–267.
Boel, R. and Mihaylova, L. (2006). A compositional stochastic model for real time freewaytraffic simulation, Transportation Research Part B 40: 319–334.
Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984). Classification and regressiontrees, Wadsworth International Group 37(15): 237–251.
Calvert, S., Taale, H., Snelder, M. and Hoogendoorn, S. (2012). Probability in traffic: achallenge for modelling, 4th International Symposium on Dynamic Traffic Assignment(DTA), Massachusetts, USA.
Chen, X., Li, L. and Shi, Q. (2015). Stochastic Evolutions of Dynamic Traffic Flow,Springer, Berlin Heidelberg.
Chong, L. and Osorio, C. (2017). A simulation-based optimization algorithm for dynamiclarge-scale urban transportation problems, Transportation Science 52(3): 637–656.
Daganzo, C. (2005). A variational formulation of kinematic waves: basic theory and com-plex boundary conditions, Transportation Research Part B 39(2): 187–196.
Daganzo, C. F. (1994). The cell transmission model: A dynamic representation of high-way traffic consistent with the hydrodynamic theory, Transportation Research Part B28(4): 269–287.
179
Davis, J. L., Massey, W. A. and Whitt, W. (1995). Sensitivity to the service-time distribu-tion in the nonstationary Erlang loss model, Management Science 41(6): 1107–1116.
Deng, W., Lei, H. and Zhou, X. (2013). Traffic state estimation and uncertainty quantifi-cation based on heterogeneous data sources: a three detector approach, TransportationResearch Part B 57: 132 – 157.
Dumont, A. G. and Bert, E. (2006). Simulation de l’agglomération Lausannoise SIMULO,Laboratoire des voies de circulation, ENAC, Ecole Polytechnique Fédérale de Lausanne.URL: Available at: http://web.mit.edu/osorioc/www/papers/dumont06BertRapport.pdf
Dunn, J. W. (2018). Optimal trees for prediction and prescription, PhD thesis, Mas-sachusetts Institute of Technology.
Endres, D. M. and Schindelin, J. E. (2003). A new metric for probability distributions,IEEE Transactions on Information Theory 49(7): 1858– 1860.
Erlang, A. K. (1917). Solution of some problems in the theory of probabilities of sig-nificance in automatic telephone exchanges, Post Office Electrical Engineer’s Journal10(1917-1918): 189–197.
Fields, E., Osorio, C. and Zhou, T. (2017). A data-driven car sharing simulator for infer-ring latent demand, Technical report, Massachusetts Institute of Technology, Cambridge,Massachusetts, USA.Available at: http://web.mit.edu/osorioc/www/papers/fields17Sim.pdf .
Flötteröd, G. and Osorio, C. (2014). Stochastic analytic dynamic qeueing network modelwith spillback, Proceedings of the International Symposium of Dynamic Traffic Assign-ment (DTA).Available at: http://web.mit.edu/osorioc/www/papers/floOso13Nwks.pdf .
Flötteröd, G. and Osorio, C. (2017). Stochastic network link transmission model, Trans-portation Research Part B 102: 180–209.
Fu, M. C. (2002). Optimization for simulation: Theory vs. practice, INFORMS Journal onComputing 14(3): 192–215.
Gazis, D. C., Herman, R. and Rothery, R. W. (1961). Nonlinear follow-the-leader modelsof traffic flow, Operations research 9(4): 545–567.
Google Maps (2017). 23 Zipcar stations in Boston South End neighborhood, https://drive.google.com/open?id=1hOvbRIjfZJjF5L3OfoAThiq0\_p8&usp=sharing. Accessed: 2017-09-22.
Gross, D. (2008). Fundamentals of queueing theory, John Wiley & Sons, New York, U.S.,chapter 2, pp. 49–103.
Heidemann, D. (1991). Queue length and waiting time distributions at priority intersec-tions, Transportation Research Part B 25(4): 163–174.
180
Heidemann, D. (1994). Queue length and delay distributions at traffic signals, Transporta-tion Research Part B 28(5): 377–389.
Heidemann, D. (2001). A queueing theory model of nonstationary traffic flow, Transporta-tion Science 35(4): 405–412.
Heidemann, D. and Wegmann, H. (1997). Queueing at unsignalized intersections, Trans-portation Research Part B 31(3): 239–263.
Helbing, D. (1997). Modeling multi-lane traffic flow with queuing effects, Physica A:Statistical Mechanics and its Applications 242(1-2): 175–194.
Himpe, W., Corthout, R. and Tampère, M. C. (2016). An efficient iterative link transmissionmodel, Transportation Research Part B 92: 170–190.
Hong, L. J. and Nelson, B. L. (2009). A brief introduction to optimization via simulation,Proceedings of the 2009 Winter Simulation Conference (WSC), IEEE, pp. 75–85.
Hoogendoorn, S. P. and Bovy, P. H. (2001). Generic gas-kinetic traffic systems modelingwith applications to vehicular traffic flow, Transportation Research Part B 35(4): 317–336.
Jabari, S. E. (2012). A stochastic model of macroscopic traffic flow: Theoretical founda-tions, PhD thesis, University of Minnesota.
Jabari, S. E. and Liu, H. X. (2012). A stochastic model of traffic flow: Theoretical founda-tions, Transportation Research Part B 46(1): 156–174.
Jabari, S. E. and Liu, H. X. (2013). A stochastic model of traffic flow: Gaussian approxi-mation and estimation, Transportation Research Part B 47: 15–41.
Jabari, S. E., Zheng, J. and Liu, H. X. (2014a). A probabilistic stationary speed–densityrelation based on Newell’s simplified car-following model, Transportation Research PartB 68: 205–223.
Jabari, S. E., Zheng, J. and Liu, H. X. (2014b). A probabilistic stationary speed-densityrelation based on newell’s simplified car-following model, Transportation Research PartB 68: 205–223.
Jagerman, D. (1975). Nonstationary blocking in telephone traffic, Bell Labs TechnicalJournal 54(3): 625–661.
Kerner, B. S. and Rehborn, H. (1996). Experimental features and characteristics of trafficjams, Physical Review E 53: R1297–R1300.
Khinchin, A. Y. (1962). Erlang’s formulas in the theory of mass service, Theory of Proba-bility & Its Applications 7(3): 320–325.
Kim, S.-H. and Nelson, B. L. (2006). Selecting the best system, Handbooks in operationsresearch and management science 13: 501–534.
181
Kingman, J. (1963). Poisson counts for random sequences of events, The Annals of Math-ematical Statistics 34(4): 1217–1232.
Kleinrock, L. (1975). Queueing Systems Volume 1:Theory, Wiley-Interscience, New York,NY, USA, chapter 2, p. 77.
Kullback, S. and Leibler, R. A. (1951). On information and sufficiency, The annals ofmathematical statistics 22(1): 79–86.
L. Salemi, P., Song, E., Nelson, B. L. and Staum, J. (2019). Gaussian markov randomfields for discrete optimization via simulation: Framework and algorithms, OperationsResearch 67(1): 250–266.
Lam, W. H., Shao, H. and Sumalee, A. (2008). Modeling impacts of adverse weatherconditions on a road network with uncertainties in demand and supply, Transportationresearch part B: methodological 42(10): 890–910.
Larson, R. C. and Odoni, A. R. (1981). Urban Operations Research, Prentice-Hall, Inc.,Englewood Cilffs, New Jersey, USA.
Laval, J. A. and Castrillón, F. (2015). Stochastic approximations for the macroscopic funda-mental diagram of urban networks, Transportation Research Procedia, Papers selectedfor the International Symposium of Transportation and Traffic Theory (ISTTT), Vol. 7,pp. 615–630.
Laval, J. A. and Chilukuri, B. R. (2014). The distribution of congestion on a class ofstochastic kinematic wave models, Transportation Science 48(2): 217–224.
Lighthill, M. and Whitham, G. (1955). On kinematic waves. I: Flood movement in longrivers, II: a theory of traffic flow on long crowded roads, Proceedings of the Royal Societyof London A: Mathematical, Physical and Engineering Sciences, Vol. 229, The RoyalSociety, pp. 281–345.
Lighthill, M. and Witham, J. (1955). On kinematic waves II. a theory of traffic flow on longcrowded roads, Proceedings of the Royal Society A 229: 317–345.
Lu, J. and Osorio, C. (2018). A probabilistic traffic-theoretic network loading model suit-able for large-scale network analysis, Transportation Science 52(6): 1509–1530.
MATLAB (2016). Optimization Toolbox: User’s Guide (R2016a), The Mathworks, Inc.,Natick, Massachusetts.
Morse, P. (1958). Queues, inventories and maintenance; the analysis of operational sys-tems with variable demand and supply, Wiley, New York, USA, chapter 6, pp. 59–67.
Nelson, P. (1995). A kinetic model of vehicular traffic and its associated bimodal equilib-rium solutions, Transport Theory and Statistical Physics 24(1-3): 383–409.
Newell, C. (1982). Applications of queueing theory, Chapman and Hall, New York, USA,chapter 3, pp. 143–175.
182
Newell, G. (1993). A simplified theory of kinematic waves in highway traffic, part I:general theory, Transportation Research Part B 27(4): 281–287.
Newell, G. F. (1961). Nonlinear effects in the dynamics of car following, OperationsResearch 9(2): 209–229.
Newell, G. F. (2002). A simplified car-following theory: a lower order model, Transporta-tion Research Part B 36(3): 195–205.
Norkin, V. I., Pflug, G. C. and Ruszczynski, A. (1998). A branch and bound method forstochastic global optimization, Mathematical programming 83(1-3): 425–450.
Odoni, A. R. and Roth, E. (1983). An empirical investigation of the transient behavior ofstationary queueing systems, Operations Research 31(3): 432–455.
Olszewski, P. S. (1994). Modeling probability distribution of delay at signalized intersec-tions, Journal of Advanced Transportation 28(3): 253–274.
Orosz, G., Wilson, R. E., Szalai, R. and Stépán, G. (2009). Exciting traffic jams: nonlinearphenomena behind traffic jam formation on highways, Physical Review E 80(4): 046205.
Osorio, C. (2010). Mitigating network congestion: analytical models, optimization meth-ods and their applications, PhD thesis, Ecole Polytechnique Fédérale de Lausanne.
Osorio, C., Chen, X., Gao, J., Talas, M. and Marsico, M. (2015). On the con-trol of highly congested urban networks with intricate traffic patterns: a NewYork City case study, Technical report, Department of Civil and Environmen-tal Engineering, Massachusetts Institute of Technology (MIT). Available at:http://web.mit.edu/osorioc/www/papers/osoChenNYCDOTOfflineSO.pdf .
Osorio, C. and Chong, L. (2015). A computationally efficient simulation-based optimiza-tion algorithm for large-scale urban transportation problems, Transportation Science49(3): 623–636.
Osorio, C. and Flötteröd, G. (2015). Capturing dependency among link boundaries in astochastic dynamic network loading model, Transportation Science 49(2): 420–431.
Osorio, C., Flötteröd, G. and Bierlaire, M. (2011). Dynamic network loading: a stochasticdifferentiable model that derives link state distributions, Transportation Research Part B45(9): 1410–1423.
Osorio, C. and Wang, C. (2017). On the analytical approximation of joint aggregate queue-length distributions for traffic networks: a stationary finite capacity Markovian networkapproach, Transportation Research Part B 95: 305–339.
Osorio, C. and Yamani, J. (2017). Analytical and scalable analysis of transient tandemMarkovian finite capacity queueing networks, Transportation Science 51(3): 823–840.
183
Paveri-Fontana, S. (1975). On boltzmann-like treatments for traffic flow: a critical reviewof the basic model and an alternative proposal for dilute traffic analysis, TransportationResearch 9(4): 225–235.
Payne, H. J. (1971). Model of freeway traffic and control, Mathematical Model of PublicSystem: Simulation Council Proceedings Series 1: 51–61.
Prigogine, I. and Andrews, F. C. (1960). A Boltzmann-like approach for traffic flow, Op-erations Research 8(6): 789–797.
Ramezani, M., Haddad, J. and Geroliminis, N. (2015). Dynamics of heterogeneity in urbannetworks: aggregated traffic modeling and hierarchical control, Transportation ResearchPart B 74: 1–19.
Regis, R. G. and Shoemaker, C. A. (2013). Combining radial basis function surrogatesand dynamic coordinate search in high-dimensional expensive black-box optimization,Engineering Optimization 45(5): 529–555.
Reibman, A. (1991). A splitting technique for Markov chain transient solution, in W. J.Stewart (ed.), Numerical solution of Markov chains, Marcel Dekker, Inc, New York,USA, chapter 19, pp. 373–400.
Richards, P. I. (1956a). Shock waves on highways, Operations Research 4(1): 42–51.
Richards, P. I. (1956b). Shock waves on the highway, Operations Research 4(1): 42–51.
Robbins, H. and Monro, S. (1951). A stochastic approximation method, The annals ofmathematical statistics pp. 400–407.
Ross, P. (1988). Traffic dynamics, Transportation Research Part B 22(6): 421–435.
Roth, E. (1994). The relaxation time heuristic for the initial transient problem in M/M/kqueueing systems, European Journal of Operational Research 72(2): 376–386.
Sayegh, A. S., Connors, R. D. and Tate, J. E. (2017). Uncertainty propagation from thecell transmission traffic flow model to emission predictions: a data-driven approach,Transportation Science 52(6): 1327–1346.
Shi, L. and Ólafsson, S. (2000). Nested partitions method for global optimization, Opera-tions research 48(3): 390–407.
Stafford, R. (2006). Random vectors with fixed sum. Accessed June 1, 2015.URL: Http://www.mathworks.com/matlabcentral/fileexchange/9700
Sumalee, A., Zhong, R. X., Pan, T. L. and Szeto, W. Y. (2011). Stochastic Cell Transmis-sion Model (SCTM): a stochastic dynamic traffic model for traffic state surveillance andassignment, Transportation Research Part B 45(3): 507–533.
184
Sztrik, J. (2012). Basic queueing theory, University of Debrecen, Debrecen, Hungary,chapter 2.4, pp. 32–37. Accessed July 20, 2018.URL: https://pdfs.semanticscholar.org/848f/a1f48ad9d3edb24b05667f15cfc633eb8f69.pdf
Tampère, C., Corthout, R., Cattrysse, D. and Immers, L. (2011). A generic class of first-order node models for dynamic macroscopic simulations of traffic flows, TransportationResearch Part B 45(1): 289–309.
Trafficware (2011). Synchro Studio 8 User Guide, Trafficware, Sugar Land, TX.
Transport for London (2010). Traffic modelling guidelines. version 3.0, Technical report,Transport for London (TfL).
Tsai, S. C. and Fu, S. Y. (2014). Genetic-algorithm-based simulation optimizationconsidering a single stochastic constraint, European Journal of Operational Research236(1): 113–125.
TSS (2014). AIMSUN 8.1 Microsimulator Users Manual, Transport Simulation System.
U.S. Department of Transportation (2008). Transportation vision for 2030, Technical re-port, U.S. Department of Transportation (DOT), Research and Innovative TechnologyAdministration.
van Doorn, E. A. and Zeifman, A. I. (2009). On the speed of convergence to stationarity ofthe Erlang loss system, Queueing Systems 63(1-4): 241.
van Zuylen, H. J. and Viti, F. (2003). Uncertainty and the dynamics of queues at con-trolled intersections, International Federation of Automatic Control (IFAC) Proceedings36(14): 43–48.
Viti, F. and Van Zuylen, H. J. (2010). Probabilistic models for queues at fixed controlsignals, Transportation Research Part B 44(1): 120–135.
Wang, Z. and Ierapetritou, M. (2018). Constrained optimization of black-box stochasticsystems using a novel feasibility enhanced kriging-based method, Computers & Chemi-cal Engineering 118: 210–223.
Xu, J., Nelson, B. L. and Hong, J. (2010). Industrial strength compass: A comprehensivealgorithm and software for optimization via simulation, ACM Transactions on Modelingand Computer Simulation (TOMACS) 20(1): 3.
Xu, J., Nelson, B. L. and Hong, L. J. (2013). An adaptive hyperbox algorithm for high-dimensional discrete optimization via simulation problems, INFORMS Journal on Com-puting 25(1): 133–146.
Xu, W. L. and Nelson, B. L. (2013). Empirical stochastic branch-and-bound for optimiza-tion via simulation, IIE Transactions 45(7): 685–698.
185
Yperman, I., Tampere, C. and Immers, B. (2007). A kinematic wave dynamic networkloading model including intersection delays, Transportation Research Board 86th An-nual Meeting, Washington DC, USA.
Zheng, F., Jabari, S. E., Liu, H. X. and Lin, D. (2018). Traffic state estimation usingstochastic Lagrangian dynamics, Transportation Research Part B 115: 143–165.
Zhou, T., Fields, E. and Osorio, C. (2019). Large-scale data-driven simulation-based car-sharing network design, Submitted to Transportation Research Part B .Available at: http://web.mit.edu/osorioc/www/papers/zhoOsoFieCarSharing.pdf .
186