Transportation Analytics and Last-mile Same-dayDelivery with Local Store Fulfillment
by
Ming NiJanuary 5, 2018
A Dissertation submitted to theFaculty of the Graduate School of
the University at Buffalo, State University of New Yorkin partial fulfilment of the requirements for the
degree of
Doctor of Philosophy
Department of Industrial and Systems Engineering
Copyright by
Ming Ni
2018
Doctoral Committee:
Qing HeStephen Still Assistant Professor of Civil, Structural and EnvironmentalEngineering and of Industrial and Systems EngineeringAdvisor and Chair of Committee
Mark H. KarwanPraxair Professor of Operations Research, Industrial and Systems EngineeringSUNY Distinguished Teaching ProfessorCommittee Member
Jose L. WalterosMorton C. Frank Assistant Professor of Industrial and Systems EngineeringCommittee Member
Jing GaoAssistant professor of Computer Science and EngineeringCommittee Member
ABSTRACT
The recent emergence of social media and online retailing become increasingly impor-
tant and continue to grow. More and more people use social media to share their real
life to the digital world, at the same time, browse the virtual Internet to buy the real
products. In the process, a huge amount of data is generated and we investigate the
data and crowdsourcing for areas of the public transportation and last-mile delivery
for online orders in the perspective of data analytics and operations optimization.
We first focus on the transit flow prediction by crowdsourced social media data.
Subway flow prediction under event occurrences is a very challenging task in transit
system management. To tackle this challenge, we leverage the power of social me-
dia data to extract features from crowdsourced content to gather the public travel
willingness. We propose a parametric and convex optimization-based approach to
combine the least squares of linear regression and the prediction results of the sea-
sonal autoregressive integrated moving average model to accurately predict the NYC
subway flow under sporting events.
The second part of the thesis focuses on the last-mile same-day delivery with
store fulfillment problem (SDD-SFP) using real-world data from a national retailer.
We propose that retailers can take advantage of their physical local stores to fulll
nearby online orders in a direct-to-consumer fashion during the same day that order
placed. Optimization models and solution algorithms are developed to determine
store selections, fleet-sizing for transportation, and inventory in terms of supply chain
seasonal planning. In order to solve large-scale SDD-SFP with real-world datasets,
we create an accelerated Benders decomposition approach that integrates the outer
search tree and local branching based on mixed-integer programming and develops
optimization-based algorithms for initial lifting constraints.
In the last part of the dissertation, we drill down SDD-SFP from supply chain
planning to supply chain operation level. The aim is to create an optimal exact
order fulllment plan to specify how to deliver each received customer order. We
adopt crowdsourced shipping, which utilizes the extra capacity of the vehicles from
private drivers to execute delivery jobs on trips, as delivery options, and define the
problem as same-day delivery with crowdshipping and store fulfillment (SDD-CSF).
we develop a set of exact solution approaches for order fulfillment in form of rolling
horizon framework. It repeatedly solves a series of order assignment and delivery
plan problem following the timeline in order to construct an optimal fulfillment plan
from local stores. Results from numerical experiments derived from real sale data of
a retailer along with algorithmic computational results are presented.
TABLE OF CONTENTS
Page
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
CHAPTER
1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Forecasting the Subway Passenger Flow under Event Occurrences with
Social Media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Hashtag-based Event Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Events Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.6 Prediction Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3 Same-Day Delivery Planning with Store Fulfillment . . . . . . . . . . . . . . . . . . . . 28
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3 Problem Description and Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3.2 SDDSFP Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3.3 SDDSFP Benders Reformulation . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3.4 Algorithmic Enhancement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4 Computational results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4 Cross Souring Delivery with Store Fulfillment . . . . . . . . . . . . . . . . . . . . . . . . . . 64
CHAPTER Page
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.3.1 Rolling Horizon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.3.2 The Cost Function for Received Orders . . . . . . . . . . . . . . . . . . . . 72
4.3.3 The Cost Function with Forecast Orders and Feedback Control 85
4.4 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5 Conclusion and Future Research Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.1 Summary of Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.2 Future Research Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
APPENDIX
LIST OF TABLES
Table Page
2.1 Sample Tweets Before Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Sample Events and Their Top Hashtags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1 Summary of data sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.2 Solving time (Minute) comparison of different solution methods . . . . . . . 57
3.3 The gaps between MIP and different solution methods . . . . . . . . . . . . . . . . 57
3.4 The Improvement of store selection by Algorithm 2 . . . . . . . . . . . . . . . . . . . 57
3.5 Summary of data set of Instance P4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.6 Metric comparison of different solution methods for Instance P4 . . . . . . . 59
4.1 The Orders and Stores Inputs for Two Levels of Customer Demand . . . . 99
4.2 The solving time (seconds) of the instances for different levels of cus-
tomer demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
LIST OF FIGURES
Figure Page
2.1 System architecture for passenger flow prediction from social media . . . . 7
2.2 Geographic distribution of tweets two hours before the events . . . . . . . . . 11
2.3 Comparisons of passenger flow and number of tweets . . . . . . . . . . . . . . . . . 13
2.4 Average event/nonevent daily passenger flow at Mets-Willies Point sta-
tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5 The correlation between tweet rates and passenger flow under events . . 18
2.6 Average passenger flow V.S. average tweet rates at Citi Field Station . . 20
2.7 Performance metrics of the prediction models . . . . . . . . . . . . . . . . . . . . . . . . 24
2.8 The distributions of test errors to compare the SVR and OPL . . . . . . . . . 24
2.9 Improvement from ensemble learning from the OPL and SVR . . . . . . . . . 25
3.1 Parallel search trees between the master problem of Benders and orig-
inal MIP problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.2 The solution strategy of parallel search tree . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.3 The solution methodology framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4 Sensitivity analysis of the store order processing capacity . . . . . . . . . . . . . 60
3.5 Sensitivity analysis of the daily operation cost of own fleet truck . . . . . . 61
3.6 Sensitivity analysis of the package shipping cost by third party carrier . 62
4.1 The hourly summary of order fulfillment plans by the four models . . . . . 96
4.2 The order fulfillment plans and associated costs of the instance P7 by
the four models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.3 The solving results of the instances for different levels of customer
demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.4 Cases of forecast orders gzt and solving results from SDD-CSF models
with feedback control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Chapter 1
INTRODUCTION
This thesis investigates the crowdsourcing for transportation and logistics in different
perspectives, including exploration of the crowdsourced content of social media for
public transportation flow prediction and crowdsourced shipping, store fulfillment for
the last-mile same-day delivery.This thesis is organized in the following ways.
Chapter 2 covers the subway flow prediction using the crowdsourced content of
social media data. Social media is a great resource of user-generated contents. Public
attention, opinion and hot topics can be captured in the social media, which provides
the ability to predict human related events. Since social media can be retrieved in real
time with relatively small building and maintenance costs, transportation operation
authorities probably identify the social media data as another type sensor for traffic
or transit demand.
One of those challenges is how to extract reliable traffic related features from big
and noisy social media data. The other challenge is how to locate a feasible traffic
study that fits well with social media data. We aim to use social media information to
assist transit flow prediction under special event conditions. Specially, a short-term
subway flow prediction model, incorporated with crowdsourcing features, is developed
to forecast the incoming transportation flow prior to big events. We propose the
prediction model based on convex optimization to combine the least squares of linear
regression and the prediction results of SARIMA in the same objective function to
accurately predict the subway passenger flow.
Chapter 3 introduces a new problem of same-day delivery planning with store
fulfillment to capture the current trend of same-day delivery in omni-channel supply
1
chain for brick-and-mortar retailers.
Electronic retailing has experienced a significant growth over the past decade [1].
Alone in the U.S., electronic retail sales have doubled over a span of only five years,
moving from $170 billions in 2010 to a staggering $340 billions in 2015 [2]. In 2016, a
report by the U.S. Census Bureau situates electronic commerce (e-commerce) at 8.1%
of the total retail sales nationwide; a notable increment from the 7.3% observed the
year before [3]. Furthermore, according to the National Retail Association (NRA),
the growth trend of e-commerce is projected to maintain a steady pace of about 8%
to 12% in the forthcoming years [4, 5].
What is perhaps the most interesting about these recent trends, is the fact that
the growth of e-commerce has significantly outpaced the overall growth of retail sales
(currently situated at 3%). Evidence suggests that customers are rapidly gravitat-
ing towards the convenience of online shopping, in part because of the increasing
number of electronic channels (e-channels) and services offered by online retailers [6].
Indeed, with the emergence of e-commerce giant Amazon and other online platforms
like eBay, traditional brick-and-mortar retailers have been forced to pursue an elec-
tronic presence to avoid substantial drops in sales [7], which has resulted in a notable
increment on the shopping alternatives for most customers. Successful examples of
this phenomenon are traditional retailers like Walmart and Best Buy, who have been
able to successfully consolidate profitable online stores, in contrast to once major
players like Circuit City, Radio Shack, and Borders, who fail to transcend to this new
e-commerce era.
The strong competition for online sales between traditional and online-exclusive
retailers has naturally stimulated the development of novel strategies and services to
attract new customers by promoting convenience and product accessibility [6]. One
notable example is the recent emergence of same-day delivery (SDD) options for some
2
types of products, which allows customers to have desired items delivered to their
doors only a few hours after the purchase. Online retailers with highly flexible supply
chains like Amazon have been able to significantly reduce the delivery time of most of
their products by their massive fulfillment centers. In contrast, traditional retailers
whose supply chain is largely designed to support the fulfillment of their physical
stores must resort to different alternatives to cope with the operational requirements
for implementing SDD.
Based on this fact, We argue that traditional retailers can take advantage of their
physical local stores to fulfill nearby online orders in a direct-to-consumer fashion
during the same day when the order is placed. In order to construct the SDD net-
work with a robust logistic plan from scratch, we develop optimization models and
solving algorithms about store location, transporting channel selection, and inven-
tory management. The problem is named as same-day delivery with store fulfillment
problem (SDD-SFP). The proposed solution methodology framework includes the
Benders decomposition, store selection algorithm, cut strengthening methods, and
parallel search trees.
Chapter 4 focuses on the daily operations of same-day delivery with crowdship-
ping and store fulfillment (SDD-CSF). we drill down SDD-SFP from supply chain
planning to supply chain operation level. This chapter aims to close the last-mile
gap between store and customers. SDD-SFP makes order fulfillment plan from two
aspects: order souring decision and delivery method selection. Souring decision con-
siders both current received orders and future forecast demand in order to minimize
not only the immediate fulfillment cost but also future expecting cost.
We adopt the new concept of crowdsourced shipping, which utilizes the extra ca-
pacity of the vehicles from private drivers to execute delivery jobs on trips. Therefore,
the delivery methods include self-operated or carrier-operated fleet of truck, and store
3
walk-in customers which is willing to deliver packages for others.
Chapter 4 develops a set of exact solution approaches for order fulfillment in
form of a rolling horizon framework. The original dynamic programming problem for
current received orders is mathematically approximated into a mixed integer linear
programming model. The model consider both current received orders and the pre-
dicting results of future demand to make order assignment decision that minimizes
the immediate delivery cost plus the resulting future expected cost. It repeatedly
solves the model following the timeline in order to construct an optimal fulfillment
plan from local stores. With help of the structure of the roll horizon, we also introduce
a feedback control system to cope with the inaccurate forecast of future demand.
Finally, we summarize our results and discuss future research opportunities in
chapter 5
4
Chapter 2
FORECASTING THE SUBWAY PASSENGER FLOW UNDER EVENT
OCCURRENCES WITH SOCIAL MEDIA
2.1 Introduction
Passenger flow prediction is critical for planning, management and operations
of public transit systems [8]. The output from the prediction can benefit transit
network design, route scheduling, and station crowd regulation operations [9]. The
majority of the previous studies lie in forecasting day-to-day recurrent passenger flow
[10, 11, 12, 13]. However, when it comes to non-recurrent events (e.g. sporting game,
concert, running race, etc.), because of its irregularity and inconsistency, passenger
flow prediction turns into a very challenging task. Very limited methods have been
proposed in the literature.
For solving this problem, instead of revising existing methods, we intend to lever-
age a new kind of data – social media. User-generated contents on social media
strengthen linkage and interactions between users, meanwhile provide a large amount
of information. The vast information is able to capture the public attention, which is
one of the common traits of events.
However, social media data is much difficult to process compared with traditional
relational data. There still exist several major challenges in handling social media
data, which is unstructured, noisy, gigantic, and contains a variety of information.
Take Twitter data for example. Only in 2014, we have collected over 29.7 million
geo-tagged posts bounded in the New York City Area. At individual post level, a
fundamental question of data mining arises: what it is talking about, and what event
5
information it contains. Thus the first challenge (C1), within a transportation con-
text, is how to identify transportation-related events that each post refers to. An in-
dividual geo-tagged post is able to provide social activity analysis at spatial-temporal
aggregated level. Transportation authorities can leverage such information to iden-
tify hot spots and further indicate passenger flows in near future for public gathering.
Therefore, the second challenge (C2) is how to develop a method to coordinate social
media for forecasting passenger flow, especially under event occurrences.
This paper aims to address challenges (C1) and (C2). More specifically, under
event occurrences, we intend to extract event information from geo-tagged social
media data, and leverage both historical transit data and real-time social media data
to forecast future passenger flow at subway stations. The following questions will be
investigated: (i) Can social media be used to identify public events in real life? (ii)
How to build the prediction model by the features extracted from social media? To
the best of our knowledge, there has not been considerable published research on the
effects of passenger flow prediction with social media.
The paper has the following structure. Section II summarizes related works about
recent popular transportation prediction techniques and the uses of social media in
transport applications. An overview of the data, including subway passenger flow
and social media, is given in Section III. Section IV describes the setup of event
detection approach. Section V presents a detailed analysis of the relationship between
event passenger flow and social media. Section VI presents the technical details of
prediction modeling and experiments on real-world datasets. Finally, Section VII
provides concluding remarks.
6
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 2
presents a detailed analysis of the relationship between event
passenger flow and social media. Section VI presents the
technical details of prediction modeling and experiments on
real-world datasets. Finally, Section VII provides concluding
remarks.
A set of events with high
social media activity
Baseball Game
of using crowdsourcing these resources to capture the incoming
non-recurrent events [27], to explain the causes of transport
overcrowding [28], to investigate intelligent transportation
systems services [29], and to utilize deep learning approach
[30][31]. Studies are trying to exploit this area mainly fall into
two applications, traffic detection, and traffic prediction, with
supervised learning techniques.
In the application of traffic detection, Wanichayapong et al.
[32] used synthetic analysis to classify the traffic incident
information into spatial categories from the social media data. Social Media
Data
Transit
Turnstile
Geo Filter Bounding
Box
Feature
Generation
Twitter Rates
Transit Passenger
Event Detection
Passenger Flow
Prediction
Parametric methods
Nonparametric
Music Concert
US Open Tennis
f(x)
Schulz et al. [33] extracted features from part-of-speech
tagging and words in Twitter posts and developed classifiers to
detect car accident occurrences. They applied spatial and
temporal filtering to locate the accidents. Daly [34] built a
system called Dublin’s Semantic Traffic Annotator and
Reasoner to use natural language processing techniques to
analyze social media contents in order to capture real-time traffic conditions. Mai and Hranac [35] explored the time and
Data Flow methods x location of the related Twitter posts after traffic incidents
occurred. They found that the majority of tweets are posted Fig. 1. System architecture for passenger flow prediction from social media
I. RELATED WORKS
There is a vast literature in short-term transportation
forecasting [7]. Generally, there are two groups of approaches
receiving wide attention, namely, parametric and non-
parametric techniques.
The common parametric techniques include autoregressive
integrated moving average model (ARIMA), exponential
smoothing [8], and historical average [9]. Especially, ARIMA
has been fully developed for various transportation prediction
purposes, including traffic occupancy [10], travel time [11] and
traffic flow [12]. Previous research [13][14] shows ARIMA
performs well for stationary and non-event time series. With
the rise of data mining and science, non-parametric techniques
also have been widely adopted recently. Neural network
[15][16], support vector machine for regression (SVR) [17] and
k-nearest neighbor [18] were used to build the traffic volume
prediction model for the time-series data.
The passenger flow prediction belongs to the subcategory of
short-term transportation prediction. Some researchers adopted
both kinds of prediction techniques to forecast the passenger
flow for railway [15][19][20], bus stop [21][22], and subway
stations. Specifically for passenger flow prediction at subway
stations, there are different prediction levels, respectively, at
whole transit lines [3][4], at one station with passenger transfer
flow [5], and at one station with entrance and transfer flow [6].
All of them obtained a desirable predict result of typical
commuting volumes. However, none of them adds
consideration of atypical conditions.
Recently, more and more attempts have been made to
implement The Internet and social media analysis in the domain
of transportation [23][24][25]. A huge group of people in the
online community generates a tremendous amount of content.
Chaniotaks and Antoniou [26] proposed a generic
methodological framework for collecting and analyzing the
data from social media. And other researchers took advantages
within 5 hours and 25 miles for freeway incidents. Gal-Tzur et
al. [36] used the Twitter messages sent from transportation
authorities to develop classifiers to identify the posts related to
transportation information. Moreover, they presented a
keyword-based hierarchical schema to categorize these posts.
Chen et al. [37] tried to detect traffic congestion and location
solely based on social media data by using topic modeling and
hinge-loss Markov random fields. D’Andrea et al. [38] utilized
Twitter data and developed a support vector machine model to
recognize useful keywords from tweets and detect traffic events
in the area of highway road network. Kumar et al. [39]
incorporated social media to detect road hazards by sentiment
and language analysis. Most recently, Zhang et al. [40] studied
and revealed the characteristics of traffic flow surge near the
tweet concentration, which is defined as a cluster of keywords
for traffic related events. Further, Zhang and He proposed
analytical models to detect on-site traffic accidents [41] and
decode people’s travel behavior with geo-mobility clustering
[42].
For traffic prediction, He et al. [43] proposed a long-term
traffic prediction models with social media features for a
freeway network in San Francisco Bay area. They found that
there exists a negative correlation between social activity on the
web and traffic activity on the roads. Ni et al. [44] tried to
forecast freeway traffic flows under special event conditions by
taking into account information derived from social media. Lin
et al. [45] applied linear regression models for predicting the
impact of inclement weather on freeway speed with the help of
social media.
For subway and transit, Collins et al. [46] used sentiment
analysis of transit riders’ short messages on social media to
measure their satisfaction about transit. They found that the
social media posts with the sharp increased negative sentiment
indicated some transit incidents, like fire and delays.
Above studies show that there is great potential to use social
media to locate right information for transportation
applications. However, none of the previous studies explores
Figure 2.1: System architecture for passenger flow prediction from social media
2.2 Literature Review
There is a vast literature in short-term transportation forecasting [14]. Generally,
there are two groups of approaches receiving wide attention, namely, parametric and
non-parametric techniques.
The common parametric techniques include autoregressive integrated moving av-
erage model (ARIMA), exponential smoothing [15], and historical average [16]. Es-
pecially, ARIMA has been fully developed for various transportation prediction pur-
poses, including traffic occupancy [17], travel time [18] and traffic flow [19]. Previous
research [20, 21] shows ARIMA performs well for stationary and non-event time se-
ries. With the rise of data mining and science, non-parametric techniques also have
been widely adopted recently. Neural network [22, 23], support vector machine for
regression (SVR) [24] and k-nearest neighbor [25] were used to build the traffic volume
prediction model for the time-series data.
The passenger flow prediction belongs to the subcategory of short-term trans-
portation prediction. Some researchers adopted both kinds of prediction techniques
to forecast the passenger flow for railway [22, 26, 27], bus stop [28, 29], and subway
7
stations. Specifically for passenger flow prediction at subway stations, there are dif-
ferent prediction levels, respectively, at whole transit lines [10, 11], at one station
with passenger transfer flow [12], and at one station with entrance and transfer flow
[13]. All of them obtained a desirable predict result of typical commuting volumes.
However, none of them adds consideration of atypical conditions.
Recently, more and more attempts have been made to implement The Internet
and social media analysis in the domain of transportation [30, 31, 32]. A huge group
of people in the online community generates a tremendous amount of content. Chan-
iotaks and Antoniou [33] proposed a generic methodological framework for collecting
and analyzing the data from social media. And other researchers took advantages
of using crowdsourcing these resources to capture the incoming non-recurrent events
[34], to explain the causes of transport overcrowding [35], to investigate intelligent
transportation systems services [36], and to utilize deep learning approach [37, 38].
Studies are trying to exploit this area mainly fall into two applications, traffic detec-
tion, and traffic prediction, with supervised learning techniques.
In the application of traffic detection, Wanichayapong et al. [39] used synthetic
analysis to classify the traffic incident information into spatial categories from the
social media data. Schulz et al. [40] extracted features from part-of-speech tagging
and words in Twitter posts and developed classifiers to detect car accident occur-
rences. They applied spatial and temporal filtering to locate the accidents. Elizabeth
Daly [41] built a system called Dublins Semantic Traffic Annotator and Reasoner to
use natural language processing techniques to analyze social media contents in order
to capture real-time traffic conditions. Mai and Hranac [42] explored the time and
location of the related Twitter posts after traffic incidents occurred. They found that
the majority of tweets are posted within 5 hours and 25 miles for freeway incidents.
Gal-Tzur et al. [43] used the Twitter messages sent from transportation authorities to
8
develop classifiers to identify the posts related to transportation information. More-
over, they presented a keyword-based hierarchical schema to categorize these posts.
Chen et al. [44] tried to detect traffic congestion and location solely based on social
media data by using topic modeling and hinge-loss Markov random fields. DAndrea
et al. [45] utilized Twitter data and developed a support vector machine model to
recognize useful keywords from tweets and detect traffic events in the area of highway
road network. Kumar et al. [46] incorporated social media to detect road hazards
by sentiment and language analysis. Most recently, Zhang et al. [47] studied and
revealed the characteristics of traffic flow surge near the tweet concentration, which
is defined as a cluster of keywords for traffic related events. Further, Zhang and He
proposed analytical models to detect on-site traffic accidents [48] and decode peoples
travel behavior with geo-mobility clustering [49].
For traffic prediction, He et al. [50] proposed a long-term traffic prediction models
with social media features for a freeway network in San Francisco Bay area. They
found that there exists a negative correlation between social activity on the web and
traffic activity on the roads. Ni et al. [51] tried to forecast freeway traffic flows under
special event conditions by taking into account information derived from social media.
Lin et al. [52] applied linear regression models for predicting the impact of inclement
weather on freeway speed with the help of social media.
For subway and transit, Collins et al. [53] used sentiment analysis of transit
riders short messages on social media to measure their satisfaction about transit.
They found that the social media posts with the sharp increased negative sentiment
indicated some transit incidents, like fire and delays.
Above studies show that there is great potential to use social media to locate right
information for transportation applications. However, none of the previous studies
explores the effectiveness of using social media for passenger flow prediction in public
9
Table 2.1: Sample Tweets Before Events
Type Start Time Details Create at Text content
Baseball
game
2014-05-14
19:10
Mets vs. Yankee 2014-05-14
18:22:22
Checked in CITI field for the
yankees vs mets game w yan-
kees mets
Tennis games 2014-08-25
19:00
US Open 1st
round
2014-08-25
17:49:46
Im at 2014 usopen tennis
championships in flushing ny
Baseball
game +
Tennis games
2014-08-28
19:00 (T)
19:10 (B)
US Open 2nd
round And Mets
vs. Braves
2014-08-28
18:29:10
love this place billy jean king
national tennis centre, us
open
metro transit systems.
2.3 Dataset
This study expands the successful applications of social media data to predict
passenger volume at a subway station. We focus our study on subway station Mets
Willets Point on Line 7 in New York City. The station is selected based on two
main reasons. First, Mets Willets Point is adjacent to not one but two stadiums,
Citi Field and USTA Billie Jean King National Tennis Center (NTC). Citi Field is
the home stadium of New York Mets Baseball team, and NTC hosts the annual US
Open grand-slam tennis tournament. Second, the sports events always obtain public
attention. From our observation, there is a substantial volume of social media posts
referring to the events.
We collected the turnstile usage at Mets Willets Point subway station from
Metropolitan Transportation Authority (MTA) [54]. In order to cover various types
of events, the time range is set from April 2014 to October 2014, in which various
events occur nearby.
10
(a) Baseball game (b) Tennis games (c) Baseball + Tennis games
Figure 2.2: Geographic distribution of tweets two hours before the events
Turnstile devices record passengers passing each turnstile for either entry or exit,
and it reports the aggregated number every four hours. In this paper, we aggregate
both entry and exit flows as total passenger flow, which is of transit agencys interest.
We collected Twitter data, known as tweets, as social media data. Twitter mes-
sage is an online text post limited to 140 characters by Twitter users. Tweets were
collected in the same temporal window through Twitter Streaming API with geo-
location filter [55]. The spatial bounding box was set to cover only the subway station
and two stadiums. Because of the location filter, besides text content, username and
timestamp, each tweet contains its geographic coordinate. Inside the post, users are
able to prefix by a # symbol with words, which is called the Twitter hashtag. A
hashtag provides unique tagging convention to facilitate tweets with certain topics,
contexts or events. The aforementioned information from Twitter messages defines a
tweet in this paper.
Fig. 2.2 shows the locations of tweets sent two hours before different types of
11
events start. As it can be seen, tweets were mostly sent from the stadium in which
the event was held. Moreover, different events correspond to different social media ac-
tivities, and to various levels of public attention. From social media data perspective,
the characteristics of tweets, like time stamps, geolocations, text content, quantity
ratios, etc., lead to such differences. Our objective is to find ways to measure these
differences in social media data and leverage them into prediction models to forecast
subway passenger flow.
2.4 Hashtag-based Event Identification
The events held in stadiums were well attended. The attendance not only brings
a high volume of passenger flows but also activities on Twitter, shown in Fig. 2.3.
As one can see, event scenarios generate large spikes of social media activity and pas-
senger flow at the same time. We assume that the complete schedule of all events is
unknown for transit operators. The subway station Mets-Willets Point could coordi-
nate transit passengers for two major sports events, US Open Tennis Championships
and Major League Baseball for New York Mets. The former was held late August
and early September over a two-week period, and the latter was held from April to
September 2014. However, after initial examinations, we found that there were other
events like concerts and speeches being held nearby as well. Therefore, we need to
identify the events by social media data.
Instead of detecting the exact topic of the events [56, 57, 58], we would like to
examine tweets within the area and probe whether there will exist events involving
high social activities. To correctly identify the events, rather than using the complex
machinery of latent variable topic models (e.g. Latent Dirichlet Allocation [59]),
we employ the Twitter hashtags to measure social media activities and provide the
context for them [60].
12
(a) Passenger flow
(b) Number of Tweets
Figure 2.3: Comparisons of passenger flow and number of tweets
Hashtag extraction is the first step of the proposed event detection algorithm. We
denote t as one of the time intervals, with t = 1, . . . , T , where T is the total number of
four-hour intervals. HLt is the list of hashtags during t. HLt = Ht1, . . . , Htj, . . . , HtJt ,
where Htj is the jth hashtag and Jt is the total number of hashtags labeled by Twitter
users during t.
Furthermore, let MH ∈ RT×S denote the hashtag matrix, where S is the number
of hashtags. Its element MHt,s corresponds to the occurrence of the sth hashtag in
the tth time interval. Since hashtag matrix contains all the T time intervals, S ≥
maxt∈T
Jt. In the hashtag matrix, all the hashtags over time intervals merge into the
columns. Various words and phrases depict different aspects of social activities. In
sum, the column names of hashtag matrix are the hashtags, the rows stand for the
13
time intervals, and each entry in the matrix corresponds to the frequency of the
hashtag.
The summary of the notations present as follows:
• t is the index of time intervals, t = 1, . . . , T .
• p is the index of the tweet.
• s is the index of the hashtag.
• Jt is the total number of hashtags labeled by Twitter users during t.
• MH ∈ RT×S denote the hashtags matrix, where S is the total number of hash-
tags.
• Htj ∈ HLt is the jth hashtag in the list HLt.
• OCt is the occurrence of each element Htj in HLt for time interval t.
• TWp,s is the occurrence of sth hashtag in HL of pth tweet.
Below are the algorithm of event detection by hashtags.
14
Algorithm 1: Hashtag-based Event Identification
Input : Tweets within the area
Output: Hashtag matrix MH ∈ RT×S
1 Hashtags extraction
HLt = Ht1, . . . , Htj, . . . , HtJt ,∀t ∈ [1, T ];
2 Lexical analysis
HL ≡⋃Tt=1HLt
Remove stop words, punctuation and duplicated strings from HL;
3 Label all collected tweets by hashtag
TWp,s ≡ calculate the occurrence of sth word in HL of pth tweet;
4 for pth tweet p = 1 to p do
5 for sth word in HL do
6 Append the TWp,s as a new column for pth tweet;
7 end
8 end
9 Build hashtag matrix (MH ∈ RT×S)
Each row of MH represents the vector of HL
OCt ∈ RS ≡ the occurrence of each element in HLt for time interval t;
10 for t = 1 to T do
11 OCt =∑
p∈T∑
s∈S TWp,s;
12 MHT = OCt;
13 end
14 Peak detection;
15 for t = 1 to T do
16 Rank OCt based on∑
s∈S OCt,s from the largest to the smallest;
17 for s = 1 to S do
18 Sort OCt,s from largest to smallest.
19 end
20 end
15
Since there could be different hashtags for different time intervals, it is trivial to
see that MH is originally a sparse column-wise matrix, and each column corresponds
to the frequency of hashtag in each time interval. By concatenating hashtag list HLt
over t, it converts MH to a full storage matrix in order to sort the hashtag matrix
row by row for peak detection afterward.
Moreover, instead of directly utilizing the occurrence of hashtags labeled by Twit-
ter users, we extract the string vector of hashtags and use it to label the text content
of each tweet. It will facilitate the approach to capture those tweets about a similar
topic without hashtags.
Finally, we implement peak detection to extract most frequently occurring hash-
tags as event hashtags, representing social media activities with context. In TABLE
II, the top 3 frequently occurring hashtags are presented. Moreover, we use the sum
of all occurring hashtags for each time interval to measure the social media activity.
High-rank number of hashtags indicates that the corresponding time interval is under
event occurrence.
TABLE 2.2 shows the various detected events, including US Open, baseball games,
music shows, running races, etc. In order to justify the method, we compare the
detection results with the true home game schedule of New York Mets, which had
long time range and a decent number of games. There were 81 game days during April
2014 to October 2014 for New York Mets. After eliminating the days with missing
Twitter data, 65 game days remain. Since the objective of the event detection is to
sense the positive events instead of non-events, we evaluate the identification results
with precision, recall and F-1 score.
The proposed method achieves good performance in identifying those baseball
events, i.e., the precision is 98.27%, recall 87.69% and F-1 score 0.9268.
Note that there are two reasons to use event hashtags instead of the quantity
16
Table 2.2: Sample Events and Their Top Hashtags
Date Hour No. of EH Top Hashtags
3/31 17 to 21 65 mets openingday ny
4/5 13 to 17 306 mets reds baseball
4/9 17 to 21 34 amaluna cirquedusoleil citifield
5/14 17 to 21 710 mets yankees subwayseries
5/31 9 to 13 85 happiest5k queens ny
6/7 17 to 21 75 digifestnyc nyc selfie
8/25 17 to 21 437 usopen tennis usopen2014
8/31 13 to 17 609 usopen mets tennis
of tweets directly. First, there is a chance that high volume of tweets does not
necessarily indicate event and attendance. In our observation, a conversation between
users, commercial promotions or information dissemination could also generate a high
quantity of tweets. The proposed hashtag-based method is able to diminish the effects
of these unrelated tweets. Second, the top event hashtags can describe what the event
is about, though the hashtags might not be formal English words. It can be seen in
TABLE II, different kind of events and baseball teams can be easily recognized by
the top event hashtags.
2.5 Events Characteristics
Different events in stadiums bring different size of audience to the sites, in which
the passenger flow at the subway station varies accordingly.
As shown in Fig. ??, there are huge differences between event and ordinary transit
traffic in quantity, more importantly in variation. This difference inevitably leads to
17
5:00 9:00 13:00 17:00 21:0024:00 + 1
hour
Non Event Volume 161.88 929.07 1500.75 1949.77 2166.36 997.72
Event Volume 241.76 1113.89 2828.61 5087.21 8066.04 5829.68
Ratio of Non-Event over Event Flow 66.96% 83.41% 53.06% 38.33% 26.86% 17.11%
0.00
1000.00
2000.00
3000.00
4000.00
5000.00
6000.00
7000.00
8000.00
9000.00
NO
. O
F P
AS
SE
NG
ER
S
Figure 2.4: Average event/nonevent daily passenger flow at Mets-Willies Point station
0
100
200
300
400
500
600
700
0 5000 10000 15000 20000 25000
No
. o
f T
weets
Passenger Flow
Correlation: 0.6286
R squared:0.3952
(a) Number of Tweets V.S. Passenger flow
0
50
100
150
200
250
300
0 5000 10000 15000 20000 25000
No
. o
f U
sers
Passenger Flow
Correlation: 0.6979
R squared: 0.4870
(b) Number of Users V.S. Passenger flow
Figure 2.5: The correlation between tweet rates and passenger flow under events
18
the difficulty of transit prediction by traditional time series models (e.g. ARIMA).
On the other hand, in Fig. 2.5 we plotted the number of tweets against passenger flow
under event occurrences in (a), and the number of Twitter users against passenger flow
in (b). As one can see, a linear trend is observed between tweet counts on passenger
flow. The correlation coefficient is above 0.62 and adjusted R2 value is above 0.39.
The R2 values indicate that the number of users is a more robust predictor. We
reasonably believe that there exists a moderate positive correlation between tweet
counts and event passenger flows. This result gives us the confidence to explore
further the prediction modeling of social media on the event passenger flow.
Note that our study is restrained to the extent that the geo-tagged tweet is avail-
able. For some of the time periods, the amount of tweets is very small despite the
time of day. In this case, event identification measures social media activities and au-
tomatically excludes these time periods from the correlation study and the following
analysis.
2.6 Prediction Modeling
In this section, we intend to investigate whether or not the content of social media
will assist in forecasting event passenger flow. The first step is to identify the best
time lags for the prediction models.
To measure the tweets quantifiable, we define two types of feature as tweets rates
from social media data:
• NTweets(t): Number of event-related tweets at time step t.
• NUsers(t): Number of unique tweet users at time step t.
Because the record time interval of transit passenger flow is four hours, we also
aggregate the tweets data in four-hour intervals. If the predicted passenger flow is
19
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
0
500
1000
1500
2000
2500
5:00 9:00 13:00 17:00 21:00 24:00
+ 1
hour
Tw
eet
Ra
tes
Pa
ssen
ger F
low
Nonevent Volume NTweets Lag 0
NTweets Lag 1 NTweets Lag 2
NTweets Lag 3 NTweets Lag 4
(a) Nonevent
0
50
100
150
200
250
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
5:00 9:00 13:00 17:00 21:00 24:00
+ 1
hour
Tw
eet
Ra
tes
Pa
ssen
ger
Flo
w
Event Volume NTweets Lag 0
NTweets Lag 1 NTweets Lag 2
NTweets Lag 3 NTweets Lag 4
(b) Event
Figure 2.6: Average passenger flow V.S. average tweet rates at Citi Field Station
at time t, we shift tweet rates to earlier hours: t − 1, t − 2, t − L, since prediction
requires features ahead of passenger flow time. Based on the positive correlation of
tweet rates and passenger flow in Fig. 2.6, we construct a linear regression (LR)
model, where passenger flow is the dependent variable, and tweet rates over different
hours are independent variables.
The highest predictive correlation is achieved when the tweet rates are calculated
based on one hour prior to event time range. We obtain an adjusted R2 value of
0.616 in lag one-hour case. For comparison, the R2 values in lag zero and two-hour
cases are, respectively, 0.488 and 0.512. Also, shown in Fig. 2.6, one can see that the
curve of tweets rates with one hour lag fits best to the curve of event passenger flow,
whereas for non-event passenger flow there are no obvious patterns between tweets
rate and passenger flow. Based on such analysis, we will include tweet rates with
one-hour lag into the base prediction model in the following analysis.
Next, we implement cross validation to compare the results of LR model and two
20
popular prediction models: average prediction (AVG) and seasonal autoregressive
integrated moving average (SARIMA). We generate an experiment with 100 runs
of datasets from the event detection result, and each run takes inputs by randomly
splitting the entire dataset into training (70%) and test (30%) sets.
The prediction performance is evaluated by two metrics, namely Mean Absolute
Percentage Error (MAPE) and Root Mean Square Error (RMSE).
In our experiment with 100 runs, the LR model with tweet rates improves the
MAPE by 33.08% comparing with SARIMA (See Fig. 2.7 for details). Notice that
such good performance is achieved by the LR with two variables only. However, the
LR model does not capture the relation between time steps, since the passenger flow
data are time series in nature.
We conduct a comparison of R2 values between two models: 1) the Tweets-based
LR model and 2) the historical-flow-based SARMIA model. The experiment obtains
adjusted R2 value of 0.616 for the LR, 0.400 for the SARMIA, and 0.696 for combined
features of both. As one can see, around 60% of the event passenger flow variance
can be explained by the number of tweets variation. And around 40% of the variance
comes from historical time-series flow data, which includes a large portion of day-
to-day recurrent passenger flow and a small portion of the non-recurrent event flow.
The combination of these two methods shows better R2 value since the LR provides
event-related features while the SARIMA presents the features related to time series
and routine flow.
Inspired by the above experiment with two modeling methods, we propose a con-
vex optimization based approach, called Optimization and Prediction with hybrid
Loss function (OPL), to fuse the LR model and the SARIMA model in the objective
function jointly. The OPL model aims to take advantage of unique strengths of line
regression in social media features and SARIMA model in time series prediction.
21
The hypothesis of the proposed model is a parametric linear model, defined as:
hw(x) = 1 + w1x1 + w2x2 + · · ·+ wnxn x0 = 1
Where xi is ith feature and its corresponding coefficient is wi. In total, the exper-
iment runs for m = 100 times. Each entry of the experiment is one of the four-hour
intervals from the event detection result. Following our experiment design, we ran-
domly split the m runs into training mtrain(70%) and test mtest(30%). The two tweet
rates, NTweets and NUsers, with one-hour lag act as features in the model.
We construct the total loss function as:
J(w, y) =
mtrain∑j
(y(j) − hw(x(j)))2 + α ·mtest∑j
(haty(j) − hw(x(j)))2
+β ·mtest∑j
(haty(j) − y∗(j)))2
(2.1)
The idea behind the loss function is to combine the modeling of the predictions on
both training and test data as well as the predictions from time series model. Equation
(2.1) contains three main parts. The first component is the sum of least square
for the training set, which is the same as linear regression. The second component
incorporates the prediction part directly into the loss function in order to minimize
the square error from test data. In addition, to fuse the results of SARIMA, we
manage to add the sum of least square between OPL predicted haty(j) and SARIMA
predicted y∗(j) into Equation (2.1) as the third component. y∗(j) plays the role of
regularization to leverage the whole loss function. Since OPL only includes two
independent variables, in the trail experiments, it shows that it is not necessary to
equip L1 regularization to prevent overfitting. In sum, OPL adopts the moderately
large correlated social media features, and incorporates the prediction results from
conventional time series model.
22
To minimize Equation (2.1), we first vectorize all variables and coefficients:
W ∈ Rn Y ∈ Rmtrain
X train ∈ Rmtrain×n Y ∈ Rmtest
X test ∈ Rmtest×n Y ∗ ∈ Rmtest
Then, the loss function is transformed into:
J(W, Y ) = tr((Y −X train ×W T )× (Y −X train ×W T )T )
+α · tr((Y −X test ×W T )× (Y −X test ×W T )T )
+β · ((Y − Y ∗)× (Y − Y ∗)T )
Take partial derivative of the above equation with respect toW and Y , respectively
and we get:
5WJ(W, Y ) = [(X train)T ×X train + α · (X test)T ×X test]×W T
−α · (X test)T × Y T − β · (X train)T × Y T = 0
(2.2)
5Y J(W, Y ) = α ·X test ×W T − (α + β) · Y T + β · Y ∗T = 0 (2.3)
Then, we use the gradient descent method to solve Equations (2.2) and (2.3) to
find a local minimum of Y . Given Equation (2.1), gradient descent starts with an
initial set of (W, Y ) and iteratively moves toward a set of values that minimize the
function. Each iteration takes a step in the negative direction of the function gradient.
Because the Equation (2.1) is convex, the result of OPL shall be the global optimal
values.
23
0 0.2 0.4 0.6
AVG
SARIMA
LR
KNN
SVR
OPL
(a) MAPE
0 2000 4000 6000
AVG
SARIMA
LR
KNN
SVR
OPL
(b) RMSE
Figure 2.7: Performance metrics of the prediction models
0.1
75
0.2
0.2
25
0.2
5
0.2
75
0.3
0.3
25
0.3
5
0.3
75
0.4
More
SVR OPL
(a) MAPE
1500
1700
1900
2100
2300
2500
2700
2900
3100
3300
More
SVR OPL
(b) RMSE
Figure 2.8: The distributions of test errors to compare the SVR and OPL
24
0.265
0.27
0.275
0.28
0.285
0.29
SVR OPL Ensemble
(a) MAPE
2200
2250
2300
2350
2400
2450
SVR OPL Ensemble
(b) RMSE
Figure 2.9: Improvement from ensemble learning from the OPL and SVR
In order to benchmark our proposed method against existing popular prediction
approaches, we introduce two nonparametric methods, including SVR and k-nearest
neighbors (KNN). The prediction process utilizes cross-validation as well.
Fig. 2.7 illustrates that the OPL yields better prediction accuracy than other
methods. Compared with the LR, the OPL improves MAPE by 11.4%. Also, one can
see that the SVR presents desirable prediction performance as well. The SVR and
the OPL have different characteristics. The SVR is a nonparametric technique that
considers tweet rates only. The OPL is a parametric method and incorporates the
prediction results from conventional time series model. Further, a detailed comparison
is conducted by another 100 randomly generated runs.
Fig. 2.8 depicts the distributions of test errors for both SVR and OPL. While
either method performs relatively well on its own, it shows the distributions are
heterogeneous for both metrics, MAPE and RMSE. The heterogeneity of error dis-
25
tributions encourages us to combine the merits from both techniques. Inspired by
the aggregation approach proposed by [61], we implement stacking – an ensemble
learning approach to merge the prediction results of the SVR and OPL.
Y = P (X trian|OPL) · arg minY
J(W, Y |OPL)
+P (X trian|SV R) · arg minY
J(W, Y |SV R)
(2.4)
We estimate Y by Equation (2.4). The weighted probabilities come from normal-
ized root mean square error of training data. The output averages the argument of
the minimum for both SVR and OPL.
As one can see from Fig. 2.9, the ensemble approach yields better prediction
accuracy than either OPL or SVR. It is worth mentioning that the improvement over
the conventional SARIMA is more than 40%. Notice that tweet features are obtained
from no-cost and real-time social media data. The results indicate the promising
value of using social media for passenger flow prediction under event conditions.
2.7 Conclusions
In this paper, we have addressed two important questions, in brief, whether social
media data is able to signify public gathering events, and what techniques can be used
to model the passenger flow prediction by the features extracted from social media.
First, we exploit social media to detect various events with hashtags. In order
to capture events precisely, the hashtags from the Twitter users have been analyzed,
tuned, adapted and applied with lexical processing techniques and peak detection.
Our approach achieves good performance with precision 98.27% and recall 87.69% for
the baseball games. It is a simple but efficient method to capture the events related
to public gathering with high social media activity.
Second, we propose a convex optimization model called Optimization and Predic-
26
tion with hybrid Loss function (OPL) to fuse the least squares of linear regression
and the prediction results of SARIMA in the same objective function. The OPL hy-
brid model aims to take advantage of the unique strengths of line regression in social
media features and SARIMA model in time series prediction. Among several popular
prediction methods, OPL shows the best results in terms of MAPE and RMSE. In
addition, by comparing the distribution of prediction errors of OPL with SVR, which
is a popular nonparametric and nonlinear method, it is found that their performance
shows heterogeneous error patterns. Therefore, an ensemble model is developed to
leverage the weighted results from OPL and SVR jointly. As a result, the prediction
accuracy and robustness further increases.
Overall, social media data show the capability in passenger flow prediction under
event conditions. Social media offers a cost-effective way to obtain real-time traveler
related data, and fills the gap between day-to-day passenger flow volume and abruptly
changing non-recurrent event volume. The positive correlation between passenger flow
and social media activity plays a significant role as transit demand indicator in the
public transit system.
In future, one could further explore the minimum percentage of social media use
in an event that leads to a respectable accuracy, and how such minimum can be
estimated in order to compute a trust index for the regression result.
27
Chapter 3
SAME-DAY DELIVERY PLANNING WITH STORE FULFILLMENT
3.1 Introduction
Electronic retailing has grown significantly over the past decade and is predicted
to continue rising into the future. Currently, increasing traditional brick-and-mortar
retailers start to operate online channel. In this process, the competition between
them and Internet retailers which solely sell products or services online is inevitable.
The e-channels, providing same-day delivery (SDD), accelerate the competition with
traditional retailers, bringing considerable convenience and nearly instant accessibility
for the online shoppers.
One of the feasible way for traditional retailers to response this recent e-commerce
trend is doing the same-day delivery as well. Our solution is to make the decision
of fulfillment sourcing from the more direct-to-consumer places, like their retailing
stores, instead of far-away fulfillment centers. One outstanding advantage of using
stores to fulfill online orders is the short distance between the retailing stores and the
consumers. It will benefit not only the fulfillment speed and costs, but also being able
to provide versatile services, such as store pickups, accessible return service, and etc.
Nonetheless, a new service comes at price and there is no exception for the same-day
delivery, since it introduces additional operation complexity in the stores.
These difficulties motivate us to study and model the store fulfillment for local
online orders within the same day. Essentially, it is a same-day delivery planning
with store fulfillment problem (SDD-SFP). SDD-SFP is characterized by three main
factors, the store selection, fleet-sizing in delivery transportation, and the multi-
28
commodity distribution.
The store selection can be treated as a facility location problem. A good store
selection would reduce expenses in order delivery and total services setup cost of
stores. There are a large number of the delivery destinations and potential facility
locations to be selected in the model.
In order delivery transportation, we incorporate the own fleet of trucks and the
3rd party carriers as two delivery channels. A fleet of vehicle would serve delivery
requests from different area. In the problem addressed in the paper, the customer
demand could vary dramatically during the time horizon, which can represent the
practice happens in holiday shopping season. Since the fleet is fixed for the whole
period, it is important to coordinate the fleet size for the demand variations. The
size of the fleet and capacity requirement from the carriers need to be answered.
For the multi-commodity distribution, the inventory of products for the online
shoppers need to assign to different selected store. Since the demand varies upon
areas and delivery options, the distribution requires careful consideration in terms
of last-mile delivery costs. In addition, the huge scale of the product assortment for
online shoppers is considered in the model.
These factors above would certainly increase the complexity of the problem. More-
over, the capacities of the delivery channels and store SDD order processing powers
will provide the additional difficulties to develop an efficient plan to tackle this prac-
tical problem.
There are two nature variant of same-day delivery being considered in the practice
and literature: (1) All orders need to be delivered by the end of the day, and (2) all
orders need to be delivered within certain periods specified by the consumers. From
the supply chain planning perspective, we study the first variant as the definition of
same-day delivery in this paper. It simplifies the modeling and reduce the computa-
29
tional complexity in delivery transportation, while it also retains feasible logistic plan
in business practice.
The objective of SDD-SFP is to identify a seasonal order fulfillment plan for deliv-
ering local online orders from nearby retailing outlets. In particular, this paper seeks
to develop optimization models and algorithms to solve facility location, transport-
ing channel selection, and inventory management to construct logistic plan in supply
chain. This paper consist of following steps. (1) We formulate a mix integer program-
ming for SDD-SFP to capture the trend of same-day delivery in omni-channel supply
chain for brick-and-mortar retailers; (2) To solve the SDD-SFP, we propose a Ben-
ders decomposition based approach that divides the model into one master problem in
charge of facility locations and fleet sizing, and subproblems in charge of SDD orders
assignment and delivery channel choices; (3) Furthermore, we introduce several al-
gorithmic enhancements for the solution method, including store-selection algorithm,
cut strengthening methods and the parallel search trees; (4) Finally, we conduct an
extensive set of experiments on a real-world national retailer that demonstrate the
value of the proposed solution approach.
This research aims to develop optimization models and solution algorithms about
store location selection, fleet-sizing for transportation, and inventory planning, in
order to construct robust logistic plan of supply chain. Our study makes following
key contributions:
1. This paper introduces a new same-day delivery planning with store fulfillment
problem to capture the current trend of same-day delivery in omni-channel
supply chain for brick-and-mortar retailers. A mix integer programming model
is developed for the SDD-SFP.
2. The solution algorithm uses a Benders decomposition based approach that di-
30
vides the model into one master problem in charge of facility locations and
fleet sizing, and subproblems in charge of SDD orders assignment and delivery
channel choices.
3. This study creates a store-selection algorithm based on mix integer program-
ming and combination of factors to essentially reduce the number of potential
store list, and acts as an effective extension to increase the computational effi-
ciency.
4. In order to make the cuts more efficient to accelerate Benders solving process,
several cut strengthening methods are implemented into the Benders solving
structure, including Pareto-optimal cuts, MIP lifting cuts, and tabu cuts.
5. A customized local branching method is proposed as an additional parallel
search tree to assist Benders approach to generate MIP initial lifting cuts and
to explore the neighborhood of incoming incumbent solutions from the master
problem.
3.2 Literature Review
SDD-SFP builds upon the incapacitated facility location problem with consid-
eration of multi-commodity, multi-plant. Facility location problem (FLP) is a very
popular academic topic and numerous research has applied on this topic. The general
characteristics and the recent trends of FLP can be found in Klose and Drexl [62],
ReVelle et al. [63], and Melo et al. [64] presented up-to-date surveys of FLP.
Multi-commodity facility location problem is one important variant of FLP. Warsza-
wski [65] proposed a branch and bound algorithm and a heuristic solution procedure
for solving the multi-commodity facility location problem. Geoffrion and Graves [66]
provided a solution procedure based on Bender’s decomposition and applied it to a
31
real situation of a major food company for the multi-commodity distribution facil-
ities design. Neebe and Khumawala [67] used the delta, omega simplification rules
adjusted for the multicommodity case. Barnhart and Sheffi [68] presented a primal
dual, heuristic solution approach for large-scale multi-commodity network flow prob-
lems. Crainic and Delorme [69] described a dual-ascent-based approach for solving
simple multi-commodity location problems with balancing requirements. Aggarwal
et al. [70] proposed a general heuristic procedure for multi-commodity integer flows
which can be utilized for solving multi-commodity facility location problems. Lee
[71] developed a general model for a capacitated facility location problem which in-
corporates the multi-product, multi-type facility and proposed an optimal solution
algorithm based on Bender’s decomposition technique. Lee [72] extended a standard
capacitated facility location problem to a generalization of multi-product, multi-type
capacitated facility location problem with a choice of facility type and presented an
elective algorithm based on cross decomposition. The algorithm unifies Bender’s
decomposition and Lagrangean relaxation into a single framework. Pirkul [73] de-
veloped an efficient heuristic procedure for solving the multi-commodity, multi-plant
capacitated facility location problem.
For the same-day delivery part, researchers mainly focus on how to operate the
delivery vehicles as dynamic pickup and delivery problem. Azi et al. [74] addressed a
vehicle routing problem with dynamic new new customer requests with time windows.
They proposed an adaptive large neighborhood search heuristic to maximize total
expected profits of vehicle operations. Klapp et al. [75] introduced a novel way to
look at vehicle operations for same-day delivery, which treated the vehicle dispatch
problem with setting that there were a single vehicle and all unserved requests on a
line. It applied a dynamic programming approach to minimize the expected vehicle
operation costs and penalties for unserved requests. Voccia et al. [76] defined the
32
multi-vehicle dynamic pickup and delivery problem with time constraints as the same-
day delivery problem. They utilized Markov decision process to model the problem
and a sample-scenario planning approach to incorporate potential future requests into
routing decisions.
Benders decomposition [77] is an exact solution method. In Benders decomposi-
tion, the original MIP problem is partitioned into two parts — master problem and
subproblem. which are typically easier to solve than the original problem.The sub-
problem is one part of the original MIP problem with some fixed variables which are
provided by the solutions of master problem. Master problem consists of the remain-
ing variable with adding cuts as constraints which are generated by the solution of
subproblem iteratively. Branch and cut algorithm and linear programming duality
are the theory bases of Benders decomposition. In the solving process, the variables
of the master problem are first determined and the subproblem is then solved sequen-
tially. If the subproblems are feasible and bounded, an optimality cut is added to the
master problem, otherwise a feasibility cut is added. An upper bound can be com-
puted from feasible subproblems and a lower bound is obtained if the master problem
is solved to optimality. The process terminates when the optimality gap reaches the
predefined threshold.
Benders decomposition-based approaches have been widely used in solving supply
chain network design problems. Inspired by Geoffrion and Graves’ work [66] apply-
ing it into Multi-commodity distribution system design, we decided to use Benders
decomposition as the base of our solving methology.
This study is the first of the type to tackle store fulfillment problem in a collab-
orative local store network. The intended logistic plan is to balance the acceleration
of order delivery speed and the supply chain costs in terms of facility location and
transportation. Most of existing studies about store fulfillment are related to alleviate
33
the workload of the fulfillment centers and facilitate inventory rebalance.
3.3 Problem Description and Formulation
This section first introduces the notation used throughout this chapter. It then for-
mulates the same-day delivery with store fulfillment supply chain planning (SDDSFP)
as a mixed integer programming model. It reformulates the problem with Benders
decomposition as solving algorithm. Finally, several algorithmic enhancing methods
are designed and used to accelerate the solving processes.
3.3.1 Notation
There are four kinds of indexes for representing time horizon, store locations, on-
sale products , and delivery zones. Let T , index by t, denote the time horizon of
the supply chain plan. Let K, index by k, denote the candidate store locations. The
on-sale products through the fulfillment network can be defined as stock keeping units
(SKUs) by J = 1, . . . , j. And the whole fulfillment region is divided by the zip codes
and we call the divided areas delivery zones, represent by Z = 1, . . . , z.
For SKU j, there is the retailing price Pj attaching. And for store k, it defines
the operation setup cost as Πk and daily same-day demand processing capacity as
CPk. For the delivery zone z, Az stands for the value of area, and Lkz is the center-
to-center distance between it and store k. The customer demand is defined by Djzt,
which denotes the quantity of SKU j be requested at delivery zone z on day t. In
order to estimate the number of packages to be fulfilled from the SKU demands, it
proposes minimum shopping amount ToC. ToC can be treated as the least amount
of all eligible SKUs in a customer order to qualify for the same-day shipping.
The transportation cost comes from two sources — own delivery fleet operation
cost and package shipping cost by 3rd party carriers. The own fleet could be delivery
34
vans or trucks operated by the retailer and works exclusively for the same-day orders.
The parameters related to a vehicle from the fleet includes the daily operation cost
O, the daily working time length WT , the average moving speed MS, the average
stop time for delivering package from the vehicle to the destination ST , and the
holding capacity of packages HC. On the other hand, the 3rd party carriers provide
the urban delivery networks and charge by package shipping. It denotes one package
average shipping cost from store k to delivery zone z as Ckz. And The daily maximum
processing capacity of 3rd party carriers is CS. When some order cannot be fulfilled
on same-day by either own fleet or carriers, there will a penalty cost G for each of
those.
The following decision variables are used to formulate the SDDSFP. The variable
ak equals to 1 when the store k is selected for the SDD service for the time horizon T ,
0 otherwise. The variable bkz equals to 1 iff the delivery zone z is able to be sourced
from store k throughout T . The variable f represents the number of vehicles used
for order fulfillment throughout whole time horizon T . There are three kinds of SDD
orders, represented by xtkz, ytkz and wtkz, in terms of the fulfillment perspective. The
first kind is for the SDD orders to be delivered by own fleet, denoted as xtkz, at day
t from store k to zone z. Similarly, the SDD orders to be delivered by carriers, are
denoted as ytkz. Another kind of orders are the ones that could not fulfill on the
required same-day t, represented by wtkz. The fulfillment decision is defined by hjtkz,
which stands that the number of SKU j would be fulfilled at day t from store k to
zone z. Finally, we denote lt as the optimal of traveling distance to fulfill SDD orders
by own fleet at day t.
For the sake of simplicity, xtkz, ytkz, wtkz and f are defined as continuous variables
regardless that they are the integer values in nature. Because of the large scale
problem with high amount of customer demand and products assortment, the effect
35
of the data types on the planning result is very small according to the pilot numerical
experiment.
3.3.2 SDDSFP Formulation
The SDDSFP can be formulated as follows:
min Ω =∑k
Πk · ak + TO · f +∑t,k,z
Ckz · ytkz +∑t,k,z
G · wtkz (3.1)
s.t.
hjtkz = Djzt · bkz ∀j ∈ J, t ∈ T, z ∈ Z, k ∈ K (3.2)∑k
bkz = 1 ∀z ∈ Z (3.3)
∑j,z
hjtkz ≤ CPk · ak ∀t ∈ T, k ∈ K (3.4)
xtkz + ytkz + wtkz =
∑j hjtkz · PjToC
∀t ∈ T, k ∈ K, z ∈ Z (3.5)
lt =∑k,z
(2 · Lkz · xtkz
HC+ 0.57 ·
√xtkz · Az) ∀t ∈ T (3.6)
f ≥ lt/MS + ST ·∑
kz xtkzWT
∀t ∈ T (3.7)
The objective function (3.1) minimizes the cost of facility locations and the ex-
pected cost of the logistic decisions, which include the transportation cost by own
fleet, the transportation cost by 3rd party carriers, and the penalty cost of unfulfilled
same-day orders. Constraint (3.2) indicates that the fulfillment decisions are based
on the customers’ demand and availability of facilities. Constraint (3.3) enforces each
delivery zone being able to source from at least one store. Constraint (3.4) makes
sure that the fulfillment decisions are within the limitation of store order processing
capacity. Constraint (3.5) allows the estimation for the quantity of packages from
demands and divided them into three types: SDD orders by own fleet, SDD orders by
36
carriers, and unfulfilled SDD orders. Constraint (3.6) estimates the optimal of travel-
ing distance for vehicles at each day. And based on predefined operative parameters
of vehicle, the number of vehicles can be approximated from the constraint (3.7).
Order Estimation
The way that we estimate the number of orders/packages from constraint (3.5) is
based on the fulfillment decision hjtkz, the price of SKUs Pi and minimum shopping
amount ToC. The nominator of right hand side of (3.5) —∑
j hjtkz · Pj is the total
amount of money for the payment of all the orders to be fulfilled at day t from store
k to zone z, while the denominator is the minimum shopping amount ToC.
The logic behind the method is to mimic the minimum requirement of the shopping
cart from popular online retailers, like Amazon.com and Jet.com. On these websites,
the online shoppers needs to order certain amount of merchandise to qualify the free
shipping. In our case, the amount of merchandise is the requirement to use the
same-day delivery service, which is denoted by minimum shopping amount ToC.
The order estimation makes assumption that the payment amount of all customer
orders would just reach the minimum shopping account. We remark that this method
could overestimate the number of orders/packages since the payment amount of cus-
tomer order whose may be larger than ToC. In the model, the minimum shopping
amount could be approximated by historical sales data and the online shopping pref-
erence of local consumers. In addition, it is able to approximate different minimum
shopping amount according to the location and time, which may increase the accuracy
of order estimation.
37
Vehicle Routing Estimation
The expected routing costs for the own fleet option come from constraint (3.6) and
(3.7). The vehicle routing problem itself is a non-deterministic polynomial-time hard
problem, which could determine the exact routes from point A to point B. Since our
problem focuses on the supply chain planning phase and only want to know the length
of routes and the corresponding costs, rather than the origins and destinations, to
estimate the fleet size. The continuous approximation modeling is more favorable
and suitable in this situation. Continuous approximation models use value of area
and number of customers to estimate average traveling distance by vehicles with
high accuracy. Several studies applied this techniques in supply chain and logistics
setting, see [78, 79, 80]. Daganzo [81] proposed a simple and intuitive formula (3.8) for
the capacitated vehicle Routing Problem (CVRP) when the depot is not necessarily
located in the area.
CV RP (V n) = 2r ·m+ φ ·√nA (3.8)
CV RP (V n) stands for the total distance of the CVRP problem having m routing,
the average distance between customers and the depot is r, and n customers evenly
distributed in an area A. The value of φ is a constant and φ = 0.57 for rectilinear
distance. In our case, the daily number of customer n approximately equals the
number of orders to be delivered by own vehicles∑
kz xtkz. The number of routes m
can be defined by the times of travel for the capacitated vehicles to fulfill all orders,
which is∑
kz xtkz/HC. The center-to-center distance Lkz replace r and value of area
Ak is in place of A. Take them into the (3.8), it forms the equation (3.6) which
estimates the optimal traveling distance to fulfill SDD orders by own fleet at day t.
One important observation to made concerns square root term√xtkz in (3.6). It
38
may be impossible to solve the MIP to optimality with the square root term when
the size of problem become large. For the sake of simplicity, we decide to calculate
a fractional number of q from the numeric experiment to replace√xtkz by q · xtkz.
It is a rough estimation in general, but when choosing q with caution, it can be a
good estimation for the planning problem. There is an alternative option that is to
apply several integer constraints to approximate value of the square root term. In the
context of value estimation, the accuracy will not be improved a lot since it has to
round√xtkz and xtkz to the nearest whole numbers to be selected in the context of
MIP. In addition, the several integer constraints dramatically increase the complexity
of the model. For the purpose of this study, fractional number of q is better option
and we adopt it to approximate the square root term.
In order of simplifying the presentation and model integration, we define:
αkz =2 · Lkz + q ·
√Az ·HC +HC ·MS · ST
HC ·MS ·WT∀k ∈ K, z ∈ Z (3.9)
βtz =
∑j Djzt · PjToC
∀t ∈ T, z ∈ Z (3.10)
From (3.1) to (3.7), it can be seen that it is able to bypass the transient decision
variables hjtkz and lt. In addition, by incorporating (3.9) and (3.10), the SDDSFP
problem can be formulated as:
min Ω =∑k
Πk · ak + TO · f +∑t,k,z
Ckz · ytkz +∑t
G · wtkz (3.11)
s.t. ∑k
bkz = 1 ∀z ∈ Z (3.12)
∑j,z
Djzt · bkz ≤ CPk · ak ∀t ∈ T, k ∈ K (3.13)
xtkz + ytkz + wtkz = βtz · bkz ∀t ∈ T, k ∈ K, z ∈ Z (3.14)
39
f ≥∑kz
αkz · xtkz ∀t ∈ T (3.15)
3.3.3 SDDSFP Benders Reformulation
It now introduces the Benders decomposition to reformulate SDDSFP in order to
solve our large-scale problem efficiently.
In the reformulation, we manage to put all the binary decision variables ak, bkz
into the master problem, while the continuous variables xtkz, ytkz, wtkz and f be-
long to the subproblem. In addition, since f is the only variable not indexed by day
t in the subproblem, it updates the partition and moves f to the master problem.
Consequently, the subproblem handles all the time related variables, while master
problem only remains the variables that are not indexed by day t. It enables us to
further divide the subproblem of the model into T independent subproblems. Ac-
cordingly, it drops the dimension T for the decision variables xtkz, ytkz and wtkz in
the subproblems. Then, the subproblem(i) can be stated as:
min Θ(i)t =
∑k,z
Ckz · ykz +G · wkz (3.16)
s.t.
xkz + ykz + wkz = βtz · bkz ∀k ∈ K, z ∈ Z (3.17)
f ≥∑kz
αkz · xkz (3.18)
where bkz and f are treated as known constants in the subproblems; and Θ(i)t repre-
sents the two kinds of cost at day t — the 3rd party shipping cost and the penalty
cost for unable fulfilled SDD orders; all the parameters and decision variables are
non-negative.
The master problem is defined as:
min Ψ =∑k
Πk · ak + TO · f +∑t
Θ(i)t (3.19)
40
s.t.
∑k
bkz = 1 ∀z ∈ Z (3.20)
∑j,z
Djzt · bkz ≤ CPk · ak ∀t ∈ T, k ∈ K (3.21)
Θ(i)t ≥
∑kz
βtz · λ(i)kz,t · bkz + µ
(i)t · f ∀t ∈ T, i ∈ H (3.22)
The constraints (3.22) are the Bender’s cuts generated during the solution of the
subproblems. H is the index set of the cuts when there are optimal solutions for the
subproblems. λ(i)kz,t is the dual value from the constraint (3.17) and µ
(i)t is the dual
value of constraint (3.18) in the subproblem(i).
An important observation is that subproblem(i) can always be solve to optimality.
Consequently, for this SDDSFP model, there does not exist any cuts being generated
by the infeasibility for the master problem, as typical Benders does.
In sum, SDDSFP Benders reformulation manages to divide the mix integer pro-
gramming problem into one master problem and H number of subproblems. H is
n times the size of T . At each time of n, it solves the current master problem to
obtain the values of bkz and f . According to the value of bkz and f , T subproblems
are then solved sequentially to generates T number of individual optimality cuts with
daily customer demand of T days. T cuts will be used to update constraint (3.22) of
the master problem. The solving process goes iteratively between the master problem
and the subproblems until it reaches a termination criterion like small optimality gap.
The Benders solving procedure is based on the branch-and-cut algorithm. The
master problem intensionally takes out some constraints of MIP problem into sub-
problems, which makes master problem much easier to solve than the original MIP
problem. The master problem is essentially a relaxation of original MIP problem.
41
The subproblem will send parts of these constraints back to master problem by cuts
to tighten the relaxation iteratively based on the incumbent solution of master prob-
lem. Finally, with appropriate termination criterion, it will obtain the same optimal
solution as MIP. This process can be called as delayed constraint generation.
In the SDDSFP Benders reformulation, it can be observed that the master problem
is a capacitated facility location problem with consideration of fleet cost, while each
subproblem is an order assignment problem with multi-commodity, multi-plant and
two fulfillment channels. In this case, the solving process between master problem
and subproblems has clear physical meaning. The master problem determines the
service-available store locations and the fleet size. Through the values of bkz and f ,
it passes the location and fleet size decisions to the subproblems. On the other hand,
with available store locations and fleet size, subproblems decide the order fulfillment
plans for each day, including the delivery method by own fleet or 3rd party carrier, the
sourcing decision – from which store to the destination. Back to the master problem,
the constraints (3.22) will be updated from the subproblems, bring new requirements
for the store locations and fleet size. In sum, the master problem iteratively updates
the store selection and fleet size, while subproblems take charge of delivery method
and souring decision for the customer orders.
3.3.4 Algorithmic Enhancement
In this section, we provide 3 algorithmic enhancements in order to accelerate the
solution algorithm.
Enhancement of Store Selection
The master problem is a capacitated facility location problem with fleet cost. It use
binary decision variable ak to represent kth store be selected by the SDDSFP model
42
or not.
In reality, the potential stores to be selected could be a long list due to the existing
facility locations. It exponentially affects the efficiency of solving the model since the
store locations have great impact on delivery zones assignment by (3.21). Therefore,
a mathematic enhancement of the store selection can be an effective extension to
increase the solving progress of the master problem.
We create an algorithm 2 called Enhancement of Store Selection to essentially
reduce the length of potential store list.
43
Algorithm 2: Enhancement of Store Selection
Input : The store list K, Store setup costΠk, Store processing capacity
CPk, and the daily demand Djzt
Output: The strengthening store selection — Kst
1 Find the date tmax with maximum total demand;
2 Select store based on a simple MIP
min∑k
Πk · ak
st.∑
j,kDjztmax ≤∑
k CPk · ak
Solve the MIP, it could get a store list Kin with ak∈Kin = 1;
3 Calculate the gap between all capacity from Kin and the maximum total
demand on date tmax, which 4CP ≡∑
k∈Kin CPk −∑
j,kDjztmax ;
4 Find the lowest processing capacity CP 0 from Kin;
5 Make up a set of processing capacities SCP
if CP 0 − CPk ≤ 4CP ∀k ∈ K then
6 Add CPk into SCP
7 end
8 List the stores with the capacity in SCP from K, denoted as K|SCP; List
the stores with the capacity in SCP from Kin, denoted as Kin|SCP;
9 Calculate the total distance between all delivery zones and each store, as dk;
10 Calculate the ratio combining two factors distance and setup cost for
K|SCP
ratio(k) =dk − 1
n
∑nk∈K|SCP dk
1n
∑nk∈K|SCP dk
+Πk − 1
n
∑nk∈K|SCP Πk
1n
∑nk∈K|SCP Πk
For all the k ∈ K;
11 Apply the above equation to the Kin, get the max value of the ratio as M;
12 The strengthening store selection Kst = Kin⋃K|SCP , ratio(k) ≤M
44
It could be onerous for Benders master problem to update with the optimal store
selection without Algorithm 2. From (3.19) to (3.22), it obtains the same solution
as the store list Kin at the root node without presolving, which is the least cost of
store setup only corresponding to the store processing capacity.The location decisions
gradually changes when the cuts are added from subproblems by (3.22). The Bender’s
cuts by solution of subproblems bring transportation cost into the location decision
of master problem. The solution of subproblems largely depends two factors from
master — the location decision of store ak and the assignment of delivery zones bkz.
Since the long list of potential stores, location decision could be very inefficient for
solving the model.
Algorithm 2 simulates the processes to make location decision from the Bender’s
model but simplify the vehicle routing and delivery by 3rd party carrier to distance-
based criteria. First, it makes Kin as the starting point by solved part of the MIP
derived from the master problem. And, it get total distance between all delivery zones
and stores to mimic the traveling distance for order delivery. Then, it ranks the all
other stores by their capacity for the purpose to have the order to swap the stores from
K \Kin to Kin. In addition, it creates the ratio combining the distance and setup
cost as the measurement of importance for k ∈ K \ Kin and k ∈ Kin respectively.
ratio smaller is better. Finally, all the store in K \Kin that have smaller ratio values
than the largest ratio value in Kin could be the potential stores to be swapped. The
combination of Kin and the small ratio stores consist of the strengthening store list.
The assumption to make Algorithm 2 feasible is that the store setup cost con-
tributes more than the transportation cost to the location decision. Since our problem
is about the seasonal planning, it holds the assumption in the model. While if the
time horizon change, the assumption is not available anymore, Algorithm 2 is able to
adopt the new scenarios to use different start point Kin as well. When the logistic
45
cost dominate the location decision, generate Kin from the simple MIP model of the
logistic cost. When the logistic cost and setup cost contributes equally, create Kin
from the combination of the simple MIP model of the logistic cost and that of setup
cost. It extends the availability of Algorithm 2 by adopting different Kin.
The results of Algorithm 2 are treated as initial cuts adding to the master problem.
The cut of the strengthening store list Kst is added by∑
k∈Kst ak = n(Kst).And cuts
of the non-selected ones are added by ak = 0 ∀k ∈ K \Kst.
Cut Strengthening
The convergency of Benders solving algorithm highly depends on the quality of the
cuts generated from the subproblem. We implement three methods to strengthen the
Benders cuts — subproblem disaggregation, Pareto-optimal cuts and tabu cuts.
In ordinary Benders decomposition algorithm, master problem is solved while pass
the values of variables and subproblem is solved with these variables fixed, at same
time generate one cut into master problem in each iteration. Based on the structure
of the model, we manage to retain all time related variables in subproblem and further
divides into T independent subproblems. More details can be seen through equation
(3.16) to (3.18). In this case, it is able to generate T instead of 1 cuts to coordinate
different quantitative and spatial distribution of customer orders for T days. The
disaggregation of subproblems could provide better estimation of the transportation
cost for each daily demand rather than that for the the whole time horizon. Therefore,
the corresponding cuts could be generally more effective.
Although T disaggregated cuts better approximate the transportation assignment,
they might increase the solving time of master problem for each iteration on the other
hand, since it will be T additional constraints for each iteration instead of one. There
is a trade-off between tight cuts and increasing difficulty of master problem. For this
46
model where T is not large, the constraints increase is limited for master problem.
Subproblem disaggregation turns out very effective from the numerical experiment.
T disaggregated subproblems estimate the optimal transportation cost for each
daily demand with fixed delivery zone assignment, but the daily delivery plan could
vary. Therefore, there may have multiple dual solution values λ(i)kz and µ(i) for the ith
subproblem. Consequently, The multiple solution values consist of alternative cuts
from the ith subproblem. In order to select the strongest cut, we adopt a method
derived by Magnanti and Wong.
If (λ(i),(1)kz , µ(i),(1)) and (λ
(i),(2)kz , µ(i),(2)) are two dual solutions from subproblem(i)
(3.16) to (3.18), therefore
∑kz
βtz · bkz · λ(i),(1)kz + f · µ(i),(1) =
∑kz
βtz · bkz · λ(i),(2)kz + f · µ(i),(2)
When b?kz and f ? are the final optimal solutions, if
∑kz
βtz · b?kz · λ(i),(1)kz + f ? · µ(i),(1) ≥
∑kz
βtz · b?kz · λ(i),(2)kz + f ? · µ(i),(2)
then it calls the cut from (λ(i),(1)kz , µ(i),(1)) dominates the cut from (λ
(i),(2)kz , µ(i),(2)). Mag-
nanti and Wong called it Pareto-optimal cut corresponding to the point (λ(i),(1)kz , µ(i),(1)).
Pareto-optimal cut is the strongest cut from subproblem(i). However, it is impos-
sible to get final optimal solution b?kz and f ? as prior knowledge when the solving is
in process. Magnanti and Wong introduce an alternative called core point b′
kz and
f′
which is the relatively interior point of the convex hull of original SSD-SFP MIP
problem.
Based on that, our Pareto-optimal cut is generated in processes as follows. First,
it solves the subproblem(i) with fixed variable passed by master problem. When
subproblem(i) is solved to optimality of V (Θit), it sets up a new linear program-
ming problem to get the optimal dual value set (λ(i),(1)kz , µ(i),(1)). Finally, based on
47
(λ(i),(1)kz , µ(i),(1)), generate the Pareto-optimal cut into master problem. The LP based
dual subproblem(i) is
max Θit =
∑k,z
βtz · b′
kz · λ(i)kz + f
′ · µ(i) (3.23)
λ(i)kz ≥ Ckz ∀k ∈ K, z ∈ Z (3.24)
λ(i)kz ≥ αkz · µ(i) ∀k ∈ K, z ∈ Z (3.25)
V (Θit) =
∑k,z
βtz · bkz · λ(i)kz + f · µ(i) (3.26)
The (3.23), (3.24), and (3.25) are the dual reformulation, respectively, of (3.16),
(3.17), and (3.18). The equation (3.26) is the constraint to make the problem having
the same optimal objective value of V (Θit) from subproblem(i).
We get the core point b′
kz and f′
from the linear relaxed initial master problem. It
is an inner point in the feasible region of master problem. Since the optimal location
decision is to swap out the initial selected store, whose logic can be seen in Algorithm
2, the b′
kz and f′
may be nearby points with the optimal solutions b?kz and f ?. Note
that no matter how is the selection of b′
kz and f′, the cuts from dual subproblem(i)
is valid and contribute to the convergence of Benders solving. When the b′
kz and f′
is the relative interior points of the convex hull of problem, it gives Pareto-optimal
dual values of (λ(i),(1)kz , µ(i),(1)) which further strength T independent cuts from the
subproblem disaggregation.
Tabu cuts is the 3rd method to improve the quality of the Benders cuts. It applies
a cut to eliminate the recurrence of the current incumbent solution from the branch
and bound tree of Benders master problem. It blocks the some feasible solutions by
the tabu cuts to help node to be fathomed fast and further reduce the size of the
search tree.
48
When the master problem get a new incumbent solution, it generates cut for the
binary variables bkz. Given the feasible solution from master problem bkz and let
K = k ∈ K|bkz = 1 and Z = z ∈ Z|bkz = 1, the tabu cut is defined by
4(bkz, bkz) =∑
k∈K,z∈Z
(1− bkz) +∑
k∈K\K,z∈Z\Z
bkz ≥ 1
By adding 4(bkz, bkz) ≥ 1 to the master problem, it would no longer to consider the
bkz in the following nodes.
The reason for selection of bkz instead of other decision variables is that first values
of bkz contain the information of both store selection ak and delivery zones assignment
by equation(3.21). In addition, 4(bkz, bkz) ≥ 1 is feasible to limit the changes of the
binary and integer values rather than the continuous variables.
Parallel Search Trees
In our model solving methodology, it not only include the 1st search tree of Benders
decomposition, but also remains the the 2nd search tree of original MIP model with
conditional constraints. The lifting cuts and incumbent updates from the 2nd search
tree can result in significant reductions in the solution runtime.
The parallel search trees between the master problem of Benders and the original
MIP problem in the framework are shown in Figure 3.1. The circled numbers stand
for the steps.
Start to solve the 1st search tree with enhancement of store selection and strength-
ening Benders cuts. If the gap between lower and upper bound is reducing with cer-
tain iterations, the 1st search tree keep running without adding other steps. When it
solves to the end, it reports the optimal solutions for the model at step 1 and finish
the whole solving processes. When the gap ceases to move for certain iterations, it
puts 1st search tree on hold and triggers the step 2 and passes the current upper
49
Master Problem
Original MIP Problem
Benders initial incumbent
MIP Lifting cut
Optimal solution
Benders master incumbents
Better adjacent solution as new incumbent
1
2
3
5
4
67
Figure 3.1: Parallel search trees between the master problem of Benders and original
MIP problem
dΨ0e and lower bounds bΨ0c of the 1st search tree as additional constraints to the 2nd
search tree.
Ω ≤ dΨ0e (3.27)
Ω ≥ bΨ0c (3.28)
Start to solve the 2nd search tree with (3.27) and (3.28). If the gap between lower
and upper bound is reducing with certain iterations, the 2nd search tree keep running
without adding other steps. When it solves to the end, it reports the optimal solutions
for the model at step 3 and finish the whole solving processes. When the gap ceases
to move for certain iterations, it terminates the 2nd search tree, keeps the incumbent
solutions, upper dΩ0e and lower bounds bΩ0c. At steop 4, if the dΩ0e < dΨ0e, add
50
Ψ ≤ dΩ0e as a cut into the master problem of the 1st search tree. If the bΩ0c > bΨ0c,
add Ψ ≥cΩ0c as a cut into the master problem as well. These cuts may help the
master problem to tighten bounds and facilitate fathoming the remaining nodes. We
call them as MIP Lifting cuts.
Back to the 1st search tree, continue to solve it with MIP lifting cuts. The step
5, 6 and 7 borrow the idea of local branching strategy (Fischetti and Lodi 2003) to
improve the upper bound of the 1st search tree of Benders decomposition.
The main idea behind local branching is to divide the feasible region of the problem
into smaller subregions in order that a generic solver is able to find high-quality
solutions effectively. Currently, the advancement of the generic solvers facilitate the
implement of local branching since they can efficiently solve small instance of small
hard problem.
We manage to apply local branching strategy into our parallel search trees solution
system. The 2nd search tree adopt additional constraint to divide solution space into
small subregions. It is able to obtain the high-quality solutions from the 2nd search
tree for each subregion. The 1st search tree keeps the branching and cut and update
with these solutions to accelerate the Benders processes for solving the large-scale
problem. In order to make it work, the following question will be investigated:
• How to divide solution space into small subregions for the 2nd search tree?
• What is the strategy for applying the different kind of solution (e.g. optimal,
infeasible, feasible with time limit) of subregions to the 1st search tree?
The solution space can be partitioned based on the incumbent solution from the 1st
search tree. We adopt the similar method as generation of tabu cuts. Given the
incumbent solution bkz and let K = k ∈ K|bkz = 1 and Z = z ∈ Z|bkz = 1, it
51
can divide the feasible region into two parts by two equations
4(bkz, bkz) ≤ K (3.29)
4(bkz, bkz) ≥ K + 1 (3.30)
where 4(bkz, bkz) =∑
k∈K,z∈Z(1 − bkz) +∑
k∈K\K,z∈Z\Z bkz; K is the size parameter
of exploring neighboring nodes and K is positive integer.
Without loss of generality, we only use binary variables bkz to illustrate the pro-
cesses of step 5, 6 and 7 of the parallel search tree method. It is able to apply to
other binary and integer variables. For the continuous variables, it can work as well
by rounding to nearest integer. It is worthy to note that it is invalid to mix discrete
and continuous variables into (3.29) and (3.30) because the unit of value change is
different. In practical, it is no problem to derive and keep several sets of (3.29) and
(3.30) for different decision variables.
The physical meaning of 4(bkz, bkz) is to count bkz change their value from 0 to
1, or from 1 to 0. When add (3.29) as constraint into 2nd search tree, it limits the
possible change of bkz values under K + 1 times. By imposing appropriate value K,
it creates a small subregion near to the incumbent solution bkz. On the other hand,
equation (3.30) stands for the another mutually exclusive subregion. And two parts
together consist of the whole solution space. Therefore, it could solve the 2nd search
tree with (3.29) as constraint efficiently by generic solver and get the optimal solution
of subregion to update the incumbent of the 1st search tree.
Since this is iterative updating process between the both search tree, the already
examined subregions are no need to be considered in the future updates. In addition,
the choice of value K is arbitrary, which may become large and very hard to solve to
optimality. It is crucial to have a balanced solving strategy to apply the solution of
subregions into both search trees.
52
We propose two kinds of limits in the solving strategy, time limit and stall limit, to
reduce the solution time of the subregion examination. Except for the self-explanatory
time limit, the stall limit is the number of iterations for ceasing movement of opti-
mality gap. Both limits specify that the 2nd search tree with (3.29) will be solved util
reaching the optimality or time or stall limit.
Let us now consider the possible solutions after each time of the subregion exam-
ination. There are 4 cases for the purpose of incumbent update of the 1st search tree.
First, it could get optimal bkz and improved solution Ω(bkz) ≤ Ω(bkz) from subregion
examination of 2nd search tree. In this case, the new optimal bkz will be used to up-
date the incumbent and upper bound of the 1st search tree. And since the subregion
has been fully examined, it will rule out the subregion for further consideration by
applying (3.30) to the 2nd search tree. Second, when the solving process hits the time
or stall limit, it may gets feasible and improved solution b′
kz. Then, for the 1st search
tree, we update incumbent and upper bound with b′
kz; while for the 2nd search tree,
since it does not exam every feasible points inside of the subregion, it can get rid of
b′
kz for future iteration by adding constraint 4(bkz, b′
kz) ≥ 1 instead of the whole sub-
region. Third case is counterpart of second case under same reaching limit situation.
It will generate feasible and unimproved solution. In this case, it usually shows that
the subregion is too big for generic solver to get improved solution within the limits.
Our algorithm will divide the subregion into smaller parts and revisit the divided
part immediately by removing (3.29) and adding 4(bkz, bkz) ≤ (K)− d (K)2e into the
2nd search tree. As the 1st search tree, it remain the solving process on hold. Finally,
the last case includes two possible results of the solution. One is solve to infeasibility;
another is optimal and unimproved solution. Both results show that it is impossible
to find potentially improved solution in the subregion. Then, the alorithm will rule
out the subregion for further consideration by applying (3.30) to the 2nd search tree,
53
1
2 3
4 5
∆(𝑏$%, 𝑏'$%( ) ≤ 𝒦 ∆ 𝑏$%, 𝑏'$%( ≥ 𝒦 + 1
Optimal and improved solution 𝑏'$%/
∆(𝑏$%, 𝑏'$%(/) ≤ 𝒦
Feasible and improved solution 𝑏'$%(0
∆ 𝑏$%, 𝑏'$%(0 ≥ 1Feasible and unimproved
solution 𝑏'$%(1
6 7
∆ 𝑏$%, 𝑏'$%(0 ≤ 𝒦 − 3𝒦2 5
Optimal and unimproved solution 𝑏'$%(6
∆ 𝑏$%, 𝑏'$%(0 ≥ 𝒦 + 1
Update the 1st search tree with 𝑏'$%/
Update the 1st search tree with 𝑏'$%(0
The 2nd search tree
Update the 2nd search tree with 𝑏'$%(/
The 1st search tree
Continue to solve the 1st
search tree with 𝑏'$%(0
Figure 3.2: The solution strategy of parallel search tree
while the 1st search tree start to branch and add cuts again with the non-updated
incumbent bkz.
The communication processes from node 1 to node 7 between the two search
trees can be seen in Fig 3.2. First, based on the b1
kz, it forms a subregion at node
2. Solve it and get optimal and improved solution b2
kz. According to the solving
strategy, it updates the 1st search tree with b2
kz and reverse the constraint in form
of 4(bkz, b1
kz) ≤ (K) + 1 to remove the subregion for future consideration at node
3. Then, it puts the 2nd search tree on hold and continue to solve the 1st search
tree by accelerated Benders method. After certain iterations, the 1st search tree get
new incumbent solution b12
kz and put the solving process on hold. Activate the 2nd
search tree and form a new subregion at node 4. Solve the MIP at node 4 and update
the steps according to the solution and the solving strategy. Processes keep push
the upper bounds of both search trees. The parallel search trees approach takes the
advantage of efficiency of solving small MIP by generic solver, at same time, it fully
utilizes the existing Benders solving structure. More details can be found in Fig 3.2.
54
Master Problem
Subproblems
Pareto-optimal Algorithm
Benders Cut Generator
Original MIP Problem
Enhancement of Store
Selection
Benders cutsTabu cuts
MIP lifting cuts
Local Branch
Initial core point
Optimal solution
Figure 3.3: The solution methodology framework
In sum, the whole solution methodology framework can be seen in Figure 3.3.
3.4 Computational results
We implement the solving approaches including the MIP, the Benders method
with enhancements in the Genetic solver Cplex 12.6. For the original problem, the
Algorithm 2 and the 2nd search tree, it use the MIP solver in Cplex directly; for the
Benders method, it applies by the Lazy constraints in Cplex. The lazy constraints are
a set of inequalities to define the feasible region but are not part of the problem when
the solver is initiated. The inequalities is generated iteratively by the subproblems
as Benders cuts. The Benders cuts are added to the current solving model as soon
as the inequalities that turn out to be violated the current incumbent solution. The
lazy constraint approach is proved to be a more efficient approach to implementing
Benders decomposition (Rubin 2011).
All the solving algorithms are coded in JAVA programming language with math-
ematic programming function of Cplex. They run under CentOS Linux on a work-
55
Table 3.1: Summary of data sets
Instance SKUs Time Range Delivery Zone Store Input
P1 1000 48 278 44
P2 3000 49 287 44
P3 5000 49 292 44
station with 8 core Inte Xeo CPU E7-4830 at 2.13 gigahertz and 32 gigabytes of
memory.
The customer demand data is derived by the distribution of the real online sales
of a major US retailer in a metropolitan area. The sale data records the common
online orders that fulfilled by a few of remote fulfillment centers located around the
country. We generate the testing customer demand from the distribution and use it
as the demand that will be fulfilled by the local stores within same day.
Three data sets associated with sampled customer demand were used to test the
effectiveness of our algorithm. Parameter setting are summarized in Table 3.1.
Other parameters and variables of the model instances are adapted some industrial
practices and reasonable assumptions. We choose Chicago city as the target to provide
spatial information of delivery zones and distance between assuming stores and zones.
For all instances, Ckz = 5 ∀k, z that means the negotiated package shipping cost is
5 dollars by 3rd party carrier; the parameters for the own truck fleet are O = 250,
WT = 8hours, MS = 20mph, ST = 5minutes, and HC = 30; number of stores and
the parameters for the store setup costs and capacities are randomly generated. There
are 44 stores whose locations are distributed in the area according to the demography
like real retailing outlets. There are 4 levels of store order processing capacities: 400,
550, 600 and 800, corresponding to 4, 14, 23 and 3 of the 44 stores. The setup costs
56
Table 3.2: Solving time (Minute) comparison of different solution methods
InstanceMethod
MIP Ordinary BendersBenders with Store
Selection Algorithm
Benders with all En-
hancements
P1 5.95 52.33 0.39 030
P2 3.69 240.02 1.35 1.25
P3 36.91 240.05 62.17 1.81
Table 3.3: The gaps between MIP and different solution methods
InstanceMethod
MIP Ordinary BendersBenders with Store
Selection Algorithm
Benders with all En-
hancements
P1 0.00% 0.00% 0.00% 0.71%
P2 0.00% -0.04% -0.26% 0.04%
P3 0.00% -0.28% -0.41% 0.39%
of stores are uniformly distributed between 30,000 and 60,000 dollars.
For running parameter values of the solvers, we adopt 4 hours as time limit for
running the algorithms. And according to the the argument in Cordeau et al. (2006),
we follows the convention to use 1% as optimality gap for integer involved program-
ming models. Due to the errors contained in the data estimates, it is adequate to run
the solver by the 1% gap for this supply chain planning problem.
Table 3.4: The Improvement of store selection by Algorithm 2
Instance Store in Store in by Algo 2 Store out w/o Algo 2 Store out w/ Algo 2 % of match
P1 44 3 3 3 100%
P2 44 16 8 8 100%
P3 44 20 16 16 100%
57
Table 3.5: Summary of data set of Instance P4
Instance SKUs Time Range Delivery Zone Store Input
P4 8339 49 300 44
In the initial testing, the store selection has huge impact on the solving time.
Table 3.2 highlights these runtime results between ordinary Benders and Benders
with Algorithm 2. The reason for this observation is that initial Benders cuts, which
are generated from initial master problem, might not able to redefine the feasible
region of the model. Since the lazy constraints would be effective only when some
of them violated the incumbent integer solution, the convergency of the Benders is
relatively slow. This observation provides the motivation both for enhancement of
store selection to reduce the potential store list and MIP initial cuts from parallel
search trees to generate warm starts in order to accelerate the Benders approach.
Table 3.4 shows that the proposed Algorithm 2 imposes significant improvement
of store input list, shrinks down from 44 stores to very small fraction for all three
instances without losing feasibility. According to the output selected store after op-
timization, Algorithm 2 helps the Benders master problem to contain a tight feasible
region and to generate a good start point for subproblems.
As a point of comparison, we run the three instances of Table 3.1 by using our
different solution methods. The results in Table 3.2 show the Benders with all en-
hancing methods outperforms others. For instance P3, it is 20 times faster than
the MIP solved in the Cplex. Table 3.3 compares the solution quality of different
methods using results of MIP as the benchmark. The gaps of all methods satisfy the
predefined 1% tolerance.
It introduces more complex instance of Table 3.5. Increase the number of SKUs
58
Table 3.6: Metric comparison of different solution methods for Instance P4
Instance MetricsMethod
MIP Ordinary BendersBenders with Store
Selection Algorithm
Benders with all En-
hancements
P4Optimality gap 1.81% 1.61% 51.15% 1%
Solving time (Hours) 4.01 4.00 4.00 2.57
to 8339, Table 3.6 shows the solving time and the optimality gap of different solution
methods. Only the Benders with all enhancements solves to optimality in 4 hours.
We conduct the sensitivity analysis for some key parameters and variable of the
model instance as well. The practice problem that the model tries to answer is how
to implement same-day delivery service by local stores. Given that, our focus is on
the key performance variables – store order processing capacity CPk, daily operation
cost of own fleet truck O and package shipping cost by third party carrier Ckz. It
designs several experiments to go through different values of these key performance
variables respectively. The experiments are based on the instance P3 and solved to
optimality by Benders with Enhancing methods. The results can be seen in Figure
3.4, 3.5 and 3.6.
From the results, it indicates that fleet size is negatively correlated with O, and
positively correlated with Ckz, while CPk seems to have no specific effect on the fleet
size. The number of selected store reduces from 16 to its half 8, when CPk increase
from around 600 to 1200. Further increasing CPk from 1200 has relatively small
changes on store selection.
3.5 Conclusions
This study introduces a new same-day delivery planning with store fulfillment
problem to capture the current trend of same-day delivery in omni-channel supply
59
0
20000
40000
60000
80000
100000
120000
140000
160000
58
3.7
88
3.7
11
83
.7
14
83
.7
17
83
.7
20
83
.7
23
83
.7
26
83
.7
29
83
.7
32
83
.7
Nu
mb
er o
f o
rder
s
Average store capacity
Fl orders Ca orders Total orders
(a) Order delivery methods analysis
0
200000
400000
600000
800000
1000000
1200000
1400000
58
3.7
88
3.7
11
83
.7
14
83
.7
17
83
.7
20
83
.7
23
83
.7
26
83
.7
29
83
.7
32
83
.7
The
cost
Average store capacity
Fl Cost Ca cost
Setup cost Total Cost
(b) Cost analysis
0.0 5.0 10.0 15.0 20.0
583.7
883.7
1183.7
1483.7
1783.7
2083.7
2383.7
2683.7
2983.7
3283.7
Number of stores
Ave
rage
sto
re c
apac
ity
(c) Number of selected store
Figure 3.4: Sensitivity analysis of the store order processing capacity
Default average value of Ckz is 583.7.
60
0
20000
40000
60000
80000
100000
120000
140000
160000
50 100 150 200 250 300 350 400 450 500
Nu
mb
er o
f o
rder
s
Daily operation cost of a truck
Fl orders Ca orders Total orders
(a) Order delivery methods analysis
0
100000
200000
300000
400000
500000
600000
700000
800000
50 100 150 200 250 300 350 400 450 500
The
tran
spo
rtin
g co
st
Daily operation cost of a truck
Fl cost Ca Cost Total cost
(b) Cost analysis
0
10
20
30
40
50
60
70
80
0 200 400 600
Nu
mb
er o
f tr
uck
s
Daily operation cost of a truck
(c) The fleet size
Figure 3.5: Sensitivity analysis of the daily operation cost of own fleet truck
Default truck operation cost O is 250.00 dollar per day.
61
0
20000
40000
60000
80000
100000
120000
140000
160000
0 2 4 6 8 10
Nu
mb
er o
f o
rder
s
Package shipping cost by carrier
Fl orders Ca orders Total orders
(a) Order delivery methods analysis
0
200000
400000
600000
800000
1000000
1200000
0 2 4 6 8 10
The
tran
spo
rtin
g co
st
Package shipping cost by carrier
Fl cost Ca Cost Total Cost
(b) Cost analysis
0
5
10
15
20
25
30
35
40
0 2 4 6 8 10
Nu
mb
er o
f tr
uck
s
Package shipping cost by carrier
(c) The fleet size
Figure 3.6: Sensitivity analysis of the package shipping cost by third party carrier
Default shipping cost by carrier Ckz is 5.00 dollar per package.
62
chain for brick-and-mortar retailers. It develops optimization models and solving
algorithms about store location, transporting channel selection, and inventory man-
agement, in order to construct robust logistic plan of supply chain. The solution
methodology framework includes the Benders decomposition, store selection algo-
rithm, cut strengthening methods, and parallel search trees.
Our method achieves the best results in terms of runtime for three large-scale
instance derived from real online customer orders. our store selection algorithm con-
strains the number of potential store locations, effectively improves the solving effi-
ciency. The cut strengthening methods ensures the high quality of Benders cut. And
the parallel search trees approach integrates the Benders model and MIP model into
one solving framework. MIP model and Benders model provide warm start points to
each other. By utilizing the good performance of generic solver for solving small MIP
instance, parallel search tree updates the incumbent solutions quickly and iteratively.
In sum, our study provide intuitive and efficient solving methodology to newly defined
same-day delivery planning with store fulfillment problem. The numeric experiments
on key performance variables are presented as well.
63
Chapter 4
CROSS SOURING DELIVERY WITH STORE FULFILLMENT
When one takes a close look on same day delivery with store fulfillment (SDDSF),
how to make a good fulfillment decision is an inevitable problem.
In Chapter three, we investigate SDD with store fulfillment in the planning per-
spective. It derives an implementable plan with the consideration of store location,
delivery fleet size and store inventory assignment based on the forecast sale data.
When it comes to the daily operations of SDD with store fulfillment, an exact order
fulfillment plan rather than a seasonal plan will to be created.
In Chapter four, we drill down SDD with store fulfillment from supply chain
planning to supply chain operation level and aim to create an optimal exact order
fulfillment plan to specify each received customer order which store to be sourced,
what selected delivery option preferred, when to be picked it up.
4.1 Introduction
Recently, the competition for online sales between traditional and online-exclusive
retailers has changed its focus from expanding product availability to customer promise
and product accessibility. One notable example is the recent emergence of same-day
delivery (SDD) options for some types of products, which allows customers to have
desired items delivered to their doors only a few hours after the purchase. Online
retailers with highly flexible supply chains like Amazon have been able to significantly
reduce the delivery time of most of their products by taking advantage of crossdock-
ing and product consolidation strategies over massive fulfillment centers [82], while
64
traditional retailers can take advantage of their physical stores to fulfill SDD orders
in a direct-to-consumer fashion by drawing inventory from their retail stores [83].
There exist several key advantages of using physical stores to fulfill online orders,
including short last-mile delivery distance, fully utilizing existing distribution chain,
and versatile accessible services. However, it also introduces major challenges that
must be addressed in order to guarantee a successful distribution operation. The
challenges include order consolidation, order assignment and delivery method selec-
tion.
In order to improve operational efficiency and reduce the transportation cost,
we adopt the concept of crowdsourced shipping that utilizes the extra capacity of
the vehicles from private drivers to execute delivery jobs on trips they would make
anyway [84]. Crowdsourced shipping provides a peer-to-peer transportation system
[85], in which we further divide the private drivers into two groups. The first group
contains the drivers are willing to make delivery and share their forthcoming trips
with retailers. We call this group Information Sharing Drivers (ISDs). Another group
stands for the random store walk-in customers who happen to be willing to deliver a
package which has already been picked and packed in the store. We name the group
Occasional Drivers (ODs).
In common, ISDs and ODs are willing to use their own vehicle for delivering
others’ packages in return for a small compensation. The main difference between
ISDs and ODs is that the ISDs express their crowdshipping willingness and share
scheduled trips to retailer while ODs just occasionally do the shipping. It may be
caused by the convenience of the delivery location, near their working places, homes
or not close to their trip destinations.
This study intends to solve the store-based SDD operational challenges by in-
vestigating the decisions of both order assignment and delivery options. There are
65
three available delivery options, self-operated or carrier-operated fleet of truck, OD,
and ISD. We name the problem as Same-day delivery with crowdshipping and store
fulfillment (SDD-CSF) that aims to create an optimal fulfillment plan for sourcing
local online orders from nearby retailing stores.
The order delivery operation that we consider in the study is assumed to be
conducted by a traditional brick-and-mortar retailer in the following manner. The
system keeps receiving customer orders for delivery. There are several time periods
for one time horizon. At the end of each time period, it consolidates new coming
orders from current time period and unfulfilled orders from previous time periods,
then assigns them to specific stores with preferred delivery methods. We consider
three methods as last-mile transport options – truck, ISD, and OD. At end of the
horizon, unfulfilled orders would lead to penalty cost.
In order to model SDD-CSF, we develop a set of exact solution approaches for
order fulfillment in form of rolling horizon framework. It repeatedly solves a series
of order assignment and delivery plan problem following the timeline in order to
construct a daily optimal fulfillment plan from local stores. Our study makes following
key contributions:
1. The study presents a roll horizon framework with the exact solution approach
to providing an optimal order fulfillment plan from nearby retailing stores.
2. We adopt the crowdsourced shipping for SDD and creatively consider two types
of private drivers: occasional drivers and information sharing drivers.
3. The optimization model incorporates the predicting results of future demand
to make order assignment decision that minimizes the immediate delivery cost
plus the resulting future expected cost.
66
4. The study develops a feedback control system to cope with an inaccurate fore-
cast of future demand in the roll horizon framework.
5. Various computational experiments are conducted to quantify the benefits by
comparing SDD-CSF model with some myopic operational practice.
4.2 Related Works
This literature review includes three components: last-mile delivery, crowdsourced
shipping, and SDD. The SDD-CSF problem deals with hourly and daily operation of
same day delivery with store fulfillment. It contains two main steps of supply chain
operations: order assignment to a specific store, and the options of last-mile delivery.
Typically, retailers assign the online sales immediately to the closest distribution
location that has available stock and deliver items as a package to the costumer as
soon as possible [86]. In this case, shipping individual items between locations lack
economies of scale, and results in high transportation cost in last-mile deliveries.
The operations limitations of last-mile delivery give rise to an emerging field of
research. Xu et al. [87] improve the online last-mile delivery by revising the existing
plan. The study consolidate possible orders together for delivery and created a cost-
effective fulfillment plan as order-warehouse assignment for online retailing. Instead
of solving the big NP-hard problem to get the optimal, they tackle the problem by
reassigning customer orders to improve the myopic initial fulfillment plan, which is
automatically generated by e-tailers at the beginning. Two heuristics are constructed
for the above purpose. Acimovic and Graves [88] investigate the fulfillment decisions
upon not only the current on-hand received customer orders but also an estimate of
future orders. They develop a transportation linear programming model to reassign
customer orders in order to minimize the immediate and estimate future outbound
67
last-mile delivery together. Mahar and Wright [86] leverage the online fulfillment
assignment decisions by postponing the immediate assignment on purpose. They
create a framework of policies to utilize the postpone to accumulate online orders
which then are assigned to distribution locations based on inventory, shipping, and
customer wait costs. That study argues that the right policy executing postpone
of order allocation benefits both inventory and transportation cost from the order
fulfillment.
These above studies address the last-mile delivery in different ways, including
order reassignment, accommodation of estimate future orders, and deliberate post-
ponement to accumulate online orders. However, in common, the goal of these studies
is all trying to achieve economies of scale for the order fulfillment. Our study fol-
lows this research idea and aims to achieve the same goal. We propose three novel
methods, including the rolling horizon solving framework to accumulate the online
orders, estimation and calibration of future demand, and integration of crowdsourced
shipping as one of delivery options.
The crowdsourced shipping (crowdshipping) is facilitated by the advancement of
communication technologies and flexibility of resources from the item sharing economy
[89, 90]. Private drivers are turned into occasional couriers to deliver package as peer-
to-peer transportation for a small compensation. Consequently, in terms of last-mile
delivery, retailer may benefit from providing crowdshipping option for online orders,
since the crowdsourced shipping provides potential to utilize the existing trips of
drivers to significantly reduce the transportation cost.
Research on crowdshipping is still in infancy and has just begun recently. We
divide the past literature into two parts –the operation optimization and empirical
studies.
The operation optimization focuses on how to maximize the efficiency of the
68
drivers, at the same time, minimize the operation cost. Archetti et al. [91] investigate
the emerging business model that makes in-store customers deliver goods ordered by
online customers. Each customer can make at most one delivery and these delivery
can not be carried by professional drivers like package delivery carriers. They model
the problem by mixed integer programming to minimize the total transportation cost
combining the cost of professional drivers and the compensation for crowdshipping
customers. The willingness of the crowdshipping customers is modeled by the ratio of
detour and two kinds of compensation schemes are considered, in which compensation
rate is either based on the distance of detour or the travel distance between the store
and the shipping address. Arslan et al. [84] discuss the application of crowdshipping
through modeling a peer-to-peer transportation platform that receives upcoming in-
formation of both delivery tasks and trips from drivers. They focus on optimizing the
delivery route and carrying tasks for the individual driver. The routing constraints
include the number of stops, willingness of driving time length, deliver time windows
and precedence constraints that enforces pickup before delivery. Wang et al. [92]
consider the crowdshipping model from network of storage facilities with prior knowl-
edge of large pool of available drivers with their trip information. They model the
problem as an assignment optimization problem and further convert to a network
min-cost problem to minimize the total compensation. The origin and destination of
the delivery task is pre-defined and the compensation is measured by an additional
travel distance. Kafle et al. [93] investigate a crowdshipping system combining the
truck carrier to transship packages and crowdsourced shippers to perform last-mile
delivery. The crowdsourced shippers could be cyclists and pedestrians from general
public. The willingness of shippers is represented by the bids rather than detour
travel distance. An optimization model is set up to determine crowdsourced shippers
assignment, corresponding the pickup points and the truck schedule to minimize the
69
total biding prices, the truck cost and the penalty cost for servicing outside customers’
desired time windows.
Empirical studies intend to evaluate the performance of the existing crowdsoured
shipping system and the explore public view and motivation of crowdshipping. For
empirical studies of crowdsourced shipping, Paloheimo et al. [94] examine the ap-
plication of crowdsourced deliveries for delivering and returning books from a local
library. Based on 6-week data, they discuss the impact of the service on the role of
customers, shippers, library, local community and environment. Devari et al. [85]
study the attitude and motivation of people for the crowdshipping through public
survey results. They investigate the impact of crowdshipping by the social network
of the traveler, like co-workers, neighbors and friends, to ensure a speedy and reliable
last-mile delivery. Punel and Stathopoulos [95] provide an exploratory examination
of the public view of crowdsourced shipping through survey results as well. They look
into effects of the characteristics of the crowdshipping drivers like expertise, rating
for different types of delivery tasks.
For the same-day delivery part, researchers mainly examine how to operate the
delivery vehicles as a dynamic pickup and delivery problem in the range of the same
day. Azi et al [74], Voccia et al [76] and Klapp et al [96] essentially address a vehicle
routing problem with the dynamic pickup and operation to maximize the expected
number of stochastic orders as well as minimize the vehicle operating cost and penal-
ties for open orders that remain unserved at the end of the period. Detail can be
found in Chapter 3 Literature Review.
To conclude from the related literature, to the best of our knowledge, the problem
of same-day delivery with crowdshipping and store fulfillment has not been fully in-
vestigated so far. SDD-CSF combines the study about order assignment and last-mile
delivery. We put the crowdshipping as one option of the last-mile delivery for online
70
retailing orders and further divide crowdshippers into two separate groups – Occa-
sional Drivers and Information Sharing Drivers with the consideration of potentially
practical schemes. This feature has not been investigated so far as well.
4.3 Methodology
In this section, we formally formulate the SDD-CSF in the form of dynamic pro-
gramming. Then we will introduce the rolling horzion as time-based consolidating
method for gathering resources and customer orders. The cost function will be formed
to model the myopic fulfillment plan only for received orders. The crowdshipping as
an option of the last-mile delivery will be investigated and modeled as an important
component in the cost function. Further, the solving algorithm of SDD-CSF in dy-
namic programming will be discussed. The future orders will be considered in the
cost function to make fulfillment decision to minimize the cost of both current and
future estimated order deliveries. A feedback control method will also be developed
in the solving algorithm to calibrate the forecast of customer orders.
The proposed SDD-CSF model consider both the current received orders and
forecast orders, which can be treated in dynamic programming form [88]. The state
of the system S is defined by the unfulfilled customer orders, available transportation
resource, and the inventory in each store. The post-decision Bellman equation can
be expressed as:
V (S) = mina∈A(S)
C(a) + V (Sa) (4.1)
where V is value function, a is a decision of order fulfillment plan, A is the set of
feasible fulfillment plans, C is the cost function for the order fulfillment plan a, and
Sa is the state evolved after applying decision a.
The states evolve based upon new received customer orders and the decision of
previous states through the time horizon, which is treated as one day in consideration
71
of SDD.
Later, we will discuss how to use linear programming model (CF ) to approximate
the value function for future state V (Sa). Ultimately, SDD-CSF model will combine
CF and C into an optimization model.
4.3.1 Rolling Horizon
Since orders and transportation resource can be consolidated by the time interval,
this study proposes a time-based rolling horizon framework that solves the SDD-CSF
problem repeatedly at each time t within the time horizon T ≡ [0, T ].
At time t, the previously unfulfilled orders and new coming orders are combined
as the received orders inputing into the model. In terms of an individual order, the
model output provides the instruction with a series of fulfilling actions specifying the
store to be sourced, the selected delivery option, and delivery pickup time in a slot
of [t, T ].
When the operational fulfillment plan is made by the SDD-CSF model, the retailer
should carry out the plan during time interval t and t + 1. It changes status quo of
unfulfilled orders, inventory level, and transportation resource, while new customer
orders are coming and recorded. At time t+ 1, the model will run again according to
new inputs and resources. This process solves a series of SDD-CSF models iteratively
until T .
4.3.2 The Cost Function for Received Orders
We express the cost function C by integrating unfulfillment penalty cost and
the delivery costs. The delivery costs can be further broken down into three parts,
respectively for own fleet (truck), ISD, and OD.
The truck cost is defined by summation of per-package delivery cost for assigned
72
orders. The per-package delivery cost depends on the distance between store locations
and order shipping address. The delivery rate is based on FedEx SameDay R© service
[97] with a reasonable corporation discount.
The ISD cost is the compensation which is proportional to the truck cost, and is
assumed to be much less than the cost of truck. Also, we consider a restricted will-
ingness for detouring measured by the extra travel distance [91]. When the detouring
distance is larger than a threshold length, the order would not assign to the specific
ISD driver.
The ODs come from random store walk-in customers who happen to be a courier
for one available package. We assume the per-package cost by OD is the least among
the three delivery options. The result of fulfillment by ODs naturally depend on
chance. When there are more walk-in customers, more compensation, and longer
package waiting time, it leads to a higher probability of order fulfillment by OD.
Unlike employed truck and scheduled ISD, ODs do not have expected arrival time.
The orders have to be picked and packed before any OD can pick them up for delivery.
In sum, the order fulfillment by ODs has two unique characteristics:
1. ODs perform delivery with uncertainty. Orders that are assigned to ODs but
cannot be delivered through ODs may be executed by other delivery options or
remain as unfulfilled, which eventually costs much more since the delivery cost
of OD is the delivery cost of OS is the lowest.
2. When orders are assigned to be delivered by ODs, the store needs to pack
the order as soon as possible to make the order ready for pick up. And the
processing time of order in store is assumed to be zero.
We assume that the per-package cost for truck and the ISD trips are known.
And, the probability for ODs to pick up a specific package is given by an algorithm
73
which will be discussed later. Furthermore, each ISD or OD can perform at most one
delivery. We formally define the cost function and related constraints as a mixed-
integer programming formulation.
The formulation divides the unfulfilled orders into two parts – the non-fixed-
sourcing orders and the fixed-sourcing orders, symbolized respectively by the set I
and set J . The non-fixed-sourcing orders represent the ones can be sourced from any
available stores, which include new coming orders at time t and unfulfilled orders
that previously have no delivery option or prepared to be delivered by truck or ISDs.
On the other hand, the fixed-sourcing orders are the ones that can be sourced only
from the previously assigned store, which come from unfulfilled orders prepared to be
delivered by ODs.
The reason for the division is based on the second characteristic of ODs. Once an
order assigned to OD, the sourcing store will pack it as soon as possible. Therefore,
the sourcing store is fixed, although the delivery method may be changed to another
at t+1 of the rolling horizon. While for these orders assigned to truck or ISD but have
not been delivered at current time t, the fulfillment plan for the orders assigned to
truck or ISD can be fully changed with respect to delivery method as well as sourcing
stores at t+ 1.
Notation
The notation used in the proposed model is summarized as follows.
Indices and sets:
74
I set of non-fixed-sourcing orders, indexed by i.
J set of fixed-sourcing orders, indexed by j.
[t0, T ] set of the service hours, indexed by t.*
K set of stores, indexed by k.
H set of information sharing drivers, indexed by h.
* where t0 is the current time and start point, T is the end of horizon. And [t0, T ] ⊂ T .
Customer demand and order fulfillment related parameters.
di the quantity of demand in order i.
mkt the forecast walk-in customers for store k at hour t.
Rk the ratio of walk-in customers willing to be occasional drivers at store k.
Ak inventory in store k at beginning of the horizon t0.
nVk
the processing (picking and packing) capacity per unit time of store k for
orders by own vehicles.
nODk
the processing (picking and packing) capacity per unit time of store k for
orders by ODs.
nISDk
the processing (picking and packing) capacity per unit time of store k for
orders by ISDs.
θ the penalty cost for a package cannot fulfill in the same day.
Shipping related parameters:
75
cVko the shipping cost for order o from store k by own vehicles, o ∈ I or J .
cODko the shipping cost for order o from store k by occasional drivers, o ∈ I or J .
cISDko
the shipping cost for order o from store k by information sharing drivers,
o ∈ I or J .
qht binary variable indicate that the driver h available at time t, or not.
Ω the coefficient of driver willingness to detour for order fulfillment.
ρPLANh the original travel distance of ISD h.
ρDETOURhko
the detoured travel distance for ISD h to fulfill order o from store k, o ∈
I or J .
pTtko
the probability for order o to be picked up at store k by OD from time t to
end of horizon, o ∈ I or J .
λjkbinary parameter indicates that the fixed-sourcing order j be packed in
store k or not based on previous assignment.
Decision variables:
aiktbinary variable the order i be sourced from store k by own fleet (truck) at
time t, or not.
biktbinary variable the order i be sourced from store k by occasional drivers at
time t, or not.
eikh binary variable the order i be sourced from store k by ISD h, or not.
αjktbinary variable the fixed-sourcing order j be sourced from store k by own
vehicles at hour t, or not.
βjktbinary variable the fixed-sourcing order j be sourced from store k by occa-
sional drivers at hour t, or not.
γjkhbinary variable the fixed-sourcing order j be sourced from store k by ISD
h at hour t, or not.
76
Myopic Formulation
The model for received SDD orders can be formulated as follows:
min∑ik
(∑t
cVki · aikt +∑t
(pTtki · cODki · bikt + (1− pTtki) · θ · bikt) +∑h
cISDki · eikh)
+∑jk
(∑t
cVkj · αjkt +∑t
(pTtkj · cODkj · βjkt + (1− pTtkj) · θ · βjkt) +∑h
cISDkj · γjkh)
+ θ · (I −∑ikt
aikt −∑ikt
bikt −∑ikh
eikt) + θ · (J −∑jkt
αjkt −∑jkt
βjkt −∑jkh
γjkt)
(4.2)
s.t.
∑i
bikt +∑j
βjkt ≤ Rk ·mkt ∀k ∈ K, t ∈ t0, .., T (4.3)
∑kt
aikt +∑k,t
bikt +∑kh
eikh ≤ 1 ∀i ∈ I (4.4)
∑kt
αjkt +∑kt
βjkt +∑kh
γjkh ≤ 1 ∀j ∈ J (4.5)
∑t
αjkt +∑t
βjkt +∑h
γjkh ≤ λjk ∀j ∈ J, k ∈ K (4.6)
∑i
di · (∑t
aikt +∑t
bikt +∑h
eikh) ≤ Ak ∀k ∈ K (4.7)
∑i
aikt +∑j
αjkt ≤ nVk ∀k ∈ K, t ∈ t0, .., T (4.8)
∑i
bikt +∑j
βjkt ≤ nODk ∀k ∈ K, t ∈ t0, .., T (4.9)
∑ih
qhteikh +∑jh
qhtγjkh ≤ nISDk ∀k ∈ K, t ∈ t0, .., T (4.10)
∑ik
eikh +∑ik
γjkh ≤ 1 ∀h ∈ H (4.11)
Ω · ρPLANh ≥ ρDETOURhki · eikh ∀i ∈ I, k ∈ K,h ∈ H (4.12)
Ω · ρPLANh ≥ ρDETOURhkj · γjkh ∀j ∈ J, k ∈ K,h ∈ H (4.13)
77
The objective function (4.2) represents the cost function C of an order fulfillment
plan defined by the decision variables. There are transportation costs and penalty
costs of unfulfillment for both non-fixed-sourcing orders I and fixed-sourcing orders
J . And each transportation costs contain the cost of three delivery modes respec-
tively. Specifically, for the orders assigned to OD, there exists a pickup probability
representing uncertainty of ODs, which will be further discussed in the next section.
The constraints of the model show that the operations are confined by resources,
including store inventory, store processing capacity, and transportation capacity.
The constraints (4.3) enforces that the number of assigned to ODs is less than the
number of potential ODs. The constraints (4.4) and (4.5) guarantee that there would
be only one of three delivery options being chosen for an order. The constraints (4.6)
ensure that the fixed-sourcing orders are sourced only from the previously assigned
store. The constraint (4.7) impose the restriction upon store inventory. The con-
straints (4.8), (4.9) and (4.10) show the processing capacity of store for three delivery
options. The constraints (4.11) guarantee that the ISD h is assigned to at most one
order. The constraints (4.12) and (4.13) show the detouring willingness of the ISDs.
The willingness of ISDs is expressed through the constraints (4.12) and (4.13),
which borrow idea from the flexibility parameters from Archetti et al[91]. For exam-
ple, when Ω = 1.5, the driver willingness can be expressed as that the driver is able
to take a order which makes the total driving distance at most 1.5 times than his/her
planning travel distance. Therefore, ρDETOURhko is defined upon three segments of travel
– from the driver’s origin to the store, from store to the order shipping address, and
from the order shipping address to the driver’s destination. And the three segments
of travel are all known information before solving the model.
For shipping cost rate, we assume that the compensation for ISDs or ODs are linear
correlated with the per-package delivery cost for assigned orders by truck, which can
78
be formally defined:
cISDko = ζISD · cVki (4.14)
cODko = ζOD · cVki (4.15)
ζISD > ζOD (4.16)
where ζISD and ζOD are the compensation factors for ISDs and ODs respectively.
The model is embedded the rolling horizon of [0, T ], all the input of model would
change according to the state of system at current time t0 ∈ [0, T ]. For instance, the
value inventory Ak would be updated based on the initial state at time 0 and order
fulfillment plans from time 0 to time t0 − 1. Also, the set of service hours would be
redefined as [t0, T ]. In general, the rolling horizon framework updates all the input
parameters based upon the previous results and set up the new model for current
time t0 iteratively until the end of the horizon T .
OD Pickup Probability
We model the OD pickup probability by three factors: the estimated number of
occasional drivers, the pickup probability from one driver for a unit of time, and the
length of the time interval between order assignment time t ∈ [t0, T ] to the end of the
horizon T .
The idea can be implemented in three steps. First, the number of occasional
drivers (mt) can be approximated by the historical time-of-day walk-in customers.
Second, the probability of a orders at time t to be picked up by an OD (ptko) can be
approximated based on three factors, total number of orders, the number of ODs, and
the willingness for taking the order o from store k which we call the preference (lko).
Without loss of generality, one can consider to estimate the preference by the shipping
distance, demographic factors at shipping address, and the delivery compensation.
79
Finally, based on the preference for an order and the predicted number of ODs from
t to T , the pickup probability (pTtko) from t to T can be formulated by applying
geometric distribution. The process is defined in the below Algorithm.
Algorithm 1. OD Order Pickup Probability
Input
• Ot0: Number of unfulfilled order at current time t0, and Ot0 = I ∪ J ;
• mkt the forecast number of walk-in customers for store k at t;
• Rk: the ratio of walk-in customers willing to be occasional drivers at store k
• nODk : the processing capacity of store k for OD orders.
Output
• pTtko: the probability for order o to be picked up at store k by OD from time t to
end of horizon, o ∈ I or J .
1. Quantify the preference (weight) of picking up the order o for ODs (lko) based
on the shipping distance, demographic factors at z. Then, standardize values of
lko into [0, 1]. Use shipping distance dko to illustrate the process.
lko =
1.0, dko <= 5 miles
0.9, 5 < dko <= 10 miles
0.8, 10 < dko <= 15 miles
0.7, dko > 15 miles
80
2. Calculate the total number of available couriers from ODs at time t, denoted as
∆t, which is the minimal number of total processing capacity of stores for OD
orders and total number of expected OD drivers of all stores.
∆t = min(∑k
nODk ,∑k
(Rk ·mkt)) t = t0, .., T
3. Calculate the probability of order o at time t to be picked up by one OD (ptko).
Let Ot is total number of unfulfilled orders at t, then
ptko =
lko·∆t
Otwhen <= 1
1.0 when > 1
(4.17)
4. Update Ot+1 from two sources – the expected fulfilled order from t (Ft) and the
assigned order to t+ 1 to be prepared (Ot+1)
Ft =Ot∑o
(1
K·∑k
pODtko )
then
Ot+1 = Ot − Ft + Ot+1 = O0 −T∑t=t0
Ft + Ot+1
where Ot+1 can be treated as the forecast customer orders at time t+ 1
5. The OD pickup probability from t to T at store k (pTtko)
pTtko = 1−T∑t
(1− ptko) (4.18)
Equation (4.18) represents the possibility that there is at least one successful
attempt for the order, which is treated as OD pickup probability from t to T for
order o in the model.
81
There are several different methods to estimate the preference of order lko. Natu-
rally, customers may visit nearby physical stores more frequently. Despite the number
of walk-in customers varies upon hour of the day, the percentage of customers who
come from short distance can be assumed usually higher than the ones from long
distance area. It can indicate that when order o is located nearby from store k, the
value of lko would be higher. Similarly, lko can be adjusted further by the demography
factors and the delivery compensation.
It is worthy noting some traits of the probability of order o at time t pODto . First,
summation of pODto over Ot is the total number of available couriers from ODs at time
t, which ensures that the orders assigned to ODs is around the expected number of
ODs within the store processing capacity. However, because of the limitation about
the processing capacity, and the number of order assigned for ODs is also unknown
when we apply this Algorithm, therefore, ptko is not the final probability after order
assignment from result of the model. In the SDD-CSF model, ptko is used to represent
the relative fondness of order o for a random OD among all other orders Ot.
The updated OD order pickup probability are estimated and used to simulate
the OD order picking up process, after the SDD-CSF model generates a fulfillment
plan for current time t in the rolling horizon. Part of the orders assigned to ODs
will be fulfilled according to the probability while the remaining ones become the
fixed-sourcing orders for time t + 1. The fixed-sourcing orders denoted by J in the
model are the ones assigned and prepared for ODs but have not been fulfilled before
current time.
The simulation process begin with recalculation of the order pickup probability
for one driver at time t0 (p∗t0,ko). The solution of SDD-CSF model for time t will
provide exact number of order assigned to ODs at each store O∗k,t0 , and the exact
number of walk-in customers (m∗k,t0). Then
82
∆∗t0,k = min(nODk ,∑k
(Rk ·m∗k,t0)) k ∈ K
p∗t0,ko =
lko·∆∗t0,kO∗k,t0
when <= 1
1.0 when > 1
Then, comparing random number for each order with p∗t0,ko, the order is simulated
to be fulfilled or not by ODs.
In sum, the Algorithm and its output p(tT )ko express the idea how to model the un-
certainty of OD delivery, which is an essential component to model both OD behavior
in SDD-CSF model and order fulfillment simulation of the rolling horizon.
Variant of the Cost Function
In the cost function, the truck cost can be modeled either by the summation of per-
package delivery cost for assigned orders, or fixed daily fleet cost. We discuss the
latter modeling variant in this section.
We assume that a self-operated fleet of trucks maintains its size during the rolling
horizon. Therefore, the operation cost for trucks is a constant for the entire rolling
horizon. Based on that, some modifications are introduced to build a mixed integer
programming model for the updated cost function.
For fixed daily fleet cost, we add the following truck operation related parameters
and decision variables.
Indices and sets.
Zset of delivery zones, which is the divided areas of the whole region, indexed
by z.
Truck operation related parameters.
83
vO the daily operation cost of unit truck from the own fleet.
vW the daily working time length of unit truck from the own fleet.
vM the average moving speed of unit truck from the own fleet.
vTthe average stop time for delivering package from the truck to the destina-
tion.
vH the holding capacity of packages of unit truck from the own fleet.
Truck operation related decision variables
f the number of trucks used for order fulfillment throughout the whole day.
lt the optimal of traveling distance to fulfill orders by the own fleet at hour t.
The new objective function
min vO · f +∑ik
(∑t
(pTtki · cODki · bikt + (1− pTtki) · θ · bikt) +∑h
cISDki · eikh)
+∑jk
(∑t
(pTtkj · cODkj · βjkt + (1− pTtkj) · θ · βjkt) +∑h
cISDkj · γjkh)
+ θ · (I −∑ikt
aikt −∑ikt
bikt −∑ikh
eikt) + θ · (J −∑jkt
αjkt −∑jkt
βjkt −∑jkh
γjkt)
(4.19)
The new adding constraints:
lt =∑(k,z)
(2ζkz(∑i
µizaikt +∑j
µjzαjkt))/vH
+ 0.57
√((∑i
µozaikt +∑j
µozαjkt)Πz) ∀t ∈ T (4.20)
f ≥ (lt/vM + vT
∑kz
(∑i
µozaikt +∑j
µozαjkt))/vW (4.21)
The objective function (4.19) aims to minimize the last-mile delivery cost and
unfulfillment penalty cost as the same as the previous one. The difference is that the
truck cost is now defined by the summation of the daily operation cost for trucks.
The constraint (4.20) calculate the total travel distance for a unit time. Based on
84
predefined operation parameters of the vehicle, the number of trucks is approximated
from the constraint (4.21). The reason of modeling truck operation is to approximate
the vehicle routing problem and find the optimal routing travel distance. As a result,
we can estimate the number of truck needed. We apply Continuous Approximation in
constraint (4.20) and (4.21) utilize the value of area and the number of customers to
approximate average traveling distance by vehicles with a high accuracy. The ratio-
nale of Continuous Approximation can be seen in Section Vehicle Routing Estimation
of Chapter 3 .
4.3.3 The Cost Function with Forecast Orders and Feedback Control
Integrating the cost function in the rolling horizon, we develop a myopic fulfill-
ment plan model based on the current received orders and the current transportation
resource. However, this model does not take into account the upcoming future orders
and resource. In order to minimize not only the immediate fulfillment cost but also
future expecting cost, we further introduce the SDD-CSF model which leverages the
forecast future orders and resources. This subsection will describe how to model the
cost function for forecast orders, and how to design the feedback function to deal with
the inaccurate order prediction.
The Cost Function for Forecast Orders
It is intractable to solve the dynamic problem due to the very large state space and
action plans A. We use the mixed integer linear programming model to approximate
the future cost function V (Sa) given the forecast of future order as input.
V (Sa) ≈ CF (a∗|Sa) (4.22)
85
In the formulation, C is the cost function for forecast orders, Sa is the state evolving
from decision a for current order assignment, a∗ is the decision of order fulfillment
plan for forecast future customer orders. Modeling CF inherit idea of modeling C,
which derives a mixed integer linear programming (MILP) model. CF is the optimal
objective value of the MILP by integrating the projected the delivery costs.
We denote the variables, parameters and decision variable of the model as follows:
Demand forecasting related parameters.
Zset of delivery zones, which are the divided areas of the whole region, in-
dexed by z.
gzτ the predicted future number of orders originated from zone z at time τ .
πzτtbinary parameter indicates that the predicted future demand gzτ can be
fulfilled at hour t or not.
ρDETOURhkz
the detoured travel distance for ISD h to fulfill one package from store k to
delivery zone z, where o ∈ I, J .
Demand forecasting related decision variables.
wzτkt the fulfillment flow for gzτ from store k to zone z by own trucks at time t.
xzτktthe fulfillment flow for gzτ from store k to zone z by occasional drivers at
time t.
vzτkhbinary variable indicates that one package for gzτ from store k to zone z by
ISD h at time t.
We assume the forecast customer orders in a processing horizon are known as gzτ
at the beginning, which represents the number of order will be placed from zone z at
time τ . The forecast customer orders can be estimate from historical order records
by time series and other factors, including weather, event, holiday etc. The online
customer demand predicting is a well studied area [98], and it is a separate process
for the model. Here, we assume that there is a order prediction model in hand for
86
estimating variable gzτ . Later we will introduce a feedback control system in next the
section to improve the model accuracy integrating the existing prediction result and
realized customer orders.
πzτt is an auxiliary parameter based upon gzτ . πzτt are zeros for time t ∈ [0, τ − 1]
and ones for t ∈ [τ, T ], denoting there is no change for order fulfillment before order
arrival at time τ while open to fulfill after τ in a processing horizon.
πzτt =
0 when τ < t
1 when τ >= t
∀gzτ
Instead of using binary variables like modeling C function, the decision variables in
CF are represented by order flows, which means number of order sourcing from store
to target zone. However, the decision variable for ISD vzτkh is still binary since ISDs
can only carry one order and the ISD is pre-defined and indexed by h. Therefore,
the action a∗ is to determine the order sourcing flows for forecast future customer
orders. In contract, the action a is to make exact fulfillment plans for individual
arrived orders.
The formulation of CF by MILP:
min∑zτkt
(cVkj · wzτkt + cODki · xzτkt +∑h
cODki · vzτkh) (4.23)
s.t.
∑zτ
xzτkt ≤ Rk ·mkt ∀k ∈ K, t ∈ T (4.24)
∑zτ
(∑t
wzτkt +∑t
xzτkt +∑h
vzτkh) ≤ Ak ∀k ∈ K (4.25)
∑zτ
wzτkt ≤ nVk ∀k ∈ K, t ∈ T (4.26)
∑zτ
xzτkt ≤ nODk ∀k ∈ K, t ∈ T (4.27)
87
∑zτ
qht · vzτkt ≤ nISDk ∀k ∈ K, t ∈ T (4.28)
∑zτk
vzτkh ≤ 1 ∀h ∈ H (4.29)
Ω · ρPLANh ≥ ρDETOURhkz · vzτkh ∀j ∈ J, k ∈ K,h ∈ H (4.30)∑kt
wzτkt +∑kt
xzτkt +∑kt
vzτkh = gzτ ∀z ∈ Z, τ ∈ T (4.31)
∑k
wzτkt +∑k
xzτkt
+∑k
qht · vzτkh ≤ πzτt · gzτ ∀z ∈ Z, τ ∈ T (4.32)
wzτkt + xzτkt +∑h
qht · vzτkh = 0 ∀z ∈ Z, τ ∈ T, k ∈ K, t = t0 (4.33)
The objective function (4.23) is to minimize the last-mile transport cost for fore-
cast customer order if it is assign to be delivered by truck, ISDs or ODs. Constraints
(4.24) ensure no assignment for ODs is more than ODs. Constraints (4.25) ensure
the assignment for stores is less than the inventory. Constraints (4.26), (4.27) and
(4.28) enforce the store processing capacity for order assignment. Constraints (4.29)
ensure that a ISD is able to take at most one order, and Constraints (4.30) represent
that the detour travel distance of ISD needs to be less than the preset max distance
by willingness. Constraints (4.31) and (4.32) require forecast orders to be met after
they arrive at time τ . For current time t = t0, there is no need to do fulfillment plan
which is handled instead by cost function C, enforced by constraints (4.33).
It is worthy noting that, comparing to the C, CF dost not consider the penalty
cost for the order unfulfillment. That is because the model presumes that supply
of stores and the processing capacity are sufficient to meed demand in the rolling
horizon. If that is not valid, it can select more stores for the order fulfillment or
scaling down the forecast customer orders in the numerical experiment.
After combining the C and CF , the SDD-CSF problem is eventually modeled by
88
mixed integer linear programming as an approximation. The dynamic programming
equation evolves from (4.1) to
V (S) = mina∈A(S)
C(a) + V (Sa) ≈ mina∈A(S),a∗∈A(SF )
C(a|Sa∗) + CF (a∗|Sa)) (4.34)
From equation (4.34), action a and a∗ contribute to C as well as CF function.
It means that the action a and a∗ affect each other by changing the inventory and
transportation resources from current time t0 to the end the horizon T . Therefore, the
SDD-CSF model combines C and CF into one cost-integrated and resource-shared
MILP model. Here, we present the SDD-CSF model by MILP is in form of truck
per-package cost.
min∑ik
(∑t
cVki · aikt +∑t
(pTtki · cODki · bikt + (1− pTtki) · θ · bikt) +∑h
cISDki · eikh)
+∑jk
(∑t
cVkj · αjkt +∑t
(pTtkj · cODkj · βjkt + (1− pTtkj) · θ · βjkt) +∑h
cISDkj · γjkh)
+∑zτkt
(cVkj · wzτkt + cODki · xzτkt +∑h
cODki · vzτkh)
+ θ · (I −∑ikt
aikt −∑ikt
bikt −∑ikh
eikt) + θ · (J −∑jkt
αjkt −∑jkt
βjkt −∑jkh
γjkt)
(4.35)
s.t.
∑i
bikt +∑j
βjkt +∑zτ
xzτkt ≤ Rk ·mkt ∀k ∈ K, t ∈ T (4.36)
∑kt
aikt +∑k,t
bikt +∑kh
eikh ≤ 1 ∀i ∈ I (4.37)
∑kt
αjkt +∑kt
βjkt +∑kh
γjkh ≤ 1 ∀j ∈ j (4.38)
∑t
αjkt +∑t
βjkt +∑h
γjkh ≤ λjk ∀j ∈ J, k ∈ K (4.39)
89
∑i
di · (∑t
aikt +∑t
bikt +∑h
eikh)
+∑zτ
(∑t
wzτkt +∑t
xzτkt +∑h
vzτkh) ≤ Ak ∀k ∈ K (4.40)
∑i
aikt +∑j
αjkt +∑zτ
wzτkt ≤ nVk ∀k ∈ K, t ∈ T (4.41)
∑i
bikt +∑j
βjkt +∑zτ
xzτkt ≤ nODk ∀k ∈ K, t ∈ T (4.42)
∑ih
qhteikh +∑jh
qhtγjkh
+∑zτ
qht · vzτkt ≤ nISDk ∀k ∈ K, t ∈ T (4.43)
∑ik
eikh +∑jk
γjkh +∑zτk
vzτkh ≤ 1 ∀h ∈ H (4.44)
Ω · ρPLANh ≥ ρDETOURhki · eikh ∀i ∈ I, k ∈ K,h ∈ H (4.45)
Ω · ρPLANh ≥ ρDETOURhkj · γjkh ∀j ∈ J, k ∈ K,h ∈ H (4.46)
Ω · ρPLANh ≥ ρDETOURhkz · vzτkh ∀j ∈ J, k ∈ K,h ∈ H (4.47)∑kt
wzτkt +∑kt
xzτkt +∑kt
vzτkh = gzτ ∀z ∈ Z, τ ∈ T (4.48)
∑k
wzτkt +∑k
xzτkt
+∑k
qht · vzτkh ≤ πzτt · gzτ ∀z ∈ Z, τ ∈ T (4.49)
wzτkt + xzτkt +∑h
qht · vzτkh = 0 t = 0 and ∀z ∈ Z, τ ∈ T, k ∈ K (4.50)
The objective function of SDD-CSF (4.35) is essentially the summation of (4.2)
from C and (4.23) from CF . Constraints (4.36), constraints (4.40)–(4.43) integrating
order assignment for received orders and forecast orders to enforce the limitation of
number of ODs, store inventory, and store processing capacity respectively. Other
constraints keep the same as they are in C and CF .
90
Feedback Control
When model is create for time t in the rolling horizon,the received orders and available
resource are realized for time t. Therefore, it is possible to verify whether we have
an accurate prediction of orders and transportation resources from time 0 to t − 1.
Furthermore, the verification at time t for the predictions at [0, t−1] can be quantified
as a feedback control system for improving the forecast results for [t+ 1, T ] at time t.
The SDD-CSF model integrates the feedback control system to cope with the
inaccurate forecast of future demand in the roll horizon framework. Based on the
realized number of online orders, an exponential smoothing method is applied to
data inputs to adjust OD pickup probability and forecast future order. In the case,
the rolling horizon solving framework more robust and scalable even if the initial
prediction is not accurate.
From the discussion in Section The Cost Function for Received Orders and The
Cost Function for Forecast Orders, the forecast of future demand are made as input
data at the beginning of the rolling horizon. The forecast customer orders can be
estimated from historical order records by time series and other factors, including
weather, event, holiday etc. The feedback control system intends to adjust the these
prediction input variables for the SDD-CSF model based on the realized data. The
detail process can be seen in Algorithm 2.
Algorithm 2. Feedback Control for Forecast orders
Inputs
• [0, T ]: set of the service hours, indexed by t;
• t0: current time of the rolling horizon;
• K:set of stores, indexed by k;
91
• Ot: Number of received order at time t ∈ [0, t0];
• gzt: the predicted future demand originated from zone z at hour t ∈ [0, T ], the
prediction is made before time 0;
• αF : the smoothing factor of exponential smoothing for forecast orders, between
0 and 1.
Outputs
• ΨF : the smoothing factor of feedback control for forecast orders;
• g∗zt: the smoothed future demand originated from zone z at hour t ∈ [t0, T ];
Steps
1. Obtain the number of predicted order at time t (OFt ) from gzt
OFt =
∑z∈Z
gzt t ∈ 0, .., T
2. Calculate the ratio (κFt ) of the realized number of orders over forecast number
of orders from the beginning to current time t0
κFt = Ot/OFt t = 0, .., t0
3. Update the ΨF at time t0 for forecast orders by
ΨFt =
κFt , t = 0
αF · κFt + (1− αF ) ·ΨFt−1, t = 1, .., t0
and
ΨF = ΨFt0
92
4. Based on the smoothing factors of feedback control for forecast orders, the total
number of forecast orders G for [t0, T ] can be updated by
G = ΨF ·∑z
T∑t=t0
gzt
5. The difference between G and total number of forecast orders is define as ∆, and
∆ = G −∑z
T∑t=t0
gzt
6. Simulate the forecast orders based on the value of ∆. Create |∆| number of
random forecast orders (g′zt) from delivery zone set Z and time set [t0, T ].
If ∆ ≥ 0, g∗zt = gzt ∪ g′
zt
else, g∗zt = gzt \ (gzt ∩ g′
zt)
By applying g∗kt to replace gzt in the Algorithm 1 and SDD-CSF model, it in-
corporates the smooth factors of feedback control for improving the forecast results
based on the realized data. In generate, with help of rolling horizon solving frame,
the SDD-CSF model is able to efficiently adjust the forecast input parameters to
accommodate the fluctuations from the reality.
4.4 Case Study
The proposed SDD-CSF model seeks to minimize the immediate order fulfillment
cost based upon on-hand and forecast transportation resource, at the same time,
leverages the forecast of future customer orders to reduce the resulting future expected
cost as well. SDD-SCF model gives consideration of both efficiency and robustness
of order fulfillment plan for the whole time horizon. For the sake of comparison,
we create three more MILP models as benchmarks for SDD-CSF model, respectively
Conservative, Myopic and Global-optimal models.
93
The conservative model only considers on-hand transportation resource and cur-
rent receiving customer orders. The Myopic model (4.2)–(4.13) examines the forecast
transportation resource for current receiving customer orders, but not the future or-
ders. The global-optimal model is defined with no uncertainty knowing all the trans-
portation resource and daily customer orders at beginning of horizon with certainty.
The conservative and Myopic models represent two common industry practical strate-
gies for order fulfillment, while the global-optimal model provides the lower bound of
the operation cost for order fulfillment.
For the perspective of formulation, the conservative and the global-optimal models
are both derived from the Myopic model. For the conservative model, the set of service
hour is only a time period [t0, t0] rather than a range like others [t0, T ]. Therefore,
all the time related parameters and variables contain the values only for current time
t0, which represents the on-hand transportation resource and the current receiving
customer orders. For the global-optimal model, there are three differences comparing
to the myopic model. First, the set of non-fixed-sourcing orders I has all orders of
the horizon instead of the orders arrived at t0, so that the set of fixed-sourcing orders
J is always empty. Second, the OD pickup probability is always 1 because of the
assumption for no uncertainty. Furthermore, we add an auxiliary parameter νit to
indicate the arrival time τ of order i.
νit =
0 if τ < t
1 otherwise
Then the fulfillment plan for order i may only be made after order arrival which
enforced by additional constraints∑k
(aikt + bikt +∑h
qht · eikh) ≤ νit ∀i ∈ I, t = 0, .., T
The customer demand data is collected from Instacart, which is an American
94
company that provides same-day grocery delivery service. In June 2017, Instacart
provides 3 million Instacart orders for non-commercial use 1 . The time horizon for
each day is set to be from 8 am to 7 pm 12 hours. All the orders placed between 8
pm to 7am will be count as new orders arriving just at 8 am. That is the reason that
there is a surge of orders at the beginning of each time horizon.
We choose Chicago, Illinois as a test area to provide spatial information of assumed
local retail stores, ISDs shared trips and customer shipping address. The number of
walk-in customer is estimated by three factors: general online statistics of customers
for the supermarket, assumed store capacity, and hour of the day. We set 5% as the
ratio of walk-in customers willing to be occasional drivers at store, which mean 5 of
100 store walk-in customers would be OD to deliver orders if there are packages ready
to be picked up.
The delivery rate is based on FedEx SameDay R© service [97] with a reasonable
corporation discount. The start point is 8 US dollars per package and the cost will
increase after the distance is greater than 15 miles, and per mile cost is 0.2 times
the value of travel distance. The compensation factors for ISDs ζISD and ODs ζOD
are, respectively, 0.2 and 0.4 according to the cost setting proposed by Archetti et al.
[91]. Furthermore, the penalty of order unfulfilled on the end of day is set as 50 US
dollars.
We implement the solving methodology in CPLEX 12.7.0. We apply the MIP
solver in CPLEX directly for the SDD-CSF model in form of the rolling horizon.
They run under CentOS Linux on a workstation with 12 core Intel Xeon CPU E7-
4830 at 2.13 gigahertz and 32 gigabytes of memory.
Here, we focus on an instance with relative small scale to show the difference
between the four proposed models. The instance contain 2788 customer orders in two
1https://www.instacart.com/datasets/grocery-shopping-2017
95
0
50
100
150
200
250
300
350
8 10 12 14 16 18 8 10 12 14 16 18
No.
of O
rder
s
Time of Day
TruckODISDUnfulfilledReceived Orders
(a) The myopic model
0
50
100
150
200
250
300
350
8 10 12 14 16 18 8 10 12 14 16 18
No.
of O
rder
s
Time of Day
(b) The conservative model
0
50
100
150
200
250
300
350
8 10 12 14 16 18 8 10 12 14 16 18
No.
of O
rder
s
Time of Day
(c) The global-optimal model
0
50
100
150
200
250
300
350
8 10 12 14 16 18 8 10 12 14 16 18
No.
of O
rder
s
Time of Day
(d) SDD-CSF model
Figure 4.1: The hourly summary of order fulfillment plans by the four models
days. And we consider two stores to do the fulfillment for these orders. The instance
is the same setting as P4 in Table 4.1.
The solving process for models, including the myopic, the conservative and SDD-
CSF model, can be described as follows. The models are created at each time t of
rolling horizon and solved iteratively with the exact solution until the end of the
horizon T . The solution for time t contains the order fulfillment for from t to T for
both the myopic and SDD-CSF models. However, the remaining orders at time t+ 1
96
will be treated as input and fulfillment plan will be updated according to the solution
of model at time t + 1. Therefore, we only implement the exact solution for the
current time as finalized order fulfillment plan from Myopic and SDD-CSF at time
t0. The global-optimal model only need to be run once for a whole horizon.
Figure 4.1 shows the hourly summary of order fulfillment plan solved by the four
models. In common, there are two time horizons from 8 am to 19 pm. There is a
surge of received orders at the beginning of each time horizon which reason has been
discussed. In addition, there is more orders arrived in day 2 comparing to day 1.
Let us focus on the number of hourly fulfilled orders regardless delivery method
first. In the Figure 4.1, the global-optimal model divide the fulfillment works more
evenly in each time than others. The SDD-CSF and conservative model generally
follows the orders receiving pattern. On the other hand, the myopic model lack
foresight of upcoming orders, and does the order fulfillment with small amount at
beginning of horizons and try to catch up at the end of each horizon. In day 2, the
myopic model has unfulfilled orders at end of day, while all other model does not have
any unfulfillment.
Each hour order fulfillment plan is composed of order assignment for three deliver
methods. Generally, the delivery rate for ODs is the least and truck is the most
expensive for a specific order because of the shipping cost rates from (4.14) – (4.16).
For the global-optimal model fully utilizes the ODs for every hour. The SDD-CSF
model tries to use as much as ODs. However, at the end of horizon, it assigns less
orders to ODs for corresponding smaller OD pickup probability. For the conservative
model, it intends to fulfill order as much as its capacity so than order assigned to
truck is the most comparing to others.
The summary of order assignment and associated costs is shown in Figure 4.2. The
total cost of SDD-CSF model is less than the myopic and conservative. Although
97
0
500
1,000
1,500
2,000
2,500
3,000
Myopic Cons GO SDD-CSF
No.
of O
rder
s
Truck OD ISD Unfulfilled
(a) The order assignments
0.00
5,000.00
10,000.00
15,000.00
20,000.00
25,000.00
Myopic Cons GO SDD-CSF
Cos
t USD
Truck OD ISD Unfulfilled
(b) The associated costs
Figure 4.2: The order fulfillment plans and associated costs of the instance P7 by the
four models
myopic and SDD-CSF models contain similar percentage of assignment for three
delivery methods, the SDD-CSF does not have the penalty for order unfulfillment.
In sum, we can conclude that the myopic model has poor management of the
delivery timing for order fulfillment, and the conservative model has poor management
of the delivery resource of order fulfillment. The proposed SDD-CSF model overcome
the shortcomings of these two models. It incorporates the cost function for future
forecast orders to select a good timing in the view of whole time horizon. Furthermore,
it enables the cost function for current received order to select a good delivery model.
Combining the two cost functions, SDD-CSF model is able to provide a better order
fulfillment plan.
Next, we conduct a large amount of computational experiments to quantify the
benefits of SDD-CSF model. The Instacart data is sampled to create two levels –
small and large customer demand for length of two days. And each levels contains
ten instances with incremental number of orders. Input of stores is two for small
demand level, while that is five for large demand level. The details can be seen in
98
Table 4.1: The Orders and Stores Inputs for Two Levels of Customer Demand
Small Demand Level Large Demand Level
Input Stores (IDs) [9, 90] Input Stores (IDs) [9, 17, 20, 35, 76]
Instance # of Orders Instance # of Orders
P1 1058 P11 1745
P2 1446 P12 2228
P3 1705 P13 2774
P4 1979 P14 3397
P5 2181 P15 3873
P6 2466 P16 4452
P7 2788 P17 4998
P8 3035 P18 5543
P9 3276 P19 6066
P10 3426 P20 6496
Table 4.1.
We manage to incorporate the two levels of customer demand into the four MILP
models, including the conservative, the myopic, the global-optimal and SDD-CSF
models, in form of the roll horizon. One day is treated as a time horizon , and further
is divided into 12 sequential time periods from 8 am to 7 pm – 12 working hours
as discussed before. In sum, the total number of the optimization models is 192
(= 2× 4× 2× 12) for the comparison.
The solving results of the 20 different-sized instances of the four models can be
seen in Figure 4.3 in terms of total cost. It can be seen that the cost curves follows
the similar patterns for both small demand level and large demand level. When the
99
0
10000
20000
30000
40000
50000
60000
70000
P1 P2 P3 P4 P5 P6 P7 P8 P9 P10
Tota
l Cos
t
Myopic Conservative
Global-optimal SDD-CSF
(a) Small demand level
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
100000
P11P12P13P14P15P16P17P18P19P20
Tota
l Cos
t
Myopic Conservative
Global-optimal SDD-CSF
(b) Large demand level
Figure 4.3: The solving results of the instances for different levels of customer demand
input orders is small enough, four models handle the fulfillment for the least expensive
way by ODs if the store has any. So the costs for four models are almost the same.
When input orders increase from small to medium numbers, the conservative model
cost more than others. That is because there still exist surplus delivery resources,
the delivery timing is not sensitive enough to affect the cost. But for the conservative
model, it handle the order surge at beginning of the horizon with all capacity including
the costly truck. When the input order increase to large numbers, delivery timing
becomes very sensitive since the penalty cost of unfulfillment is very high. The myopic
model fails to fully utilize the delivery resources at beginning of horizons. And costs
of conservative models come close to those of SDD-CSF since both models are forced
to use all the delivery resources to avoid penalty cost.
Figure 4.3 shows that the cost curve of SDD-CSF is always close to that of the
global-optimal model no matter the size of input orders. Except for the global-
optimal, which is unpractical with the assumption for real-world use, SDD-CSF model
100
0
50
100
150
200
250
300
350
400
8 10 12 14 16 18 8 10 12 14 16 18
No.
of O
rder
s
Time of DayC1 C2 C3 C4C5 C6 C7 C8C9 Real
(a) Hourly orders for forecast demand cases
15,000.00
16,000.00
17,000.00
18,000.00
19,000.00
20,000.00
21,000.00
22,000.00
23,000.00
C1 C2 C3 C4 C5 C6 C7 C8 C9
Cos
t USD
Myopic Conservative Global-optimal
(b) The total cost
Figure 4.4: Cases of forecast orders gzt and solving results from SDD-CSF models
with feedback control
Bar chart in (b) represent the total costs of SDD-CSF models with feedback control
is proved to be the most robust and efficient way to make the order fulfillment plan
for these 20 different-sized instances.
Table 4.2 summarize the solving time for 20 dierent-sized instances of the four
models. The proposed SDD-CSF model need the longest time to run for every in-
stance. However, It is reasonable to solve the large size problem within 5 minutes in
consideration of hourly order fulfillment planning.
We create 9 cases of forecast orders gzt to benchmark the performance of the
feedback control. The input setting is as same as the instance P4 in Table 4.1.
Case 1–3 is much less than the real input orders representing under-forecasting, while
Case 7–9 is larger denoting over-forecasting and Case 4–6 is similar standing for good
forecasting. All cases is randomly sampled from original Instacart order data.
In Figure 4.4, the results of SDD-CSF models with feedback control remain relative
101
Table 4.2: The solving time (seconds) of the instances for different levels of customer
demand
Instance # of Orders Myopic Conservative Global-optimal SDD-CSF
P1 1,058 2.06 0.31 1.71 17.63
P2 1,446 3.84 0.67 2.67 28.18
P3 1,705 4.86 0.47 2.89 18.37
P4 1,979 5.64 0.55 3.76 23.96
P5 2,181 6.22 0.87 3.98 29.08
P6 2,466 6.73 0.70 5.60 39.97
P7 2,788 8.50 0.80 6.09 37.30
P8 3,035 11.44 0.85 6.72 44.44
P9 3,276 11.45 1.00 7.63 37.81
P10 3,426 14.05 1.44 8.39 43.74
P11 1,745 4.88 1.18 6.46 47.89
P12 2,228 8.40 1.10 8.71 66.27
P13 2,774 9.96 1.55 12.11 87.00
P14 3,397 18.26 1.36 19.76 89.72
P15 3,873 20.31 1.69 22.85 115.98
P16 4,452 24.32 2.28 30.23 146.35
P17 4,998 26.00 3.25 35.62 141.68
P18 5,543 29.12 2.57 45.75 169.93
P19 6,066 46.50 2.84 48.56 175.17
P20 6,496 74.52 4.75 57.76 238.65
102
stable level and are better than that of the myopic and conservative models, regardless
of the cases and corresponding different patterns.
4.5 Conclusion
In this chapter, we introduce a new problem of same-day delivery crowdshipping
and store fulfillment, which aims to fill the last-mile gap between local store network
and online customers in terms of supply chain operation.
First, we propose a rolling horizon solving framework to accumulate the online
orders and transportation resources for purpose to achieve economies of scale, which
is the typical issue for last-mile delivery. The framework repeatedly solves a series
of order assignment and delivery plan problem following the timeline in order to
construct a daily optimal fulfillment plan from local stores.
We develop a set of exact solution approaches for order fulfillment in form of
rolling horizon framework. The original dynamic programming problem for current
received orders is mathematically approximated into a mixed integer linear program-
ming model. The model consider both current received orders and the predicting
results of future demand to make order assignment decision that minimizes the im-
mediate delivery cost plus the resulting future expected cost.
With help of the structure of the roll horizon, we introduce a feedback control
system to cope with the inaccurate forecast of future demand. Based on the realized
number of online orders, an exponential smoothing method is applied to data inputs
to adjust the forecast results of future orders. The case study shows the rolling horizon
solving framework more robust and scalable through the feedback control system even
if the initial prediction is not accurate.
Additionally, crowdsourced shipping for SDD has been integrated as delivery op-
tion and creatively is divided into two types of private drivers: information sharing
103
drivers and occasional drivers. In the setting of store fulfillment, information sharing
drivers are like commuters who share scheduled trips to the retailer. It enables the
model assign packages based on the scheduled trips for compensation with certainty.
Occasional drivers can be treated as store average walk-in customers who are willing
to take packages for others to obtain compensation. The big challenge to model ODs
is that they perform delivery with uncertainty. Therefore, We propose the OD pickup
probability and integrate it into the model.
Finally, the case study with various computational experiments are conducted to
quantify the benefits by comparing SDD-CSF with some conventional order fulfillment
practices. The proposed SDD-CSF achieves the good performance and robustness in
term of optimizing the order fulfillment planning for different-sized customer demand
instances, which derived from real sale data of a national retailer.
104
Chapter 5
CONCLUSION AND FUTURE RESEARCH DIRECTIONS
5.1 Summary of Findings
The problems examined in this thesis are originated from two perspectives of
transportation analytics. The transit flow prediction under events is an actual public
transportation problem faced by transportation authorities. The last-mile same-day
delivery with store fulfillment is the trending business idea in omni-channel supply
chain for brick-and-mortar retailers.
We show how the crowdsourced content of social media can improve the event
transit flow prediction in Chapter 2. Since social media can be retrieved in real time
with relatively small building and maintenance costs, we propose an algorithm and a
prediction model with social media data to assist transportation flow prediction under
special event conditions. Among several popular prediction methods, our method
shows the best results in terms of mean absolute percentage error. Also, we exploit
social media to detect various events. Our approach achieves good performance with
precision 98.27% and recall 87.69% for detecting the baseball games.
In the chapter 3, we define the same-day delivery with store fulfillment problem,
then list the benefits, the implementation challenges, and our solution to the chal-
lenges. It focuses on solving the supply chain planning problem to provide optimal
seasonal plan for retailers as decision making tool to select stores, prepare inventory,
and equip trucks for the last-mile same-day delivery. Our solving method achieves
the best results in terms of runtime for three large-scale instance derived from real
online customer orders. The optimization result provides practical supply chain plan
105
for retailers to setup the service for the same-day delivery with store fulfillment.
In Chapter 4, we follow up the idea of store fulfillment and build an optimization
model for supply chain operation to fill the last-mile gap between local store network
and customers for specific online orders. We adopt a trending concept from sharing
economy to provide delivery option of crowdsourced shipping from average customers.
Chapter 4 provide a solution to the remaining problem from Chapter 3. The proposed
model achieves the good performance and robustness in term of optimizing the order
fulfillment planning for different-sized customer demand instances, which derived from
real sale data of a national retailer. And the proposed feedback control system assist
the model to be more robust and scalable even if the initial predicting inputs is not
accurate.
5.2 Future Research Directions
The work in this thesis is based on the real-world problems and are solved by
real-world data. One trait of our work is that the solutions are not only proved by
theories but also implementable for real-world instances. This trait facilitates the
solutions from our work for practical uses and testing proposed ideas in the future.
For the social media part, the researchers in TransInfo of the University at Buffalo
have follow-up works for various topics in the field of transportation, including traffic
accident detection, travel behavior inference, human mobility pattern analysis etc.
Social media can contribute more in the fields of retailing and disaster recovery since
our work proves that the crowdsourced content is able to signify the public attention
and willingness.
For the last-mile same-day delivery part, the category of products can be taken
into consideration for both supply chain planning and operation models, since the
difference between grocery, cloth and gadget is huge in terms of the supply chain
106
operations. Another direction of the future work is to explore the application of the
models for cases at a very large scale. It is a great opportunity to visit more solving
methods and heuristics in order to efficiently solve the models with scalable inputs.
107
REFERENCES
[1] Vibhanshu Abhishek, Kinshuk Jerath, and Z. John Zhang. Agency sellingor reselling? channel structures in electronic retailing. Management Science,62(8):2259–2280, 2016.
[2] U.S. Census Bureau. Annual retail trade survey 2015. https://www.census.gov/retail/index.html, 2015. Last accessed: April 6, 2017.
[3] U.S. Census Bureau. Quarterly retail e-commerce sales, 4th quarter 2016. https://www.census.gov/retail/index.html, 2017. Last accessed: April 6, 2017.
[4] Vikram Sehgal and S Mulpuru. Forrester research online retail forecast, 2013to 2018 (us). https://www.forrester.com/report/Forrester+Research+Online+Retail+Forecast+2013+To+2018+US/-/E-RES115941, 2011.
[5] Statista. Retail e-commerce sales in the united states from2015 to 2021. https://www.statista.com/statistics/272391/us-retail-e-commerce-sales-forecast/, 2017. Last accessed: April 6,2017.
[6] Ethan Lieber and Chad Syverson. Online versus offline competition. OxfordHandbook of the Digital Economy, pages 189–223, 2012.
[7] Ali Hortasu and Chad Syverson. The ongoing evolution of us retail: A formattug-of-war. The Journal of Economic Perspectives, 29(4):89–111, 2015.
[8] Mu-Chen Chen and Yu Wei. Exploring time variants for short-term passengerflow. Journal of Transport Geography, 19(4):488–498, July 2011.
[9] Samiul Hasan, Christian M. Schneider, Satish V. Ukkusuri, and Marta C. Gon-zlez. Spatiotemporal Patterns of Urban Human Mobility. Journal of StatisticalPhysics, 151(1-2):304–318, December 2012.
[10] Yu Wei and Mu-Chen Chen. Forecasting the short-term metro passenger flowwith empirical mode decomposition and neural networks. Transportation Re-search Part C: Emerging Technologies, 21(1):148–162, April 2012.
[11] Biao Leng, Jiabei Zeng, Zhang Xiong, Weifeng Lv, and Yueliang Wan. Proba-bility Tree Based Passenger Flow Prediction and Its Application to the BeijingSubway System. Front. Comput. Sci., 7(2):195–203, April 2013.
[12] Yujuan Sun, Guanghou Zhang, and Huanhuan Yin. Passenger Flow Prediction ofSubway Transfer Stations Based on Nonparametric Regression Model. DiscreteDynamics in Nature and Society, 2014:e397154, April 2014.
[13] Yuxing Sun, Biao Leng, and Wei Guan. A novel wavelet-SVM short-time pas-senger flow prediction in Beijing subway system. Neurocomputing, 166:109–121,October 2015.
108
[14] E.I. Vlahogianni, J.C. Golias, and M.G. Karlaftis. Short-term traffic forecasting:Overview of objectives and methods. Transport Reviews, 24(5):533–557, 2004.
[15] Billy Williams, Priya Durvasula, and Donald Brown. Urban Freeway Traffic FlowPrediction: Application of Seasonal Autoregressive Integrated Moving Averageand Exponential Smoothing Models. Transportation Research Record: Journalof the Transportation Research Board, 1644:132–141, January 1998.
[16] A.G. Hobeika and Chang Kyun Kim. Traffic-flow-prediction Systems Based onUpstream Traffic. In Vehicle Navigation and Information Systems Conference,1994. Proceedings., 1994, pages 345–350, August 1994.
[17] Mohamed S. Ahmed and Allen R. Cook. Analysis of Freeway Traffic Time-series Data by Using Box-Jenkins Techniques. Transportation Research Record,(722):1–9, 1979.
[18] Xiaoyan Zhang and John A. Rice. Short-term Travel Time Prediction. Trans-portation Research Part C: Emerging Technologies, 11(34):187–210, June 2003.
[19] Billy Williams. Multivariate Vehicular Traffic Flow Prediction: Evaluation ofARIMAX Modeling. Transportation Research Record: Journal of the Trans-portation Research Board, 1776:194–200, January 2001.
[20] Billy M. Williams and Lester A. Hoel. Modeling and Forecasting Vehicular TrafficFlow as a Seasonal ARIMA Process: Theoretical Basis and Empirical Results.Journal of Transportation Engineering, 129(6):664–672, 2003.
[21] Sangsoo Lee and Daniel Fambro. Application of Subset Autoregressive Inte-grated Moving Average Model for Short-Term Freeway Traffic Volume Forecast-ing. Transportation Research Record: Journal of the Transportation ResearchBoard, 1678:179–188, January 1999.
[22] Tsung-Hsien Tsai, Chi-Kang Lee, and Chien-Hung Wei. Neural Network BasedTemporal Feature Models for Short-term Railway Passenger Demand Forecast-ing. Expert Systems with Applications, 36(2, Part 2):3728–3736, March 2009.
[23] R. Yasdi. Prediction of Road Traffic using a Neural Network Approach. NeuralComputing & Applications, 8(2):135–142, May 1999.
[24] C.-H. Wu, J.-M. Ho, and D.T. Lee. Travel-time prediction with support vectorregression. IEEE Transactions on Intelligent Transportation Systems, 5(4):276–281, 2004.
[25] Fangce Guo, Rajesh Krishnan, and John Polak. A computationally efficient two-stage method for short-term traffic prediction on urban roads. TransportationPlanning and Technology, 36(1):62–75, February 2013.
[26] Weiwei Gong. ARMA-GRNN for passenger demand forecasting. In 2010 Sixth In-ternational Conference on Natural Computation (ICNC), volume 3, pages 1577–1581, August 2010.
109
[27] Xiushan Jiang, Lei Zhang, and Xiqun (Michael) Chen. Short-term forecasting ofhigh-speed rail demand: A hybrid approach combining ensemble empirical modedecomposition and gray support vector machine with real-world applications inChina. Transportation Research Part C: Emerging Technologies, 44:110–127,July 2014.
[28] Chun-Hui Zhang, Rui Song, and Yang Sun. Kalman Filter-Based Short-TermPassenger Flow Forecasting on Bus Stop. Journal of Transportation SystemsEngineering and Information Technology, 11(4):154, 2011.
[29] Min Gong, Xiang Fei, Zhi Wang, and Yun Qiu. Sequential Framework for Short-Term Passenger Flow Prediction at Bus Stop. Transportation Research Record:Journal of the Transportation Research Board, 2417:58–66, December 2014.
[30] F. Y. Wang. Scanning the Issue and Beyond: Real-Time Social Transportationwith Online Social Signals. IEEE Transactions on Intelligent TransportationSystems, 15(3):909–914, June 2014.
[31] F. Y. Wang. Scanning the Issue and Beyond: Transportation Games for So-cial Transportation. IEEE Transactions on Intelligent Transportation Systems,16(3):1061–1069, June 2015.
[32] X. Zheng, W. Chen, P. Wang, D. Shen, S. Chen, X. Wang, Q. Zhang, andL. Yang. Big Data for Social Transportation. IEEE Transactions on IntelligentTransportation Systems, 17(3):620–630, March 2016.
[33] E. Chaniotakis and C. Antoniou. Use of Geotagged Social Media in UrbanSettings: Empirical Evidence on Its Potential from Twitter. In 2015 IEEE 18thInternational Conference on Intelligent Transportation Systems (ITSC), pages214–219, September 2015.
[34] Francisco C. Pereira, Filipe Rodrigues, and Moshe Ben-Akiva. Using Data Fromthe Web to Predict Public Transport Arrivals Under Special Events Scenarios.Journal of Intelligent Transportation Systems, 19(3):273–288, July 2015.
[35] F. C. Pereira, F. Rodrigues, E. Polisciuc, and M. Ben-Akiva. Why so manypeople? Explaining Nonhabitual Transport Overcrowding With Internet Data.IEEE Transactions on Intelligent Transportation Systems, 16(3):1370–1379, June2015.
[36] F. Y. Wang, J. J. Zhang, X. Zheng, X. Wang, Y. Yuan, X. Dai, J. Zhang, andL. Yang. Where does AlphaGo go: from church-turing thesis to AlphaGo thesisand beyond. IEEE/CAA Journal of Automatica Sinica, 3(2):113–120, April 2016.
[37] Y. Lv, Y. Duan, W. Kang, Z. Li, and F. Y. Wang. Traffic Flow PredictionWith Big Data: A Deep Learning Approach. IEEE Transactions on IntelligentTransportation Systems, 16(2):865–873, April 2015.
[38] X. Wang, X. Zheng, Q. Zhang, T. Wang, and D. Shen. Crowdsourcing in ITS:The State of the Work and the Networking. IEEE Transactions on IntelligentTransportation Systems, 17(6):1596–1605, June 2016.
110
[39] N. Wanichayapong, W. Pruthipunyaskul, W. Pattara-Atikom, and P. Chaovalit.Social-based traffic information extraction and classification. In 2011 11th Inter-national Conference on ITS Telecommunications (ITST), pages 107–112, August2011.
[40] Axel Schulz, Petar Ristoski, and Heiko Paulheim. I See a Car Crash: Real-TimeDetection of Small Scale Incidents in Microblogs. In Philipp Cimiano, MiriamFernndez, Vanessa Lopez, Stefan Schlobach, and Johanna Vlker, editors, TheSemantic Web: ESWC 2013 Satellite Events, number 7955 in Lecture Notes inComputer Science, pages 22–33. Springer Berlin Heidelberg, 2013.
[41] Freddy Lecue Elizabeth Daly. Westland Row Why So Slow? Fusing Social Mediaand Linked Data Sources for Understanding Real-Time Traffic Conditions. 2013.
[42] Eric Mai and Rob Hranac. Twitter Interactions as a Data Source for Transporta-tion Incidents. 2013.
[43] Ayelet Gal-Tzur, Susan M. Grant-Muller, Tsvi Kuflik, Einat Minkov, Silvio No-cera, and Itay Shoor. The potential of social media in delivering transport policygoals. Transport Policy, 32:115–123, March 2014.
[44] Po-Ta Chen, Feng Chen, and Zhen Qian. Road Traffic Congestion Monitoringin Social Media with Hinge-Loss Markov Random Fields. In 2014 IEEE Inter-national Conference on Data Mining (ICDM), pages 80–89, December 2014.
[45] E. D’Andrea, P. Ducange, B. Lazzerini, and F. Marcelloni. Real-Time Detec-tion of Traffic From Twitter Stream Analysis. IEEE Transactions on IntelligentTransportation Systems, PP(99):1–15, 2015.
[46] Avinash Kumar, Miao Jiang, and Yi Fang. Where Not to Go?: Detecting RoadHazards Using Twitter. In Proceedings of the 37th International ACM SIGIRConference on Research & Development in Information Retrieval, SIGIR ’14,pages 1223–1226, New York, NY, USA, 2014. ACM.
[47] Zhenhua Zhang, Ming Ni, Qing He, Jing Gao, Jizhan Gou, and Xiaoling Li.An Exploratory Study on the Correlation between Twitter Concentration andTraffic Surge. Transportation Research Record: Journal of the TransportationResearch Board, 2553, 2016.
[48] Zhenhua Zhang and Qing He. On-site Traffic Accident Detection with BothSocial Media and Traffic Data. 9th Triennial Symposium on TransportationAnalysis (TRISTAN IX), 2016.
[49] Zhenhua Zhang and Qing He. Exploring Travel Behavior with Social Media: AnEmpirical Study of Abnormal Movements Using High Resolution Tweet Trajec-tory Data. submitted to Transportation Research Part C: Emerging Technologies,2016.
111
[50] Jingrui He, Wei Shen, Phani Divakaruni, Laura Wynter, and Rick Lawrence. Im-proving Traffic Prediction with Tweet Semantics. In Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, IJCAI ’13, pages1387–1393, Beijing, China, 2013. AAAI Press.
[51] Ming Ni, Qing He, and Jing Gao. Using social media to predict traffic flow underspecial event conditions. In The 93rd Annual Meeting of Transportation ResearchBoard, 2014.
[52] Lei Lin, Ming Ni, Qing He, Jing Gao, and Adel W. Sadek. Modeling the Impactsof Inclement Weather on Freeway Traffic Speed. Transportation Research Record:Journal of the Transportation Research Board, 2482:82–89, 2015.
[53] Craig Collins, Samiul Hasan, and Satish Ukkusuri. A Novel Transit Rider Sat-isfaction Metric: Rider Sentiments Measured from Online Social Media Data.Journal of Public Transportation, 16(2), June 2013.
[54] Metropolitan Transportation Authority Date Feed.
[55] Twitter Streaming APIs.
[56] Daniel Rampage, Susan Dumais, and Dan Liebling. Characterizing Microblogswith Topic Models. Proceedings of the Fourth International AAAI Conferenceon Weblogs and Social Media, 2010.
[57] Wouter Weerkamp and Maarten Rijke. Credibility-inspired Ranking for BlogPost Retrieval. Inf. Retr., 15(3-4):243–277, June 2012.
[58] MRIO Cordeiro. Twitter Event Detection: Combining Wavelet Analysis andTopic Inference Summarization. In Doctoral Symposium on Informatics Engi-neering, DSIE, volume 8, pages 11–16, 2012.
[59] David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent Dirichlet Alloca-tion. J. Mach. Learn. Res., 3:993–1022, March 2003.
[60] P. Giridhar, M.T. Amin, T. Abdelzaher, L.M. Kaplan, J. George, and R. Ganti.ClariSense: Clarifying sensor anomalies using social network feeds. In 2014IEEE International Conference on Pervasive Computing and CommunicationsWorkshops (PERCOM Workshops), pages 395–400, March 2014.
[61] Man-Chun Tan, S.C. Wong, Jian-Min Xu, Zhan-Rong Guan, and Peng Zhang.An Aggregation Approach to Short-Term Traffic Flow Prediction. IEEE Trans-actions on Intelligent Transportation Systems, 10(1):60–69, March 2009.
[62] Andreas Klose and Andreas Drexl. Facility location models for distributionsystem design. European Journal of Operational Research, 162(1):4–29, April2005.
[63] C. S. ReVelle, H. A. Eiselt, and M. S. Daskin. A bibliography for some fun-damental problem categories in discrete location science. European Journal ofOperational Research, 184(3):817–848, February 2008.
112
[64] M. T. Melo, S. Nickel, and F. Saldanha-da Gama. Facility location and sup-ply chain management A review. European Journal of Operational Research,196(2):401–412, July 2009.
[65] Abraham Warszawski. Multi-Dimensional Location Problems. Journal of theOperational Research Society, 24(2):165–179, June 1973.
[66] A. M. Geoffrion and G. W. Graves. Multicommodity Distribution System Designby Benders Decomposition. Management Science, 20(5):822–844, January 1974.
[67] Alan W. Neebe and Basheer M. Khumawala. An Improved Algorithm for theMulti-Commodity Location Problem. The Journal of the Operational ResearchSociety, 32(2):143–149, 1981.
[68] Cynthia Barnhart and Yosef Sheffi. A Network-Based Primal-Dual Heuristicfor the Solution of Multicommodity Network Flow Problems. TransportationScience, 27(2):102–117, May 1993.
[69] Teodor Gabriel Crainic and Louis Delorme. Dual-Ascent Procedures for Multi-commodity Location-Allocation Problems with Balancing Requirements. Trans-portation Science, 27(2):90–101, May 1993.
[70] A. K. Aggarwal, M. Oblak, and R. R. Vemuganti. A heuristic solution pro-cedure for multicommodity integer flows. Computers & Operations Research,22(10):1075–1087, December 1995.
[71] Choong Y. Lee. The Multiproduct Warehouse Location Problem: Applyinga Decomposition Algorithm. International Journal of Physical Distribution &Logistics Management, 23(6):3–13, June 1993.
[72] Choong Y. Lee. A cross decomposition algorithm for a multiproduct-multitypefacility location problem. Computers & Operations Research, 20(5):527–540,June 1993.
[73] Hasan Pirkul and Vaidyanathan Jayaraman. A multi-commodity, multi-plant,capacitated facility location problem: formulation and efficient heuristic solution.Computers & Operations Research, 25(10):869–878, October 1998.
[74] Nabila Azi, Michel Gendreau, and Jean-Yves Potvin. A dynamic vehicle rout-ing problem with multiple delivery routes. Annals of Operations Research,199(1):103–112, October 2011.
[75] Mathias Klapp and Alan Erera. The One-dimensional Dynamic Dispatch WavesProblem. 2015.
[76] Stacy A. Voccia, Ann M. Campbell, and Barrett W. Thomas. The Same-DayDelivery Problem for Online Purchases. ResearchGate, October 2015.
[77] J. F. Benders. Partitioning procedures for solving mixed-variables programmingproblems. Numerische Mathematik, 4(1):238–252, December 1962.
113
[78] Zuo-Jun Max Shen and Lian Qi. Incorporating inventory and routing costs instrategic location models. European Journal of Operational Research, 179(2):372–389, June 2007.
[79] Niels Agatz, Ann Campbell, Moritz Fleischmann, and Martin Savelsbergh.Time Slot Management in Attended Home Delivery. Transportation Science,45(3):435–449, December 2010.
[80] Yu-Chung Tsao, Divya Mangotra, Jye-Chyi Lu, and Ming Dong. A continuousapproximation approach for the integrated facility-inventory allocation problem.European Journal of Operational Research, 222(2):216–228, October 2012.
[81] Carlos F. Daganzo. The Distance Traveled to Visit N Points with a Maximumof C Stops per Vehicle: An Analytic Model and an Application. TransportationScience, 18(4):331–350, November 1984.
[82] Kyle D. Cattani, Gilvan C. Souza, and Shengqi Ye. Shelf Loathing: Cross Dock-ing at an Online Retailer. Production and Operations Management, 23(5):893–906, May 2014.
[83] Ming Ni, Qing He, Jose Walteros, Xian Liu, and Arun Hampapur. Same DayDelivery Planning with Store Fulfllment. submitted to Transportation Science.
[84] Alp Arslan, Niels Agatz, Leo G. Kroon, and Rob A. Zuidwijk. CrowdsourcedDelivery: A Dynamic Pickup and Delivery Problem with Ad-Hoc Drivers. SSRNScholarly Paper ID 2726731, Social Science Research Network, Rochester, NY,September 2016.
[85] Aashwinikumar Devari, Alexander G. Nikolaev, and Qing He. Crowdsourcingthe last mile delivery of online orders by exploiting the social networks of retailstore customers. Transportation Research Part E: Logistics and TransportationReview, 105(Supplement C):105–122, September 2017.
[86] Stephen Mahar and P. Daniel Wright. The value of postponing online fulfillmentdecisions in multi-channel retail/e-tail organizations. Computers & OperationsResearch, 36(11):3061–3072, November 2009.
[87] Ping Josephine Xu, Russell Allgor, and Stephen C. Graves. Benefits of Reevaluat-ing Real-Time Order Fulfillment Decisions. Manufacturing & Service OperationsManagement, 11(2):340–355, December 2008.
[88] Jason Acimovic and Stephen C. Graves. Making Better Fulfillment Decisions onthe Fly in an Online Retail Environment. Manufacturing & Service OperationsManagement, 17(1):34–51, December 2014.
[89] Valentina Carbone, Aurlien Rouquet, and Christine Roussat. ”Carried away bythe crowd: what types of logistics characterise collaborative consumption. pages1–21, June 2015.
[90] Martin Savelsbergh and Tom Van Woensel. 50th Anniversary Invited ArticleCityLogistics: Challenges and Opportunities. Transportation Science, March 2016.
114
[91] Claudia Archetti, Martin Savelsbergh, and M. Grazia Speranza. The VehicleRouting Problem with Occasional Drivers. European Journal of OperationalResearch, 254(2):472–480, October 2016.
[92] Yuan Wang, Dongxiang Zhang, Qing Liu, Fumin Shen, and Loo Hay Lee. To-wards enhancing the last-mile delivery: An effective crowd-tasking model withscalable solutions. Transportation Research Part E: Logistics and TransportationReview, 93(Supplement C):279–293, September 2016.
[93] Nabin Kafle, Bo Zou, and Jane Lin. Design and modeling of a crowdsource-enabled system for urban parcel relay and delivery. Transportation ResearchPart B: Methodological, 99(Supplement C):62–82, May 2017.
[94] Harri Paloheimo, Michael Lettenmeier, and Heikki Waris. Transport reduction bycrowdsourced deliveries a library case in Finland. Journal of Cleaner Production,132(Supplement C):240–251, September 2016.
[95] Aymeric Punel and Amanda Stathopoulos. Exploratory Analysis of Crowd-sourced Delivery Service Through a Stated Preference Experiment. 2017.
[96] Mathias A. Klapp, Alan L. Erera, and Alejandro Toriello. The dynamic dispatchwaves problem for same-day delivery. In under review, 2016.
[97] FedEx SameDay. http://www.fedex.com/us/fedex/shippingservices/package/sameday.html,2017. [Online; accessed 30-November-2017].
[98] Kris Johnson Ferreira, Bin Hong Alex Lee, and David Simchi-Levi. Analytics foran Online Retailer: Demand Forecasting and Price Optimization. Manufacturing& Service Operations Management, 18(1):69–88, November 2015.
115