Transportation Analytics and Last-mile Same-day Delivery ...qinghe/thesis/2018-01 Ni PhD Same-Day...

Transportation Analytics and Last-mile Same-dayDelivery with Local Store Fulfillment

by

Ming NiJanuary 5, 2018

A Dissertation submitted to theFaculty of the Graduate School of

the University at Buffalo, State University of New Yorkin partial fulfilment of the requirements for the

degree of

Doctor of Philosophy

Department of Industrial and Systems Engineering

Copyright by

Ming Ni

2018

Doctoral Committee:

Qing HeStephen Still Assistant Professor of Civil, Structural and EnvironmentalEngineering and of Industrial and Systems EngineeringAdvisor and Chair of Committee

Mark H. KarwanPraxair Professor of Operations Research, Industrial and Systems EngineeringSUNY Distinguished Teaching ProfessorCommittee Member

Jose L. WalterosMorton C. Frank Assistant Professor of Industrial and Systems EngineeringCommittee Member

Jing GaoAssistant professor of Computer Science and EngineeringCommittee Member

ABSTRACT

The recent emergence of social media and online retailing become increasingly impor-

tant and continue to grow. More and more people use social media to share their real

life to the digital world, at the same time, browse the virtual Internet to buy the real

products. In the process, a huge amount of data is generated and we investigate the

data and crowdsourcing for areas of the public transportation and last-mile delivery

for online orders in the perspective of data analytics and operations optimization.

We first focus on the transit flow prediction by crowdsourced social media data.

Subway flow prediction under event occurrences is a very challenging task in transit

system management. To tackle this challenge, we leverage the power of social me-

dia data to extract features from crowdsourced content to gather the public travel

willingness. We propose a parametric and convex optimization-based approach to

combine the least squares of linear regression and the prediction results of the sea-

sonal autoregressive integrated moving average model to accurately predict the NYC

subway flow under sporting events.

The second part of the thesis focuses on the last-mile same-day delivery with

store fulfillment problem (SDD-SFP) using real-world data from a national retailer.

We propose that retailers can take advantage of their physical local stores to fulll

nearby online orders in a direct-to-consumer fashion during the same day that order

placed. Optimization models and solution algorithms are developed to determine

store selections, fleet-sizing for transportation, and inventory in terms of supply chain

seasonal planning. In order to solve large-scale SDD-SFP with real-world datasets,

we create an accelerated Benders decomposition approach that integrates the outer

search tree and local branching based on mixed-integer programming and develops

optimization-based algorithms for initial lifting constraints.

In the last part of the dissertation, we drill down SDD-SFP from supply chain

planning to supply chain operation level. The aim is to create an optimal exact

order fulllment plan to specify how to deliver each received customer order. We

adopt crowdsourced shipping, which utilizes the extra capacity of the vehicles from

private drivers to execute delivery jobs on trips, as delivery options, and define the

problem as same-day delivery with crowdshipping and store fulfillment (SDD-CSF).

we develop a set of exact solution approaches for order fulfillment in form of rolling

horizon framework. It repeatedly solves a series of order assignment and delivery

plan problem following the timeline in order to construct an optimal fulfillment plan

from local stores. Results from numerical experiments derived from real sale data of

a retailer along with algorithmic computational results are presented.

TABLE OF CONTENTS

Page

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CHAPTER

1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2 Forecasting the Subway Passenger Flow under Event Occurrences with

Social Media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.3 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.4 Hashtag-based Event Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.5 Events Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.6 Prediction Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3 Same-Day Delivery Planning with Store Fulfillment . . . . . . . . . . . . . . . . . . . . 28

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.3 Problem Description and Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.3.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.3.2 SDDSFP Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.3.3 SDDSFP Benders Reformulation . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.3.4 Algorithmic Enhancement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.4 Computational results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

4 Cross Souring Delivery with Store Fulfillment . . . . . . . . . . . . . . . . . . . . . . . . . . 64

CHAPTER Page

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

4.3.1 Rolling Horizon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.3.2 The Cost Function for Received Orders . . . . . . . . . . . . . . . . . . . . 72

4.3.3 The Cost Function with Forecast Orders and Feedback Control 85

4.4 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

5 Conclusion and Future Research Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

5.1 Summary of Findings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

5.2 Future Research Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

APPENDIX

LIST OF TABLES

Table Page

2.1 Sample Tweets Before Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2 Sample Events and Their Top Hashtags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.1 Summary of data sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

3.2 Solving time (Minute) comparison of different solution methods . . . . . . . 57

3.3 The gaps between MIP and different solution methods . . . . . . . . . . . . . . . . 57

3.4 The Improvement of store selection by Algorithm 2 . . . . . . . . . . . . . . . . . . . 57

3.5 Summary of data set of Instance P4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.6 Metric comparison of different solution methods for Instance P4 . . . . . . . 59

4.1 The Orders and Stores Inputs for Two Levels of Customer Demand . . . . 99

4.2 The solving time (seconds) of the instances for different levels of cus-

tomer demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

LIST OF FIGURES

Figure Page

2.1 System architecture for passenger flow prediction from social media . . . . 7

2.2 Geographic distribution of tweets two hours before the events . . . . . . . . . 11

2.3 Comparisons of passenger flow and number of tweets . . . . . . . . . . . . . . . . . 13

2.4 Average event/nonevent daily passenger flow at Mets-Willies Point sta-

tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.5 The correlation between tweet rates and passenger flow under events . . 18

2.6 Average passenger flow V.S. average tweet rates at Citi Field Station . . 20

2.7 Performance metrics of the prediction models . . . . . . . . . . . . . . . . . . . . . . . . 24

2.8 The distributions of test errors to compare the SVR and OPL . . . . . . . . . 24

2.9 Improvement from ensemble learning from the OPL and SVR . . . . . . . . . 25

3.1 Parallel search trees between the master problem of Benders and orig-

inal MIP problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.2 The solution strategy of parallel search tree . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.3 The solution methodology framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

3.4 Sensitivity analysis of the store order processing capacity . . . . . . . . . . . . . 60

3.5 Sensitivity analysis of the daily operation cost of own fleet truck . . . . . . 61

3.6 Sensitivity analysis of the package shipping cost by third party carrier . 62

4.1 The hourly summary of order fulfillment plans by the four models . . . . . 96

4.2 The order fulfillment plans and associated costs of the instance P7 by

the four models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

4.3 The solving results of the instances for different levels of customer

demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

4.4 Cases of forecast orders gzt and solving results from SDD-CSF models

with feedback control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

Chapter 1

INTRODUCTION

This thesis investigates the crowdsourcing for transportation and logistics in different

perspectives, including exploration of the crowdsourced content of social media for

public transportation flow prediction and crowdsourced shipping, store fulfillment for

the last-mile same-day delivery.This thesis is organized in the following ways.

Chapter 2 covers the subway flow prediction using the crowdsourced content of

social media data. Social media is a great resource of user-generated contents. Public

attention, opinion and hot topics can be captured in the social media, which provides

the ability to predict human related events. Since social media can be retrieved in real

time with relatively small building and maintenance costs, transportation operation

authorities probably identify the social media data as another type sensor for traffic

or transit demand.

One of those challenges is how to extract reliable traffic related features from big

and noisy social media data. The other challenge is how to locate a feasible traffic

study that fits well with social media data. We aim to use social media information to

assist transit flow prediction under special event conditions. Specially, a short-term

subway flow prediction model, incorporated with crowdsourcing features, is developed

to forecast the incoming transportation flow prior to big events. We propose the

prediction model based on convex optimization to combine the least squares of linear

regression and the prediction results of SARIMA in the same objective function to

accurately predict the subway passenger flow.

Chapter 3 introduces a new problem of same-day delivery planning with store

fulfillment to capture the current trend of same-day delivery in omni-channel supply

1

chain for brick-and-mortar retailers.

Electronic retailing has experienced a significant growth over the past decade [1].

Alone in the U.S., electronic retail sales have doubled over a span of only five years,

moving from $170 billions in 2010 to a staggering $340 billions in 2015 [2]. In 2016, a

report by the U.S. Census Bureau situates electronic commerce (e-commerce) at 8.1%

of the total retail sales nationwide; a notable increment from the 7.3% observed the

year before [3]. Furthermore, according to the National Retail Association (NRA),

the growth trend of e-commerce is projected to maintain a steady pace of about 8%

to 12% in the forthcoming years [4, 5].

What is perhaps the most interesting about these recent trends, is the fact that

the growth of e-commerce has significantly outpaced the overall growth of retail sales

(currently situated at 3%). Evidence suggests that customers are rapidly gravitat-

ing towards the convenience of online shopping, in part because of the increasing

number of electronic channels (e-channels) and services offered by online retailers [6].

Indeed, with the emergence of e-commerce giant Amazon and other online platforms

like eBay, traditional brick-and-mortar retailers have been forced to pursue an elec-

tronic presence to avoid substantial drops in sales [7], which has resulted in a notable

increment on the shopping alternatives for most customers. Successful examples of

this phenomenon are traditional retailers like Walmart and Best Buy, who have been

able to successfully consolidate profitable online stores, in contrast to once major

players like Circuit City, Radio Shack, and Borders, who fail to transcend to this new

e-commerce era.

The strong competition for online sales between traditional and online-exclusive

retailers has naturally stimulated the development of novel strategies and services to

attract new customers by promoting convenience and product accessibility [6]. One

notable example is the recent emergence of same-day delivery (SDD) options for some

2

types of products, which allows customers to have desired items delivered to their

doors only a few hours after the purchase. Online retailers with highly flexible supply

chains like Amazon have been able to significantly reduce the delivery time of most of

their products by their massive fulfillment centers. In contrast, traditional retailers

whose supply chain is largely designed to support the fulfillment of their physical

stores must resort to different alternatives to cope with the operational requirements

for implementing SDD.

Based on this fact, We argue that traditional retailers can take advantage of their

physical local stores to fulfill nearby online orders in a direct-to-consumer fashion

during the same day when the order is placed. In order to construct the SDD net-

work with a robust logistic plan from scratch, we develop optimization models and

solving algorithms about store location, transporting channel selection, and inven-

tory management. The problem is named as same-day delivery with store fulfillment

problem (SDD-SFP). The proposed solution methodology framework includes the

Benders decomposition, store selection algorithm, cut strengthening methods, and

parallel search trees.

Chapter 4 focuses on the daily operations of same-day delivery with crowdship-

ping and store fulfillment (SDD-CSF). we drill down SDD-SFP from supply chain

planning to supply chain operation level. This chapter aims to close the last-mile

gap between store and customers. SDD-SFP makes order fulfillment plan from two

aspects: order souring decision and delivery method selection. Souring decision con-

siders both current received orders and future forecast demand in order to minimize

not only the immediate fulfillment cost but also future expecting cost.

We adopt the new concept of crowdsourced shipping, which utilizes the extra ca-

pacity of the vehicles from private drivers to execute delivery jobs on trips. Therefore,

the delivery methods include self-operated or carrier-operated fleet of truck, and store

3

walk-in customers which is willing to deliver packages for others.

Chapter 4 develops a set of exact solution approaches for order fulfillment in

form of a rolling horizon framework. The original dynamic programming problem for

current received orders is mathematically approximated into a mixed integer linear

programming model. The model consider both current received orders and the pre-

dicting results of future demand to make order assignment decision that minimizes

the immediate delivery cost plus the resulting future expected cost. It repeatedly

solves the model following the timeline in order to construct an optimal fulfillment

plan from local stores. With help of the structure of the roll horizon, we also introduce

a feedback control system to cope with the inaccurate forecast of future demand.

Finally, we summarize our results and discuss future research opportunities in

chapter 5

4

Chapter 2

FORECASTING THE SUBWAY PASSENGER FLOW UNDER EVENT

OCCURRENCES WITH SOCIAL MEDIA

2.1 Introduction

Passenger flow prediction is critical for planning, management and operations

of public transit systems [8]. The output from the prediction can benefit transit

network design, route scheduling, and station crowd regulation operations [9]. The

majority of the previous studies lie in forecasting day-to-day recurrent passenger flow

[10, 11, 12, 13]. However, when it comes to non-recurrent events (e.g. sporting game,

concert, running race, etc.), because of its irregularity and inconsistency, passenger

flow prediction turns into a very challenging task. Very limited methods have been

proposed in the literature.

For solving this problem, instead of revising existing methods, we intend to lever-

age a new kind of data – social media. User-generated contents on social media

strengthen linkage and interactions between users, meanwhile provide a large amount

of information. The vast information is able to capture the public attention, which is

one of the common traits of events.

However, social media data is much difficult to process compared with traditional

relational data. There still exist several major challenges in handling social media

data, which is unstructured, noisy, gigantic, and contains a variety of information.

Take Twitter data for example. Only in 2014, we have collected over 29.7 million

geo-tagged posts bounded in the New York City Area. At individual post level, a

fundamental question of data mining arises: what it is talking about, and what event

5

information it contains. Thus the first challenge (C1), within a transportation con-

text, is how to identify transportation-related events that each post refers to. An in-

dividual geo-tagged post is able to provide social activity analysis at spatial-temporal

aggregated level. Transportation authorities can leverage such information to iden-

tify hot spots and further indicate passenger flows in near future for public gathering.

Therefore, the second challenge (C2) is how to develop a method to coordinate social

media for forecasting passenger flow, especially under event occurrences.

This paper aims to address challenges (C1) and (C2). More specifically, under

event occurrences, we intend to extract event information from geo-tagged social

media data, and leverage both historical transit data and real-time social media data

to forecast future passenger flow at subway stations. The following questions will be

investigated: (i) Can social media be used to identify public events in real life? (ii)

How to build the prediction model by the features extracted from social media? To

the best of our knowledge, there has not been considerable published research on the

effects of passenger flow prediction with social media.

The paper has the following structure. Section II summarizes related works about

recent popular transportation prediction techniques and the uses of social media in

transport applications. An overview of the data, including subway passenger flow

and social media, is given in Section III. Section IV describes the setup of event

detection approach. Section V presents a detailed analysis of the relationship between

event passenger flow and social media. Section VI presents the technical details of

prediction modeling and experiments on real-world datasets. Finally, Section VII

provides concluding remarks.

6

> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 2

presents a detailed analysis of the relationship between event

passenger flow and social media. Section VI presents the

technical details of prediction modeling and experiments on

real-world datasets. Finally, Section VII provides concluding

remarks.

A set of events with high

social media activity

Baseball Game

of using crowdsourcing these resources to capture the incoming

non-recurrent events [27], to explain the causes of transport

overcrowding [28], to investigate intelligent transportation

systems services [29], and to utilize deep learning approach

[30][31]. Studies are trying to exploit this area mainly fall into

two applications, traffic detection, and traffic prediction, with

supervised learning techniques.

In the application of traffic detection, Wanichayapong et al.

[32] used synthetic analysis to classify the traffic incident

information into spatial categories from the social media data. Social Media

Data

Transit

Turnstile

Geo Filter Bounding

Box

Feature

Generation

Twitter Rates

Transit Passenger

Event Detection

Passenger Flow

Prediction

Parametric methods

Nonparametric

Music Concert

US Open Tennis

f(x)

Schulz et al. [33] extracted features from part-of-speech

tagging and words in Twitter posts and developed classifiers to

detect car accident occurrences. They applied spatial and

temporal filtering to locate the accidents. Daly [34] built a

system called Dublin’s Semantic Traffic Annotator and

Reasoner to use natural language processing techniques to

analyze social media contents in order to capture real-time traffic conditions. Mai and Hranac [35] explored the time and

Data Flow methods x location of the related Twitter posts after traffic incidents

occurred. They found that the majority of tweets are posted Fig. 1. System architecture for passenger flow prediction from social media

I. RELATED WORKS

There is a vast literature in short-term transportation

forecasting [7]. Generally, there are two groups of approaches

receiving wide attention, namely, parametric and non-

parametric techniques.

The common parametric techniques include autoregressive

integrated moving average model (ARIMA), exponential

smoothing [8], and historical average [9]. Especially, ARIMA

has been fully developed for various transportation prediction

purposes, including traffic occupancy [10], travel time [11] and

traffic flow [12]. Previous research [13][14] shows ARIMA

performs well for stationary and non-event time series. With

the rise of data mining and science, non-parametric techniques

also have been widely adopted recently. Neural network

[15][16], support vector machine for regression (SVR) [17] and

k-nearest neighbor [18] were used to build the traffic volume

prediction model for the time-series data.

The passenger flow prediction belongs to the subcategory of

short-term transportation prediction. Some researchers adopted

both kinds of prediction techniques to forecast the passenger

flow for railway [15][19][20], bus stop [21][22], and subway

stations. Specifically for passenger flow prediction at subway

stations, there are different prediction levels, respectively, at

whole transit lines [3][4], at one station with passenger transfer

flow [5], and at one station with entrance and transfer flow [6].

All of them obtained a desirable predict result of typical

commuting volumes. However, none of them adds

consideration of atypical conditions.

Recently, more and more attempts have been made to

implement The Internet and social media analysis in the domain

of transportation [23][24][25]. A huge group of people in the

online community generates a tremendous amount of content.

Chaniotaks and Antoniou [26] proposed a generic

methodological framework for collecting and analyzing the

data from social media. And other researchers took advantages

within 5 hours and 25 miles for freeway incidents. Gal-Tzur et

al. [36] used the Twitter messages sent from transportation

authorities to develop classifiers to identify the posts related to

transportation information. Moreover, they presented a

keyword-based hierarchical schema to categorize these posts.

Chen et al. [37] tried to detect traffic congestion and location

solely based on social media data by using topic modeling and

hinge-loss Markov random fields. D’Andrea et al. [38] utilized

Twitter data and developed a support vector machine model to

recognize useful keywords from tweets and detect traffic events

in the area of highway road network. Kumar et al. [39]

incorporated social media to detect road hazards by sentiment

and language analysis. Most recently, Zhang et al. [40] studied

and revealed the characteristics of traffic flow surge near the

tweet concentration, which is defined as a cluster of keywords

for traffic related events. Further, Zhang and He proposed

analytical models to detect on-site traffic accidents [41] and

decode people’s travel behavior with geo-mobility clustering

[42].

For traffic prediction, He et al. [43] proposed a long-term

traffic prediction models with social media features for a

freeway network in San Francisco Bay area. They found that

there exists a negative correlation between social activity on the

web and traffic activity on the roads. Ni et al. [44] tried to

forecast freeway traffic flows under special event conditions by

taking into account information derived from social media. Lin

et al. [45] applied linear regression models for predicting the

impact of inclement weather on freeway speed with the help of

social media.

For subway and transit, Collins et al. [46] used sentiment

analysis of transit riders’ short messages on social media to

measure their satisfaction about transit. They found that the

social media posts with the sharp increased negative sentiment

indicated some transit incidents, like fire and delays.

Above studies show that there is great potential to use social

media to locate right information for transportation

applications. However, none of the previous studies explores

Figure 2.1: System architecture for passenger flow prediction from social media

2.2 Literature Review

There is a vast literature in short-term transportation forecasting [14]. Generally,

there are two groups of approaches receiving wide attention, namely, parametric and

non-parametric techniques.

The common parametric techniques include autoregressive integrated moving av-

erage model (ARIMA), exponential smoothing [15], and historical average [16]. Es-

pecially, ARIMA has been fully developed for various transportation prediction pur-

poses, including traffic occupancy [17], travel time [18] and traffic flow [19]. Previous

research [20, 21] shows ARIMA performs well for stationary and non-event time se-

ries. With the rise of data mining and science, non-parametric techniques also have

been widely adopted recently. Neural network [22, 23], support vector machine for

regression (SVR) [24] and k-nearest neighbor [25] were used to build the traffic volume

prediction model for the time-series data.

The passenger flow prediction belongs to the subcategory of short-term trans-

portation prediction. Some researchers adopted both kinds of prediction techniques

to forecast the passenger flow for railway [22, 26, 27], bus stop [28, 29], and subway

7

stations. Specifically for passenger flow prediction at subway stations, there are dif-

ferent prediction levels, respectively, at whole transit lines [10, 11], at one station

with passenger transfer flow [12], and at one station with entrance and transfer flow

[13]. All of them obtained a desirable predict result of typical commuting volumes.

However, none of them adds consideration of atypical conditions.

Recently, more and more attempts have been made to implement The Internet

and social media analysis in the domain of transportation [30, 31, 32]. A huge group

of people in the online community generates a tremendous amount of content. Chan-

iotaks and Antoniou [33] proposed a generic methodological framework for collecting

and analyzing the data from social media. And other researchers took advantages

of using crowdsourcing these resources to capture the incoming non-recurrent events

[34], to explain the causes of transport overcrowding [35], to investigate intelligent

transportation systems services [36], and to utilize deep learning approach [37, 38].

Studies are trying to exploit this area mainly fall into two applications, traffic detec-

tion, and traffic prediction, with supervised learning techniques.

In the application of traffic detection, Wanichayapong et al. [39] used synthetic

analysis to classify the traffic incident information into spatial categories from the

social media data. Schulz et al. [40] extracted features from part-of-speech tagging

and words in Twitter posts and developed classifiers to detect car accident occur-

rences. They applied spatial and temporal filtering to locate the accidents. Elizabeth

Daly [41] built a system called Dublins Semantic Traffic Annotator and Reasoner to

use natural language processing techniques to analyze social media contents in order

to capture real-time traffic conditions. Mai and Hranac [42] explored the time and

location of the related Twitter posts after traffic incidents occurred. They found that

the majority of tweets are posted within 5 hours and 25 miles for freeway incidents.

Gal-Tzur et al. [43] used the Twitter messages sent from transportation authorities to

8

develop classifiers to identify the posts related to transportation information. More-

over, they presented a keyword-based hierarchical schema to categorize these posts.

Chen et al. [44] tried to detect traffic congestion and location solely based on social

media data by using topic modeling and hinge-loss Markov random fields. DAndrea

et al. [45] utilized Twitter data and developed a support vector machine model to

recognize useful keywords from tweets and detect traffic events in the area of highway

road network. Kumar et al. [46] incorporated social media to detect road hazards

by sentiment and language analysis. Most recently, Zhang et al. [47] studied and

revealed the characteristics of traffic flow surge near the tweet concentration, which

is defined as a cluster of keywords for traffic related events. Further, Zhang and He

proposed analytical models to detect on-site traffic accidents [48] and decode peoples

travel behavior with geo-mobility clustering [49].

For traffic prediction, He et al. [50] proposed a long-term traffic prediction models

with social media features for a freeway network in San Francisco Bay area. They

found that there exists a negative correlation between social activity on the web and

traffic activity on the roads. Ni et al. [51] tried to forecast freeway traffic flows under

special event conditions by taking into account information derived from social media.

Lin et al. [52] applied linear regression models for predicting the impact of inclement

weather on freeway speed with the help of social media.

For subway and transit, Collins et al. [53] used sentiment analysis of transit

riders short messages on social media to measure their satisfaction about transit.

They found that the social media posts with the sharp increased negative sentiment

indicated some transit incidents, like fire and delays.

Above studies show that there is great potential to use social media to locate right

information for transportation applications. However, none of the previous studies

explores the effectiveness of using social media for passenger flow prediction in public

9

Table 2.1: Sample Tweets Before Events

Type Start Time Details Create at Text content

Baseball

game

2014-05-14

19:10

Mets vs. Yankee 2014-05-14

18:22:22

Checked in CITI field for the

yankees vs mets game w yan-

kees mets

Tennis games 2014-08-25

19:00

US Open 1st

round

2014-08-25

17:49:46

Im at 2014 usopen tennis

championships in flushing ny

Baseball

game +

Tennis games

2014-08-28

19:00 (T)

19:10 (B)

US Open 2nd

round And Mets

vs. Braves

2014-08-28

18:29:10

love this place billy jean king

national tennis centre, us

open

metro transit systems.

2.3 Dataset

This study expands the successful applications of social media data to predict

passenger volume at a subway station. We focus our study on subway station Mets

Willets Point on Line 7 in New York City. The station is selected based on two

main reasons. First, Mets Willets Point is adjacent to not one but two stadiums,

Citi Field and USTA Billie Jean King National Tennis Center (NTC). Citi Field is

the home stadium of New York Mets Baseball team, and NTC hosts the annual US

Open grand-slam tennis tournament. Second, the sports events always obtain public

attention. From our observation, there is a substantial volume of social media posts

referring to the events.

We collected the turnstile usage at Mets Willets Point subway station from

Metropolitan Transportation Authority (MTA) [54]. In order to cover various types

of events, the time range is set from April 2014 to October 2014, in which various

events occur nearby.

10

(a) Baseball game (b) Tennis games (c) Baseball + Tennis games

Figure 2.2: Geographic distribution of tweets two hours before the events

Turnstile devices record passengers passing each turnstile for either entry or exit,

and it reports the aggregated number every four hours. In this paper, we aggregate

both entry and exit flows as total passenger flow, which is of transit agencys interest.

We collected Twitter data, known as tweets, as social media data. Twitter mes-

sage is an online text post limited to 140 characters by Twitter users. Tweets were

collected in the same temporal window through Twitter Streaming API with geo-

location filter [55]. The spatial bounding box was set to cover only the subway station

and two stadiums. Because of the location filter, besides text content, username and

timestamp, each tweet contains its geographic coordinate. Inside the post, users are

able to prefix by a # symbol with words, which is called the Twitter hashtag. A

hashtag provides unique tagging convention to facilitate tweets with certain topics,

contexts or events. The aforementioned information from Twitter messages defines a

tweet in this paper.

Fig. 2.2 shows the locations of tweets sent two hours before different types of

11

events start. As it can be seen, tweets were mostly sent from the stadium in which

the event was held. Moreover, different events correspond to different social media ac-

tivities, and to various levels of public attention. From social media data perspective,

the characteristics of tweets, like time stamps, geolocations, text content, quantity

ratios, etc., lead to such differences. Our objective is to find ways to measure these

differences in social media data and leverage them into prediction models to forecast

subway passenger flow.

2.4 Hashtag-based Event Identification

The events held in stadiums were well attended. The attendance not only brings

a high volume of passenger flows but also activities on Twitter, shown in Fig. 2.3.

As one can see, event scenarios generate large spikes of social media activity and pas-

senger flow at the same time. We assume that the complete schedule of all events is

unknown for transit operators. The subway station Mets-Willets Point could coordi-

nate transit passengers for two major sports events, US Open Tennis Championships

and Major League Baseball for New York Mets. The former was held late August

and early September over a two-week period, and the latter was held from April to

September 2014. However, after initial examinations, we found that there were other

events like concerts and speeches being held nearby as well. Therefore, we need to

identify the events by social media data.

Instead of detecting the exact topic of the events [56, 57, 58], we would like to

examine tweets within the area and probe whether there will exist events involving

high social activities. To correctly identify the events, rather than using the complex

machinery of latent variable topic models (e.g. Latent Dirichlet Allocation [59]),

we employ the Twitter hashtags to measure social media activities and provide the

context for them [60].

12

(a) Passenger flow

(b) Number of Tweets

Figure 2.3: Comparisons of passenger flow and number of tweets

Hashtag extraction is the first step of the proposed event detection algorithm. We

denote t as one of the time intervals, with t = 1, . . . , T , where T is the total number of

four-hour intervals. HLt is the list of hashtags during t. HLt = Ht1, . . . , Htj, . . . , HtJt ,

where Htj is the jth hashtag and Jt is the total number of hashtags labeled by Twitter

users during t.

Furthermore, let MH ∈ RT×S denote the hashtag matrix, where S is the number

of hashtags. Its element MHt,s corresponds to the occurrence of the sth hashtag in

the tth time interval. Since hashtag matrix contains all the T time intervals, S ≥

maxt∈T

Jt. In the hashtag matrix, all the hashtags over time intervals merge into the

columns. Various words and phrases depict different aspects of social activities. In

sum, the column names of hashtag matrix are the hashtags, the rows stand for the

13

time intervals, and each entry in the matrix corresponds to the frequency of the

hashtag.

The summary of the notations present as follows:

• t is the index of time intervals, t = 1, . . . , T .

• p is the index of the tweet.

• s is the index of the hashtag.

• Jt is the total number of hashtags labeled by Twitter users during t.

• MH ∈ RT×S denote the hashtags matrix, where S is the total number of hash-

tags.

• Htj ∈ HLt is the jth hashtag in the list HLt.

• OCt is the occurrence of each element Htj in HLt for time interval t.

• TWp,s is the occurrence of sth hashtag in HL of pth tweet.

Below are the algorithm of event detection by hashtags.

14

Algorithm 1: Hashtag-based Event Identification

Input : Tweets within the area

Output: Hashtag matrix MH ∈ RT×S

1 Hashtags extraction

HLt = Ht1, . . . , Htj, . . . , HtJt ,∀t ∈ [1, T ];

2 Lexical analysis

HL ≡⋃Tt=1HLt

Remove stop words, punctuation and duplicated strings from HL;

3 Label all collected tweets by hashtag

TWp,s ≡ calculate the occurrence of sth word in HL of pth tweet;

4 for pth tweet p = 1 to p do

5 for sth word in HL do

6 Append the TWp,s as a new column for pth tweet;

7 end

8 end

9 Build hashtag matrix (MH ∈ RT×S)

Each row of MH represents the vector of HL

OCt ∈ RS ≡ the occurrence of each element in HLt for time interval t;

10 for t = 1 to T do

11 OCt =∑

p∈T∑

s∈S TWp,s;

12 MHT = OCt;

13 end

14 Peak detection;

15 for t = 1 to T do

16 Rank OCt based on∑

s∈S OCt,s from the largest to the smallest;

17 for s = 1 to S do

18 Sort OCt,s from largest to smallest.

19 end

20 end

15

Since there could be different hashtags for different time intervals, it is trivial to

see that MH is originally a sparse column-wise matrix, and each column corresponds

to the frequency of hashtag in each time interval. By concatenating hashtag list HLt

over t, it converts MH to a full storage matrix in order to sort the hashtag matrix

row by row for peak detection afterward.

Moreover, instead of directly utilizing the occurrence of hashtags labeled by Twit-

ter users, we extract the string vector of hashtags and use it to label the text content

of each tweet. It will facilitate the approach to capture those tweets about a similar

topic without hashtags.

Finally, we implement peak detection to extract most frequently occurring hash-

tags as event hashtags, representing social media activities with context. In TABLE

II, the top 3 frequently occurring hashtags are presented. Moreover, we use the sum

of all occurring hashtags for each time interval to measure the social media activity.

High-rank number of hashtags indicates that the corresponding time interval is under

event occurrence.

TABLE 2.2 shows the various detected events, including US Open, baseball games,

music shows, running races, etc. In order to justify the method, we compare the

detection results with the true home game schedule of New York Mets, which had

long time range and a decent number of games. There were 81 game days during April

2014 to October 2014 for New York Mets. After eliminating the days with missing

Twitter data, 65 game days remain. Since the objective of the event detection is to

sense the positive events instead of non-events, we evaluate the identification results

with precision, recall and F-1 score.

The proposed method achieves good performance in identifying those baseball

events, i.e., the precision is 98.27%, recall 87.69% and F-1 score 0.9268.

Note that there are two reasons to use event hashtags instead of the quantity

16

Table 2.2: Sample Events and Their Top Hashtags

Date Hour No. of EH Top Hashtags

3/31 17 to 21 65 mets openingday ny

4/5 13 to 17 306 mets reds baseball

4/9 17 to 21 34 amaluna cirquedusoleil citifield

5/14 17 to 21 710 mets yankees subwayseries

5/31 9 to 13 85 happiest5k queens ny

6/7 17 to 21 75 digifestnyc nyc selfie

8/25 17 to 21 437 usopen tennis usopen2014

8/31 13 to 17 609 usopen mets tennis

of tweets directly. First, there is a chance that high volume of tweets does not

necessarily indicate event and attendance. In our observation, a conversation between

users, commercial promotions or information dissemination could also generate a high

quantity of tweets. The proposed hashtag-based method is able to diminish the effects

of these unrelated tweets. Second, the top event hashtags can describe what the event

is about, though the hashtags might not be formal English words. It can be seen in

TABLE II, different kind of events and baseball teams can be easily recognized by

the top event hashtags.

2.5 Events Characteristics

Different events in stadiums bring different size of audience to the sites, in which

the passenger flow at the subway station varies accordingly.

As shown in Fig. ??, there are huge differences between event and ordinary transit

traffic in quantity, more importantly in variation. This difference inevitably leads to

17

5:00 9:00 13:00 17:00 21:0024:00 + 1

hour

Non Event Volume 161.88 929.07 1500.75 1949.77 2166.36 997.72

Event Volume 241.76 1113.89 2828.61 5087.21 8066.04 5829.68

Ratio of Non-Event over Event Flow 66.96% 83.41% 53.06% 38.33% 26.86% 17.11%

0.00

1000.00

2000.00

3000.00

4000.00

5000.00

6000.00

7000.00

8000.00

9000.00

NO

. O

F P

AS

SE

NG

ER

S

Figure 2.4: Average event/nonevent daily passenger flow at Mets-Willies Point station

0

100

200

300

400

500

600

700

0 5000 10000 15000 20000 25000

No

. o

f T

weets

Passenger Flow

Correlation: 0.6286

R squared:0.3952

(a) Number of Tweets V.S. Passenger flow

0

50

100

150

200

250

300

0 5000 10000 15000 20000 25000

No

. o

f U

sers

Passenger Flow

Correlation: 0.6979

R squared: 0.4870

(b) Number of Users V.S. Passenger flow

Figure 2.5: The correlation between tweet rates and passenger flow under events

18

the difficulty of transit prediction by traditional time series models (e.g. ARIMA).

On the other hand, in Fig. 2.5 we plotted the number of tweets against passenger flow

under event occurrences in (a), and the number of Twitter users against passenger flow

in (b). As one can see, a linear trend is observed between tweet counts on passenger

flow. The correlation coefficient is above 0.62 and adjusted R2 value is above 0.39.

The R2 values indicate that the number of users is a more robust predictor. We

reasonably believe that there exists a moderate positive correlation between tweet

counts and event passenger flows. This result gives us the confidence to explore

further the prediction modeling of social media on the event passenger flow.

Note that our study is restrained to the extent that the geo-tagged tweet is avail-

able. For some of the time periods, the amount of tweets is very small despite the

time of day. In this case, event identification measures social media activities and au-

tomatically excludes these time periods from the correlation study and the following

analysis.

2.6 Prediction Modeling

In this section, we intend to investigate whether or not the content of social media

will assist in forecasting event passenger flow. The first step is to identify the best

time lags for the prediction models.

To measure the tweets quantifiable, we define two types of feature as tweets rates

from social media data:

• NTweets(t): Number of event-related tweets at time step t.

• NUsers(t): Number of unique tweet users at time step t.

Because the record time interval of transit passenger flow is four hours, we also

aggregate the tweets data in four-hour intervals. If the predicted passenger flow is

19

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

0

500

1000

1500

2000

2500

5:00 9:00 13:00 17:00 21:00 24:00

+ 1

hour

Tw

eet

Ra

tes

Pa

ssen

ger F

low

Nonevent Volume NTweets Lag 0

NTweets Lag 1 NTweets Lag 2


(a) Nonevent

0

50

100

150

200

250

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

5:00 9:00 13:00 17:00 21:00 24:00

+ 1

hour

Tw

eet

Ra

tes

Pa

ssen

ger

Flo

w

Event Volume NTweets Lag 0



(b) Event

Figure 2.6: Average passenger flow V.S. average tweet rates at Citi Field Station

at time t, we shift tweet rates to earlier hours: t − 1, t − 2, t − L, since prediction

requires features ahead of passenger flow time. Based on the positive correlation of

tweet rates and passenger flow in Fig. 2.6, we construct a linear regression (LR)

model, where passenger flow is the dependent variable, and tweet rates over different

hours are independent variables.

The highest predictive correlation is achieved when the tweet rates are calculated

based on one hour prior to event time range. We obtain an adjusted R2 value of

0.616 in lag one-hour case. For comparison, the R2 values in lag zero and two-hour

cases are, respectively, 0.488 and 0.512. Also, shown in Fig. 2.6, one can see that the

curve of tweets rates with one hour lag fits best to the curve of event passenger flow,

whereas for non-event passenger flow there are no obvious patterns between tweets

rate and passenger flow. Based on such analysis, we will include tweet rates with

one-hour lag into the base prediction model in the following analysis.

Next, we implement cross validation to compare the results of LR model and two

20

popular prediction models: average prediction (AVG) and seasonal autoregressive

integrated moving average (SARIMA). We generate an experiment with 100 runs

of datasets from the event detection result, and each run takes inputs by randomly

splitting the entire dataset into training (70%) and test (30%) sets.

The prediction performance is evaluated by two metrics, namely Mean Absolute

Percentage Error (MAPE) and Root Mean Square Error (RMSE).

In our experiment with 100 runs, the LR model with tweet rates improves the

MAPE by 33.08% comparing with SARIMA (See Fig. 2.7 for details). Notice that

such good performance is achieved by the LR with two variables only. However, the

LR model does not capture the relation between time steps, since the passenger flow

data are time series in nature.

We conduct a comparison of R2 values between two models: 1) the Tweets-based

LR model and 2) the historical-flow-based SARMIA model. The experiment obtains

adjusted R2 value of 0.616 for the LR, 0.400 for the SARMIA, and 0.696 for combined

features of both. As one can see, around 60% of the event passenger flow variance

can be explained by the number of tweets variation. And around 40% of the variance

comes from historical time-series flow data, which includes a large portion of day-

to-day recurrent passenger flow and a small portion of the non-recurrent event flow.

The combination of these two methods shows better R2 value since the LR provides

event-related features while the SARIMA presents the features related to time series

and routine flow.

Inspired by the above experiment with two modeling methods, we propose a con-

vex optimization based approach, called Optimization and Prediction with hybrid

Loss function (OPL), to fuse the LR model and the SARIMA model in the objective

function jointly. The OPL model aims to take advantage of unique strengths of line

regression in social media features and SARIMA model in time series prediction.

21

The hypothesis of the proposed model is a parametric linear model, defined as:

hw(x) = 1 + w1x1 + w2x2 + · · ·+ wnxn x0 = 1

Where xi is ith feature and its corresponding coefficient is wi. In total, the exper-

iment runs for m = 100 times. Each entry of the experiment is one of the four-hour

intervals from the event detection result. Following our experiment design, we ran-

domly split the m runs into training mtrain(70%) and test mtest(30%). The two tweet

rates, NTweets and NUsers, with one-hour lag act as features in the model.

We construct the total loss function as:

J(w, y) =

mtrain∑j

(y(j) − hw(x(j)))2 + α ·mtest∑j

(haty(j) − hw(x(j)))2

+β ·mtest∑j

(haty(j) − y∗(j)))2

(2.1)

The idea behind the loss function is to combine the modeling of the predictions on

both training and test data as well as the predictions from time series model. Equation

(2.1) contains three main parts. The first component is the sum of least square

for the training set, which is the same as linear regression. The second component

incorporates the prediction part directly into the loss function in order to minimize

the square error from test data. In addition, to fuse the results of SARIMA, we

manage to add the sum of least square between OPL predicted haty(j) and SARIMA

predicted y∗(j) into Equation (2.1) as the third component. y∗(j) plays the role of

regularization to leverage the whole loss function. Since OPL only includes two

independent variables, in the trail experiments, it shows that it is not necessary to

equip L1 regularization to prevent overfitting. In sum, OPL adopts the moderately

large correlated social media features, and incorporates the prediction results from

conventional time series model.

22

To minimize Equation (2.1), we first vectorize all variables and coefficients:

W ∈ Rn Y ∈ Rmtrain

X train ∈ Rmtrain×n Y ∈ Rmtest

X test ∈ Rmtest×n Y ∗ ∈ Rmtest

Then, the loss function is transformed into:

J(W, Y ) = tr((Y −X train ×W T )× (Y −X train ×W T )T )

+α · tr((Y −X test ×W T )× (Y −X test ×W T )T )

+β · ((Y − Y ∗)× (Y − Y ∗)T )

Take partial derivative of the above equation with respect toW and Y , respectively

and we get:

5WJ(W, Y ) = [(X train)T ×X train + α · (X test)T ×X test]×W T

−α · (X test)T × Y T − β · (X train)T × Y T = 0

(2.2)

5Y J(W, Y ) = α ·X test ×W T − (α + β) · Y T + β · Y ∗T = 0 (2.3)

Then, we use the gradient descent method to solve Equations (2.2) and (2.3) to

find a local minimum of Y . Given Equation (2.1), gradient descent starts with an

initial set of (W, Y ) and iteratively moves toward a set of values that minimize the

function. Each iteration takes a step in the negative direction of the function gradient.

Because the Equation (2.1) is convex, the result of OPL shall be the global optimal

values.

23

0 0.2 0.4 0.6

AVG

SARIMA

LR

KNN

SVR

OPL

(a) MAPE

0 2000 4000 6000

AVG

SARIMA

LR

KNN

SVR

OPL

(b) RMSE

Figure 2.7: Performance metrics of the prediction models

0.1

75

0.2

0.2

25

0.2

5

0.2

75

0.3

0.3

25

0.3

5

0.3

75

0.4

More

SVR OPL

(a) MAPE

1500

1700

1900

2100

2300

2500

2700

2900

3100

3300

More

SVR OPL

(b) RMSE

Figure 2.8: The distributions of test errors to compare the SVR and OPL

24

0.265

0.27

0.275

0.28

0.285

0.29

SVR OPL Ensemble

(a) MAPE

2200

2250

2300

2350

2400

2450

SVR OPL Ensemble

(b) RMSE

Figure 2.9: Improvement from ensemble learning from the OPL and SVR

In order to benchmark our proposed method against existing popular prediction

approaches, we introduce two nonparametric methods, including SVR and k-nearest

neighbors (KNN). The prediction process utilizes cross-validation as well.

Fig. 2.7 illustrates that the OPL yields better prediction accuracy than other

methods. Compared with the LR, the OPL improves MAPE by 11.4%. Also, one can

see that the SVR presents desirable prediction performance as well. The SVR and

the OPL have different characteristics. The SVR is a nonparametric technique that

considers tweet rates only. The OPL is a parametric method and incorporates the

prediction results from conventional time series model. Further, a detailed comparison

is conducted by another 100 randomly generated runs.

Fig. 2.8 depicts the distributions of test errors for both SVR and OPL. While

either method performs relatively well on its own, it shows the distributions are

heterogeneous for both metrics, MAPE and RMSE. The heterogeneity of error dis-

25

tributions encourages us to combine the merits from both techniques. Inspired by

the aggregation approach proposed by [61], we implement stacking – an ensemble

learning approach to merge the prediction results of the SVR and OPL.

Y = P (X trian|OPL) · arg minY

J(W, Y |OPL)

+P (X trian|SV R) · arg minY

J(W, Y |SV R)

(2.4)

We estimate Y by Equation (2.4). The weighted probabilities come from normal-

ized root mean square error of training data. The output averages the argument of

the minimum for both SVR and OPL.

As one can see from Fig. 2.9, the ensemble approach yields better prediction

accuracy than either OPL or SVR. It is worth mentioning that the improvement over

the conventional SARIMA is more than 40%. Notice that tweet features are obtained

from no-cost and real-time social media data. The results indicate the promising

value of using social media for passenger flow prediction under event conditions.

2.7 Conclusions

In this paper, we have addressed two important questions, in brief, whether social

media data is able to signify public gathering events, and what techniques can be used

to model the passenger flow prediction by the features extracted from social media.

First, we exploit social media to detect various events with hashtags. In order

to capture events precisely, the hashtags from the Twitter users have been analyzed,

tuned, adapted and applied with lexical processing techniques and peak detection.

Our approach achieves good performance with precision 98.27% and recall 87.69% for

the baseball games. It is a simple but efficient method to capture the events related

to public gathering with high social media activity.

Second, we propose a convex optimization model called Optimization and Predic-

26

tion with hybrid Loss function (OPL) to fuse the least squares of linear regression

and the prediction results of SARIMA in the same objective function. The OPL hy-

brid model aims to take advantage of the unique strengths of line regression in social

media features and SARIMA model in time series prediction. Among several popular

prediction methods, OPL shows the best results in terms of MAPE and RMSE. In

addition, by comparing the distribution of prediction errors of OPL with SVR, which

is a popular nonparametric and nonlinear method, it is found that their performance

shows heterogeneous error patterns. Therefore, an ensemble model is developed to

leverage the weighted results from OPL and SVR jointly. As a result, the prediction

accuracy and robustness further increases.

Overall, social media data show the capability in passenger flow prediction under

event conditions. Social media offers a cost-effective way to obtain real-time traveler

related data, and fills the gap between day-to-day passenger flow volume and abruptly

changing non-recurrent event volume. The positive correlation between passenger flow

and social media activity plays a significant role as transit demand indicator in the

public transit system.

In future, one could further explore the minimum percentage of social media use

in an event that leads to a respectable accuracy, and how such minimum can be

estimated in order to compute a trust index for the regression result.

27

Chapter 3

SAME-DAY DELIVERY PLANNING WITH STORE FULFILLMENT

3.1 Introduction

Electronic retailing has grown significantly over the past decade and is predicted

to continue rising into the future. Currently, increasing traditional brick-and-mortar

retailers start to operate online channel. In this process, the competition between

them and Internet retailers which solely sell products or services online is inevitable.

The e-channels, providing same-day delivery (SDD), accelerate the competition with

traditional retailers, bringing considerable convenience and nearly instant accessibility

for the online shoppers.

One of the feasible way for traditional retailers to response this recent e-commerce

trend is doing the same-day delivery as well. Our solution is to make the decision

of fulfillment sourcing from the more direct-to-consumer places, like their retailing

stores, instead of far-away fulfillment centers. One outstanding advantage of using

stores to fulfill online orders is the short distance between the retailing stores and the

consumers. It will benefit not only the fulfillment speed and costs, but also being able

to provide versatile services, such as store pickups, accessible return service, and etc.

Nonetheless, a new service comes at price and there is no exception for the same-day

delivery, since it introduces additional operation complexity in the stores.

These difficulties motivate us to study and model the store fulfillment for local

online orders within the same day. Essentially, it is a same-day delivery planning

with store fulfillment problem (SDD-SFP). SDD-SFP is characterized by three main

factors, the store selection, fleet-sizing in delivery transportation, and the multi-

28

commodity distribution.

The store selection can be treated as a facility location problem. A good store

selection would reduce expenses in order delivery and total services setup cost of

stores. There are a large number of the delivery destinations and potential facility

locations to be selected in the model.

In order delivery transportation, we incorporate the own fleet of trucks and the

3rd party carriers as two delivery channels. A fleet of vehicle would serve delivery

requests from different area. In the problem addressed in the paper, the customer

demand could vary dramatically during the time horizon, which can represent the

practice happens in holiday shopping season. Since the fleet is fixed for the whole

period, it is important to coordinate the fleet size for the demand variations. The

size of the fleet and capacity requirement from the carriers need to be answered.

For the multi-commodity distribution, the inventory of products for the online

shoppers need to assign to different selected store. Since the demand varies upon

areas and delivery options, the distribution requires careful consideration in terms

of last-mile delivery costs. In addition, the huge scale of the product assortment for

online shoppers is considered in the model.

These factors above would certainly increase the complexity of the problem. More-

over, the capacities of the delivery channels and store SDD order processing powers

will provide the additional difficulties to develop an efficient plan to tackle this prac-

tical problem.

There are two nature variant of same-day delivery being considered in the practice

and literature: (1) All orders need to be delivered by the end of the day, and (2) all

orders need to be delivered within certain periods specified by the consumers. From

the supply chain planning perspective, we study the first variant as the definition of

same-day delivery in this paper. It simplifies the modeling and reduce the computa-

29

tional complexity in delivery transportation, while it also retains feasible logistic plan

in business practice.

The objective of SDD-SFP is to identify a seasonal order fulfillment plan for deliv-

ering local online orders from nearby retailing outlets. In particular, this paper seeks

to develop optimization models and algorithms to solve facility location, transport-

ing channel selection, and inventory management to construct logistic plan in supply

chain. This paper consist of following steps. (1) We formulate a mix integer program-

ming for SDD-SFP to capture the trend of same-day delivery in omni-channel supply

chain for brick-and-mortar retailers; (2) To solve the SDD-SFP, we propose a Ben-

ders decomposition based approach that divides the model into one master problem in

charge of facility locations and fleet sizing, and subproblems in charge of SDD orders

assignment and delivery channel choices; (3) Furthermore, we introduce several al-

gorithmic enhancements for the solution method, including store-selection algorithm,

cut strengthening methods and the parallel search trees; (4) Finally, we conduct an

extensive set of experiments on a real-world national retailer that demonstrate the

value of the proposed solution approach.

This research aims to develop optimization models and solution algorithms about

store location selection, fleet-sizing for transportation, and inventory planning, in

order to construct robust logistic plan of supply chain. Our study makes following

key contributions:

1. This paper introduces a new same-day delivery planning with store fulfillment

problem to capture the current trend of same-day delivery in omni-channel

supply chain for brick-and-mortar retailers. A mix integer programming model

is developed for the SDD-SFP.

2. The solution algorithm uses a Benders decomposition based approach that di-

30

vides the model into one master problem in charge of facility locations and

fleet sizing, and subproblems in charge of SDD orders assignment and delivery

channel choices.

3. This study creates a store-selection algorithm based on mix integer program-

ming and combination of factors to essentially reduce the number of potential

store list, and acts as an effective extension to increase the computational effi-

ciency.

4. In order to make the cuts more efficient to accelerate Benders solving process,

several cut strengthening methods are implemented into the Benders solving

structure, including Pareto-optimal cuts, MIP lifting cuts, and tabu cuts.

5. A customized local branching method is proposed as an additional parallel

search tree to assist Benders approach to generate MIP initial lifting cuts and

to explore the neighborhood of incoming incumbent solutions from the master

problem.

3.2 Literature Review

SDD-SFP builds upon the incapacitated facility location problem with consid-

eration of multi-commodity, multi-plant. Facility location problem (FLP) is a very

popular academic topic and numerous research has applied on this topic. The general

characteristics and the recent trends of FLP can be found in Klose and Drexl [62],

ReVelle et al. [63], and Melo et al. [64] presented up-to-date surveys of FLP.

Multi-commodity facility location problem is one important variant of FLP. Warsza-

wski [65] proposed a branch and bound algorithm and a heuristic solution procedure

for solving the multi-commodity facility location problem. Geoffrion and Graves [66]

provided a solution procedure based on Bender’s decomposition and applied it to a

31

real situation of a major food company for the multi-commodity distribution facil-

ities design. Neebe and Khumawala [67] used the delta, omega simplification rules

adjusted for the multicommodity case. Barnhart and Sheffi [68] presented a primal

dual, heuristic solution approach for large-scale multi-commodity network flow prob-

lems. Crainic and Delorme [69] described a dual-ascent-based approach for solving

simple multi-commodity location problems with balancing requirements. Aggarwal

et al. [70] proposed a general heuristic procedure for multi-commodity integer flows

which can be utilized for solving multi-commodity facility location problems. Lee

[71] developed a general model for a capacitated facility location problem which in-

corporates the multi-product, multi-type facility and proposed an optimal solution

algorithm based on Bender’s decomposition technique. Lee [72] extended a standard

capacitated facility location problem to a generalization of multi-product, multi-type

capacitated facility location problem with a choice of facility type and presented an

elective algorithm based on cross decomposition. The algorithm unifies Bender’s

decomposition and Lagrangean relaxation into a single framework. Pirkul [73] de-

veloped an efficient heuristic procedure for solving the multi-commodity, multi-plant

capacitated facility location problem.

For the same-day delivery part, researchers mainly focus on how to operate the

delivery vehicles as dynamic pickup and delivery problem. Azi et al. [74] addressed a

vehicle routing problem with dynamic new new customer requests with time windows.

They proposed an adaptive large neighborhood search heuristic to maximize total

expected profits of vehicle operations. Klapp et al. [75] introduced a novel way to

look at vehicle operations for same-day delivery, which treated the vehicle dispatch

problem with setting that there were a single vehicle and all unserved requests on a

line. It applied a dynamic programming approach to minimize the expected vehicle

operation costs and penalties for unserved requests. Voccia et al. [76] defined the

32

multi-vehicle dynamic pickup and delivery problem with time constraints as the same-

day delivery problem. They utilized Markov decision process to model the problem

and a sample-scenario planning approach to incorporate potential future requests into

routing decisions.

Benders decomposition [77] is an exact solution method. In Benders decomposi-

tion, the original MIP problem is partitioned into two parts — master problem and

subproblem. which are typically easier to solve than the original problem.The sub-

problem is one part of the original MIP problem with some fixed variables which are

provided by the solutions of master problem. Master problem consists of the remain-

ing variable with adding cuts as constraints which are generated by the solution of

subproblem iteratively. Branch and cut algorithm and linear programming duality

are the theory bases of Benders decomposition. In the solving process, the variables

of the master problem are first determined and the subproblem is then solved sequen-

tially. If the subproblems are feasible and bounded, an optimality cut is added to the

master problem, otherwise a feasibility cut is added. An upper bound can be com-

puted from feasible subproblems and a lower bound is obtained if the master problem

is solved to optimality. The process terminates when the optimality gap reaches the

predefined threshold.

Benders decomposition-based approaches have been widely used in solving supply

chain network design problems. Inspired by Geoffrion and Graves’ work [66] apply-

ing it into Multi-commodity distribution system design, we decided to use Benders

decomposition as the base of our solving methology.

This study is the first of the type to tackle store fulfillment problem in a collab-

orative local store network. The intended logistic plan is to balance the acceleration

of order delivery speed and the supply chain costs in terms of facility location and

transportation. Most of existing studies about store fulfillment are related to alleviate

33

the workload of the fulfillment centers and facilitate inventory rebalance.

3.3 Problem Description and Formulation

This section first introduces the notation used throughout this chapter. It then for-

mulates the same-day delivery with store fulfillment supply chain planning (SDDSFP)

as a mixed integer programming model. It reformulates the problem with Benders

decomposition as solving algorithm. Finally, several algorithmic enhancing methods

are designed and used to accelerate the solving processes.

3.3.1 Notation

There are four kinds of indexes for representing time horizon, store locations, on-

sale products , and delivery zones. Let T , index by t, denote the time horizon of

the supply chain plan. Let K, index by k, denote the candidate store locations. The

on-sale products through the fulfillment network can be defined as stock keeping units

(SKUs) by J = 1, . . . , j. And the whole fulfillment region is divided by the zip codes

and we call the divided areas delivery zones, represent by Z = 1, . . . , z.

For SKU j, there is the retailing price Pj attaching. And for store k, it defines

the operation setup cost as Πk and daily same-day demand processing capacity as

CPk. For the delivery zone z, Az stands for the value of area, and Lkz is the center-

to-center distance between it and store k. The customer demand is defined by Djzt,

which denotes the quantity of SKU j be requested at delivery zone z on day t. In

order to estimate the number of packages to be fulfilled from the SKU demands, it

proposes minimum shopping amount ToC. ToC can be treated as the least amount

of all eligible SKUs in a customer order to qualify for the same-day shipping.

The transportation cost comes from two sources — own delivery fleet operation

cost and package shipping cost by 3rd party carriers. The own fleet could be delivery

34

vans or trucks operated by the retailer and works exclusively for the same-day orders.

The parameters related to a vehicle from the fleet includes the daily operation cost

O, the daily working time length WT , the average moving speed MS, the average

stop time for delivering package from the vehicle to the destination ST , and the

holding capacity of packages HC. On the other hand, the 3rd party carriers provide

the urban delivery networks and charge by package shipping. It denotes one package

average shipping cost from store k to delivery zone z as Ckz. And The daily maximum

processing capacity of 3rd party carriers is CS. When some order cannot be fulfilled

on same-day by either own fleet or carriers, there will a penalty cost G for each of

those.

The following decision variables are used to formulate the SDDSFP. The variable

ak equals to 1 when the store k is selected for the SDD service for the time horizon T ,

0 otherwise. The variable bkz equals to 1 iff the delivery zone z is able to be sourced

from store k throughout T . The variable f represents the number of vehicles used

for order fulfillment throughout whole time horizon T . There are three kinds of SDD

orders, represented by xtkz, ytkz and wtkz, in terms of the fulfillment perspective. The

first kind is for the SDD orders to be delivered by own fleet, denoted as xtkz, at day

t from store k to zone z. Similarly, the SDD orders to be delivered by carriers, are

denoted as ytkz. Another kind of orders are the ones that could not fulfill on the

required same-day t, represented by wtkz. The fulfillment decision is defined by hjtkz,

which stands that the number of SKU j would be fulfilled at day t from store k to

zone z. Finally, we denote lt as the optimal of traveling distance to fulfill SDD orders

by own fleet at day t.

For the sake of simplicity, xtkz, ytkz, wtkz and f are defined as continuous variables

regardless that they are the integer values in nature. Because of the large scale

problem with high amount of customer demand and products assortment, the effect

35

of the data types on the planning result is very small according to the pilot numerical

experiment.

3.3.2 SDDSFP Formulation

The SDDSFP can be formulated as follows:

min Ω =∑k

Πk · ak + TO · f +∑t,k,z

Ckz · ytkz +∑t,k,z

G · wtkz (3.1)

s.t.

hjtkz = Djzt · bkz ∀j ∈ J, t ∈ T, z ∈ Z, k ∈ K (3.2)∑k

bkz = 1 ∀z ∈ Z (3.3)

∑j,z

hjtkz ≤ CPk · ak ∀t ∈ T, k ∈ K (3.4)

xtkz + ytkz + wtkz =

∑j hjtkz · PjToC

∀t ∈ T, k ∈ K, z ∈ Z (3.5)

lt =∑k,z

(2 · Lkz · xtkz

HC+ 0.57 ·

√xtkz · Az) ∀t ∈ T (3.6)

f ≥ lt/MS + ST ·∑

kz xtkzWT

∀t ∈ T (3.7)

The objective function (3.1) minimizes the cost of facility locations and the ex-

pected cost of the logistic decisions, which include the transportation cost by own

fleet, the transportation cost by 3rd party carriers, and the penalty cost of unfulfilled

same-day orders. Constraint (3.2) indicates that the fulfillment decisions are based

on the customers’ demand and availability of facilities. Constraint (3.3) enforces each

delivery zone being able to source from at least one store. Constraint (3.4) makes

sure that the fulfillment decisions are within the limitation of store order processing

capacity. Constraint (3.5) allows the estimation for the quantity of packages from

demands and divided them into three types: SDD orders by own fleet, SDD orders by

36

carriers, and unfulfilled SDD orders. Constraint (3.6) estimates the optimal of travel-

ing distance for vehicles at each day. And based on predefined operative parameters

of vehicle, the number of vehicles can be approximated from the constraint (3.7).

Order Estimation

The way that we estimate the number of orders/packages from constraint (3.5) is

based on the fulfillment decision hjtkz, the price of SKUs Pi and minimum shopping

amount ToC. The nominator of right hand side of (3.5) —∑

j hjtkz · Pj is the total

amount of money for the payment of all the orders to be fulfilled at day t from store

k to zone z, while the denominator is the minimum shopping amount ToC.

The logic behind the method is to mimic the minimum requirement of the shopping

cart from popular online retailers, like Amazon.com and Jet.com. On these websites,

the online shoppers needs to order certain amount of merchandise to qualify the free

shipping. In our case, the amount of merchandise is the requirement to use the

same-day delivery service, which is denoted by minimum shopping amount ToC.

The order estimation makes assumption that the payment amount of all customer

orders would just reach the minimum shopping account. We remark that this method

could overestimate the number of orders/packages since the payment amount of cus-

tomer order whose may be larger than ToC. In the model, the minimum shopping

amount could be approximated by historical sales data and the online shopping pref-

erence of local consumers. In addition, it is able to approximate different minimum

shopping amount according to the location and time, which may increase the accuracy

of order estimation.

37

Vehicle Routing Estimation

The expected routing costs for the own fleet option come from constraint (3.6) and

(3.7). The vehicle routing problem itself is a non-deterministic polynomial-time hard

problem, which could determine the exact routes from point A to point B. Since our

problem focuses on the supply chain planning phase and only want to know the length

of routes and the corresponding costs, rather than the origins and destinations, to

estimate the fleet size. The continuous approximation modeling is more favorable

and suitable in this situation. Continuous approximation models use value of area

and number of customers to estimate average traveling distance by vehicles with

high accuracy. Several studies applied this techniques in supply chain and logistics

setting, see [78, 79, 80]. Daganzo [81] proposed a simple and intuitive formula (3.8) for

the capacitated vehicle Routing Problem (CVRP) when the depot is not necessarily

located in the area.

CV RP (V n) = 2r ·m+ φ ·√nA (3.8)

CV RP (V n) stands for the total distance of the CVRP problem having m routing,

the average distance between customers and the depot is r, and n customers evenly

distributed in an area A. The value of φ is a constant and φ = 0.57 for rectilinear

distance. In our case, the daily number of customer n approximately equals the

number of orders to be delivered by own vehicles∑

kz xtkz. The number of routes m

can be defined by the times of travel for the capacitated vehicles to fulfill all orders,

which is∑

kz xtkz/HC. The center-to-center distance Lkz replace r and value of area

Ak is in place of A. Take them into the (3.8), it forms the equation (3.6) which

estimates the optimal traveling distance to fulfill SDD orders by own fleet at day t.

One important observation to made concerns square root term√xtkz in (3.6). It

38

may be impossible to solve the MIP to optimality with the square root term when

the size of problem become large. For the sake of simplicity, we decide to calculate

a fractional number of q from the numeric experiment to replace√xtkz by q · xtkz.

It is a rough estimation in general, but when choosing q with caution, it can be a

good estimation for the planning problem. There is an alternative option that is to

apply several integer constraints to approximate value of the square root term. In the

context of value estimation, the accuracy will not be improved a lot since it has to

round√xtkz and xtkz to the nearest whole numbers to be selected in the context of

MIP. In addition, the several integer constraints dramatically increase the complexity

of the model. For the purpose of this study, fractional number of q is better option

and we adopt it to approximate the square root term.

In order of simplifying the presentation and model integration, we define:

αkz =2 · Lkz + q ·

√Az ·HC +HC ·MS · ST

HC ·MS ·WT∀k ∈ K, z ∈ Z (3.9)

βtz =

∑j Djzt · PjToC

∀t ∈ T, z ∈ Z (3.10)

From (3.1) to (3.7), it can be seen that it is able to bypass the transient decision

variables hjtkz and lt. In addition, by incorporating (3.9) and (3.10), the SDDSFP

problem can be formulated as:

min Ω =∑k

Πk · ak + TO · f +∑t,k,z

Ckz · ytkz +∑t

G · wtkz (3.11)

s.t. ∑k

bkz = 1 ∀z ∈ Z (3.12)

∑j,z

Djzt · bkz ≤ CPk · ak ∀t ∈ T, k ∈ K (3.13)

xtkz + ytkz + wtkz = βtz · bkz ∀t ∈ T, k ∈ K, z ∈ Z (3.14)

39

f ≥∑kz

αkz · xtkz ∀t ∈ T (3.15)

3.3.3 SDDSFP Benders Reformulation

It now introduces the Benders decomposition to reformulate SDDSFP in order to

solve our large-scale problem efficiently.

In the reformulation, we manage to put all the binary decision variables ak, bkz

into the master problem, while the continuous variables xtkz, ytkz, wtkz and f be-

long to the subproblem. In addition, since f is the only variable not indexed by day

t in the subproblem, it updates the partition and moves f to the master problem.

Consequently, the subproblem handles all the time related variables, while master

problem only remains the variables that are not indexed by day t. It enables us to

further divide the subproblem of the model into T independent subproblems. Ac-

cordingly, it drops the dimension T for the decision variables xtkz, ytkz and wtkz in

the subproblems. Then, the subproblem(i) can be stated as:

min Θ(i)t =

∑k,z

Ckz · ykz +G · wkz (3.16)

s.t.

xkz + ykz + wkz = βtz · bkz ∀k ∈ K, z ∈ Z (3.17)

f ≥∑kz

αkz · xkz (3.18)

where bkz and f are treated as known constants in the subproblems; and Θ(i)t repre-

sents the two kinds of cost at day t — the 3rd party shipping cost and the penalty

cost for unable fulfilled SDD orders; all the parameters and decision variables are

non-negative.

The master problem is defined as:

min Ψ =∑k

Πk · ak + TO · f +∑t

Θ(i)t (3.19)

40

s.t.

∑k

bkz = 1 ∀z ∈ Z (3.20)

∑j,z

Djzt · bkz ≤ CPk · ak ∀t ∈ T, k ∈ K (3.21)

Θ(i)t ≥

∑kz

βtz · λ(i)kz,t · bkz + µ

(i)t · f ∀t ∈ T, i ∈ H (3.22)

The constraints (3.22) are the Bender’s cuts generated during the solution of the

subproblems. H is the index set of the cuts when there are optimal solutions for the

subproblems. λ(i)kz,t is the dual value from the constraint (3.17) and µ

(i)t is the dual

value of constraint (3.18) in the subproblem(i).

An important observation is that subproblem(i) can always be solve to optimality.

Consequently, for this SDDSFP model, there does not exist any cuts being generated

by the infeasibility for the master problem, as typical Benders does.

In sum, SDDSFP Benders reformulation manages to divide the mix integer pro-

gramming problem into one master problem and H number of subproblems. H is

n times the size of T . At each time of n, it solves the current master problem to

obtain the values of bkz and f . According to the value of bkz and f , T subproblems

are then solved sequentially to generates T number of individual optimality cuts with

daily customer demand of T days. T cuts will be used to update constraint (3.22) of

the master problem. The solving process goes iteratively between the master problem

and the subproblems until it reaches a termination criterion like small optimality gap.

The Benders solving procedure is based on the branch-and-cut algorithm. The

master problem intensionally takes out some constraints of MIP problem into sub-

problems, which makes master problem much easier to solve than the original MIP

problem. The master problem is essentially a relaxation of original MIP problem.

41

The subproblem will send parts of these constraints back to master problem by cuts

to tighten the relaxation iteratively based on the incumbent solution of master prob-

lem. Finally, with appropriate termination criterion, it will obtain the same optimal

solution as MIP. This process can be called as delayed constraint generation.

In the SDDSFP Benders reformulation, it can be observed that the master problem

is a capacitated facility location problem with consideration of fleet cost, while each

subproblem is an order assignment problem with multi-commodity, multi-plant and

two fulfillment channels. In this case, the solving process between master problem

and subproblems has clear physical meaning. The master problem determines the

service-available store locations and the fleet size. Through the values of bkz and f ,

it passes the location and fleet size decisions to the subproblems. On the other hand,

with available store locations and fleet size, subproblems decide the order fulfillment

plans for each day, including the delivery method by own fleet or 3rd party carrier, the

sourcing decision – from which store to the destination. Back to the master problem,

the constraints (3.22) will be updated from the subproblems, bring new requirements

for the store locations and fleet size. In sum, the master problem iteratively updates

the store selection and fleet size, while subproblems take charge of delivery method

and souring decision for the customer orders.

3.3.4 Algorithmic Enhancement

In this section, we provide 3 algorithmic enhancements in order to accelerate the

solution algorithm.

Enhancement of Store Selection

The master problem is a capacitated facility location problem with fleet cost. It use

binary decision variable ak to represent kth store be selected by the SDDSFP model

42

or not.

In reality, the potential stores to be selected could be a long list due to the existing

facility locations. It exponentially affects the efficiency of solving the model since the

store locations have great impact on delivery zones assignment by (3.21). Therefore,

a mathematic enhancement of the store selection can be an effective extension to

increase the solving progress of the master problem.

We create an algorithm 2 called Enhancement of Store Selection to essentially

reduce the length of potential store list.

43

Algorithm 2: Enhancement of Store Selection

Input : The store list K, Store setup costΠk, Store processing capacity

CPk, and the daily demand Djzt

Output: The strengthening store selection — Kst

1 Find the date tmax with maximum total demand;

2 Select store based on a simple MIP

min∑k

Πk · ak

st.∑

j,kDjztmax ≤∑

k CPk · ak

Solve the MIP, it could get a store list Kin with ak∈Kin = 1;

3 Calculate the gap between all capacity from Kin and the maximum total

demand on date tmax, which 4CP ≡∑

k∈Kin CPk −∑

j,kDjztmax ;

4 Find the lowest processing capacity CP 0 from Kin;

5 Make up a set of processing capacities SCP

if CP 0 − CPk ≤ 4CP ∀k ∈ K then

6 Add CPk into SCP

7 end

8 List the stores with the capacity in SCP from K, denoted as K|SCP; List

the stores with the capacity in SCP from Kin, denoted as Kin|SCP;

9 Calculate the total distance between all delivery zones and each store, as dk;

10 Calculate the ratio combining two factors distance and setup cost for

K|SCP

ratio(k) =dk − 1

n

∑nk∈K|SCP dk

1n

∑nk∈K|SCP dk

+Πk − 1

n

∑nk∈K|SCP Πk

1n

∑nk∈K|SCP Πk

For all the k ∈ K;

11 Apply the above equation to the Kin, get the max value of the ratio as M;

12 The strengthening store selection Kst = Kin⋃K|SCP , ratio(k) ≤M

44

It could be onerous for Benders master problem to update with the optimal store

selection without Algorithm 2. From (3.19) to (3.22), it obtains the same solution

as the store list Kin at the root node without presolving, which is the least cost of

store setup only corresponding to the store processing capacity.The location decisions

gradually changes when the cuts are added from subproblems by (3.22). The Bender’s

cuts by solution of subproblems bring transportation cost into the location decision

of master problem. The solution of subproblems largely depends two factors from

master — the location decision of store ak and the assignment of delivery zones bkz.

Since the long list of potential stores, location decision could be very inefficient for

solving the model.

Algorithm 2 simulates the processes to make location decision from the Bender’s

model but simplify the vehicle routing and delivery by 3rd party carrier to distance-

based criteria. First, it makes Kin as the starting point by solved part of the MIP

derived from the master problem. And, it get total distance between all delivery zones

and stores to mimic the traveling distance for order delivery. Then, it ranks the all

other stores by their capacity for the purpose to have the order to swap the stores from

K \Kin to Kin. In addition, it creates the ratio combining the distance and setup

cost as the measurement of importance for k ∈ K \ Kin and k ∈ Kin respectively.

ratio smaller is better. Finally, all the store in K \Kin that have smaller ratio values

than the largest ratio value in Kin could be the potential stores to be swapped. The

combination of Kin and the small ratio stores consist of the strengthening store list.

The assumption to make Algorithm 2 feasible is that the store setup cost con-

tributes more than the transportation cost to the location decision. Since our problem

is about the seasonal planning, it holds the assumption in the model. While if the

time horizon change, the assumption is not available anymore, Algorithm 2 is able to

adopt the new scenarios to use different start point Kin as well. When the logistic

45

cost dominate the location decision, generate Kin from the simple MIP model of the

logistic cost. When the logistic cost and setup cost contributes equally, create Kin

from the combination of the simple MIP model of the logistic cost and that of setup

cost. It extends the availability of Algorithm 2 by adopting different Kin.

The results of Algorithm 2 are treated as initial cuts adding to the master problem.

The cut of the strengthening store list Kst is added by∑

k∈Kst ak = n(Kst).And cuts

of the non-selected ones are added by ak = 0 ∀k ∈ K \Kst.

Cut Strengthening

The convergency of Benders solving algorithm highly depends on the quality of the

cuts generated from the subproblem. We implement three methods to strengthen the

Benders cuts — subproblem disaggregation, Pareto-optimal cuts and tabu cuts.

In ordinary Benders decomposition algorithm, master problem is solved while pass

the values of variables and subproblem is solved with these variables fixed, at same

time generate one cut into master problem in each iteration. Based on the structure

of the model, we manage to retain all time related variables in subproblem and further

divides into T independent subproblems. More details can be seen through equation

(3.16) to (3.18). In this case, it is able to generate T instead of 1 cuts to coordinate

different quantitative and spatial distribution of customer orders for T days. The

disaggregation of subproblems could provide better estimation of the transportation

cost for each daily demand rather than that for the the whole time horizon. Therefore,

the corresponding cuts could be generally more effective.

Although T disaggregated cuts better approximate the transportation assignment,

they might increase the solving time of master problem for each iteration on the other

hand, since it will be T additional constraints for each iteration instead of one. There

is a trade-off between tight cuts and increasing difficulty of master problem. For this

46

model where T is not large, the constraints increase is limited for master problem.

Subproblem disaggregation turns out very effective from the numerical experiment.

T disaggregated subproblems estimate the optimal transportation cost for each

daily demand with fixed delivery zone assignment, but the daily delivery plan could

vary. Therefore, there may have multiple dual solution values λ(i)kz and µ(i) for the ith

subproblem. Consequently, The multiple solution values consist of alternative cuts

from the ith subproblem. In order to select the strongest cut, we adopt a method

derived by Magnanti and Wong.

If (λ(i),(1)kz , µ(i),(1)) and (λ

(i),(2)kz , µ(i),(2)) are two dual solutions from subproblem(i)

(3.16) to (3.18), therefore

∑kz

βtz · bkz · λ(i),(1)kz + f · µ(i),(1) =

∑kz

βtz · bkz · λ(i),(2)kz + f · µ(i),(2)

When b?kz and f ? are the final optimal solutions, if

∑kz

βtz · b?kz · λ(i),(1)kz + f ? · µ(i),(1) ≥

∑kz

βtz · b?kz · λ(i),(2)kz + f ? · µ(i),(2)

then it calls the cut from (λ(i),(1)kz , µ(i),(1)) dominates the cut from (λ

(i),(2)kz , µ(i),(2)). Mag-

nanti and Wong called it Pareto-optimal cut corresponding to the point (λ(i),(1)kz , µ(i),(1)).

Pareto-optimal cut is the strongest cut from subproblem(i). However, it is impos-

sible to get final optimal solution b?kz and f ? as prior knowledge when the solving is

in process. Magnanti and Wong introduce an alternative called core point b′

kz and

f′

which is the relatively interior point of the convex hull of original SSD-SFP MIP

problem.

Based on that, our Pareto-optimal cut is generated in processes as follows. First,

it solves the subproblem(i) with fixed variable passed by master problem. When

subproblem(i) is solved to optimality of V (Θit), it sets up a new linear program-

ming problem to get the optimal dual value set (λ(i),(1)kz , µ(i),(1)). Finally, based on

47

(λ(i),(1)kz , µ(i),(1)), generate the Pareto-optimal cut into master problem. The LP based

dual subproblem(i) is

max Θit =

∑k,z

βtz · b′

kz · λ(i)kz + f

′ · µ(i) (3.23)

λ(i)kz ≥ Ckz ∀k ∈ K, z ∈ Z (3.24)

λ(i)kz ≥ αkz · µ(i) ∀k ∈ K, z ∈ Z (3.25)

V (Θit) =

∑k,z

βtz · bkz · λ(i)kz + f · µ(i) (3.26)

The (3.23), (3.24), and (3.25) are the dual reformulation, respectively, of (3.16),

(3.17), and (3.18). The equation (3.26) is the constraint to make the problem having

the same optimal objective value of V (Θit) from subproblem(i).

We get the core point b′

kz and f′

from the linear relaxed initial master problem. It

is an inner point in the feasible region of master problem. Since the optimal location

decision is to swap out the initial selected store, whose logic can be seen in Algorithm

2, the b′

kz and f′

may be nearby points with the optimal solutions b?kz and f ?. Note

that no matter how is the selection of b′

kz and f′, the cuts from dual subproblem(i)

is valid and contribute to the convergence of Benders solving. When the b′

kz and f′

is the relative interior points of the convex hull of problem, it gives Pareto-optimal

dual values of (λ(i),(1)kz , µ(i),(1)) which further strength T independent cuts from the

subproblem disaggregation.

Tabu cuts is the 3rd method to improve the quality of the Benders cuts. It applies

a cut to eliminate the recurrence of the current incumbent solution from the branch

and bound tree of Benders master problem. It blocks the some feasible solutions by

the tabu cuts to help node to be fathomed fast and further reduce the size of the

search tree.

48

When the master problem get a new incumbent solution, it generates cut for the

binary variables bkz. Given the feasible solution from master problem bkz and let

K = k ∈ K|bkz = 1 and Z = z ∈ Z|bkz = 1, the tabu cut is defined by

4(bkz, bkz) =∑

k∈K,z∈Z

(1− bkz) +∑

k∈K\K,z∈Z\Z

bkz ≥ 1

By adding 4(bkz, bkz) ≥ 1 to the master problem, it would no longer to consider the

bkz in the following nodes.

The reason for selection of bkz instead of other decision variables is that first values

of bkz contain the information of both store selection ak and delivery zones assignment

by equation(3.21). In addition, 4(bkz, bkz) ≥ 1 is feasible to limit the changes of the

binary and integer values rather than the continuous variables.

Parallel Search Trees

In our model solving methodology, it not only include the 1st search tree of Benders

decomposition, but also remains the the 2nd search tree of original MIP model with

conditional constraints. The lifting cuts and incumbent updates from the 2nd search

tree can result in significant reductions in the solution runtime.

The parallel search trees between the master problem of Benders and the original

MIP problem in the framework are shown in Figure 3.1. The circled numbers stand

for the steps.

Start to solve the 1st search tree with enhancement of store selection and strength-

ening Benders cuts. If the gap between lower and upper bound is reducing with cer-

tain iterations, the 1st search tree keep running without adding other steps. When it

solves to the end, it reports the optimal solutions for the model at step 1 and finish

the whole solving processes. When the gap ceases to move for certain iterations, it

puts 1st search tree on hold and triggers the step 2 and passes the current upper

49

Master Problem

Original MIP Problem

Benders initial incumbent

MIP Lifting cut

Optimal solution

Benders master incumbents

Better adjacent solution as new incumbent

1

2

3

5

4

67

Figure 3.1: Parallel search trees between the master problem of Benders and original

MIP problem

dΨ0e and lower bounds bΨ0c of the 1st search tree as additional constraints to the 2nd

search tree.

Ω ≤ dΨ0e (3.27)

Ω ≥ bΨ0c (3.28)

Start to solve the 2nd search tree with (3.27) and (3.28). If the gap between lower

and upper bound is reducing with certain iterations, the 2nd search tree keep running

without adding other steps. When it solves to the end, it reports the optimal solutions

for the model at step 3 and finish the whole solving processes. When the gap ceases

to move for certain iterations, it terminates the 2nd search tree, keeps the incumbent

solutions, upper dΩ0e and lower bounds bΩ0c. At steop 4, if the dΩ0e < dΨ0e, add

50

Ψ ≤ dΩ0e as a cut into the master problem of the 1st search tree. If the bΩ0c > bΨ0c,

add Ψ ≥cΩ0c as a cut into the master problem as well. These cuts may help the

master problem to tighten bounds and facilitate fathoming the remaining nodes. We

call them as MIP Lifting cuts.

Back to the 1st search tree, continue to solve it with MIP lifting cuts. The step

5, 6 and 7 borrow the idea of local branching strategy (Fischetti and Lodi 2003) to

improve the upper bound of the 1st search tree of Benders decomposition.

The main idea behind local branching is to divide the feasible region of the problem

into smaller subregions in order that a generic solver is able to find high-quality

solutions effectively. Currently, the advancement of the generic solvers facilitate the

implement of local branching since they can efficiently solve small instance of small

hard problem.

We manage to apply local branching strategy into our parallel search trees solution

system. The 2nd search tree adopt additional constraint to divide solution space into

small subregions. It is able to obtain the high-quality solutions from the 2nd search

tree for each subregion. The 1st search tree keeps the branching and cut and update

with these solutions to accelerate the Benders processes for solving the large-scale

problem. In order to make it work, the following question will be investigated:

• How to divide solution space into small subregions for the 2nd search tree?

• What is the strategy for applying the different kind of solution (e.g. optimal,

infeasible, feasible with time limit) of subregions to the 1st search tree?

The solution space can be partitioned based on the incumbent solution from the 1st

search tree. We adopt the similar method as generation of tabu cuts. Given the

incumbent solution bkz and let K = k ∈ K|bkz = 1 and Z = z ∈ Z|bkz = 1, it

51

can divide the feasible region into two parts by two equations

4(bkz, bkz) ≤ K (3.29)

4(bkz, bkz) ≥ K + 1 (3.30)

where 4(bkz, bkz) =∑

k∈K,z∈Z(1 − bkz) +∑

k∈K\K,z∈Z\Z bkz; K is the size parameter

of exploring neighboring nodes and K is positive integer.

Without loss of generality, we only use binary variables bkz to illustrate the pro-

cesses of step 5, 6 and 7 of the parallel search tree method. It is able to apply to

other binary and integer variables. For the continuous variables, it can work as well

by rounding to nearest integer. It is worthy to note that it is invalid to mix discrete

and continuous variables into (3.29) and (3.30) because the unit of value change is

different. In practical, it is no problem to derive and keep several sets of (3.29) and

(3.30) for different decision variables.

The physical meaning of 4(bkz, bkz) is to count bkz change their value from 0 to

1, or from 1 to 0. When add (3.29) as constraint into 2nd search tree, it limits the

possible change of bkz values under K + 1 times. By imposing appropriate value K,

it creates a small subregion near to the incumbent solution bkz. On the other hand,

equation (3.30) stands for the another mutually exclusive subregion. And two parts

together consist of the whole solution space. Therefore, it could solve the 2nd search

tree with (3.29) as constraint efficiently by generic solver and get the optimal solution

of subregion to update the incumbent of the 1st search tree.

Since this is iterative updating process between the both search tree, the already

examined subregions are no need to be considered in the future updates. In addition,

the choice of value K is arbitrary, which may become large and very hard to solve to

optimality. It is crucial to have a balanced solving strategy to apply the solution of

subregions into both search trees.

52

We propose two kinds of limits in the solving strategy, time limit and stall limit, to

reduce the solution time of the subregion examination. Except for the self-explanatory

time limit, the stall limit is the number of iterations for ceasing movement of opti-

mality gap. Both limits specify that the 2nd search tree with (3.29) will be solved util

reaching the optimality or time or stall limit.

Let us now consider the possible solutions after each time of the subregion exam-

ination. There are 4 cases for the purpose of incumbent update of the 1st search tree.

First, it could get optimal bkz and improved solution Ω(bkz) ≤ Ω(bkz) from subregion

examination of 2nd search tree. In this case, the new optimal bkz will be used to up-

date the incumbent and upper bound of the 1st search tree. And since the subregion

has been fully examined, it will rule out the subregion for further consideration by

applying (3.30) to the 2nd search tree. Second, when the solving process hits the time

or stall limit, it may gets feasible and improved solution b′

kz. Then, for the 1st search

tree, we update incumbent and upper bound with b′

kz; while for the 2nd search tree,

since it does not exam every feasible points inside of the subregion, it can get rid of

b′

kz for future iteration by adding constraint 4(bkz, b′

kz) ≥ 1 instead of the whole sub-

region. Third case is counterpart of second case under same reaching limit situation.

It will generate feasible and unimproved solution. In this case, it usually shows that

the subregion is too big for generic solver to get improved solution within the limits.

Our algorithm will divide the subregion into smaller parts and revisit the divided

part immediately by removing (3.29) and adding 4(bkz, bkz) ≤ (K)− d (K)2e into the

2nd search tree. As the 1st search tree, it remain the solving process on hold. Finally,

the last case includes two possible results of the solution. One is solve to infeasibility;

another is optimal and unimproved solution. Both results show that it is impossible

to find potentially improved solution in the subregion. Then, the alorithm will rule

out the subregion for further consideration by applying (3.30) to the 2nd search tree,

53

1

2 3

4 5

∆(𝑏$%, 𝑏'$%( ) ≤ 𝒦 ∆ 𝑏$%, 𝑏'$%( ≥ 𝒦 + 1

Optimal and improved solution 𝑏'$%/

∆(𝑏$%, 𝑏'$%(/) ≤ 𝒦

Feasible and improved solution 𝑏'$%(0

∆ 𝑏$%, 𝑏'$%(0 ≥ 1Feasible and unimproved

solution 𝑏'$%(1

6 7

∆ 𝑏$%, 𝑏'$%(0 ≤ 𝒦 − 3𝒦2 5

Optimal and unimproved solution 𝑏'$%(6

∆ 𝑏$%, 𝑏'$%(0 ≥ 𝒦 + 1

Update the 1st search tree with 𝑏'$%/

Update the 1st search tree with 𝑏'$%(0

The 2nd search tree

Update the 2nd search tree with 𝑏'$%(/

The 1st search tree

Continue to solve the 1st

search tree with 𝑏'$%(0

Figure 3.2: The solution strategy of parallel search tree

while the 1st search tree start to branch and add cuts again with the non-updated

incumbent bkz.

The communication processes from node 1 to node 7 between the two search

trees can be seen in Fig 3.2. First, based on the b1

kz, it forms a subregion at node

2. Solve it and get optimal and improved solution b2

kz. According to the solving

strategy, it updates the 1st search tree with b2

kz and reverse the constraint in form

of 4(bkz, b1

kz) ≤ (K) + 1 to remove the subregion for future consideration at node

3. Then, it puts the 2nd search tree on hold and continue to solve the 1st search

tree by accelerated Benders method. After certain iterations, the 1st search tree get

new incumbent solution b12

kz and put the solving process on hold. Activate the 2nd

search tree and form a new subregion at node 4. Solve the MIP at node 4 and update

the steps according to the solution and the solving strategy. Processes keep push

the upper bounds of both search trees. The parallel search trees approach takes the

advantage of efficiency of solving small MIP by generic solver, at same time, it fully

utilizes the existing Benders solving structure. More details can be found in Fig 3.2.

54

Master Problem

Subproblems

Pareto-optimal Algorithm

Benders Cut Generator

Original MIP Problem

Enhancement of Store

Selection

Benders cutsTabu cuts

MIP lifting cuts

Local Branch

Initial core point

Optimal solution

Figure 3.3: The solution methodology framework

In sum, the whole solution methodology framework can be seen in Figure 3.3.

3.4 Computational results

We implement the solving approaches including the MIP, the Benders method

with enhancements in the Genetic solver Cplex 12.6. For the original problem, the

Algorithm 2 and the 2nd search tree, it use the MIP solver in Cplex directly; for the

Benders method, it applies by the Lazy constraints in Cplex. The lazy constraints are

a set of inequalities to define the feasible region but are not part of the problem when

the solver is initiated. The inequalities is generated iteratively by the subproblems

as Benders cuts. The Benders cuts are added to the current solving model as soon

as the inequalities that turn out to be violated the current incumbent solution. The

lazy constraint approach is proved to be a more efficient approach to implementing

Benders decomposition (Rubin 2011).

All the solving algorithms are coded in JAVA programming language with math-

ematic programming function of Cplex. They run under CentOS Linux on a work-

55

Table 3.1: Summary of data sets

Instance SKUs Time Range Delivery Zone Store Input

P1 1000 48 278 44

P2 3000 49 287 44

P3 5000 49 292 44

station with 8 core Inte Xeo CPU E7-4830 at 2.13 gigahertz and 32 gigabytes of

memory.

The customer demand data is derived by the distribution of the real online sales

of a major US retailer in a metropolitan area. The sale data records the common

online orders that fulfilled by a few of remote fulfillment centers located around the

country. We generate the testing customer demand from the distribution and use it

as the demand that will be fulfilled by the local stores within same day.

Three data sets associated with sampled customer demand were used to test the

effectiveness of our algorithm. Parameter setting are summarized in Table 3.1.

Other parameters and variables of the model instances are adapted some industrial

practices and reasonable assumptions. We choose Chicago city as the target to provide

spatial information of delivery zones and distance between assuming stores and zones.

For all instances, Ckz = 5 ∀k, z that means the negotiated package shipping cost is

5 dollars by 3rd party carrier; the parameters for the own truck fleet are O = 250,

WT = 8hours, MS = 20mph, ST = 5minutes, and HC = 30; number of stores and

the parameters for the store setup costs and capacities are randomly generated. There

are 44 stores whose locations are distributed in the area according to the demography

like real retailing outlets. There are 4 levels of store order processing capacities: 400,

550, 600 and 800, corresponding to 4, 14, 23 and 3 of the 44 stores. The setup costs

56

Table 3.2: Solving time (Minute) comparison of different solution methods

InstanceMethod

MIP Ordinary BendersBenders with Store

Selection Algorithm

Benders with all En-

hancements

P1 5.95 52.33 0.39 030

P2 3.69 240.02 1.35 1.25

P3 36.91 240.05 62.17 1.81

Table 3.3: The gaps between MIP and different solution methods

InstanceMethod


Selection Algorithm


hancements

P1 0.00% 0.00% 0.00% 0.71%

P2 0.00% -0.04% -0.26% 0.04%

P3 0.00% -0.28% -0.41% 0.39%

of stores are uniformly distributed between 30,000 and 60,000 dollars.

For running parameter values of the solvers, we adopt 4 hours as time limit for

running the algorithms. And according to the the argument in Cordeau et al. (2006),

we follows the convention to use 1% as optimality gap for integer involved program-

ming models. Due to the errors contained in the data estimates, it is adequate to run

the solver by the 1% gap for this supply chain planning problem.

Table 3.4: The Improvement of store selection by Algorithm 2

Instance Store in Store in by Algo 2 Store out w/o Algo 2 Store out w/ Algo 2 % of match

P1 44 3 3 3 100%

P2 44 16 8 8 100%

P3 44 20 16 16 100%

57

Table 3.5: Summary of data set of Instance P4

Instance SKUs Time Range Delivery Zone Store Input

P4 8339 49 300 44

In the initial testing, the store selection has huge impact on the solving time.

Table 3.2 highlights these runtime results between ordinary Benders and Benders

with Algorithm 2. The reason for this observation is that initial Benders cuts, which

are generated from initial master problem, might not able to redefine the feasible

region of the model. Since the lazy constraints would be effective only when some

of them violated the incumbent integer solution, the convergency of the Benders is

relatively slow. This observation provides the motivation both for enhancement of

store selection to reduce the potential store list and MIP initial cuts from parallel

search trees to generate warm starts in order to accelerate the Benders approach.

Table 3.4 shows that the proposed Algorithm 2 imposes significant improvement

of store input list, shrinks down from 44 stores to very small fraction for all three

instances without losing feasibility. According to the output selected store after op-

timization, Algorithm 2 helps the Benders master problem to contain a tight feasible

region and to generate a good start point for subproblems.

As a point of comparison, we run the three instances of Table 3.1 by using our

different solution methods. The results in Table 3.2 show the Benders with all en-

hancing methods outperforms others. For instance P3, it is 20 times faster than

the MIP solved in the Cplex. Table 3.3 compares the solution quality of different

methods using results of MIP as the benchmark. The gaps of all methods satisfy the

predefined 1% tolerance.

It introduces more complex instance of Table 3.5. Increase the number of SKUs

58

Table 3.6: Metric comparison of different solution methods for Instance P4

Instance MetricsMethod


Selection Algorithm


hancements

P4Optimality gap 1.81% 1.61% 51.15% 1%

Solving time (Hours) 4.01 4.00 4.00 2.57

to 8339, Table 3.6 shows the solving time and the optimality gap of different solution

methods. Only the Benders with all enhancements solves to optimality in 4 hours.

We conduct the sensitivity analysis for some key parameters and variable of the

model instance as well. The practice problem that the model tries to answer is how

to implement same-day delivery service by local stores. Given that, our focus is on

the key performance variables – store order processing capacity CPk, daily operation

cost of own fleet truck O and package shipping cost by third party carrier Ckz. It

designs several experiments to go through different values of these key performance

variables respectively. The experiments are based on the instance P3 and solved to

optimality by Benders with Enhancing methods. The results can be seen in Figure

3.4, 3.5 and 3.6.

From the results, it indicates that fleet size is negatively correlated with O, and

positively correlated with Ckz, while CPk seems to have no specific effect on the fleet

size. The number of selected store reduces from 16 to its half 8, when CPk increase

from around 600 to 1200. Further increasing CPk from 1200 has relatively small

changes on store selection.

3.5 Conclusions

This study introduces a new same-day delivery planning with store fulfillment

problem to capture the current trend of same-day delivery in omni-channel supply

59

0

20000

40000

60000

80000

100000

120000

140000

160000

58

3.7

88

3.7

11

83

.7

14

83

.7

17

83

.7

20

83

.7

23

83

.7

26

83

.7

29

83

.7

32

83

.7

Nu

mb

er o

f o

rder

s

Average store capacity

Fl orders Ca orders Total orders

(a) Order delivery methods analysis

0

200000

400000

600000

800000

1000000

1200000

1400000

58

3.7

88

3.7

11

83

.7

14

83

.7

17

83

.7

20

83

.7

23

83

.7

26

83

.7

29

83

.7

32

83

.7

The

cost

Average store capacity

Fl Cost Ca cost

Setup cost Total Cost

(b) Cost analysis

0.0 5.0 10.0 15.0 20.0

583.7

883.7

1183.7

1483.7

1783.7

2083.7

2383.7

2683.7

2983.7

3283.7

Number of stores

Ave

rage

sto

re c

apac

ity

(c) Number of selected store

Figure 3.4: Sensitivity analysis of the store order processing capacity

Default average value of Ckz is 583.7.

60

0

20000

40000

60000

80000

100000

120000

140000

160000

50 100 150 200 250 300 350 400 450 500

Nu

mb

er o

f o

rder

s

Daily operation cost of a truck



0

100000

200000

300000

400000

500000

600000

700000

800000

50 100 150 200 250 300 350 400 450 500

The

tran

spo

rtin

g co

st


Fl cost Ca Cost Total cost

(b) Cost analysis

0

10

20

30

40

50

60

70

80

0 200 400 600

Nu

mb

er o

f tr

uck

s


(c) The fleet size

Figure 3.5: Sensitivity analysis of the daily operation cost of own fleet truck

Default truck operation cost O is 250.00 dollar per day.

61

0

20000

40000

60000

80000

100000

120000

140000

160000

0 2 4 6 8 10

Nu

mb

er o

f o

rder

s

Package shipping cost by carrier



0

200000

400000

600000

800000

1000000

1200000

0 2 4 6 8 10

The

tran

spo

rtin

g co

st


Fl cost Ca Cost Total Cost

(b) Cost analysis

0

5

10

15

20

25

30

35

40

0 2 4 6 8 10

Nu

mb

er o

f tr

uck

s


(c) The fleet size

Figure 3.6: Sensitivity analysis of the package shipping cost by third party carrier

Default shipping cost by carrier Ckz is 5.00 dollar per package.

62

chain for brick-and-mortar retailers. It develops optimization models and solving

algorithms about store location, transporting channel selection, and inventory man-

agement, in order to construct robust logistic plan of supply chain. The solution

methodology framework includes the Benders decomposition, store selection algo-

rithm, cut strengthening methods, and parallel search trees.

Our method achieves the best results in terms of runtime for three large-scale

instance derived from real online customer orders. our store selection algorithm con-

strains the number of potential store locations, effectively improves the solving effi-

ciency. The cut strengthening methods ensures the high quality of Benders cut. And

the parallel search trees approach integrates the Benders model and MIP model into

one solving framework. MIP model and Benders model provide warm start points to

each other. By utilizing the good performance of generic solver for solving small MIP

instance, parallel search tree updates the incumbent solutions quickly and iteratively.

In sum, our study provide intuitive and efficient solving methodology to newly defined

same-day delivery planning with store fulfillment problem. The numeric experiments

on key performance variables are presented as well.

63

Chapter 4

CROSS SOURING DELIVERY WITH STORE FULFILLMENT

When one takes a close look on same day delivery with store fulfillment (SDDSF),

how to make a good fulfillment decision is an inevitable problem.

In Chapter three, we investigate SDD with store fulfillment in the planning per-

spective. It derives an implementable plan with the consideration of store location,

delivery fleet size and store inventory assignment based on the forecast sale data.

When it comes to the daily operations of SDD with store fulfillment, an exact order

fulfillment plan rather than a seasonal plan will to be created.

In Chapter four, we drill down SDD with store fulfillment from supply chain

planning to supply chain operation level and aim to create an optimal exact order

fulfillment plan to specify each received customer order which store to be sourced,

what selected delivery option preferred, when to be picked it up.

4.1 Introduction

Recently, the competition for online sales between traditional and online-exclusive

retailers has changed its focus from expanding product availability to customer promise

and product accessibility. One notable example is the recent emergence of same-day

delivery (SDD) options for some types of products, which allows customers to have

desired items delivered to their doors only a few hours after the purchase. Online

retailers with highly flexible supply chains like Amazon have been able to significantly

reduce the delivery time of most of their products by taking advantage of crossdock-

ing and product consolidation strategies over massive fulfillment centers [82], while

64

traditional retailers can take advantage of their physical stores to fulfill SDD orders

in a direct-to-consumer fashion by drawing inventory from their retail stores [83].

There exist several key advantages of using physical stores to fulfill online orders,

including short last-mile delivery distance, fully utilizing existing distribution chain,

and versatile accessible services. However, it also introduces major challenges that

must be addressed in order to guarantee a successful distribution operation. The

challenges include order consolidation, order assignment and delivery method selec-

tion.

In order to improve operational efficiency and reduce the transportation cost,

we adopt the concept of crowdsourced shipping that utilizes the extra capacity of

the vehicles from private drivers to execute delivery jobs on trips they would make

anyway [84]. Crowdsourced shipping provides a peer-to-peer transportation system

[85], in which we further divide the private drivers into two groups. The first group

contains the drivers are willing to make delivery and share their forthcoming trips

with retailers. We call this group Information Sharing Drivers (ISDs). Another group

stands for the random store walk-in customers who happen to be willing to deliver a

package which has already been picked and packed in the store. We name the group

Occasional Drivers (ODs).

In common, ISDs and ODs are willing to use their own vehicle for delivering

others’ packages in return for a small compensation. The main difference between

ISDs and ODs is that the ISDs express their crowdshipping willingness and share

scheduled trips to retailer while ODs just occasionally do the shipping. It may be

caused by the convenience of the delivery location, near their working places, homes

or not close to their trip destinations.

This study intends to solve the store-based SDD operational challenges by in-

vestigating the decisions of both order assignment and delivery options. There are

65

three available delivery options, self-operated or carrier-operated fleet of truck, OD,

and ISD. We name the problem as Same-day delivery with crowdshipping and store

fulfillment (SDD-CSF) that aims to create an optimal fulfillment plan for sourcing

local online orders from nearby retailing stores.

The order delivery operation that we consider in the study is assumed to be

conducted by a traditional brick-and-mortar retailer in the following manner. The

system keeps receiving customer orders for delivery. There are several time periods

for one time horizon. At the end of each time period, it consolidates new coming

orders from current time period and unfulfilled orders from previous time periods,

then assigns them to specific stores with preferred delivery methods. We consider

three methods as last-mile transport options – truck, ISD, and OD. At end of the

horizon, unfulfilled orders would lead to penalty cost.

In order to model SDD-CSF, we develop a set of exact solution approaches for

order fulfillment in form of rolling horizon framework. It repeatedly solves a series

of order assignment and delivery plan problem following the timeline in order to

construct a daily optimal fulfillment plan from local stores. Our study makes following

key contributions:

1. The study presents a roll horizon framework with the exact solution approach

to providing an optimal order fulfillment plan from nearby retailing stores.

2. We adopt the crowdsourced shipping for SDD and creatively consider two types

of private drivers: occasional drivers and information sharing drivers.

3. The optimization model incorporates the predicting results of future demand

to make order assignment decision that minimizes the immediate delivery cost

plus the resulting future expected cost.

66

4. The study develops a feedback control system to cope with an inaccurate fore-

cast of future demand in the roll horizon framework.

5. Various computational experiments are conducted to quantify the benefits by

comparing SDD-CSF model with some myopic operational practice.

4.2 Related Works

This literature review includes three components: last-mile delivery, crowdsourced

shipping, and SDD. The SDD-CSF problem deals with hourly and daily operation of

same day delivery with store fulfillment. It contains two main steps of supply chain

operations: order assignment to a specific store, and the options of last-mile delivery.

Typically, retailers assign the online sales immediately to the closest distribution

location that has available stock and deliver items as a package to the costumer as

soon as possible [86]. In this case, shipping individual items between locations lack

economies of scale, and results in high transportation cost in last-mile deliveries.

The operations limitations of last-mile delivery give rise to an emerging field of

research. Xu et al. [87] improve the online last-mile delivery by revising the existing

plan. The study consolidate possible orders together for delivery and created a cost-

effective fulfillment plan as order-warehouse assignment for online retailing. Instead

of solving the big NP-hard problem to get the optimal, they tackle the problem by

reassigning customer orders to improve the myopic initial fulfillment plan, which is

automatically generated by e-tailers at the beginning. Two heuristics are constructed

for the above purpose. Acimovic and Graves [88] investigate the fulfillment decisions

upon not only the current on-hand received customer orders but also an estimate of

future orders. They develop a transportation linear programming model to reassign

customer orders in order to minimize the immediate and estimate future outbound

67

last-mile delivery together. Mahar and Wright [86] leverage the online fulfillment

assignment decisions by postponing the immediate assignment on purpose. They

create a framework of policies to utilize the postpone to accumulate online orders

which then are assigned to distribution locations based on inventory, shipping, and

customer wait costs. That study argues that the right policy executing postpone

of order allocation benefits both inventory and transportation cost from the order

fulfillment.

These above studies address the last-mile delivery in different ways, including

order reassignment, accommodation of estimate future orders, and deliberate post-

ponement to accumulate online orders. However, in common, the goal of these studies

is all trying to achieve economies of scale for the order fulfillment. Our study fol-

lows this research idea and aims to achieve the same goal. We propose three novel

methods, including the rolling horizon solving framework to accumulate the online

orders, estimation and calibration of future demand, and integration of crowdsourced

shipping as one of delivery options.

The crowdsourced shipping (crowdshipping) is facilitated by the advancement of

communication technologies and flexibility of resources from the item sharing economy

[89, 90]. Private drivers are turned into occasional couriers to deliver package as peer-

to-peer transportation for a small compensation. Consequently, in terms of last-mile

delivery, retailer may benefit from providing crowdshipping option for online orders,

since the crowdsourced shipping provides potential to utilize the existing trips of

drivers to significantly reduce the transportation cost.

Research on crowdshipping is still in infancy and has just begun recently. We

divide the past literature into two parts –the operation optimization and empirical

studies.

The operation optimization focuses on how to maximize the efficiency of the

68

drivers, at the same time, minimize the operation cost. Archetti et al. [91] investigate

the emerging business model that makes in-store customers deliver goods ordered by

online customers. Each customer can make at most one delivery and these delivery

can not be carried by professional drivers like package delivery carriers. They model

the problem by mixed integer programming to minimize the total transportation cost

combining the cost of professional drivers and the compensation for crowdshipping

customers. The willingness of the crowdshipping customers is modeled by the ratio of

detour and two kinds of compensation schemes are considered, in which compensation

rate is either based on the distance of detour or the travel distance between the store

and the shipping address. Arslan et al. [84] discuss the application of crowdshipping

through modeling a peer-to-peer transportation platform that receives upcoming in-

formation of both delivery tasks and trips from drivers. They focus on optimizing the

delivery route and carrying tasks for the individual driver. The routing constraints

include the number of stops, willingness of driving time length, deliver time windows

and precedence constraints that enforces pickup before delivery. Wang et al. [92]

consider the crowdshipping model from network of storage facilities with prior knowl-

edge of large pool of available drivers with their trip information. They model the

problem as an assignment optimization problem and further convert to a network

min-cost problem to minimize the total compensation. The origin and destination of

the delivery task is pre-defined and the compensation is measured by an additional

travel distance. Kafle et al. [93] investigate a crowdshipping system combining the

truck carrier to transship packages and crowdsourced shippers to perform last-mile

delivery. The crowdsourced shippers could be cyclists and pedestrians from general

public. The willingness of shippers is represented by the bids rather than detour

travel distance. An optimization model is set up to determine crowdsourced shippers

assignment, corresponding the pickup points and the truck schedule to minimize the

69

total biding prices, the truck cost and the penalty cost for servicing outside customers’

desired time windows.

Empirical studies intend to evaluate the performance of the existing crowdsoured

shipping system and the explore public view and motivation of crowdshipping. For

empirical studies of crowdsourced shipping, Paloheimo et al. [94] examine the ap-

plication of crowdsourced deliveries for delivering and returning books from a local

library. Based on 6-week data, they discuss the impact of the service on the role of

customers, shippers, library, local community and environment. Devari et al. [85]

study the attitude and motivation of people for the crowdshipping through public

survey results. They investigate the impact of crowdshipping by the social network

of the traveler, like co-workers, neighbors and friends, to ensure a speedy and reliable

last-mile delivery. Punel and Stathopoulos [95] provide an exploratory examination

of the public view of crowdsourced shipping through survey results as well. They look

into effects of the characteristics of the crowdshipping drivers like expertise, rating

for different types of delivery tasks.

For the same-day delivery part, researchers mainly examine how to operate the

delivery vehicles as a dynamic pickup and delivery problem in the range of the same

day. Azi et al [74], Voccia et al [76] and Klapp et al [96] essentially address a vehicle

routing problem with the dynamic pickup and operation to maximize the expected

number of stochastic orders as well as minimize the vehicle operating cost and penal-

ties for open orders that remain unserved at the end of the period. Detail can be

found in Chapter 3 Literature Review.

To conclude from the related literature, to the best of our knowledge, the problem

of same-day delivery with crowdshipping and store fulfillment has not been fully in-

vestigated so far. SDD-CSF combines the study about order assignment and last-mile

delivery. We put the crowdshipping as one option of the last-mile delivery for online

70

retailing orders and further divide crowdshippers into two separate groups – Occa-

sional Drivers and Information Sharing Drivers with the consideration of potentially

practical schemes. This feature has not been investigated so far as well.

4.3 Methodology

In this section, we formally formulate the SDD-CSF in the form of dynamic pro-

gramming. Then we will introduce the rolling horzion as time-based consolidating

method for gathering resources and customer orders. The cost function will be formed

to model the myopic fulfillment plan only for received orders. The crowdshipping as

an option of the last-mile delivery will be investigated and modeled as an important

component in the cost function. Further, the solving algorithm of SDD-CSF in dy-

namic programming will be discussed. The future orders will be considered in the

cost function to make fulfillment decision to minimize the cost of both current and

future estimated order deliveries. A feedback control method will also be developed

in the solving algorithm to calibrate the forecast of customer orders.

The proposed SDD-CSF model consider both the current received orders and

forecast orders, which can be treated in dynamic programming form [88]. The state

of the system S is defined by the unfulfilled customer orders, available transportation

resource, and the inventory in each store. The post-decision Bellman equation can

be expressed as:

V (S) = mina∈A(S)

C(a) + V (Sa) (4.1)

where V is value function, a is a decision of order fulfillment plan, A is the set of

feasible fulfillment plans, C is the cost function for the order fulfillment plan a, and

Sa is the state evolved after applying decision a.

The states evolve based upon new received customer orders and the decision of

previous states through the time horizon, which is treated as one day in consideration

71

of SDD.

Later, we will discuss how to use linear programming model (CF ) to approximate

the value function for future state V (Sa). Ultimately, SDD-CSF model will combine

CF and C into an optimization model.

4.3.1 Rolling Horizon

Since orders and transportation resource can be consolidated by the time interval,

this study proposes a time-based rolling horizon framework that solves the SDD-CSF

problem repeatedly at each time t within the time horizon T ≡ [0, T ].

At time t, the previously unfulfilled orders and new coming orders are combined

as the received orders inputing into the model. In terms of an individual order, the

model output provides the instruction with a series of fulfilling actions specifying the

store to be sourced, the selected delivery option, and delivery pickup time in a slot

of [t, T ].

When the operational fulfillment plan is made by the SDD-CSF model, the retailer

should carry out the plan during time interval t and t + 1. It changes status quo of

unfulfilled orders, inventory level, and transportation resource, while new customer

orders are coming and recorded. At time t+ 1, the model will run again according to

new inputs and resources. This process solves a series of SDD-CSF models iteratively

until T .

4.3.2 The Cost Function for Received Orders

We express the cost function C by integrating unfulfillment penalty cost and

the delivery costs. The delivery costs can be further broken down into three parts,

respectively for own fleet (truck), ISD, and OD.

The truck cost is defined by summation of per-package delivery cost for assigned

72

orders. The per-package delivery cost depends on the distance between store locations

and order shipping address. The delivery rate is based on FedEx SameDay R© service

[97] with a reasonable corporation discount.

The ISD cost is the compensation which is proportional to the truck cost, and is

assumed to be much less than the cost of truck. Also, we consider a restricted will-

ingness for detouring measured by the extra travel distance [91]. When the detouring

distance is larger than a threshold length, the order would not assign to the specific

ISD driver.

The ODs come from random store walk-in customers who happen to be a courier

for one available package. We assume the per-package cost by OD is the least among

the three delivery options. The result of fulfillment by ODs naturally depend on

chance. When there are more walk-in customers, more compensation, and longer

package waiting time, it leads to a higher probability of order fulfillment by OD.

Unlike employed truck and scheduled ISD, ODs do not have expected arrival time.

The orders have to be picked and packed before any OD can pick them up for delivery.

In sum, the order fulfillment by ODs has two unique characteristics:

1. ODs perform delivery with uncertainty. Orders that are assigned to ODs but

cannot be delivered through ODs may be executed by other delivery options or

remain as unfulfilled, which eventually costs much more since the delivery cost

of OD is the delivery cost of OS is the lowest.

2. When orders are assigned to be delivered by ODs, the store needs to pack

the order as soon as possible to make the order ready for pick up. And the

processing time of order in store is assumed to be zero.

We assume that the per-package cost for truck and the ISD trips are known.

And, the probability for ODs to pick up a specific package is given by an algorithm

73

which will be discussed later. Furthermore, each ISD or OD can perform at most one

delivery. We formally define the cost function and related constraints as a mixed-

integer programming formulation.

The formulation divides the unfulfilled orders into two parts – the non-fixed-

sourcing orders and the fixed-sourcing orders, symbolized respectively by the set I

and set J . The non-fixed-sourcing orders represent the ones can be sourced from any

available stores, which include new coming orders at time t and unfulfilled orders

that previously have no delivery option or prepared to be delivered by truck or ISDs.

On the other hand, the fixed-sourcing orders are the ones that can be sourced only

from the previously assigned store, which come from unfulfilled orders prepared to be

delivered by ODs.

The reason for the division is based on the second characteristic of ODs. Once an

order assigned to OD, the sourcing store will pack it as soon as possible. Therefore,

the sourcing store is fixed, although the delivery method may be changed to another

at t+1 of the rolling horizon. While for these orders assigned to truck or ISD but have

not been delivered at current time t, the fulfillment plan for the orders assigned to

truck or ISD can be fully changed with respect to delivery method as well as sourcing

stores at t+ 1.

Notation

The notation used in the proposed model is summarized as follows.

Indices and sets:

74

I set of non-fixed-sourcing orders, indexed by i.

J set of fixed-sourcing orders, indexed by j.

[t0, T ] set of the service hours, indexed by t.*

K set of stores, indexed by k.

H set of information sharing drivers, indexed by h.

* where t0 is the current time and start point, T is the end of horizon. And [t0, T ] ⊂ T .

Customer demand and order fulfillment related parameters.

di the quantity of demand in order i.

mkt the forecast walk-in customers for store k at hour t.

Rk the ratio of walk-in customers willing to be occasional drivers at store k.

Ak inventory in store k at beginning of the horizon t0.

nVk

the processing (picking and packing) capacity per unit time of store k for

orders by own vehicles.

nODk


orders by ODs.

nISDk


orders by ISDs.

θ the penalty cost for a package cannot fulfill in the same day.

Shipping related parameters:

75

cVko the shipping cost for order o from store k by own vehicles, o ∈ I or J .

cODko the shipping cost for order o from store k by occasional drivers, o ∈ I or J .

cISDko

the shipping cost for order o from store k by information sharing drivers,

o ∈ I or J .

qht binary variable indicate that the driver h available at time t, or not.

Ω the coefficient of driver willingness to detour for order fulfillment.

ρPLANh the original travel distance of ISD h.

ρDETOURhko

the detoured travel distance for ISD h to fulfill order o from store k, o ∈

I or J .

pTtko

the probability for order o to be picked up at store k by OD from time t to

end of horizon, o ∈ I or J .

λjkbinary parameter indicates that the fixed-sourcing order j be packed in

store k or not based on previous assignment.

Decision variables:

aiktbinary variable the order i be sourced from store k by own fleet (truck) at

time t, or not.

biktbinary variable the order i be sourced from store k by occasional drivers at

time t, or not.

eikh binary variable the order i be sourced from store k by ISD h, or not.

αjktbinary variable the fixed-sourcing order j be sourced from store k by own

vehicles at hour t, or not.

βjktbinary variable the fixed-sourcing order j be sourced from store k by occa-

sional drivers at hour t, or not.

γjkhbinary variable the fixed-sourcing order j be sourced from store k by ISD

h at hour t, or not.

76

Myopic Formulation

The model for received SDD orders can be formulated as follows:

min∑ik

(∑t

cVki · aikt +∑t

(pTtki · cODki · bikt + (1− pTtki) · θ · bikt) +∑h

cISDki · eikh)

+∑jk

(∑t

cVkj · αjkt +∑t

(pTtkj · cODkj · βjkt + (1− pTtkj) · θ · βjkt) +∑h

cISDkj · γjkh)

+ θ · (I −∑ikt

aikt −∑ikt

bikt −∑ikh

eikt) + θ · (J −∑jkt

αjkt −∑jkt

βjkt −∑jkh

γjkt)

(4.2)

s.t.

∑i

bikt +∑j

βjkt ≤ Rk ·mkt ∀k ∈ K, t ∈ t0, .., T (4.3)

∑kt

aikt +∑k,t

bikt +∑kh

eikh ≤ 1 ∀i ∈ I (4.4)

∑kt

αjkt +∑kt

βjkt +∑kh

γjkh ≤ 1 ∀j ∈ J (4.5)

∑t

αjkt +∑t

βjkt +∑h

γjkh ≤ λjk ∀j ∈ J, k ∈ K (4.6)

∑i

di · (∑t

aikt +∑t

bikt +∑h

eikh) ≤ Ak ∀k ∈ K (4.7)

∑i

aikt +∑j

αjkt ≤ nVk ∀k ∈ K, t ∈ t0, .., T (4.8)

∑i

bikt +∑j

βjkt ≤ nODk ∀k ∈ K, t ∈ t0, .., T (4.9)

∑ih

qhteikh +∑jh

qhtγjkh ≤ nISDk ∀k ∈ K, t ∈ t0, .., T (4.10)

∑ik

eikh +∑ik

γjkh ≤ 1 ∀h ∈ H (4.11)

Ω · ρPLANh ≥ ρDETOURhki · eikh ∀i ∈ I, k ∈ K,h ∈ H (4.12)

Ω · ρPLANh ≥ ρDETOURhkj · γjkh ∀j ∈ J, k ∈ K,h ∈ H (4.13)

77

The objective function (4.2) represents the cost function C of an order fulfillment

plan defined by the decision variables. There are transportation costs and penalty

costs of unfulfillment for both non-fixed-sourcing orders I and fixed-sourcing orders

J . And each transportation costs contain the cost of three delivery modes respec-

tively. Specifically, for the orders assigned to OD, there exists a pickup probability

representing uncertainty of ODs, which will be further discussed in the next section.

The constraints of the model show that the operations are confined by resources,

including store inventory, store processing capacity, and transportation capacity.

The constraints (4.3) enforces that the number of assigned to ODs is less than the

number of potential ODs. The constraints (4.4) and (4.5) guarantee that there would

be only one of three delivery options being chosen for an order. The constraints (4.6)

ensure that the fixed-sourcing orders are sourced only from the previously assigned

store. The constraint (4.7) impose the restriction upon store inventory. The con-

straints (4.8), (4.9) and (4.10) show the processing capacity of store for three delivery

options. The constraints (4.11) guarantee that the ISD h is assigned to at most one

order. The constraints (4.12) and (4.13) show the detouring willingness of the ISDs.

The willingness of ISDs is expressed through the constraints (4.12) and (4.13),

which borrow idea from the flexibility parameters from Archetti et al[91]. For exam-

ple, when Ω = 1.5, the driver willingness can be expressed as that the driver is able

to take a order which makes the total driving distance at most 1.5 times than his/her

planning travel distance. Therefore, ρDETOURhko is defined upon three segments of travel

– from the driver’s origin to the store, from store to the order shipping address, and

from the order shipping address to the driver’s destination. And the three segments

of travel are all known information before solving the model.

For shipping cost rate, we assume that the compensation for ISDs or ODs are linear

correlated with the per-package delivery cost for assigned orders by truck, which can

78

be formally defined:

cISDko = ζISD · cVki (4.14)

cODko = ζOD · cVki (4.15)

ζISD > ζOD (4.16)

where ζISD and ζOD are the compensation factors for ISDs and ODs respectively.

The model is embedded the rolling horizon of [0, T ], all the input of model would

change according to the state of system at current time t0 ∈ [0, T ]. For instance, the

value inventory Ak would be updated based on the initial state at time 0 and order

fulfillment plans from time 0 to time t0 − 1. Also, the set of service hours would be

redefined as [t0, T ]. In general, the rolling horizon framework updates all the input

parameters based upon the previous results and set up the new model for current

time t0 iteratively until the end of the horizon T .

OD Pickup Probability

We model the OD pickup probability by three factors: the estimated number of

occasional drivers, the pickup probability from one driver for a unit of time, and the

length of the time interval between order assignment time t ∈ [t0, T ] to the end of the

horizon T .

The idea can be implemented in three steps. First, the number of occasional

drivers (mt) can be approximated by the historical time-of-day walk-in customers.

Second, the probability of a orders at time t to be picked up by an OD (ptko) can be

approximated based on three factors, total number of orders, the number of ODs, and

the willingness for taking the order o from store k which we call the preference (lko).

Without loss of generality, one can consider to estimate the preference by the shipping

distance, demographic factors at shipping address, and the delivery compensation.

79

Finally, based on the preference for an order and the predicted number of ODs from

t to T , the pickup probability (pTtko) from t to T can be formulated by applying

geometric distribution. The process is defined in the below Algorithm.

Algorithm 1. OD Order Pickup Probability

Input

• Ot0: Number of unfulfilled order at current time t0, and Ot0 = I ∪ J ;

• mkt the forecast number of walk-in customers for store k at t;

• Rk: the ratio of walk-in customers willing to be occasional drivers at store k

• nODk : the processing capacity of store k for OD orders.

Output

• pTtko: the probability for order o to be picked up at store k by OD from time t to

end of horizon, o ∈ I or J .

1. Quantify the preference (weight) of picking up the order o for ODs (lko) based

on the shipping distance, demographic factors at z. Then, standardize values of

lko into [0, 1]. Use shipping distance dko to illustrate the process.

lko =

1.0, dko <= 5 miles

0.9, 5 < dko <= 10 miles

0.8, 10 < dko <= 15 miles

0.7, dko > 15 miles

80

2. Calculate the total number of available couriers from ODs at time t, denoted as

∆t, which is the minimal number of total processing capacity of stores for OD

orders and total number of expected OD drivers of all stores.

∆t = min(∑k

nODk ,∑k

(Rk ·mkt)) t = t0, .., T

3. Calculate the probability of order o at time t to be picked up by one OD (ptko).

Let Ot is total number of unfulfilled orders at t, then

ptko =

lko·∆t

Otwhen <= 1

1.0 when > 1

(4.17)

4. Update Ot+1 from two sources – the expected fulfilled order from t (Ft) and the

assigned order to t+ 1 to be prepared (Ot+1)

Ft =Ot∑o

(1

K·∑k

pODtko )

then

Ot+1 = Ot − Ft + Ot+1 = O0 −T∑t=t0

Ft + Ot+1

where Ot+1 can be treated as the forecast customer orders at time t+ 1

5. The OD pickup probability from t to T at store k (pTtko)

pTtko = 1−T∑t

(1− ptko) (4.18)

Equation (4.18) represents the possibility that there is at least one successful

attempt for the order, which is treated as OD pickup probability from t to T for

order o in the model.

81

There are several different methods to estimate the preference of order lko. Natu-

rally, customers may visit nearby physical stores more frequently. Despite the number

of walk-in customers varies upon hour of the day, the percentage of customers who

come from short distance can be assumed usually higher than the ones from long

distance area. It can indicate that when order o is located nearby from store k, the

value of lko would be higher. Similarly, lko can be adjusted further by the demography

factors and the delivery compensation.

It is worthy noting some traits of the probability of order o at time t pODto . First,

summation of pODto over Ot is the total number of available couriers from ODs at time

t, which ensures that the orders assigned to ODs is around the expected number of

ODs within the store processing capacity. However, because of the limitation about

the processing capacity, and the number of order assigned for ODs is also unknown

when we apply this Algorithm, therefore, ptko is not the final probability after order

assignment from result of the model. In the SDD-CSF model, ptko is used to represent

the relative fondness of order o for a random OD among all other orders Ot.

The updated OD order pickup probability are estimated and used to simulate

the OD order picking up process, after the SDD-CSF model generates a fulfillment

plan for current time t in the rolling horizon. Part of the orders assigned to ODs

will be fulfilled according to the probability while the remaining ones become the

fixed-sourcing orders for time t + 1. The fixed-sourcing orders denoted by J in the

model are the ones assigned and prepared for ODs but have not been fulfilled before

current time.

The simulation process begin with recalculation of the order pickup probability

for one driver at time t0 (p∗t0,ko). The solution of SDD-CSF model for time t will

provide exact number of order assigned to ODs at each store O∗k,t0 , and the exact

number of walk-in customers (m∗k,t0). Then

82

∆∗t0,k = min(nODk ,∑k

(Rk ·m∗k,t0)) k ∈ K

p∗t0,ko =

lko·∆∗t0,kO∗k,t0

when <= 1

1.0 when > 1

Then, comparing random number for each order with p∗t0,ko, the order is simulated

to be fulfilled or not by ODs.

In sum, the Algorithm and its output p(tT )ko express the idea how to model the un-

certainty of OD delivery, which is an essential component to model both OD behavior

in SDD-CSF model and order fulfillment simulation of the rolling horizon.

Variant of the Cost Function

In the cost function, the truck cost can be modeled either by the summation of per-

package delivery cost for assigned orders, or fixed daily fleet cost. We discuss the

latter modeling variant in this section.

We assume that a self-operated fleet of trucks maintains its size during the rolling

horizon. Therefore, the operation cost for trucks is a constant for the entire rolling

horizon. Based on that, some modifications are introduced to build a mixed integer

programming model for the updated cost function.

For fixed daily fleet cost, we add the following truck operation related parameters

and decision variables.

Indices and sets.

Zset of delivery zones, which is the divided areas of the whole region, indexed

by z.

Truck operation related parameters.

83

vO the daily operation cost of unit truck from the own fleet.

vW the daily working time length of unit truck from the own fleet.

vM the average moving speed of unit truck from the own fleet.

vTthe average stop time for delivering package from the truck to the destina-

tion.

vH the holding capacity of packages of unit truck from the own fleet.

Truck operation related decision variables

f the number of trucks used for order fulfillment throughout the whole day.

lt the optimal of traveling distance to fulfill orders by the own fleet at hour t.

The new objective function

min vO · f +∑ik

(∑t


cISDki · eikh)

+∑jk

(∑t


cISDkj · γjkh)

+ θ · (I −∑ikt

aikt −∑ikt

bikt −∑ikh


αjkt −∑jkt

βjkt −∑jkh

γjkt)

(4.19)

The new adding constraints:

lt =∑(k,z)

(2ζkz(∑i

µizaikt +∑j

µjzαjkt))/vH

+ 0.57

√((∑i

µozaikt +∑j

µozαjkt)Πz) ∀t ∈ T (4.20)

f ≥ (lt/vM + vT

∑kz

(∑i

µozaikt +∑j

µozαjkt))/vW (4.21)

The objective function (4.19) aims to minimize the last-mile delivery cost and

unfulfillment penalty cost as the same as the previous one. The difference is that the

truck cost is now defined by the summation of the daily operation cost for trucks.

The constraint (4.20) calculate the total travel distance for a unit time. Based on

84

predefined operation parameters of the vehicle, the number of trucks is approximated

from the constraint (4.21). The reason of modeling truck operation is to approximate

the vehicle routing problem and find the optimal routing travel distance. As a result,

we can estimate the number of truck needed. We apply Continuous Approximation in

constraint (4.20) and (4.21) utilize the value of area and the number of customers to

approximate average traveling distance by vehicles with a high accuracy. The ratio-

nale of Continuous Approximation can be seen in Section Vehicle Routing Estimation

of Chapter 3 .

4.3.3 The Cost Function with Forecast Orders and Feedback Control

Integrating the cost function in the rolling horizon, we develop a myopic fulfill-

ment plan model based on the current received orders and the current transportation

resource. However, this model does not take into account the upcoming future orders

and resource. In order to minimize not only the immediate fulfillment cost but also

future expecting cost, we further introduce the SDD-CSF model which leverages the

forecast future orders and resources. This subsection will describe how to model the

cost function for forecast orders, and how to design the feedback function to deal with

the inaccurate order prediction.

The Cost Function for Forecast Orders

It is intractable to solve the dynamic problem due to the very large state space and

action plans A. We use the mixed integer linear programming model to approximate

the future cost function V (Sa) given the forecast of future order as input.

V (Sa) ≈ CF (a∗|Sa) (4.22)

85

In the formulation, C is the cost function for forecast orders, Sa is the state evolving

from decision a for current order assignment, a∗ is the decision of order fulfillment

plan for forecast future customer orders. Modeling CF inherit idea of modeling C,

which derives a mixed integer linear programming (MILP) model. CF is the optimal

objective value of the MILP by integrating the projected the delivery costs.

We denote the variables, parameters and decision variable of the model as follows:

Demand forecasting related parameters.

Zset of delivery zones, which are the divided areas of the whole region, in-

dexed by z.

gzτ the predicted future number of orders originated from zone z at time τ .

πzτtbinary parameter indicates that the predicted future demand gzτ can be

fulfilled at hour t or not.

ρDETOURhkz

the detoured travel distance for ISD h to fulfill one package from store k to

delivery zone z, where o ∈ I, J .

Demand forecasting related decision variables.

wzτkt the fulfillment flow for gzτ from store k to zone z by own trucks at time t.

xzτktthe fulfillment flow for gzτ from store k to zone z by occasional drivers at

time t.

vzτkhbinary variable indicates that one package for gzτ from store k to zone z by

ISD h at time t.

We assume the forecast customer orders in a processing horizon are known as gzτ

at the beginning, which represents the number of order will be placed from zone z at

time τ . The forecast customer orders can be estimate from historical order records

by time series and other factors, including weather, event, holiday etc. The online

customer demand predicting is a well studied area [98], and it is a separate process

for the model. Here, we assume that there is a order prediction model in hand for

86

estimating variable gzτ . Later we will introduce a feedback control system in next the

section to improve the model accuracy integrating the existing prediction result and

realized customer orders.

πzτt is an auxiliary parameter based upon gzτ . πzτt are zeros for time t ∈ [0, τ − 1]

and ones for t ∈ [τ, T ], denoting there is no change for order fulfillment before order

arrival at time τ while open to fulfill after τ in a processing horizon.

πzτt =

0 when τ < t

1 when τ >= t

∀gzτ

Instead of using binary variables like modeling C function, the decision variables in

CF are represented by order flows, which means number of order sourcing from store

to target zone. However, the decision variable for ISD vzτkh is still binary since ISDs

can only carry one order and the ISD is pre-defined and indexed by h. Therefore,

the action a∗ is to determine the order sourcing flows for forecast future customer

orders. In contract, the action a is to make exact fulfillment plans for individual

arrived orders.

The formulation of CF by MILP:

min∑zτkt

(cVkj · wzτkt + cODki · xzτkt +∑h

cODki · vzτkh) (4.23)

s.t.

∑zτ

xzτkt ≤ Rk ·mkt ∀k ∈ K, t ∈ T (4.24)

∑zτ

(∑t

wzτkt +∑t

xzτkt +∑h

vzτkh) ≤ Ak ∀k ∈ K (4.25)

∑zτ

wzτkt ≤ nVk ∀k ∈ K, t ∈ T (4.26)

∑zτ

xzτkt ≤ nODk ∀k ∈ K, t ∈ T (4.27)

87

∑zτ

qht · vzτkt ≤ nISDk ∀k ∈ K, t ∈ T (4.28)

∑zτk

vzτkh ≤ 1 ∀h ∈ H (4.29)

Ω · ρPLANh ≥ ρDETOURhkz · vzτkh ∀j ∈ J, k ∈ K,h ∈ H (4.30)∑kt

wzτkt +∑kt

xzτkt +∑kt

vzτkh = gzτ ∀z ∈ Z, τ ∈ T (4.31)

∑k

wzτkt +∑k

xzτkt

+∑k

qht · vzτkh ≤ πzτt · gzτ ∀z ∈ Z, τ ∈ T (4.32)

wzτkt + xzτkt +∑h

qht · vzτkh = 0 ∀z ∈ Z, τ ∈ T, k ∈ K, t = t0 (4.33)

The objective function (4.23) is to minimize the last-mile transport cost for fore-

cast customer order if it is assign to be delivered by truck, ISDs or ODs. Constraints

(4.24) ensure no assignment for ODs is more than ODs. Constraints (4.25) ensure

the assignment for stores is less than the inventory. Constraints (4.26), (4.27) and

(4.28) enforce the store processing capacity for order assignment. Constraints (4.29)

ensure that a ISD is able to take at most one order, and Constraints (4.30) represent

that the detour travel distance of ISD needs to be less than the preset max distance

by willingness. Constraints (4.31) and (4.32) require forecast orders to be met after

they arrive at time τ . For current time t = t0, there is no need to do fulfillment plan

which is handled instead by cost function C, enforced by constraints (4.33).

It is worthy noting that, comparing to the C, CF dost not consider the penalty

cost for the order unfulfillment. That is because the model presumes that supply

of stores and the processing capacity are sufficient to meed demand in the rolling

horizon. If that is not valid, it can select more stores for the order fulfillment or

scaling down the forecast customer orders in the numerical experiment.

After combining the C and CF , the SDD-CSF problem is eventually modeled by

88

mixed integer linear programming as an approximation. The dynamic programming

equation evolves from (4.1) to

V (S) = mina∈A(S)

C(a) + V (Sa) ≈ mina∈A(S),a∗∈A(SF )

C(a|Sa∗) + CF (a∗|Sa)) (4.34)

From equation (4.34), action a and a∗ contribute to C as well as CF function.

It means that the action a and a∗ affect each other by changing the inventory and

transportation resources from current time t0 to the end the horizon T . Therefore, the

SDD-CSF model combines C and CF into one cost-integrated and resource-shared

MILP model. Here, we present the SDD-CSF model by MILP is in form of truck

per-package cost.

min∑ik

(∑t

cVki · aikt +∑t


cISDki · eikh)

+∑jk

(∑t

cVkj · αjkt +∑t


cISDkj · γjkh)

+∑zτkt

(cVkj · wzτkt + cODki · xzτkt +∑h

cODki · vzτkh)

+ θ · (I −∑ikt

aikt −∑ikt

bikt −∑ikh


αjkt −∑jkt

βjkt −∑jkh

γjkt)

(4.35)

s.t.

∑i

bikt +∑j

βjkt +∑zτ

xzτkt ≤ Rk ·mkt ∀k ∈ K, t ∈ T (4.36)

∑kt

aikt +∑k,t

bikt +∑kh

eikh ≤ 1 ∀i ∈ I (4.37)

∑kt

αjkt +∑kt

βjkt +∑kh

γjkh ≤ 1 ∀j ∈ j (4.38)

∑t

αjkt +∑t

βjkt +∑h

γjkh ≤ λjk ∀j ∈ J, k ∈ K (4.39)

89

∑i

di · (∑t

aikt +∑t

bikt +∑h

eikh)

+∑zτ

(∑t

wzτkt +∑t

xzτkt +∑h

vzτkh) ≤ Ak ∀k ∈ K (4.40)

∑i

aikt +∑j

αjkt +∑zτ

wzτkt ≤ nVk ∀k ∈ K, t ∈ T (4.41)

∑i

bikt +∑j

βjkt +∑zτ

xzτkt ≤ nODk ∀k ∈ K, t ∈ T (4.42)

∑ih

qhteikh +∑jh

qhtγjkh

+∑zτ

qht · vzτkt ≤ nISDk ∀k ∈ K, t ∈ T (4.43)

∑ik

eikh +∑jk

γjkh +∑zτk

vzτkh ≤ 1 ∀h ∈ H (4.44)

Ω · ρPLANh ≥ ρDETOURhki · eikh ∀i ∈ I, k ∈ K,h ∈ H (4.45)

Ω · ρPLANh ≥ ρDETOURhkj · γjkh ∀j ∈ J, k ∈ K,h ∈ H (4.46)

Ω · ρPLANh ≥ ρDETOURhkz · vzτkh ∀j ∈ J, k ∈ K,h ∈ H (4.47)∑kt

wzτkt +∑kt

xzτkt +∑kt

vzτkh = gzτ ∀z ∈ Z, τ ∈ T (4.48)

∑k

wzτkt +∑k

xzτkt

+∑k

qht · vzτkh ≤ πzτt · gzτ ∀z ∈ Z, τ ∈ T (4.49)

wzτkt + xzτkt +∑h

qht · vzτkh = 0 t = 0 and ∀z ∈ Z, τ ∈ T, k ∈ K (4.50)

The objective function of SDD-CSF (4.35) is essentially the summation of (4.2)

from C and (4.23) from CF . Constraints (4.36), constraints (4.40)–(4.43) integrating

order assignment for received orders and forecast orders to enforce the limitation of

number of ODs, store inventory, and store processing capacity respectively. Other

constraints keep the same as they are in C and CF .

90

Feedback Control

When model is create for time t in the rolling horizon,the received orders and available

resource are realized for time t. Therefore, it is possible to verify whether we have

an accurate prediction of orders and transportation resources from time 0 to t − 1.

Furthermore, the verification at time t for the predictions at [0, t−1] can be quantified

as a feedback control system for improving the forecast results for [t+ 1, T ] at time t.

The SDD-CSF model integrates the feedback control system to cope with the

inaccurate forecast of future demand in the roll horizon framework. Based on the

realized number of online orders, an exponential smoothing method is applied to

data inputs to adjust OD pickup probability and forecast future order. In the case,

the rolling horizon solving framework more robust and scalable even if the initial

prediction is not accurate.

From the discussion in Section The Cost Function for Received Orders and The

Cost Function for Forecast Orders, the forecast of future demand are made as input

data at the beginning of the rolling horizon. The forecast customer orders can be

estimated from historical order records by time series and other factors, including

weather, event, holiday etc. The feedback control system intends to adjust the these

prediction input variables for the SDD-CSF model based on the realized data. The

detail process can be seen in Algorithm 2.

Algorithm 2. Feedback Control for Forecast orders

Inputs

• [0, T ]: set of the service hours, indexed by t;

• t0: current time of the rolling horizon;

• K:set of stores, indexed by k;

91

• Ot: Number of received order at time t ∈ [0, t0];

• gzt: the predicted future demand originated from zone z at hour t ∈ [0, T ], the

prediction is made before time 0;

• αF : the smoothing factor of exponential smoothing for forecast orders, between

0 and 1.

Outputs

• ΨF : the smoothing factor of feedback control for forecast orders;

• g∗zt: the smoothed future demand originated from zone z at hour t ∈ [t0, T ];

Steps

1. Obtain the number of predicted order at time t (OFt ) from gzt

OFt =

∑z∈Z

gzt t ∈ 0, .., T

2. Calculate the ratio (κFt ) of the realized number of orders over forecast number

of orders from the beginning to current time t0

κFt = Ot/OFt t = 0, .., t0

3. Update the ΨF at time t0 for forecast orders by

ΨFt =

κFt , t = 0

αF · κFt + (1− αF ) ·ΨFt−1, t = 1, .., t0

and

ΨF = ΨFt0

92

4. Based on the smoothing factors of feedback control for forecast orders, the total

number of forecast orders G for [t0, T ] can be updated by

G = ΨF ·∑z

T∑t=t0

gzt

5. The difference between G and total number of forecast orders is define as ∆, and

∆ = G −∑z

T∑t=t0

gzt

6. Simulate the forecast orders based on the value of ∆. Create |∆| number of

random forecast orders (g′zt) from delivery zone set Z and time set [t0, T ].

If ∆ ≥ 0, g∗zt = gzt ∪ g′

zt

else, g∗zt = gzt \ (gzt ∩ g′

zt)

By applying g∗kt to replace gzt in the Algorithm 1 and SDD-CSF model, it in-

corporates the smooth factors of feedback control for improving the forecast results

based on the realized data. In generate, with help of rolling horizon solving frame,

the SDD-CSF model is able to efficiently adjust the forecast input parameters to

accommodate the fluctuations from the reality.

4.4 Case Study

The proposed SDD-CSF model seeks to minimize the immediate order fulfillment

cost based upon on-hand and forecast transportation resource, at the same time,

leverages the forecast of future customer orders to reduce the resulting future expected

cost as well. SDD-SCF model gives consideration of both efficiency and robustness

of order fulfillment plan for the whole time horizon. For the sake of comparison,

we create three more MILP models as benchmarks for SDD-CSF model, respectively

Conservative, Myopic and Global-optimal models.

93

The conservative model only considers on-hand transportation resource and cur-

rent receiving customer orders. The Myopic model (4.2)–(4.13) examines the forecast

transportation resource for current receiving customer orders, but not the future or-

ders. The global-optimal model is defined with no uncertainty knowing all the trans-

portation resource and daily customer orders at beginning of horizon with certainty.

The conservative and Myopic models represent two common industry practical strate-

gies for order fulfillment, while the global-optimal model provides the lower bound of

the operation cost for order fulfillment.

For the perspective of formulation, the conservative and the global-optimal models

are both derived from the Myopic model. For the conservative model, the set of service

hour is only a time period [t0, t0] rather than a range like others [t0, T ]. Therefore,

all the time related parameters and variables contain the values only for current time

t0, which represents the on-hand transportation resource and the current receiving

customer orders. For the global-optimal model, there are three differences comparing

to the myopic model. First, the set of non-fixed-sourcing orders I has all orders of

the horizon instead of the orders arrived at t0, so that the set of fixed-sourcing orders

J is always empty. Second, the OD pickup probability is always 1 because of the

assumption for no uncertainty. Furthermore, we add an auxiliary parameter νit to

indicate the arrival time τ of order i.

νit =

0 if τ < t

1 otherwise

Then the fulfillment plan for order i may only be made after order arrival which

enforced by additional constraints∑k

(aikt + bikt +∑h

qht · eikh) ≤ νit ∀i ∈ I, t = 0, .., T

The customer demand data is collected from Instacart, which is an American

94

company that provides same-day grocery delivery service. In June 2017, Instacart

provides 3 million Instacart orders for non-commercial use 1 . The time horizon for

each day is set to be from 8 am to 7 pm 12 hours. All the orders placed between 8

pm to 7am will be count as new orders arriving just at 8 am. That is the reason that

there is a surge of orders at the beginning of each time horizon.

We choose Chicago, Illinois as a test area to provide spatial information of assumed

local retail stores, ISDs shared trips and customer shipping address. The number of

walk-in customer is estimated by three factors: general online statistics of customers

for the supermarket, assumed store capacity, and hour of the day. We set 5% as the

ratio of walk-in customers willing to be occasional drivers at store, which mean 5 of

100 store walk-in customers would be OD to deliver orders if there are packages ready

to be picked up.

The delivery rate is based on FedEx SameDay R© service [97] with a reasonable

corporation discount. The start point is 8 US dollars per package and the cost will

increase after the distance is greater than 15 miles, and per mile cost is 0.2 times

the value of travel distance. The compensation factors for ISDs ζISD and ODs ζOD

are, respectively, 0.2 and 0.4 according to the cost setting proposed by Archetti et al.

[91]. Furthermore, the penalty of order unfulfilled on the end of day is set as 50 US

dollars.

We implement the solving methodology in CPLEX 12.7.0. We apply the MIP

solver in CPLEX directly for the SDD-CSF model in form of the rolling horizon.

They run under CentOS Linux on a workstation with 12 core Intel Xeon CPU E7-

4830 at 2.13 gigahertz and 32 gigabytes of memory.

Here, we focus on an instance with relative small scale to show the difference

between the four proposed models. The instance contain 2788 customer orders in two

1https://www.instacart.com/datasets/grocery-shopping-2017

95

0

50

100

150

200

250

300

350

8 10 12 14 16 18 8 10 12 14 16 18

No.

of O

rder

s

Time of Day

TruckODISDUnfulfilledReceived Orders

(a) The myopic model

0

50

100

150

200

250

300

350

8 10 12 14 16 18 8 10 12 14 16 18

No.

of O

rder

s

Time of Day

(b) The conservative model

0

50

100

150

200

250

300

350

8 10 12 14 16 18 8 10 12 14 16 18

No.

of O

rder

s

Time of Day

(c) The global-optimal model

0

50

100

150

200

250

300

350

8 10 12 14 16 18 8 10 12 14 16 18

No.

of O

rder

s

Time of Day

(d) SDD-CSF model

Figure 4.1: The hourly summary of order fulfillment plans by the four models

days. And we consider two stores to do the fulfillment for these orders. The instance

is the same setting as P4 in Table 4.1.

The solving process for models, including the myopic, the conservative and SDD-

CSF model, can be described as follows. The models are created at each time t of

rolling horizon and solved iteratively with the exact solution until the end of the

horizon T . The solution for time t contains the order fulfillment for from t to T for

both the myopic and SDD-CSF models. However, the remaining orders at time t+ 1

96

will be treated as input and fulfillment plan will be updated according to the solution

of model at time t + 1. Therefore, we only implement the exact solution for the

current time as finalized order fulfillment plan from Myopic and SDD-CSF at time

t0. The global-optimal model only need to be run once for a whole horizon.

Figure 4.1 shows the hourly summary of order fulfillment plan solved by the four

models. In common, there are two time horizons from 8 am to 19 pm. There is a

surge of received orders at the beginning of each time horizon which reason has been

discussed. In addition, there is more orders arrived in day 2 comparing to day 1.

Let us focus on the number of hourly fulfilled orders regardless delivery method

first. In the Figure 4.1, the global-optimal model divide the fulfillment works more

evenly in each time than others. The SDD-CSF and conservative model generally

follows the orders receiving pattern. On the other hand, the myopic model lack

foresight of upcoming orders, and does the order fulfillment with small amount at

beginning of horizons and try to catch up at the end of each horizon. In day 2, the

myopic model has unfulfilled orders at end of day, while all other model does not have

any unfulfillment.

Each hour order fulfillment plan is composed of order assignment for three deliver

methods. Generally, the delivery rate for ODs is the least and truck is the most

expensive for a specific order because of the shipping cost rates from (4.14) – (4.16).

For the global-optimal model fully utilizes the ODs for every hour. The SDD-CSF

model tries to use as much as ODs. However, at the end of horizon, it assigns less

orders to ODs for corresponding smaller OD pickup probability. For the conservative

model, it intends to fulfill order as much as its capacity so than order assigned to

truck is the most comparing to others.

The summary of order assignment and associated costs is shown in Figure 4.2. The

total cost of SDD-CSF model is less than the myopic and conservative. Although

97

0

500

1,000

1,500

2,000

2,500

3,000

Myopic Cons GO SDD-CSF

No.

of O

rder

s

Truck OD ISD Unfulfilled

(a) The order assignments

0.00

5,000.00

10,000.00

15,000.00

20,000.00

25,000.00

Myopic Cons GO SDD-CSF

Cos

t USD

Truck OD ISD Unfulfilled

(b) The associated costs

Figure 4.2: The order fulfillment plans and associated costs of the instance P7 by the

four models

myopic and SDD-CSF models contain similar percentage of assignment for three

delivery methods, the SDD-CSF does not have the penalty for order unfulfillment.

In sum, we can conclude that the myopic model has poor management of the

delivery timing for order fulfillment, and the conservative model has poor management

of the delivery resource of order fulfillment. The proposed SDD-CSF model overcome

the shortcomings of these two models. It incorporates the cost function for future

forecast orders to select a good timing in the view of whole time horizon. Furthermore,

it enables the cost function for current received order to select a good delivery model.

Combining the two cost functions, SDD-CSF model is able to provide a better order

fulfillment plan.

Next, we conduct a large amount of computational experiments to quantify the

benefits of SDD-CSF model. The Instacart data is sampled to create two levels –

small and large customer demand for length of two days. And each levels contains

ten instances with incremental number of orders. Input of stores is two for small

demand level, while that is five for large demand level. The details can be seen in

98

Table 4.1: The Orders and Stores Inputs for Two Levels of Customer Demand

Small Demand Level Large Demand Level

Input Stores (IDs) [9, 90] Input Stores (IDs) [9, 17, 20, 35, 76]

Instance # of Orders Instance # of Orders

P1 1058 P11 1745

P2 1446 P12 2228

P3 1705 P13 2774

P4 1979 P14 3397

P5 2181 P15 3873

P6 2466 P16 4452

P7 2788 P17 4998

P8 3035 P18 5543

P9 3276 P19 6066

P10 3426 P20 6496

Table 4.1.

We manage to incorporate the two levels of customer demand into the four MILP

models, including the conservative, the myopic, the global-optimal and SDD-CSF

models, in form of the roll horizon. One day is treated as a time horizon , and further

is divided into 12 sequential time periods from 8 am to 7 pm – 12 working hours

as discussed before. In sum, the total number of the optimization models is 192

(= 2× 4× 2× 12) for the comparison.

The solving results of the 20 different-sized instances of the four models can be

seen in Figure 4.3 in terms of total cost. It can be seen that the cost curves follows

the similar patterns for both small demand level and large demand level. When the

99

0

10000

20000

30000

40000

50000

60000

70000

P1 P2 P3 P4 P5 P6 P7 P8 P9 P10

Tota

l Cos

t

Myopic Conservative

Global-optimal SDD-CSF

(a) Small demand level

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

100000

P11P12P13P14P15P16P17P18P19P20

Tota

l Cos

t

Myopic Conservative

Global-optimal SDD-CSF

(b) Large demand level

Figure 4.3: The solving results of the instances for different levels of customer demand

input orders is small enough, four models handle the fulfillment for the least expensive

way by ODs if the store has any. So the costs for four models are almost the same.

When input orders increase from small to medium numbers, the conservative model

cost more than others. That is because there still exist surplus delivery resources,

the delivery timing is not sensitive enough to affect the cost. But for the conservative

model, it handle the order surge at beginning of the horizon with all capacity including

the costly truck. When the input order increase to large numbers, delivery timing

becomes very sensitive since the penalty cost of unfulfillment is very high. The myopic

model fails to fully utilize the delivery resources at beginning of horizons. And costs

of conservative models come close to those of SDD-CSF since both models are forced

to use all the delivery resources to avoid penalty cost.

Figure 4.3 shows that the cost curve of SDD-CSF is always close to that of the

global-optimal model no matter the size of input orders. Except for the global-

optimal, which is unpractical with the assumption for real-world use, SDD-CSF model

100

0

50

100

150

200

250

300

350

400

8 10 12 14 16 18 8 10 12 14 16 18

No.

of O

rder

s

Time of DayC1 C2 C3 C4C5 C6 C7 C8C9 Real

(a) Hourly orders for forecast demand cases

15,000.00

16,000.00

17,000.00

18,000.00

19,000.00

20,000.00

21,000.00

22,000.00

23,000.00

C1 C2 C3 C4 C5 C6 C7 C8 C9

Cos

t USD

Myopic Conservative Global-optimal

(b) The total cost

Figure 4.4: Cases of forecast orders gzt and solving results from SDD-CSF models

with feedback control

Bar chart in (b) represent the total costs of SDD-CSF models with feedback control

is proved to be the most robust and efficient way to make the order fulfillment plan

for these 20 different-sized instances.

Table 4.2 summarize the solving time for 20 dierent-sized instances of the four

models. The proposed SDD-CSF model need the longest time to run for every in-

stance. However, It is reasonable to solve the large size problem within 5 minutes in

consideration of hourly order fulfillment planning.

We create 9 cases of forecast orders gzt to benchmark the performance of the

feedback control. The input setting is as same as the instance P4 in Table 4.1.

Case 1–3 is much less than the real input orders representing under-forecasting, while

Case 7–9 is larger denoting over-forecasting and Case 4–6 is similar standing for good

forecasting. All cases is randomly sampled from original Instacart order data.

In Figure 4.4, the results of SDD-CSF models with feedback control remain relative

101

Table 4.2: The solving time (seconds) of the instances for different levels of customer

demand

Instance # of Orders Myopic Conservative Global-optimal SDD-CSF

P1 1,058 2.06 0.31 1.71 17.63

P2 1,446 3.84 0.67 2.67 28.18

P3 1,705 4.86 0.47 2.89 18.37

P4 1,979 5.64 0.55 3.76 23.96

P5 2,181 6.22 0.87 3.98 29.08

P6 2,466 6.73 0.70 5.60 39.97

P7 2,788 8.50 0.80 6.09 37.30

P8 3,035 11.44 0.85 6.72 44.44

P9 3,276 11.45 1.00 7.63 37.81

P10 3,426 14.05 1.44 8.39 43.74

P11 1,745 4.88 1.18 6.46 47.89

P12 2,228 8.40 1.10 8.71 66.27

P13 2,774 9.96 1.55 12.11 87.00

P14 3,397 18.26 1.36 19.76 89.72

P15 3,873 20.31 1.69 22.85 115.98

P16 4,452 24.32 2.28 30.23 146.35

P17 4,998 26.00 3.25 35.62 141.68

P18 5,543 29.12 2.57 45.75 169.93

P19 6,066 46.50 2.84 48.56 175.17

P20 6,496 74.52 4.75 57.76 238.65

102

stable level and are better than that of the myopic and conservative models, regardless

of the cases and corresponding different patterns.

4.5 Conclusion

In this chapter, we introduce a new problem of same-day delivery crowdshipping

and store fulfillment, which aims to fill the last-mile gap between local store network

and online customers in terms of supply chain operation.

First, we propose a rolling horizon solving framework to accumulate the online

orders and transportation resources for purpose to achieve economies of scale, which

is the typical issue for last-mile delivery. The framework repeatedly solves a series

of order assignment and delivery plan problem following the timeline in order to

construct a daily optimal fulfillment plan from local stores.

We develop a set of exact solution approaches for order fulfillment in form of

rolling horizon framework. The original dynamic programming problem for current

received orders is mathematically approximated into a mixed integer linear program-

ming model. The model consider both current received orders and the predicting

results of future demand to make order assignment decision that minimizes the im-

mediate delivery cost plus the resulting future expected cost.

With help of the structure of the roll horizon, we introduce a feedback control

system to cope with the inaccurate forecast of future demand. Based on the realized

number of online orders, an exponential smoothing method is applied to data inputs

to adjust the forecast results of future orders. The case study shows the rolling horizon

solving framework more robust and scalable through the feedback control system even

if the initial prediction is not accurate.

Additionally, crowdsourced shipping for SDD has been integrated as delivery op-

tion and creatively is divided into two types of private drivers: information sharing

103

drivers and occasional drivers. In the setting of store fulfillment, information sharing

drivers are like commuters who share scheduled trips to the retailer. It enables the

model assign packages based on the scheduled trips for compensation with certainty.

Occasional drivers can be treated as store average walk-in customers who are willing

to take packages for others to obtain compensation. The big challenge to model ODs

is that they perform delivery with uncertainty. Therefore, We propose the OD pickup

probability and integrate it into the model.

Finally, the case study with various computational experiments are conducted to

quantify the benefits by comparing SDD-CSF with some conventional order fulfillment

practices. The proposed SDD-CSF achieves the good performance and robustness in

term of optimizing the order fulfillment planning for different-sized customer demand

instances, which derived from real sale data of a national retailer.

104

Chapter 5

CONCLUSION AND FUTURE RESEARCH DIRECTIONS

5.1 Summary of Findings

The problems examined in this thesis are originated from two perspectives of

transportation analytics. The transit flow prediction under events is an actual public

transportation problem faced by transportation authorities. The last-mile same-day

delivery with store fulfillment is the trending business idea in omni-channel supply

chain for brick-and-mortar retailers.

We show how the crowdsourced content of social media can improve the event

transit flow prediction in Chapter 2. Since social media can be retrieved in real time

with relatively small building and maintenance costs, we propose an algorithm and a

prediction model with social media data to assist transportation flow prediction under

special event conditions. Among several popular prediction methods, our method

shows the best results in terms of mean absolute percentage error. Also, we exploit

social media to detect various events. Our approach achieves good performance with

precision 98.27% and recall 87.69% for detecting the baseball games.

In the chapter 3, we define the same-day delivery with store fulfillment problem,

then list the benefits, the implementation challenges, and our solution to the chal-

lenges. It focuses on solving the supply chain planning problem to provide optimal

seasonal plan for retailers as decision making tool to select stores, prepare inventory,

and equip trucks for the last-mile same-day delivery. Our solving method achieves

the best results in terms of runtime for three large-scale instance derived from real

online customer orders. The optimization result provides practical supply chain plan

105

for retailers to setup the service for the same-day delivery with store fulfillment.

In Chapter 4, we follow up the idea of store fulfillment and build an optimization

model for supply chain operation to fill the last-mile gap between local store network

and customers for specific online orders. We adopt a trending concept from sharing

economy to provide delivery option of crowdsourced shipping from average customers.

Chapter 4 provide a solution to the remaining problem from Chapter 3. The proposed

model achieves the good performance and robustness in term of optimizing the order

fulfillment planning for different-sized customer demand instances, which derived from

real sale data of a national retailer. And the proposed feedback control system assist

the model to be more robust and scalable even if the initial predicting inputs is not

accurate.

5.2 Future Research Directions

The work in this thesis is based on the real-world problems and are solved by

real-world data. One trait of our work is that the solutions are not only proved by

theories but also implementable for real-world instances. This trait facilitates the

solutions from our work for practical uses and testing proposed ideas in the future.

For the social media part, the researchers in TransInfo of the University at Buffalo

have follow-up works for various topics in the field of transportation, including traffic

accident detection, travel behavior inference, human mobility pattern analysis etc.

Social media can contribute more in the fields of retailing and disaster recovery since

our work proves that the crowdsourced content is able to signify the public attention

and willingness.

For the last-mile same-day delivery part, the category of products can be taken

into consideration for both supply chain planning and operation models, since the

difference between grocery, cloth and gadget is huge in terms of the supply chain

106

operations. Another direction of the future work is to explore the application of the

models for cases at a very large scale. It is a great opportunity to visit more solving

methods and heuristics in order to efficiently solve the models with scalable inputs.

107

REFERENCES

[1] Vibhanshu Abhishek, Kinshuk Jerath, and Z. John Zhang. Agency sellingor reselling? channel structures in electronic retailing. Management Science,62(8):2259–2280, 2016.

[2] U.S. Census Bureau. Annual retail trade survey 2015. https://www.census.gov/retail/index.html, 2015. Last accessed: April 6, 2017.

[3] U.S. Census Bureau. Quarterly retail e-commerce sales, 4th quarter 2016. https://www.census.gov/retail/index.html, 2017. Last accessed: April 6, 2017.

[4] Vikram Sehgal and S Mulpuru. Forrester research online retail forecast, 2013to 2018 (us). https://www.forrester.com/report/Forrester+Research+Online+Retail+Forecast+2013+To+2018+US/-/E-RES115941, 2011.

[5] Statista. Retail e-commerce sales in the united states from2015 to 2021. https://www.statista.com/statistics/272391/us-retail-e-commerce-sales-forecast/, 2017. Last accessed: April 6,2017.

[6] Ethan Lieber and Chad Syverson. Online versus offline competition. OxfordHandbook of the Digital Economy, pages 189–223, 2012.

[7] Ali Hortasu and Chad Syverson. The ongoing evolution of us retail: A formattug-of-war. The Journal of Economic Perspectives, 29(4):89–111, 2015.

[8] Mu-Chen Chen and Yu Wei. Exploring time variants for short-term passengerflow. Journal of Transport Geography, 19(4):488–498, July 2011.

[9] Samiul Hasan, Christian M. Schneider, Satish V. Ukkusuri, and Marta C. Gon-zlez. Spatiotemporal Patterns of Urban Human Mobility. Journal of StatisticalPhysics, 151(1-2):304–318, December 2012.

[10] Yu Wei and Mu-Chen Chen. Forecasting the short-term metro passenger flowwith empirical mode decomposition and neural networks. Transportation Re-search Part C: Emerging Technologies, 21(1):148–162, April 2012.

[11] Biao Leng, Jiabei Zeng, Zhang Xiong, Weifeng Lv, and Yueliang Wan. Proba-bility Tree Based Passenger Flow Prediction and Its Application to the BeijingSubway System. Front. Comput. Sci., 7(2):195–203, April 2013.

[12] Yujuan Sun, Guanghou Zhang, and Huanhuan Yin. Passenger Flow Prediction ofSubway Transfer Stations Based on Nonparametric Regression Model. DiscreteDynamics in Nature and Society, 2014:e397154, April 2014.

[13] Yuxing Sun, Biao Leng, and Wei Guan. A novel wavelet-SVM short-time pas-senger flow prediction in Beijing subway system. Neurocomputing, 166:109–121,October 2015.

108

[14] E.I. Vlahogianni, J.C. Golias, and M.G. Karlaftis. Short-term traffic forecasting:Overview of objectives and methods. Transport Reviews, 24(5):533–557, 2004.

[15] Billy Williams, Priya Durvasula, and Donald Brown. Urban Freeway Traffic FlowPrediction: Application of Seasonal Autoregressive Integrated Moving Averageand Exponential Smoothing Models. Transportation Research Record: Journalof the Transportation Research Board, 1644:132–141, January 1998.

[16] A.G. Hobeika and Chang Kyun Kim. Traffic-flow-prediction Systems Based onUpstream Traffic. In Vehicle Navigation and Information Systems Conference,1994. Proceedings., 1994, pages 345–350, August 1994.

[17] Mohamed S. Ahmed and Allen R. Cook. Analysis of Freeway Traffic Time-series Data by Using Box-Jenkins Techniques. Transportation Research Record,(722):1–9, 1979.

[18] Xiaoyan Zhang and John A. Rice. Short-term Travel Time Prediction. Trans-portation Research Part C: Emerging Technologies, 11(34):187–210, June 2003.

[19] Billy Williams. Multivariate Vehicular Traffic Flow Prediction: Evaluation ofARIMAX Modeling. Transportation Research Record: Journal of the Trans-portation Research Board, 1776:194–200, January 2001.

[20] Billy M. Williams and Lester A. Hoel. Modeling and Forecasting Vehicular TrafficFlow as a Seasonal ARIMA Process: Theoretical Basis and Empirical Results.Journal of Transportation Engineering, 129(6):664–672, 2003.

[21] Sangsoo Lee and Daniel Fambro. Application of Subset Autoregressive Inte-grated Moving Average Model for Short-Term Freeway Traffic Volume Forecast-ing. Transportation Research Record: Journal of the Transportation ResearchBoard, 1678:179–188, January 1999.

[22] Tsung-Hsien Tsai, Chi-Kang Lee, and Chien-Hung Wei. Neural Network BasedTemporal Feature Models for Short-term Railway Passenger Demand Forecast-ing. Expert Systems with Applications, 36(2, Part 2):3728–3736, March 2009.

[23] R. Yasdi. Prediction of Road Traffic using a Neural Network Approach. NeuralComputing & Applications, 8(2):135–142, May 1999.

[24] C.-H. Wu, J.-M. Ho, and D.T. Lee. Travel-time prediction with support vectorregression. IEEE Transactions on Intelligent Transportation Systems, 5(4):276–281, 2004.

[25] Fangce Guo, Rajesh Krishnan, and John Polak. A computationally efficient two-stage method for short-term traffic prediction on urban roads. TransportationPlanning and Technology, 36(1):62–75, February 2013.

[26] Weiwei Gong. ARMA-GRNN for passenger demand forecasting. In 2010 Sixth In-ternational Conference on Natural Computation (ICNC), volume 3, pages 1577–1581, August 2010.

109

[27] Xiushan Jiang, Lei Zhang, and Xiqun (Michael) Chen. Short-term forecasting ofhigh-speed rail demand: A hybrid approach combining ensemble empirical modedecomposition and gray support vector machine with real-world applications inChina. Transportation Research Part C: Emerging Technologies, 44:110–127,July 2014.

[28] Chun-Hui Zhang, Rui Song, and Yang Sun. Kalman Filter-Based Short-TermPassenger Flow Forecasting on Bus Stop. Journal of Transportation SystemsEngineering and Information Technology, 11(4):154, 2011.

[29] Min Gong, Xiang Fei, Zhi Wang, and Yun Qiu. Sequential Framework for Short-Term Passenger Flow Prediction at Bus Stop. Transportation Research Record:Journal of the Transportation Research Board, 2417:58–66, December 2014.

[30] F. Y. Wang. Scanning the Issue and Beyond: Real-Time Social Transportationwith Online Social Signals. IEEE Transactions on Intelligent TransportationSystems, 15(3):909–914, June 2014.

[31] F. Y. Wang. Scanning the Issue and Beyond: Transportation Games for So-cial Transportation. IEEE Transactions on Intelligent Transportation Systems,16(3):1061–1069, June 2015.

[32] X. Zheng, W. Chen, P. Wang, D. Shen, S. Chen, X. Wang, Q. Zhang, andL. Yang. Big Data for Social Transportation. IEEE Transactions on IntelligentTransportation Systems, 17(3):620–630, March 2016.

[33] E. Chaniotakis and C. Antoniou. Use of Geotagged Social Media in UrbanSettings: Empirical Evidence on Its Potential from Twitter. In 2015 IEEE 18thInternational Conference on Intelligent Transportation Systems (ITSC), pages214–219, September 2015.

[34] Francisco C. Pereira, Filipe Rodrigues, and Moshe Ben-Akiva. Using Data Fromthe Web to Predict Public Transport Arrivals Under Special Events Scenarios.Journal of Intelligent Transportation Systems, 19(3):273–288, July 2015.

[35] F. C. Pereira, F. Rodrigues, E. Polisciuc, and M. Ben-Akiva. Why so manypeople? Explaining Nonhabitual Transport Overcrowding With Internet Data.IEEE Transactions on Intelligent Transportation Systems, 16(3):1370–1379, June2015.

[36] F. Y. Wang, J. J. Zhang, X. Zheng, X. Wang, Y. Yuan, X. Dai, J. Zhang, andL. Yang. Where does AlphaGo go: from church-turing thesis to AlphaGo thesisand beyond. IEEE/CAA Journal of Automatica Sinica, 3(2):113–120, April 2016.

[37] Y. Lv, Y. Duan, W. Kang, Z. Li, and F. Y. Wang. Traffic Flow PredictionWith Big Data: A Deep Learning Approach. IEEE Transactions on IntelligentTransportation Systems, 16(2):865–873, April 2015.

[38] X. Wang, X. Zheng, Q. Zhang, T. Wang, and D. Shen. Crowdsourcing in ITS:The State of the Work and the Networking. IEEE Transactions on IntelligentTransportation Systems, 17(6):1596–1605, June 2016.

110

[39] N. Wanichayapong, W. Pruthipunyaskul, W. Pattara-Atikom, and P. Chaovalit.Social-based traffic information extraction and classification. In 2011 11th Inter-national Conference on ITS Telecommunications (ITST), pages 107–112, August2011.

[40] Axel Schulz, Petar Ristoski, and Heiko Paulheim. I See a Car Crash: Real-TimeDetection of Small Scale Incidents in Microblogs. In Philipp Cimiano, MiriamFernndez, Vanessa Lopez, Stefan Schlobach, and Johanna Vlker, editors, TheSemantic Web: ESWC 2013 Satellite Events, number 7955 in Lecture Notes inComputer Science, pages 22–33. Springer Berlin Heidelberg, 2013.

[41] Freddy Lecue Elizabeth Daly. Westland Row Why So Slow? Fusing Social Mediaand Linked Data Sources for Understanding Real-Time Traffic Conditions. 2013.

[42] Eric Mai and Rob Hranac. Twitter Interactions as a Data Source for Transporta-tion Incidents. 2013.

[43] Ayelet Gal-Tzur, Susan M. Grant-Muller, Tsvi Kuflik, Einat Minkov, Silvio No-cera, and Itay Shoor. The potential of social media in delivering transport policygoals. Transport Policy, 32:115–123, March 2014.

[44] Po-Ta Chen, Feng Chen, and Zhen Qian. Road Traffic Congestion Monitoringin Social Media with Hinge-Loss Markov Random Fields. In 2014 IEEE Inter-national Conference on Data Mining (ICDM), pages 80–89, December 2014.

[45] E. D’Andrea, P. Ducange, B. Lazzerini, and F. Marcelloni. Real-Time Detec-tion of Traffic From Twitter Stream Analysis. IEEE Transactions on IntelligentTransportation Systems, PP(99):1–15, 2015.

[46] Avinash Kumar, Miao Jiang, and Yi Fang. Where Not to Go?: Detecting RoadHazards Using Twitter. In Proceedings of the 37th International ACM SIGIRConference on Research & Development in Information Retrieval, SIGIR ’14,pages 1223–1226, New York, NY, USA, 2014. ACM.

[47] Zhenhua Zhang, Ming Ni, Qing He, Jing Gao, Jizhan Gou, and Xiaoling Li.An Exploratory Study on the Correlation between Twitter Concentration andTraffic Surge. Transportation Research Record: Journal of the TransportationResearch Board, 2553, 2016.

[48] Zhenhua Zhang and Qing He. On-site Traffic Accident Detection with BothSocial Media and Traffic Data. 9th Triennial Symposium on TransportationAnalysis (TRISTAN IX), 2016.

[49] Zhenhua Zhang and Qing He. Exploring Travel Behavior with Social Media: AnEmpirical Study of Abnormal Movements Using High Resolution Tweet Trajec-tory Data. submitted to Transportation Research Part C: Emerging Technologies,2016.

111

[50] Jingrui He, Wei Shen, Phani Divakaruni, Laura Wynter, and Rick Lawrence. Im-proving Traffic Prediction with Tweet Semantics. In Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, IJCAI ’13, pages1387–1393, Beijing, China, 2013. AAAI Press.

[51] Ming Ni, Qing He, and Jing Gao. Using social media to predict traffic flow underspecial event conditions. In The 93rd Annual Meeting of Transportation ResearchBoard, 2014.

[52] Lei Lin, Ming Ni, Qing He, Jing Gao, and Adel W. Sadek. Modeling the Impactsof Inclement Weather on Freeway Traffic Speed. Transportation Research Record:Journal of the Transportation Research Board, 2482:82–89, 2015.

[53] Craig Collins, Samiul Hasan, and Satish Ukkusuri. A Novel Transit Rider Sat-isfaction Metric: Rider Sentiments Measured from Online Social Media Data.Journal of Public Transportation, 16(2), June 2013.

[54] Metropolitan Transportation Authority Date Feed.

[55] Twitter Streaming APIs.

[56] Daniel Rampage, Susan Dumais, and Dan Liebling. Characterizing Microblogswith Topic Models. Proceedings of the Fourth International AAAI Conferenceon Weblogs and Social Media, 2010.

[57] Wouter Weerkamp and Maarten Rijke. Credibility-inspired Ranking for BlogPost Retrieval. Inf. Retr., 15(3-4):243–277, June 2012.

[58] MRIO Cordeiro. Twitter Event Detection: Combining Wavelet Analysis andTopic Inference Summarization. In Doctoral Symposium on Informatics Engi-neering, DSIE, volume 8, pages 11–16, 2012.

[59] David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent Dirichlet Alloca-tion. J. Mach. Learn. Res., 3:993–1022, March 2003.

[60] P. Giridhar, M.T. Amin, T. Abdelzaher, L.M. Kaplan, J. George, and R. Ganti.ClariSense: Clarifying sensor anomalies using social network feeds. In 2014IEEE International Conference on Pervasive Computing and CommunicationsWorkshops (PERCOM Workshops), pages 395–400, March 2014.

[61] Man-Chun Tan, S.C. Wong, Jian-Min Xu, Zhan-Rong Guan, and Peng Zhang.An Aggregation Approach to Short-Term Traffic Flow Prediction. IEEE Trans-actions on Intelligent Transportation Systems, 10(1):60–69, March 2009.

[62] Andreas Klose and Andreas Drexl. Facility location models for distributionsystem design. European Journal of Operational Research, 162(1):4–29, April2005.

[63] C. S. ReVelle, H. A. Eiselt, and M. S. Daskin. A bibliography for some fun-damental problem categories in discrete location science. European Journal ofOperational Research, 184(3):817–848, February 2008.

112

[64] M. T. Melo, S. Nickel, and F. Saldanha-da Gama. Facility location and sup-ply chain management A review. European Journal of Operational Research,196(2):401–412, July 2009.

[65] Abraham Warszawski. Multi-Dimensional Location Problems. Journal of theOperational Research Society, 24(2):165–179, June 1973.

[66] A. M. Geoffrion and G. W. Graves. Multicommodity Distribution System Designby Benders Decomposition. Management Science, 20(5):822–844, January 1974.

[67] Alan W. Neebe and Basheer M. Khumawala. An Improved Algorithm for theMulti-Commodity Location Problem. The Journal of the Operational ResearchSociety, 32(2):143–149, 1981.

[68] Cynthia Barnhart and Yosef Sheffi. A Network-Based Primal-Dual Heuristicfor the Solution of Multicommodity Network Flow Problems. TransportationScience, 27(2):102–117, May 1993.

[69] Teodor Gabriel Crainic and Louis Delorme. Dual-Ascent Procedures for Multi-commodity Location-Allocation Problems with Balancing Requirements. Trans-portation Science, 27(2):90–101, May 1993.

[70] A. K. Aggarwal, M. Oblak, and R. R. Vemuganti. A heuristic solution pro-cedure for multicommodity integer flows. Computers & Operations Research,22(10):1075–1087, December 1995.

[71] Choong Y. Lee. The Multiproduct Warehouse Location Problem: Applyinga Decomposition Algorithm. International Journal of Physical Distribution &Logistics Management, 23(6):3–13, June 1993.

[72] Choong Y. Lee. A cross decomposition algorithm for a multiproduct-multitypefacility location problem. Computers & Operations Research, 20(5):527–540,June 1993.

[73] Hasan Pirkul and Vaidyanathan Jayaraman. A multi-commodity, multi-plant,capacitated facility location problem: formulation and efficient heuristic solution.Computers & Operations Research, 25(10):869–878, October 1998.

[74] Nabila Azi, Michel Gendreau, and Jean-Yves Potvin. A dynamic vehicle rout-ing problem with multiple delivery routes. Annals of Operations Research,199(1):103–112, October 2011.

[75] Mathias Klapp and Alan Erera. The One-dimensional Dynamic Dispatch WavesProblem. 2015.

[76] Stacy A. Voccia, Ann M. Campbell, and Barrett W. Thomas. The Same-DayDelivery Problem for Online Purchases. ResearchGate, October 2015.

[77] J. F. Benders. Partitioning procedures for solving mixed-variables programmingproblems. Numerische Mathematik, 4(1):238–252, December 1962.

113

[78] Zuo-Jun Max Shen and Lian Qi. Incorporating inventory and routing costs instrategic location models. European Journal of Operational Research, 179(2):372–389, June 2007.

[79] Niels Agatz, Ann Campbell, Moritz Fleischmann, and Martin Savelsbergh.Time Slot Management in Attended Home Delivery. Transportation Science,45(3):435–449, December 2010.

[80] Yu-Chung Tsao, Divya Mangotra, Jye-Chyi Lu, and Ming Dong. A continuousapproximation approach for the integrated facility-inventory allocation problem.European Journal of Operational Research, 222(2):216–228, October 2012.

[81] Carlos F. Daganzo. The Distance Traveled to Visit N Points with a Maximumof C Stops per Vehicle: An Analytic Model and an Application. TransportationScience, 18(4):331–350, November 1984.

[82] Kyle D. Cattani, Gilvan C. Souza, and Shengqi Ye. Shelf Loathing: Cross Dock-ing at an Online Retailer. Production and Operations Management, 23(5):893–906, May 2014.

[83] Ming Ni, Qing He, Jose Walteros, Xian Liu, and Arun Hampapur. Same DayDelivery Planning with Store Fulfllment. submitted to Transportation Science.

[84] Alp Arslan, Niels Agatz, Leo G. Kroon, and Rob A. Zuidwijk. CrowdsourcedDelivery: A Dynamic Pickup and Delivery Problem with Ad-Hoc Drivers. SSRNScholarly Paper ID 2726731, Social Science Research Network, Rochester, NY,September 2016.

[85] Aashwinikumar Devari, Alexander G. Nikolaev, and Qing He. Crowdsourcingthe last mile delivery of online orders by exploiting the social networks of retailstore customers. Transportation Research Part E: Logistics and TransportationReview, 105(Supplement C):105–122, September 2017.

[86] Stephen Mahar and P. Daniel Wright. The value of postponing online fulfillmentdecisions in multi-channel retail/e-tail organizations. Computers & OperationsResearch, 36(11):3061–3072, November 2009.

[87] Ping Josephine Xu, Russell Allgor, and Stephen C. Graves. Benefits of Reevaluat-ing Real-Time Order Fulfillment Decisions. Manufacturing & Service OperationsManagement, 11(2):340–355, December 2008.

[88] Jason Acimovic and Stephen C. Graves. Making Better Fulfillment Decisions onthe Fly in an Online Retail Environment. Manufacturing & Service OperationsManagement, 17(1):34–51, December 2014.

[89] Valentina Carbone, Aurlien Rouquet, and Christine Roussat. ”Carried away bythe crowd: what types of logistics characterise collaborative consumption. pages1–21, June 2015.

[90] Martin Savelsbergh and Tom Van Woensel. 50th Anniversary Invited ArticleCityLogistics: Challenges and Opportunities. Transportation Science, March 2016.

114

[91] Claudia Archetti, Martin Savelsbergh, and M. Grazia Speranza. The VehicleRouting Problem with Occasional Drivers. European Journal of OperationalResearch, 254(2):472–480, October 2016.

[92] Yuan Wang, Dongxiang Zhang, Qing Liu, Fumin Shen, and Loo Hay Lee. To-wards enhancing the last-mile delivery: An effective crowd-tasking model withscalable solutions. Transportation Research Part E: Logistics and TransportationReview, 93(Supplement C):279–293, September 2016.

[93] Nabin Kafle, Bo Zou, and Jane Lin. Design and modeling of a crowdsource-enabled system for urban parcel relay and delivery. Transportation ResearchPart B: Methodological, 99(Supplement C):62–82, May 2017.

[94] Harri Paloheimo, Michael Lettenmeier, and Heikki Waris. Transport reduction bycrowdsourced deliveries a library case in Finland. Journal of Cleaner Production,132(Supplement C):240–251, September 2016.

[95] Aymeric Punel and Amanda Stathopoulos. Exploratory Analysis of Crowd-sourced Delivery Service Through a Stated Preference Experiment. 2017.

[96] Mathias A. Klapp, Alan L. Erera, and Alejandro Toriello. The dynamic dispatchwaves problem for same-day delivery. In under review, 2016.

[97] FedEx SameDay. http://www.fedex.com/us/fedex/shippingservices/package/sameday.html,2017. [Online; accessed 30-November-2017].

[98] Kris Johnson Ferreira, Bin Hong Alex Lee, and David Simchi-Levi. Analytics foran Online Retailer: Demand Forecasting and Price Optimization. Manufacturing& Service Operations Management, 18(1):69–88, November 2015.

115

Date post:	06-Jul-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

Transportation Analytics and Last-mile Same-day Delivery ...qinghe/thesis/2018-01 Ni PhD Same-Day...

Documents