A Quality Value Chain Network: Linking SupplyChain Quality to Customer Lifetime Value*
Qiuping YuKelley School of Business, Indiana University, [email protected]
Masha ShunkoFoster School of Business, University of Washington, [email protected]
Shawn MankadJohnson Graduate School of Management, Cornell University, [email protected]
We create a quality value chain network concept to analyze the impact of supply chain quality (SCQ) on
the customer lifetime value (CLV). We apply our framework to a rich dataset from a major restaurant
chain utilizing text analysis of the complaints to measure SCQ, a two-stage least squares (2SLS) model with
instruments to assess the impact of SCQ on customer experience, and a structural model of consumer pur-
chasing behavior to eventually link customer experience to CLV. Note that we consider not only the impact
of customer experience at the focal store but also that from adjacent stores on CLV. Considering such net-
work effect significantly improves the model performance in predicting customer behavior and quantification
of financial returns. We identify the profile of the most valuable customers and provide insights on which
SCQ issues the supply chain should focus on, and which restaurants should be prioritized for supply chain
improvements.
1. Introduction
According to the Forrester report on customer life-cycle marketing, 72% of surveyed companies
list customer experience improvement as their number one priority (Forrester 2016). Customer
experience improvements, ranging from improving general product and service delivery to providing
customized campaigns and services for individual customers, can lead to increased customer loyalty
and higher revenues. According to a McKinsey & Company study (Shital Chheda and Roggenhofer
2017), a successful improvement has to focus not only on the front-end touch-points but also on
back-end operations and processes to be sustainable. One of the earlier conceptual frameworks in
the literature that links internal operational processes to customer satisfaction, loyalty and thus
profitability is the service profit chain framework proposed by Heskett and Schlesinger (1997). This
framework focuses on a firm’s internal service attributes such as employee quality and satisfaction.
Anderson and Mittal (2000) extend this chain framework to include both product and service
attributes, which they refer to as a satisfaction profit chain.
The idea of a profit chain was originally developed within the boundaries of one firm. In practice,
firms rarely operate in isolation, and many firms rely on their supply chain to provide products
* This paper is supported by the National Science Foundation under Grant No. 1633158
1
2
or services. In many cases, firms may carry a brand name (e.g., Starbucks, McDonalds) such that
customer experience provided by one firm may affect customer perception of the quality of another
firm that carries the same brand name (see Figure 1). We enrich the classic profit chain concept
through two dimensions: (1) We extend the service profit chain upstream to include the supply
chain attributes that may have a direct impact on the product/service attributes that, in turn, have
their impact on customer experience; and (2) We include customer experience at all firms that carry
the same brand name to address potential competition and/or reputation spillover effects. Several
profit chain studies have included multiple firms or outlets in the profit chain framework (Bowman
and Narayandas 2004, Loveman 1998). To the best of our knowledge, however, our framework is
broader than the previous works on the profit chain and appears to be the first that considers all of
the stages in the supply chain and includes manufacturers, distributors, and a network of retailers
and their individual end consumers. Our study is also more granular than the previous value chain
studies and appears to be the first value chain study that differentiates the impact of customer
experience on their purchasing behavior at the individual customer level. Moreover, unlike most
of the value chain studies, we consider not only customers’ current purchasing behavior but also
their future purchasing behavior by measuring customer lifetime value (CLV). From the application
perspective, our paper appears to be the first comprehensive value chain study using field data
from a major fast food restaurant chain. It generates numerous unique managerial insights as we
elaborate below.
The extended profit chain framework, which we refer to as the quality value chain network,
can be used to create value from the individual firm’s perspective, the brand’s perspective, and
the supply chain orchestrator’s perspective. Given that supply chain operations are not directly
observable to customers, we propose a two-phase methodology to link the supply chain quality to
the CLV (see Figure 1). In the first phase, we identify how the supply chain can create value for
customers by improving supply chain quality, which can lead to an improved customer experience.
In the second phase, we identify how increased customer experience at all firms within the brand
creates future value for all supply chain players by increasing CLV. This methodology provides
numerous managerial insights. For the firms, it identifies customers who have the highest response
to customer experience improvement, allowing the firms to target their campaigns. For the supply
chain, it identifies which supply chain quality issues have the biggest impact on customer experience
and, thus, on the CLV, allowing supply chain management to focus on the most critical issues.
Meanwhile, we identify firms that have the highest response to customer experience improvements,
allowing the supply chain to create tailored supply chain management strategies that lead to
profit maximization via prioritizing the right firms with the right customers and the right network
features. A recent McKinsey & Company study (Maynes and Rawson 2016) notes that many
3
Figure 1: The supply chain network under consideration consists of multiple firms that carry thesame brand name such that operations of one firm may affect the customer behavior at anotherfirm through brand reputation.
customer experience initiatives fail due to companies’ inability to link particular initiatives to
generated value and to a lack of focus on the initiatives that bring the highest return. Our framework
and two-phase methodology, as described above, help to address these common practical issues in
the industry.
We apply our methodology to a large dataset from the fast-food industry, namely, a national
restaurant chain that operates multiple franchises in different locations that carry the same brand
name. In the first phase, our focus is on how supply chain can create valuable customer experi-
ence. Customer experience is affected by multiple dimensions; in our application, an important
determinant of customer experience is the quality of served food. Kim et al. (2009) studied the
impact of various product and service components on customer satisfaction in dining facilities at
public universities using web surveys of students and found that food quality was the most impor-
tant predictor of customer satisfaction and of intent to return. Hwang and Zhao (2010) made
similar observations and identified food quality and taste as one of the main drivers of customer
satisfaction.
Because food quality depends largely on the management of supply chain quality (e.g., quality
and timeliness of delivered ingredients), we analyze the impact of different dimensions of supply
chain quality on customer experience to inform the firm as to which aspects of supply chain quality
are critical to improving customer experience. In our data sample, all restaurants are serviced by
4
the same distributor and the same set of suppliers; hence, the quality of the distributor (e.g., dis-
tributor’s overall fill rate) is fixed at the distributor level. The quality of the distributor-restaurant
link, however, may differ for each restaurant as it depends on many factors, (e.g., distributor’s
prioritization rules, relationship between the distributor and the restaurant, the distance between
the distributor and the restaurant, experience of the restaurant manager). Thus, we consider the
supply chain quality from each restaurant’s perspective.
We conduct a text analysis of the complaints filed by restaurants against their distributors and
suppliers to extract information about supply chain quality. Through manual examination of a
training set of complaints, we identify three main supply issues — Freshness, Packaging, and
Delivery — and their associated keywords. We then use the Latent Dirichlet Allocation topic
model to refine the keywords and to categorize our complaints according to these three issues. We
then evaluate how each of these issues affects customers’ experience in the downstream restaurants,
using a linear regression model that accounts for the potential endogeneity issues. Our results show
that issues related to Freshness have the largest impact on customer experience among all supply
chain issues. In particular, we find that reducing one complaint related to Freshness per month can
improve the customer experience score by about 0.44 (on a scale from 0 to 10). In our application,
most complaints related to Freshness are associated with over-ripeness of products and come from
one supplier, proposing an immediate managerial action to improve handling and reduce lead time
on products from the particular supplier.
In the second phase, we explore how customer experience with a network of restaurants affects
CLV at a focal restaurant. Note that customers’ decision to stay engaged with a restaurant that they
visited may depend not only on their experience at this focal restaurant but also on the reputation
of the brand and/or customer experiences at other locations that carry the same brand name. For
example, customers may be more likely not to come back to the visited restaurant if the overall
customer experience of the adjacent outlets of the same brand is higher; hence, customers may
prefer to go to a better location. We refer to this phenomenon as the competition effect. In contrast,
given that all of the restaurants share the same brand name, good overall customer experience of
adjacent restaurants may improve the reputation of the brand and customers will be more likely to
keep coming to the visited focal restaurant. We refer to this effect as the reputation spillover effect.
We combine the two effects into the network effect. If better overall customer experience at adjacent
restaurants reduces customers’ churn rate at the focal restaurant, the network effect is positive,
and the reputation spillover effect dominates. Otherwise, the network effect is negative, and the
competition effect dominates. Customers’ perception of the quality of the neighboring restaurants
may be acquired through one or both of the following channels: word-of-mouth through friends and
5
family, or personal experience (e.g., personal visits to the adjacent stores). We build our model in
a way that allows us to separate the network effects by channel.
In our setting, restaurants have a non-contractual relationship with their customers. Thus,
restaurants cannot directly observe whether customers have churned, i.e., stopped engaging with
the restaurants, nor can restaurants observe the latent arrival process of the churned customers,
had they not churned. To uncover the latent information about customers’ churn rate and arrival
process and, thus, to measure CLV, we propose a latent attrition model that is a variation of the
BG/NBD model (Fader et al. 2005a). Our latent attribution model extends the BG/NBD model
to account for the impact of customer experience reported in all of the restaurants in the network
in regard to customers’ purchasing decisions. We show that both customer experience at the focal
store and the perceived neighbor quality have a significant impact on customers’ purchasing deci-
sions. Incorporating the perceived neighbor quality in the model significantly improves the model
performance, both in its goodness-of-fit and forecasting accuracy. We show that better overall
perceived neighbor quality reduces customers’ churn rate at the focal store regardless of whether
the customers have visited the adjacent stores, which implies that the network effect is positive
and that the reputation spillover effect dominates the competition effect. We also show that the
reputation spillover effect is even stronger relative to the competition effect for customers who have
visited the neighbor stores than for those who have not.
Through counterfactual studies, we demonstrate how each individual restaurant and the supply
chain orchestrator can use our two-phase methodology and results to prioritize investment in
customer experience and supply chain improvements to maximize their return on investment (ROI).
Note that, for each individual customer, our latent attribution model in the second phase allows us
to quantify the exact incremental number of future transactions that the customer can generate at
the focal store if her experience at the focal store improves. We show that the incremental number
of future transactions is the highest among the customers whose last visit to the restaurant was
neither too long ago nor too recent and whose number of total transactions with the firm is neither
too high nor too low. Hence, firms should focus their customer experience improvement initiatives
(e.g., targeted campaigns) on the customers who fit the profile above. From the supply chain
orchestrator’s perspective, we note that improving the supply chain quality for a focal restaurant
not only generates additional number of transactions at the focal restaurant, but may also lead
to more sales at the adjacent stores due to the network effect. The supply chain orchestrator thus
should prioritize the supply chain improvement for the store that can generate the largest total
incremental number of transactions in the entire network of restaurants. In particular, we show that
the supply chain orchestrator should prioritize the restaurants that have high quality neighbors and
a high percentage of customers with the profile identified above. It is worth noting that a recent
6
Forbes report suggests personalization of service as one of the top ten trends in improving customer
experience in 2017 (Hyken 2016). The results of our study help companies to design personalized
marketing or operational campaigns in both business-to-business (B2B) and business-to-customer
(B2C) settings, as we describe above.
2. Literature Review
We study the impact of supply chain quality on customer experience and consequently, on the
CLV accounting for the network effect among multiple firms. To this end, our research is related to
several streams of literature: quality and profitability, customer lifetime value, quality competition,
and supply chain quality.
Quality and Profitability. Heskett and Schlesinger (1997) developed the service profit chain
framework that established a chained relationship between service quality, customer satisfaction,
customer loyalty, and eventually, financial growth and profitability. The satisfaction profit chain
framework extends the service profit chain by including performance attributes beyond service
attributes (Anderson and Mittal 2000). Empirical studies that support each of these links are seen
in the literature. We refer interested readers to Zeithaml (2000) and, more recently to Kumar et al.
(2013) for an extensive review. A number of recent empirical works in the operations management
literature also have examined the impact of quality on customer behavior or sales (Aksin et al.
2013, Batt and Terwiesch 2015, Cachon et al. 2013, Kesavan et al. 2014, Lu et al. 2013, Musalem
et al. 2016, Tan and Netessine 2014, Yu et al. 2017), and the impact of loyalty on growth in the app
economy (Mendelson and Moon 2017). For the link between customer satisfaction and behavior,
although there are several studies that report a significant positive relationship (Bolton 1998), some
studies show that satisfaction has no significant impact on customer loyalty (Verhoef 2003). Based
on the reality of marketplace practices, the link between satisfaction and profitability also seems to
be weak (Kumar et al. 2013). Most of the research in this stream, however, does not differentiate
the impact of customer satisfaction across different customers, which could lead to large variance
and a seemingly weak overall relationship between customer satisfaction and profitability. Kumar
et al. (2013) explicitly call for research that systemically explores how customer satisfaction or,
more broadly, experience affects profitability differently for different customers, which will allow
firms to focus on customers who have the highest response to experience and, thus, improve the
return on investment.
Our paper research fills this gap by investigating how customer experience affects CLV differ-
ently for customers with different transaction patterns. Not only does our approach appear to be
much more granular in capturing the heterogeneous impact of customer experience across different
customers compared to the previous works in profit chains, but our framework also is broader
7
and appears to be the first to consider all of the stages in the supply chain, including suppliers,
distributors, and a network of retailers and their end consumers. From the application perspective,
our work appears to be the first comprehensive value chain network study that uses field data from
a major fast-food restaurant chain. It generates numerous unique managerial insights.
Customer Lifetime Value. The CLV metric is a forward-looking metric that takes into account
the dynamic nature of customer behavior and allows firms to develop personalized strategies for cus-
tomers of different transaction patterns to maximize return on investment. It is generally acknowl-
edged that CLV is a preferred metric to measure customers value to the firm as compared to other
metrics (such as past profitability or customer churn rate) (Kumar and Reinartz 2016). Multiple
approaches for calculating CLV have been proposed. We focus on the ones for non-contractual
settings, as it is for our application, whereby the firm does not directly observe the times when
customers churn. One of the earliest models known as the Pareto/NBD model of Schmittlein et al.
(1987) serves as a building block for many follow-up models. To simplify the model estimation,
Fader et al. (2005a) proposed the BG/NBD model, which allows customers to become inactive only
immediately after a purchasing transaction rather than at any point in time (as it is in Schmit-
tlein et al. (1987)). This simplification significantly reduces the computational time required for
estimation. One of the limitations of these frequently used latent attrition models, including the
Pareto/NBD and BG/NBD models, is the lack of accounting for the impact of attributes, such as
quality or customer experience, on CLV.
Ho et al. (2006) extend the above latent attrition models by proposing an analytical model
that accounts for the impact of customer satisfaction in CLV in non-contractual settings. Aflaki
and Popescu (2014) also explore the impact of service quality on CLV analytically. The authors,
however, focus on a contractual setting and incorporate observations from behavioral theories in
their model. Afeche et al. (2015) extend the works above by considering the capacity constraint
whereby the service quality is a function of the allocated capacity. While all of the works above
are analytical, Braun et al. (2015) extend the BG/NBD model in an empirical study in which
they explore how service quality affects CLV, using data from an online marketplace for freelance
writing services. The service quality in their study is measured by a panel of reviewers hired by
the firm. In our application, however, we focus on customer experience as reported by customers
themselves via an online consumer survey. According to Parasuraman et al. (1985), there is a sig-
nificant gap between self-reported customer experience and quality metrics estimated internally by
the firm. In the customer relationship management literature, service quality is generally consid-
ered an antecedent of customer satisfaction or experience (Cronin Jr and Taylor 1992). The link
between customer experience and purchase intention is often much stronger than that between
service quality and purchase intention. This may explain why including the service quality does not
8
significantly improve the goodness of fit of the model or its forecasting accuracy compared to the
base model without service quality in Braun et al. (2015), while it does significantly improve the
model performance by including customer experience in our case. Knox and van Oest (2014) study
the impact of customer complaints and firm recoveries on customer future purchasing behavior
using data from an Internet and catalog retailer. The analysis in both Braun et al. (2015) and Knox
and van Oest (2014) is limited to one firm. We take one step further by exploring how customer
experience in a network of firms affects CLV at the focal store, for which we explicitly account
for the potential competition and reputation spillover effect across different firms in the network.
Our results show that it is important to include the network effect in the customer purchasing
model, as it significantly improves the model performance in terms of both its goodness of fit and
its forecasting accuracy. We show that the reputation spillover effect dominates the competition
effect among the firms in the same neighborhood that share the same brand name. By accounting
for such network effect, we can quantify the total ROI from improving the supply chain quality
of a focal firm in a more holistic manner. In particular, we can quantify not only the associated
incremental value generated at the focal firm, but also the significant amount of incremental value
generated at the adjacent firms through the network effect. Without accounting for the network
effect, the supply chain orchestrator would have underestimated the ROI from improving the sup-
ply chain quality of a focal firm. To this end, our work is also related to the literature on quality
competition.
Quality Competition. There has been a stream of analytical papers (Aksoy-Pierson et al. 2013,
Allon and Federgruen 2009, Cachon and Harker 2002, Cohen and Whang 1997, Gans 2002) that
explore the impact of quality competition on customer behavior and equilibrium outcomes. More
recently, there is an emerging body of empirical research on quality competition. For example, Allon
et al. (2011) empirically study how waiting time performance affects different firms’ market shares
and price decisions using firm level data from the hamburger drive-through fast-food industry.
The authors show that, in the fast-food industry, customers trade off price and waiting time and
attribute a very high cost to the time that they spend waiting. Moving beyond waiting time as
a proxy for service quality, Guajardo et al. (2015) empirically study how service and product
attributes affect customer demand in the presence of competition using product-level monthly sales
data from the US automobile industry. One of their main findings is that service attributes, which
jointly determine service quality, become more important in determining consumer demand when
products exhibit lower quality. Buell et al. (2016) empirically examine the relationship between
the level of service quality competition and customer defection rates, using customer level data
from the retail banking sector over a five-year period. They find that the level of service quality
competition affects customer defection from an incumbent firm to a competitor, and that the
9
impact is moderated by whether the incumbent firm has a high or low service quality position in
the market. Note that none of the works above considered the impact of quality on customers’
future behavior or account for the potential different impact of quality across different customers
at an individual customer level. We fill this gap and contribute to this stream of literature by
looking at a network of firms that can jointly affect CLV. Our approach not only allows each firm
to identify the customers with the highest future incremental value from their improved experience
in a B2C setting but also allows the supply chain orchestrator to identify the firms that generate
the highest return on investment from their supply chain quality improvement in a B2B setting.
Supply Chain Quality. There are multiple approaches and frameworks for assessing the per-
formance of supply chains. Kaynak and Hartley (2008) summarize the literature on quality mea-
surement and frameworks within the supply chain context. In our study, we focus on the attributes
that have a direct impact on the availability and taste of the menu items at the restaurant: fresh-
ness of delivered products, timeliness of delivery, and packaging. Coyle et al. (2008) list on-time
delivery and quality of delivered goods as the top two supply chain performance attributes used
in real industry practice. Our attributes also are among the top attributes identified for supplier
quality in Bowman and Narayandas (2004) and Lehmann and O’shaughnessy (1974). It is worth
noting that Bowman and Narayandas (2004) adapt the service profit chain framework in the B2B
market. The authors explore how vendor effort affects vendor quality and, thus, business buyers’
satisfaction, and eventually, vendor’s profitability from its business buyers.
We enrich this framework by extending it to include individual end consumers. Moreover, in
our analysis, we systematically identify the causal relationship between supply chain quality and
customer experience, using novel instrumental variables, while the links in Bowman and Narayandas
(2004) are demonstrated mostly through correlations. Beyond the profit chain framework, there
is a substantial body of analytical research that explores the impact of supplier quality measured
by various metrics (i.e., inventory availability, delivery time, and product quality) on customer
behavior or aggregated demand (Cachon and Lariviere 2001, Cohen and Whang 1997, Gans 2002,
Olsen and Parker 2008). There also are recent empirical studies that explore the impact of supplier
quality on a firm’s behavior in B2B settings with a focus on inventory availability, e.g., Craig et al.
(2016).
3. Empirical Setting and Data Description
We describe our framework while applying it to a nationwide restaurant chain in the United States
that operates multiple franchise outlets that carry the same brand name. Each outlet operates as
an independent firm. The dataset acquired through the Wharton Customer Analytics Initiative
(WCAI) captures customer transaction data, customer responses to the satisfaction survey and
supply chain data.
10
3.1. Customer Transaction and Survey Data
Recall that the main objective of our study is to explore the impact of supply chain quality on
customer experience and thus eventually on CLV. To eliminate the potential effects of confounding
variables and to control for factors that are external to supply chain quality or customer experience,
but may affect customers’ purchasing behavior, we focus on four ZIP-code areas that have the high-
est number of restaurants that carry the brand name and that have similar demographic profiles.
Thus, selection of the sample will control for, among other variables, market heterogeneity, which
could confound our analysis due to different levels of brand presence, and customers’ heterogeneity
due to different demographics. All four ZIP-code areas are in the same city and include in total
36 restaurants that carry the brand name of interest in these areas. To better measure customer
experience in the restaurants, we select 28 restaurants that have at least 50 completed customer
surveys.1 These restaurants received, on average, 265 survey responses (full summary statistics are
provided in Table 2).
For customers who pay with a credit card or have a loyalty account, we observe customers’ trans-
actions over time through their unique customer ID and we limit our attention to such customers.
To this end, we sampled 3,000 such customers in total from the selected restaurants. For each
selected customer, we then extract all of their transactions at any restaurant that carry the same
brand name even if it is outside of the 28 selected restaurants. For each transaction, we observe
the corresponding transaction ID, customer ID, restaurant ID, transaction time, and transaction
amount. Among the selected restaurants, on average, customers make 2.2 visits to one restaurant
during the observation window of July 1, 2013 to July 2, 2015 with the number of visits ranging
from 1 to 75.
All customers are invited to complete an online customer satisfaction survey following each
purchasing transaction in return for a free treat. The survey consists of questions regarding cus-
tomer experience at the restaurant, which includes items about overall experience/satisfaction and
detailed questions about such concerns as satisfaction with food quality, speed of service, cleanli-
ness, etc. For each question, the customers rank their experience from 0 (not at all satisfied) to 10
(very satisfied). For each completed survey, in addition to customers’ responses to the questions
above, we also observe the time when the survey was filled and restaurant ID with which the survey
is affiliated.
In this study, we use the answers to the question “How likely are you to recommend this restau-
rant?” to determine customer experience at restaurants. This question is considered to be of great
importance in determining the overall perceived quality in the service and retail industry, and is
1 We explore the robustness of our main insights to this sample selection procedure in Section 6.
11
used in practice to determine the Net Promoter Score (NPS). Notably, NPS is known to be a
good predictor of revenue (Reichheld 2003). In addition, we analyze the answers to the questions
“Please rate the following: Cleanliness” and “Please rate the following: Comfort and Atmosphere”
for controls. Summary statistics of the survey data are provided in Table 2.
3.2. Supply Chain Data
All restaurants in our selected sample are serviced by one distributor that procures from 26 suppli-
ers. This supply chain structure allows us to focus on the supply chain quality from the restaurants’
perspective. Although while the quality at the distributor level is fixed, the quality received by
each restaurant from the distributor varies and we capture this variation at the restaurant level.
There are many dimensions to quality in fast-food restaurants that affect customer experience.
The quality of the ingredients for the items listed on the menu is one of the important factors (Min
and Min 2011), and it depends on the quality of the supply chain that supports the restaurant.
As explained earlier, we measure the supply chain quality from the restaurants’ perspective, which
varies over time and across restaurants. When restaurants incur problems with their deliveries
and/or supplies, they file complaints with the call center of the supply chain orchestrator. Com-
plaints serve as a mechanism to get replacement and/or credit for damaged or unusable supplies.
The call center records the complaint, investigates, and assigns the complaint to a responsible
party.
Using analysis of complaints’ texts, we identify the main quality issues that appear frequently
in the complaints. To identify the main issues involved and to categorize the complaints by these
issues, we first manually go through a training set of complaints and identify the main supply
chain issues (Freshness, Packaging, and Delivery) and their associated keywords. For example, the
complaint “Store received 1 case of Chips that has all bags open at the top seals,” indicates that the
restaurant has encountered a problem with Packaging and that the words “bag” and “seals” may
be associated with the Packaging topic. We then use a form of probabilistic topic modeling based
on the Latent Dirichlet Allocation model (LDA), which is a hierarchical Bayesian model developed
in computer science of how content is structured within text (Blei 2012), to refine our topic and
keyword associations and to categorize all complaints. The main idea of this method is as follows:
Each complaint is modeled as a collection of words drawn from different topic distributions, whereby
the proportion of each topic that comprises a review is given by P(topic|complaint). Each topic is
defined as a multinomial distribution over the observed set of words within all complaints, which
is given by P(word|topic) (Blei 2012). The goal of LDA is to infer both probability distributions,
modeled as latent variables, from the observed word counts in each complaint.
Recently the LDA model has been successfully used in business applications to assess product
quality from user-generated content on websites (Abrahams et al. 2015, Tirunillai and Tellis 2014).
12
In contrast to these prior works, we estimate the LDA model after specifying an informative prior
distribution of keywords for each topic, which allows us to pre-specify topics of interest, thus
integrating managerial intuition with statistical modeling of the text. To form the prior we manually
went through a training set of complaints and identified main topics and associated keywords – an
approach that has been successfully used in previous text mining works (see, for example, Hu and
Liu (2004) and Wallach et al. (2009)). To the best of our knowledge, this is the first application
of such analysis in supply chain management literature, and one of the first applications in the
management literature. In Appendix A Table 7 we summarize the words by topic included in the
prior and the updated list of words by topic after applying the LDA model.
In general, each complaint may include multiple topics. For example: “Case was dented and
crushed. As per caller the case was on the bottom of cases. Cucumbers are broken,” mentions
keywords associated with Freshness and with Packaging issues. To associate each complaint
with one main topic, we assign the complaints to the most likely topic using the estimate of
P(topic|complaint). In the example above, the complaint is classified as Packaging issue, which is
the main quality problem. In our application, complaints are logged by trained call center employ-
ees, and we find that most complaints are short (an average of 24 words, 151 characters) and
focused, making association with one issue fairly straight-forward.
We validate our seeded LDA approach using three quality metrics: the average topic coherence
(Mimno et al. 2011), Hellinger distance (Blei and Lafferty 2009), and entropy (Hall et al. 2008).
Please see Appendix C for detailed definition of the metrics. Table 1 shows that the seeded LDA
method is preferred to the standard LDA topic model with uninformed prior according to all three
quality metrics (higher values are preferred for coherence and Hellinger distance measures, and
lower values are preferred for the entropy measure).
Method Topic Coherence Hellinger Distance Entropy
Seeded LDA -1206.724 0.627 3.525
LDA with Uninformed Prior -1387.492 0.555 3.769
Table 1: Quality metrics for the topic modeling show that the seeded LDA results are preferableto topic modeling with an uninformed prior.
After each complaint is associated with one issue, we count the number of complaints per issue
for each restaurant and use these counts as measures of supply chain quality. In Table 2, we present
the descriptive statistics of all variables included in the supply chain analysis. We include the age
of the restaurant calculated in days from opening until the end of the observation window, and the
distance from the restaurant to its distributor (in miles), which we will use later for controls. To
13
compute the distance between a given restaurant and its distributor, we first find their geographic
coordinates using Google Maps Geolocation API based on the provided addresses. We then use the
coordinates to find the driving distance between the restaurant and its distributor using Google
Maps Distance Matrix API. We assume that the distances are symmetrical between restaurants
(i.e., the distance between A and B is the same as between B and A). We compute the distances
between restaurants, using the same approach. We use the distances to identify restaurants that
are within a close proximity (specified by parameter d) from a focal restaurant in our second phase
analysis.
Statistic N Mean St. Dev. Min Max
Average score for “Recommend Restaurant”1 28 8.762 0.541 7.305 9.732
Average score for “Cleanliness/Comfort”2 28 9.090 0.410 8.138 9.759
Total number of surveys filled 28 264.536 243.929 53 878
Complaints regarding Freshness issues3 28 5.786 13.815 0 57
Complaints regarding Packaging issues3 28 2.786 3.881 0 14
Complaints regarding Delivery issues3 28 0.357 0.989 0 5
Distance from restaurant to distributor (miles) 28 131.021 25.624 39.900 138.000
Restaurant age (days) 28 3,848 2,095 765 7,759
Table 2: Restaurant-level summary statistics for the selected sample. Notes: 1For each restaurant computethe average score for the “Recommend Restaurant” question, then report the descriptive statistics across restaurants; 2For eachrestaurant compute the average score for the “Cleanliness” question and “Comfort Atmosphere” question, take the average of thetwo, then report the descriptive statistics across restaurants; 3For each restaurant, compute the total number of complaints ona specific topic per restaurant over the whole observation time window, then report the descriptive statistics across restaurants.
4. Model
Given that supply chain quality is not directly observable to customers, to link supply chain quality
to CLV, we conduct the analysis in two phases. In the first phase, we explore how supply chain
quality affects customer experience at the downstream restaurants using a linear regression model.
In the second phase, we propose a latent attrition model to capture how customers experience the
focal restaurant and how the average customer experience reported for the adjacent restaurants
affects CLV at the focal restaurant.
4.1. Impact of Supply Chain Quality on Customer Experience
In this section, we first describe our variables and present a regression model to identify which
operational and supply chain characteristics drive customer experience at restaurants. We then
discuss potential endogeneity issues and present a two-stage least squares (2SLS) approach with
instrumental variables to address such issues.
14
4.1.1. Variables and Linear Model Our dependent variable is the level of customer expe-
rience provided at the restaurant. To capture customer experience at the restaurant, we use the
responses to the “How likely are you to recommend this restaurant?” question from the customer
surveys – RECOMMENDjl. In particular, for each restaurant j, we compute a rolling average2
from the time l to time l+ τ . We set the time window length to τ = 84.3 We average the responses
over a time window to capture potential delay between the occurrence of an issue at the restaurants
and customer survey completion. We regress RECOMMENDjl on the following time-variant
characteristics of the restaurants. First, we include three variables counting the number of com-
plaints placed by restaurant j on day l on one of three supply chain issues identified through text
analysis (COMPLFreshjl , COMPLPackagejl , and COMPLDeliveryjl ). To control for the general quality
level of the restaurant due to factors other than supply chain quality, we add the average score
given to the restaurant on Cleanliness and Comfort in the customer survey (AV GCLEANjl) to
the model specification.4 For example, this control variable should capture whether restaurants
have more experienced or more talented managers, which may affect the frequency of complaints
and the quality of the relationship between the distributor/suppliers and the restaurant. Similarly,
restaurants may have better staff, which creates a better atmosphere for customers and hence,
influences their customer experience, and so on, which should be controlled for by AV GCLEANjl.
We also combine other controls in vector Cj : To control for potential demographic and socioeco-
nomic differences between restaurants’ locations, we control for the ZIP code where the restaurant
is located; in addition, we control for the restaurant’s age in days from opening until the end of
the observation window rescaled by 1/1000.
RECOMMENDj,l+τ = β0 +βFreshCOMPLFreshjl +βDeliveryCOMPLDeliveryjl +
βPackagingCOMPLPackagingjl +βAbilityAV GCLEANjl +
βControlsCj + εjl. (1)
4.1.2. 2SLS Model Using an OLS model as specified above may lead to biased estimators
due to potential endogeneity issues. Although supply chain quality can affect customer experi-
ence, such customer experience also may directly affect the number of supply chain complaints
2 Note that we use the average response to the question “How likely are you to recommend this restaurant?” ratherthan does the Net Promoter Score to measure customer experience. This is because we find that the continuousaverage metric significantly better predicts customers’ future purchasing behavior than the Net Promoter Score basedon our second phase analysis, which is consistent with the results in Pingitore et al. (2007).
3 To select the time window τ , we ran the model with different rolling horizons ranging from 1 day to 98 in incrementsof 7 days, and selected the final model based on the best fit according to the adjusted R-squared measure.
4 In a study of students’ preference of fast food restaurants by (Knutson 2000), cleanliness was ranked as the numberone driving force of students’ choice.
15
filed by the restaurants. This can comprise the three variables that count the number of supply
chain complaints in (1) endogenous variables. For example, high customer experience may lead
to higher sales at the restaurant level. This implies that the restaurant may have more money to
invest in better management that collaborates better with the supply chain members and causes
supply chain quality to increase. Although we intend to control for this problem by including the
proxy for quality of management using the AV GCLEANjl variable, we are uncertain whether
AV GCLEANjl can fully capture the management capability of the restaurants. Meanwhile, low
customer experience leads to frustrated customers who complain more to the restaurants, which
may trigger higher number of complaints in the supply chain. Although we observe that very few
supply chain complaints (2%) in our data mention that they originate from the customers, we
cannot fully exclude the associated potential endogeneity issues. To this end, it is worth noting
that, our results based on the Wu-Hausman test reject the hypothesis that the three counts of
supply chain complaints are exogenous variables with p < 0.01.
To address the above endogeneity issues, we use a 2SLS approach with three instrumental vari-
ables. To be valid instrument variables, they have to satisfy two conditions: relevance (correlated
with the endogenous variable, i.e., number of complaints) and exclusion (it uncorrelated with the
error term) (Wooldridge 2001). To this end, we use the following three instrumental variables: the
distance from the distributor to the restaurant (DISTj), the adoption of the point-of-sale (POS)
system at the restaurants, captured with a binary variable (POSjl), and the number of restaurants
serviced by the distributor (STORENUMl).
(1) The distance from the distributor to the restaurant (DISTj): Note that, if a restaurant
is located far from the distributor, it can be scheduled last on the route and hence, may incur
frequent delivery delays, or the boxes to the restaurant may have a higher risk of getting damaged
because they are stuck at the bottom of the delivery truck. As a result, the distance from the
distributor is likely to be correlated with the number of supply chain complaints and thus satisfies
the relevance condition. Because the distance between the restaurant and its distributor should
have no direct impact on customer experience, the exclusion condition is satisfied. (2) The adoption
of the point-of-sale (POS) system at the restaurants captured with a binary variable (POSjl): Note
that POSjl = 1 if the POS has been implemented at restaurant j at time l, and 0 otherwise. The
POS system makes the customer transaction data readily available and the data are shared at
all stages involved in the supply chain in our application. Thus, the adoption of the POS system
helps to better inform the supply chain partners about the consumer demand so that they can
better plan their supplies. This establishes the relevance of POSjl. However, because POS has no
impact on the customer interaction process, POS implementation should have no direct impact
on customer experience and, thus satisfy the exclusion condition. (3) The number of restaurants
16
serviced by distributor (STORENUMl): Although there is only one distributor in our selected
sample, this number changes over time. Because the distributor has to deal with more restaurants,
the supply chain quality may suffer from such issues as slower processing of orders or availability.
This, however, should have no direct impact on customer experience. To further validate the three
instruments proposed above, we confirm that the instruments pass the relevance condition using
the F-tests which indicates that they are not weak instruments. We also provide evidence that
the three instruments satisfy the exclusion condition using an F-test with the instruments on the
residuals from the final model. Full results from the first stage and the detailed results of the two
tests mentioned above are presented in Appendix D.
In the first stage of the 2SLS approach, we estimate the endogenous independent variables
COMPLcat
jl for cat∈ Fresh,Package,Delivery using the instruments defined above and control
variables Cj . In the second stage, we replace the observed counts of complaints in (1) with the
predicted number of complaints obtained from Stage 1. We then estimate the model specified in
(1) with robust errors (Wooldridge 2001).
4.2. Impact of Customer Experience on Customer Purchasing Behavior
In this section, we propose a latent attrition model to study the impact of customer experience on
customers’ future purchasing behavior and describe our estimation strategy.
4.2.1. Customer Purchasing Behavior Model To standardize notation, we use j ∈
1,2, ..., J to index the restaurants. Customers of a given restaurant j are indexed by i ∈
1,2, ...,Nj. Customer i’s transactions at a restaurant j are indexed by k ∈ 1,2, ...,Kij and tijk
represents the time of transaction k made by customer i in restaurant j. Customers have a non-
contractual relationship with the restaurants. Although customer i is active with a restaurant, he
or she makes transactions according to a Poisson process with rate λi. We allow the arrival rate λi
to vary across different customers. To account for such heterogeneity, we let the arrival rate follow
a gamma distribution. In particular, λi is a realization of λ∼Gamma(aλ, bλ), where aλ and bλ are
the shape and rate parameters of the gamma distribution, respectively.
After each transaction, a customer may decide to stop visiting the restaurant and become per-
manently inactive, i.e. churn. We denote customer i’s probability of churning after his or her kth
transaction at restaurant j with pijk, given by
pijk = 1− exp(−θiyijk). (2)
θi captures customers’ specific heterogeneity due to the unobservable variables beyond the factors
in yijk, (e.g., customers may react to the same experience differently, customers may see different
value in the brand). We let θ∼Gamma(aθ, bθ), where aθ and bθ are the shape and rate parameters
17
of the gamma distribution, respectively. We assume that θ and λ are independent. yijk is a function
of factors that may affect customers’ churn rate. Customers’ probability of churning at a restaurant
may depend not only on their experience with the focal restaurant but also on the perceived quality
of the adjacent restaurants. For example, if the perceived quality of the adjacent restaurants is
high, customers may become more likely to churn at the focal restaurant and to decide to visit the
adjacent restaurants instead. We refer to such effect as the competition effect. Because all of the
restaurants in our application share the same brand name, good perceived quality of the adjacent
restaurants may improve the quality reputation of the brand and thus reduce customers’ churn
rate at the focal restaurant. We refer to such effect as the reputation spillover effect. We define
the network effect as the combined competition and reputation spillover effects. To operationalize
the impact of customer experience at the focal restaurant and the network effect on their churn
rates at the focal restaurant, we let qijk denote customer i’s experience at restaurant j by the end
of his or her kth transaction there. We define qdjk as the average perceived quality of the adjacent
restaurants, which are within d miles from the focal restaurant j, at the time of transaction k. Note
that customers’ perception of the adjacent restaurant’s quality can be acquired through word-of-
mouth or/and personal experience with visiting a neighboring restaurant. Thus, the network effect
for customers who have visited the adjacent restaurants may be different from that for customers
who have not visited any of the adjacent restaurants. To this end, we let the binary variable δdijk
indicate whether or not customer i has visited any of the restaurants adjacent to restaurant j by
the time of transaction k. In particular, we have δdijk = 1 if the customer has visited any of the
adjacent restaurants, and 0 otherwise. Finally, customers’ churn rate also may be different across
different ZIP codes due to such factors as different customer demographics, strength of the brand
in the area, or competition from other brands. We let Zipj be a vector of dummies that indicates
the ZIP code where restaurant j is located.
To capture all of the factors above, we let yijk be given as follows
log(yijk) = αFQqijk +αNQqdjk +αdδ
dijk +αintq
djkδ
dijk +αZIP ·Zipj, (3)
Note that including the interaction term qdjkδdijk in (3) allows us to differentiate the network effect
through the two different channels (i.e., word-of-mouth and personal visits) noted above. In par-
ticular, the coefficient αNQ captures the network effect through the word-of-mouth channel, while
αint quantifies the additional network effect through the channel of customers’ personal visits.
Finally, it is important to note that, by including the ZIP code level fixed effect and explicitly
characterizing individual customers’ heterogeneity, our model allows us to elicit the causal impact
of customer experience at the focal store and the network effect on their purchasing behavior in a
clean manner.
18
4.2.2. Estimation Strategy We use the maximum likelihood estimation method to estimate
the parameters in the customer purchasing model described above. We construct the likelihood
function as follows: (1) We first compute the likelihood of observing the sequence of transactions for
each given customer at a given restaurant; (2) we then compute the likelihood for all the customers
of each given focal restaurant; and (3) finally, we compute the total likelihood function for all the
selected restaurants in the given market.
Before we derive the likelihood function, we first describe how we construct the key variables (i.e.,
qijk and qdjk) in the purchasing behavior model. Recall that qijk represents customer i’s experience
at restaurant j at the time of transaction k. Similar to our analysis in the first phase, we use cus-
tomers’ response to the question “How likely are you to recommend this restaurant?” to measure
customer experience. To this end, it is important to note that, although we can link customers’
surveys to a particular restaurant, we cannot link customers’ surveys to the corresponding transac-
tions. Moreover, most of the customers’ do not complete surveys for all of their transactions. Thus,
instead of measuring customer experience at the transactional level, we do so at the restaurant
level. In particular, we let qj indicate the average recommendation score collected from all surveys
completed for restaurant j and set qijk = qj. We include surveys collected over the full observation
period to measure customer experience for the following two reasons: (1) our results show that
the average recommendation scores are stationary over time for all the selected restaurants based
on the augmented Dickey-Fuller (ADF) test; and that (2) this allows us to include more surveys
to measure customer experience, which can potentially reduce the bias introduced by a low num-
ber of completed surveys. To explore the robustness of our main insights to our measurement of
customer experience, we consider an alternative model whereby we measure customer experience
using completed surveys up to the time of the their last transaction, in Section 6.
We next construct the variable qdjk, which is the average perceived customer experience across all
of the adjacent restaurants within distance d from restaurant j. We index the focal restaurant j’s
adjacent restaurants as jm with m∈ 1,2, ..,Mdj , where Md
j is the total number of adjacent restau-
rants for the focal restaurant j within distance d. We let qjm be the average recommendation score
collected from all surveys for the adjacent restaurant jm. To this end, we have qdjk = 1
Mdj
Mdj∑
m=1
qjm .
Based on our conversation with the managers at the restaurant chain, we let the distance threshold
d= 1 mile in our estimation. To demonstrate the robustness of our main insights to our choice of
the value for d, we explore several alternative models with different values for d (d ∈ 1,1.5,2).
Our results show that, although the main insights are consistent across all these models, the model
with d= 1 best explains the data based on the Akaike information criterion (AIC; Burnham and
Anderson (2003)). Hence, we focus on the case with d= 1 throughout the paper.
19
We are now ready to construct the likelihood function. Consider customer i who had Kij transac-
tions at restaurant j in the period (0, T ] with the transactions occurring at tij = (tij1, tij2, ..., tijKij ).
Meanwhile, his or her corresponding visiting trajectory to other restaurants is characterized by
δdij = (δdij1, ..., δdijKij
). His or her experience at the focal restaurant and the perceived quality of
the adjacent restaurants during the past Kij transactions are given by qij = (qij1, ..., qijKij ) and
qdj = (qdj1, ...., q
djKij
), respectively. We let Lij be the likelihood of observing the sequence of trans-
actions of customer i at restaurant j starting at the moment tij1. To this end, we have
Lij(Kij , tij ,δdij,qij, q
dj ,ZIPj , T |αFQ, αNQ, αd, αint,αZIP , λi, θi)
= λKij−1i exp
(−λi(tijKij − tij1)
)(Kij−1∏m=1
(1− pijm)
)(pijKij + (1− pijkij ) exp
(−λi(T − tijKij )
)). (4)
Then, for a randomly chosen customer, we compute the corresponding expected likelihood function
by taking expectation of the likelihood function defined in (4) over the random variables λ and θ.
This is characterized in the following lemma (all proofs and derivations are provided in Appendix
B).
Lemma 1. The expected likelihood function for a randomly chosen customer is given by
Eλ,θ[Lij(Kij, tij ,δdij ,qij, q
dj ,ZIPj, T |αFQ, αNQ, αd, αint,αZIP , λi, θi)]
=Γ(aλ+Kij−1)
Γ(aλ)
baλλ(
bλ+tijKij−tij1
)aλ+Kij−1
(bθ
bθ+Yi,j,Kij−1
)aθ(
1−(bθ+Yi,j,Kij−1
bθ+YijKij
)aθ (1−
(bλ+tijKij
−tij1bλ+T−tij1
)aλ+Kij−1))
, (5)
where YijK =∑k=K
k=1 yijk.
Finally, we denote the total likelihood function across all customers and restaurants in the
chosen neighborhood as LL(ω), with ω being the set of parameters to be estimated, given by
ω= (αFQ, αNQ, αd, αint,αZIP , aλ, bλ, aθ, bθ). We then have
LL(ω) =J∑j=1
Nj∑i=1
log(Eλ,θ[Lij]). (6)
We then estimate the parameters ω = (αFQ, αNQ, αd, αint,αZIP , aλ, bλ, aθ, bθ) by maximizing the
likelihood function given in (6) using the nonlinear optimization solver Knitro in matlab. To con-
struct the confidence interval for each parameter, we use the non-replacement sub-sampling; see
Horowitz et al. (2001).
5. Estimation Results
In this section, we summarize the results from both phases of our framework. First, we identify
which supply chain issues are most important for determining customer experience at the restau-
rants. Next, we show the impact of customer experience on customer future purchasing behavior.
20
5.1. Impact of Supply Chain Quality on Customer Experience
We present the estimation results from the first phase of our analysis in Table 3.
The four estimation results differ in how the idiosyncratic noise is modeled, which can affect
the statistical validity of the overall estimation (Arora 1973). Specifically, the error structure in
the first model includes a component specific to restaurants (εjl = ζjl + ξj), whereas the second
model includes a temporal component (εjl = ζjl+ ξl). Each error specification induces a correlation
structure, so that every pair of observations from the same group (either by restaurant or time) are
correlated. The third model, which is the most conservative, computes standard errors using the
so-called “sandwich covariance” estimator (Fitzmaurice et al. 2012), which produces statistically
consistent standard errors even in the presence of autocorrelation and heteroskedasticity. Finally,
the last column shows results from a standard linear model. The results across the three 2SLS
models are consistent both in terms of the signs and the magnitudes of the estimates. To this end,
we show that supply chain issues related to Freshness of products and ingredients have a significant
impact on the customer experience, as they directly affect the availability and taste of menu items.
Moreover, we show that reducing one complaint per month can improve customer experience at the
restaurant by 13.052/30 = 0.44. The supply chain quality issues related to Packaging and Delivery,
however, do not seem to affect customer experience at the restaurant significantly. This result is
intuitive: Freshness issues have a direct impact on the taste of final product served to the customers
and, hence, have a higher impact on the customer experience at the store.
Our approach of classifying the complaints using text analysis and including separate categories
in the model can help management to identify which supply chain issues need to be addressed
first and to quantify the impact of such issues on customer experience. Once the issue is identified,
the company may dig deeper into this particular category. In our application, for example, 79% of
the Freshness complaints are associated with one supplier, and 65% of the Freshness complaints
are associated with over-ripening of fruits and vegetables. The supply chain can work with the
particular supplier and focus on remedial actions to prevent over-ripening, such as improving
forecasting and decreasing the lead time.
This modeling phase allows us to quantify the impact of supply chain issues on customer expe-
rience. The next phase identifies how valuable is customer experience improvement is for different
customers with different transaction patterns.
5.2. Impact of Customer Experience on Customer Purchasing Behavior
The novel and unique feature of our customer purchasing behavior model is that we account for
the impact not only of customer experience in the focal restaurant but also of the the network
effect on customers’ purchasing decisions. We refer to the full model described in Section 4.2 as the
21
Dependent variable: RECOMMENDjl
2SLS
(EX)
2SLS
(Time)
2SLS
(Robust SE)OLS
(1) (2) (3) (4)
Freshness complaints −16.841∗∗∗ −13.528∗∗∗ −13.052∗ −0.167∗∗∗
(1.298) (2.592) (7.691) (0.036)
Packaging complaints −3.254 −11.172 −9.007 −0.080
(2.459) (11.990) (18.167) (0.054)
Delivery complaints 1.024 3.317 −4.941 0.018
(0.729) (3.582) (37.723) (0.167)
Manager Ability 1.268∗∗∗ 1.280∗∗∗ 1.285∗∗∗ 1.241∗∗∗
(0.008) (0.016) (0.026) (0.006)
ZIP Code 1 −0.348∗∗∗ −0.338∗∗∗ −0.346∗∗∗ −0.231∗∗∗
(0.100) (0.024) (0.047) (0.011)
ZIP Code 2 −0.161∗ −0.165∗∗∗ −0.174∗∗∗ −0.218∗∗∗
(0.095) (0.021) (0.049) (0.010)
ZIP Code 3 −0.175∗ −0.133∗∗∗ −0.144∗ −0.046∗∗∗
(0.099) (0.047) (0.082) (0.011)
Restaurant Age 0.012 0.009∗∗ 0.010 −0.011∗∗∗
(0.017) (0.004) (0.008) (0.002)
Constant −2.512∗∗∗ −2.549∗∗∗ −2.670∗∗∗ −2.359∗∗∗
(0.108) (0.163) (0.228) (0.051)
Observations 20,233 20,233 20,233 20,233
R2 0.602 0.411 0.587 0.729
Adjusted R2 0.602 0.389 0.587 0.729
Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01
Table 3: Impact of supply chain quality on customer experience.
network effect model. To assess the value of including customer experience at the focal restaurant
and that at the neighboring restaurants, in characterizing customers’ purchasing behavior, we also
consider the following two models: (1) the model whereby we do not account for the impact of
customer experience in any restaurants, which we refer to as the benchmark model ; and (2) the
22
model whereby we account only for the impact of customer experience at the focal restaurant but
not the network effect, which we refer to as the focal experience model. We report the estimation
results of all three models in Table 4.5
Description Notation Benchmark Focal Experience Network Effect
Model Model Model
Shape parameter of λ aλ 0.35 0.32 0.33
(0.33,0.37) (0.30,0.35) (0.31,0,35)
Rate parameter of λ bλ 1.22 1.17 1.18
(1.15,1.29) (1.10,1.24) (1.12,1.24)
Shape parameter of θ aθ 0.85 0.89 0.92
(0.82,0.88) (0.85,0.92) (0.88,0.95)
Rate parameter of θ bθ 0.73 0.01 0.00
(0.63,0.78) (0.00,0.01) (0.00,0.00)
Effect of focal customer experience αFQ −0.58 −0.62
(-0.65,-0.52) (-0.71,-0.56)
Effect of neighborhood customer experience αNQ −0.38
(-0.48,-0.25)
Effect of visiting at least one adjacent αd 4.54
restaurant (2.87,8.13)
Interaction effect αint −0.57
(-0.99,-0.38)
Fixed effect of ZIP Code 2 νZip2 0.10 0.38 0.59
(0.04,0.16) (0.31,0.45) (0.48,0.68)
Fixed effect of ZIP Code 3a νZip3 0.06 0.32 0.40
(0.01,0.12) (0.26,0.39) (0.34,0.47)
Table 4: Estimation results for the customer purchasing behavior models: Estimates of the param-eters and the corresponding 95% confidence intervals.
a Because ZIP Codes 1 and 4 were statistically similar, we eliminated the control for the fourth ZIP Code.
Note that aλ/bλ represents the average weekly rate of transactions, which is around 0.3 trans-
actions per week across all models. Most managerially relevant parameters are αFQ, αNQ, αint
and αd, and the results are consistent across the three models. As one would expect, αFQ has a
negative sign, implying that better customer experience at the focal restaurant reduces customers’
probability of churning. Interestingly, αNQ also has a negative sign. It implies that better customer
experience at the adjacent restaurants of the same brand does not lure away customers from the
5 Note that in all three models reported in Table 4, we omit ZIP Code 4 as a control. However, it is important to notethat we also have considered the models whereby we include ZIP Code 4 along with ZIP Codes 2 and 3 as controls.Our results show that the coefficient for ZIP Code 4 is not significantly different from 0 in all of the three modelsand that including ZIP Code 4 as a control does not improve the goodness of fit of these models. Thus, to improvethe estimation efficiency of the models, we focus on the ones for which we omit ZIP Code 4 as a control.
23
focal restaurant but, instead, enhances the focal restaurant’s ability to keep customers. This may
be because good overall customer experience offered by the adjacent restaurants leads to a good
reputation of the brand in the area and hence, reassures customers of their restaurant choice.
This result indicates that the reputation spillover effect dominates the competition effect. More-
over, negative αint implies that better overall customer experience of adjacent restaurants reduces
customers’ churn rate at the focal restaurant even more for customers who have visited adjacent
restaurants compared to those who have not. This shows that the reputation spillover effect can be
enhanced through customers’ personal visits to the adjacent restaurants. Note that the parameter
αNQ captures the network effect through the channel of word-of-mouth, while the parameter αint
measures the additional network effect through personal visits to the adjacent stores. Finally, as
expected, we have αd > 0, which indicates that customers who have visited adjacent restaurants
have a higher churn rate at the focal restaurant as compared to those who have not. This may be
because customers who explore multiple stores have more options and thus tend to be less loyal to
the focal store.
Model assessment and selection In this section, we compare the three customer purchasing
behavior models identified above, based on the AIC, and the prediction accuracy of the model.
We normalize the AIC of the Benchmark Model to 0 and report the AIC of all models relative to
the benchmark model in Table 5. The model with the lowest AIC best explains the data. Thus,
our results show that the network effect model best explains the data among all the three models,
while the benchmark model is the worst in explaining the observed customer purchasing behavior.
This indicates that both customer experience at the focal restaurant and the network effect are
informative additions to the benchmark model and help to explain the data significantly better.
We next assess the predictive accuracy of the various models that we consider. Following Braun
et al. (2015), we use the probability that a customer will make 0 transactions in the focal restau-
rant in the next t† weeks after the observation period T (conditional on his or her transaction
history) as our variable of interest. We denote it as P (X(t†) = 0|Kij, tij ,δdij ,qij , q
dj ,Zipj, T ), where
(Kij, tij ,δdij ,qij , q
dj ,Zipj, T ) characterizes the corresponding past transaction pattern and experi-
ence. We derive its expression in Appendix B, see (23).
To evaluate the prediction accuracy of our models, we divide the data into two subsets, with one
for training the model and the other for testing. In particular, the testing subset includes all of the
transactions in the last eight weeks of our observation window (from April 27, 2015 to June 22,
2015), while we use the remaining data to train our model. We first estimate our models by using
the training data. Using these estimates, for any given customer in the training set, we can compute
his or her predicted probability of making 0 transactions in the future eight weeks based on his
24
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Observed Probability
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1P
red
icte
d A
ve
rag
e P
rob
ab
ility
Figure 2: Probability of making 0 transactionsin the future eight weeks.
Models AIC RMSE MAE
Benchmark Model 0 0.112 0.080
Focal Experience Model -12.0 0.120 0.079
Network Effect Model -16.4 0.090 0.059
Table 5: Model selection measures
or her transaction history. We next compute the corresponding observed probability of making 0
transactions in the future eight weeks using the testing data. To do so, we first divide the customers
into 15 groups based on their predicted probability of making 0 transactions in the future eight
weeks. In particular, group g includes all customers with the predicted probability from (g−1)/15
to g/15, for g = 1,2, · · · ,15. For each group, we calculate the observed average probability of
making 0 transactions in the future eight weeks using the testing data. We then compare the
observed average probability to the average predicted probability of making zero transaction in the
future eight weeks for all the 15 groups. We measure the prediction accuracy our model in terms
of such probability using the following two commonly used metrics: mean-absolute error (MAE)
and root-mean-square error (RMSE). We report the results in Table 5.
Based on both RMSE and MAE, we observe that the network effect model predicts customers’
future purchasing behavior the most accurately among all the three models, which is consistent with
our modeling choice according to AIC. In Figure 2, we demonstrate how accurately the network
effect model can predict the probability of making 0 transactions in the future eight weeks by
plotting the predicted probability against the observed probability.
The above analysis shows that accounting for the network effect is important for explaining and
predicting customers’ future purchasing behavior. The model with network effect performs signifi-
cantly better in terms of both its goodness of fit and prediction accuracy compared to the models
without the network effect. We also show that increasing customer experience at one restaurant
not only improves customers’ intent to stay at the focal restaurant, but also can potentially ben-
efit its adjacent restaurants through the network effect. With the improved forecasting accuracy
and the additional insights regarding the network effect, our model can help the supply chain and
25
firms’ managers to estimate financial return of the supply chain quality and customer experience
improvements more accurately, as it will become clear in Section 7.
6. Robustness Studies
To check robustness of our customer purchasing behavior estimation, we study several alternative
models below.
Rolling Average Recommendation Score. Recall that we measure customer experience at
a restaurant by using all of the surveys collected in the entire observation time horizon in our
main network effect model. Thus, customer experience at a restaurant is constant over time in
these models. To explore the robustness of our main insights to the above choice of customer
experience measurement, we consider an alternative network effect model whereby we measure
customer experience at a given time using the rolling average of the recommendation scores from
the surveys collected up to that time. We refer to this model as the rolling network effect model.
This model allows us to capture the variation of customer experience over time, if any. We find
that our main insights continue to hold based on this alternative model (see Table 6). However, it
does not explain the observed customer purchasing behavior as well compared to our main network
effect model based on the AIC score.
Customer-based Network. Recall that we define the network of restaurants based on geographic
proximity in our main network effect model. We next consider an alternative model whereby we
define the network of restaurants based on customers’ personal visits. In particular, for any given
customer, we find all restaurants that the customer visited (regardless of their location) and include
them in this customer’s personal network. The perceived customer experience of the “adjacent”
restaurants to the focal restaurant now becomes the average customer experience aggregated across
all restaurants except the focal one visited by the customer. We refer to this model as the customer-
based network effect model. Again, we find that the results of this alternative model are consistent
with our main insights (see Table 6). However, our main network effect model better explains the
observed customer purchasing behavior than does the customer-based network effect model based
on AIC.
Potential Sample Selection Issue. Among the restaurants in the four ZIP Codes areas we
focus on, we select the ones that collect at least 50 surveys in our observation window. To confirm
that our results are not subject to sampling bias due to the elimination of restaurants with a low
number of surveys, we estimate our network effect model using the data from only the two ZIP
codes (ZIP Codes 1 and 4) where we kept most of the restaurants. We eliminate only one restaurant
out of 10 in ZIP Code 1 and one out of eight restaurants in ZIP Code 4. For ease of reference, we
name this model as the subsample network effect model. Our main insights continue to hold based
26
Notation Rolling Network Customer-based Subsample Network
Effect Model Network Effect Model Effect Model
aλ 0.35 0.33 0.459
(0.33,0.37) (0.30,0.35) (0.414,0.507)
bλ 1.22 1.18 1.323
(1.16,1.28) (1.11,1.25) (1.218,1.466)
aθ 0.89 0.92 0.881
(0.85,0.92) (0.86,0.95) (0.852,0.912)
bθ 0.03 0.00 0.00
(0.00,0.75) (0.00,0.01) (0.00,0.00)
αFQ −0.19 −0.59 −0.492
(-0.23,-0.15) (-0.69,-0.54) (-0.556,-0.401)
αNQ −0.17 −0.11 −0.357
(-0.22,-0.10) (-0.21,-0.02) (-0.491,-0.229)
αd 1.83 0.66 8.271
(0.79,3.17) (-0.13,1.57) (-2.144,14.38)
αint −0.26 −1.013
(-0.42,-0.14) (-1.719,0.2)
νZip2 0.27 0.40
(0.17,0.35) (0.33,0.47)
νZip3 0.27 0.34
(0.22,0.36) (0.28,0.41)
νZip4 −0.053
(-0.135,0.038)
Table 6: Estimation results of all of the alternative network effect models that characterize theimpact of customer experience on customer purchasing behavior.
on this alternative model (see Table 6), which reduces the concern about the potential sample
selection issue.
7. Counterfactual Studies
We have characterized the mechanism by which the supply chain quality affects customer expe-
rience and, thus, customers’ purchasing behavior through the two-phase methodology presented
above. We next describe how individual restaurants and the supply chain orchestrator can use
our methodology and results to prioritize investment in customer experience and supply chain
improvements to maximize their ROI. For example, a firm may run a marketing campaign on a
constrained budget and has to choose which customers to include in the campaign to maximize the
ROI. Similarly, the supply chain orchestrator may not be able to make supply chain improvements
for all restaurants and will have to choose, for example, which restaurant to serve first or which
restaurants to serve with the freshest supplies to maximize the ROI.
At the individual restaurant level, improving customer experience may take the form of a targeted
marketing or service campaign (e.g., sending special offers or providing premium services to specific
27
customers). Assuming that the cost of an improvement is the same across customers (e.g., it costs
the same to send an offer to each customer), the ROI for a restaurant from improving the experience
of any given customer is proportional to the additional future revenue that the customer generates
at the restaurant due to her improved experience. Given that the transaction amount has little
variation across different transactions and customers in our application, the additional revenue
from a given customer at a restaurant and, hence, the CLV, is proportional to the future number
of transactions that the customer generates at the restaurant.
At the supply chain level, improving the supply chain quality for a given restaurant may take
the form of fixing a particular supply chain issue. For example, if a certain restaurant incurs
Packaging issues because their shipment is at the bottom of the truck due to the route schedule, the
supply chain orchestrator may reduce the risk of this problem by adding cushioning to the bottom
shipments. Assuming that the cost of fixing such a problem does not vary across restaurants, the
ROI for the supply chain orchestrator from improving the supply chain quality of a given restaurant
is proportional to the incremental number of future transactions generated in the entire network
of restaurants due to the supply chain improvement. Note that improving the supply chain quality
for one restaurant will not only generate additional transactions at the focal restaurant, it also
may generate more revenue for the adjacent stores due to the network effect. Our network effect
model allows us to quantify the number of incremental transactions generated through both the
focal restaurant and the other restaurants in the network.
7.1. Expected Incremental Number of Future Transactions
We next derive the expected incremental number of future transactions that we need for our
ROI approximation. Consider customer i who had Kij transactions at restaurant j during the
observation period [0, T ] with the transactions occurring at tij = (tij1, tij2, · · · , tijKij ). The factors
that may affect the customer’s churn rate at the end of each of the Kij past transactions are
captured by yij = (yij1, yij2, · · · , yijKij ). Recall that yij is a function of customer experience at
the focal restaurant qij , the perceived quality of the adjacent restaurants qdj , customers’ visiting
trajectory to the adjacent restaurants δdij , and the ZIP code controls Zipj, see (2). For convenience
of notations, we let YijKij =∑k=Kij
k=1 yijk. Customers’ past transaction pattern and corresponding
experience are fully captured by (Kij, tij ,δdij ,qij , q
dj ,Zipj, T ). We then characterize the conditional
probability that customer i is still active at restaurant j by the end of time period T, denoted as
PAijT , in Lemma 2.
28
Lemma 2. For customer i with a past transaction pattern and experience characterized by
(Kij, tij ,δdij ,qij , q
dj ,Zipj, T ), the probability that she is still active at restaurant j by the end of
time period T is given by
PAijT =
(1−
(bλ +T − tij1
bλ + tijKij − tij1
)aλ+Kij−1(
1−
(bθ +YijKijbθ+Yi,j,Kij−1
)aθ))−1
. (7)
Timeline
T- theendofobservation
period!"#$ !"#% … !"#& …' + t
Observed horizon Forecasted horizonof lengtht
Figure 3: Transactions sequence for a customer i in restaurant j.
Next we derive the expected number of transactions for customer i at restaurant j in the
future t† periods conditional on her past transaction pattern and experience, characterized by
(Kij, tij ,δdij ,qij , q
dj ,,Zipj, T ). We denote this as Xij(t
†). To visualize the sequence of the events,
we illustrate the timeline in Figure 3. For simplicity of notation, we use a † superscript to denote
measures associated with the forecasting horizon t†. In particular, we let y†ijk denote the function
of factors that may affect customer i’s churn rate at restaurant j at the time of the customer’s kth
transaction after the end of the observation period T . Note that y†ijk is identical to yijk given in
(2) with the modification that the subscript k in y†ijk refers to the kth transaction after the end of
the observation time period T rather than the kth transaction during the observation period [0, T ].
For ease of exposition, similar to Yijx defined above, we let Y †ijx =x∑k=1
y†ijk. The expected number of
future transactions for customer i after time T is characterized in Lemma 3, below. For simplicity,
we assume that customers’ future experience at the focal restaurant and the perceived quality
of the adjacent restaurants stays the same as that of the last transaction during the observation
period [0, T ], i.e., y†ijx = yijKij for x∈Z+.6
Lemma 3. For customer i whose past transaction pattern and experience are characterized by
(Kij, tij ,δdij ,qij , q
dj ,Zipj, T ), the expected number of transaction in the future t† periods is given by
E[Xij(t†)|Kij, tij ,δ
dij ,qij , q
dj ,Zipj, T ] = PA
ijT
∞∑x=1
(YijKij + bθ
YijKij +Y †ijx + bθ
)aθB
(t†
t†+ bλ +T − tij1;x,aλ +Kij − 1
),
(8)
6 Because our results show that the average recommendation scores are stationary over time for all selected restaurantsbased on the augmented Dickey-Fuller (ADF) test, this is a reasonable assumption for our application.
29
where B(
t†
t†+bλ+T−tij1;k,aλ +Kij − 1
)is the cumulative distribution function of the beta distribu-
tion with parameters k and aλ +Kij − 1 evaluated at t†
t†+bλ+T−tij1.
For the ROI approximation, we next derive the expected incremental number of future
transactions that a customer can generate at a focal restaurant as her experience at the
focal restaurant or that at the corresponding adjacent restaurant changes. Consider a hypo-
thetical customer i whose past transaction pattern and experience are characterized by
(Kij, tij ,δdij ,qij , q
dj ,Zipj, T ), as described above. We denote her incremental number of transac-
tions in the future t† periods at the focal restaurant j from improving her experience there by ∆qij
as GFQ(∆qij |Kij, tij ,δdij ,qij , q
dj ,Zipj, T ), where ∆qij represents the vector of changes to customer
i’s experience at the focal restaurant j over time and is given by ∆qij = (∆qij1,∆qij2, · · ·∆qijKij ).
We have
GFQ(∆qij |Kij, tij ,δdij ,qij , q
dj ,Zipj, T ) =
E[Xij(t†)|Kij, tij ,δ
dij ,qij , q
dj ,Zipj, T ]−E[Xij(t
†)|Kij, tij ,δdij ,qij −∆qij , q
dj ,Zipj, T ]. (9)
Similarly, we denote the incremental number of transactions that customer i generates in the future
t† periods at the focal restaurant j if the average perceived quality at the adjacent restaurants
increases by ∆qdj as GNQ(∆qij |Kij, tij ,δdij ,qij , q
dj ,Zipj, T ), where ∆qdj represents the vector of
changes in the average perceived customer experience of the adjacent restaurants over time given
by ∆qdj = (∆qj1,∆qj2, · · · ,∆qjKij ). We then have
GNQ(∆qij |Kij, tij ,δdij ,qij , q
dj ,Zipj, T ) =
E[Xij(t†)|Kij, tij ,δ
dij ,qij , q
dj ,Zipj, T ]−E[Xij(t
†)|Kij, tij ,δdij ,qij , q
dj −∆qdj ,Zipj, T ]. (10)
7.2. Counterfactual Results
We will now use the expected incremental number of future transactions in response to change
in customer experience derived above to (a) identify the customers who can generate the highest
incremental value due to their improved customer experience for the focal restaurant (thus, the
focal restaurant can target such customers with its marketing or service campaigns to maximize its
ROI) and (b) identify the restaurants that can deliver the highest incremental value for the entire
network of restaurants from a supply chain improvement (thus, the supply chain orchestrator can
select such restaurants to invest in first).
Identify customers with the highest incremental value. We start with the identification of
the customers who can generate the highest incremental value for a given restaurant. We consider an
observation window of 100 weeks, i.e., T = 100, and customers with the number of past transactions
30
from 0 to 60. Following the terminology used in the literature, we refer to the time of a customer’s
last transaction as her recency, while we refer to the customer’s total number of transactions as her
frequency. We focus on restaurants in ZIP code 4 and customers who have not visited any of the
corresponding adjacent restaurants.7 For any such customer with a given frequency and recency,
using (9), we can compute her incremental number of transactions in the future three years (156
weeks),8 when we improve the customer’s experience at the focal restaurant from 8 to 9, conditional
on the average perceived customer experience across the adjacent restaurants’ being 8.9 We present
our results using a two-dimensional contour plot, as seen Figure 4.
10 20 30 40 50 60 70 80 90 100
Recency: Time of Last Transaction (Week)
5
10
15
20
25
30
35
40
45
50
55
60
Fre
qu
en
cy: N
um
be
r o
f P
ast T
ran
sa
ctio
ns
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
Figure 4: Iso-curves for incremental expected number of future transactions for customers in ZIPCode 4 who have not visited adjacent restaurants when customer experience of focal restaurantchanges from 8 to 9 and the network quality equals 8.
Each dot in Figure 4 represents a hypothetical customer with a particular frequency and recency.
All of the customers on a given iso-curve share the same expected incremental number of future
transactions at the focal restaurant, which is given in the color bar on the right side of the plot.
Note that, if a customer falls between two iso-curves, her incremental expected number of future
transactions falls between the values associated with the two corresponding iso-curves. For example,
the (red) dot represents a customer who visited the restaurant 28 times during the observation
7 The same logic applies to restaurants in other ZIP codes and customers who have visited the adjacent restaurants.
8 Based on our conversation with the managers, we choose to use the incremental number of transactions in the futurethree years instead of the classic discounted expected transactions (DET), proposed in Fader et al. (2005b), for betterforecasting reliability. Our qualitative insights, however, continue to hold if we use DET.
9 Note that we consider a customer experience of 8 points as a benchmark. This is because most of the restaurantsin our application have a customer experience score of 8 or above.
31
period of 100 weeks and whose last visit was in week 90.10 For such a customer, the expected
incremental number of future transactions at the focal restaurant over the next three years will
increase by 0.158 if her experience at the focal restaurant changes from 8 to 9. To this end, based
on the values associated with the iso-curves, the customers whose recency-and-frequency profile
are inside the lightest area (yellow regions) have the highest response to their customer experience
improvement at the focal restaurant. Hence, the focal restaurant should prioritize these customers
with their marketing efforts, e.g., sending targeted coupons and special offers to such customers.
To develop more intuition about the results presented in Figure 4, we next demonstrate how a
customer’s incremental value changes with her frequency conditional on recency. To this end, we
plot the incremental number of future transactions as a function of frequency (Kij) for two selected
recency values (tijKij = 80 and 95); see Figure 5. First, note the initial increasing trend that we
observe for both values of recency. Low frequency implies that the customer makes few transactions
with the firm and hence, the number of future transactions in the next three years is low. Even
if such a customer is exposed to increased customer experience, which will reduce the probability
of churning, the incremental number of transactions will be low. As the frequency increases, the
customer is likely to make more transactions in the next three years if she stays active, and, thus,
improvement in customer experience will lead to a higher incremental improvement in the number
of future transactions. Note now the decreasing trend that we observe on the curve with relatively
low recency (i.e., tijKij = 80). When the frequency is high, but the last visit was a long time ago
(in our case 20 weeks ago), it is likely that such a customer is no longer an active customer of
the restaurant, and, hence, a customer experience improvement will not lead to a high incremental
number of transactions. As frequency gets even higher, it is even more likely that the customer is
no longer active and, thus, we observe the decreasing trend. When the recency is high (tijKij = 95),
the customer is likely to be active with the restaurant regardless of her frequency. Hence, the
decreasing trend is not present. We observe a similar pattern regarding how customers’ incremental
value changes with respect to recency for fixed frequencies.
Identify restaurants with the highest incremental value. We next demonstrate how the
supply chain orchestrator can use our methodology and results to prioritize the supply chain
improvement across different restaurants to maximize its ROI. Note that improving customer expe-
rience at a focal restaurant not only generates an incremental number of future transactions at
the focal restaurant but also improves the revenue of the adjacent restaurants through the net-
work effect. In particular, the total incremental number of transactions generated in the network
10 Note that the average customer frequency that we observe in the data is 28 in our observation window of around100 weeks. Thus, we choose the frequency 28 for a representative customer.
32
0 10 20 30 40 50 60
Frequency: Number of Past Transactions
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
Incre
me
nta
l F
utu
re N
um
be
r o
f T
ran
sa
ctio
ns
80
95
Time of Last Transaction (Week)
Figure 5: The expected incremental number of transactions in the next three years that customers,who have not visited any restaurants in the neighborhood of 1 mile, generate at a focal restaurantfrom ZIP Code 4, as their experience at the focal restaurant improves from 8 to 9, conditional onthe average perceived customer experience of the adjacent restaurants’ being 8.
of restaurants by improving customer experience at a focal restaurant is the sum of the incremen-
tal number of transactions generated at the focal restaurant and that generated at the adjacent
restaurants.
Recall that, for any given customer of a particular frequency and recency at a focal restaurant, we
can quantify the incremental number of future transactions that the customer generates at the focal
restaurant as her experience at the focal restaurant improves, using (9). We have demonstrated
our results in Figure 4. We can thus obtain the total incremental number of transactions generated
at a focal restaurant by summing over the incremental number of future transactions across all of
the customers at the focal restaurant.
We next quantify the total incremental number of future transactions generated at the adja-
cent restaurants by improving the focal restaurant’s customer experience. Mathematically, this is
equivalent to the total incremental number of future transactions generated at a focal restaurant
by improving the average customer experience of its adjacent restaurants, which we can quantify
by using (10). For each customer with given recency and frequency, we present the incremental
number of future transactions at the focal store as the perceived average customer experience of the
adjacent stores increases from 8 to 9 (conditional on customer experience of the focal store’s being
8), as seen in Figure 6. Note that Figure 6a shows the results for customers who have visited only
the focal restaurant but none of its adjacent restaurants, while Figure 6b shows the results for those
who have visited the focal restaurant and at least one of its adjacent restaurants. The incremental
33
value shown in Figure 6a is purely through the network effect due to word-of-mouth, while the
incremental value shown in Figure 6b is the total network effect through both the word-of-mouth
channel and customers’ personal experience channel.
Note a customer with 28 total transactions at the focal restaurant in the past 100 weeks and
who made her last transaction at the focal restaurant at Week 90 (indicated by the red dot). If the
customer has not visited any of the adjacent stores, she may generate 0.086 additional transactions
at the focal restaurant as the average perceived customer experience of the adjacent restaurants
improves from 8 to 9. As noted earlier, the network effect is even larger if the customer has visited
the adjacent restaurants. The corresponding additional transactions at the focal restaurant is 0.297,
which is 245 percent higher compared to the case when she has not visited any of the adjacent
restaurants. Our results above underscore the importance of accounting for the network effect for
the supply chain orchestrator. In particular, the supply chain orchestrator may underestimate the
ROI from improving a focal restaurant’s supply chain quality if orchestrator does not account for
the network effect.
10 20 30 40 50 60 70 80 90 100
Recency: Time of Last Transaction (Week)
5
10
15
20
25
30
35
40
45
50
55
60
Fre
quen
cy: N
umbe
r of
Pas
t Tra
nsac
tions
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
(a) For customers who have not visited neighbors
10 20 30 40 50 60 70 80 90 100
Recency: Time of Last Transaction (Week)
5
10
15
20
25
30
35
40
45
50
55
60
Fre
quen
cy: N
umbe
r of
Pas
t Tra
nsac
tions
0.05
0.1
0.15
0.2
0.25
0.3
(b) For customers who have visited neighbors
Figure 6: The expected number of future transactions for customers at a focal restaurant in ZIPCode 4 when the average customer experience of its adjacent restaurants increases from 8 to 9.
We have described how we quantify the total incremental value for the entire restaurant chain
through improving customer experience at a focal restaurant. To illustrate how this total incremen-
tal value changes with the perceived customer experience of the adjacent restaurants, we consider
the following hypothetical restaurant network. The network is comprised of two restaurants located
in ZIP Code 4. Each restaurant has only the other restaurant as its neighbor within 1 mile. We
assume that both restaurants have 5,000 customers who had 28 visits in the past 100 weeks and
made their last transaction at the 90th week. All of these customers have visited only one restaurant
by the end of the observation period. For ease of reference, we refer to one of the restaurants as
the focal restaurant and the other as the adjacent restaurant. For any fixed perceived customer
34
experience of the adjacent restaurant, we calculate the total incremental number of transactions
generated in the focal restaurant and that generated in the adjacent restaurant as customer expe-
rience at the focal restaurant improves from 8 to 9. In Figure 7, we present the total incremental
number of transactions generated at the focal restaurant, that generated at the adjacent restaurant,
and that generated in the entire network of restaurants as a function of the perceived customer
experience at the adjacent restaurant.
Figure 7 shows that improving customer experience at the focal restaurant generates not only
incremental number of transactions at the focal store itself, but also additional transactions for
the adjacent restaurant. Moreover, the incremental value for the focal restaurant and that for the
adjacent restaurant (from the improved customer experience at the focal restaurant) both increase
with the perceived customer experience of the adjacent restaurant due to the dominating reputation
spillover effect. Hence, firms should prioritize restaurants with adjacent restaurants that provide
higher customer experience.
0 1 2 3 4 5 6 7 8 9 10Customer Experience at Adjacent Restaurants
0
500
1000
1500
2000
2500
3000
3500
Incr
emen
tal N
umbe
r of F
utur
e Tr
ansa
ctio
ns
Expected total incremental number of transactions at the focal restaurant in the future 3 yearsExpected total incremental number of transactions at the adjacent restaurant in the future 3 yearsExpected total incremental number of transactions within the restaurant chain in the future 3 years
Figure 7: Total incremental number of transactions for the networks from ZIP Code 4 as customerexperience at the focal restaurant changes from 8 to 9.
We have demonstrated that, for a fixed ZIP Code market, the supply chain orchestrator should
prioritize serving the restaurant with adjacent restaurants of higher quality. We next show how
our model help the supply chain orchestrator to prioritize its service across different ZIP Code
markets.
We consider a second network which is identical to the network described above, with one
modification: Both restaurants in this network are located in ZIP Code 2 instead of ZIP Code 4.
We plot the total incremental number of future transactions (generated at the entire network of
35
the restaurants) as a function of the perceived customer experience at the adjacent restaurant (for
both networks) in Figure 8. Our results show that the ROI is higher from improving customer
experience at the focal restaurant from ZIP Code 4 than that from ZIP Code 2. This is because,
based on our estimation results in Section 5.2, customers of restaurants in ZIP Code 4 have a lower
churn rate compared to those of restaurants in ZIP Code 2 on average. Thus, our model also can
be used by the supply chain orchestrator to identify the right market to prioritize based on the
unobservable market characteristics.
0 1 2 3 4 5 6 7 8 9 10Customer Experience at Adjacent Restaurants
0
500
1000
1500
2000
2500
3000
3500
Incr
emen
tal N
umbe
r of
Fut
ure
Tra
nsac
tions Expected total incremental number of transactions within the restaurant chain in the future 3 years at Zip 4
Expected total incremental number of transactions within the restaurant chain in the future 3 years at Zip 2
Figure 8: Total incremental number of transactions for the two network described above when thecustomer experience at the focal restaurant changes from 8 to 9.
In summary, our counterfactual study demonstrates how a focal restaurant can use our model
to quantify the incremental value from improving an individual customer’s experience, and how
we can help the supply chain orchestrator to quantify the total incremental value by improving
a focal restaurant’s customer experience. Based on our results, we show that the focal restaurant
should prioritize customers whose last visit to the restaurant was neither too long ago nor too
recent and whose total number of past transactions with the restaurant is neither too high nor
too low, for its marketing campaign or premium services. Moreover, the supply chain orchestrator
should prioritize the restaurants with a high percentage of customers with the transaction pattern
characterized above and the restaurants with high quality neighbors.
36
8. Conclusion
In this study, we explore a quality value chain network which includes all of the stages in a supply
chain, namely, manufacturers, distributors, and a network of firms that share the same brand
and their individual end consumers. Compared to the classic profit chain concept proposed in
Heskett and Schlesinger (1997), we take a more holistic view in our framework by considering (1)
the impact of vertically-related supply chain partners on customer experience at the downstream
retailer firms, and (2) the impact from the horizontally-related firms through a common brand on
customer purchasing behavior at a focal firm. In particular, we develop a two-phase framework
and methodology to manage the quality value chain network. In the first phase, we identify the
operational factors (in our application, supply chain quality issues) that have the largest impact
on customer experience at the firms. In the second stage, we explore how customer experience at
a network of firms that share the same brand affects customer future purchasing behavior at a
focal firm. As a result, our framework provides insights regarding (1) which operational initiatives
need to be prioritized for improvement to have the largest impact on customer experience, (2)
improvement at which firms provide the largest ROI for the brand in the B2B setting, and (3)
which customers have the largest response to customer experience improvement and can be selected
for targeted campaigns in the B2C setting.
In our application to a major fast food restaurant chain, we identify three main supply chain
issues, i.e., Freshness, Packaging, and Delivery. Among these issues, Freshness has the largest
impact on customer experience at the restaurants. In particular, reducing one complaint related
to Freshness per month can improve customer experience score by 0.44 points out of 10. In regard
to how customer experience affects customer purchasing behavior, we show that it is important
to consider the network effect, which significantly improves the model performance both in terms
of goodness-of-fit and forecasting accuracy. We further show that the reputation spillover effect
among the adjacent stores dominates the competition effect. Namely, higher customer experience
offered by adjacent stores may improve customers’ loyalty to the focal restaurant instead of luring
away customers from the focal store.
Based on the mechanism that we established in regard to how customer experience affects her
purchasing behavior, we then conduct counterfactual analysis. For any given customer, we quantify
the incremental number of transactions that the customer will generate based on her past trans-
action patterns if the customer’s experience at the focal restaurant improves. We show that, to
optimize ROI, the firm should focus its customer experience improvement efforts on the customers
whose last visit for the restaurant was neither too long ago nor too recent and whose number of
total transactions with the firm is neither too high nor too low. At the restaurant level, the firm
should prioritize the supply chain improvement for the restaurants that have neighbors that offer
37
better customer experience and a high percentage of customers with the profile identified above.
It is worth mentioning that, ROI from supply chain improvement at a focal restaurant includes
both the incremental value generated directly at the focal restaurant and the increment value gen-
erated at its adjacent stores due to the network effect. We demonstrate that the incremental value
generated at the adjacent stores due to the network effect can be a significant portion of the total
incremental value generated within the entire restaurant chain. This implies that it is important
to consider the network effect to quantify the financial return more accurately and that the results
without the network effect can be considerably misleading.
Acknowledgments
The authors gratefully acknowledge WCAI and an anonymous data sponsor for providing the data and for
their tremendous support for this research.
References
Abrahams, Alan S, W. Fan, G. Alan Wang, Z. J. Zhang, J. Jiao. 2015. An integrated text analytic framework
for product defect discovery. Production and Operations Management 24(6) 975–990.
Afeche, Philipp, M. Araghi, O. Baron. 2015. Customer acquisition, retention, and queueing-related service
quality: Optimal advertising, staffing, and priorities for a call center .
Aflaki, Sam, I. Popescu. 2014. Managing retention in service relationships. Management Science 60(2)
415–433.
Aksin, Zeynep, B. Ata, S. M. Emadi, C.-L. Su. 2013. Structural estimation of callers’ delay sensitivity in
call centers. Management Science 59(12) 2727–2746.
Aksoy-Pierson, Margaret, G. Allon, A. Federgruen. 2013. Price competition under mixed multinomial logit
demand functions. Management Science 59(8) 1817–1835.
Allon, Gad, A. Federgruen. 2009. Competition in service industries with segmented markets. Management
Science 55(4) 619–634.
Allon, Gad, A. Federgruen, M. Pierson. 2011. How much is a reduction of your customers’ wait worth? an
empirical study of the fast-food drive-thru industry based on structural estimation methods. Manu-
facturing & Service Operations Management 13(4) 489–507.
Anderson, Eugene W, V. Mittal. 2000. Strengthening the satisfaction-profit chain. Journal of service research
3(2) 107–120.
Arora, Swarnjit. 1973. Error components regression models and their applications. Annals of Economic and
Social Measurement, Volume 2, number 4 . NBER, 451–461.
Batt, Robert J., C. Terwiesch. 2015. Waiting patiently: An empirical study of queue abandonment in an
emergency department. Manage. Sci. 61(1) 39–59.
38
Blei, David M. 2012. Probabilistic topic models. Communications of the ACM 55(4) 77–84.
Blei, David M, J. D. Lafferty. 2009. Topic models. Text mining: classification, clustering, and applications
10(71) 34.
Bolton, Ruth N. 1998. A dynamic model of the duration of the customer’s relationship with a continuous
service provider: The role of satisfaction. Marketing science 17(1) 45–65.
Bowman, Douglas, D. Narayandas. 2004. Linking customer management effort to customer profitability in
business markets. Journal of Marketing Research 41(4) 433–447.
Braun, Michael, D. A. Schweidel, E. Stein. 2015. Transaction attributes and customer valuation. Journal of
Marketing Research 52(6) 848–864.
Buell, Ryan W, D. Campbell, F. X. Frei. 2016. How do customers respond to increased service quality
competition? Manufacturing & Service Operations Management 18(4) 585–607.
Burnham, Kenneth P, D. R. Anderson. 2003. Model selection and multimodel inference: a practical
information-theoretic approach. Springer Science & Business Media.
Cachon, Gerard P, S. Gallino, M. Olivares. 2013. Does adding inventory increase sales? evidence of a scarcity
effect in us automobile dealerships. Columbia Business School Research Paper No. 13-60 .
Cachon, Gerard P, P. T. Harker. 2002. Competition and outsourcing with scale economies. Management
Science 48(10) 1314–1333.
Cachon, Gerard P, M. A. Lariviere. 2001. Turning the supply chain into a revenue chain. Harvard Business
Review March.
Cohen, Morris A, S. Whang. 1997. Competing in product and service: a product life-cycle model. Management
science 43(4) 535–545.
Coyle, John J, C. Langley, B. Gibson, R. Novack, E. Bardi. 2008. Supply Chain Management: A Logistics
Perspective. Cengage Learning. URL https://books.google.com/books?id=TbuZnzcXGysC.
Craig, Nathan, N. DeHoratius, A. Raman. 2016. The impact of supplier inventory service level on retailer
demand. Manufacturing & Service Operations Management 18(4) 461–474.
Cronin Jr, J Joseph, S. A. Taylor. 1992. Measuring service quality: a reexamination and extension. The
journal of marketing 56 55–68.
Fader, Peter S, B. G. Hardie, K. L. Lee. 2005a. Counting your customers the easy way: An alternative to
the Pareto/NBD model. Marketing Science 24 275–284.
Fader, Peter S, B. G. Hardie, K. L. Lee. 2005b. RFM and CLV: Using iso-value curves for customer base
analysis. Journal of Marketing Research 42(4) 415–430.
Fitzmaurice, Garrett M, N. M. Laird, J. H. Ware. 2012. Applied longitudinal analysis, vol. 998. John Wiley
& Sons.
39
Forrester. 2016. The Customer Life-Cycle Marketing Playbook For 2016. Tech. rep., Forrester.
Gans, Noah. 2002. Customer loyalty and supplier quality competition. Management Science 48(2) 207–221.
Guajardo, Jose A, M. A. Cohen, S. Netessine. 2015. Service competition and product quality in the us
automobile industry. Management Science 62(7) 1860–1877.
Hall, David, D. Jurafsky, C. D. Manning. 2008. Studying the history of ideas using topic models. Proceedings
of the conference on empirical methods in natural language processing . Association for Computational
Linguistics, 363–371.
Heskett, W. E. Sasser Jr., J., L. Schlesinger. 1997. The Service Profit Chain: How Leading Companies Link
Profit and Growth to Loyalty, Satisfaction, and Value. New York: Free Press.
Ho, Teck-Hua, Y.-H. Park, Y.-P. Zhou. 2006. Incorporating satisfaction into customer value analysis: Optimal
investment in lifetime value. Marketing Science 25(3) 260–277.
Horowitz, Joel L, J. Heckman, E. Leamer. 2001. Handbook of econometrics .
Hu, Minqing, B. Liu. 2004. Mining and summarizing customer reviews. Proceedings of the tenth ACM
SIGKDD international conference on Knowledge discovery and data mining . ACM, 168–177.
Hwang, Jinsoo, J. Zhao. 2010. Factors influencing customer satisfaction or dissatisfaction in the restaurant
business using answertree methodology. Journal of Quality Assurance in Hospitality & Tourism 11(2)
93–110.
Hyken, Shep. 2016. Ten Customer Service And Customer Experience Trends For 2017. Tech. rep., Forbes.
Kaynak, Hale, J. L. Hartley. 2008. A replication and extension of quality management into the supply chain.
Journal of Operations Management 26(4) 468 – 489. Special Issue: Research in Supply Chain Quality.
Kesavan, Saravanan, V. Deshpande, H. S. Lee. 2014. Increasing sales by managing congestion in self-service
environments: Evidence from a field experiment .
Kim, Woo Gon, C. Y. N. Ng, Y. soon Kim. 2009. Influence of institutional DINESERV on customer
satisfaction, return intention, and word-of-mouth. International Journal of Hospitality Management
28(1) 10 – 17.
Knox, George, R. van Oest. 2014. Customer complaints and recovery effectiveness: A customer base approach.
Journal of Marketing 78(5) 42–57.
Knutson, Bonnie J. 2000. College students and fast food. Cornell Hotel and Restaurant Administration
Quarterly 41(3) 68–74.
Kumar, V, I. Dalla Pozza, J. Ganesh. 2013. Revisiting the satisfaction–loyalty relationship: empirical gener-
alizations and directions for future research. Journal of Retailing 89(3) 246–262.
Kumar, V., W. Reinartz. 2016. Creating enduring customer value. Journal of Marketing 80(6) 36–68.
Lehmann, Donald R, J. O’shaughnessy. 1974. Difference in attribute importance for different industrial
products. The Journal of Marketing 36–42.
40
Loveman, Gary W. 1998. Employee satisfaction, customer loyalty, and financial performance: an empirical
examination of the service profit chain in retail banking. Journal of Service Research 1(1) 18–31.
Lu, Yina, A. Musalem, M. Olivares, A. Schilkrut. 2013. Measuring the effect of queues on customer purchases.
Management Science 59(8) 1743–1763.
Maynes, Joel, A. Rawson. 2016. Linking the customer experience to value. Tech. rep., McKinsey & Company.
Mendelson, Haim, K. Moon. 2017. Growth and customer loyalty: Evidence from the app economy. Stanford
University Graduate School of Business Working Paper .
Mimno, David, H. M. Wallach, E. Talley, M. Leenders, A. McCallum. 2011. Optimizing semantic coherence
in topic models. Proceedings of the Conference on Empirical Methods in Natural Language Processing .
Association for Computational Linguistics, 262–272.
Min, Hokey, H. Min. 2011. Benchmarking the service quality of fast-food restaurant franchises in the usa: A
longitudinal study. Benchmarking: An International Journal 18(2) 282–300.
Musalem, Andres, M. Olivares, A. Schilkrut. 2016. Retail in high definition: Monitoring customer assistance
through video analytics. Working Paper .
Olsen, Tava Lennon, R. P. Parker. 2008. Inventory management under market size dynamics. Management
Science 54(10) 1805–1821.
Parasuraman, Anantharanthan, V. A. Zeithaml, L. L. Berry. 1985. A conceptual model of service quality
and its implications for future research. the Journal of Marketing 41–50.
Pingitore, Gina, N. Morgan, L. Rego, A. Gigliotti, J. Meyers. 2007. The single-question trap: The net
promoter score has limitations in predicting financial performance. Marketing Research 19 9–13.
Reichheld, Frederick F. 2003. The one number you need to grow. Harvard business review 81(12) 46–55.
Schmittlein, David C, D. G. Morrison, R. Colombo. 1987. Counting your customers: Who-are they and what
will they do next? Management science 33(1) 1–24.
Shital Chheda, Ewan Duncan, S. Roggenhofer. 2017. Putting customer experience at the heart of next-
generation operating models. Tech. rep., McKinsey & Company.
Tan, Tom Fangyun, S. Netessine. 2014. When does the devil make work? an empirical study of the impact
of workload on worker productivity. Management Science 60(6) 1574–1593.
Tirunillai, Seshadri, G. J. Tellis. 2014. Mining marketing meaning from online chatter: Strategic brand
analysis of big data using latent dirichlet allocation. Journal of Marketing Research 51(4) 463–479.
Verhoef, Peter C. 2003. Understanding the effect of customer relationship management efforts on customer
retention and customer share development. Journal of marketing 67(4) 30–45.
Wallach, Hanna M, D. M. Mimno, A. McCallum. 2009. Rethinking lda: Why priors matter. Advances in
neural information processing systems. 1973–1981.
41
Wooldridge, Jeffrey M. 2001. Econometric Analysis of Cross Section and Panel Data, MIT Press Books,
vol. 1. The MIT Press.
Yu, Qiuping, G. Allon, A. Bassamboo. 2017. How do delay announcements shape customer behavior? An
empirical study. Management Science 63(1) 1–20.
Zeithaml, Valarie A. 2000. Service quality, profitability, and the economic worth of customers: what we know
and what we need to learn. Journal of the academy of marketing science 28(1) 67–85.
42
Appendix A: Supply Chain Data Analysis
Topic Keywords in the Prior Keywords after LDA
Freshness
Apple*, Areas, Avocado, Bad, Bak-ing, Black, Blue, Bread, Breast, Brown,Browning, Caramel, Carmel, Cheese,Chewy, Chicken, Package Item Y, Col-lapse, Cookies, Crunchy, Cucumbers,Dark, Decay, Degrees, Diet, Discol-ored, Do not like, Dots, Dough, Exp,Expired, Fatty, Flavor, Frozen, Gray,Green, Grey, Gristle, Hard, Hour*,Lettuce, Life, Like, Looks, Mold*,Mushy, Onion*, Overripe, Patties, Pep-pers, Pink, Plain, Proofing, Red*, Ripe,Rise, Rotten, Salty, Shelf, Shelf life,Shiny, Shrink, Sliced, Slimy, Smell, Soft,Soggy, Sour, Spoiled, Spotted, Spread-ing, Stale, Sticky, Stored, Taste, Tem-perature, Thermometer, Tomato*, TooDark, Transparent, Tuna, Underripe*,Watery, Wet, White, Wrinkled*, Yellow
Apples, Avocado, Black, Breast, Brown,Cheese, Chicken, Cookies, Crunchy,Cucumbers, Dark, Degrees, Discolored,Dots, Frozen, Green, Hour, Hours,Lettuce, Looks, Mold, Moldy, Mushy,Onions, Overripe, Patties, Peppers,Pink, Red, Ripe, Rotten, Slimy, Smell,Soft, Soggy, Spotted, Sticky, Stored,Temperature, Thermometer, Tomato*,Tuna, Underripe*, Watery, Wet, White,Yellow
Packaging
Bag*, Bloated*, Box*, Busted, Case,Casebags, Cases, Chips, Package ItemY, Cookies, Cooler, Crushed, Cut, Dam-age*, Degrees, Package Item X*, Edges,Empty, Filled, Fold, Frame, Full, Handle,Holes, Inconsistent, Ink, Large, Leak-ing, Length, Lids, Missing, Only , Open,Overfill, Package, Seal, Seam, Short,Small, Smashed, Storage, Tabs, Tape-flaps, Tear, Temperature, Thermome-ter, Thick, Thin, Torn, Underfill, Under-weight, Weight
Bag*, Bloated*, Box*, Busted, Case*,Chips, Cookies, Cooler, Cut, Dam-aged, Degrees, Package Item X*, Filled,Frame, Large, Leaking, Missing, Open,Seal, Seam, Short, Small, Smashed,Storage, Tapeflaps, Temperature, Ther-mometer, Thick, Thin, Torn
DeliveryCar, Date, Degrees, Deliver*, Driver,Exp, Hour, Late, Outside, Present, Sat-urday, Thermometer, Time, Truck
Car, Date, Degrees, Deliver*, Driver,Hour, Late, Outside, Present, Saturday,Thermometer, Time, Truck,
Table 7: Keywords in the prior (second column) and after the LDA application (third column) fortopic modeling that identify main supply chain quality issues. Notes: * implies different forms of the word,e.g., Spot* includes the words “spot,” “spotty,” “spots,” “spotted,” and so forth. Meanwhile, to protect confidentiality of thedata provider, we are not listing any trademarked terms and have replaced two such terms with “Package Item A” and “PackageItem B”, respectively.
Appendix B: Proofs
Before we begin the proofs for the lemmas in the paper, we would like to provide a few useful formulas for
all the proofs below.
• Recall that we have λ∼Gamma(aλ, bλ) where aλ and bλ are the shape and rate parameter, respectively.
Thus, the pdf of λ, denoted as fλ(λ|aλ, bλ), is given as follows:
43
fλ(λ|aλ, bλ) =baλλ
Γ(aλ)λaλ−1 exp(−λbλ) (11)
• Meanwhile, we have θ∼Gamma(aθ, bθ) where aθ and bθ are the shape and rate parameter, respectively.
Thus, the pdf of θ, denoted as fθ(θ|aθ, bθ), is given as follows:
fθ(θ|aθ, bθ) =baθθ
Γ(aθ)θaθ−1 exp(−θbθ) (12)
• Using the moment generating function of a gamma distributed random variable, we have
Eλ[ecλ] =baλλ
(bλ− c)aλand Eθ[ecθ] =
baθθ(bθ − c)aθ
(13)
• Gamma function Γ(k) and lower incomplete gamma function are defined as follows:
Γ(k) =
∫ ∞0
tk−1e−tdt and γ(k,x) =
∫ x
0
tk−1e−tdt (14)
Note that all of proofs below are inspired by Braun et al. (2015) or a special case of the proofs there.
Proof of Lemma 1 in Section 4.2.2.
Consider a customer i whose transaction pattern during the observation period [0, T ] is characterized
by Ωij = (Kij , tij ,δdij,qij , q
dj,Zipj , T ). Recall that we have pijk = 1− exp(−θyijk). Thus, we can simplify the
likelihood function (4) by replacing pijk with 1− exp(−θyijk). We get
Lij(Ωij |λ, θ, ·) = λKij−1 exp(−λ(tijKij − tij1
)− θYijKij−1
)−λKij−1 exp
(−λ(tijKij − tij1)− θYijKij
)+λKij−1 exp
(−λ(T − tij1)− θYijKij
),
where Yijk =∑m=k
m=1 yijm. Note that λ and θ are independent. By taking the expectation over the random
variables λ and θ, we have
Eλ,θ[Lij ] = Eλ[λKij−1e
−λ(tijKij−tij1)]Eθ[e−θYi,j,Kij−1
]−Eλ
[λKij−1e
−λ(tijKij−tij1)]Eθ[e−θYijKij
]+Eλ
[λKij−1e−λ(T−tij1)
]Eθ[e−θYijKij
]. (15)
In particular, we have
Eλ[λKij−1e
−λ(tijKij−tij1)]
=
∫ ∞0
λKij−1e−λ(tijKij−tij1)
fλ(λ|aλ, bλ)dλ
=baλλ
Γ(aλ)
∫ ∞0
λaλ+Kij−2e−λ(bλ+tijKij−tij1)
dλ
=baλλ
Γ(aλ)(bλ + tijKij − tij1)aλ+Kij−1
∫ ∞0
Λaλ+Kij−2e−ΛdΛ
=Γ(aλ +Kij − 1)baλλ
Γ(aλ)(bλ + tijKij − tij1)aλ+Kij−1.
(16)
Note that we obtain the third equality in (16) by transforming variables with Λ = λ(bλ + tijKij − tij1) , and
we obtain the last equality based on the definition of the Γ(k) function given in (14). Following the same
logic, we get
Eλ[λKij−1e−λ(T−tij1)
]=
Γ(aλ +Kij − 1)baλλΓ(aλ)(bλ +T − tij1)aλ+Kij−1
. (17)
44
Based on the results on the generating function of a gamma distributed random variable given in (13), we
have
Eθ[e−θYi,j,Kij−1 ] =baθθ
(bθ +Yi,j,Kij−1)aθEθ[e−θYijKij ] =
baθθ(bθ +YijKij )
aθ. (18)
Combining (15), (16), (17) and (18), we have
Eλ,θ[Lij(Ωij |αFQ, αNQ, αd, αint,αZIP , λ, θ)]
=Γ(aλ+Kij−1)
Γ(aλ)
baλλ
(bλ+tijKij−tij1)aλ+Kij−1
(bθ
bθ+Yi,j,Kij−1
)aθ(
1−(bθ+Yi,j,Kij−1
bθ+YijKij
)aθ (1−
(bλ+tijKij−tij1bλ+T−tij1
)aλ+Kij−1))
. (19)
Q.E.D.
Proof of Lemma 2 in Section 7.2.
Consider a customer i whose transaction pattern during the observation period [0, T ] is characterized
by Ωij = (Kij , tij ,δdij,qij , q
dj,Zipj , T ). We next derive the probability that such customer is still active at
restaurant j at the end of time period T , which is denoted as PAijT . To do so, we first characterize such
probability conditional on a given λ and θ, which we denote as P (A|λ, θ,Ωij).
Note that after the last transaction, customer i either churns with probability pijKij or is still active but
has not visited the focal restaurant j since the last transaction with probability (1− pijKij )e−λ(T−tijKij )
. To
this end, we have
P (A|λ, θ,Ωij) =(1− pijKij )e
−λ(T−tijKij )
pijKij + (1− pijKij )e−λ(T−tijKij )
. (20)
To get PAijT , we need to take expectation of P (A|λ, θ,Ωij) over the joint posterior distribution of λ and θ.
We denote the joint posterior pdf of λ and θ as fposλ,θ (λ, θ|aλ, bλ, aθ, bθ). Based on Bayes’ rule, we have
fposλ,θ (λ, θ|aλ, bλ, aθ, bθ) = P (λ, θ|Ωij) =Lij(Ωij |λ, θ, ·)fλ(λ|aλ, bλ)fθ(θ|aθ, bθ)
Eλ,θ[Lij(Ωij |λ, θ, ·)]. (21)
We now have
PAijT = P (A|Ωij) =
∫ ∞0
∫ ∞0
P (A|λ, θ,Ωij)fposλ,θ (λ, θ|aλ, bλ, aθ, bθ)dλdθ.
Combining (20), (21), (11), (12) and (15), we get
PAijT =
(1−
(bλ +T − tij1
bλ + tijKij − tij1
)aλ+Kij−1(
1−
(bθ +YijKijbθ+Yi,j,Kij−1
)aθ))−1
.
The detailed algebra technique to get the equality above is similar to the proof of Lemma 1. Q.E.D.
Proof of Lemma 3 in Section 7.1.
Consider a customer i whose transaction pattern during the observation period [0, T ] is characterized
by Ωij = (Kij , tij ,δdij,qij , q
dj,Zipj , T ). We next derive her expected number of transactions in the future
t† period (after the end of observation horizon T ) conditional on her transaction pattern Ωij for given
45
parameters λ and θ. Let X(t†) be the number of transactions in the future t† periods. As one can see, we
shall have
E[X(t†)|λ, θ,Ωij ] = P (A|λ, θ,Ωij) ·E[X(t†)|λ, θ,Ωij ,A],
where A represents the event that customer i stays active at restaurant j by time T . Recall that the first term
P (A|λ, θ,Ωij) is given by (20). The second term E[X(t†)|λ, θ,Ωij ,A] is the expected number of transactions
in the future t† period conditional on λ, θ, Ωij and event A.
To derive the second term, we let τ + T be the time when customer i becomes inactive at restaurant j.
We next derive the pdf of tau, which will be useful later. We let X(t) be the number of transactions during
the time periods from T to T + t. Note we have
P (τ > t) =
∞∑k=1
P (τ > t|X(t) = k)P (X(t) = k),
where P (τ > t|X(t) = k) is equivalent to the probability that customer survived all the k transactions. Thus,
we have P (τ > t|X(t) = k) = e−θY†ijk , where Y †ijk =
∑k
0 y†ijm. Recall that y†ijm is identical to yijm given in (3)
with the modification that the subscription m in y†ijm refers to the mth transaction after time period T rather
than during the observation period [0, T ]. Meanwhile, given that X(t) follows a shifted Poisson process, we
have P (X(t) = k) = (λt)k−1e−λt
(k−1)!. Thus, we have
P (τ > t) =
∞∑k=1
(λt)k−1e−λt
(k− 1)!e−θY
†ijk .
We denote the pdf of τ as fτ (τ), which can be derived by differentiating the equation above over t. It is
given by
fτ (τ) = e−λτ∞∑k=1
(λτ)k−1e−θY†ijk
(k− 1)!(λ− k− 1
τ).
We next derive E[X(t†)|λ, θ,Ωij ,A]. In particular, we have
E[X(t†)|λ, θ,Ωij ,A] = E[X(t†)|λ, θ,Ωij ,A, τ > t†]P (τ > t†) +
∫ t†
0
E[X(t†)|λ, θ,Ωij ,A, τ ≤ t†]fτ (τ)dτ
= λt†
(∞∑k=1
(λt†)k−1e−λt†
(k− 1)!e−θY
†ijk
)+
∫ t†
0
λτfτ (τ)dτ
=
(∞∑k=1
(λt†)ke−λt†
(k− 1)!e−θY
†ijk
)+
∞∑k=1
e−θY†ijk
(k− 1)!
∫ t†
0
(λτ)k(λ− k− 1
τ
)e−λτdτ
=
∞∑k=1
γ(k,λt†)
Γ(k)e−θY
†ijk .
To this end, we have
E[X(t†)|λ, θ,Ωij ] =(1− pijKij )e
−λ(T−tijKij )
pijKij + (1− pijKij )e−λ(T−tijKij )
·∞∑k=1
γ(k,λt†)
Γ(k)e−θY
†ijk .
Thus, we have
46
E[X(t†)|Ωij ] =
∫ ∞0
∫ ∞0
E[X(t†)|λ, θ,Ωij ]fposλ,θ (λ, θ|aλ, bλ, aθ, bθ)dλdθ
= PAijT
∞∑x=1
(YijKij + bθ
YijKij +Y †ijx + bθ
)aθB
(t†
t†+ bλ +T − tij1;x,aλ +Kij − 1
),
(22)
where B(
t†
t†+bλ+T−tij1;x,aλ +Kij − 1
)is the cumulative distribution function of the beta distribution with
parameter x and aλ +Kij − 1 evaluated at t†
t†+bλ+T−tij1. We refer the readers to equation (23) in Appendix
A of Braun et al. (2015) for the detailed algebra to obtain the second equality above. Q.E.D.
Derivation of the posterior probability of a given customer making 0 transaction in the future
t† period P (0|t†, ·) defined in Section 5.2.
Consider a customer i whose transaction pattern during the observation period [0, T ] is characterized by
Ωij = (Kij , tij ,δdij,qij , q
dj,Zipj , T ). Let the number of transactions in the future t† be X(t†). Note that the
probability that customer i makes 0 transaction in the future t† period equals to the sum of the probability
that customer i becomes inactive at the end of observation period T and the probability that customer i is
still active but has not made any transaction yet within the next t† period after the observation period T .
To this end, we have
P (Xij(t†) = 0|λ, θ,Ωij) = 1−P (A|λ, θ,Ωij) +P (A|λ, θ,Ω)e−λt
†,
where P (A|λ, θ,Ωij) is given in (20) above.
The posterior expected probability of making 0 transaction in the future t† period P (Xij(t†) = 0|Ωij) is
then given by
P (Xij(t†) = 0|Ωij) =
∫ ∞0
∫ ∞0
P (Xij(t†) = 0|λ, θ,Ωij)f
posλ,θ (λ, θ|aλ, bλ, aθ, bθ)dλdθ
=C0(G1 +G2).
(23)
where fposλ,θ (λ, θ|aλ, bλ, aθ, bθ) is the joint posterior distribution of λ and θ given in (21). The algebra for the
second equality is similar to that in the Proof of Lemma 1. Moreover, C0, G1 and G2 are given by
C0 =
(bλ + tijKij − tij1
)aλ+Kij−1 (bθ +Yi,j,Kij−1
)aθΓ(aλ+Kij−1)Γ(aθ)
·
(1−
(bθ +Yi,j,Kij−1
bθ +YijKij
)aθ·
(1−
(bλ + tijKij − tij1
bλ+T−tij1
)aλ+Kij−1))−1
,
G1 =
(Γ(aθ)
(Yi,j,Kij−1 + bθ)aθ− Γ(aθ)
(YijKij + bθ)aθ
)Γ(aλ +Kij − 1)
(bλ + tijKij − tij1)aλ+Kij−1,
G2 =Γ(aθ)
(YijKij + bθ)aθ· Γ(aλ +Kij − 1)
(T − tij1 + t∗+ bλ)aλ+Kij−1.
47
Appendix C: Validation of Text Analysis Approach.
To validate the seeded LDA results, we calculate three quality metrics of the estimated topic-keyword prob-
ability distribution.
The first quality metric is the average topic coherence, defined by Mimno et al. (2011), as
Coherence=1
Oκ
Oκ∑κ=1
p∑u=2
u−1∑v=1
logD(wκu,w
κv ) + 1
D(wκv ), (24)
where (wκ1 , . . . ,wκp ) is the list of p most probable words in topic κ, Oκ is the number of topics, D(w) is the
number of complaints containing the word w, and D(w,w′) is the number of complaints containing both
w and w′. The general idea behind this measure is to gauge the interpretability of each topic based on
co-occurences of its keywords. Topics with larger topic coherence have been shown to be more interpretable
by human judges (Mimno et al. 2011).
The second quality metric, Hellinger Distance, is the average distance between topics in the topic-keyword
probability distribution (Blei and Lafferty 2009). The idea here is that topics that are further apart contain
keywords that are more distinct and having less overlap, which tends to result in better interpretability.
Hence, the methods with higher Hellinger Distance are preferred.
Finally, the third quality metric is the entropy of the topic-keyword probability distribution, defined as
Entropy=− 1
Oκ
Ok∑κ=1
∑w
P(w|Topic κ) log(P(w|Topic κ)), (25)
where κ indexes the topic and w words (Hall et al. 2008). Higher entropy values indicate that the topic
distributions are more evenly spread over the topics. Thus, lower entropy values are desired for more distinct
topics and greater interpretability.
Table 1 shows that the seeded LDA method is preferred to the standard LDA topic model with uninformed
prior according to all three quality metrics.
Appendix D: Validity of Instrumental Variables
Here we provide detailed estimation results to show that our instruments are valid and satisfy the two
necessary and sufficient conditions of relevance and exclusion (Wooldridge 2001).
We begin by assessing the relevance condition, which states that the instruments must be correlated with
the endogenous variables (complaint counts in our case). In the first stage of the 2SLS model, we use linear
regression to explain the endogenous variables COMPLcat
jl for cat ∈ Fresh,Package,Delivery using the
instruments (DIST,POS,STORENUM) and controls Cj as independent variables. Estimation results from
these regressions are shown in Table 8. It is important to note that at least one of the instruments is statis-
tically significant when explaining each endogenous variable. To rigorously check the relevance condition, we
perform the so-called weak instruments test, a partial F-test with the null hypothesis that, given the control
variables, the instrument effects are all equal to zero. Table 9 shows that we reject the null hypothesis in
each case, providing statistical evidence that the relevance condition is satisfied.
The exclusion condition states that instruments should not be correlated with the error term from the
second stage of the 2SLS model. As such, we use linear regression to explain the residuals from the second
48
stage of the 2SLS model using the instruments as independent variables. Table 10 shows the full estima-
tion results and that the instruments are not correlated with the residuals, since none of the instruments
are statistically significant individually or collectively through the F-test, thus providing evidence that the
exclusion condition is not violated.
Dependent variable:
COMPLFresh COMPLDelivery COMPLPackaging
(1) (2) (3)
Zip 1 −0.008∗∗∗ −0.001∗∗∗ −0.002
(0.002) (0.0005) (0.001)
Zip 2 0.006∗∗∗ −0.0004 0.001
(0.002) (0.0004) (0.001)
Zip 3 −0.007∗∗∗ −0.001 0.002
(0.002) (0.0005) (0.001)
Restaurant age 0.002∗∗∗ 0.0003∗∗∗ 0.0004∗
(0.0004) (0.0001) (0.0003)
Manager ability 0.003∗∗∗ 0.0003 0.002∗∗
(0.001) (0.0002) (0.001)
Distance 0.0001∗∗∗ 0.00001∗ 0.00004∗
(0.00003) (0.00001) (0.00002)
POS implementation flag 0.010∗∗∗ 0.001 0.001
(0.002) (0.0005) (0.002)
Number of stores per distributor 0.0002∗∗∗ −0.00000 0.0001∗∗
(0.00004) (0.00001) (0.00003)
Constant −0.091∗∗∗ −0.004 −0.037∗∗∗
(0.018) (0.004) (0.012)
Observations 20,233 20,233 20,233
R2 0.004 0.001 0.001
Adjusted R2 0.004 0.001 0.001
F Statistic (df = 8; 20224) 10.053∗∗∗ 3.146∗∗∗ 2.450∗∗
Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01
Table 8: Estimation results from the first stage of the 2SLS model.
statistic p-value
COMPLFresh 9.738 < 0.01
COMPLPackaging 3.330 0.0187
COMPLDelivery 3.760 0.0103
Table 9: Results from the weak instruments test results establishing that the proposed instrumentssatisify the relevance condition.
49
Dependent variable:
Residual
Distance 0.000
(0.0005)
POS implementation flag 0.000
(0.036)
Number of stores per distributor −0.000
(0.001)
Constant −0.000
(0.201)
Observations 20,233
R2 0.000
Adjusted R2 −0.0001
Residual Std. Error 1.671 (df = 20229)
F Statistic 0.000 (df = 3; 20229)
Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01
Table 10: Estimation results from the regression to check the exclusion condition of the 2SLS model.