Effect of Location and Assortment on Category Consideration,
Learning, and Choice
Bhoomija Ranjan∗, Paul Ellickson and Mitchell Lovett
June 30, 2016
Job Market Paper Preliminary, do not cite without consulting
authors first
Abstract
Retailers aim to maximize profits given the constraints of space
and existing in- frastructure. They frequently face the problem of
department management, to over- haul the assortments and locations
of multiple product categories sharing a common area in the store.
Although category and departmental resets are frequently per-
formed by retailers, an empirical analysis of the effects of these
department layout changes has not been done. To analyze this
question, we exploit a “natural” exper- iment of a large-scale
departmental reset of dairy in a supermarket store location, with
two other stores acting as controls. With a rare dataset of floor
plans and category planograms, we characterize 11 reset treatments
related to location and assortment changes. Analyzing both
aggregate and household-level purchase data, we present descriptive
evidence that the reset made a significant (2.6%) improve- ment in
sales. We find that the changes affect purchase probabilities
through the channel of attention/consideration, and induce learning
among customers. We then specify a structural model of demand that
incorporates multi-category considera- tion, learning, and choice
at the individual-level. The model enables us to leverage the
exogenous variation in location and assortments to identify the
effects on at- tention/consideration and choice. Preliminary
results indicate that the location of the category within the store
layout has a significant effect on consideration. We find that
being adjacent to popular categories has a negative effect on
consideration among customers who have tried the category earlier,
but has a much larger positive effect among customers who have
never bought the category, thus inducing trial. Our learning
estimates also indicate that consumers’ perceptions of category
match values are positively biased on average, which leads them to
try the product but that far fewer individuals make it a regular
feature of the on-going shopping basket.
∗Doctoral Student in Marketing, Simon School of Business,
University of Rochester, Email:
[email protected]. A version of this paper will
also serve as the first essay in my dissertation. We would like to
thank Bowen Luo and Austin Stone for excellent research assistance.
We are grateful to Ron Goettler, Garett Johnson, Avery Haviv &
Yufeng Huang for their support and in- sightful comments through
the project. We also thank Marketing seminar participants at the
University of Rochester for their meaningful discussions. All
remaining errors are our own.
1
1 Introduction
A central challenge facing modern ‘brick and mortar’ retailers is
how to maximize sales given the physical constraints of their space
and display infrastructure. To improve space allocation, they
“reset” the layout within the store, including whole categories or
depart- ments. According to the Food and Marketing Institute, the
average supermarket makes ∼ $12 per square foot of selling area per
week. Even a small 0.5% increment in sales due to better space
optimization is significant compared to the average industry profit
margin of 1.5%. The retailer has to decide the space allocation for
a set of categories within a common location in the store, such as
the produce section or dairy section. This task of department
management not only involves deciding what assortment each category
should carry (category management), but also where they should be
displayed within the physical context of the store itself (layout
planning). Not only are the profit implications of department
management important, but ease-of-navigating the store is a major
factor for 92% of consumers in choosing their primary store (FMI,
2012[10]). Hence, department management is a vital problem for the
retailer.
The narrower problem of category assortment management has been the
subject of a large number of marketing studies. The seminal work by
Dreze, Hoch and Purk (1994)[13] analyzes an experiment involving 60
stores in the Dominick’s Finer Foods supermarket chain that
investigates the efficient allocation of space within a category
and what strategies to use for optimal assortment. They find that
categories are over-allocated in terms of shelf space, and
prescribe removing lower-performing SKUs from the shelves. This
result is also corroborated by a number of later studies
(Broniarczyk, et al. (1998)[3], Boatwright & Nunes(2001)[2],
Gourville & Soman (2005)[12], Chernev (2005)[5]), and a Nielsen
report indicates that 40% of retailers implemented this basic
suggestion (Nielsen, 2010[22]). Later work (Kahn & Wansink
2004)[16] clarified that it is perceived variety that has a
positive effect on consumer perceptions and sales, and that not all
SKUs contribute to the perception of variety. Attention also shapes
consumers’ perceptions of assortment and choice in a category.
Broniarczyk, et al. (1998)[3] point out that the availability of a
consumer’s favorite product(s) attracts their attention and
moderates the negative effect that a reduction in SKUs has on
assortment perceptions. Chandon, et al. (2009)[4] use eye-tracking
experiments to show that in-store marketing factors, such as the
number and position of facings, and the horizontal and vertical
positions of items, have a significant effect on attracting
consumer attention within a category display. The existing papers
have done analysis on single categories alone, or on an independent
category-by-category basis.
However, no category is an island. Category locations, assortments,
and merchandiz- ing can have effects on neighboring categories as
well. Dreze, Hoch & Purk (1999)[13] and Bezawada, et al.
(2009)[1] both find that bringing known complements closer together
increases the sales of each, and Hong, Misra & Vilcassim
(2015)[14] find that categories have positive effects of assortment
on own sales, and negative effects on neighbors’ sales when they
are neither complements nor substitutes. The authors argue that
these results can be explained through categories competing for the
consumer’s attention, and corrob- orate this theory with online
eye-tracking experiments. However, barring Dreze, Hoch & Purk
(1994)[13], the field studies do not have exogenous sources of
location variation variation.
2
In this paper, we investigate how category location, assortment,
and merchandizing shape consumers’ choices of the category and its
neighbors. We are aided in this enterprise by a unique ”natural”
experiment of a departmental reset performed by a major retailer in
one of her stores. The layout change involved a major
reorganization of 15 dairy categories. The reset was instigated by
an unrelated technology upgrade that was not specific to dairy,
meaning that the timing and store selection for this reset is
plausibly exogenous. The basic idea for identifying the desired
location and assortment effects is to take advantage of this
(conditionally) exogenous variation in locations and assortments
induced by the reset, and compare the outcomes with control stores
that did not undergo any changes. The essential identification
strategy utilizes a “diff-in-diff” and “diff-in-diff- in-diff”
approach.
Our analysis exploits a unique dataset that includes the
department’s physical layout, as well as planograms that provide
information on the exact location of every item in the department.
These documents allows us to construct data, for example, on what
SKUs moved/changed arrangement, the adjacency of categories, and
what type of display each category was contained in. We
characterize the reset changes faced by the dairy categories
through a set of 11 ‘reset treatments’ that affect location and
assortment. We use this information, along with individual and
aggregate sales data, to analyze the natural experiment.
Our experimental analyses strongly indicate that location
influences category choice, which we theorize as operating through
attention and consideration. For instance, our re- sults show that
moving to high(low)-traffic locations improves (reduces) category
purchase probabilities. Similarly, increasing facings and the
visual space the category occupies also increases category
purchase. These results tie in well with previously established
results in the literature (e.g., Chandon, et al. (2009)[4]). We
also find differences between the short and long-term effects of
the reset, that are directionally consistent with consumer
learning.
We then develop a structural model of consumer category purchase
decisions that involves consideration, choice, and learning. This
model is developed to explain the pat- terns observed in the data
and to capture the underlying consumer processes that drive them.
Such a model can help shed light on the cross-category and
inter-temporal dynam- ics observed in the data. In our model, we
build on the existing literature and hypothesize that consumers
evaluate categories once they ‘pay attention’ to them. We assume
that consumers have a passive attention process that shapes
consideration. The level of at- tention is influenced by the
category’s location, assortment, and the characteristics of its
neighbors, among other factors. Hence, we integrate into a
structural model the adjacency effects noted by Hong, Misra &
Vilcassim (2015)[14].
Our model follows a “consider-then-choose” framework, shared by
many papers in the consideration set literature. This general
framework incorporates the idea that con- sumers’ effective choice
sets do not include all the brands/SKUs in the category. Rather,
they restrict their attention to a smaller set of items, the
consideration set, from which they ultimately make their choice.
The literature contains many different assumptions re- garding what
makes consumers consider certain items, but not others. Attention
focused explanations include ones focusing on advertising (Goeree
2008[11], Draganska & Klapper 2011[8], Terui, Ban & Allenby
2011[27]), in-store display and feature ads (Nierop, et al.
2010)[28], and the shelf space and position (Dehmamy & Otter
2015)[7]. By excluding
3
this variables that influence attention from the choice process,
the model can separately identify the consideration and choice
processes.1 Our identification focuses on location as being
excluded from consumption utility.
We further differentiate from the existing literature by modeling
consideration and choice among multiple categories, rather than a
single one. In that sense, our model also relates to the literature
on multi-category purchases. This literature mostly focuses on
correlations between categories, without accounting for their
in-store locations. Man- chanda, Ansari & Gupta (1999)[18]
study the cross-price effects of two sets of known complementary
categories (laundry detergent & fabric softener, cake mix &
cake frosting) and point out that correlations might exist between
two non-complementary categories (they call it co-incidence),
because these categories might be bought in the same shopping trip.
Others in this literature focus on cross-category price effects
with aggregate data (Song & Chintagunta 2006)[25]) and consider
the role of budget constraints (e.g., Mehta 2007[19] and Mehta
& Ma 2012[20]). Thus, we integrate the multi-category
literature with the consideration set literature.
Finally, because we find evidence of learning and dynamics in the
preferences over time, our structural model also incorporates
consumer learning in the form of a Bayesian learning model (Erdem
and Keane, 1996[9]; Narayanan & Manchanda, 2009[21]; Shin,
Misra & Horsky, 2012[24]). However, in our model, each consumer
learns about each cat- egory’s match quality through their own
consumption experiences. Integrating learning into the complete
model allows a rich inter-temporal response to location changes.
Such changes can increase attention to and consideration of a
category, which for newcomers can induce trial, a short-term
positive effect. However, trial leads to consumer learning, which
can increase or decrease purchases in the long-term, depending on
the distribution of the“true” quality in the consumer population.
Having both consideration and learn- ing influence the choice
process enables the structural model to predict flexible dynamic
patterns of response to the reset, including persistent decreases
in category purchase in- cidence due to lower attention or worse
assortment; or an initial increase in purchases due to increased
attention, followed by a decrease due to learning that the category
is worse than expected; or a small initial increase in purchases
due to rising consideration, followed by further increases,
suggesting the category is better than expected.
We apply our structural model to a subsample of 2,100 individuals.
Our preliminary results support our theory that attention
influences category consideration. We find that there is an
“attention-stealing” effect if a category is placed next to a
high-frequency pop- ular category. This result corroborates the
findings of Hong, Misra & Vilcassim (2015)[14]. However, if a
consumer has never bought a category, then its chances of being
tried by the consumer will be significantly improved by placing it
next to a category that the consumer already purchases frequently.
Therefore, we find a dual role played by categories, some- times
bringing attention to their neighbors, and at other times stealing
attention from them, depending on the consumer’s prior knowledge of
the category. Apart from this, we also find significant effects of
assortment on consumption utility. Finally, our results indicate an
important role for learning, but only among a subset of consumers
with rela- tively little experience in a category. We find that
consumers are generally biased upward from their true match
qualities, which encourages them to try new categories upon
con-
1Dehmamy & Otter (2015)[7] also make an argument that purchase
incidence vs. quantity purchases can be used to identify the
consideration process.
4
sideration. However, post-trial, many of these consumers realize
that the (match) quality is not as high as they expected, leading
to lower retention. This is consistent with the motivation for a
reset–that there is untapped potential that can be exploited by
shifting attention to existing categories. Understanding which
categories have this potential and which do not can assist managers
in planning future resets. Hence, these structural model results
form the basis of our planned counterfactual analyses, which can
guide the retailer in choosing how to improve the department
layout.
The rest of the document is organized as follows. Section 2
provides a conceptual framework for our analysis and structural
model. We present our data in Section 3, and our experimental
analysis and preliminary evidence in Section 4. Section 5 lays out
our structural model. We discuss the details of the structural
model and estimation in Sections 6 and 7. Finally, Section 8
concludes.
2 Conceptual Framework
Our work highlights the role of category location and assortment in
influencing consumer attention, category choice and learning about
category quality. Figure 1 presents the framework we use for our
analysis and structural model.
Figure 1: Model Conceptual Framework
We conceptualize consumers’ shopping trips to involve considering
and choosing among multiple categories. Our focus is entirely on
the level of the category (i.e., we do not model brand choices
within the category). The key variables we study are assort- ment,
location, habit (state dependence), quality expectations, and
price. We explain the role of each variable in our conceptual
framework based on a three stage view of category choice
process.
The first stage is consideration. Following the literature on
category assortment effects (Chandon, et al.,2009[4]), attention is
a central influence on consideration.2 An
2An alternate way of modeling this would be a more “rational
expectations” type approach where
5
individual considers a category only if the level of attention the
category obtains is suffi- ciently high.
We conceptualize attention as a passive process that is influenced
by assortment, location, in-store promotions, and habit (state
dependence) (see arrows pointing to con- sideration/attention).
Assortment can influence attention through the size of the
category. For instance, the number of facings assigned to a
category proxies for the total visual space that the category
occupies in the store. If a category occupies a large area
visually, it may be more likely to attract consumers attention
(Chandon et al.,2009[4]). Similarly, a greater number of brands and
items can produce more or less visual attention due to numerosity
perception biases related to subgrouping (Krishna, 2008[17]).
Location influ- ences attention through the type of display case,
the orientation of the category relative to the general store
traffic patterns, and the adjacent categories. For instance,
consumers may notice categories put in high-traffic areas or in a
certain style of the display case. Ad- jacent categories can
generate attention spillovers, whereby the high frequency category
that obtains a large amount of attention steals attention from
neighboring categories. Al- ternatively, such a category might lead
to positive spillovers to neighbors, especially ones that are not
known as well. In this case, while visiting the shelf for one
category, the indi- vidual notices (perhaps for the first time) an
adjacent display. Relatedly, shopping habits shape the attention
through greater awareness and memory for the category as well as
repeated patterns in how the individual generally moves through the
retail environment. Finally, in store promotions can shape
attention. In our context, the prominent type is shopper club
discounts which are labelled in the aisle.
The second stage of our process is choice given consideration. We
follow the literature on consideration (Draganska & Klapper
2011[8]) and take consideration to mean that the individual will
make a conscious choice about the option (category) that is
considered, which could be to not buy from the category. During
this process, the consumer examines prices, recalls category
quality expectations, and evaluates any observable attributes
(e.g., the assortment of brands and items). We represent the inputs
to this process via the arrows pointing into choice (i.e.,
assortment, price, and quality expectations). We note that an
important distinction in our framework is that location does not
directly influence the utility of a choice, but rather influences
only attention. We argue that this is on the surface a good
assumption for products like those we study in the dairy
department.
The third stage in the framework is the post-purchase process that
involves con- sumption and the resultant changes in consumer
expectations and habits. Purchase and consumption generates
experience that provide signals of the true quality and causes the
consumer to update their expectations about the category quality
(e.g., Shin, Misra, & Horsky 2012[24]). In addition, purchase
and consumption shapes memory and habit that can lead the category
to receive greater attention in future shopping visits. For
example, having visited a category in recent days could trigger
memory and resultant attention to the category as the individual
walks through the department containing that category.
consumers decide to consider only if the expected payoff from
choice is high enough (e.g. Seiler, 2013[23]). We choose to focus
on the passive attention approach both because of the broad support
for visual attention being central in the existing literature and
because in our context we have limited price variation needed to
identify such a search model.
6
3 Data
The data for this project comes from a major U.S. supermarket
chain. Due to confiden- tiality requirements, the name of the chain
and some details of the data are hidden. The data and analysis
focus on a reset of the dairy department in a particular store of
the chain that occurred in the last week of August, 2015. In
addition, we obtain data on two other stores, which were selected
by the chain as the best “control” stores.
The dairy department in this chain consists of 22 product
categories. These account for ∼ 10% of daily store sales, and have
1550 unique SKUs across stores. The categories and their daily
dairy share are presented in Table 1. Yogurt, milk and shredded
cheese are the main revenue generators in dairy. The product
categories are defined by the chain for operational reasons, but
they largely follow definitions based on substitution. For ex-
ample, the categories “Milk,” “Yogurt,” “Hummus,” “Butter &
Margarine” contain close substitutes. However, others do not with
“Pasta & Sauce” being the most problematic because the prices
differ markedly between pasta and sauce, and these two sets of
prod- ucts are complements in many usage situations. Despite these
concerns, throughout we use the product categories defined by the
chain for consistency with the data provider.
Category Avg. Daily Category Avg. Daily Category Avg. Daily Dairy
Dairy Dairy
Share (%) Share (%) Share (%)
Margarine Hummus 2.34 Plant Based 2.27
Chunk Cheese 6.54 Juice 6.95 Beverages
Cold Cuts 7.17 Lunchables 1.08 Refrigerated 3.14
Cottage Cheese 2.70 Milk 14.92 Baked Goods
& Ricotta Pasta & Sauce 0.76 Shredded Cheese 10.68
Cream Cheese 2.67 Pepperoni Rack 0.99 Sliced Cheese 2.22
Dairy Creamer 6.11 Pepperoni 0.42 Sour Cream & Dips 2.33
Desserts 1.60 Stick Rack Yogurt 16.51
Table 1: Descriptive statistics of dairy categories
The data we obtained for this research consist of three distinct
datasets. The first contains floor plans and planograms for the
entire dairy department of all three stores. This rare dataset
provides a detailed accounting of every inch of shelf-space in the
dairy department, allowing us to track where items (i.e.,
stock-keeping units or SKUs) were located, how many facings (the
visual space of an item on the shelf) items have, and how much the
items moved during the reset. The second dataset contains the
census of primary shoppers from the individual-level shopper-club
data for the three stores. This shopper club information contains
visit-level sales data for every purchase each individual made over
the sample period. The third dataset contains daily aggregate
point-of-sale (POS) data on unit and dollar sales for each store.
In the remainder of this section, we discuss each of these datasets
in turn.
7
3.1 Location Data
The location data contains detailed floor plans and planograms for
the dairy department for the treatment and control locations. These
floor plans and planograms provide a snapshot of how dairy was
arranged within the respective stores between May 1 - Dec 31, 2015.
The reset occurred during the last week of August, which provides
us approximately 120 days before and after the reset when we know
the exact floor layout.
We are limited in exactly what we are able to describe regarding
the reset, but we can say that one part of the floor layout change
involved replacing a central 40 ft. refrigerated coffin-style
cheese bar with a 56 ft. regular cheese bar. At the same time as
this change, many other category shelf-space and location changes
occurred. Figure 2 shows an example store floor plan without any
category indicators.
Figure 2: Changes in treatment store dairy layout before and after
reset
The floor layout plans are very useful, but don’t provide the
detailed data on SKUs. We also obtained planograms for each of the
categories. A planogram is a map of how items are arranged within
the category. Figure 3 shows a sample planogram. Each (colored) box
represents a facing where a SKU is displayed. The planogram also
records Item IDs (hidden in the figure) that can be linked back to
the respective SKUs.
Together, the planograms and floor plans provide two important
dimensions that are not possible in standard scanner datasets.
First, it provides rich location information about every item of
every category in dairy. We are able to attach to each location
what locations are neighbors (i.e., adjacency), and what type of
display case that location uses (e.g., side case, wall case, cheese
bar). Further, we can characterize where on the shelf each item is.
Second, it provides information about facings. With this
information, we can characterize not only the number of items and
brands in a category, but also the number of facings.
8
3.2 Individual-Level Data
The individual-level data is a shopper-club panel of dairy
purchases for 61,509 consumers who constitute the primary consumers
of the three store locations - Treatment, Control A and Control B3.
The data includes the date and store visited, categories purchased,
and units and dollar amounts spent in the process. The panel data
customers constitute ∼ 30% of the stores’ daily traffic and ∼ 80%
of daily dairy sales during May-Dec 2015.
We further augment this panel data with historical dairy purchases
for the same individuals going back one year to April 2014. Using
this panel data, we construct stan- dard state-dependence measures
of recency (such as weeks since last category purchase or weeks
since last shopping trip) and frequency (such as no. of purchases
within the cate- gory in the past 180 days). We also record if the
individual has purchased at all within a category since April 2014.
Table 2 shows descriptive statistics for these measures for the
dairy categories. Yogurt is the most purchased as well as the most
recently purchased category on average, whereas Pickles &
Salads is the least recently purchased category on average.
3.3 Aggregate Data
The aggregate data consists of daily SKU-level sales for the dairy
categories of the treat- ment and control locations from May 1 -
Dec 31, 2015. The data has information on SKU attributes such as
brand and size, and records daily unit and dollar sales for each
SKU. We describe our operationalization of price and promotion
variables below.
3A consumer’s primary store is defined as the store in the
supermarket chain where she spends the maximum dollar amount in the
past 52 weeks.
9
Mean weeks since Mean no. of purchases % of customers last purchase
(sd) in past 6 months (sd) never bought*
Butter & Margarine 9.03 (12.76) 5.29 (5.00) 32.61 Chunk Cheese
13.25 (15.89) 3.96 (4.58) 40.68 Cold Cuts 15.16 (18.00) 3.38 (4.04)
44.96 Cottage Cheese & Ricotta 17.12 (17.87) 2.77 (3.80) 53.10
Cream Cheese 16.29 (17.12) 2.52 (2.98) 47.14 Desserts 21.72 (20.10)
1.94 (3.20) 61.56 Hummus 17.76 (18.83) 2.68 (3.47) 62.49 Lunchables
23.78 (22.26) 2.39 (4.12) 84.03 Pasta & Sauce 25.86 (20.68)
1.16 (1.89) 81.76 Pickles & Salads 27.58 (22.86) 1.24 (1.99)
79.78 Refrigerated Baked Goods 16.90 (16.93) 2.58 (3.30) 51.53
Shredded Cheese 9.10 (13.21) 6.54 (6.46) 33.30 Sliced Cheese 18.08
(19.08) 2.43 (3.21) 60.78 Sour Cream & Dips 13.49 (15.78) 3.34
(3.55) 42.58 Yogurt 7.82 (13.05) 9.71 (9.20) 31.66
* Proportion of customers not purchased in category from April 2014
to the start of data
Table 2: Descriptive statistics of recency and frequency
measures
3.3.1 Prices
Constructing SKU-specific prices is straightforward using the above
information4. The primary challenge at the SKU-level is that we
must normalize prices to $/oz. or $/fl.oz. for comparison across
heterogeneous unit sizes within category. However, constructing
category prices introduces significant challenges.
The retailer follows an every-day low price (EDLP) strategy,
resulting in little vari- ation in SKU-level gross prices over
time. The retailer does provide some shopper club discounts, and
consumers sometimes use coupons, resulting in small variation in
net prices (gross price minus the discount). Figure 4 shows the
weighted average gross and net prices in the yogurt category in the
Treatment and Control A stores during the sample period5. The
average gross (net) price of yogurt changes post-reset by less than
0.4 (0.5) cents. We also note that much of the change in the
treatment store, where the variation is larger, reflects assortment
changes, which is not a meaningful source of price variation for
estimating price sensitivity.
Despite the limited variation in price, and the fact that it is not
central to our study, we conservatively include price to control
for potential omitted variable bias.6 To
4For days that a SKU sold no units, its price was imputed with the
price from the next day with positive sales. On two days, Oct.
10-11, 2015, the treatment store had a refrigerator malfunction,
which was both highly unusual for the store. Hence, we dropped
these two days from our sample.
5We first calculate average prices for each subcategory within the
category, and the weights are chosen to be the sub-category’s share
of category sales. The prices remain very similar if we do this
two-step process at the brand or SKU-level.
6However, we strongly caution against a causal interpretation of
our price sensitivity estimates for three reasons. First, unlike
the typical variation sought to estimate prices, the primary
variation we obtain with the above measure is across individuals,
rather over time within individual. Second, the pricing measure
subsumes into the prices the individual differences in product
preferences into the prices. For instance, individuals who
consistently buy items that are priced above the average price of
the category due to item-level preferences, could appear to have
positive preferences for price. Third, as mentioned earlier,
10
produce price series that aggregate over SKU prices to the category
level, we combine two suggestions offered by Manchanda, et al.
(1999)[18].7 Specifically, for the visits with purchase, we use the
price paid as the category price, otherwise we use the individual-
specific weighted price where the weights are the share of brands
ever bought by the individual (since April 2014). If the individual
has never purchased the category, we use the weighted net price
calculated from the aggregate data. Although we include price as a
control in order to be conservative, we do not interpret the
estimated coefficients as price sensitivity per se.
(a) Treatment Store (b) Control A Store
Figure 4: Comparison of gross and net price in yogurt
3.3.2 Promotions
The retailer does not employ display advertising for dairy products
and follows an EDLP strategy. However, there are periods when the
shopper club card or coupons provide discounts, which we can
identify based on differences between the reported gross and net
prices. The data do not report what portion of the monetary
discounts are in the form of promotions (shopper club discounts) or
coupons. For the purposes of the present analysis, we interpret
these monetary discounts as shopper club discounts.8 Such shopper
club discounts on items are marked with stickers, they may draw
consumers’ attention towards the category, and, of course, the
discount itself can increase the net utility of purchase through
price.
some category definitions (e.g., Pasta & Sauce) are not
consistent with a product substitution definition for the category.
As a result, price sensitivity may be confounded with
complementarity leading to more price-seeking consumers.
7They offer two alternative approaches: an individual-weighted
average price of brands, where the weights are the share of brands
bought by each individual, and an approach that uses as the
category price the price paid when an item is purchased, and
otherwise it uses the weighted average price.
8Even if these monetary discounts were all through coupon
redemptions, it does not pose much of a problem. Instead of our
current assumption that the promotional stickers catch the
consumer’s eye and attract their attention, we would need to assume
that all consumers are aware of the coupons for each category. The
presence of a coupon then attracts their attention towards the
category. This is identical to the assumptions made for display and
feature in standard marketing models.
11
We define two variables related to these promotions: (a) whether
the category has any promotion, and (b) average promotion depth for
promoted items. Table 3 shows the descriptive statistics of
category promotions. A category is likely to have at least one SKU
on promotion ∼ 42% of the days during our sample period. While on
promotion, ∼ 4% of the SKUs have a price reduction of ∼ 12.5% from
their regular price.
Min Median Mean Max
Propensity of Promotion across categories 0.02 0.42 0.49 0.99
Conditional on promotion, mean promotion depth 0.12 12.53 13.89
94.34 (as % of SKU price) Conditional on promotion, percent of SKUs
on promotion 0.33 4.23 8.09 68.18
Table 3: Descriptive statistics of Category Promotions
3.3.3 Other controls
Apart from the variables mentioned above, we also need to control
for seasonality and daily store traffic measures. We control for
the latter with the number of daily store-level transactions and
weekend dummies. For capturing seasonality, we allow for category-
specific month trends. Also, we construct a measure of how much
monthly category sales deviate from the annual average.9 This helps
us to control for non-linear time-trends in category sales.
4 Experimental Analysis and Preliminary Evidence
In this section we focus on “non-structural” evidence related to
the influence of the reset. In particular, we argue that the reset
serves as a kind of ‘natural experiment’ that we can use to measure
the casual effect of the reset and the related category-level
treatments. In this section, we first describe an initial analysis
employing a simple diff-in-diff for the total reset effect and the
corresponding results. We then describe the category-level reset
treatments, the appropriate diff-in-diff-in-diff, and the results
from this detailed analysis. Finally, we present preliminary
evidence consistent with the idea that consumers are learning and
changing their preferences after the reset initiates consideration
for categories that were previously not considered as often.
4.1 Diff-In-Diff Analysis
In this analysis, we aim to measure the average effect of the reset
on store sales. The diff-in-diff analysis compares the change in
dairy sales in the treatment store with that of
9This measure is constructed from aggregate monthly category sales
data between April 2014 and May 2015.
12
Salesst = αs + αt>T + αs,t>T +Xstβ + εst, (1)
where the subscript ‘s’ denotes store, ‘t’ denotes time, and ‘T ’
the time of the reset. Salesst stands for dollar sales of dairy in
store s at time t.10 The store fixed effect αs controls for
store-level differences between treatment and control locations.
The time fixed effect αt>T captures the (post-pre) difference in
sales common across stores. Xst contains other time and
store-varying covariates including week effects, day-of-week
effects, and measures of daily traffic. Finally, αs,t>T measures
the difference-in-difference estimate of the effect of the layout
reset on dairy sales.
The two major threats to validity for this analysis are selection
on timing of the treatment by the store or selection of the actual
store that was treated. Neither the timing nor choice of treatment
store for the reset was driven by predictions about future demand
(i.e., selection). The timing was driven by an external calendar
related to implementing a new software that managed operational
aspects not specific to the dairy category. Many different stores
were in line for this IT implementation, and this store was chosen
for this timing essentially “at random.” The store was selected to
reset the dairy department (other stores reset other departments),
but this, too, from conversations with managers, although not
randomly assigned, was a function of experimentation as much as
forecasted opportunity.
The retailer identified the two control stores based on their own
assessment of best match. The two control locations, in fact,
closely resemble the treatment store in key observables including
the time since the store opened/was remodeled (within the last 15
years), competing supermarkets within a 5-mile radius, whether the
store has an in- house pharmacy, no. of aisles within the store,
and the categories adjoining the dairy department. Since the stores
are within a 10-mile radius of one another, they also share similar
market conditions and customer demographics. The primary consumers
going to these stores also share similar characteristics both in
terms of dairy expenditure per shopping trip and time between
visits (see appendix A). Hence, the evidence suggests that the
stores serve as effective control stores to treat the reset in the
treatment store as a natural experiment. In other words, in the
absence of this treatment, we would expect the stores to follow
similar demand trends.
Given these arguments for the validity of the “natural” experiment,
the data also appear consistent with these arguments. Figure 5
shows the trends in average daily11
dairy sales across the treatment and control locations. The black
line in between July and October denotes the time of the reset. The
treatment and control locations show very similar time series
patterns in dairy sales before reset. This gives credence to the
common trends assumption for performing the diff-in-diff analysis.
The treatment store is bigger in size, accounting for the higher
level of sales. However, these time-invariant differences between
stores will be absorbed via store fixed effects, and do not pose a
problem to the diff-in-diff analysis. The small increase in the
treatment store following the reset is
10We also conduct this analysis using individual-level data 11Daily
average constructed as total weekly sales of dairy divided by the
number of days. This smoothes
out the day-to-day variation in sales. If the week included
national holidays when the treatment/control locations were closed,
the number of days gets appropriately adjusted.
13
difficult to see from the figure, but apparent upon closer
inspection.
Figure 5: Daily sales in dairy across locations
Table 5 shows the results from the diff-in-diff analysis12. The
average increase in sales is $325 per day, which accounts for ∼
2.6% of daily dairy sales at the treatment store pre- reset. Thus,
the reset appears to have led to a practically meaningful increase
in sales. Note also that an unreported analogous diff-in-diff on
individual-level data provides a similar 2.6% significant
percentage increase in spending per individual (aggregating across
visits).
Estimate Std. Error
Post-Reset 1173.68 479.53 * Treatment Store -347.61 135.71 ** No.
of Transactions 3.93 0.08 *** Post-Reset*Treatment Store 325.03
152.15 *
Regressors include store, week and day fixed effects, and no. of
daily transactions. Significance: * - 5% , ** - 1%, *** - 0.1%
Adjusted R-squared: 0.9361
Table 4: Difference-in-Difference Analysis on Aggre- gate Store
Data
4.2 Diff-in-Diff-in-Diff Analysis
The diff-in-diff analysis can reveal the average treatment effect
from the reset, but it does not provide information about the
various kinds of location and assortment changes that took place as
part of the reset. For this second analysis design, we use a
diff-in-diff-in- diff (DDD) approach. Before we present the details
of the model, we first introduce the category-level reset
treatments related to location and assortment.
12Note that this analysis is performed with aggregate store-level
dairy sales data alone.
14
4.2.1 The Category-level Reset Treatments
The reset affected each of the 22 dairy categories in the treatment
store differently.13 For instance, Yogurt increased its space
allocation from 24 ft. to 36 ft., but remained in the same wall-fed
display case. Desserts & Toppings, on the other hand, moved to
the opposite side of the aisle, changed display cases and added 15
SKUs to its assortment. Three other categories (Milk, Egg Beaters
& Juice) had no changes in assortment or location. In fact, the
majority of changes occurred to the categories in the Wall Case,
Cheese Bar, and Side Case, which include 15 categories. The other
seven categories were in locations that are physically more distant
or quite different in appearance (e.g., a pepperoni rack the at end
of cheese bar) and none of these others moved. Hence, we focus on
the 15 categories for the DDD analysis.14.
We categorize 11 broad types of reset actions/treatments
implemented by the retailer at the category-level. We use the
following specifications for defining each reset action -
(i) Moved - If the center of the category moved from its original
position
(ii) Rearranged - Whether an item (SKU) within the category was
displaced horizontally and/or vertically. No. of rearrangements
counts the no. of SKUs rearranged within the category.
(iii) Changed Display - Whether the category’s display case was
changed to the cheese bar, wall case or a side case. Note that one
can move a category without changing its display case, but not vice
versa.
(iv) Split Category - If two categories shared the same location
earlier, and were sepa- rated post-reset
(v) Changed Orientation - If the category was moved from left to
right (L → R)to the outer section of the dairy department, or right
to left (R → L) towards the inner section of dairy, with respect to
the direction of traffic
(vi) ∂Items, ∂Brands - No. of items (SKUs)/brands changed in the
category post- reset. The former is equal to #(Items Added) +
#(Items Deleted) in the category. Similar definition is used for
∂Brands. No brands were removed post-reset, hence the number of
brands changed equals the number of brands added to the
category.
(vii) ∂Facings - Difference between the no. of facings allocated to
the category post-reset and pre-reset. This proxies for changes in
space allocation to the category.
Table 5 shows the descriptive statistics of these actions. The
reset actions from Table 5 can be characterized as changes in
location (columns 3-10) and assortment (columns 11-13).
Changes in location include moving, changing the display case and
orientation, and splitting the category into two or more
sub-categories. As mentioned in section 2, we argue that location
does not generate utility from purchase directly and instead
affects only consideration. We argue that in our setting there are
four ways location is likely to
13Since dairy requires use of cooling infrastructure like wall-fed
freezers, the reset was contained within the same 120ft X 40ft area
of the store, and did not disrupt the arrangement of other
categories.
14We do use the eliminated 7 dairy categories to control for
whether the individual’s shopping trip took them near the dairy
department.
15
influence consumers’ consideration. First, the new location can
have different contextual cues to remind the shopper to consider
buying, such as type of display case and orientation to traffic.
Second, moving categories to high-traffic areas such as the wall
case or the central cheese bar may attract attention from more
consumers than lower traffic areas (and vice versa). Third, a
location change can confuse regular customers of a category who
then have to find the category in its new location. Such confusion
could reduce consideration. Fourth, moving a category to a new
location can change its neighbors. If there are spillovers in
attention between neighboring categories, this again changes
consumers’ consideration of the category in question. This last
change is not explicitly in our treatments, but rather captured in
covariates, which we discuss shortly.
The last three treatments in Table 5 pertain to changes in category
assortment, which as we mentioned previously are conceptualized to
affect both consumption utility and consideration (attention).
Adding or deleting SKUs (or brands) from the category changes the
choice set faced by consumers, and hence, the utility they derive
from category consumption. Assortment changes can also mechanically
change the total visual space of the category (number of facings)
and change the perception of it (number of brands and items).
Looking at columns 3-13 of Table 5, their linear independence
indicates all 11 treat- ments can be separately identified from one
another. For instance, four categories (Butter & Margarine,
Shredded Cheese, Sour Cream & Dips, and Yogurt) moved to a
different location on the same wall-fed case, without changes in
display case or orientation. Chunk Cheese moved to a different
style of display case without a change in orientation, and nine
categories moved to a different display case and changed
orientation (see Table 5).
4.2.2 DDD Analysis Approach
We use a modified DDD analysis approach to measure the effect of
each reset action (treatment). We generalize Equation 1 to
yijst = αi + αj + αs + αt>T + αj,s + αj,t>T + αs,t>T + 11∑
k=1
τk,j,s,t>T γk +Xijstβ + εjst. (2)
where i indicates individual, j stands for category, and k = 1, . .
. , 11 stands for treatment type. Therefore, τk,j,s,t>T is a
dummy that is 1 if category j in the treatment store un- derwent
treatment k during the reset. The dependent yijst includes measures
of purchase incidence, purchase quantity, and churn (defined
later).
In the above DDD specification, we control for store-, category-
and post-reset fixed effects, and combinations of these. These
fixed effects absorb time-invariant unobservables that affect
categories in each store, store-invariant unobservables affecting
categories post- reset, and so on. The effect αs,t>T picks up
unobservables that affect the treatment store in the post-reset
period, i.e., are common across all categories. In addition, we
include individual fixed effects αi. Xijst includes category-,
time- and store-varying covariates in- cluding controls for
category-month-of-year effects, day-of-week effects, measures of
daily traffic, year-ago seasonality controls, seasonal time trends,
state dependence variables for recency and frequency of purchase
within category, category-weekend/holiday effects, and
category-specific controls for price, promotions, and depth of
promotions.
16
17
Finally, we estimate the treatment effects γk by comparing the
categories with τk = 1 and τk = 0 in the treatment store
post-reset. We modify the standard DDD design, because rather than
allowing category-specific treatment effects, αj,s,t>T , we
assume that the category-specific treatment effects are constrained
to be the sum of the treatment effects arising from the reset.
Formally, we assume that category-specific treatment effects
are
αj,s,t>T = 11∑ k=1
τk,j,s,t>T γk, j = 1, . . . , J. (3)
Because our study focuses on location we also include in Xijst
variables that are related to adjacency and not part of the DDD
treatments, per se, but do have variation induced by the reset. We
conceptualize the adjacency effects on purchase likelihood as
arising from attention and consideration. We term categories the
individual visits regularly as high frequency categories for that
individual15. Consumers may be more likely to try a category if it
is placed next to a category they visit often. Because the 11
treatments that we specify above do not account for adjacency
effects, we add four variables to capture adjacency at the
individual level: (1) is category j next to a high- frequency
category j′ for consumer i, (2) has individual i never purchased
from category j, which is next to high-frequency category j′ for
consumer i, (3) how many categories surrounding j are on promotion,
and (4) how many categories not surrounding j are on promotion.
Together, we use these measures to assess the effect of adjacency.
To measure these adjacency effects we need to be able to separate
category complementarities, which are interactions between
categories, from adjacency. Note that this separation is possible
because we have variation in the location of the categories so that
the categories have constant benefit from complementarities, but
within individual the location of those categories vary across
stores and over time.16
We also include controls for initial location and assortment
including variables for the type of case and for the number of
items, brands, and facings in the category before the reset (in the
control stores, these have the same value in the post-reset
period). Including these variables implies that our treatment
effects are measured as the effect of changes in location and
assortment.
Now that we have described the model formulation, we discuss the
threats to validity for this DDD approach. The exogenous timing
assumption and use of (valid) control stores addresses many of the
potential concerns. However, the additional difference produces
both additional controls in the form of fixed effects as described
above, but also involves additional demands because the treatment
effects are assigned at the category-store level rather than at the
store level. As long as the treatment was not assigned to
categories in
15Category c is a high-frequency category for individual i if her
propensity to buy c exceeds her median propensity to buy in the
dairy categories. If i has no purchases in dairy in the previous
year or has less than 5 shopping trips, we define six categories -
Butter & Margarine, Chunk Cheese, Cold Cuts, Shredded Cheese,
Sour Cream & Dips and Yogurt - as high-frequency categories for
i. Since these are high-revenue categories, the average individual
is more likely to consider among these categories for
purchase
16Complementarities are sometimes evaluated by including
cross-price or cross-promotional variables. If the estimated
cross-price coefficients were negative, we would infer the presence
of complementarities. In our context, since the variation in prices
is minimal, we do not estimate a full set of cross-price
coefficients, but include the cross-category promotions for
non-adjacent categories.
18
anticipation of a specific store-time-category level demand shock,
the treatment effects, τk, are conditionally exogenous. The main
remaining “issue” related to such anticipated shocks are in the
form of predicting unmet demand. Such unmet demand could be tapped
by the category-level treatments. However, philosophically, such
untapped demand is exactly the cause for our treatment effects. Our
theoretical view is that the unmet demand is due to consumers not
considering products that they otherwise might obtain sufficient
utility from such that they would choose to buy it. If there are no
such opportunities, then there is no room for the reset to generate
benefit. Hence, we view our estimates as conditional on the nature
of the untapped potential.17
Another potential concern for the assortment-related treatments
could be if the re- tailer has pre-existing contracts with major
brands to allocate space to them. If such contracts exist, then one
can question whether the retailer has autonomy to experiment with
the category assortments. However, the retailer has a policy of
accepting no slotting allowances, and claims complete control over
space allocation and assortment decisions within the stores.
4.2.3 DDD Results
Table 6 shows the results of the DDD analysis.18 We estimate
Equation 2 with three dependent variable specifications - category
purchase incidence (Column 1), log purchase quantity (Column 2) and
churn (Column 3). Category purchase incidence records a binary
outcome - whether individual i bought category j during shopping
visit (s, t). The second outcome records the quantity purchased. We
normalized all the units to ounces or fluid ounces. Using the log
enables us to express the treatment effect as an % increase in
quantity purchased. Finally, we define individual i as a churned
customer for category j if i has not purchased j in the past 13
weeks (91 days).19
The results in Table 6 show patterns that are consistent with our
primary thesis. Because in most cases the three dependent variables
present a common picture (i.e., direction of effect), we focus our
interpretation on the probability of purchase and discuss the other
measures primarily when they differ meaningfully. First, across all
specifications, the reset treatment effects support the theory that
location influences consumer attention and consideration. Moving an
item reduces its probability of purchase. The number of items
rearranged within the category reduces the propensity to purchase.
Splitting categories into sub-categories also has a negative
effect. These effects suggest that the reset can break existing
shopping habits and increase the difficulty of finding the category
or preferred items within the category (reducing the probability of
purchase). Further, these effects reflect the negative side of a
reset that constrains how frequently a retailer wants to adjust a
department.
Second, switching display cases or orientation changes the
contextual cues of the category. Moving the category to
high-traffic areas (the wall case or the cheese bar) increases the
purchase propensity and quantity. Moving to a low-traffic area (the
side
17We consider this conditioning as similar to the conditioning that
every study on advertising faces in terms of conditioning on the
kind of advertisement creatives that were used. Of course, if
better ads were used this would lead to higher advertising
elasticities, but this fact doesn’t invalidate the measurement of
advertising elasticities, and neither should it invalidate the
measurement of store resets.
18The focal parameter estimates reported here. Refer to Appendix B
for remaining estimates. 19We try a similar specification with a
cap of 26 weeks (182 days), and results remained similar.
19
(0.001) (0.003) (0.002) Log # of Items Rearranged -0.007***
-0.019*** 0.001
(0.001) (0.002) (0.001) Changed Display to Cheese Bar 0.026***
0.060*** -0.033***
(0.002) (0.007) (0.004) Changed Display to Wall Case 0.026***
0.068*** -0.112***
(0.002) (0.005) (0.003) Changed Display to Side Case -0.026***
-0.078** 0.001
(0.002) (0.006) (0.004) Split Category -0.035*** -0.094***
0.161***
(0.003) (0.008) (0.003) Changed Orientation L→ R -0.042***
-0.106*** 0.052***
(0.003) (0.010) (0.005) Changed Orientation R→ L -0.001 0.003
-0.040***
(0.002) (0.005) (0.003) ∂Log Brands -0.083*** -0.210*
0.042***
(0.006) (0.019) (0.012) ∂Log Items 0.094*** 0.275*** 0.065***
(0.007) (0.022) (0.012) ∂Log Facings 0.051*** 0.133***
-0.217**
(0.003) (0.010) (0.005)
Adjacency Adjacent to high frequency category -0.005*** -0.014***
0.075***
(0.000) (0.000) (0.000) Never Bought* Adj. to high freq. cat.
0.004*** 0.007*** -0.280***
(0.000) (0.000) (0.000) No. of adj. cat. on promo. 0.001***
0.003*** 0.000
(0.000) (0.000) (0.000) No. of non-adj. cat. on promo. 0.000 0.000
0.000
(0.000) (0.000) (0.000)
Location & Assortment Size before reset X X X State Dependence
X X X Category-price controls X X X Category-promotion controls X X
X Category-promotion depth controls X X X Category-control store
fixed effects X X X Category-treatment store fixed effects X X X
Category-weekend fixed effects X X X Category-month fixed effects X
X X Individual fixed effects X X X
No. of Individuals 65,019 65,019 65,019 No. of Observations
15,297,720 15,297,720 15,297,720 Adjusted R-squared 0.4219 0.4469
0.2367
∂log Brands, ∂log Items, ∂log Facings have all been defined as
log(Items/Brands/Facings after reset) - log(Items/Brands/Facings
before re- set) Robust standard errors in parantheses Significance:
* - 5% , ** - 1%, *** - 0.1%
Table 6: Difference-in-Difference-in-Difference Analysis
20
case) reduces purchase propensities and quantity. Similarly,
changing the orientation of categories from right to left and
moving them to the interior of the dairy department (again, a high
traffic area) decreases churn, whereas changing orientation from
left to right and moving to the outer part of the dairy area
decreases the purchase propensity and quality, and increases
churn.
Third, considering the treatments that dealt with changes in
assortment, a 1% in- crease in category size (∂Log Facings)
increases the probability of category purchase in- cidence by 5
percentage points. Similarly, a 1% change in assortment (∂Log
Items) raises the probability of purchase by 9 percentage points,
but also increases the probability of churn. Combining this result
with that of purchase incidence and quantity indicates that,
although the assortment changes are beneficial for frequent buyers,
it increases the un- certainty for infrequent buyers and pushes
them to non-purchase. Changing the brands within the assortment has
a detrimental effect across specifications.
Fourth, the adjacency effects also reveal a nuanced influence of
neighboring cate- gories. Recall that moving a category’s location
also automatically changes its neighbors. Being adjacent to a
high-frequency category reduces the probability of purchase by 0.5
percentage points (correspondingly a 1.4% decrease in quantity and
7.5% increase in churn). This points towards an attention-stealing
story, where more popular neighbors steal customers’ attention away
from the category (see Hong et.al., 2016[14] for a related result).
However, if the individual has never bought in the category, then
having an adjacent high frequency category increases the chances of
trial vis-a-vis being next to a low-frequency category. This
implies that consumers are more willing to try, but note that once
the consumer picks this category, she is less likely to pick it the
next time (due to the dominant negative effect), pointing to lower
retention and purchase frequency post-trial. Similarly, a
category’s probability of purchase increases if an adjacent
category has a discount/promotion, but is unaffected by the
discounts in non-adjacent categories. This result suggests that
adjacency is important to capturing attention, and that the
adjacency effects are not simply proxying for category
complementarities.
The above results suggest that location influences choice, which we
theorize as op- erating through attention and consideration.
Further, these influences are practically important in magnitude.
However, the last set of results about adjacency point to a
distinction between short and long-term effects. In the next set of
analyses, we explore whether the patterns in the data are
consistent with learning being initiated by the reset.
4.3 Evidence of Learning
To shed initial light on the post-purchase stage of our conceptual
framework, we wish to investigate whether the reset induced
learning among consumers. Traditionally, shifts in market share and
changes in purchase probabilities are taken as evidence of
learning. We therefore plot the evolution of purchase propensities
over time across categories.20 Figure 6 shows this evolution for
two categories, Sour Cream & Dips and Yogurt, across the
treatment vs. average of the control stores. The black line in the
center denotes the day of the layout reset. Sour Cream & Dips
is a much lower share category than Yogurt, and hence many
individuals would not have tried it in the past. However, because
it
20Purchase propensity is calculated from the data by dividing the
number of purchases in the category by the total number of store
visits for each month in the sample period.
21
was moved and split from another category, it now occupies a
different location that will be adjacent to new consumers’
high-frequency categories. Thus, the reset brought it new attention
from many consumers, inducing them to try it, significantly
changing its purchase probabilities compared to the control
locations. In particular, the purchase probabilities narrow the gap
over the months after the reset. On the other hand, yogurt is a
high-share category and many more consumers have ample experience
in the category. Unsurprisingly, the reset appears to induce less
change (though there is still a slight rise in purchase right after
the reset) in Yogurt’s purchase probabilities versus the control
stores. Similar plots for each category are available in Appendix
C. These patterns suggest there may be different short-term and
long-term effects of the reset on categories.
(a) Pickles & Salads (b) Yogurt
Figure 6: Pickles & Salads and Yogurt purchase probabilities
across stores
We implement the above idea in the DDD context by looking at the
reset effects on purchase probabilities over time. If learning is
occurring in our context, we should find long-term effects of the
reset actions on purchase probabilities that are different from the
short-term effects. To test this hypothesis, we repeat DDD
specification (1) but add a separate long-term effect for each
reset action. In this case, we define short-term as the first month
after the reset (Sept.2015), and long-term as the three months
after that (Oct-Dec 2015). Table 7 shows the results of this
analysis. All the reset actions have significant effects on
purchase probabilities in the long term, and some have significant
differences from the short-term effects (we report the differences
in the table). The effect of being moved worsens with time, perhaps
suggesting that consumers don’t return to prior consideration
levels for the moved categories and instead find new substitutes.
Having the category split improves with time, suggesting the newer
prominence and assortment of a split category eventually builds up
habit (state dependence) or expectations around the higher quality.
Interestingly, the difference in log brands decreases its effect
(i.e., more negative), suggesting that the long-term effect of
changes in brands is negative. Finally, the difference in log items
increases with time, suggesting after some time consumers learn
that they like the new variety. The reset effects become
significantly stronger in the long term for the assortment-related
variables (i.e., for changes in Items and Brands). Since the
assortment resets affect consumers’ utility of consumption, this
evidence is also consistent with learning. Overall, we take these
preliminary investigations as indicative
22
of the need to model the dynamic effects of the reset through
learning.
Purchase Incidence Short-Term Long-Term
Effects Differences from short-term
(0.001) (0.002) Log # of Items Rearranged -0.006*** 0.000
(0.001) (0.001) Changed Display to Cheese Bar 0.026*** -0.001
(0.004) (0.004) Changed Display to Wall Case 0.028*** -0.003
(0.003) (0.003) Changed Display to Side Case -0.025*** 0.000
(0.003) (0.003) Split Category -0.041*** 0.009*
(0.004) (0.0064) Changed Orientation L→ R -0.044*** 0.005
(0.005) (0.004) Changed Orientation R→ L 0.001 -0.002
(0.003) (0.003) ∂Log Brands -0.068*** -0.017*
(0.009) (0.009) ∂Log Items 0.076*** 0.021*
(0.010) (0.009) ∂Log Facings 0.055*** -0.006
(0.005) (0.005)
Adjacency controls X Location & Assortment Size before reset X
State Dependence X Category-price controls X Category-promotion
controls X Category-promotion depth controls X Category-control
store fixed effects X Category-treatment store fixed effects X
Category-weekend fixed effects X Category-month fixed effects X
Individual fixed effects X No. of Individuals 65019 No. of
Observations 15,297,720 Adjusted R-squared 0.4219
∂log Brands, ∂log Items, ∂log Facings have all been defined as
log(Items/Brands/Facings after reset) - log(Items/Brands/Facings
before reset) Robust standard errors in parentheses Significance: *
- 5% , ** - 1%, *** - 0.1%
Table 7: Difference-in-Difference-in-Difference Analysis with
short-term and long-term changes
23
4.4 Summary
The descriptive analyses above demonstrate that the layout reset
has an overall positive effect on total sales and on
individual-level category purchase incidence. The evidence
indicates that the influence of the reset operates through a
combination of location and assortment effects. As discussed
earlier, we theorize that these location effects do not affect
choice utility directly, but instead shape attention, and, as a
result, the likelihood of considering a product. Thus, our evidence
on location effects, including the influence of the reset actions
and the adjacency results, support an attention effect. We also
theo- rize that assortment effects directly influence both
attention/consideration and category preferences (i.e., changing
the individual’s perceived quality of the category). We find
evidence supporting assortment effects as well. Further, we
provided preliminary evi- dence that some of these location and
assortment effects change over time and that the aggregate purchase
probabilities change in ways consistent with learning. Such
patterns could arise from consumers learning their preferences due
to trial induced by the location and assortment changes from the
reset. As a collection, these results are consistent with our
three-stage conceptual framework. In the next section, we will
develop a model of consumer demand that can capture the
consideration, choice, and learning processes that this DDD
evidence supports.
5 Model
We propose that consumers consider and purchase multiple categories
in the same shop- ping trip. In our model, consumers have limited
attention, and location (among other factors, including habit)
shapes that attention. More attention on a category implies a
higher likelihood of consideration. There are no upper limits or
lower limits to the num- ber of categories consumers can consider.
We envision the following scenario: consumers walk down the dairy
aisles and if a category attracts their attention, they stop and
think whether to buy in the category or not. We call this process
of attracting attention ‘con- sideration.’ In this scenario,
consumers are not aware of prices while walking down the aisle.
Prices are revealed once they consider the category. Hence, this is
a passive model of attention, where consumers do not plan attention
allocation.21
Once a consumer considers a category, they exert effort to make
optimal choices based on the information they have at the time,
including prices and expectations about their ‘match-value’ with
the category. After purchase, consumers gain new information about
the quality of the category through their consumption experience,
and, as a result, update their expectations about the quality of
the category, thereby shaping future purchases.
21In other words, it is not a “rational” model of search where
consumers decide optimally whether the anticipated benefits of
effortful search/consideration are worth the costs (ala, Seiler,
2013[23]). The important modeling distinction is whether consumers
make consideration “choices” based on expectations about the net
utility for the category. In Seiler’s paper, consumers have
expectations on prices and evaluate their expected utility from
consumption to decide whether to search in the category. This is
equivalent to saying that consumers are aware of prices (and
perhaps other features of the category) initially. In our context,
it is improbable that consumers are aware of prices in all dairy
categories, particularly ones that they don’t regularly visit.
Further, given the EDLP strategy, prices don’t vary much so the
incentive to shop for prices is limited. As a result, we formulate
a model of passive attention rather than active search.
24
Thus, in our model, increasing attention can increase the chance of
purchase, which can shift expectations either up or down so that
the net effect of attention shifters can be short-lived or develop
a long-term benefit that is larger than the initial trial
bump.
Formally, an individual i ∈ 1, . . . , N goes to a store s ∈ 1, . .
. , S at “time” t ∈ 1, . . . , Ti, where “time” is an actual store
visit. The individual can consider any of the j ∈ 1, . . . , J
categories in the dairy department. If they consider the category,
the Cijts = 1 (otherwise 0) and the consumer then makes an optimal
(myopic) choice to purchase from the category, yijts = 1 or not
(0). On purchase, she receives an experience signal upon
consumption, which she uses to update her beliefs about category
quality. These beliefs form the basis for her future choices. We
now develop each aspect of the model in the following
sections.
5.1 Consideration
We posit that that the first-stage decision of whether to consider
category j depends on whether j attracts individual i’s
attention22. At the category level, following Nierop,et al.
(2010)[28], we write a multivariate probit (MVP) model of
consideration:
C∗ijts = Xijtsα + εijts j = 1, . . . , J
Cijts = I[C∗ijts > 0] (4)
~εits = (εi1ts, . . . , εiJts) ′ ∼ N(0,Σ).
The variables Xijts measure the effect of category j on individual
i’s attention. These variables include a range of
individual-specific and category-specific controls for how re-
cently, and/or frequently, i purchases j. These variables follow
roughly those used in the DDD analysis including category-specific
controls for promotion, size, assortment, and seasonality, and
controls for store traffic and weekend effects. In addition, Xijts
includes our focal shifters of consideration, which we discuss
shortly. The unobserved stochastic shocks ~εits are distributed
multivariate normal with covariance matrix Σ, measuring the
unobserved correlations across categories. If consideration utility
C∗ijts is positive, we say that i considers j23.
The setup in (4) above allows the consumer to consider multiple
categories simulta- neously. The Xijts include layout-specific
dummies for type of display case, orientation of category j,
whether j is adjacent to categories that i purchases frequently,
and whether j’s adjacent categories are on promotion, individual
state dependence related variables for time since last store visit
and last category purchase, and controls for seasonality (linear
category-month trends and monthly sales deviation from the annual
average for the category) and store traffic (no. of daily store
transactions and a weekend dummy). This creates relationships
between categories based on spatial proximity and individual
purchasing patterns.
22Honka, Hortacsu, Vitorino (2014)[15] use survey data to
distinguish awareness/attention from con- sideration. In our
setting, we cannot identify between these two empirically and use
these terms inter- changeably.
23Note that C∗ ijts can be multiplied by a different positive
constant, and the outcome Cijts remains
unchanged. Hence, the coefficients at the category consideration
level are identified up to scale, and we need to normalize the
diagonal elements of Σ to 1, making it a correlation instead of a
covariance matrix.
25
Economically, two categories are complements when their cross-price
effects on de- mand are positive. Classical examples of complements
include laundry detergent and fabric softener, cake mix and cake
frosting, and so forth. Complementarity between cate- gories
depends directly on the categories’ characteristics, and is
independent of the spatial distance between them in stores.
However, spatial correlations can enhance/detract from the baseline
complementarity between categories through the channel of consumer
atten- tion. For instance, consumers might be more/less likely to
buy from categories adjacent to their most frequently purchased
categories. Or, having a promotion in a category may pull away
consumers’ attention from adjacent categories. Our model allows for
these re- lationships to exist across the J categories through the
X matrix. All other unobserved cross-category correlations are
captured in the off-diagonal elements of the correlation matrix Σ,
including complementarity.24
5.2 Consumption Utility & Learning
Moving from the consideration stage, individual i then chooses
whether to buy within the considered categories, given her current
belief of category quality Qijt. In a learning framework similar to
Shin, Misra & Horsky (2012)[24], consumers update their quality
beliefs for each category j over successive category purchases.
After every category pur- chase, consumers receive a quality signal
due arising their consumption experiences. These quality signals
are centered around true match quality between individual i and
category j, Qij,
QE ijt ∼ N(Qij, σ
where σ2 E,j is the variance of the experience signals.
Initially, consumer i starts from a initial quality belief Qij0
about the true mean quality Qij,
Qij0 ∼ N(µij0, σ 2 ij0), (6)
where µij0 and σ2 ij0 are the mean and variance of the initial
beliefs about category j’s
quality for individual i at time 0. In the Bayesian learning setup,
the prior beliefs at time t are the same as the posterior beliefs
at time t− 1, which are distributed as
Qij,t−1 ∼ N(µij,t−1, σ 2 ij,t−1). (7)
We assume that quality beliefs at time t − 1 are normally
distributed, so that posterior beliefs at time t also remain normal
after Bayesian updating. At time t, individual i goes to store s
and, conditional on consideration, her utility from category j is
given as
U∗ijts = Qij,t−1 +Wijtsβ + ωijts, j ∈ {k : Cikts = 1} ~ωits =
(ωi1ts, . . . , ωiJt)
′ ∼ N(0,). (8)
24Technically, complementarity operates at the level of consumption
utility, not consideration. In practice, the correlation among
unobservables in the consideration and choice stages cannot be
identified separately without other restrictive assumptions. As a
result, we allow it only at the consideration stage, which also
proxies for the choice stage.
26
Wijts includes variables that affect the consumption utility of the
category, such as price, and the number of unique SKUs and brands
in the category. Since i can buy multi- ple categories in the same
shopping trip, the utility structure across categories U∗ijts again
follows a multivariate probit structure. The stochastic error terms
for all J categories (~ωits) are distributed multivariate normal
with covariance matrix . Similar identifica- tion restrictions need
to be placed on as Σ above. However, it may be difficult (or
impossible) to identify cross-category correlations in the
consideration stage and the pur- chase incidence stage (Nierop et
al., 2010[28]). Hence, we restrict to be a diagonal matrix, which
reduces (8) above to a series of independent binary probits.
Given the information set at the end of period t− 1 from (7),
Qij,t−1 is a stochastic variable. We assume that individuals are
myopic and risk-neutral and hence, base their decision to buy/not
buy on the current expected utility with respect to quality
beliefs:
Eij,t−1(U∗ijts) = Ei,t−1(Qij,t−1) +Wijtsβ + ωijts, j ∈ {k : Cikts =
1} = µij,t−1 +Wijtsβ + ωijts
yijts = I[Eij,t−1(U∗ijts) > 0]. (9)
The consumer buys in the category if expected consumption utility
Eij,t−1(U∗ijts) is posi- tive. Following DeGroot (1971)[6], we can
write out the posterior mean and variance at time t in terms of the
initial mean and variance of the quality beliefs:
µij,t = σ2 ij,t
. (10)
Shin, Misra & Horsky (2012)[24] rewrite the above in terms of
perception bias νij,t = µij,t− Qij, the difference between the mean
quality belief at time t and the match quality, and signal noise
ηEijt = QE
ijt − Qij. Thus,
µij,t = Qij + (σ2
E,j/σ 2 ij,0)νij,0 +
As the number of purchases of category j, ∑t
τ=1 yij,τ , increase, µijt tends to the match quality Qij as the
resultant bias term tends to zero. After the Bayesian updating from
Equation 10, µij,t acts as the mean belief for i at the t+ 1th
shopping trip.
For defining the initial conditions, we let the means and variances
of the initial perception bias νij,0 and uncertainty σ2
ij,0 be functions of past purchases NPij and whether they have
never bought the category NBij. These variables are defined at the
time of individual i’s first visit to the store. The mean and
variance of the initial perception bias are
νij,0 = δQ0,j + δQ1 NBij
27
Note that µij,0 = Qij + νij,0, and hence, defining (12) for the
initial belief mean µij,0 or the initial perception bias νij,0 is
equivalent.
Finally, to close the model, we assume the distribution of true
match qualities Qij to be normal for each category. Hence,
Qij ∼ N(Qj, σ 2 Qj
) j = 1, . . . , J. (13)
5.3 Identification
In this section, we informally discuss the identification issues of
the above model. Broadly, this discussion on identification covers
three aspects. First, how do we identify the effects of the
location and assortment-related variables. Second, how do we
separate consid- eration from learning. And finally, given our
learning model, how do we identify the learning-related
parameters.
Our main research questions relate to measuring the effect of
location and assortment variables on consumer choices. As discussed
previously, our design leverages the reset timing, which was
assigned randomly due to IT software changes. We do not repeat
these arguments here, but note that the same essential arguments
provide exogenous variation in assortment and location across
individuals in this structural analysis. The main difference
relates to the non-linear model components. We discuss the critical
aspects of these model components next.
In our data, consideration and choice are not separately observed
by the researcher (i.e., consideration is unobserved to the
researcher). To properly identify these functions, we require a
valid excluded variable that enters consideration and not utility.
Location is an ideal exclusion restriction. In-store location
trivially satisfies that it does not affect the utility of
consumption and it also does not affect the trade-off between
outside good con- sumption and inside good consumption, where the
outside good is disposable income to be spent on other products and
services. Hence, location influences choices only through
consideration, which, in our model, is determined by attention.25
With location as a nat- ural exclusion, we can identify the two
processes. While this is our primary exclusion restriction, we also
assume that consumers do not observe category prices without
consid- ering the category and that category quality expectations
do not influence consideration. We argue that these assumptions are
consistent with a passive attention process. We do include
variables related to state dependence (reflecting habits and
shopping patterns) and promotions (reflecting attention grabbing
in-store display) in the consideration pro- cess (and not in the
utility/choice process). However, we make these modeling choices to
ensure consistency between the theoretical basis of our passive
attention model and the variables that should be included in such a
process, rather than to obtain identification of our model.
Regarding the learning model, we need to identify the
population-level true qualities, the variance across individuals of
the true qualities, and the functions for the initial mean bias and
initial uncertainty, respectively, {Qj, σ
2 Qj , νij,0, σ
2 E,j}. First, to obtain this
(limited) set of learning parameters, following Shin, Horsky, &
Misra (2012)[24], we have
25As noted previously, although we prefer to model consideration as
a function of attention, we cannot rule out that some or all of the
location effects could also be explained by a rational model of
search. However, a comparison of these two approaches is beyond the
scope of the current paper.
28
already fixed the variance of the experience signals, σ2 E,j, to 1.
For the population mean of
the true qualities, estimates are based on the “long-run” average
propensity to purchase in the category. The variance of the true
quality distribution across individuals is estimated from the
variation across individuals in this long-run average propensity to
purchase in the category. For the initial conditions parameters, we
use the long series of past purchases in our panel that are prior
to our focal estimation period. For the initial mean bias, we use
variation across categories in the average initial purchase
propensity, as well as the average difference in the initial
propensity to purchase for those that had never bought (vs. bought)
the category in the pre-estimation period. For the initial
uncertainty, technically two data features are used in estimation.
First, the pace of change to shift from the initial propensity to
the long-run propensity is related to the number of purchases and
whether the individual never bought in the category. Second, the
uncertainty induces extra noise in the choice process (see Equation
12), so that as more purchases are made, choices become more
predictable. Qualitatively both of these features help with
estimating uncertainty. We note that, because of the need for
information about “long-run” vs. initial propensities to purchase,
we restricted our data for the structural analysis to the subset of
individuals with 5 or more visits during the data period. This
removes around 30% of the individuals, but these individuals that
visit less often are less important to overall sales.
6 Estimation
The full set of parameters for the Bayesian consideration and
learning model are given as Θ = {α,Σ, β, {Qj, σ
2 Qj }Jj=1, δ
Q, δσ}. Hence, the category consideration and choice stages are now
given as
C∗its = Xitsα + εits, εits ∼ N(0,Σ)
Eij,t−1[U∗ijts] = Qij + (σ2
E,j/σ 2 ij,0)νij,0 +
E ij,τ
(σ2 E,j/σ
2 ij,0) +
Given the above model, the data likelihood is written as
L(y|X,W,Θ) = N∏ i=1
∫ Qi1
Ti τ=1)
) dηEijtdQi1 . . . dQiJ .
In the maximum likelihood paradigm, calculating the above
likelihood function would require summing over each of the 2J − 1
possible consideration sets for each individual, making it
unwieldy. Moreover, it would require simulating and integrating
over the ex- perience shock signal noises, ηEijt, a
high-dimensional integral. However, in the Bayesian paradigm, if we
draw the latent consideration (C∗ijts) and purchase (U∗ijts)
utilities (Tanner & Wong, 1987[26]), we need to make only J
comparisons. Further, the ηEijt can be drawn from a prior
distribution of N(0, 1). These simplifications reduce the demands
at each
29
computational iteration and also allow the data to inform the
distributions and set of consideration sets that we integrate over,
which can dramatically improve the computa- tional performance. For
these reasons, we chose to estimate the model using a Markov Chain
Monte Carlo (MCMC) method. For the purposes of the current study,
we also fix Σ to be a diagonal matrix, which significantly reduces
computation costs.26
To complete the Bayesian model, we need to specify our prior. We
assume standard conjugate distributions for the regression
coefficients α and β, and for the initial condition parameter
vectors, δQ and δσ, and for each of the J true quality population
means in the vector, Qj. We use diffuse priors for these normally
distributed parameters. For the precisions of the true quality
population means, we draw each precision σ2
Qj using
a gamma distribution parameterized as a scaled chi-squared
distribution with a scale of 1 and J + 3 degrees of freedom. Thus,
the joint posterior likelihood of the parameters conditional on the
data is
L(Θ|X,W, y) ∝ L(y|X,W,Θ)π(Θ)
∝
Ti−1 t=1 α, β))
(
)
Ti−1 t=1 β)L(Cijt|X,α))
(
)
] π(Θ). (14)
We estimate the model with a Metropolis-in-Gibbs MCMC sampler. In
Appendix D, we provide the details of the full conditional
posterior distributions and the sampling algorithm. For the
estimation of the parameters, we discard the first 50,000
iterations of the sampling chain as burn-in, and keep the next
100,000 iterations for analysis from which we retain every 10th
draw. We inspect the iteration plots to determine that the sampler
converged to the stationary posterior distribution.
We note that we subsample our data to accomplish two goals. First,
our learning model requires a reasonable number of observations per
individual due to the individual level heterogeneity. As a result,
we subsample to include only individuals with at least 5 shopping
trips. This criterion reduced the sample of individuals to 42,108.
Second, due to the computation time, estimating the model on the
full sample using the Bayesian estimation technique is prohibitive.
Instead we use approximately a 5% sample of 2,100 individuals. This
sample was drawn randomly from the set of those with at least 5
shopping trips.
26This choice of fixing Σ to diagonal was determined after
estimating a model without learning with the full correlation
matrix. In that model we found that all of the marginal
distributions for the correlations covered zero.
30
Parameter Posterior Posterior Mean Std. Dev.
Adjacency: Adjacent to high-frequency category -0.185 0.09 Never
Bought*Adj. to high-freq. cat. 0.414 0.05 Adjacent category on
promotion -0.010 0.03
Category Size & Promotion: Log no. of facings 0.085 0.08
Promotion dummy 0.160 0.04 Promotion depth 2.543 0.72
State Dependence: Purchased last month 0.115 0.05 Log weeks since
last visit 0.386 0.02 Log weeks since last purchase -0.693
0.03
Location: Coffin Case: Outside 0.369 0.13 Coffin Case: Inside 0.355
0.13 Cheese Bar: Outside 0.192 0.11 Cheese Bar: Inside 0.183 0.09
Wall-fed case 0.172 0.08 Side case 0.000 -
Other controls: Weekend 0.215 0.03 Log no. of daily transactions
-0.185 0.10 Monthly sales dev. from annual avg./1000 0.068 0.01
School Start dummy -0.016 0.05 Category fixed effects X
Category-month trends X
Table 8: Consideration stage estimates
We now present results from a preliminary formulation of our model
that does not yet incorporate the effects of the full set of reset
treatments (discussed in Section 4.2) on the attention variables.
We therefore view the current results as a proof in concept, rather
than final results. The estimates are reported in Tables 8 and 9,
and correspond to the parameters related to consideration and
utility, respectively. We begin our discussion with consideration
and then turn to the parameters related to utility and
learning.
7.1 Consideration Model Parameters
We begin with our focal variables related to adjacency. We find
that adjacency plays an important role in consideration. Moreover,
our results are consistent with the analyses of section 4.2.3.
First, being adjacent to a category tha