spatial differentiation in the supermarket industrypages.stern.nyu.edu/~mkt/seminar...
TRANSCRIPT
SPATIAL DIFFERENTIATION IN THESUPERMARKET INDUSTRY
A.YESIM ORHUN
May 21, 2005Haas School of Business, University of California at Berkeley
Abstract. This paper investigates the positioning choice of strategic firms andinfers the tradeoff between demand and competition factors in their optimaldecisions. It shows that controlling for spatially correlated unobservable commonfactors as well as pre-existing systematic differences across retailers is essentialfor explaining the spatial positioning decisions in the supermarket industry.
1. Introduction
Observed decisions of firms and consumers reveal valuable information about
the underlying factors in their decision making. To this end, purchase decisions of
consumers and pricing decisions of firms are extensively studied in the marketing
literature. Positioning decisions such as where to open a retailer, how to design
a new product, which combination of alternatives to offer, are all decided before
the pricing game takes place, and demand is realized, and thus reflect the firms’
knowledge and expectations of these factors. When positioning a product, each
firm in the market has to consider the proximity to the consumer base of inter-
est and the perceived distance from competing alternatives. These factors will
determine the extent of price competition and demand conditions. Although the
positioning decision is a key marketing strategy which incorporates information
Preliminary, comments appreciated. Address for correspondence: Haas School of Business, Uni-versity of California at Berkeley, 545 Student Services Building #1900, Berkeley, CA 94720-1900.email: [email protected].
1
2 A.YESIM ORHUN
on the sensitivity of demand and competition to the change in product attributes,
this source of information has not been empirically explored.
In this paper I present a model that recovers the relative importance of com-
petition and demand factors and their sensitivity to geographic distance in the
supermarket industry. Spatial positioning is a major differentiating factor in this
industry, as in many retail industries. Consumers’ shopping trips usually originate
from their residences, and distance to the store is one of the major determinants
of their choice of where to shop. In this industry, the favorable demand conditions
translate to being close to the customer base of interest, due to travel costs of
consumers. Therefore “distance” to different segments of consumers, as well as
other competitors, is well defined. Thus studying location choice of supermarkets
as a form of positioning is advantageous in terms of the alignment of what the
econometrician can observe and what the firms observe.
In the model each firm simultaneously makes a spatial positioning decision with
expectations of the decisions of the competitors. The decision that maximizes
expected payoffs trades off being close to consumers of interest and avoiding pos-
sible competition. Firms are assumed to have some private information about
their payoffs across locations. The model also investigates details specific to the
supermarket industry which prove to be important in understanding the spatial
differentiation outcomes. In the this industry, pre-existing systematic differences
at the brand level are another source of differentiation. These preexisting differ-
ences are found to be significant in their effect on firms’ spatial positioning choices,
providing evidence of consumer segmentation as well as asymmetric competition
intensities. Since these preexisting differentiating factors are independently de-
cided at the national level, the econometrician can condition on this information
in order to isolate the spatial positioning decisions. The empirical results illustrate
the importance of allowing for differences across firms. The competition intensity
is greater between the firms of the same type than between those firms of different
types. Competition among different types of firms is softened by the fact that they
target different demographic segments.
3
Moreover, the assumption that the information sets of the econometrician and
of the firms are aligned is relaxed by introducing location specific unobservables to
the model. Some examples of unobservable factors are major road intersections,
zoning laws and traffic flow. These factors will be observed by the firms at the
time of the location decision, however will not be observed by the econometrician.
In the case of a positive location shock, such as easy parking amenities and major
road intersections, a model that does not allow for such unobservables will wrongly
attribute the high number of supermarkets to low competition intensity between
these supermarkets. The results show that introducing some common information
to the game, in the form of location specific shocks, is important for the supermar-
ket industry. The competition effect is stronger when unobservables are included in
the model. The location specific unobservables are found to be spatially positively
correlated.
Next I summarize related literature, and present the data. I proceed with the
discrete location choice model and the payoff formulations, followed by estimation
and discussion of results.
2. Literature Review
In order to study the implied demand and competition conditions from the lo-
cation choice of retailers, one needs to define an equilibrium that leads to the
observed outcomes. In this section, I outline the discrete location choice models in
the literature. The discrete product type choice literature stems from a literature
that estimate entry equilibrium models, which are models of binary decisions in
nature. Bresnahan and Reiss (1990) examine the decisions of potential entrants
in monopoly automobile dealer markets, and find that differentiation increases
profit margins of the duopoly more than competition decreases them. Berry(1992)
estimates a game where firms are making simultaneous decisions of entry with com-
plete information about their competitors. The estimation of this model hinges on
the fact that the number of the entrants is unique, even if identity of entrant may
not be. This study brings problems of estimating complete information discrete
choice games to our attention, such as dimensionality problems due to complicated
4 A.YESIM ORHUN
regions of integration, as well as potential multiplicity of equilibria. Bresnahan and
Reiss (1991) detail the empirical estimation of discrete games and further discuss
the multiplicity of equilibria and dimensionality problems in different models si-
multaneous choice of entry with complete information.
The transition from the binary strategic choice to multiple strategic choice proves
to be challenging due to the complete information setup. The first step in this di-
rection comes with Mazzeo (2002), who extends entry models to allow for firm
differentiation by incorporating entry and quality choice in the motel industry.
This study also compares estimates from simultaneous and sequential complete
information models and finds the results to be very similar. However, due to the
complete information assumption, introducing more than three levels of product
choice proves infeasible. This dimensionality problem is addressed by Seim (2004),
who estimates the location choice of video rental stores. In her model, symmet-
ric firms make a nested decision about entry and location simultaneously, where
each firm has private information about its profitability in a particular location.
Asymmetric information assumption turns the discrete actions of competitors into
smooth location choice probabilities, thus both simplifies computation and reduces
the regions of possible multiplicity of equilibria1. This framework allows her to es-
timate a game where firms have many discrete choices.
The use of the asymmetric information assumption makes advances in product
type choice research feasible. Watson (2002), models eyeglass retailers’ variety
choice after an entry decision, and Einav (2003) estimates the discrete timing choice
of movies. Einav’s sequential-move game setting guarantees a unique equilibrium in
pure-strategies. The solution method to this model can also allow for heterogeneity
in firms, thus asymmetric probabilities of timing choice.
In studying a social planner’s decision of locations of gasoline retailers, Chan,
Padmanabhan, Seetharaman (2003) estimate a very flexible model which incorpo-
rates the pricing game as well as the location choice. Their model does not involve
a multiple agent discrete choice game, since the location choices are made by a
1It should be pointed out that asymmetric information does not theoretically rule out multipleequilibria. This particular study finds empirical evidence that equilibrium choice probabilitiesare unique.
5
social planner instead of competing firms. Gowrisankaran and Krainer (2004) are
able to estimate a flexible model of strategic location choice, similar in setup to
Seim (2004), by substituting equilibrium probabilities from data, instead of solving
for them.
This paper concentrates on estimating the strategic tradeoff between demand
and competition factors in the location choice of firms. Product type choice is
conditional on entry, as in Einav (2003). The model is one in which firms choose
locations simultaneously and have private information. Furthermore, the model
allows for location specific common knowledge that is unobservable to the econo-
metrician, as well as pre-existing asymmetries between firms.
3. The Data
Due to the advances in Geographic Information Systems and the initiatives of
the U.S. Census Bureau to make cartographic resources available, data on firms
and consumers can be matched to geography. This allows researchers to construct
a realistic picture of the distribution of consumers with different characteristics.
Moreover, the distance between any pair of consumer and firm, as well as between
any pair of competitors can be calculated. The data on the grocery stores in 25
metropolitan areas in the U.S. are provided by TDLinx. The data include the
street addresses, size in square feet, brand name, number of employees, number of
checkouts, and average weekly revenues of all grocery stores in these markets.2 The
data on consumers are obtained from the 2000 U.S. Census, such as population,
average per capita income, number of households, household income distribution,
family type and average upper quartile rent at the census block group level. Once
the addresses of supermarkets and population weighted centroids of census block
groups are geocoded onto latitude and longitude axes using a cartography software,
any pair of distances3 can be calculated taking the Earth’s ellipsoid surface into
2I concentrate on the location choice of supermarkets, which are defined as grocery stores withmore than 100, 000 dollars of weekly revenue, and are also categorized in the data as supermarkets.3Distance between two locations (lona, lata), (lonb, latb) can be found as da,b = 3956 ∗ 2 ∗arcsin(min(1,
√sin(dlat/2)2 + cos(lata) · cos(latb) · sin(dlon/2)2)) where dlon = lona · π/180 −
lonb · π/180 and dlat = lata · π/180− latb · π/180
6 A.YESIM ORHUN
account. Using the FFIEC Geocoding System, supermarkets are assigned to the
census block group they fall into .
In Figure 1, the population weighted centroids and boundaries of locations of a
sample market are mapped. Note that the population weighted centroids might
be far from the geographic centroids, especially for peripheral locations.
Figure 1: Centroids in Hayward
Distribution of consumer demographics over the map is obtained by matching the
U.S. Census data to the centroid coordinates of census block groups. These data
are used to describe a block group’s own characteristics, as well as the weighted
characteristics of neighboring block groups. Figure 2 displays an example of de-
mographic data assignment to the block groups.
Figure 2: Demographic Assignment at the Census Block Group Level
7
Furthermore, data on the total area of each block group and retailer density is
collected in order to proxy for unobserved zoning rules. The table below summa-
rizes the variables in the data set.
Summary Statistics at the Block Group Level
Mean Std. Dev. Min MaxPopulation 1,930.23 14,248.45 401 824,562
Income per Capita 24,860.8 13,107.12 4,748 124,114Area of Block Group (mi2) 2.52 10.96 0.015 295.31
Density 6,623.34 11,456.13 4.28 432,638.8Upper quartile rent 973.79 425.89 138 2001
Number of Retail Establishments 13.51 24.52 0 290Number of Supermarkets observed 0.13 0.39 0 4
I identify 53 isolated markets, with considerable variation in size, as seen in the
table below. An isolated market is a city or bundle of cities that are at least 8mi
to another area of settlement with a supermarket.
Market Level Variation in Size and CompetitorsMean Std. Dev. Min Max
Number of Block Groups 66.09 64.74 8 338Total Number of Supermarkets 8.03 5.20 2 22
4. Discrete Location Choice Model
The strategic location choice is assumed to be simultaneous among competitors 4.
The unit of location choice is a census block group, and the locations are indexed
by l = {1, 2, .., Lm} in each market m. It is assumed that firms have private
information about their own location specific profitability shocks, following Seim
(2004).
4Asymmetric information assumption does not guarantee uniqueness of equilibrium in simultane-ous discrete choice games. In my estimation, I don’t run into uniqueness problems for the regionof parameters I am searching over. The data has considerable variation, and so the equilibriumconjectures are always the same, for different starting values. Seim (2004) also finds evidence forempirical uniqueness in her estimation.
8 A.YESIM ORHUN
Let the set of players in market m be j = 1, 2, .., Jm and the discrete action space
for player j be Amj . Market superscripts are suppressed since markets represent
independent games. Let ηj be the vector of private information of firm j over the
discrete action space. Given the actions of all players, a = {a1, .., aj, ..aJm} the
payoff for player j is given as
πj(a) = βXaj+ αCaj
(a−j) + ηjaj
(1)
where Xajis a vector of location specific demand and cost shifters, which includes
own and some weighting of other locations’ demographic factors, incorporating the
effects of distance on demand due to consumer travel costs. Caj(a−j) is the measure
of competitive intensity, defined by the number of competitors in the particular
location as well as some weighting of competitors in other locations. Demand and
competition factors are assumed to be separately additive. Construction of Xaj
and Caj(a−j) will be detailed in the next section. The private information of firm
j, denoted ηjaj
is i.i.d across locations and firms, and is assumed to follow a type I
extreme value distribution.
Each player, in equilibrium, will choose a location that optimizes its expected
payoff, given its self-confirming beliefs about other players’ actions based on the
distribution of private information, which is known to all players. Thus a(·) is a
pure strategy Bayesian Nash equilibrium, if for each player j,
aj(ηj) ∈ arg max
aj
{E[πj(aj)] =
∫η−j
πj(aj(ηj), a−j(η
−j), ηjaj
)Pr(η−j)dη−j}
(2)
In other words, a firm will choose action aj only if its expected payoff as a
function of the specific draw of ηjaj
is thus maximized. Since the private information
of other firms only affect a firm’s payoff through others’ optimal strategies, the
expected profits can be expressed as E[πj(aj, a−j, ηjaj
)] =∑
a−jPr(a−j|X, J, β, α) ·
πj(aj, a−j, ηjaj
) or passing the expectation of others’ actions through the payoff
function, as E[πj(aj, a−j, ηjaj
)] = βXaj+ αE[Caj
(a−j)] + ηjaj
.
9
Due to the distributional assumption on the private information, the probability
that the expected profits at a given location is weakly greater than expected profits
at other locations simplifies to
Pr(aj) =exp(βXaj
+ αE[Caj(a−j)])∑
a′j∈Aj
exp(Xa′jβ + αE[Ca
′j(a−j)])
(3)
The expectation of the number of competitors across locations is based on the
beliefs of others’ optimal strategies, which means E[Ca′j(a−j)]) is a function of
Pr(a−j). Therefore the optimal response of a firm depends on its beliefs of the
other firms’ location choice probabilities. Players form these beliefs the same way
that the econometrician does, as in the logit formulation above. This is due to
the asymmetric information assumption that aligns the information set of the
econometrician with the other firms’. Moreover, since all firms are assumed to be
symmetric up to their private information, the probability of any firm choosing
a particular location is the same for all firms.5 Therefore the probability of each
location choice is described by the logit formulation in Equation 3. This constitutes
a mapping where one equations is of the form Pr(aj) = f(Pr(a−j)) and the fixed
point solution to this system of Lm nonlinear equations for each market, gives
us the Lm vector of reduced form equilibrium location probabilities Pr(aj) =
g(X, Jm, β, α) of any firm. Then the probability of observing an outcome a =
{a1, .., aj, ..aJm} can be written as
Pr(a) =J∏
j=1
Pr(aj|X, Jm, β, α) (4)
where Pr(aj|X, Jm, β, α) is the equilibrium probability of a firm choosing action
aj.
5. Payoff Specifications and Estimation
In the game formulated above, the firms maximize their expected payoffs as
a function of their beliefs of competitors’ strategies. In this section, I detail this
5The symmetry assumption is relaxed below by allowing for discrete types of supermarkets.However the equilibrium solution concept remains the same.
10 A.YESIM ORHUN
payoff function under different assumptions, and show how the payoff function and
the game structure are taken to the data. The payoff function needs to capture
the sensitivity of the demand and competition factors to distance. Moreover, the
payoff function should be flexible in order to handle different number of location
alternatives in different markets. To this end, one could use bands of distance
and assume that the effect of characteristics of locations within a band of dis-
tance is homogenous, as in Seim(2004). This approach identifies differing effects
of demographics and competitors for discrete bands of distance. Alternatively, one
could use continuous distance measures to weight the effect of characteristics of a
neighboring location on the location at hand. A flexible weighting function can be
estimated to describe the distance sensitivity at a finer level.
5.1. Discrete Bands Approach. Distance sensitivity of competition and de-
mand effects are captured by allowing these measures to have different coefficients
for different bands of distance. It is assumed that the rivals within the same band
of distance to the firm exert the same competitive pressure on the firm. However
rivals in different bands of distance to the firm potentially have different effects.
Similarly, the effects of demographics are modelled homogenously within a band,
and allowed to differ across bands.
Bands are constructed by choosing cutoff distances. Thus bands b = {1, 2, ..., B}can be described by cutoff distance set db = {0, d1, d2, ..., dB−1} where band b
around a location includes all the locations within db−1 and db miles of the location
at hand. Following the structure of Equation 2, such a payoff function for firm j
can be written as
πj(a) =B∑
b=1
βbZbaj
+B∑
b=1
αb
Lm∑l=1
M baj lhl + ηj
aj(5)
where Zbaj
denotes the demographic data of the locations within band b of the
location choice of firm j. Lm is the total number of locations in the market, hl is
the number of rivals in location l, and Maj l equals 1 if the location l is within band
b of location of choice aj, in other words Maj l = 1{db−1 ≤ daj l < db}. Therefore,∑Lm
l=1 M baj lhl denotes the total number of competitors in the locations within band
11
b of the location choice of firm j. This payoff is realized after the simultaneous
location choice of firms. However, firms maximize expected profits based on their
beliefs about where competitors will locate, since they do not know hl before it is
realized. The expected profits of firm j if it chooses aj can be expressed as
E[πj(aj)] =B∑
b=1
βbZbaj
+ (Jm − 1)B∑
b=1
αb
Lm∑l=1
M baj lpl + ηj
aj(6)
where pl is the probability that a firm will choose location l, and (Jm − 1) is the
total number of competitors in market m. Thus, actualized number of competitors,
hl, is replaced by the expected number of competitors, (Jm − 1)pl. The symmetric
probability of any firm choosing location l is the probability that expected profits
are greatest in that location. Due to distributional assumption on the private
information, this probability is
pl =exp(
∑Bb=1 βbZ
bl + (Jm − 1)
∑b αb
∑Lm
k=1 M blkpl)∑Lm
i=1 exp(∑B
b=1 βbZbi + (Jm − 1)
∑b αb
∑Lm
k M bkipk)
(7)
Given parameters {β, α}, we can numerically find the fixed point solution to the
mapping p = f(p) in order to obtain the vector of reduced form equilibrium conjec-
tures p∗l (X, α, β, Jm) that depend only on the data and parameters. Once we solve
this system of Lm equations in each market, the estimation consists of maximizing
the following log-likelihood
LL =M∑
m=1
Lm∑l=1
yl ln p∗l (8)
where yl is the observed number of entrants in a location within a market, and p∗lis the equilibrium conjecture we solved for.
This procedure necessitates identifying the block groups centroids that fall in the
distance rings, or bands, highlighted with different colors in Figure 3. Note that
the membership depends on the distances between centroids of locations. The store
counts, and demographics for each band are constructed using these neighborhood
definitions.
12 A.YESIM ORHUN
Figure 3: Band Construction
The first band includes location of interest, at the bull’s eye, as well as location
that are within d1 of it. The third band includes all the locations that are farther
away than distance d2 to the location of interest. One could potentially get finer
definitions of distance by using more bands, however the parameters to be esti-
mated increase by the number of distance sensitive variables for each additional
band.
In some cases, there might not be any locations in the second band of a particular
location, although there are locations in the first and third bands. Moreover,
locations with higher percentages of area outside the band may be included in the
band, and others that need to be included may be avoided. These discretization
errors may not be random, since the magnitude of errors are positively correlated
with the size of the location, and its neighboring location sizes, where the size of
a location is correlated with population density, since it is endogenously defined
by the U.S Census Bureau. For small enough location sizes, the discretization
approach generates band definitions that are close to boundary inclusion, just as
uniform-sized locations would. Whereas when the location sizes are bigger, more
errors are possible due to discretization.
Although we cannot observe the population distribution over the location’s area,
using population weighted centroids rather than geographics centroids of locations
increases the precision with which we can identify consumers’ distances to a loca-
tion. Errors due to discretization are minimized when location definitions are as
13
small as possible, therefore I use the census block groups as decision units rather
than census tracts. 6
5.2. Continuous Distance Weighting Approach. The problems related to dis-
cretization can be circumvented by directly using distance measures to weigh the
effect of a variable on profits. Otherwise similar in its assumptions, the model
below allows the weights of characteristics of locations to be flexible functions of
the locations’ distances to the location of interest. The expected payoff for firm j
can be written as
E[πj(aj)] =Lm∑l=1
β(daj l)Xl + (Jm − 1)Lm∑l=1
α(daj l)pl + ηjaj
(9)
where Lm is the total number of locations in the market, Xl are the demographic
variables for location l and daj l is the distance between location choice aj and any
other location indexed by l, and pl is the probability that a firm will choose location
l.
In this approach, the effect of each location’s attributes on the attractiveness
of the location of interest is weighted by its distance to that location. A flexible
polynomial for the β(d) function can be estimated within the system.
The probability of any firm choosing location l is
pl =exp(
∑Lm
k=1 β(dlk)Xk + (Jm − 1)∑Lm
k=1 α(dlk)pk)∑Lm
i=1 exp(∑Lm
k=1 β(dik)Xk + (Jm − 1)∑Lm
k=1 α(dik)pk)(10)
Discrete bands approach provides an advantage in computation with respect to
using continuous distance weighting, if the number of bands are relatively few.
On the other hand, continuous weighting by distance is better in circumstances
where discretization of bands induce biases to the estimation. Moreover, having
the distance weighting function over the whole range of distances provides us with
interesting insights from local changes in the function that might be lost due to
averaging and the particular choice of cutoff in the discrete bands approach.
6Results at the census tract level are available on request.
14 A.YESIM ORHUN
The following extensions address two facts that characterize the supermarket
industry. The first fact is that the firms may have some common information about
the attractiveness of a location that the econometrician may not. The second fact
is that the supermarkets are differentiated in other dimensions than spatial, which
then can affect their spatial positioning decisions.
5.3. Location Specific Unobservables. The models above assume that all the
demand and cost conditions that make a specific location preferable to supermar-
kets are observed by the researcher. However in case of the failure of this assump-
tion, the competition effects will be biased. Including location specific errors in
the model introduces some extra correlation of strategies across players, and it
might be interpreted as adding common information to the game or accounting for
location specific omitted variables. It might also be of interest to allow for spatial
correlation in the unobservables.
In this light, the expected payoff function in the discrete bands model can written
as
E[πj(aj)] =B∑
b=1
βbZbaj
+ (Jm − 1)B∑
b=1
αb
Lm∑l=1
M baj lpl +
B∑b=1
σb
Lm∑l=1
M baj lεl + ηj
aj
(11)
where pl is the probability of any firm choosing location l and ε is drawn from
N(0, I). Similarly for the distance weighting model, the expected payoff function
is
E[πj(aj)] =Lm∑l=1
β(daj l)Xl + (Jm − 1)Lm∑l=1
α(daj l)pl +Lm∑l=1
σ(daj l)εl + ηjaj
(12)
By including σ(daj l)εl instead of just σεajI investigate spatial correlation among
the location specific unobservables.
Introducing common location unobservables requires solving for the equilibrium
conjectures for a set of simulated ε in order to integrate out the distribution of ε
in the Maximum Likelihood estimation. This approach assumes that ε is uncor-
related with the explanatory variables. For each simulation r = 1, 2, ...R of the
15
unobservable vector over all locations, the probability of location choice of firm j
is,
prl =
exp(∑B
b=1 βbZbl + (Jm − 1)
∑Bb=1 αb
∑Lm
k=1 M blkpk +
∑Bb=1 σb
∑Lm
k=1 M blkε
rk)∑Lm
i=1 exp(∑B
b=1 βbZbi + (Jm − 1)
∑Bb=1 αb
∑Lm
k=1 M bikpk +
∑Bb=1 σb
∑Lm
k=1 M bikε
rk)
(13)
for the discrete bands approach, and
prl =
exp(∑Lm
k=1 β(dlk)Xk +∑Lm
k=1 α(dlk)pk +∑Lm
k=1 σ(dlk)εrk)∑Lm
i=1 exp(∑Lm
k=1 β(dik)Xk +∑Lm
k=1 α(dik)pk +∑Lm
k=1 σ(dik)εrk)
(14)
for the distance weighting approach.
This system of equations for all firms is solved numerically for each simulation r.
Then these R sets of conjectures are averaged, thus integrating out the distribution
of ε for a given set of parameters (β, α, σ). The log-likelihood taken to estimation
is
LL =M∑
m=1
Lm∑l=1
yl ln(
∫p∗l (ε)f(ε)dε) (15)
Finding the fixed-point solution to the set of equations for the equilibrium con-
jectures can be time consuming. Estimations which have a large number of in-
dependent calculations within a minimization routine are very suitable for paral-
lelization. In order to gain magnitudes of speed, I parallelize the algorithm in C
and ran it on the Datastar in the San Diego Supercomputer Center. The algorithm
is generally parallelized over the simulations of ε and in cases where I am interested
in models without the unobservable, it is parallelized over the markets. Costs of
communications outweighs the benefits of double parallelization (for both R and
M). The minimization routine is a downhill simplex method.
5.4. Asymmetric Types. In the supermarket industry, geographic positioning
segments consumers according to their proximity and travel costs. Preexisting
16 A.YESIM ORHUN
brand level differences at the time of location choice further segments them ac-
cording to their tradeoff between distance and quality and price concerns. This
may result in relative geographic targeting of consumer segments by different types
of firms. Moreover, if preexisting differences play a role in differentiation, compe-
tition across firms that are more differentiated will be softer.
In order to investigate these effects, we can allow for discrete types of firms which
results in asymmetries in location probabilities across different types. When pre-
determined systematic differences across players are denoted as types t = 1, 2, ...T ,
the discrete bands expected payoff function of firm j of type t can be written as
E[πtj(aj)] =
B∑b=1
βtbZ
baj
+T∑
s=1
(Jms − It=s)
B∑b=1
αtsb
Lm∑l=1
M baj lp
sl + ηj
aj(16)
where It=h equals one if t = h, and Jms is the number of firms of type s in the
market. The probability of a firm of type s choosing location l is denoted by psl .
The coefficient αtsb captures the effect of competitors of type s within band b, on
the profits a type t firm. Similarly, for the distance weighting model, the expected
payoff function is
E[πtj(aj)] =
Lm∑l=1
βt(daj l)Xl +T∑
s=1
(Jms − It=s)
Lm∑l=1
αts(daj l)psl + ηj
aj(17)
The probability of a firm of type t choosing location l can be expressed as
ptl =
exp(∑B
b=1 βtbZ
bl +
∑Ts=1(J
ms − It=s)
∑Bb=1 αts
b
∑Lm
k=1 M blkp
sk)∑Lm
i=1 exp(∑B
b=1 βtbZ
bi +
∑Ts=1(J
ms − It=s)
∑Bb=1 αts
b
∑Lm
k=1 M bikp
sk)
(18)
or
ptl =
exp(∑Lm
k=1 βt(dlk)Xk +∑T
s=1(Jms − It=s)
∑Lm
k=1 αts(dlk)psk)∑Lm
i=1 exp(∑Lm
k=1 βt(dik)Xk +∑T
s=1(Jms − It=s)
∑Lm
k=1 αts(dik)psk)
(19)
The equilibrium conjectures pt∗ are found by solving Lm × T equations.
The log-likelihood taken to the estimation is
LL =M∑
m=1
Lm∑l=1
T∑t=1
ytl ln pt∗
l (20)
17
where ytt is the observed number of entrants of type t, in location l within a
market.
6. Results
In the estimation, the demand shifters such as population density and average
income are assumed to be distance sensitive.7 On the other hand, upper quartile
rent and retailer density of a particular location only affect the profits for that
location.8 I operationalize the discrete bands approach with 3 bands which are
defined by 0.5 mi, 2mi cutoffs.9
Table 1 shows the estimates from the discrete bands model. The parameters of
Equation 7 are reported in the first column, and the second column reports the
results of Equation 13, which introduces common location unobservables. Let’s
concentrate on the first column. The effect of population density on profits is
positive and decreases with distance to the store location. In fact, there is no
significant effect of population density from the 3rd band, which includes locations
farther than 2 miles. The fact that the effect decreases with distance reflects
consumer travel costs. The effect of income per capita in the immediate band is
negative, however in the second and third bands it is positive and decreasing with
distance. The effect of upper quartile rent in the location of choice is negative,
but insignificant. A location with very few or no retail establishments is much less
likely to be chosen, possibly reflecting unobserved zoning rules. 10
As expected, the effects of competitors are negative and decreasing in magni-
tude with distance. In the second specification which introduces location specific
unobservables to the estimation, the effects of competitors get more pronounced
in every band. This is due to the fact that the model can attribute clusters or
non-existence of supermarkets to unobservables. For example, take a case where
7I do not find significant effects of education, family size, or travel time to work.8I have also used restaurant density and business density instead of retailer density and got verysimilar coefficients.9Results for distance cutoffs (1, 2.5), (0.5, 2) and (0.5, 1.5, 2.5) are available upon request. Resultsare not very sensitive to cutoff choice.10Results using density of large retailers (comparable to supermarkets) are very similar.
18 A.YESIM ORHUN
there is a positive unobservable for a given location, say an intersection of major
avenues, and parking lot facility nearby. When the model is forced to attribute
a higher number of supermarkets in this location to observed demographics only,
this will dilute the effect of demographics and the competitive effect will be un-
derestimated. The difference in results provides evidence that allowing for some
common information of players is important. The unobservables show a significant
variance and positive spatial correlation. I do not include retail density as an ex-
planatory variable, since the location specific unobservables capture zoning rules
and more. The effect of rent on profits becomes significant and more negative in
this specification. 11
Table 2 displays the estimates of the continuous distance weighting approach.
The first column report the results of the model in Equation 10 and the second
column report the results of the model with location specific unobservables, as
in Equation 14. For each distance sensitive parameter, I use θ(d) = θ1
d+ θ2
d2 or
θ(d) = θ1
d+ θ2
d2 + θ3
d3 to approximate the effect of distance on the weighting of
the variable. This functional form is intuitive and flexible. This approach avoids
discrepancies due to ad hoc discretization of neighborhoods and provides a more
detailed picture of distance varying effects.
The results are in the same direction as the discrete bands model results. In
Display 1, estimated weighting functions in two specifications are graphed as a
function of distance for demand and competition factors. As before, the effect
of a competitor is negative and decreasing with distance. Allowing for location
specific unobservables increases this effect in magnitude. The effect of population
density is positive and decreasing with distance. The effects of income per capita
are negative for locations that are very close, and positive and decreasing with
distance for the rest. The effect of rent is negative as expected.
The results of the model with asymmetric types of players are presented in Table
3. The first column of Table 3 shows the results of the model presented in Equation
11To the degree that explanatory variables are correlated with the unobservable factors, theparameters on the explanatory variables will reflect the effect of this correlation with the unob-servable factors, not just the effect of population for example. Then the estimates need to beviewed in this light.
19
17. Supermarkets are divided into two types, Type A supermarkets are brand
names such as Whole Foods, Andronico’s, Nob Hill, Trader Joe’s and specialty
supermarkets and Type B supermarkets are Safeway/Vons, Albertsons, Smart and
Final, Grocery Outlet, Food4Less and other nonbranded stores. For estimation
purposes, it is assumed that the competitive pressure of a type t supermarket
on a type s super market is the same as that of a type s on type t, in other
words αts = αst. It is also assumed that supermarkets of the same type exert
the same competitive pressure on each other, regardless of their type, so αtt =
αss. This method differentiates between within-type and across-type competition
effects. Moreover, supermarkets are allowed to differ in their sensitivities to income
per capita of their consumers. The specification presented in the second column
introduces the variance of income in a location as an explanatory variable that
each type is allowed to respond differently to. This aims to capture the effect of
distribution of income, rather than just the mean of income. Consider two locations
with the same mean income per capita. Let one have a unimodal distribution
of income per capita, with a small variance around the mean. Let the other
have a bimodal distribution, reflecting a mix of poor and rich consumers. The
second location might attract two entrants of different types, where as the first
location may attract none, or one. In this scenario, if the distribution of income is
omitted, then the competitive effect between supermarkets of different types might
be underestimated.
The results for the two specifications are graphed in Display 2. The effect of
population density is positive and decreasing with distance as before. The income
per capita sensitivities for each type is quite different. Type B firms prefer low
income locations, and the effect of income decreases sharply with distance. For
Type A firms the effect of income is positive and generally decreases with distance.
The results of the first specification show a considerable difference in within-type
competition intensities and across-type intensities. Competition between firms of
the same type is much fiercer for all distances, although the effect decreases sharply
within the first mile of proximity. The competition between firms of different
types is softer and decreases slowly with distance. This competition effect must
20 A.YESIM ORHUN
be interpreted as the result of consumer choice among supermarkets, as a function
of the substitutability or complementarity of competitors. The estimates from the
second specification are similar, with the competition across types being fiercer.
The comparison can be seen more clearly in the last graph of Display 2. This is in
line with the earlier intuition that distribution of income might be a determinant
of location choice of different types.
7. Discussion
This paper investigates the geographic differentiation of supermarkets, taking
into account other differentiating factors that are pre-existing at the time of the
location choice. The contribution of the results is to provide evidence that such pre-
existing differences matter in spatial differentiation decisions. The supermarkets
are categorized into types outside the model. There may be alternative ways of
categorizing supermarkets, such as with respect to size and variety or type of
pricing scheme that this paper does not yet explore. To this end, the results that
point to the importance of such pre-existing differences might be understated.
The competition effect is modeled as the effect of an additional competitor on the
firm’s profits. By construction, this effect will include any possible spill-over effects.
Supermarket industry, unlike other retailer industries such as restaurants, is not
expected to exhibit large magnitudes of positive clustering externalities. Moreover,
allowing for location specific unobservables in the model prevents misattribution
of unobservables to competition intensity. However, results should be interpreted
keeping the model in mind. For example, when we allow for asymmetric competi-
tion intensities within and across different types of supermarkets, we find that the
within type competition is tougher than across type competition. One can inter-
pret these results as pre-existing differences creating asymmetric substitutability
between different types of supermarkets for the consumer. These results, to some
extent, may also be due to the complementarity of supermarkets of different types,
which may induce positive externalities of locating together.
21
The payoff function is designed to capture the tradeoff between being close to
favorable demand conditions and being far from competitors. The separably ad-
ditive formulation of these effects is prominent in the literature. The competitive
effect is a linear, additive function of the expected number of competitors. This
formulation assumes that the effect of an additional firm is constant, regardless of
the number of existing firms. The fixed point solution to the system of conjectures
is greatly aided by this assumption. If one were to allow for a more flexible payoff
function, the estimation would not be feasible under these assumptions. Two stud-
ies deal with this problem. Einav(2003) models a flexible payoff function, under a
sequential discrete choice assumption. The game structure does not require solving
for the fixed point, it is a backwards induction solution method. He estimates one
parameter of this flexible payoff function, where other parameters are substituted
in. I do not have the order of entry information in my data set. Then, in order to
estimate a sequential choice model, I would have to randomize over the order of
choice with respect to some parametric distribution. However, with a more com-
plicated payoff function, where I am interested in estimating all the parameters
from the location choice data, I do not find it feasible to randomize over order of
choice and to estimate the parameters of the model. The second study that is able
to estimate a more flexible payoff function is Gowrisankaran and Krainer (2005).
This paper uses the same simultaneous discrete choice game setup, however instead
of solving for the equilibrium beliefs, substitutes actual actions of competitors in
the market in the place of beliefs of competitors’ location choices.
8. Conclusion
I estimate a simultaneous game of asymmetric information of location choice in
order to gain insight on the tradeoff between demand and competition factors in
the supermarket industry, and to recover sensitivities of these factors to distance
due to consumer travel costs. In the model, firms face uncertainty regarding the
number of competitors they will face, since each firm possesses private information
on their profitability across locations. Each firm maximizes its expected profits,
however might face ex-post regret due to the risk in payoffs caused by uncertainty in
22 A.YESIM ORHUN
the optimal strategies of competitors, which is a well suited description of decision
making in the real world.
The model is applied to the supermarket industry where location choice is a
major differentiating factor, and other pre-existing differentiating factors can be
controlled for. The findings indicate that competition intensity decreases signifi-
cantly with distance, which gives firms the incentive to avoid locating in proximity.
On the other hand, the demand effects also weaken with distance, giving firms a
counter incentive to locate close to favorable demand conditions. Empirical re-
sults also show that these two counteracting incentives are traded off differently
by different types of supermarkets. For a fixed distance, the competition intensity
is greater between firms of the same type than between firms of different types.
Also, results indicate that firms of different types are targeting different segments
of consumers. This result highlights the importance of considering all pre-existing
differentiating dimensions when modelling geographic position choices of firms.
Furthermore, this paper illustrates how controlling for unobservable location spe-
cific factors can explain clustering or thinning of supermarkets which otherwise
would be misattributed to competition intensity by the model.
This research is a step towards incorporating positioning decisions of firms in
marketing strategy models. This study shows that positioning decisions of firms
can convey valuable information on the demand and competition conditions in
the market. Combining demand and supply models with product location choice
models is a natural direction for future research.
23
References
[1] Steve Berry. Estimation of a model of entry in the airline indstry. Econometrica, 60:889–917,February 1992.
[2] Timothy Bresnahan and Peter Reiss. Entry in monopoly markets. The Review of EconomicStudies, 57(4):531–553, October 1990.
[3] Timothy Bresnahan and Peter Reiss. Empirical models of discrete games. Journal of Econo-metrics, 48:57–81, 1991.
[4] Liran Einav. Not all rivals look alike: Estimating an equilibrium model of the release datetiming game. Working Paper, Stanford University, June 2003.
[5] Gautam Gowrisankaran and John Krainer. The welfare consequences of atm surcharges: Ev-idence from a structural entry model. Working Paper, Olin School of Business, WashingtonUniversity, November 2004.
[6] Micheal Mazzeo. Product choice and oligopoly market structure. RAND Journal of Econom-ics, 3(2):1–22, Summer 2002.
[7] Katja Seim. An empirical model of entry with endogenous product-type choices. WorkingPaper, Graduate School of Business, Stanford University, February 2004.
[8] Randal Watson. Product variety and competition in the retail market for eyeglasses. WorkingPaper, University of Texas, Austin, March 2004.
24 A.YESIM ORHUN
Table 1: Results of Discrete Bands Approach
(1) (2)Population density, band 1 0.5214 0.9731
(0.1594)* (0.2517)*Population density, band 2 0.2297 0.5672
(0.0952)* (0.1288)*Population density, band 3 0.103 0.0962
(0.094) (0.208)Income per Capita, band 1 -0.1239 -0.0811
(0.0439)* (0.034)*Income per Capita, band 2 0.2371 0.2422
(0.0992)* (0.1041)*Income per Capita, band 3 0.0459 0.0069
(0.0208)* (0.0047)Upper quartile rent -0.0735 -0.0868
(0.0523) (0.0363)*Retail density 0.473
(0.166)*Competition, band 1 -2.4675 -3.532
(0.2612)* (0.2238)*Competition, band 2 -0.1245 -0.439
(0.058)* (0.1318)*Competition, band 3 -0.1883 -0.2489
(0.0916)* (0.054)*σ1 1.762
(0.7551)*σ2 1.183
(0.4358)*σ3 1.107
(0.314)*LLF 1744.56 1738.19
25
Table 2: Results of Distance Weighting Approach(1) (2)
Population density 1d
0.1962 0.2683(0.0793)* (0.0581)*
Population density 1d2 -0.0121 -0.0109
(0.0051)* (0.0031)*Income per Capita 1
d0.1420 0.1120
(0.0670)* (0.0440)*Income per Capita 1
d2 -0.0169 -0.0138(0.0080)* (0.0052)*
Upper quartile rent -0.0580 -0.0910(0.0276)* (0.0494)*
Retail density 0.336(0.1074)*
Competition 1d
-0.5213 0.6715(0.1480)* (0.1508)*
Competition 1d2 0.0850 0.0442
(0.0196)* (0.0099)*Competition 1
d3 -0.0063 -0.0021(0.0015)* (0.0005)*
σ11d
0.2894(0.1181)*
σ21d2 -0.0197
(0.0059)*LLF 1749.77 1740.16
26 A.YESIM ORHUN
Display 1
Graph of Competition Coefficient
Graph of Population Density Coefficient
27
Graph of Income per Capita Coefficient
28 A.YESIM ORHUN
Table 3: Results of Distance Weighting Approach with AsymmetricTypes
(1) (2)Population density 1
d 0.228 0.1813(0.0417)* (0.0705)*
Population density 1d2 -0.0107 -0.0096
(0.0044)* (0.0038)*Upper quartile rent -0.0720 -0.0933
(0.0854) (0.0468)*Retail Density 0.6201 0.6953
(0.173)* (0.323)*Income per Capita - type A 1
d 0.1993 0.2634(0.0978) (0.1085)*
1d2 -0.0147 -0.0241
(0.006)* (0.011)*Income per Capita - type B 1
d 0.0225 0.0218(0.0263) (0.0158)
1d2 -0.0073 -0.0034
(0.0055) (0.0021)Within-type Competition 1
d -0.5982 -0.6247(0.2550)* (0.2193)*
Within-type Competition 1d2 0.0103 0.0235
(0.0042)* (0.0106)*Across-type Competition 1
d -0.2209 -0.3527(0.0774)* (0.1255)*
Across-type Competition 1d2 0.0152 0.0168
(0.0067)* (0.0074)*Variance of Income - type A 0.3863
(0.2886)Variance of Income - type B 0.6419
(0.2745)*LLF 1749.32 1741.81
29
Display 2
Population Density Coefficient
Income Coefficients
30 A.YESIM ORHUN
Competition Coefficients from (1)
Competition Coefficients from (2)
31
Across type competition comparison