spatial differentiation in the supermarket industrypages.stern.nyu.edu/~mkt/seminar...

SPATIAL DIFFERENTIATION IN THESUPERMARKET INDUSTRY

A.YESIM ORHUN

May 21, 2005Haas School of Business, University of California at Berkeley

Abstract. This paper investigates the positioning choice of strategic firms andinfers the tradeoff between demand and competition factors in their optimaldecisions. It shows that controlling for spatially correlated unobservable commonfactors as well as pre-existing systematic differences across retailers is essentialfor explaining the spatial positioning decisions in the supermarket industry.

1. Introduction

Observed decisions of firms and consumers reveal valuable information about

the underlying factors in their decision making. To this end, purchase decisions of

consumers and pricing decisions of firms are extensively studied in the marketing

literature. Positioning decisions such as where to open a retailer, how to design

a new product, which combination of alternatives to offer, are all decided before

the pricing game takes place, and demand is realized, and thus reflect the firms’

knowledge and expectations of these factors. When positioning a product, each

firm in the market has to consider the proximity to the consumer base of inter-

est and the perceived distance from competing alternatives. These factors will

determine the extent of price competition and demand conditions. Although the

positioning decision is a key marketing strategy which incorporates information

Preliminary, comments appreciated. Address for correspondence: Haas School of Business, Uni-versity of California at Berkeley, 545 Student Services Building #1900, Berkeley, CA 94720-1900.email: [email protected].

1

2 A.YESIM ORHUN

on the sensitivity of demand and competition to the change in product attributes,

this source of information has not been empirically explored.

In this paper I present a model that recovers the relative importance of com-

petition and demand factors and their sensitivity to geographic distance in the

supermarket industry. Spatial positioning is a major differentiating factor in this

industry, as in many retail industries. Consumers’ shopping trips usually originate

from their residences, and distance to the store is one of the major determinants

of their choice of where to shop. In this industry, the favorable demand conditions

translate to being close to the customer base of interest, due to travel costs of

consumers. Therefore “distance” to different segments of consumers, as well as

other competitors, is well defined. Thus studying location choice of supermarkets

as a form of positioning is advantageous in terms of the alignment of what the

econometrician can observe and what the firms observe.

In the model each firm simultaneously makes a spatial positioning decision with

expectations of the decisions of the competitors. The decision that maximizes

expected payoffs trades off being close to consumers of interest and avoiding pos-

sible competition. Firms are assumed to have some private information about

their payoffs across locations. The model also investigates details specific to the

supermarket industry which prove to be important in understanding the spatial

differentiation outcomes. In the this industry, pre-existing systematic differences

at the brand level are another source of differentiation. These preexisting differ-

ences are found to be significant in their effect on firms’ spatial positioning choices,

providing evidence of consumer segmentation as well as asymmetric competition

intensities. Since these preexisting differentiating factors are independently de-

cided at the national level, the econometrician can condition on this information

in order to isolate the spatial positioning decisions. The empirical results illustrate

the importance of allowing for differences across firms. The competition intensity

is greater between the firms of the same type than between those firms of different

types. Competition among different types of firms is softened by the fact that they

target different demographic segments.

3

Moreover, the assumption that the information sets of the econometrician and

of the firms are aligned is relaxed by introducing location specific unobservables to

the model. Some examples of unobservable factors are major road intersections,

zoning laws and traffic flow. These factors will be observed by the firms at the

time of the location decision, however will not be observed by the econometrician.

In the case of a positive location shock, such as easy parking amenities and major

road intersections, a model that does not allow for such unobservables will wrongly

attribute the high number of supermarkets to low competition intensity between

these supermarkets. The results show that introducing some common information

to the game, in the form of location specific shocks, is important for the supermar-

ket industry. The competition effect is stronger when unobservables are included in

the model. The location specific unobservables are found to be spatially positively

correlated.

Next I summarize related literature, and present the data. I proceed with the

discrete location choice model and the payoff formulations, followed by estimation

and discussion of results.

2. Literature Review

In order to study the implied demand and competition conditions from the lo-

cation choice of retailers, one needs to define an equilibrium that leads to the

observed outcomes. In this section, I outline the discrete location choice models in

the literature. The discrete product type choice literature stems from a literature

that estimate entry equilibrium models, which are models of binary decisions in

nature. Bresnahan and Reiss (1990) examine the decisions of potential entrants

in monopoly automobile dealer markets, and find that differentiation increases

profit margins of the duopoly more than competition decreases them. Berry(1992)

estimates a game where firms are making simultaneous decisions of entry with com-

plete information about their competitors. The estimation of this model hinges on

the fact that the number of the entrants is unique, even if identity of entrant may

not be. This study brings problems of estimating complete information discrete

choice games to our attention, such as dimensionality problems due to complicated

4 A.YESIM ORHUN

regions of integration, as well as potential multiplicity of equilibria. Bresnahan and

Reiss (1991) detail the empirical estimation of discrete games and further discuss

the multiplicity of equilibria and dimensionality problems in different models si-

multaneous choice of entry with complete information.

The transition from the binary strategic choice to multiple strategic choice proves

to be challenging due to the complete information setup. The first step in this di-

rection comes with Mazzeo (2002), who extends entry models to allow for firm

differentiation by incorporating entry and quality choice in the motel industry.

This study also compares estimates from simultaneous and sequential complete

information models and finds the results to be very similar. However, due to the

complete information assumption, introducing more than three levels of product

choice proves infeasible. This dimensionality problem is addressed by Seim (2004),

who estimates the location choice of video rental stores. In her model, symmet-

ric firms make a nested decision about entry and location simultaneously, where

each firm has private information about its profitability in a particular location.

Asymmetric information assumption turns the discrete actions of competitors into

smooth location choice probabilities, thus both simplifies computation and reduces

the regions of possible multiplicity of equilibria1. This framework allows her to es-

timate a game where firms have many discrete choices.

The use of the asymmetric information assumption makes advances in product

type choice research feasible. Watson (2002), models eyeglass retailers’ variety

choice after an entry decision, and Einav (2003) estimates the discrete timing choice

of movies. Einav’s sequential-move game setting guarantees a unique equilibrium in

pure-strategies. The solution method to this model can also allow for heterogeneity

in firms, thus asymmetric probabilities of timing choice.

In studying a social planner’s decision of locations of gasoline retailers, Chan,

Padmanabhan, Seetharaman (2003) estimate a very flexible model which incorpo-

rates the pricing game as well as the location choice. Their model does not involve

a multiple agent discrete choice game, since the location choices are made by a

1It should be pointed out that asymmetric information does not theoretically rule out multipleequilibria. This particular study finds empirical evidence that equilibrium choice probabilitiesare unique.

5

social planner instead of competing firms. Gowrisankaran and Krainer (2004) are

able to estimate a flexible model of strategic location choice, similar in setup to

Seim (2004), by substituting equilibrium probabilities from data, instead of solving

for them.

This paper concentrates on estimating the strategic tradeoff between demand

and competition factors in the location choice of firms. Product type choice is

conditional on entry, as in Einav (2003). The model is one in which firms choose

locations simultaneously and have private information. Furthermore, the model

allows for location specific common knowledge that is unobservable to the econo-

metrician, as well as pre-existing asymmetries between firms.

3. The Data

Due to the advances in Geographic Information Systems and the initiatives of

the U.S. Census Bureau to make cartographic resources available, data on firms

and consumers can be matched to geography. This allows researchers to construct

a realistic picture of the distribution of consumers with different characteristics.

Moreover, the distance between any pair of consumer and firm, as well as between

any pair of competitors can be calculated. The data on the grocery stores in 25

metropolitan areas in the U.S. are provided by TDLinx. The data include the

street addresses, size in square feet, brand name, number of employees, number of

checkouts, and average weekly revenues of all grocery stores in these markets.2 The

data on consumers are obtained from the 2000 U.S. Census, such as population,

average per capita income, number of households, household income distribution,

family type and average upper quartile rent at the census block group level. Once

the addresses of supermarkets and population weighted centroids of census block

groups are geocoded onto latitude and longitude axes using a cartography software,

any pair of distances3 can be calculated taking the Earth’s ellipsoid surface into

2I concentrate on the location choice of supermarkets, which are defined as grocery stores withmore than 100, 000 dollars of weekly revenue, and are also categorized in the data as supermarkets.3Distance between two locations (lona, lata), (lonb, latb) can be found as da,b = 3956 ∗ 2 ∗arcsin(min(1,

√sin(dlat/2)2 + cos(lata) · cos(latb) · sin(dlon/2)2)) where dlon = lona · π/180 −

lonb · π/180 and dlat = lata · π/180− latb · π/180

6 A.YESIM ORHUN

account. Using the FFIEC Geocoding System, supermarkets are assigned to the

census block group they fall into .

In Figure 1, the population weighted centroids and boundaries of locations of a

sample market are mapped. Note that the population weighted centroids might

be far from the geographic centroids, especially for peripheral locations.

Figure 1: Centroids in Hayward

Distribution of consumer demographics over the map is obtained by matching the

U.S. Census data to the centroid coordinates of census block groups. These data

are used to describe a block group’s own characteristics, as well as the weighted

characteristics of neighboring block groups. Figure 2 displays an example of de-

mographic data assignment to the block groups.

Figure 2: Demographic Assignment at the Census Block Group Level

7

Furthermore, data on the total area of each block group and retailer density is

collected in order to proxy for unobserved zoning rules. The table below summa-

rizes the variables in the data set.

Summary Statistics at the Block Group Level

Mean Std. Dev. Min MaxPopulation 1,930.23 14,248.45 401 824,562

Income per Capita 24,860.8 13,107.12 4,748 124,114Area of Block Group (mi2) 2.52 10.96 0.015 295.31

Density 6,623.34 11,456.13 4.28 432,638.8Upper quartile rent 973.79 425.89 138 2001

Number of Retail Establishments 13.51 24.52 0 290Number of Supermarkets observed 0.13 0.39 0 4

I identify 53 isolated markets, with considerable variation in size, as seen in the

table below. An isolated market is a city or bundle of cities that are at least 8mi

to another area of settlement with a supermarket.

Market Level Variation in Size and CompetitorsMean Std. Dev. Min Max

Number of Block Groups 66.09 64.74 8 338Total Number of Supermarkets 8.03 5.20 2 22

4. Discrete Location Choice Model

The strategic location choice is assumed to be simultaneous among competitors 4.

The unit of location choice is a census block group, and the locations are indexed

by l = {1, 2, .., Lm} in each market m. It is assumed that firms have private

information about their own location specific profitability shocks, following Seim

(2004).

4Asymmetric information assumption does not guarantee uniqueness of equilibrium in simultane-ous discrete choice games. In my estimation, I don’t run into uniqueness problems for the regionof parameters I am searching over. The data has considerable variation, and so the equilibriumconjectures are always the same, for different starting values. Seim (2004) also finds evidence forempirical uniqueness in her estimation.

8 A.YESIM ORHUN

Let the set of players in market m be j = 1, 2, .., Jm and the discrete action space

for player j be Amj . Market superscripts are suppressed since markets represent

independent games. Let ηj be the vector of private information of firm j over the

discrete action space. Given the actions of all players, a = {a1, .., aj, ..aJm} the

payoff for player j is given as

πj(a) = βXaj+ αCaj

(a−j) + ηjaj

(1)

where Xajis a vector of location specific demand and cost shifters, which includes

own and some weighting of other locations’ demographic factors, incorporating the

effects of distance on demand due to consumer travel costs. Caj(a−j) is the measure

of competitive intensity, defined by the number of competitors in the particular

location as well as some weighting of competitors in other locations. Demand and

competition factors are assumed to be separately additive. Construction of Xaj

and Caj(a−j) will be detailed in the next section. The private information of firm

j, denoted ηjaj

is i.i.d across locations and firms, and is assumed to follow a type I

extreme value distribution.

Each player, in equilibrium, will choose a location that optimizes its expected

payoff, given its self-confirming beliefs about other players’ actions based on the

distribution of private information, which is known to all players. Thus a(·) is a

pure strategy Bayesian Nash equilibrium, if for each player j,

aj(ηj) ∈ arg max

aj

{E[πj(aj)] =

∫η−j

πj(aj(ηj), a−j(η

−j), ηjaj

)Pr(η−j)dη−j}

(2)

In other words, a firm will choose action aj only if its expected payoff as a

function of the specific draw of ηjaj

is thus maximized. Since the private information

of other firms only affect a firm’s payoff through others’ optimal strategies, the

expected profits can be expressed as E[πj(aj, a−j, ηjaj

)] =∑

a−jPr(a−j|X, J, β, α) ·

πj(aj, a−j, ηjaj

) or passing the expectation of others’ actions through the payoff

function, as E[πj(aj, a−j, ηjaj

)] = βXaj+ αE[Caj

(a−j)] + ηjaj

.

9

Due to the distributional assumption on the private information, the probability

that the expected profits at a given location is weakly greater than expected profits

at other locations simplifies to

Pr(aj) =exp(βXaj

+ αE[Caj(a−j)])∑

a′j∈Aj

exp(Xa′jβ + αE[Ca

′j(a−j)])

(3)

The expectation of the number of competitors across locations is based on the

beliefs of others’ optimal strategies, which means E[Ca′j(a−j)]) is a function of

Pr(a−j). Therefore the optimal response of a firm depends on its beliefs of the

other firms’ location choice probabilities. Players form these beliefs the same way

that the econometrician does, as in the logit formulation above. This is due to

the asymmetric information assumption that aligns the information set of the

econometrician with the other firms’. Moreover, since all firms are assumed to be

symmetric up to their private information, the probability of any firm choosing

a particular location is the same for all firms.5 Therefore the probability of each

location choice is described by the logit formulation in Equation 3. This constitutes

a mapping where one equations is of the form Pr(aj) = f(Pr(a−j)) and the fixed

point solution to this system of Lm nonlinear equations for each market, gives

us the Lm vector of reduced form equilibrium location probabilities Pr(aj) =

g(X, Jm, β, α) of any firm. Then the probability of observing an outcome a =

{a1, .., aj, ..aJm} can be written as

Pr(a) =J∏

j=1

Pr(aj|X, Jm, β, α) (4)

where Pr(aj|X, Jm, β, α) is the equilibrium probability of a firm choosing action

aj.

5. Payoff Specifications and Estimation

In the game formulated above, the firms maximize their expected payoffs as

a function of their beliefs of competitors’ strategies. In this section, I detail this

5The symmetry assumption is relaxed below by allowing for discrete types of supermarkets.However the equilibrium solution concept remains the same.

10 A.YESIM ORHUN

payoff function under different assumptions, and show how the payoff function and

the game structure are taken to the data. The payoff function needs to capture

the sensitivity of the demand and competition factors to distance. Moreover, the

payoff function should be flexible in order to handle different number of location

alternatives in different markets. To this end, one could use bands of distance

and assume that the effect of characteristics of locations within a band of dis-

tance is homogenous, as in Seim(2004). This approach identifies differing effects

of demographics and competitors for discrete bands of distance. Alternatively, one

could use continuous distance measures to weight the effect of characteristics of a

neighboring location on the location at hand. A flexible weighting function can be

estimated to describe the distance sensitivity at a finer level.

5.1. Discrete Bands Approach. Distance sensitivity of competition and de-

mand effects are captured by allowing these measures to have different coefficients

for different bands of distance. It is assumed that the rivals within the same band

of distance to the firm exert the same competitive pressure on the firm. However

rivals in different bands of distance to the firm potentially have different effects.

Similarly, the effects of demographics are modelled homogenously within a band,

and allowed to differ across bands.

Bands are constructed by choosing cutoff distances. Thus bands b = {1, 2, ..., B}can be described by cutoff distance set db = {0, d1, d2, ..., dB−1} where band b

around a location includes all the locations within db−1 and db miles of the location

at hand. Following the structure of Equation 2, such a payoff function for firm j

can be written as

πj(a) =B∑

b=1

βbZbaj

+B∑

b=1

αb

Lm∑l=1

M baj lhl + ηj

aj(5)

where Zbaj

denotes the demographic data of the locations within band b of the

location choice of firm j. Lm is the total number of locations in the market, hl is

the number of rivals in location l, and Maj l equals 1 if the location l is within band

b of location of choice aj, in other words Maj l = 1{db−1 ≤ daj l < db}. Therefore,∑Lm

l=1 M baj lhl denotes the total number of competitors in the locations within band

11

b of the location choice of firm j. This payoff is realized after the simultaneous

location choice of firms. However, firms maximize expected profits based on their

beliefs about where competitors will locate, since they do not know hl before it is

realized. The expected profits of firm j if it chooses aj can be expressed as

E[πj(aj)] =B∑

b=1

βbZbaj

+ (Jm − 1)B∑

b=1

αb

Lm∑l=1

M baj lpl + ηj

aj(6)

where pl is the probability that a firm will choose location l, and (Jm − 1) is the

total number of competitors in market m. Thus, actualized number of competitors,

hl, is replaced by the expected number of competitors, (Jm − 1)pl. The symmetric

probability of any firm choosing location l is the probability that expected profits

are greatest in that location. Due to distributional assumption on the private

information, this probability is

pl =exp(

∑Bb=1 βbZ

bl + (Jm − 1)

∑b αb

∑Lm

k=1 M blkpl)∑Lm

i=1 exp(∑B

b=1 βbZbi + (Jm − 1)

∑b αb

∑Lm

k M bkipk)

(7)

Given parameters {β, α}, we can numerically find the fixed point solution to the

mapping p = f(p) in order to obtain the vector of reduced form equilibrium conjec-

tures p∗l (X, α, β, Jm) that depend only on the data and parameters. Once we solve

this system of Lm equations in each market, the estimation consists of maximizing

the following log-likelihood

LL =M∑

m=1

Lm∑l=1

yl ln p∗l (8)

where yl is the observed number of entrants in a location within a market, and p∗lis the equilibrium conjecture we solved for.

This procedure necessitates identifying the block groups centroids that fall in the

distance rings, or bands, highlighted with different colors in Figure 3. Note that

the membership depends on the distances between centroids of locations. The store

counts, and demographics for each band are constructed using these neighborhood

definitions.

12 A.YESIM ORHUN

Figure 3: Band Construction

The first band includes location of interest, at the bull’s eye, as well as location

that are within d1 of it. The third band includes all the locations that are farther

away than distance d2 to the location of interest. One could potentially get finer

definitions of distance by using more bands, however the parameters to be esti-

mated increase by the number of distance sensitive variables for each additional

band.

In some cases, there might not be any locations in the second band of a particular

location, although there are locations in the first and third bands. Moreover,

locations with higher percentages of area outside the band may be included in the

band, and others that need to be included may be avoided. These discretization

errors may not be random, since the magnitude of errors are positively correlated

with the size of the location, and its neighboring location sizes, where the size of

a location is correlated with population density, since it is endogenously defined

by the U.S Census Bureau. For small enough location sizes, the discretization

approach generates band definitions that are close to boundary inclusion, just as

uniform-sized locations would. Whereas when the location sizes are bigger, more

errors are possible due to discretization.

Although we cannot observe the population distribution over the location’s area,

using population weighted centroids rather than geographics centroids of locations

increases the precision with which we can identify consumers’ distances to a loca-

tion. Errors due to discretization are minimized when location definitions are as

13

small as possible, therefore I use the census block groups as decision units rather

than census tracts. 6

5.2. Continuous Distance Weighting Approach. The problems related to dis-

cretization can be circumvented by directly using distance measures to weigh the

effect of a variable on profits. Otherwise similar in its assumptions, the model

below allows the weights of characteristics of locations to be flexible functions of

the locations’ distances to the location of interest. The expected payoff for firm j

can be written as

E[πj(aj)] =Lm∑l=1

β(daj l)Xl + (Jm − 1)Lm∑l=1

α(daj l)pl + ηjaj

(9)

where Lm is the total number of locations in the market, Xl are the demographic

variables for location l and daj l is the distance between location choice aj and any

other location indexed by l, and pl is the probability that a firm will choose location

l.

In this approach, the effect of each location’s attributes on the attractiveness

of the location of interest is weighted by its distance to that location. A flexible

polynomial for the β(d) function can be estimated within the system.

The probability of any firm choosing location l is

pl =exp(

∑Lm

k=1 β(dlk)Xk + (Jm − 1)∑Lm

k=1 α(dlk)pk)∑Lm

i=1 exp(∑Lm

k=1 β(dik)Xk + (Jm − 1)∑Lm

k=1 α(dik)pk)(10)

Discrete bands approach provides an advantage in computation with respect to

using continuous distance weighting, if the number of bands are relatively few.

On the other hand, continuous weighting by distance is better in circumstances

where discretization of bands induce biases to the estimation. Moreover, having

the distance weighting function over the whole range of distances provides us with

interesting insights from local changes in the function that might be lost due to

averaging and the particular choice of cutoff in the discrete bands approach.

6Results at the census tract level are available on request.

14 A.YESIM ORHUN

The following extensions address two facts that characterize the supermarket

industry. The first fact is that the firms may have some common information about

the attractiveness of a location that the econometrician may not. The second fact

is that the supermarkets are differentiated in other dimensions than spatial, which

then can affect their spatial positioning decisions.

5.3. Location Specific Unobservables. The models above assume that all the

demand and cost conditions that make a specific location preferable to supermar-

kets are observed by the researcher. However in case of the failure of this assump-

tion, the competition effects will be biased. Including location specific errors in

the model introduces some extra correlation of strategies across players, and it

might be interpreted as adding common information to the game or accounting for

location specific omitted variables. It might also be of interest to allow for spatial

correlation in the unobservables.

In this light, the expected payoff function in the discrete bands model can written

as

E[πj(aj)] =B∑

b=1

βbZbaj

+ (Jm − 1)B∑

b=1

αb

Lm∑l=1

M baj lpl +

B∑b=1

σb

Lm∑l=1

M baj lεl + ηj

aj

(11)

where pl is the probability of any firm choosing location l and ε is drawn from

N(0, I). Similarly for the distance weighting model, the expected payoff function

is

E[πj(aj)] =Lm∑l=1

β(daj l)Xl + (Jm − 1)Lm∑l=1

α(daj l)pl +Lm∑l=1

σ(daj l)εl + ηjaj

(12)

By including σ(daj l)εl instead of just σεajI investigate spatial correlation among

the location specific unobservables.

Introducing common location unobservables requires solving for the equilibrium

conjectures for a set of simulated ε in order to integrate out the distribution of ε

in the Maximum Likelihood estimation. This approach assumes that ε is uncor-

related with the explanatory variables. For each simulation r = 1, 2, ...R of the

15

unobservable vector over all locations, the probability of location choice of firm j

is,

prl =

exp(∑B

b=1 βbZbl + (Jm − 1)

∑Bb=1 αb

∑Lm

k=1 M blkpk +

∑Bb=1 σb

∑Lm

k=1 M blkε

rk)∑Lm

i=1 exp(∑B

b=1 βbZbi + (Jm − 1)

∑Bb=1 αb

∑Lm

k=1 M bikpk +

∑Bb=1 σb

∑Lm

k=1 M bikε

rk)

(13)

for the discrete bands approach, and

prl =

exp(∑Lm

k=1 β(dlk)Xk +∑Lm

k=1 α(dlk)pk +∑Lm

k=1 σ(dlk)εrk)∑Lm

i=1 exp(∑Lm

k=1 β(dik)Xk +∑Lm

k=1 α(dik)pk +∑Lm

k=1 σ(dik)εrk)

(14)

for the distance weighting approach.

This system of equations for all firms is solved numerically for each simulation r.

Then these R sets of conjectures are averaged, thus integrating out the distribution

of ε for a given set of parameters (β, α, σ). The log-likelihood taken to estimation

is

LL =M∑

m=1

Lm∑l=1

yl ln(

∫p∗l (ε)f(ε)dε) (15)

Finding the fixed-point solution to the set of equations for the equilibrium con-

jectures can be time consuming. Estimations which have a large number of in-

dependent calculations within a minimization routine are very suitable for paral-

lelization. In order to gain magnitudes of speed, I parallelize the algorithm in C

and ran it on the Datastar in the San Diego Supercomputer Center. The algorithm

is generally parallelized over the simulations of ε and in cases where I am interested

in models without the unobservable, it is parallelized over the markets. Costs of

communications outweighs the benefits of double parallelization (for both R and

M). The minimization routine is a downhill simplex method.

5.4. Asymmetric Types. In the supermarket industry, geographic positioning

segments consumers according to their proximity and travel costs. Preexisting

16 A.YESIM ORHUN

brand level differences at the time of location choice further segments them ac-

cording to their tradeoff between distance and quality and price concerns. This

may result in relative geographic targeting of consumer segments by different types

of firms. Moreover, if preexisting differences play a role in differentiation, compe-

tition across firms that are more differentiated will be softer.

In order to investigate these effects, we can allow for discrete types of firms which

results in asymmetries in location probabilities across different types. When pre-

determined systematic differences across players are denoted as types t = 1, 2, ...T ,

the discrete bands expected payoff function of firm j of type t can be written as

E[πtj(aj)] =

B∑b=1

βtbZ

baj

+T∑

s=1

(Jms − It=s)

B∑b=1

αtsb

Lm∑l=1

M baj lp

sl + ηj

aj(16)

where It=h equals one if t = h, and Jms is the number of firms of type s in the

market. The probability of a firm of type s choosing location l is denoted by psl .

The coefficient αtsb captures the effect of competitors of type s within band b, on

the profits a type t firm. Similarly, for the distance weighting model, the expected

payoff function is

E[πtj(aj)] =

Lm∑l=1

βt(daj l)Xl +T∑

s=1

(Jms − It=s)

Lm∑l=1

αts(daj l)psl + ηj

aj(17)

The probability of a firm of type t choosing location l can be expressed as

ptl =

exp(∑B

b=1 βtbZ

bl +

∑Ts=1(J

ms − It=s)

∑Bb=1 αts

b

∑Lm

k=1 M blkp

sk)∑Lm

i=1 exp(∑B

b=1 βtbZ

bi +

∑Ts=1(J

ms − It=s)

∑Bb=1 αts

b

∑Lm

k=1 M bikp

sk)

(18)

or

ptl =

exp(∑Lm

k=1 βt(dlk)Xk +∑T

s=1(Jms − It=s)

∑Lm

k=1 αts(dlk)psk)∑Lm

i=1 exp(∑Lm

k=1 βt(dik)Xk +∑T

s=1(Jms − It=s)

∑Lm

k=1 αts(dik)psk)

(19)

The equilibrium conjectures pt∗ are found by solving Lm × T equations.

The log-likelihood taken to the estimation is

LL =M∑

m=1

Lm∑l=1

T∑t=1

ytl ln pt∗

l (20)

17

where ytt is the observed number of entrants of type t, in location l within a

market.

6. Results

In the estimation, the demand shifters such as population density and average

income are assumed to be distance sensitive.7 On the other hand, upper quartile

rent and retailer density of a particular location only affect the profits for that

location.8 I operationalize the discrete bands approach with 3 bands which are

defined by 0.5 mi, 2mi cutoffs.9

Table 1 shows the estimates from the discrete bands model. The parameters of

Equation 7 are reported in the first column, and the second column reports the

results of Equation 13, which introduces common location unobservables. Let’s

concentrate on the first column. The effect of population density on profits is

positive and decreases with distance to the store location. In fact, there is no

significant effect of population density from the 3rd band, which includes locations

farther than 2 miles. The fact that the effect decreases with distance reflects

consumer travel costs. The effect of income per capita in the immediate band is

negative, however in the second and third bands it is positive and decreasing with

distance. The effect of upper quartile rent in the location of choice is negative,

but insignificant. A location with very few or no retail establishments is much less

likely to be chosen, possibly reflecting unobserved zoning rules. 10

As expected, the effects of competitors are negative and decreasing in magni-

tude with distance. In the second specification which introduces location specific

unobservables to the estimation, the effects of competitors get more pronounced

in every band. This is due to the fact that the model can attribute clusters or

non-existence of supermarkets to unobservables. For example, take a case where

7I do not find significant effects of education, family size, or travel time to work.8I have also used restaurant density and business density instead of retailer density and got verysimilar coefficients.9Results for distance cutoffs (1, 2.5), (0.5, 2) and (0.5, 1.5, 2.5) are available upon request. Resultsare not very sensitive to cutoff choice.10Results using density of large retailers (comparable to supermarkets) are very similar.

18 A.YESIM ORHUN

there is a positive unobservable for a given location, say an intersection of major

avenues, and parking lot facility nearby. When the model is forced to attribute

a higher number of supermarkets in this location to observed demographics only,

this will dilute the effect of demographics and the competitive effect will be un-

derestimated. The difference in results provides evidence that allowing for some

common information of players is important. The unobservables show a significant

variance and positive spatial correlation. I do not include retail density as an ex-

planatory variable, since the location specific unobservables capture zoning rules

and more. The effect of rent on profits becomes significant and more negative in

this specification. 11

Table 2 displays the estimates of the continuous distance weighting approach.

The first column report the results of the model in Equation 10 and the second

column report the results of the model with location specific unobservables, as

in Equation 14. For each distance sensitive parameter, I use θ(d) = θ1

d+ θ2

d2 or

θ(d) = θ1

d+ θ2

d2 + θ3

d3 to approximate the effect of distance on the weighting of

the variable. This functional form is intuitive and flexible. This approach avoids

discrepancies due to ad hoc discretization of neighborhoods and provides a more

detailed picture of distance varying effects.

The results are in the same direction as the discrete bands model results. In

Display 1, estimated weighting functions in two specifications are graphed as a

function of distance for demand and competition factors. As before, the effect

of a competitor is negative and decreasing with distance. Allowing for location

specific unobservables increases this effect in magnitude. The effect of population

density is positive and decreasing with distance. The effects of income per capita

are negative for locations that are very close, and positive and decreasing with

distance for the rest. The effect of rent is negative as expected.

The results of the model with asymmetric types of players are presented in Table

3. The first column of Table 3 shows the results of the model presented in Equation

11To the degree that explanatory variables are correlated with the unobservable factors, theparameters on the explanatory variables will reflect the effect of this correlation with the unob-servable factors, not just the effect of population for example. Then the estimates need to beviewed in this light.

19

17. Supermarkets are divided into two types, Type A supermarkets are brand

names such as Whole Foods, Andronico’s, Nob Hill, Trader Joe’s and specialty

supermarkets and Type B supermarkets are Safeway/Vons, Albertsons, Smart and

Final, Grocery Outlet, Food4Less and other nonbranded stores. For estimation

purposes, it is assumed that the competitive pressure of a type t supermarket

on a type s super market is the same as that of a type s on type t, in other

words αts = αst. It is also assumed that supermarkets of the same type exert

the same competitive pressure on each other, regardless of their type, so αtt =

αss. This method differentiates between within-type and across-type competition

effects. Moreover, supermarkets are allowed to differ in their sensitivities to income

per capita of their consumers. The specification presented in the second column

introduces the variance of income in a location as an explanatory variable that

each type is allowed to respond differently to. This aims to capture the effect of

distribution of income, rather than just the mean of income. Consider two locations

with the same mean income per capita. Let one have a unimodal distribution

of income per capita, with a small variance around the mean. Let the other

have a bimodal distribution, reflecting a mix of poor and rich consumers. The

second location might attract two entrants of different types, where as the first

location may attract none, or one. In this scenario, if the distribution of income is

omitted, then the competitive effect between supermarkets of different types might

be underestimated.

The results for the two specifications are graphed in Display 2. The effect of

population density is positive and decreasing with distance as before. The income

per capita sensitivities for each type is quite different. Type B firms prefer low

income locations, and the effect of income decreases sharply with distance. For

Type A firms the effect of income is positive and generally decreases with distance.

The results of the first specification show a considerable difference in within-type

competition intensities and across-type intensities. Competition between firms of

the same type is much fiercer for all distances, although the effect decreases sharply

within the first mile of proximity. The competition between firms of different

types is softer and decreases slowly with distance. This competition effect must

20 A.YESIM ORHUN

be interpreted as the result of consumer choice among supermarkets, as a function

of the substitutability or complementarity of competitors. The estimates from the

second specification are similar, with the competition across types being fiercer.

The comparison can be seen more clearly in the last graph of Display 2. This is in

line with the earlier intuition that distribution of income might be a determinant

of location choice of different types.

7. Discussion

This paper investigates the geographic differentiation of supermarkets, taking

into account other differentiating factors that are pre-existing at the time of the

location choice. The contribution of the results is to provide evidence that such pre-

existing differences matter in spatial differentiation decisions. The supermarkets

are categorized into types outside the model. There may be alternative ways of

categorizing supermarkets, such as with respect to size and variety or type of

pricing scheme that this paper does not yet explore. To this end, the results that

point to the importance of such pre-existing differences might be understated.

The competition effect is modeled as the effect of an additional competitor on the

firm’s profits. By construction, this effect will include any possible spill-over effects.

Supermarket industry, unlike other retailer industries such as restaurants, is not

expected to exhibit large magnitudes of positive clustering externalities. Moreover,

allowing for location specific unobservables in the model prevents misattribution

of unobservables to competition intensity. However, results should be interpreted

keeping the model in mind. For example, when we allow for asymmetric competi-

tion intensities within and across different types of supermarkets, we find that the

within type competition is tougher than across type competition. One can inter-

pret these results as pre-existing differences creating asymmetric substitutability

between different types of supermarkets for the consumer. These results, to some

extent, may also be due to the complementarity of supermarkets of different types,

which may induce positive externalities of locating together.

21

The payoff function is designed to capture the tradeoff between being close to

favorable demand conditions and being far from competitors. The separably ad-

ditive formulation of these effects is prominent in the literature. The competitive

effect is a linear, additive function of the expected number of competitors. This

formulation assumes that the effect of an additional firm is constant, regardless of

the number of existing firms. The fixed point solution to the system of conjectures

is greatly aided by this assumption. If one were to allow for a more flexible payoff

function, the estimation would not be feasible under these assumptions. Two stud-

ies deal with this problem. Einav(2003) models a flexible payoff function, under a

sequential discrete choice assumption. The game structure does not require solving

for the fixed point, it is a backwards induction solution method. He estimates one

parameter of this flexible payoff function, where other parameters are substituted

in. I do not have the order of entry information in my data set. Then, in order to

estimate a sequential choice model, I would have to randomize over the order of

choice with respect to some parametric distribution. However, with a more com-

plicated payoff function, where I am interested in estimating all the parameters

from the location choice data, I do not find it feasible to randomize over order of

choice and to estimate the parameters of the model. The second study that is able

to estimate a more flexible payoff function is Gowrisankaran and Krainer (2005).

This paper uses the same simultaneous discrete choice game setup, however instead

of solving for the equilibrium beliefs, substitutes actual actions of competitors in

the market in the place of beliefs of competitors’ location choices.

8. Conclusion

I estimate a simultaneous game of asymmetric information of location choice in

order to gain insight on the tradeoff between demand and competition factors in

the supermarket industry, and to recover sensitivities of these factors to distance

due to consumer travel costs. In the model, firms face uncertainty regarding the

number of competitors they will face, since each firm possesses private information

on their profitability across locations. Each firm maximizes its expected profits,

however might face ex-post regret due to the risk in payoffs caused by uncertainty in

22 A.YESIM ORHUN

the optimal strategies of competitors, which is a well suited description of decision

making in the real world.

The model is applied to the supermarket industry where location choice is a

major differentiating factor, and other pre-existing differentiating factors can be

controlled for. The findings indicate that competition intensity decreases signifi-

cantly with distance, which gives firms the incentive to avoid locating in proximity.

On the other hand, the demand effects also weaken with distance, giving firms a

counter incentive to locate close to favorable demand conditions. Empirical re-

sults also show that these two counteracting incentives are traded off differently

by different types of supermarkets. For a fixed distance, the competition intensity

is greater between firms of the same type than between firms of different types.

Also, results indicate that firms of different types are targeting different segments

of consumers. This result highlights the importance of considering all pre-existing

differentiating dimensions when modelling geographic position choices of firms.

Furthermore, this paper illustrates how controlling for unobservable location spe-

cific factors can explain clustering or thinning of supermarkets which otherwise

would be misattributed to competition intensity by the model.

This research is a step towards incorporating positioning decisions of firms in

marketing strategy models. This study shows that positioning decisions of firms

can convey valuable information on the demand and competition conditions in

the market. Combining demand and supply models with product location choice

models is a natural direction for future research.

23

References

[1] Steve Berry. Estimation of a model of entry in the airline indstry. Econometrica, 60:889–917,February 1992.

[2] Timothy Bresnahan and Peter Reiss. Entry in monopoly markets. The Review of EconomicStudies, 57(4):531–553, October 1990.

[3] Timothy Bresnahan and Peter Reiss. Empirical models of discrete games. Journal of Econo-metrics, 48:57–81, 1991.

[4] Liran Einav. Not all rivals look alike: Estimating an equilibrium model of the release datetiming game. Working Paper, Stanford University, June 2003.

[5] Gautam Gowrisankaran and John Krainer. The welfare consequences of atm surcharges: Ev-idence from a structural entry model. Working Paper, Olin School of Business, WashingtonUniversity, November 2004.

[6] Micheal Mazzeo. Product choice and oligopoly market structure. RAND Journal of Econom-ics, 3(2):1–22, Summer 2002.

[7] Katja Seim. An empirical model of entry with endogenous product-type choices. WorkingPaper, Graduate School of Business, Stanford University, February 2004.

[8] Randal Watson. Product variety and competition in the retail market for eyeglasses. WorkingPaper, University of Texas, Austin, March 2004.

24 A.YESIM ORHUN

Table 1: Results of Discrete Bands Approach

(1) (2)Population density, band 1 0.5214 0.9731

(0.1594)* (0.2517)*Population density, band 2 0.2297 0.5672

(0.0952)* (0.1288)*Population density, band 3 0.103 0.0962

(0.094) (0.208)Income per Capita, band 1 -0.1239 -0.0811

(0.0439)* (0.034)*Income per Capita, band 2 0.2371 0.2422

(0.0992)* (0.1041)*Income per Capita, band 3 0.0459 0.0069

(0.0208)* (0.0047)Upper quartile rent -0.0735 -0.0868

(0.0523) (0.0363)*Retail density 0.473

(0.166)*Competition, band 1 -2.4675 -3.532

(0.2612)* (0.2238)*Competition, band 2 -0.1245 -0.439

(0.058)* (0.1318)*Competition, band 3 -0.1883 -0.2489

(0.0916)* (0.054)*σ1 1.762

(0.7551)*σ2 1.183

(0.4358)*σ3 1.107

(0.314)*LLF 1744.56 1738.19

25

Table 2: Results of Distance Weighting Approach(1) (2)

Population density 1d

0.1962 0.2683(0.0793)* (0.0581)*

Population density 1d2 -0.0121 -0.0109

(0.0051)* (0.0031)*Income per Capita 1

d0.1420 0.1120

(0.0670)* (0.0440)*Income per Capita 1

d2 -0.0169 -0.0138(0.0080)* (0.0052)*

Upper quartile rent -0.0580 -0.0910(0.0276)* (0.0494)*

Retail density 0.336(0.1074)*

Competition 1d

-0.5213 0.6715(0.1480)* (0.1508)*

Competition 1d2 0.0850 0.0442

(0.0196)* (0.0099)*Competition 1

d3 -0.0063 -0.0021(0.0015)* (0.0005)*

σ11d

0.2894(0.1181)*

σ21d2 -0.0197

(0.0059)*LLF 1749.77 1740.16

26 A.YESIM ORHUN

Display 1

Graph of Competition Coefficient

Graph of Population Density Coefficient

27

Graph of Income per Capita Coefficient

28 A.YESIM ORHUN

Table 3: Results of Distance Weighting Approach with AsymmetricTypes

(1) (2)Population density 1

d 0.228 0.1813(0.0417)* (0.0705)*

Population density 1d2 -0.0107 -0.0096

(0.0044)* (0.0038)*Upper quartile rent -0.0720 -0.0933

(0.0854) (0.0468)*Retail Density 0.6201 0.6953

(0.173)* (0.323)*Income per Capita - type A 1

d 0.1993 0.2634(0.0978) (0.1085)*

1d2 -0.0147 -0.0241

(0.006)* (0.011)*Income per Capita - type B 1

d 0.0225 0.0218(0.0263) (0.0158)

1d2 -0.0073 -0.0034

(0.0055) (0.0021)Within-type Competition 1

d -0.5982 -0.6247(0.2550)* (0.2193)*

Within-type Competition 1d2 0.0103 0.0235

(0.0042)* (0.0106)*Across-type Competition 1

d -0.2209 -0.3527(0.0774)* (0.1255)*

Across-type Competition 1d2 0.0152 0.0168

(0.0067)* (0.0074)*Variance of Income - type A 0.3863

(0.2886)Variance of Income - type B 0.6419

(0.2745)*LLF 1749.32 1741.81

29

Display 2

Population Density Coefficient

Income Coefficients

30 A.YESIM ORHUN

Competition Coefficients from (1)

Competition Coefficients from (2)

31

Across type competition comparison

spatial differentiation in the supermarket industrypages.stern.nyu.edu/~mkt/seminar...

Documents