internet vs tv 2

7/28/2019 Internet vs TV 2

1/33

Internet vs. TV Advertising: A Brand-Building

Comparison

Michaela Draganska

The Wharton School

Wesley R. Hartmann

Stanford GSB

Gena Stanglein

Google

Abstract

A key issue for media planners determining the share of their advertising budgetsto spend on Internet advertising is whether Internet advertising can build brands aseffectively as television advertising. To address this question, we extend traditionalbrand-message recall measurement to facilitate comparisons between Internet formatsand television. Specifically, we supplement brand-message surveys conducted during thecampaign with a set of pre-campaign surveys to control for pre-existing brand knowl-

edge, and use a matching procedure to ensure the pre-campaign sample is comparableto the in-flight one.

For our analysis, we use a rich data set comprising 20 campaigns, across multipleindustries ranging from consumer packaged goods to telecommunications. We findsubstantial cross-brand variation in pre-existing knowledge as well as variation acrossadvertising formats. In particular, individuals exposed to Internet display ads havesignificantly lower levels of pre-existing brand knowledge than television viewers. Suchdifferences in initial conditions suggest biases in comparisons between Internet andtelevision ads, and possibly a more general failure of the brands to establish lastingassociations among individuals shifting media consumption from TV to the Internet.Incorporating these pre-existing differences between media formats results in brandlift measures for Internet ads that are statistically indistinguishable from comparabletelevision lift measures.

Keywords: advertising, display, television, Internet.

The authors would like to thank Oscar Mitnik and Amogh Vasekar for valuable assistance, as well asRawley Cooper, Brent Davis and Scott McKinley at Nielsen for their help in executing the study. Draganskaand Hartmann served as consultants during the survey design and administration phases of the project.

Email: [email protected]: [email protected]: [email protected]

1


2/33

1 Introduction

Over the past decade advertising expenditures have shifted from traditional media to the

Internet. In 2011, online advertising in the United States alone reached $32 billion and is

projected to reach $62 billion by 2016 (eMarketer, February 2012 report). Internet portals

determined to use their inventory to substitute for traditional advertising formats have turned

to quantitative metrics to illustrate the advantages of online advertising. They are armed

with the ability to readily observe behavioral responses on the web, such as click-through

rates, and to conduct large-scale online experiments to provide the most accurate measure of

the effectiveness of the ads in driving consumer purchasing decisions (Lewis & Reilley 2011,

Goldfarb & Tucker 2011).

Nevertheless, many advertisers still hesitate to shift spending from television campaigns

to the Internet, pointing to the established role of TV advertising in building brands. The

solid experimental evidence quantifying the behavioral response to Internet advertising does

not seem to be a sufficient reason, because no direct comparison to the effectiveness of TV

as a brand-building medium is available. In general, TV experiments are costly and thus not

scalable for wide-spread application to allow for a comparative study of the effectiveness of

online and offline campaigns. Older experimental studies on TV advertising, most notably

by Lodish, Abraham, Kalmenson, Livelsberger, Lubetkin, Richardson & Steve (1995a) and

Lodish, Abraham, Kalmenson, Livelsberger, Lubetkin, Richardson & Steve (1995b) do not

have Internet data. Considering also the typical sample sizes for such experiments, obtaining

a significant effect of advertising on sales frequently fails due to lack of power (Lewis &Reilley 2011), so we cannot really rely on them.

For that reason and because of the perception that TV advertising is the main medium

for brand building, the metrics typically used to assess its effects are brand awareness and

preference. The rationale is that, although direct links to eventual purchase are sometimes

possible, brand advertising on television is primarily aimed at influencing the mindset of a

customer who may purchase anytime within a reasonably long horizon (Assmus, Farley &

2


3/33

Lehmann 1984). By contrast, the effect of online advertisements has been measured mostly

on outcomes such as click-through rates and generated sales.

A few recent studies have questioned the emphasis on sales measures and have pointed to

the brand-building potential of the Internet (Briggs & Hollis 1997, Dreze & Hussherr 2003).

In one of the earliest studies of online advertising, Briggs & Hollis (1997) show that banner

ads can also have an effect on brand awareness and image, even in the absence of a behavioral

response such as a click-through. Using eye-tracking devices in conjunction with a large-scale

survey, Dreze & Hussherr (2003) find that consumers avoid looking at banners, but there is

still an effect on brand recall measures, suggesting a pre-attentive level of processing. This

research implies that attitudinal measures may be more appropriate not just for assessing

the effectiveness of TV commercials but that of online advertising as well.

To date, however, no field data have been available to enable media planners to conduct

an apples-to-apples comparison of the advertising effectiveness of online and offline media in

terms of creating brand awareness and establishing brand associations. This paper seeks to

fill this gap and measure Internet advertising performance, specifically the performance ofvarious non-search advertising formats, according to the metrics advertisers have historically

relied on for their television campaigns.1

We have a unique data set of 20 advertising campaigns spanning a wide variety of product

categories and industries. In addition to TV commercials, we have data for Internet banner,

rich media and video ads. The advertising campaigns use the online and TV advertising

formats concurrently, and the effect of the commercials is assessed using the same brand-

recall measure (ability of respondent to correctly link creative to brand) for all advertising

formats, thus providing the data for a valid comparison of the effectiveness of the different

media.

1At the start of 2011, Google, The ARF, Nielsen, Stanford, and Wharton collaborated on an initiative toenhance the media planning and buying process. The goal was to quantify cross-media ad-format effective-ness, and derive the relative impact of ad formats. The first phase of this project was a pilot measuring thebrand cut-through (i.e., ability of consumers to correctly link a brand to a creative) of ad formats across adcampaigns.

3


4/33

The performance of an ad campaign is gauged relative to a baseline and is referred to

as lift. One could posit, as is common in the industry, that absent advertising consumers

would randomly associate brands with commercials. However, especially for mature brands,

assuming consumers do not have any pre-existing brand knowledge due to exposure to past

advertising, word of mouth, or other experiences with the brand is naive. In addition, for us

to compare advertising effects for a given campaign across formats, potential customers who

extensively use the Internet and those who predominantly watch TV need to have the same

level of pre-existing familiarity with the brand. If the existing stock of past advertising differs

by media behavior and is thus dependent on the type of ad format, to which an individual is

likely to be exposed, using a constant baseline across formats would no longer yield a valid

comparison.

To account for such potential disparities in the pre-existing familiarity with the brand

across media, we have modified the traditional television recall methodology to include a pre-

campaign survey to obtain the initial conditions of the advertising stocks for consumers

with different media-consumption habits. We avoid a testing bias by employing a repeatedcross-section design rather than a true panel; that is, we measure pre-campaign brand recall

and in-flight (during the campaign) brand recall for separate sets of consumers.

We ensure the comparability of the pre-campaign and in-flight survey groups by employ-

ing a nearest-neighbor matching procedure (Abadie & Imbens 2012). This technique allows

us to select only those individuals from the pre-campaign sample who exhibit media con-

sumption behavior similar to that of the individuals surveyed during the campaign. Having

this pre-campaign measure for an equivalent group gives us a much more accurate baseline

to establish the lift of a campaign relative to assuming random guessing as is typically done

in the industry.

We find substantial differences in the pre-existing levels of brand knowledge both across

campaigns and across advertising formats. In particular, respondents who were exposed

predominantly to the Internet formats had a lower level of pre-existing brand knowledge

4


5/33

than TV viewers. Our analysis further reveals that ignoring the initial conditions results

in different conclusions regarding the relative effectiveness of TV versus online formats.

Comparing the impact of the three online formats - banner ads, rich media, and video - to

commercials aired on TV using the traditional measure of in-flight brand recall, we find TV is

superior to the Internet. Upon adjusting for the pre-existing differences in brand knowledge

by format, however, we find that Internet ad performance is statistically indistinguishable

from TV.

In the next section, we provide some background on the traditional brand-recall measures

and define the conditions under which the methodology can be interpreted causally. Section

3 explains the data-collection procedure and provides a description of the variables used in

the analysis. We proceed by outlining our empirical strategy in section 4 and then present

the findings in section 5. Section 6 concludes with directions for future research.

2 Traditional Recall Methodology

A long-established practice in advertising research is to survey individuals who were exposed

to an ad the previous day to determine the extent to which they recall the ad message, the

brand, and can link the message to the corresponding brand (Rossiter & Bellman 2005).

Although these attitudinal measures only approximate the effect on purchase behavior that

advertisers are ultimately interested in, they have gained wide acceptance and usage. In

an early study, Wells (1964) compared the different recognition, recall and rating scales

employed in practice and concluded that recall scores, which reflect the advertisements

ability to register the sponsor name and to deliver a meaningful message to the consumer,

are particularly trustworthy. More recently, Krishnan & Chakravarti (1999) review existing

memory tests for assessing advertising effectiveness and underscore their value across a wide

range of advertising objectives.

The ad message could inform the viewer about the existence or functional attributes of

the brand, or establish non-functional brand associations. It produces memory traces about

5


6/33

brand-specific and message-specific information, about the product category, evaluative re-

actions, and brand identification (Hutchinson & Moore 1984). For the message to have an

effect, the consumer needs to know which brand is being advertised. Empirical studies have

shown that this is a nontrivial task, as only about 40% of consumers who have viewed a

commercial recall the sponsor of the message (Franzen 1994). Establishing brand-message

links therefore is a critical input to brand building. In the present research, we focus specif-

ically on the question of whether a respondent prompted with a description of the ad can

recall the brand.

Message-recall studies have a few reasons for selecting individuals who viewed the ad the

day before. First, the goal is to assess the creatives ability to link the brand and message,

and not necessarily to assess the quality of the message itself. For example, one could

imagine assessing recall several days after the individuals view the ad to see whether the

ad sticks. This measure, however, says more about the memorability of the message than

the creatives ability to help the brand cut through and get viewers attention. Second,

exposure is traditionally inferred based on self-reported viewing of a TV program duringwhich a commercial was aired (opportunity to see). That is, respondents have been required

to recall their program viewership to establish exposure. Doing so for a longer period of

time can result in too much error. Passive measurement of exposure, for example, through

a meter installed on the TV set or through other tracking devices, can resolve this problem,

but such measurement is not available at a large enough scale for all advertising.

To describe the traditional methodology, we begin by introducing some notation. Let

Ys be an indicator for whether respondent s can correctly select the brand from a multiple-

choice list after being prompted with a description of the advertising message. Xs is an

indicator for whether respondent s was exposed to the ad. Finally, let ys0 be a probabilistic

assessment of how well respondent s could have guessed the associated brand before the ad

was run. Then the traditional recall methodology defines = E[Y y0|X = 1] to be the

expected lift among the exposed population in linking the brand with the message. The

6


7/33

estimator for is

=

{s|Xs=1}

(Ys ys0) ws, (1)

where the summation conditions on respondents who have been exposed to the ad and ws

weighs respondents based on how representative they are of the entire population exposed

to the ad.2 If respondents are randomly drawn from the exposed population, there is no

need for the weight ws. This term arises here because of selection issues market research

organizations have in recruiting their panels.

The baseline ys0 captures consumer past interactions with the brand and provides a

measure of the extent to which its advertising has established an association between brand

and message. One might expect successful brands to already have a reasonably high baseline

association with the message because the message is probably related to associations they

have previously communicated. On the other hand, a new brand may have no pre-existing

associations that could be tied to the message, resulting in a small baseline.

In practice, ys0 is typically a predetermined constant, such as the success rate of guessing

at random, that is the same for all respondents. Obtaining a more accurate measure of the

baseline ys0 by establishing an initial condition for the campaign is important both for the lift

measurement above as well as for providing the advertiser with information about how well

past campaigns have imprinted a brand image. Furthermore, if is to be compared across ad

formats, recognizing that individuals exposed to different formats may systematically differ

in their level of pre-campaign associations is critical.

The traditional recall methodology can be characterized as trying to measure a treatmenteffect on the treated population (Heckman, Ichimura & Todd 1997, Imbens 2004). This

2The canonical message-recall measure focuses on assessing a single airing of an ad. Yet practitionersoften group together multiple ads in a day as well as ads aired across multiple weeks of the campaign.Nevertheless, the estimator in equation (1) is still applied, but the meaning may change because responsesfor ads later in the campaign could involve more campaign exposures than those responses for ads earlierin the campaign. Practitioners have attempted to account for multiple campaign exposures by consideringthe build and/or decay in the brand associations throughout the campaign. With enough surveys, onecould repeat the above analysis at each point in time, but more often the researcher tries to estimate howthe responses vary with how far along the campaign is in terms of either time or total exposures.

7


8/33

terminology arises from the focus on only measuring the effect for those individuals who

were exposed, that is, conditioning on X = 1 in = E[Y y0|X = 1].

The primary challenge to a causal interpretation of recall studies is the establishment

of a control condition. Because the same individual cannot be simultaneously exposed and

unexposed, measuring ys0 for a respondent who is exposed is typically impossible. To clarify

the problem, we separate the lift measure, = E[Y y0|X = 1], into two independent

expectations: = E[Y|X = 1] E[y0|X = 1]. The first component of this expression,

the probability of correctly identifying the brand if an individual was exposed to the ad

campaign, E[Y|X = 1], can be easily obtained from observed recall and exposure data. The

latter component, E[y0|X = 1], requires assessing the control outcomes, that is assessing

whether a respondent would have correctly linked the creative to the brand without seeing

the ad. This measurement requires experimentation and presents a particular challenge

for media such as television. In section 4 we propose a method to obtain this measure by

augmenting the traditional methodology described above with a pre-campaign survey. Before

we proceed, we first introduce the data set in the next section.

3 Data

3.1 Data Collection and Variables

The data-collection effort employs Nielsens TV Brand Effect panel. This panel consists of

a large number of participants who reveal their advertising exposures across Internet and

television formats and answer creative and brand-recall survey questions on rewardtv.com.

The panel consists of more than six million registered members, with a weekly average

of 26,000 participants. On average, a panelist would visit the rewardtv.com site 1.5 times a

week and take 1.7 surveys per visit. Approximately 83% of panelists are new each month.

Because we rarely observe the same individual for longer stretches of time, the panel is best

considered a repeated cross section, which limits our ability to make before-after comparisons

of the same individual. However, repeatedly asking a respondent about the same ad and set

8


9/33

of brands may lead to conditioning effects (testing bias), so not having a long time series is

not necessarily a negative feature in this setting.

Nielsen recruits panelists across various Internet portals and sites and through word of

mouth. To maximize daily participation, the site provides a lot of entertainment content,

along with sweepstakes, auctions, and discounts. The incentives are soft, though, thus

ensuring a high turnover and minimizing the potential for conditioning effects. Nielsen

conducts periodic checks to ensure the panelists exhibit the same TV viewing and Internet

usage behavior as other Nielsen panelists, and uses weights to ensure the representativeness

of each surveyed individual. The first survey for new panelists is eliminated to allow for a

training/experimentation period, and any abnormal participation and response patterns are

carefully examined.

In addition to TV commercials, we investigate three online formats - banner ads, rich

media, and video. Banner ads are ads with or without animation with which the user

cannot interact. Examples include overlays on video content, companion banners, wallpapers,

and skyscraper. Video is any streaming video, pre-roll, post-roll, or in-roll. Rich media areany ads with which the mouse can interact without necessarily activating a click-through,

such as expandable ads, interactive game ads, and corner peels.

To record online ad exposure, online ad creatives are tagged and then linked to the

panelists via cookies on their computers. Provided cookies are not erased and the user does

not change computers, Internet exposures are complete for the duration of the campaign

irrespective of an individual logging in on rewardtv.com. Television exposure is inferred

when a respondent logs in to rewardtv.com and states that on the preceding day, she watched

a program that is known to have run an advertisement from the campaign (opportunity to

see). For TV exposures, we thus do not observe exposures an individual may have had prior

to logging onto rewardtv.com.

When an individual logging in is identified as having been exposed to an ad, she is

presented with a description of a scene from a commercial (for an example, see Table 1).

9


10/33

This description often comes in the form of a question assessing whether the respondent can

recall the creative. Next, the respondent is asked to indicate which of four listed brands the

commercial was for.

Table 1: Example for creative-recall and brand-recall questions

In a commercial during this show, who spoke directly to the cameraand said, I just bought stock you just saw me buy stock, as he satat a computer keyboard?

Well-spoken baby who eventually spat up all over the place Monkey wearing a custom-tailored suit and a fine silk tie

Simple peasant from the past who came from a rural village

Alien from outer space who did not speak earth language

What was this a commercial for?

E Trade

TD Ameritrade

Scottrade

Charles Schwab

Questions are asked in the same way for all formats. Brand recall, however, is only

measured conditional on creative recall in the case of TV, as opposed to rich media, video,

and banner ads, where all responses are recorded. To keep the data comparable, we retain

individuals who answered the creative-recall question correctly for all formats. The sample

sizes by format and campaign are reported in Table 2.

We collected data for 20 advertising campaigns run in 2011 across several industries: tele-

com, food and beverage, beauty, financial services, and pharmaceuticals. For confidentiality

reasons, we cannot share the brand names that were advertised, but Table 3 gives some

information about each campaign and the brand advertised. We see that the campaigns

vary considerably in terms of duration, with the shortest campaign being four weeks and the

10


11/33

Table 2: Sample sizes for in-flight sample (first number) and unmatched pre-campaign sample(second number) by survey question format and campaign.

banner rich media video TVcampaign 1 307/1723 93/3419 92/1695 1893/1729campaign 2 225/2313 36/2339 141/2312 3909/2327campaign 3 334/919campaign 4 157/3562 721/3546campaign 5 78/1199 2338/1199campaign 6 146/826 239/804 366/807campaign 7 468/1023 90/1031 83/1068 2518/3348campaign 8 2269/5966campaign 9 78/1225 959/1255

campaign 10 189/1400 245/1320 84/1409 1955/2935campaign 11 258/2135 467/2123 131/2123 3875/3396campaign 12 820/1980 407/1083campaign 13 75/957 352/964campaign 14 53/1277 380/1254campaign 15 87/971campaign 16 73/1271 2426/1602campaign 17 1108/438campaign 18 1386/1570 57/1425 3658/1453campaign 19 2118/2241campaign 20 53/1723 36/3419 648/1729

11


12/33

longest, 36 weeks.

Table 3: Duration of advertising campaigns, penetration of advertised brand in its respectiveproduct category, and share of TV GRPs for the four quarters prior to current campaign.

TV weeks online weeks penetration TV GRP sharecampaign 1 12 15 0.33 0.48campaign 2 8 25 0.33 0.47campaign 3 8 8 0.08 0.36campaign 4 8 8 0.15 0.00campaign 5 10 10 new 0.16campaign 6 19 19 0.02 0.32campaign 7 32 32 0.12 0.22

campaign 8 36 36 0.19 0.55campaign 9 12 12 0.21 1.00campaign 10 27 27 0.17 0.45campaign 11 12 30 new 0.00campaign 12 12 30 0.03 0.68campaign 13 4 4 0.19 0.15campaign 14 6 6 0.35 0.30campaign 15 4 4 0.01 0.43campaign 16 14 14 0.23 0.44campaign 17 8 8 0.12 0.16

campaign 18 14 14 0.12 0.08campaign 19 7 7 0.36 0.31campaign 20 11 11 0.36 0.27

The percentage of US households who are buying a certain CPG brand or using a service

(penetration of the brand) varies widely across campaigns: we have a new brand (campaign

11), a new line extension (campaign 5), along with several category leaders with a high

penetration of more than 30% (campaigns 1, 2, 14, 19 and 20). The level of advertising

in the four quarters prior to the current campaign also exhibits substantive variation: from

non-existent (campaigns 4 and 11) to 100% of the TV GRPs in the category for campaign

9.

3.2 Recall Measures: In-Flight Sample

The brand-recall analysis as described in section 2 consists only of respondents correct or

incorrect associations of the brand with the message. To collect these data, Nielsen deploys

12


13/33

surveys while an ad campaign is running. When individuals report they have viewed a TV

program that aired a commercial for the focal campaign or when they have visited a web page

featuring an online ad, they are presented with the brand-recall question. Table 4 displays

the average of the responses from the in-flight survey by campaign and format. Because

these estimates do not include an adjustment for a baseline response, they are calculated as

in equation (1), except that ys0 is set to zero:

{s|Xs=1}

Ysws.

Table 4: Percentage of correct linkages of brand and creative across formats and campaigns for

all individuals surveyed in-flight. Standard deviations are reported in parentheses.

banner rich media video TVcampaign 1 0.40 (0.49) 0.30 (0.46) 0.39 (0.49) 0.40 (0.49)campaign 2 0.39 (0.49) 0.24 (0.43) 0.41 (0.49) 0.37 (0.48)campaign 3 0.53 (0.50)campaign 4 0.37 (0.48) 0.35 (0.48)campaign 5 0.15 (0.35) 0.31 (0.46)campaign 6 0.44 (0.50) 0.43 (0.50) 0.41 (0.49)campaign 7 0.35 (0.48) 0.34 (0.48) 0.44 (0.50) 0.42 (0.49)campaign 8 0.79 (0.41)

campaign 9 0.85 (0.36) 0.78 (0.42)campaign 10 0.51 (0.50) 0.50 (0.50) 0.80 (0.40) 0.68 (0.47)campaign 11 0.36 (0.48) 0.36 (0.48) 0.59 (0.49) 0.49 (0.50)campaign 12 0.58 (0.49) 0.48 (0.50)campaign 13 0.38 (0.49) 0.18 (0.39)campaign 14 0.48 (0.50) 0.84 (0.37)campaign 15 0.34 (0.47)campaign 16 0.60 (0.49) 0.48 (0.50)campaign 17 0.55 (0.50)campaign 18 0.46 (0.50) 0.55 (0.50) 0.55 (0.50)campaign 19 0.53 (0.50)

campaign 20 0.39 (0.49) 0.49 (0.51) 0.38 (0.49)

Looking at the average brand recall rates in Table 4, we see many substantial brand-

message links. There is also substantial variation both across formats and campaigns. Al-

though the numbers in the table cannot be directly interpreted as a lift measure because

the baseline has not been removed, we can subtract the one traditionally used in practice,

ys0 = 0.25, from the reported numbers to get an estimate of the lift. It is notable that

13


14/33

although many campaigns have a positive lift, quite a few format-campaign combinations

(e.g., banners in campaign 5, rich media in campaign 2, and TV in campaign 13) are below

the baseline of 0.25. These numbers could be indicative of a poor campaign that broke pre-

viously established brand-message links, or as we will explore with our initial conditions

methodology cases, in which the baseline should actually be lower.

To formally assess the differences between recall rates for Internet formats and television,

we aggregate across campaigns. Table 5 reports the results of comparing the average recall

rates for campaigns that used Internet formats to the recall rates for TV for these campaigns.

For campaigns that ran some banner ads, the average brand-message recall of banners is 0.45,

whereas it is 0.50 for TV ads, with the difference having a p-value of 0.01. Similarly, among

the campaigns running rich media, the recall is 0.37 for rich media, but significantly greater

at 0.46 for TV. The video ads recall is significantly greater than TV (0.50 versus 0.44) in

those campaigns airing some video ads. Based on these data, we might therefore conclude

that TV outperforms banner ads and rich media in terms of brand recall, whereas video

outperforms TV.

Table 5: Comparison of average recall rates for Internet formats vs. TV across campaigns inin-flight sample. Campaigns that do not use a given online format were excluded.

avg. recall t-stat p-valuebanner 0.45 -3.24 0.01TV 0.50rich media 0.37 -3.92 0.00TV 0.46video 0.50 2.88 0.00TV 0.44

3.3 Recall Measures: Pre-Campaign Sample

For this research project, we augmented the in-flight data collection with a set of surveys,

which were deployed before the advertising campaign was run, to account for pre-existing

differences in respondents abilities to link the brand and message. As we describe in sec-

14


15/33

tion 4, these pre-campaign surveys can be used to measure more accurately the lift the ad

campaign provides relative to an initial condition than by simply assuming that, absent

advertising, consumers would randomly guess.

Table 6: Percentage of correct linkages of brand and creative across formats and campaigns inpre-campaign survey sample. Standard deviations are reported in parentheses.

banner rich media video TVcampaign 1 0.38 (0.48) 0.39 (0.49) 0.39 (0.49) 0.37 (0.48)campaign 2 0.35 (0.48) 0.35 (0.48) 0.38 (0.48) 0.36 (0.48)campaign 3 0.49 (0.50)

campaign 4 0.27 (0.44) 0.29 (0.45)campaign 5 0.18 (0.39) 0.17 (0.37)campaign 6 0.26 (0.44) 0.25 (0.44) 0.27 (0.45)campaign 7 0.48 (0.50) 0.43 (0.49) 0.47 (0.50) 0.53 (0.50)campaign 8 0.54 (0.50)campaign 9 0.41 (0.49) 0.50 (0.50)campaign 10 0.45 (0.50) 0.41 (0.49) 0.47 (0.50) 0.47 (0.50)campaign 11 0.10 (0.30) 0.09 (0.29) 0.10 (0.31) 0.08 (0.28)campaign 12 0.43 (0.50) 0.40 (0.49)campaign 13 0.16 (0.37) 0.17 (0.38)campaign 14 0.31 (0.46) 0.34 (0.47)

campaign 15 0.23 (0.42)campaign 16 0.30 (0.46) 0.34 (0.47)campaign 17 0.33 (0.47)campaign 18 0.17 (0.38) 0.18 (0.39) 0.19 (0.39)campaign 19 0.27 (0.44)campaign 20 0.24 (0.43) 0.23 (0.42) 0.26 (0.44)

Preliminary examination of the average pre-campaign brand-recall rates in Table 6 re-

veals that the recall rates vary substantially across campaigns and that large deviations from

a random guess rate of ys0 = 0.25 are present. As expected, the correct linkages for the new

products (campaigns 5 and 11) are quite low. In line with our intuition, the preexisting

brand knowledge for campaign 5, which is a line extension, is somewhat higher than the

entirely new brand in campaign 11. Campaign 18, which has a low share of TV GRPs (8%),

is also characterized by a low level of creative-brand association. By contrast, campaigns

with a relatively high penetration and share of TV GRPs have higher creative-brand associ-

15


16/33

ations. We do not have enough data to fully document a relationship between the campaign

characteristics and the probability of correctly linking a creative to a brand, but sufficient

evidence exists to suggest that subsequent analyses should account for, and possibly attempt

to explain, the presence of systematic variation.

Table 7: Comparison of average recall rates for Internet formats vs. TV across campaigns inpre-campaign survey sample. Campaigns that do not use a given online format were excluded.

avg. recall t-stat p-valuebanner 0.31 -3.82 0.01

TV 0.33rich media 0.32 -4.78 0.00TV 0.35video 0.30 -0.93 0.39TV 0.31

One notable difference in the pre-campaign recall rates reported in Table 6, relative to

the in-flight recall rates in Table 4, is that much less variation is present across formats.

This lack of variation is to be expected because the differences across formats in Table 6

are only in the question asked, not in the respondents past or future exposure to a given

format (the questions were asked before the campaign had begun, so the respondents could

not have been exposed to the ad.

Table 7 reports a direct comparison of average recall for each Internet format to that for

TV. Both banners and rich media perform slightly worse relative to TV (a difference of -0.02

for banners and -0.03 for rich media), whereas video is statistically indistinguishable from

TV.

3.4 Comparison of In-Flight and Pre-Campaign Samples

For the summary statistics of the pre-campaign sample to be considered a valid baseline

to calculate the lift of a campaign, we need to ensure the respondents included in the pre-

campaign sample are comparable to the ones surveyed during the campaign. This may not

16


17/33

be the case, however, for a number of reasons. First, we are only interested in the effect of the

campaign on the exposed individuals, and therefore respondents who are not exposed should

receive a weight of zero in our analysis. Survey respondents in the pre-campaign sample

by definition have not been exposed to the ad at the time they are surveyed, but their

subsequent exposures (if any) have been recorded. We can therefore examine their various

media exposures to verify they saw the commercial in the focal campaign and format. Table

8 reports the percentage of the pre-campaign sample eventually exposed to an ad in the focal

format and campaign. Although this percentage is quite high for TV, many pre-campaign

respondents in the Internet formats were never exposed to the campaign. By contrast, all

in-flight respondents have by definition been exposed. To make the samples comparable, we

thus need to focus only on individuals who were eventually exposed (Xs = 1).

Table 8: Percentage of pre-campaign sample who are eventually exposed to focal format (i.e.,were asked about the respective format).

banner rich media video TVcampaign 1 0.70 0.47 0.54 0.83campaign 2 0.63 0.41 0.49 0.88campaign 3 0.96campaign 4 0.59 0.94campaign 5 0.53 0.94campaign 6 0.69 0.46 0.87campaign 7 0.70 0.48 0.40 0.85campaign 8 0.98campaign 9 0.63 0.94campaign 10 0.53 0.64 0.41 0.89campaign 11 0.50 0.73 0.46 0.82

campaign 12 0.76 0.86campaign 13 0.60 0.91campaign 14 0.78 0.91campaign 15 0.98campaign 16 0.50 0.98campaign 17 1.00campaign 18 0.84 0.24 0.84campaign 19 0.99campaign 20 0.67 0.64 0.94

17


18/33

A second issue is the extent to which the exposed pre-campaign sample and the in-flight

sample are similar in terms of exposures to the different advertising formats. As can be seen

by looking at the averages for both groups reported in Table 9, even those respondents who

were eventually exposed to an ad in the focal campaign have a different rate of exposure

than the respondents included in the in-flight sample. In general, those in the pre-campaign

group have a much higher exposure to TV relative to the in-flight group.

Thinking about what may explain these differences in media exposures, the different

sampling time frames emerge as a possible cause. Whereas the in-flight surveys were collected

for the entire duration of the campaign (anywhere between 4 and 36 weeks), the pre-campaign

measures were typically collected within a week. To obtain the necessary sample size to

ensure we would have an adequate group of individuals who are eventually exposed to the

focal campaign, the selection of respondents had to be much more aggressive, thus yielding a

potentially different sample. For example, the high TV exposures among the pre-campaign

sample could be attributed to a greater number of professional survey takers that might

have overstated TV exposure rates in order to earn more points on rewardtv.com. Using

a matching methodology, we remove these outliers and create a sample comparable to the

in-flight group.

18


19/33

Table

9:

Averagenumberof

exposurestodifferentadformatsbycampaign.Comparisonof

pre-campaign(leftcolumn)and

in-flight

(rightcolumn)samples.

banner

ric

hme

dia

vid

eo

TV

pre-camp

.

in-fl

ight

pre-c

amp.

in-fl

ight

pre-c

amp

.

in-flight

pre-c

amp

.

in-fl

ight

campa

ign

1

5.6

1

4.7

4

3.2

2

2.9

4

2.4

5

1.9

8

10

.75

2.8

7

campa

ign

2

3.7

9

3.0

8

2.0

8

1.6

8

2.8

7

2.0

4

16

.67

3.9

4

campa

ign

3

21

.96

13

.7

campa

ign

4

4.8

9

6.2

7

7.3

4

1.4

8

campa

ign

5

3.1

9

3.4

1

13

.91

6.2

3

campa

ign

6

5

.7

3.7

2

3.4

4

3.8

2

4.2

9

2.3

1

campa

ign

7

6.5

8

5.6

3.4

2.8

2

6.8

1

4.0

3

11

.51

6.4

1

campa

ign

8

1

2.5

7

1.5

6

11

.49

campa

ign

9

5.7

9

3.6

8

1.8

5

1.6

7

11

.45

2.6

5

campa

ign

10

7.2

6

9.4

7

5.4

8

6.4

6

3.6

5

3.4

8

17

.16

2.5

1

campa

ign

11

3.2

8

8.7

8

4.3

9

4.4

9

3.5

7

4.3

8

10

.65

2.5

campa

ign

12

4.1

8

5.5

5

4.2

3

1.4

6

campa

ign

13

4.2

5

4

.5

4.3

3

2.5

6

campa

ign

14

2.3

1

3.0

2

2.9

5

2.7

campa

ign

15

1.8

5

1.2

3

campa

ign

16

2.3

3

1.8

4

11

.54

8.5

5

campa

ign

17

5

2.2

campa

ign

18

9.9

1

7.7

6

4.8

3.2

3

22

.89

15

.81

campa

ign

19

2.1

7

12

.02

campa

ign

20

6.1

2

5.2

2

8.6

3

6.7

7

11

.31

9.4

2

19


20/33

4 Matching Methodology

Given our pre-campaign survey data, we conceptualize lift as

= E[Ys1 Ys0|X = 1] ,

where Ys1 indicates correct association of the message and brand during the campaign by

respondent s and Ys0 indicates correct association before the campaign.3 Numbering the sur-

veys before the campaign as {1,...,S0} and those during the campaign as {S0 + 1,...,S1 + S0},

we would ideally measure

=

{s|s>S0,Xs=1}

Ys1ws1

{s|sS0,Xs=1}

Ys0ws0, (2)

where the weights w1s and w0s ensure the surveyed in-flight and pre-campaign individuals

are representative of the population of exposed individuals. We cannot, however, estimate

the above equation because we do not observe ws0; that is, the weights are only calculated

for the individuals surveyed during the campaign. Furthermore, the analysis and discussion

in section 3 indicate the pre-campaign group is systematically different from the in-flight

group for which we observe the weights. To prune the non-representative pre-campaign

respondents, we employ a matching procedure that restricts the analysis to each in-flight

survey and its nearest-neighbor from the pre-campaign group.

4.1 The Matching Estimator

We match each in-flight respondents

surveyed during the campaign with a setM

s of pre-survey respondents based on a set of variables Zs that we describe below. Then we estimate

the following:

=

{s|s>S0,Xs=1}

mMs

(Y1s Y0m)w1s

|Ms|. (3)

3Our notation for s equates surveys and respondents. Given the sampling approach described in section3, a given respondent could potentially fill out multiple surveys in the repeated cross section. We currentlycannot separate such cases to treat them specially.

20


21/33

In the above expression, Ms is the set of pre-campaign respondents that are matched to

in-flight respondent s, and Y0m indicates whether the mth matched pre-survey respondent

correctly recalled the brand. We divide by the number of matched respondents, |Ms|, such

that the total weight for each in-flight respondent s is equal to that respondents reported

weight, w1s.

The assumption underlying this estimator is

E[Y0|s > S0, Z] = E[Y0|s S0, Z] . (4)

In words, we assume that conditional on the matching variables, Zs, the expected response

to the pre-campaign survey is invariant to whether the individual was surveyed before or

after the campaign began. The assumption therefore guarantees our estimator removes any

systematic sampling differences between the pre-campaign and in-flight groups.

4.2 Matching Variables

Given our goal is to compare advertising effectiveness across various media formats, we

decided to focus on media consumption as the most relevant descriptor of the surveyed

individuals. The matching variables Zs include the total number of campaign exposures for

each of the three Internet formats, as well as the total number of TV exposures across all

campaigns in our data.

The Internet formats provide valuable match variables because they are passively ob-

served and thus do not suffer from self-reporting issues. Furthermore, they are highly re-

flective of the type of individual. Specifically, exposure to the campaigns advertisements

signifies the individual is in the campaigns target, and the number of exposures provides a

measure of the intensity of viewership of the targeted medium.

We do not match on television exposures within the campaign because they are not

passively observed. A pre-campaign respondent could have been exposed to TV even if we

do not observe TV exposure. However, we include the total television exposures across all

campaigns so that we are matching on a measure of television viewership intensity. The

21


22/33

number of total television exposures also helps us separate out individuals that might take

many surveys, because reported television exposures give the respondent the opportunity

to take more surveys. Such individuals are down-weighted in Nielsens estimate of each

in-flight respondents weight, but we need to match on this characteristic to ensure similar

down-weighting of pre-campaign respondents who might have reported many exposures.

We use a nearest-neighbor matching approach (Abadie, Drukker, Herr & Imbens 2004,

Abadie & Imbens 2012) in which we find at least one pre-campaign survey to match to each

in-flight survey. As Abadie & Imbens (2012) show, allowing individual observations to be

used as a match more than once lowers the bias of the estimates.

We seek exact matches on the campaigns passively observed Internet exposures and

allow the overall television exposures to sort among ties in terms of shortest distance. If ties

are still present, we include all tied matches, which accounts for |Ms| in equation (3) being

greater than 1 for in-flight survey s. As per equation (3), we include the additional matches

based on their share of the total matches to s. If no exact match exists, we find the nearest

neighbor in terms of the distance between the two vectorsZs for the in-flight survey and

Zs

for the pre-campaign survey.

Our procedure worked well. For the TV format question, we are able to match exactly

96% of the in-flight respondents on the passively observed Internet exposures. Note that we

exclude any pre-campaign respondents that do not match in-flight respondents, because our

in-flight respondent weights sum to form the true distribution of exposed individuals. For

banners the percentage is 84%, followed by video at 75% and rich media at 69%.

4.3 Causal Interpretation

Because pre-campaign surveys are conducted well before most of the in-flight surveys (given

that some campaigns last 4-5 months), time-varying unobservables could make a causal

interpretation difficult. Moreover, although matching ensures pre-campaign and in-flight

respondents are comparable in terms of ad exposure and media consumption over the entire

22


23/33

time frame of the data, it cannot make up for the time gap between the two surveys. Because

our goal is to compare exposures across different media formats, our primary concern arises

from time- varying unobservables that differ based on the media format to which a respondent

is exposed.

One source of time-varying unobservables we know exists is unobserved television ex-

posures. Due to the inability to passively measure television exposures, we only observe a

subset of the actual exposures to TV ads. However, in trying to assess whether Internet

formats can build brands comparably to television, unobserved television exposures would

likely overstate television effects relative to Internet effects. This overstatement is likely to

occur, because we should expect individuals exposed to television to watch more television

on average than individuals exposed to the Internet, giving television-exposed individuals

relatively more unobserved exposures to the ad campaign.

Other sources of time-varying unobservables include other non-advertising marketing

activity by the firm or its competitors. For example, in-store displays do not include messages

that would increase association of the message with a brand but could increase the salienceof the brand in the mind of the customer and therefore increase the focal brands choice in

random guessing. We have no a-priori reason to believe television- or Internet intensive media

consumers should see a firms non-advertising marketing activity at a systematically higher

or lower rate. Competitors are likely to target their marketing activity at the same targets as

those chosen by the focal brand and these competitive actions could lead to systematically

higher or lower levels of associations as the time since the pre-campaign survey increases. If

competitive advertising creates biases in favor of one format over another, we should expect

these biases to be increasing with time since the pre-survey. We therefore consider our effects

separately for different progressions of our campaigns, measured as the number of previous

exposures respondents have to the campaign.

23


24/33

5 Findings

We discuss the results from the above matching procedure in the context of two separate yet

related research questions. First, by examining the pre-campaign brand recall of the exposed

population, we can evaluate whether past advertising or brand experiences have led to a

divergence in brand associations between Internet- and television-intensive targets. We find

that banner- and rich media- intensive targets have systematically lower levels of brand recall,

which suggests either past advertising was insufficient or less effective for Internet media.

Second, the pre-campaign brand-recall measures derived after the matching procedure serve

as the baseline in our lift measures. The matched pre-campaign sample allows for a more

accurate measure of the campaign lift, and it ensures that is more comparable across formats

because it takes into account any pre-existing cross-format differences in brand knowledge.

5.1 Existing Brand Knowledge across Formats

As some consumers have shifted their media consumption away from television toward vari-

ous online formats, a concern arises as to whether brand-building activities can be transferred

easily across formats. Before we examine the effectiveness of various ad platforms, we con-

sider the lasting effects of past campaigns. Specifically, we measure pre-campaign brand

knowledge separately by the media format to which a respondent is eventually exposed

(and presumably favors). Although our data do not allow us to infer why, for instance, an

Internet-exposed individual may have had less pre-campaign knowledge of the brand than

a television-exposed individual, two explanations for the difference in baseline brand knowl-

edge are possible: (i) brands may have devoted fewer past exposures to the Internet formats

the individual views, or (ii) past Internet exposures had less persistent effects.

We are able to assess pre-campaign associations by exposure format because we observe

pre-campaign respondents eventual exposures to the campaign. Table 10 reports the initial

conditions based on the matched pre-campaign surveys. These initial conditions differ from

the ones reported in Table 6 in that they reflect the responses for only those individuals who

24


25/33

are exposed to the format-campaign combination and are matched to an in-flight respondent.

Table 10: Percent of correct brand associations before each campaign in matched pre-campaignsample.

banner rich media video TVcampaign 1 0.31 (0.46) 0.34 (0.48) 0.32 (0.47) 0.30 (0.46)campaign 2 0.26 (0.44) 0.24 (0.44) 0.35 (0.48) 0.32 (0.47)campaign 3 0.32 (0.47)campaign 4 0.36 (0.48) 0.36 (0.48)campaign 5 0.24 (0.43) 0.08 (0.28)campaign 6 0.24 (0.43) 0.23 (0.42) 0.14 (0.35)

campaign 7 0.42 (0.49) 0.38 (0.49) 0.50 (0.50) 0.62 (0.49)campaign 8 0.59 (0.49)campaign 9 0.30 (0.46) 0.63 (0.48)campaign 10 0.53 (0.50) 0.45 (0.50) 0.47 (0.50) 0.66 (0.47)campaign 11 0.07 (0.25) 0.05 (0.21) 0.12 (0.32) 0.07 (0.25)campaign 12 0.42 (0.49) 0.42 (0.49)campaign 13 0.11 (0.32) 0.13 (0.34)campaign 14 0.32 (0.47) 0.66 (0.47)campaign 15 0.14 (0.35)campaign 16 0.22 (0.42) 0.41 (0.49)

campaign 17 0.18 (0.39)campaign 18 0.16 (0.36) 0.43 (0.50) 0.16 (0.37)campaign 19 0.20 (0.40)campaign 20 0.11 (0.32) 0.08 (0.27) 0.16 (0.37)

The primary change in Table 10 relative to Table 6 is that substantial variation in brand

recall now exists across formats within a campaign. For example, campaign 7 has a TV

baseline of 0.62, but the baseline is 0.5 or less for the three Internet formats. Alternatively,

campaign 2 has a high baseline on video and TV at 0.35 and 0.32, respectively, but is close

to 0.25 for banners and rich media.

Although the campaign-by-campaign measures are illustrative, our focus is on the aver-

ages across campaigns and within format, where the aggregated sample sizes allow us more

conclusive inference. Table 11 compares average pre-campaign brand recall for each Inter-

net format to the average pre-campaign brand recall for TV. It also compares the matched

estimates with the unmatched estimates. For campaigns running banner ads, we see that

25


26/33

Table 11: Difference across formats in the percentage correct brand associations in pre-campaignsample. Comparison between matched and unmatched samples. Asterisk denotes a significantdifference at the 5% level.

unmatched matched exact matchesbanner 0.31 0.28 84%TV 0.33 0.36 97%rich media 0.32 0.26 69%TV 0.35 0.35 96%video 0.30 0.32 75%TV 0.31 0.30 97%

the initial condition for those exposed to banner ads dropped to 0.28 with matching, which

is significantly lower than the 0.36 for TV. Rich-media matched initial conditions are also

significantly lower than TV at 0.26. Video is indistinguishable from TV in both the matched

and unmatched samples. We suspect video and TV may be similar, because many of the

video ads were for online viewership of episodes from television series (e.g., through Hulu).

The banner and rich-media differences from TV are worth considering. The fact that the

target audience for the online ad campaigns has a lower level of existing brand knowledge

than the target audience exposed to TV suggests advertisers efforts to reach this population

have been ineffective thus far. The TV population is more familiar with the brand message

and is thus better able to correctly link the commercial to the corresponding brand. This

finding could be the result of insufficient or ineffective past advertising to Internet-intensive

media viewers.

5.2 Comparison of Advertising Lift across Formats

The metric we use to compare the performance of the different advertising formats is the

campaign lift, calculated as the difference in the brand-recall measure between the matched

pre-campaign sample and the in-flight sample (see equation (3)). Table 12 reports the lift by

campaign and format. We observe a dramatic effect across all formats for the new brand in

campaign 12. Similarly, we find large and significant effects for campaign 21 (banners, rich

26


27/33

Table 12: Adjusted lift by campaign and format. Asterisk denotes significance at the 5% level.

banner rich media video TVcampaign 1 0.10 -0.05 0.07 0.11

campaign 2 0.13 0.00 0.06 0.05

campaign 3 0.21

campaign 4 0.00 -0.01

campaign 5 -0.09 0.23

campaign 6 0.20 0.20 0.27

campaign 7 -0.08 -0.04 -0.05 -0.20

campaign 8 0.20

campaign 9 0.54 0.15

campaign 10 -0.02 0.05 0.33 0.02

campaign 11 0.30 0.31 0.48 0.43 campaign 12 0.17 0.06

campaign 13 0.27 0.05

campaign 14 0.15 0.18

campaign 15 0.20

campaign 16 0.38 0.07

campaign 17 0.37

campaign 18 0.30 0.12 0.39

campaign 19 0.33

campaign 20 0.28 0.41 0.22

27


28/33

Table 13: Aggregate lift comparison across formats.

avg. lift t-value p-value

banner 0.17 1.56 0.12TV 0.14rich media 0.12 0.38 0.71TV 0.10video 0.19 1.60 0.11TV 0.14

media, and TV), campaign 19 (banners and TV), campaign 15 (banners and TV), campaign

7 (banners and TV), and campaign 6 (TV and video). Some campaigns have a much greater

banner lift than TV (e.g., 9 and 16). Rich media provides the highest lift in campaign

20. Video outperforms other formats in campaigns 10 and 13. These differences suggest

more exploration is needed when data become available for a larger number of campaigns

in order to enable the establishment of a relationship between campaign characteristics and

the effectiveness of the media vehicles.

Comparing the average performance of the advertising formats across the campaigns, we

find all Internet formats perform slightly better than TV, with video having the highest

relative lift at 0.05, banners at 0.03, and rich media at 0.01. However, the p-values for video

and banners are only 0.11 and 0.12. Thus, accounting for the differences in pre-existing

brand knowledge by format leads to a different inference regarding the relative performance

of TV versus the online advertising formats. By only comparing in-flight recall rates, TV

appears to be the most impactful medium, but adjusting for the initial conditions, Internet

formats perform just as well or perhaps even better.

One question that arises in comparing in-flight lift across campaigns is whether respon-

dents were exposed the same number of times across campaigns at the time they are surveyed.

Table 14 reports the average number of exposures to the surveyed format for each Internet

versus TV comparison. Exposure rates among the banner- and rich-media exposed/surveyed

28


29/33

Table 14: Average number of exposures to focal format vs. TV at the time a respondent takes asurvey in focal format.

exposures t-value p-valuebanner 2.57 4.91 0.00TV 1.86rich media 2.58 3.78 0.00TV 1.86video 2.24 1.40 0.16TV 2.01

Table 15: Adjusted lift by number of exposures prior to survey for the pairwise comparison toTV.

banner rich media video1 exposure 0.01 -0.04 0.062 exposures 0.01 0.07 0.083 exposures 0.03 0.03 -0.084 exposures -0.02 -0.03 -0.085 exposures -0.14 -0.18 -0.186 exposures 0.23 0.13 -0.07avg. diff. in lift 0.03 0.01 0.05

Note: denotes significance at the 10% level, significance at the 5% level.

are significantly greater than TV exposures for the same campaigns. Recall, however, that

not all TV exposures are observed. Video exposures are also greater than TV exposures,

though the difference is not statistically significant.

Table 15 reports the difference in lift between TV and each Internet format separately by

the total number of exposures to the campaign. Once we condition on exposures, we do not

see any format performing systematically better. Even for a given exposure level, only one

comparison yields a significant result (banner lift 0.23 greater than TV at six exposures).Overall this finding suggests that the ability of Internet exposures to produce lift measures

comparable to those of TV is not due to systematically greater numbers of exposures to the

campaign.

29


30/33

6 Conclusions

In this research, we propose a methodology for establishing a format-specific baseline to

assess the lift in brand recall due to an advertising campaign. We supplement the in-flight

brand-message surveys with a set of pre-campaign surveys and match the pre-campaign

respondents to those eventually exposed to the campaign in order to control for pre-existing

brand knowledge. The rich data set we have, tracking the response to TV and Internet

advertising for 20 campaigns across a variety of industries, provides us with comparable

measures to assess the relative performance of the different advertising formats.

We find a systematically lower level of brand knowledge among individuals who are

surveyed about banner and rich media. Without a format-specific baseline, a researcher

might therefore draw the wrong conclusion and ascribe too much importance to TVs effect

on brand recall. Once the difference in pre-existing knowledge is taken into account, there

is no significant difference in the effectiveness of TV and Internet ads in terms of correct

brand identification. This result underscores the importance of pre-campaign surveys and

our matching methodology for comparing ad performance across media formats.

The goal of our research was to assess the widely held belief that TV outperforms Internet

formats as a brand-building platform, and we therefore focused on head-to-head comparisons

of TV to the Internet formats. Nevertheless, as advertisers decide how to use these various

formats, knowledge of the complementarities between media will be important. Researchers

in marketing have long explored the potential synergies in multimedia communications (see,

e.g., Naik & Raman (2003) or Dijkstra, Buijtels & van Raaij (58) for recent examples)but the empirical study of the phenomenon in a field setting is still challenging. Studies

that randomly vary TV and Internet pulses across geographic markets may be best suited

to disentangle the optimal combination and sequencing of ad formats. This more detailed

analysis was not possible in our context, where most advertising was at the national level, so

a focus on brands involved in geo-targeted campaigns may be the most promising approach.

Another avenue for future research would be to investigate more formally the link between

30


31/33

category characteristics and effectiveness of different types of campaigns. Our study and the

existing literature on advertising point to a number of potentially relevant brand and category

factors such as the maturity level of the category, the stage of the product life cycle (new

introduction versus established brand), and the amount of previous advertising, possibly as

share of voice in the category. In addition, the type of consumer decision making in the

product category - whether it is a low-involvement or a high-involvement process will also

likely play a role in determining what media format will be most effective.

Finally, our research can be extended by practitioners to include cost measures in com-

paring the relative performance across ad formats and guiding the media budget allocation

decisions. As of now, online advertising still appears to be more cost effective. We anticipate

though that once the brand-building potential of Internet formats has been firmly estab-

lished, the prices for online advertising will increase to reflect their relative performance.

31


32/33

References

Abadie, A., Drukker, D., Herr, J. & Imbens, G. (2004). Implementing matching estimators

for average treatment effects in Stata, The Stata Journal4(3): 290311.

Abadie, A. & Imbens, G. (2012). Bias-corrected matching estimators of average treatment

effects, Journal of Business and Economic Statistics29(1): 111.

Assmus, G., Farley, J. U. & Lehmann, D. (1984). How advertising affects sales: Meta analysis

of econometric results, Journal of Marketing Research 21(1): 6574.

Briggs, R. & Hollis, N. (1997). Advertising on the web: Is there response before click-

through?, Journal of Advertising Research pp. 3345.

Dijkstra, M., Buijtels, H. & van Raaij, F. (58). Separate and joint effects of medium type

on consumer responses: A comparison of television, print, and the internet, Journal of

Business Research2005(3): 377386.

Dreze, X. & Hussherr, F.-X. (2003). Internet advertising: Is anybody watching?, Journal of

Interactive Marketing17(4): 823.

Franzen, G. (1994). Advertising Effectiveness: Findings from empirical research, NTC Pub-

lications, Henley-on-Thames, U.K.

Goldfarb, A. & Tucker, C. (2011). Online advertising, Advances in Computers, Vol. 81,

Elsevier.

Heckman, J., Ichimura, H. & Todd, P. (1997). Matching as an econometric evaluation

estimator: Evidence from evaluating a job training programme, Review of Economic

Studies64: 605654.

Hutchinson, W. & Moore, D. (1984). Issues surrounding the examination of delay effects of

advertising, in T. Kinnear (ed.), Advances in consumer research, Vol. 11, Provo, UT:

Association for Consumer Research, pp. 650655.

32


33/33

Imbens, G. (2004). Nonparametric estimation of average treatment effects under exogeneity:

A review, The Review of Economics and Statistics 86(1): 429.

Krishnan, S. & Chakravarti, D. (1999). Memory measures for prestesting advertisements:

An integrative conceptual framework and a diagnostic template, Journal of Consumer

Psychology8(1): 137.

Lewis, R. & Reilley, D. (2011). Does retail advertising work?, Technical report, Yahoo!

Research.

Lodish, L. M., Abraham, M., Kalmenson, S., Livelsberger, J., Lubetkin, B., Richardson, B.

& Steve, M. E. (1995a). How advertising works: A meta-analysis of 389 real world split

cable tv advertising experiments, Journal of Marketing Research 32: 125139.

Lodish, L. M., Abraham, M., Kalmenson, S., Livelsberger, J., Lubetkin, B., Richardson, B.

& Steve, M. E. (1995b). A summary of fifty-five in-market experimental estimates of

the long-term effects of advertising, Marketing Science14(3): G13340.

Naik, P. & Raman, K. (2003). Understanding the impact of synergy in multimedia commu-

nications, Journal of MArketing Research40(4): 375388.

Rossiter, J. & Bellman, S. (2005). Marketing Communications: Theory and Applications,

Pearson Education.

Wells, W. (1964). Recognition, recall and rating scales, Journal of Advertising Research

4(3): 28.

internet vs tv 2

Documents