market intelligence session 6 kerlander soup case, survey design

Post on 18-Jan-2016

413 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Market Intelligence Session 6

Kerlander Soup Case, Survey Design

Kerlander Soup

• What situation is Kerlander facing – what motivated research?

• What Action Should Kerlander Take Based on the Research Findings?

2

3

Action Alternatives

regular creamy extra creamy

A (recommendation)

B

C

D

E

F

G* (status quo)

4

Focus Groups

• Appropriate use?• Implementation?• How they used insights?

5

Focus group insight

• Taste

Purchase decision function of:

6

Construct validity

• Taste (creaminess)

Purchase decision function of:

7

Construct validity

• Taste (creaminess)• Packaging• Price• Ingredients• Nutrition• Availability• Awareness• Label• Shelf position• Promotions• Etc etc

Purchase decision function of:

8

Construct validity

• Focus group: direction of bias?

9

Imagine they confirmed that taste was main driver of purchase…

• What do you think of taste test?

10

Reliability

Time 1 Time 2 Time 3

Soup. A 1 2 1

Soup. B 2 1 3

Soup. C 3 5 2

Soup. D 4 3 5

Soup. E 5 4 4

11

ReliabilityCorrelations – Spearman’s Rho

Time 1 Time 2 Time 3

soup. A 1 2 1

soup. B 2 1 3

soup. C 3 5 2

soup. D 4 3 5

soup. E 5 4 4

12

ReliabilityCorrelations – Spearman’s Rho

Time 1 Time 2 Time 3

soup. A 1 2 1

soup. B 2 1 3

soup. C 3 5 2

soup. D 4 3 5

soup. E 5 4 4

13

ReliabilityCorrelations – Spearman’s Rho

Time 1 Time 2 Time 3

soup. A 1 2 1

soup. B 2 1 3

soup. C 3 5 2

soup. D 4 3 5

soup. E 5 4 4

14

Reliability: correlationsSpearman’s Rho

Time 1 Time 2 Time 3

Time 1 1

Time 2 1

Time 3 1

15

What would this indicate?

Time 1 Time 2 Time 3

Time 1 1

Time 2 0.85 1

Time 3 0.14 0.08 1

16

What would this indicate?

Time 1 Time 2 Time 3

Time 1 1

Time 2 0.85 1

Time 3 0.14 0.08 1

Burnout

17

What would this indicate?

Time 1 Time 2 Time 3

Time 1 1

Time 2 0.04 1

Time 3 0.11 0.91 1

18

What would this indicate?

Time 1 Time 2 Time 3

Time 1 1

Time 2 0.04 1

Time 3 0.11 0.91 1

Learning

19

Reliability

• Missed opportunity to test reliability of data using correlations

• Due to 15 tastings, reasons to suspect it may not be reliable!

20

Imagine we know data are reliable…

• How do we decide which option to recommend? What stats should we look at?

Preference Data: Descriptive Stats

Taste Test Mean Rank

KerRegular 3.40FishDelight 2.75KerCreamy 2.40Cape Cod 2.90KerExtra 3.55

Preference Data: Descriptive Stats

Mean Median

Mode

KerRegular 3.40 4.5 5FishDelight 2.75 2.0 2KerCreamy 2.40 3.0 3Cape Cod 2.90 3.0 4KerExtra 3.55 4.5 5

Any problems with these?

23

Any problems with these?

24

Do segments exist?

Preference PlotsKerlander Regular

Kerlander Creamy Kerlander Extra Creamy

26

Totally Disaggregate Choice Modeling

• Alternative to central tendency (useful when segments exist)

• Taking 1 respondent at a time, what would they choose in a given context (with certain options on market)?

• Forecast market share from there

27

How to do it by hand

Summary Share Simulation Results

28

ScenarioKerlander

RegularFisherman

DelightKerlander

Creamy Cape CodKerlander

Extra CreamyKerlander

Total

A 40 25 35 25

B 55 20 25 25

C 40 25 10 25 50

D 30 10 25 35 55

E 30 25 20 25 55

F 30 10 25 10 25 80

G (current) 30 25 45 30

29

How to decide between options?

• Cannibalization?• Can we introduce all 3?• If disaggregate choice modeling indicates tie,

what should you do?

External validity

• How much confidence do you have in using these data to make the recommendation for Kerlander?

• What, if anything, can you do to test whether confidence in these data is warranted?

External validity

• Can you use model to forecast the current (known) market shares?

• Additional Information: – Current market share: Kerlander (45%);

Fisherman’s Delight (10%); Cape Code (45%); • Preference Data: Impute brand chosen by

each subject in a three brand market (KR,FD, CC) by looking to see which of the three is highest ranked.

• Example Three-Brand Imputed Purchase: KR FD KC CC KEC

Subj 1 1 2 3 4 5 -- buy KR Subj 4 5 4 3 2 1 -- buy CC

• Of the subjects shown:– KR 6/20 (30%)– FD 5/20 (25%)– CC 9/20 (45%)

External validity

Observed Frequency

Expected Frequency

Ker Reg 6 (30%) (45%) X 20 = 9

Fish Del 5 (25%) (10%) X 20 = 2

Cape Cod 9 (45%) (45%) X 20 = 9

Null hypothesis? Which statistic?

External validity

Observed Frequency

Expected Frequency

Ker Reg 6 (30%) (45%) X 20 = 9

Fish Del 5 (25%) (10%) X 20 = 2

Cape Cod 9 (45%) (45%) X 20 = 9

Chi-square p < .05. implication?

External validity

35

Direction of bias?

• Can we make sense of it?

36

Direction of bias?

• Can we make sense of it?– Distribution– Price– Awareness

Kerlander Soup Takeaways• Levels of measurement:

– Can’t take mean of ordinal data• Measures of central tendency are misleading when segments exist

– Disaggregate Choice Modeling Approach• Examine cues to data quality

– Reliability: repeatability, consistent results (missed opportunity to look at correlations)

– External Validity: generalizability to larger population (does model forecast current mkt share?)

– Construct Validity: measures what it purports to measure (soup pref not just based on taste)

• Why are forecasts not accurate?– Distribution– Price– Awareness 37

Survey Research, Measurement

39

Representative sampling: The beginning

• 1916: Literary Digest starts to poll Americans to predict voting behavior

• Becomes the go-to source for presidential and other election predictions

Prediction • Landon: 57%• Roosevelt: 43%

1936 presidential election

Prediction • Landon: 57%• Roosevelt: 43%

They were confident: “if past experience is a criterion, the country will know to within a fraction of 1% the actual popular vote of 40 Million voters”

1936 presidential election

Prediction • Landon: 57%• Roosevelt: 43%

Results• Landon: 38% • Roosevelt: 62%

Sampling error: 19%

1936 presidential election

43

1936 Presidential Election

• Sample size: 2.4 million• Sampling method: created mailing list of 10

million names (1 of 4 voters) pulled from:– telephone directory, lists of magazine subscribers,

rosters of clubs and associations, automobile registry

– Sent mock ballot to return to magazine

44

1936 Presidential Election

• Sampling bias– Selection bias: sample list slanted toward middle

and upper class voters.– Nonresponse bias: 2.4 out of 10 million responses

(1/4). People who respond to surveys are different than those who don’t

45

46

47

But as one star falls, another rises

• Gallup also did a poll in 1936• Sample size: 5,000• Sampling method: representative sample• Prediction:

– Roosevelt 56%; Landon 44%• Introduced modern era of public opinion polls

48

But 12 years later…

49

Common Pitfalls when writing questions

• Complex questions• Ambiguous questions• Leading questions• Loaded questions• Double-barreled questions

50

Common Pitfalls when writing questions

• Complex questions• Ambiguous questions• Leading questions• Loaded questions• Double-barreled questions

51

Complex questions

• Vague• Lacks context• Use simple, ordinary words and wording• No jargon• Literacy

– Reading level of average American: 7th-8th grade

52

Complex

• Vague• Lacks context• Whose point of view?• Jargon

53

Complex

54

Common Pitfalls when writing questions

• Complex questions• Ambiguous questions• Leading questions• Loaded questions• Double-barreled questions

55

Ambiguous questions

56

Ambiguous questions

57

Ambiguous questions

58

Ambiguous questions

• Avoid words or phrases with multiple meanings

• Specify the context of the question• Watch for similar spellings or pronunciations

of key words• Be direct about what you're asking• Back translation

59

Common Pitfalls when writing questions

• Complex questions• Ambiguous questions• Leading questions• Loaded questions• Double-barreled questions

60

61

Leading questions

• Leading the respondent to a particular answer• Introduces bias• Should not be a “right” or “wrong” answer• Data will be unreliable

62

Leading questions

63

Leading questions

64

Leading questions

65

Leading questions

66

VW example

67

Common Pitfalls when writing questions

• Complex questions• Ambiguous questions• Leading questions• Loaded questions• Double-barreled questions

68

Loaded questions

• Emotionally charged• Heated topic• Do: make all answers equally acceptable• Don’t: induce social pressure

69

Loaded questions

70

Loaded questions

71

Loaded questions

72

73

74

Common Pitfalls when writing questions

• Complex questions• Ambiguous questions• Leading questions• Loaded questions• Double-barreled questions

75

Double-barreled questions

• Asking 2 questions at once

76

Double barreled question

77

Double barreled question

78

Double-barreled question

79

Double-barreled question

80

Key to success in survey writing

The ability to anticipate:1. The cognitive processes of the respondent2. The analytical work that might be performed on responses3. The “so what” (decision value) attached to statistical results4. The likely distribution of responses under different wording conditions

81

Key to success in survey writing

The ability to anticipate:1. The cognitive processes of the respondent2. The analytical work that might be performed on responses3. The “so what” (decision value) attached to statistical results4. The likely distribution of responses under different wording conditions

82

How often do you exercise? A.0 times a week1 time a week2 times a week3 or more times a week B.0-1 times a week2-3 times a week4-5 times a week6 or more times a week

83

A. How much would you be willing to pay for an ice cream maker? 1. $0-102. $11-203. $21-304. $30-405. more than $40 B. How much would you be willing to pay for an ice cream maker? 1. $0-252. $26-503. $51-754. $76-1005. more than $100

84

A. I always recycle

1 2 3 4 5totally disagree totally agree

B. I sometimes recycle

1 2 3 4 5totally disagree totally agree

C. I never recycle

1 2 3 4 5totally disagree totally agree

Creating the Survey: Question Order

• Put difficult or sensitive questions well into the interview

• Demographics Questions typically last• Usually funnel questions general to specific

– Use product category? – Use Brand X? – Do you like Brand X? – Why?

85

Creating the Survey:Types of Scales

• Some Examples– Likert scale (Agree-Disagree)– Other rating scales– Semantic Differential (opposites)– Rankings – Constant sum– Purchase Intent

86

Likert scale (agree-disagree)• Ask respondents the extent to which they agree or

disagree with a statement (Usually a 5 or 7 point scale)

87

Now we would like to find out your impressions about ColgateCombo. Please indicate your opinion below.

Neither Strongly Agree Nor StronglyDisagree Disagree Agree

Expensive 1 2 3 4 5

Convenient 1 2 3 4 5

High Quality 1 2 3 4 5

Appealing 1 2 3 4 5

Likert scale (agree-disagree)• Ask respondents the extent to which they agree or

disagree with a statement (Usually a 5 or 7 point scale)

88

Now we would like to find out your impressions about ColgateCombo. Please indicate your opinion below.

Neither Strongly Agree Nor StronglyDisagree Disagree Agree

Expensive 1 2 3 4 5

Convenient 1 2 3 4 5

High Quality 1 2 3 4 5

Appealing 1 2 3 4 5

Can also be on sliding scale for more gradations

89

Other rating scales

• Can have 5 or 7 point scales other than agreement:

1 2 3 4 5Not at all important important

90

Other rating scales

1 2 3 4 5Not at all important Important

Other option:

-2 -1 0 1 2Very unimportant Very important

91

Other rating scales

1 2 3 4 5Not at all important Important

Other option:

-2 -1 0 1 2Very unimportant Very important

92

Unipolar vs. bipolar scales?

• In general, unipolar usually better than bipolar– Less mentally taxing; only have to consider 1

attribute instead of 2– More streamlined with fewer choices (5-7 vs. 11)– Many bipolar scales actually only measure 1

dimension. Ex: “Not at all important” vs. “very unimportant”

• Exception: semantic differential

Semantic Differential (opposites)

• Typically bipolar adjectives, endpoints only are labeled• Typically 7 or 11 point scale (usually coded -3 to 3, -5 to 5)• Typically treated as interval scales

This website looks: boring -3 -2 -1 0 1 2 3 funamateur -3 -2 -1 0 1 2 3 professionalcomplex -3 -2 -1 0 1 2 3 simple

93

94

Semantic differential vs Likert?

• SDS advantages– Can tap specific emotional responses– Advantages: 1 question with multiple scales, less to read

• SDS drawbacks– Not always easy to find pairs (opposite of fun?)– Takes longer to answer

• 2 steps: (1) direction (+ or -), (2) magnitude• Scale endpoints change each time so need to recalibrate• More #s/options so harder to choose

– May strongly think not A but not B, so they mark midpoint even though they’re extreme on A

95

Pictorial Scale

96

Rankings

Constant sum

• Often used to measure Importance – e.g., allocate 100 points across various features based on importance

• Generally do not want to have more than 5-7 features or the allocation process gets too hard.

Example: How important are the following attributes to you in choosing a laptop computer? Please allocate 20 points across the various features

_____ warranty_____ battery life_____ screen size_____ processing speed_____ price 97

Purchase intent

• Statement of likelihood of making a future purchase

• Example:“How likely are you to purchase Colgate Combo in the future?”

1 2 3 4 5 6 7Not at all veryLikely likely

98

99

Quadrant Analysis

• Useful when liking/preference for company, brand, is based on various attributes.

• Gives you a big-picture view of your strengths and weaknesses

• Follow up with more specific quant analysis• Steps

– Generate list of key attributes from focus groups, previous surveys, etc.

– Ask consumers 2 questions on survey (rating scales so you can take the mean)

• Rate importance of attributes (importance)• Rate company/brand on attributes (evaluation)

Importance• How important are each of the following

financial service attributes to you?not at all extremely 1 2 3 4 5 6 7

______ Flexible business hours______ Simple paperwork______ Flexible payment plans

101

Evaluation

• Please rate Merrill Lynch on each of the listed attributes using the scale below.

poor excellent 1 2 3 4 5 6 7

______ Flexible business hours______ Simple paperwork______ Flexible payment plans

102

Quadrant Analysis• Useful when liking/preference for company, brand, is based on

various attributes.• Gives you a big-picture view of your strengths and weaknesses• Follow up with more specific quant analysis• Steps

– Generate list of key attributes from focus groups, previous surveys, etc.– Ask consumers 2 questions on survey (rating scales so you can take the

mean)• Rate importance of attributes (importance)• Rate company/brand on attributes (evaluation)

– Take the means (unless segments?)– Label axes and plot attributes in 2 dimensional space

low

h

igh

imp

orta

nce

1 2 3 4 5poor fair good v.good

excellentmean performance rating

flexiblepayment plans

accuracy of product info

ease of schedulingan appt

ability of consultantto answer questions

simplepaperwork

flexiblebusiness hours

convenient officelocations

ability to obtain product info

Financial Services Attributes

103

Importance and Performance Evaluation for one brand.

low

h

igh

imp

orta

nce

poor strong performance

Areas for Improvement

LowestPriority

“PossibleOverkill”

“Keep up the Good Work”

Quadrant Analysis

104

Importance and Performance Evaluation for one brand.

Have to be careful though if

there is a minimum level that customers

expect, could be a priority

Single Brand Evaluation

105

What else to know before making changes?

• Perceptions versus reality?– If perceptions, creating better awareness is key

• Cost of improving different attributes?• Possibility of changing importance instead?

106

Competitive analysis variant

• Comparing your brand to 1 competitor• Main difference: Evaluation ratings will be

relative – Need to take difference scores (focal brand –

competitor)– X axis will be negative (left) to positive (right), 0

(no difference) in middle

low

h

igh freq of

communication

total time toresolve problem

technician’sknowledge ofmy needs

qual of replacement parts

efficiency ofservice callhandling

effectiveness of customer training

responsetime

timeliness ofinvoicing for services

significantly worse significantly betterthan competitor than competitor

Quadrant Analysis: Competitive Analysis

107

Performance Evaluation is the difference between one brand and another brand.

Imp

ort

an

ce

low

hig

h

imp

orta

nce

freq ofcommunication

total time toresolve problem

technician’sknowledge ofmy needs

qual of replacement parts

efficiency ofservice callhandling

effectiveness of customer training

responsetime

timeliness ofinvoicing for services

significantly worse at parity significantly betterthan competitor than competitor

priorities for pre-emption competitiveimprovement opportunities strengths

Competitive Analysis Variant

108

Performance Evaluation is the difference between one brand and another brand.

• How important to you are each of the following car attributes?not at all extremely 1 2 3 4 5 6 7

Sporty Styling _______ _______Handling _______ _______Cost _______ _______Comfort _______ _______Sound System _______ _______

• Please rate the following vehicles on each of the listed attributes using the scale below.poor excellent 1 2 3 4 5 6 7

Toyota Camry Chevy Corvette

Sporty Styling _______ _______Handling _______ _______Cost _______ _______Comfort _______ _______Sound System _______ _______

111

• Note: you would take the means, but let’s plot one person’s data

low

h

igh

significantly worse significantly betterthan competitor than competitor

Quadrant Analysis: Competitive Analysis

112

Performance Evaluation is the difference between one brand and another brand.

Imp

ort

an

ce

top related