the maxdiff typing tool - sawtooth software€¦ · the maxdiff typing tool section 3 22 ......

Post on 22-May-2018

303 Views

Category:

Documents

8 Downloads

Preview:

Click to see full reader

TRANSCRIPT

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Learning to Use the

MaxDiff Typing Tool

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Background on typing tools

How the naïve Bayes MaxDiff typing tool works

Practical use of Sawtooth Software’s MaxDiff typing tool

Using the software

Agenda

2

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

BACKGROUND

Section 1

3

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

We have a set of cases (e.g. respondents) with given segment membership assignments

Now we want to predict segment membership for new cases not included in the original segmentation exercise

This is a job for “supervised learning”

Unsupervised learning, e.g. cluster analysis, creates group memberships in the absence of a dependent (supervising) variable

Supervised learning identifies rules or equations that predict group membership when a supervising variable (group membership) is available

We want to apply segment assignments from a learning sample (or original sample tagged with segment membership) to a new sample of cases/respondents

The assignment

4

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Linear-in-parameters models

Discriminant analysis

Logit

Machine learning methods

Nearest neighbor analysis

Tree-based methods

Tree ensembles (random forests)

Naïve Bayes classifiers

Other supervised learning methods (support vector machines, etc.)

Some supervised learning methods

5

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Discriminant analysis has a categorical dependent variable (DV) and some number of independent variables (IVs)- metric or categorical predictors

To make a typing tool let the segment membership be the DV and search for a model where a small number of independent variables predicts segment membership well

One set of outputs are Fisher’s Linear Discriminant Functions

One linear function per level in the dependent variable

Compute the values of these functions with data from a given new respondent

Predicted segment is the one corresponding to the function with the largest value

Discriminant analysis

6

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Age (in years), gender (1=male, 2=female) scores on a 5-point rating scale predicts membership into 3 segments

Functions

DF1: 14.3 - 0.4(Age) + 1.5(Gender) + 0.8 (RS)

DF2: -8.0 + 0.3(Age) + 0.9(Gender) - 0.4 (RS)

DF3: 1.8 + 0.1(Age) -0.1(Gender) + 0.5 (RS)

e.g. 30 year old Mr. Jones who gives 2 to the rating question

DF1: 14.3 - 0.4(30) + 1.5(1) + 0.8 (2) = 5.4

DF2: -8.0 + 0.3(30) + 0.9(1) - 0.4 (2) = 1.1

DF3: 1.8 + 0.1(30) - 0.1(1) + 0.5 (2) = 5.7

Discriminant analysis example

7

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

One can also use a polytomous multinomial logit (MNL) with segment assignment as the DV and a small number of IVs

As with discriminant analysis, assign respondent to the segment with the largest value for the linear function

With MNL you can also use values resulting from the linear functions and the logit choice rule to compute the probability of segment membership

This can helps us distinguish

Core segment members

Peripheral members

“Fence sitters”

Logit

8

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Stepwise search to find the variables that most discriminate group membership

Tree identifies the variables and the rules that classify respondents best into known segments

Now we can apply those classification rules to new cases/respondents

Classification trees

9

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

If respondent reports “4” for Q17, is over 44 years old and reports “5” for Q8, classify as Segment B

Classification trees

10

Total Sample

40% A, 25% B,

35% C

Q17 < 4

16% A, 14% B,

70% C

Q17 > 3

60% A, 15% B,

25% C

45+

22% A, 48% B,

30% C

<45

80% A, 15% B,

5% C

Q10 < 3

3% A, 1% B,

92% C

Q10 > 2

38% A, 55% B,

7% C

Q8 = 5

10% A, 86% B,

4% C

Q8 > 4

28% A, 5% B,

67% C

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Grow an ensemble of partially-informed trees (use a random subset of variables for each tree and for each node in each tree)

Run each new observation through each tree in the forest

Assign respondent to the modal prediction of the forest

Random forests

11

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Classifying pets

12

Friendly

Hateful

SmartStupid

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Cats and dogs

13

Friendly

Hateful

SmartStupid

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Nearest neighbor analysis

14

Friendly

Hateful

SmartStupid

cc

c

c

c

c

cc

cc

c

d

d

dd

d

d

d

d

d

d

d

d

h hh

h

h h

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Nearest neighbor analysis

15

Friendly

Hateful

SmartStupid

cc

c

c

c

c

cc

cc

c

d

d

dd

d

d

d

d

d

d

d

d

N

h hh

h

h h

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

HOW THE NAÏVE BAYES CLASSIFIER

WORKS

Section 2

16

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Naïve Bayes classifiers use conditional probabilities from Bayes’ Theorem (specifically the posterior probabilities) to identify most likely group membership from a set of input variables

It is “naïve” because it assumes that the conditional probabilities are independent and can simply be multiplied together

Ideal for MaxDiff

Naïve Bayes classifiers

17

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Our tool uses a type of Naïve Bayes to

1. Identify a small set of MaxDiff questions that do a good job of classifying respondents from an existing MaxDiff based segmentation database into their known segments

Likely these will be questions few or no respondents actually answered

It can also use non-MaxDiff “auxiliary” variables to improve its predictions

2. Classify new respondents into those segments using the small set of MaxDiff questions

Sawtooth Software MaxDiff Typing Tool

18

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

With the utilities of existing segment members in hand we can predict how the average member in any of the segments should answer any possible MaxDiff question and any possible combination of MaxDiff questions

Here’s the Bayesian part: we can also calculate the likelihood that a respondent giving any pattern of responses to a given set of MaxDiff questions belongs to any of the segments

How it works

19

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

The tool searches for a subset of MaxDiff questions that correctly classifies the greatest number of existing respondents to the correct segments using a fast swapping procedure

The search procedure has a random starting point, so we typically use many (50, 100, etc.) starting points so that we can arrive at a near optimal solution

The user indicates how many questions of how many items each to include in each search

You can also choose to focus on success predicting particular segments

How it works

20

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Now we can ask the small set of MaxDiff questions to new respondents

New respondents will answer with one of the possible patterns of responses and we know which patterns most likely belong to which segments, so we can assign new respondents accordingly

How it works

21

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

EXPERIENCE USING THE MAXDIFFTYPING TOOL

Section 3

22

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Percent correct predictions

50%

55%

60%

65%

70%

75%

80%

2 3 4 5 6 7 8 9 10

2

3

4

5

Items/Task

Number of Tasks

23

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Percent correct predictions

35.0%

40.0%

45.0%

50.0%

55.0%

60.0%

2 questions 3 questions 4 questions 5 questions 6 questions 7 questions 8 questions

Overall Hit Rate

2 items

3 items

4 items

5 items

24

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Fiendishly clever naïve Bayes classifier

150

200

250

300

350

400

2 3 4 5 6 7 8 9 10

Correct Classifications

2 items

3 items

4 items

5 items

Number of sets

25

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

USING THE SOFTWARE

Section 4

26

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Overview of Using MaxDiff Typing Tool

Conduct a full MaxDiff study to develop a segmentation (usually via LC or cluster analysis)

Create a Typing Tool MaxDiff Questionnaire using Typing.EXE

Field the Typing Tool MaxDiff Questionnaire among new resps

Assign new resps to previous segments using Classifying.EXE

1

2

3

4

27

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Conduct a full MaxDiff study to develop a segmentation (usually via LC or cluster analysis)

1

Often you have a full MaxDiff questionnaire with 12-36 items (where each item appears 2+ times)

You have developed a segmentation you like, usually via Latent Class or clustering on HB scores

You also have estimated HB scores for reporting, TURF, or other simulations

28

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Client Wants a Typing Tool

The client really likes the segmentation scheme and wants to be able to assign respondents to new surveys into those same segments with a high degree of accuracy

29

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Input files (can create using Excel or your favorite text editor like Notepad or Wordpad):

Typing.txt: Contains raw HB scores on all the MaxDiff items, segment membership, and (optionally) other survey variables that are highly predictive of segment membership (e.g. age, company size, intended usage)

Segmentscores.txt: Contains segment sizes, raw aggregate (pooled) logit scores for each of the segments on all the MaxDiff items

Params.txt: A file containing control parameters that tells Typing.EXE what to expect in the input files and what to do

Create a Typing Tool MaxDiff Questionnaire using Typing.EXE2

30

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Input Files (in Notepad)

31

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

More Info on Params.txt

32

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Launch Typing.exe!

Fun, fun! The command prompt (the DOS prompt)

Luckily, you don’t need to remember any DOS commands…

Just double-click LaunchCommandPrompt file, then type “Typing”, then press ENTER key

(Software Demo)

33

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Output File Gives Typing Tool

Log.txt (open with Notepad or Wordpad)

Items to show in MaxDiff typing Questionnaire (e.g. show items 23, 18, and 14 in task #1)

34

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Remember that you can Export the MaxDiff design to .CSV, modify it (to insert the typing tool questionnaire design), and re-import into Lighthouse Studio (You’ll need to “fool” the Designer by

telling it to allow designs without connectivity)

Now you’re using Lighthouse Studio’s MaxDiff questions, but with your typing tool questionnaire!

Field the Typing Tool MaxDiff Questionnaire among new resps3

35

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Assign new resps to previous segments using Classifying.EXE

4

Input files (can create using Excel or your favorite text editor like Notepad or Wordpad):

Respdata.txt: Contains respondent answers to the typing tool questionnaire, plus (optionally) responses to additional survey variables that could help assign people into the right segments

Segdata.txt: A file containing average segment MaxDiff scores and average segment responses to the optional survey questions used for classification

36

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Input Files (in Notepad)

Respondent#, #Survey_Variables, #MaxDiff_Sets, SurveyVariable_Values, #Items_In_Set1, Items_in_Set1, Best_Item_Set1, Worst_Item_Set1, Etc.

Line 1: #Segments, #Survey_Variables, #MaxDiff_ItemsLine 2: #Levels_for_Survey_VariablesLine 3: Segment_Size_Seg1, Survey_Variable_Probabilities_Seg1, Seg1_Logit_ScoresLine 4: Segment_Size_Seg2, Survey_Variable_Probabilities_Seg2, Seg2_Logit_Scores

37

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Output File Gives Segment Assignments

Respclass.txt (open with Notepad or Wordpad)

Respondent#, Probability_of_Membership, Segment_Assignment

38

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Classification Look-Up Table

Often clients don’t want to have to come back to you to run Classifying.exe each time they collect new respondents with the typing questionnaire

They want a lookup table that tells them the segment prediction given ANY possible combination of answers to the typing questionnaire

For our 4-set, 3-items at a time questionnaire (plus the optional survey question with 2-category response) there are just 6x6x6x6x2=2592 possible ways that a respondent could answer the typing tool questionnaire plus the survey question

39

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Create the Lookup Table

Create a file of 2592 “respondents” (respdata.txt) who represent all 2592 ways to answer the typing tool questionnaire plus additional survey question

Run Classifying.exe to generate the segment assignment for each of those 2592 “respondents”

Deliver classification lookup table to client

40

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

QUESTIONS?

41

Bryan Orme

President

Bryan@sawtoothsoftware.com

www.sawtoothsoftware.com

+1 801 477 4700

@sawtoothsoft

Keith Chrzan

SVP, Sawtooth Analytics

keith@sawtoothsoftware.com

© 2016 Sawtooth Software, Inc. | www.sawtoothsoftware.com

Webinar

Lyon, David W. (2016) Naïve Bayes Classifiers: Or, How to Classify via MaxDiff without Doing MaxDiff, paper presented at the Sawtooth Software Conference, Park City.

Orme, Bryan and Rich Johnson (2009) A Procedure for Classifying New Respondents into Existing Segments Using Maximum Difference Scaling, available at: http://www.sawtoothsoftware.com/download/techpap/typing_tools_mrmag.pdf

References

42

top related