remissence of decompress

33
The game experiments - Researching how gaming techniques can be used to improve the quality of feedback from online research Jon Puleston and Deborah Sleep ESOMAR Best Methodological Paper, Congress, Amsterdam, September 2011

Upload: ajay-sahu

Post on 16-Jan-2016

225 views

Category:

Documents


0 download

DESCRIPTION

gruel decamps lingobit

TRANSCRIPT

Page 1: remissence of decompress

  The game experiments - Researching how gaming techniques can be used to improve the quality of feedback from online research

Jon Puleston and Deborah Sleep

ESOMAR

Best Methodological Paper, Congress, Amsterdam, September 2011

 

Page 2: remissence of decompress

 

The game experiments - Researching how gaming techniques can be used to improve the quality of feedback from online

research

Jon Puleston

GMI

Deborah Sleep

Engage Research

BACKGROUND

Over the past four years, GMI and Engage Research have been on a journey together, exploring ways of improving the

feedback from online surveys by making them more engaging for respondents. Over this period we have conducted over 100

research-on-research experiments with a range of end-user clients and explored a wide variety of techniques.

Over the course of these experiments we learnt how important the design and ergonomic flow of surveys was and how dropout

could be reduced by making surveys more engaging consumer experiences. We learnt how valuable imagery was in online

surveys as a means to stimulate the imagination and trigger the memory and encourage more enthusiastic participation in a

survey, especially when there was an element of humour to the imagery used. We looked at how we could improve the

language used in surveys, and saw how, when it was more friendly in tone, this could be used to develop a better relationship

with respondents and encourage more feedback.

We applied thinking borrowed from the world of NLP, social psychology and qualitative research, and discovered how

dramatic an impact such techniques as projection and imaginary scenario planning could have on the willingness of

respondents to complete survey tasks, and on the effort they were prepared to put into them. We also tested a number of more

creative questioning techniques, and found that when we made them more game-like they could improve click count, reduce

straight-lining effects and improve respondent satisfaction levels.

We began to notice the impact that humour in general could play in surveys. Segueing questions with a humourous animation

effect seemed to re-set respondents’ concentration levels and improve feedback to follow on questions.

   Title: The game experiments - Researching how gaming techniques can be used to improve the quality of feedback from online research

   Author(s): Jon Puleston and Deborah Sleep

   Source: ESOMAR

   Issue: Best Methodological Paper, Congress, Amsterdam, September 2011

 

Downloaded from warc.com

 

 

2

Page 3: remissence of decompress

FIGURE 1

In fact we observed that introducing any fun or game-based mechanic tended to provoke very positive reactions from survey

respondents.

For example, in one experiment, the use of the phrase “we challenge you” stimulated a three-fold increase in the number of

ads being recalled. In another, applying a two-minute time limit resulted in ten times as much feedback as the previous version;

while in a third, incorporating the words “can you guess” extended the time respondents spent considering a question from ten

seconds to two minutes. Even small changes to question design, such as having emoticon stamps appear when the answer

was clicked, seemed to encourage greater consideration in response choice.

When we asked respondents why they had written more and spent more time thinking when answering these types of

questions, the answer was simple: “it was more fun!”

What we were slowly discovering was the impact that gamification could have on improving response levels to surveys.

Moving into the game space with our thinking

As a result of this discovery, we decided last year to explore the idea of game-play in greater depth. We examined the theory

behind game-play, and looked at how it was being used in other fields, with the aim of discovering how we could integrate this

thinking more effectively into our surveys.

The concept of “gamification”, we quickly discovered, was already sweeping across the marketing communication industry,

and being discussed in marketing departments, advertising agencies and even governments around the globe. We found

countless examples of how it was being applied to encourage compliance and active participation in all manner of different

activities.

Market research, we felt, had great potential to successfully incorporate this concept, because many surveys could already be

seen as games, just boring, badly designed ones.

THE GAME EXPERIMENTS

To begin our research, we analyzed what made a game, and conducted a number of small-scale experiments to see how we

could make different survey questions more game-like in nature.

A game is essentially any thinking activity that we do for fun. There is often very little that intrinsically differentiates game

 

Downloaded from warc.com

 

 

3

Page 4: remissence of decompress

activity from real work. The difference tends to lie in either the application of an imaginary framework, or the application of

often arbitrary rules.

Take, for example, “Golf the game” vs. “Golf the task”.

Golf as a task is to carry around a 10kg bag of sticks for 6km, whilst hitting a ball into holes in the ground along the route.

What transforms it into a game is the otherwise arbitrary challenge of getting the ball into the holes in the minimum number of

touches.

This explains why Twitter raised itself above all the other ten-a-penny forums on the block. By applying an abstract rule that

only 140 characters were allowed in a post, it was instantly transformed into a game.

Rules and game-play mechanics can be thought of as enthusiasm enzymes, the little twists that make the ordinary

extraordinary and therefore more fun.

Experiment 1: Applying an imaginary framework to a question to turn it into a game

We discovered this thinking could be very easily applied to re-engineer simple questions.

The question “what is your favourite meal?” was transformed into a game by placing it in a more imaginary personal

framework that made the question more fun to answer: “imagine you are on death row and have to choose your last meal.”

Question = the typical answer was three words, e.g. “Steak & chips”

Game = we got paragraphs, like this “Scallops with black pudding and cream, rib eye steak with chips and a dolce latte cream

sauce; stinking bishop (cheese) with 1960 port (year of my birth). Wine would have to be Chateau Lafite 1st Cru Pauillac

2000. I would skip pudding course, I would not want indigestion!”

Gamifying the question by this simple shift in emphasis resulted in more focused respondents and far richer data. We more

than quadrupled the average word-count.

Experiment 2: Applying rules to questions to turn them into a game

Adding abstract rules to a question can have the same effect. For example, rather than asking people to describe themselves

in a question phrased like this, “How would you describe yourself?”, we asked them the question with a word restriction rule

and a projective twist “In exactly seven words how would your friends describe you?”

Question = Eighty-two percent gave an answer, average 2.4 descriptors

Game = Ninety-eight percent gave an answer, average 4.5 descriptors

The addition of the rule made this boring task more fun, and also encouraged more thought about the answer.

These two simple tricks, applied with sufficient imagination, could be used to transform just about any question in an online

survey. But we were only just beginning.

Experiment 3: Turning questions into mini-quests

 

Downloaded from warc.com

 

 

4

Page 5: remissence of decompress

Taking things further, we looked at the evolutionary roots of game-play, and learnt how our motivation to play games is closely

linked to our hunter-gatherer heritage. Games were an important part of honing and refining skills used in hunting and foraging

quests. These quests could last several days, requiring extended periods of concentration. To accommodate this, the brain

developed ways of suppressing other demands on it, such as hunger, pain and tiredness. Time-references were also dulled.

Many of the most successful games tap into this kind of mind-state to the point where they can become dangerously addictive.

World of Warcraft is a good example, where players can spend days doing menial tasks just to gain points to make battles

marginally easier.

One of the main challenges in online research is working out how to extend respondent’s concentration spans, so as to

maintain data quality through lengthy surveys.

We thought one solution might be to turn questions more into “quests”. Again, we found this could be done with a simple twist.

As an example, a question might be worded: “How much do you like each of the following music artists?”

The quest version might be as follows: “Imagine you are in charge of your own private radio station, where the DJs play just

the music you like. You will be shown a series of artists, and we want you to build up a play-list by deciding how much each

artist should be played.”

Question = 83 artists evaluated

Game = 148 artists evaluated

Essentially this is the same task — to identify favourite artists — but this has now become a quest, and in “quest mode” we

found respondents evaluated roughly twice as many artists as when the question was phrased more traditionally. The data

also showed a halving of the incidence of the neutral or uncertain answers this being another indirect measure of focus.

Turning a question into a quest is essentially about applying an imaginary purpose to the task, one that makes it more

important, fun and involving for the respondent.

A further example of this approach comes from an early experiment. One group of respondents were shown an ad and asked

simply what they thought of it. A second were presented with this scenario:

“Imagine you work for an advertising agency. One of your key client’s rival brands has just released a new ad, and you are

about to see a sneak preview before anyone else. You have to report back to the agency what you thought of it.”

Question = Average 14 words per respondent

Game = Average 53 words per respondents and over 85% of respondents wrote over 20 words

Experiment 4: Adding competitive elements

Deriving similarly from our distant past, in which only the fittest survived, is the near-Pavlovian way we respond to

opportunities and challenges.

Knowing how competitive games tap into this, we explored how we could add a more competitive framework to questions. We

 

Downloaded from warc.com

 

 

5

Page 6: remissence of decompress

had already observed the power of adding the phrase “we challenge you …” In the same vein, we experimented with time

limits, such as telling respondents they had only two minutes to answer.

So, taking the basic question, “name all your favourite foods”, we changed this by incorporating a quest-type scenario with a

time constraint, thus: “You have an opportunity to go to supermarket with an unlimited budget and buy all your favourite foods.

The catch is, you only have two minutes.”

The number of foods listed increased from an average of seven items, when asked as a simple question, to over thirty when in

this competitive quest mode.

Question = Six foods types listed

Game = 35 food types listed

What was interesting about this experiment was that although there was no actual real time constraint or conditionality placed

on this question, over 90% of respondents in game mode listed over 20 foods, underlying how universal is this instinct to play

games.

Experiment 5: Guess work

Another simple twist that adds a competitive game element to a question is to ask respondents to guess answers.

For example, instead of asking which brands of deodorant they recall, the question can be worded: “How many brands of

deodorant can you guess?”

Question = Two brands, 15 seconds thought

Game = Five guesses, two minutes thought

This is a technique that we have now successfully used already in a number of surveys as a replacement for top of mind brand

awareness questions.

Experiment 6: Scenario planning game technique

Another skill that evolution has primed us to develop, and one we play games to practice, is scenario planning. The ability to

predict the outcome of different situations can be said to separate man from the rest of the animal kingdom.

We found ways of asking questions that tapped into respondents’ natural willingness to scenario-plan. For example, a typical

question might be “Which of these characteristics would you used to describe this brand” as a scenario planning game the

wording could be reframe something like this “We would like you try and turn some brands into people using these human

characteristics.” ”

Question = Three characteristics selected and 12 seconds spent on task

Game = Six characteristics selected and 43 seconds spent on task

All we are doing here is asking people to specifically use their imagination more, which is the basis of so many simple games

we play.

 

Downloaded from warc.com

 

 

6

Page 7: remissence of decompress

Asking people to do more thinking

These examples lead to a more general point about questioning technique. A great many surveys are built around asking

questions that require only simple, low-level thought, such as “What is your favourite colour?” or “How much do you agree or

disagree with this statement?”

Most games are much more intellectually involving than this, and indeed, the more involving they are, the more popular they

tend to be. Word-searches, chess, Scrabble and The Times crossword all have many adherents, and all require far more

advanced mental effort than answering the average survey question.

It is our contention that we are simply not asking respondents to do enough interesting thinking in a typical survey.

We have found projective, predictive and imagination-based questions to be highly successful, and they all encourage greater

thought. So, instead of asking respondents their favourite colour, they could be asked to design a colour scheme for their room

from a palette of colours, or to predict the colour they think would work best in a particular situation, or to guess which colour

other people might like. All these involve more interesting thinking, and, as a result, are perhaps indirectly more “game like”.

Experiment 7: The pizza experiment

A classic example of the reliance on low-level thought is conjoint analysis, where respondents are asked a repetitive set of

questions about which choices they prefer. The aim of this is to work out which features of a product are the most important to

consumers, but anyone who has ever had to undergo this experience knows that you go blind to the choices after the first few

repetitions and it is frankly dull as dishwater for respondents to do!

We thought about how we could change this into a more game-like process, and this resulted in our first proper game we

called “Evolution: Survival of the fittest product”.

In traditional conjoint analysis, respondents are asked to pick out the features of the product that are important to them, and

are then presented with a series of interlocked choices, with different features turned on or off.

In our game, we set up a quest for the respondents and asked them to take part in a series of three surveys, telling them that

they were about to open their own pizza restaurant, and that they had to design some pizzas they thought would sell well.

In the first part we gave them a fixed ranged of ingredients, and also asked them to come up with a name for their pizza that

would be popular. To add a competitive element, we told them that the pizza they designed would go into a virtual restaurant,

where they and other respondents could go and choose those they would like to eat and made it into a mini completion to see

who could design the best pizza.

The next part, a virtual restaurant stage, was effectively a conjoint comparison process, where the competitive combinations

had been intelligently designed by the respondents themselves.

The results showed a massive difference in the way the second, “conjoint-style” stage was viewed. Conjoint style question

batteries typically scores three out of ten in respondent satisfaction data, due to boredom, but “gamifying” it made it exciting for

the respondents, since they now had something at stake in the process. Over 90% of those who designed a pizza returned to

participate in the second stage evaluation.

 

Downloaded from warc.com

 

 

7

Page 8: remissence of decompress

In the third stage we wanted to see how much more intellectual “work” respondents were prepared to do in this game mindset.

So we analysed the results, as in traditional conjoint, to find which ingredients and features had the most influence on

selection. This kind of information is normally only fed back to clients, but in this next stage of this experiment, we shared it

with respondents. We debriefed them in detail on what was the most popular pizza, and which factors appeared to influence

this, and asked them then to go away and design a better one — in evolutionary terms, a second-generation mutant variant

that they thought would do better than the first, hence the name of the game.

These were then tested against the original pizzas using a completely different set of respondents to see which were most

popular. We asked them to imagine they were setting up a restaurant and pick pizzas they wanted to put on their menu.

Results = Respondents wanted to put 52% of second generation pizzas created on their menu vs. 34% of the first generation

pizzas.

Of the 100 people who agreed to take part in this experiment, 87 went on to take all three surveys. This is compared with the

less than 50% who would typically complete all of a three-survey series and the feedback was overwhelmingly positive. All bar

one respondent said they would like to do more surveys like this.

A typical comment: “I loved it, the whole thing was fun to do (though my diet was nearly ruined as it made me hungry!). It was

challenging at times but really made me think and that is never a bad thing. Thank you!”

What we hope this experiment has demonstrated is that by making a task more fun, competitive and intellectually more

challenging, you can be rewarded with much higher levels of loyalty in participation, and much richer and valuable feedback.

Adding in reward mechanics

At the heart of most games are mechanisms for rewarding participants, and point systems for marking achievement. We next

looked at how we could incorporate these into surveys.

Most survey technology is not built around points systems, and building these into a survey using standard logic features is

extremely time-consuming, so we first had to re-engineer our survey technology – which we will discuss in a later part of this

paper. We built several specialist game-style question formats to record points and display respondent scores.

With these in place, we played around with predictive questioning games, where respondents were rewarded with points for

predicting the correct answer, and market trading games, where they were rewarded if other people agreed with their opinions.

Experiment 8: Point scoring game

As an example of the first type, we set up a simple experiment where we asked people to predict the most popular emotions

respondents as a whole would associate with different situations and different brands, winning points for making the correct

predictions. We then compared their answers to a standard version of the same question, in which they were asked which

emotional associations they personally felt.

Question = Average answer time per option 8 seconds

Game = Average answer time per option 12 seconds

 

Downloaded from warc.com

 

 

8

Page 9: remissence of decompress

With this reward mechanism in place, respondents spent upwards of 50% more time in making their decisions, compared to

the standard question. As a standard, non-mandatory question, respondents would select two or three word associations on

average before giving up, whereas with the predictive game, we found they wanted to click on as many choices as they were

allowed until they got the right answers, happily making upwards of ten selections given the chance. At the same time, their

enjoyment rating rocketed, with 90% saying they enjoyed answering the predictive set of questions, compared with 50% for

the standard version.

As would be expected, given the change in emphasis from the respondent’s own emotions to his estimation of the population

as a whole, there were noticeable differences in the character of some of the answers, which is an issue we will discuss later

on. But purely as an experiment in engaging respondents and encouraging more active participation in a survey, this was the

most powerful technique we had experimented with so far.

Experiment 9: Market trading game

We then trialled market trading games, where respondents were given a trading pot to stake on their opinions and these

proved equally popular.

We devised a game where respondents had to decide what they thought about a topic, and then bet whether the market would

agree or disagree with them. They could bet however much they wished on their opinion, and would win or lose their stake

according to whether the market agreed or disagreed with their viewpoint. The game was to see how much money they could

win over the course of a set of questions.

Question = Average answer time per option 2.7 seconds

Game = Average answer time per option 6.2 seconds

In this game scenario respondents’ decision making time more than doubled and when asked about the inclusion of these

types of predictive games in surveys, 95% of respondents told us they enjoyed them ( “it was great fun!” ), and would like to

do more surveys with these types of questions.

It was clear to us after these two experiments that point scoring was a very powerful gaming mechanic to integrate into

surveys.

EXPLORING THE SECONDARY APPLICATION OF GAMES

We also looked at how we could use games in a wider variety of ways within surveys: as warm-up exercises, to help

respondents think about issues, and to encourage cross-participation in a series of research studies.

Experiment 10: Using games as warm-up exercises

We set up an experiment to see how effective a warm-up game could be at putting respondents in the right frame of mind to

take part in a creative co-creation exercise.

We designed a game where respondents had to think up “silly uses” for things. To encourage more active participation, we

showed examples, such as figure 2.

 

Downloaded from warc.com

 

 

9

Page 10: remissence of decompress

FIGURE 2

Although we were being silly, this gave respondents the important message that there should be no rules or boundaries to

their thinking. It also demonstrated our commitment to the idea, as researchers, by showing that we were prepared to make

fools of ourselves (thanks to Steve, our model!).

To test the impact of this, we then asked respondents to perform a standard lateral thinking task, and come up with ideas for

uses for a brick. We then compared their answers to those of a control cell who had been asked to do this blind. The group

that had played the warm-up game generated 30% more ideas, but perhaps more importantly, when a third group rated how

imaginative they considered the two groups’ ideas to be, the quality of those generated by the warm-up group was judged

higher (see figure 3).

FIGURE 3, QUALITY OF IDEAS

Experiment 11: Using games to encourage cross-participation in surveys

We topped and tailed a series of four surveys with an ongoing series of creative mini-games, where we asked respondent to

 

Downloaded from warc.com

 

 

10

Page 11: remissence of decompress

do things like complete lines of limericks, write punch-lines to cartoons and summarise films in as few words as possible, and

then shared the most entertaining answers in the following survey.

In this experiment, 80% of respondents took part in all four waves, (the average cross completion rate of four surveys is less

than 50%). There were some other creative elements in these surveys that made them more fun so we cannot present the

results of this experiment as anything other than anecdotal evidence, but we are now working with OMD in the UK, who run a

weekly closed panel omnibus survey, to test this technique over a long-term period, to see if it improves overall panel

retention. We aim to present these results in a future paper.

FIGURE 4

There are two main ways in which surveys can be “gamified”. What we have examined so far in this paper has been the

reframing of question wording to be more game-like. The second way is to do the same for the answering process itself. (See

figure 4.)

We had already done a lot of groundwork in our previous research experiments in this area. In our efforts to make surveys

more engaging, we explored a wide range of more creative ways to ask conventional questions. Our evidence showed how

more creative questioning techniques, when used in the right way, could reduce straight-lining, improve the quality of answers

and radically improve respondent enjoyment levels. We have published several papers and case studies on this.1)

Much of this early work was focused on improving the ergonomics of question design and the visual appeal of the question.

But we wanted to investigate ways of making the answer process just a bit more “fun”, so we conducted a series of small

experiments looking at the impact of getting respondents to answer questions in slightly more playful ways.

We tested a range of more fun selection processes, adding small rewards in the form of a piece of animation, or visual or

sound effect.

We found that these simple additions brought about small but measurable changes in response quality and respondent

satisfaction levels: encouraging one or two extra clicks, and a little more focus.

 

Downloaded from warc.com

 

 

11

Page 12: remissence of decompress

FIGURE 5

Figure 5 illustrates combining a more playful question format with a selection stamp process. Overall, this resulted in an 18%

increase in time spent by respondents answering the set of questions (one of the key benchmark measures of respondent

attention) and 15% more click activity.

FIGURE 6

We found that transposing a bank of sliders, making them resemble a mixing desk, increased the time respondents spent

answering the question by over 50%, because the format encouraged them to “play” with the answers, tweak and refine their

choices (see figure 6).

We then went on to look at the impact of more elaborate games-style questioning techniques.

Our survey design team created a range of games-style question formats, which we used in a series of experiments to see

how they affected the way respondents answered the questions. These formats ranged from simple guessing game and quiz

question formats, through to more elaborate videogame-style questioning interfaces.

Guessing game question component

 

Downloaded from warc.com

 

 

12

Page 13: remissence of decompress

FIGURE 7

In this question, respondents guessed an answer, and were rewarded with instant feedback in the form of a tick or cross (see

figure 7). The question incorporated a life system, in which respondents had a number of “lives”, and lost one for each wrong

guess. The question was programmed to recognize spelling variants.

Over 80% of respondents said they really enjoyed this question format, and we found respondents were prepared to do three

times the amount of work compared with a straightforward “top of mind” recall question.

Predictive/quiz question component

We then created a variant of the guessing question for standard options selection questions with an integrated point scoring

system and final score-ometer to tell respondents how they performed. (See figure 8.)

FIGURE 8

The consumer reaction was almost universally positive, with over 90% of respondents enjoying this question format. Specific

examples of this question format can be seen in some of the client case studies later in this paper.

Videogame-style question formats

No paper on survey gaming techniques would be complete without this. We tested two experimental videogame-style question

formats: a snowboarding game where respondents skied down a hill and passed through gates to indicate their selection, and

a “space invader” game where they fired at their choices.

To test the snowboarding game, we asked respondents whether they agreed or disagreed with a range of statements. We

spilt the sample, half being presented with binary choices, and the other half with four-point range scales. We also made some

of the questions more complicated than others, effectively incorporating “not” clauses, and we experimented with the speed

 

Downloaded from warc.com

 

 

13

Page 14: remissence of decompress

with which respondents passed through the gates and had to make decisions. (See figure 9.)

FIGURE 9

Eighty percent of respondents said they enjoyed this question in its simplest format, but this dropped to 73% for the more

difficult variant. We also measured significant discrepancies between the answers to positive and negatively worded choices,

seeing an average of 14% difference in scores. This raised a big question about using this more creative question format in

real surveys for conventional question tasks, but it potentially could have a really interesting role for conducting implicit

association style tests, where more instant answers are sought from respondents. We are working on a re-engineered version

for this as a follow-up experiment.

The space invader style game was loved by nearly everyone. It contained no time limits, with respondents simply having to aim

and fire at choices. (See figure 10.)

FIGURE 10

Eighty-five percent of respondents found answering this question fun, and the answers generated seemed consistent with the

same questions posed in more traditional ways. A small group of respondents thought it was slightly “facile” so using this type

of question in a real survey would have to be thought about carefully, but this more fun style of question is potentially ideal for

youth research.

The problem with both these question formats is that although they have the appearance of games, they lack proper

mechanisms for rewarding skill and performance, which is what makes real games fun. So we thought - imagine if the options

fight back! This kind of thinking is where we are heading next with our experimentation.

STAGE 3 OF THESE EXPERIMENTS: APPLICATION AND VALIDATION

There is no doubt that gamifying is a powerful idea that can transform how online research is conducted. However, when

talking to people about this, we were confronted with three common questions:

 

Downloaded from warc.com

 

 

14

Page 15: remissence of decompress

1. How can we adapt these ideas for practical research projects?

2. What impact will this have on the character of the data?

3. What types of people will respond to this process?

To answer these, we teamed up with a broad cross-section of our clients, and looked at how we could apply these gaming

techniques to some of their real research problems. We then studied the data generated to see how it might have been

affected by the game-style mechanisms.

We don't yet have all the answers, and these experiments have raised many more questions. As a result, we are currently

doing a large-scale global experiment to assess some of the more serious questions that people have raised about the impact

gamification has on respondent behaviour.

QUESTION 1: HOW CAN WE ADAPT THESE IDEAS FOR PRACTICAL RESEARCH PROJECTS?

Nearly every one of the techniques explored in this paper could have practical application in real life research situation. There

are several levels on which they can be applied. Taking a rethink to the wording of questions to make them more fun and

game like is something that could be done in nearly any survey. Other techniques like point scoring mechanics are more

specialist ingredients and really only have practical role in certain research situations.

The consideration for when and where to use these techniques is still very much to be explored and there are a number of

factors to consider along the way – not least, understanding the impact that it will have on the character of the data.

Here are some examples of how we have adapted and applied some of these techniques to real life research situations and a

summary of the issues we have faced along the way.

CASE STUDY 1: GAMIFYING THE MINTEL BRAND CATEGORY RESEARCH TRACKERS

 

Downloaded from warc.com

 

 

15

Page 16: remissence of decompress

FIGURE 11

In this experiment we looked at how we could apply points-based reward mechanisms and predictive market trading games to

the brand category research we conduct on behalf of Mintel.

We have worked very closely with Mintel over the last two years to redesign their brand category research, to make it more

engaging for respondents. Before the game-style experiments, we had already achieved significant improvements in click-

rates, respondent satisfaction levels and completion rates. But we struggled to make certain parts of these research studies

more engaging, as they involved respondents having to answer long series of repetitive questions about the attributes of

competing brands. We therefore looked at how we could convert these parts into games.

One of the key questions in these surveys involves asking respondents to predict the future of these brands. We gamified this

by asking people to make their predictions in competition with the “market”, and we gave them a gambling budget. If the

market agreed with their predictions they would win the amount of money that they gambled (and vice versa) and the game

was to see how much money they could make. (See figure 12.)

FIGURE 12

 

Downloaded from warc.com

 

 

16

Page 17: remissence of decompress

This resulted in respondents spending more than double the time considering their answer to this question. The answers

seemed to be roughly consistent with asking the question in a standard format, but with a slight (4%) shift to the negative.

FIGURE 13

The value of this gambling game was that it gave an extra dimension to the result, the relative amount respondents were

prepared to gamble: a measure of confidence.

We then identified two other predictive games we could play within the survey.

In the second, respondents were asked to identify which brands most possessed certain characteristics, such as being

innovative. We turned this into a predictive game, asking “Which brands do you think other respondents would think as being

most innovative?” and awarded points for identifying the three most selected brands.

Here we encountered a problem. Although respondents much preferred answering the question this way, investing 100% more

time, there was a strong learning element at work that had an immediate impact on the answers. Respondents learnt which

brands did well on earlier questions, and picked these brands subsequently, radically skewing the answers. To overcome this,

we did not reveal which ones they got correct after each answer, but told them how well they were doing only after every few

questions and this seemed to restore more consistency.

FIGURE 14

Even so, this did still result in some quite significant differences in the results compared to when asking the question in a

standard way, as shown in figure 14, upwards of 20% differences were measured in some answers.

The third way in which we used this predictive game technique was on a question where respondents had to pick words they

associated with each brand from a list of twenty choices. We asked respondents instead to predict what other people would

say, and rewarded them with points for picking the most popular choices.

 

Downloaded from warc.com

 

 

17

Page 18: remissence of decompress

FIGURE 15

Again, respondents invested 100% more time in answering the game-format question, but the results were significantly

different, particularly when comparing attributes that were centred around experiential factors such as user-friendliness and

reliability, for example. As a standard question respondents were much less likely to choose attributes that were based on

personal experience if they had not experienced the product themselves, but were happier to make assumptions about these

experiential factors when making predictions. (See figure 15.)

There are two ways of looking at this difference. The first is to argue that this predictive game technique has limited

application because of the way it affects answers; the second is to see this as shedding new light and a new perspective on

the research, in this case learning more about the perceived user-friendliness and reliability of brands. This has led to a hybrid

version where both question formats are used, half predictive and half standards that we are experimenting next with.

What is not in doubt is how successful these predictive gaming mechanisms were at making the survey more fun for

respondents and improving the amount of thought put into answering questions. Like for like answer consideration times for

the gamified question improved by over 100%; over 90% of respondents said they really enjoyed the gamified version of this

survey, compared with the standard grid and tickbox version, and the verbatim feedback was literally 100% positive. The

survey achieved a 93% completion rate, and when asked, 97% of respondents said they would like to do more surveys like

this. (See figure 16.)

FIGURE 16

Sample = 400 respondents (200 per cell)

 

Downloaded from warc.com

 

 

18

Page 19: remissence of decompress

CASE STUDY 2: GAMIFYING A MEDIA TOUCHPOINT RESEARCH STUDY FOR AMS

FIGURE 17

AMS media are one of the leading UK-based independent media planning agencies. With their guidance, we undertook an

experiment into gamifying a media Touchpoint study.

For anyone not familiar with a media Touchpoint study, respondents are normally asked a highly repetitive set of questions

about their consumption and attitudes towards different media, with often more than twenty media touchpoints being evaluated

in a single survey lasting upwards of twenty minutes. Typical respondent feedback is that they find this type of survey boring

to complete. In addition, they often find questions asking whether they were influenced by advertising in different media

intangible and difficult to answer.

Our approach was to try to turn this evaluation process into a fun quest, and increase the thinking involved in the task.

Rather than simply asking respondents about their media consumption, we presented them with this quest scenario: “Imagine

you work for an advertising agency and have to plan an advertising campaign with £1million budget to reach a really important

target audience … people like you!”

We then reframed the question to be more decision-making in nature. Instead of beginning by asking them which media they

consumed, we phrased the question thus: “The first thing you will need to assess is how effective each of these media are at

reaching you. How often do you consume these media?”

We also made the task more elaborate, requiring more thinking on behalf of the respondents, giving them an advertising

budget and telling them the relative value of each medium, and asking them to buy advertising slots (see figure 18).

 

Downloaded from warc.com

 

 

19

Page 20: remissence of decompress

FIGURE 18

To make the repetitive evaluative parts more fluid to get through and answer, we used a range of more creative question

formats, including scrolling grids that we had developed and pioneered through some of our earlier research.

Results

In the quest version, we found an all-round improvement in the level of attention, and a significant improvement in enjoyment

levels and completion rates.

l Twenty percent longer spent answering like-for-like questions compared with control media Touchpoint surveys set up in

the traditional way.

l The average selection of the “no exposure to media” option dropped from 23% to 14% (an indirect measure of the depth

of thought put into answering the question).

l Enjoyment rating increased from 61% to 82%, even though the survey took four minutes longer.

l Average completion rate moved from 78% to a 95%.

CASE STUDY 3: GAMIFYING INSURANCE SURVEYS WITH ALLIANZ INSURANCE USING A VARIETY OF QUESTION REFRAMING TECHNIQUES

FIGURE 19

One of the most common remarks we heard about gaming techniques is that they might work well for fun survey topics, but

would probably be less use with dryer subjects. To challenge this perception, we set about applying our thinking to what many

would regard as one of the most boring research topics in the world – insurance!

 

Downloaded from warc.com

 

 

20

Page 21: remissence of decompress

The approach we adopted was a basket of question reframing techniques, rewording some of the typical insurance survey

questions to make them more fun and human. So for example, to find which brands they liked, we asked “Which brand would

you most want to have as a sponsor of your local football team?” and “Which brands would you invite to a party?” We also

included a brand guessing game, a predictive game to get people thinking about the emotions people felt when buying

insurance (used as a precursor to more interrogative questions on the topic), a two-minute rant to get their opinions off their

chest, and a product design game (similar to the pizza experiment described above) where respondents had to design their

perfect insurance policy. Throughout this, we shared with respondents some of the issues and dilemmas faced by insurance

companies in designing products. The aim was to begin by getting respondents to think about the topic in a very light way, win

their trust, let them voice their frustrations about buying insurance in a fun way, and then challenge them to come up with

solutions as a challenge to them. (See figure 20.)

FIGURE 20

Results

l This is still a work in progress project, but having conducted two experiments so far, we have achieved an average 87%

enjoyment rating. An average 72% rated the survey four stars or higher for enjoyment.

l Ninety-three percent who took part in these studies said they would like to take part in follow-on studies.

l We achieved 87% and 90% completion rate for the two studies.

The feedback again was almost universally positive (see figure 21).

FIGURE 21

 

Downloaded from warc.com

 

 

21

Page 22: remissence of decompress

CASE STUDY 4: GAMIFYING BRAND IMAGERY RESEARCH WITH KIMBERLY CLARK

FIGURE 22

All brand managers want to know what consumers think of their brands. The problem is that consumers are not always as

keen as we might like to spend their time telling us what they think. And who can blame them?! We give them lengthy batteries

of dull and often rather meaningless statements and a boring 5 point scale – it’s hardly surprising that these questions suffer

from a lack of respondent engagement.

Working with Kimberly Clark we set about exploring a fresh way of conducting brand imagery research by making the

experience more fun and playful for respondents.

First off, we set the scene with a warm-up game designed to get respondents thinking more laterally about brands as though

they were human beings. This took the form of a simple quiz, where we described a brand in human terms and respondents

had to guess which brand it was. For example: “I shop at Sainsbury’s, but my shoes are from Clarks; I like to go the cinema

and museums and take my holidays camping in France with my family, where we go on nice long walks. Which TV channel am

I?”

We then asked respondents to think about two particular brands, and presented them with a range of words and phrases

representing human characteristics, one brand at a time, and asked whether the brand would present this trait or not. This

wasn’t very far from a traditional style of question, but preceded by the warm-up exercise, and humanized slightly with a fun

intro, we found respondents embraced this process in the spirit of a game. (See figure 23.)

 

Downloaded from warc.com

 

 

22

Page 23: remissence of decompress

FIGURE 23

Respondents had no problem doing this, but whilst it gave us some interesting information, it tended not to separate the

brands very well.

So we then used the same attributes in a slightly different way, using two game approaches:

l A guessing game – which brand did people think would be most like the attribute?

l A market trading game – each respondent had a pot of £100, and had to bet £20 on the brand they though most people

would have chosen in relation to the attribute, winning or losing £20 on each bet they made.

Using these gaming techniques, we found the brands starting to be being pulled apart more successfully. This was partly

down to the forced choice mechanism, but also because the bigger brand in the choicing pair was simply being chosen more.

The next stage involved showing a series of images, grouped by theme, along with two brand logos. Respondents were

invited to select the image they most associated with each brand. We told respondents this was a game, because, although

this was little different to a conventional survey question, we have seen that gaming is as much a mindset as anything. Many

games are very simple, but draw people in because they are attractively presented. This question was more akin to a

qualitative projective technique, but its visual appeal qualified it as a game for our purpose, and examining the feedback, it was

clear that most respondents treated it in this spirit. We went on to ask people to explain a few of their image choices for us,

and at the end, invited them to take part in the slightly more intellectual exercise of writing their own human portrait of each

brand.

Here we began to see the emergence of some real brand differentiation and themes behind that differentiation. Respondents’ 

explanations of their choices, combined with qualitative and semiotic analysis of this data, all suggested a far richer vein of

brand imagery emerging from these simple game-play techniques.

Results

l This type of approach has the potential to help us delve deeper into brand imagery.

l Respondents found this way of thinking about brands enjoyable.

 

Downloaded from warc.com

 

 

23

Page 24: remissence of decompress

l Ninety-seven percent who started the survey completed it.

l Seventy-eight percent of respondents rated the survey four stars or higher.

l Ninety percent said they felt they were fully engaged in the survey.

l Ninety-three percent said they would like to do more surveys like this.

FIGURE 24

We will continue to explore this area as a potential way to engage with respondents to deliver richer brand imagery insight.

CASE STUDY 5: GAMIFYING A CO-CREATION PROCESS WITH HEINZ

Building on initial successes with the pizza experiment, we teamed up with Heinz to conduct a similar creative quest task to

devise some new recipe ideas. This was a multi-stage task.

FIGURE 25

When inviting people to take part, we framed the whole thing as a fun challenge.

We started out with a couple of warm-up games – an example is shown in figure 26. Although this is pretty close to a classic

awareness question – “Which varieties of Heinz soup do you know of, even if only by name?” – the visual appearance (ticks

 

Downloaded from warc.com

 

 

24

Page 25: remissence of decompress

for correct and crosses for incorrect answers) and, finally the feedback on “the ones you missed” make it much more like a

game.

FIGURE 26

The average respondent made eight guesses and named five varieties, spending two minutes and forty seconds playing the

game. How many traditional awareness questions can attract this much thinking time?

After a couple of such games, we invited respondents to take part in the challenge of creating a recipe using Heinz soup as an

ingredient. To give them some context, we initially shared some client-generated ideas with them, and asked them to pick the

top five that they thought other people would have chosen. (See figure 27.)

We then asked them to select ingredients for their recipe, tell us what they would make, and give the dish a name – again,

presenting the screens to look engaging. To make the naming process fun, we visualized their answers on a poster. (See

figure 28.)

 

Downloaded from warc.com

 

 

25

Page 26: remissence of decompress

FIGURE 27

FIGURE 28

The second stage of the challenge involved getting respondents to review a selection of other respondents’ suggested

recipes, alongside some of the original client recipes, and getting them to vote for those they’d most like to make – with four

respondent dishes emerging in the top five and being selected by 50% or more of the sample.

The research team has already trialled making some of the respondent recipes – see figure 29. The next stage of the

experiment will involve trying to persuade respondents to make the recipes too. We will report back on progress of this.

 

Downloaded from warc.com

 

 

26

Page 27: remissence of decompress

FIGURE 29

Results

l Eighty-five percent cross-participation in the two studies.

We will report back on:

l Respondent enjoyment;

l Quality and quantity of ideas emerging from the process and client view on these.

QUESTION 2: WHO RESPONDS TO GAME MECHANICS?

In short, nearly everyone.

In the twenty or so different game mechanisms we have experimented with so far, the best have achieved over 95% active

participation, and the lowest 80%. In every case, the average completion time has been raised, in tandem with average

consumer enjoyment levels. We have not found one piece of evidence so far where gaming mechanisms have not helped to

improve the volume of data.

We don’t have any detailed demographic data yet, as most of these experiments have been conducted on relatively small

samples of 100-200 per cell. We are addressing this now with some larger scale experiments.

QUESTION 3: WHAT IMPACT DOES IT HAVE ON THE CHARACTER OF THE DATA?

This probably is the main question researchers have. It is clear gamification adds involvement and entertainment but does it

really provide reliable and useful data?

Undertaking a macro analysis of all the experiments we have conducted so far it is clear that applying gaming mechanics to

questions can change the character of answers. There are three factors at play here:

1. Effects caused by changes to the question and how it is interpreted

Making questions more game like often requires adaptations to the wording of questions and option choices. These can

impact on how respondents interpret the question and the answer they are likely to give. If, for example, a question about how

much you liked sport was changed from a standard Likert liking scale to more fun options such as, “I would fight for the

channel changer to switch over if this was on TV”/ “I would cancel going to a wedding if this was on TV” – clearly the data will

be different. For understandable reasons, it’s not the same question.

Does this change make the data collected any less useful? If you were totally happy with the quality and character of answers

you were getting from the traditional question format, you probably would not want to change it anyway. But if you wanted to

encourage respondents to give more consideration to their answers, and get them to use the full extent of the answer scale,

 

Downloaded from warc.com

 

 

27

Page 28: remissence of decompress

then you would have a rationale for change. So this question can only be answered in the context of the objectives of the

research.

If the question was part of a tracking study, asked in a particular way many times, then any change in the question is likely to

disrupt the data. If, however, in this sports question example, the commissioner of the research was a TV company interested

in what people like to watch on TV, the data could be arguably seen as more pertinent.

Most gamification techniques by their nature do shift the focus of the question either from the general to the specific or from

the personal to the general in the case of projection and predictive questioning. These different shifts need to be validated in

some way.

In the case of predictive questions there has been a lot of work done in this area by the likes of Brainjuicer to demonstrate that

the data from this approach is actually potentially more valuable. This is not going to universally be the case for all game

question techniques and this underlines the need when using these more creative techniques to properly conduct piloting.

Simple things like wording changes can be explored very easily through small sample pilot studies. Simple things like wording changes can be explored very easily through small sample pilot studies.

2. Effects caused by changing the attitude and mindset of respondents

Trying to win!

When we play games our instinct is to try to win them, and this may change respondents’ approach to answering questions.

We have found some evidence with certain predictive question mechanics, for example, of answers being adapted to achieve

the goal of winning in a way that could potentially corrupt the data. In other cases like guessing games, you can find

respondents clicking on random options if they don’t know the answer, in hope of winning.

But at the same time we did find ways to rebalance these effects through the careful redesign of the game mechanic - in the

case of predictive questions not telling respondents what they got right or wrong until they had answered all the options and

with guessing games simply by limiting the number of guesses, but it is certainly an issue to be very aware of when integrating

game mechanics into surveys and again points to the need for effective piloting and testing of any creative new game

technique.

Does fun encourage flippancy?

There is also the suspicion that game play will encourage more flippant answers. This may well be the case in some

circumstances, but we have not found any evidence to that effect so far. In fact we have consistently observed quite the

opposite effect, i.e. what we have found is that when people are playing survey games they appear to take what they are

doing more seriously. These effects can be seen when game play mechanics are used to stimulate open ended feedback –

they invariably lead to richer, better thought out answers. When used in association with tick selections, they can encourage

more items to be picked. When use with long repetitive question sets it can encourage respondents to concentrate for longer

periods and seem willing to answer more questions.

In head-to-head experiments we have consistently measured improvements in consideration time ranging from 20% to over

100%. See examples in figure 30 taken from our Mintel game experiment.

 

Downloaded from warc.com

 

 

28

Page 29: remissence of decompress

FIGURE 30

Looking in more detail at how the character of data changes, again outside of the effects caused by change of question

emphasis such as question reframing and predictive question techniques, there is no evidence to indicate that the character of

data is affected simply by making questions more fun. The primary difference we have observed is simply more data.

In the example illustrated in figure 31 we use a stamp effect question format to get feedback on the emotions of people

associated with shopping. You can see that the basic pattern of answers is similar but there is a lot more click activity. The

respondents spent 30% more time answering this specific question.

FIGURE 31

FIGURE 32

Sample = 360 respondents (180 per cell)

 

Downloaded from warc.com

 

 

29

Page 30: remissence of decompress

Shifts in mindset

Entering a game thinking mode can, however, shift peoples’ mindset somewhat and this can impact on the answers they give.

We asked respondents to write a list of the best places to go on holiday, as shown in figure 33. In one cell, in order to make

the question more fun, we asked them to imagine they were the editor of a travel magazine and then asked them the same

question. As travel editors they gave us 60% more feedback but their answers were different. They were more ambitious and

diverse in nature, e.g. more people suggested New Zealand over Spain.

FIGURE 33

This could be seen in fact as one of the great strengths of game play techniques – that you can encourage people to think in

different ways about something. But it also highlights the need for effective piloting of questioning techniques as small

changes like this can make a big difference to the data.

3. Effects caused by changes to the design and layout of question

Again small differences to the way a question is designed can have a measurable impact on the answer. This is an issue not

specifically confined to gamified questions. A lot of work has been done to investigate the impact of more creative questioning

approaches (Puleston and Sleep, 2008).

Within the confines of this research we have been able to explore the impact of using imagery to emphasise slider choices and

emoticons to emphasise tick selection and, as outlined above, it can create significantly more clicks, but without a measurable

shift in the character of the data.

We used images of glasses of water to visually reinforce the option choice, as shown in figure 34. The average drinking

instance changed from 2.78 without the images to 2.93 with images. On a sample of 350 this was not statistically significant.

FIGURE 34

However when using more visual option choices, as in figure 35, there will inevitably be differences in the selection choices

due to the nature of the visuals shown and the fact that the position of options cannot be randomised when asked in this way.

 

Downloaded from warc.com

 

 

30

Page 31: remissence of decompress

In this case, option choices for specific items varied by more than 50% due to layout factors. So this is another consideration

to bear in mind when thinking about being more creative in the way questions are laid out.2)

FIGURE 35

CONCLUSION

“If I could marry one market research technique it would be game-play. It would be a fun marriage and I think a fruitful one.”

Gamification is a very powerful technique, in fact the most powerful technique we have come across in the four years of

research we have done to explore how to improve the feedback from online surveys.

However, there is no escaping the fact that it is a creative solution. There are no out-of-the-box techniques other than perhaps

point-based scoring mechanisms. Most of the other ways to gamify surveys require copywriting and design skills, as well as

the technical expertise needed for the effective delivery of games in surveys.

Think of gamification like advertising. Good advertising can help sell more products, but bad advertising won’t. Likewise, a

well-executed game in a survey can be transformative and result in better feedback, but a badly executed one won’t have the

same impact. It’s not just about the games you insert in a survey, it is also about the whole tone and language used when

communicating to respondents. It is about thinking of a survey as a piece of creative communication.

In addition, gamification will often affect the data, and the necessary change in interpretation is an area we feel we have not

yet begun to properly explore.

AFTERTHOUGHTS

Making it so it’s all about me!

Another way of looking at survey game mechanisms is to see them as primarily about making questions relevant to “me”. Most

of these techniques effectively anchor the question and answering process to the individual respondent.

Take the question wording: “What music would you play on your private radio station?” It is more game-like in nature, and

arouses more quest-like instincts in us, but its success ultimately lies in the way it relates the task to the respondent

themselves.

You could call this the “hyper-personalization” of questions.

Research the two-way conversation

 

Downloaded from warc.com

 

 

31

Page 32: remissence of decompress

The work we have done to explore the value of giving feedback to respondents, whether in the form of points for getting a

question right, or as personalised feedback of the data gathered as a kind of “personality test”, highlights how one-

dimensional most surveys are.

For respondents, most surveys are like communicating into a vacuum. They dutifully answer the questions asked of them, and

their responses disappear into the ether and are never heard of again.

As much as anything, a points score says, “we hear you”. It is a return response, which is the thing generally most lacking in

online research. We contend that the reason so few people carry on doing online surveys is the sense that they get nothing

back from it.

An example of the effectiveness of making respondents feel they are listened to came from one of the questions we developed

for Allianz, where we asked respondents to predict which emotions people most associated with buying insurance. For the

follow-up question, rather than simply ask, “Why did you choose that?” we phrased it as an echo question: “It is interesting

that you chose that option. A lot of other respondents did too. Can you explain why…” This form of the question comes across

as part of a two-way conversation with the researcher, and resulted in twice as much feedback as when we ask the question

in a more one-way fashion.

Incentives to concentrate

In a single phrase, this is what game-play offers. At the point when respondents might normally feel their concentration

slipping, and feel unmotivated to sustain it, games can encourage people to focus and think about a topic.

Many survey tasks are already games – just very badly designed!

A lot of the techniques described in the paper to gamify surveys come quite naturally to many qualitative researchers. Indeed,

any reading this paper might well be smirking at its findings, as they have been using similar techniques for years in their own

arena. What we have been able to do is quantify their benefits.

Within online research, the main problem is not the concept as much as the execution. Many of the questions we ask in

surveys are already games in theoretical terms, just really boring ones!

It is our belief that with an ever-increasing range of fun activities competing for online attention, surveys must radically improve

to survive in the market place.

Notes on methodology

Nearly all the experiments quoted here have been conducted using simple test and control cells research techniques using

small sample of around 100 to 200 respondents per cell. As a result we do not wish to purport the results of any of these

experiments to be anything other than descriptive anecdotal evidence about the impact of game play. These experiments have

very much been a general exploration of this topic. We are now in the process of conducting some more robust sample trials of

all the main gaming question techniques we have pioneered and aim to report back of the results of these trials in a future

paper.

FOOTNOTES

 

Downloaded from warc.com

 

 

32

Page 33: remissence of decompress

1. See Puleston, J. and Sleep, D. (2008) Measuring the value of respondent engagement, ESOMAR Panel Research

Conference.

2. We do not want to dismiss this factor. It is simply due to the nature of the small sample sizes used in these exploratory

experiments. Any changes observable fell below the measurable range within the error limits of sample. We are in the process

of conducting large sample size validation experiments that may shed more light on this.

THE AUTHORS

Jon Puleston is VP Innovation, GMI, United Kingdom.

Deborah Sleep is Director, Engage Research, United Kingdom.

© Copyright ESOMAR 2011 ESOMAR

Eurocenter 2, 11th floor, Barbara Strozzilaan 384, 1083 HN Amsterdam, The Netherlands

Tel: +31 20 664 2141, Fax: +31 20 664 2922

www.warc.com

All rights reserved including database rights. This electronic file is for the personal use of authorised users based at the subscribing company's office location. It may not be reproduced, posted on intranets, extranets

or the internet, e-mailed, archived or shared electronically either within the purchaser’s organisation or externally without express written permission from Warc.

 

Downloaded from warc.com

 

 

33