to do or not to do: differences in user experience and retrospective judgments depending on the...

9
To do or not to do: Differences in user experience and retrospective judgments depending on the presence or absence of instrumental goals Marc Hassenzahl a, * , Daniel Ullrich b a Economic Psychology and Human–Computer Interaction, University of Koblenz-Landau, Fortstraße 7, 76829 Landau, Germany b Darmstadt University of Technology, Social Psychology and Decision Making, Alexanderstraße 10, 64283 Darmstadt, Germany Received 2 February 2007; received in revised form 30 April 2007; accepted 14 May 2007 Available online 21 May 2007 Abstract Recently, Human–Computer Interaction (HCI) started to focus on experiential aspects of product use, such as affect or hedonic qual- ities. One interesting question concerns the way a particular experience is summarized into a retrospective value judgment about the product. In the present study, we specifically explored the relationship between affect, mental effort and spontaneity experienced while interacting with a storytelling system and retrospective judgments of appeal. In addition, we studied differential effects of the presence or absence of instrumental goals. In general, active instrumental goals did not only impact experience per se by, for example, inducing men- tal effort, but also the way subsequent retrospective judgments were formed. We discuss the implications of our findings for the practice of product evaluation in HCI specifically, and more general aspects, such as the role of affect in product evaluations and the importance of usage mode compatibility (i.e., a compatibility of the way one ought to and actually does approach a product). Ó 2007 Elsevier B.V. All rights reserved. Keywords: User experience; Affect; Evaluation; User satisfaction; Task; Instrumental goals; Context-dependency; Goal-mode; Action-mode 1. Introduction Imagine using a Web site to look for a Christmas pres- ent. During your quest, you are likely to experience vari- ations in the quality of the service. You will encounter usability problems as well as good suggestions by the Web site’s recommendation system. And with a bit of luck, this particular usage episode will end with a perfect present and the successful completion of the check out procedure. The subjective representation of such a usage episode is an experience. During an experience, the brain continu- ously constructs an affective commentary on the current state of affairs (e.g., Kahneman, 1999). In other words, it is a stream of valuable and not so valuable moments with a definite beginning (e.g., the opening of a good bottle of wine) and ending (e.g., the last sip of the wine). During an experience individuals can probe their momentary state. By that, the state becomes conscious. Moreover, individuals are able to make summary judg- ments about the overall quality of an experience and ele- ments of it in retrospect (Kahneman, 1999; Hassenzahl and Sandweg, 2004). In the field of Human–Computer Interaction (HCI) any subjective product evaluation pro- vided by users after a product trial is an instance of retro- spective summary evaluation. For users/consumers those evaluations are the basis for further communicating about the experience and the product. Moreover, evaluations will be memorized and, by that, will presumably guide future behavior. 0953-5438/$ - see front matter Ó 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.intcom.2007.05.001 * Corresponding author. Tel.: +49 6341 280261. E-mail addresses: [email protected] (M. Hassenzahl), ullrich @psychologie.tu-darmstadt.de (D. Ullrich). www.elsevier.com/locate/intcom Interacting with Computers 19 (2007) 429–437

Upload: marc-hassenzahl

Post on 26-Jun-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: To do or not to do: Differences in user experience and retrospective judgments depending on the presence or absence of instrumental goals

www.elsevier.com/locate/intcom

Interacting with Computers 19 (2007) 429–437

To do or not to do: Differences in user experience and retrospectivejudgments depending on the presence or absence

of instrumental goals

Marc Hassenzahl a,*, Daniel Ullrich b

a Economic Psychology and Human–Computer Interaction, University of Koblenz-Landau,

Fortstraße 7, 76829 Landau, Germanyb Darmstadt University of Technology, Social Psychology and Decision Making, Alexanderstraße 10,

64283 Darmstadt, Germany

Received 2 February 2007; received in revised form 30 April 2007; accepted 14 May 2007Available online 21 May 2007

Abstract

Recently, Human–Computer Interaction (HCI) started to focus on experiential aspects of product use, such as affect or hedonic qual-ities. One interesting question concerns the way a particular experience is summarized into a retrospective value judgment about theproduct. In the present study, we specifically explored the relationship between affect, mental effort and spontaneity experienced whileinteracting with a storytelling system and retrospective judgments of appeal. In addition, we studied differential effects of the presence orabsence of instrumental goals. In general, active instrumental goals did not only impact experience per se by, for example, inducing men-tal effort, but also the way subsequent retrospective judgments were formed. We discuss the implications of our findings for the practiceof product evaluation in HCI specifically, and more general aspects, such as the role of affect in product evaluations and the importanceof usage mode compatibility (i.e., a compatibility of the way one ought to and actually does approach a product).� 2007 Elsevier B.V. All rights reserved.

Keywords: User experience; Affect; Evaluation; User satisfaction; Task; Instrumental goals; Context-dependency; Goal-mode; Action-mode

1. Introduction

Imagine using a Web site to look for a Christmas pres-ent. During your quest, you are likely to experience vari-ations in the quality of the service. You will encounterusability problems as well as good suggestions by theWeb site’s recommendation system. And with a bit ofluck, this particular usage episode will end with a perfectpresent and the successful completion of the check outprocedure.

The subjective representation of such a usage episode isan experience. During an experience, the brain continu-ously constructs an affective commentary on the current

0953-5438/$ - see front matter � 2007 Elsevier B.V. All rights reserved.

doi:10.1016/j.intcom.2007.05.001

* Corresponding author. Tel.: +49 6341 280261.E-mail addresses: [email protected] (M. Hassenzahl), ullrich

@psychologie.tu-darmstadt.de (D. Ullrich).

state of affairs (e.g., Kahneman, 1999). In other words, itis a stream of valuable and not so valuable moments witha definite beginning (e.g., the opening of a good bottle ofwine) and ending (e.g., the last sip of the wine).

During an experience individuals can probe theirmomentary state. By that, the state becomes conscious.Moreover, individuals are able to make summary judg-ments about the overall quality of an experience and ele-ments of it in retrospect (Kahneman, 1999; Hassenzahland Sandweg, 2004). In the field of Human–ComputerInteraction (HCI) any subjective product evaluation pro-vided by users after a product trial is an instance of retro-spective summary evaluation. For users/consumers thoseevaluations are the basis for further communicating aboutthe experience and the product. Moreover, evaluations willbe memorized and, by that, will presumably guide futurebehavior.

Page 2: To do or not to do: Differences in user experience and retrospective judgments depending on the presence or absence of instrumental goals

430 M. Hassenzahl, D. Ullrich / Interacting with Computers 19 (2007) 429–437

Often, researchers and practitioners in the field of HCItake retrospective summary evaluations as direct indicatorsof product quality. Their implicit assumption is that retro-spective evaluations are simple averages or sums of all themoments encountered during the experience. Research,however, reveals a more complex picture. In a well-knownstudy on colonoscopy (Redelmeier and Kahneman, 1996),for example, patients were asked to continuously indicatethe intensity of pain experienced during the procedure.After the procedure they provided a retrospective judgmentabout how painful the procedure was in general. Thosejudgments were best predicted by a peak-end rule, i.e., anaverage of the peak intensity and the end intensity of painexperienced during the episode. Similar results were foundfor less extreme situations, such as retrospective summariesof the affect experienced while watching a movie (Freder-ickson and Kahneman, 1993). In the context of HCI, Has-senzahl and Sandweg, 2004 studied the relation betweenexperience and retrospective product evaluations. Partici-pants of a usability test worked through a series of inde-pendent tasks. After each task they were asked to ratethe mental effort experienced during the task. Effort servedas a proxy for encountered usability problems (Hassenzahl,2000). Subsequently, they were asked to rate the product’susability. As expected, more effort led to less favorableevaluations of the product. However, the effort experiencedin the last task was a better predictor of the participants’evaluation than any other indicator (e.g., average of alleffort measures, peak/end, variance, trend). Thus, the ret-rospective usability evaluation was related to, but not iden-tical with the experience participants made. It was ratherconstructed on the spot, based on available information,either derived directly from the target (e.g., by taking sali-ent moments from the experience into account) or fromgeneral knowledge about and attitudes toward the target(see Kardes et al., 2004).

A further aspect that impacts retrospective product eval-uations is the situation. In a study, Hassenzahl et al., 2002found tasks to impact the way participants formed overalljudgments of how appealing a number of Web sites were.Participants were either given tasks to accomplish withthe Web sites (e.g., to find a particular information) orthe instruction to just ‘‘have fun with the Web sites.’’ Forthe task-group, subjective perceptions of usability (e.g.,simple–complex) were strongly correlated with appeal(e.g., good–bad) (partial correlation = .87), which was notapparent for the fun-group (partial correlation = �.10).In the former case, usability as a product attribute was pre-dictive of the product’s overall appeal, in the latter not.Depending on the context, the basis of the product evalua-tion changed.

The present paper examines the relation between mea-sures taken during an experience and retrospectively. Spe-cifically, we let participants assess the valence of theirmomentary feelings (i.e., affect) and exerted mental effortduring the interaction with a digital storytelling systembased on the art-E-fact platform (Iurgel, 2004a,b; Spier-

ling, 2005). Affect was included, because of its generalimportance to judgmental processes. Spontaneous goodor bad feelings often serve as a basis for judgments on var-ious aspects of tasks or products (Schwarz and Clore, 1983,see Pham, 2004, for an overview). Experienced mental effort

is the amount of energy a user has to activate to meetperceived task demands (Arnold, 1999). It is a predictorfor encountered usability problems (Hassenzahl, 2000),thus, related to experienced barriers in goal attainment.After the experience, participants were asked to assess theirspontaneity while interacting with the product (I considered

my actions carefully – I decided spontaneously for actions),the product’s overall appeal and to give positive and nega-tive comments on the product/the experience. In addition,we varied the situation by either inducing an externalinstrumental goal for the interaction with the product ornot. Providing an external goal was expected to changethe way experiential measures relate to retrospectivemeasures.

2. Method

2.1. Participants

Thirty individuals (13 female, median age = 26.5,min = 20, max = 51) participated in the study. The major-ity were students of the Darmstadt University of Technol-ogy or employees of the Computer Graphics CenterDarmstadt (ZGDV). They received no compensation forparticipation.

2.2. Stimulus product: art-E-fact

Art-E-fact is a generic platform for the creation of inter-active stories developed by the Computer Graphics CenterDarmstadt (ZGDV) (Iurgel, 2004a,b; Spierling, 2005, seewww.art-e-fact.org for further information). It consists ofan authoring tool for the generation of stories (e.g.,story-line, virtual characters, interaction rules) and a set-up for viewing and interacting with the story. In the presentconfiguration, the story was displayed by a data projectoronto the wall (2 by 1.50 m). During the experiment, partic-ipants were standing in front of the wall, watching andinteracting with the story.

We used a story generated for a museum environment.While the story’s plot about an art theft unfolded, userswere presented with a number of different paintings, back-ground information on artists, the analysis of art and art ingeneral. The story was driven mainly by dialogues betweentwo synthetic characters: a reporter and a knowledgeableart professor. All characters spoke with synthesized voices,with matching gestures and facial expressions.

At particular turns of the story, participants couldinteract with the system. The basic interaction mechanismwas pointing. A camera-based recognition system capturedthe hand-movement and extrapolated where users pointedat. This mechanism was used to activate specific functions,

Page 3: To do or not to do: Differences in user experience and retrospective judgments depending on the presence or absence of instrumental goals

M. Hassenzahl, D. Ullrich / Interacting with Computers 19 (2007) 429–437 431

introduced and explained by the characters, such as skip-ping or repeating parts of the story or getting additionalinformation about particular paintings. In addition, theparticipants could use purpose-built tools, such as a magni-fying glass, to enlarge portions of pictures or to reveal hid-den layers of paint.

2.3. Independent variable: instrumental goal

A major objective of the present study was to exploredifferences in the user experience and relations to retrospec-tive judgments depending on the situation. Specifically, thesituation was varied by either letting participants experi-ence the system freely at their own pace (no-goal condition)or by asking them to find answers to a number of particularquestions with the help of the system (goal condition).Finding particular pieces of information was selected asthe instrumental goal (i.e., task) because acquiring infor-mation about art was supposed to be a major outcome ofthe implemented story. Note that the present experimentrequires a product that can either be experienced with orwithout active instrumental goals. Most products studiedin HCI are utilitarian systems, build for a particular pur-pose. Art-E-fact, however, can be used either explorativeand for the sake of the story or as a source of informationabout paintings. This twofold nature makes it a particularinteresting stimulus product.

2.4. Experiential measures

2.4.1. Mental effortMental effort can be defined as the amount of energy a

user has to activate to meet the perceived task demands(Arnold, 1999). This energy depends on objective demandsof the task and the individuals’ preparedness and copingstrategies. Thus, mental effort is subjective and always tiedto a task, i.e., meaningful instrumental goals. The presentstudy used the ‘‘Subjective Mental Effort Questionnaire’’(SMEQ, Zijlstra, 1993; Eilers et al., 1986), a simple ratingscale ranging from 0 (hardly effortful) to 220 (exceptionallyeffortful). Studies showed this measure to be related to sub-jective ratings of product usability, appeal (Hassenzahl,2001) and time spent to correct usability problems (Hassen-zahl, 2000). Unpublished data further showed a substantialcorrelation between the number of usability problemsobserved by a usability expert and the SMEQ rating (aver-aged r = .47).

In the present study, we were rather interested in anexperiential than in a retrospective measure of mentaleffort. To achieve this, we measured mental effort threetimes throughout the usage episode (see Section 2.6). Theinternal consistency of the three separate measurementswas satisfactory (no-goal: Cronbach’s a = .78; goal: Cron-bach’s a = .84). A 2 · 3 analysis of variance with situation(no-goal, goal) as between-subjects factor, time (1., 2., 3.)as within-subjects factor and mental effort as dependentvariable showed no significant main effect of time or time

by goal interaction. We averaged all three measurements,following the logic of Kahneman (1999) to obtain an‘‘objective’’ experiential measure of mental effort.

2.4.2. Affect

Affect can be defined ‘‘as a neurophysiological state thatis consciously accessible as a simple non-reflective feeling[. . . This feeling] is an integral blend of hedonic (plea-sure–displeasure) and arousal (sleepy–activated) values’’(Russell, 2003, p. 147). For the present study, we restrictedour analysis to valence – that is the pleasure and displea-sure – experienced during interaction with the system asone of the key dimensions of virtually all models of affect(Russell, 2003). Valence was measured with the ‘‘SelfAssessment Manikin’’ (Bradley and Lang, 1994). Similarto mental effort, three separate measurements were taken.The internal consistency was satisfactory (no-goal: Cron-bach’s a = .90; goal: Cronbach’s a = .81). A 2 · 3 analysisof variance with situation (no-goal, goal) as between-sub-jects factor, time (1., 2., 3.) as within-subjects factor andaffect (valence) as dependent variable showed a significantmain effect of time, F (2,56) = 3.68, p < 0.05, best describedas a U-form shape, i.e., a dip in affect for the mid measure-ment. However, no significant time by goal interactionemerged. Thus, we averaged all three measurements toobtain an experiential measure of affect.

2.5. Retrospective measures

2.5.1. EvaluationSeven seven-point items with bipolar verbal anchors

were used to capture the retrospective evaluation of the sys-tem. The scale is a part of the AttrakDiff2-questionnaire(Hassenzahl et al., 2003) and was already used in a numberof different studies (e.g., Hassenzahl, 2001) to measure theappeal of interactive products. Example items are good–

bad, inviting–rejecting or likable–disagreeable. The internalconsistency of the scale was satisfactory (no-goal: Cron-bach’s a = .95; goal: Cronbach’s a = .94). Evaluation(i.e., appeal) was calculated as the average of all sevenitems.

2.5.2. Acquired knowledge

To measure the amount of acquired knowledge, partic-ipants had to answer four questions after finishing thestory. All required information was provided in the mainstory-line, thus, individual choices regarding different storythreads or following up additional information cannot bemade responsible for differences in the acquired knowledge.The four questions were: (1) ‘‘Which was the universityJames [a reporter] visited to gather information about thestolen painting?’’ (2) ‘‘Name the two artists whose paint-ings were on display,’’ (3) ‘‘What was Guardi’s [a Venetianartist] profession?,’’ and (4) ‘‘What was the name of thetelevision program the professor wanted to watch?’’. Ques-tions were designed to ensure that participants could notrely on previous knowledge.

Page 4: To do or not to do: Differences in user experience and retrospective judgments depending on the presence or absence of instrumental goals

432 M. Hassenzahl, D. Ullrich / Interacting with Computers 19 (2007) 429–437

2.5.3. Spontaneity

Hassenzahl (2003; Hassenzahl, Kekez, & Burmester,2002) argued that users can be either in goal- or action-mode while interacting with an interactive product. In thegoal-mode the current instrumental goal has a certainimportance and determines more or less all activities. Thesystem is ‘‘a means to an end.’’ Besides other aspects, indi-viduals tend to plan ahead, i.e., to consider their actionscarefully with respect to active goals. In the action-modethe current activity determines goals ‘‘on the fly;’’ goalsare ‘‘volatile.’’ Using the system is an ‘‘end in itself.’’ Indi-viduals rather spontaneously decide their actions, becausegoals will be made up in the course of the interactioninstead of guiding the action as in goal-mode.

We tap into participants’ differences in mode by lettingthem rate their felt ‘‘spontaneity’’ during the experiencein retrospect on a seven-point scale with the bipolar verbalanchors I considered my actions carefully – I decided spon-

taneously for actions taken from Hassenzahl and colleagues(2002).

2.5.4. Positive/negative comments

In a debriefing interview, participants were asked tocomment freely on positive and negative aspects of the sys-tem. Specifically, we asked them to think of their experi-ence with the product and to report three particularlypositive and three particularly negative aspects of the prod-uct or their experience.

2.6. Procedure

Participants were led separately into the laboratory.During the experiment, they stood on a mark approxi-mately 2.50 m in front of the image displayed by a dataprojector in the size of 2 by 1.50 m. The laboratory wasshaded, the display remaining the only light source.

The experimenter started with a short introduction toart-E-fact’s main features, such as using one’s finger as apointing device and the resulting interaction. After a verybrief introduction to the story, participants in the no-goalcondition were instructed to just have fun with the storyand to do with the product whatever they like. Participantsin the goal condition were told to explore the story and tofind answers to four questions on story details and artists’biography (see Section 2.5.2). After participants indicatedthat they had understood all instructions, the story wasstarted.

During the experiment, the story was interrupted threetimes at scene-transitions to let participants assess theirmomentary affective state (see Section 2.4.2) and the men-tal effort experienced right at that moment (see Section2.4.1). Scene-transitions provided ‘‘natural’’ moments forinterruptions, thereby keeping the impact of the interrup-tion itself on the current affective state at a minimum(e.g., Adamczyk and Bailey, 2004). The questionnaireswere filled in at the position, where participants interactedwith the system. The first measurement took about 45 s,

due to a short introduction given by the experimenter.The remaining two measurements took about 15 s each.On average, participants interacted 34 min with the story,with a minimum of 25 min and a maximum of 45 min.

After having gone through the story, participants took a5 min break. After this, they were first asked to assess theirspontaneity while using the system (see Section 2.5.3). Sub-sequently, participants answered the four open knowledgeacquisition questions (see Section 2.5.2). For the goalgroup these questions were already known, and partici-pants simply wrote down the information they rememberedfrom interacting with the story. For the no-goal group thequestions were new. Finally, each participant evaluated theproduct (see Section 2.5.1). The experiment ended with adebriefing interview. Participants were asked to point outup to three particularly positive and three particularly neg-ative aspects of the product or their experience (see Section2.5.4).

2.7. Predictions

Although the present study is mainly explorative somepredictions concerning the different experiential and retro-spective variables can be made. First, the average numberof correct answers given to the four knowledge acquisitionquestions should be higher in the goal compared to the no-goal condition. This is due to the explicit setting of theknowledge acquisition goal and the fact that the questionswere known in advance (i.e., during interaction).

Second, mental effort should be higher in the goal com-pared to the no-goal condition. Mental effort is closely tiedto the attainment of instrumental goals, i.e., to tasks(Arnold, 1999). If no goal is made salient, goal-directedbehavior and general problems with accomplishing goals(expressed as mental effort) become less likely.

Third, spontaneity should be higher in the no-goal con-dition compared to the goal condition. Hassenzahl and col-leagues (2002) argued that a ‘‘have fun’’-instruction (as inthe no-goal condition) results in an action mode, whereasthe explicit setting of goals (as in the goal condition) resultsin a goal mode. One particular difference between bothmodes is the amount of spontaneity users experience dur-ing interaction.

In addition, particular relations between the differentvariables were expected. First, mental effort should berelated to knowledge acquisition because effort is tied togoal-directed behavior and the number of correct knowl-edge acquisition questions is the actual measure for goalattainment. Second, the basis of product evaluation (i.e.,appeal) should vary with the situation: in the goal condi-tion, absence of mental effort and good knowledge acqui-sition are supposed to be important predictors of theoverall product evaluation. In the no-goal condition, how-ever, both should lose their importance. Third, affectshould be related to all variables, emphasizing the impor-tance of affective responses as input for any judgmentalprocess.

Page 5: To do or not to do: Differences in user experience and retrospective judgments depending on the presence or absence of instrumental goals

M. Hassenzahl, D. Ullrich / Interacting with Computers 19 (2007) 429–437 433

3. Results and discussion

3.1. Experiential and retrospective measures

Table 1 (column 2) shows the overall mean and standarddeviation of each measure. On the whole, interacting withart-E-fact was experienced as to some degree effortful,but accompanied by positive affect. The system was evalu-ated as rather appealing. Knowledge acquisition was fairand the participants used the system in a noticeably spon-taneous way.

Table 1 (column 3 and 4) further shows the mean andstandard deviation of each measure for the no-goal andgoal condition. Significant differences between no-goaland goal condition emerged for mental effort and knowl-edge acquisition (see Table 1, column 5), with a higher levelof mental effort and better knowledge acquisition in thegoal condition. Both was expected and can be taken as evi-dence for the successful manipulation of the situation. Thefinding that the induction of the knowledge acquisitiongoal led to an increase in mental effort underlines the closerelationship between mental effort and goal-directed behav-ior. It further gives an insight into the way usability prob-lems were experienced. As long as participants in bothconditions used the same features of the product forapproximately the same amount of time, both groups wereequally likely to encounter usability problems. However,those ‘‘objective’’ usability problems led to an increase ineffort only for those participants with an externallyimposed goal (to acquire information).

Table 1Mean (standard deviation), t-tests and homogeneity of variances for all measu

Measure (scale) Overall N

Mental effort (SMEQ) (0–220) 78.67 (41.96) 6Affect–valence (SAM) (1–9) 6.37 (1.40)Evaluation (APPEAL) (1–7) 5.17 (1.10)Knowledge acquisition (0–4) 2.03 (1.50)Spontaneity (I considered my actions carefully – I

decided spontaneously for actions) (1–7)5.60 (1.61)

* p < .05.** p < .01.

Table 2Factor loadings for the no-goal and goal condition

ExperientialMental effortSpontaneity (I considered my actions carefully – I decided

spontaneously for actions)Affect–valence

RetrospectiveKnowledge acquisitionEvaluation (APPEAL)EigenvalueExplained variance

Notes: Principal components analysis; Extraction criterion: Eigenvalue >1, Va

To further explore the relationships among the experien-tial (mental effort, affect, spontaneity) and retrospectivemeasures (evaluation, knowledge acquisition), we per-formed two separate principal components analyses(PCA, extraction criterion: Eigenvalue >1, Varimax rota-tion) on each condition (no-goal, goal). Gorsuch, 1997 rec-ommends at least a ratio of three independent observationsfor each included variable, which was given in the presentstudy. To rule out the possibility that differences in theemerging patterns of relations among variables are due tolarge differences in the variability of those variables, testsof the homogeneity of variance were performed for eachvariable (see Table 1, column 6). No significant differencesin variances emerged.

Table 2 (column 2 and 3) shows the result of the PCAfor the no-goal condition. Two components with an Eigen-value greater than 1 were extracted. Both componentstogether explained 73% of the total variance. Positive affectand spontaneity, together with the retrospective evaluationof the product, formed one component; low mental effort,positive affect and knowledge acquisition the other compo-nent. This pattern disappeared in the goal condition (Table2, column 4): only one component was extracted (60%explained variance) with loadings of all variables.

Several interesting differences between the patterns ofthe two conditions (no-goal, goal) emerged. Most strik-ingly, in the goal condition experienced mental effort washighly related to the retrospective evaluation of the system(direct r = �.80, p < .001); a relation, which was reduced inthe no-goal condition (direct r = �.47, n.s.). This supports

res

o-goal (N = 15) Goal (N = 15) t(28) Levene F

1.12 (34.47) 96.23 (42.41) 2.49* .236.52 (1.33) 6.23 (1.50) �.56 .045.18 (1.12) 5.15 (1.11) �.07 .011.33 (1.35) 2.73 (1.34) 2.86** .025.67 (1.40) 5.53 (1.85) �0.22 .78

No-goal (N = 15) Goal (N = 15)

1 2 1�.752 �.890

.866 �.553

.572 .341 .943

.918 .472.952 .891

2.11 1.55 3.0342 31 60

rimax rotation, all loadings <.300 were suppressed.

Page 6: To do or not to do: Differences in user experience and retrospective judgments depending on the presence or absence of instrumental goals

434 M. Hassenzahl, D. Ullrich / Interacting with Computers 19 (2007) 429–437

the notion of a context-dependent evaluation process, i.e.,an ‘‘on the spot’’ construction of the evaluation. In otherwords, participants ground their evaluation on variablespredictive for a given context, such as mental effort in thecontext of the goal condition.

A further interesting difference concerned the relation ofspontaneity to evaluation. Whereas in the no-goal condi-tion experienced spontaneity was highly predictive for afavorable evaluation (direct r = .76, p < 0.001), this rela-tion disappeared in the goal condition (r = �.33, n.s.).Remarkably, the relation changed its direction (sign):whereas in the no-goal condition more spontaneity wasrelated to more favorable evaluations, in the goal conditionless spontaneity (i.e., more planning) led to an in tendencymore favorable evaluation. This underlines the importanceof the compatibility between the mode (goal, action) of theindividual (i.e., actual mode) and the requirements of thesituation at hand (i.e., appropriate mode).

Not surprisingly, affect is context-dependent just as eval-uation is. In the no-goal condition positive affect was asso-ciated with high spontaneity, whereas in the goal conditionaffect becomes rather associated with mental effort. Thisillustrates the notion that experiences can considerablyvary and be still summarized into a simple good/bad-judg-ment (Kahneman, 1999). Affect is the psychological cur-rency that makes the comparison of qualitatively differentexperiences possible (Russell, 2003). In line with this view,affect is in general the single best predictor for the retro-spective product evaluation. A stepwise regression withmental effort, affect, spontaneity, and knowledge acquisi-tion as predictors and evaluation as criterion, resulted ina model with affect as a single predictor (b = 0.69,R2 = 0.46, F = 25.51, p < 0.001). Thus, in the presentstudy, affective states during the experience were, if notcaused by, at least attributed to the product. This resultedin a clear relationship between affect and evaluation of theproduct.

The principal components analysis of the goal conditionshowed knowledge acquisition to be – albeit weakly – asso-ciated with low mental effort, positive affect, planning anda positive evaluation of the product. In the no-goal condi-tion knowledge acquisition was dissociated from evalua-tion, simply because knowledge acquisition had not beenexplicitly mentioned and thus was not used as a basis forthe evaluation of the product. Interestingly, high knowl-

Table 3Comment categories, their definition and an example

Category Definition: comments refer to. . .

Voice . . . the synthesized voice and its intelligibilityInteraction . . . the opportunity of interacting differently with the prodTools . . . the used tools (e.g., magnifying glass)Pointing . . . the products gesture-based pointing-featureCharacters . . . the displayed charactersStory . . . the story-plotNovelty . . . the product’s novelty and innovationFun . . . the experienced fun

edge acquisition in the no-goal condition remained relatedto low mental effort (they form their own component).

3.2. Positive/negative comments

In the debriefing interview, participants made 93 posi-tive and negative comments on the product or their experi-ence. Comments were further classified by the authors intoeight broad categories (see Table 3), addressing particularfeatures of the product (e.g., pointing gesture), content(e.g., the featured story) as well as high level attributes suchas the fun derived from using the product. Eighty-four per-cent of comments (78 of 93) could be categorized this way.The remaining 16% were not classifiable and omitted fromfurther analyses.

Table 4 shows the absolute and relative frequencies forcomments in the different categories overall, separated bypositive and negative comments as well as for each condi-tion. The most frequently mentioned category was ‘‘voice’’(26%, 20 of 78) referring to, for example, voice synchronic-ity and speed. The categories ‘‘novelty’’ and ‘‘fun’’ receivedthe fewest mentions (4%, 3 of 78).

A simple way to examine differences between the goaland the no-goal condition is to analyze the relationship(i.e., correlation) between category-rankings. The more fre-quent the comments in a particular category, the lower itsaccording rank. A low correlation between rankings ofboth groups hints at differences between groups, whereasa high correlation indicates correspondence.

Goal and no-goal rankings (computed on the total num-ber of comments in the category) did not correlate signifi-cantly, q = .39, n.s., n = 8. Participants in the goalcondition commented more frequently on ‘‘interaction’’and ‘‘story’’ issues, whereas participants in the no-goalcondition commented more frequently on ‘‘tools’’. We fur-ther analyzed differences between conditions for positiveand negative comments separately. Participants in bothconditions agreed on the negative aspects of the product,q = .88, p < .01, n = 8, but disagreed on the positiveaspects, q = .51, n.s., n = 8. As expected, comments of par-ticipants in the goal condition revolved around goal attain-ment, i.e., interaction with the products and problems withunderstanding the synthesized voices, whereas participantsin the no-goal condition had no preferred issue – their com-ments were spread more evenly over all categories.

Example [#participant-number]

‘‘the voice is not intelligible and ridiculous’’ [#12]uct ‘‘options: you may learn more but you haven’t have to’’ [#6]

‘‘tools had malfunctions’’ [#3]‘‘pointing is inaccurate and exhausting’’ [#12]‘‘characters are static and artificial’’ [#30]‘‘the story is vivid’’ [#12]‘‘a new type of knowledge transfer’’ [#4]‘‘it’s playful to use the program’’ [#27]

Page 7: To do or not to do: Differences in user experience and retrospective judgments depending on the presence or absence of instrumental goals

Table 4Absolute frequencies (%) of comment categories

Category Overall Goal No-goal

Total Positive Negative Total Positive Negative Total Positive Negative

Voice 20 (26) 1 19 11 11 9 1 8Interaction 15 (19) 13 2 10 9 1 5 4 1Tools 12 (15) 3 9 3 1 2 9 2 7Pointing 11 (14) 4 7 5 1 4 6 3 3Characters 8 (10) 2 6 3 1 2 5 1 4Story 6 (8) 5 1 5 4 1 1 1Novelty 3 (4) 3 1 1 2 2Fun 3 (4) 3 1 1 2 2

Total 78 34 44 39 18 21 39 16 23

M. Hassenzahl, D. Ullrich / Interacting with Computers 19 (2007) 429–437 435

4. Summary and conclusion

To have an active instrumental goal or not does not onlyimpact the experience of an interactive product per se, butalso subsequent retrospective judgments. In the presentstudy, active instrumental goals made barriers to theirattainment salient indicated by an increase in mental effort.In addition, mental effort was negatively related to affect,and both acted as input to retrospective product evalua-tions (i.e., appeal). Goal attainment itself, approximatedby the number of correct answers given to a series of ques-tions (i.e., knowledge acquisition), was as well related toproduct evaluation, albeit not as strongly. Interestingly,the more spontaneous participants felt during interaction,the more effort and more negative affect they experienced.In addition, spontaneity was related to a reduced appealof the product.

If no instrumental goal was active, experience and therelation of experiential variables to retrospective productevaluations changed. Now, spontaneity was experiencedas positive and was highly related to a positive productevaluation. Mental effort and knowledge acquisition stillformed a group of goal-related aspects; however, both weredissociated from product evaluation. As long as mentaleffort is related to pragmatic quality perceptions (i.e.,instrumental quality, usability and utility of a product),this conceptually replicates the findings of Hassenzahlet al., 2002 with pragmatic quality being predictive of prod-uct appeal in a situation with active instrumental goals, butnot in a situation where participants were told to have fun.

The apparent profound impact of active instrumentalgoals on the way people experience and evaluate an inter-active product is especially relevant for HCI, as long as‘‘the task’’ is one of the pivotal elements of product evalu-ation (Rubin, 1994). Tasks induce instrumental goals andfocus participants on their attainment. Accordingly, theproduct is evaluated in terms of its capability to supportgoal attainment (i.e., its usability). However, the presentstudy shows that this is not necessarily the only way prod-ucts can be experienced and judged. Without externalinstrumental goals given, participants change their set ofcriteria as demonstrated by the different roles mental effortand spontaneity played in the goal compared to the non-

goal condition. This highlights how the setting of an eval-uation impacts its results, an effect broadly called demand

characteristics (Orne, 1969). Cordes, 2002 provided a strik-ing example of demand characteristics in the context ofusability testing, an effect he dubbed the ‘‘I know it canbe done or you wouldn’t have asked me to do it’’-bias.He showed participants in a standard usability test settingto be six times more persistent in accomplishing tasks, ifthey were told that they should not assume ‘‘that the prod-uct can perform each task that we are going to ask you todo’’ (p. 413). Given this instruction, they were not onlymore persistent; they also rated the same tasks as being14 times less difficult. The present study shows that themere evaluation technique itself, i.e. a strong focus ontasks, alters the set of criteria used and may thus impactthe result of an according evaluation. Note that in the pres-ent case, appeal did not differ between conditions. How-ever, this may be simply the consequence of the balancednature of the product studied. In other words, art-E-factmay be relatively appropriate for both types of use, a goaland an action-oriented. One can easily imagine other inter-active products, which have to cater for both types of uses.In this case standard usability testing may be severelylimited.

On a more general level, the study demonstrated notonly the context-dependency of overall evaluative judg-ments (e.g., Hassenzahl, 2003), but also the pervasive roleof affect for retrospective judgments and the importanceof mode compatibility. In both conditions (goal, no-goal),affect was related to product evaluations. However, in theno-goal condition, affect was the only variable with load-ings on both components (spontaneity/evaluation versusmental effort/knowledge acquisition). This underlines thecentral role of affect in user experience. Interaction witha product, no matter whether goal or not goal-oriented,is inevitably accompanied by affect. In both conditions,for example, mental effort was accompanied by negativeaffect, i.e., experienced as negative. In the present case,the measurement of affect was experiential, i.e., measureswere momentary snapshots of affective states taken fromthe interaction. Experiential affect is related to but notidentical with product evaluation. Only experiential affectattributed to the product will be relevant for a subsequent

Page 8: To do or not to do: Differences in user experience and retrospective judgments depending on the presence or absence of instrumental goals

436 M. Hassenzahl, D. Ullrich / Interacting with Computers 19 (2007) 429–437

evaluative judgment. This becomes very apparent in theno-goal condition where only the proportion of affectrelated to spontaneity was relevant for evaluation butnot affect related to mental effort. In other words, individ-uals base their evaluations on affective experience but notwithout a further stage of ‘‘editing,’’ where affect has tobe related to the product (i.e., attributed to) to be used.The experiential character of affect in the present studydistinguishes it from studies, such as Zhang and Li’s(2004). They showed affective quality to be a predictorof usability, utility and ultimately of behavioral inten-tions. Affective quality (Russell, 2003), however, is a con-struct already closely tied to an object, namely itsperceived ability to change affect. The present studyextends these findings by showing how people selectivelybase their evaluative judgments on affective experienceduring product use.

Spontaneity during interaction was valued very differ-ently. Whereas in the no-goal condition, spontaneity wasvalued in terms of a positive relation to product evaluationthe opposite was found in the goal-condition. Hassenzahl,2003 argued for two modes of using an interactive product,a goal- and an action-mode. In this concept, aspects of userexperience appropriate for one mode may be inappropriatefor the other. In other words, the same experiential quality,such as the spontaneous interaction with a product, may beappreciated in one mode but spurned in the other. Thepresent study supports this notion. The same level of spon-taneity was either experienced as appropriate, i.e., accom-panied by positive affect and a favorable productevaluation or as inappropriate, i.e., accompanied by nega-tive affect and an unfavorable product evaluation. Thispoint at the importance of what we call mode compatibility,i.e., a compatibility of the way one ought to and actuallydoes approach a product.

The present study is certainly limited in scope. Obvi-ously, only a single product was studied, with a limitednumber of people, which may render the broader gener-alization of results problematic. Note, however, theexploratory nature of the study and the unique opportu-nity to study an experimental interactive product explic-itly designed for both, a goal-oriented (i.e., knowledgeacquisition) and an action-oriented mode (i.e., story,playful exploration). Future studies may further expli-cate and specify the relationship between experienceand retrospective product evaluations as well as itsmoderation by different usage situations. This wouldsurely be important contributions to the growing fieldof user experience research (Hassenzahl and Tractinsky,2006).

Acknowledgements

We are grateful to Dr. Stefan Gobel, Anja Hoffmann,Ido Iurgel, and the whole team of the Computer GraphicsCenter Darmstadt (ZGDV) for supporting the presentstudy. The art-E-fact project ‘‘Generic Platform for the

Creation of Interactive Art Experience in Mixed Reality’’was funded by the European Union in the context of theFifth Framework Program (IST-2001-37924). For furtherinformation visit: www.art-e-fact.org.

References

Adamczyk, P.D., Bailey, B.P., 2004. If not now, when? The effects ofinterruption at different moments within task execution. In: Proceed-ings of the CHI 04 Conference on Human Factors in ComputingSystems. ACM, New York, pp. 271–278.

Arnold, A.G., 1999. Mental effort and evaluation of user interfaces: aquestionnaire approach. In: Bullinger, H.-J., Ziegler, J. (Eds.),Proceedings of the HCII ’99 international conference on human–computer interaction, vol. 1. Lawrence Erlbaum, Mahwah, NJ, pp.1003–1007.

Bradley, M.M., Lang, P.J., 1994. Measuring emotion: the self-assessmentmanikin and the semantic differential. Journal of Behavior Therapyand Experimental Psychiatry 25, 49–59.

Cordes, E.R., 2002. Task-selection bias: a case for user-defined tasks.International Journal of Human–Computer Interaction 13, 411–420.

Eilers, K., Nachreiner, F., Hanecke, K., 1986. Entwicklung und Uberpru-fung einer Skala zur Erfassung subjektiv erlebter Anstrengung.Zeitschrift fur Arbeitswissenschaft 40, 215–224.

Frederickson, B.L., Kahneman, D., 1993. Duration neglect in retrospec-tive evaluations of affective episodes. Journal of Personality and SocialPsychology 65, 45–55.

Gorsuch, R.L., 1997. Exploratory factor analysis: its role in item analysis.Journal of Personality Assessment 68, 532–560.

Hassenzahl, M., 2000. Prioritising usability problems: data-driven andjudgement-driven severity estimates. Behaviour & Information Tech-nology 19, 29–42.

Hassenzahl, M., 2001. The effect of perceived hedonic quality on productappealingness. International Journal of Human–Computer Interaction13, 479–497.

Hassenzahl, M., 2003. The thing and I: understanding the relationshipbetween user and product. In: Blythe, M., Overbeeke, C., Monk, A.F.,Wright, P.C. (Eds.), Funology: From Usability to Enjoyment. Kluwer,Dordrecht, pp. 31–42.

Hassenzahl, M., Burmester, M., Koller, F., 2003. AttrakDiff: EinFragebogen zur Messung wahrgenommener hedonischer und prag-matischer Qualitat. In: Ziegler, J., Szwillus, G. (Eds.), Mensch &Computer 2003. Interaktion in Bewegung. B.G. Teubner, Stuttgart,Leipzig, pp. 187–196.

Hassenzahl, M., Kekez, R., Burmester, M., 2002. The importance of asoftware’s pragmatic quality depends on usage modes. In: Luczak, H.,Cakir, A.E., Cakir, G. (Eds.), Proceedings of the 6th internationalconference on Work With Display Units (WWDU 2002).ERGONOMIC Institut fur Arbeits- und Sozialforschung, Berlin, pp.275–276.

Hassenzahl, M., Sandweg, N., 2004. From mental effort to perceivedusability: transforming experiences into summary assessments. In:Proceedings of the CHI 04 Conference on Human Factors inComputing Systems. Extended abstracts. ACM, New York, pp.1283–1286.

Hassenzahl, M., Tractinsky, N., 2006. User experience – a research agenda[Editorial]. Behavior & Information Technology 25, 91–97.

Iurgel, I., 2004a. From another point of view: art-E-fact. In: Gobel, S.,Spierling, U., Hoffmann, A., Iurgel, I., Schneider, O., Dechau, J., Feix,A. (Eds.), Technologies for Interactive Digital Storytelling andEntertainment: Second International Conference, TIDSE 2004,Darmstadt, Germany, June 24–26. Springer, Berlinu, Heidelberg, pp.26–35.

Iurgel, I., 2004b. Narrative dialogues for educational installations. In:Brna P. (Ed.), Proceedings of the third conference on narrativeand interactive learning environments, NILE 2004, August 10–13, pp.9–16.

Page 9: To do or not to do: Differences in user experience and retrospective judgments depending on the presence or absence of instrumental goals

M. Hassenzahl, D. Ullrich / Interacting with Computers 19 (2007) 429–437 437

Kahneman, D., 1999. Objective happiness. In: Kahneman, D., Diener, E.,Schwarz, N. (Eds.), Well-being: The Foundations of Hedonic Quality.Sage, New York, pp. 3–25.

Kardes, F.R., Posavac, S.S., Cronley, M.L., 2004. Consumer inference: areview of processes, bases, and judgment contexts. Journal ofConsumer Research 14, 230–256.

Orne, M.T., 1969. Demand characteristics and the concept of quasi-controls. In: Rosenthal, R., Rosnow, R.L. (Eds.), Artifact in Behav-ioral Research. Academic Press, New York, pp. 143–179.

Pham, M.T., 2004. The logic of feeling. Journal of Consumer Psychology14, 360–369.

Redelmeier, D., Kahneman, D., 1996. Patients’ memories of painfulmedical treatments: real-time and retrospective evaluations of twominimally invasive procedures. Pain 116, 3–8.

Rubin, J., 1994. Handbook of Usability Testing: How to Plan, Design,and Conduct Effective Tests. Wiley, New York.

Russell, J.A., 2003. Core affect and the psychological construction ofemotion. Psychological Review 110, 145–172.

Schwarz, N., Clore, G.L., 1983. Mood, misattribution, and judgments ofwell-being: informative and directive functions of affective states.Journal of Personality and Social Psychology 45, 513–523.

Spierling, U., 2005. Interactive digital storytelling: towards a hybridconceptual approach. Paper presented at DIGRA 2005, Simon FraserUniversity, Burnaby, BC, Canada. Available at: <http://www.di-gra.org/dl/db/06278.24521.pdf/>.

Zhang, P., Li, N., 2004. Love at first sight or sustained effect? The roleof perceived affective quality on user’s cognitive reactions toinformation technology. In: Proceedings of the Twenty-FifthInternational Conference on Information Systems (ICIS), pp.283–295.

Zijlstra, R., 1993. Efficiency in Work Behaviour. A Design Approach forModern Tools. Delft University Press, Delft.