planning support system for user experience evaluationquestions, option for checkboxes and radio...

Planning Support System ForUser Experience Evaluation

by

Bjarni Kristján Leifsson

Thesis of 60 ECTS credits submitted to the School of Computer Scienceat Reykjavík University in partial fulfillment

of the requirements for the degree ofMaster of Science (M.Sc.) in Software Engineering

May 2018

Supervisor:

Marta Kristín Lárusdóttir, SupervisorAssociate Professor, Reykjavík University, Iceland

Mohammad Hamdaqa, Co-supervisorAssistant Professor, Reykjavík University, Iceland

Examiner:

Guðlaugur Stefán Egilsson, ExaminerSoftware Developer and Consultant, Kolibri, Iceland

Hannes Högni Vilhjálmsson, ExaminerAssociate Professor, Reykjavík University, Iceland

CopyrightBjarni Kristján Leifsson

May 2018

ii



May 2018

Abstract

As technology advances the challenges facing software development expands. One of thosechallenges is the requirement from users that software needs to be fast, clear and easy to use.Through UX evaluation feedback is gathered, which is valuable for further development ofthe software. The evaluations provide insight into all aspects of the software, a detaileddescription that provides the groundwork for fixing user problems to extend the user ex-perience. Although research has shows that developers understand the importance of UXevaluations and its vital role in software development, it is still one of the first things to becut out from the project plan. The stated reasons is that it is time-consuming and expensive,using valuable development time and some of the limited resources. We believe that it ispossible to assist developers in conducting UX evaluations by providing them with a frame-work and a tool that utilizes the most important factors that affect evaluation planning. Thetool provides a questionnaire based on said factors, to reduce time, cost and manage com-plexity. For this project we compared and analyzed four different frameworks that supportdevelopers in planning UX evaluations (UXE). For doing that we identify similarities anddifferences of the frameworks. We conclude what factors need to be considered when plan-ning UX evaluation and suggest a framework including these factors. We identified 26 suchfactors which were used to create and implement the UXE framework and 52 questions thatgenerate a plan for UX evaluations. There is no planning support system (PSS) that helpsdevelopers to plan UX evaluations. In this project we make a prototype, the UXE PSS basedon the UXE framework, to help developers to plan UX evaluations. Preliminary evaluationson the UXE PSS show that UXE framework can help with planning, broaden the under-standing of what is needed for evaluation, reduce time, cost, and manage the complexity ofthe process. Furthermore, we conclude that further research is required that includes moretesting, and we believe this work provides a basis for that.



Maí 2018

Útdráttur

Áskoranirnar sem hugbúnaðarþróun stendur frammi fyrir þróast samhliða tæknilegri þró-un. Ein slík áskorun er sú krafa, sem er lögð er fram af notendum hugbúnaðar, að hann séhraður, auðskiljanlegur og auðveldur í notkun. Ein aðferð til að fá fullvissu um að kröfumnotenda sé mætt er að framkvæma prófanir á notendaupplifun (UX) (e. User Experienceevaluation). Slíkar prófanir færa innsýni á alla þætti hugbúnaðarins í formi ýtarlegra lýs-inga sem færa grundvöll fyrir því að laga megi vandamál sem notendur upplifa, framlengjanotagildi hugbúnaðarins og auka jákvæða upplifun notandans á hugbúnaðinum. Þrátt fyrirað rannsóknir hafi sýnt fram á að forritarar gera sér grein fyrir mikilvægi notendaprófana oghlutverki þess í hugbúnaðarþróun þá eru þetta það fyrsta sem þeir fjarlæga úr verkáætlunumsínum. Ástæðurnar sem gefnar eru fyrir því eru að prófanir er of tímafrekar og kostnaðar-samar, það tekur of mikinn tíma frá hugbúnaðarþróun og notar of mikið af takmörkuðumauðlindum þeirra. Okkar skoðun er sú að mögulegt sé að aðstoða forritara við að skipu-leggja framkvæmd notendaprófana með því að sjá þeim fyrir ramma (e. framework) oggagnvirku verkfæri í formi hugbúnaðar. Verkfærið nýtir sér alla þætti rammans og ætti aðauðvelda skipulagningu notendaupplifunarprófana (UXE), og minnka tíma og kostnað viðframkvæmd þeirra. Við framkvæmd þessa verkefnis bárum við saman 4 ramma sem hjálpavið UXE, bárum kennsl á hvað þeir eiga sameiginlegt og hvað er mismunandi í römmumog greindum hvaða þætti þarf að hafa í huga við skipulagningu slíkra prófana. Við bárumkennsl á 26 slíka þætti sem voru notaðir til að hanna og útfæra UXE rammann fyrir notenda-upplifunarprófanir ásamt spurningalista samsettum úr 52 spurningum sem móta áætlun fyrirnotendaupplifunarprófanir. Einnig var gert áætlanagerðartól (PSS), sem byggir á ramm-anum, sem gerður var í verkefninu (UXE rammanum). Tólið aðstoðar hugbúnaðarfólk,við skipulagningu notendaupplifunarprófana. Fyrstu niðurstöður rannsókna sýna að UXEramminn hjálpar við skipulagningu notendaupplifunarprófana með því að dýpka skilning áþví hvað þarf til við framkvæmd slíkra prófana og þar með minnka tíma, kostnað og flækj-ustig ferlisins. Jafnframt því komumst við að þeirri niðurstöðu að frekari rannsókna er þörfmeð tilheyrandi tilraunum og teljum við að þessi vinna færi grundvöll fyrir þeim.

iv

Acknowledgements

This thesis would not have been possible without the inspiration and support of a number ofwonderful individuals.

First and foremost I would like to thank my supervisor’s, Dr. Marta Kristin Lárusdóttirand Dr. Mohammad Hamdaqa for their guidance throughout my master’s studies. This workwould not have been possible without their help.

I would like to thank Ingibergur Sindri Stefnisson, Jón Hjörtur Brjánsson and Sigurgrí-mur Unnar Ólafsson for their valuable feedback on this thesis.

I would like to thank Dr. Anna Ingólfsdóttir and Dr. Marcel Kyas for their pep talks andguidance when I really needed it.

I would like to thank my parents Aðalheiður Bjarnadóttir and Leifur Kristjánsson for alltheir support throughout my studies at Reykjavík University, I truly appreciate everythingthat you have done for me.

Finally, I would like to thank my daughter Anja Björk Bjarnadóttir for her support, beingpatient and understanding of her time, I’m looking forward spending more time with you.

vi

Contents

Acknowledgements v

Contents vi

List of Figures ix

List of Tables x

1 Introduction 11.1 Goal and Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Research Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Related Work 42.1 Evaluation Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 User Experience (UX) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.3 UX Evaluation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.4 Usability and UX Evaluation Tools . . . . . . . . . . . . . . . . . . . . . . 72.5 Planning Support System (PSS) . . . . . . . . . . . . . . . . . . . . . . . . 8

3 Approach 93.1 List of Evaluation Frameworks . . . . . . . . . . . . . . . . . . . . . . . . 10

3.1.1 The RAMES Framework . . . . . . . . . . . . . . . . . . . . . . . 103.1.2 The Evaluation Framework by Kwhak and Han . . . . . . . . . . . 103.1.3 The Common Industry Format . . . . . . . . . . . . . . . . . . . . 103.1.4 SQuaRE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2 Framework Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.3 List of Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.3.1 Roles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.3.2 Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.3.3 Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.3.4 Environment and Equipment . . . . . . . . . . . . . . . . . . . . . 153.3.5 System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.4 Framework Creation and Implementation . . . . . . . . . . . . . . . . . . 173.5 Evaluation Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.6 Framework Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4 The UXE PSS Framework 214.1 Roles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.1.1 System Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

vii

4.1.2 Evaluators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224.1.3 Observers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224.1.4 Technical Writers . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.1.5 Recipients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.2 Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.2.1 The purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.2.2 Schedules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.2.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.2.4 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.2.5 Decision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.3 Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264.3.1 Evaluation Material . . . . . . . . . . . . . . . . . . . . . . . . . . 264.3.2 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.3.3 Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.3.4 Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.3.5 Resource Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.4 Environment and Equipment . . . . . . . . . . . . . . . . . . . . . . . . . 284.4.1 Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284.4.2 Physical Ambient Condition . . . . . . . . . . . . . . . . . . . . . 294.4.3 Social Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.4.4 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.4.5 Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.5 System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.5.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.5.2 Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.5.3 Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314.5.4 Part . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.5.5 Required Equipment . . . . . . . . . . . . . . . . . . . . . . . . . 324.5.6 Risk and Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5 The UXE PSS 345.1 Quick Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345.2 Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355.3 Three Layered Application . . . . . . . . . . . . . . . . . . . . . . . . . . 355.4 Presentation Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365.5 Business Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405.6 Data Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

6 Evaluation of UXE PSS 476.1 Evaluation Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476.2 Usability Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476.3 User Experience Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486.4 The User Evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486.5 First User Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

6.5.1 Evaluation Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516.5.2 The Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526.5.3 The Evaluation Process . . . . . . . . . . . . . . . . . . . . . . . . 526.5.4 The Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

6.6 Second User Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

viii

6.6.1 Evaluation Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566.6.2 The Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576.6.3 The Evaluation Process . . . . . . . . . . . . . . . . . . . . . . . . 576.6.4 The Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

6.7 Evaluation Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

7 Conclusion and Future Work 607.1 Thesis Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607.2 Evaluation Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617.3 Research Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 627.5 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

Bibliography 64

A Questions 68

B User Evaluation Text Information 70B.1 First Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70B.2 Second Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

ix

List of Figures

3.1 Research approach. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.2 Diagram of framework to questionnaire. . . . . . . . . . . . . . . . . . . . . . 113.3 Showing the process of making an evaluation plan with the planner. . . . . . . 173.4 Showing how stating the number of users in evaluation can affect other questions. 19

5.1 The figure shows when the user is getting a list of plans for a project. Thepresentation layer sends a request to the Business layer that then sends a queryto the data layer, that then response with current information about the requestand gets relayed back the same way. . . . . . . . . . . . . . . . . . . . . . . . 35

5.2 The Model contains business logic, Controller interacts with Model to createdata for the View and the View renders content to the user and relays user com-mands to the Controller. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5.3 MVC architecture implemented into three layer architecture . . . . . . . . . . . 375.4 Showing the front page of UXinn application. . . . . . . . . . . . . . . . . . . 385.5 Showing how list of projects is presented to users. . . . . . . . . . . . . . . . . 385.6 Showing how list of plans and feedback is presented to users. . . . . . . . . . . 395.7 Showing example of questions that are presented to users. . . . . . . . . . . . . 395.8 Showing how the result of the question are represented into the evaluation plan 405.9 A HTTP POST login request sent to server with username and password. Server

issues JSON Web Token that is then used for access control and validation.[49] 415.10 Showing when client sends a request to the application through Nginx as reverse

proxy and pm2 as loadbalancer. Application 1, 2 & 3 represent as three instanceof UXinn API. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.11 The yellow sector of the diagram, stores language option and category withinthe system. Such as, Roles, Activities, Material, Environment and System. . . . 43

5.12 The red sector of the diagram, stores information from the feedback, feedbackquestions, option for checkboxes and radio choices of each question . . . . . . 44

5.13 The green sector of the diagram, stores information from the UXE PSS frame-work, it’s questions, option from checkboxes, radio and drowpdown menu. . . . 44

5.14 The purple sector of the diagram, stores information about projects, plans andtype of plan. This information is attached to user id. . . . . . . . . . . . . . . . 45

5.15 The blue sector of the diagram, stores information of content in the application,stores information where to find pictures and information about collaborators inthe project, e.g. developers of the application. The orange sector is the infor-mation on the user in the system, that holds hashed password, email, roles andother needed information on the user. . . . . . . . . . . . . . . . . . . . . . . . 46

6.1 An overview of when user is giving feedback on the evaluation he conducted. . 55

x

List of Tables

3.1 Comparing the frameworks, showing Roles category, Factors and Characteris-tics. Marking x represents that the framework includes the characteristics. . . . 12

3.2 Showing differences between frameworks for background characteristics in sys-tem user factor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.3 The list of 26 factors categorized in categories that are included in the framework. 133.4 Showing the factors defined for Roles category with description of each charac-

teristics within each factor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.5 Showing the factors defined for Materials category with description of each

characteristics within each factor. . . . . . . . . . . . . . . . . . . . . . . . . . 143.6 Showing the factors defined for Materials category with description of each

characteristics within each factor. . . . . . . . . . . . . . . . . . . . . . . . . . 153.7 Showing the factors defined for Environment and equipment category with de-

scription of each characteristics within each factor. . . . . . . . . . . . . . . . 163.8 Showing the factors defined for System category with description of each char-

acteristics within each factor. . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.9 Questions in each characteristic for system user factor. . . . . . . . . . . . . . 183.10 Questions in each characteristic for Schedules factor. . . . . . . . . . . . . . . 183.11 Questions in each characteristic for Methodology factor. . . . . . . . . . . . . 183.12 Questions in each characteristic for Research cost factor. . . . . . . . . . . . . 193.13 Proposal of the evaluation plan in UXE PSS framework. . . . . . . . . . . . . . 20

4.1 An example for Characteristics of System Users in Roles category. . . . . . . . 214.2 An example for Characteristics of Evaluators factor in Roles category . . . . . 224.3 An example for Characteristics of Observers factor in Roles category . . . . . . 234.4 An example for Characteristics of Technical Writers factor in Roles category . 234.5 An example for Characteristics of Recipients factor in Roles category . . . . . 234.6 An example for Characteristics of The purpose factor in Activities category . . 244.7 An example for Characteristics of Schedules factor in Activities category . . . . 244.8 An example for Characteristics of Methodology factor in Activities category. . 254.9 An example for Characteristics of Analysis factor in Activities category. . . . . 254.10 An example for Characteristics of Decision factor in Activities category. . . . . 264.11 An example for Characteristics of Evaluation material factor in Materials category 264.12 An example for Characteristics of Data collection factor in Materials category . 274.13 An example for Characteristics of Result factor in Materials category . . . . . . 274.14 An example for Characteristics of Decisions factor in Materials category . . . . 274.15 An example for Characteristics of Resource cost factor in Materials category . . 284.16 An example for Characteristics of Settings factor in Environment and equipment

category . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

xi

4.17 An example for Characteristics of Physical ambient condition factor in Environ-ment and equipment category . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.18 An example for Characteristics of Social conditions factor in Environment andequipment category . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.19 An example for Characteristics of Data collection factor in Environment andequipment category . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.20 An example for Characteristics of Data analysis factor in Environment and equip-ment category . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.21 An example for Characteristics of Description factor in System category . . . . 314.22 An example for Characteristics of Type factor in System category . . . . . . . . 314.23 An example for Characteristics of Stage factor in System category . . . . . . . 314.24 An example for Characteristics of Part factor in System category . . . . . . . . 324.25 An example for Characteristics of Required equipment factor in System category 324.26 An example for Characteristics of Risk and Issues factor in System category . . 32

5.1 List of main characteristic of REST service. . . . . . . . . . . . . . . . . . . . 40

6.1 List of goals to achieve on our effectiveness, efficiency and satisfaction goals. . 486.2 List of task in the user evaluation. The first nine task are used in the first eval-

uation, in the second evaluation we used all eleven task and the for the thirdevaluation we used task #4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

6.3 Evaluation plan for conducting the first user evaluation of UXE PSS. . . . . . . 516.4 This shows if the user finished task with correct data marked with 3or empty if

not. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526.5 The task effectiveness for solving task 4 : Complete a new plan for project . . . 536.6 The resources used for completing task (Both correct and incorrect data) . . . . 546.7 Evaluation plan for conducting the second user evaluation of UXE PSS. . . . . 566.8 This shows if the user finished task with correct data marked with 3or empty if

not. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576.9 The task effectiveness for solving task 4 : Complete a new plan for project . . . 576.10 The resources used for each completing each task (Both correct and incorrect data) 586.11 The resources used for completing task (Both correct and incorrect data (Both

correct and incorrect data) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

A.1 Showing the 16 questions defined for factors in Roles category. . . . . . . . . . 68A.2 Showing the 8 questions defined for factors in Activities category. . . . . . . . 68A.3 Showing the 14 questions defined for factors in Materials category. . . . . . . . 69A.4 Showing the 6 questions defined for factors in Environment and equipment cat-

egory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69A.5 Showing the 8 questions defined for factors in system category. . . . . . . . . . 69

1

Chapter 1

Introduction

When developing software, it is essential that the software be easy to use and relatively bug-free, and it meets the requirements of the user before the product is launched on the openmarket. The most common way trying to fulfill the requirements is to conduct user tests.These tests are known by many names, such as product testing, design testing, usability test-ing. Although most developers understand the importance of user tests in software creation,it is one of the first things that is pulled out of the software development lifecycle. Themain reason is valuable development time, and it is considered to be time-consuming.[1][2]To estimate the usability of new software, the developers need to see things from the user’sperspective and how they will use the product. To this end, the best way to get the usersperspective is to conduct User Experience (UX) evaluation.[3]

The purpose of the UX evaluation is to gather feedback on the software from the user’s;it gives needed insight into all aspects of the flaws and a detailed description of them. Hencethe information can be used to fix the problem cases to extend the usability and the user ex-perience.[3] Despite the existence of some frameworks for UX evaluation, e.g., RAMES[4],where such frameworks collect information about several factors that affect the evaluationplanning process manually, then relay on experts on UX evaluation to do the actual planningand decision making. There is no digital tool that combine the frameworks that we iden-tified to help evaluators to conduct a UX evaluation efficiently by recommending the bestdirection on how to plan for the evaluation.

It is common when developing software to follow a process that is used to structure, planand control the life cycle of developing information systems. Many of such methods haveevolved over the past years, and each of them has their strengths and weaknesses. Mostof them share the same vision when it comes to software development, such as; analyzing,gathering requirements, devising a plan, evaluating.

As there are significant advantages and disadvantages between some of the methods,the best-fitted solution to solve a problem is often depended on its type. By understandingthe problem, we can effectively plan out how we are going to address it and then comeup with a solution. It is a common understanding that planning is a good practice whendeveloping software, and evaluation plan is a vital document that should be produced everytime. This plan should explain the process of how the software is being tested, with step bystep instructions. It can then easily be referenced at any later date, should that be needed.It will give insights or vital information about what happened while the evaluation wasconducted. However, there’s a fair number of things that need to be taken into considerationwhen making an evaluation plan; it should be clear and as concise as possible, simple tounderstand and navigate for anyone who would want to review it.

2 CHAPTER 1. INTRODUCTION

Looking at what ends up in the evaluation plan, then it depends on the type of softwareand what method is used. It must be tailored to the need of each evaluation, be straightfor-ward and contain all the necessary information. Essentially, there are three main subjectsthat must be included in such a plan and should be a priority, these are; what is being tested,how the evaluation is conducted, and how will the outcome be defined. All other informationis based on the evaluation itself.

1.1 Goal and Research QuestionsOne of the easiest ways to ensure that the evaluation plan is correct and meeting the require-ments is to start with a template. That will keep the plan structured and help to make surethat everything is included. This can help to ensure that the evaluation plan is robust.

The goal of this thesis is to assist UX evaluators in planning a user evaluation by pro-viding them with: framework that utilizes the most important factors, that affect evaluationplanning, and a tool that provides a questionnaire based on these factors to enable the evalu-ators to efficiently plan their evaluations in accordance to a specific method or approach andwithin the time and budget limits.

Particularly, this thesis is trying to provide answers to the following research questions:

1. What factors are important for planning UX evaluations?

2. How usable is a planning support system based on these important factors?

1.2 Research ContributionsThis thesis provides the following original contributions:

1. Evaluated a number of UX evaluation frameworks.

2. Identified a set of factors that must be taken into consideration when conducting a UXevaluation.

3. Built a framework based on the selected factors that can help UX evaluators to plan aUX evaluation in order to:

• Reduce time

• Reduce cost

• Manage complexity

4. Built a prototype of UXE PSS application.

5. Evaluated the prototype of UXE PSS application.

1.3 Thesis OrganizationThe rest of this thesis is organized as follows: In Chapter 2 we start by introducing relatedwork that relates to this thesis, which creates a context and helps to understand what thiswork is about. In Chapter 3 we present the approach of this thesis. In Chapter 4 we present

1.3. THESIS ORGANIZATION 3

the User Experience Evaluation Planning Support System (UXE PSS) framework. In Chap-ter 5 we present the implementation of the UXE PSS application that is based on UXE PSSframework. In Chapter 6 we present the initial evaluation of the framework. Then in Chapter7 we present summaries of Chapter 2 to 6, give our conclusion and future work.

4

Chapter 2

Related Work

In this study we identified several frameworks that are suited for related work, they are:RAMES - Framework Supporting User-Centered Evaluation in Research and Practice [4],Kwahk and Han - A methodology for evaluating the usability of audiovisual consumer elec-tronic products [5], ISO/IEC 25026 standard or better known as Common Industry Format(CIF) [6] and ISO/IEC 25066 better known as Systems and software Quality Requirementsand Evaluation (SQuaRE) [7].

2.1 Evaluation Process

Evaluation is a process that is used to critically examine a product with research methods andpractices that measure changes in a product. The purpose of the evaluation is to increase theknowledge about one or several aspects of a product by collecting and analyzing informationto improve its effectiveness, and decisions making.[8]

There are many reasons for why we would want to conduct an evaluation, it could bethat we want to improve the design of a product, the implementation of software, or we wantto measure the success of a product to be sure that we are fulfilling the need of the customer.Ultimately we want to identify and improve every aspect of the product to meet specificgoals.

Evaluations can be divided into two categories: formative and summative.[9] In forma-tive evaluation, the product is being evaluated during the development cycle, or in the earlystage of implementation to provide valuable information to improve a product for progressmonitoring that measures if we are achieving our goals. In the summative evaluation, theproduct is being evaluated when its found to be well established and thought to be readyfor production, it will tell whether the product should be adopted, continued, or it mustbe modified for improvements. Both of these methods are recommended for use, as of-ten as possible, as it provides developers with valuable feedback information for necessarymodifications (formative) as well as a recurring review of progress on goals and objectives(summative).

Making a plan is an essential part of the evaluation process and must be well-planned andcarefully executed for each evaluation. It provides information that will guide us to improvethe product during the development cycle. It sets out the details of the evaluation, what isto be evaluated, how it will be executed, what information is collected, when to conduct theevaluation, how the information will be analyzed and how we will report the results.

2.2. USER EXPERIENCE (UX) 5

2.2 User Experience (UX)User experience (UX) is a popular term in the technology industry. We often hear people re-fer to UX when talking about user interfaces, like websites or mobile applications. However,UX is much more than that; we encounter it everywhere, it is about how humans interactwith things and the experiences they get from that interaction, from the way we interact withsoftware to using a ketchup bottle. The success of a product is based on how people experi-ence it; people often evaluate their experience from if they feel it is easy to use, pleasant orif it gives them any value.

Don Norman, co-founder of the Nielsen Norman Group, is known for inventing the term"user experience". In the 90’s he described it as:

"User experience encompasses all aspects of the end user’s interaction with thecompany, its services, and its products."[10]

Norman explained in his own words when he was asked why the term was even coined:

"I invented the term because I thought Human Interface and usability were toonarrow: I wanted to cover all aspects of the person’s experience with a system,including industrial design, graphics, the interface, the physical interaction, andthe manual."[11]

UX describes the user’s interaction with a product, how they feel about the product andits emphasis on the human side of human-computer interaction (HCI). It is about the productand its context of use, about understanding how users operate and discover how products fitinto their lives. The ISO-definition [12] (ISO 9241-210 (2010)) defines User Experience(UX):

"a person’s perceptions and responses that result from the use or anticipated useof a product, system or service."

User Experience is an important aspect when it comes to the success or failure of aproduct. However, it is common that UX is confused with usability which is described inthe ISO-definition [12] (ISO 9241-210 (2010)):

"The extent to which a product, system or service can be used by specified usersto achieve specified goals with effectiveness, efficiency, and satisfaction in aspecified context of use."

There the effectiveness is defined regarding accuracy and completeness. That is, users,achieve specific goals focusing on completeness and accuracy of achieving their goals, re-gardless of his effort or time solving a task. The efficiency describes the ratio of effectivenessdivided by the resources used. That is, resources represent, e.g., time or mouse clicks fora task. Satisfaction is defined as: "freedom from discomfort, and positive attitudes towardsthe use of the product". By asking questions like: "How satisfied is the user?, How do youfeel after using the system?". Furthermore, according to Peter Morville in the article "UserExperience Design"[13], UX is more than just about usability, there are facets of the UserExperience when it comes to delivering a successful product. He explained that we need toconsider if the product is useful, easy to use (usable), desirable, content accessible to find(findable), accessible to people with disabilities, credible such as user trust and believe whatthey are told, and is valuable for the customer.

6 CHAPTER 2. RELATED WORK

2.3 UX Evaluation MethodsOver the years User Experience researchers have developed many methods for testing andvalidating their ideas. Those ideas range from well-known lab-based user studies to otherforms of testing like user testing, think aloud method, A/B testing, questionnaires, andheuristic evaluation.

User testing[14] is the process where the evaluators watch or track the participants whilethey use the product to see if the product is usable. One of the best ways to understand howthe users experience the product is to conduct a user test. It is flexible for collecting variousinformation about the user and relatively easy to combine with other testing techniques. Thefocus of user testing is to obtain feedback from users by introducing them to tasks that theysolve, asking them questions and interact on their feedback in real time. The main strengthof user testing is to be able to monitor the participants while they use the product, and havethe opportunity to ask a question about what they are thinking, what they are doing and why.

Think aloud method[15] is one form of user testing. It involves having one observerand one participant. The participants are asked to say out-loud what they are thinking andsay what whatever comes into their mind as they are performing a predefined set of tasks.The observer takes notes of what the participant say and do, without attempting to interpretthe participants action when solving the tasks, explicitly where the participants encounterany difficulty. The evaluation is often recorded so that it is possible to go back and referto what exactly happened in the evaluation. Kuusela and Paul[16] claim that the methodcan be distinguished into two types of experimental procedures. The first is the concurrentmethod, where information is collected during the task performance. The second is moreretrospective, where the information is collected after the task performance.

A/B testing[17] is a process for when the developers are at a crossroad choosing betweentwo competing versions of a product. A/B testing is used to show one of two versions toparticipants randomly, collect information, analyze and review results on which version ismore effective. This method is valuable when there is a need to detect any differences ina product, from comparing everything from colors of buttons, to rewrites of large sites.However, this method only allows finding the most valuable solution among the versions.

Questionnaire[18] (surveys) is a specific method that is an efficient and cost-effectiveway that introduces set of questions that aim to extract a significant amount of informationfrom the participant. The goal of the questions is to understand the user of the productand his experience. There are many ways to use questionnaire, in an evaluation, on a web-site, or even send with tools like SurveyMonkey or Google Forms. However, when usingquestionnaire, we need to have a specific goal in mind to get useful feedback. Additionally,specialized questionnaires have been made to measure the user experience of using software.The most frequently used is Attrakdiff.[19][20]

Heuristic evaluation[21] is an informal usability inspection method[22] that is used toidentify any problems in the design of user interfaces. Using heuristic evaluation methodis beneficial at the early stage of design. In heuristic evaluation an expert inspects the userinterface according to a predefined set of heuristics, so there are no users involved in thisevaluation. It only requires a number of experts that review the UI and compare against us-ability heuristics. This comparison results in a list of potential usability problems. Heuristicevaluation can reduce the complexity and time of evaluation. One of the best-known setof guidelines used for heuristic evaluation is the "10 Usability Heuristics for User InterfaceDesign"[23] by the founder of heuristic evaluation Jakob Nielsen. However, heuristic eval-uation should not replace other usability methods, since the problems that are identified aredifferent that can be found by using other methods.

2.4. USABILITY AND UX EVALUATION TOOLS 7

Exploratory testing[24] is about discovering, investigating and learning. It is an eval-uation technique that emphasizes on personal freedom and responsibility of the evaluatorto continually optimize the quality of work by treating test-related learning, test design, testexecution, and test results interpretation as mutually supportive activities that may run in par-allel throughout the project development.[25] This means that test cases are not necessarilycreated beforehand, but more on the fly. Exploratory testing is more a hands-on approach,that needs minimum planning and maximum test execution.[26] However, this thesis is notfocusing on this technique.

2.4 Usability and UX Evaluation ToolsAlthough there is a variety of evaluation methods that can be used, it is important to under-stand the different types that are available and when to use each of them. As the evaluationprocess and measuring the results has been conducted manually over the years, it has beencostly to perform an evaluation, since it consumes a lot of valuable time and resources fromthe development of the software, as some of the evaluation methods involve questionnairesthat are introduced to users or guidelines for expert reviews. Furthermore, resources usedwhile collecting and processing the data from the evaluation needs to be presented in a reli-able way such as it can be understood and replicated. To reduce the time and resources, oneway would be to develop an automated tool that helps with the evaluation process.

The Website Analysis and Measurement Inventory [27] [28] (WAMMI) questionnairewas first developed in 1996, after an unsuccessful attempt to modify the SUMI [29] ques-tionnaire. WAMMI is described as a tool that helps with measuring user satisfaction byasking visitors to compare their expectations with their experience on the website. It isbased on 20 standardized statements and a unique database, that is used to reference theresults from the users with existing values in the database that contains over 320 surveys.WAMMI claims that they can benchmark websites, portals or intranet against others to seeif the scores are good or bad, as they track performance over time to see if there are any im-provements from the users. WAMMI is a simple tool to use. It only needs to be configuredand defined on their secure server that then provides the user with URL that is located in aprominent place on a website. After the survey has collected between 40 and 200 responses,it generates a report that is then sent electronically to the owner of the website. One of theessential parts of the WAMMI report is the five key scales that they focus on; Attractive-ness, Controllability, Efficiency, Helpfulness, and Learnability. Furthermore, WAMMI is ascientific analytics service that claims they get a high response rate of 10-20%.

The Programmsystem zur kommunikationsergonomischen Untersuchung rechnerunter-stützter Verfahren (PROKUS) by Zülch and Stowasser [30], is a computer system developedfor evaluation procedures and to carry out usability evaluations according to different evalu-ation situations. The authors describe that it can be classified as an evaluation method withguidelines and is applicable for market research, conformity tests, quality tests and com-parative tests. The tool is based on a catalog with questions that are filled in by an expertduring the evaluation procedure. One of the elements in PROKUS is the exercise databasewhich consists of a different set of scenarios that contains several questions. These questionsare described to be based on each thinkable checklist, standard, guidelines, etc., that must betransformed into the terminology of evaluation, e.g., into questions with answer possibilities.With the assistance from the questions from the catalog, the expert can evaluate the computersystem in an individually adjusted way which is documented for reproducing. The purposeof PROKUS is to design evaluation procedures according to different evaluation environ-

8 CHAPTER 2. RELATED WORK

ments, e.g., level of details or resources, and to carry out different procedures in usabilityevaluation, as the complexity of an evaluation can be from expert reviews, acceptance testsand usability test.

The Diagnostic Recorder for Usability Measurement [31] (DRUM) has been in develop-ment since 1990 and is focused on helping evaluators to organize and analyze user-centeredevaluation. DRUM considers many stages of user-centered evaluation, from selecting usersfor the evaluation to the result of the analysis. It is a system that offers video recordingof an evaluation that is a considerable advantage for usability evaluation. Recording videoclips of users while performing tasks in the evaluation can provide compelling evidence fordesigners and developers of the system and insightful information about specific problems.DRUM gives the evaluator the option to review the video clips of the evaluation and manu-ally define starting and ending points for the task. It processes and produces the informationinto logs and performs several measurements, which includes: the completion of a task, effi-ciency which is effectiveness divided by completion of a task and productive periods of theevaluation.

The motivation between all the tools mentioned above is to assist with evaluation in astructured way. However, their focus is different regarding the process of evaluation, e.g.,DRUM focuses on video recordings of the evaluation that then is analyzed. PROKUS hasmore methods that can be used to evaluate that helps with structuring the evaluation, whileWAMMI is a questionnaire that is produced to the website in the form of a link that the usersof the website are asked to answer to describe their experience. As many types of methodscan be used for conducting a user-centered evaluation the purpose of this study is somewhatsimilar to those tools, e.g., the motivation is to assist the evaluators. However, the maindifference is that we aim to assist the evaluators to plan the evaluation by asking series ofquestions that then leads to what is needed to consider in order to conduct the evaluation,e.g., Roles, Activities, Materials, Environment and Equipment, and system.

2.5 Planning Support System (PSS)Harris and Batty [32] defined the idea of Planning support systems (PSS) by combiningmethods and models into a system that was used to support planning. In general, PSS is asub-class of decision support systems (DSS) and is regarded as systems that are dedicatedto assisting a person with planning. It is their opinion that a single PSS can be describedwith three sets of components: the purpose of the planning and what to achieve, the methodsof the planning process and the transformation of data into information. Furthermore, Brailand Klosterman [33] recently described PSS information technology that is used by plannersto undertake their professional responsibilities, and suggested that PSS has evolved into aframework.

9

Chapter 3

Approach

The goal of this thesis is to assist evaluators with planning a UX evaluation, by providingthem with a framework that uses several factors and a questionnaire to plan their evaluation.To achieve this goal, in this thesis we devise the approach presented in Figure 3.1. Ourresearch approach is divided into three phases. In Phase I, we perform an analysis of fourframeworks by looking for any similarities or differences and come up with a list of factorsthat can affect the outcome of an evaluation. In Phase II, we use these factors to createand implement UXE PSS framework and come up with a set of questions that are based oncharacteristics within each factor to generate a plan for UX evaluation. Finally, in Phase III,we evaluate the work.

Figure 3.1: Research approach.

In the following sections, we explain the list of evaluation frameworks that we chose,how they were analyzed to get a set of 26 factors that we then use to create and implementthe framework. Then, we show an example of the evaluation plan and briefly explain howwe evaluate the framework.

10 CHAPTER 3. APPROACH

3.1 List of Evaluation FrameworksAs we mentioned in our research approach the first step is to identify several frameworksthat can be used for evaluation e.g. Atlas [34], Rita [35] and ResQue [36]. In our work,we identified four frameworks that we found best suited for this study, those are: RAMES- Framework Supporting User-Centered Evaluation in Research and Practice [4], Kwahkand Han - A methodology for evaluating the usability of audiovisual consumer electronicproducts [5], ISO/IEC 25026 standard or better known as Common Industry Format (CIF)[6] and ISO/IEC 25066 better known as Systems and software Quality Requirements andEvaluation (SQuaRE) [7].

3.1.1 The RAMES FrameworkThe motivation and creation of the RAMES[4] is similar to our goals. RAMES was devel-oped to support evaluators with planning, comparing and document user-centered evaluationin a structured way. It consists of 22 elements divided into five categories: Roles, Activities,Materials, Environments, and System. The framework can be used to plan informal evalua-tions, such as the evaluators can evaluate wire-frame sketches of a system or with a formalevaluation that checks if all aspects that could affect the outcome have been considered. Thisframework is fundamental to our research and lays the foundation for our work. The authordescribes the framework to provide a set of concepts, that stakeholders, users, professionals,and managers can refer to and use as their basis for interaction. This is both valuable andimportant for learning and using consistent concepts for user-centered evaluation.

3.1.2 The Evaluation Framework by Kwhak and HanThe Framework developed by Kwhak and Han[5] is focused on an audio-visual consumerelectronics product. It is divided into three groups: design variables, context variables,and dependent variables. Each of these groups then describes various things. The designvariables describe the hardware elements of the product, such as control panel, connector,etc. The context variables have four categories: the user, the product, the activities and theenvironment. Each of the categories has their specific characteristics related to the category,which we consider as it can help with expanding the use of the framework to be used in otherthan planning for software. The third category is the dependent variables, which is split intothree categories: usability dimensions, usability measures, and evaluation techniques. Thisallows the evaluators to describe the context of use for the product, e.g., the specifics of thetest. However, in this study, we focus more on the software evaluation then the hardware.

3.1.3 The Common Industry FormatThe Common Industry Format [6](CIF) for usability test reports describes the format forreporting the results from an evaluation. It focuses on a summative evaluation, which usesa usability metrics that describes how usable a product is when used in some settings. Thefocus of CIF report is good practice in usability evaluation and to have sufficient informationfor usability specialist to judge the validity of the results. The focus of the information inCIF is that the report should be easy to replicate if needed and it should produce substan-tially the same results. Satisfaction must also be measured. This is an essential aspect ofany evaluation. It should be easy to understand the information and should have sufficientinformation such as it is possible to replicate the evaluation.

3.2. FRAMEWORK ANALYSIS 11

3.1.4 SQuaRESystems and software product Quality Requirements and Evaluation also known as SQuaRE[7], is a part of the Common Industry Format (CIF). It describes the context of use andspecifies the content of both high level and detailed descriptions of the use for an exist-ing, intended, implemented or deployed system. The description includes information aboutthe user, stakeholders, the characteristics of each user group, the goals, tasks and the en-vironment in which the system is used. The description for the context of use can applyfor hardware systems, products or services. The framework finds it essential to gather andanalyze information on the context, such as it is easy to understand and then describe thecontext to the future system. Also, the description provides many collections of data that isrelevant for analysis, specification, design, and evaluation of a system.

3.2 Framework AnalysisThe focus when we perform the analysis is that we are looking for any similarities anddifferences between the frameworks mentioned in section 3.1, and add new factors thatneeds to be considered as well. In order to do this we used the Six Honest Serving Men[37]approach, where we form questions from the poem by Rudyard Kipling:

I keep six honest serving-men (They taught me all I knew); Their names areWhat and Why and When And How and Where and Who.[38]

By answering question on What, Why, When, How, Where, and Who, we identified 26factors and 42 characteristics that need to be considered when planning an UX evaluationthat could divided into 5 categories. We then use the result to form 52 questions that ensuresthat we collect needed information for each of the characteristics. See Figure 3.2.

Figure 3.2: Diagram of framework to questionnaire.

If we look at table 3.1 we see that all the frameworks have defined four factors withinroles category, and there are some differences and similarities between the frameworks, e.g.,all the frameworks agree that number of users (#Users), user types and background shouldbe defined for a UX evaluation, while not all agree upon that the selection of user should bedefined.


Category Factor Characteristics FrameworkRames Kwhak CIF SQuaRE

Roles

System Users

#Users x x x xUser types x x x xBackground x x x xIntended Users x x xUser selection x x

Evaluators

#Evaluators x x xBackground x x xRequirement

Observers

#Observers x x xObligation x x xBackground x x

Technical writers BackgroundRecipients Background x x x

Table 3.1: Comparing the frameworks, showing Roles category, Factors and Characteristics.Marking x represents that the framework includes the characteristics.

However, in addition to the roles, none of the compared frameworks addresses the roleof a technical writer for the report. It is our opinion that to be able to simplify the process ofgetting relevant information from the right person; it should be included so that it would beclear for anyone who needs to know whom to contact. The best example of that would be ifa stakeholder needs specific information about any part of the report then it would be clearwhom to contact.

Characteristics Rames Kwhak CIF SQuaREBackground What is the

backgroundof the partici-pant.

Age, gender, race/national-ity, Occupation /career, mar-ital status, income, expen-ditures, ethnic background,cultural background, Vision,Hearing, Speech clarity, tac-tile capability, body size mo-bility, health, metabolic re-quirements, Domain knowl-edge, product usage experi-ence, and many more.

Gender, age,education, oc-cupation/role,professionalexperience,computerexperience,productexperience.

Table of par-ticipants bycharacteris-tics.

Table 3.2: Showing differences between frameworks for background characteristics in sys-tem user factor.

Although it seems that all the framework agrees that the background information aboutthe system user should be defined, they have their differences. In table 3.2 we see thedifference between the frameworks when collecting background information for the systemuser. Both RAMES[4] and SQuaRE[7] are relatively similar when defining the backgroundinformation, CIF[6] is very specific on what should be collected while Kwahk[5] has anextensive and detailed list to select from. However, there are few characteristics that all theframeworks define; those characteristics can be found in CIF[6]. To create the framework,we need to consider all similarities and differences between the four frameworks for eachfactor as we have described.

3.3. LIST OF FACTORS 13

3.3 List of FactorsFrom our analysis we found 26 factors that need to be considered when planning a UX eval-uation, we then divide those factors into categories that was devised by RAMES framework[4] (see table 3.3), they are: Roles, Activities Materials, Environment and equipment andSystem.

Roles Activities Materials Environment andequipment

System

- System Users - The purpose - Evaluation material - Settings - Description- Evaluators - Schedules - Data collection - Data analysis - Type- Observers - Methodology - Result - Social conditions - Stage- Technical writers - Analysis - Decisions - Data collection - Stage- Recipients - Decisions - Resource cost - Physical ambient - Part

condition - Required equipment

Table 3.3: The list of 26 factors categorized in categories that are included in the framework.

In the following sections, we briefly discuss the factors in the categories and the char-acteristics that are needed. However, we will explain these categories and factors in moredetail in chapter 4.

3.3.1 RolesIn the roles category, we found that it is important to have five factors that are needed whenplanning a UX evaluation (see table 3.4), they are System Users, Evaluators, Observers,Technical writers, and Recipients.

Category Factor Characteristics Description

Roles

System Users

#Users - Describes how many users are involved in theevaluation.

User types - Description of the type.Background - Describes what is the background of the end

users.Intended Users - Description of the indented end users of the

systemUser selection - Description of how the users where selected

and how they fit with the intended end users

Evaluators

#Evaluators - Describes how many evaluators are involvedin the evaluation.

Background - Describes what is the background of the eval-uator.

Requirement - Describes what knowledge is needed to con-duct the evaluation.

Observers

#Observers - Describes how many observers are involved inthe evaluation.

Obligation - Describes what the observer is doing in theevaluation.

Background - Describes the background of the observer.Technical writers Background - Describes the contact information.Recipients Background - Describes the contact information.

Table 3.4: Showing the factors defined for Roles category with description of each charac-teristics within each factor.


When conducting a UX evaluation, it is essential to decide if there are any participantsincluded in the evaluation, as there are some methods that need them to be included, whileothers do not, e.g., Heuristic Evaluation [39]. The evaluators’ factor, however, is neededdespite what method is used, since each time we conduct an evaluation the roles of theevaluator is included. The role of the observer, on the other hand, is optional. If we choose toinclude observers in the evaluation, it is good to describe the background and his obligationin the evaluation. The role of the technical writer is to write the report. This can be theevaluator or another person. There are two main reasons we find it is essential to includethis factor, the skill of technical writing and if any further information is needed. E.g., therecipients wants to know more about the results in the report, it would be clear to whom isneeded to contact.

3.3.2 ActivitiesMany activities need to be described before conducting an evaluation.[4] There must be aclear purpose for the evaluation when to schedule a meeting with the participants, estimatethe time of the evaluation, what method to use, how to analyze the data, what the results willbe used for and who has the authority to make decisions from results. For that we definedfive factors, they are: The purpose, Schedules, Methodology, Analysis, and Decisions (seetable 3.5)


Activities

The purposeThe objective - Describes what is the purpose of the evalua-

tion.The approach - Describes how we are going to do the evalua-

tion.

SchedulesTime and date - Describes the time and date of the evaluation.Duration - Describes the estimated time of the evaluation.

Methodology Method - Describes the method used for the evaluation.Analysis Approach - Describes the process of analyzing the data

from the evaluation.Decision General - Describes what the results will be used for and

who makes the decision from the results

Table 3.5: Showing the factors defined for Materials category with description of each char-acteristics within each factor.

There might be a need to analyze the impact or changes that have been made in the de-sign, and a formal user test is needed with five participants.[40] As there are five participantsneeded, there is a need to schedule a meeting. Furthermore, there is a need to estimate theduration so that the participants know how long the meeting will take and for the evaluatorsto know if they have back to back meetings. When planning the evaluation, we should de-scribe how the data was analyzed, what the procedure for deciding how to react to the resultsand what they will be used. It might be a group activity or done by the evaluator.

3.3.3 MaterialsAs many materials need to be defined when planning an evaluation, there is a need to decidewhat material is used for users (participants) and evaluators, what data is collected and howis the data presented, who has the authority to make the decisions and how to respond to theresults.[4] Then we have the resources we need to have for the evaluation. There might be

3.3. LIST OF FACTORS 15

a need for staff training, the preparation of the evaluation, compensation for the participantand what is the hourly rate of the staff conducting the evaluation. For this, we have definedthe factors: Evaluation material, Data collection, Result, Decisions and Resource cost (seetable 3.6).


Materials

Evaluation materialFor users - Describes the material used for users.For evaluators - Describes the material used for evaluators.

Data collectionCollection - Describes what data is collected.Metrics - Describes what metrics are used for measure-

ment.Result Description - Describes how the result will be presented.Decisions General - Describes how the decision will be described

and who is responsible for making decisions.

Resource costKnowledge - Describes if any training is needed or if exter-

nal help is needed.Preparation - Describes if any preparation is needed for the

evaluation.Compensation - Describes if participants get compensation for

participate.Rates - Describes the hourly rates of the staff.

Table 3.6: Showing the factors defined for Materials category with description of each char-acteristics within each factor.

The material for the participants can be that we are presenting some number of tasks andwe need to present relevant information for him to solve them, then we might have a surveyto get feedback. The material for evaluators might be a set of questions that he wants to askthe participants when solving each task, transcript that he follows in the evaluation or paperto collect information. In the data collection, we want to describe what we are collecting inthe evaluation and what metrics we use for the measurement of the data. The result factorwill then describe how we will represent the results of the evaluation. It could be a detailedreport for a client or a test report that is used by the development team.

3.3.4 Environment and Equipment

When considering the environment and equipment for an evaluation, we need to define thelocation and describe the environment,[4] as it could be in the field or controlled settings andit describes where the user and the evaluator will meet. We need to describe the physicalambient conditions,[5] e.g., is there no ventilation in the room that could make the partici-pant frustrated and wanting to finish the evaluation in short amount of time or are there anysocial conditions that might affect the user.



Environmentandequipment

SettingsLocation - Describes where the evaluation is conducted.Description - Describes the environment of the evaluation.

Physical ambientcondition

Information - Describes the ambient conditions of the envi-ronment.

Social conditions Information - Describes the social conditions that could af-fect the performance of the participant.

Data collection Approach - Describes the equipment used for collectingdata.

Data analysis Approach - Describes how the data will be analyzed.

Table 3.7: Showing the factors defined for Environment and equipment category with de-scription of each characteristics within each factor.

Then we need to consider what equipment will be used to collect and analyze the datafrom the evaluation, as it would be possible to collect the data with voice recording andtaking notes with pen and paper.[4] For this, we defined the factors: Settings, Physicalambient condition, Social conditions, Data collection and Data analysis (see table 3.7).

3.3.5 SystemThe system is the focus of the evaluation; hence we need to describe the general informa-tion about the system, which type it is, in what stage, what part is evaluated, is there anyequipment required to conduct the evaluation and are there any risks and issues that need tobe considered for the evaluation. For this we defined six factors, they are Description, Type,Stage, Part, Required equipment and Risk and issues (see table 3.8).

Category Factor Characteristics Attribute example

System

Description General - Describes general things about the system,e.g., Software that helps students to plan anevaluation, name, version, interaction, or wherethe system is stored.

Type General - Describes the type of system, e.g. Web appli-cation

Stage Form of design - Describes the stage of the software, e.g. Run-ning prototype.

Part Description - Describes which part is being evaluated, e.g.,Creating a new project and plan evaluation withUXinn software.

Required equipment List - Describes the equipment used in the evalua-tion, e.g., The system will be running on theevaluator’s computer.

Risk and issues Description - Describes any risk and issues that might occur,e.g., Low battery on the computer; the systemwill crash, the participant might not show up.

Table 3.8: Showing the factors defined for System category with description of each charac-teristics within each factor.

In the description, we might describe software that is intended to help students to planan evaluation. This system might be a mobile application or web application in a prototypestage. The part that we would want to evaluate is when the participant in the evaluation ismaking a plan for software that he is developing and will be needing a computer to conductthe evaluation. When conducting an evaluation some risks or issues might occur, e.g., the

3.4. FRAMEWORK CREATION AND IMPLEMENTATION 17

battery life of the computer could be low, some of the participants will not show up, thesystem is hosted externally, or the connection would break.

3.4 Framework Creation and ImplementationIn this section, we discuss the creation of the framework and describe the implementation inchapter 5. To use the factors, we described in section 3.3, we come up with characteristicsthat enable the framework to collect relevant information from the user that fits each charac-teristic. However, the characteristics varies between each of the factors, as there is differentinformation that must be collected in each of them (see table 3.4, 3.5, 3.6, 3.7, 3.8).

As there are many important factors that need to be considered when planning an evalu-ation, for that we generated a set of questions in which we collect relevant information thatthen enables the framework to generate an evaluation plan for the system user (see figure3.3).

When looking at the system user factor (see table 3.4) we see that there are many char-acteristics to be considered, as there might be a various number of participants when con-ducting an evaluation, it might be 2 or 4 that participate in the evaluation. However, manyresearchers find it enough to have 5 participants in an evaluation, as it is claimed to beenough for finding 80% of the usability problems in the system.[40]

Figure 3.3: Showing the process of making an evaluation plan with the planner.

When the evaluator is making a plan for the evaluation he must know the user type (enduser) of the system, the background of the participant, what information he wants to collect,gender, age or occupation relevant and does the participant fit into a specific user group thatis relevant to the system. By defining these things and formalize who the intended users of


the system are and how they were selected, we have enough information to validate the usersselected to the participants of the evaluation and be confident that the results of the report arerelevant to the need of the end user. To be able to extract the information needed on each ofthe characteristics of the system users factor we formalize sets of questions (see table 3.9).

Characteristics Question

#Users How many users will be included in the evaluation?

User types Who is the end user of the system?

Background What is the background of the participant?

What background information will be collected?

Is the participant representative for a user group?

Describe the user group.

Intended users Who are the intended users for the system?

User selection How are the users selected?

Table 3.9: Questions in each characteristic for system user factor.

As an evaluator plan the evaluation and meet with the participants, he needs to knowwhen the evaluation will be conducted and how long it might take. There might be a needto perform the evaluation over 2 or 3 days since it can be hard to plan a meeting with fiveparticipants on the same day. The relevant information that needs to be considered is theestimated time that the evaluation takes, when it will be conducted and the time frame of theevaluation. For that, we defined a set of questions seen in table 3.10.


Time and date What are the dates of the evaluation?

At what time?

Duration What is the estimated time on each evaluation?

Table 3.10: Questions in each characteristic for Schedules factor.

Knowing which method will be used when planning an evaluation is essential. As theevaluation can be conducted by using a described evaluation method, e.g., Think aloudmethod [15], Heuristic evaluation [39], or a procedure that is focused on a collection oftasks. The question in table 3.11 has a list of selection that is depended on if there are anyparticipants involved in the evaluation,


Method What evaluation method will be used?

Table 3.11: Questions in each characteristic for Methodology factor.

Estimating the cost of the evaluation is often hard to do [41]. The framework definesa set of questions (see table 3.12) that need to be considered when it comes to cost assess-ment. The question is focused on answering if there is some training needed to conductthe evaluation, asks the planner to estimate the preparation time of the evaluation or if thereis a compensation for the participants and the hourly rate of each person involved in theevaluation process.

3.5. EVALUATION PLAN 19


Knowledge Are there any skills that need to be trained?

What is the estimated time for training the skills?

Preparation How much time is estimated to prepare the evaluation?

Compensation What is the value of the compensation for the participant?

Rates What is the hourly rate of the evaluator?

What is the hourly rate of the observer?

What is the estimated cost for the technical writer?

Table 3.12: Questions in each characteristic for Research cost factor.

By now we have realized that many things need to be considered when conducting anevaluation. The framework has defined 26 factors with 42 characteristics that the plannerneeds to consider when planning an evaluation. For that, we have defined 52 questions(see Appendix A). To utilize the questions in an efficient way we define rules based on theanswers from the planner, that then can be used for deciding if a set of questions is relevantfor the evaluation plan.

Figure 3.4: Showing how stating the number of users in evaluation can affect other questions.

In figure 3.4 we show an example where the number of participants (users) can affectwhat is needed to present to the planner of the evaluation. If we look at two scenariosto explain, then if there is no user involved in the evaluation, the framework can presentmethods to the planner that has no users involved and skip the question about if there is anycompensation cost. However, if there are users involved in the evaluation, the frameworkcan present methods that include users and ask for compensation cost. Therefore, we canuse classification for the questions in the framework to guide the planner efficiently and onlypresent what is needed.

3.5 Evaluation PlanThe goal of the evaluation plan is to support evaluators when planning, comparing and doc-umenting a UX evaluation in a structured way. To do this, we have investigated severalframeworks, selected those found to be relevant to this study (see section 3.1), analyzed andcompared to find which factors need to be considered and defined characteristics for themsuch that we are enabled to ask the planner a set of questions that generates the plan for theevaluation (see table 3.13).


RolesR1. Users #Users, User types, Background, Intended users, User selection.R2. Evaluators #Evaluators, Background, Requirement.R3. Observers #Observers, Obligation, Background.R4. Technical writers Background.R5. Recipients Background.ActivitiesA1. The Purpose The objective, The approach.A2. Schedules Time and date, Duration.A3. Methodology Method.A4. Analysis Approach.A5. Decision General.MaterialsM1. Evaluation material For users, For evaluators.M2. Data collection Collection, Metrics.M3. Result Description.M4. Decisions General.M5. Resource cost Knowledge, Preparation, Compensation, Rates.Environment and equipmentE1. Settings Location, DescriptionE2. Physical ambient condition Information.E3. Social conditions Information.E4. Data collection Approach.E5. Data analysis Approach.SystemS1. Description General.S2. Type General.S3. Stage Form of design.S4. Part Description.S5. Required equipment List.S6. Risk and issues Description.

Table 3.13: Proposal of the evaluation plan in UXE PSS framework.

3.6 Framework EvaluationWhen planning the framework evaluation, we used the UXE PSS framework to prepare theevaluation. Then we performed two evaluations with students studying Computer Scienceat Reykjavík University. The details of the evaluation process will be presented in Chapter6.

21

Chapter 4

The UXE PSS Framework

For our proposed framework UXE PSS (User Experience Evaluation Planning Support Sys-tem), we compared four frameworks to identify the similarities and differences by comingup with a list of factors that need to be considered and divide them into five categories. Toidentify the categories, we looked at the compared frameworks that are described in chapter3.

4.1 RolesThere are many types of roles that we need to consider when conducting a UX evaluation.It can either have users, also known as participants, that are asked to perform specific tasksor expert evaluation that is without users, depending on the method. Then there are alwaysother roles for the person conducting the evaluation, or if a dedicated person is observing theevaluation, a technical writer of the report and the recipients. By comparing the frameworks,we came up with five factors for Roles. They are; System Users, Evaluators, Observers,Technical writers and Recipients. For each factor, we made Characteristics that we describein the following sections.

4.1.1 System Users

All of the four frameworks have the factor users. System users are the participants that takepart in the evaluation, and play a vital role in the evaluation. They can be the actual usersof the system, the expected users or surrogates that play the role of users for the system.Therefore it is important to include system users in the framework.

Characteristics Example

#Users - 5.

User types - Students.

Background - In their Computer Science study. Expertise level, age, gender.

Yes, learning how to conduct an user evaluation.

Intended Users - Students learning how to conduct an user centered evaluation.

User selection - We select users that are learning how to conduct UX evaluation

from Reykjavík University.

Table 4.1: An example for Characteristics of System Users in Roles category.

22 CHAPTER 4. THE UXE PSS FRAMEWORK

Knowing how many users are involved in the evaluation is essential. That gives anindication on which methods can be used for the evaluation, e.g., if we have no participantsin the evaluation, we might use Heuristic evaluation[39] rather than Think aloud method[15]that requires participants. The type of users describes the occupation in a way and helpsto understand what background is needed. The background gives excellent insights if theuser fits the system that is being evaluated or gives the opportunity to interpret the results.If we encounter any problems, e.g., if any result of a task is different between users, wecould perhaps see if the background gives some indications for that. By describing theintended users, we get a better understanding of the system and overview on how to selectthe users. By describing the selection of the users, we see if we are meeting the fundamentalcharacteristics and if they fit for the evaluation (See table 4.1). All of the characteristicshave essential information for the evaluation plan and the report. Hence we consider theSystem User factor and all the characteristics to be important and should be included in theframework.

4.1.2 EvaluatorsIt is vital for any evaluation to have an evaluator. Hence we use the evaluator in the rolescategory. The evaluator has the role of analyzing the data. Therefore, he has to have expertisein the area such that the description of the result is correct. The evaluator does not only planand conduct the evaluation, but he also has other activities. However, those activities dependon the chosen evaluation methods.


#Evaluators - 1.

Background - John doe, [email protected], experienced evaluator, has

conducted several evaluations.

Requirement - How to perform think aloud user testing.

Table 4.2: An example for Characteristics of Evaluators factor in Roles category

When conducting an evaluation one of the critical roles are the evaluators. How manyevaluators are needed depends on each evaluation. For each of the evaluator, the backgroundinformation is needed, since it is essential to know if he has the required expertise for theevaluation. For the same reason, we found it to be important to list up the requirements forthe role, by doing that we can ensure that the background fits the requirement of the evalua-tion. If the background does not meet the requirement, we know that there is a need to trainfor that skill or find a more suitable person for the role. An example of each characteristicof the evaluator role can be seen in table 4.2.

4.1.3 ObserversIt is common when conducting an evaluation to have observers. The role of that person isnot necessary to interact with the participants, but to observe what happens. It is known thatin some cases an IT professional and representatives from the client observe in an empiricalevaluation that is conducted.[4] Sometimes the observers write down notes in the evaluationand take part in discussing the outcome of the evaluation. By including an observer in theevaluation, we can produce valuable information for deciding how to react to the result ofthe evaluation. Hence the role of an observer is included in the framework.

4.1. ROLES 23


#Observers - 1.

Obligation - Taking notes on user action.

Background - Student.

Table 4.3: An example for Characteristics of Observers factor in Roles category

Knowing the number of observers and describing the obligation of the role is important.It helps to define the role and what is expected of the person. Although not all of the frame-work agrees with that the background of the observer is necessary, one can argue that to beable to obtain similar results; all information is needed, e.g., if the observer is a psychologist,then he might notice a behavioral thing with the participant that a developer would not. Forthis reason, we include all the characteristics described for the observer. An example can beseen in table 4.3.

4.1.4 Technical WritersWriting a technical report is a part of the evaluation process. This report is a document that iswritten to present results from an evaluation. As the report is to clarify the idea, demonstrateresults and promote a particular point of view, writing a good technical report of any kind isa vital skill to have, primarily related to engineering, programming, architecture and design.However, not everyone has the writing skill, and it is hard to develop. Hence the possibilityof the need to get a person that has the skill to write the evaluation report is something thatneeds to be considered. Therefore, we include it in the framework.


Background Same as evaluator.

Table 4.4: An example for Characteristics of Technical Writers factor in Roles category

We find the option of defining technical writer to be important as this skill is hard todevelop and some might need help from a person that works as a technical writer for aliving. As seen in table 4.4 the information required is the contact information and his orher expertise. In our opinion, it should be stated in the report who is responsible for thewriting and the information should be accessible. However, none of the framework that wecompared addressed the need for technical writer. One of the possible reason could be thatthey expect the evaluator to write the report.

4.1.5 RecipientsThe result of the evaluation needs to be presented to someone. Hence it is crucial to definethe recipients of the report[42] . It can be a client, a company, a team within a company orthe evaluator himself. The recipients are the once that decide on what action is needed totake after reviewing the report.


Background - Same as evaluator.

Table 4.5: An example for Characteristics of Recipients factor in Roles category


The background information of the recipients of the results is necessary, as any infor-mation about the company or the person that is needed to contact for any relevant reasonthat may present itself during or after the evaluation. See table 4.5). Hence, we find thatthe recipients should be stated in the framework for two reasons: the contact information isessential and who is responsible for making decisions.

4.2 ActivitiesEvaluation is a process that is used to research something. When conducting an evaluation,it is vital to plan the activities in the evaluation, such as it can be replicated. That includesstating the purpose of the evaluation, when it is planned to be conducted, estimate timeof each evaluation, describing the procedure of the evaluation and more. We describe thefactors and their characteristics in the following section. They are; The purpose, Schedules,Methodology, Data collection and Analysis.

4.2.1 The purposeThe first thing we ask when planning an evaluation is, why are we doing this evaluation?There are two primary evaluation purposes; Formative evaluation and Summative evaluation.The formative method is to evaluate for improvements, learning and see if any changes areneeded. The summative method is used for accountability and decisions about if the softwareshould be expanded or discontinued. The purpose factor in the framework is to describe whythe evaluation is conducted.


The objective - Analyze the impact and changes that have occurred after

changing the user interface

The approach - To construct formal user test with ten users of the software

Table 4.6: An example for Characteristics of The purpose factor in Activities category

By defining the objective, we state what we are doing and why, e.g., analyze the impactor changes that have occurred after changing the user interface. By defining the approach,we describe how we are going to achieve the objective of the evaluation, e.g., constructformal user test with ten users of the software, this can be seen in 4.22.

4.2.2 SchedulesThere are many vital things in the evaluation and scheduling an evaluation is one of them.The performance of the evaluation can depend on various things and knowing the time anddate, duration and the location is vital for carrying out the evaluation.


Time and date - 27-29 of November, between hours 10am to 3pm.

Duration - 60 minutes.

Table 4.7: An example for Characteristics of Schedules factor in Activities category

4.2. ACTIVITIES 25

Another thing about the schedule is the duration of the evaluation. This has few benefitsfor both the participant and the evaluator. E.g., the participant and the evaluator know be-forehand what to expect when arriving for the evaluation and this allows to estimate the costof the evaluation. E.g., duration 60 minutes, 5 participants, hourly rate of the staff 10.000ISK, then we can define estimated cost = Duration * participants * hourly rate. An exampleof the characteristics is shown in table 4.7.

4.2.3 MethodologyThe evaluation methodology is a process to help the evaluators to understand better the stepsto do a quality evaluation. The process of the method helps the evaluator to learn what heneeds to know to determine the level of performance, product, or skill that is needed.


Method - Formal user tests are conducted by having the users perform

9 tasks, time tasks while solving them and ask them if they had

any comments on tasks.

Table 4.8: An example for Characteristics of Methodology factor in Activities category.

The process that we are following for collecting information in an evaluation is depen-dent on sets of tasks and questions that the participants have been assigned. By definingthe method (see table 4.8) it is easier to replicate the evaluation if needed and it gives betteroversight on what has to be done for the evaluation.

4.2.4 AnalysisThe analysis is one of the critical factors to any evaluation. It creates value and meaningto the evaluation data that we collect. Reporting the subsequent results of the evaluationis an essential step in documenting the findings, and it helps to describe the process of theanalysis.


Approach - The data from the interview were gathered with recording, by

taking notes and by asking the participants with questions. Data

is analyzed by the evaluator and compared to the usability goals.

Table 4.9: An example for Characteristics of Analysis factor in Activities category.

There are many ways how data analysis is done. This can be performed by a groupof people or a single evaluator as seen in table 4.9. It is clear that when conducting anevaluation and collecting information, some analysis is needed and should be included inthe framework.

4.2.5 DecisionThere are sets of decisions that must be made in each evaluation, when planning, conductingand presenting results. The recipients of the results or another stakeholder often decide what


action to take. The decision can be to quit working on the software or redesign according tothe outcome of the evaluation.


General - The evaluator takes care of the procedure for deciding how to

react to the result.

Table 4.10: An example for Characteristics of Decision factor in Activities category.

When getting results from evaluation and knowing what the next step should be it isessential to know who has the authorities to make that decision. In table 4.10 we show anexample where the evaluator takes care of the procedure for deciding how to react to theresult from the evaluation and whom to contact for more information on the evaluation ifneeded.

4.3 MaterialsThere are many kinds of material that we might use in evaluation. For the participants wemight prepare information about some particular thing that would allow them to solve a task,the tasks are part of the material used as well. For the evaluator, we can have predefinedinformation that we want to collect and questions to ask the participants. This is just anexample of what material can be defined for both participants and evaluators. More materialsneeds to be defined for an evaluation, for instance, how are we collecting data, how will theresult be described and who is responsible for the decisions. The materials category is vitalwithin the framework. We describe the factors and their characteristics in the followingsection. They are Evaluation material, Data collection, Report, Decisions and Resourcecost.

4.3.1 Evaluation MaterialIt is common to use user tasks in an evaluation which the participants are asked to perform.For that, we need to prepare information that would allow them to solve the tasks. Theinformation we collect has to be defined, how many tasks are to solve or how are we goingto collect background information about the participant.


For users - 9 tasks are defined by evaluator, information on the subject

where user can find relevant information to solve tasks, A back-

ground is gathered with questions defined by evaluator.

For evaluators - Set of questions to ask the participant, introduction text for the

evaluation, paper to collect information on task.

Table 4.11: An example for Characteristics of Evaluation material factor in Materials cate-gory

Many materials can be presented to the participants. The example in table 4.11 is just afew things that could be defined for the evaluation material. Another thing could be a consentform for the participant. Furthermore, two types of evaluation could be used regarding the

4.3. MATERIALS 27

information gathered for users and evaluators. If we do not include the users, we would needdifferent material for evaluators, e.g., heuristics, guidelines, and standards.

4.3.2 Data CollectionOne of the critical things that we do in an evaluation is collecting data. This data can be bothquantitative and qualitative on the usability and the experience of the user. This can all beconfusing and as there are many types of dimensions when collecting data, e.g., objectiveand subjective dimensions.

When collecting data, we need to think about what metrics are used and why we arecollecting them. This can help to see if we achieve the goal of the evaluation or if we needto redesign something in the software according to the analysis of the collected data.


Collection - User tasks, with questions.

Metrics - Time completing task.

Table 4.12: An example for Characteristics of Data collection factor in Materials category

There are two types of characteristics that we want to include in the framework, whatdata we are going to collect and what metrics we will use for that data (see table 4.12).

4.3.3 ResultOne of the materials that are produced in evaluation is the report. The result of the reportdepends on a variety of factors, such as; the purpose of the evaluation or what method isused and what data is collected. The result described in a report can be on average time oneach task, how many tasks were completed, encountered problems when performing tasksor answers to questions from debriefing before and after evaluation.


Description - The results are presented in a testing report.

Table 4.13: An example for Characteristics of Result factor in Materials category

It is clear that after conducting an evaluation, we will have some results. These resultshave to be described somehow. We find it to be fitting to have a description (see table 4.13)of that in the framework.

4.3.4 DecisionsWhen conducting an evaluation and analyzing the data gathered, we produce some resultsthat are important to make some decision on. Sometimes stakeholders make this decision,the team or the evaluator.


General - Evaluator is responsible for responding to the results from the

evaluation.

Table 4.14: An example for Characteristics of Decisions factor in Materials category


As seen the example in table 4.14 we have decided that the evaluator makes the decision.We find it to be suitable to have this stated in the framework, such as it is easy to find neededinformation from those who are responsible and if stakeholders or any other person lookingfor information later on in the process of the development of the software.

4.3.5 Resource CostIt is vital that we know how much the evaluation might end up costing. Therefore, we findit essential to have the resource cost factor within the framework. When conducting anevaluation, we need to estimate the resources that are available for the evaluation and lookinto what is required to be able to do the evaluation.


Knowledge - The evaluator needs to learn think aloud method.

Preparation - Analyse and collect data from other evaluation on the soft-

ware, estimated 10 hours.

Compensation - Participants will get store credit in Kringlan for the amount

10.000 ISK.

Rates - Evaluators rate : 15.000 ISK, Observers rate : 10.000 ISK

Table 4.15: An example for Characteristics of Resource cost factor in Materials category

There are many resources needed for evaluation. E.g., we might have to look at alreadyexisting data or get a team or another external evaluator that we need to pay. The firstcharacteristics knowledge (see table 4.15) is to define if any skills need to be trained for thestaff or if any external help is needed. For the preparation, we need to make an estimatehow much time it might take to prepare the evaluation. This includes making all the plans,gathering any data needed and so forth. Then we need to define the preparation of theevaluation how much time do we estimate is needed. The last characteristics are the rates.The rates are for the observers, evaluators or any other person involved in the evaluationprocess. By getting those rates, we can use them to calculate the estimated cost of conductingthe evaluation. We found it essential to have resource cost in the framework to give anestimation of the cost of an evaluation.

4.4 Environment and EquipmentThe environment in which the evaluation is conducted is an important aspect. It can bein controlled settings, home of the participants and other places. These places can affectthe outcome of the evaluation. The equipment used for the evaluation can also affect theoutcome of the evaluation.[4] Therefore we found that both environment and equipmentshould be in the framework. We defined five factors that we describe in the following section.They are Settings, Physical ambient condition, Social conditions, Data collection and Dataanalysis.

4.4.1 SettingsThe evaluation can be conducted in various settings. It can take place in the field, at theworkplace of the participant also known as "in real settings", or it can be held in a usability

4.4. ENVIRONMENT AND EQUIPMENT 29

laboratory, also known as "in controlled settings". We include the Settings in the frameworkto describe the location of the evaluation and a description of the place that it will be held at.


Location - Reykjavík University, Menntavegur 1.

Description - In controlled settings.

Table 4.16: An example for Characteristics of Settings factor in Environment and equipmentcategory

In the table above we have an example of how the location and description of the Settingsfactor could be. We find this to be important to include in the framework, as this can help inmany ways. E.g., if we need to replicate the evaluation or instruct the participant on wherethe evaluation will take place.

4.4.2 Physical Ambient ConditionIn the paper for the evaluation framework by Kwahk and Han [5] they described that thefactor physical ambient condition could affect the performance of the participant in the eval-uation and it is known to deteriorate in abnormal environmental conditions if it is dark, coldor noisy. We agree with them on the issue and include the factor in the framework.


Information - The room has no ventilation, has some noise and the tempera-

ture is 29 °C.

Table 4.17: An example for Characteristics of Physical ambient condition factor in Environ-ment and equipment category

In table 4.17 we describe a room that has no ventilation, it has some noise in the back-ground and the temperature is 29 °C. This is an excellent example of how the performance ofthe participant could be affected, he is hot, with no fresh air and there is noise to distract him.This could lead to him wanting to finish the evaluation and get out of the environment. Wefind it could be useful to describe the ambient condition because it could affect the results ofthe evaluation.

4.4.3 Social ConditionsIn the paper for the evaluation framework by Kwahk and Han [5] they described that thefactor social conditions could affect the performance of the participants like the ambientconditions. We agree with them on that social conditions could affect the performance ofthe participant and include it in the framework.


Information - There are interruption at the work place, might have influence

from the workplace.

Table 4.18: An example for Characteristics of Social conditions factor in Environment andequipment category


In table 4.18 we describe a scenario where the evaluation is conducted in real settingsat the workplace of the participant. There is some interruption from the co-workers and thestakeholder. This could affect the performance of the participant and is vital information tohave in the analysis of the data collected from that evaluation. Therefore we find it could beuseful to describe the social conditions in the framework.

4.4.4 Data CollectionThere is various equipment that can be used to collect data in an evaluation. We find itessential to describe the data collection factor in the framework. An example of equipmentthat could be used is video recording, voice recording, and writing notes. Therefore weinclude it in the framework.


Approach - Voice recording and Timer in Iphone 7, pen and paper

Table 4.19: An example for Characteristics of Data collection factor in Environment andequipment category

In the table 4.19 above we describe that the equipment for collecting data in the evalu-ation is a voice recorder, pen and paper for taking notes and a mobile device to time eachtask. We find this being important to know in each evaluation since we are collecting dataon the evaluation to analyze as a result.

4.4.5 Data AnalysisThere are various ways to analyze the collected data. For that reason, we find it essential toinclude the factor data analysis. We could be using some software tool that supports whatwe are doing, have our own spreadsheet or even use pen and paper.


Approach - The data will be analyzed by taking notes from recordings.

Table 4.20: An example for Characteristics of Data analysis factor in Environment andequipment category

In the example in table 4.20 we describe that the data will be analyzed by listening torecordings from the evaluation and take notes afterward. As this is an essential factor inevaluation, we include it in the framework.

4.5 SystemThe system evaluation is an ongoing process throughout the development cycle, and it offershigher value when conducted early in the designing stage. The evaluation of the systemintends to collect information about the status of the project as a whole and any other detailsthat would be helpful in guiding the development team in achieving their goal. It can helpto see if the need for the project is met, by giving a general description of the system and itsintended context. As there are many types of systems in various stages of design, it is crucialto have a System category that describes the system in whole. To get the needed information

4.5. SYSTEM 31

for the category, we defined six factors and describe them in the following sections. Theyare Description, Type, Stage, Part, Required equipment and Risk and issues.

4.5.1 DescriptionThere are many types of characteristics when it comes to systems. Therefore we need togive general information about the system and its intended context, see the example in table4.21. As we develop a system, there can be many versions, and if in any case, we need tobacktrack any decision from an evaluation, then it is good to state the version of the systemin each evaluation.


General - Software that helps students to plan an evaluation, UXinn,

1.0.0, Web application.

Table 4.21: An example for Characteristics of Description factor in System category

However, we have only mentioned few of the possible characteristics that fit the gen-eral description, and it is clear that describing the system is vital to have in the frameworkand for the evaluation process for many reasons, including the once that have already beenmentioned.

4.5.2 TypeMany things can influence what evaluation methods can be used and what kind of metricscan be used when collecting data. The type of the system describes in which field the systemwill be used and its primary functionality. Hence we use the type factor.


General - Web application

Table 4.22: An example for Characteristics of Type factor in System category

The example in table 4.22 we have Web application defined as a type. However, otherthings could have been described there, such as games, business systems, operating systemsor database software.

4.5.3 StageThe software can be in many stages when evaluated. It could be a low fidelity prototypeshowing the first design of the system, detailed design of a prototype or even finished prod-uct. The stage of the software can set constraints on how we can conduct the evaluation andshould be described.


Form of design - Running prototype.

Table 4.23: An example for Characteristics of Stage factor in System category

The form of design describes (see table 4.23) the status of the software as there are manystages of design. Therefore we find it important that the stage be stated in the framework.


4.5.4 PartWhen we are conducting an evaluation, we are focused on exploring some part of hardwareor system. We need to describe what it is that we are doing. We might be testing out howthe participant experience the use of remote control for switching channels on a TV or wemight be asking him to use an application that has been developed for mobile devices.


Description - Creating new project and plan an evaluation with Uxinn soft-

ware.

Table 4.24: An example for Characteristics of Part factor in System category

Sometime, however, we might be focusing on a small part of the system that is ready forevaluation like we describe in table 4.24. As it needs to be clear for the evaluation what partwe are focusing on.

4.5.5 Required EquipmentThere can be various equipment required to conduct an evaluation; it depends on what isbeing evaluated. If we are evaluating a music application, we might need headphones andcomputer. However, if it would be a mobile application, we would use a mobile deviceinstead of the computer. Another example would be depended on the stage of the software,are we in a detailed design stage of the software and will use a computer for the interface orwe might be using paper prototypes.


List - The system will be running on the evaluators computer.

Table 4.25: An example for Characteristics of Required equipment factor in System category

In table 4.25 we describe that the evaluator will have a computer for the software thatis being evaluated. The user will, therefore, use the computer to perform user tasks in theevaluation. Hence we find it necessary that we state what equipment is required to conductthe evaluation.

4.5.6 Risk and IssuesWhen conducting an evaluation, some risk and issues might occur. Perhaps the computerthat you are using to evaluate a software shuts down, or one participant does not show up.However, there can be various risks or issues that might happen before or during evaluation.Being aware of these things can be helpful and should be considered.


Description - Low battery on computer, system will crash, participant might

not show up.

Table 4.26: An example for Characteristics of Risk and Issues factor in System category

4.5. SYSTEM 33

As there might be some risk that can occur in the evaluation (see table 4.26) a goodexample would be that the system or the computer shuts down because of an error. Thenwe could prepare for that and know how to address that issue. There might even be toomuch workload for the participant to finish. Then we could have some prepared procedureto handle the issue.

34

Chapter 5

The UXE PSS

In this chapter, we describe the design considerations, three-layered service application con-taining presentation layer, business layer, and data layer. First, we give an overview ofthe UXinn application based on UXE PSS (User Experience Evaluation Planning SupportSystem).

5.1 Quick Overview

To study the planning of UX evaluation we needed to develop an application that was giventhe name UXinn (User Experience Innovation). The difference between the framework andthe application is that the framework is focused on what needs to be considered when plan-ning an UX evaluation, while the application is a questionnaire that is focused on collectingthe needed information for the UXE PSS framework. UXinn is a software application thatwill enable software developers to prepare user tests based on the UXE PSS framework.The reason for not using a questionnaire tool like SurveyMonkey is that there are limits withusing such tools for further research e.g., collecting data. This data can be explored, sincewe have centralized data collection from many different source. The application adds theoption of expanding the project in other directions that might be interesting to explore.

The intended users were students in computer science because if we have an applicationthat is understood by students. It is most likely going to be understood by IT-professionalsand researchers that would benefit from using the software in the future. The user of thesoftware receives instructions on the intended evaluation and then provides feedback on thattest after completion. This accelerates the organization of user testing.

By introducing feedback of the tests to the system, we are preparing for the opportunityfor developing a research tool that can be used to answer all sorts of questions, e.g., whatmethod of evaluation is used the most. All data from the users is stored centrally. Over time,the information gained from the use of test methods and the assessment of the effectivenessof the test performed by the user could be used further. An example is that it would bepossible to develop a machine learning mechanism to learn how many users are needed forvarious tests, then give a feedback of what is found to be optimal number of users to have inuser evaluation for any type of product testing, e.g. for type: Website, wire-frame, detaileddesign. However, in this study, we do not focus on further use of the data.

5.2. DESIGN CONSIDERATIONS 35

5.2 Design Considerations

To study how to design the application, we needed to look into how to develop the product.The fundamental structure of the application is that it has three layers: the presentationlayer (Web), the application programming interface (API) and the data layer for storinginformation. The application is designed for browsers, so the API has to be secure. It isdescribed in section 5.5. The presentation layer is described in section 5.4. The data isstored in the data storage layer that is described in section 5.6.

5.3 Three Layered Application

In software engineering, multi-tier architecture is also known as n-tier architecture (tier isreferred to as a layer) is a client-server architecture where presentation, application process-ing, and data management are physically separated. The most extensively spread use of thisarchitecture is the three-layer architecture which is well known in software development.See figure 5.1.

Figure 5.1: The figure shows when the user is getting a list of plans for a project. Thepresentation layer sends a request to the Business layer that then sends a query to the datalayer, that then response with current information about the request and gets relayed backthe same way.

36 CHAPTER 5. THE UXE PSS

The presentation layer is the topmost level of the application. Where it displays infor-mation to the user when he is browsing the application, it communicates with business layerand receives information that is then displayed to the user. This is the layer that the user canaccess directly. The business layer contains all the applications functionality. It processesthe request from the presentation layer by getting information from the data layer. The datalayer includes the database, it encapsulates the database and exposes the data to the busi-ness layer [43]. By using three-layered architecture, we can change any of the three layerswithout affecting the other layers.

5.4 Presentation LayerFor the presentation layer, we use MVC software architectural pattern for implementing auser interface also known as Model-View-Controller. See Figure 5.2. The MVC architecturedivides the application into these three interconnected parts. This is done to separate thepresentation of information from the way they are presented and muted by the user [44],[45].

Figure 5.2: The Model contains business logic, Controller interacts with Model to createdata for the View and the View renders content to the user and relays user commands to theController.

The example in figure 5.2 describes when a user opens a browser. He is shown a view;the controller sends initial data to the view that then renders it for the user. When a userinteracts with the view, he sends a request to the controller that then sends a request to themodel. The model interacts with the API (business layer) with a request for update. Fromthere the model sends a response from the API to the controller in which then sends data tothe view. See Figure 5.3 for how MCV interacts with three-layer architecture.

5.4. PRESENTATION LAYER 37

Figure 5.3: MVC architecture implemented into three layer architecture

The MVC design patterns decouple these key factors, which allow the developer to reusethe code and parallel development efficiently. An example of parallel development is thatone developer can work on the view while a second developer works on the controller logicand the third developer focuses on the business logic in the model. The use of this methodhas grown and become more popular over the years for developing web applications, mo-bile, desktop, and other clients [46]. Many popular programming languages like C#, Java,and others have MVC frameworks that are used in web application today. That includes An-gularJS. As Jain et al, described in their article, "AngularJS: A Modern MVC Framework inJavaScript" : AngularJS is an innovative approach for extending HTML where it will makemuch sense for web development, since it works well for both quick prototyping projectsand large-scale application, especially since it has a large community and Google behind it[47]. After researching the framework and exploring the community behind it, we decidedto use AngularJS, especially since Google is heavily invested in the framework.

We designed UXinn application as a single page application (SAP), by developing theapplication in SAP we minimize the need to refresh the whole page since we send all thecode to the browser (JS, HTML, CSS) when the user loads the page.

The UXinn application has a front page that the user can find information about UXEPSS, UXinn, the team involved in developing the product, how to contact the team and reg-ister or login. See Figure 5.4. When the user is logged in to the system, he can make as many


projects as he needs and make evaluation plan on each project. He is then walked througha set of 52 questions from the five elements, Roles, Activities, Material, Environment, andSystem that are defined in the UXE PSS framework. When a user has made a plan withinUXinn, he can get a detailed summary of his answers that then is presented to him on a singlepage, giving him an overview of the evaluation that has been prepared. For each evaluationplan, a feedback option is generated so he can give feedback on his evaluation. This willgive the user better insights into how things went in the evaluation, that later can be used toimprove planning of UX evaluation. Furthermore, this can be utilized in the future to gainbetter insights into how user evaluation is conducted and give suggestion loop based on thatinformation to the user. We, however, do not discuss that option further.

Figure 5.4: Showing the front page of UXinn application.

When the user has registered and logged in, he gets directed to a dashboard. On thedashboard, he can make a new project and see a list of projects that he owns. See Figure5.5. For each project, he can add a new evaluation plan that he is about to conduct. Witheach plan, we make a feedback option, where the user can give feedback on the evaluation.See Figure 5.6. From that, he can later see if something went wrong or what was right in theevaluation. Then he can see what he wants in his next plan.

Figure 5.5: Showing how list of projects is presented to users.

5.4. PRESENTATION LAYER 39

Figure 5.6: Showing how list of plans and feedback is presented to users.

When the user has made the plan, he is introduced to a set of questions about the fiveelements of UXE PSS, Roles, Activities, Material, Environment and System. See Figure5.7. After he has answered a set of questions on those elements he can get an overview ofthe plan he just made. See Figure 5.8.

Figure 5.7: Showing example of questions that are presented to users.


Figure 5.8: Showing how the result of the question are represented into the evaluation plan

When the user decides to give feedback, he is introduced with a set of questions aboutthe five elements of UXE PSS and the evaluation he conducted in the same manner as whenhe was preparing the plan for the evaluation. Then, after he has answered questions aboutthe evaluation, he can get an overview of his answers in the same way as the evaluation planrepresents it.

5.5 Business LayerFor the business layer, we decided to use representational state transfer (REST) or RESTfulweb service. It is a way to provide synergy between computer systems on the internet.By using REST, we allow requesting system to access and change textual information ofWeb resources by using a uniform and predefined set of stateless operations. The maincharacteristic of a REST service are listed in Table 5.1.

Client / ServerStatelessCache-ableLayered SystemCode on demand

Table 5.1: List of main characteristic of REST service.

Technically, REST does not depend on HTTP. However, most REST services are imple-mented using it, and the reason for that is that HTTP fits very well. An example of HTTPrequest can be seen in Figure 5.9. Roy Fielding, one of the principal authors of the HTTPspecification, described RESTs effect on scalability [48]:

5.5. BUSINESS LAYER 41

"REST’s client-server separation of concerns simplifies component implemen-tation, reduces the complexity of connector semantics, improves the effective-ness of performance tuning, and increases the scalability of pure server compo-nents."

For the API we decided to use Node.js. It is an open source, a cross-platform run-time environment for server-side and networking applications. Node.js is event-driven, non-blocking I/O model that makes it lightweight and efficient. Node.js has one of the mostsignificant ecosystems of open source libraries in the world [49]. The application is writtenin JavaScript and can be run within OS X, Microsoft Windows, Linux, and FreeBSD.

Figure 5.9: A HTTP POST login request sent to server with username and password. Serverissues JSON Web Token that is then used for access control and validation.[49]

Security is one of the things that need to be addressed when the presentation layer iscommunicating with the business layer. Two things need to be addressed. First the authen-tication, how is the user authenticated and how to store password. Secondly, the security oftransferring information between the user and the server. For authentication, we use JSONweb token (JWT). It is built on an open standard (RFC 7519) that defines a compact andself-contained way for securely transmitting information between a user and an API [50].The information transferred can be trusted because it is digitally signed since JWTs can besigned using a secret key or a public/private key pair using RSA. RSA is one of the firstpublic key cryptosystem developed by Rivest, Shamir, and Adleman. It is widely used forsecure data transfers. There are two keys in RSA, one public key that is an encryption keyand the second is a decryption key which is kept private [51]. The most common way ofusing JWT is with authentication. Once the user is logged in, he has been assigned a to-ken that then is included in the request to the API (see Figure 5.9), that allows the user toaccess routes, services, and resources that are permitted within that token. The JSON WebTokens consist of three parts that are separated by dots, typically looking like xxx.yyy.zzzand represent Header, Payload and a Signature. This is explained in detail at jwt website(www.jwt.io) [52].


For storing passwords, we use the bcrypt algorithm that is slow enough such that bruteforce attacks are not a feasible option. It uses salts for the password so that it helps againstrainbow attacks [53]. From bcrypt, we get hash of the password that we store in the database.

The communication between the server and the user has to be secure. For that, we useSSL (Secure Sockets Layer) which is the standard security technology for an encrypted linkbetween a server and a browser. It ensures that all data that is passed between the server andthe browser is private. SSL is a known industry standard and is widely used for protectingtransactions between websites and users.

For testing environment we use Mocha. Mocha is a feature-rich JavaScript frameworkthat runs on Node.js. It is created to be a simple, extensible and fast testing suit [54]. Mochais used for unit and integration testing and is suitable for Behavior Driven Development(BDD). We feel that having it in the application will help with future development of theUXinn application.

So far we have described the business layer in development. For an immediate productionenvironment, we looked into how that can be done. Since we are using Node.js, it is fittingto have it managed by PM2. PM2 is an advanced, production process manager for Node.js[55]. PM2 gives a significant oversight on what is happening within the application. Itallows to start and restart the application as a daemon, monitor console logging, set up mul-tiple instances to load balance and cluster handling of the application. By adding PM2 as aload balancer (see Figure 5.10) we can quickly add a new instance of the application to theprocess manager, such that a request gets routed to other instance of the same application.However, by using Nginx ("engine x") for all the abilities we mentioned and run that withPM2, it will allow a fine tuning and actual understanding of what is happening in the ap-plication. Nginx is a free, open-source, high-performance HTTP server. It is known for itsstability, rich feature set and low resource consumption [56].

Figure 5.10: Showing when client sends a request to the application through Nginx as re-verse proxy and pm2 as loadbalancer. Application 1, 2 & 3 represent as three instance ofUXinn API.

In Figure 5.10 we can see when the client interacts with the business layer ( UXinn API"Application 1, 2, 3") that has a reverse proxy server and load balancer in front of it. Therequest from the client is sent to the proxy that relays it forward on the load balancer. Then

5.6. DATA LAYER 43

directs it to the instance of the application. When we described PM2 and that we could startup a new instance and more, this is what we are referring to; there we can see three instancesof the application that the load balancer then directs a request to according to the load onthe system. If an application is a high traffic site, the first step in increasing the performanceis to put a reverse proxy server in front of the Node.js server (see Figure 5.10). That is oneof the core use cases for Nginx. Since Nodej.js is single threaded and uses nonblockingI/O that can support tens of thousands of concurrent operations, using this system can beuseful. By using PM2 as a load balancer and Nginx as a reverse proxy for the application,we can quickly scale the product up and handle more than tens of thousands of concurrentconnections [56].

5.6 Data LayerFor the database, we use PostgreSQL also known as Postgres. Postgres is an object-relationaldatabase management system (ORDBMS), that emphasis on extensibility and standardscompliance. Its primary functions are to store data securely and return that data in responseto a request. For the database schema, we had six sectors which are marked with yellow,red, green, blue, orange and purple.

The yellow sector (in Figure 5.11) has the shared tables rames_category that havethe Name of the category and information about it, e.g., Roles, and languages so there isan option to add more languages to the application. The category is then used to categorizethe questions to the UXE PSS factors for both planning and giving feedback.

Figure 5.11: The yellow sector of the diagram, stores language option and category withinthe system. Such as, Roles, Activities, Material, Environment and System.

The red sector (in Figure 5.12) is the feedback. It is connected to reports or as a user willknow it a plan, for each plan we want to collect feedback to the database such as that infor-mation can both help the user to remember how that evaluation went and for the possibilityfor research. The feedback holds four tables, feedback_reports_info which is theanswers from the user, question_feedback_questions which holds the informa-tion about the questions that we are looking for getting feedback on and then we have bothquestion_feedback_checkbox_choices and question_feedback_radi-o_choices that holds the options for each feedback question.


Figure 5.12: The red sector of the diagram, stores information from the feedback, feedbackquestions, option for checkboxes and radio choices of each question

The green sector (in Figure 5.13) is the plan. It is connected to the feedback to the re-ports that are known to the user as a plan. The user can have many plans for each project.The sector has five tables: reports_info that stores the answer to the questions fromUXE PSS, rames_info which contains the information about each question and explana-tion, rames_questions that contains the questions to make the plan, suggestion for eachquestion such that the user can get better insight into what is expected to have and questi-on_radio_choices question_dropdown_choices question_checkbox_c-hoices which holds the answer options.

Figure 5.13: The green sector of the diagram, stores information from the UXE PSS frame-work, it’s questions, option from checkboxes, radio and drowpdown menu.

The purple sector (in Figure 5.14) is for the projects and plans. For each project andreport the user is set as the owner. This means that project can own one user and manyreports. The tables in this sector are project which contains the name of the project,

5.6. DATA LAYER 45

description of the project dates and id of the user, the reports holds the information forthe plan and the answer from questions for the evaluation and feedback attached to it.

Figure 5.14: The purple sector of the diagram, stores information about projects, plans andtype of plan. This information is attached to user id.

The blue sector (in Figure 5.15) has three standalone tables, contactus, rames_pictureand collaborators those are for content and contacting the owner of the application.E.g., contact us has a message that can be sent from the presentation layer. The collaboratorsare added to the database, so they get displayed in the application and the rames_pictureis to save URL of pictures that can be displayed on the SAP.


Figure 5.15: The blue sector of the diagram, stores information of content in the application,stores information where to find pictures and information about collaborators in the project,e.g. developers of the application. The orange sector is the information on the user in thesystem, that holds hashed password, email, roles and other needed information on the user.

The orange sector (in Figure 5.15) holds essential information about the user. It stores thehash from the hash function that then is used to validate the user id. Each user has a uniqueid that is used for the ownership of the projects and reports. It also stores information aboutthe roles that users have within the application, meaning if he is a typical user or has otherprivileges, such roles can be an administrator.

47

Chapter 6

Evaluation of UXE PSS

This chapter presents each user evaluation that was conducted. First, we explain the evalua-tion overview, our usability goals, and user experience goals. Then for each user evaluation,UXE PSS framework is used as described in chapter 4 to plan our evaluation, were we de-fine each of the factors of the five categories: Roles, Activities, Material, Environment andSystem. Then we describe the participants, the method of the evaluation and the results.

6.1 Evaluation OverviewIn this study, we conducted two formal evaluations with the same four participants in theirComputer Science study at Reykjavík University. When conducting the evaluations we usedCustomer Interview process[57] that is described section 6.4. We used the UXE PSS frame-work as described in Chapter 4 to plan the evaluations. Then plan can be seen in Table 6.3and 6.7. Our purpose was to get better insights and to measure usability goals from thedata we gathered from the evaluations. For data gathering we focused on having the partic-ipants to solve predefined tasks and collected information from the participants by gettingfeedback while they solved tasks, asking questions, time-solving task, completion of a task,comments, and debriefing. We used a voice recorder, timer, pen and paper for collecting theinformation. For the participants to solve each task, we prepared an informative text thatthey got beforehand (see Appendix B). That allows us to expect a certain solution on eachof the tasks and helped us in the analysis process.

The focus in both of the evaluation was to have the participants to make an evaluationplan for a project that was made up (Task 4). However, in the second evaluation, we added afeedback questionnaire to the system that can be used to collect information about how thesystem user felt about the generated plan. This allows the system to collect information toanalyze in the future to build a recommend system that can simplify the process of planning.

6.2 Usability GoalsFor usability goals we used the ISO-definition [12] (ISO 9241-210 (2010)). The user was tocomplete a predefined task with the correct data inserted. As a measurement of the accuracyand completeness, the percentage of users completing each task with correct data can beused. However, if a user completes a task with incorrect data one has to evaluate howserious it is and how much attention that measurement should get. In our evaluation, it isnot serious to insert incorrect data, but the task was not completed with correct data as onewould expect.

48 CHAPTER 6. EVALUATION OF UXE PSS

For this study, we have defined measurable goals for effectiveness, efficiency, and satis-faction in table 6.1.

EffectivenessA user using the system, 80% can make a new project.A user using the system, 80% finds information on UXE PSS.A user using the system, 80% can add information on a project.A user using the system, 80% can edit information on a project.A user using the system, 80% can delete a project.A user using the system, 80% can complete a new plan for a project.A user using the system, 80% can find the plan he created.A user using the system, 80% can edit plan.A user using the system, 80% can delete plan.EfficiencyA user using the system, 80% takes less than 3 minutes making a project.A user using the system, 80% takes less than 3 minutes to find information about UXE PSS.A user using the system, 80% takes less than 60 minutes to make an evaluation plan.SatisfactionA user using the system, 80% finds it easy to use the software.A user using the system, 80% finds it easy to navigate within the software.A user using the system, 80% will likely return to the software in the future.

Table 6.1: List of goals to achieve on our effectiveness, efficiency and satisfaction goals.

6.3 User Experience GoalsLooking at what we mean when we are making our goals for User experience (UX), UX isabout how the user feels about using a product. For our goals, we found that we wanted tofocus on three things, Confidence, Completion and Self-actualizing. The main reason wethought for these experience is that we wanted to enhance and help the user of the systemto feel that he has the opportunity to cope with the situation, be capable and effective ratherthan feeling ineffective and give them the sense that they are making something meaningfuland having more control when planning UX evaluation.

When users are planning an evaluation, they often don’t know what they are looking foror how they are supposed to conduct the evaluation. By asking the user questions that aredefined in the UXE PSS framework, we believe that they will gain more confidence on theirevaluation. After the user has answered these questions, they get a plan that is presented tothem that they will then follow for there evaluation, meaning they will feel more capable ofcompleting the evaluation as well as help them learn and have the fulfillment of conductingthe evaluation.

6.4 The User EvaluationsIn this study, we conduct two user evaluations. Both of the evaluations are formative wherewe had four students that we monitor when they solved tasks and provided feedback that wethen used to improve the application and the framework. By using formative evaluation, weidentify parts of the application that need to be more clear. In both of the evaluations, wegathered information by asking about the background, solve tasks and gathering feedbackon how they felt about the process. After collecting this information, we analyzed the data

6.4. THE USER EVALUATIONS 49

for further information. We used Customer Interview process that is a part of the Googledesign sprint process [57] for our user evaluation that is split into five acts.

Act 1 - Friendly Welcome.

It is important that we thank the user for participating in the evaluation and explain that weare not testing him and that we are only looking to improve the product. Explain that hishonest feedback is essential and will help. Then we will let him know that this is an informalinterview and that we will be asking some questions and addressing that this is not to testhim but to get a better understanding of our product. From this we address that there isnothing right or wrong, explain next acts and make sure that they feel comfortable.

Act 2, Context Questions.

For context, we ask open-ended questions that give us background information on the partic-ipant. Here it is important that we are not focusing on the prototype, but to gain understatingabout how or if the product fits with the user. This is so we can get a better insight into thepotential end user of the product from the user perspective.

Act 3 Introduce the Prototype.

When we introduce the product, there are some things that we need to be aware of. The userhas a natural way of responding to situations, e.g., they might not want to hurt feelings, andthey might feel that they are being tested. This can lead to a biased result, and that needs tobe avoided by easing the user before everything starts by explaining that we are not testinghim, something might not work and there are no rights and wrong in the evaluation. Herewe ask the user to think aloud, by telling what they are thinking, how they feel, what theyare trying to do and why. We must be sure that the user knows that if he gets confused orstuck, he is allowed to ask for help and we want to hear his thoughts.

Act 4, Tasks & Nudges.

To get the needed information we will need to observe the participant by giving him assign-ments, commonly referred to as tasks. First, we need to define general user goals that theend user may have, by asking, what is the main reason for the use of the software. In thisstudy it is to plan a User evaluation, so we need to build our tasks that address this goal.The defined tasks are listed in Table 6.2. In order to solve the task we designed a relevanttext information that was given to the participant (see Appendix B). Then the user should beasked to complete the tasks by nudging him in the right direction if needed. When nudgingthe user, we need to be careful not to instruct him while solving the task. We might ask somequestions on how he feels about the product, why he choose to do this instead of somethingelse or what he is thinking.


# Task1 Ask the user to brows the system, register and login.2 Find information on Roles, Activities, Materials, Environment and Software.3 Make a project.4 Make a plan.5 Edit the plan, since it changed.6 Edit project description.7 Find the plan that you just made.8 Add a new plan and give it name. "Cooking 101"9 Delete the plan "Cooking 101".10 Make feedback on your evaluation.11 Find the feedback that you just made.

Table 6.2: List of task in the user evaluation. The first nine task are used in the first eval-uation, in the second evaluation we used all eleven task and the for the third evaluation weused task #4.

Act 5, Quick Debrief.

By now the user has given the information on his background and browsed the productwhile completing tasks. He has solved each task while he was thinking aloud, described histhoughts, how he felt and why he approached things the way he did. With quick debrief weare getting the learnings and key takeaways from the whole process, by asking him, e.g.,How does this product compare to what you do now? This information is important for theanalysis, as we are asking for the users insights after walking through the system.

In the five-act interview, we gathered the data by voice recordings, taking notes, askingquestions and actively observing the users solving the tasks. This interview took roughlyan hour, and we had at least 30-minute break between the interviews. From there we ana-lyzed the data and tried to identify patterns. After only two participants we started to see apattern, it was confirmed by the next two users as we saw that we are getting the same infor-mation from all four participants we decided to address the issue and then conduct anotherevaluation.

6.5 First User EvaluationIn the first user evaluation, we planned to have five participants in their Computer Sciencestudy from Reykjavik University. However, we ended up with four participants as one didnot show up for the evaluation. Our purpose was to construct formal user test with thoseparticipants such that we get better insights and to measure usability goals from the data wegathered. In this evaluation, we are asking the participants to make a plan for a new project.See Figure 5.7.

We then planned to meet with each of the participants at Reykjavik University over theperiod of two days for about one hour each in a controlled environment. Our evaluation plancan be found in section 6.5.1. In section 6.5.2 we describe the participants. In section 6.5.3we describe the process of our evaluation methods. Finally in section 6.5.4 we present theresults of our user evaluation.

6.5. FIRST USER EVALUATION 51

6.5.1 Evaluation Plan

RolesR1. System Users - 4, - Students, In their Computer Science study, - Expertise level, age, gender,

- Students learning how to conduct a user evaluation, - We select users that arelearning how to conduct UX evaluation from Reykjavik University

R2. Evaluators - 1. Bjarni Kristján Leifsson, Graduate student in Software Engineering, - Howto perform research with UX evaluation.

R3. Observers No observerR4. Technical writers Same as evaluatorR5. Recipients SupervisorsActivitiesA1. The Purpose Construct formal user test with four users in the field of Computer Science,

measure usability goals and gather insights on how to improve the UXE PSSsoftware from these tests.

A2. Schedules We plan to meet the users at Reykjavík University over two days, - 60 minutes.A3. Methodology - Formal user tests are conducted by having the users perform nine tasks, time

tasks while solving them and ask them if they had any comments on tasks.A4. Analysis - 9 User tasks are defined by the evaluator, information on the subject where

the user can find relevant information to solve tasks, A background is gatheredwith questions defined by the evaluator, - Set of questions to ask the participant,introduction text for the evaluation, paper to collect information on task.

A5. Decision The evaluator takes care of the procedure for deciding how to react to the result.MaterialsM1. Evaluation material 9 User tasks defined by the evaluator for the UXE PSS. A background is gath-

ered with questions defined by the evaluator. Debriefing at the end of the eval-uation with questions defined by the evaluator, -A paper with information textthat describes a product that has been developed. The text contains informationthat the user needs to solve a task.

M2. Data collection (a) Feedback while conducting tasks.(b) Gathered data by questions.(c) Time to solve task.(d) Are tasks solved or not.(e) Comments on how UXE PSS could be improved.

M3. Results The results are presented in a testing report.M4. Decisions The evaluator is responsible for responding to the results of the evaluation re-

port.M5. Resource cost Research, do not apply.Environment and EquipmentE1. Settings - Reykjavík University, Menntavegur 1., - In controlled settings.E2. Physical ambient condition Does not apply.E3. Social conditions Does not apply.E4. Data collection Voice recording and Timer in iPhone 7, pen and paperE5. Data analysis The data will be analyzed by taking notes from recordings, gather notes both

from recording and from evaluation. Calculate the average time of solving atask.

SystemS1. Description - Software that helps students to plan an evaluation, UXE PSS, version 1.1.0S2. Type Web application used to plan a UX evaluation, gain better insights on the pro-

cess. A research tool that can be used to answer questions, e.g., what methodsare mostly used, how many users are often used for conducting evaluations.

S3. Stage Running prototype.S4. Part Creating a new project and planning UX evaluation with UXE PSS framework.S5. Required equipment The system will be running on the evaluator’s computer.S6. Risk and issues - Low battery on computer, system will crash, participant might not show up.

Table 6.3: Evaluation plan for conducting the first user evaluation of UXE PSS.


6.5.2 The ParticipantsIn our first evaluation of UXE PSS, we had 4 participants. They are all Computer Sciencestudents at Reykjavík University. Three of the participants are in their Master studies andone in his Bachelor study. The participants were chosen by us, from our working environ-ment within Reykjavík University. Their field of focus varies from Artificial Intelligence,Game development, Algorithm and Human-Computer Interaction. One of the participantshas some working experience in planning and conducting user evaluation of a product. How-ever, all participants have experience with planning an user evaluation from their studies.

Looking at the experience from their studies they have all taken courses on usabilitytesting, planning and conducting a test. All but one explained that they did usability testingin their final project and the one that hadn’t is still in his Bachelor studies. However, allparticipants have used user evaluation for a product they have been developing in their study,e.g., Game development, on their user evaluation they got useful and vital information thatthen improved the game for the user. From that experience, they stated, that they saw theimportance of user evaluation in the development cycle.

6.5.3 The Evaluation ProcessFor our evaluation we used a method called Customer Interviews which is a part of theGoogle design sprint process [57], where they use a five-act interview to conduct user eval-uation as described in section 6.4.

6.5.4 The ResultsIn this section the results of the usability evaluations are described.

Effectiveness

In table 6.4 we can see if each task was solved by the user with correct data and mark it with3or left empty otherwise. Where 3is considered to have 100% in both Quantity and Quality,

Task 1 2 3 4 5 6 7 8 9

User 1 3 3 3 3 3 3 3 3

User 2 3 3 3 3 3 3 3 3 3

User 3 3 3 3 3 3 3 3 3

User 4 3 3 3 3 3 3 3 3 3

Table 6.4: This shows if the user finished task with correct data marked with 3or empty ifnot.

From the table 6.4 we can see that we achieved all but one of our effectiveness goals (seetable 6.1) with correct data. The goal we did not achieve is that we expected 80% of usersusing the system could make a new project. One explanation for not achieving that goalcould be that the users have not planned an evaluation in this much detail before. Anotherexplanation could be that the information text (see Appendix B) they got was not direct andleading hence they might have got confused. The users commented that they felt it wasmuch information for them to read and not being acquainted with it had affected how long ittoke them to solve the task. The information in the application that should be helping themshould be (referring to help button) and quoted "Dumped down and have some examples as

6.5. FIRST USER EVALUATION 53

well". That could also explain the trouble they had. See Table 6.5.

User Completed the task Quantity Quality Taskeffectiveness

U1 Yes, with incorrect data 100 50 50%U2 Yes, with the correct data 100 100 100%U3 Yes, with incorrect data 100 50 50%U4 Yes, with the correct data 100 100 100%Total 75%

Table 6.5: The task effectiveness for solving task 4 : Complete a new plan for project

The task effectiveness is measured by calculating according to definition by Bevan andMacleod [58]. Lárusdóttir et al. used this way of calculating the effectiveness. In "A casestudy of software replacement" where they were introducing a new software system forevaluation by replacing an old one for a new one. Macloud, Bowden and Bevan [59] furtherexplain this method. They describe there effectiveness both quantitative and qualitative,where quantitative is "the proportion of task goals represented in the output of a task whichhave been attempted". The qualitative is described as: "the degree to which the task goalsrepresented in the output have been achieved". From that, we can define the effectivenessas:

Task effectiveness = (Quantity ∗Quality) / 100% (6.1)

In the evaluation, the quantitative effectiveness is defined as 100%, if the user succeededby completing the task regardless if the data was correct or incorrect, but 0%, otherwise.The qualitative effectiveness is defined as 100% if the user succeeded by completing thetask with correct data, 50% with incorrect data and 0% otherwise. This is per Nielsen’scalculations of success-rate [60].

Nielsen recommends using a simple usability measure: the user success rate. Wherehe defines rates as the percentage of tasks that user completes correctly. However, it saysnothing about why users fail or how well they perform the task. Nielsen also explains howsuccess rates are easy to collect, shows statistic and that the user success is the bottom lineof usability. A good example from Nielsen :

Let’s say, for example, that the users’ task is to order twelve yellow roses to bedelivered to their mothers on their birthday. True task success would mean justthat: Mom receives a dozen roses on her birthday. If a test user leaves the sitein a state where this will occur, we can certainly score the task as a success. Ifthe user fails to place any order, we can just as easily determine the task failure.

However, Nielsen pointed out that there are other possibilities as well. What if the userorders twelve yellow tulips, twenty-four yellow roses or some other deviant bouquet. Thisexample can be defined in some degree of failure, although the user ordered the twelveyellow tulips he also added another bouquet with it, and that was not the task.

So if a user does not perform a task as it is specified, we could be strict and score itas a failure (0%). By doing so, we are using a simple model where it is defined that usereither do everything correctly or they fail the task. However, Nielsen grants partial creditsfor a partially successful task. As it seems unreasonable to give the same score to a user thatpartially fulfills the task and the user that did nothing.


From the example above where Nielsen described a flower order, he then also explainedhow we could rate the success of that task:

In the flower example, we might give 80% credit for placing a correct order, butomitting the gift message; 50% credit for (unintentionally) ordering the wrongflowers or having them delivered on the wrong date; and only 25% credit forhaving the wrong delivery address. Of course, the precise numbers would de-pend on a domain analysis.

Efficiency

As we described at the beginning of Chapter 6, the efficiency describes the ratio of effec-tiveness divided by the resources used. To measure the resources expended we first definedefficiency goals (see table 6.1), then we took the time of each user. If he completed the task,we calculated the average time. See Table 6.6. Timing tasks are often used as a measure-ment of quality of use [61]. In this case, we only timed task 4: "Complete a new plan fora project", we had set our goal that 80% would complete the task within 60 minutes, everyuser was well within that, the longest time was 48 minutes. The main reason for timingthat one particular task is, each user had no problem with completing each of the tasks, andit took much less time than we expected. The result is therefore based on the time of allcompleted tasks. That includes both with the correct and incorrect data, left out otherwise.This is very suitable, especially if all the tasks were completed with 100% effectiveness.

Task Average time (Stdev, N)Make a plan 37:00 (07:00, 4)

Table 6.6: The resources used for completing task (Both correct and incorrect data)

In Bevan and Macleod [58] the user efficiency is defined as:

User efficiency = (Task effectiveness / Task T ime) (6.2)

By using this method, we can combine the accuracy, the completeness and the resourcesexpended in a single measurement. However, more evaluation is needed to combine mea-surements and compare the user efficiency for each task and are shown in section 6.6.4.Larusdóttir et al. [61], described the advantage of using this method as: "The advantage ofthe efficiency measure is that it provides a more general measure of the usage of the systemby trading off the quantity and quality of the task completion against time of completion.".

Satisfaction & User Experience

To measure the satisfaction and user experience, we asked the user’s few questions after alltasks were completed as described in section 6.5.3. All the users found the system easyto use and had no problem navigate within the system. All but one said that they are verylikely to use the system in the future and said "This will help me in my final project", the onethat was not sure said that he could see that if he would need to conduct a user evaluationthis would definitely be useful and help a lot, but was unsure if he would use it again. Allcommented that they would feel more confident conducting a user evaluation and this gavethem good oversight for conducting a user evaluation, they also felt that the software wouldhelp them to be capable, efficient and have control when planning UX evaluation. When

6.6. SECOND USER EVALUATION 55

users were asked about if there was something that they would like to add or change in thesystem. All pointed out that they see that the system is part of a bigger picture and said itwould be good if they could add information to each plan. The information they wanted tosee was about the users they met, how they answered questions, information about tasks,such as time and other notes that are relevant. They also said that it would be good to beable to setup tasks, questions, and other related information into the system with each planso that it could generate a report and some analysis of their work.

6.6 Second User EvaluationIn the second user evaluation, we planned to have four participants in their Computer Sci-ence study from Reykjavik University. All four participants we meet in the first evaluation.Our purpose is the same as first evaluation. See section 6.5. In this evaluation, we are askingthe participants to make a plan and then give feedback on that plan. See Figure 6.1. Thefeedback is a questionnaire that asks the system user about how he felt about any commentson the plan that was generated by the system. In order for the participants to answer thefeedback questions we provided an extra information to the informative text in the first eval-uation (see Appendix B.2). By adding feedback to the system we open the opportunity tobuild a research tool that collects data that then can be used to answer questions, such ashow many users are most often included in user testing, where is it most often conductedand more. Minor fixes had to be done between the evaluation, the participants lost all dataif they changed from one view to another within the system, some spelling and the buttonsneeded a small redesign.

Figure 6.1: An overview of when user is giving feedback on the evaluation he conducted.

We then planned to meet with each of the participants at Reykjavik University over theperiod of two days for about one hour each in a controlled environment. Our evaluation plancan be found in section 6.6.1 In section 6.6.2 we describe the participants. In section 6.6.3we talk about the process of our evaluation methods. Finally, in section 6.6.4 we present theresults of our user evaluation.


6.6.1 Evaluation Plan

RolesR1. System Users - 4, - Students, In their Computer Science study, - Expertise level, age, gender,

- Students learning how to conduct a user evaluation, - We select users that arelearning how to conduct UX evaluation from Reykjavik University

R2. Evaluators - 1. Bjarni Kristján Leifsson, Graduate student in Software Engineering, - Howto perform research with UX evaluation.

R3. Observers - No observerR4. Technical writers - Same as evaluatorR5. Recipients - SupervisorsActivitiesA1. The Purpose - Construct formal user test with four users in the field of Computer Science,

measure usability goals and gather insights on how to improve the UXE PSSfrom these tests.

A2. Schedules - We plan to meet the users at Reykjavík University over two days, - 60 minutes.A3. Methodology - Formal user tests are conducted by having the users perform 11 tasks, time

tasks while solving them and ask them if they had any comments on tasks.A4. Analysis - 11 Users tasks are defined by the evaluator, relevant information to solve tasks,

A background is gathered with questions defined by the evaluator, - Set of ques-tions to ask the participant, introduction text for the evaluation, paper to collectinformation on the task.

A5. Decision The evaluator takes care of the procedure for deciding how to react to the result.MaterialsM1. Evaluation material - 11 User tasks defined by the evaluator for the UXE PSS. A background is

gathered with questions defined by the evaluator. Debriefing at the end of theevaluation with questions defined by the evaluator, -A paper with informationtext that describes a product that has been developed. The text contains infor-mation that the user needs to solve a task.

M2. Data collection (a) Feedback while conducting tasks.(b) Gathered data by questions.(c) Time to solve task.(d) Are tasks solved or not.(e) Comments on how UXE PSS could be improved.

M3. Results The results are presented in a testing report.M4. Decisions The evaluator is responsible for responding to the results of the evaluation re-

port.M5. Resource cost Research, do not apply.Environment and EquipmentE1. Settings - Reykjavík University, Menntavegur 1., - In controlled settings.E2. Physical ambient condition - Does not apply.E3. Social conditions - Does not apply.E4. Data collection - Voice recording and Timer in iPhone 7, pen and paperE5. Data analysis - The data will be analyzed by taking notes from recordings, gather notes both

from recording and from evaluation. Calculate the average time of solving atask.

SystemS1. Description - Software that helps students to plan an evaluation, UXE PSS, version 1.2.0S2. Type - Web application used to plan a UX evaluation, gain better insights on the pro-

cess. A research tool that can be used to answer questions, e.g., what methodsare mostly used, how many users are often used for conducting evaluations.

S3. Stage - Running prototype.S4. Part - Creating a new project and planning UX evaluation with UXE PSS framework.

Give feedback on the evaluation plan made.S5. Required equipment - The system will be running on the evaluator’s computer.S6. Risk and issues - Low battery on computer, system will crash, participant might not show up.

Table 6.7: Evaluation plan for conducting the second user evaluation of UXE PSS.

6.6. SECOND USER EVALUATION 57

6.6.2 The ParticipantsThe participants were the same as we described in section 6.5.2.

6.6.3 The Evaluation ProcessThe process of evaluation was the same as we described in section 6.5.3.

6.6.4 The ResultsIn this section we will use the same method as in section 6.5.4 and the results of the usabilityevaluations are described.

Effectiveness

In table 6.8 we can see if each task was solved by the user with correct data and mark it with3or left empty otherwise. Where 3is considered to have 100% in both Quantity and Quality,

Task 1 2 3 4 5 6 7 8 9 10 11

User 1 3 3 3 3 3 3 3 3 3 3 3

User 2 3 3 3 3 3 3 3 3 3 3 3

User 3 3 3 3 3 3 3 3 3 3 3

User 4 3 3 3 3 3 3 3 3 3 3 3

Table 6.8: This shows if the user finished task with correct data marked with 3or empty ifnot.

From the table 6.8 we can see that we achieved all but one of our effectiveness goal (seetable 6.1 with correct data. The goal we did not achieve is that we expected 80% of usersusing the system could make a new project. The explanation for not achieving that goal wasthat the user had more information in his answer than was expected and qualifying as 50%in quality see Table 6.9.

User Completed the task Quantity Quality Taskeffectiveness

U1 Yes, with the correct data 100 100 100%U2 Yes, with the correct data 100 100 100%U3 Yes, with incorrect data 100 50 50%U4 Yes, with the correct data 100 100 100%Total 87.5%

Table 6.9: The task effectiveness for solving task 4 : Complete a new plan for project

We explained in section 6.5.4 how we calculated the task effectiveness with formula 6.1,where we described the quantitative effectiveness defined as 100% if the user succeeded bycompleting the task regardless if the data was correct or incorrect, but 0%, otherwise. Thequalitative effectiveness is defined as 100% if the user succeeded by completing the taskwith correct data, 50% with incorrect data and 0% otherwise. In the first evaluation we hada success rate of 75%, and now we see improvements to 87.5%, which is a visible indicationof improvement of using the application.


Efficiency

As we described at the beginning of Chapter 6, the efficiency describes the ratio of effec-tiveness divided by the resources used. To measure the resources expended we first definedefficiency goals (see table 6.1), then we took the time of each user. If he completed the task,we calculated the average time. See Table 6.10 as described in section 6.5.4. In this case, wetimed all tasks but the first one, since the users were already familiar with the application.We achieved all goals for efficiency, e.g., task 4, we had set our goal that 80% would com-plete the task within 60 minutes. Every user was well within that; the longest time was 24minutes. Each user had no problem with completing each of the tasks, and it took much lesstime than we expected. The result is therefore based on the time of all completed tasks, thatincludes both with the correct and incorrect data, left out otherwise. This is very suitable,especially if all the tasks were completed with 100% effectiveness.

Task Average time (Stdev, N)Exploring the application - (-, -)Find information on UXE PSS 00:08 (00:05, 4)Make a project 00:28 (00:07, 4)Make a plan 19:28 (04:15, 4)Edit the plan 00:59 (00:07, 4)Edit project description 00:38 (00:21, 4)Find the plan they you made 00:26 (00:22, 4)Add a new plan and give it name 00:23 (00:12, 4)Delete the plan you just made 00:04 (00:02, 4)Make feedback on your evaluation 03:42 (00:17, 4)Find the feedback that you just made 00:06 (00:02, 4)

Table 6.10: The resources used for each completing each task (Both correct and incorrectdata)

In section 6.5.4 we described equation 6.2 that was defined by Bevan and Macleod aboutthe user efficiency. In the first evaluation, we described the method about user efficiency thatwe used in Table 6.10

Task : Make a plan Average time (Stdev, N)First evaluation 37:00 (07:00, 4)Second evaluation 19:28 (04:15, 4)

Table 6.11: The resources used for completing task (Both correct and incorrect data (Bothcorrect and incorrect data)

By using this method, we can combine the accuracy, the completeness and the resourcesexpended in a single measurement. When looking at the resources spent in the first evalu-ation, we see that the participants solved the task on average in 39 minutes with a standarddeviation of 7 minutes while in the second evaluation it took on average 19 minutes and 28seconds with a standard deviation of 4 minutes and 15 seconds. See Table 6.11. There is asignificant improvement between the first and second evaluation. We asked the participantshow they felt now comparing to the first evaluation. All said that they felt more confident inwhat they were doing, they were familiar with the process and felt they understand it better.

6.7. EVALUATION LIMITATIONS 59

Satisfaction & User Experience

To measure the satisfaction and user experience, we asked the users a few questions after alltasks were completed as described in section 6.5.3. Comparing to what the users said in thefirst evaluation, all found the system easy to use and had no problem navigating within thesystem. All users said that it would be likely that they would use the system in the future ifthey were conducting a user evaluation, compared to the first evaluation. Where one of theusers said, he would not be likely to use the system in the future now changed his mind andsaid that the reason was that now when he is more familiar with the system, he felt it easier.All users commented that the software would help them and they would feel more confidentwhen conducting a user evaluation using it. They also felt that the software would help themto perform a better evaluation and better control when planning UX evaluation. When userswere asked about if there was something that they would like to add or change in the system.All pointed out that they see that the system is part of a bigger picture and said it would begood if they could add information to each plan. The information they wanted to see wasabout the users they met, how the user answered questions, information about tasks, such astime and other notes that are relevant. Another thing was that they said that it would be goodto be able to setup tasks, questions, and other related information into the system with eachplan so that it would help them in making a report.

6.7 Evaluation LimitationsThere are some limitations to this study that needs to be mentioned. The measurement andinformation available from the evaluations offer limited data to measure a reliable outcome.That means that there is a need to conduct several evaluations, with several different partic-ipants before getting enough data that can be considered to be reliable. Also, there is onlyone evaluator that plans, conducts and analyses the data from the evaluation. There is aneed for external validity. However, there is some external validity for this project, all of thecompared frameworks have internal validity from their study or are well-known standardsfor evaluations.

60

Chapter 7

Conclusion and Future Work

The purpose of this thesis was to find a way that assists user experience evaluators in plan-ning an UX evaluation by providing them with a framework that utilizes the most importantfactors that affect UX evaluation planning and a tool that provides a questionnaire based onthese factors to enable the user experience evaluators to efficiently plan their evaluations inaccordance to a specific UX method or approach and within the time and budget limits.

7.1 Thesis SummaryIn Chapter 2 we introduced some related work that relates to this thesis and helped to createa context and better understanding of the topic. We described that we identified severalframeworks that were suited for our research. That evaluation process is used to criticallyexamine a product with research methods and practices that measure changes in a product.The purpose of the evaluation is to increase the knowledge about one or several aspects of aproduct. Having an evaluation plan is an essential part of the evaluation process and must becarefully executed. We explained few user experience and UX evaluation methods, such asuser testing, think aloud method, A/B testing, questionnaires and heuristic evaluation. Weintroduced some usability and UX evaluation tools: The Website Analysis and MeasurementInventory [27] [28] (WAMMI), The Programmsystem zur kommunikationsergonomischenUntersuchung rechnerunterstützter Verfahren (PROKUS) by Zülch and Stowasser [30], TheDiagnostic Recorder for Usability Measurement [31] (DRUM) and their motivation. Finally,we explained the idea of planning support system (PSS).

In Chapter 3 described that our approach is divided into three phases. In Phase I, weperformed an analysis of four frameworks: RAMES [4], Kwahk and Han [5], CIF [6] andSQuaRE [7] by looking for any similarities or differences and came up with a list of factorsthat can affect the outcome of an evaluation. In Phase II, we used these factors to createand implement UXE PSS framework and come up with a set of questions that are based oncharacteristics within each factor that generates a plan for UX evaluation. Finally, in PhaseIII, we evaluated our work.

In Chapter 4 we described the framework we are proposing in this research named UXEPSS, the categories; Roles, Activities, Materials, Environment and equipment and System.What factors are included in each category, the characteristics included in each factor andan example for them.

In Chapter 5 we described the development of UXinn (User Experience Innovation), andhow the application is thought to enable software developers to prepare and perform usertest based on the UXE PSS framework. That the intended users were students in computer

7.2. EVALUATION SUMMARY 61

science, the design considerations, the software architecture and the fundamental structure ofthe application: the presentation layer (Web), the application programming interface (API),and the data layer.

In Chapter 6 we described the evaluation of the framework. That included explainingour usability and user experience goals, how they are measured and used in the evaluation.How we meet the users for the evaluations and that we used Customer Interview processthat is part of the Google design sprint process [57]. We showed how we used UXE PSSto plan out the evaluation by defining each of the factors in the framework. We describedthe participants, methods, what data was gathered and how we analyzed the data. Then wepresented result from both of the evaluation that was conducted.

7.2 Evaluation SummaryIn the first evaluation, we met all but one of our effectiveness goal, where we only had 75%success rate on completing task 4. This task is the most important task in the evaluation,it is focused on making a plan on a made up project that had to be evaluated. In task 4,the participants were asked to make en evaluation plan based on text information that theygot in the evaluation (see Appendix B). The reason was that the users got confused and hadincorrect inputs. Since we only had four users in our evaluation instead of five, it can havesome impact, but we feel that we capture most of the information from the evaluation. It tookon average 39 minutes with a standard deviation of 7 minutes to solve task 4. Our efficiencygoals were met, as was our user experience and satisfaction goals. In next evaluation, wefocused on fixing some minor issues and add a new feature that allows the users to givefeedback on the plan he previously made.

In the second evaluation, we met all of our effectiveness goals; we had a success rate of87.5%, which is 12.5 points more than in the first evaluation. That is a visible indication ofimprovement of using the application between the evaluation. However, this improvement isdue to the learning effect. The reason for not achieving 100% success rate in this evaluationwas not that the user did not finish the task, but he had too much information, and that isnot necessarily a bad thing. It took on average 19 minutes and 28 seconds with a standarddeviation of 4 minutes and 15 seconds to solve task 4. Our efficiency goals were met, as wasour user experience and satisfaction goals.

7.3 Research ContributionThis thesis provided the following original contributions:

1. Evaluated a number of UX evaluation frameworks.

2. Identified a set of factors that must be taken into consideration when conducting a UXevaluation.

3. Built a framework based on the selected factors that can help UX evaluators to plan aUX evaluation in order to:

• Reduce time

• Reduce cost

• Manage complexity

62 CHAPTER 7. CONCLUSION AND FUTURE WORK

4. Built a prototype of UXE PSS application.

5. Evaluated the prototype of UXE PSS application.

7.4 ConclusionIn this thesis we try to answer the following questions:

1. What factors are important for planning UX evaluations?

2. How usable is a planning support system based on these important factors?

In order to answer these questions, we identified several frameworks described in Chap-ter 3 that can be used for evaluation and chose four frameworks that we found best suited forour research. Then, to find the similarities and differences between the frameworks, we per-formed an analysis that helped us to identify the 26 factors, 42 characteristics that is dividedto 5 categories that needed to be taken into consideration when conducting a UX evaluation.See table 3.3. Furthermore, to collect needed information for the characteristics we came upwith a set of 52 questions (see Appendix A).

In order to know the effectiveness of the generated plan, we performed two user evalua-tions and used UXE ourselves. All the information in this research is based on this processand shows how it can be utilized. However, there was a significant improvement betweenthe first and second evaluation. The time to prepare an evaluation plan improved by 20 min-utes on average. The anticipate reason for this improvement is due to the learning effectsince the participants had the same text between the evaluations, and were using the systemfor the second time. We asked the participants to compare how they felt about the pro-cess between the evaluations. All participants they felt more confident and familiar with theprocess, they had better understanding of the UX evaluation requirements, could perform abetter evaluation and have better control when conducting a user evaluation.

We asked participants for any information that they would like to see in the system, theysaid they would like to add information to plan; information about the users they met, howthe user answered questions, information about tasks, such as time and other notes that arerelevant. Furthermore, they wanted to set up tasks, questions, and other related informationinto the system. Then be able to add information to that in order to generate a report withthe result from the evaluation.

This thesis introduced a framework that can be used to assist with planning an evaluation.Preliminary results show that it can help with planning, broaden the understanding of lessexperienced evaluators, reduce time and cost and manage the complexity of the process.Furthermore, we find future research is needed that includes more students and tests.

7.5 Future WorkThere are many possibly ways of taking this work one step further. The first thing is toimprove the current implementation. For example, it would be good to look at the userinterface, the current design needs improvement. Then, there are the business rules for thequestions, they could be explored further, and there is a need to have clear information textthat explains each of the questions for the system user and having an example for eachquestion would help the system user to understand each question. Finally, it is crucial to

7.5. FUTURE WORK 63

address the limitations of this study, since all data collection, analyses and preparation wasdone by one person, and we only had four participants in the evaluations.

Then there is the possibility to add new features to the application. It would be inter-esting exploring how to help planners with evaluating by allowing them to define each task,questions, metrics and other related things, depending on the method used. Three featuresneed to be considered for that. First, we would have to build a catalog of methods and whatis required when using them e.g., in a heuristic evaluation, we could set up a list of heuris-tics that need to be addressed, or Think aloud method, where there is a need for tasks tobe defined, informative texts, questionnaire, and more. Secondly, we would have to build acatalog of metrics that are used to generate statistical analyses out of inputs from the evalu-ation, e.g., the success rate used in this study. Finally, we could implement a generator thatanalysis the information on the evaluation and generates a preliminary report for the planner.

Since we did not use an already existed software like Google Forms or SurveyMonkey,the framework can be extended to enable exploring the data collected by applying machinelearning. Gathering data on how well each evaluation plan actually turn out, and storingthat data, could be a good basis for learning from that data, and making suggestions to usersbased on that data.

Suggestion system based on the previous experiences from the system users is anotherway to use machine learning. The data sets established by storing previous experiencescould be used to make suggestions for each of the questions within the system and havethem based on the previous answer from users, e.g., the system could start by suggestinghow many users are recommended to have in an evaluation. However, to be able to usemachine learning with the system, we would have to first collect data from quite extensivenumber of users.

There are various ways to take this project further and we consider this as a good basis forfurther development of the tool. The requirements of users and results from user evaluationsof the tool could guide the work how to approach that future work.

64

Bibliography

[1] J. Gothelf, Five Effective Ways for Usability Testing to Play Nice with Agile, http://blog.usabilla.com/5-effective-ways-for-usability-testing-to-play-nice-with-agile, (Accessed on 04/12/2017), Jun. 2011.

[2] H. Alveraz, How user testing fits into agile development | UserTesting Blog, https://www.usertesting.com/blog/2016/06/07/user-testing-agile-development, (Accessed on 04/11/2017), Jun. 2016.

[3] M. K. Larusdottir, “User centred evaluation in experimental and practical settings”,PhD thesis, KTH Royal Institute of Technology, 2012.

[4] M. K. Larusdottir, J. Gulliksen, and N. Hallberg, “RAMES – Framework SupportingUser-Centred Evaluation in Research and Practice”, Submitted for review.

[5] J. Kwahk and S. H. Han, “A methodology for evaluating the usability of audiovi-sual consumer electronic products”, Applied Ergonomics, vol. 33, no. 5, pp. 419–431, 2002, ISSN: 0003-6870. DOI: https://doi.org/10.1016/S0003-6870(02)00034-0. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0003687002000340.

[6] Wichansky.CIFUTR, http://www.idemployee.id.tue.nl/g.w.m.rauterberg/lecturenotes/common-industry-format.pdf, (Accessed on04/28/2018),

[7] ISO/IEC 25066:2016 - Systems and software engineering – Systems and softwareQuality Requirements and Evaluation (SQuaRE) – Common Industry Format (CIF)for Usability – Evaluation Report, https://www.iso.org/standard/63831.html, (Accessed on 04/28/2018),

[8] M. Q. Patton, Qualitative evaluation and research methods. SAGE Publications, inc,1990.

[9] Expert Evaluation | CS4760, HU4628 & CS5760: Human-Computer Interactions &Usability, https://cs4760.csl.mtu.edu/2018/lectures/expert-evaluation/, (Accessed on 05/11/2018),

[10] The Definition of User Experience (UX), https://www.nngroup.com/articles/definition-user-experience/, (Accessed on 05/10/2018),

[11] interface design, https://www.peterme.com/index112498.html, (Ac-cessed on 05/10/2018),

[12] ISO 9241-210:2010 - Ergonomics of human-system interaction – Part 210: Human-centred design for interactive systems, https://www.iso.org/standard/52075.html, (Accessed on 11/22/2017),

[13] P. Morville, User Experience Design, http://semanticstudios.com/user_experience_design/, (Accessed on 05/01/2018),

http://blog.usabilla.com/5-effective-ways-for-usability-testing-to-play-nice-with-agile



https://www.usertesting.com/blog/2016/06/07/user-testing-agile-development



https://doi.org/https://doi.org/10.1016/S0003-6870(02)00034-0

https://doi.org/https://doi.org/10.1016/S0003-6870(02)00034-0

http://www.sciencedirect.com/science/article/pii/S0003687002000340

http://www.sciencedirect.com/science/article/pii/S0003687002000340

http://www.idemployee.id.tue.nl/g.w.m.rauterberg/lecturenotes/common-industry-format.pdf

http://www.idemployee.id.tue.nl/g.w.m.rauterberg/lecturenotes/common-industry-format.pdf

https://www.iso.org/standard/63831.html


https://cs4760.csl.mtu.edu/2018/lectures/expert-evaluation/

https://cs4760.csl.mtu.edu/2018/lectures/expert-evaluation/

https://www.nngroup.com/articles/definition-user-experience/

https://www.nngroup.com/articles/definition-user-experience/

https://www.peterme.com/index112498.html



http://semanticstudios.com/user_experience_design/

http://semanticstudios.com/user_experience_design/

BIBLIOGRAPHY 65

[14] J. Rubin and D. Chisnell, Handbook of usability testing: howto plan, design, andconduct effective tests. John Wiley & Sons, 2008.

[15] M. Van Someren, Y. Barnard, and J. Sandberg, “The think aloud method: a practicalapproach to modelling cognitive”, 1994.

[16] H. Kuusela and P. Pallab, “A comparison of concurrent and retrospective verbal pro-tocol analysis”, The American journal of psychology, vol. 113, no. 3, p. 387, 2000.

[17] A/B Testing, https://www.optimizely.com/optimization-glossary/ab-testing/, (Accessed on 05/18/2018),

[18] M. Schrepp, A. Hinderks, and J. Thomaschewski, “Applying the user experiencequestionnaire (UEQ) in different evaluation scenarios”, in International Conferenceof Design, User Experience, and Usability, Springer, 2014, pp. 383–392.

[19] AttrakDiff, http://attrakdiff.de/index-en.html, (Accessed on 06/03/2018),

[20] M. Hassenzahl, M. Burmester, and F. Koller, “AttrakDiff: A questionnaire to measureperceived hedonic and pragmatic quality”, in Mensch & Computer, 2003, pp. 187–196.

[21] J. Nielsen and R. Molich, “Heuristic evaluation of user interfaces”, in Proceedingsof the SIGCHI conference on Human factors in computing systems, ACM, 1990,pp. 249–256.

[22] ——, “Heuristic evaluation of user interfaces”, in Proceedings of the SIGCHI confer-ence on Human factors in computing systems, ACM, 1990, pp. 249–256.

[23] 10 Heuristics for User Interface Design: Article by Jakob Nielsen, https://www.nngroup.com/articles/ten-usability-heuristics/, (Accessed on05/18/2018),

[24] J. Bach, Exploratory testing explained, 2003.

[25] Defining Exploratory Testing « Cem Kaner, J.D., Ph.D. http://kaner.com/?p=46, (Accessed on 05/27/2018),

[26] What is Exploratory testing in software testing?, http://istqbexamcertification.com/whatis- exploratory- testing- in- software-testing/, (Accessed on 05/27/2018),

[27] J. Kirakowski, N. Claridge, and R. Whitehand, “Human centered measures of successin web site design”, in Proceedings of the Fourth Conference on Human Factors &the Web, 1998.

[28] WAMMI - Home, http://www.wammi.com/, (Accessed on 05/07/2018),

[29] J. Kirakowski and M. Corbett, “SUMI: The software usability measurement inven-tory”, British journal of educational technology, vol. 24, no. 3, pp. 210–212, 1993.

[30] G. Zülch and S. Stowasser, “Usability evaluation of user interfaces with the computer-aided evaluation tool PROKUS”, MMI-Interaktiv, vol. 3, pp. 1–17, 2000.

[31] M. Macleod and R. Rengger, “The development of DRUM: A software tool for video-assisted usability evaluation”, People and Computers, pp. 293–293, 1993.

[32] B. Harris and M. Batty, “Locational models, geographic information and planningsupport systems”, Journal of Planning Education and Research, vol. 12, no. 3, pp. 184–198, 1993.

https://www.optimizely.com/optimization-glossary/ab-testing/

https://www.optimizely.com/optimization-glossary/ab-testing/

http://attrakdiff.de/index-en.html

https://www.nngroup.com/articles/ten-usability-heuristics/

https://www.nngroup.com/articles/ten-usability-heuristics/

http://kaner.com/?p=46

http://kaner.com/?p=46

http://istqbexamcertification.com/what-is-exploratory-testing-in-software-testing/



http://www.wammi.com/

66 BIBLIOGRAPHY

[33] R. K. Brail and R. E. Klosterman, Planning support systems: Integrating geographicinformation systems, models, and visualization tools. ESRI, Inc., 2001.

[34] Atlas Test Plan Template, https://www.atlascode.com/wp-content/uploads/2017/08/Atlas_Software_Testing_Plan_Template.pdf,(Accessed on 01/01/2018),

[35] S. Charfi, H. Ezzedine, and C. Kolski, “RITA: A framework based on multi-evaluationtechniques for user interface evaluation: Application to a transport network supervi-sion system”, in 2013 International Conference on Advanced Logistics and Transport,May 2013, pp. 263–268. DOI: 10.1109/ICAdLT.2013.6568470.

[36] P. Pu, L. Chen, and R. Hu, “A User-centric Evaluation Framework for RecommenderSystems”, in Proceedings of the Fifth ACM Conference on Recommender Systems,ser. RecSys ’11, Chicago, Illinois, USA: ACM, 2011, pp. 157–164, ISBN: 978-1-4503-0683-6. DOI: 10.1145/2043932.2043962. [Online]. Available: http://doi.acm.org/10.1145/2043932.2043962.

[37] M. Q. Patton, “Six Honest Serving Men for Evaluation.”, Studies in Educational Eval-uation, vol. 14, no. 3, pp. 301–30, 1988.

[38] Rudyard Kipling, http://wiki.c2.com/?RudyardKipling, (Accessed on05/31/2018),

[39] Heuristic Evaluations and Expert Reviews | Usability.gov, https://www.usability.gov/how-to-and-tools/methods/heuristic-evaluation.html, (Accessed on 04/01/2018),

[40] L. Faulkner, “Beyond the five-user assumption: Benefits of increased sample sizes inusability testing”, Behavior Research Methods, Instruments, & Computers, vol. 35,no. 3, pp. 379–383, 2003.

[41] How To Estimate a UX Project – UX Mastery, https://uxmastery.com/how-to-estimate-a-ux-project/, (Accessed on 04/22/2018),

[42] Identify primary intended users | Better Evaluation, https://www.betterevaluation.org/en/plan/frame/identify_primary_intended_users,(Accessed on 05/18/2018),

[43] T. Marston, What is the 3-Tier Architecture?, http://www.tonymarston.net/php-mysql/3-tier-architecture.html, (Accessed on 12/14/2017),

[44] The DCI Architecture: A New Vision of Object-Oriented Programming, http://www.artima.com/articles/dci_vision.html, (Accessed on 11/24/2017),

[45] S. Burbeck, “Applications programming in smalltalk-80: how to use model-view-controller (mvc)”, Jan. 1992.

[46] What Are The Benefits of MVC?, http://blog.iandavis.com/2008/12/what-are-the-benefits-of-mvc/, (Accessed on 11/24/2017),

[47] N. Jain, P. Mangal, and D. Mehta, “AngularJS: A modern MVC framework in JavaScript”,Journal of Global Research in Computer Science, vol. 5, no. 12, pp. 17–23, 2015.

[48] Fielding Dissertation: CHAPTER 5: Representational State Transfer (REST), http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm, (Accessed on 11/24/2017),

[49] Node.js, https://nodejs.org/en/, (Accessed on 11/24/2017),

https://www.atlascode.com/wp-content/uploads/2017/08/Atlas_Software_Testing_Plan_Template.pdf

https://www.atlascode.com/wp-content/uploads/2017/08/Atlas_Software_Testing_Plan_Template.pdf

https://doi.org/10.1109/ICAdLT.2013.6568470

https://doi.org/10.1145/2043932.2043962

http://doi.acm.org/10.1145/2043932.2043962

http://doi.acm.org/10.1145/2043932.2043962

http://wiki.c2.com/?RudyardKipling

https://www.usability.gov/how-to-and-tools/methods/heuristic-evaluation.html



https://uxmastery.com/how-to-estimate-a-ux-project/

https://uxmastery.com/how-to-estimate-a-ux-project/

https://www.betterevaluation.org/en/plan/frame/identify_primary_intended_users

https://www.betterevaluation.org/en/plan/frame/identify_primary_intended_users

http://www.tonymarston.net/php-mysql/3-tier-architecture.html

http://www.tonymarston.net/php-mysql/3-tier-architecture.html

http://www.artima.com/articles/dci_vision.html

http://www.artima.com/articles/dci_vision.html

http://blog.iandavis.com/2008/12/what-are-the-benefits-of-mvc/

http://blog.iandavis.com/2008/12/what-are-the-benefits-of-mvc/

http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm



https://nodejs.org/en/

BIBLIOGRAPHY 67

[50] RFC 7519 - JSON Web Token (JWT), https://tools.ietf.org/html/rfc7519, (Accessed on 11/25/2017),

[51] What is RSA algorithm (Rivest-Shamir-Adleman)? - Definition from WhatIs.com, http://searchsecurity.techtarget.com/definition/RSA, (Accessedon 12/25/2017),

[52] JSON Web Tokens - jwt.io, https://jwt.io/, (Accessed on 11/25/2017),

[53] 1999 USENIX Annual Technical Conference, June 6-11, 1999, Monterey, California,USA, https://www.usenix.org/legacy/events/usenix99/provos/provos_html/index.html, (Accessed on 11/26/2017),

[54] Mocha - the fun, simple, flexible JavaScript test framework, https://mochajs.org/, (Accessed on 11/26/2017),

[55] PM2 - Advanced Node.js process manager, http://pm2.keymetrics.io/,(Accessed on 11/27/2017),

[56] NGINX | High Performance Load Balancer, Web Server, & Reverse Proxy, https://www.nginx.com/, (Accessed on 11/28/2017),

[57] J. Knapp, J. Zeratsky, and B. Kowitz, Sprint: How to Solve Big Problems and TestNew Ideas in Just 5 Days. Simon & Schuster, 2016, ISBN: 9781501140808. [Online].Available: https://books.google.is/books?id=BWlJjgEACAAJ.

[58] N. BEVAN and M. MACLEOD, “Usability measurement in context”, Behaviour &Information Technology, vol. 13, no. 1-2, pp. 132–145, 1994.

[59] M. MacLeod, R. Bowden, N. Bevan, and I. Curson, “The MUSiC performance mea-surement method”, Behaviour & Information Technology, vol. 16, no. 4-5, pp. 279–293, 1997.

[60] Success Rate: The Simplest Usability Metric, https://www.nngroup.com/articles/success-rate-the-simplest-usability-metric/,(Accessed on 11/17/2017),

[61] M. K. Lárusdóttir and S. E. Ármannsdóttir, “A case study of software replacement”, inProceedings of the international conference on software development, 2005, pp. 129–140.

https://tools.ietf.org/html/rfc7519

https://tools.ietf.org/html/rfc7519

http://searchsecurity.techtarget.com/definition/RSA

http://searchsecurity.techtarget.com/definition/RSA

https://jwt.io/

https://www.usenix.org/legacy/events/usenix99/provos/provos_html/index.html

https://www.usenix.org/legacy/events/usenix99/provos/provos_html/index.html

https://mochajs.org/

https://mochajs.org/

http://pm2.keymetrics.io/

https://www.nginx.com/

https://www.nginx.com/

https://books.google.is/books?id=BWlJjgEACAAJ

https://www.nngroup.com/articles/success-rate-the-simplest-usability-metric/

https://www.nngroup.com/articles/success-rate-the-simplest-usability-metric/

68

Appendix A

Questions

Here we have the list of questions in the framework.

Factor Characteristics Questions

System Users

#Users How many users will be included in the evaluation?User types Who is the end user of the system?Background What is the background of the user?.

What background information will be collected?Is the user representative for a user group?Describe the user group.

Intended Users Who are the intended users for the system?User selection How will the users be selected?

Evaluators

#Evaluators How many evaluators will be included in the evaluation?Background What is the background of the evaluator?Requirement What are the requirements for the evaluator to conduct the eval-

uation?

Observers

#Observers How many observers will be included in the evaluation?.Obligation What are the obligations of the observer?.Background What is the background of the observer?

Technical writers Background What is the background of the technical writers?Recipients Background What is the background of the recipients?.

Table A.1: Showing the 16 questions defined for factors in Roles category.

Factor Characteristics Description

The purposeThe objective What is the purpose of the evaluation?The approach How are you going to achieve the purpose of the evaluation?

SchedulesTime and date What is the date of the evaluation?

At what time?Duration What is the estimated time on each evaluation?

Methodology Method What evaluation method will be used?Analysis Approach How will the data be analyzed?Decision General What will the results be used for?

Table A.2: Showing the 8 questions defined for factors in Activities category.

69


Evaluation materialFor users What evaluation material will be used for users?For evaluators What evaluation material will be used for evaluators?

Data collectionCollection Is the data qualitative, quantitative or both?Metrics What data is being collected?

Result Description How will the results be described?Decisions General How will the decisions be described?

Who has the authority to make the decisions?

Resource costKnowledge Are there any skills that needs to be trained?

What is the estimated time for training the skills?Preparation How much time is estimated to prepare the evaluation?Compensation What is the value of the compensation for the participant?Rates What is the hourly rate of the evaluator?

What is the hourly rate of the observer?What is the estimated cost for the technical writer?

Table A.3: Showing the 14 questions defined for factors in Materials category.


SettingsLocation Where will the evaluation be conducted?Description What kind of environment will the evaluation be conducted?

Physical ambientcondition

Information What are the ambient conditions of the environment?

Social conditions Information Are there any social conditions that affect the performance ofthe participant?

Data collection Approach What equipment will be used to gather the data?Data analysis Approach What equipment will be used for analyzing the data?

Table A.4: Showing the 6 questions defined for factors in Environment and equipment cate-gory.

Factor Characteristics Attribute exampleDescription General What is the name of the system?

What is the version of the system?What is the interaction style of the system?

Type General What is the type of the system?Stage Form of design What is the status of the design?Part Description What part of the system is being evaluated?Required equipment List What equipment will be used to use the system during the eval-

uation?Risk and issues Description Are there any risks or issues that might occur in the evaluation?

Table A.5: Showing the 8 questions defined for factors in system category.

70

Appendix B

User Evaluation Text Information

This Chapter contains the information, that the participant got when solving tasks in each ofthe evaluation.

B.1 First EvaluationThe cooking team in Iceland has recently been developing projects that help all those whowant to learn to cook, to cook properly. This project is a website and was named with thetitle "Cooking done right", a short description would be, e.g., "Proper working methods inthe kitchen". You have reached the point that you need to conduct user tests on the product.Where you are testing the interface of recipes and teaching videos. For that, you need toplan a user evaluation. You plan to have five users who are divided into three women andtwo men. All of them want to improve their cooking skills.

For the evaluation, you have one observer and one evaluator. The observer is an internand is in training, the evaluator has five years of experience and has a Master degree inComputer Science. The observer will take notes of what he sees and hears. He will alsohelp in analyzing the results of the evaluation. The result will later be handed over to theFood Counsellors that then will make decisions on the next step in the development. Youplan to meet the participants in your office, and you estimate that this will take two days.

You will use a "think aloud method" and have ten tasks for the user to solve, while theusers solve the task you have some questions on how they are thinking about solving a task.The user evaluation will be recorded on iPhone 6, the timer is on the phone, and you willtake notes with pen and paper by asking the user some questions. You will then explain theresult in a report and decide on how to react to them. In the evaluation, you want to havesome background knowledge of the users, for that you will have few questions that theyanswer to give you better oversight on their knowledge and more related to the project.

For supporting material, you will give them a paper with information that helps themunderstand how to solve each task. By doing this you can gather data on feedback from theuser while he conducts each task, asking questions, take time on each task and mark themas solved or not, then you will ask the user in the end how he felt about the product. Asdescribed you will conduct the evaluation at your office, after the evaluation you will gatheryour information from the recording, notes to calculate average time on each task and othervarious measurements for analyzing the result.

The product is the early version or 1.0.0 of a product that is helping everyone that wantsto learn to cook better food with right tools. The user will use a computer that your companyowns to solve each task.

B.2. SECOND EVALUATION 71

It is estimated that the evaluation will take about 45 minutes to perform each evaluationand after having discussed the individuals who will be able to take place on 5 and 6 Novem-ber 2017. The material that will be used is, e.g., User task, User instructions and Othermaterial. Then use a sheet and pen, computer, phone to record sound recordings and a clockto take time. These data are expected to be Qualitative and collected by Viewing UsabilityProblems, Time of task, completion of a task, user experience and satisfaction data will thenbe collected in Excel.

B.2 Second EvaluationAfter evaluation and all the meetings, you wanted to give feedback on how you felt aboutthe current evaluation. You found that using UXE PSS worked very well. The plan workedrather well for all factors, and you found that the Users part worked very well for you.Stating the purpose worked well for you, analyzing the data and making decisions workedvery well. Next time you will try to have six users in your evaluation.

planning support system for user experience evaluationquestions, option for checkboxes and radio...

Documents