the impact of alternative search mechanisms on the effectiveness of knowledge retrieval ryan c....

The Impact of Alternative Search Mechanisms on the Effectiveness of

Knowledge Retrieval

Ryan C. LaBrie

Department of Information SystemsW. P. Carey School of Business

Arizona State University

Dissertation Defense PresentationThursday 27 May 2004

10:00 a.m. – NoonBA 253

2/54

Today’s Agenda

Motivation Literature ReviewModel and Hypotheses MethodResultsDiscussion

3/54

Why Study Knowledge Management?

75% of organizational assets are now intangible (knowledge) assets.

- Alan Greenspan

The 500 largest firms in the United States had intangible assets valued at $7.3 trillion (70% of their total value).

These 500 companies employ more than 21.6 million employees and generate over US$ 6.1 trillion in revenue.

(Stone, 2002)

4/54

Why Knowledge Management?

An Arthur Anderson study revealed: In 1978 the balance sheet explained 95% of the market

value of the firms. In 1998 the balance sheet only explained 28% of the

market value of those firms.

Today, the balance sheet explains less than 15% of the market value of the average firm (Stanfield, 2002).

Impending requirements of the Sarbanes-Oxley Act Legislation requiring improved access to documentation

and other knowledge objects.

5/54

Knowledge Management Research

Alavi & Leidner (2001) provide a KM research framework

creation – “storage/retrieval” – transfer – application

This dissertation focuses specifically on the retrieval aspect of knowledge management research addressing the following question posed by Alavi & Leidner:

“What retrieval mechanisms are most effective in enabling knowledge retrieval?”

6/54

Research Question

Does the cognitive loading of search mechanisms impact the effectiveness of knowledge retrieval?

7/54

The Data, Information, Knowledge Hierarchy

DATADATA

INFORMATIONINFORMATION

KNOWLEDGEKNOWLEDGE

8/54

What is Data?Definition of data:

Sets of symbols not necessarily understood by, found meaningful by, or causing a change of state in the destination (Meadow & Yuan, 1997).

Connotes: raw facts, direct observations

Example:65

9/54

What is Information?Definition of information

A set of symbols that does have meaning or significance to the recipient (Meadow & Yuan, 1997).

Connotes: structure, definition, semantics

Example:Age: 65

10/54

What is Knowledge?Definition of knowledge:

The accumulation and integration of information received and processed by a recipient (Meadow & Yuan, 1997).

Information bound together by structure, assumptions, justification, and process (Edgington et al., in press).

Connotes: relationships, models, and usage

Example:At age 65 you may retire from work as a United

States federal employee.

11/54

The Difference between Data and Knowledge Retrieval

Blair (2002) outlines five major differences• Additional work by Blair identifies 12 differences

Data Retrieval Knowledge Retrieval

Direct (“I want to know X”) Indirect (“I want to know about X”)

Necessary relation between a formal query and the representation of a satisfactory answer

Probabilistic relation between a formal query and the representation of a satisfactory answer

Criterion of success = correctness Criterion of success = utility

Speed dependent on the time of physical access Speed dependent on the number of logical decisions the searcher must make

Scales easily Does not scale easily

12/54

Keyword Retrieval Mechanisms

Keyword Phrase # of Articles

System Design (original query) 3

Systems Design 10

System Designs 0

Systems Designs 1

System Analysis and Design 0

Systems Analysis and Design 4

Information System Design 2

Information Systems Design 8

MIS Systems Design 1

Participative System Design 1

System Design Methods 1

Expert Systems Design 1

Impact and Socio-Technical Systems Design

1

TOTAL 34

Gorla & Walker (1998)Ambiguity biggest

problem Call for a controlled

vocabulary (classification)

LaBrie & St. Louis (2003)Mirrored Gorla & Walker

results with a different data set.

13/54

Did Classifications Improve Retrieval Effectiveness?

LaBrie & St. Louis (2003) Classifications mirror keyword results

Classification schemes are almost immediately outdated and require updating

Weber (2003) Ended 15 years of ISRL classification usage in MIS Quarterly

Number of times a keyword was used Percent of Total

1 1375 76.77%

2 234 13.07%

>2 182 10.16%

Total 1791 100.00%

Number of times a category was used Percent of Total

1 275 44.87%

2 111 18.10%

>2 227 37.03%

Total 613 100.00%

14/54

The Human FactorKnowledge Management and KM retrieval

involves people as part of the equationPeople have cognitive limitations

• Cognitive processing size limitationsPeople have inherent cognitive retrieval

processes• Recall versus recognition retrieval

People have thresholds for decision-making• Effort versus accuracy

People have varying degrees of experience (or previous knowledge)

15/54

Recall and Recognition Recall: retrieval mechanism with no cues or

hints to help the individual make a decision or judgment (Driscoll, 2000).

Example: What does the word esoteric mean?

Recognition: involves a set of pre-generated stimuli presented to the individual for a decision or judgment (Driscoll, 2000).

Example: Which of the following words is the best synonym for esoteric?

• Essential• Mystical• Terrific• Evident

16/54

Milestones in Recall & Recognition Research

Ebbinghaus’ Forgetting Function (1885) Beginning of the recall research, exponential decay function.

Miller’s Magical Number Seven (1956) There is a cognitive limit (7+/- 2) of how much information

we can remember, use. Simon’s Hierarchical Systems (1962, 1974)

Humans store information hierarchically. Bower et al’s Recall versus Recognition Experiments

(1969) Recall is a serial process, recognition is a parallel process.

Anderson’s Adaptive Character of Thought – ACT-R (1995, 1997) Retrieval of information is facilitated if it is organized

hierarchically. Theory of the way declarative and procedural knowledge

interact in complex cognitive processes.

17/54

Effort versus AccuracyThe saliency of effort is much greater than that of

accuracy.

Todd & Benbasat (1992) Effort and decision quality with decision aides

Vessey (1991, 1994) Cognitive fit, Information presentation (graphs vs.

tables) and decision making Galleta et al. (1996)

Effort and accuracy in spreadsheet evaluations Speier & Morris (2003)

Query interface design on decision making performance

18/54

The Role of ExperienceDhaliwal & Benbasat (1996),Gregor &

Benbasat (1999), Mao & Benbasat (2000) Provide theoretical support that

experience plays a role in KB/IS usageMarkus (2001)

Differentiates between novice and expert users in her theory of knowledge reuse

19/54

Reexamining the Research Question

Does the cognitive loading (recall or recognition, small to large sets, user experience) of search mechanisms (keyword or visual) impact the effectiveness (accuracy, timeliness, work effort, satisfaction) of knowledge retrieval?

20/54

Research Model (Figure 6, p.35)

RETRIEVAL METHOD

(Keyword or Visual)

RETRIEVAL EFFECTIVENESS(Accuracy, Time, Work Effort, and

Satisfaction)

RESULTS SET(Small, Medium,

and Large)

H1.1 - H1.4

H3.1 - H3.4

H2.1 - H2.4

H4.1 - H4.4

USER EXPERIENCE(High or Low)

H5.1 - 5.4

21/54

Measuring the Variables Automatically, electronically by the system

Captures time and articles Measuring accuracy

Expert raters using a psuedo-Delphi method (Buckley, 1995)

Measured both omitted relevant (Type I errors) and included irrelevant (Type II errors) articles

Validated survey instruments captured work effort, and satisfaction data Work Effort NASA-TLX (Hart & Staveland, 1988)

• Mental, Physical, Temporal, Frustration, Performance, Effort End-User Computing Satisfaction (Doll & Torkzadeh, 1988)

• Content, Accuracy, Format, Ease of Use, Timeliness

22/54

Summary of Hypotheses (Main Effects)

H1: Retrieval Method to Retrieval Effectiveness Keyword (Recall) Visual (Recognition)

H1.1: Accuracy Less More

H1.2: Time Faster Slower

H1.3: Work Effort More Less

H1.4: Satisfaction Lower Higher

H2: Result Set Size to Retrieval Effectiveness Smaller Result Sets Larger Result Sets

H2.1: Accuracy Higher Lower

H2.2: Time Faster Slower

H2.3: Work Effort Less More

H2.4: Satisfaction Higher Lower

H3: Experience on Retrieval Effectiveness Less Experience More Experience

H3.1: Accuracy Lower Higher

H3.2: Time Slower Faster

H3.3: Work Effort More Less

H3.4: Satisfaction Lower Higher

23/54

Summary of Hypotheses (Interaction Effects)

H4: Retrieval Method and Result Set Size to Retrieval Effectiveness

H4.1: Accuracy As result set size grows the difference in accuracy increases

H4.2: Time As result set size grows the difference in time increases

H4.3: Work Effort As result set size grows the difference in work effort increases

H4.4: Satisfaction As result set size grows the difference in satisfaction increases

H5: Retrieval Method and User Experience to Retrieval Effectiveness

H5.1: Accuracy As user experience increases the difference in accuracy decreases

H5.2: Time As user experience increases the difference in time increases

H5.3: Work Effort As user experience increases the difference in work effort decreases

H5.4: Satisfaction As user experience increases the difference in satisfaction decreases(such that a very experienced user may prefer the keyword search interface

over the visual hierarchy search interface)

24/54

H4: Retrieval Method – Results Set Size Interactions

Small Medium LargeRESULT SET SIZE

AC

CU

RA

CY

Low

Hig

h

Keyword

Visual


TIM

EL

owH

igh

Keyword

Visual


WO

RK

EFF

OR

TL

owH

igh


SAT

ISFA

CT

ION

Low

Hig

h

Keyword

VisualKeyword

Visual

25/54

H5: Retrieval Method – User Experience Interactions

Low HighUSER EXPERIENCE

AC

CU

RA

CY

Low

Hig

h

Keyword

Visual


TIM

EL

owH

igh

Keyword

Visual


WO

RK

EFF

OR

TL

owH

igh


SAT

ISFA

CT

ION

Low

Hig

h

Keyword

Visual

Keyword

Visual

26/54

Method A 2 x 3 x 2 mixed factor, repeated measures (on

one factor), analysis of variance (ANOVA), experimental design.

Independent Variables Retrieval Method (keyword or visual – between subjects) Result Set Size (small, medium, and large – within

subjects) User Experience (low or high – between subjects)

Dependent Variable: Retrieval Effectiveness Accuracy Time Work Effort Satisfaction

27/54

Method (cont.) Laboratory Experiment Characteristics:

Task setting: Each subject was randomly assigned to one of two search interfaces (keyword or visual)

• Controls for carry-over effect

Task setting: Each subject was requested to perform search tasks on three different scenarios

• See slide deck appendix for scenarios

Incentives included a $100 cash prize for the highest level of accuracy

Subjects: Four sessions – 3 graduate sections, 1 undergraduate (seniors) section

28/54

The Search TasksThree randomized search scenarios

Controls for learning effect

Manipulated to return various result set sizesLarge “System Design” scenario

• 20% of the data set (120 correct responses)Medium “User Acceptance” scenario

• 6-7% of the data set (40 correct responses)Small “Risk Management” scenario

• 1-2% of the data set (10 correct responses)

29/54

Demonstration of Search Interfaces

30/54

Summary of Results

N = 75 AccuracyAccuracy Time (sec)Time (sec) Work Work EffortEffort

SatisfactiSatisfactionon

VisualVisual 28.93% 1080 63.77 14.84

KeywordKeyword 17.67% 902 53.67 15.61

Graduate Students Only

N = 42 AccuracyAccuracy Time (sec)Time (sec) Work Work EffortEffort

SatisfactiSatisfactionon

VisualVisual 31.21% 1297 64.24 14.80

KeywordKeyword 20.68% 1056 56.36 15.86

31/54

Accuracy Hypothesis – 1.1 Knowledge management systems that employ

a visual tree-view hierarchy search interface will produce more accurate results than knowledge management systems that employ a text-based keyword search interface.

Means of Accuracy

Keyw ord Visual

Retrieval Method

Acc

ura

cy

Figure 1a - Search Interface - Accuracy (Hypothesized)

Means of Accuracy

0%

5%

10%

15%

20%

25%

30%

35%

Keyw ord Visual

Retrieval Method

Acc

ura

cy

Figure 13b - Search Interface - Accuracy (Actual)

32/54

Accuracy Hypothesis – 2.1 As result set size increases retrieval accuracy

will decrease.

Means ofAccuracy

Small Medium Large

Result Set Size

Acc

ura

cy

Figure 13c – Result Set Size - Accuracy (Hypothesized)

Means ofAccuracy

0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

Small Medium Large

Result Set Size

Acc

ura

cy

Figure 13d – Result Set Size - Accuracy (Actual)

33/54

Troubles Measuring Experience

Bedard (1989) and Gregor & Benbasat (1999) discuss theoretical support for experience as a valid variable for both direct and moderating effects in knowledge-based research.

They also note the difficulties in finding a generally accepted definition and a method for operationalizing the concept.

34/54

Factor Analysis of Experience Items

Rotated Component Matrix(a)

Component

1 2 3

SDExpert .144 .873 -.040

UAExpert -.021 .883 .105

RiExpert -.048 .836 .089

Graduate .861 -.087 -.243

Educ .792 -.124 -.168

ISProf .738 .143 .226

ISDeg .733 .165 .256

UseISBig .011 -.113 .862

UseISSml .012 .235 .746

Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization.a Rotation converged in 4 iterations.

•Content Experience (loaded on by SDExpert, UAExpert, RMExpert) •Educational Experience (loaded on by ISDegree, ISProf, EducationLevel, GradorUndergrad) •Search Experience (loaded on by SearchBig, SearchSml)

35/54

Means of Accuracy

Undergraduate Graduate

Experience

Acc

ura

cy

Figure 13e – Experience - Accuracy

(Hypothesized)

Means of Accuracy

0%

5%

10%

15%

20%

25%

30%

35%


Experience

Ac

cura

cy

Figure 13f – Experience - Accuracy

(Actual)

Accuracy Hypothesis – 3.1 Subjects with more (educational) experience will

have higher retrieval accuracy than subjects with less (educational) experience.

36/54

Top 10 Accuracy Results*

Rank Accuracy Rate Search Interface Education Email1 72% Visual Undergraduate [email protected] 64% Visual Graduate ***Anonymous***3 63% Keyword Graduate [email protected] 58% Visual Undergraduate [email protected] 57% Visual Graduate [email protected] 52% Visual Graduate [email protected] 50% Visual Graduate [email protected] 45% Keyword Graduate ***Anonymous***9 43% Visual Graduate [email protected]

10 41% Visual Graduate [email protected]

* Mean Accuracy Rate = 24%

37/54

Means of Time

Keyword Visual

Retrieval Method

Tim

e

Figure 1a – Search Interface - Time (Hypothesized)

Means of Time

260

280

300

320

340

360

380

Keyword Visual

Retrieval Method

Tim

e

Figure 14b – Search Interface - Time (Actual)

Time Hypothesis – 1.2 Users of knowledge management systems that

employ visual tree-view retrieval interfaces will spend more time searching than users of knowledge management systems that employ text-based keyword retrieval interfaces.

38/54

Means of Time

Small Medium Large

Result Set Size

Tim

e

Figure 14c – Result Set Size - Time (Hypothesized)

Means of Time

200

250

300

350

400

450

Small Medium Large

Result Set Size

Tim

e

Figure 14d – Result Set Size - Time (Actual)

Time Hypothesis – 2.2 As result set size increases time will increase.

Note: Directional for the visual search interface only, the keyword interface remained flat across the scenarios.

39/54

Means of Time


Experience

Tim

e

Figure 14e – Experience - Time

(Hypothesized)

Means of Time

200

250

300

350

400

450


Experience

Tim

e

Figure 14f – Experience - Time

(Actual)

Time Hypothesis – 3.2 Subjects with more (educational) experience

will perform searches faster than subjects with less (educational) experience.

40/54

Work Effort Hypothesis – 1.3 Knowledge management systems that employ

visual tree-view hierarchy search interfaces will use less work effort than knowledge management systems that employ text-based keyword search interfaces. Means of Effort

Keyword Visual

Retrieval Method

Wor

k Ef

fort

Figure 1a – Search Interface – Effort (Hypothesized)

Means of Effort

0102030405060708090

100

Keyword Visual

Retrieval Method

Wor

k Ef

fort

Figure 15b – Search Interface - Effort (Actual)

41/54

Work Effort Hypothesis – 2.3 As result set size increases work effort will

increase.

Means of Effort

Small Medium Large

Result Set Size

Wor

k Ef

fort

Figure 15c – Result Set Size - Effort (Hypothesized)

Means of Effort

0102030405060708090

100

Small Medium Large

Result Set Size

Wor

k Ef

fort

Figure 15d – Result Set Size - Effort (Actual)

42/54

Work Effort Hypothesis – 3.3 Subjects with more (educational) experience

will use less effort than subjects with less (educational) experience.

Means of Effort


Experience

Wor

k Effo

rt

Figure 15e – Experience - Effort

(Hypothesized)

Means of Effort

0102030405060708090

100


Experience

Wor

k Effo

rt

Figure 15f – Experience - Effort

(Actual)

43/54

Satisfaction Hypothesis – 1.4 Knowledge management systems that employ

visual tree-view hierarchy search interfaces will have a higher degree of satisfaction than knowledge management systems that employ text-based keyword search interfaces.

Means of Satisfaction

Keyword Visual

Retrieval Method

Satis

fact

ion

Figure 1a – Search Interface – Satisfaction (Hypothesized)


5

10

15

20

25

30

35

Keyword Visual

Retrieval Method

Satis

fact

ion

Figure 16b – Search Interface - Satisfaction (Actual)

44/54

Satisfaction Hypothesis – 2.4 As result set size increases satisfaction will

decrease.


Small Medium Large

Result Set Size

Satis

fact

ion

Figure 16c – Result Set Size - Satisfaction (Hypothesized)


5

10

15

20

25

30

35

Small Medium Large

Result Set Size

Satis

fact

ion

Figure 16d – Result Set Size - Satisfaction (Actual)

45/54

Satisfaction Hypothesis – 3.4 Subjects with more (educational) experience

will be more satisfied than subjects with less (educational) experience.



Experience

Sa

tis

fac

tio

n

Figure 15e – Experience - Satisfaction

(Hypothesized)


5

10

15

20

25

30

35


Experience

Satis

fact

ion

Figure 15f – Experience - Satisfaction

(Actual)

46/54

Summary of Hypotheses Results (Main Effects)

Hypothesis 1: Retrieval Method to Retrieval Effectiveness – Main Effect

H1.1: Accuracy Supported

H1.2: Time Supported

H1.3: Work Effort Not Supported – significant in opposite direction

H1.4: Satisfaction Not Supported

Hypothesis 2: Result set Size to Retrieval Effectiveness – Main Effect


H2.2: Time Directional – for the Visual search interface only

H2.3: Work Effort Not Supported


Hypothesis 3: (Educational) Experience to Retrieval Effectiveness – Main Effect


H3.2: Time Not Supported – significant in opposite direction



47/54

Summary of Hypotheses Results (Interaction Effects)

Hypothesis 4: Retrieval Method and Result set Size to Retrieval Effectiveness – Interaction Effect

H4.1: Accuracy Not Supported

H4.2: Time Not Supported



Hypothesis 5: Retrieval Method and (Educational) Experience to Retrieval Effectiveness – Interaction Effect

H5.1: Accuracy Not Supported

H5.2: Time Not Supported



48/54

Revised Model (Figure 17, p.123)

RETRIEVAL METHOD

(Keyword or Visual)

RETRIEVAL EFFECTIVENESS(Accuracy, Time, Work Effort, and

Satisfaction)

EDUCATIONAL EXPERIENCE(Graduate or

Undergraduate)

RESULT SET SIZE(Small, Medium,

Large)

49/54

Repeated Measures Analysis

mean f-ratio (p-value) mean f-ratio (p-value) mean f-ratio (p-value) mean f-ratio (p-value) comp (t-test sig.) comp (t-test sig.) comp (t-test sig.) comp (t-test sig.)

grand mean 22.96% 321.88 58.41 15.21Retrieval Method 10.79 (0.002) 2.83 (0.097) 11.17 (0.001) 1.20 (0.277) keyword 17.41% 294.48 53.33 15.60 visual 28.51% 349.27 63.49 14.83Educational Experience 3.09 (0.083) 19.02 (0.000) 1.38 (0.245) 0.10 (0.758) undergraduate 19.98% 250.46 56.61 15.10 graduate 25.96% 393.29 60.21 15.32Result Set Size 81.14 (0.000) 0.95 (0.389) 35.29 (0.000) 1.19 (0.306) large 14.85% 1|2 (0.467) 334.23 1|2 (0.899) 51.30 1|2 (0.000) 15.53 1|2 (0.159) medium 16.31% 2|3 (0.000) 338.98 2|3 (0.262) 63.87 2|3 (0.014) 15.02 2|3 (0.866) small 37.72% 1|3 (0.000) 292.42 1|3 (0.212) 60.06 1|3 (0.000) 15.08 1|3 (0.215)

Accuracy Time (sec) Workeffort Satisifaction

50/54

Findings and Implications Cognitive loading does have a significant effect on

retrieval effectiveness with respect to accuracy and work effort.

50+% Accuracy gains for approximately one minute more of your time.

Satisfaction between the two search mechanisms was virtually identical.

Work effort turned out to be significantly greater in the visual interface, opposite of hypothesized direction.

Retrieval accuracy can be improved for a limited cost using current technology.

51/54

Limitations Lack of a validated experience measure

With a better measure experience may turn out to be significant

Demographic questions focused on the workplace may have confounded the measure.

Homogeneity of subjects on experience measures.

Experiment based on an academic literature data source May not generalizing to other knowledge retrieval

settings

Size of the data set 600+ knowledge objects with some 2700+ relationships

might be considered a rather small knowledge repository

52/54

Avenues for Future Research Use of an alternative knowledge object as a

data source

Move to a field study Possibilities include a corporation involved in a KM

initiative looking to improve their retrieval effectiveness in areas such as lessons learned, or areas of expertise

Add a combinatory (keyword + visual) search interface The optimal solution is probably not an either or

scenario but rather a combinatory one Related: testing the Fit of different interfaces to user

preferences

53/54

Future Research (cont.) Allowing for searching across multiple

hierarchies keywords and authors expertise and location

Investigating additional visual search interfaces This tree-view hierarchy should not be considered the

only valid visual search interface.

Testing the scalability of visual search mechanism

Quantifying the benefits of improved retrieval effectiveness What are the economic implications?

54/54

Comments or Questions?

Thank you for your time.

Please address feedback to:

[email protected]

55/54

Appendix: System Design Scenario (Large Result Set – p.78)

In this scenario, suppose you are a manager in a sizable information technology department within a large corporation. Assume your department is responsible for a large number of internal IT application development projects. Many of your projects tend to have problems that you would like to see alleviated. You believe that many of these problems could have been avoided by better design during the information systems development process. Before moving forward with any new projects you would like to learn about ways to more effectively design information systems.

56/54

Appendix: User Acceptance Scenario (Medium Result Set – p.79)

In this scenario, suppose you are working on a project for a self-monitoring healthcare application in which you must implement a new computer system that patients will need to use in their home. You would like to learn more about what causes end-users to accept new information systems.

57/54

Appendix: Risk Management Scenario (Small Result Set – p. 80)

In this scenario, suppose you are a senior manager of an IT organization in a company that is heavily dependent on the use of computing technology. Due to the recent floods of computer viruses, Internet worms, and hackers trying to gain access to customer records your systems have come under scrutiny by the top management team and board of directors. Because of this, you decide to seek a greater understanding of how to mitigate risk in information systems to better safeguard against these dependencies.

the impact of alternative search mechanisms on the effectiveness of knowledge retrieval ryan c....

Documents

definition of knowledge

knowledge retrievalblair

knowledge objects

knowledge hierarchywhat

intangible knowledge

retrieval effectiveness

definition of data

market value