the impact of alternative search mechanisms on the effectiveness of knowledge retrieval ryan c....
TRANSCRIPT
The Impact of Alternative Search Mechanisms on the Effectiveness of
Knowledge Retrieval
Ryan C. LaBrie
Department of Information SystemsW. P. Carey School of Business
Arizona State University
Dissertation Defense PresentationThursday 27 May 2004
10:00 a.m. – NoonBA 253
2/54
Today’s Agenda
Motivation Literature ReviewModel and Hypotheses MethodResultsDiscussion
3/54
Why Study Knowledge Management?
75% of organizational assets are now intangible (knowledge) assets.
- Alan Greenspan
The 500 largest firms in the United States had intangible assets valued at $7.3 trillion (70% of their total value).
These 500 companies employ more than 21.6 million employees and generate over US$ 6.1 trillion in revenue.
(Stone, 2002)
4/54
Why Knowledge Management?
An Arthur Anderson study revealed: In 1978 the balance sheet explained 95% of the market
value of the firms. In 1998 the balance sheet only explained 28% of the
market value of those firms.
Today, the balance sheet explains less than 15% of the market value of the average firm (Stanfield, 2002).
Impending requirements of the Sarbanes-Oxley Act Legislation requiring improved access to documentation
and other knowledge objects.
5/54
Knowledge Management Research
Alavi & Leidner (2001) provide a KM research framework
creation – “storage/retrieval” – transfer – application
This dissertation focuses specifically on the retrieval aspect of knowledge management research addressing the following question posed by Alavi & Leidner:
“What retrieval mechanisms are most effective in enabling knowledge retrieval?”
6/54
Research Question
Does the cognitive loading of search mechanisms impact the effectiveness of knowledge retrieval?
7/54
The Data, Information, Knowledge Hierarchy
DATADATA
INFORMATIONINFORMATION
KNOWLEDGEKNOWLEDGE
8/54
What is Data?Definition of data:
Sets of symbols not necessarily understood by, found meaningful by, or causing a change of state in the destination (Meadow & Yuan, 1997).
Connotes: raw facts, direct observations
Example:65
9/54
What is Information?Definition of information
A set of symbols that does have meaning or significance to the recipient (Meadow & Yuan, 1997).
Connotes: structure, definition, semantics
Example:Age: 65
10/54
What is Knowledge?Definition of knowledge:
The accumulation and integration of information received and processed by a recipient (Meadow & Yuan, 1997).
Information bound together by structure, assumptions, justification, and process (Edgington et al., in press).
Connotes: relationships, models, and usage
Example:At age 65 you may retire from work as a United
States federal employee.
11/54
The Difference between Data and Knowledge Retrieval
Blair (2002) outlines five major differences• Additional work by Blair identifies 12 differences
Data Retrieval Knowledge Retrieval
Direct (“I want to know X”) Indirect (“I want to know about X”)
Necessary relation between a formal query and the representation of a satisfactory answer
Probabilistic relation between a formal query and the representation of a satisfactory answer
Criterion of success = correctness Criterion of success = utility
Speed dependent on the time of physical access Speed dependent on the number of logical decisions the searcher must make
Scales easily Does not scale easily
12/54
Keyword Retrieval Mechanisms
Keyword Phrase # of Articles
System Design (original query) 3
Systems Design 10
System Designs 0
Systems Designs 1
System Analysis and Design 0
Systems Analysis and Design 4
Information System Design 2
Information Systems Design 8
MIS Systems Design 1
Participative System Design 1
System Design Methods 1
Expert Systems Design 1
Impact and Socio-Technical Systems Design
1
TOTAL 34
Gorla & Walker (1998)Ambiguity biggest
problem Call for a controlled
vocabulary (classification)
LaBrie & St. Louis (2003)Mirrored Gorla & Walker
results with a different data set.
13/54
Did Classifications Improve Retrieval Effectiveness?
LaBrie & St. Louis (2003) Classifications mirror keyword results
Classification schemes are almost immediately outdated and require updating
Weber (2003) Ended 15 years of ISRL classification usage in MIS Quarterly
Number of times a keyword was used Percent of Total
1 1375 76.77%
2 234 13.07%
>2 182 10.16%
Total 1791 100.00%
Number of times a category was used Percent of Total
1 275 44.87%
2 111 18.10%
>2 227 37.03%
Total 613 100.00%
14/54
The Human FactorKnowledge Management and KM retrieval
involves people as part of the equationPeople have cognitive limitations
• Cognitive processing size limitationsPeople have inherent cognitive retrieval
processes• Recall versus recognition retrieval
People have thresholds for decision-making• Effort versus accuracy
People have varying degrees of experience (or previous knowledge)
15/54
Recall and Recognition Recall: retrieval mechanism with no cues or
hints to help the individual make a decision or judgment (Driscoll, 2000).
Example: What does the word esoteric mean?
Recognition: involves a set of pre-generated stimuli presented to the individual for a decision or judgment (Driscoll, 2000).
Example: Which of the following words is the best synonym for esoteric?
• Essential• Mystical• Terrific• Evident
16/54
Milestones in Recall & Recognition Research
Ebbinghaus’ Forgetting Function (1885) Beginning of the recall research, exponential decay function.
Miller’s Magical Number Seven (1956) There is a cognitive limit (7+/- 2) of how much information
we can remember, use. Simon’s Hierarchical Systems (1962, 1974)
Humans store information hierarchically. Bower et al’s Recall versus Recognition Experiments
(1969) Recall is a serial process, recognition is a parallel process.
Anderson’s Adaptive Character of Thought – ACT-R (1995, 1997) Retrieval of information is facilitated if it is organized
hierarchically. Theory of the way declarative and procedural knowledge
interact in complex cognitive processes.
17/54
Effort versus AccuracyThe saliency of effort is much greater than that of
accuracy.
Todd & Benbasat (1992) Effort and decision quality with decision aides
Vessey (1991, 1994) Cognitive fit, Information presentation (graphs vs.
tables) and decision making Galleta et al. (1996)
Effort and accuracy in spreadsheet evaluations Speier & Morris (2003)
Query interface design on decision making performance
18/54
The Role of ExperienceDhaliwal & Benbasat (1996),Gregor &
Benbasat (1999), Mao & Benbasat (2000) Provide theoretical support that
experience plays a role in KB/IS usageMarkus (2001)
Differentiates between novice and expert users in her theory of knowledge reuse
19/54
Reexamining the Research Question
Does the cognitive loading (recall or recognition, small to large sets, user experience) of search mechanisms (keyword or visual) impact the effectiveness (accuracy, timeliness, work effort, satisfaction) of knowledge retrieval?
20/54
Research Model (Figure 6, p.35)
RETRIEVAL METHOD
(Keyword or Visual)
RETRIEVAL EFFECTIVENESS(Accuracy, Time, Work Effort, and
Satisfaction)
RESULTS SET(Small, Medium,
and Large)
H1.1 - H1.4
H3.1 - H3.4
H2.1 - H2.4
H4.1 - H4.4
USER EXPERIENCE(High or Low)
H5.1 - 5.4
21/54
Measuring the Variables Automatically, electronically by the system
Captures time and articles Measuring accuracy
Expert raters using a psuedo-Delphi method (Buckley, 1995)
Measured both omitted relevant (Type I errors) and included irrelevant (Type II errors) articles
Validated survey instruments captured work effort, and satisfaction data Work Effort NASA-TLX (Hart & Staveland, 1988)
• Mental, Physical, Temporal, Frustration, Performance, Effort End-User Computing Satisfaction (Doll & Torkzadeh, 1988)
• Content, Accuracy, Format, Ease of Use, Timeliness
22/54
Summary of Hypotheses (Main Effects)
H1: Retrieval Method to Retrieval Effectiveness Keyword (Recall) Visual (Recognition)
H1.1: Accuracy Less More
H1.2: Time Faster Slower
H1.3: Work Effort More Less
H1.4: Satisfaction Lower Higher
H2: Result Set Size to Retrieval Effectiveness Smaller Result Sets Larger Result Sets
H2.1: Accuracy Higher Lower
H2.2: Time Faster Slower
H2.3: Work Effort Less More
H2.4: Satisfaction Higher Lower
H3: Experience on Retrieval Effectiveness Less Experience More Experience
H3.1: Accuracy Lower Higher
H3.2: Time Slower Faster
H3.3: Work Effort More Less
H3.4: Satisfaction Lower Higher
23/54
Summary of Hypotheses (Interaction Effects)
H4: Retrieval Method and Result Set Size to Retrieval Effectiveness
H4.1: Accuracy As result set size grows the difference in accuracy increases
H4.2: Time As result set size grows the difference in time increases
H4.3: Work Effort As result set size grows the difference in work effort increases
H4.4: Satisfaction As result set size grows the difference in satisfaction increases
H5: Retrieval Method and User Experience to Retrieval Effectiveness
H5.1: Accuracy As user experience increases the difference in accuracy decreases
H5.2: Time As user experience increases the difference in time increases
H5.3: Work Effort As user experience increases the difference in work effort decreases
H5.4: Satisfaction As user experience increases the difference in satisfaction decreases(such that a very experienced user may prefer the keyword search interface
over the visual hierarchy search interface)
24/54
H4: Retrieval Method – Results Set Size Interactions
Small Medium LargeRESULT SET SIZE
AC
CU
RA
CY
Low
Hig
h
Keyword
Visual
Small Medium LargeRESULT SET SIZE
TIM
EL
owH
igh
Keyword
Visual
Small Medium LargeRESULT SET SIZE
WO
RK
EFF
OR
TL
owH
igh
Small Medium LargeRESULT SET SIZE
SAT
ISFA
CT
ION
Low
Hig
h
Keyword
VisualKeyword
Visual
25/54
H5: Retrieval Method – User Experience Interactions
Low HighUSER EXPERIENCE
AC
CU
RA
CY
Low
Hig
h
Keyword
Visual
Low HighUSER EXPERIENCE
TIM
EL
owH
igh
Keyword
Visual
Low HighUSER EXPERIENCE
WO
RK
EFF
OR
TL
owH
igh
Low HighUSER EXPERIENCE
SAT
ISFA
CT
ION
Low
Hig
h
Keyword
Visual
Keyword
Visual
26/54
Method A 2 x 3 x 2 mixed factor, repeated measures (on
one factor), analysis of variance (ANOVA), experimental design.
Independent Variables Retrieval Method (keyword or visual – between subjects) Result Set Size (small, medium, and large – within
subjects) User Experience (low or high – between subjects)
Dependent Variable: Retrieval Effectiveness Accuracy Time Work Effort Satisfaction
27/54
Method (cont.) Laboratory Experiment Characteristics:
Task setting: Each subject was randomly assigned to one of two search interfaces (keyword or visual)
• Controls for carry-over effect
Task setting: Each subject was requested to perform search tasks on three different scenarios
• See slide deck appendix for scenarios
Incentives included a $100 cash prize for the highest level of accuracy
Subjects: Four sessions – 3 graduate sections, 1 undergraduate (seniors) section
28/54
The Search TasksThree randomized search scenarios
Controls for learning effect
Manipulated to return various result set sizesLarge “System Design” scenario
• 20% of the data set (120 correct responses)Medium “User Acceptance” scenario
• 6-7% of the data set (40 correct responses)Small “Risk Management” scenario
• 1-2% of the data set (10 correct responses)
29/54
Demonstration of Search Interfaces
30/54
Summary of Results
N = 75 AccuracyAccuracy Time (sec)Time (sec) Work Work EffortEffort
SatisfactiSatisfactionon
VisualVisual 28.93% 1080 63.77 14.84
KeywordKeyword 17.67% 902 53.67 15.61
Graduate Students Only
N = 42 AccuracyAccuracy Time (sec)Time (sec) Work Work EffortEffort
SatisfactiSatisfactionon
VisualVisual 31.21% 1297 64.24 14.80
KeywordKeyword 20.68% 1056 56.36 15.86
31/54
Accuracy Hypothesis – 1.1 Knowledge management systems that employ
a visual tree-view hierarchy search interface will produce more accurate results than knowledge management systems that employ a text-based keyword search interface.
Means of Accuracy
Keyw ord Visual
Retrieval Method
Acc
ura
cy
Figure 1a - Search Interface - Accuracy (Hypothesized)
Means of Accuracy
0%
5%
10%
15%
20%
25%
30%
35%
Keyw ord Visual
Retrieval Method
Acc
ura
cy
Figure 13b - Search Interface - Accuracy (Actual)
32/54
Accuracy Hypothesis – 2.1 As result set size increases retrieval accuracy
will decrease.
Means ofAccuracy
Small Medium Large
Result Set Size
Acc
ura
cy
Figure 13c – Result Set Size - Accuracy (Hypothesized)
Means ofAccuracy
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
Small Medium Large
Result Set Size
Acc
ura
cy
Figure 13d – Result Set Size - Accuracy (Actual)
33/54
Troubles Measuring Experience
Bedard (1989) and Gregor & Benbasat (1999) discuss theoretical support for experience as a valid variable for both direct and moderating effects in knowledge-based research.
They also note the difficulties in finding a generally accepted definition and a method for operationalizing the concept.
34/54
Factor Analysis of Experience Items
Rotated Component Matrix(a)
Component
1 2 3
SDExpert .144 .873 -.040
UAExpert -.021 .883 .105
RiExpert -.048 .836 .089
Graduate .861 -.087 -.243
Educ .792 -.124 -.168
ISProf .738 .143 .226
ISDeg .733 .165 .256
UseISBig .011 -.113 .862
UseISSml .012 .235 .746
Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization.a Rotation converged in 4 iterations.
•Content Experience (loaded on by SDExpert, UAExpert, RMExpert) •Educational Experience (loaded on by ISDegree, ISProf, EducationLevel, GradorUndergrad) •Search Experience (loaded on by SearchBig, SearchSml)
35/54
Means of Accuracy
Undergraduate Graduate
Experience
Acc
ura
cy
Figure 13e – Experience - Accuracy
(Hypothesized)
Means of Accuracy
0%
5%
10%
15%
20%
25%
30%
35%
Undergraduate Graduate
Experience
Ac
cura
cy
Figure 13f – Experience - Accuracy
(Actual)
Accuracy Hypothesis – 3.1 Subjects with more (educational) experience will
have higher retrieval accuracy than subjects with less (educational) experience.
36/54
Top 10 Accuracy Results*
Rank Accuracy Rate Search Interface Education Email1 72% Visual Undergraduate [email protected] 64% Visual Graduate ***Anonymous***3 63% Keyword Graduate [email protected] 58% Visual Undergraduate [email protected] 57% Visual Graduate [email protected] 52% Visual Graduate [email protected] 50% Visual Graduate [email protected] 45% Keyword Graduate ***Anonymous***9 43% Visual Graduate [email protected]
10 41% Visual Graduate [email protected]
* Mean Accuracy Rate = 24%
37/54
Means of Time
Keyword Visual
Retrieval Method
Tim
e
Figure 1a – Search Interface - Time (Hypothesized)
Means of Time
260
280
300
320
340
360
380
Keyword Visual
Retrieval Method
Tim
e
Figure 14b – Search Interface - Time (Actual)
Time Hypothesis – 1.2 Users of knowledge management systems that
employ visual tree-view retrieval interfaces will spend more time searching than users of knowledge management systems that employ text-based keyword retrieval interfaces.
38/54
Means of Time
Small Medium Large
Result Set Size
Tim
e
Figure 14c – Result Set Size - Time (Hypothesized)
Means of Time
200
250
300
350
400
450
Small Medium Large
Result Set Size
Tim
e
Figure 14d – Result Set Size - Time (Actual)
Time Hypothesis – 2.2 As result set size increases time will increase.
Note: Directional for the visual search interface only, the keyword interface remained flat across the scenarios.
39/54
Means of Time
Undergraduate Graduate
Experience
Tim
e
Figure 14e – Experience - Time
(Hypothesized)
Means of Time
200
250
300
350
400
450
Undergraduate Graduate
Experience
Tim
e
Figure 14f – Experience - Time
(Actual)
Time Hypothesis – 3.2 Subjects with more (educational) experience
will perform searches faster than subjects with less (educational) experience.
40/54
Work Effort Hypothesis – 1.3 Knowledge management systems that employ
visual tree-view hierarchy search interfaces will use less work effort than knowledge management systems that employ text-based keyword search interfaces. Means of Effort
Keyword Visual
Retrieval Method
Wor
k Ef
fort
Figure 1a – Search Interface – Effort (Hypothesized)
Means of Effort
0102030405060708090
100
Keyword Visual
Retrieval Method
Wor
k Ef
fort
Figure 15b – Search Interface - Effort (Actual)
41/54
Work Effort Hypothesis – 2.3 As result set size increases work effort will
increase.
Means of Effort
Small Medium Large
Result Set Size
Wor
k Ef
fort
Figure 15c – Result Set Size - Effort (Hypothesized)
Means of Effort
0102030405060708090
100
Small Medium Large
Result Set Size
Wor
k Ef
fort
Figure 15d – Result Set Size - Effort (Actual)
42/54
Work Effort Hypothesis – 3.3 Subjects with more (educational) experience
will use less effort than subjects with less (educational) experience.
Means of Effort
Undergraduate Graduate
Experience
Wor
k Effo
rt
Figure 15e – Experience - Effort
(Hypothesized)
Means of Effort
0102030405060708090
100
Undergraduate Graduate
Experience
Wor
k Effo
rt
Figure 15f – Experience - Effort
(Actual)
43/54
Satisfaction Hypothesis – 1.4 Knowledge management systems that employ
visual tree-view hierarchy search interfaces will have a higher degree of satisfaction than knowledge management systems that employ text-based keyword search interfaces.
Means of Satisfaction
Keyword Visual
Retrieval Method
Satis
fact
ion
Figure 1a – Search Interface – Satisfaction (Hypothesized)
Means of Satisfaction
5
10
15
20
25
30
35
Keyword Visual
Retrieval Method
Satis
fact
ion
Figure 16b – Search Interface - Satisfaction (Actual)
44/54
Satisfaction Hypothesis – 2.4 As result set size increases satisfaction will
decrease.
Means of Satisfaction
Small Medium Large
Result Set Size
Satis
fact
ion
Figure 16c – Result Set Size - Satisfaction (Hypothesized)
Means of Satisfaction
5
10
15
20
25
30
35
Small Medium Large
Result Set Size
Satis
fact
ion
Figure 16d – Result Set Size - Satisfaction (Actual)
45/54
Satisfaction Hypothesis – 3.4 Subjects with more (educational) experience
will be more satisfied than subjects with less (educational) experience.
Means of Satisfaction
Undergraduate Graduate
Experience
Sa
tis
fac
tio
n
Figure 15e – Experience - Satisfaction
(Hypothesized)
Means of Satisfaction
5
10
15
20
25
30
35
Undergraduate Graduate
Experience
Satis
fact
ion
Figure 15f – Experience - Satisfaction
(Actual)
46/54
Summary of Hypotheses Results (Main Effects)
Hypothesis 1: Retrieval Method to Retrieval Effectiveness – Main Effect
H1.1: Accuracy Supported
H1.2: Time Supported
H1.3: Work Effort Not Supported – significant in opposite direction
H1.4: Satisfaction Not Supported
Hypothesis 2: Result set Size to Retrieval Effectiveness – Main Effect
H2.1: Accuracy Supported
H2.2: Time Directional – for the Visual search interface only
H2.3: Work Effort Not Supported
H2.4: Satisfaction Not Supported
Hypothesis 3: (Educational) Experience to Retrieval Effectiveness – Main Effect
H3.1: Accuracy Supported
H3.2: Time Not Supported – significant in opposite direction
H3.3: Work Effort Not Supported
H3.4: Satisfaction Not Supported
47/54
Summary of Hypotheses Results (Interaction Effects)
Hypothesis 4: Retrieval Method and Result set Size to Retrieval Effectiveness – Interaction Effect
H4.1: Accuracy Not Supported
H4.2: Time Not Supported
H4.3: Work Effort Not Supported
H4.4: Satisfaction Not Supported
Hypothesis 5: Retrieval Method and (Educational) Experience to Retrieval Effectiveness – Interaction Effect
H5.1: Accuracy Not Supported
H5.2: Time Not Supported
H5.3: Work Effort Not Supported
H5.4: Satisfaction Not Supported
48/54
Revised Model (Figure 17, p.123)
RETRIEVAL METHOD
(Keyword or Visual)
RETRIEVAL EFFECTIVENESS(Accuracy, Time, Work Effort, and
Satisfaction)
EDUCATIONAL EXPERIENCE(Graduate or
Undergraduate)
RESULT SET SIZE(Small, Medium,
Large)
49/54
Repeated Measures Analysis
mean f-ratio (p-value) mean f-ratio (p-value) mean f-ratio (p-value) mean f-ratio (p-value) comp (t-test sig.) comp (t-test sig.) comp (t-test sig.) comp (t-test sig.)
grand mean 22.96% 321.88 58.41 15.21Retrieval Method 10.79 (0.002) 2.83 (0.097) 11.17 (0.001) 1.20 (0.277) keyword 17.41% 294.48 53.33 15.60 visual 28.51% 349.27 63.49 14.83Educational Experience 3.09 (0.083) 19.02 (0.000) 1.38 (0.245) 0.10 (0.758) undergraduate 19.98% 250.46 56.61 15.10 graduate 25.96% 393.29 60.21 15.32Result Set Size 81.14 (0.000) 0.95 (0.389) 35.29 (0.000) 1.19 (0.306) large 14.85% 1|2 (0.467) 334.23 1|2 (0.899) 51.30 1|2 (0.000) 15.53 1|2 (0.159) medium 16.31% 2|3 (0.000) 338.98 2|3 (0.262) 63.87 2|3 (0.014) 15.02 2|3 (0.866) small 37.72% 1|3 (0.000) 292.42 1|3 (0.212) 60.06 1|3 (0.000) 15.08 1|3 (0.215)
Accuracy Time (sec) Workeffort Satisifaction
50/54
Findings and Implications Cognitive loading does have a significant effect on
retrieval effectiveness with respect to accuracy and work effort.
50+% Accuracy gains for approximately one minute more of your time.
Satisfaction between the two search mechanisms was virtually identical.
Work effort turned out to be significantly greater in the visual interface, opposite of hypothesized direction.
Retrieval accuracy can be improved for a limited cost using current technology.
51/54
Limitations Lack of a validated experience measure
With a better measure experience may turn out to be significant
Demographic questions focused on the workplace may have confounded the measure.
Homogeneity of subjects on experience measures.
Experiment based on an academic literature data source May not generalizing to other knowledge retrieval
settings
Size of the data set 600+ knowledge objects with some 2700+ relationships
might be considered a rather small knowledge repository
52/54
Avenues for Future Research Use of an alternative knowledge object as a
data source
Move to a field study Possibilities include a corporation involved in a KM
initiative looking to improve their retrieval effectiveness in areas such as lessons learned, or areas of expertise
Add a combinatory (keyword + visual) search interface The optimal solution is probably not an either or
scenario but rather a combinatory one Related: testing the Fit of different interfaces to user
preferences
53/54
Future Research (cont.) Allowing for searching across multiple
hierarchies keywords and authors expertise and location
Investigating additional visual search interfaces This tree-view hierarchy should not be considered the
only valid visual search interface.
Testing the scalability of visual search mechanism
Quantifying the benefits of improved retrieval effectiveness What are the economic implications?
55/54
Appendix: System Design Scenario (Large Result Set – p.78)
In this scenario, suppose you are a manager in a sizable information technology department within a large corporation. Assume your department is responsible for a large number of internal IT application development projects. Many of your projects tend to have problems that you would like to see alleviated. You believe that many of these problems could have been avoided by better design during the information systems development process. Before moving forward with any new projects you would like to learn about ways to more effectively design information systems.
56/54
Appendix: User Acceptance Scenario (Medium Result Set – p.79)
In this scenario, suppose you are working on a project for a self-monitoring healthcare application in which you must implement a new computer system that patients will need to use in their home. You would like to learn more about what causes end-users to accept new information systems.
57/54
Appendix: Risk Management Scenario (Small Result Set – p. 80)
In this scenario, suppose you are a senior manager of an IT organization in a company that is heavily dependent on the use of computing technology. Due to the recent floods of computer viruses, Internet worms, and hackers trying to gain access to customer records your systems have come under scrutiny by the top management team and board of directors. Because of this, you decide to seek a greater understanding of how to mitigate risk in information systems to better safeguard against these dependencies.