charles l.a. clarke school of computer science, university of waterloo, canada elaine g. toms...
TRANSCRIPT
Charles L.A. ClarkeSchool of Computer Science, University of Waterloo, Canada
Elaine G. TomsFaculty of Management, Dalhousie University, Halifax, Canada
Luanne Freund Faculty of Information Studies, University of Toronto, Canada
Modeling Task-Genre Relationships for IR in
the Workplace
Modeling Task-Genre Relationships for IR in
the Workplace
The 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. August 15-19, 2005, in Salvador, Brazil.
2
Workplace IRHow do I phrase this so I don't get 2,000 responses? If I look for certain words, I know I'm going to get thousands of responses and they don't mean anything when I sift through them.
There’s lots of great information out there. We just don't know how to find it yet. We don't know how to make it easy to find yet.
I really just don't have time to read through 10 articles to find one that's good. First off I just don't have that kind of attention span and secondly I've just got lots and lots of other stuff to do.
3
Approach: Contextual IR
querybag of
words
work domain
searcher
info need Traditional IR
work task
work domain
author
info
work task
problem purpose
•work tasks
•information tasks
•document genreWhich Factors?
Searcher Information
Contextual IR
Interaction
4
Research Questions
• Do discernable relationships exist between work tasks, information tasks and document genres in a specific work domain?
• If so, what are these relationships, and are there broader factors underlying the patterns of association between these variables?
5
Methods: Work Domain
Software Engineering• large multinational hi-tech company• software services consulting group• wide range of work activities: assessment,
troubleshooting, implementation, system migration, project management, etc.
• heavy reliance upon digital information sources
6
Methods: DatasetInternal Intellectual Capital Database• Documents submitted and meta-tagged by
consultants• Tags: document type & task (purpose)• 5,800 pairs of tags for analysisgenres (17) tasks: work (20) tasks: informational
(16)
cookbookdiscussionlecture / labpresentationschedulesales kitreading materialsource codetools, etc.
architecturedebugginginstallationconfigurationdeploymentimplementationproof of conceptproject managementtesting, etc.
compareeducatedocumentguidedemonstrateindexsupportmarketmethodology, etc.
7
Methods: Analysis
Correspondence Analysis
• exploratory method used to identify patterns of association between variables
• generalization of PCA to contingency tables with multiple categories for each variable
• maps vectors of row and column profiles in multi-dimensional space – using Chi-Square distance
• calculates inertia - measure of dispersion - for each row and column
• uses best-fitting planes to reduce dimensionality of solution
8
Results: Genre DistributionSignificant Relationship:
Genre & Task (x2= 5878.968, df=612, p<.001)
0%
5%
10%
15%
20%
25%
30%
35%
40%
Configuration
Development
Project Management
Select Work Tasks: Genre Distribution
9
Results: Genre Distribution
0%
5%
10%
15%
20%
25%
30%
35%
40%
comparedocumentexample
Select Information Tasks: Genre Distribution
Significant Relationship: Genre & Task (x2= 5878.968, df=612,
p<.001O)
10
Correspondence Map: Dimensions 1& 2
“Work Activities”
engineering consulting
“Info
rmati
on G
oals
”
doing;low level
learning;high level
11
Correspondence Map: Dimensions 1& 4
“Work Activities”
“Info
rmati
on
Goals
”
demonstrating; interactive
fact-finding; static
12
Summary –Patterns of Association
Work Role Doing“how to”low-level
Learning“why?”high-level
Fact-Finding“what?”static
Demonstrating “show me”interactive
Software engineering
integrationinstallationtoolcookbook demo
architecturecapacity planningguidewebsite
administratesecuritydesign docsroadmap standards
testdebuggingperformance tuningsource code tooldemo
consulting project managementengagement summaryschedule
product presentationtechnical infolecture/labpresentation
project managementindexschedulelegal material
discovery sessioncompetitive evaluationmethodsdiscussion technical info
13
Genre Clusters
• reusables• low-level technical• product maintenance• high-level generic• educational
Meta-genres?
14
Key relevance criteria for engineers: “task applicability”
To what extent does genre reflect this?
DiscussionInfo-centric perspective on work tasks– Significant relationship: task & genre– Micro-relationships – specific tasks & genres –
general relationships exist; moderated by roles and information goals
– Macro-relationships – suggests factors for hypothesis-testing for engineering domain; enterprise search
15
This research is supported by an IBM Centre for Advanced Studies (Toronto) fellowship to the first and second authors, and a SSHRC and Canada Research Chairs Program grant to the second author. We would like to thank Julie Waterhouse, IBM, and the many software services consultants who contributed
their valuable time to the project.
thank you