Dynamic Context-Sensitive PageRank for Expertise Mining
2nd Int. Conf. on Social Informatics (SocInfo'10)27-29 October, 2010, Austria
http://www.infosys.tuwien.ac.at/staff/dschall/29. Oct. 2010
Daniel [email protected]
Vienna University of Technology
Presentation Outline• Overview• Motivation• Human-Provided Services (HPS)
Crowdsourcing Example• Human Interaction Metrics• Dynamic Skill and Activity-based PageRank
(DSARank)• Experiments and Conclusion
2
• Open dynamic ecosystems– People and software services
integrated into evolving “solutions“• Communications and
coordination– „Anytime-anywhere“ pervasive
infrastructures and mobility• Mass collaboration
– Knowledge sharing and social interaction
• Crowdsourcing– Human computation on the Web
3
Overview Paradigm: human and service interactions
… software service
… user
… human/service interaction
• BPEL4People/WS-HT• User driven versus modeled
tasks in workflow
• Crowdsourcing• Human Intelligent Tasks (e.g., Amazon
Mechanical Turk)• No collaboration link
between humans
Motivation: Human computation/SOA
4
Modeling of human interactions in dynamic service-oriented systems
Reputation mechanism and expertise ranking in large-scale systems
Process flow Web services
People activity/human task
Knowledge sharing platform
Tasks
Requester
task1 task2
task4
task3
5
Definition
DiscoveryHPS
Interactions
Schall et al. (2008), Unifying Human and Software Services in Web-Scale Collaborations, IEEE Computer
Human Provided Service: Crowdsourcing Example
Overview Metrics
6
• Classification of MetricsSchall (2009), Human Interactions in Mixed Systems - Architecture, Protocols, and Algorithms (PhD Thesis)
Challenges
7
• How to find the most relevant expert?• How to calculate the expertise of people in an
automated manner?• How to account for changing interests and the
skill level in different fields of interest?My Approach• Dynamic Skill and Activity-based PageRank• Interaction mining using link-intensity weights• Personalization based on interaction context• Aggregated importance using query terms
8
• (1) Logging interactions• (2) Create interaction graph (offline)• (3) Aggregate ranking results based on preferences (online)
Discovery and Ranking
Expert Seeker (e.g., Crowdsourcing engine)
Schall (2009), Human Interactions in Mixed Systems - Architecture, Protocols, and Algorithms
Ranking Algorithm: Random surfer model
9
1/2 1/3
Web Graph
… node
… surfer
… Web link
With a certain probability, I will jump (“teleport”) to a random Web page.
Page et al. (1999), The PageRank Citation Ranking: Bringing Order to the Web.
NvoutlinksvPRuPR
uinlinksv
1)1(|)(|
)()()(
Ranking Algorithm: Behavior model
10
w1,2
Interaction Graph
… document
… user
… link
w1,31
3
2
5
4
6
I will contact User 2 depending on the link weight w1,2. The link weight is based on strength and intensities of interactions.
w2,4
I will contact some other user. For example, to start a new collaboration by relaying a message.
Ranking Algorithm: Interaction context
11
• Users interact in different contexts with different intensities
12
context 1 (e.g., topic = WS Addressing)
1
context 2 (e.g., topic = WS Policy)
Interaction intensity context 1
Interaction intensity context 2
• Personalize ranking (i.e., expertise) for different contexts
Context-dependent DSARank
12
w1,2
Context 1
w1,31
3
2
4
w2,4
• (1) Identify context of interactions („tags“)
• (2) Select relevant links and people• (3) Create weighted subgraph (for
context)• (4) Perform mining
w1,3
Context 2
w1,41
4
3
User 1’s expertise in context 1
User 1’s expertise in context 2
)(...)()';( 11'
upwupwDSAwCuDSA nnCc
c
Calculated offlineE.g., p(u) = w1 IIL(u) + w2 availability(u)
Combined online based on preferences
Results
13
• Real dataset (Email)• High interaction
intensity reveals key people
• Best informed usersID Rank (DSA) Rank (PR) Intensity Level
37 1 21 7.31...
253 4 170 2.07347 5 282 1.39
(see paper for detailed experiment results)
Conclusion
14
• Crowdsourcing gains popularity• Amazon Mechanical Turk• Recognition from scientific community
• Human-Provided Services• Supporting versatile crowdsourcing scenarios
• Context-sensitive expertise• Important in collaborative crowd environments• Based on topic sensitive interaction mining
Thanks for your attention!
Daniel [email protected]
Vienna University of Technologyhttp://www.infosys.tuwien.ac.at/staff/dschall/
http://en.wikipedia.org/wiki/The_Turk