cse509 lecture 6
Upload: web-science-research-group-at-institute-of-business-administration-karachi-pakistan
Post on 13-May-2015
397 views
DESCRIPTION
Lecture 6 of CSE509:Web Science and Technology Summer CourseTRANSCRIPT
Arjumand Younus
Web Science Research Group
Institute of Business Administration (IBA)
CSE509: Introduction to Web Science
and Technology
Lecture 6: Social Information Re-trieval
2
Last Time…
Transition from Web 1.0 to Web 2.0 Social Media Characteristics
Part I: Theoretical Aspects Social Networks as a Graph Properties of Social Networks
Part II: Getting Hands-On Experience on Social Media Analytics Twitter Data Hacks
Part III: Example Researches
August 13, 2011
3
Today
Role of Today’s Web: Changing the way Information Needs are Satisfied
Social Search
Research Case by Microsoft Research: What do People Ask their Social Networks
Techniques for Influence Analysis in Social Networks
August 13, 2011
4
Role of Today’s Web
August 13, 2011
Marketing Tool
Information Finding Tool
Media Tool
5
New Dimensions in Search with The Social Web
Information Overload Search engines don’t always hold answers that users are looking for
Smart Search (CNN Money) “The Web, they say, is leaving the era of search and entering one of
discovery. What’s the difference? Search is what you do when you’re looking for something. Discovery is when something wonderful that you didn’t know existed, or didn’t know how to ask for, finds you.”
August 13, 2011
What does that mean for search engines? Will they be left behind?
6
Role of Today’s Web
August 13, 2011
Marketing Tool
Information Finding Tool
Media Tool
7
Social Search
Takes into account the “social graph” of the person initiating the query
Search activity in which users pose a question to their social networks
Search systems using statistical analytics over traces left behind by others Conducting a search over an existing database of content previously
provided by other users such as searching over the collection of public Twitter posts or searching through an archive of questions and answers
August 13, 2011
8
Social Search Benefits
Reduced impact of link spam by lesser reliance on link structure of Web pages
Increased relevance due to each result being selected by users
Web pages relevance judged from reader’s perspective rather than author’s perspective
More current results through constant feedback
August 13, 2011
Improvements achieved by social search have not been quantified so far
What Do People Ask Social Networks?
Meredith Ringel Morris, MSR
Jaime Teevan, MSR
Katrina Panovich, MIT
August 13, 2011
10
Questions about People’s Questions
What questions do people ask? How are the questions phrased? What are the question types and topics? Who asks which questions and why?
Which questions get answered? How is answer speed and utility perceived? What are people’s motivations for answering?
August 13, 2011
11
Survey of Asking via Status Messages
Survey content Used a status message to ask a question?
Frequency of asking, question type, responses received Provide an example
Answered a status message question? Why or why not? Provide an example
624 participants Focus on Facebook and Twitter behavior
August 13, 2011
12
Questions: Types
Type % Example
Recommendation 29% Building a new playlist – any ideas for
good running songs?
Opinion 22% I am wondering if I should buy the Kitchen-Aid ice cream maker?
Factual 17% Anyone know a way to put Excel charts into LaTeX?
Rhetorical 14% Why are men so stupid?
Invitation 9% Who wants to go to Navya Lounge this evening?
Favor 4% Need a babysitter in a big way tonight… anyone??
Social connection 3% I am hiring in my team. Do you know
anyone who would be interested?
Offer 1% Could any of my friends use boys size 4 jeans?
August 13, 2011
13
Questions: Topics
Topic % Example
Technology 29% Anyone know if WOW works on Windows 7?
Entertainment 17% Was seeing Up in the theater worth the
money?
Home & Family 12% So what’s the going rate for the tooth
fairy?
Professional 11% Which university is better for Masters? Cornell or Georgia Tech?
Places 8%Planning a trip to Whistler in the off-season. Recommendation on sites to see?
Restaurants 6% Hanging in Ballard tonight. Dinner recs?
Current events 5% What is your opinion on the recent
proposition that was passed in California?
Shopping 5% What’s a good Mother’s Day gift?
Philosophy 2% What would you do if you had a week to live?
Missing: Health, Religion
Politics, Dating, and Finance
August 13, 2011
14
Questions: Who Asks What
August 13, 2011
Type
Recommendation
Opinion
Factual
Rhetorical
Invitation
Favor
Social connection
Offer
Topic
Technology
Entertainment
Home & Family
Professional
Places
Restaurants
Current events
Shopping
Philosophy
men
women
old
young
15
Questions: Motives for Asking
Topic % Example
Trust 24.8%
I trust my friends more than I trust strangers.
Subjective
21.5%
Search engine can provide data but not an opinion.
Thinks search would fail
15.2%
I’m pretty search engine couldn’t answer a question of that nature.
Audience 14.9% Friends with kids, first hand real experience.
Connect 12.4%
I wanted my friends to know I was asking the question.
Speed 6.6% Quick response time, no formalities.
Context 5.4% Friends know my tastes.
Tried search 5.4% I tried searching and didn’t get good results.
Easy 5.4% Didn’t want to look through multiple search results.
Quality 4.1% Human-vetted responses.
August 13, 2011
16
Questions About People’s Questions
What questions do people ask? How are the questions phrased? What are the question types and topics? Who asks which questions and why?
Which questions get answered? How is answer speed and utility perceived? What are people’s motivations for answering?
August 13, 2011
17
Answers: Speed and Utility
94% of questions received an answer Answer speed
A quarter in 30 minutes, almost all in a day People expected faster, but satisfied with speed Shorter questions got more useful responses
Answer utility 69% of responses helpful
August 13, 2011
Database and Multimedia Lab 18
Answers: Speed and Utility
Type
Recommendation
Opinion
Factual
Rhetorical
Invitation
Favor
Social connection
Offer
Topic
Technology
Entertainment
Home & Family
Professional
Places
Restaurants
Current events
Shopping
Philosophy
Fast
UnhelpfulNo
correlation
August 13, 2011
19
Answers: Motives for Answering
Motive % Example
Altruism 37.0 Just trying to be helpful.
Expertise 31.9 If I’m an expert in the area.
Question 15.4 Interest in the topic.
Relationship 13.7 If I know and like the person.
Connect 13.5 Keeps my network alive.
Free time 12.3 Boredome/procrastination.
Social capital 10.5 I will get help when I need it myself.
Obligation 5.4 A tit-for-tat.
Humor 3.7 Thinking I might have a witty response.
Ego 3.4 Wish to seem knowledgeable.
Motives for Not
Answering
- Don’t know the answer
- Private topic
- Question im
personal
August 13, 2011
20
Answers About People’s Questions
The questions people ask Short, directed to “anyone” Subjective questions on acceptable topics Social relationships important motivators
The questions that get answered Fast, helpful responses, related to length and type Answers motivated by altruism and expertise
August 13, 2011
21
Enhancing Search using Social Network Features
Recency Crawling and Ranking Identification of Hot Topics on Social Web [YQG+11]
News in the Making Trend analysis Event detection
Real-Time Search Information Diffusion and Influence Analysis Community Detection Opinion Mining
August 13, 2011
22August 13, 2011
Nodes, Ties and Influence
23
Importance of Nodes
Not all nodes are equally important
Centrality Analysis Find out the most important nodes in one network
Commonly-used Measures Degree Centrality Closeness Centrality Betweenness Centrality Eigenvector Centrality
August 13, 2011
24
Degree Centrality
The importance of a node is determined by the number of nodes adjacent to it The larger the degree, the more import the node is Only a small number of nodes have high degrees in many real-life
networks
Degree Centrality
Normalized Degree Centrality:
For node 1, degree centrality is 3;Normalized degree centrality is
3/(9-1)=3/8.
August 13, 2011
25
Closeness Centrality
“Central” nodes are important, as they can reach the whole network more quickly than non-central nodes
Importance measured by how close a node is to other nodes
Average Distance
Closeness Centrality
August 13, 2011
26
Closeness Centrality Example
Node 4 is more central than node 3August 13, 2011
27
Betweenness Centrality
Node betweenness counts the number of shortest paths that pass one node
Nodes with high betweenness are important in communication and information diffusion
Betweenness Centrality
The number of shortest paths between s and t
The number of shortest paths between s and t that pass vi
27August 13, 2011
28
Betweenness Centrality Example
The number of shortest paths between s and t
The number of shortest paths between s and t that pass vi
August 13, 2011
29
Eigenvector Centrality
One’s importance is determined by his friends’ If one has many important friends, he should be important as
well.
The centrality corresponds to the top eigenvector of the adjacency matrix A.
A variant of this eigenvector centrality is the PageRank score.
August 13, 2011
30
Weak and Strong Ties
In practice, connections are not of the same strength
Interpersonal social networks are composed of strong ties (close friends) and weak ties (acquaintances)
Strong ties and weak ties play different roles for community formation and information diffusion
Strength of Weak Ties (Granovetter, 1973) Occasional encounters with distant acquaintances can provide important
information about new opportunities for job search
August 13, 2011
31
Connections in Social Media
• Social Media allows users to connect to each other more easily than ever One user might have thousands of friends online Who are the most important ones among your 300 Facebook friends?
• Imperative to estimate the strengths of ties for advanced analysis Analyze network topology Learn from User Profiles and Attributes
August 13, 2011
32
Learning from Network Topology
Bridges connecting two different communities are weak ties
An edge is a bridge if its removal results in disconnection of its terminal nodes
e(2,5) is a bridge e(2,5) is NOT a bridge
August 13, 2011
33
“shortcut” Bridge
Bridges are rare in real-life networks Alternatively, one can relax the definition by checking if the
distance between two terminal nodes increases if the edge is removed
The larger the distance, the weaker the tie is
d(2,5) = 4 if e(2,5) is removed d(5,6) = 2 if e(5,6) is removed e(5,6) is a stronger tie than e(2,5)
August 13, 2011
34
Neighborhood Overlap
Tie Strength can be measured based on neighborhood overlap; the larger the overlap, the stronger the tie is
-2 in the denominator is to exclude vi and vj
August 13, 2011
35
Neighborhood Overlap
Tie Strength can be measured based on neighborhood overlap; the larger the overlap, the stronger the tie is
-2 in the denominator is to exclude vi and vj
August 13, 2011
36
Learning from Profiles and Interactions
Twitter: one can follow others without followee’s confirmation The real friendship network is determined by the frequency two users
talk to each other, rather than the follower-followee network The real friendship network is more influential in driving Twitter usage
Strengths of ties can be predicted accurately based on various information from Facebook Friend-initiated posts, message exchanged in wall post, number of
mutual friends, etc.
Learning numeric link strength by maximum likelihood estimation User profile similarity determines the strength Link strength in turn determines user interaction Maximize the likelihood based on observed profiles and interactions
36August 13, 2011