microblogs: information and social network huang yuxin

28
Microblogs: Information and Social Network Huang Yuxin

Upload: cecily-miles

Post on 13-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Microblogs: Information and Social Network Huang Yuxin

Microblogs: Information and Social Network

Huang Yuxin

Page 2: Microblogs: Information and Social Network Huang Yuxin

Millions of users in Microblogs

• By July 2009, Twitter has attracted 41 million users.

• By March 2011, size of Twitter has grown to 175 million.

• The registered id in Sina Microblog has reached 100 million by March 2011

Page 3: Microblogs: Information and Social Network Huang Yuxin

People can publish posts and share information on Microblogs

Page 4: Microblogs: Information and Social Network Huang Yuxin

Social network in Microblogs

Page 5: Microblogs: Information and Social Network Huang Yuxin

What information can we extract from Microblogs

• Plain Text• User reference (1/2 posts)• Hashtag (1/9 posts)• Retweet• Emoticons• Shortened URL (resource) (1/2 posts)• Time• Users’ Geology info

Page 6: Microblogs: Information and Social Network Huang Yuxin

Basic features from text of twitterTiny URL

Users Post Time

Emoticons

Hashtag

Mention (User reference)

Page 7: Microblogs: Information and Social Network Huang Yuxin

What is Twitter(WWW 2010)

• People who are moreactive tend to have morefollowers• The case is different forpeople with very highpopularity.(Because theyare celebrities)

Page 8: Microblogs: Information and Social Network Huang Yuxin

Small World

• Average Path length of Twitter: 4.12

Page 9: Microblogs: Information and Social Network Huang Yuxin

Reciprocity?

(Whole dataset)• 77.9% of user pairs with

any link between them are connected one-way.

• And 67.6% of users are not followed by any of their followings.

• The rate of reciprocity is higher in Asian countries than America.

• (www 2010)

(Part of active users• 72.4% of the users in Twitter

follow more than 80% of their followers

• 80.5% of users have 80% of users they are following follow them back

• (wsdm 2010)The difference of conclusion

between these two papers is caused by different data extraction method

Page 10: Microblogs: Information and Social Network Huang Yuxin

Celebrities And Popular

Topics

Page 11: Microblogs: Information and Social Network Huang Yuxin
Page 12: Microblogs: Information and Social Network Huang Yuxin

Users’ participation in topics

• A topic can only attract certain group of users

Page 13: Microblogs: Information and Social Network Huang Yuxin

Content types on twitter

• Daily Chatter• Conversations• Sharing Information• Reporting and Spreading News

Page 14: Microblogs: Information and Social Network Huang Yuxin

Understanding following Behavior----a statistics made in a paper

• Why we follow: professional interest, technology, tone of presentation, keeping up with friends

• Why we unfollow: Too many posts in general, too much status/personal info, spam, duplicative posts.

Page 15: Microblogs: Information and Social Network Huang Yuxin

Interesting Research Topics on Twitter

• Vertical Search on Twitter (partial indexing + time sensitive information retrieval)

• Static Topic Detection (topic model)• Burst Event Detection (topic specific)• Topic Biased Expert Recommendation (graph

feature+ activeness+ textual feature)• Cascading Feature Analysis (Network structure

+ topic spreading behavior on different topics)

Page 16: Microblogs: Information and Social Network Huang Yuxin

Related Works

Page 17: Microblogs: Information and Social Network Huang Yuxin

People I need to follow vs. Content I need to know

TWEET Listen

Page 18: Microblogs: Information and Social Network Huang Yuxin

People I need to follow vs. Content I need to know

• An active publisher may has interest in many topics

• My page is always filled with non-valuable latest chatting

• I may only need to subscribe certain topics of an author

• Can we automatically classify one’s content and filter out irrelevant ones?

Page 19: Microblogs: Information and Social Network Huang Yuxin

Topics spreads through network

EARTHQUAKE

EARTHQUAKE

EARTHQUAKE

EARTHQUAKE

Page 20: Microblogs: Information and Social Network Huang Yuxin

Detecting hot Topics with community

• keywords temporal feature• Hot topics are biased to a group of users, or a

certain time period

• Retweet Trees, Social Networks accompanied with users’ expertise can all participate in the model training

Page 21: Microblogs: Information and Social Network Huang Yuxin

Topic Model with network regularization(WWW 08)

21

e.g. coauthor network

Document d

k

12

O(C,G)=L(C)+ R(G,C)

keyword list

Page 22: Microblogs: Information and Social Network Huang Yuxin

?????

Page 23: Microblogs: Information and Social Network Huang Yuxin

Rumors have attracted much attention

Page 24: Microblogs: Information and Social Network Huang Yuxin
Page 25: Microblogs: Information and Social Network Huang Yuxin

Intuitions

• Rumors spread furiously and cause hot discussion

• Rumors tends to be controversial (people spreading it and people against it)

• The source of Rumor (celebrities? Nobody?)• Maybe a study of the spreading of particular

rumor is interesting.• Celebrities will clarify the truth?

Page 26: Microblogs: Information and Social Network Huang Yuxin

Challenges

• How to differentiate rumors with personal view

• Most of the comments are subjective (expression of feelings)

• Most of the comments are subjective

Page 27: Microblogs: Information and Social Network Huang Yuxin

Rumors vs. meaningless Topics

Page 28: Microblogs: Information and Social Network Huang Yuxin

Suggestions and ideas are really Welcome