using tag recommendations to homogenize folksonomies in microblogging environments

25
Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments Eva Zangerle, Wolfgang Gassler and Günther Specht 1

Upload: evazangerle

Post on 12-May-2015

1.100 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Using Tag Recommendations to

Homogenize Folksonomies in Microblogging Environments

Eva Zangerle, Wolfgang Gassler and Günther Specht

1

Page 2: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Outline

• Motivation

• Approach

• Ranking Methods

• Evaluation

• Future Directions

• Conclusion

2

Page 3: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Hashtags

• Tags for Tweets

• (Manual) Categorization of conversations

• Follow streams of conversation

• Indicator for certain topic or audience

3

Page 4: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Motivation

• Only 20% of tweets contain hashtags

• Hashtags can be chosen freely

– #socinfo2011? #socinfo11? #socinfo? all?

– Synonymous hashtags

– Heterogeneity

– Search capability limited

– Which stream to follow?

4

Page 5: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Motivation

5

Page 6: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Proposed Solution:

Hashtag Recommendations

Motivation

6

Page 7: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Goals

• Recommendation of suitable hashtags during entering a tweet

• Encourage use of hashtags

– Improve search capabilities

– Better categorization

• Fight heterogeneity

– Avoid use of synonymous hashtags

7

Page 8: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Our Approach in a Nutshell

• Based on a set of existing tweets

• Analysis of entered tweet

• Analysis of dataset

• Recommendations based on hashtags within similar messages

8

Page 9: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Approach - Workflow

User enters message

Retrieve 500 most similar messages

Retrieve candidate-set of Hashtags

Ranking of Hashtags

Top-k Recommendations

9

Page 10: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Crawled Dataset

• Crawled July 2010 – April 2011

• 18,731,800 messages in total

• 3,753,927 messages containing hashtags

– about 20%

– used as dataset for evaluation

• 5,968,571 hashtags → avg of 1.6 hashtags

• 585,140 distinct hashtags

– 502,172 hashtags occurred less then 5 times

10

Page 11: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Longtail Distribution

11

Page 12: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Hashtags per Tweet

12

Page 13: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Candidate Set Generation

• Find tweets most similar to the user‘s tweet

• Cosine similarity of tf/idf weighted term vectors

• Take 500 most similar tweets

• Extract hashtags from these tweets

• Next step: ranking the hashtags

13

Page 14: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Basic Ranking Methods

Input: Set of Candidate Hashtags (from 500 similar tweets) Output: Ranked Candidate List -> top k shown 1. SimRank

– Use similarity measure of tweets for ranking (tf/idf cosine similarity)

– The higher the similarity of the tweets, the higher the ranking of the corresponding hashtags

2. TimeRank – Recency of usage of the hashtag – The more recently a hashtag has been used, the higher the

ranking within the candidate hashtags

14

Page 15: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Basic Ranking Methods

Input: Set of Candidate Hashtags (from 500 similar tweets) Output: Ranked Candidate List -> top k shown 3. RecCountRank

– Count number of occurrences for each hashtag within candidate list

– The more similar tweets feature the hashtag, the higher the rank of the hashtag

4. PopRank – Global popularity of the hashtag within the whole dataset – The more popular a hashtag is overall, the higher is its ranking

15

Page 16: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Hybrid Ranking Methods

• Based on 4 basic ranking methods

• ℎ𝑦𝑏𝑟𝑖𝑑𝑟𝑎𝑛𝑘(𝑟1, 𝑟2) = 𝛼 ∗ 𝑟1 + 1 − 𝛼 ∗ 𝑟2

• Hybrid ranking computed for all possible combinations of basic ranking methods

16

Page 17: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Evaluation

Compare top-k recommendations

Use proposed ranking methods

Compute hashtag recommendations for t

Use t as input for recommendation algorithm

Remove hashtags from t

Randomly select tweet t from dataset

17

Page 18: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Evaluation

• Dataset

– 3,753,927 messages

– 5,968,571 hashtags

– 585,140 distinct hashtags

• Testrun

– 10,000 randomly chosen tweets (max. 5 hashtags)

– Retweets excluded

18

Page 19: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Recall - Basic Methods

19

Top-5 recommendations

enough?

Page 20: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Recall@5 - Hybrid Methods

20

Page 21: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Precision@5

21

Page 22: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Development of Recall Values

22

Page 23: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Future Directions

• Social Graph

• User‘s Timeline

• Realtime Recommendations

• Real User Tests

23

Page 24: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

Conclusion

• Motivation

• Hashtag Recommendations

• Simple, straight-forward approach

• Promising results

24

Page 25: Using Tag Recommendations to Homogenize Folksonomies in Microblogging Environments

25