Transcript
Page 1: Investigating the Impact of the Blogosphere: Using PageRank to Determine the Distribution of Attention

Using PageRank to determine the distribution of attention

Lars Kirchhoff | Axel Bruns | Thomas Nicolai

Investigating the impact of the blogosphere

18.10.2007

Page 2: Investigating the Impact of the Blogosphere: Using PageRank to Determine the Distribution of Attention

What are the questions?► Is the impact of the blogosphere

different to other forms of online media?

► How is PageRank distributed across the blogosphere?

► Does it indicate the existence of measurable, visible effects of blogs on the overall mediasphere?

► Has there been a growth in the impact of the blogosphere on the Web over the two years analysed here?

Page 3: Investigating the Impact of the Blogosphere: Using PageRank to Determine the Distribution of Attention

What we have done?2005► ~15m profiles from blogger.com► ~8.871m unique blog urls extracted► Retrieved Google PageRank

2006► same profiles ► but slightly more unique blog urls

extracted (~8.888m)

► Retrieved Google PageRank

Page 4: Investigating the Impact of the Blogosphere: Using PageRank to Determine the Distribution of Attention

Why PageRank?► Available for almost any web page

► Easy to gather

► Global property that takes the whole web into account

► Search is most common way to look for information

Page 5: Investigating the Impact of the Blogosphere: Using PageRank to Determine the Distribution of Attention

What do we have?Blogosphere PageRank Distribution 2005

# b

log

s

PageRank

1

10

100

1000

10000

100000

1000000

10000000

0 1 2 3 4 5 6 7 8 9 10

Distribution 2005

Page 6: Investigating the Impact of the Blogosphere: Using PageRank to Determine the Distribution of Attention

What do we have?

1

10

100

1000

10000

100000

1000000

10000000

0 1 2 3 4 5 6 7 8 9 10

Distribution 2005

Distribution 2006

Blogosphere PageRank Distribution 2006#

blo

gs

PageRank

Page 7: Investigating the Impact of the Blogosphere: Using PageRank to Determine the Distribution of Attention

What has happened?Increase and Decrease (%) of PageRank from 2005 to 2006

perc

en

t

PageRank

0

20

40

60

80

100

120

0 1 2 3 4 5 6 7 8 9 10

Page 8: Investigating the Impact of the Blogosphere: Using PageRank to Determine the Distribution of Attention

What does this mean?► Strong decline at PageRank 1,2 / 7-10

► Lower end: effect of focus on Blogger?► Blogger as sandbox high attrition?

► Higher end: shrinking A-list?► churn away from Blogger?► harder to achieve high PageRank in larger, more

diverse Web?

► Need to track trajectories► e.g. how many low PR blogs rose from 2005 to

2006?► e.g. how many PR7+ blogs survived from 2005 to

2006?

Page 9: Investigating the Impact of the Blogosphere: Using PageRank to Determine the Distribution of Attention

What are the limitations?► Coarse values

► Algorithm is not entirely known

► Updates for Google PageRank are random

Page 10: Investigating the Impact of the Blogosphere: Using PageRank to Determine the Distribution of Attention

What next!► Use more blogs

► Measure PageRank more frequently

► Use other indicators/measures(alexa, technorati, BlogLines)

► Discuss different metrics


Top Related