the $25 billion eigenvector

27
The $25 Billion Eigenvector How does Google do Pagerank?

Upload: dasan

Post on 05-Jan-2016

96 views

Category:

Documents


3 download

DESCRIPTION

The $25 Billion Eigenvector. How does Google do Pagerank ?. The Imaginary Web Surfer:. Starts at any page, Randomly goes to a page linked from the current page, Randomly goes to any web page from a dangling page, … except sometimes (e.g. 15% of the time), goes to a purely random page. J. - PowerPoint PPT Presentation

TRANSCRIPT

The $25 Billion Eigenvector

How does Google do Pagerank?

The Imaginary Web Surfer:

• Starts at any page,• Randomly goes to a page linked from the

current page,• Randomly goes to any web page from a

dangling page,• … except sometimes (e.g. 15% of the time),

goes to a purely random page.

A tiny web: who should get the highest rank?

J A B

I C

DH

G F E

The associated stochastic matrix:

0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.4400 0.0150 0.0150 0.2983 0.4400 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.2983 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.2983 0.8650 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.4400 0.0150 0.0150 0.8650 0.0150 0.8650 0.0150 0.0150 0.0150 0.0150 0.0150 0.2983 0.0150 0.0150 0.8650 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.8650 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.8650 0.2983 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.2983 0.0150 0.0150 0.0150 0.0150 0.0150 0.0150 0.4400 0.0150 0.0150 0.0150

How is yk+1=Axk performed?

J A B

I C

DH

G F E

connection = [2 5 3 4 6 4 5 6 5 1 10 7 8 1 8 9]end = [2 5 6 7 8 9 11 12 13 16]

How is yk+1=Axk performed?

1. yk+1 = .15/n e, (where e is all 1’s)2. start = 13. for j = 1,…, n

a) col_tot = endj-startb) for i = start,…, endj

• ii = connectioni

• yk+1ii = yk+1

ii+.85/col_tot*yki

c) start =endj+1

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Start with equal components

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

One iteration

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Two iterations

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Three iterations

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Four iterations

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Five iterations

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Six iterations

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Seven iterations

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Eight iterations

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Nine iterations

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Ten iterations

1 20

0.05

0.1

0.15

0.2

0.25

0.3

0.35

The Eigenvector

The Imaginary Web Surfer:

• Starts at any page,• Randomly goes to a page linked from the

current page,• Randomly goes to any web page from a

dangling page,• … except sometimes (e.g. 15% of the time),

goes to a purely random page.

[U,G] = surfer (‘http://google.com’, 100)

0 20 40 60 80 100

0

10

20

30

40

50

60

70

80

90

100

0 20 40 60 80 100 1200

0.005

0.01

0.015

0.02

0.025

0.03 Pagerank Power Iteration 1 step

0 20 40 60 80 100 1200

0.005

0.01

0.015

0.02

0.025

0.03 Pagerank Power Iteration 2 steps

0 20 40 60 80 100 1200

0.005

0.01

0.015

0.02

0.025

0.03 Pagerank Power Iteration 3 steps

0 20 40 60 80 100 1200

0.005

0.01

0.015

0.02

0.025

0.03

0.035 Pagerank Power Iteration 4 steps

0 20 40 60 80 100 1200

0.005

0.01

0.015

0.02

0.025

0.03

0.035 Pagerank Power Iteration 5 steps

0 20 40 60 80 100 1200

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09 Pagerank Power Iteration the limit

And the winners are… 'http://www.loc.gov/standards/iso639-2' 'http://www.sil.org/iso639-3' 'http://www.loc.gov/standards/iso639-5' 'http://purl.org/dc/elements/1.1' 'http://purl.org/dc/terms' 'http://purl.org/dc' 'http://creativecommons.org/licenses/by/3.0' 'http://i.creativecommons.org/l/by/3.0/88x31.png' 'http://www.nlb.gov.sg' 'http://purl.org/dcpapers' 'http://www.nl.go.kr' 'http://purl.org/dcregistry' 'http://www.kc.tsukuba.ac.jp/index_en.html'