application of ensemble models in web ranking

26
Application of Ensemble Models in Web Ranking Homa B. Hashemi Nasser Yazdani Azadeh Shakery Mahdi Pakdaman Naeini School of Electrical and Computer Engineering University of Tehran

Upload: chaela

Post on 15-Feb-2016

24 views

Category:

Documents


0 download

DESCRIPTION

Application of Ensemble Models in Web Ranking. Homa B. Hashemi Nasser Yazdani Azadeh Shakery Mahdi Pakdaman Naeini. School of Electrical and Computer Engineering University of Tehran. Information Explosion. Web Challenges. Huge size of information 25 billion pages - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Application of Ensemble Models in Web Ranking

Application of Ensemble Models in Web Ranking

Homa B. HashemiNasser YazdaniAzadeh Shakery

Mahdi Pakdaman Naeini

School of Electrical and Computer EngineeringUniversity of Tehran

Page 2: Application of Ensemble Models in Web Ranking

Information Explosion

Page 3: Application of Ensemble Models in Web Ranking
Page 4: Application of Ensemble Models in Web Ranking

Web Challenges Huge size of information

25 billion pages

Proliferation and dynamic nature Creation of New pages New links are created at rate 25% per week

Heterogeneous contents HTML/Text/Audio/…

4Application of Ensemble Models in Web Ranking

Page 5: Application of Ensemble Models in Web Ranking

Search Engine as A Tool

Application of Ensemble Models in Web Ranking 5

http://seo-related.com/

Page 6: Application of Ensemble Models in Web Ranking

Inside Search Engine Crawling Indexing Ranking

Page 7: Application of Ensemble Models in Web Ranking

Inside Search Engine Crawling Indexing Ranking

Page 8: Application of Ensemble Models in Web Ranking

Ranking Approaches Content-based (query dependent)

TF, IDF BM25 Classical IR …

Connectivity based (web) PageRank HITS …

Application of Ensemble Models in Web Ranking 8

Page 9: Application of Ensemble Models in Web Ranking

Our General Framework

Application of Ensemble Models in Web Ranking 9

Query Retrieval Model

List 1

List 2

List N

Ensemble Model

Final

List

Page 10: Application of Ensemble Models in Web Ranking

Simple Ensemble Models Sum rule

Add (normalized) values of different methods Product rule

Multiply (normalized) values of different methods

Borda rule Combination of ranking

Application of Ensemble Models in Web Ranking 10

Page 11: Application of Ensemble Models in Web Ranking

Complicated Ensemble Models OWA (Ordered Weighted Averaging)

Click-Through Data

SVM Use the distance from discriminating hyper

plane as the measure for relevancy of a page to a specific query

Application of Ensemble Models in Web Ranking 11

Page 12: Application of Ensemble Models in Web Ranking

OWA operator

the weights of each vector

Application of Ensemble Models in Web Ranking 12

n

jjjn bwaaaF

121 ,...,,

1

21

23

2

1

1

,1

...,1

,1,

nn

nn

w

w

w

ww

3.0

Page 13: Application of Ensemble Models in Web Ranking

Simulated Click-Through Data How can we use the user behavior?

80% of user clicks are related to query Click-through data

Application of Ensemble Models in Web Ranking 13

Page 14: Application of Ensemble Models in Web Ranking

14

L(a)

1. D1

2. D3

3. D2

4. D4

5. D5

6. d6

Simulated Click-Through Data (example)

L(b)

1. D1

2. D4

3. D7

4. D9

5. D2

6. d8

Page 15: Application of Ensemble Models in Web Ranking

15

L(a)

1. D1

2. D3

3. D2

4. D4

5. D5

6. d6

Simulated Click-Through Data (example)

L(b)

1. D1

2. D4

3. D7

4. D9

5. D2

6. d8

Interleaved results L(a,b)

1. D1 2. D43. D34. D75. D26. D97. D58. D89. D6

Page 16: Application of Ensemble Models in Web Ranking

16

L(a)

1. D1

2. D3

3. D2

4. D4

5. D5

6. d6

Simulated Click-Through Data (example)

L(b)

1. D1

2. D4

3. D7

4. D9

5. D2

6. d8

Interleaved results L(a,b)

1. D1 First2. D43. D34. D75. D2 Second6. D97. D5 Third8. D89. D6

Page 17: Application of Ensemble Models in Web Ranking

17

L(a)

1. D1

2. D3

3. D2

4. D4

5. D5

6. d6

Simulated Click-Through Data (example)

L(b)

1. D1

2. D4

3. D7

4. D9

5. D2

6. d8

Interleaved results L(a,b)

1. D1 First2. D43. D34. D75. D2 Second6. D97. D5 Third8. D89. D6

Page 18: Application of Ensemble Models in Web Ranking

Experimental Datasets LETOR benchmark (English)

Microsoft Research Asia, 2007

DotIR benchmark (Persian) Iran Telecommunication Research Center

(ITRC),2009

Application of Ensemble Models in Web Ranking 18

Page 19: Application of Ensemble Models in Web Ranking

LETOR Benchmark – p@k

Application of Ensemble Models in Web Ranking 19

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

-0.0499999999999997

2.91433543964104E-16

0.0500000000000003

0.1

0.15

0.2

0.25

0.3

0.35

TF-IDF BM25 HITS PageRank Borda1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

product Normal_sum SumWeighted_Sum SVM_Linear SVM_RBFOWA SimClick

Page 20: Application of Ensemble Models in Web Ranking

LETOR Benchmark – MAP

Application of Ensemble Models in Web Ranking 20

TF-ID

FBM

25HIT

S

Page

Rank

Borda

prod

uct

Normal_

sum

Sum

Weig

hted

_Sum

Weig

hted

_Nor

mal_Su

m

SVM

_Line

ar

SVM

_RBF

OWA

SimClic

k0

0.05

0.1

0.15

0.2

0.25

Page 21: Application of Ensemble Models in Web Ranking

DotIR Benchmark – p@k

21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 160

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

TF-IDF BM25 HITS

PageRank Borda

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 160

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

product Normal_sumSum SVM_RBFOWA SimClick

Page 22: Application of Ensemble Models in Web Ranking

DotIR Benchmark – MAP

Application of Ensemble Models in Web Ranking 22

TF-ID

FBM

25HIT

S

Page

Rank

Borda

prod

uct

Normal_

sum

Sum

Weig

hted

_Sum

Weig

hted

_Nor

mal_Su

m

SVM

_Line

ar

SVM

_RBF

OWA

SimClic

k0

0.1

0.2

0.3

0.4

0.5

0.6

Page 23: Application of Ensemble Models in Web Ranking

Summary Motivation:

Important role of Ranking algorithms Low precision of content and connectivity

algorithms

Solution: Use different Ensemble models to combine

Ranking algorithms based on Learning

Results: LETOR benchmark has been used for evaluation More research needed to be done on newly built DotIR

collectionApplication of Ensemble Models in Web Ranking 23

Page 24: Application of Ensemble Models in Web Ranking

Application of Ensemble Models in Web Ranking 24

LABS

Page 25: Application of Ensemble Models in Web Ranking

25

Reference Ali Mohammad Zareh Bidoki, Pedram Ghodsnia, Nasser

Yazdani, “A3CRank: An Adaptive Ranking method based on Connectivity, Content and Click-through data”, Information Processing and Management, 2010.

Ali Mohammad Zareh Bidoki, “Combination of Documents Features Based on Simulated Click-through Data”, ECIR 2009.

Application of Ensemble Models in Web Ranking

Page 26: Application of Ensemble Models in Web Ranking

Thank YouAny Questions?