other ir models

25
Other IR Models Non-Overlapping Lists Proximal Nodes Structured Models Retrieva l: Adhoc Filterin g Browsing U s e r T a s Classic Models boolean vector probabilist ic Set Theoretic Fuzzy Extended Boolean Probabilistic Inference Network Belief Network Algebraic Generalized Vector Lat. Semantic Index Neural Networks Browsing Flat Structure Guided Hypertext

Upload: etoile

Post on 05-Jan-2016

51 views

Category:

Documents


0 download

DESCRIPTION

Set Theoretic. Structured Models. Fuzzy Extended Boolean. Non-Overlapping Lists Proximal Nodes. Browsing. Flat Structure Guided Hypertext. Other IR Models. Classic Models. boolean vector probabilistic. Algebraic. U s e r T a s k. Generalized Vector - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Other IR Models

Other IR Models

Non-Overlapping ListsProximal Nodes

Structured Models

Retrieval: Adhoc Filtering

Browsing

U s e r

T a s k

Classic Models

boolean vector probabilistic

Set Theoretic

Fuzzy Extended Boolean

Probabilistic

Inference Network Belief Network

Algebraic

Generalized Vector Lat. Semantic Index Neural Networks

Browsing

Flat Structure Guided Hypertext

Page 2: Other IR Models

Another Vector Model: Motivation

1. Index terms have synonyms. [Use thesauri?]

2. Index terms have multiple meanings (polysemy).[Use restricted vocabularies or more precise queries?]

3. Index terms are not independent; think “phrases”. [Use combinations of terms?]

Page 3: Other IR Models

Latent Semantic Indexing/Analysis

Basic Idea: Keywords in a query are just one way of specifying the information need. One really wants to specify the key concepts rather than words.

Assume a latent semantic structure underlying the term-document data that is partially obscured by exact word choice.

Page 4: Other IR Models

LSI In Brief Map from terms into lower

dimensional space (via SVD) to remove “noise” and force clustering of similar words. Pre-process corpus to create reduced

vector space Match queries to docs in reduced

space

Page 5: Other IR Models

SVD for Term-Doc Matrix Docs

Term

s

t x d

=

t x m

m x m m x d

C 0T 0S

0D=

where m is the rank of X (<=min(t,d)), T is orthonornal matrix of eigenvectors for term-term correlation,D is orthonornal matrix of eigenvectors from transpose of doc-doc correlation

Page 6: Other IR Models

Reducing Dimensionality Order singular values in S0 by size,

keep the k largest, and delete other rows/columns in S0, T0 and D0 to form

Approximate model is the rank-k model with best possible least-squares-fit to X.

Pick k large enough to fit structure, but small enough to eliminate noise – usually ~100-300.

C

Page 7: Other IR Models

Computing Similarities in LSI

How similar are 2 terms? dot product between two row vectors of

How similar are two documents? dot product between two column vectors

of How similar are a term and a

document? value of an individual cell

C

C

Page 8: Other IR Models

Query Retrieval As before, treat query as short

document: make it column 0 of C First row of C provides rank of docs

wrt query.

Page 9: Other IR Models

LSI Issues Requires access to corpus to compute

SVD How to efficiently compute for Web?

What is the right value of k ? Can LSI be used for cross-language

retrieval? Size of corpus is limited: “one

student’s reading through high school” (Landauer 2002).

Page 10: Other IR Models

Other Vector Model: Neural Network Basic idea:

3 layer neural net: query terms, document terms, documents

Signal propagation based on classic similarity computation

Tune weights.

Page 11: Other IR Models

Neural Network Diagram

from Wilkinson and Hingston, SIGIR 1991

Document Terms

Query Terms

Document

s

ka

kb

kc

ka

kb

kc

k1

kt

d1

dj

dj+1

dN

Page 12: Other IR Models

Computing Document Rank Weight from query to document

term WiqWiq = wiq

sqrt ( i wiq ) Weight from document term to

document WijWij = wij

sqrt ( i wij )

Page 13: Other IR Models

Probabilistic ModelsPrinciple: Given a user query q and a

document d in the collection, estimate the probability that the user will find d relevant. (How?)

User rates a retrieved subset. System uses rating to refine the subset. Over time, retrieved subset should

converge on relevant set.

Page 14: Other IR Models

Computing Similarity I

probability that document dj is relevant to query q, probability that dj is non-relevant to the query q, probability of randomly selecting dj from set R probability that a randomly selected document is

relevant

)|(

)|(),(

)()|(

)()|(

)|(

)|(),(

RdP

RdPqdsim

RPRdP

RPRdP

dRP

dRPqdsim

j

j

j

j

j

j

j

j

)|( jdRP

)|( jdRP

)|( RdP j

)(RP

Page 15: Other IR Models

Computing Similarity II

probability that index term ki is present in document randomly selected from R,

Assumes independence of index terms

)|(1

)|(log

)|(1

)|(log),( ,

1, RkP

RkP

RkP

RkPwwqdsim

i

i

i

iji

t

iqij

)|( RkP i

Page 16: Other IR Models

Initializing Probabilities assume constant probabilities for

index terms: assume distribution of index terms

in non-relevant documents matches overall distribution:

5.0)|( RkP i

N

dfRkP i

i )|(

Page 17: Other IR Models

Improving ProbabilitiesAssumptions: approximate probability given relevant

as % docs with index i retrieved so far:

approximate probabilities given non-relevant by assuming not retrieved are non-relevant:

V

VRkP i

i )|(

VN

VdfRkP ii

i

)|(

Page 18: Other IR Models

Classic Probabilistic Model Summary Pros:

ranking based on assessed probability can be approximated without user

intervention Cons:

really need user to determine set V ignores term frequency assumes independence of terms

Page 19: Other IR Models

Probabilistic Alternative: Bayesian (Belief) NetworksA graphical structure to represent the

dependence between variables in which the following holds:

1. a set of random variables for the nodes2. a set of directed links3. a conditional probability table for each

node, indicating relationship with parents4. a directed acyclic graph

Page 20: Other IR Models

Belief Network Example

B E P(A)

T T .95

T F .94

F T .29

F F .001

Burglary Earthquake

Alarm

JohnCalls Mary Calls

P(B)

.001

P(E)

.002

A P(J)

T .90

F .05

A P(M)

T .70

F .01

from Russell & Norvig

Page 21: Other IR Models

Belief Network Example (cont.)

B E P(A)

T T .95

T F .94

F T .29

F F .001

P(B)

.001

P(E)

.002

A P(J)

T .90

F .05

A P(M)

T .70

F .01

00062.998.999.001.7.9.

)(~)(~)~|~()|()|(

)~~(

EPBPEBAPAMPAJP

EBAMJP

Probability of false notification: alarm sounded and both people call, but there was no burglary or earthquake

Burglary Earthquake

Alarm

JohnCalls Mary Calls

Page 22: Other IR Models

Inference Networks for IRRandom variables

are associated with documents, index terms and queries.

Edges from document node to term nodes increases belief in terms.

and

or

qq2

q1

k1 k2 ki kt

I

dj

or

… …

Page 23: Other IR Models

Computing rank in Inference Networks for IR

and

or

qq2

q1

k1 k2 ki kt

I

dj

or

… …

q is keyword query. q1 is Boolean query. I is information need.

Rank of document is computed as P(q^dj)

k jjj dPdkPkqPdqP )()|()|()(

Page 24: Other IR Models

Where do probabilities come from? (Boolean Model)

uniform priors on documents

only terms in the document are active

query is matched to keywords ala Boolean model

otherwise 0

))()(,()(| if 1)|(

otherwise 0

1)( if 1)|(

1)(

cciiidnfcccc

ji

ji

j

qgkgkqqqkqP

dgdkP

NdP

Page 25: Other IR Models

Belief Network Formulation different network topology does not consider each document

individually adopts set theoretic view