formal retrieval frameworks

2008 © ChengXiang Zhai China-US-France Summer School, Lotus Hill Inst., 2008 1

Formal Retrieval Frameworks

ChengXiang Zhai (翟成祥 ) Department of Computer Science

Graduate School of Library & Information Science

Institute for Genomic Biology, Statistics

University of Illinois, Urbana-Champaign

http://www-faculty.cs.uiuc.edu/~czhai, czhai@cs.uiuc.edu

Outline

• Risk Minimization Framework [Lafferty & Zhai 01, Zhai & Lafferty 06]

• Axiomatic Retrieval Framework [Fang et al. 04, Fang & Zhai 05, Fang & Zhai 06]

Risk Minimization Framework

Risk Minimization: Motivation• Long-standing IR Challenges

– Improve IR theory

• Develop theoretically sound and empirically effective models

• Go beyond the limited traditional notion of relevance (independent, topical relevance)

– Improve IR practice

• Optimize retrieval parameters automatically

• SLMs are very promising tools …

– How can we systematically exploit SLMs in IR?

– Can SLMs offer anything hard/impossible to achieve in traditional IR?

Long-Standing IR Challenges

• Limitations of traditional IR models

– Strong assumptions on “relevance”

• Independent relevance

• Topical relevance

– Can we go beyond this traditional notion of relevance?

• Difficulty in IR practice

– Ad hoc parameter tuning

– Can’t go beyond “retrieval” to support info. access in general

More Than “Relevance”

Relevance Ranking Desired Ranking

Redundancy

Readability

Retrieval Parameters

• Retrieval parameters are needed to

– model different user preferences

– customize a retrieval model according to different queries and documents

• So far, parameters have been set through empirical experimentation

• Can we set parameters automatically?

Systematic Applications of Language Models to IR

• Many different variants of language models have been developed, but are there many more models to be studied?

• Can we establish a road map for exploring language models in IR?

Two Main Ideas of the Risk Minimization Framework

• Retrieval as a decision process

• Systematic language modeling

Idea 1: Retrieval as Decision-Making(A more general notion of relevance)

Unordered subset?

Clustering?

Given a query, - Which documents should be selected? (D) - How should these docs be presented to the user? ()Choose: (D,)

Query … Ranked list?1 2 3 4

Idea 2: Systematic Language Modeling

Document Language ModelsDocuments

DOC MODELING

QueryQuery

Language Model

QUERY MODELING

Loss Function User

USER MODELING

Retrieval Decision: ?

Generative Model of Document & Query [Lafferty & Zhai 01b]

observedPartiallyobserved

QU)|( Up QUser

DS )|( Sp DSource

inferred

),|( Sdp Dd Document

),|( Uqp Q q Query

( | , )Q Dp R R

Applying Bayesian Decision Theory [Lafferty & Zhai 01b, Zhai 02, Zhai & Lafferty 06]

Choice: (D1,1)

Choice: (D2,2)

Choice: (Dn,n)

query quser U

doc set Csource S

dSCUqpDLDD

),,,|(),,(minarg*)*,(,

hidden observedloss

Bayes risk for choice (D, )RISK MINIMIZATION

Benefits of the Framework

• Systematic exploration of retrieval models (covering almost all the existing retrieval models as special cases)

• Derive general retrieval principles (risk ranking principle)

• Automatic parameter setting

• Go beyond independent-relevance (subtopic retrieval)

Special Cases of Risk Minimization

• Set-based models (choose D)

• Ranking models (choose )

– Independent loss

• Relevance-based loss

• Distance-based loss

– Dependent loss

• MMR loss

• MDR loss

Boolean model

Probabilistic relevance model Generative Relevance Theory

Vector-space Model

Subtopic retrieval model

Two-stage LM

KL-divergence model

Case 1: Two-stage Language Models

QU)|( Up Q

DS)|( Sp D ),|( Sdp D

),|( Uqp Q q

otherwisec

ifdl DQ

),(0),,(

Loss function

),ˆ|(

)|ˆ(),ˆ|(

),|ˆ(),(

UqpqdR

Risk ranking formula

Stage 1: compute D̂Stage 1

),ˆ|( Uqp DStage 2: compute

Stage 2

(Dirichlet prior smoothing)

(Mixture model)

Two-stage smoothing

Case 2: KL-divergence Retrieval Models

QU)|( Up Q

DS)|( Sp D ),|( Sdp D

),|( Uqp Q q

),(),,(

Loss function

)ˆ||ˆ(),( DQ

Risk ranking formula

)ˆ||ˆ( DQD

Case 3: Aspect Generative Model of Document & Query

QU),|( Up Q

User),|( Qqp

q Query

DS),|( Sp D

Source),|( Ddp

d Document

=( 1,…, k)

aDaiD dddwhereapdpdp ...,)|()|(),|( 1

dDirapdpdpn

aai )|()|()|(),|(

Optimal Ranking for Independent Loss

1 11 1

* arg min ( , ) ( | , , , )

( , ) ( | ... )

( ) ( )

* arg min ( ) ( ) ( | , , , )

arg min ( ) ( ) (

L p q U C S d

s l p q U C S d

| , , , )

( | , , , ) ( ) ( | , , , )

* ( | , , , )

k k k k

q U C S d

r d q U C S l p q U C S d

Ranking based on r d q U C S

Decision space = {rankings}

Sequential browsing

Independent loss

Independent risk= independent scoring

“Risk ranking principle”[Zhai 02]

Automatic Parameter Tuning• Retrieval parameters are needed to

– model different user preferences

– customize a retrieval model to specific queries and documents

• Retrieval parameters in traditional models

– EXTERNAL to the model, hard to interpret

– Parameters are introduced heuristically to implement “intuition”

– No principles to quantify them, must set empirically through many experiments

– Still no guarantee for new queries/documents

• Language models make it possible to estimate parameters…

The Way to Automatic Tuning ...

• Parameters must be PART of the model!

– Query modeling (explain difference in query)

– Document modeling (explain difference in doc)

• De-couple the influence of a query on parameter setting from that of documents

– To achieve stable setting of parameters

– To pre-compute query-independent parameters

Parameter Setting in Risk Minimization

Query Query Language Model

Document Language Models

Loss Function

Documents

Query model parameters

Doc model parameters

User model parameters

Estimate

Generative Relevance Hypothesis [Lavrenko 04]

• Generative Relevance Hypothesis: – For a given information need, queries expressing that need and

documents relevant to that need can be viewed as independent random samples from the same underlying generative model

• A special case of risk minimization when document models and query models are in the same space

• Implications for retrieval models: “the same underlying generative model” makes it possible to– Match queries and documents even if they are in different

languages or media

– Estimate/improve a relevant document model based on example queries or vice versa

Risk minimization can easily go beyond independent relevance…

Aspect Retrieval

Query: What are the applications of robotics in the world today?

Find as many DIFFERENT applications as possible.

Example Aspects: A1: spot-welding robotics

A2: controlling inventory A3: pipe-laying robotsA4: talking robotA5: robots for loading & unloading memory tapesA6: robot [telephone] operatorsA7: robot cranes… …

Aspect judgments A1 A2 A3 … ... Ak

d1 1 1 0 0 … 0 0d2 0 1 1 1 … 0 0d3 0 0 0 0 … 1 0….dk 1 0 1 0 ... 0 1

Must go beyond independent relevance!

Evaluation Measures

• Aspect Coverage (AC): measures per-doc coverage

– #distinct-aspects/#docs

– Equivalent to the “set cover” problem, NP-hard

• Aspect Uniqueness(AU): measures redundancy

– #distinct-aspects/#aspects

– Equivalent to the “volume cover” problem, NP-hard

• Examples0001001

0101100

1000101

… ...d1 d3d2

#doc 1 2 3 … …#asp 2 5 8 … …#uniq-asp 2 4 5AC: 2/1=2.0 4/2=2.0 5/3=1.67AU: 2/2=1.0 4/5=0.8 5/8=0.625

Accumulated counts

Dependent Relevance Ranking

• In general, the computation of the optimal ranking is NP-hard

• A general greedy algorithm

– Pick the first document according to INDEPENDENT relevance

– Given that we have picked k documents, evaluate the CONDITIONAL relevance of each candidate document

– Choose the document that has the highest conditional relevance value

Loss Function L( k+1 | 1 … k )

? dk+1

Novelty/RedundancyNov ( k+1 | 1 … k )

RelevanceRel( k+1 )

Maximal Marginal Relevance (MMR)

The best dk+1 is novel & relevant

Maximal Diverse Relevance (MDR)

Aspect Coverage Distrib. p(a|i)

The best dk+1 is complementary

in coverage

Maximal Marginal Relevance (MMR) Models

• Maximizing aspect coverage indirectly through redundancy elimination

• Conditional-Rel. = novel + relevant

• Elements

– Redundancy/Novelty measure

– Combination of novelty and relevance

A Mixture Model for Redundancy

P(w|Background)Collection

P(w|Old)

Ref. document

Maximum Likelihood Expectation-Maximization

Cost-based Combination of Relevance and Novelty

))|(1()|(

))|(1)(|(Re

))|(Re1())|(1)(|(Re)}{,,,...,|(

321111

cwhere

dNewpdqp

dNewpdlp

dlpcdNewpdlpcdddl

kkkkiiQkk

Relevance score Novelty score

Maximal Diverse Relevance (MDR) Models

• Maximizing aspect coverage directly through aspect modeling

• Conditional-rel. = complementary coverage

• Elements

– Aspect loss function

– Generative Aspect Model

Aspect Generative Model of Document & Query

QU),|( Up Q

User),|( Qqp

q Query

DS),|( Sp D

Source),|( Ddp

d Document

=( 1,…, k)

aDaiD dddwhereapdpdp ...,)|()|(),|( 1

dDirapdpdpn

aai )|()|()|(),|(

Aspect Loss Function

)|()1()|(1

)||()}{,,,...,|(

11,...,1

1,...,11111

kiiQkk

QU),|( Up Q ),|( Qqp

DS),|( Sp D Ddp ,|(

)ˆ||ˆ( 1,...,1kkQD

Aspect Loss Function: Illustration

Desired coveragep(a|Q)

“Already covered” p(a|1)... p(a|k -1)

New candidate p(a|k)

non-relevant

redundant

perfect

Combined coverage

)|()1()|(1

1,...,1

Risk Minimization: Summary

• Risk minimization is a general probabilistic retrieval framework

– Retrieval as a decision problem (=risk min.)

– Separate/flexible language models for queries and docs

• Advantages

– A unified framework for existing models

– Automatic parameter tuning due to LMs

– Allows for modeling complex retrieval tasks

• Lots of potential for exploring LMs…

• For more information, see [Zhai 02]

Future Research Directions

• Modeling latent structures of documents

– Introduce source structures (naturally suggest structure-based smoothing methods)

• Modeling multiple queries and clickthroughs of the same user

– Let the observation include multiple queries and clickthroughs

• Collaborative search

– Introduce latent interest variables to tie similar users together

• Modeling interactive search

Axiomatic Retrieval Framework

Most of the following slides are from Hui Fang’s presentation

Traditional Way of Modeling the RelevanceTraditional Way of Modeling the Relevance

Document

Relevance?

• No way to predict the performance and identify the weaknesses• Sophisticated parameter tuning

Rel≈Sim(DRep,QRep)

Vector Space Models[Salton et al.75, Salton et al. 83,

Salton et al. 89, Singhal96]

Rel≈P(R=1|DRep,QRep)

Probabilistic Models[Fuhr et al 92, Lafferty et al 03, Ponte et al 98, Robertson et al. 76, Turtle et al. 91, Rijbergen et al 77]

test collectiontest collection

No Way to Predict the PerformanceNo Way to Predict the Performance

S (Q,D) c(t,Q)logN 1

df (t)

1 log(1 log(c(t,D)))

(1 s) s | D |avdl

1 log(c(t,D))

“k1, b and k3 are parameters which depend on the nature of the queries and possibly on the database; k1 and b default to 1.2 and 0.75 respectively, but smaller values of b are sometimes advantageous; in long queries k3 is often set to 7 or 1000.”

Sophisticated Parameter TuningSophisticated Parameter Tuning

[Robertson et al. 1999]

S(Q,D) logN df (t) 0.5

df (t)tQD

(k1 1)c(t,D)

c(t,D) k1((1 b) b | D |

avdl)(k3 1)c(t,Q)

k3 c(t,Q)

High Parameter SensitivityHigh Parameter Sensitivity

Hui Fang’s Thesis Work Hui Fang’s Thesis Work [Fang 07][Fang 07]

Propose a novel axiomatic framework, where relevance is directly modeled with term-based constraints

– Predict the performance of a function analytically[Fang et al., SIGIR04]

– Derive more robust and effective retrieval functions [Fang & Zhai, SIGIR05, Fang & Zhai, SIGIR06]

– Diagnose weaknesses and strengths of retrieval functions [Fang & Zhai, under review]

Traditional Way of Modeling the RelevanceTraditional Way of Modeling the Relevance

Document

Relevance?Rel≈Sim(DRep,QRep)

Vector Space Models

Probabilistic Models

Query QRep

Axiomatic Approach to Relevance ModelingAxiomatic Approach to Relevance Modeling

Document

Vector Space Models

Query QRep

Constraint 1

Constraint 2

Constraint m

(1) Predict (1) Predict performanceperformance

Rel(Q,D)

(2) Develop more (2) Develop more robust functionsrobust functions

Collection Collection

(constraint 1)(constraint 1)……

(3) Diagnose weaknesses(3) Diagnose weaknesses

(constraint 2)(constraint 2)

(constraint m)(constraint m)

We are hereWe are here

Part 1: Define retrieval constraintsPart 1: Define retrieval constraints

[Fang et. al. SIGIR 2004][Fang et. al. SIGIR 2004]

• Pivoted Normalization Method

• Dirichlet Prior Method

• Okapi Method

1 ln(1 ln( ( , ))) 1( , ) ln

| | ( )(1 )w q d

c w d Nc w q

d df ws savdl

( , )( , ) ln(1 ) | | ln

( | ) | |w q d

c w dc w q q

p w C d

( 1) ( , )( 1) ( , )( ) 0.5ln

| |( ) 0.5 ( , )((1 ) ) ( , )w q d

k c w qk c w dN df wddf w k c w qk b b c w d

Inversed Document FrequencyDocument Length NormalizationTerm Frequency

Empirical Observations in IR (Cont.)

1+ln(c(w,d))

Alternative TF transformationParameter sensitivity

Research Questions

• How can we formally characterize these necessary retrieval heuristics?

• Can we predict the empirical behavior of a method without experimentation?

d2:d1:

),( 1dwc

),( 2dwc

Term Frequency Constraints (TFC1)

• TFC1

TF weighting heuristic I: Give a higher score to a document with more occurrences of a query term.

If |||| 21 dd ),(),( 21 dwcdwc and

Let q be a query with only one term w.

).,(),( 21 qdfqdf then

),(),( 21 qdfqdf

1 2( , ) ( , )f d q f d q

Term Frequency Constraints (TFC2)

TF weighting heuristic II: Favor a document with more distinct query terms.

2 1( , )c w d

1 2( , )c w d

1 1( , )c w d

1 2( , ) ( , ).f d q f d qthen

1 2 1 1 2 1( , ) ( , ) ( , )c w d c w d c w d If2 2 1 1 2 1( , ) 0, ( , ) 0, ( , ) 0c w d c w d c w d

1 2| | | |d dand

Let q be a query and w1, w2 be two query terms.

Assume1 2( ) ( )idf w idf w

• TFC2

q:w1 w2

Term Discrimination Constraint (TDC)

IDF weighting heuristic:Penalize the words popular in the collection; Give higher weights to discriminative terms.

Query: SVM Tutorial Assume IDF(SVM)>IDF(Tutorial)

...…

SVMSVM TutorialTutorial…

……

SVMSVMTutorialTutorial…

( 1) ( 2)f Doc f Doc

SVM Tutorial

Term Discrimination Constraint (Cont.)

• TDCLet q be a query and w1, w2 be two query terms.

1 2| | | |,d dAssume

)()( 21 widfwidf and),(),( 2111 dwcdwc If

),(),(),(),( 22211211 dwcdwcdwcdwc and

),(),( 21 dwcdwc for all other words w.and

1 2( ) ( )idf w idf wq:w1 w2

d2:d1:

),( 11 dwc

),( 21 dwc

),( 12 dwc

),( 22 dwc

1 2( , ) ( , )f d q f d q

Length Normalization Constraints(LNCs)Document length normalization heuristic:Penalize long documents(LNC1); Avoid over-penalizing long documents (LNC2) .

• LNC2

q:Let q be a query.

d1:||||,1 21 dkdk ),(),( 21 dwckdwc If and

),(),( 21 qdfqdf then

),(),( 21 qdfqdf

d1:d2:

q:Let q be a query.

1),(),(, 12 dwcdwcqw),(),(, 12 dwcdwcw

),( 1dwc

),( 2dwc

If for some word

but for other words

),(),( 21 qdfqdf ),(),( 21 qdfqdf then

• LNC1

TF-LENGTH Constraint (TF-LNC)

• TF-LNC

TF-LN heuristic:Regularize the interaction of TF and document length.

),( 2dwc

),( 1dwc

Let q be a query with only one term w.

),(),( 21 dwcdwc and

If 1 2 1 2| | | | ( , ) ( , )d d c w d c w d

1 2( , ) ( , )f d q f d q

Analytical Evaluation

Retrieval Formula TFCs TDC LNC1 LNC2 TF-LNC

Pivoted Norm. Yes Conditional Yes Conditional Conditional

Dirichlet Prior Yes Conditional Yes Conditional Yes

Okapi (original) Conditional Conditional Conditional Conditional Conditional

Okapi (modified) Yes Conditional Yes Yes Yes

Term Discrimination Constraint (TDC)

IDF weighting heuristic:Penalize the words popular in the collection; Give higher weights to discriminative terms.

...…SVMSVMSVMTutorialTutorial…

Query: SVM Tutorial Assume IDF(SVM)>IDF(Tutorial)

……TutorialSVMSVMTutorialTutorial…

( 1) ( 2)f Doc f Doc

Benefits of Constraint Analysis

• Provide an approximate bound for the parameters

– A constraint may be satisfied only if the parameter is within a particular interval.

• Compare different formulas analytically without experimentations

– When a formula does not satisfy the constraint, it often indicates non-optimality of the formula.

• Suggest how to improve the current retrieval models

– Violation of constraints may pinpoint where a formula needs to be improved.

Parameter sensitivity of s

Benefits 1 : Bounding Parameters

• Pivoted Normalization MethodLNC2 s<0.4

Optimal s (for average precision)

Negative when df(w) is large Violate many constraints

( 1) ( , )( 1) ( , )( ) 0.5ln

| |( ) 0.5 ( , )((1 ) ) ( , )w q d

Benefits 2 : Analytical Comparison• Okapi Method

Pivoted

keyword query verbose query

s or b s or b

Benefits 3: Improving Retrieval Formulas

Make Okapi satisfy more constraints; expected to help verbose queries

( 1) ( , )( 1) ( , )( ) 0.5ln

| |( ) 0.5 ( , )((1 ) ) ( , )w q d

• Modified Okapi Methoddf

keyword query verbose query

s or b s or b

Pivoted

Modified Okapi

Document

Vector Space Models

Query QRep

Constraint 1

Constraint 2

Constraint m

Rel(Q,D)

Part 2: Derive new retrieval functionsPart 2: Derive new retrieval functions

[Fang & Zhai SIGIR05, Fang & Zhai SIGIR06][Fang & Zhai SIGIR05, Fang & Zhai SIGIR06]

Basic Idea of Axiomatic ApproachBasic Idea of Axiomatic Approach

Function space

Retrieval constraints

Our target

Function space

Function SpaceFunction Space

Dd1,d2,...,dn

Qq1,q2,...,qm;

S : QD

Define the function space inductively

catcat

dogdog

Primitive weighting function (f)S(Q,D) = S( , ) = f ( , ) bigbig

Query growth function (h)S(Q,D) = S( , ) = S( , )+h( , , )

Document growth function (g)

S(Q,D) = S( , ) = S( , )+g( , , )

bigbig

Derivation of New Retrieval FunctionsDerivation of New Retrieval Functions

S(Q,D)

decomposedecompose

S’S’

generalizegeneralize

Hconstrainconstrain

existing functionexisting function

assembleassemble

S'(Q,D) new functionnew function

Representative Derived Function Representative Derived Function

S(Q,D) c(t,Q)tQD (

df (t))0.35

c(t,D)

c(t,D) s s| D |

IDFIDF TFTF

length normalizationlength normalization

QTFQTF

The derived function is less sensitive to the The derived function is less sensitive to the parameter settingparameter setting

Axiomatic ModelAxiomatic Modelbetterbetter

Adding Semantic Term MatchingAdding Semantic Term Matching

dogdog

Training puppies is not always easy: it requires work. Puppies should be touched and held from birth, although only briefly and occasionally until their eyes and ears open. Otherwise the puppy may become vicious.

A book is a collection of paper with text, pictures, usually bound together along one edge within covers. A book is also a literary work or a main division of such a work. A book produced in electronic format is known as an e-book.

General Approach General Approach to Semantic Term Matchingto Semantic Term Matching

• Select semantic similar terms

• Expand original query with the selected terms

dog dog 11puppy 0.5

doggy 0.5

hound 0.5

bone 0.1

Key challenge: Key challenge:

How to weight selected terms?How to weight selected terms?

The proposed axiomatic approach provides guidance on The proposed axiomatic approach provides guidance on how to weight terms appropriately. how to weight terms appropriately.

Effectiveness of Semantic Term MatchingEffectiveness of Semantic Term Matching

ROBUST04 ROBUST05

MAP P@20 MAP P@20

Syntactic term matching

(baseline)0.248 0.352 0.192 0.379

Semantic term matching

(21.8%)

0.399 0.292

(51.0%)

Document

Vector Space Models

Query QRep

Constraint 1

Constraint 2

Constraint m

Rel(Q,D)

Part 3: Diagnostic evaluation for IR models

[Fang & Zhai, under review]

Existing evaluation provides little explanation for Existing evaluation provides little explanation for the performance differencesthe performance differences

trec8 wt-2g fr88-89

Pivoted 0.244 0.288 0.218

Dirichlet 0.257 0.302 0.202

How to diagnose weaknesses and strengths of retrieval functions?

dogdog

……

Query:Query:

DocDoc11::

DocDoc22::

DocDocnn::

Retrieval FunctionRetrieval Function MAP=0.25 MAP=0.25

Relevance-Preserving Perturbations

cD(d,d,K)

concatenate every document with itself K times

document scaling perturbation:document scaling perturbation:

Perturb term statistics in documents and keep relevance statusPerturb term statistics in documents and keep relevance status

Summary of PerturbationsSummary of Perturbations

• Relevance addition

• Noise addition

• Internal term growth

• Document scaling

• Relevant document concatenation

• Non-relevant document concatenation

• Noise deletion

• Document addition

• Document deletion

Length Scaling TestLength Scaling Test

cD(d,d,K)

test whether a retrieval function over-penalizes long documents

1. Identify the aspect to be diagnosed

2. Choose appropriate relevance-preserving perturbations

3. Perform the test and interpret the results

Dirichlet over-penalizes Dirichlet over-penalizes long documents!long documents!

Summary of Diagnostic TestsSummary of Diagnostic Tests

• Length variation sensitivity tests

– Length variance reduction test

– Length variance amplification test

– Length scaling test

• Term noise resistance tests

– Term noise addition test

• TF-LN balance Tests

– Single query term growth

– Majority query term growth

– All query term growth

Identifying the weaknesses makes it possible to improve the performance

Dir. 0.257 2838 0.397 0.302 1875 0.372 0.207 741 0.185

M.D. 0.262 2874 0.415 0.321 1930 0.395 0.224 811 0.191

Piv. 0.244 2826 0.402 0.288 1924 0.369 0.223 822 0.206

M.P. 0.256 2848 0.411 0.316 1940 0.392 0.230 867 0.202

trec8 wt2g fr88-89

MAP #RRel P@20 MAP #RRel P@20 MAP #RRel P@20

Axiomatic Framework: Summary

• A new way of examining and developing retrieval models

• Facilitate analytical study of retrieval models

• Applicable to the development of all kinds of ranking functions

• Limitation:

– Constraints can be subjective

– Not constructive (thus must rely on other techniques to reduce the search space)

• Combined with machine learning?

Lecture 4: Key Points

• Retrieval problem can be generally formalized as a statistical decision problem

– Nicely incorporate generative models into a retrieval framework

– Serve as a road map for exploring new retrieval models

– Make it easier to model complex retrieval problems (interactive retrieval)

• Axiomatic framework makes it possible to analyze a retrieval function without experimentation

– Facilitate theoretical study of retrieval models (“impossibility theorem”?)

– Offer a general methodology for thinking about and improving retrieval models

Readings

• The risk minimization paper:

– http://sifaka.cs.uiuc.edu/czhai/riskmin.pdf

• Hui Fang’s thesis:

– http://www.cs.uiuc.edu/techreports.php?report=UIUCDCS-R-2007-2847

Discussion

• Risk minimization for multimedia retrieval

– Add generative models of images and video to the framework

– Unifying multimedia with text as a common language

• Axiomatic approaches

– Constraints for ranking multimedia information items

– Add constraints to a statistical learning framework (e.g., add constraints as prior or regularization)

formal retrieval frameworks

existing retrieval models

zhai lafferty

fang zhai

ir practicead

dranking models

effective models

stage language modelscase

tomodel different user

Documents

introduction to information retrieval ` `%%%`#`&12...

modernity, morals and more information - frameworks...

ethical frameworks for telecare technologies for older...

health system performance frameworks: aligning frameworks...

week 10 information retrieval presentationlsir...

game analysis frameworks -...

formal verification of embedded real-time software in...

a formal study of information retrieval heuristics

multimedia retrieval. outline audio retrieval spoken...

translating simple legal text to formal...

multiple curriculum frameworks in early childhood teacher...

conceptual/theoretical frameworks operational frameworks

a formal study of information retrieval heuristics hui fang,...

specifying javascript in ml · slide 9 meeting in the...

web application frameworks - security assessment ·...

validation of non-formal in europe · open up...

html template frameworks vs. joomla! template frameworks

introduction to information retrieval boolean retrieval

automated formal static analysis and retrieval of source...

syntactic question abstraction and retrieval for data...