cheshire at geoclef 2008: text and fusion approaches for gir

Cheshire at GeoCLEF 2008: Text and Fusion

Approaches for GIR

Ray R LarsonSchool of Information

University of California, Berkeley

GeoCLEF 2008 -- Aarhus September 18, 2008

Motivation Motivation

In previous GeoCLEF evaluations we found very mixed results in using various methods of query expansion, attempts at explicit geographic constraints, etc.

Last year we decided to try just our “basic” retrieval methodI.e., Logistic regression with blind feedback

The goal was to establish baseline data that we can use to test selective additions in later experiments

In previous GeoCLEF evaluations we found very mixed results in using various methods of query expansion, attempts at explicit geographic constraints, etc.

Last year we decided to try just our “basic” retrieval methodI.e., Logistic regression with blind feedback

The goal was to establish baseline data that we can use to test selective additions in later experiments


MotivationMotivation

Because the “baselines” worked well last year, we decided to continue with them and begin testing “fusion” approaches for combining the results of different retrieval algorithmsThis was due in part to Neuchatel’s use

of fusion approaches with good results and our previous use of fusion approaches in earlier CLEF tasks

Because the “baselines” worked well last year, we decided to continue with them and begin testing “fusion” approaches for combining the results of different retrieval algorithmsThis was due in part to Neuchatel’s use

of fusion approaches with good results and our previous use of fusion approaches in earlier CLEF tasks


ExperimentsExperiments

TD, TDN, and TDN Fusion for Monolingual English, German, Portuguese (9 runs)

TD, TDN, and TDN Fusion for Bilingual X to English, German, and Portuguese (18 runs)

TD, TDN, and TDN Fusion for Monolingual English, German, Portuguese (9 runs)

TD, TDN, and TDN Fusion for Bilingual X to English, German, and Portuguese (18 runs)


MonolingualMonolingual

Run Name Task Characteristics MAPBERKGCMODETD Monolingual German TD auto 0.2295 *BERKGCMODETDN Monolingual German TDN auto 0.205BERKMODETDNPIV Monolingual German TDN auto fusion 0.2292BERKGCMOENTD Monolingual English TD auto 0.2652BERKGCMOENTDN Monolingual English TDN auto 0.2001BERKMOENTDNPIV Monolingual English TDN auto fusion 0.2685 *BERKGCMOPTTD Monolingual Portuguese TD auto 0.217BERKGCMOPTTDN Monolingual Portuguese TDN auto 0.1741BERKMOPTTDNPIV Monolingual Portuguese TDN auto fusion 0.2310 *


MonolingualMonolingual

Run Name Task Characteristics MAP

BERKGCMODETD Monolingual German TD auto 0.2295 *

BERKGCMODETDN Monolingual German TDN auto 0.205

BERKMODETDNPIV Monolingual German TDN auto fusion 0.2292

BERKGCMOENTD Monolingual English TD auto 0.2652

BERKGCMOENTDN Monolingual English TDN auto 0.2001

BERKMOENTDNPIV Monolingual English TDN auto fusion 0.2685 *

BERKGCMOPTTD Monolingual Portuguese TD auto 0.217

BERKGCMOPTTDN Monolingual Portuguese TDN auto 0.1741

BERKMOPTTDNPIV Monolingual Portuguese TDN auto fusion 0.2310 *


BilingualBilingualRun Name Task Characteristics MAPBERKGCBIENDETD Bilingual English->German TD auto 0.215BERKGCBIENDETDN Bilingual English->German TDN auto 0.1682BERKBIENDETDNPIV Bilingual English->German TDN auto fusion 0.2251 *BERKGCBIPTDETD Bilingual Portuguese->German TD auto 0.195BERKGCBIPTDETDN Bilingual Portuguese->German TDN auto 0.1108BERKBIPTDETDNPIV Bilingual Portuguese->German TDN auto fusion 0.1912BERKGCBIDEENTD Bilingual German->English TD auto 0.2274BERKGCBIDEENTDN Bilingual German->English TDN auto 0.1894BERKBIDEENTDNPIV Bilingual German->English TDN auto fusion 0.2304 *BERKGCBIPTENTD Bilingual Portuguese->English TD auto 0.1886BERKGCBIPTENTDN Bilingual Portuguese->English TDN auto 0.154BERKBIPTENTDNPIV Bilingual Portuguese->English TDN auto fusion 0.2101BERKGCBIDEPTTD Bilingual German->Portuguese TD auto 0.1346BERKGCBIDEPTTDN Bilingual German->Portuguese TDN auto 0.126BERKBIDEPTTDNPIV Bilingual German->Portuguese TDN auto fusion 0.1488BERKGCBIENPTTD Bilingual English->Portuguese TD auto 0.1913BERKGCBIENPTTDN Bilingual English->Portuguese TDN auto 0.1762BERKBIENPTTDNPIV Bilingual English->Portuguese TDN auto fusion 0.2074 *


TDN FusionTDN Fusion

NewWt=(B*piv) + (A*(1-piv))

(piv = 0.29)

A: TD LogisticRegression withBlind Feedback

Result

B: TDNOKAPI BM-25

Result

Final Result

A and B Normalized usingMinMax to [0:1]


ResultsResults

Fusion of Logistic regression with blind feedback and Okapi BM-25 resulted in most of our best performing runsNot always dramatic improvement

With a single algorithm use of the Narrative is counter-productive. Using Title and Description provides better results with these algorithmsDoes blind feedback accomplish some of the

geographic expansion explicit in the narrative?

Fusion of Logistic regression with blind feedback and Okapi BM-25 resulted in most of our best performing runsNot always dramatic improvement

With a single algorithm use of the Narrative is counter-productive. Using Title and Description provides better results with these algorithmsDoes blind feedback accomplish some of the

geographic expansion explicit in the narrative?


Comparison of Berkeley Results 2006, 2007-2008

Comparison of Berkeley Results 2006, 2007-2008

Task MAP 2006

MAP 2007

MAP2008

Pct. Diff‘07-’08

Monolingual English

0.250 0.264 0.268* 1.493

Monolingual German

0.215 0.139 0.230 39.565

Monolingual Portuguese

0.162 0.174 0.231* 24.675

Bilingual English -> German

0.156 0.090 0.225* 60.000

Bilingual English -> Portuguese

0.1260 0.201 0.207* 2.899*using fusion


What happened in 2007 German?

What happened in 2007 German?

We speculated last year that it wasNo decompounding

2006 used Aitao Chen’s decompounding (no)

Worse translation?Possibly - different MT systems were used

But same for 2007 and 2008, so no

Incomplete stoplist?Was it really the same? (yes)

Was stemming the same? (yes)

We speculated last year that it wasNo decompounding

2006 used Aitao Chen’s decompounding (no)

Worse translation?Possibly - different MT systems were used

But same for 2007 and 2008, so no

Incomplete stoplist?Was it really the same? (yes)

Was stemming the same? (yes)


Why did German work better for us in 2008?Why did German work better for us in 2008?

That was all speculation, but…

It REALLY helps if you include the entire databaseOur 2007 German runs did not

include any documents from the SDA collection!

That was all speculation, but…

It REALLY helps if you include the entire databaseOur 2007 German runs did not

include any documents from the SDA collection!


What Next?What Next?

Finally start adding back true geographic processing and test where and why (and if) results are improved

Get decompounding working with German

Finally start adding back true geographic processing and test where and why (and if) results are improved

Get decompounding working with German

cheshire at geoclef 2008: text and fusion approaches for gir

Documents

tdn fusion

good results

mixed results

previous geoclef evaluations

basic retrieval methodi

earlier clef tasksseptember

logistic regression

blind feedbackthe goal