japanese mlr

Post on 31-Dec-2015

11 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Japanese MLR. International/JP MLR Issues. Have to do more with less data Blending different languages? Can’t necessarily filter adult May need new/different features Different types of queries English/Bracket/Phrase/etc Metrics designed for English China has lots more spam - PowerPoint PPT Presentation

TRANSCRIPT

Japanese MLR

International/JP MLR Issues

• Have to do more with less data– Blending different languages?

• Can’t necessarily filter adult• May need new/different features• Different types of queries

English/Bracket/Phrase/etc• Metrics designed for English

– China has lots more spam– Japan has much less spam– Germany looks 10-20% ahead of Google by DCG

JP MLR vs. English MLR

Kanji/ Hiragana

Katakana Latin (Romaji)

Baseline 7.2 7.6 9.2

JP MLR +4% +2% +1%

EN MLR 0% +1% +3%

Google +3% +4% +6%

Examples 277 231 96

Different features important for JP

• http://internal.inktomi.com/~lukeb/FeatureImportance.html

• “Linkflux”

• How soon the word appears in the document

• Is the first word in query in the title

New features for JP

• Query Word Length very important

• Query type important

• Phonetic url match

• Future:– vcano match– Matching segmented chunks

top related