japanese mlr

5
Japanese MLR

Upload: ramona-robles

Post on 31-Dec-2015

11 views

Category:

Documents


0 download

DESCRIPTION

Japanese MLR. International/JP MLR Issues. Have to do more with less data Blending different languages? Can’t necessarily filter adult May need new/different features Different types of queries English/Bracket/Phrase/etc Metrics designed for English China has lots more spam - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Japanese MLR

Japanese MLR

Page 2: Japanese MLR

International/JP MLR Issues

• Have to do more with less data– Blending different languages?

• Can’t necessarily filter adult• May need new/different features• Different types of queries

English/Bracket/Phrase/etc• Metrics designed for English

– China has lots more spam– Japan has much less spam– Germany looks 10-20% ahead of Google by DCG

Page 3: Japanese MLR

JP MLR vs. English MLR

Kanji/ Hiragana

Katakana Latin (Romaji)

Baseline 7.2 7.6 9.2

JP MLR +4% +2% +1%

EN MLR 0% +1% +3%

Google +3% +4% +6%

Examples 277 231 96

Page 4: Japanese MLR

Different features important for JP

• http://internal.inktomi.com/~lukeb/FeatureImportance.html

• “Linkflux”

• How soon the word appears in the document

• Is the first word in query in the title

Page 5: Japanese MLR

New features for JP

• Query Word Length very important

• Query type important

• Phonetic url match

• Future:– vcano match– Matching segmented chunks