presentation at joint pia workshop at umap 2014
DESCRIPTION
CNGL's Dr. Rami Ghorab presented research in multilingual search personalisation during the joint PIA workshop at UMAP 2014. 'Does Personalisation Benefit Everyone in the Same Way? Multilingual Search Personalisation for English vs. Non-English Users'. The research paper, which is accessible here http://bit.ly/1qjRyY5 is authored by; M. Rami Ghorab, Séamus Lawless, Alexander O'Connor and Vincent Wade.TRANSCRIPT
Does Personalisation Benefit Everyone in the Same Way?
M. Rami GhorabPostdoc, School of Computer Science & Statistics,
Trinity College Dublin
Today’s Web
Monolingual & MultilingualUsers
Searching acrossMultilingual Content
• Diverse linguistic backgrounds
• Different language capabilities
• Different language preferences
We want to personalise search, given these characteristics
• Various languages.
• Relevant content – which lang?
• User Modelling– Search interests (keywords) that span across multiple languages.– Grouped into language fragments.
• Adapting Results in Multilingual Web Search– Merging and Re-ranking the results.– Translating where necessary.
Extending Personalisationinto the Multilingual Dimension
Personalised Multilingual Information Retrieval (PMIR)
User Modelling
Native Language
Familiar Languages
Preferred Language
Attributes
Structure
Result Lists(English, French, German)
Ranked separately
against keywords
in User Model fragment
(textual similarity)
Re-ranked Result Lists
(English, French, German)
Merged & Translated List
Research Question - Revisited
Would multilingual search personalisation algorithms
achieve the same degree of improvements
for all search queries, regardless of query language?
• Evaluate the retrieval effectiveness of the multilingual search personalisation algorithms (User Modelling and Result Adaptation).
• Determine whether the algorithms achieve the same degree of effectiveness for users who have different language preferences (examine English vs. Non-English users).
Experiment - Objectives
Experiment - Setup
Phase 2: Result Pooling
• Last query reserved for testing.
• Construct the user models.
• Generate various result lists.Phase 3: Relevance Judgments
• 4-point scale of relevance
(not relevant / somewhat relevant / relevant / very relevant)
Phase 4: Evaluation
• Metric: Mean Average Precision (MAP).
• Measures effectiveness of each algorithm across all test queries
Phase 1: User Participation
• Sign up – language preferences.
• Two search topics.
• Use baseline multilingual Web search.
• Submit findings about topic.
Experiment - Results
MAP Improvements over Baselinefor various result list positions (cut-off points @5..@20)
Understanding the Results
List Position
EnglishNon-
English%
English over Non-English
P@5 0.58 0.45 29.15%
P@10 0.55 0.49 11.54%
P@15 0.51 0.45 14.46%
P@20 0.50 0.48 3.71%
Baseline (non-personalised) Precision Scores
• Does personalisation benefit everyone in the same way?– No.– Multilingual search adaptation algorithms work differently with users of
different language preferences/capabilities.
• Recommendation– Personalised Search systems should adopt different personalisation
strategies for certain languages or groups of languages.
• Future Work– Concept-based user models (multilingual ontology or web taxonomy).
Conclusion & Future Work
Thank You
This research is supported bythe Science Foundation Ireland (Grant 12/CE/I2267)
as part of the Centre for Next Generation Localisation (www.cngl.ie) at Trinity College, Dublin.