![Page 1: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551](https://reader035.vdocument.in/reader035/viewer/2022081503/56649d605503460f94a4214e/html5/thumbnails/1.jpg)
Experimental Components for the Evaluation of Interactive Information
Retrieval Systems
Pia Borlund
Dawn Filan3/30/04610:551
![Page 2: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551](https://reader035.vdocument.in/reader035/viewer/2022081503/56649d605503460f94a4214e/html5/thumbnails/2.jpg)
The Goal• To evaluate IR systems in a way that is as
close to actual information seeking process as possible, while still being in a controlled environment.
![Page 3: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551](https://reader035.vdocument.in/reader035/viewer/2022081503/56649d605503460f94a4214e/html5/thumbnails/3.jpg)
Research Questions• Can simulated information needs be
substituted for real information needs?
• What makes a good simulated situation with reference to semantic openness and types of topics of the simulated situations?
![Page 4: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551](https://reader035.vdocument.in/reader035/viewer/2022081503/56649d605503460f94a4214e/html5/thumbnails/4.jpg)
Hybrid Evaluation Model• Increased demand
– Relevance revolution– Cognitive revolution– Interactive revolution
• Combine two main approaches– System-driven approach (controlled)– Cognitive user-centered approach (realism)
![Page 5: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551](https://reader035.vdocument.in/reader035/viewer/2022081503/56649d605503460f94a4214e/html5/thumbnails/5.jpg)
The Experimental Setting3 components:
• The involvement of potential users as test persons
• The application of dynamic and individual information needs
• Use of dynamic relevance judgements
![Page 6: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551](https://reader035.vdocument.in/reader035/viewer/2022081503/56649d605503460f94a4214e/html5/thumbnails/6.jpg)
Ideal IIR Setting• Real users who state personal information
needs to the system and judge the relevance of the retrieved documents under controlled circumstances.
• Use of “simulated work task”
• Must be under controlled circumstances so that results can be compared across systems and user groups.
![Page 7: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551](https://reader035.vdocument.in/reader035/viewer/2022081503/56649d605503460f94a4214e/html5/thumbnails/7.jpg)
Simulated Work Task• Triggers and develops a simulated
information need by allowing for user interpretations of the situation.
• Platform against which situational relevance is measured.
• 2 variants applied:– Complete need applied (sim 1)– Only situation applied (sim 2)
![Page 8: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551](https://reader035.vdocument.in/reader035/viewer/2022081503/56649d605503460f94a4214e/html5/thumbnails/8.jpg)
Situational Relevance• User-centered, realistic, and dynamic
measure of relevance
• Judgements are not based on the request or query, but rather relate to the person’s requirements and mental state at the time of the retrieval
• Assessed continuously and interactively during the session
![Page 9: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551](https://reader035.vdocument.in/reader035/viewer/2022081503/56649d605503460f94a4214e/html5/thumbnails/9.jpg)
Relevance (Schamber, Eisenberg, and Nilan)
• Multidimensional cognitive concept whose meaning is dependant in users’ perceptions of information and their information needs
• Dynamic concept that depends on users’ judgements of quality of the relationship between information and the information need
• Complex but systematic and measurable concept if approached conceptually and operationally from the user’s perspective
![Page 10: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551](https://reader035.vdocument.in/reader035/viewer/2022081503/56649d605503460f94a4214e/html5/thumbnails/10.jpg)
Meta-Evaluation• Should simulated work tasks be
recommended as a component of the experimental setting for evaluating IIR systems?
![Page 11: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551](https://reader035.vdocument.in/reader035/viewer/2022081503/56649d605503460f94a4214e/html5/thumbnails/11.jpg)
Meta-Evaluation Questions• Possibility of substituting real information
needs with simulated information needs through the application of simulated work task situations.
• Whether the variants of the simulated task makes any difference to the test persons’ treatment of the information need
• What characterizes a good simulated work task in terms of how tailored the task should be to the user
![Page 12: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551](https://reader035.vdocument.in/reader035/viewer/2022081503/56649d605503460f94a4214e/html5/thumbnails/12.jpg)
Test Setting• Full-text online system applying TREC data
and probabilistic-based retrieval engine
• Search activity and relevance scores were logged
• 24 users from various academic backgrounds and education levels
• Asked to prepare a personal information need
![Page 13: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551](https://reader035.vdocument.in/reader035/viewer/2022081503/56649d605503460f94a4214e/html5/thumbnails/13.jpg)
Testing Procedure• Brief questionnaire
• Introduction
• Explanation of the test person’s role
• Demo of the system
• Execution of 6 search tasks (training, real, 4 simulated tasks)
• Post-search interview
![Page 14: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551](https://reader035.vdocument.in/reader035/viewer/2022081503/56649d605503460f94a4214e/html5/thumbnails/14.jpg)
Conclusions• One can substitute real information needs
with simulated information through the application of simulated work tasks
• One can mix simulated and real information needs
• Treatment of the information need did not differ between the group that received the work task and request, and those who received just the work task.