using the user's mental model to guide the integration of information space into information...

8
Using the User’s Mental Model to Guide the Integration of Information Space into Information Need Charles Cole and John E. Leide Graduate School of Library and Information Studies, McGill University, 3459 McTavish Street, Montreal, Quebec, Canada H3A 1Y1. E-mail: [email protected]; [email protected] The study reported here tested the efficacy of an infor- mation retrieval system output summary and visualiza- tion scheme for undergraduates taking a Vietnam War history who were in Kuhlthau’s Stage 3 of researching a history essay. The visualization scheme consisted of (a) the undergraduate’s own visualization of his or her es- say topic, drawn by the student on the bottom half of a sheet of paper, and (b) a visualization of the information space (determined by index term counting) on the top- half of the same page. To test the visualization scheme, students enrolled in a Vietnam War history course were randomly assigned to either the visualization scheme group, who received a high recall search output, or the nonvisualization group, who received a high precision search output. The dependent variable was the mark awarded the essay by the course instructor. There was no significant difference between the mean marks for the two groups. We were pleasantly surprised with this result given the bad reputation of high recall as a prac- tical search strategy. We hypothesize that a more pro- active visualization system is needed that takes the stu- dent through the process of using the visualization scheme, including steps that induce student cognition about task–subject objectives. Introduction An undergraduate taking a social science course attends course lectures, takes notes, then goes to the library to read the books and articles suggested by the course instructor. But all these are subsidiary tasks, enabling the student to learn the subject then put this learning on display. In history and other social science and humanities subjects, the vehicle for putting learning on display is usually the term paper or research essay. Constructing an essay is a learned art. An instructor might settle for a descriptive paper when the student first arrives from high school, but as the student progresses from sophomore to junior to senior year, there is increasing pressure to demonstrate evidence of critical thinking in the essay—i.e., that the student has researched her topic area thoroughly, has synthesized it into a mes- sage—an informed opinion as Hellstern, Scott and Garrison (1998) call it—and inserted the message into an identifiable structure for effective communication to the instructor. At the center of this message is the essay thesis or argument statement, which provides both energy and force to the research essay and a synthesizing mechanism that enables the undergraduate to bring together disparate data and facts from different types of sources, time periods, and even different subject disciplines. The undergraduate does not start off researching the essay with a thesis in hand; rather, the thesis grows out of the student’s own exploration of the topic area through background reading and thinking about the topic. The stu- dent must then back up the thesis with relevant information before inserting it into a suitable essay format and handing it in to the instructor. One can view the research essay as something constructed, constructed in stages. We have pre- viously identified a thesis-like formulation stage as Stage 4 of the undergraduate’s process of constructing the essay (Cole, 2001; Cole, Cantero, & Sauve, 1998), which we based on Kuhlthau (1991, 1993). Kuhlthau gives a start-to-finish view of a wide variety of information users, including undergraduates, whom the in- structor has given an assignment to write and research. Kuhlthau’s Information Search Process (ISP) model divides assignment research and preparation into 6 stages. Stage 1 is the initiation stage where the student is handed the assign- ment by the instructor; it is at this point, according to the ISP model, that the student “recognize[s] his or her need for information.” In Stage 2, the selection stage, students “iden- tify and select the topic for the assignment.” In Stage 3, Kuhlthau’s exploration stage, students “investigate informa- tion on the general topic to extend his or her personal understanding.” In Stage 4, the formulation stage, students “form a focus from the information.” In Stage 5, the col- lection stage, students “gather information related to the focused topic.” In Stage 6, the presentation stage, students Received October 15, 2001; revised April 22, 2002; accepted July 3, 2002 © 2003 Wiley Periodicals, Inc. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 54(1):39 – 46, 2003

Upload: charles-cole

Post on 06-Jun-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Using the User’s Mental Model to Guide the Integrationof Information Space into Information Need

Charles Cole and John E. LeideGraduate School of Library and Information Studies, McGill University, 3459 McTavish Street, Montreal,Quebec, Canada H3A 1Y1. E-mail: [email protected]; [email protected]

The study reported here tested the efficacy of an infor-mation retrieval system output summary and visualiza-tion scheme for undergraduates taking a Vietnam Warhistory who were in Kuhlthau’s Stage 3 of researching ahistory essay. The visualization scheme consisted of (a)the undergraduate’s own visualization of his or her es-say topic, drawn by the student on the bottom half of asheet of paper, and (b) a visualization of the informationspace (determined by index term counting) on the top-half of the same page. To test the visualization scheme,students enrolled in a Vietnam War history course wererandomly assigned to either the visualization schemegroup, who received a high recall search output, or thenonvisualization group, who received a high precisionsearch output. The dependent variable was the markawarded the essay by the course instructor. There wasno significant difference between the mean marks forthe two groups. We were pleasantly surprised with thisresult given the bad reputation of high recall as a prac-tical search strategy. We hypothesize that a more pro-active visualization system is needed that takes the stu-dent through the process of using the visualizationscheme, including steps that induce student cognitionabout task–subject objectives.

Introduction

An undergraduate taking a social science course attendscourse lectures, takes notes, then goes to the library to readthe books and articles suggested by the course instructor.But all these are subsidiary tasks, enabling the student tolearn the subject then put this learning on display. In historyand other social science and humanities subjects, the vehiclefor putting learning on display is usually the term paper orresearch essay. Constructing an essay is a learned art. Aninstructor might settle for a descriptive paper when thestudent first arrives from high school, but as the studentprogresses from sophomore to junior to senior year, there is

increasing pressure to demonstrate evidence of criticalthinking in the essay—i.e., that the student has researchedher topic area thoroughly, has synthesized it into a mes-sage—an informed opinion as Hellstern, Scott and Garrison(1998) call it—and inserted the message into an identifiablestructure for effective communication to the instructor. Atthe center of this message is the essay thesis or argumentstatement, which provides both energy and force to theresearch essay and a synthesizing mechanism that enablesthe undergraduate to bring together disparate data and factsfrom different types of sources, time periods, and evendifferent subject disciplines.

The undergraduate does not start off researching theessay with a thesis in hand; rather, the thesis grows out ofthe student’s own exploration of the topic area throughbackground reading and thinking about the topic. The stu-dent must then back up the thesis with relevant informationbefore inserting it into a suitable essay format and handingit in to the instructor. One can view the research essay assomething constructed, constructed in stages. We have pre-viously identified a thesis-like formulation stage as Stage 4of the undergraduate’s process of constructing the essay(Cole, 2001; Cole, Cantero, & Sauve, 1998), which webased on Kuhlthau (1991, 1993).

Kuhlthau gives a start-to-finish view of a wide variety ofinformation users, including undergraduates, whom the in-structor has given an assignment to write and research.Kuhlthau’s Information Search Process (ISP) model dividesassignment research and preparation into 6 stages. Stage 1 isthe initiation stage where the student is handed the assign-ment by the instructor; it is at this point, according to theISP model, that the student “recognize[s] his or her need forinformation.” In Stage 2, the selection stage, students “iden-tify and select the topic for the assignment.” In Stage 3,Kuhlthau’s exploration stage, students “investigate informa-tion on the general topic to extend his or her personalunderstanding.” In Stage 4, the formulation stage, students“form a focus from the information.” In Stage 5, the col-lection stage, students “gather information related to thefocused topic.” In Stage 6, the presentation stage, students

Received October 15, 2001; revised April 22, 2002; accepted July 3, 2002

© 2003 Wiley Periodicals, Inc.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 54(1):39–46, 2003

“complete the search and prepare the assignment to presentfindings” (Kuhlthau, 1991, pp. 366–368).

For an undergraduate constructing a thesis or argument-type research essay, Stage 4 is the stage where the studentformulates the thesis or argument statement for the essay.Stage 4 is a difficult stage to navigate through for a domainnovice, with Kuhlthau (1991, pp. 366, 369) finding that 50%of subjects in her studies never achieved focus. The way theundergraduate explores the material and issues just beforeStage 4 in Stage 3 is, we believe, a crucial and necessarycondition for the focusing Stage 4 that follows (Kennedy,Cole, & Carter, 1999). In this article we wish to overcomethe general problem of the frequent disconnect between anundergraduate’s exploration of their topic in Stage 3 and thenecessity of bringing the essay researching task into focusvia a well-anchored thesis statement in Stage 4.

In previous studies we found that course instructors areaware of this disconnect: to encourage their students toselect a topic and explore the topic, they require the essay behanded in in stages—for example, a topic statement andbibliography must be handed in a month or so before thefinished version of the essay is scheduled to be handed in.Despite this encouragement, however, when interviewingthese students immediately after they handed in or discussedwith the instructor their topic and bibliography, in Cole(2001) we previously observed that undergraduates canname the topic but that the topic is not anchored or mean-ingfully defined by the student. In Cole (2001, p. 177), weconcluded that for successful essay construction to occurthere must be another stage between Stage 3 and Stage 4where students define their essay topic. With this definition,they can then effectively, from the point of view of con-structing a successful essay, (1) find and process informa-tion while they are reading background material, and (2)pass on through to Stage 4, the focusing stage of their ISP.We believe the IR system must facilitate topic definition inits design, but how can it do this?

Visualizing the Information Space to Induce theUndergraduate to Define Their Topic

A major challenge in research for digital libraries is todesign and implement gateway mechanisms for informationretrieval (IR) systems that communicate the structure of asubject area to domain novice users such as undergraduateswho are in Stage 3 of their information search process—exploring an unfamiliar topic area to gain a personal under-standing of it. For this type of user, using an IR system isquite different from a user using the system to locate“known items” (Cutter’s (1876/1904) first object; cf. also,Leide, Cole, Large, Beheshti, & Brooks, submitted for pub-lication), or from domain experts who already know thestructure of the literature and can thus effectively exploretheir topic area (i.e., they explore the topic area so that theycan eventually formulate a thesis for their research activity;Cole, Mandelblatt, & Stevenson, 2002).

The specific problem of the domain novice user of IRsystems is that they do not know the structure of the topicarea, and find it difficult to situate themselves inside theissues relevant to the topic area. Classification schemestraditionally perform this role of situating the user inside thestructure of both the topic and the larger subject area forwhich the user is interested. The user herself, of course, isnot being situated inside the topic structure but rather theuser is situating her view of her topic inside this structure.We associate this situating process with defining the topic.We therefore assume that the classification scheme repre-sents the structure of a topic area to the user, and that bysituating his or her place inside this structure the classifica-tion scheme facilitates the domain novice user’s definitionof his or her topic and thus the identification of his or herown information need.

Traditionally, classification/indexing systems have usednumber/letter coded word-lists to represent the classificationscheme. The problem with such word-lists is that domainnovice users of IR systems “see” the words rather than“seeing them as” concepts that have meaning to them, andthis is a major reason why word-list classification schemeswere underused as gateways to subject topics. The problemsassociated with word retrieval accessing methods have notsignificantly decreased in recent years (Hawkins, 1999, p.88; Rorvig & Lunin, 1999, p. 790); in fact, with the adventof the Internet, word-based accessing problems may haveincreased (e.g., relevance and set magnitude problems areexacerbated by the Internet’s sheer size and lack of struc-ture). As a result, information retrieval appears to be shift-ing away from purely word-based classification schemes,that represent the subject area as a logical system developedby one person (like Cutter, Dewey, or an institution like theLibrary of Congress), towards visualized representations ofan information space, that use space, proximity of objects,and different sizes and textures to convey the structure ofthe subject area.

In Cole, Mandelblatt, and Stevenson (2002), we analyzedfive visualization schemes: the Cat-a-cone Display schemeof Hearst (1997), the Network Display scheme of Rose andBelew (1991), the Scatter Display scheme of White andMcCain (1998), the Map Display scheme of Chen, Houston,Sewell, and Schatz (1998), and the Hybrid Display schemeof Small (1999) (see also, Lin, 1997). The analysis centeredon efficacy of the five schemes in visually representing thestructure of a subject area information space to a domainnovice user of the visualization (such as an undergraduate).The five schemes attempt to represent an objective view ofan information space—the cat-a-cone, for example, doesthis by taking the categories and subcategories of conven-tional classification schemes, creating a hierarchy ofbroader, related, and narrower terms, and putting the hier-archy within a three-dimensional cone. As another example,the scatter display starts from notables in a research area andplots them around an x- and y-axis by the strength of theirassociations with each other.

40 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 1, 2003

The major problem with these five visualization schemesis that although domain experts can make, via their alreadyexisting schema of their subject area, the leap to understandthese visualizations and to use them as analytical tools,domain novices such as undergraduates do not possess astructural schema of their subject area so they are notanchored in a familiar environment when they explore theinformation space contained in the visualization. Being an-chored is similar to a “You are here” locating device on acartographic map—it defines the user’s position vis-a-viswhere she has been and where she wants to go on the map.A road map reader knows where her car is in relation tostarting and end locations—or else she cannot read the map.It is through this anchoring that the domain novice is able tointerpret a visualization, much like a map reader reads andunderstands a cartographic visualization (Cole, et al., 2002).

Anchoring the Domain Novice User

The central problem of a visualized information spacefor domain novices is anchoring the user in the so-calledinformation space for the topic area of his or her researchessay when this domain novice user is in Stage 3, or theexploration stage of researching an essay topic. In informa-tion retrieval research and practice (e.g., Taylor, 1968), theso-called anchoring centers on the concept of the user’sinformation need. When accessing an IR system, the usertypes in a query to the system, which is supposed to repre-sent her information need. However, the user’s informationneed does not indicate where he or she is located in the topicspace but rather where she would like to be after using theinformation indicated by the IR system’s output list ofcitations. In effect, the user’s query is supposed to representthe user’s goal position not the start position. For the do-main novice user exploring the essay topic space, this userdoes not know either the start position or the goal position,and therefore, can neither formulate an effective query noreffectively judge an item listed in the IR system output listfor its relevance to the essay topic.

To anchor the domain novice user in the essay topicspace so that she can begin exploring the topic space, thedomain novice type of user must rely on bits and pieces ofdiverse knowledge, including both perceptual and higherorder cognitive knowledge, which elsewhere we call theuser’s tacit knowledge about the topic being explored (Cole& Mandelblatt, 2000). At the most basic, fundamental level,to gain meaning from the experience of viewing a visual-ization of an information space, an undergraduate readingthis visualization must see a pattern in the visualization; shedoes this by grouping edges, lines, spaces (and differentword-label concepts), and seeing this grouping as some sortof object. Let us call a pattern a meaning object. In cogni-tive-based visual perception research, the location of theconstituent parts of a meaning object must be recognized bythe viewer’s visual system as belonging together and form-ing the object. The constituent parts are organized togethervia three coordinate visual systems:

(1) The object-centered coordinate system specifies an objectpart’s location in relation to other parts of the object.

(2) The environment-centered coordinate system specifiesthe locations of the parts to each other relative to theenvironment around the object.

(3) The viewer-centered coordinate visual system specifiesthe locations of an object relative to the viewer (Farah,2000, p. 72).

For an undergraduate in an exploration phase of essayconstruction, to be able to perceive an IR system-createdvisualization of an information space, we make the assump-tion that the undergraduate must use some or all of theabove typology of three coordinating systems to recognizethe visualization as a meaning object.

In Cole et al. (2002), we argued that reading a visualiza-tion of an information space is similar to reading a carto-graphic visualization—a map—of a geographic space.However, there are important differences. First, a carto-graphic map has a relationship between the map and thephysical reality it represents, which helps the map readercoordinate locations of lines, spaces, colors, etc., into ameaning object, using her environment-centered coordinatesystem. The visualizations of information spaces do nothave any relationship to a physical reality. Consequently,the undergraduate does not have access to the environmentcoordinate system (from the above typology).

Secondly, the map reader is inundated with maps fromprimary school, and her exposure to the same maps alwaysrepresenting the same things is frequent enough to givethese maps meaning. In effect, the map becomes usefulbecause the map reader has an ingrained cognitive repre-sentation of the map that can be immediately imaginedwhen a friend says he is flying from New York to London.On the other hand, an undergraduate viewing a visualizationof an information space is not familiar with it—indeed, thevisualization schemes analyzed in Cole et al. (2002) weredeliberately conceptualized so that they would change asresearch shifted in a subject area over time. Consequently,with these particular visualizations, the undergraduate doesnot have access to the viewer-centered coordinate systemeither (from the above typology).

Therefore, the undergraduate confronted with derivingmeaning from visualizations of information space is forcedto rely on his object-centered coordinate system only, put-ting the undergraduate at a severe disadvantage, similar towhat is called an apperceptive agnosic. In an experiment byLandis, Coraves, Bensen, and Hebben (1982), using anapperceptive agnosic subject, the subject was unable togroup the lines and spaces of the word “THIS” into theword, but could only misidentify each letter as the separatenumbers “7,” “4,” “1,” and “5,” which together form thenumber “7415.” Without an organizing schema, the under-graduate is like an apperceptive agnosic, unable to group“local elements of the visual field into more global objectsor patterns” (Farah, 2000, p. 59).

The purpose of the following study is to begin to explorehow an IR system can activate the undergraduate’s viewer-

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 1, 2003 41

centered coordinate system, allowing him or her to perceivethe dots, circles, shapes, and spaces contained in the IRsystem visualization of the undergraduate’s essay topicspace as a meaning object. The visualized index/catalogdoes this by stimulating the undergraduate user’s formationof a device for locating herself in the topic space, so that shecan then begin to put together a representation of the globalaspect of the topic space in terms of both individual issuesor concepts in the topic space and the relational structure ofthese issues.

The Study

The present article reports the results of a 1999 study of38 Concordia University undergraduates who were re-quired, as part of a Vietnam War history course, to seekinformation for a 12–15-page research paper. For purposesof the study, the course instructor allowed the author to meetwith each student in the class and to conduct a search of ahistory database for information on the student’s essaytopic. Concordia University is an urban university with20,000 full-time students; it awards doctorates and offers afull range of university programs (except law and medi-cine).

The purpose of the study was to begin to test, in anaturalistic setting, the concepts underlying the visualiza-tion of an information space shown in Figure 1, especiallyits use when the student is in Stage 3, the exploration stage,of researching her essay. We wanted to know: Do under-

graduates researching a class essay have a mental represen-tation or expectation of what they are looking for whenusing the system? Can they draw the mental representationon paper using various sizes of circles, distances betweencircles, circle clustering, etc., to indicate relationships be-tween the concepts? Is the student’s own mental represen-tation of the topic space useful to the student as he or shereads and tries to understand the system’s objective sum-marization and visualization of the system’s high recallcitation output?

The hypothesis of the study was that students, who at thetime of the intervention are estimated to be in an explorationphase Stage 3 of researching their essay, and who receiveda visualization of their essay topic information space basedon a high recall search strategy, will receive a higher meanmark for their essay than a similar group of students fromthe same class who received a high precision search strategywithout an accompanying visualization.

The study was conducted in a naturalistic setting wherethe actual mark of the student could be negatively or posi-tively affected by the intervention. For this reason, the studydesign was evaluated and approved by the Concordia Uni-versity Ethics Committee.

Methodology

At the beginning of the winter term, 1999, the first authorreceived a list of all the students registered in a third yearVietnam War history course given by Concordia Universi-ty’s history department. The 38 registered students wererandomly assigned to one of two groups: one group waslabeled the high recall plus visualization group, and thecomparison group was labeled the high precision group.The timing of the study intervention was designed to coin-cide with a course requirement that the students hand in abibliography and their essay argument on March 2, 1999,which was roughly 1 month before the essay hand in date onMarch 30, 1999. Because of our experience with how un-dergraduates schedule their essay writing, we assumed theirargument statement would be more like a topic statementthan an actual thesis statement, and that on March 2 for theirappointment with the course instructor they would be inStage 3 of their ISP.

Before the intervention, the students were contacted bytelephone to arrange an interview time for around the dateof their topic discussion session with the course instructor.Five students could not be contacted. Thirty-three inter-views took place from March 1 to March 5, 1999. Thestudents were interviewed individually in a room in theConcordia Library designed especially for information re-trieval teaching.

The students began the intervention by reading and sign-ing a consent form and the interview started. For the highprecision group, the student was first asked to write downfour or five key words or phrases that outlined their essaytopic. For the high recall group, the student was asked towrite down four or five key words or phrases that outlined

FIG. 1. Example of student filled-in map (lower half) serving as guide-post for reading system map (upper half).

42 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 1, 2003

their essay topic, and to convert these words and phrasesinto a visualization. The students in the high recall groupwere given the following instructions when drawing theirvisualization:

Take the key words and phrases you have just listed andconvert them into circles. The important words and phrasesshould be drawn as big circles, the less important ones asmedium or smaller circles. Take care to put circles togetheryou feel are associated, and put circles you don’t think areassociated farther away.

An example of a high recall group student’s mentalrepresentation of her essay topic space is shown in thebottom half of Figure 1.

The student was then asked to sit beside the author toread the computer screen as he searched either America:History and Life (for the United States and Canada) and/orHistorical Abstracts (for the rest of the world). Both data-bases are put out by ABC-CLIO, and for topics like theVietnam War they overlap to a great extent.

For the students in the high recall with accompanyingvisualization group, the search strategy consisted of spend-ing a few minutes looking for a citation the student liked,then expanding the search strategy outward to a selection ofsearch terms that would achieve a system output of about200 citations. For the students in the high precision group,the search strategy consisted of spending a few minutesfinding a citation the student liked, then sticking closely tothat strategy, selecting search terms that produced a totalcitation output of around 10 citations. Each student sessionlasted 15–20 minutes. At the end of the session, the studentswere told they would receive both a disk version and a printversion of their search output at the beginning of their nextVietnam War history class.

At the beginning of the next class, the author handed outthe search output in sealed envelopes. Included in the en-velope for the students in the high recall search strategygroup, who received a visualization, was a short messagethat appeared on a separate page preceding the visualiza-tion:

Please examine the representation of your essay topic thatyou created, comparing it with an objective representationof your topic area (created by counting index terms). Do youagree with it? Then examine relationships of concepts toestablish a context for your essay topic.

The marks the students received for their essay wereused to evaluate the effectiveness of the two search strategyinterventions. It was decided beforehand that all studentswho received an intervention but did not receive a markfrom the course instructor (because they did not hand intheir essay, or for whatever other reason) would be given a0. The significance of the differences between mean marksfor the intervention group, the comparison group, and athird group made up of the five students who could not be

contacted and thus did not receive any intervention, wastested using ANOVA.

The Accompanying Visualization

The accompanying visualization, an example of which isshown in Figure 1, was created for each of the 16 studentsin the high recall with accompanying visualization group bytaking the high recall search strategy citation output thatresulted from searching one of two history databases avail-able on compilation CD-ROMs: Historical Abstracts (1971–1998) and America: History and Life (1964–1998). Bothdatabases have index terms or descriptors attached to eachrecord by professional indexers. High recall was defined asan output producing about 200 records, so this was thecitation output we aimed at when selecting search terms foreach student in the high recall search strategy group. Welisted and counted the number of times each index termappeared in the total citation output for each student andcreated categories of circle sizes. The two or three mostimportant index terms, which appeared 150 to 200 times inthe high recall citation output, were given the largest circlecategory. The next largest circle category was for indexterms appearing around 100 times. Two much smaller circlesizes denoted index terms that appeared between 10 and 50times and under 10 times, respectively. The different sizecircles were placed on the visualization according to theauthor’s knowledge of the relationships between the con-cepts denoted by the index terms. Lines were drawn con-necting the index term circles in the top half of the diagramin Figure 1 to the student’s own mental representation of thetopic area in the lower half of the diagram. The first authoris not an expert in the Vietnam War,—therefore, the visu-alization represents an educated appraisal of the index re-lationships only. Each visualization took 2 to 3 hours toproduce using the Macintosh drawing program.

In the present study, the system visualization was createdby the author counting index terms from the undergradu-ate’s high recall citation output and representing thesecounts in visualization-like form with small circles, largecircles, and connecting lines. While unsophisticated, webelieve the message communicated by the visualization tothe domain novice subjects in our study is not that differentfrom the visualizations of information space produced byusing citation analysis techniques. We discuss this observa-tion further in Cole et al. (2002).

Results

The mean mark for the two groups of students whoreceived the two interventions, plus a third group of fivestudents who handed in an essay but did not receive anyintervention because they were not available for the study,is given in Table 1.

The difference in mean marks between the three groupsis not significant, using a t-test (p � 0.414). Although alarge difference was found between the mean marks of

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 1, 2003 43

students receiving some intervention versus the group offive students in the Essay But No Intervention Group, thedifference was not statistically significant (between the highrecall group and no intervention group, p � 0.292; betweenthe high precision group and no intervention group, p� 0.227).

The difference in mean marks between the high recallwith visualization group and high precision group was alsonot significant, using a t-test (p � 0.778), which indicatesthat neither strategy was better than the other at the momentwhen the intervention was given.

Discussion

The results of the study are inconclusive, neither affirm-ing nor denying the usefulness of visualizations of the topicspace for the high recall group. We find it useful that, giventhe negative reputation of high recall searches, the meanmark of the high recall group did not suffer from the highrecall search given them. However, because the samplesizes were small for each of the two intervention groups, thedecision to assign a 0 mark to subjects who did not hand intheir essay introduces some unknown variance into theanalysis. (The decision to assign a 0 was made to evaluatethe effect of receiving high recall output on the student’smotivation to complete and hand in essay, which we as-sumed might be negative.)

In evaluting the study, there are two issues we have toconfront. First, there is a distinction between a front-endvisualization, where the user is shown a visualization of thewhole world of knowledge from which he selects the par-ticular part he wishes to explore, and a visualization basedon the system’s interim output, the results of the system’smatching process that comes after the user inputs his topicinto the IR system. In our previously discussed secondarticle (Cole et al., 2002), where five visualization schemeswere evaluated for a domain novice user, the Small (1999)Hybrid Display Scheme was the only one of the schemesstudied that provided the user with the first gateway option;the other four schemes required the user to query the systemfirst, thus limiting the visualization to that user’s subjectdomain. There is a difference between these two options(Brookes & Campbell, 1999), which we are investigatingelsewhere (Leide, Large, Beheshti, Brooks, & Cole, inpress). In the present study we presented the undergraduatein the high recall group with only the slice of the informa-tion space that was pertinent to her essay topic after she had

defined the topic space she wanted to see and typed thequery into the system. In effect, for the high recall group inthe study, the visualization of the information space shownto the user was the interim display output.

The interim display output by definition requires the userto examine individual records from the output to determinethe effectiveness of the search terms and strategy, leading topossible modification of the strategy and/or search terms. Alimitation of the present study is that the subjects in the highrecall plus visualization group received the visualization oftheir interim display output a week after they had left the IRsystem, giving them no time to modify their vision of theirtopic space through the usual route of looking at the sys-tem’s citation output to see if the output was relevant totheir information need. This relevance judging processmight have had considerable influence on the student’s ownmental model of their essay topic. A future study is neededto specifically test this assumption—i.e., whether or notjudging the interim results list for relevance significantlyaffects the undergraduate’s conception of the topic space,and/or positively affects the undergraduate’s mark.

In the normal development of the visualization schemepresented to the students in our study, the student shouldhave the chance to interact with the visualization beforedeciding to produce the final output and taking it away.Here, the student did not get to do this. We do not know ifthe undergraduates in the high recall intervention groupeven opened the envelope they were given containing thevisualization, or if they did, whether they understood whatthey were supposed to do with it. The visualizations mighthave been more effective if we had sat down with them andshowed them how to use the visualization. (To do this withthe high recall group, and to meet the concerns of the EthicsCommittee, we would also have had to devise a second andcomparable quality interaction with the high precisiongroup.) We believe the student’s interaction with the visu-alization is probably a crucial part of making visualizationswork, and the fact that in this study the undergraduate didnot have the opportunity to engage in this interaction mili-tated against the visualization having a positive effect on thestudent’s essay mark. In future phases of the research wewill build it into the study.

The second issue we have to confront in evaluating thisstudy is the linkage of visualization production with a highrecall search strategy. Because a high recall search strategyand a visualization of the system’s interim output maynecessarily be the same thing (given that the visualization isto facilitate exploration and the user’s associative thinking),in a sense, the high recall search strategy we used was amethod of creating the visualization of the informationspace. For instance, what purpose would visualizing thehigh precision search strategy results serve unless the visu-alization also showed the user possible gateways outside thehigh precision information space? Therefore, in this studywe were really testing the efficacy of a visualization of theundergraduate’s information space against a high precisionword list. If we had wanted to test a high recall search

TABLE 1. Mean marks for essay for three groups.

GroupMean mark

(in %) Std. error

High recall plus visualization group (N � 16) 69.1250 4.8High precision group (N � 17) 71.0588 4.8Essay but no intervention group (N � 5) 56.6000 14.4Total (N � 38) 68.3421 3.5

44 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 1, 2003

strategy against a high precision search strategy, bothshould have remained word lists. The visualization for thehigh recall group confused this issue. A more effectivetesting of the hypothesis would have been to include a thirdgroup, one that received only high recall strategy assistanceand no visualization. If this third group had received thesame mean mark as the high recall strategy with visualiza-tion group, then the study would also have said somethinginteresting about the viability of visualizations of topicspaces as a method of facilitating a user’s identification ofher information need.

A third evaluation issue arose during the study itself.Were the undergraduates in our study actually in Stage 3 oftheir ISP at the time of the intervention? This issue handi-capped the efficacy of the visualization tested here becausethe assumptions underlying the study design were contin-gent on the subjects in the study being in Stage 3 of theirISP, the exploration stage of researching their essay topic.We have previously discussed the difficulty of correctlydiagnosing the student’s ISP stage (Cole, 2001; Cole, Can-tero, & Ungar, 2000). Although we timed this study tocoincide with the students’ session with their course instruc-tor, when they were required to have already selected theirtopic and argument statement, it became apparent during theinterviews that some students had not yet selected theirargument statement. (In the individual case, this may havebeen influenced by the amount of the student’s prior knowl-edge of the essay topic before starting the course.) Theywere still dealing with topic identification and selection,which is the objective of Stage 2. It is our impression thatstudents who are selecting their topic may not benefit froma high recall search strategy because topic selection shouldprobably be based on the student’s life-long interests—i.e.,their basic philosophy about how the world and humannature works. This is an internally located and satisfiedphase of the student’s research process and outside infor-mation from an IR system may confuse this Stage 1–2process.

A study such as the one described in this article hasseveral benefits, however, beginning with the requirementthat the student think in terms of her own mental model ofthe information space and producing it for the system toconsider and use to create its visualization of the informa-tion space. This was the focus of our study: by actualizingon paper the undergraduate’s own mental representation ofher topic area, is this user’s mental model an effectiveguidepost to comprehend and interpret the objective repre-sentation of topic knowledge created by the IR system?Although inconclusive, the results of the study suggest thata visualization of a high recall search strategy does notappear to have a negative influence on the students’ marks.

Conclusion

In our study of 38 undergraduates researching an essayfor a Vietnam War history course, we randomly assigned allthe students in the class to two groups: one group received

a high recall search strategy intervention with accompany-ing visualization, and the comparison group received a highprecision search strategy intervention. There was no signif-icant difference in the mean marks of the essays awardedthe two groups by the course instructor.

In future research, we will look more closely at theambiguous stage when the student believes she has selectedher topic but has not yet defined it. This may in fact be animportant difference between Stage 2 and Stage 3. Part ofthe reason we believe it is important to distinguish betweenStage 2 and Stage 3 is that undergraduates who are identi-fying and selecting their essay topic in Stage 2 of their ISPshould perhaps have an entirely different experience withthe IR system than in Stage 3. Researching and writing anundergraduate essay is dependent on inputs of energy.Kuhlthau’s hypothesis is that the student may select a topicbased on a consideration of their own interests, in Stage 1 ofthe ISP, requiring internal introspection and internallysourced energy inputs. Stage 2 may require a more inter-nally sourced input than external (i.e., from the IR systemand its database), while Stage 3 may require an externalsource of information accessed from an interactive IR sys-tem in a first part, for purposes of exploring the structureand issues of the topic information space, leading to, again,a more internally focused interaction with the IR system ina second part of Stage 3 (perhaps this should be called Stage3.5), where the system facilitates the user’s definition of hertopic. In future research, we will try to determine if this is,in fact, the case.

At this point in our research, we believe the user’sinformation need changes constantly over the course of theISP. When an undergraduate is in Stage 3 (perhaps 3.5), theinformation need has to do with defining the essay topic.Therefore, the visualizations shown to this user by the IRsystem should activate the undergraduate’s viewer-centeredcoordinate system, allowing him or her to perceive the dots,circles, shapes and spaces contained in the visualization asa meaning object. The visualized index/classificationscheme does this by stimulating the undergraduate to formher own concept model of the subject area, which, like acartographic map reader invoking her familiarity with car-tographic visualizations, anchors and guides her searchstrategy through the information space so that she canintegrate the objective structure of the topic space into herdefinition of her topic, to a certain degree determining it.

Acknowledgments

For the first author, the research for this article wasfunded by the Social Sciences and Humanities ResearchCouncil of Canada (SSHRC), Fellowship Award No. 756-97-0278, and Standard Research Grant, File No. 410-97-0366. The authors also wishes to thank the Concordia Uni-versity history students who participated in the study.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 1, 2003 45

References

Brooks, M., & Campbell, J. (1999). Interactive graphical queries forbibliographic search. Journal of the American Society for InformationScience, 50, 814–825.

Chen, H., Houston, A.L., Sewell, R.R. & Schatz, B.R. (1998). Internetbrowsing and searching: User evaluations of category map and conceptspace techniques. Journal of the American Society for InformationScience, 49(7), 582–603.

Cole, C. (2001). Intelligent information retrieval: Part IV. Testing thetiming of two information retrieval devices in a naturalistic setting.Information Processing & Management, 37, 163–182.

Cole, C., & Mandelblatt, B. (2000). Using Kintsch’s discourse compre-hension theory to model the user’s coding of an informative messagefrom an enabling information retrieval system. Journal of the AmericanSociety for Information Science, 51, 1033–1046.

Cole, C., Cantero, P., & Sauve, D. (1998). Intelligent information retrieval:Part II: Diagnosing information need: Uncertainty expansion: A proto-type of a diagnostic IR tool. Information Processing and Management,34(6), 721–737.

Cole, C., Cantero, P., & Ungar, A. (2000). The development of a diagnos-tic-prescriptive tool for undergraduates seeking information for a socialscience/humanities assignment. III. Enabling devices. Information Pro-cessing & Management, 36, 481–500.

Cole, C., Mandelblatt, B., & Stevenson, J. (2002). Visualizing a high recallsearch strategy output for undergraduates in an exploration stage ofresearching a term paper. Information Processing & Management, 38,37–54.

Cutter, C.A. (1876/1904). Rules for a dictionary catalog (3rd ed). Wash-ington, DC: Government Printing Office.

Farah, M.J. (2000). The cognitive neuroscience of vision. Malden, MA:Blackwell.

Hawkins, D.T. (1999). Information visualization: Don’t tell me, show meOnline, 23(Jan/Feb), 88–90.

Hearst, M.A. (1997). Interfaces for searching the web. Scientific American,276(3), 68–72.

Hellstern, M., Scott, G.M., & Garrison, S.M. (1998). The history studentwriter’s manual. Upper Saddle River, NJ: Prentice Hall.

Kennedy, L., Cole, C., & Carter, S. (1999). The false focus in onlinesearching: The particular case of undergraduates seeking information forcourse assignments in the humanities and social sciences. Reference &User Services Quarterly, 38, 267–273.

Kuhlthau, C.C. (1991). Inside the search process: Information seeking fromthe user’s perspective. Journal of the American Society for InformationScience, 42(5), 361–371.

Kuhlthau, C.C. (1993). Seeking meaning: A process approach to libraryand information services. Norwood, NJ: Ablex Publishing.

Landis, T., Graves, R., Benson, F., & Hebben, N. (1982). Visual recogni-tion through kinaesthetic mediation. Physiological Medicine, 12, 515–531.

Leide, J.E., Cole, C., Large, A., Beheshti, J. & Brooks, M. (Submitted forpublication). Virtualizing Cutters collocation 2nd object for the Internet.

Leide, J.E., Large, A., Beheshti, J., Brooks, M., & Cole, C. (in press).Visualization schemes for domain novices exploring a topic space: Thenavigation classification scheme. Information Processing & Manage-ment.

Lin, X. (1997). visualization displays for information retrieval. Journal ofthe American Society for Information Science, 48(1), 40–54.

Rorvig, M., & Lunin, L.F. (1999). Introduction and overview: Visualiza-tion, retrieval, and knowledge. Journal of the American Society forInformation Science, 50, 790–793.

Rose, D.E., & Belew, R.K. (1991). A connectionist and symbolic hybridfor improving legal research. International Journal of Man–MachineStudies, 35, 1–33.

Small, H. (1999). Visualizing science by citation mapping. Journal of theAmerican Society for Information Science, 50, 799–813.

Taylor, R.S. (1968). Question-negotiation and information seeking in li-braries. RO, 29(3), 178–194.

White, H.D., & McCain, K.W. (1998). Visualizing a discipline: An authorco-citation analysis of information science, 1972–1995. Journal of theAmerican Society for Information Science, 49(4), 327–355.

46 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—January 1, 2003