human information retrieval - j.warner-2010-9780262013444

199
8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444 http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 1/199 Julian Warner Human Information Retrieval

Upload: rmbxxx

Post on 10-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 1/199

Julian Warner

Human Information Retrieval

Page 2: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 2/199

Human Inormation Retrieval

Page 3: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 3/199

History and Foundations o Inormation Science

Edited by Michael Buckland and Jonathan Furner

Human Inormation Retrieval , by Julian Warner

Page 4: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 4/199

Human Inormation Retrieval

 Julian Warner

The MIT Press

Cambridge, Massachusetts

London, England

Page 5: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 5/199

© 2010 Massachusetts Institute o Technology

All rights reserved. No part o this book may be reproduced in any orm by anyelectronic or mechanical means (including photocopying, recording, or inorma-tion storage and retrieval) without permission in writing rom the publisher.

For inormation about special quantity discounts, please email [email protected]

This book was set in Sabon by The MIT Press.Printed and bound in the United States o America.

Library o Congress Cataloging-in-Publication Data

Warner, Julian, 1955–Human inormation retrieval / Julian Warner.

p. cm.—(History and oundations o inormation science)Includes bibliographical reerences and index.ISBN 978-0-262-01344-4 (alk. paper)

1. Inormation retrieval. I. Title.ZA3075.W375 2010025.04—dc22

200901012010 9 8 7 6 5 4 3 2 1

Page 6: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 6/199

Contents

Acknowledgments vii

1 Introduction 1

2 Selection Power and Selection Labor 17

3 Description and Search Labor 33

4 A Labor Theoretic Approach 53

5 Retrieval rom Full Text 73

6 A Semantics or Retrieval rom Full Text 95

7 A Syntactics or Retrieval rom Full Text 113

8 Semantics and Syntactics or Retrieval rom Full Text 131

9 Conclusion 147

Postscript 157

Notes 163

Supplementary Readings 169

Bibliography 173

Index 185

Page 7: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 7/199

Page 8: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 8/199

Acknowledgments

The John Campbell Trust supported a presentation on the labor theoretic

approach to inormation retrieval at the Annual Meeting o the American

Society or Inormation Science and Technology, Providence, RI,

November 2004. I received hospitality and assistance rom the Research

Centre or the Social Sciences (RCSS), University o Edinburgh, where

most o the articles rom which material has been derived were origi-

nally written during study leave rom the Queen’s University o Belast,

February–July 2005.I am indebted to the unnumbered students at the Queen’s University

o Belast and Indiana University who interactively contributed to the

development o the research themes expounded by taking courses in

Inormation Policy, Communicating Electronically, and Computer

Mediated Communication, 2001–2007.

Michael Buckland, Elisabeth Davenport, and Madeleine Coyle also

gave me support and encouragement.

Chapters 2–4 are adapted, respectively, rom the ollowing series o articles:

Warner, J. 2007. Selection power and selection labor or inormation retrieval.  Journal o the American Society or Inormation Science and Technology 58 (7):915–923.

Warner, J. 2007. Description and search labor or inormation retrieval. Journal o the American Society or Inormation Science and Technology. 58 (12): 1783–1790.

Warner, J. 2008. A labor theoretic approach to inormation retrieval. Journal o the American Society or Inormation Science and Technology 59 (5): 731–741.

Chapter 2 also incorporates material rom the ollowing article:

Warner, J. 2000. In the catalogue ye go or men: evaluation criteria or inorma-tion retrieval systems. Aslib Proceedings 52 (2): 76–82.

Page 9: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 9/199

Chapters 5–8 incorporate material rom the ollowing articles:

Warner, J. 2007. Analogies between linguistics and inormation theory. Journal o the American Society or Inormation Science and Technology 58 (3): 309–321.

Warner, J. 2007. Linguistics and inormation theory: analytic advantages. Journal o the American Society or Inormation Science and Technology 58 (2): 275–285.

They also include some material published in the ollowing book chapter.

Warner, J. 2008. Materializing communication concepts: linearity and surace inlinguistics and inormation theory. In Exploration o space, technology, and spa-tiality: Interdisciplinary perspectives, ed. P. Turner, S. Turner, and E. Davenport.,196–213. Hershey: Inormation Science Reerence Press.

viii Acknowledgments

Page 10: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 10/199

1Introduction

Inormation retrieval is o high contemporary signifcance, diusing into

ordinary discourse and everyday practice. Recently inormation retrieval

has changed rapidly, particularly through the inuence o Internet search

engines. Practical understanding o how to use systems has been in

advance o uller theoretic understanding, particularly when concerned

with transormations o meaning. The breadth o the area—the variety o 

disciplines that have and potentially could contribute—may have inhib-

ited the development o understanding. A deep divide remains betweencentrally relevant activities and their associated cultures, between library

and inormation science and Internet search engines. The limited theoreti-

cal development implies a need or a deeper—more comprehensive and

inclusive—understanding o inormation retrieval, which should reveal

structure and underlying patterns in a complex, dierentially understood,

and apparently chaotic area. Ideally, a deeper understanding or theory

should be congruent with practical understanding and everyday prac-

tice. It should also be explicitly articulated, yielding knowledge which canbe returned to the real world to inorm deliberate intervention in system

design and use rather than unconsciously reproducing patterns o activity.

The overall intention is to develop a labor theoretic approach to inor-

mation retrieval. The immediate concern in this chapter involves the initial

components or this approach. This chapter also reviews existing evalu-

ative traditions and indicates the possibility or synthesis within a labor

theoretic approach. Chapter 2 introduces and develops crucial concepts

o selection power and selection labor, both as concepts and activities in

themselves and or the relation between them.

Page 11: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 11/199

2 Chapter 1

Labor, choice, and technology are undamental to human experience.

In the Judeo-Christian tradition, once out o Eden we are condemned to

labor and compelled to choose. Technology may have been noticed lessexplicitly, but it is equally pervasive in post-Edenic experience, both as

agrarian and industrial—or productive—and inormation technologies.

Physical and mental labor are usually considered separately rom each

other, but acknowledgment o the mental components o physical labor

and physical elements in mental labor have moved recently toward syn-

thesis, and an emerging view o intelligence concerns “quality o our

bodies as much as our minds” (Gosden 2003, 31–33, 119). Mental or

inormational labor has been recognized as both an independent activ-ity and an adjunct to obtaining physical control over the environment

(Webster 2002, 15). Types o mental labor have been dierentiated,

with semantic labor distinguished rom syntactic labor (Warner 2005a).

Classically rom Aristotle, choice, or deliberation, is the product o men-

tal labor. Late-twentieth-century developments in inormation technol-

ogy constitute a revolution in the mechanization o mental labor (Minsky

1967, 2), embodied in the computer as a universal inormation machine

that developed rom mid- and late-nineteenth-century antecedents in

special-purpose inormation machines. Both productive and inormation

technologies are human constructions—the products o human physical

and intellectual labor drawing upon natural resources and preexisting

human constructions (Warner 2004, 5–35). An understanding o inor-

mation retrieval constructed rom labor, choice, and technology promises

to be deeply rooted in human experience and to oer a radical depth o 

understanding.Power in explanation can be demonstrated by the ability to absorb ele-

ments rom previous models as special cases o the new model, indicative

o the history o a true science, while discarding those elements that have

obstructed understanding.

Existing Models

Inormation science has developed existing evaluative models that shouldbe absorbed into the new model and oer some elements or synthesis

and advancement, with diusion into computer science, librarianship, and

indexing. Some discussions o the inormation society are concerned with

Page 12: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 12/199

Introduction 3

inormational or mental labor, and this oers a more indirect resource

that can also be absorbed.

Inormation and Computer Science

The dominant tradition or evaluating inormation retrieval systems in

inormation science emerged nearly simultaneously and partly independ-

ently in the United States and in the United Kingdom during the early- to

mid-1950s (Ellis 1996, 1–22) and has since diused to and been partly

absorbed by computer science. The techniques developed or selection

and ordering o reerences and documents have served as exemplars,

or demonstrations o possibility, or the increasingly dominant Internetsearch engines, with some elements o more direct transer or inheritance

in search algorithms derived rom inormation retrieval research (Ellis

and Vasconcelos 1999, 8). There were also parallel developments in com-

mercial search services, largely independent in concept but drawing on

common technologies, which also served as exemplars and demonstrated

commercial easibility and technical possibility.

The inormation and computer science tradition has not always been

explicit about its own values or examined its own assumptions; with-

out ull notice, it has sometimes departed pragmatically rom its initial

assumptions. It can, however, be broadly characterized as query trans-

ormation, with the query articulated verbally in advance o searching

and then transormed by a system into a set o records (Heine 1980). The

central value o query transormation can be summarized as a system

that should deliver all and, insoar as possible, only all records relevant

to a given query. Retrieved records are assessed or their relevance to thequery and beore generating measures o perormance, including preci-

sion and recall. The adopted methodology induces a bias toward fxing

and possibly reiying relevance, reducing it rom a concept to a relation

between query and document (Ellis 1984, 28–29; 1992; 1996, 11–20).

Bibliographic systems, rather than ull text, have been the dominant—

although not exclusive—subject o study. Humanly assigned indexing has

tended to be received as a given, leaving the rationale or such indexes—

and their associated labor and costs—unexplored. There has also beenan implicit teleology that aims or a perect system. Evaluation has

become an end in itsel, sometimes obstructing understanding (Ellis 1984;

1992).

Page 13: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 13/199

4 Chapter 1

The relation between inormation technology and the research tra-

dition in inormation and computer science could be characterized as

repression, wherein the repressed reemerges but not at a ully consciouslevel. Repression is discernible in the insistence that the retrieval proc-

esses created and studied are independent o their particular technologi-

cal instantiation while simultaneously allowing procedures to be strongly

determined by contemporary technological possibilities. For instance,

the stress on query transormation corresponded to the batch processing

embodied in the technology o the 1950s. The theoretical legacy o query

transormation has proved difcult to adapt to modern systems, which do

not necessarily demand a verbally articulated query in advance o search-ing and which can be—and are—used interactively. Critiques argued

that the assumption o the necessity or a verbally articulated query was

intratheoretic (Heine 1980) rather than intrinsic to inormation seeking;

this argument has been substantiated by changes in practice enabled partly

by subsequent technological developments. Reemergence o the repressed

can be ound in the late articulation and still-limited acknowledgment

o the identity between primitive operations o inormation retrieval

and logic or computation. Analysis has revealed that the potential trans-

ormations or inormation retrieval on written records or descriptions

are variations on primitive operations o sorting or partitioning and the

transormation o one symbol into another (Buckland and Plaunt 1994).

This can be a regarded as a special case o the known potential or reduc-

ing mathematical and logical operations on an object language to the

writing, erasure, and substitution o symbols (Ramsey 1925/1990, 165–

174) and also corresponding to the primitive computational operations(Warner 1994, 102–103). The paradigm o query transormation can be

regarded as largely but not entirely exhausted, becoming increasingly dis-

tant rom the empirical reality o interactive and distributed systems (Ellis

and Vasconcelos 1999, 8), exposing its rigidity i the original distinctions

are retained, or surviving by ad hoc modifcations to its theoretical base

and thereby losing relevance in the frst direction and internal intellectual

coherence in the other.

Two paradigms—the cognitive and the physical—have been distin-guished in inormation retrieval research, but they share the assump-

tion o the value o delivering relevant records (Ellis 1984, 19; Belkin

Page 14: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 14/199

Introduction 5

and Vickery 1985, 114). For the purposes o discussion here, they can be

considered as a single heterogeneous paradigm, linked but not united by

this common assumption. The value placed on query transormation isdissonant with common practice, where users may preer to explore an

area and may value ully inormed exploration. Some dissenting research

discussions have been more congruent with practice, advocating explora-

tory capability—the ability to explore and make discriminations between

representations o objects—as the undamental design principle or inor-

mation retrieval systems.

We can acknowledge the utility o techniques developed or selection

and ordering o reerences and documents—in both the experimental tra-dition and commercial practice—and simultaneously recognize that these

techniques are derived rom known undamental computational opera-

tions. The techniques have been realized in the special-purpose tools and

machines used or inormation retrieval at the beginning o the research

tradition in the 1950s and by the programmed universal inormation

machine o the modern computer. We can ully acknowledge technology

rather than repress it and still make a distinction between techniques and

values in order to preserve and carry orward what may be valuable rom

inormation retrieval research in inormation and computer science.

Librarianship and Indexing

Compared with the research tradition developed in inormation science

and subsequently diused to computer science, the historical antecedents

or understanding inormation retrieval in librarianship and indexing are

ar longer but less widely inuential today. They have tended to be lessexplicit about their evaluative criteria and aims or inormation retrieval

systems, and ar less concerned with producing measures o eective-

ness. In contrast to inormation and computer science, they have been

associated with the technologies o writing and printing and have had

a pronounced preerence or direct human description o inormation

objects. Although less immediately pronounced, we can discern a similar

pattern o repression regarding technology. The need or descriptions less

extensive than the documents described, imposed by storage constraintso inscribed media, and or direct human intervention in the creation o 

these descriptions, connected with the technical characteristics o writing

Page 15: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 15/199

6 Chapter 1

and printing, have tended to be universalized and treated as i they were

independent o their dominant technological realizations (Wilson 2001).

The inormation and computer science tradition probably inherited theassumed need or brie index descriptions directly rom existing inorma-

tion products and not rom theories that inormed the construction o 

those products (Cleverdon 1962; Cleverdon, Mills, and Keen 1966). A

urther limitation o library studies involves its ocus on training in the

use o inormation retrieval systems, which oten concentrates on the level

o system commands rather than understanding their value in communi-

cation (Roberts 1989). Disturbing evidence suggests that ormal inor-

mation retrieval systems are marginal in communication (particularlyscholarly communication), especially in the sense o inormation, topic,

or subject retrieval rather than document identifcation (known author or

title) and supply (Bath University Library 1971; Smithson 1994).

Two valuable elements are carried orward rom librarianship and

indexing. The frst is a partly implicit stress on selection power, conceived

as bibliographic control in librarianship (Wilson 1968) and implied by

valuing the discriminatory power o index terms in indexing but made

ully explicit here. The second is an acknowledgment o the role o direct

human intellectual labor in creating selection power, transorming into a

uller understanding distinguished rom specifc technological constraints

and their partly covert inuence on theory and practice.

Inormation Society Discussions

Inormation society discussions have given some rather limited attention

to inormation retrieval. For instance, Lyotard comments:It is reasonable to suppose that the prolieration o inormation-processingmachines is having, and will continue to have, as much an eect on the circulationo learning as did advancements in human circulation (transportation systems)and later, in the circulation o sounds and visual images (the media). (1984, 4)

Other comments remain similarly unocused, recognizing the signifcance

o inormation retrieval but not providing ull research or intellectual con-

text or its consideration. In particular, some inormation society discus-

sions have treated technology unsatisactorily (Webster 2002), possiblydue to wariness about being stigmatized as technologically determinist

(Wilson 1996a), and there has been a limited understanding o unda-

Page 16: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 16/199

Introduction 7 

mental computer operations. An analytically valuable category o inor-

mational labor has begun to be distinguished by some writers (Webster

2002, 15); this book will adopt and urther dierentiate that distinction,acknowledging the possibility o transerring some orms o mental labor

to inormation technology.

Summary

Dierent elements rom the inormation retrieval tradition developed in

inormation science rom librarianship, indexing, and inormation soci-

ety discussions will be selected and carried orward. The utility o the

techniques developed by inormation retrieval research—but not theassociated value o query transormation—is acknowledged, with the rec-

ollection that techniques are variations on primitive computational trans-

ormations. Selection power is adopted rom librarianship and indexing

as the primary value and the role o direct human labor is both substanti-

ated and critiqued. Inormational labor transormed into mental labor to

incorporate its historical antecedents is derived rom inormation soci-

ety discussions. Technology is restored, not repressed, and understand-

ing o the types o mental labor transerable to inormation technology is

inormed by the distinction between semantic and syntactic mental labor.

A synthesis o existing approaches is envisaged, producing a set o con-

cepts and categories that are simultaneously simpler and more powerul

than the query transormation o classic inormation-retrieval research,

more explicit and discriminating than librarianship and indexing (par-

ticularly regarding the signifcance and costs o human mental labor), and

uller and more technologically inormed than inormation society discus-sions.

This book adopts an inclusive understanding o inormation retrieval

systems, developed rom common understandings and conveyed by osten-

sive exemplifcation rather than restrictive defnition. In particular, the

common antithesis between experimental and operational systems is dis-

solved. The real source o contrast between the types o system likely has

been dierent orms o the description process, particularly the experi-

mental preerence or machine generation rather than human selection o index terms and or non-Boolean searches or those descriptions. When

made explicit, the basis or the distinctions between experimental and

Page 17: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 17/199

8 Chapter 1

operational inormation retrieval systems appears theoretically weak. The

distinction is being increasingly eroded in practice, with operational sys-

tems possibly selecting records or documents by directly Boolean opera-tions but ordering retrieved documents on the basis o other indicators.

I the proposed model is to be regarded as a scientifc advance, it must

have a dual aspect that comprehends empirical reality and selectively

absorbs existing models. Empirical reality should be explained as ully,

powerully, and as parsimoniously as possible. The pervasive presence o 

labor, choice, and technology in inormation retrieval practice promises

a strong degree o correspondence to empirical reality. Human labor is

immediately present as the description labor o cataloging, classifcation,and database description. Choice has been persistently embodied in prac-

tice and, more recently, increasingly recognized theoretically and valued

as both selection rom retrieval results and the fltering o inormation.

Diused rom the 1950s, modern inormation technologies, to which

aspects o human mental labor can be and increasingly are transerred,

are now pervasive in inormation retrieval.

The reader can now anticipate this book’s approach and structure.

The encompassing theories adapted or development and the mode o 

presentation derives rom the human sciences. The author understands

human history as a cumulative progression: human labor acting upon

the naturally given and humanly modifed environment as the primary

source or progressive change developed by, but not exclusive to, Marx.

Distinctions rom Ferdinand de Saussure’s Course in General Linguistics 

are adapted or understanding transormations o meaning in ull-textretrieval. Material rom inormation theory and computer science is also

understood rom the perspective o the human sciences. Inormation the-

ory is utilized to comprehend patterns o occurrence and recurrence o 

words and phrases. The discussion is inormed throughout by an under-

standing o the computational process derived directly rom automata

theory, or the theory o computation.

Characteristic o the human sciences, the mode o presentation is prima-

rily discursive and aims to obtain broad intelligibility. Logically expressedschemata are introduced, accompanied by diagrams more oten encoun-

tered in the ormal sciences but intended here to reveal the underlying

Page 18: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 18/199

Introduction 9

rigor and economy o the discursively expressed argument, parallel to the

discourse rather than independent points o departure. Methodologically,

the book deliberately excludes some complexity, particularly regardingull-text retrieval, to enable analytic concentration on central or inescap-

able issues; it also excludes simplifed assumptions that can be alse or

distorting.

Each chapter develops cumulatively rom the preceding chapters.

Chapter 2: Selection Power and Selection Labor

Selection power—the ability to make inormed choices between objects

or representations o objects—is argued rom a number o perspectivesas the primary aim o inormation retrieval systems. Similar principles

or retrieval are adduced rom partly independent discourses, such as the

value placed upon an index term’s discriminatory power in discussions

o indexing and the concept o bibliographic control as mastery over

written and published records (UNESCO/Library o Congress 1950, 1).

Commonly used systems embody acilities to enhance selection power,

and the survival o such systems in the market or inormation services

testifes to the perceived utility o selection power (Swanson 1980, 128).

Ordinary discourse comments rom consumers o such systems also value

control and selection. The concept o exploratory capability developed

by critiques o the dominant research tradition is urther transormed

into selection power, providing more precise and generically applicable

analysis. Thereore, understandings embodied in some signifcant schol-

arly discourses, in practice and in ordinary discourse, precede theoretical

articulation, which continues to value query transormation above selec-tion power in the dominant research tradition. The etymology o intel-

ligence (inter-legere: to choose between) implies a link between selection

power and both deep and ordinary discourse, or widely diused aspects

o human experience (Stevens 1998, 66). Values or inormation retrieval

are brought into accord with processes by replacing query transormation

with selection power.

Selection power is conceived as a undamental concept that is open to

elucidation but not urther decomposition into more primitive entities. Itis understood as a quality o human consciousness that can be assisted

or rustrated by the system’s capacity or exploration but is not inher-

Page 19: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 19/199

10 Chapter 1

ent in the system itsel. Under certain historical conditions and levels o 

technological development, selection power is produced by activities such

as cataloguing, classifcation, description o objects or databases, andsearching catalogs and databases, all o which can all be comprehended

and understood as selection labor. Thus, selection power is not conceived

abstractly, apart rom real world circumstances, but operates in relation

to considerations o human labor (particularly mental labor), the costs o 

that labor, and the possibility o transerring particular orms o mental

labor to inormation technology, now primarily computational technolo-

gies. A undamental proposition is developed: selection power is produced

by selection labor.Selection labor is characterized as a orm o mental labor and theoreti-

cal minima are established or a given collection o objects. The separation

o selection labor into description and search labor with the premodern 

technologies o writing and printing on paper is noted. Similarly, the

author acknowledges that description and search activities reconverge

with computer-based, or modern, technologies and also acknowledges the

possibility o sustaining analytical distinctions between them.

Chapter 3: Description and Search Labor

Selection labor separates historically into description and search labor

and can be analytically decomposed. The activities o description and

searching are then more ully characterized empirically as components

o selection labor. As orms o mental labor, description and search labor

participate in the conditions or labor and mental labor. Concepts and dis-

tinctions that apply to physical and mental labor are indicated, introduc-ing the necessity o labor or survival, the idea o technology as a human

construction, and the possibility o transerring human labor—including

mental labor—to technology. Distinctions specifc to mental labor, par-

ticularly between semantic and syntactic labor, are introduced. The high

cost o human mental labor is also indicated.

Exempliied by cataloging, classiication, and database description,

description labor is more ormally understood as the labor involved in

transorming objects into searchable descriptions; it includes interpreta-tion. Search labor is understood as the human labor expended in search-

ing systems. Direct human labor has diminished progressively or both

Page 20: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 20/199

Introduction 11

description and search labor, and its syntactic aspects have transerred

to technology eectively compelled by the high relative costs o direct

human labor compared to machine processes.

Chapter 4: A Labor Theoretic Approach

The labor theoretic approach to inormation retrieval that inormed

chapters 2 and 3 is made ully explicit in chapter 4. The labor theoretic

approach has qualities usually desired or a theory that couples com-

prehensiveness with economy— parsimony, power, and fnal simplicity.

Although the ocus is on the computational mode, it acknowledges inher-

itances rom orality and literacy, and the theory can also comprehendoral and written modes. The labor theoretic approach absorbs library

and inormation and Internet activities into a common schema within the

computational mode; dierent aspects o the schema become prominent

or each set o activities. The schema developed within the theory is eco-

nomical, explicitly reduced to a short sequence o clauses, and also rep-

resented in diagrammatic orm that includes elements o iconicity. A very

powerul analysis results rom making ully explicit the dynamics that

are strongly implicit in current inormation activities. Once grasped, the

theory becomes simple.1

Chapter 5: Retrieval rom Full Text

The labor theoretic approach can account or the existence o ull text

retrieval, precisely locating signifcant changes in description and search

labor and processes. However, its analytic power is more ully demon-

strated in accounting or the existence o changes and precisely speciyingtheir location than by a uller understanding o those same qualitative

changes. As evidenced by the provision and use o phrase searching, practi-

cal understanding o transormations o word meaning and the requency

o word sequences has tended to run ahead o theoretical understanding

and articulation. Sources acknowledged in retrieval as relevant to under-

standing word meanings have exposed a misleading concept o language

as a nomenclature but have not ully articulated a positive account o the

production o meaning in written language. Thereore, uller and delib-erately articulated understanding requires urther development. Ideally,

urther development should remain consistent with the labor theoretic

Page 21: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 21/199

12 Chapter 1

approach, incorporate existing sources recognized as revealing, and coin-

cide with practical understandings; but it also should develop dialectically

rom these bases and may draw upon urther material.To obtain deeper insight into inescapable issues o semantics and syn-

tax—the production o meaning rom written language and replication

and dierences in words and words sequence—requires urther theoreti-

cal contexts. Thereore, the principal sources selected are Saussurean lin-

guistics (or understanding semantics) and inormation theory (or insight

into syntactics understood as patterns o replication and dierence in

written language), both continually inormed by the theory o computa-

tion. Selection power is understood to be inescapably produced by humanselection labor that modulates over time.

Chapter 6: A Semantics or Retrieval rom Full Text

This chapter is concerned with developing a semantics or written lan-

guage that incorporates crucial distinctions or understanding retrieval

rom ull text. Established and materially rooted categories in Saussurean

linguistics, the syntagma and paradigm—the linear sequence o utterance

and the network o associations a word acquires outside a particular syn-

tagma—are adapted to a largely unprecedented purpose: an account o 

the production o meaning rom written language. The inheritance o pat-

terns or the production o meaning and the occurrence and recurrence o 

words and phrases rom oral and written language are also acknowledged

and placed in dialogic encounter with the possibilities o computation.

Chapter 7: A Syntactics or Retrieval rom Full TextUnderstood as the occurrence and recurrence o patterns, the syntactics o 

written language are equally relevant to understanding retrieval rom ull

text. Material elements rom inormation theory—the message and mes-

sages or selection rom a source that correspond directly to the categories

chosen rom Saussurean linguistics—are adapted to gain understanding

o patterns in written discourse, which then can be directly and humanly

exploited in searching. The theory o computation continues to inorm the

understanding o patterns o occurrence and recurrence, particularly oroperations that eectively include cutting o words rom a line o writing,

consistent with the undamental computational operations o writing,

erasing, and substituting symbols.

Page 22: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 22/199

Introduction 13

   A  p  p  r  o  a

  c   h  e  s  t  o   i  n   f  o  r  m  a  t   i  o  n  r  e  t  r   i  e  v  a

   l

   C   h  a  p  t  e  r   1 .

   I  n  t  r  o   d  u  c  t   i  o  n

   C  h  a  p  t  e  r  F  o  u  r .   A l  a  b  o  r  t  h  e  o  r  e  t i  c  a  p  p  r  o  a  c  h

   C  h  a  p  t  e  r   N i  n  e .   C  o  n  c l  u  s i  o  n

   P o s t s c r i p t

   S  e   l  e  c  t   i  o

  n

  p  o  w  e  r

   S  e   l  e  c  t   i  o  n   l  a   b  o  r

   C   h  a  p  t  e  r   2 .

   S  e   l  e  c  t   i  o  n

  p  o  w  e  r  a  n   d

  s  e   l  e  c  t   i  o  n

   l  a   b  o  r

   D  e  s  c  r   i  p  t   i  o  n   l  a   b  o  r

   S  e  a  r  c   h   l  a   b  o  r

   C   h  a  p  t  e  r   3 .

   D  e  s  c  r   i  p  t   i  o  n

  a  n   d  s  e  a  r  c   h

   l  a   b  o  r

   D  e  s  c  r   i  p  t   i  o  n

   l  a   b  o  r  s  e  m  a  n  t   i  c

   D  e  s  c  r   i  p  t   i  o  n

   l  a   b  o  r  s  y  n  t  a  c  t   i  c

   S  e  a  r  c   h   l  a   b  o  r

  s  e  m  a  n  t   i  c

   S  e  a  r  c   h   l  a   b  o  r

  s  y  n  t  a  c  t   i  c

   D  e  s  c  r   i  p  t   i  o  n

  p  r  o  c  e  s  s  e  s

  s  y  n  t  a

  c  t   i  c

   S  e  a  r  c   h   l  a   b  o  r

  s  e  m  a  n  t   i  c

   C   h  a  p  t  e  r   5 .

   R  e  t  r   i  e  v  a   l

   f  r  o  m    f  u

   l   l  t  e

  x  t

   C  h  a  p  t  e  r   E i  g  h  t .  S  e   m  a  n  t i  c  s  a  n  d  s  y  n  t  a  c  t i  c  s  f  o  r  r  e  t  r i  e  v  a l  f  r  o   m  f  u l l  t  e  x  t

   D  e  s  c  r   i  p  t   i  o  n

  p  r  o  c  e  s  s  e  s

  s  y  n  t  a

  c  t   i  c

   S  e  a  r  c   h   l  a   b  o  r

  s  e  m  a  n  t   i  c

   C   h  a  p  t  e  r   6 .

   A  s  e  m  a  n  t   i  c  s

   f  o  r  r  e  t  r   i  e  v  a

   l   f  r  o  m    f  u

   l   l

  t  e  x  t

   D  e  s  c  r   i  p  t   i  o  n

  p  r  o  c  e  s  s  e  s

  s  y  n  t  a

  c  t   i  c

   S  e  a  r  c   h   l  a   b  o  r

  s  e  m  a  n  t   i  c

   C   h  a  p  t  e  r   7 .

   A  s  y  n  t  a  c  t   i  c  s

   f  o  r  r  e  t  r   i  e  v  a

   l   f  r  o  m    f  u

   l   l

  t  e  x  t

   T  a   b   l  e   1 .   1

   S  t  r  u  c  t  u  r  e

  o   f  t   h  e   b  o  o   k

Page 23: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 23/199

14 Chapter 1

Chapter 8: Semantics and Syntactics or Retrieval rom Full Text

This chapter ocuses on bringing semantics and syntactics back to the

examples o retrieval given in chapter 5 and then testing and demonstrat-ing their analytic advantages. A uller example o retrieval is also intro-

duced and considered rom the developed perspective.

Chapter 9: Conclusion

The conclusion explores the implications o semantics and syntactics

developed or preexisting theories o inormation retrieval, or the labor

theoretic approach, and or the practical evolution o Internet search

engines.

Postscript

The postscript addresses the changes in what it means to be human that

arise rom current developments in inormation retrieval.

A diagram confrms the structure o the book, indicating the topics cov-

ered by each chapter in relation to the categories o the labor theoretic

approach developed and adopted (see table 1.1). The absence o a cat-

egory does not mean that the category has been discarded in later chap-

ters, but rather that it either continues to be incorporated into the theory

developed without urther signifcant modifcation, or that it is no longer

highly signifcant in real world practice or has been analytically excluded

to enable clarity o attention. For instance, selection power is incorpo-

rated into the urther theory developed, while semantic description labor

and syntactic search labor have diminished in real world practice orretrieval rom ull text. Accordingly, both are analytically excluded rom

the later chapters. The diagram oers a guide or placing individual chap-

ters within the overall argument.

Page 24: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 24/199

Introduction 15

Ay, in the catalogue ye go or men;As hounds, and greyhounds, mongrels, spaniels, curs,Shoughs, water-rugs, and demi-wolves, are cleptAll by the name o dogs: the valu’d fleDistinguishes the swit, the slow, the subtle,The housekeeper, the hunter, every oneAccording to the git which bounteous NatureHath in him clos’d; whereby he does receiveParticular addition, rom the billThat writes them all alike;

—Shakespeare. Macbeth. c.1606. III.i.91Macbeth’s questioning o the murderers (Shakespeare, 1606/1988, 77–78)indicates the value historically attached to subtlety o distinctions in thelanguage or lexicon o inormation retrieval systems. In this respect, thepassage anticipates the principle ormulated in modern discussions o indexing and classifcation—the value o an index term lies in its discrimi-natory power—and is consistent with the valuing o selection power (seechapter 2).

Commentaries have glossed “valu’d” as an adjective derived rom thenoun (value) and not as the participle o the verb (Shakespeare 1606/1988,77), implying that values attached to objects are analogous to attributesin modern databases and index terms in inormation retrieval systems. Itcould also be read as being valued or valuable, giving “[p]articular addi-tion” or added value. A “fle” (Shakespeare 1606/1988, 142) is a list or roll,a highly linear technological orm.

While regarded as an image o order (Shakespeare 1606/1988, 77), thepassage also contains elements o disorder: interactions between the breedsare listed in tension with the hierarchy rom “dogs” to breeds o dogs—“shoughs” (shag-haired dogs) lead to “water-rugs” (rough-haired water

dogs). The transition rom the irst-named type (“hounds”) to the last(“demi-wolves”) represents a move rom domestic to semieral, with a hinto lycanthropy implied by the analogy with types o men. The impositiono hierarchy also associates closely with political tyranny.

Box 1.1Historical valuing o selection power

Page 25: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 25/199

Page 26: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 26/199

2Selection Power and Selection Labor

Selection Power

Selection power is understood as the human ability to make inormed

choices between objects or representations o objects. It is adopted here

as the primary value or aim or inormation retrieval systems, in contrast

to the stress o query transormation in the experimental research tradi-

tion. Selection power is not arbitrarily asserted; its epistemological and

ontological status is clarifed, its content conveyed through exemplifca-tion, and its value supported by the presence o analogous concepts in

separate scholarly and ordinary discourses.

Defnition and Elucidation

Like other undamental concepts, selection power may be difcult to

defne without becoming tautological. The difculty o defnition could

suggest the concept’s signifcance. In the classic sense o decomposition

into more primitive concepts, defnition is deliberately avoided and wouldbe difcult given the undamental nature o the principle, but the term is

still elucidated and a reusal o explication similarly avoided.

Questions regarding the classic practice o defnition as decomposing a

concept into known entities appear somewhat playully in literary sources,

partly by implication in linguistics and most explicitly in philosophical

texts. When Pip is inducted into written literacy in Great Expectations,

Dickens plays upon the inevitable circularity o defnitions:

“Your sister’s a master-mind. A master-mind.”“What’s that?” I asked, in some hope o bringing him to a stand. But Joe wasreadier with his defnition than I had expected, and completely stopped me byarguing circularly, and answering with a fxed look, “Her.”

Page 27: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 27/199

18 Chapter 2

“And I ain’t a master-mind.” (1861/1946, 50)

The implied association between the perceived need or defnitions

and written literacy has been more ormally stated, with the commonlyencountered insistence on preliminary defnitions regarded as a product

o the cultural transition rom orality to literacy (Goody and Watt 1968).

A structuralist perspective in linguistics implies a fnally circular under-

standing o meaning. In The Course in General Linguistics, Saussure

regarded linguistic signs as obtaining meaning rom their negative di-

erences rom other signs within a network o signs (1916/1983; Culler

1988, 52). Most radically (and seemingly independently), Wittgenstein

admitted the impossibility o defning primitive signs by urther decom-

position and advocated elucidation rather than defnition:

The meaning o primitive signs can be explained by elucidations. Elucidationsare propositions which contain the primitive signs. They can, thereore, only beunderstood when the meaning o these signs are already known. (1922/1981, §3.263)

Wittgenstein continues to pursue rigorous logical development regarding

the possibilities o combining primitive signs. Thereore, the commitment

here to a logical structure and fnally ormal and discursive presenta-

tion does not imply adherence to logical positivism, where empirical

propositions must correspond to sense impressions (Ayer 1936/1980).

Technically, this book oers elucidations rather than defnitions.

The primitive signs to which Wittgenstein reers correspond to atomic

acts also distinguished in the Tractatus Logico-Philosophicus and “From

the existence or non-existence o an atomic act we cannot iner the exist-

ence or non-existence o another” (Wittgenstein 1922/1981, § 2.062).Selection power is understood as an atomic act or primitive term not

amenable to urther decomposition, and elucidations rather than defni-

tions are oered. Furthermore, we cannot iner rom selection power that

other atomic acts exist. The process o elucidation here will begin with

an example and then indicate concepts analogous to selection power in

independent scholarly and ordinary discourses, implicitly acknowledging

that practical understanding o constructing and using inormation sys-

tems has preceded theoretical articulation. The discussion will rise romthe concrete to the abstract.

Page 28: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 28/199

Selection Power and Selection Labor 19

Example

A researcher might wish to distinguish the private individual Samuel

Langhorne Clemens rom the author Mark Twain. For this purpose,a valuable system would not conate the individual’s two distinguish-

able aspects but rather allow dierentiation. Later, the same researcher

might seek inormation that combines Mark Twain and Samuel Clemens

as a single entity. Thereore, an inormation retrieval system should be

capable o both dierentiating and linking occurrences o these dier-

ent names. Originally conceived as fctional in a double sense (Warner

2000), the example does have real historical roots. Collections o copy-

right proceedings index Twain’s copyright disputes under his legal name(Clemens) without providing a link to his pen name (Copyright Decisions

1909). A generic search or Twain and Clemens as a single entity across

dierent sources would have to adapt the terminology already used to

search the particular source in use. While index terms can oer discrimi-

nations and links between related subjects, selection power is considered

characteristic o human consciousness, derivable rom but not inhering in

semiotic products.

The relation between combined identity (Mark Twain and Samuel

Clemens as author and private individual) and separate identities (Mark

Twain [author] and Samuel Clemens [private individual]) corresponds to

the genus: species relation, with public: not public as the dierentiating

actor. Figure 2.1 illustrates this relationship. The genus: species relation

occurs repeatedly in indexing languages (or instance, in thesaural rela-

tions between broader and narrower terms and in the relation o indexing

terms taken rom a controlled vocabulary to the language o discourse,particularly as generic scope contrasted with specifcity). For ormal logic,

the species: genus is analogous to material implication (p isamembero q

has similar truth conditions to p implies q), although the variables or

material implication could denote objects rather than classes (Bell 1937,

volume 2, 491). Material implication has been the most productive and

most difcult o logical relations (Quine 1937/1953, 84).

Discrimination between Twain (author) and Clemens (private individ-

ual) could be obtained by: direct serial reading o relevant texts, where the searcher expends labor

in reading and discrimination, possibly creating an index;

Page 29: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 29/199

20 Chapter 2

algorithmic and computer-conducted transormations on texts, where

the searcher is required to eliminate alse recalls and retrieve all relevant

instances, both processes complicated by inevitable inconsistencies in the

language o discourse; or human assignment o index terms and reerences to sections o dis-

course, or interpretation by the searcher.

At each stage, the searcher’s selection power increases, her/his labor

decreases, and the description labor and processes (either transerred to

technology or embodied in a human indexer) increase.

Dierent orms o graphic representation—pictorial, handwritten, or

printed—oer dierent possibilities or algorithmic transormation (see

fgure 2.2). Curiously, the standard orm o computer representation o 

written language (or instance, ASCII code, which appears more fnished

and retains less specifc production traces than handwriting) is more ame-

nable to algorithmic transormation because the process (keyboarding) is

less congealed in the product, as storage in the orm o computer memory.

Further transormation into more ully graphic representation—rather

than directly encoded representation embodied in dierent fle ormats—

would require a more ully congealed process and would complicatealgorithmic transormations.

Thus, selection power operates eectively or an aspect o inorma-

tion retrieval practice oten considered separately rom other aspects,

Figure 2.1Example o a genus-species relation.

Genus 

Difference 

Species 

public not public

Mark Twain / Samuel Clemens

(public author and

private individual)

Mark Twain

(author)

Samuel Clemens

(private individual)

Page 30: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 30/199

Selection Power and Selection Labor 21

analogous to cataloging in distinction rom classifcation or character-

ized as data contrasted with subject or topic retrieval. In addition to the

value o unifed description, recognition o the common principle cata-

loging shares with classifcation and subject determination restores theo-

retical signifcance and value to it, congruent with the oten dominant

actual use o inormation retrieval systems or identiying, recalling, and

retrieving known items or the works o a given author (Smithson 1994;Shneiderman 2003, 54).

Scholarly and Ordinary Discourses

Dierent, and partly independent, scholarly discourses have implicitly

endorsed selection power as a design principle or inormation retrieval

systems. Ordinary discourse discussions o inormation retrieval also

value that concept.

The interconnected felds o librarianship and indexing have endorsed

as central aims—in dierent ways and not always explicitly—concepts

analogous to or necessary components o selection power. Within librar-

ianship, bibliographic control was seminally defned in the post-1945

period as “mastery over written and published records” (UNESCO/ 

Library o Congress 1950, 1); it is strongly analogous to selection power.

Without bibliography, the “records o civilization would be an uncharted

chaos o miscellaneous contributions to knowledge, unorganized and

inapplicable to human needs” (UNESOC/Library o Congress 1950, viii).

At that stage o technological development, the creation o records and

indexes or bibliographic organization required direct human intervention

Figure 2.2Contrasting representations o Samuel Clemens / Mark Twain. Source: Railton2005.

Page 31: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 31/199

22 Chapter 2

or labor. UNESCO regarded itsel as born into “appalling post-war bibli-

ographic chaos” (Murra 1951, 47), and the distribution o responsibility

to the national agencies required to produce national bibliographies andallied works on a shared model was conceived as both the remedy or the

chaos and the path toward universal bibliographic control. The growth

o WorldCat (WorldCat 2007) indicates a more recent and less explicitly

noticed movement toward universal bibliographic control—particularly

regarding monographic literature rather than journal articles—prompted

more by internal dynamisms in the process than by imposition and by the

possibility o sharing the costs o human description labor by distribut-

ing the products o that labor as catalog records. Later sophisticated dis-cussions more directly concerned with selection power also distinguished

bibliographic control rom bibliographic organization, with organization

as the means o control (Wilson 1968), reinorcing the sense o selec-

tion power as a property o human consciousnesses enabled by—but not

directly inhering in—organization imposed on data.

Indexing values index terms or their discriminatory power, similar to

classical logic’s concern with dierentiae. Discriminatory power could

also be regarded as an essential component or organizational actor o 

enabling discriminatory control or selection power. The technological

constraints o written and printed documents, and the need to reduce the

cognitive labor o the searcher, compelled brieer and more concise index

descriptions than the object described —with some exceptions, such as

concordances.

Cybernetics, emerging in its modern orm during the immediate post-

1945 period concurrently with the ormalization o bibliographic con-trol and partly concerned with inormation technologies envisaged or

enhancing bibliographic control, also emphasized control and navigation

(Wiener 1954). Cybernetics was understood to embrace the “complex o 

ideas” represented by

the study o language … the study o messages as a means o controlling machin-ery and society, the development o computing machines, and other such autom-ata, certain reections upon psychology and the nervous system, and a tentativenew theory o scientifc method. (1954, 15)

In the subsequent development o cybernetics, emphasis on control did

not always separate human and machine discrimination. The etymology

Page 32: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 32/199

Selection Power and Selection Labor 23

o cybernetics—given by Wiener, who coined the term in testimony to

the Greek kubernē t ē s or steersman, understanding the steersman prima-

rily as a human control mechanism (1954, 15), and through its link toCybernesia (the pilots’ estival held in honor o Theseus’ navigation to

Athens) (Plutarch 100/1914)—points to a deeper level o selection in col-

lective human experience.

A dispute within philosophy (Giambattista Vico’s critique o Aristotle)

oers a strong and highly signifcant analogue to exploratory capability.

Aristotle’s philosophy involved a systematic method o enquiry or clas-

siying an object, in addition to providing a direct and indirect source

or subsequent understandings o genus, species, specifc dierence, syn-onymy, and equivalence. An enquirer was required to ask a series o ques-

tions—Does the thing exist? What is it? How big is it? What is its quality?

Vico subjected this method o enquiry to an incisive critique:

Aristotle’s Categories and Topics are completely useless i one wants to fnd some-thing new in them. One turns out to be a Llull or Kircher and becomes like a manwho knows the alphabet, but cannot arrange the letters to read the great book o nature. But i these tools were considered the indices and ABC’s o inquiries about

our problem [o certain knowledge] so that we might have it ully surveyed, noth-ing would be more ertile or research. (1710/1988, 100–101)

The last clause o the critique deserves emphasis: “nothing would be more

ertile or research.” While retaining some o Aristotle’s techniques, Vico

avoided his rigidity, transorming it into a systematic and eective means

or enhancing knowledge o an object. Analogously, while rejecting a

query’s rigid transormation into a set o records assumed as desirable in

inormation retrieval research, similar techniques can explore the domain

o discourse covered by the inormation retrieval system. This process o categorization would be somewhat similar to collecting documents under

the terms o a humanly assigned metalanguage when describing them,

which can then be exploited in searching. Thus, classifcation schemes

(and their analogues in thesaural relations among indexing terms) become

valuable exploratory devices.

Dickens’s Hard Times provides another supporting analogue, although

the fction also enacts a critique o Platonic and Aristotelian philosophy.

The logical distinctions exemplifed in Bitzer’s defnition o a horse—

“Quadruped. Graminivorous. Forty teeth, namely twenty-our grinders,

Page 33: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 33/199

24 Chapter 2

our eye-teeth, and twelve incisive . . . Age known by marks in mouth.”

(1854/1989, 6) (a description that does resemble eighteenth- and nine-

teenth-century taxonomies or the horse [Linne 1792; Donovan 1820],themselves inuenced by the Aristotelian method o defnition by genus

and species)—are presented as harsh. Outside the restricting enclosure o 

the town, a dierent metaphor or knowledge is discernible:

They walked on across the felds and down the shady lanes, sometimes gettingover a ragment o a ence so rotten that it dropped at a touch o the oot, some-times passing near a wreck o bricks and beams overgrown with grass, mark-ing the site o deserted works. They ollowed paths and tracks, however slight.(Dickens 1854/1989, 353)

The value o an inormation system could lie in the ability it oers dis-

criminatingly to ollow “paths and tracks, however slight.”

The etymology o intelligence oers urther support or the signif-

cance o selection power. Traced to its Latin orm (inter-legere: to choose

rom or between things), intelligence is strongly analogous to selection

power, implying deliberate choice rather than domination by brute needs.

Plutarch’s account o the ormation o the Roman military legion is better

known:

When the city was built, in the frst place, Romulus divided all the multitudethat were o age to bear arms into military companies, each company consistingo three thousand ootmen and three hundred horsemen. Such a company wascalled a “legion,” because the warlike were selected out o all. (1914, 123)

Thus, division or dierentiation o individuals represents a urther char-

acteristic o man in the polis in its initial realization as the city-state. The

current discussion conceives intelligence as a quality o human conscious-

ness rather than inhering in the objects dierentiated.

Comments on inormation systems by contemporary ordinary discourse

are highly signifcant but difcult to produce as evidence. Evaluative crite-

ria may be given by implication rather than ully and directly articulated.

Yet a searcher who complains that it is difcult to control the number

o records retrieved invokes a principle o discriminatory power. More

explicitly, one spoken response to a presentation o the value o selection

power was, “that’s the basis [an enhanced capacity or inormed choice]

on which people use systems anyway” (Warner 2000, 79). Not inuenced

directly by the query transormation tradition, extradisciplinary written

Page 34: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 34/199

Selection Power and Selection Labor 25

comments can be regarded as embodying ordinary discourse concepts.

For instance, a sociological study o communication in philosophy, which

stresses the importance o direct oral communication between signifcantphilosophers, criticizes literature discovery accomplished by “index-

ing and abstracting services (whether in printed media or electronically

online), which overload the channels rather than ocusing them” (Collins

1998, 45). Thereore, ordinary discourse concepts—although elusive—

support the value o selection power.

The query transormation tradition o classic inormation-retrieval

research also values discriminatory power, but or a distinguishable end

that can be absorbed into selection power. The current argument valuesthe discriminatory power o an index term or its potential to directly

increase human selection and discrimination. Classic inormation-

retrieval research, by contrast, has valued discriminatory power as a

actor that assists automatic partitioning o documents or records (Van

Rijsbergen 1979, 29, 136). The valuing o discriminatory power by clas-

sic inormation-retrieval research constitutes partly independent support

or the value given to it here. The distinction in the ends or which it is

valued reveals that this aspect o classic inormation-retrieval research

can be absorbed into selection power as a special and incomplete case.

The absorption o particular aspects o query transormation urther

indicates that the research tradition as a whole can be conceptually and

operationally subsumed into selection power as a special case within a

more encompassing theory. Thereore, this book will not pursue oppo-

sition with query transormation, which served as one dialectical point

o departure or asserting the value o selection power (Wilson 1996c;Warner 2000).

Particularly since the mid-1990s, studies within inormation science

have advocated selection, evaluation, and fltering as appropriate aims

or inormation retrieval, rather than recall o all and only all relevant

documents (Wilson 1996b; 1996c). The concept still needs to be ully

operationalized (Griesbaum 2000). In relation to real-world considera-

tions o human labor, the costs and value o that labor and labor del-

egated to technologies—the subsequent development o a labor theoreticapproach to inormation retrieval—can be regarded as operationaliza-

tion o selection power.

Page 35: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 35/199

26 Chapter 2

Summary

Analogous concepts in partly separate scholarly discourses support the

value o selection power. Librarianship held a comparable concept in bib-liographic control, indexing endorsed the discriminatory power o a term

(a crucial actor in obtaining control), and cybernetics valued control and

navigation. In addition, the value o selection power was supported by

disputes within philosophy;

a fctional critique o hierarchical classifcations;

the etymology o intelligence;

and more specifcally in relation to inormation retrieval systems, ordi-nary discourse comments on inormation retrieval.

The concept o selection power was elucidated and understood as the

human aculty or discrimination; we endorse it here as the primary

design and evaluation principle or inormation systems. As a special

case o implementation, selection power absorbed query transormation,

appropriate in certain circumstances and compelled by certain orms o 

technology (or instance by the batch-processing methods o the 1950s).

Thus, the frst proposition or developing a labor theoretic approach to

inormation retrieval asserts the value o selection power both as a prop-

erty o human consciousness and as a primitive term, open to elucidation

but not urther decomposition.

Selection Power and Selection Labor

The relation between selection power and selection labor is not con-ceived either ahistorically or independently o technology. For instance,

in primarily oral societies, orms o recitation not equivalent to verbatim

repetition but still dependent on individual memory within a communal

context were crucial to knowledge preservation by renewal (Goody and

Watt 1968). The Icelandic law-speaker provides a transitional orm that

both inherits elements rom orality and anticipates characteristics o writ-

ten literacy. The law-speaker was required to recite the law and to answer

queries on legal and parliamentary procedures by oral pronouncementsbased on his memory o the law (Njal 1280/1960, 306–308). From a

modern perspective, the law-speaker could be regarded as an inormation

Page 36: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 36/199

Selection Power and Selection Labor 27 

Figure 2.3 Alþing in session. W. G. Collingwood.

Lawspeakers are characteristic o oral societies, and the Icelandic law-speaker is o relatively late date and well-documented, with some concur-rent and developing elements o written literacy. The lawspeaker wouldrecite the law to the members o the annual assembly, or Alþing, as a lin-ear spoken utterance, reciting one third o the laws at each meeting (Short2008).

Selection power is evidenced in dialogic questioning and response, withevidence or the possibility o questioning separately rom the recitation o the law. For instance, in Njal’s Saga, the lawspeaker is consulted or confr-mation o an aspect o the law.

Flosi asked i this were the law, but Eyjol replied that he did not know or certainand said that the Law-Speaker would have to settle that point. Thorkel Geitissonwent on their behal and told the Law-Speaker the situation, and asked i there wereany legal basis or Mord’s submission.“There are more great lawyers alive today than I thought,” replied Skapti. “I can tellyou that this is so precisely correct that not a single objection can be raised against it.But I had thought that I was the only person who knew this specialty o the law nowthat Njal is dead, or to the best o my knowledge he was the only other man whoknew it.” (Njal 1280/1960, 308).

Box 2.1An Inormation System Embodied in a Single Individual

Page 37: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 37/199

28 Chapter 2

system embodied in an individual. With increased social complexity andthe growth o both documents and indexes to documents, direct mental

labor in memory and recall is transerred to sources outside the human

body and mind—exosomatic resources. Variously conceived, the knowl-

edgeable person may remain signifcant to inormation seeking and oer

a readiness and ocus o response difcult to obtain rom more ormal-

ized inormation systems. In a development concordant with other ea-

tures o secondary orality (Ong 1982), selection labor may be reemerging

as a single category, concentrated in searcher labor.Although acknowledged as a orm o labor, even i not subjectively

experienced as such, our primary concern will not ocus directly on the

mental labor o memory, recall, and response, but rather on the techno-

logical orms that now absorb the cognitive burden o memory and recall

and also on the mental labor involved in their construction and searching.

In premodern practice (that is, written and printed orms distinguished

rom computer-based, or modern systems), the organization o documen-

tary materials required physical labor and, most signifcantly, the distinc-

tion between description and search labor was relatively clear. We will

Selection labor can be discovered in the lawspeaker’s mental work o memorization and recall, the communicative labor o recitation, and inthe attention o the auditors. There is no direct analogue to products o description labor or metadata, urther suggesting that these are historicallyspecifc developments o written literacy.

Technology is maniested in a natural object adapted to a human pur-pose, the rock ace used as a sounding board or the voice o the law-speaker:

All those things which labour merely separates rom their immediate connection

with their environment are objects o labour spontaneously provided by nature,such as fsh caught and separated rom their natural element, namely water, timberelled in virgin orests, and ores extracted rom their veins. (Marx 1867/1976, 284).

Thereore, selection power, selection labor, and technology are discerniblein a primarily oral inormation system and have persisted across writtenliterate and computational modes, indicating their centrality to inorma-tion retrieval.

Box 2.1(Continued)

Page 38: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 38/199

Selection Power and Selection Labor 29

carry this distinction orward as an analytical distinction, while recog-

nizing the difculty o substantive separation. For example, a searcher

requesting a list o documents in chronological order eectively instan-tiates at the point o searching what once would have been a orm o 

description labor, or process. The contrast is traceable back to the fxity

o writing technologies compared with the possible uidity o compu-

tation (Warner 2001, 33–46). The concern here is with the theoretical

possibilities or and constraints on selection labor, conceived as incorpo-

rating both description and search labor.

Theoretical minima or selection labor can be derived rom serial pos-

sibilities. I items are examined serially and without regress, selectionlabor increases with the number o objects examined in the collection. An

unchanged principle or discrimination is assumed, rather than a conver-

sational or dialogic alteration o the principle or discrimination, and this

would be closely analogous to the batch processing practiced historically.

An absence o meaningul organization o the discriminated objects and

a conation and simultaneous occurrence o description and search labor

are also implied. I the choice between objects examined is reduced to a

binary contrast between acceptance and nonacceptance, the unit o labor

is somewhat analogous with the classic understanding o the bit.

Imposing organization upon a collection o objects separates descrip-

tion rom search labor, with work invested in organization or descrip-

tion reducing labor and enhancing power in searching. Selection labor

is then distributed between description and searching. I description and

organization can partition objects into appropriate sets, Shannon’s or-

mula or the inormation o a source would indicate that the number o objects discriminated by search labor would rise more than the quantity

o search labor:

I there are N possibilities [or the choice o messages rom a source], all equallylikely, the amount o inormation is given by log

2N . . . . I it were possible to choose

questions which always had the eect o subdividing into two equal groups, itwould be possible to isolate, in twenty questions, one object rom approximately1,000,000 possibilities. (1968/1993, 214–215)

On this basis, the number o choices—broadly, units o labor—requiredto discriminate between c.1000 and c.1,000,000 such possibilities (which

could correspond to objects, documents, or records or documents) would

Page 39: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 39/199

30 Chapter 2

double. The search could be conducted either deterministically or nonde-

terministically with human intervention at intervals, both with unaltered

criteria during the process. The labor invested in description correspondsto a capital cost that is not incurred or each iteration o searching.

These slightly abstract considerations are helpul in establishing the-

oretical constraints or the labor associated with selection and also or

enorcing the point that labor can be distributed between description and

searching but cannot be eliminated. They also have some more practical

resonances. The purposes o description may require semantic primitives

that are difcult to isolate and may not exist, particularly or human or

social discourse. An approach to theoretical limits can be made in otheraspects o inormation theory, such as reducing redundancy in messages,

but they are diicult to obtain ully (Shannon 1948/1993, 39; Verdú

and McLaughlin 2000). Biological classifcations might oer the closest

analogies to reduction to atomic acts or to a perectly organized source,

but distinguishing species rom variety can be problematic (Darwin

1859/1968, 104–108). Despite these reservations about the possibility

o approaching theoretical limits on the eectiveness o descriptions, the

Many years ago, when comparing, and seeing others compare, the birdsrom the separate islands o the Galapagos Archipelago, both one withanother, and with those rom the American mainland, I was much struckby how entirely vague and arbitrary is the distinction between species andvarieties. On the islets o the little Madeira group there are many insectswhich are characterized as varieties in Mr. Wollaston’s admirable work, butwhich it cannot be doubted would be ranked as distinct species by manyentomologists. Even Ireland has a ew animals, now generally recorded asvarieties, but which have been ranked as species by some zoologists. . . .From these remarks it will be seen that I look at the term species, as onearbitrarily given or the sake o convenience to a set o individuals closelyresembling each other, and that it does not essentially dier rom the termvariety, which is given to less distinct and more uctuating orms. Theterm variety, again, in comparison with mere individual dierences, is also

applied arbitrarily, and or mere convenience sake.—Charles Darwin. The Origin o Species by Means o Natural Selection or The

Preservation o Favoured Races in the Struggle or Lie. 1859. (1859/1968, 104–108).

Box 2.2Darwin on the Difculty o Establishing Classifcations

Page 40: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 40/199

Selection Power and Selection Labor 31

potential that the increase in the number o objects discriminated will be

much greater than the increase in the search labor required or discrimi-

nation might provide an underlying explanation or the possibilities o scaling oered by practical devices such as thesauri in working inorma-

tion systems.

Thereore, the second proposition or developing a labor theoretic

approach to inormation retrieval is that selection labor produces selec-

tion power, both or an oral speaker and with certain orms o exosomatic

technologies, which tend to be adopted under historical circumstances

o increased social complexity. Selection labor’s production o selection

power and the decomposition o selection labor into description andsearch labor is very clearly exemplifed within written literacy, but con-

tinues in modifed orm with modern inormation technologies.

Conclusion

This chapter addressed some undamental issues and developed some

propositions. Inormation retrieval research was reviewed and selec-

tion power was endorsed as the primary value or aim or inormation

retrieval. Selection power was received as a primitive term whose content

was elucidated, and value supported, rom analogous concepts in partly

independent discourses and also rom its embodiment in inormation

retrieval practice. Selection power was produced by selection labor. An

understanding o inormation retrieval has begun to absorb the concept

o inormational or mental labor. There is a congruence o values with

processes or inormation retrieval, particularly through the commonidea o selection.

For the urther development o a labor theoretic approach to inor-

mation retrieval, the concept o selection labor requires urther dier-

entiation to develop and exempliy the distinction between description

and search labor and explore the possibility o transerring direct human

labor to technology. Semantic mental labor will be ully distinguished

rom syntactic mental labor, dierentiating mental labor rom the proc-

esses that can be abstracted rom it and transerred to technology, androm the products that can result rom mental labor. These topics will be

the concern o the next chapter.

Page 41: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 41/199

Page 42: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 42/199

3Description and Search Labor

Introduction

The previous chapter characterized selection labor as a orm o mental

labor and established theoretical minima or a given collection o objects.

It also noted that the premodern technologies o writing and printing

on paper separated selection labor into description and search labor.

Similarly, the discussion acknowledged the reconvergence o descrip-

tion and search activities with computer-based, or modern, technologies,together with the possibility o sustaining analytical distinctions between

them. The activities o description and searching still require more ully

empirical characterization as components o selection labor.

This chapter will develop and elucidate the concepts o description and

search labor. As mental labor, description and search labor participate in

the conditions or other orms o mental labor. Thereore, the chapter will

introduce distinctions between types o mental labor and their dierent

possibilities or transer to technology. The perspective established onmental labor and technology is then used to understand description and

search labor and their relation to selection labor.

Concepts o Mental Labor

Labor and Mental Labor

For both Genesis and Marx, labor—productive work in nature—is a un-

damental condition o human lie, imposed by the necessity or survival.

The labour process . . . is purposeul activity aimed at the production o use-val-ues. It is an appropriation o what exists in nature or the requirements o man.

Page 43: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 43/199

34 Chapter 3

It is the universal condition or the metabolic interaction (Stowechsel ) betweenman and nature, the everlasting nature imposed condition o human existence,and it is thereore independent o every orm o that existence, or rather it is com-

mon to all orms o society in which human beings live. (Marx 1867/1976, 290)

Mental labor is usually conceived as an adjunct to enhancing control over

the physical environment, but it also can be considered as an activity in

itsel, with possibilities or mechanization explored (Minsky 1967, 2).

Agrarian, industrial, and inormation technologies can be regarded as

human constructions, the product o human labor on natural resources

and preexisting human-made products, rather than naturally or objec-

tively given. In Marx’s work, the understanding o technology as a humanconstruction receives its ullest expression in a classic passage rom the

Grundrisse, whose themes implicitly inorm the treatment o technology

in Capital (Marx 1867/1976).

Nature builds no machines, no locomotives, railways, electric telegraphs, sel-acting mules etc. These are products o human industry; natural material trans-ormed into organs o the human will over nature, or o human participationin nature. They are organs o the human brain, created by the human hand ; thepower o knowledge, objectifed. The development o fxed capital indicates to

what degree general social knowledge has become a direct orce o production.(Marx 1858/1973, 706)

Although Marx mentions automatic devices involving control mecha-

nisms (“sel-acting mules” [automatic spinning machines]), the passage

ocuses on industrial technologies common during his historical period.

Regarding inormation technology as a human construction and con-

cerned with the transormation o signals rather than natural resources

(Warner 2004, 5–35), indicates the possibility o a similar transer o 

human mental labor to technology. For both industrial and inormation

technology, transerring direct human labor to technology can speed proc-

esses and enable previously impossible activities.

The possibility o transerring aspects o direct human labor to technol-

ogy enables a dynamic between human labor and its technological prod-

ucts, in which orms o direct human work are progressively transerred

to technological processes. The dynamic is compelled by the historical

search or greater control over the environment and accelerated by theinnovatory dynamic o capitalism, including the impulse rom reduced

immediate costs o labor transerred to technology. From the perspective

Page 44: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 44/199

Description and Search Labor 35

indicated here, the dynamic can apply to mental labor as an independent

activity, not simply an adjunct to control the physical environment. The

social division o labor (or instance, between master and slave, or intel-lectual and clerical labor) can anticipate the division o labor between

human labor and machine process.

We can distinguish dierent types o labor, adopting Marx’s distinction

o universal and communal labor and applying it to inormational labor,

processes, and products.

We must distinguish here, incidentally, between universal labour and communallabour. They both play their part in the production process, and merge into one

another, but they are each dierent as well. Universal labour is all scientifc work,all discovery and invention. It is brought about partly by the cooperation o mennow living, but partly also by building on earlier work. Communal labour, how-ever, simply involves the direct cooperation o individuals. (1894/1981, 199)

As they merge into each other, universal and communal labor are not

encountered in pure orm and can be embodied in the products o labor,

but their distribution can vary signifcantly. Communal labor should also

be understood to include individual labor. In this context, we can regard

the machine aspects o inormation technology primarily as productso universal labor, although constructed, activated, and renewed by com-

munal labor;

the design and writing o programs as communal labor developed rom

universal labor and its products, such as understandings o the algorith-

mic process and programming languages; and

human description o inormation objects as primarily involving com-

munal labor, guided by universal labor and embodied in codes or descrip-tion.

Communal labor has immediate costs—possibly matched by a wage—

in required human energy; in contrast, the costs o universal labor

have been absorbed over time. As ormulated by Marx, the distinction

between communal labor and universal labor, involves mental labor—

“Universal labour is all scientiic work, all discovery and invention”

(Marx 1894/1981, 199)—and also can be applied directly to the modern

maniestations o mental labor and its products.

Labor, process, and product can be distinguished explicitly—process

and product separated rom an originally undierentiated labor. For

Page 45: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 45/199

36 Chapter 3

instance, in unrecorded oral speech, labor and process are not necessar-

ily distinguished and the product disappears with its creation. Once the

categories are separated, labor can be understood as direct human work

and process as activity abstracted rom labor, with the possibility o trans-

er to technology. This partly nominal distinction o labor rom process,

which arose originally rom unease at applying the term labor to inorganic

activities, also has substantive implications: compared to direct human

labor, greater rigidity and exactness in the process may arise partly rom

preliminary ormalization. Products—in this context, semiotic products

such as catalog records—result rom direct human labor and technologi-cal processes.

Figure 3.1 diagrams the distinction between labor, process, and product.

Labor is exclusively human—indicated by the darkly shaded intersection

o Human and Labor—while the white box indicates the absence o inter-

section between labor and machine. In contrast, process can be human or

machine. Human labor can be ormalized as a process (intermediate grey

section) and ully ormalized processes can be transerred to technology

(darker grey section). The making o products involves both labor andmachine processes that may congeal in the product, requiring deliberate

investigation to discover their roles in the making o the product.

An object or description can be read in as a precondition or biblio-

graphic labor, processes, and products (to the let o the diagram). The

object or description could be a natural object or, more characteristi-

cally, a product o human mental and productive labor, such as a written

document. Under premodernity, labor—embodied in objects or descrip-

tion—was usually received as a given, both in theory and in bibliographic

description practice. Under modernity, ull text searching can reactivate

Figure 3.1Labor, process, and product.

Human Machine

Labor

Process }Product

Page 46: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 46/199

Description and Search Labor 37 

the labor embodied in objects and transorm it into a resource or exploi-

tation.

We can now introduce distinctions specifc to mental labor that are notcaptured by historical concern with physical and productive labor.

Distinctions within Mental Labor

The social division o intellectual labor is more amiliar than separation

between the aspects o mental tasks, frst as a division between the priest-

hood and those engaged in more directly productive labor (Childe 1956),

and developing with the dierentiation o disciplines and discourse com-

munities (Goody and Watt 1968).The nineteenth century witnessed deliberate division o the aspects o 

mental tasks between people working either within organizations or on

extensive projects, in an addition to the social division o mental labor. A

theoretical understanding o the possibilities surrounding the division and

mechanization o mental labor also began to emerge. At a level that medi-

ates between practice and theory, the production o logarithm tables was

determined amenable to manuacture by subdividing the tasks involved

and dividing the human labor those tasks required. Inspired by Adam

Smith’s work on the division o labor, Prony conceived:

l’expédient de mettre ses logarithmes en manuacture comme les épingles [theexpedient o putting his logarithms into manuacture as i they were pins].(Babbage 1835/1963, 193)

The actual process o producing the tables revealed an interesting and

potentially disturbing eature: the relative accuracy o clerical and intel-

lectual human labor.Persons [nine tenths with no knowledge o arithmetic beyond addition and sub-traction] were usually ound more correct in their calculations, than those whopossessed a more extensive knowledge o the subject. (Babbage 1835/1963, 195).

Thereore, considerations o meaning or human computers can distract

attention rom simple computational operations. At a more deliberately

theoretical level, Babbage, similarly inuenced by and adapting Adam

Smith on the division o labor in manuacturing, noted the possibility or

the division o mental labor.

The division o labour is no less applicable to mental productions than to those inwhich material bodies are concerned. (Babbage 1835/1963, 379)

Page 47: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 47/199

38 Chapter 3

Mental labor has a material aspect, particularly regarding the use o 

exosomatic technologies, and it is this material aspect that gives the pos-

sibility o mechanization.Mechanical mental labor contrasts with mental labor belonging to “the

domain o the understanding, requiring the intervention o reasoning”

(Babbage 1989, 246) and closely involving considerations o meaning.

For Babbage, one impulse to the design and construction o computing

machines involved transerring the mechanical part o the mathemati-

cian’s labor to automatic machinery (1989, 246–247). A similar impulse

can be discerned in the diusion by adoption o modern inormation

technologies, a process also inuenced by a desire to avoid the costs o direct human labor or, in Marx’s terms, o communal labor.

The amiliar—and deeply embedded—distinction o semantics rom

syntax can be used to dierentiate semantic rom syntactic mental labor.

Semantic labor is concerned with transormations motivated by the

meaning or signifcance o symbols, while syntactic labor is determined

only by the orm o symbols, operating on them in their aspect as signals.

Semantic labor requires direct human involvement; by contrast, origi-

nally human syntactic labor can be transerred to inormation technol-

ogy, where it becomes a machine process. Direct human labor has high

costs; under modern conditions, mental labor transerred to technology

likely will decrease costs. The distinction o semantic labor rom syntactic

labor has an analogue in ordinary discourse, and the transer o syntactic

mental labor to technology occurs in everyday practice, suggesting the

robustness and wide applicability o the distinction, despite its only recent

theoretical articulation (Warner 2005a).The ollowing literary example, occurring chronologically close to and

within the same broad cultural context as Boole’s ormalization o logic

(1854), conceived as the laws o thought, conveys the distinction between

semantic and syntactic mental labor:

“My other piece o advice, Copperfeld,” said Mr. Micawber, “you know. Annualincome twenty pounds, annual expenditure nineteen nineteen and six, result hap-piness. Annual income twenty pounds, annual expenditure twenty pounds oughtand six, result misery. The blossom is blighted, the lea is withered, the God o daygoes down upon the dreary scene, and—and in short you are or ever oored. As Iam!” (Dickens 1850/1981, 175)

Page 48: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 48/199

Description and Search Labor 39

In this passage, the substitution o a semantically congruent but syntac-

tically dissonant result parodies the syntactic process o calculation—a

curious and revealing inversion o modern experience with computeroperations on written text or inormation retrieval and spell checking.

In contrast to the syntactically generated result it implicitly replaces, the

semantic result does not generalize to other similar syntactic procedures.

In nineteenth century practice, the exosomatic technologies o writing

would have assisted human mental labor, and both semantic and syntac-

tic labor would have involved continuous human intervention. Following

the mechanization o mental labor during the late twentieth century, syn-

tactic labor could be transerred to inormation technology—operatingdeterministically between intervals o human intervention and opening

up and revealing a distinction between semantic and syntactic mental

labor. Excepting processes already ormalized or known to be ormaliz-

able, attempts to transer human mental labor to inormation technology

usually reveal the complexity and intractability o semantic labor.

The endorsement o strongly analogous distinctions—implicit in a

highly signifcant copyright judgment by the U.S. Supreme Court involv-

ing modern inormation technologies—supports the distinction between

semantic and syntactic labor and also between labor and process. In Feist 

v. Rural , the court denied copyright protection to telephone white pages:

Nor can Rural claim originality in its coordination and arrangement o acts. Thewhite pages do nothing more than list Rural’s subscribers in alphabetical order.This arrangement may, technically speaking, owe its origin to Rural; no one dis-putes that Rural undertook the task o alphabetizing the names itsel. But there isnothing remotely creative about arranging names alphabetically in a white pages

directory. It is an age-old practice, frmly rooted in tradition and so commonplacethat it has come to be expected as a matter o course. . . . It is not only unoriginal,it is practically inevitable. (Feist 1991, 363)

Age-old practice corresponds to syntactic labor and the alphabetization

o names corresponds to a syntactic process, delegated to the orms o 

inormation technology current in the 1980s. Slightly obscured within the

Court’s judgment—and not ully brought into contrast with its explicit

reversal o the labor theory o copyright—is a reerence to “writings

which are to be protected . . . [as] the ruits o intellectual labor, embodiedin the orm o books, prints, engravings, and the like” (1991, 346). The

intellectual labor embodied in the protected writings is closely analogous

Page 49: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 49/199

40 Chapter 3

to semantic labor. While the judgment has independent interest, its imme-

diate value in this context is as a wider public analogue to the distinctions

o syntactic and semantic mental labor and o human labor rom machineprocess, thus supporting their validity.

We can now summarize the understanding o the conditions or mental

labor. Human labor, including mental labor, can be transerred to technol-

ogy, undergoing transormation into a machine process. Human mental

labor can be semantic or syntactic in character, directly motivated by con-

siderations o meaning or reduced to pattern-governed transormations.

Only syntactic, not semantic, labor can be transerred to inormation

technology. As orms o mental labor, description labor and search laborparticipate in the conditions or mental labor.

Description and Search Labor

The historical separation (and current reconvergence) o description labor

and search labor rom selection labor supported the proposition that

selection labor can be distributed signifcantly between description labor

and search labor, but its overall sum could not diminish below certain

theoretically established limits. The understanding o description labor

and search labor as orms o mental labor supports the urther proposi-

tion that the syntactic aspects o description and searching, but not their

semantic components, can be transerred to technology.

Description Labor

Thus, description labor is understood as one component o selectionlabor. As mental labor, description labor can have a material and exo-

somatic aspect and semantic and syntactic components. We will consider

description labor in a more directly empirical and contemporary ashion

than selection power and selection labor, but the historical emergence o 

description labor in inormation systems with the emergence o written

literacy is acknowledged here. We also acknowledge equally the possibil-

ity that description labor can be reduced, transerred to technology as

process, and absorbed by selection labor (with selection labor becoming amore substantive category). The labor embodied in documents described

is accepted largely as a given and not ully explored, but the contrast

Page 50: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 50/199

Description and Search Labor 41

between the technology o writing on paper, demanding separate descrip-

tion, and computation that enables automatically generated syntactic

descriptions is ully incorporated into the schema developed. Humandescription labor includes interpretation as well as apparently more sim-

ple orms o description, classically understood as embodied in the activi-

ties o both classifcation and cataloging.

Description labor is understood ostensively and empirically as the

work involved in processes such as cataloging, classifcation, indexing,

and database description. More analytically, although still consistently,

description labor is conceived as the work involved in transorming

objects (documents, images, or people) into searchable descriptions thatwill assist subsequent retrieval.1 Two aspects o the transormation o 

objects or description can be distinguished: the description o objects and

the assembling o these descriptions into searchable lists or indexes. These

aspects may merge into each other in practice, but they can be separated

analytically. Description labor aims implicitly to increase selection power

and can have the urther eect o reducing labor expended in searching.

The object or description (see fgure 3.2) is both physical document

and ideational text—inormation as both object and potential knowledge

(Buckland 1991; Blair 2002). As text, the object or description is the

product o an extended period o human mental labor, with periods o 

high intensity: “writing a book is a horrible, exhausting struggle” (Orwell

1946/1970, 29). The purchase price o the document may not ully reect

the duration, intensity, and real costs o that labor.

The separation o syntax rom semantics with the advent o written

language (Warner 2005a, 557–563) may have continuing implications orthe relative eectiveness o applying syntactic, or pattern-transorming,

procedures to verbal and nonverbal graphic signs. Nonverbal signs can

be subjected to syntactic or pattern-based transormations, with those

transormations realized as labor or as machine process, but it has been

difcult to endow transormations with semantic signifcance, either or

similarity and dierence between signs or or establishing orders or dis-

play. Technically, it would not be difcult to transorm a digital photo-

graph into a searchable description (or instance, Google’s advancedimage search unction enables searching by a range o colors and fle types

[Google 2007b]) and to establish measures o similarity between images.

Page 51: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 51/199

42 Chapter 3

Figure 3.2Object or description.

Page 52: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 52/199

Description and Search Labor 43

However, excepting sets o images produced under highly controlled

conditions and thereby eectively selected or their ormal similarity to

one another, creating meaningul searchable descriptions or measureso similarity has been difcult. Images, including photographs, received

as iconic signs with a motivated resemblance rom signifed to signifer

could lend themselves to iconic representation or display and retrieval,

possibly reducing image size to enable scanning. Google uses indexing

and retrieval based primarily on verbal descriptions o images, not on

matching between the graphic images; iconic modes o representation are

used to display retrieved results (Google 2007b). In contrast, written ver-

bal signs that embody the distinction between semantics and syntax aremore amenable to syntactic operations, which can partially incorporate

semantic signifcance.

A bibliographic record (see fgure 3.3) represents one maniestation o 

the product o description labor and processes, potentially open to ur-

ther transormations. Enhanced selection power, or search control, is

made possible by reducing the diversity o language in related objects or

description to canonical or standard orms. In many instances, categoriz-

ing objects implicitly or explicitly involves the relationship between indi-

vidual (or species) and genus, gathering together items related in meaning.

Collecting items related in meaning but diverse in pattern oten involves

human semantic labor. Under premodern conditions, enhanced control

generally involves abstraction and loss o richness. Characteristically, the

description is brieer than the object described and also is the product o 

less extensive and less intense labor, although the costs o that labor may

be recovered more directly.Examples o signifcant inormation systems can reveal the reduction o 

direct human labor and the increased ullness o descriptions. In late-nine-

teenth- and early-twentieth-century practice, Palmer’s Index to the Times

Newspaper was created by direct human labor and assisted by current

technologies, particularly writing and printing. For instance, “Mad Dogs,”

a Times subheading that introduced a short paragraph beginning, “The

inhabitants o Shefeld . . . ,” would be transormed into the index entry

“Mad dogs in Shefeld,” and fled under “M” in the list o entries or thatvolume o the index (Morrison 1986, 189–192; Times 1885, 6; Palmer

1885, 60). The index, as a product and as a whole, can be read to reveal

Page 53: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 53/199

44 Chapter 3

Figure 3.3Product o description labor and processes. Source: OCLC. WorldCat 2008.

GET THIS ITEM

 Availability: Check the catalogs in your library.

Libraries worldwide that own item: 15

Connect to the catalog at your library

ExternalResources:

Cite This Item

FIND R ELATED

More Like This: Search for versions with same title and author  | Advanced options ...

Find Items About: Childe, V. Gordon (max: 29)

Title: Man makes himself /

 Author(s): Childe, V. Gordon 1892-1957. (Vere Gordon),

Publication: Nottingham : Spokesman,

 Year: 2003

Description: 244 p. : ill. ; 22 cm.

Language: English

Series: New thinker’s library ; no. 2;

Standard No: ISBN: 0851246494; 9780851246499

SUBJECT(S)

Descriptor: Civilization, Ancient. Archaeology.

Prehistoric peoples.Progress.

Note(s): Reprint of the 4th ed./ Includes bibliographical references (p. 239-240)and index.

Class Descriptors: LC: CB311

Responsibility: by V. Gordon Childe ; with a new foreword by Mark Edmonds.

 Vendor Info: YBP Library Services Baker & Taylor (YANK BKTY) 42.50 Status: active

Document Type: Book

Entry: 20040730

Update: 20071103

 Accession No: OCLC: 56060701

Database: WorldCat

Man makes himself /

V Gordon Childe

2003

English Book 244 p. : ill. ; 22 cm.Nottingham : Spokesman, ; ISBN: 0851246494 9780851246499

Page 54: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 54/199

Description and Search Labor 45

that the description process was strongly syntactic, although conducted

by human labor rather than by machine, with some semantic interven-

tion. The storage constraints o the technologies compel the productiono descriptions more concise than the objects described. Technologies also

have to be used in a primarily nondeterministic mode with continuous

human intervention, particularly or construction rather than printing

o indexes. Other inormation systems o the time (most obviously the

construction o the British Museum Catalog ) involved substantial human

semantic intervention in the making and listing o descriptions—o mate-

rial by and about signifcant authors—although the subject approach was

not ully endorsed (Roberts 1977, 14).These contrasting descriptive practices have modern descendants that

can be embodied within the same inormation system. Google’s advanced

search descends rom the practice o indexing newspapers by syntactic

transormations on object-language. Further semantic work is not applied

to the objects described, although they may contain deliberate descrip-

tive elements (or instance, in the application o metadata). Descriptions

not available or direct inspection are automatically generated rom the

verbal objects themselves and urther generate indexes to those descrip-

tions. Direct human clerical labor that was involved in nineteenth-century

practice is now delegated to modern inormation technologies that oper-

ate syntactically and deterministically. The need or brieer descriptions

imposed by storage constraints has also been removed, but the searcher’s

labor remains intense.

WorldCat embodies both practices. Guided by codes developed rom

their historical antecedents, descriptions are created by human semanticwork, with the aim o creating more systematic descriptions than those

given by transcription o the verbal objects. These systematic descriptions

aim to increase the searcher’s selection power and reduce labor. Elements

o syntactic labor can also be ound, transormed into a machine proc-

ess. For instance, WorldCat automatically transorms humanly created

descriptions into searchable indexes. Descriptions themselves increasingly

incorporate inormation derived by syntactic processes rom the verbal

objects taken or description.Some common trends over time can be discerned in these processes o 

description. There is a decrease in the direct human labor involved in the

Page 55: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 55/199

46 Chapter 3

Description labor, processes, and products can be realized and discerned inactivities other than cataloging and indexing. For instance, descriptions o people or the purposes o controlling travel and citizenship have intensi-fed since the developments in global communications (both physical trans-port and message transmission) in the mid- to late-19th century. Identitydescriptions o people and descriptions o documents may share a similarbut more explicitly conceived aim to identiy and dierentiate objects ordescriptions (or instance, individuals as citizens). Jules Verne gives a gentlyironic account o the process o acquiring a passport in Paris in 1859:

He then went to the preecture o the Seine—or Jacques ‘the lord mayor’s parlour’—where he boldly requested a passport or the British Isles. His description was takendown by an old, myopic clerk whom the progress o civilization would one dayreplace by an ofcially designated photographer. (1992, 6–7).

Photographs are a more eectively translingual but not culturally invariantmeans or recognizing identity than verbal descriptions.

Prior to the use o photographic images or establishing identity, or inthe empirical absence o an identiying image rom a particular situation,other behavioral characteristics were used to identiy individuals or to di-erentiate them rom other individuals with whom they were or could be

conused. In Thomas Hardy’s “The Three Strangers,” an escaped prisoner,who had been due to be hanged, is identifed by his “musical bass voice thati you heard it once you’d never mistake as long as you lived.” (1888/1976,34). In O. Henry’s “The Theory and the Hound,” a wie murderer—whoseverbal description could encompass his companion even though “[t]heybore little resemblance one to the other in detail,”—is dierentiated by hishighly angry response to the investigating sheri’s deliberate cruelty to ananimal: “I never yet saw a man that was over ond o horses and dogs butwhat was cruel to women” (1910/1993, 401, 406). Both narratives revealan ordinary discourse understanding o the crucial value ascribed to an

identiying or discriminatory variable.In modern practice, description processes in many situations are del-

egated largely to technology or the production o images, requiringhuman semantic labor or recognition or comparison o image and person.Consider, or instance, the making o images and their interpretation byairport security systems.

Box 3.1Nineteenth-Century Identity Description Practices

description o objects, and, more intensely, in the compilation o indexes

rom these descriptions. In particular, syntactic components o description

are being transerred rom human labor to machine process. Descriptions

are becoming more ull and exact, enabled but not compelled by the

Page 56: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 56/199

Description and Search Labor 47 

reduction in technologically imposed storage constraints. Fullness and

exactness can be distinguished analytically and substantively rom dis-

criminatory power and inormativeness; they may not enhance selectionpower, but they may aid specifcity in searching.

The social division o mental labor between clerical and intellec-

tual roles has been transormed progressively into the division o labor

between human work and machine processes, with syntactic labor dele-

gated to technology. While contrasts between late-nineteenth-century and

current practice arise rom the development and adoption o inormation

technologies—the progressive transormation o communal labor into the

products o primarily universal labor—continuities ollow rom the pre-dominantly symbolic nature o the signs described (Warner 2005b).

A particular eature that combines continuity and change involves

transormation rom social division o labor into division o labor between

human and machine, with syntactic labor transerred to machine process.

This pattern might have implications or the nature o expertise in search-

ing inormation systems. At one extreme, understanding the orces that

determine a system’s construction may not correlate with eective search-

ing o systems.2 The changed relation between language o discourse and

language o representation between premodern and modern inormation

systems is acutely relevant: a semantically assigned language of representa-

tion need not be the entry vocabulary, although later recourse could be made

to it or its generic scope. Prima acie, expertise in the language o discourse

likely will be stronger among domain than inormation specialists.

A ew other writers have considered the costs o human description

labor involved in creating records or catalogs, although not ully withinthe conceptual ramework developed here, and these considerations can

provide empirical data to inorm the argument. The costs associated with

creating a catalog that meets WorldCat standards have been estimated

at about US$40 (Hayes 2000, 76). This amount represents the cost o 

human semantic description labor, and the use o WorldCat records by

participating libraries represents distribution o the products o labor.

Online Computer Library Center (OCLC) accounts show no value, either

or WorldCat or or OCLC’s direct costs or creating the records (Hayes2000, 76). Failure to recognize the cost o direct and accumulated intellec-

tual labor embodied in catalog records and catalogs represents a “serious

Page 57: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 57/199

48 Chapter 3

mis-measurement o phenomena o undamental economic importance”

(Hayes 2000, 73).

In summary, description labor can be semantic or syntactic in character.Semantic description labor is exclusively and directly human, in accord-

ance with the understanding o human mental labor developed in this

chapter and the possibilities o its transer to technology. In contrast, syn-

tactic description labor can be delegated to technology, which transorms

it rom organic labor into machine process.

Search Labor

Search labor was contradistinguished rom description labor; bothdescription and search labor are regarded as components o selection

labor. Search labor is also a orm o mental labor, containing material,

semantic, and syntactic aspects. Within the schema developed, selection

labor distributes signifcantly between description and search labor and

its syntactic aspects are transerred to technology. Search labor might

have emerged predominantly as search expertise in premodern systems,

reecting the substantial investment o human semantic labor in descrip-

tion processes. A signal and revealing exception to the emergence o labor

primarily as expertise is represented by the arduous work and accumulat-

ing expertise required or searching or the sources necessary to construct

a subject bibliography, attempting exhaustivity and encountering biblio-

graphic scatter (Greg 1959; Bradord 1948/1971).3 For analytical clarity

and with substantive justifcation, search labor is the reverse, obverse, or

mirror image o description labor.

Search labor is reected in its material aspect in the psychomotor skillsrequired to operate a keyboard, mouse, or other interace devices. The

progressive naturalization o inormation technologies and improved

interace resulting rom system design have diminished the apparent

complexity o this aspect o searching. Both perceived and inherent com-

plexities have diminished, although human desires may evolve alongside

technology. The recent relative stability o interace technologies may

indicate that the late-twentieth-century revolution in the mechanization

o mental labor resulted in a teleological state, at least temporarily.Syntactic processes can be distinguished rom syntactic aspects o 

searching. Derived rom processes previously conducted directly by

Page 58: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 58/199

Description and Search Labor 49

humans, syntactic processes now have been transerred to machines. In

contrast, syntactic aspects o searching involve human understanding

o syntactic processes. Syntactic processes—activities such as orderingretrieved records or eliminating duplicate records—increasingly are trans-

erred to technology. With premodern technologies, ordering records by

date or by author would have been accomplished at the point o retrieval

by description labor or direct human syntactic labor. Modern search

technologies can invoke dierent orderings, transerring human labor to

machine processes.

Compared with syntactic processes, the syntactic aspects o search-

ing correspond to human understanding o syntax. Specifc componentsor understanding include Boolean logical combinations, their computa-

tional derivatives, their realization in system commands, and the likely

eects o specifc combinations. Many consider Boolean logic difcult to

grasp, although its difculty may have been exaggerated.4 The adoption

o computational technologies has resulted in some public understand-

ing o Boolean logic. Boolean operators are characteristically realized

in contrasting system commands, although the underlying commonality

should be recalled. Early studies indicated that the number o system com-

mands used had little eect on system perormance (Barraclough 1977).

However, dierences in commands made it difcult or searchers to adapt

rom one system to another. Similar to the material aspects o searching,

syntactic aspects have improved through simpler system design, and also

through coevolving consciousness.

Semantic components center on translating a topic into a searchable

query, the most complex aspect o retrieval (Roberts 1977, 15; 1989).The process may remain diicult, but its nature—the social distribu-

tion o expertise and the distribution o labor between description and

searching—may be changing due to alterations in description processes.

Semantic understanding and expertise with premodern technologies

involved understanding the type and particular characteristics o descrip-

tion labor applied to those documents, including the language o subject

representation. With modern systems, understanding the documents’ lan-

guage o discourse and the eects o generating automatic representationso those documents may become more signifcant, likely involving a social

redistribution o expertise toward those ully amiliar with the language

o discourse.

Page 59: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 59/199

50 Chapter 3

A urther level o semantic understanding involves knowledge o the

overall system o inormation system production and o the role or posi-

tion o particular inormation systems used within that system. The termsystem reers to an interacting set o components rather than a deliber-

ately coordinated or planned system (Roberts 1977). The diusion o 

Internet search engines is changing both the overall system and the nature

o expertise.

A signiicant contrast between description and search labor lies

between the presence o the object or description and the customary

initial absence o the object in searching. Presence inorms description

and absence combined with desire to restore presence motivates search-ing. Objects or documents recalled in search can be received as exemplar

documents and used (possibly iteratively) to modiy the search regarding

choice o terms rom the language o representation, where it exists and

can be exploited, and the language o discourse. The presence and absence

o the object creates a particular orm o abstraction or disconnection or

machine description processes and human searching o the products o 

those processes. In machine description, words are characteristically torn

or abstracted rom their context, understood specifcally as the line o 

writing. They may be considered apart rom their lines o writing by a

human searcher, but machine processes restore their various lines o writ-

ing during retrieval. They are then made more ully present and subjected

again to human semantic interpretation, inormed by their line o writing

and their broader context (see chapters 5, 6, 7, and 8 or uller discussion

o these patterns). Theoretically, it is possible to reconstruct documents

rom ull-text descriptions, at least as linear sequences o words. Thus,the contrast between the presence and absence o the object described ini-

tially distinguishes description rom searching: presence inorms descrip-

tion and absence motivates searching. The contrast tends to produce a

mirror image o description in searching.

Thereore, search labor parallels description labor, and search labor is

either semantic or syntactic in character. Semantic search labor is inescap-

ably and directly human, while syntactic search labor can be transerred

to technology, where it becomes a machine process. Obtained without aProcrustean ftting o data to theory and without disguising their con-

trasts, the parallelism between search labor and description labor rein-

Page 60: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 60/199

Description and Search Labor 51

orces the analogies between them. It also strengthens the possibility o 

interchanging human labor and machine process between description and

searching within the encompassing activity o selection labor.

Dynamic or Transer to Technology

The dynamic that compels the transer o syntactic labor to technology is

connected with the costs o direct human labor, and was acutely noted in

a proleptic remark by Norbert Wiener.

the automatic machine . . . is the precise economic equivalent o slave labor. Anylabor which competes with slave labor must accept the economic conditions o 

slave labor. (1954, 162)

As semantic description labor decreases, semantic labor may transer rom

description to searching and search labor may increase. Thus, the process

o searching may remain permanently intractable at the semantic level.

Conclusion

This chapter elucidated the concepts o description and search labor and

also provided parallel distinctions or the work involved in description

and searching. It established the possibility o transerring human mental

labor to technology. The discussion characterized semantic labor as irre-

ducibly human and syntactic mental labor as transerable to technology

as machine process. Description labor involves cataloging, classifcation,

and database description, with separate semantic and syntactic aspects.

Search labor strongly parallels description labor. Selection labor, or the

total labor involved in inormation retrieval, is amenable to signifcantdistribution between description and searching, and its syntactic aspects

can be transerred to technology.

A urther stage in the development o a labor theoretic approach to

inormation retrieval involves reviewing and synthesizing all the elements

o our argument—rom selection power to selection, description, and

search labor—and ully introducing the costs o direct human labor as

the dynamic that compels the transer o human syntactic labor to tech-

nology. The parsimony and power o the overall argument and its corre-spondence to real-world decision practice will also be revealed.

Page 61: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 61/199

Page 62: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 62/199

4A Labor Theoretic Approach

Introduction

Although the previous chapters established the essential components

o a labor theoretic approach to inormation retrieval, a synthesis is

still needed. The synthesis should reveal the economy and power o the

approach by emphasizing the coherence and mutual relation o elements

such as selection power and selection labor. Emerging rom the costs o 

direct human mental labor and a common desire to avoid those costs,the practices o major inormation systems oer an empirical illustration

o the dynamic that transers human mental labor to machine processes.

Decision practices o major inormation systems embody and conorm to

the strongly determining orces already identifed. This chapter will con-

sider the value o the labor theoretic approach and suggest the continuing

intractability o inormation retrieval.

A ull synthesis o the labor theoretic approach must recapitulate its

discursive development and lay bare its logical structure and progression.The limited number o concepts and activities distinguished reveals the

economy o the approach, but its power lies in its ability to comprehend

signiicant empirical developments in major inormation systems and

selectively absorb preexisting theories.

The argument presented in the previous two chapters was a logical, but

not reductionist, approach. Concepts such as selection power and selec-

tion labor were abstracted rom everyday practice—rom Marx’s “real lie

process” (1858/1973, 706)—rather than imposed theoretically. Thus ar,

we have preserved and incorporated the empirical richness o inormation

retrieval into a discursively expressed cohesive structure. This chapter

Page 63: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 63/199

54 Chapter 4

will introduce ormal (symbolic) logic to expose the underlying rigor o 

the argument. Connectives rom ormal logic are ully sufcient to cap-

ture relations between the “atomic act[s]” distinguished (Wittgenstein1922/1981, § 2.062); atomic acts are understood to include both con-

cepts and activities (or instance, selection power as a concept and selec-

tion labor as an activity). By embedding ormal logical exposition in a

summary discursive recapitulation and complementing it with a diagram,

we will sustain intelligibility or readers amiliar with ordinary discourse

but not ully conversant with ormalism. Thus, complementing the dis-

cursive development with a more ormal presentation should clariy and

reveal the rigor o the argument and also maintain strong correspondenceto reality.

Synthesis

The frst concept we treated was selection power, received as an atomic

act or primitive term that is open to elucidation but not defnition in

the sense o decomposition into more primitive terms. Selection power

was understood as human aculty or discrimination, which could be aug-

mented by technologies but still remain a quality o human conscious-

ness. Thereore, our frst proposition asserted selection power’s value or

inormation retrieval.

Labor was regarded as comparably undamental to selection or intel-

ligence and an inescapable condition o human existence. Labor was ur-

ther understood to include mental as well as physical labor, both as an

adjunct to physical or productive labor and as a separate activity. A spe-cifc concern involved selection labor as a orm o mental labor. Thus,

selection labor could also be asserted as a primitive activity, although

inescapably imposed by the need or selection power and not implicitly or

explicitly sought ater as a good.

We have developed a central proposition connecting the primitive con-

cept and activity—that selection labor produces selection power under a

range o historical conditions: emerging when orality transormed into

literacy; ully realized with literacy, or premodern, inormation technolo-gies; and continuing, although transormed, with modern technologies.

This proposition has empirical correlates in historical and current descrip-

tion processes or documents, objects, and people. Under premodern con-

Page 64: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 64/199

A Labor Theoretic Approach 55

ditions, selection labor decomposed clearly and strongly into description

and search labor.

The production o selection power by selection labor is analogous tomaterial implication in ormal logic, with selection power implying selec-

tion labor.

Selection power➞ selection labor.

Making an explicit analogy that links the production o selection power

by selection labor with the relation o material implication in ormal logic

adds value to the discursive exposition. The relation o material implica-

tion remains valid or selection labor without selection power

1

and thencaters to the case, met in practice, that selection labor does not produce

selection power when unhelpul or inappropriate selection labor has been

applied. In material implication, the frst place given to selection power

also suggests that it is possible to hide or disguise selection labor rom

view behind selection power (or instance, congealed in the products o 

selection labor). Disguising selection labor also corresponds to its theo-

retical neglect, despite costs and signifcance (Hayes 2000), and may be

partly responsible or that neglect. In the practical use o a particularsystem, a searcher might have to determine the nature o selection labor

or process applied rom the selection power aorded; or instance, vari-

ants and unreconciled orms o a single author’s name could indicate the

absence o eective human description labor or machine processes in rec-

onciling variants.

Some urther propositions illuminated the relation o selection labor

to description and search labor and the possibilities o their transer to

technology. Concepts isolated and distinctions made corresponded toreal-world practice, suggesting their robustness. Labor was understood to

include mental labor.

Labor includes mental labor.

Mental labor shared some characteristics with physical and productive

labor. Direct human mental labor could be transerred to technology, and

technology itsel was regarded as a human construction.

Human mental labor can be transerred to inormation technology.

In the intervals between direct human intervention, ully transerred

human mental labor becomes a machine process.

Page 65: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 65/199

56 Chapter 4

Some categories developed primarily or physical and productive labor

explicitly recognized elements o mental labor. According to Marx, uni-

versal labor—“all scientifc work, all discovery and invention”—dieren-tiates rom communal labor: universal labor is “brought about partly by

the cooperation o men now living, but partly also by building on earlier

work,” while communal labor “simply involves the direct cooperation o 

individuals” (1894/1981, 199). Labor might originate as communal labor,

but it is transormed into universal labor as labor-created knowledge di-

uses.

Labor separates into communal labor and universal labor.

Communal labor includes individual labor. In both hardware and sot-

ware, inormation technology results primarily rom universal labor.

Human descriptions o inormation objects (records) result predomi-

nantly rom communal labor.

Distinctions o labor, process, and product, understood close to their

ordinary discourse senses, could also be applied to mental labor and

its products. In oral speech, particularly under orality that preceded

the development o written language, labor includes the entire activityo communication and process and product are not ully separated. In

written language, the product is separate rom labor and the possibility

emerges o ormalizing the processes involved in making that product.

Under modernity, processes can be conducted automatically. Thereore,

originally undierentiated labor separates into labor, process, and prod-

uct, with associated changes in the meanings o the constituent terms, or:

Labor separates into labor, process, and product.

We can apply analytic distinctions between labor, process, and product to

modern inormation retrieval activities and systems, with labor as human

labor, process as a ormalized process and oten delegated to technology,

and product as the outcome o labor and process (or instance, records or

index descriptions).

We then introduced distinctions specifc to mental labor. Mental labor

could be semantic or syntactic, linking a widely held distinction between

levels o analysis to mental labor. Semantic and syntactic mental laborcould be regarded as separating rom undierentiated mental labor.

Human mental labor separates into semantic labor and syntactic labor.

Page 66: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 66/199

A Labor Theoretic Approach 57 

A distinction then could be made between semantic and syntactic mental

labor or both premodernity and modernity. Semantic mental labor cor-

responded to the work involved in conducting transormations on signsmotivated by their meaning, and syntactic labor corresponded to trans-

ormations reduced to motivation rom patterns. Semantic mental labor

was irreducibly and directly human.

Semantic labor➞ human labor.

Expressing the relation between semantic labor and human labor as mate-

rial implication allows or the possibility that human labor is more exten-

sive than semantic labor, incorporating syntactic aspects.

2

In contrast tosemantic labor, syntactic labor could be transerred to technology, and

labor would become process.

Syntactic labor can be transerred to inormation technology (labor

becomes process).

A technological process was a priori syntactic in character, operating on

patterns and not directly on meaning. In premodern historical practice,

syntactic labor was delegated to human clerical labor; under modern

conditions, it is increasingly transerred to inormation technology as a

machine process.

Transormations o Selection Labor

Selection power remained a relatively stable primitive or atomic act as

a quality o human consciousness; selection labor, although persistently

present, modulated with historical transormations in inormation tech-

nology. In developing the proposition that selection power is producedby selection labor, we ocused primarily on transormations o selection

labor as a orm o mental labor understood by the distinctions established

within labor and mental labor.

When orality emerged into literacy, description labor in inormation

systems included the cognitive labor o memory and recall as well as the

bodily and communicative labor o public oral expression. Embodied in a

socially designated individual, the Icelandic law-speaker exemplifes such

an inormation system (Njal 1280/1960, 306–308). In oral speech, theprocess o production and the product (audible speech) disappeared with

the process, leaving no trace outside the memory o the auditors; process

Page 67: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 67/199

58 Chapter 4

and product did not ully separate rom labor. A distinction o syntac-

tics rom semantics could be recovered but is not strongly marked in oral

societies.For inormation systems under written literacy or premodernity, selec-

tion labor progressively separated into description and search labor, and

search labor emerged as the work involved in traversing or searching the

products o description labor.

Selection labor separates into description labor and search labor.

Analytically and substantively, selection labor represents the sum o 

description and search labor ollowing separation. The historical sepa-ration o description and search labor rom selection labor supports the

proposition that selection labor can be distributed between description

and searching. Separating products rom labor introduces the possibility

o distributing the products o description labor (catalog records, cata-

logs, and bibliographies); human labor congeals in those products. We

can also delegate aspects o description processes to clerical human labor,

with that labor assisted by technologies associated with writing as well as

by special-purpose inormation machines that produce and sort writtenutterances.

Under modernity, the possibility and actuality o transerring syntactic

labor to technology sharpens the distinction between semantic and syn-

tactic labor or both description and search labor.

Description labor separates into description labor semantic and descrip-

tion labor syntactic.

Search labor separates into search labor semantic and search labor syn-tactic.

The semantic aspects o description and search labor, including the under-

standing o syntactic processes, remain irreducibly and directly human.

Description labor semantic➞ human labor.

Search labor semantic➞ human labor.

Previously conducted by direct human work, we can now transer the syn-

tactic aspects o description and search processes to inormation technol-ogies, predominantly to the computer as a universal inormation machine

programmed to unction as an appropriate special-purpose inormation

machine.

Page 68: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 68/199

A Labor Theoretic Approach 59

Description labor syntactic can be transerred to inormation technology

(labor becomes process).

Search labor syntactic can be transerred to inormation technology(labor becomes process).

Thus, description and search labor are more ully revealed as both seman-

tic and syntactic in character.

Summary

Our synthesis revealed a pattern o progressive development. Originally

undierentiated categories such as labor and selection labor separate intodierent aspects. The pattern revealed is essentially additive, congruent

with a broader conception that human history involves the cumulative

development o human capacities (Hobsbawm 1998, 41).

Figure 4.1 illustrates the central propositions underlying this discursive

exposition as a sequence o summary statements 3. The order o the main

section in the discursive developments o the labor theoretic approach,

both in the ull expositions in the earlier chapters and recapitulated here,

corresponds to the sequence o statements and can be mapped to it. Thelogical connective o material implication is used to relate concepts and

activities distinguished, where it adds value to understandings obtainable

rom ordinary discourse. Material implication was classically the most

difcult and productive o the logical connectives, and it emerges here as

a highly signifcant crux, particularly illuminating the production o selec-

tion power by selection labor.

To aid intelligibility, we present these central propositions in a diagram

restricted to the primitive terms, central proposition, and transormationso selection labor (see fgure 4.2). The diagram also incorporates support-

ing propositions but does not show them explicitly. Selection power is

given priority by its placement at the head o the diagram, where it rests

on selection labor. The submerged position o selection labor is congru-

ent with its congealing and disguise in its products and with its theoreti-

cal neglect. When exposed, selection labor signifcantly supports selection

power. I the line indicating the distribution o selection labor (between

description and searching) is understood as the beam o a balance or

scales, selection labor must be considered as a hinged or potentially bro-

ken beam. For instance, i description labor is not helpul to the searcher,

Page 69: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 69/199

60 Chapter 4

the quantity o description labor could increase without reducing search

labor. The analogy o beam or scales remains valuable or giving iconic

orm to the idea that labor can be distributed between description and

search.In its current development, the argument is slightly static, and no

dynamic compels the transormation o relations between categories rom

“can be” to “is.”

Figure 4.1Summary o synthesis.

Primitive terms 

    Selection power

    Selection labor

Central proposition

    Selection power → Selection labor

Further propositions 

  Labor includes mental labor

  Human mental labor can be transferred to Information technology  Labor separates into Communal labor and Universal labor  Labor separates into Labor, process, and product

  Human mental labor separates into Semantic labor and Syntactic labor  Semantic labor → Human labor  Syntactic labor can be transferred to Information Technology

(labor becomes process)

Transformations of selection labor 

  Selection labor separates into Description labor and Search labor

  Description labor separates into Description labor semantic and Description

labor syntactic  Description labor semantic →Human labor  Description labor syntactic can be transferred to Information Technology

(labor becomes process)

  Search labor separates into Search labor semantic and Search labor syntactic  Search labor semantic →Human labor  Search labor syntactic can be transferred to Information Technology

(labor becomes process)

Page 70: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 70/199

A Labor Theoretic Approach 61

Transormation into a Dynamic

A dynamism that compels the transer o human syntactic labor to tech-

nology is ound in the conditions o direct human labor, including its

costs and elements o drudgery.4 Discerning a widespread preerence, par-

ticularly under capitalism, or reducing the costs o processes by abridg-ing direct human labor is a specifc and historically rooted position. It

is distinguishable rom Zip’s universalized principle o least eort or o 

resistance to labor (1936). The equivalence o automatic machine to slave

labor (Wiener 1954, 162), a condition not oten adopted voluntarily,

should also be recalled. “Can be” relations between categories are trans-

ormed into “is.”

Syntactic labor can be transerred to inormation technology (labor

becomes process.)

is transormed into

Syntactic labor is transerred to inormation technology (labor becomes

 process).

The transer would apply to both syntactic description and to syntactic

search labor.

Description labor syntactic is transerred to inormation technology(labor becomes process).

Search labor syntactic is transerred to inormation technology labor

(labor becomes process).

Figure 4.2Representation o synthesis

Selection power

Selection labor

Description

Semantic labor

Syntactic labor

or process

Searching

Semantic labor

Syntactic labor

or process

Page 71: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 71/199

62 Chapter 4

Thus, direct human labor primarily becomes semantic labor, and the

transormation applies similarly to description and search labor— both

aspects o selection labor.The concepts o selection power and labor, and o semantic and syntac-

tic labor and processes—all components o the model developed—have

been abstracted out rom historical and current inormation practice

and rom ordinary discourse, rather than imposed. In such a process o 

abstraction, there is an analogy with the construction o models o the

computational process, not as ends themselves, but as aids to understand-

ing. Ater grasping the undamental matters, we can bring understanding,

“which we could never obtain while immersed in inessential detail anddistraction” (Minsky 1967, 2–3), back to the practical world. Automata

theory develops models o the computational process partly as ends in

themselves rather than as an endeavor that might inorm practical under-

standing and has oten become increasingly technical (Boolos and Jerey

1989). The challenge here is to show that the determining orces identifed

or inormation systems really do inuence real-world systems.

Decision Practice

Although producers o major inormation systems would not ormulate

their policies and activities according to the concepts identifed here, their

activities are constrained and guided by the determining orces we have

discussed. The eect o the determining orces can vary with the inten-

tions and the producer’s market.

A common and largely unvarying characteristic o the movements pro-duced by motivating orces involves transerring direct human labor to

technological processes. Under premodern conditions, the description

processes (or instance, creation o indexes rom records) were accom-

plished by human syntactic labor. Today those tasks are accomplished

most requently by technology, strongly implying a desire to avoid the

costs o direct human labor. Description processes used previously to cre-

ate certain description products in fxed media (or instance, dierent

orders or indexes or records) are eectively moved rom description topotential instantiation in searching, economizing on product creation and

enabling a greater variety o orders.

Page 72: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 72/199

A Labor Theoretic Approach 63

In contrast to premodern practices, systems with syntactically based

description processes—human labor transerred to technology—have

prolierated, as exemplifed by Internet search engines. Primarily syntac-tic and machine processes increasingly generate the descriptions used or

searching the texts o works or citations between works on both Amazon.

com (Amazon 2007) and Google.com (2007a, b, c). (Amazon’s “search

inside this book” acility oers a concordance or the particular text,

and Google’s descriptions or indexes provide weighted concordances;

machine processes produce both systems syntactically). Thereore, human

labor embodied in the written text o the documents described provides

a resource or description. Description processes can aid specifcity inretrieval, but the very number and diversity o results obtained may trans-

er selection labor to the searcher. The links or traces let by semantically

guided explorations o resources can be exploited to determine the order

o reerences in retrieval and could be regarded as products o a orm

o unpaid semantic description labor. The increasing dominance o the

Google search engine, which might appear to contradict assertion o the

prolieration o syntactically based systems, can be regarded as the diu-

sion o an eective syntactic system, with some monopolistic orces aiding

its dominance. The market or inormation services has been regarded as

acting as a regulative mechanism that eliminates or corrects errors; a bet-

ter mechanism o correction remains undetermined (Swanson 1980, 128).

Although syntactic processes or selecting and ordering documents may

be open to unlimited variations derived rom primitive computational

operations, only some o these variations may be interesting or useul.

One task or inormation retrieval research would involve attempting tounderstand the eectiveness o Google, including the eects o syntac-

tic transormations on the semantic interpretation o words and phrases;

subsequent chapters will address these issues.

Appropriately exploited or searching, the greatly increased scope o 

syntactically generated descriptions can enhance human capacities. For

example, using Google’s advanced book search unction to fnd the exact

phrase “power o knowledge objectifed” yields a range o discussions

o Marx’s (1859/1973) conception o technology: “organs o the humanbrain, created by the human hand : the power o knowledge, objectifed”

(706) (Google 2007c). Such a search requires understanding patterns o 

Page 73: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 73/199

64 Chapter 4

diusion within the primary literature as well as the searching process.

Regarding primary literature, Marx’s phrase comes rom the dominant

English language translation o the Grundrisse, published in 1973 andnot yet subject to processes o secondary diusion that might have ena-

bled discussion o Marx’s theme o technology as a human construction

without using the signifcant phrase. For searching, the understanding

that phrase searching oten exclusively yields tokens o the intended type

could have been obtained by reiterated searches on dierent phrases.5 

Theoretically, we can explain the correspondence o tokens to type rom

the perspective o inormation theory regarding transition probabilities

between letters and also between words, understood as cohesive groupso letters (Shannon 1951/1993). Chapter 7 discusses the understanding

o the structure o written language obtainable rom inormation theory.

Semantic description processes bypassed in the discovery o documents

still may be exploited to obtain copies o those documents rom librar-

ies or bookshops, using their own humanly and semantically constructed

records. The enhancement o human capacities itsel creates partly novel

difculties—selection rom an enlarged body o potentially relevant and

now more readily discoverable primary literature, and understanding how

to exploit syntactically generated descriptions. Developments in descrip-

tion practices have increased and slightly transormed search labor.

Distribution o the products o semantic description labor (or instance,

catalog records) emerged in premodernity and continues in modernity with

elements o continuity and modulation. Thus, the costs o human seman-

tic labor required to produce descriptions are eectively shared, although

semantic labor itsel is not directly divided. WorldCat , or instance,contains the products o semantic description labor as catalog records.

Records are created by communal labor, guided by the universal labor

embodied in codes, understandings, and precedents or document descrip-

tion. The products o communal labor (catalog records) are distributed

to participating libraries, which use syntactic and primarily technology-

based processes to incorporate them into their own institutional catalogs

(WorldCat 2007). Amazon.com also uses products o human semantic

description labor as records or books, although Amazon’s standards areless ull and exacting than those required by WorldCat (WorldCat 2007;

Page 74: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 74/199

A Labor Theoretic Approach 65

Amazon 2007). In contrast to the specifcity obtainable rom syntactically

generated descriptions, deliberately human-assigned descriptions may

oer generic power and control. For some systems (or instance, Internetsearch engines) semantic description labor used to assign metadata cannot

necessarily be exploited directly and separately in searching. Distribution

o description products reveals continuity with premodernity, and modu-

lation emerges when machine processes supplant clerical labor to inte-

grate products into intrainstitutional catalogs. We can characterize these

practices as distribution o the products o human semantic description

labor. The costs o that labor are thereby shared and we can understand

the dominance o WorldCat in simple economic terms.Inheritance o the absence o precise coding or oral speech and non-

written graphic orms implies a limitation on the eective extension o 

established syntactic procedures to such orms. For searching and retrieval

but not inspection o results, Google’s image search relies primarily on

descriptions generated rom written verbal orms discovered in proxim-

ity to the images (Google 2007b). Other characteristics o computer-held

images (or instance, type o fle) amenable to automatic exploitation also

can be searched. Retrieved results are presented iconically; the reduced

size o the display images enables urther scanning. The contrast between

the treatment o symbolic and iconic graphic signs implies a connection

between labor and epistemology, where written language embodies accu-

mulated or universal human encoding labor (or instance, standardized

orthography and clearly marked boundaries between words).

Here, written communications present one kind o problem and voice com-munications another—generally considered more difcult. To sort out thesecond type requires a “word-spotting” capacity—essentially a computerprogram that can distinguish between spoken words in a multitude o lan-guages . . . Admiral Bobby Ray Inman, a longtime director o the Americancodebreaking and Comint eorts as chie o the National Security Agency,once admitted in public that word spotting or voice systems remained a

dream. “I have wasted more US taxpayers’ dollars trying to do that,” hesaid, “than [on] anything else in my intelligence career.” (Powers 2005, 22)

Box 4.1Difculty o Analyzing or Segmenting Oral Speech

Page 75: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 75/199

66 Chapter 4

The determining orces isolated by theoretical discussion have emerged in

major inormation systems as highly signifcant inuences on real-world

developments. Technological processes are a priori syntactic in character.The most compelling aspect o the dynamic observed, determined by rela-

tive costs, has been the transer o syntactic processes rom direct human

labor to humanly constructed and adopted technology. Although not nec-

essarily applied widely, human semantic description labor persists both

as an inheritance and or its current value. Where semantic description

labor had value in enhancing selection power and reducing search labor,

its absence tends to transer selection labor to the searcher. One writer has

suggested that a theory o bibliographic searching may replace a theory o bibliographic description (Wilson 2001). A task or inormation retrieval

research might involve expertise and strategies that can assist exploitation

o syntactically generated descriptions. Once understood, strategies may

not be complex: consider, or instance, the contrasting results characteris-

tically obtained rom word and phrase searching.

The orces identifed in the model contain no intrinsic dynamism that

compels transerring human semantic selection labor rom description to

searching, but they may have that eect in many contexts. In a relatively

closed and controlled system, the act o consumption o material may

be assured and patterns o consumption—one epistemological basis or

the choice o mode o description—can be anticipated (Shera 1952/1965;

1961). Even then, assigning objects to categories can be problematic and

highly inconsistent. More open systems cannot assure the act o con-

sumption, and patterns o consumption are difcult to anticipate: “i you

watch it [display o Google queries] long enough, the dierent queriesshow how diverse the world is” (Weisman 2002). Thus, practices that

reduce human labor in description and transer work to searching may

prolierate.

Decision Variation

Decision practices reveal evident commonalities, variation, and an under-

lying value. Evident commonality lies in transerring established descrip-

tion processes to technology. Variation occurs primarily in two loci: theextent o syntactic description processes applied—what proportion o the

original document to capture and represent—and the presence or absence

Page 76: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 76/199

A Labor Theoretic Approach 67 

o the products o semantic description labor and the type o product.

Some level o human semantic search labor is inescapable. An underlying

value, embodied in real-world systems and inuencing practice, is selec-tion power.

Dierent aspects o variation have dierent costs and eects. Although

not without cost, expanding the scope o syntactic description processes

has limited costs and may enhance human capacities. In contrast, seman-

tic description labor is costly. The presence o the products o semantic

description labor may enhance selection power and reduce search labor

and their absence can limit selection power and incur high levels o search

labor.

Decision Considerations

We can construct decision considerations rom the contrasting costs and

eects o the two aspects o variation. The low costs o syntactic descrip-

tion processes and the possible enhancement o human capacities avor

their instantiation. Restrictions on the scope o machine description proc-

esses may be only a transitional stage, with developments inhibited by

inherited belies and attitudes, particularly in library rather than broader

inormation contexts. The primary consideration in making decisions con-

cerns semantic description labor and the associated distribution o human

semantic labor between description and searching. Because description

labor is more amenable to deliberate control than search labor, the specifc

consideration would cascade rom deciding whether to apply semantic

description labor at all, to determining which type (including exhaustivity

o description), to resolving whether to draw upon any existing productso semantic description labor. Thereore, decision considerations relate

primarily and directly to human labor, secondarily to machine processes,

and only slightly to the materials consumed in the products o labor and

processes.

Value o a Labor Theoretic Approach

We identifed a historical dynamism or inormation practice that dis-tinguishes modes o inormation: primarily oral, written literate (pre-

modern), and computer technologies (modern). Excepting the recently

Page 77: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 77/199

68 Chapter 4

dominant but currently eroding inormation retrieval research tradition,

the value o selection power was relatively consistent. In contrast, selec-

tion labor emerged experientially and as an empirically observable prac-tice (written literacy), then divided urther into description and search

labor, beore reconverging as a single substantive category. Historical and

empirical considerations reinorced the validity o the proposition that

selection power is produced by selection labor, most clearly when inor-

mation has exosomatic and material aspects. Further consideration o the

value o the labor theoretic approach should include:

its ormal qualities as a theory,

its relation to the objects described,

its connection to ordinary discourse and common experience, and

its absorption o preexisting theories.

Formally, the theory has simplicity and economy and can be reduced

to a brie series o statements—qualities usually valued strongly in a

theory. In relation to objects described, the theory adopted an expansive

understanding o inormation retrieval systems, accompanied by osten-

sive exemplifcation rather than restrictive defnition. The theory was able

to comprehend systems in oral, written (premodern), and computational

(modern) modes. We identifed a dynamic o change and possibilities or

deliberate and inormed intervention in modern systems. Thereore, we

can consider the theory as comprehensive and powerul in relation to the

activities it takes as its object. The combination o simplicity and economy

with comprehensiveness and power makes the theory parsimonious, pos-

sibly representing a fnal reduction to essential and inescapable elements.The comprehension o real-world practice is matched by closeness to

ordinary discourse and common experience. The concepts o selection

power and selection labor and the distinction o semantic rom syntac-

tic labor have analogues in ordinary discourse conceptions and everyday

inormation practices. For instance, the analytical distinctions introduced

and employed between labor, process, and product all emerge rom real

historical transormations o lived categories. Rooting in ordinary expe-

riences should give the theory urther qualities o robustness and wideapplicability.

We have selectively absorbed preexisting theories rom librarianship,

indexing, and classic inormation-retrieval research and also absorbed a

Page 78: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 78/199

A Labor Theoretic Approach 69

concept rom inormation society discussions. Selection power was taken

rom librarianship and indexing, but the preerence o those activities and

disciplines or human description was transormed into a uller under-standing o the distribution o human semantic labor between description

and searching, and recognition o the possibility o transerring orms o 

labor to technology. Query transormation—implicitly valued in classical

inormation-retrieval research—was incorporated into selection power

both practically and theoretically as a special case, similarly produced

by selection labor. Inormational labor, recognized in inormation society

discussions, was dierentiated into semantic and syntactic mental labor,

and human labor was distinguished rom machine processes. Technologyhas not been repressed but rather has been incorporated explicitly into

the theory. Thereore, we have achieved a synthesis o preexisting theo-

ries that adopts their elements o value and discards aspects that obstruct

understanding.

The relation between the labor theoretic approach and preexisting

theories has a partial and revealing analogy to revolutions in mathemat-

ics. Classically, revolutions in mathematics have changed interpretation

(verbal metalanguage) while retaining orm (symbolic object-language),

thus economizing on the accumulated intellectual labor (universal labor)

embodied in known transormations in object-language:

And in thus preserving the orm while modiying the interpretation, I am ollow-ing the great school o mathematical logicians who, in virtue o a series o star-tling defnition, have saved mathematics rom the sceptics, and provided a rigiddemonstration o its propositions. (Ramsey 1925/1990, 219) 

In this instance, we have brought the interpretation into rigorous accord

with essential or inescapable components o orm or practice, particularly

through the idea o selection. Selection inherent in the process o retrieval

is matched by the value placed upon selection, and the addition o power

as the undamental value or inormation retrieval. Thereore, the value o 

selection power is congruent with the process o selection.

The theory then meets a rigorous test or knowledge, as “an ideal repro-

duction o the external world serviceable or cooperative action thereon”

(Childe 1956, 54). The theory has elements o idealization in its abstrac-tion out o common activities rom empirical reality, but the abstraction

gives enhanced understanding o those activities (or instance, the distinc-

tion between semantic and syntactic mental labor and their relation to

Page 79: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 79/199

70 Chapter 4

technology). Distinctions established have been returned to the external

world, yielding an enhanced understanding o major inormation systems

and identiying a dynamic or change. The theory also serves coopera-tive action, enabling guided and deliberate intervention in the dynamic

(or instance, the distribution o semantic labor between description and

searching) rather than simply allowing reproduction o the dynamic that

is less than ully conscious.

 Just as the owl o Minerva takes ight at evening, understanding grows

toward the end o a process. In this instance, human description labor

has been ully identifed as a category complemented and challenged by

syntactic machine description processes. For uture development, partic-ular aspects o the labor theoretic approach may obtain special signi-

icance. For instance, the enhancement o human capacities enabled by

the increased scope o description processes also raises the issue o how

best to determine the most eective ways o exploiting the descriptions

enabled. Existing research themes could be transormed and carried or-

ward. For instance, recognition o the multiaceted nature o relevance,

previously understood to require dierent methods o human descrip-

tion (Wilson 1973), now could be conceived as how to exploit syntacti-

cally generated descriptions along the dimensions o relevance identifed.

Crucially, the labor theoretic approach can absorb possible uture devel-

opments, and uller empirical richness is possible within its established

conceptual ramework.

As computational technologies become increasingly diused there is a

dissolution o the gestalt o the computer (Rosenberg 1974), in the partic-

ular sense o extravagant expectations o the computational process ando the transer o human intelligence to technology. The implicit transer

o human judgment to technological processes in query transormation

may have partly motivated the growth and diusion o the classical inor-

mation retrieval tradition. The experience o the practice o computation

is reducing the contrast between mystical practice—in the sense o pro-

posals or processes—and restrained theory; the theory o computation

itsel incorporates ordinary discourse or natural conceptions o the com-

putable. The revolution in the mechanization o mental labor (Minsky1967, 2) can now be seen as a late-twentieth-century revolution that has

stabilized at the syntactic level o computational transormations and the

interace to systems.

Page 80: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 80/199

A Labor Theoretic Approach 71

Conclusion

At the semantic level, the process o inormation retrieval may remainar more enduringly intractable and, despite technological and system

developments, not amenable to teleological transormation. One mid-

eighteenth-century comment, by a lexicographer who relied partly on

his own accumulated knowledge or making a monolingual dictionary

(Boswell 1791/1980, 131–132),6 still gives the most convincing account

o the process o inormation retrieval:

When frst I engaged in this work, I resolved to leave neither words nor things

unexamined, and pleased mysel with a prospect o the hours which I shouldrevel away in easts o literature, the obscure recesses o northern learning, whichI should enter and ransack, the treasures with which I expected every search intothose neglected mines to reward my labour, and the triumph with which I shoulddisplay my acquisitions to mankind. . . . But these were the dreams o a poetdoomed at last to wake a lexicographer. I soon ound that it is too late to look orinstruments, when the work calls or execution, and that whatever abilities I hadbrought to my task, with those I must fnally perorm it. To deliberate whenever Idoubted, to enquire whenever I was ignorant, would have protracted the under-taking without end, and, perhaps, without much improvement; or I did not fndby my frst experiments, that what I had not o my own was easily to be obtained:I saw that one enquiry only gave occasion to another, that book reerred to book,that to search was not always to fnd, and to fnd was not always to be inormed;and that thus to persue perection, was, like the frst inhabitants o Arcadia, tochace the sun, which, when they had reached the hill where he seemed to rest, wasstill beheld at the same distance rom them.I then contracted my design, determining to confde in mysel, and no longer tosolicit auxiliaries, which produced more incumbrance than assistance: by this Iobtained at least one advantage, that I set limits to my work, which would in time

be fnished, though not completed. (Johnson 1755/1982, 21–22)Mental labor is both made explicit and alluded to in repeated analogies

between literature searching and physical mining. The Aristotelian notion

o deliberation—“to deliberate whenever I doubted”—is also critiqued

implicitly. The contrast between “fnished” and “completed,” a subtle

distinction between the meanings o words made in a lexicographic con-

text, has relevance here: this particular sequence o chapters ends, but the

research agenda is not complete.

Page 81: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 81/199

Page 82: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 82/199

5Retrieval rom Full Text

Introduction

The remaining chapters ocus on retrieval rom ull text, which is under-

stood as a urther transormation o selection labor enabled by modern

inormation technologies; particular aspects o selection labor identifed

by the labor theoretic approach rise to prominence. Specifcally, the quan-

titative expansion in the scope o description process has transormed the

possibilities or search labor. Description processes remain syntactic andmachine-based, while search labor includes directly semantic elements

(or instance, the choice o words or search) and semantic understanding

o syntactic components (or instance, the logical combination o search

terms and understanding o the eects o description processes on the

meanings attached to words). This chapter proceeds rom the concrete

to the abstract, rom examples o retrieval rom ull text to discussions

inormed by scholarly sources accepted as relevant to understanding the

semantics o written language, particularly regarding the idea o languageas a nomenclature and the insufciency o this conception o language.

The relation o the labor theoretic approach to ull text retrieval is pre-

cisely specifed. Subsequent chapters develop a uller and more positive

understanding o the semantics o written language (understood as the

transormations o meaning in the processes o description and search-

ing) and o syntax (understood as patterns o dierence and replication in

words and sequences o words).

Examples

Unquiet cooccurrences

Page 83: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 83/199

74 Chapter 5

Quotidian metamorphise

Happiest legumens

The ragment shown above represents a novel literary orm in which each

line combines a unique co-occurrence on Google (2004).1 The primary

explicit ocus o composition centers on the combinatorial possibilities o 

the signifer or expression, although the signifed or meaning reemerges

orceully in thematic resonances containing unique combination and

transormation o terms. Composition in this instance also involved a

more deliberate search or a pun that would allude to hapax legomenon,

the scholarly term or a word that occurs only once in a classical or bibli-cal corpus. Because the implied Boolean AND characteristically retrieves

a number o documents rom a large corpus, discovering unique co-occur-

rences—even those devoid o semantic interest—is difcult.

The contrast between the retrieval results expected rom combining

individual search terms with AND and OR would be relatively well-

known and understood, both rom the historical experience rom the

1970s o searching systems based primarily but not exclusively on the

representation o single terms taken rom source documents, and alsorom the theory o computation. Resemblance in the initial ordering o 

reerences would partly disguise the empirical dierences in the retrieved

sets obtainable rom search engines using a phrase rather than an all the

word search, but the dierences would be discernible by the number o 

documents retrieved.

Experience in searching newspaper databases—regarded as more

ordinary than scientiic or technical discourse—and as anticipating

the issues encountered when other orms o ordinary written discoursewould become subject to computer-based retrieval also revealed the mul-

tivalency—multiple values or meanings, that single words acquire when

detached rom their syntagmas. Retrieval recovers the syntagma and

reveals the variety o syntagmatic occurrences. For instance, searching or

the topic fnancing o university libraries in 1993 with a query equivalent

to university AND library AND fnance retrieved unintended documents:

THE GUARDIAN Copyright (C) Guardian Newspapers Ltd,1984,1985,1986,1987,1988,1989,1990,1991,1992,1993

>get universityGET UNIVERSITY

3678 ITEMS RETRIEVED SINCE 1FEB92

Page 84: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 84/199

Retrieval rom Full Text 75

>pick library

PICK LIBRARY

143 ITEMS RETRIEVED SINCE 1FEB92>pick fnance

PICK FINANCE8 ITEMS RETRIEVED SINCE 1FEB92

>context 1

CONTEXT 1

1....

GDN 23 Jan 93 Latest Positions: Sex by Vatsyayana (1770)....Kama is love, Sutra is art. The inamous treatise on the art o love was part o a trio o works; the others were the Artha Shastra, one o the world’s earliest man-uals o statecrat, written by Kautilya, chie minister o Chandragupta Maurya,and the Dharma Shastra. . . . Then as now, there was no romance withoutFINANCE.....Later, he became a proessor at the Hindu UNIVERSITY o Benares and a direc-tor o the college o music. In 1954, he moved to Madras to become the director

o the Adyar LIBRARY.

None o the query terms received in the retrieved passage are used in

senses ar removed rom their ordinary discourse denotations and conno-

tations. Other retrieved results suggested a university as a site or transi-

tion rom adolescence to adulthood as well as a place o learning—a more

ormal sense consistent with discussions o aboutness in inormation

retrieval research. Similarly but not precisely parallel, a search or public

discussion o the Research Assessment Exercise (RAE)—the evaluation o publicly unded research in the United Kingdom—in 1997 supplemented

the phrase research assessment exercise with the acronym RAE combined

with research to retrieve documents that did not mention the ull phrase

(Warner 1997, 264–265).

(Research Assessment Exercise) OR (RAE AND research)Did Englishmen eat each other? In the 150 anniversary year o Sir John Franklin’sdeath, Tom Pow considers why his ailed endeavour overshadowed the greater

achievement o the Orcadian explorer John RAE . . . . Anyone who has visitedthe Scott Polar RESEARCH Institute and seen the fnal copperplate letters, writ-ten with all hope gone, with breath averted so paper would not reeze. (Pow1997)

Page 85: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 85/199

76 Chapter 5

The logical operators used in searching the newspaper databases continue

to Internet search engines, although expressed in dierent ways. Some

learning rom experience is revealed in the use o phrase searching.In contrast to combining search terms by AND, phrase searching

tends to recall a limited number o documents that contain thematic or

semantic similarities to the intended subject or signifed in the query. For

instance, searching or the phrase direct semantic ratifcation2 recalls a

restricted number o documents on the topic o direct semantic ratifca-

tion, oten concerned with contrasts between orality and literacy and

beginning to address issues concerned with electronic communication

(table 5.1). Similarly, searching or the phrase Jack Kennedy was a riend o mine retrieves a limited set o documents concerned with or deliber-

ately alluding to the vice-presidential debate between Lloyd Bentsen and

Dan Quayle. The discussions recalled seldom noticed the rhetorical eec-

tiveness o the contrast between the deliberate inormality ( Jack Kennedy,

implying the personal riendship claimed) and the ormality o Senator 

(alluding to Quayle’s oicial role) in Bentsen’s continuation, Senator,

you’re no Jack Kennedy (Google 2004). The variable verbal orms given

or other parts o Bentsen’s remarks could be ascribed to the inexactness

o unchecked verbal memory.3 Thus, phrase searching is used as a com-

pensatory mechanism or the mutivalency o separate words. A recent

innovation by Amazon.com, which displays statistically improbable

phrases rom the text o documents and uses them to make interdocu-

ment connections—reveals similar practical rather than necessarily theo-

retical understanding (Amazon.com 2005).

Search statement Retrieved items

“direct semantic rarifcation” c.83

direct AND semantic ANDratifcation

c.8700

direct OR semantic OR ratifcation c.28000K

Source: Google 2004.

Table 5.1 Eects o dierent search strings

Page 86: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 86/199

Retrieval rom Full Text 77 

Practical Understanding and Theoretical Knowledge

Understanding theoretically—rather than just experiencing practically or

exploiting—the dierences between term and phrase searching requiresmovement beyond the theory o computation to linguistics, specifcally

to Saussure or an account o signifcation in written language (Saussure

1916/1983), and to inormation theory or the possibilities and con-

straints on the combination o words into phrases (Shannon 1948/1993).

Comprehending patterns revealed by phrase searching might illuminate

why search engines have been perceived as eective. For instance, retriev-

ing documents exclusively on a topic corresponds to a principle intended

but not always achieved purpose o humanly assigned index or meta-languages and to an aim o classical inormation retrieval, particularly

when oriented toward precision. With phrase searching, some expertise

and mental labor has been transerred rom description to searching.

Synthesizing aspects o the seldom-connected felds o Saussurean linguis-

tics and inormation theory could be acilitated by establishing analo-

gies between their principal concepts: between the syntagma (the linear

sequence o utterance) rom linguistics and the message rom inormation

theory and the paradigm (the network o associations) and the messages

or selection.

This reverses a commonly assumed relation between theoretical knowl-

edge, practical experience, and understanding and gives practical under-

standing priority over theory. Philosophical antecedents to this reversal

are ound in Marx’s insistence on the value o rising rom the abstract

to the concrete (1858/1973, 101) and in Vico’s preerence or practical

understanding over philosophical universals (1710/1988, 62). Givingpriority to practical understanding may be particularly appropriate or

human communication, since the ability to communicate characteristi-

cally precedes analysis o communication. Vico’s comments on his own

account o the transition rom primary orality to literacy, regarded as the

coevolution o spoken and written language, are particularly apposite:

All this seems more reasonable than what Julius Caesar Scaliger and FranciscoSánchez have said with regard to the Latin language, reasoning rom the prin-

ciples o Aristotle, as i the peoples that invented the languages must frst havegone to school with him! (1744/1976, 153 § 455)

Similarly, practical understanding may have preceded theoretical analysis

regarding the transition rom paper-written to electronic literacy.

Page 87: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 87/199

78 Chapter 5

The additional value o theory lies in the enhanced understanding o 

practice and in its comprehension and generalization o experientially and

empirically obtained understandings. The choice o particular theories(or instance, Saussurean over Chomskyan linguistics) can be supported

by experience, with a dialectic relation between practice and choice o 

theory. In this instance, some particularly strong theories are employed.

Mediated by semiotics, Saussurean linguistics provides the most sophis-

ticated and powerul method known or analyzing signifcation (Warner

1994, 9–15). Inormation theory (Shannon 1948/1993) sets limits or the

transmission o messages o comparable standing and signifcance to the

limits established or computation by automata theory (Minsky 1967).The strength o theories is such that they may assist in establishing that

the emerging elements o stability in practice correspond to a teleological

statue or plateau that only could be transcended by disproving the theo-

ries. Generally recognized as particularly strong, automata theory estab-

lished undamental possibilities and limits or computation (Cockshott

and Michaelson 2007). Inormation theory gives a model o communica-

tion that identifes the undamental entities required or signal transmis-

sion. Inormation theory has not been disproven, although understanding

o its legitimate application to message and signal transmission has been

obscured by its analogical interpretation to include semantic understand-

ing (Tidline 2004). Saussurean linguistics can yield a very powerul,

although not previously developed, account o transormation or mean-

ing or ull-text description and searching. From the perspective o autom-

ata theory, ailure to progress in practical applications despite repeated

eorts may strongly imply that processes are computationally intractable.The theories here may reveal the sources o the continuing intractabil-

ity o inormation retrieval, in both its ordinary discourse and its techni-

cal sense, and even identiy the sources o possible noncomputability that

may underlie its intractability. Contemplating moving beyond the current,

relatively stable elements o practice might then require a reutation o 

frmly embedded and not-yet-disproved theories. In particular, human

semantic labor, possibly invested in search rather than description, may

remain inescapable to counter the intractability o the process o retrieval.This chapter adopts a rigorous understanding o knowledge and theory

as knowledge. One powerul understanding o knowledge is that it must

be communicable and useul, amenable to translation into successul

Page 88: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 88/199

Retrieval rom Full Text 79

action. Knowledge continues to be understood as “an ideal reproduction

o the external world serviceable or cooperative action thereon” (Childe

1956, 54). A distinction o knowledge rom inormation, independentlydeveloped in subsequent literature (Blair 2002, 1020–1021), is implied.

Childe’s “ideal reproduction” is understood to include a degree o ide-

alization and also the active participation o receivers in reproducing

knowledge. In this context, idealization is realized in the abstraction and

simplifcation o elements o language and message transmission; active

participation and the useul nature o knowledge emerge in the utility

o the abstractions chosen or inorming subsequent use and design o 

inormation retrieval systems. The knowledge embodied in inormationtechnology is acknowledged but treated as a given and as a basis or the

development o urther knowledge. The embodied knowledge is received

in objectifed orm, as material technology (Warner 2004, 5–35).

Conceptions o Language

The conception o language implicit in the searches recounted above was

concisely articulated by Saussure in his Course in General Linguistics,

frst published in 1916—ar in advance o the possibilities o discovering

the use o words oered by modern technologies or inormation retrieval

(1916/1983, 65).

For some people, a language reduced to its essentials, is a nomenclature: a list o terms corresponding to list o things. For example, Latin would be represented as:

Figure 5.1Source: Saussure 1916/1983, 65.

Page 89: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 89/199

80 Chapter 5

In one instance, library was conceived primarily as reerring to a physi-

cal place or inormation-related expenditures. Similarly, university was

understood spatially, and fnance represented a defnable, related activ-ity. Saussure critiqued the view o language as a nomenclature, arguing

that a word obtains its value rom a “network o orever negative dier-

ences” (Culler 1988, 52). Experience with inormation retrieval systems

has tended to confrm Saussure’s proleptic insight, particularly regarding

his view that we cannot understand language ully as a nomenclature.

We can trace the persistence o the ordinary discourse notion o lan-

guage as a nomenclature, possibly experientially eroded now by modern

inormation retrieval, to inculcation into written literacy. In Saussure’sexemplifcation o this conception o language, equivalence is indicated

between words and pictures o objects, corresponding to a child’s alpha-

betic primer. The relation between word and image is known to break

down when tested by more abstract entities— how could the Research

Assessment Exercise be drawn?—but we may still assume a similar

relation o equivalence between word and concept. In a similar way to

Saussure’s exemplifcation o language as a nomenclature, Augustine’s

description o the acquisition o language implies a relation o equiva-

lence between spoken sounds and visually perceived objects:

I can remember that time, and later on I realized that I had learnt to speak. It wasnot my elders who showed the words by some set system o instruction, in the waythat they taught me to read not long aterwards; but, instead, I taught mysel byusing the intelligence which you, my God, gave to me. For when I tried to expressmy meaning by crying out and making various sounds and movements, so that mywishes should be obeyed, I ound that I could not convey all that I meant or make

mysel understood by everyone whom I wished to understand me. So my memoryprompted me. I noticed that people would name some object and then turn towardswhatever it was that they had named. I watched them and understood that thesound they made when they wanted to indicate that particular thing was the namewhich they gave to it, and their actions clearly showed what they meant, or there isa kind o universal language, consisting o the expressions o the ace and eyes, ges-tures and tones o voice, which can show whether a person means to ask or some-thing and get it, or reuse it and have nothing to do with it. So, by hearing wordsarranged in various phrases and constantly repeated, I gradually pieced togetherwhat they stood or, and when my tongue had mastered the pronunciation, I began

to express my wishes by means o them. In this way I made my wants known to myamily and they made theirs known to me, and I took a urther step into the stormylie o human society, although I was still subject to the authority o my parents andthe will o my elders. (498/1961, 29 § I.8)

Page 90: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 90/199

Retrieval rom Full Text 81

“When I use a word,” Humpty Dumpty said, in a rather scornul tone, “it means justwhat I choose it to mean—neither more nor less.”

“The question is,” said Alice, “whether you can make words mean so many di-erent things.”

“The question is,” said Humpty Dumpty, “which is to be master—that’s all.”

—Lewis Carroll. Through the Looking Glass, and What Alice Found There. (Carroll1871/1998, 183–186)

Box 5.1Multivalency o Words

Page 91: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 91/199

82 Chapter 5

Augustine ocused explicitly on the acquisition o oral language, but

we can discern some inuence rom written literacy, particularly in the

word to object relation. Words are not necessarily distinguishable unitso oral utterance, in their aspect as signals, even when those utterances

are inuenced by written language. The culturally signifcant artiact o 

the monolingual dictionary both reinorces and undercuts the concept

o language as nomenclature. Rather, it reinorces the concept by repre-

senting relations o equivalence between defned words and their defni-

tions, although characteristically between verbal orms rather than word

and image; it also undercuts the concept, particularly in more extensive

dictionaries, by revealing the variety o signifcation that can attach todefned terms. Even extensive and historical dictionaries might still imply

stable synchronic states o a language, showing clear transitions between

historical uses o words and sharply dierentiated, defnable senses while

assigning uses to specifc single categories.

Construction o an equivalent to a nomenclature—a deliberately made

language—has been a recurrent response to the perceived inconsistency o 

the relation between word and object or meaning in ordinary as well as

scholarly and scientifc discourse. In philosophy, Wilkins proposed a real

character—intended to “signife things, and not words”—that aimed to

achieve clarity in argumentation (1668/1968, 21). In indexing, controlled

vocabularies are intended as terms with stable meanings that label con-

cepts more consistently than the language o discourse. Less commonly,

a search or a nomenclature has become embedded in the language o 

discourse. A common but not universal assumption in the construction

o a nomenclature argues that concepts and objects can be known inde-pendently o language and that agreement exists regarding undamental

notions. The excluded multivalency and indeterminacy o ordinary dis-

course tends to reemerge in the complexity o deliberately constructed

language and also in the difculty o assigning objects to categories. The

theoretical sophistication o discussions about deliberately constructed

languages has varied (Gardin 1973, 142), particularly regarding the gran-

ularity o isolation o concepts and the relation o concepts isolated to the

language o discourse.

Page 92: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 92/199

Retrieval rom Full Text 83

Retrospect and Prospect

We can distinguish a retrospect and a prospect regarding ull-text retrieval.The retrospect comprises the theoretical development to this point—the

labor theoretic approach and the urther sources adduced in this chapter.

In contrast, the prospect involves developing a uller understanding o the

emerging qualitative changes isolated by the retrospect, not just account-

ing or their existence. It will be constituted by the subsequent chapters.

Retrospect

The labor theoretic approach can explain the existence o ull-textretrieval as a real-world practice. Its emergence can be understood as a

product o the desire to exploit the enhanced selection power made pos-

sible by the quantitatively increased storage and processing capacities o 

modern, humanly constructed inormation technologies. Selection power

continues to be implicitly and strongly valued, and it remains produced

by selection labor, with aspects o that labor (particularly description

labor) transerred to machine processes. Thereore, the value implicitly

attached to selection power and its production by selection labor reveals

consistency with the labor theoretic approach. The signifcance o selec-

tion power (and the value attached to it) and the inescapability o selec-

tion labor are urther confrmed by systems that oer enhanced selection

power but require orms o selection labor—systems currently entering

widespread existence and use.

The underlying dynamism or change can be identifed precisely. Change

has emerged principally rom the reduced costs o syntactic machineprocesses and the increased amount o resources that can be incorpo-

rated economically into those processes. The labor theoretic approach

and the theories in which it is embedded can explain how substitution o 

machine processes or direct human labor reduces the costs o processes.

The enhancement o human capacities that emerges rom repeated sub-

stitutions is also explicable, i less requently recognized, particularly by

the labor theory o value. Thereore, the immediate origins or change are

consistent with a broader emphasis on the signifcance o changes in thematerial basis o being or existence to practice, and with a more specifc

view o recent developments in inormation technologies as a revolution

in the mechanization o mental labor.

Page 93: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 93/199

84 Chapter 5

We can locate the dynamism or change in relation to the categories

oered by the labor theoretic approach. The primary location or imme-

diate change lies in the intersection between syntactic machine processesand the description o objects (lower let section, table 5.2). The enhanced

scope or machine description o objects obtained has had secondary

eects on other aspects o inormation retrieval. The common source or

change has been mediated by contrasting and sometimes antagonistic

cultures o library and inormation science and Internet search engines,

resulting in dierent secondary eects and practices.

For library and inormation cultures, the dynamic has owed predomi-

nantly in two directions: toward the intersection between human seman-tic labor and the making and distribution o description products, and

also toward invoking machine syntactic processes in searching (repre-

sented by the upper let quarter and the transition between the upper and

lower right quarters o table 5.2). Producers o library and inormation

resources have been reluctant to exploit the potentialities o describing

the ull-text o documents, although this was identifed earlier as a pos-

sibly transitional stage that was no longer compelled by the storage or

processing constraints o technologies. Thereore, only limited excursion

has occurred into human semantic search labor, both in practice and in

Description Searching

Human

semanticlabor

Continuity and modulation

(distribution o products o semantic description labor).Generic capacity.

Consequential (rom increased

scope o machine syntacticdescription practices) increasein possibilities or exploitation.Need to understand productiono meaning and patterns o occurrence, in order humanly toexploit.

Machinesyntactic

process

Increase in scope o resources covered. Strong

continuities in descriptiontechniques.

Invoking o machine processes(copying and sorting) rather

than direct human syntacticlabor/without syntactic laborconducted.

Table 5.2 Distribution o human labor and machine processes betweendescription and searching under modernity

Page 94: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 94/199

Retrieval rom Full Text 85

theory. Experimental inormation retrieval, which developed partly rom

library and inormation cultures, has had a marginally direct eect on

practice and correspondingly limited secondary real-world eects owingrom the primary dynamism. The changes resulting rom the mediation o 

the dynamics or change within the library and inormation cultures—the

distribution o products o description labor and the invoking o machine

processes or ordering results—are historically embedded and can be

regarded as a quantitative expansion o existing practices.

For Internet search engines, change has occurred principally but not

exclusively within the intersection between machine syntactic processes

and description. Quantitative expansion in the potential scope o syntacticdescription progress has created some qualitatively signifcant but poorly

understood changes in human semantic search labor. A consequential

and highly signifcant dynamism ows toward the intersection o human

semantic labor and searching, increasing the possibilities or exploiting

resources (the change occurs in the lower let quarter o table 5.2, and

consequential eects appear in the upper right quarter). From the sys-

temic perspective developed in studies o the role o inormation services,

changes in one aspect o a system that have unintended consequences in

another aspect would be understood as a amiliar phenomenon (Roberts

1997). Semantic description labor and the more ormalized invoking o 

machine processes or search ordering and combination developed in and

associated with the cultural milieu o library and inormation science are

usually bypassed, but not always. The potential or exploitation o ull-

text description is much less embedded historically and less ully under-

stood than developments in library and inormation science. It representsa qualitative change, not just quantitative expansion.

The persistence o underlying cultures and their inuence on practice

illustrates that practice and particularly theoretical consciousness can

be underdetermined by the material basis o being: “[t]he tradition o 

the dead generations weighs like a nightmare on the minds o the liv-

ing” (Marx 1852/1973, 146). The material basis o being, mediated by

technology, directly aects practice in the sense o methods or construct-

ing working systems and practical understanding o how to exploit thosesystems. In contrast, theory can be slow to change and dominated by its

inheritance rom previous modes o thought. Thus, practice can precede

theory.

Page 95: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 95/199

86 Chapter 5

Studying practice can yield a highly specifc ocus or theoretical under-

standing. The elements o practice can be arranged along a continuum

that ranges rom contingent to inescapable. Contingent elements areunderstood as those that vary across contrasting cultures and practices.

Conversely, inescapable elements strongly inluence both library and

inormation and Internet cultures and practices. Isolating the inescapa-

ble—and attempting to understand it—may yield the basis or a synthesis

o the contrasting cultures. Any such synthesis likely will require a painul

eort o deliberate articulation; even then, synthesis will remain incom-

plete at this stage.

Contingent Elements

It is possible to group signifcant contingent or variable elements into

meaningul categories that reveal underlying analogies between con-

trasting cultures. Human description labor and its strongly semantic ele-

ments appear in the library and inormation sphere. The products o that

labor—bibliographic records and the distinctions embodied in dierent

felds o those records—can be exploited directly in searching. In addi-

tion, human description labor is highly and unquestioningly valued in

theoretical discussions. Documents discovered by Internet search engines

may contain humanly assigned metadata, but it is not necessarily avail-

able or direct exploitation. A signifcant development, both analogous

and contrasting to human description labor, involves exploitation o the

traces let by semantically motivated human searching, which produces

eects analogous to those obtained rom description labor (or instance,

grouping o material).Generic capacity—grouping material—occurs in both cultures but it

is produced by contrasting methods. Library and inormation practices

characteristically oer generic capacity by incorporating the products o 

human description labor (or instance, subject descriptors or canonical

orms or authors’ names). Evidenced by Google Scholar (2008), a con-

trasting method involves a computational collation o an article’s variants

through similarities in pattern. The sense o generic capacity common to

both cultures is the minimal sense o gathering together; the more elab-orate, humanly assigned structures are developed more requently by

library and inormation practices. The ordering o results represents one

orm o gathering, and practices in both cultures oer intelligible—or at

Page 96: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 96/199

Retrieval rom Full Text 87 

least explicable—orderings and discernible analogies. For instance, rank-

ing results by the number o links to Internet pages may be analogous to

ordering records o books by the number o libraries that hold them, orordering journal articles by the number o citations received: the various

orderings all unction as surrogate indicators o diusion or adoption.

Variable practices are intended to coner value, and the persistence o 

some variable elements in practice across contrasting cultures implies the

possibility o value or signifcance transcending their particular method

o instantiation, whether by human semantic description, search labor, or

computational transormations. The market eect continues to unction

as a selection mechanism, progressively dierentiating synchronic valuerom inherited practice. The particular orms o value conerred might be

better understood by contrasting them with inescapable elements, laying

bare the real reasons or their continued existence.

Inescapable Elements

Selection operations are inescapable and intrinsic to the understanding

o inormation retrieval we have developed here; the particular selection

operations may vary, although more in theory than in real-world practice.

Experimental inormation retrieval began by critiquing the eectiveness

o Boolean searching, and developed orms o retrieval it characterized

as non-Boolean. However, directly Boolean operations are very common

in real-world practice or the selection but not necessarily the ordering o 

retrieved documents and records. They also correspond to undamental

computational operations. Contrary to criticisms made in experimental

retrieval, and possibly attracted by the transparency and intelligibilityoered by Boolean operations, the dominant market eect is toward

selection o Boolean systems. The crucial dimension or urther analysis

will involve abstracting the word or other segments o utterance rom

writing, which is realized in Boolean searching but may not be confned

to it. Thereore, we can assume Boolean operations or our subsequent

analysis, ensuring its relevance to real-world practice without signifcantly

restricting its theoretical scope.

Inescapable elements emerge as the residue let behind ollowing theseparation o variable elements. Under modernity, selection operations

are intrinsic to inormation retrieval, and all these operations can be gen-

erated rom the undamental computational operations necessary or

Page 97: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 97/199

88 Chapter 5

writing, erasing, and substituting symbols. Selection operates on data and

inevitably encounters semantics and syntax—meanings and patterns—o 

that data or medium. Thus, both semantics and syntax are also inescapa-ble. When the selection medium is written language, semantics undergoes

a urther translation into a more specifc concern regarding production o 

meaning rom written language and syntax becomes issues surrounding

the replication and dierence o words and sequences o words. These

inescapable elements do not simply coexist but engage dialectically: com-

putational procedures or selection that correspond to modernity interact

with the production o meaning and patterns o replication and dierence

that embody strong inheritances rom orality and rom written literacy, orpremodernity.

The dialectical interaction o computational procedures or selec-

tion with the semantics and syntax o written language—understood as

patterns—is most ully and directly realized in Internet search engines,

although more in practice than in consciousness or theory. The most dis-

tinctive, novel, and strongly emerging aspects o Internet search engines

is not ull-text retrieval itsel—consider Biblical concordances—or the

underlying processes that support selection, but rather the scope and

extent o data covered, with variety merging into global heterogene-

ity. Enabled by the transer o description process rom direct human

labor to machine and also by the reduced constraints o machine stor-

age limitations, a quantitative change yields qualitative eects. In addi-

tion, search engines may be approaching a teleological state or plateau

and may embody the possibilities approached by other systems. Systems

that employ semantic description labor directly, and humanly, encounterpatterns o production o meaning rom written language, along with a

subsequent dialectic between the descriptions and computational proce-

dures or selection. Thereore, inescapable elements transcend the divide

between the library and inormation and Internet cultures, although they

occur dierentially in the more novel, interesting, and possibly histori-

cally inevitable occurrences embodied in Internet search engines.

Contingent and Inescapable ElementsBy separating the inescapable rom the variable or contingent, we can

now progressively isolate the set o activities that should orm the ocus

or our prospective attention. For analytical purposes, we begin by

Page 98: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 98/199

Retrieval rom Full Text 89

excluding highly variable elements: human description labor and prac-

tices that depend on the presence o the products o human description

labor (including exerting generic capacity and certain orms o order-ing) rom the library and inormation sphere, and semantic search labor

adapted to description, syntactically produced generic capacity, and the

orderings o records rom the Internet sphere. We can still include one

slightly variable element, Boolean logic or selection, while recognizing

the possibility o other computational selection procedures and also rec-

ognizing that the critical actor is abstraction o the segment o utterance

rom the line o writing, not Boolean logic itsel. We can then ocus our

attention on the interaction o the inescapable elements; Internet searchengines currently oer the ullest and possibly teleological realization o 

that interaction. Table 5.3 summarizes the argument’s progression and

indicates the ocus o subsequent attention, where “+” indicates strongly

present, “-” indicates largely absent, and “+ / -” indicates both presence

and absence in signifcant but not necessarily comparable proportions

within the practices o that culture. The inescapable elements themselves

are poorly understood and require theoretical understanding, which to

date partially tends to ocus on the variable practices. Studying the ines-

capable, in its ullest realization, also may reveal the particular orms o 

value derived rom the variable elements analytically excluded.

The signifcance o the particular inescapable elements identifed can

be strongly supported by analogies to orality emerging into literacy and

a contrast with written literacy, or premodernity, as a dual triangulation

eect.

Under orality emerging into literacy, selection labor concentrated pri-marily in memorization and communication o a linear spoken utter-

ance, and selection power emerged as dialogue between human speakers.

The primary analogy o orality emerging into literacy with modernity is

that the entire linear utterance—now written rather than spoken—and

the human labor embodied in it is potentially available or exploitation.

Although the real possibilities o direct comprehension are constrained by

the extent o available utterances, a compensating addition o computa-

tional procedures or selection has developed. However, those proceduresinvolve tearing the word or other segment o utterance rom its place in a

linear utterance. The absence rom modernity o an immediately human

respondent in dialogue can be partially compensated or by the dialogic

Page 99: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 99/199

90 Chapter 5

Practices Libraryandinorma-tionpractices

Internetpractices

 

   C  o  n   t   i  n  g  e  n   t

Variable Descriptionlabor

Human semanticdescription labor

+ / - + / -

Semantic

search labor asdescription

- +

Genericcapacityoered bydescriptions

Generic capacityrom humandescription labor

+ + / -

Generic capacitysyntacticallygenerated in

description

- +

Orderingo retrievalresults

Intelligibleorders

+ + Experimentalinormationretrieval

Identical orders - -

Selectionoperations

Booleanoperations orselection

+ + -

   I  n  e  s  c  a  p  a   b   l  e

Common Selectionoperations Computationalprocedures orselection

+ + +

Semantic Production o  meaning romwritten language

+ + +

Syntacticor patternbased

Frequency o occurrenceo words andphrases

+ + +

Table 5.3Contingent and inescapable elements in modern inormation retrieval

Page 100: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 100/199

Retrieval rom Full Text 91

character o system interrogation, where queries can be revised on the

basis o system responses. Selection labor—distinguished rom selection

processes—has begun to reemerge as a single category, although concen-trated in searching rather than in description. The availability o the entire

utterance or exploitation, the dialogic character o system interrogation,

the concentration o selection labor as a single activity are then analogous

between orality emerging into literacy and modernity.

Description products as records or metadata ormed by human

semantic labor on preexisting utterances were not part o inormation

retrieval under orality emerging into literacy and can be regarded as prac-

tices developed under written literacy, with an oten disguised connec-tion to the constraints o the technologies o written literacy. Semantic

human description labor and its products oten are substantively absent

or bypassed in searching under modernity. The human syntactic descrip-

tion labor o premodernity itsel is transormed into machine description

processes. The description products o modernity can then be regarded as

partly specifc to that era.

Thereore, substantive analogies that run across particular technologi-

cal eras support the more methodological exclusion o human semantic

description labor and its products and the ocus on the linear utterance

and procedures or selection rom linear utterances in the subsequent con-

sideration o ull text retrieval (see table 5.4).

Table 5.4 Inheritances and transormations rom orality and literacy

  Selection labor Linear utterance Dialogic character

Oralityemerging intoliteracy

Concentrated inmemorization andcommunication

Directlypresented overtime

Humanrespondent

Writtenliteracy orpremodernity

Development o description laborand products

Can be read Absence o humanrespondent

Computationalmodes ormodernity

Concentrated insearching

Available orexploitation, bycomputational(nonlinear)procedures

Dialogic charactero systeminterrogation

Page 101: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 101/199

92 Chapter 5

Thus, the labor theoretic approach is both comprehensive and, to a cer-

tain extent, analytically revealing or ull-text retrieval. Comprehensiveness

was revealed by understanding that ull-text retrieval is a product o thedesire or selection power, enabled by modern inormation technolo-

gies. Human selection labor remains inescapable or producing selec-

tion power, and selection labor is concentrated increasingly in searching.

Analytic power was revealed by precisely locating the primary dynamism

or change and its secondary eects as a quantitative change in description

processes that give rise to qualitative changes in human semantic search

labor. Thereore, particular aspects o inormation retrieval identifed by

the labor theoretic approach have assumed particular importance (seechapter 4 conclusion). Within the identifed locations or change, contin-

gent and variable practices could be separated rom inescapable elements,

both in a synchronic analysis o current practices and rom a diachronic

perspective on historical inheritances. The labor theoretic approach was

comprehensive in identiying the inescapability o selection labor and ana-

lytically revealing or isolating the ocus or prospective attention, assist-

ing the separation o inescapable rom contingent elements.

For our purposes here, technological constraints on the quantitative

scope o descriptions, oten received as universals, have already been

identifed as transcended or no longer immediately relevant; we can ana-

lytically exclude them rom the prospective discussion.

Prospect

We can maintain consistency with established theories or enveloping

context as well as the specifc approach to inormation retrieval, witha continuum between them. With regard to the broader context, the

signifcance o the material basis o being to the making o human his-

tory, to consciousness, and to the possibilities o retrieval continues to

be acknowledged. The material basis o being emerges specifcally in the

materiality o communication, particularly as a line o writing extended

across a surace. Current transitions in inormation technologies are

understood as a rematerialization and not a dematerialization o commu-

nication. With direct reerence to the labor theoretic approach, selectionpower continues as the primary value and remains inescapably produced

by selection labor, although human description labor is transormed

extensively into a machine process and selection labor is concentrated

Page 102: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 102/199

Retrieval rom Full Text 93

in searching. The labor theoretic approach identifed the signifcant loci

or change. Practical understanding remains valuable, and the articulated

theory is deliberately congruent with the practical understanding o howto use systems eectively, particularly accounting or the perceived eec-

tiveness and widespread use o phrase searching.

Congruent with a ocus on the inescapable elements o inormation

retrieval, some methodological restrictions are imposed. Human descrip-

tion labor and its products are deliberately and analytically excluded;

similarly excluded are their analogs in generic capacity and the order-

ing o records or documents produced computationally by Internet search

engines. Boolean operations or selection, highly common across systemsand including the crucial actor o cutting a segment o written utterance

rom its line, are assumed. Methodological exclusion o the contingent

enables ocusing on the inescapable elements common to library and

inormation and Internet cultures. At a urther level o analysis, such

exclusion could reveal the particular value added by contingent elements,

thus explaining the transcultural analogs.

The selection o sources or understanding qualitative changes in the

possibilities or semantic search labor is guided frst by their promise o 

analytical insight. Existing sources acknowledged as relevant to under-

standing inormation retrieval, such as the work o Augustine and o 

Saussure, remain relevant and can be incorporated into the uller under-

standing developed. In their current development, these resources have

been exploited primarily or the negative antithesis to the nomenclatur-

ism evidenced by Augustine and critiqued by Saussure, rather than a posi-

tive account o the production o meaning rom written language. Thedanger o extreme relativism that might ollow rom a purely negative

antithesis was emblematized in the ragile authority o Humpty Dumpty

(box 5.1).

We require a uller understanding that moves beyond predominantly

antithetical contrasts to language as nomenclature and constructs a

positive account o signifcation, or the production o meaning in writ-

ten language. Understanding can be developed by a dialectic with exist-

ing sources recognized as relevant, particularly Saussure and Shannon.Aspects o their work can be synthesized, stressing materiality o com-

munication and its current rematerialization and also stressing the signif-

cance o practical understanding already recognized in the labor theoretic

Page 103: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 103/199

94 Chapter 5

approach. Selection o elements o materiality rom existing sources

ensures theoretical consistency, which itsel promises the urther possibil-

ity o fnal simplicity and analytic value.

Conclusion

We used the labor theoretic approach to identiy speciic changes in

description process and search labor located precisely within the cat-

egories oered by that approach; urther development can be obtained

by dialectical departure rom sources already recognized as relevant. We

still need to develop a uller understanding o the inescapable elementso semantics and syntax common to both library and inormation and

Internet cultures. The understanding can assist the theoretical integra-

tion that might provide a basis or inormed practical action. We aim to

develop a positive account o the production o meaning and the occur-

rence requency o words and phrases, which can be exploited deliber-

ately in searching and may also suggest teleological limitations on system

development. While the account o meaning and o requency will be con-

gruent with the practical understanding revealed in known and widely

diused patterns o exploitation, it should yield uller and more ully

articulated understanding. Enhanced understanding should yield value

or library and inormation and Internet discourse and practice. The lim-

ited sophistication o discussions detected here could be enhanced. In par-

ticular, theoretical constraints or uture developments could be detected

rom the analogy o the syntagma with the message and the paradigm with

the messages or selection. Linguistics could contribute an understandingo the role o the syntagma and paradigm in signifcation and inormation

theory o the combinatorial possibilities o the messages or selection into

the message.

We can now anticipate subsequent chapters concerned with ull-text

retrieval. Drawing on Saussurean linguistics, chapter 6 provides a materi-

ally embedded account o the production o meaning rom written lan-

guage. Chapter 7 discusses the recurrence requency o words and o 

phrases. Chapter 8 synthesizes the insights obtained rom both linguisticsand inormation theory, particularly directly inormed by an understand-

ing o the computational process.

Page 104: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 104/199

6A Semantics or Retrieval rom Full Text

Introduction

This chapter ocuses primarily on semantics and draws specifcally rom

Saussurean linguistics to describe the production o meaning in written

language; it also provides a basis or understanding transormations o 

meaning in ull-text retrieval. Meaning in written language will be under-

stood as the product o human mental labor through interaction with the

syntagma (the linear sequence o utterance) and the paradigm (the net-work o associations a word acquires outside its context in discourse).

Syntagma and paradigm are undamental to Saussurean linguistics and

congruent with concepts rom inormation theory: message and message

or selection. We will use the concepts rom inormation theory in the

next chapter to account or the requency o occurrence and recurrence

o units and combinations o units in written language, principally words

and multiword sequences. In addition, syntagma and paradigm are con-

gruent with the material basis or algorithmic operations on the line o writing or description, searching, and retrieval. The congruence o un-

damental categories implies that an account o the production o mean-

ing, derived rom Saussurean linguistics, may be particularly revealing or

understanding ull-text retrieval. While the account o the semantics o 

written language presented here is consistent with Saussurean linguistics,

it must be extrapolated—not simply received.

Saussurean linguistics has inuenced inormation science primarily

through its incorporation into semiotics, briey anticipated by Saussure

(1916/1983, 15–17; Warner 1994, 9). Adopted by semioticians Saussure’s

distinctions regarding the syntagma as conceived in binary contrast to

Page 105: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 105/199

96 Chapter 6

the paradigm, and the signifer, sign, and signifed are valuable or under-

standing and analyzing signifcation in written language. Further devel-

oped here, Saussure’s distinctions yield an understanding o the eects o automatic indexing operations or ull-text retrieval on the language o 

discourse, where words are characteristically wrenched rom their syn-

tagmas, eectively released into a paradigm (a network o associations)—

where they can acquire a multiplicity o signifeds—and restored to a

variety o syntagmas in retrieval.

Within inormation science, inormation retrieval research has given

some attention to linguistics, but it has ocused more on its mathematical

and ormalized aspects (represented by Zip’s work on the distributiono word orms in linguistic corpora and by its interest in syntactic struc-

tures and computational transormations which has some analogies to

Chomsky’s linguistics) (Zip 1936; Montgomery 1972; Sparck Jones and

Kay 1973). The resistance o Saussurean linguistics to mathematical or

logical ormalism (Harris 1987, xiv) may account or its limited adoption

within inormation retrieval research. Recent data obtained rom citation

searching also indicate that Chomsky is more prominent than Saussure in

English-language scholarship, with very limited intersection between cita-

tions to Saussure and to Chomsky (see table 6.1).

The primary motivation or inormation science’s recourse to linguis-

tics has been its search or eective transormational procedures on writ-

ten language to aid automatic indexing and translation. Here, we aim to

develop understandings that will assist direct human intervention; those

understandings also may indicate the possibilities and limitations o or-

malization.

Citation made in 2004 in the Social Science CitationIndex and in the Arts & Humanities Citation Index

Saussure (all works) 64

Chomsky (linguistics not politics) 500Saussure and Chomsky 6

Table 6.1 Intersection o interests in Saussure and Chomsky

Page 106: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 106/199

A Semantics or Retrieval rom Full Text 97 

Syntagma and Paradigm

For Saussure, the interaction between syntagma and paradigm—associa-tive relations—was crucial to understanding language. Syntagmatic and

paradigmatic relations constituted linguistic structure and determined

how the language unctioned (Saussure 1916/1983, 126). In contrast to

the “interpenetration o morphology, syntax and lexicology”:

Only the distinction earlier drawn between syntagmatic relations and associa-tive relations suggests a classifcation which is indispensable, and which ulfls therequirements or any grammatical systematisation. (Saussure 1916/1983, 135)/ 

In a development rom the indispensability o the distinction, Saussureinsisted:

Everything in a given linguistic state should be explicable by reerence to a theoryo syntagmas and a theory o associations. (Saussure 1916/1983, 135).

From the perspective o modernity inuenced by computational technolo-

gies, transormations on language involved in retrieval should be equally

explicable by the reerence to the syntagma and paradigm. We have

already indicated the possibility o signifcant explication, understandingthat creation o a ull-text index involves tearing a word rom its syn-

tagma and releasing it into the paradigm. We also understand retrieval as

reentering a variety o syntagmas.

In his Course in General Linguistics, Saussure is more concerned with

paradigm and language than syntagma and utterance (1916/1983; Harris

1987, 125). Prioritizing language and paradigm implies objectifcation

o human activity. In earlier neglected and difcult work on anagrams,

the syntagma and its patterns o variation—conceived at various levelso granularity—received preerential attention (Starobinski 1979). In this

context, the usual order o mention and consideration—paradigm beore

syntagma—is reversed, thus preserving the probable order in human his-

tory o activity and perception. The individual utterance is given priority

over language as a whole (Vološinov 1929/1986), and the syntagma is

regarded as an abstraction rom its prior practical instantiation in oral

speech occurring over time and written language extending across space.

We can regard the paradigm, particularly as a network o associations, as

a urther abstraction, produced by the variety o syntagmatic occurrences

o words. Finally, we will investigate the analytic value o the interaction

Page 107: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 107/199

98 Chapter 6

between syntagma and paradigm in understanding signifcation in writ-

ten language. The discussion here accepts a degree o objectifcation o 

language but acknowledges and reers to the congealing o language inits written orm. We urther assume the removal o communication rom

direct semantic ratifcation—separation o the utterance rom its place

and situation o production and also rom the possibility o questioning

its producer.

Syntagma

Saussure considered linearity an inescapable and undamental aspect

o language, crucial to the conception o the syntagma. Linearity ol-lowed rom the spoken nature o language, and the “spoken word alone

constitute[d]” the object o study o linguistics (1916/1983, 24–25).

The linguistic signal, being auditory in nature, has a temporal aspect, and hencecertain temporal characteristics: (a) it occupies a certain temporal space, and (b)this space is measured in just one dimension: it is a line. (1916/1983, 69–70)

The principle o linearity is comparably signifcant as the “frst law” o 

linguistics, the arbitrary nature o the linguistic sign (Saussure 1916/1983,

68–70).

The whole mechanism o linguistic structure depends upon it [linearity]. Unlikevisual signals (e.g. ships’ ags) which can exploit more than one dimension simul-taneously, auditory signals have available to them only the linearity o time.The elements o such signals are presented one ater another: they orm a chain.(Saussure 1916/1983, 70)

Thereore, linearity is undamental to the concept o the syntagma.

Words as used in discourse, strung together one ater another, enter into rela-tions based on the linear character o languages. Linearity precludes the possibil-ity o uttering two words simultaneously. They must be arranged consecutively inspoken sequence. Combinations based on sequentiality may be called syntagmas.(Saussure 1916/1983, 121)

Some traces o objectifcation, particularly in the view o words existing

prior to their instantiation in discourse, can be discerned in this passage.

For Saussure, the “spoken word alone constitute[d]” the object o study

o linguistics (1916/1983, 24–25), although perception and understanding

o the spoken word has been inuenced by models in written language,

both consciously and unconsciously (Harris 1987, 78). Both explicit and

implicit models o written language appear in Saussure’s treatment o lin-

Page 108: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 108/199

A Semantics or Retrieval rom Full Text 99

earity, an inescapable and undamental aspect o language inherent in the

production o syntagmatic sequences. Saussure’s idea that the linearity o 

the sounds o speech is most clearly evident when speech is transcribed towriting is highly revealing, exposing the hidden premise “o Saussurean

linguistics . . . that the spoken word is ‘invisibly’ organized on exactly the

same lines as the ‘visible’ organization o the written word” (Harris 1987,

77–78). Implicit inuences rom written language about the understand-

ing o linearity can be discovered in metaphors o spatial extension—“a

line, a continuous ribbon o sound” (Saussure 1916/1983, 102)—or the

temporal linearity o the spoken signal. The auditory signal is conceived

implicitly as abstracted rom its producer and its visible supports andpartly objectifed.

The medium o discourse in which the syntagma is realized has ur-

ther, more materially grounded, and empirically detectable eects.

Biomechanics limit the combination o distinguishable sounds in the con-

tinuous or analog medium o oral speech. Saussure noted, “reedom to

link sound types in succession is limited by the possibility o combining

the right articulatory movements,” and suggested, “[t]o account or what

happens in these combinations, we need a science which treats combina-

tions rather like algebraic equations” (Saussure 1916/1983, 51).1 By con-

trast in written language, with its more discrete alphabet o symbols and

potentially digital rather than analog nature, restrictions upon transitions

between symbols are not inherent in the medium. However, restrictions

can be imposed or certain communicative purposes, such as providing an

analog to oral speech or incorporating redundancy to enable the recon-

struction o messages disturbed by noise (Warner 2003).Other analogies and contrasts can be made between the realization o 

the syntagma in oral and written discourse. The syntagma is extended in

time in oral speech and in space in written language, although this con-

trast can be qualifed. Perception—including reading—o written lan-

guage may take place over time. Written utterances, such as private and

published correspondence and issues o a journal, may take place succes-

sively and be related syntagmatically to each other. A urther transorma-

tion o the relation between space and time occurs with the written text o a computer program: the program spreads out in text space and the com-

puting process occurs over time (MacKenzie 2001, 38; Dijkstra 1968).

Page 109: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 109/199

100 Chapter 6

The linearity o speech becomes more evident when transormed into

written orm:

This eature appears immediately when they are represented in writing, and aspatial line o graphic signs is substituted or a succession o sounds in time.(Saussure 1916/1983, 70)

The treatment o linearity includes explicit analogies with written lan-

guage and implicit inuences rom writing. The particular conception o 

time as a sequence, with strong analogies to extension in space, is only one

conception or time (Harris 1987, 77), which could itsel be reinorced

and made to appear natural by the diusion o graphic representations o 

temporal sequences ounded on the analogy o time with spatial linearityand directionality. Crucially or the purposes here, written language can

be understood as a line sequentially composed o letters and other marks,

including punctuation marks, and grouped into words divided rom one

another by spaces.

The linear materiality o writing is acutely revealed by an established

pedagogic technique or exempliying the distinction between syntagma

and paradigm. Words are cut rom a sheet o written paper,2 a proc-

ess more complex than it might frst appear. First, the syntagma can be

isolated by cutting the line o writing as a single, physically continuous

ribbon rom paper with conventionally arranged writing on one side

or surace o the sheet o paper. Forming a continuous material ribbon

requires a zigzag cut. In modern practice, the semiotic sequence o the line

o writing is already broken at the end o each line, and these breaks are

retained in the continuous ribbon. Historically earlier orms o writing—

such as the boustrophedon, or the way the oxdrawn plow moves—wouldhave yielded a simpler line o writing or cutting, semiotically continuous

and more readily materially isolated. Once the syntagma has been cut as

a continuous material ribbon, units o the paradigm can then be isolated

and removed rom it by cutting words rom the line o writing. In the

modern line o writing, words characteristically have boundaries, indi-

cated by spaces or punctuation marks. Earlier orms o writing, including

the boustrophedon, did not necessarily separate words and appeared as

an unbroken line o writing more strongly analogous to the continuity o oral speech extended over time. The technique is rooted in the materiality

o premodernity: the writing on the paper is the object o labor and scis-

Page 110: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 110/199

A Semantics or Retrieval rom Full Text 101

sors are the instrument o labor. The operations o cutting can be based

upon visually detectable patterns alone, without reerence to meaning or

semantics. Cutting can be accomplished computationally and assimilatedto modernity by replacing human labor with a syntactic machine process

that enables greater speed and a much more extended object o labor as

the line o writing.

For our purposes here, explicit concentration on the line o writing

replaces the analogy between the extension o oral speech in time and

writing in space, with its partly covert premises. The line o writing is

understood as materially embodied and extended across a surace, giving

the possibility o moving backward as well as orward along the line. Inlanguage ully detached rom its producer, the line o writing provides the

closest practical realization o the orm rom which the concept o the

syntagma can be abstracted.

The constituents o the understanding o the syntagma, word, and dis-

course demand consideration, including contrasts between written and

spoken discourse.

Word

The word is an essential constituent o the syntagma; its defnition has

been disputed repeatedly in linguistics. For Saussure, “what a word is usu-

ally taken to be does not correspond to our notion o a concrete unit”

(1916/1983, 103–104):

To convince onesel o this, it suices to consider the singular orm cheval  (“horse”) and its plural chevaux (“horses”). It is commonly said that these aretwo orms o the same word. But, taking each as a whole, it is clear that we aredealing with two quite distinct items, as regards both meaning and sound.

Understanding o the word can be sharpened by considering this passage.

As sounds and, more clearly evidenced, as sequences o letters, there is

both resemblance (the letters, cheva) and dierentiation (l , ux), but not

the identity o a concrete unit between the two words. For meaning, or

semantics, there is contrast between singleness and plurality, but also

commonality o the constituents o reerence (the horse as one horse and

as one o more than one horse). Saussure continues to acknowledge theabsence o “immediately perceptible entities” in language, but concludes

by regarding the word as “a unit which compels recognition by the mind”

Page 111: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 111/199

102 Chapter 6

that “has a central role in the linguistic mechanism” (1916/1983, 105

and 109). Saussure recognizes but does not ully ormally defne words as

units o written utterance.From a materialist perspective, the word o written language could

be regarded, in part, as a historically developed division o the written

syntagma. Divisions between words would be understood as historically

introduced between units that compel recognition by the mind. In this

understanding, the word would be a possible origin or the perception o 

the paradigm, particularly or the paradigm as the network o associa-

tions created by the occurrence o identical or related word orms in a

variety o syntagmas, with diering meanings or signifeds.

Beyond the Word

Saussure understood the notion o the syntagma as applying “not only to

words, but to groups o words, and to complex units o every size and kind

(compound words, derivative orms, phrases, sentences)” (1916/1983,

122). Units larger than single words—compounds and exional orms (il a

été, or he has been) (1916/1983, 104)—are then recognized. For Saussure,

the word was an element o language and the combination o words in

discourse would normally belong to speech. Like the word, some complex

units (particularly compound words and derivatives) belonged to lan-

guage—“to the language, and not to speech, must be attributed all types

o syntagmas constructed on regular patterns” (1916/1983, 123). More

extensive syntagmas, such as the phrase and the sentence, were parts o 

speech, although there was no clear boundary separating language or

communal usage rom speech, marked by the reedom o the individual.For Saussure, the “characteristic o speech is reedom o combination”

(122), in a curious anticipation o the terminology used to describe the

selection o messages rom the source in inormation theory. Similarly,

with “sentences . . . it is diversity which is predominant” (105). The syn-

tagma then includes multiword sequences.

Paradigm

Saussure characteristically placed the paradigm (or, “associative rela-tions”) in a binary contrast with the syntagma (1916/1983, 123). The

paradigm can be regarded as produced by and abstracted rom the experi-

ence o the syntagma.

Page 112: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 112/199

A Semantics or Retrieval rom Full Text 103

The paradigm has been understood classically in a triple sense:

as the vertical axis, counterposed to the horizontal syntagma—a spa-

tial rather than immediately temporal extension; as the collection o units or members o an associative group that can

be substituted or one another in the syntagma, while remaining syntacti-

cally or semantically acceptable or cognate; and

possibly most inluentially, as the network o associations a word

can acquire when considered apart rom the syntagma, in accord with

Saussure’s primary understanding.

Outside the context o discourse, words having something in common are associ-ated together in the memory. (Saussure 1916/1983, 121)

The idea o memory as a network o associations entered modern cog-

nitive science through the idea o the semantic network, without any

discernible indebtedness to Saussure but possibly prompted by similar

inuences rom language. The originally complex semantic networks gen-

erally have reduced to simpler and more computationally tractable tree

structures that incorporate genus: species relations (Johnson-Laird 1988,

328–330).

Syntagmatic relations hold in praesentia; in contrast, associative rela-

tions obtain in absentia when the line o writing is considered alone. The

syntagma immediately introduces the idea o a fxed sequence that con-

tains a specifc number o elements. An associative group lacks a fxed

order and may have an indefnite number o elements. Some associative

orms, such as the exional paradigm o the cases o a noun or verb, may

have a limited but not necessarily precisely agreed upon number o ele-

ments (Saussure 1916/1983, 122–124).

Representation o the paradigm on surace—as a diagram, rather than

simply as line—is oten essential or exposition; representations impose

both pattern-based and semantic cohesion (Saussure 1916/1983, 125).

Both the possibility o distributing cut pieces on a surace and the estab-

lished orm o representing the paradigm as a diagram on a plane imply

that linearity alone is not sufcient to represent the paradigm in an imme-

diately intelligible way.The paradigm has been regarded as ormed by units carved rom the

syntagma (Barthes 1984, 121). The metaphorical orce o carving out  

Page 113: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 113/199

104 Chapter 6

should be observed, as it may help understanding o the strength and

extent o words released into multivalency when cut rom the syntagma.

Formation o the paradigm dividing the syntagma also has a historicalanalog in the introduction o word divisions into a previously undivided

line o writing. A urther common metaphor or the relation o the syn-

tagma and paradigm involves weaving—the oral or written text as the

woven product that is ormed rom the line o oral speech extended over

time and o writing across space. In particular, written words removed

rom their syntagma can be regarded as torn rom their line o writing

and embedded in the abric o the text.

From a rigorous materialist perspective, the paradigm can be regardedas generated rom the syntagma, corresponding to the probable historical

order o perception and the known historical order o the development

o written language. In the pedagogic technique or exempliying the dis-

tinction o syntagma rom paradigm—cutting lines o writing rom paper

and isolating words rom the line—the syntagma (the line o writing) can

be cut repeatedly along its vertical axis, isolating words or elements o 

the paradigm. The possibility o cutting depends partly on word bounda-

[T]he mandrake . . . gives a cry when it torn up; this cry can drive those who hearit mad. We read in Shakespeare (Romeo and Juliet , IV, iii):

And shrieks like mandrake torn out o the earth,That living mortals, hearing them, run mad …

In German, “mandrake” is Alraune; earlier it was Alruna, a word that comes origi-nally rom “rune,” which stood or “whisper” or “buzz.” Hence (according to Skeat),it meant a “mystery . . . a writing, because written characters were regarded as a mys-tery known to the ew.” More simply, perhaps, the idea o a visible mark standing ora sound baed the Nordic mind, and therein laid the mystery.

The physician Dioscorides (2nd century A. D.) identifed the mandrake with thecircea, or herb o Circe, o which we read in the tenth book o the Odyssey:

At the root it was black, but its ower was like milk. Moly the gods call it, and ithard or mortal man to dig: but the gods are all-powerul. (Borges, 1974, 96-98).

The tearing o a word rom its line o writing could be compared to tearing

up a mandrake rom the earth. Borges also revealed some direct associationbetween the mandrake and the written word.

Box 6.1Tearing o a Word rom Its Syntagm

Page 114: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 114/199

A Semantics or Retrieval rom Full Text 105

ries marked by spaces. By contrast, oral speech does not necessarily have

intervals o silence (analogous to spaces in writing) between words, and

it does not constitute a product that can be segmented, with the cut seg-ments retaining material existence. Oral speech has no direct equivalent

to the surace on which words can be distributed. Cutting the line o writ-

ing implies a disruption o linearity; the cut pieces can be gathered on a

preexisting surace. Identical cut words rom dierent syntagmas, possi-

bly contrasting in meaning rom their original syntagmas, can be regarded

as the origin or the concept o the paradigm, particularly in the sense o a

network o associations. Thereore, the paradigm can be regarded as pro-

duced by and abstracted rom the experience o the syntagma.The cut words can be distributed on a surace in ways that materi-

ally embody each o the triple senses o the paradigm. First, in the least

deliberately imposed organization, cut pieces can be allowed to all onto

a surace as they are cut rom above, similar to the material experience

o cutting paper with scissors above a desk surace. Cutting along the

vertical axis and allowing words to all onto a surace corresponds to the

frst sense o paradigm as the vertical axis, counterposed to the horizon-

tal syntagma, with a spatial rather than an immediately temporal exten-

sion in real material orm. The technology o premodernity—paper and

scissors—requires human material and syntactic labor. Under modernity,

it could be conducted computationally, but the result would yield limited

semantic interest.

With greater imposed organization, cut words could be grouped by sim-

ilarity in patterns (or instance, by opening sequences o letters o words),

resulting in pattern-based organization, where distinguished groups o words might constitute collections o grammatical variants (or instance,

cheval , chevaux). Such organization captures one aspect o the second

sense o paradigm—the collection o units or members o an associative

group that can be substituted or one another in the syntagma and still

remain syntactically or semantically acceptable or cognate. This captures

only that aspect o the associative sense o the paradigm that corresponds

to similarity in patterns, not the semantic or even grammatical relations

between elements dissimilar in pattern (consider is, are). Organization bypattern also captures the paradigm’s material basis, considered as derived

Page 115: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 115/199

106 Chapter 6

rom the recurrence o identical word orms in dierent systems, with

contrasting meanings. Under premodernity, this orm o organization

could be imposed by human syntactic labor; under modernity, it can begenerated computationally and is commonly implemented or retrieval

rom ull text.

Finally, cut words can be grouped by reerring to considerations

o meaning, including similarity, opposition, and other connections.

Grouping by meaning—semantic considerations—corresponds to an

inuential sense o paradigm as the network o associations acquired by

a word considered apart rom its syntagma, congruent with Saussure’s

understanding:

Outside the context o discourse, words having something in common are associ-ated together in the memory. (1916/1983, 121)

Grouping by reerence to meaning is a orm o semantic organization—

drawing on human understanding and memory—and requires human

semantic labor. The necessary presence o understanding and memory

indicates a link with semantic networks but also reveals a contrast; the

greater complexity possible in grouping cut pieces suggests that rela-tions o meaning may be difcult to capture in computationally tracta-

ble models. Under premodernity, similar instruments and objects o labor

(hand, scissors, and paper) could be used or semantic organization and

were employed or syntactic organization. Under modernity, grouping by

meaning cannot be implemented immediately by computation and would

require continual direct human semantic labor, using a materially di-

erent instrument and object o labor. For orms o syntactic organiza-

tion, the change in instrument (rom scissors and hand to computationaloperations) and object o labor (rom paper to electronic storage) enables

greatly increased speed and scope.

Demonstrating the material embodiment o dierent senses o para-

digm enables isolation o the specifc sense o paradigm or retrieval rom

ull text, without human semantic intervention in description. The pri-

mary sense is o the varied meanings acquired by a word in dierent syn-

tagmas. This sense has a material basis, with contrasting instruments and

objects o labor between premodernity and modernity. It embodies one

aspect o the second sense o paradigm, reerring to units or members

o an associative group, specifcally to an associative group o patterns

Page 116: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 116/199

A Semantics or Retrieval rom Full Text 107 

identical or similar to one another. It also corresponds to an aspect o 

the strongest sense o paradigm as the associations a word can acquire

when considered outside a specifc context in discourse or in a syntagma.In retrieval, the syntagma o search words are restored, and the diversity

o meanings likely will exceed expectation. The searcher implicitly may

have held a nomenclaturist model o meaning. In summary, search and

retrieval reveals the distribution o specifc units o the paradigm in sys-

tems; the human searcher can recover varied meanings, whose diversity

can encounter a prior conception o meaning, possibly more restricted

than intended.

We can also reveal the sense o paradigm uncaptured. Words connectedin meaning but lacking similar patterns (part o the primary or ullest

sense o paradigm) are not immediately recovered. Human semantic labor

could be used to link such words together in either description or search-

ing. This chapter has methodologically excluded human semantic descrip-

tion labor rom the consideration o retrieval rom ull text, but it has

exposed one highly signifcant source or its potential additional value, o 

linking semantically connected but syntactically unrelated units together.

Syntagma and Paradigm and the Production o Meaning in Written

Language

Interaction between syntagma and paradigm can be made into a crucial

element in a powerul semiotic model o signifcation. For Saussure:

In its place in a syntagma, any unit acquires its value simply in opposition to what

precedes, or to what ollows, or to both. (1916/1983, 121)Although value is a difcult and multivalent term in Saussurean linguistics

(Harris, 1987, 118–123), it can be regarded in this passage as correspond-

ing to meaning, or signifed. The primary concern here is with the recep-

tion (rather than the production) o written language, separated rom the

possibility o immediate dialog with its producer. In receiving the written

syntagma, the choice o attaching value to the word as signifer3 would

characteristically be guided by the signifeds or values attached to words

preceding and ollowing syntagma. Dierent languages would dier in

the extent and position—beore or ater—o the relevant syntagma. Mark

Twain suggested, “Whenever the literary German dives into a sentence,

Page 117: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 117/199

108 Chapter 6

that is the last you are going to see o him till he emerges on the other side

o his Atlantic with his verb in his mouth” (1889/1996, 280). Saussure’s

understanding o the syntagma may have been inuenced (partly uncon-sciously) by models in western and Romance languages.

Word Meaning

The model o signifcation that incorporates interaction between syntagma

and paradigm can be applied to both the production o meaning rom

the individual written word and the sequence. Saussure both endorsed

and developed the structuralist principle that distinguished units obtain

their signifcance through their dierence rom proximate units—“Inthe language itsel, there are only dierences.” (1916/1983, 118)—and

applied this principle to understanding the meaning o words. Regarding

the meaning o words as produced by mutual dierentiations contrasts

acutely with the ordinary view o language as a nomenclature. While

opposing the related assumption in classic inormation-retrieval research

that the word is a unit o meaning, it does correspond to the experience

o searching ull-text systems. In a dialectical development rom Saussure,

the surrounding syntagma guides the choice o signifed rom the para-

digm that attaches to the word in written discourse.

Conceptions o language, implicitly but not necessarily explicitly

accepted in ordinary discourse through their embodiment in the mono-

lingual dictionary, can urther illuminate the relation between word orms

and their meanings. In a monolingual dictionary, words characteristically

have many defnitions attached to them, and the number o defnitions

is inuenced by the depth and purpose o treatment. The relation wasacutely noticed by the frst modern English lexicographer:

names . . . have oten many ideas, but ew ideas have many names. (Johnson1755/1982, 15)

From the frst part o this remark, “names . . . oten have many ideas,”

the relation between a word orm and its meanings can be regarded as

a one to many relation. Thereore multivalency, or having many values,

is confrmed by the monolingual dictionary as a condition o a word’s

meanings, with the meanings indicated by defnitions given and ully dis-

played by illustrative quotations. The second part o the remark—“ew

ideas have many names”—implies that i ideas are extended to include

Page 118: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 118/199

A Semantics or Retrieval rom Full Text 109

meanings, a meaning likely will be unique—recoverable rom its illustra-

tive display and open to elucidation, but not necessarily easily subject to

decomposition—unless the meaning is deliberately compounded romsimpler ideas.

On the model o singularity o meaning, synonymy—equivalence in

signifeds between words—may occur only temporarily as a symptom o 

structural change. The observation that names have many ideas but ideas

have ew names was itsel made in connection with questioning the exist-

ence o ull synonymy:

Words are seldom exactly synonimous; a new term was not introduced, but

because the ormer was thought inadequate. (Johnson 1755/1982, 15)

Similarly, Saussure noted, “a language dislikes maintaining two signals

or a single idea” (1916/1983, 162). Saussure oten viewed language as

an objectifed autonomous orce (“a language dislikes . . .”), in contrast to

the earlier view o language as a product o human activity—“a new term

was not introduced.” The idea o absence o ull synonymy is consistent

with, or even a reasonable extension o, the idea that a word’s meanings

are produced by dierence rom other meanings in the language.Thereore, word meaning is regarded as produced by human mental

labor on the interaction o syntagma with paradigm. Words characteristi-

cally have many meanings and meanings may be singular, indicated by

defnitions but not necessarily easily urther decomposed or reducible to

simple equivalences.

Phrase

A crucial contrast emerges when considering the signifcation o syntag-mas more extensive than the word, such as the phrase or the sentence.

Similar orces o human mental labor acting on the interaction o syn-

tagma and paradigm continue to apply, but their eects are dierent in

the extended syntagma. Since the choice o signifeds is restrained by the

urther syntagma included, extending the syntagma beyond the word

would be expected to reduce the multivalency o the constituent words.

Indeterminacy— dierent interpretations—could remain. Again, ull-text

phrase searching would experientially confrm the reduction o multiva-

lency and the persistence o indeterminacy.

Page 119: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 119/199

110 Chapter 6

Multivalency and Indeterminacy

The account o the mechanism or the production o meaning rom writ-

ten language can be made to yield a distinction between multivalency andindeterminacy. Multivalency (that is, having many values or meanings) is

a condition o words used as ordinary written discourse. It is most acutely

revealed when a word is torn rom its syntagma— eectively released into

paradigm—and reinserted into a number o syntagmas, rom which we

can legitimately iner dierent meanings. In a particular syntagma, the

multivalency o a particular word is restrained by determining its mean-

ing rom the meanings attached to proximate words in the surrounding

line o utterance. Indeterminacy, or difculty in reaching defniteness ininterpretation, may still exist as a remaining possibility. Once ormulated,

the distinction o multivalency rom indeterminacy is simple and analyti-

Would you convey my compliments to the purist who reads your proos and tell him

or her that I write a sort o broken-down patois which is something like the way aSwiss waiter talks, and that when I split an infnitive, God damn it, I split it so it willstay split and when I interrupt the velvety smoothness o my more or less literatesyntax with a ew sudden words o bar-room vernacular, that is done with the eyeswide open and the mind relaxed and attentive.

—Raymond Chandler. Letter to Edward Weeks, January 18, 1948 (Chandler,1962/1997, p.77).

The language used by Chandler is intended to give an appearance o preci-sion, with restrained anger conveyed by the repetition o “split” and itsassonance with the “it” o “God damn it.” On a syntactic level, an appro-

priate oral reading would emphasize the separateness o words and theirboundaries, or splits rom one another, in the crucial passage. The mean-ing o crucial words, “velvety,” “vernacular,” and “attentive” is controlledby their syntagm—or instance, “bar-room vernacular”—and is not idio-syncratic. Indeterminacy has been deliberately avoided at the levels o expression and o meaning. Remaining indeterminacy could be regardedas a maniestation o the central paradox o writing, that its appearance o exactness is continually betrayed by the possibility o dierent interpreta-tions (McKenzie, 1990).

Multivalency, in distinction rom indeterminacy, is still potentially pres-

ent, with many values or meanings or crucial words possible in other syn-tagms. The potential or multivalency could be empirically confrmed by asearch o ull text resources.

Box 6.2Indeterminacy and Multivalency

Page 120: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 120/199

A Semantics or Retrieval rom Full Text 111

cally powerul, particularly or understanding ull-text retrieval, but sel-

dom explicitly made. It could be expressed economically in terms o the

interaction o more undamental and materially rooted categories, with-out distorting those categories.

Conclusion

Thereore, we can state summarily the account o the production o mean-

ing in human consciousness or the reception o written language already

developed. In written language, word meaning is regarded as produced

rom human mental action—semantic labor—at the intersection o syn-tagma and paradigm. The experience o meaning, as distinguished rom

the mechanism or the production o meaning, is understood as an event

in consciousness. The concept o meaning is not reduced to that o defni-

tion or paraphrase, and the meaning o a particular word is not identifed

with any acceptable defnition or paraphrase given. Meaning is regarded

as strongly analogous to the signifed. The concept o the signifed has

the conceptual advantage o an established place in a series o systematic

distinctions, particularly dierentiated rom signifer and sign, while the

concept o meaning obtains greater richness rom its resonance with ordi-

nary discourse.

The conception o meaning developed is consistent with the existing

sources recognized as relevant to understanding inormation retrieval,

but also develops dialectically rom them. The idea o language as nomen-

clature, with a one-to-one relation o word to object, continues to be

rejected while the counter notion o meaning as produced by dierenceis accepted. In a dialectical development, the idea that language is not a

nomenclature is no longer let as the partly unexplored term o a contrast

or antithesis. The mechanism or selection rom dierentiating meanings

is given a defnite and materially specifc orm o human mental labor

at the intersection o the syntagma with the paradigm. Possibly assisted

by the ocus on the material aspects o communication, the dangers o 

mysticism and extreme relativism, present in existing sources, have been

avoided. The particular nature o the mechanism or selection rom dier-ent meanings identifed also has a signifcant analytic value rom ull-text

retrieval, rom its elements o congruence, with the current rematerializa-

Page 121: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 121/199

112 Chapter 6

tion o communication, in which computational operations o cutting are

applied to the line o writing.

The conception o meaning developed here is consistent with the under-lying presumptions o the labor theoretic approach but also translates

them into more specifc terms. Meaning as an event in consciousness is

consistent with elucidation—rather than defnition by decomposition—as

the only way o giving verbal understandings o atomic acts or primi-

tive terms. Meaning itsel might be a primitive term. In accord with the

underlying stress on the signifcance o the material basis o being, the

mechanism that produces meaning—the intersection o syntagma with

paradigm—has a specifc material orm.In relation to classic inormation-retrieval research, relevance has not

been identifed with similarity in meaning or even seen as a simple unc-

tion o meaning. The assumption o the value o delivering all (and pos-

sibly, only all) the relevant records is not resurrected. Rather, meaning is

regarded as one highly signifcant ocus or criterion or selection, pos-

sibly more undamental than relevance. Considered apart rom and ater

meaning, relevance might connect more directly with the ordering o doc-

uments or records in retrieval, analytically excluded rom the discussion

here.

Page 122: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 122/199

7A Syntactics or Retrieval rom Full Text

Introduction

This chapter ocuses on understanding written language at the syntactic

level. The understanding o syntax adopted here reers to patterns, not

directly to grammar, and is continuous and consistent with the under-

standing o syntax in the concepts o syntactic labor and machine proc-

esses, which were developed earlier as part o the labor theoretic approach.

The continuity o the understanding o syntax implies the possibility thatmachine processes operate on the patterns isolated and the urther pos-

sibility o revealing the basis or established computational operations,

including those used in retrieval rom ull text. The specifc concern is with

the units o written language and their combination—the letter or charac-

ter, the word, and the multiword sequence. We are urther concerned with

the requency o recurrence o units and their combination, particularly

o identical multiword sequences. The area o study rom which the anal-

ysis will be derived involves inormation theory, rigorously interpretedand applied to written language. The stress on the materiality o com-

munication and the primary ocus on written language will be sustained.

In contrast to Saussurean linguistics, inormation theory ocuses exclu-

sively on communication in its aspect as signals and not on its meaning. It

can be adapted to yield particular insight into patterns o replication and

dierence in sequences o signals. Written language is understood as the

message o inormation theory. Inormation theory oers a particularly

revealing analysis o the units o written language and their combination,

but it is seldom exploited or understanding the structure o written lan-

guage in relation to retrieval.

Page 123: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 123/199

114 Chapter 7 

The model o communication in inormation theory emerged rom

a mathematical culture that valued the precision and the possibility o 

mathematical development, obtainable rom the abstraction o unda-mental concepts and entities rom everyday reality. The undamental

components o the model were identifed by Claude Shannon during the

late 1930s, and the model itsel was developed and substantiated through

cryptography during the 1941–1945 war.1 Including both the model

o communication and its mathematical development, the theory was

fnally made ully public in 1948 in a two-part article, “A Mathematical

Theory o Communication” (Shannon 1948/1993). At a biographical

level, Shannon had been interested in coding systems, including Morsecode, during his adolescence (Liversidge 1993, xxii). On a cultural level,

coding systems or transorming written language, including Morse code

and other telegraphic codes, had prolierated during the late nineteenth

century, partially resulting rom the need to transmit messages over the

geopolitical space created by the western expansion o the United States

and its increased links with Europe (Warner 1993). Thus, practical under-

standing o coding preceded theory.

Data indicate a very limited intersection between citations to Saussure

and to Shannon and that very ew documents captured by Google (2005)

include both the phrase mathematical theory o communication and the

word syntagm (see table 7.1). Data is only suggestive, rather than conclu-

sive, but it strongly confrms that the conjunction o inormation theory

rom the article “A Mathematical Theory o Communication” with the

concept o the syntagma, pursued here, is almost entirely unanticipated.

Including its strong inuence on the early development o inormationscience (Roberts 1976), inormation theory has been widely diused but

not necessarily well understood since its ormulation by Shannon in 1948

(1948/1993). Particularly in the humanities and the social sciences but

not within inormation theory (Verdú and McLaughlin, 2000), Warren

Weaver’s interpretation in a highly inuential introduction to the mono-

graphic publication, The Mathematical Theory o Communication, has

been received mostly as explication, with limited recourse to Shannon’s

own arguments (Shannon 1956/1993; Tidline 2004). Weaver’s introduc-tion includes the suggestion that inormation theory can provide a model

or understanding human communication at the levels o meaning and

Page 124: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 124/199

A Syntactics or Retrieval rom Full Text 115

the eects o messages on their recipients and that it need not be confned

to signal transmission (Weaver 1949; Fiske 1990).

The theory embodied in the article “A mathematical theory o commu-

nication” is valued directly, rather than analogically, or its scope. Undercertain specifed conditions, it is considered applicable to a variety o sys-

tems previously considered as separate entities. Since 1948, and particu-

larly since the late 1970s, the model became relevant to the design as well

as to the understanding o telecommunications and data storage systems.

It currently is regarded as still valid and as setting undamental limits or

inormation or signal transmission (Verdú and McLaughlin 2000). The

model o communication in inormation theory has antecedents traceable

to Aristotle (Sperber and Wilson 1986, 5–6), and it resembles Saussure’s

less deliberately abstracted speech circuit (Saussure 1916/1983, 11–13).

In its original intention, however, it was restricted primarily to communi-

cation as signals and not concerned with the understanding o messages.

This chapter preserves the level o concern with signals, and this helps

clariy connections and contrasts between inormation theory and linguis-

tics. In semiotic terms, linguistics was concerned with signifer, sign, and

signifed, both as complexes and at the levels o signifer and o signifed.The primary concern here involves the relation o signifer to signifed and

the inuence o the interaction o syntagma and paradigm on the realiza-

Citation made in 2004 in the Social Science Citation Index andin the Arts & Humanities Citation Index

Saussure (all works) 64

Shannon (all works) 122

Saussure and Shannon 1

Reerences on Google (5 May 2005)

Mathematical theory o communication 53,500Syntagm 10,300

Mathematical theory o communication AND syntagm 14

Table 7.1 Intersection o interests in Saussure and Shannon

Page 125: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 125/199

116 Chapter 7 

tion o that relation in written language. Some attention to complexes o 

signifer, sign, and signifed was given when considering speech as marked

by reedom o combination, where a partial anticipation o the combina-torial concerns o inormation theory was detected. Although Saussurean

linguistics ocused explicitly on spoken language its concepts could be

and were applied to the understanding o written sequences (in chapter

6). In contrast, inormation theory ocuses exclusively on communication

as signals or on the signifer, and does not consider the relation between

signals and meaning (see fgure 7.1) (Shannon 1948/1993, 5). In its origi-

nal ormulation, inormation theory was also limited to discrete rather

than continuous signals, although it recognized that continuous signals,such as oral speech, can be transormed into discrete orm (Shannon

1948/1993, 7–8, 50, 75).

This chapter examines the application o inormation theory to written

language in its aspect as signals, and develops rom a suggestion—made

by Pierce (1980) in one o the ew but not only less technical expositions

o inormation theory (Cherry 1978; Shannon 1968/1993)—that, in

applying inormation theory to written language, we may be

pushing a little beyond the mechanical constraints o language and getting at theamount o choice that language aords. This idea suggests views concerning theuse and unction o language, but it does not establish them. (Pierce 1980, 124)

Inormation theory is directly, rather than analogically, applied: the line

o written language would be one instance o the message o inormation

theory and can be revealingly understood rom an inormation theory

perspective.

Analogies are established between undamental concepts rom lin-guistics and inormation theory (that is, between the paradigm and the

messages or selection and the syntagma and the message). The analytic

advantages and interest o the analogies established ollow rom the con-

Expression Signifier Signal

Linguistics Relation Sign

Content Signified

} Information theory

Figure 7.1Levels o analysis or linguistics and inormation theory.

Page 126: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 126/199

A Syntactics or Retrieval rom Full Text 117 

vergence o previously unrelated and deeply contrasting traditions. “A

conuence o undamental quantities always is intriguing” (Berger 2000,

661) and may yield insights not obtainable rom either discourse in isola-tion. The intention, both in the direct treatment o inormation theory

and or the analogies with concepts rom linguistics, remains an enhanced

understanding o the condition within which human choice operates,

rather than implementing deterministic computations, beyond reveal-

ing the basis or Boolean operations. The emphasis on human choice is

consistent with this book’s treatment o semantics, but contrasts with

the dominant deterministic approach to the use o inormation theory in

retrieval.

Messages or Selection and Message

In inormation theory, the message and messages or selection are un-

damental and interrelated components o a coherent and comprehen-

sive model o communication; communication is understood primarily

as the transmission o signals across telecommunication channels. In the

model o communication developed or inormation theory, an inorma-

tion source chooses rom messages or selection and then combines them

into a message. In its technical and deliberately intratheoretic sense, inor-

mation is a measure o reedom o choice in selecting messages rom the

source. Measures o inormation can be derived directly rom the mes-

sage i the stability o the statistical characteristics o messages is present

or assumed across the samples analyzed. The message is then passed to

a transmitter, which operates on the message to produce a signal ortransmission beore sending it across a communication channel; noise in

the communication channel is assumed. The communication channel is

linked to a receiver, which operates on the signal to transorm it into a

message that can be passed to the destination. The inormation source,

transmitter, receiver, and destination can be a combination o human and

technology. For instance, the inormation source could be a printer that

selects messages rom a type ont, or a person who composes a telegram

and implicitly chooses rom the lexicon o the language, in accord withthe combinative constraints o syntactically acceptable telegraphic mes-

sages. For a telegraphic message, the transmitter would be a combination

Page 127: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 127/199

118 Chapter 7 

o telegraph operator and equipment, transorming the message into a

signal or transmission; the receiver undertakes an inverse process, pass-

ing the message received to the destination (Shannon 1948/1993) (fgure7.2). The examples o printing and o telegraphic communication indicate

the empirical actuality o the message corresponding to an object central

to the understanding o the syntagma—the sequence o written language.

The concepts o the messages or selection and o the message rom

inormation theory can be materially illustrated in a similar way to the

distinction o syntagma rom paradigm, i the process o cutting paper is

considered as reversed in temporal sequence and the cut words are assem-

bled into the line o writing. The cut words correspond to the messagesor selection rom a source, conceived in this illustration at the level o 

granularity o the word rather than the character. The assembled line o 

writing corresponds to the message, which then can be transormed into

a signal or transmission. I the assembled message requires a deliberately

intended specifc meaning, the selection o messages must incorporate

semantic considerations. I there were no concern to produce a particu-

lar meaning or the message, selection could be conducted on the basis

o patterns alone. The reversal o temporal sequence implies a degree

o ontological priority or messages or selection over the message, and,

accordingly, we will treat them frst.

Transmitter Receiver Destination

Signal

Message Message

Informationsource

Noisesource

Receivedsignal

Figure 7.2Model o communication in inormation theory.

Page 128: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 128/199

A Syntactics or Retrieval rom Full Text 119

Messages or Selection

In inormation theory, the messages or selection are the messages exist-

ing in the source rom which the inormation source selects to compose amessage. Classically, the messages or selection are conceived at the level

o granularity o the individual character or letter o the alphabet o char-

acters or selection (Shannon 1948/1993). Restricting the lowest level

o granularity at which the messages or selection are conceived to the

individual character avoids the necessity o direct attention to such ea-

tures as intercharacter or mosaic dierentiation and redundancy but still

acknowledges the existence and unction o these eatures (Cherry 1978).

The level o granularity chosen is consistent with the understanding o themessage adopted—orthographically acceptable written sequences rom

the English language lexicon, compounded o letters and other characters

rom the messages or selection.

Messages or selection can be conceived as held in a reservoir or scat-

tered across a surace, in contrast to the surace organization oten con-

sidered essential to representing the paradigm. Within inormation theory,

the lack o organization implied in a reservoir or scattering across a sur-

ace is analogous to the ormalized concept o entropy. Real historical ana-

logues to a store, or reservoir, o messages or selection can be considered

at increasing levels o organization within the messages or selection and

not yet concatenated into a linear message, although possibly anticipat-

ing the relative distribution o the messages or selection in the message.

Clarity in exposition can be obtained by restriction to a single alphabet

(in this instance, the lower case Roman alphabet), although the concept o 

an alphabet could extend to other sets o characters. For instance, a rela-tively unorganized reservoir could be conceived as a container that holds

equal numbers o individual letters. Combinations blindly chosen or ran-

domly generated rom the reservoir might be difcult to orm into cohe-

sive units or words in a given language, even on a purely pattern-based

and computationally realizable basis, and a residue o unusable letters

might remain ater the ormation o units (or instance, qqq, xxx, zzz). At

a urther level o organization, the distribution o letters in the reservoir

could correspond to the relative distribution o letters in messages o theintended language; blind selections would then be more amenable to con-

catenation into lexically cohesive units or words, and there need only be

a limited, possibly nil, residue o unusable characters. (Scrabble® players

Page 129: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 129/199

120 Chapter 7 

should recognize a close correspondence with their modes o blind selec-

tion, combination into units, and ending with a limited residue). A still

closed but partially organized reservoir could be constructed by organ-izing characters into categories, such as consonant and vowel, with blind

selection rom dierent categories possible: Countdown letter games

could be understood as selection rom categorized but eectively unlim-

ited reservoirs (Countdown 2007). The collection o moveable type in a

printer’s shop could be regarded as an organized reservoir o letters open

to deliberate selection o individual letters, with the number o letters in

each box o the overall store anticipating their relative numerical distri-

bution in message sequences (ligatures are excluded or the sake o ana-lytical clarity but could be understood as requently recurring sequences

o individual letters). A surace with some pattern-based organization

imposed would be the sending mechanism or the Cooke and Wheatstone

telegraph, rom which messages were selected by pointing needles at par-

ticular characters within an array. From a restricted alphabet o twenty

letters, each character could be selected by pointing two o the fve nee-

dles; characters arranged primarily by reerence to the order o the alpha-

bet (Ohlman 1996, 713–714; Science Museum 2007). A keyboard could

also be regarded as a surace or selection—the historical transition rom

alphabetic arrangement to qwerty confguration was intended to ensure

that type bars, linked to keys in a manual typewriter, did not conict in

their movements and jam (Ohlman 1996, 632). Proposals or reorm

rom qwerty toward a more efcient keyboard could be understood as

reorganization deliberately governed by requency o occurrence o indi-

vidual messages or selection in anticipated messages. In these instances,the messages or selection were conceived as syntactically or pattern-

dierentiated units, without immediate semantic signifcance. Individual

messages only unction semantically as words in special cases (I , a) and

only when considered as part o the message.

The parallel between the increasing organization o messages or selec-

tion and o the components o the paradigm (Paradigm, chapter 6) is only

partial. The organization imposed on the messages or selection is

conceived primarily at the level o the character,

restricted to pattern-based or syntactic characteristics, and

stops short o semantic associations.

Page 130: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 130/199

A Syntactics or Retrieval rom Full Text 121

Semantic coherence could be imposed by direct human semantic inter-

vention in the selection o specifc characters rom messages or selection

or combination into a linear message, consisting o semantically cohesivesequences o messages as well as lexically acceptable units.

Messages or selection are analogous to the paradigm, particularly in

its second sense o the collection o units or members o an associative

group rom which selection can be made to orm a syntagma. Again, the

paradigm has a semantic connotation when the messages or selection

are syntactically conceived units. Reservations about the analogy arise in

three respects:

the assumption o preexisting messages or selection to the message,

the level o granularity at which they are conceived, and

the comprehensiveness o the messages or selection compared to the

selective nature o the paradigm.

First, messages or selection are explicitly regarded as preexisting their

combination in the message or transmission, in a urther objectifcation

o communication. In contrast, particularly or the interpretation here, the

paradigm derived rom the syntagma and existed partly in absentia and asan analytic construct. Thereore, inormation theory receives the products

o human semiotic labor as objective existents. A real historical analogue

to the independent existence o messages or selection can be ound in

a printer’s ont o separate metal types, where ligatures and the relative

number o tokens in each category anticipated probable combinations

and distributions in the message to be composed. Second, the messages

or selection in inormation theory are conceived primarily at the level o 

granularity o the individual character; the character is regarded as a unit

o an alphabet o characters that includes all characters admissible to the

message. The constituents o the paradigm, understood as a collection o 

associated units, are primarily words—limited in number compared to

the ull vocabulary o the language. Finally, in contradistinction to the

inclusiveness o the messages or selection, selection into the paradigm,

based on associations in meaning or grammatical variations o a word or

multiword sequence, is implied. Thereore, the analogy between the mes-sages or selection and the paradigm must be qualifed.

The concern here will be with the messages or selection, understood

in an acceptable simplifcation as the individual and separate characters

Page 131: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 131/199

122 Chapter 7 

o the Roman alphabet; the message is understood as syntactically and

orthographically acceptable written sequences rom the English language

lexicon. The level o granularity at which the messages or selection areconceived stops at the individual character, without giving direct attention

to such eatures as intercharacter or mosaic dierentiation and redun-

dancy, although the existence and unction o these eatures is recognized

(Cherry 1978; Warner 2003, 551). Thus, the message in inormation the-

ory is understood as constructed rom individual messages or selection

and combined into a linear sequence or message.

MessageIn inormation theory, the message emerges in two orms: the message

or sending, passed rom the inormation source to the transmitter, and

the message reconstructed by the receiver rom the received signal and

passed to the destination. Classically, inormation theory was concerned

with ensuring a close correspondence between the message sent and the

message received, accepting that a signal would be perturbed by noise in

the communication channel. The message is identical with the signal or

certain uses o writing—handwritten, typed, or printed orms used or

communication over distance—and neither transmitter nor receiver oper-

ates signifcantly on the message or signal. Selection errors and noise (or

instance, orthographic mistakes and printing ailures) are still possible,

and some suggest that redundancy in written language was deliberately

introduced to enable reconstruction o the message intended to be sent

(Warner 2003). For other historically subsequent uses o writing, such as

telegraphy and e-mail, the transmitter and receiver operate on the mes-sage and received signal, and technology displaces direct human labor

and intervention over time.

We are concerned here with the message, not the transormations

between message and signal. There are two methods or assimilating

this concern to the model o communication in inormation theory. The

ull model could be retained, an identity between the message sent, and

the message received assumed, with the transmitter and decoder work-

ing without error and the signal not perturbed by noise. Alternatively,the transmitter and decoder could be eliminated rom the model and the

signal regarded as the message in transmission or as the message as sig-

Page 132: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 132/199

A Syntactics or Retrieval rom Full Text 123

nal. Conceptually, eliminating the transmitter and decoder is preerable

because it reduces the number o entities and corresponds to real historical

developments, in which practical understandings o coding precede theo-retical articulation, unctions, and entities distinguished in inormation

theory—such as the message and signal—separate rom a previous lack o 

dierentiation (Warner 2003). For the purposes o the current discussion,

the message is understood as a single entity and no dierentiation is made

between the message or sending, the signal, and the message as recon-

structed by the receiver; the transmitter and decoder are eliminated rom

the model. Although signifcant, these conceptual considerations do not

aect consideration o the message’s characteristics (including statisticalcharacteristics). We will ocus on choosing rom messages or selection to

construct the message and the implications o this process or understand-

ing the structure o the message

Particularly on the understanding adopted, the message is both analo-

gous with and can be dierentiated rom the syntagma. Shannon’s reer-

ences to the message as a sequence are reminiscent o Saussure’s insistence

that sequentiality orms the syntagma (Shannon 1948/1993, 6). The ter-

minological similarity suggests conceptual congruence. A partly implicit

rather than ully explicit assumption, rather than a principle, o linearity

is similarly present in the model o communication in inormation theory,

with communication proceeding in time rom messages or selection to the

destination. Linearity is implicitly observed in telecommunication prac-

tices or oral messages, presumably independently o Saussure’s insight

but inuenced by similar considerations. I it is to remain ordinarily intel-

ligible, an oral message must be received by the human destination in thesame sequence as the message or transmission; graphic messages (includ-

ing written language) can be received by the destination in parts over time,

as long as an adequate correspondence to the message or transmission is

fnally reconstructed. Later modifcations o the model incorporated eed-

back loops that allowed modifcation o linearity. In contrast to the con-

cern o linguistics with levels o signifer, sign, and signifed, inormation

theory ocuses exclusively on the signifer, or expression. Even when lin-

guistics anticipated the combinatorial perspective o inormation theoryin its view o speech as marked by reedom o combination, it was con-

cerned primarily with complexes o signifer, sign, and signifed, not the

Page 133: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 133/199

124 Chapter 7 

signifer alone. Thereore, both the message and the syntagma can reer to

a common object—the sequence o written language—with an underly-

ing assumption or principle o linearity, albeit to dierent aspects o thatobject (the message to expression alone and the syntagma incorporating

expression and content and the relation between them). 

Once messages or selection are accepted as individual characters o 

the Roman alphabet, the message o written English can be understood

at interconnected levels o granularity, dierentiated as character or let-

ter, word, and multiple or multiword sequences. In some special cases,

an individual character demarcated by spaces or a punctuation mark

and space can unction as a word (the space itsel can be considered asa character or message or selection, corresponding to the real material

experience o selecting messages rom a collection o moveable type or

a keyboard). Both the word and the multiword sequence can be more

ully understood rom the perspective o inormation theory, specifcally

by treating the line o writing as the message o inormation theory.

Message and Messages or Selection and Occurrence and Recurrence o 

Units

From the interaction between messages or selection and the message, we

can construct an understanding o the word and multiword sequence that

is consistent with and complementary to understandings in Saussurean

linguistics, but more ully developed. The intersection o syntagma and

paradigm was transormed into a crucial element or a model o the pro-

duction o meaning or written language, embracing and dierentiating

the word and phrase. Analogously, the combination o the messages orselection into the message can yield a syntactic or pattern-based under-

standing o the word and multiword sequence. The distinction between

the levels o analysis o semantics rom syntax is maintained and paral-

lel and complementary understandings are developed. The messages or

selection continue to be understood as individual characters, and the line

o writing as compounded rom the messages or selection, with the word

as a demarcated group o characters and the multiword sequence as a lin-

ear concatenation o such groups. An inormal, rather than arithmeticallyexpressed or statistically ormulated, but still sufcient understanding

Page 134: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 134/199

A Syntactics or Retrieval rom Full Text 125

o the requency o recurrence o words, and particularly o multiword

sequences, can be made to ollow rom these understandings.

Word

Shannon’s own brie but highly illuminating explorations o the structure

o written language, wherein the line o writing is eectively understood

as the message o inormation theory, yield a defnition o the word as “a

cohesive group o letters [o printed English] with strong internal statisti-

cal inuences” (Shannon 1951/1993, 197–198). This conception o the

word is consistent with the Saussurean understanding o the word as a

unit that compels recognition by the mind, but is more historically andmedium-specifc and less tautological. Shannon’s defnition o the word is

specifc to the written and printed word, not the oral medium, although it

may have some application to handwritten utterances regarded as com-

posed o discrete characters rather than as a continuous or broken line.

While Shannon’s defnition o the word has been largely neglected since

its publication in 1951, beginning signs o its adoption in connection with

the computational parsing o language have appeared,2 and it is poten-

tially highly signifcant. Most undamentally, Shannon’s defnition is con-

gruent with an emphasis on the material basis or communication and its

concentration on humanly and potentially automatically detectable pat-

terns.

Each element o the deinition can also be understood in a speciic

material sense, strongly related to linearity. Group can be understood as

spatially grouped, particularly within the line o writing, and separated

rom contiguous words within the line by the space; the space itsel isrecognized as a character rather than simply the absence o a character.

Cohesive can be understood in both a related and urther developed sense

o  group—related, as the cohesion or mutual stickiness implied by the let-

ters grouping together between spaces; and urther developed, as cohesion

between units o the word, or letters, which exhibit transitions between

units, acceptable within that written language. Strong internal statistical 

inuences imply that particular letters co-occur with one another, includ-

ing transition probabilities between two individual letters but extendingbeyond immediately contiguous transitions to longer sequences.

Page 135: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 135/199

126 Chapter 7 

Thereore, the deinition o the word meets the rigorous test or

knowledge as “an ideal reproduction o the external world serviceable

or cooperative action thereon” (Childe 1956, 54). For instance, specifccooperative actions could include isolation o words rom written dis-

course or ull-text indexing conducted either by human clerical labor or

machine process. Clerical labor eectively instantiated an understanding

o the word strongly analogous to Shannon’s defnition, or instance or

producing biblical concordances or nineteenth-century indexes or news-

papers (Palmer 1885) beore its ormal defnition. Clerical and automatic

computational processes have used a similarly implicit understanding o 

the word and also have recognized the signifcance o the space; how-ever, such understandings seem to have developed largely independently

o the relevant theory. Thus, practical understanding has occurred partly

in advance o theoretical ormulation and diusion, but practice could be

inormed by the encompassing theory.

The defnition o the word as “a cohesive group o letters with strong

internal statistical inuences” is consistent with the already adopted con-

ception o the message or selection as the individual characters o the

Roman alphabet, and o the message as compounded rom messages or

selection. It also incorporates qualities specifc to the message: cohesive-

ness and strong internal statistical inuences. The defnition o word can

also provide a basis or a consistent understanding o the multiword

sequence.

Multiword Sequence

As revealed by testing human subjects, prediction possibilities oered bythe written-language message connect the word to and dierentiate it

rom the multiword sequence (Shannon 1951/1993; Moradi, Grsymala-

Busse, and Roberts 1998). From prediction testing, Shannon concluded

that statistical eects extending up to one hundred letters (and thereore

beyond the separate word) reduced entropy, the amount o inormation

conveyed, to the order o one bit per letter or ordinary literary English.

Understood as the negative o entropy, redundancy was roughly 75 per-

cent; subsequent studies have yielded similar but not identical estimates(Moradi, Grsymala-Busse, and Roberts 1998). The space was revealed as

almost entirely redundant in sequences o one or more words (Shannon

1951/1993, 198). Real-world coding systems have displayed a practical

Page 136: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 136/199

A Syntactics or Retrieval rom Full Text 127 

understanding o the relative but incomplete redundancy o the space.

While early orms o alphabetic written language do not mark word

boundaries, the absence o the space symbol rom Morse code requiredthe introduction o urther telegraphic transmission codes to prevent con-

usion over letter boundaries and word-endings in the message received

(Warner 1993, 310–312). The space is a recent introduction to spell check-

ers or word processing systems. The reduced text produced by reducing

redundancy can be considered an encoded version o the original:

Roughly speaking, ideal prediction collapses the probabilities o various symbolsto a small group more than any other translating operation involving the same

number o letters which is instantaneously reversible. (Shannon 1951/1993, 204)

Analogously, shorthand—diused in the late nineteenth century and con-

sidered here as a mapping rom ull to reduced written sequences, even i 

primarily intended or the transcribing o oral speech—reduced redun-

dancy in the ull sequence and was reversible. Reduced redundancy might

introduce errors in reversing, particularly i the shorthand message was

transmitted to a destination dierent rom the inormation source and

not used or private recollection over time, inormed by knowledge o the

circumstances o production o the original utterance.3

Shannon’s experiments did not ully distinguish syntactic, or pattern-

based, predictions rom the semantic predictions o human subjects,

introducing (possibly partly unconsciously) issues o meaning that are

oreign to inormation theory’s primary concern with expression. Thus,

we can distinguish syntactic rom semantic prediction and ocus atten-

tion on the syntactic, or pattern-based, level already partially embodied

in text compression programs. The possibilities o prediction are greatestwith high redundancy and low entropy or subsequent characters, when

sequences immediately below the word are considered: or instance, in

written English the sequence strengt can normally only be ollowed by h.

These predictive possibilities would be consistent with Shannon’s under-

standing o the word as “a cohesive group o letters with strong internal

statistical inuences” (Shannon 1951/1993, 197–198). Compared with

prediction rom a combination o syntactic and semantic considerations,

prediction based on purely syntactic eatures may cover shorter sequencesand accordingly yield a lower value or redundancy and a higher entropy

or written English.

Page 137: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 137/199

128 Chapter 7 

From the perspective o inormation theory, thereore, the multiword

sequence o written language can be understood as a linear concatena-

tion o units weakly correlated with one another; the units themselves—understood as words—are internally cohesive. This conception o the

multiword sequence is consistent with other aspects o the application

o inormation theory to written language, speciically incorporating

Shannon’s defnition o the word and building on the implicit assump-

tion o linearity. Consistency with the defnition o the word ensures ur-

ther consistency with the conception o the messages or selection which

were understood as individual characters o the Roman alphabet. This

understanding was incorporated into the defnition o the word. Coupledwith eective real-world reerence and, in this instance, computational

applicability, internal theoretical consistency provides some assurance o 

the validity o the defnitions o the word and multiword sequence. The

understanding o the multiword sequence o written language is also sim-

ple, economic, and apparently novel, revealing simplicity and economy in

the brevity o the understanding given. This impression o novelty is sup-

ported by its logical dependence on Shannon’s conception o the word,

itsel not widely adopted despite its potential value.

The conception o the multiword sequence o written language as a

linear concatenation o units weakly correlated with one another has

other conceptual advantages, particularly through its correlation with

other independently knowable eatures o the development o written

language and related and independent theoretical perspectives. In rela-

tion to the historical development o written language, the space can be

regarded as historically introduced and persisting where transition pos-sibilities between two adjacent letters would otherwise be extensive and

the potential or prediction o the second o the two letters rom the frst

to be low, particularly or purely syntactically based prediction. Ater

a space, the initial character o a word likely will be difcult to predict

syntactically and will have low redundancy (some orms o writing—or

instance, Biblical Hebrew—have indicated the presence o vowels at the

beginning o words, but not within them). From the related perspective o 

critiques o the analogical application o inormation theory to semanticissues, the greatest reedom o choice at a semantic level occurs in the

selection o the next word, where constraints at the syntactic level are

weakest, reinorcing the evidence or the inapplicability o inormation

Page 138: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 138/199

A Syntactics or Retrieval rom Full Text 129

theory concepts directly to semantic issues. The conception o the mul-

tiword sequence is both consistent with and appropriately dierentiated

rom independently developed concepts in Saussurean linguistics. Theweak correlations between units correspond to the Saussurean view o 

speech as marked by reedom o combination. In contrast to conceptions

o the extended syntagma and the examples given o extended syntagmas,

semantic cohesiveness is not necessarily implied by an understanding o 

the multiword sequence in terms derived rom inormation theory, which

is limited to the level o expression. (It could apply to automatically gen-

erated sequences or which semantic cohesiveness was not attempted.)

The conception o the multiword sequence has analytic value or under-standing retrieval rom ull text and instrumental value or constructing

specifc queries when semantic cohesiveness is directly humanly imposed.

In this understanding, a phrase that distinctively characterizes the topic

desired is highly likely to occur only within relevant documents and this

can be exploited in searching.

Dierentiating the multiword sequence rom the word as a weakly

correlated concatenation o units that are themselves internally cohesive

can yield a crucial theoretical insight into the requency o recurrence o 

identical multiword sequences. I the units (words) o the linear multi-

word sequence correlate only weakly with one another, it is theoretically

possible to recognize identical multiword sequences as highly improbable

unless one is copied rom the other. This conception o the multiword

sequence also has considerable analytic power or understanding retrieval

rom ull text—it can yield a theoretical explanation o the experientially

encountered inrequency o recurring multiword sequences, even in verylarge corpora. The relative inrequency o identical phrases or extended

multiword sequences, even in large corpora, can then be understood as

consistent with the weak correlations between the concatenated units.

Summary

We derived a precise, computationally implementable understanding o 

the word and multiword sequence, realized in computational practice, by

considering written language as the message o inormation theory. Wealso established implications or the requency o occurrence o identical

words and multiword sequences.

Page 139: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 139/199

130 Chapter 7 

Conclusion

The specifc interest o the analogies established between concepts rompreviously largely unconnected areas or ull-text retrieval—Saussurean

linguistics and inormation theory—lies in their distinctive possible con-

tributions. Saussurean linguistics can give a basis or understanding and

modeling the semantic eects o interactions between syntagma and para-

digm, and the transormations o the signifed or meaning in indexing,

searching, and retrieval. Inormation theory gave insight into syntax or

patterns, including a computationally implementable defnition o the

word and multiword sequence.For the labor theoretic approach, we provided a basis or understand-

ing the eects o the current transormation o selection labor toward

searching. Emphasis on the material basis o communication, undamen-

tal to the approach, has been sustained. The next chapter will theorize

more ully the partly experientially and inductively acquired understand-

ings o patterns o retrieval (given or the opening examples in chapter 4).

The value o theory resides in the additional understanding it oers.

Page 140: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 140/199

8Semantics and Syntactics or Retrieval rom

Full Text

Introduction

This chapter is concerned with bringing the models o semantics and syn-

tactics or ull-text retrieval, developed in the previous two chapters, into

dialogue with examples o retrieval rom ull text. To acilitate under-

standing o the material presented, we will maintain a strong parallelism

with the structure o the previous chapters at macro-, intermediate, and

micro-levels. At the macro-level, the material basis or communicationis prioritized in the sequence o discussion, ully revealing the parallels

between retrieval rom ull text and the semantics and syntactics already

developed. At an intermediate level, we will maintain the sequence o 

examples given in chapter 5 and the progress rom considering separate

words to searching or phrases, adding a uller example. At a micro-

level, each example is considered frst in relation to the specifc semantics

developed, then in connection with syntactics, and fnally regarding other

relevant considerations. We careully preserve the already developed di-erences between semantics and syntactics and exploit their parallelism.

First, we must establish the correspondence between the already devel-

oped semantics and syntactics and retrieval rom ull text, primarily

through reerence to material basis.

Full-Text Description, Semantics, and Syntactics

Recurrent experiences in retrieval rom descriptions o the ull-text doc-

uments created by algorithmic transormations on the language o the

documents themselves—both or transormations o meaning and or re-

Page 141: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 141/199

132 Chapter 8

quency o occurrence o words and phrases—can be understood rom the

semantics and syntactics already developed. The semantics derived rom

Saussure can contribute to a model or understanding transormations aseects on the meaning or signifed o document and query terms. The syn-

tactics developed rom inormation theory can give an understanding o 

the relative requency o documents recalled by word and phrase search-

ing. The congruences emerging suggest a validation o each approach.

Contributions are obtained rom both linguistics and inormation theory,

and each discipline makes a distinctive contribution.1

Full-Text DescriptionWe can derive a suggestive indication, i not confrmation, o the corre-

spondence between language o discourse and language o representation

or ull-text indexing o syntagma and paradigm as well as the message

and messages or selection rom a urther congruence between pedagogic

techniques. We illustrated the relation o syntagma to paradigm by cut-

ting words rom a sheet o written paper. We could also illustrate the dis-

tinction between messages or selection and the message by considering

the process o cutting reversed in temporal sequence. Similarly, the eects

o ull-text indexing can be demonstrated by cutting words rom a sheet

o paper and creating an ordered index rom the cut words.2 Both the cut-

ting and the creation o an alphabetically ordered index can be conducted

syntactically, working solely on patterns. The simplicity and materiality

o the common method o exposition compels recognition o the corre-

spondence and indicates its signifcance.

Computer operations represented by the pedagogic technique substi-tute a machine process or direct human labor and speed the task, pos-

sibly also accomplishing it more exactly. Although humanly constructed

technology enables computer operations (Warner 2004, 5–35), the gestalt

o the computer (Rosenberg 1974)—in the sense o a mysterious object

held within relatively enclosed communities—may have disguised the

simplicity o primitive transormations. The diusion o technologies has

rendered that aspect o the gestalt less potent, although it may remain

necessary to reconstruct particular operations rom the behavior o thesystem or rom discursive account because we lack access to the source

code.

Page 142: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 142/199

Semantics and Syntactics or Retrieval rom Full Text 133

Automata studies—a signifcant theoretical aspect o computer sci-

ence—can contribute an understanding o the primitive computational

operations possible on written language, also discovered independentlyin inormation retrieval. Computational operations can be understood as

the writing, erasure, and substitution o symbols (Turing 1937; Herken

1995). Analogously or inormation retrieval rom written documents,

operations can be reduced to sorting or partitioning, which can be gener-

ated rom writing, erasure and substitution, and the substitution o one

symbol or another (Buckland and Plaunt 1994). The Boolean operators

AND, OR, and NOT, amiliarly used in commercial and Internet inorma-

tion retrieval systems, correspond to the primitive logical connectives,3 themselves independently developed rom the primarily iconic (rather

than notational) modes used or automata theory (Minsky 1967, x;

Warner 2001, 47–72). Dierent systems express Boolean operators as di-

erent commands, and the algorithms used in the experimental tradition

also derive rom the primitive operations.

A Semantics or Retrieval rom Full Text

Saussure’s remark that “Everything in a given linguistic state should be

explicable by reerence to a theory o syntagmas and a theory o associa-

tions” (Saussure 1916/1983, 135) can be developed to cover the transor-

mations o meaning that result rom algorithmic operations on language,

as well as the production o meaning; this possibility will be indicated

here. The associations or paradigm is understood in the strict sense o the

possibilities o occurrence in the syntagma. By algorithmically creating a

searchable description o a written document or ull-text indexing, unitso the syntagma are detached rom their particular syntagmatic occur-

rences and eectively released into the paradigm (the paradigm is con-

ceived as an analytic construct and in absentia). The level o granularity

into which the syntagma is cut and the units are detached can vary, but the

word, as a ormal or pattern-based actualization o the word, has become

a dominant unit in practice and will be the minimal unit considered here.

Detaching units rom the syntagma would cover a number o technically

dierentiated approaches, including serial searching o written language.Thereore, searching operates on the units detached rom the syntagma,

in eect on units belonging to a particular paradigm or network o asso-

Page 143: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 143/199

134 Chapter 8

ciations. The minimal units isolated in description can be combined in

searching (or instance, as direct Boolean combinations or as phrase- or

sequence-searching, itsel understandable as an application o the BooleanAND). Retrieval (the instantiation o the search statement) reinserts

detached units into the syntagma, a reverse transormation o a paradigm

into a multitude o syntagmatic occurrences. The signifed, implied by the

uller syntagma revealed by this reversal, may not correspond to the signi-

fed intended by the search statement. An incomplete mental representa-

tion o the networks o associations or search terms may have been held,

possibly due to the tendency to receive words rom language as terms

rom a nomenclature.In summary o the process o description, units o expression are cut

rom their uller syntagma or sequence, where they may have had a defnite

signifed, and released into a paradigm, where they become multivalent

and admit a number o signifeds. In searching, a mental representation o 

the intended signifed or the signifer, possibly received as having a single

meaning (or as a univocal complex o signifer and signifed), may be held.

Retrieval restores the syntagmatic occurrences o search terms, where the

signifeds represented by a single signifer may begin to constitute and even

exhaust the variety contained within the particular paradigm. Thereore,

linguistics can contribute a sophisticated understanding o the interaction

between signifer and signifed, enorced by the movement rom syntagma

to paradigm in description and rom paradigm to syntagma in searching

and retrieval, or computational and direct human operations on written

language in ull-text representation and retrieval.4

A Syntactics or Retrieval rom Full Text

The message and the messages or selection are as undamental to inor-

mation theory as the syntagma and paradigm are to Saussurean linguistics.

I the analogy o the message with the syntagma—and o the messages or

selection with the paradigm—is accepted, inormation theory can be used

to develop a theoretical understanding o the relative number o records

or documents retrieved by dierent orms o searching, particularly the

contrast between searching using word and multiword sequences. A moresophisticated ormal understanding o the word, corresponding to its

practical implementation, was derived rom inormation theory.

Page 144: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 144/199

Semantics and Syntactics or Retrieval rom Full Text 135

As currently developed, automata theory and experience o inorma-

tion retrieval systems can continue to contribute an understanding o 

the eects o dierent logical combinations on the number o recordsretrieved, particularly or searches using combinations o separate words.

Exemplifcation and Discussion

We can now reconsider the examples given previously, in chapter 5, or

searching on separate words and on phrases.

WordThe difculty o fnding two uniquely co-occurring words in the docu-

ments indexed by Google (Unquiet cooccurrences / Quotidian meta-

morphise / Happiest legumens) can be understood rom linguistics,

inormation theory, and the experience accumulated by using inormation

retrieval systems (Google 2004). Linguistics, particularly through the idea

o the paradigm as a mentally held network o associations, can indicate

the difculty o excluding semantic considerations rom the choice o 

terms or searching or unique co-occurrences, comparable to the dif-

culty o humanly ormulating a random sequence5—the repressed signi-

fed can reemerge. The operational implementation o the word—broadly,

a sequence o characters between spaces or space and punctuation mark—

is consistent with the conception derived rom inormation theory (that

is, a cohesive group o letters with strong internal statistical inuences).

The rarity o combinations o two such cohesive units, linked by AND,

resulting in a single document can be understood in terms o the wide dis-tribution o the cohesive units across documents. In addition, the rarity o 

combinations o two cohesive units resulting in a single document can be

understood as continuous with but still contrasting to the historical expe-

rience o using inormation retrieval systems with documents recalled

by simple Boolean combinations, particularly with reducing the number

o documents retrieved by combining sets with AND. In a more deliber-

ately theoretical, rather than directly experiential, perspective, combining

entities assumed to be independently distributed with AND is analogousto multiplying odds or ractions. Fuller index descriptions and possibly

transormations o the ull texts o documents are now available in his-

torical (including recent historical) experience.

Page 145: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 145/199

136 Chapter 8

Operating on ull-text representations, the eects o searching using

individual words and Boolean combinations o separate words, (univer-

sity AND library AND fnance) OR (RAE AND research), can be sim-ilarly understood. From the model o meaning already established, the

restricted syntagma o the separate word allows a multiplicity o signifeds

to attach themselves to the signifers when they are eectively released

into their paradigm; the retrieved extended syntagmas exhibit the multi-

valency acquired. The ordinary discourse meaning o words used as search

terms was not stretched in either instance, and their use in retrieved syn-

tagmas was not highly indeterminate. Initially, the inquirer may have held

a nomenclaturistic mental model o query terms as univocal complexes o signifer and signifed. The number o tokens o word types retrieved testi-

fes to the preexistence o words as cohesive and discrete units in the lan-

guage or representation. Using the technical aordances o the time, the

syntagma was restricted to the paragraph or the Boolean combination

university AND library AND fnance, but the search recalled the Queen’s

Honors List (syntactically marked as a single paragraph), indicating the

difculty o defning the paragraph. Inormal or experientially acquired

and weakly theoretical understandings o the difculties posed by the

multivalency o separate words, embodied in acilities or restricting mul-

tivalency by the provision o proximity operators, again preceded their

theoretical articulation.

Phrase

The eects o phrase searching—or example, direct semantic ratifca-

tion, Jack Kennedy was a riend o mine, Research Assessment Exercise (Google 2004)—can be conversely and still consistently understood.

From a linguistic perspective, the extension o the syntagma in the search

and in the retrieved string beyond the separate word reduces the potential

multivalency o the individual words (or the signifeds) that legitimately

can be attached to them as signifers, in so ar as they are accorded sep-

arate signifeds. In the examples given, the search phrase is a relatively

univocal (rather than potentially multivalent) union o signifer and sig-

nifed, although the possibility o indeterminacy remains. Phrases couldeven be regarded as approaching semantic and possibly historically spe-

cifc types o tokens retrieved ( Jack Kennedy was a riend o mine) rather

Page 146: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 146/199

Semantics and Syntactics or Retrieval rom Full Text 137 

than remaining simply syntactic or pattern-based types. Even in large cor-

pora, the relative inrequency o identical multiword sequences can be

explained rom the understanding o the sequence developed rom inor-mation theory—a weakly correlated linear concatenation o individually

cohesive units. It is possible to make an analogy with cryptanalytic proce-

dures, where repeated multiword sequences may be recognized, although

possibly with access to the process as well as to the product o commu-

nication and directly to the place o the sequence within the encompass-

ing utterance.6 The utterance also emerges as more signifcant than the

separate word, matching its possible ontological priority. Inormal or

experientially developed understandings o the eects o phrase search-ing—both as process o search, as with Google.com, and description, as

with Amazon.com (Google 2005; Amazon 2005)—or both relative sta-

bility o meanings and inrequency o occurrence have preceded their the-

oretical explanation.

Fuller example

A uller example can advance the dialogue between practical understand-

ing and the theoretical knowledge developed. In the ollowing extended

sequence, stood or occurs in two deliberately contrasting meanings or

senses, which could be paraphrased as “represented” and “put up with”:

“A college widow stood or something in those days. In act, she stood or

 plenty” (McLeod 1932). In its frst occurrence, stood or demands selec-

tion o represented , in the sense implied by the classic semiotic defni-

tion o the sign as something standing or something (aliquid stat pro

aliquo), and in its subsequent occurrence, to put up with. The dierentsyntagmatic occurrences compel selection o contrasting senses rom the

paradigmatic possibilities. The sequence stood or is regarded as belong-

ing to the language, not to speech; rom the Saussurean understanding o 

the word as a unit that compels recognition by the mind, it could also be

regarded as a word. From the combinatorial perspective o inormation

theory, the sequence stood or can also be regarded as a word i the space

is treated as a character (Shannon 1948/1993; 1951/1993).

In its original flmic mode, by oral delivery, the particular selectionrom the paradigm is guided by the surrounding syntagma, but the proc-

ess o selection is difcult to ormalize in logical or computational terms.

Page 147: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 147/199

138 Chapter 8

Investigation could recover the paradigm or stood or, although the

number o senses may be indeterminate and open to dierent imposed

meaningul orders (or instance, by historical development or requencyo occurrence o dierent senses distinguished). Dierent surrounding

syntagmatic sequences or stood or could compel similar or dissimilar

selections. Sequences might be difcult to divide into mutually exclusive

groups, demanding dierent choices: or example, what substitutions or

college widow and something or or she and or plenty could compel sim-

ilar or dissimilar selections rom the paradigm as an association o signi-

feds or stood or? Even a mechanistic simplifcation and objectifcation

o the process o selection—where the horizontal axis o the syntagmadetermines choice rom the vertical paradigm—reveals a complex and

computationally intractable process (see table 8.1). Even with a restricted

example, human understanding o language appears to exceed logical

and computational orm—and is not necessarily reducible to such orm—

although human understandings can still create logical products.

A paradigm or stood or could be deliberately constructed by drawing

on mental representations and associations in individual memory, while

recognizing that these representations are historically and socially con-

structed, by drawing on reerence sources, such as dictionaries, and on the

collections o syntagmas derived rom sources covered by Internet search

engines. A historical perspective on the possibility o reconstructing

the paradigm rom external sources would reveal continuities and con-

trasts: continuities in orms o signifcation and contrasts in the type and

quantity o direct human labor needed to recover the paradigm. Current

inormation technologies, including Internet search engines, oer greatly

Syntagma

Lexicon1

Lexicon2

stood or Lexicon3

Lexicon4

Drawn romLexicon

l..n

Paradigmor

selectiono sensesor stoodor, sd

l..n

Drawn romLexicon

l..n

Table 8.1 Choice o signifeds or the term stood or 

Page 148: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 148/199

Semantics and Syntactics or Retrieval rom Full Text 139

enhanced possibilities or recovering syntagmatic occurrences rom the

line o writing.

An incomplete exional paradigm (stood or–stand or–standing or)can be reconstructed rom mental representations guided by historically

accumulated grammatical understandings. The infnitive orm, [to] stand 

or, which becomes the entry or head term or dictionaries, has no neces-

sary special priority in the paradigm. As Saussure noted with regard to

nouns:

As ar as language-users are concerned, the nominative is not in any sense the“irst” case in the declension: the orms may be thought o in any variety o 

orders, depending on circumstances. (1916/1983, 124–125)

The need to reconstruct the paradigm implies that an incomplete men-

tal representation o it may have been held on the semantic (rather than

a syntactic or grammatical) level, possibly due to thinking o language

as a nomenclature, with a single or highly restricted number o senses

belonging to a word. Reconstructing the paradigm rom syntagmatic

occurrences confrms the existence o the paradigm in absentia and the

syntagma in praesentia (Saussure 1916/1983, 122).

Monolingual and historical dictionaries o the English language reveal

the diversity o senses that can be gathered under the verb to stand as

headword. Johnson’s subsequently inuential approach to dictionary-

making involved being empirical and inclusive, but he also used scholastic

distinctions to inorm dierentiations between senses and the organiza-

tion o defnitions (1755/1982). Stand or is distinguished as a separate

subentry within the entry or to stand , but neither to represent nor to put 

up with are distinguished as senses (Johnson 1755/1996; see fgure 8.1).The Oxord English Dictionary also distinguishes stand or as a subentry,

dierentiating a greater variety o senses, including to “represent by way

o symbol or sign” (the earliest occurrence o this sense is given as 1612)

and to “put up with” (traced to 1896) (Oxord 2005). The senses o stand 

or distinguished primarily by Johnson are assimilated to senses o stand ,

with similar deinitions and including similar illustrative quotations.

Retrieval o stand or and stood or in defnitions and illustrative quota-

tion or other headwords—a orm o retrieval that would have involvedconsiderable direct human labor or Johnson and Murray7 or their assist-

ants (Johnson 1755/1982; Murray 1978)—reveals a urther diversity

Page 149: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 149/199

140 Chapter 8

 Stand for as entry term

To STAND for . To propose one’s self a candidate.

How many stand for consulships?—— three; but ’tis thought of every one Coriolanus

will carry it. Shakespeare.

If they were jealous that Coriolanus had a design on their liberties when he stood for 

the consulship, it was but just that they should give him a repulse. Dennis.

To STAND for . To maintain; to profess to support.

Those which stood for the presbytery thought their cause had more sympathy with the

discipline of Scotland, than the hierarchy of England. Bacon.

Freedom we all stand for . Ben. Johnson.

 Stand for and stood for as syntagms in definitions .

To ANSWER.

To be equivalent to; to stand for something else.

A feast is made for laughter, and wine maketh merry: but money answereth all things.

Bible Eccl. x. 19.

APPE'LLATIVE. n.s. [appelativum, Lat.]

Words and names are either common or proper. Common names are such as stand for 

universal ideas, or a whole rank of beings, whether general or special. These are called

appellatives. Watts’s Logick .

OXSTALL. n.s. [ox and stall.]

A stand for oxen.

STILLING. n.s. [from still.]

1. The act of stilling.

2. A stand for casks.

Figure 8.1Reconstruction o paradigm or stand or and stood or. Source: Johnson1755/1996.

Page 150: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 150/199

Semantics and Syntactics or Retrieval rom Full Text 141

o senses, including a sense strongly analogous to to represent as a sign 

or Johnson (under appellative), as well as semantically dissonant senses

with orthographically compatible signifers in the defnition o nouns (orinstance, oxstall and stilling ) (Johnson 1775/1996; a verbally similar def-

nition o stilling is given by Oxord 2005).

A collection o syntagmatic occurrences o the written orms stand or 

and stood or can be obtained by an Internet search. In comparison to

historical lexicographic enterprises, the Internet reduces direct human

labor and alters the principle o selection. Literary sources no longer are

avored, although selection still is confned to written orms. Selection is

Figure 8.2Collection o syntagmatic occurrences o stand or and stood or. Source: Google2005.

Stand for Children

Building a powerful, grassroots citizen voice to ensure all children have the opportunity to

grow up healthy, educated, and safe.

www.stand.org/ - 21k - 13 Apr 2005 - Cached - Similar pages

We Stand for Peace & Justice

... I stand for democracy and autonomy. I don’t think the US or any other country ... I stand

for internationalism. I oppose any nation spreading an ever ...www.zmag.org/wspj/index.cfm - 12k - 13 Apr 2005 - Cached - Similar pages

BBC NEWS | England | Gloucestershire | New Kingsholm stand for ...

Work on a new £8m stand for Gloucester Rugby Club will start in summer, ready for next

year's season.

news.bbc.co.uk/1/hi/england/gloucestershire/4428393.stm - 33k - 13 Apr 2005 - Cached -Similar pages

Lenin and Trotsky - what they really stood for

... what they really stood for . By Alan Woods and Ted Grant. Buy Online! Buy online fromWellred Books! Contents. A Note from the Authors ...www.marxist.com/LeninAndTrotsky/ - 3k - Cached - Similar pages

Crosses stood for women as well as babies - The Observer - Viewpoint

Crosses stood for women as well as babies, , The Observer, a newspaper of University of 

Notre Dame.www.ndsmcobserver.com/news/2004/10/13/Viewpoint/

Crosses.Stood.For.Women.As.Well.As.Babies-751052.shtml - 44k - Cached - Similar pages

Page 151: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 151/199

142 Chapter 8

 stood for something 

The struggle for the GJF, A People's Foundation

... Justice and the Greensboro Massacre by Marty Nathan, MD & Paul C. Bermanzohn, MD.In Greensboro, North Carolina, an anti-Klan march and educational ...

www.gjf.org/Klansmen.html - 31k - Cached - Similar pages

Volume 36 - continued

... In our team journal, there is an anonymous quote: "Live your life so your children can tell their children you stood for something wonderful." ...

www.globalvolunteers.org/LINK/jumpvol36.htm - 12k - Cached - Similar pages

 Netflix Customer E-mail 1/11/05

... America used to be a country that stood for something, ya know? ... used tobe a time when America, and American companies stood for something good. ...

www.manuelsweb.com/netflix/0105email.htm - 14k - Cached - Similar pages

U2 Vertigo tour 

... agree with a lot of his positions but at least he stood for something. ...I mean, how can you respect someone just because he stood for something? ...

flyertalk.com/forum/showthread. php?goto=lastpost&t=420174 - 101k - Cached - Similar pages

Hugh Duke 2nd Devons remembered

... and which he used of a fellow-officer who fell before him; “He stood for something very precious to me—for an England of my dreams, made up of honest, ...

www.warchronicle.com/50th_div/ soldierstory_wwii/duke.htm - 12k - Cached - Similar pages

 stood for plenty

Horse Feathers (1932)

... In fact, she stood for plenty. Frank: There's nothing wrong between me andthe college widow. Wagstaff: There isn't, huh? Then you're crazy to fool ...

www.filmsite.org/hors.html - 23k - Cached - Similar pages

tsujigiri: 01/02/2005 - 01/08/2005

... In fact, she stood for plenty. Lindsey Graham questioning Alberto Gonzales islike a doberman questioning Groucho Marx. Graham relies heavily on weird ...

tsujigiri.blogspot.com/ 2005_01_02_tsujigiri_archive.html - 47k - Cached - Similar pages

tsujigiri: 02/06/2005 - 02/12/2005

... A college widow stood for something in those days. In fact, she stood for plenty."And they who is not us shall perish"; Hoax: IQ in Blue vs Red States ...tsujigiri.blogspot.com/ 2005_02_06_tsujigiri_archive.html - 25k - Cached - Similar pages

[ More results from tsujigiri.blogspot.com ]Fort Weyr 

... He did become a Candidate and Stood for plenty of clutches. He finally methis lifemate, bronze Gehenth, and graduated from weyrlinghood, ...www.darkfort.co.uk/fort.php?p=persona2/ display_char.php&cname=K_sin&user=guest - 10k -

Cached - Similar pages

2004 Pow Wow Page

... Like last year, the "POW" in "POW-WOW" stood for Plenty of Water, but thatdidn't stop the Rangers and Commanders of the West Central Section from being ...

www.njwcsrr.org/NJWCS%20Pow%20Wow%202004.htm - 18k - Cached - Similar pages

Student Doctor Network Forums - Physical Medicine & Rehabilitation ...

... Answering “because I heard PM&R stood for plenty of money and relaxation” is

probably not going to earn you many points. ...forums.studentdoctor.net/showthread.php?t=187469 - 89k - Cached - Similar pages

Figure 8.3Collection o syntagmatic occurrences o stood or something and stood or plenty. Source: Google 2005.

Page 152: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 152/199

Semantics and Syntactics or Retrieval rom Full Text 143

understood to occur on the basis o congruence o characters between

query and document strings, with a urther algorithmically generated

ordering o reerences based on such eatures as number o links andrecency. Inclusion o the context or the query string in the display o 

reerences embodies a practical understanding o the syntagma’s value in

reducing multivalency. Instances o stand or and stood or are difcult to

assign exclusively to the sense distinguished lexicographically, although

a sense o to represent legitimate interests seems numerically dominant,

and stand or is also present in its semiotic sense (see fgures 8.2 and 8.3).8 

Indeterminacy can remain, and multivalency has been reduced. Stand or 

also occurs with stand as a noun. The display is indicative o the com-plexity that might conront a lexicographer. The witty clarity o the ini-

tial contrast—stood or something … stood or plenty— is confrmed as

imposed rather than emerging.

I the extent o the syntagma or matching is increased to stood or

something and stood or plenty, multivalency o items retrieved are ur-

ther reduced, although plenty still occurs as use and mention (fgure 8.4).

The number o occurrences o each multiword sequence and o docu-

ments retrieved is also urther reduced, explicable by an inormation

theory perspective on the occurrence and recurrence o words and multi-

word sequences.

The wit o the utterance, written but intended or oral delivery, stems

rom the clarity o the contrast between the two senses and the rapid tran-

sition rom one to the other. The association between the two senses may

have its origins in the unconscious (Freud 1905/1976, 215–238), but their

mutual contrast is deliberately intellective. The contrast between the wordsenses also embodies reservations about attempts to construct a semantic

unity or even strong coherence or a word:

The various contexts o usage or any one particular word are thought o as orm-ing a series o circumscribed, sel-contained utterances all pointed in the samedirection. In actual act, this is ar rom true . . . Contexts do not stand side by sidein a row, as i unaware o one another, but are in a state o constant tension, orincessant interaction and conict. (Vološinov 1929/1986, 80)

The wit arises rom the tension between the two senses and is lost whentheir order is reversed. Experience with ull-text retrieval has tended to

oer empirical confrmation o the variety o unpredicted syntagmas or

words.

Page 153: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 153/199

144 Chapter 8

Extremes o relation between expression and content can be iden-

tifed, and a progression can be traced between them. At one extreme,

the sequence cut can have a multitude o signiieds: or Vološinov,“Multiplicity o meanings is the constitutive eature o [the primitive]

word ” (1929/1986, 101). In this understanding, the primitive word was

not distinguished rom the utterance or abstracted rom its place between

speakers. For an individual speaker, brevity and multivalency can reduce

labor in the production o utterances, without demanding compensating

labor in interpretation by that speaker, particularly i there is a power rela-

tion over the hearer, such as the slave, or “instrumentum vocale” (Marx

1867/1976, 303n–304n). In a speech circuit, or with the circulation o linear written messages, speakers are also hearers and have an interest

in reducing the work in disambiguating multivalency in interpretation,

bringing it into balance with work in production. Multivalency can still

remain, particularly when the word is extracted rom the speech circuit or

written message. At an antithetical extreme, a signifed could have many

signiiers, although this would demand unnecessary labor both rom

source and destination. Saussure strongly questioned the existence o ull

synonymy, a maniestation o a signifed with more than one signifers

(1916/1983, 162). A nomenclature with a unitary relation o signifer to

signifed would occupy an intermediate position between two extremes.

The uller example has revealed that human understanding and compo-

sition o written language may be highly intractable computationally, even

given sharp dierentiations between alternative senses and with compre-

hension and selection reduced to a rather objectifed orm. Derived rom

a monolingual dictionary, the paradigm or the infnitive orm (stand or)may give a sense o organization, clarity, and strictly limited number, and

also o words or messages or selection that await combination in the

syntagma or message. The monolingual dictionary has been regarded as

a practical anticipation o the Saussurean model o language; the head-

word and defnition are analogous to signifer and signifed, and the net-

work image o the semantics o a language is exemplifed by the mutual

defnition o words. The collections o syntagmas revealed the diversity

o speech, which may have been disguised by the appearance o fnishedorder given by the collection o paradigms, and they indirectly testifed to

the abstracting and ordering eorts o lexicographers and theorists. From

Page 154: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 154/199

Semantics and Syntactics or Retrieval rom Full Text 145

a Saussurean perspective, the sequence stood or something could be

regarded as partly belonging to the language—rom its regular pattern—

and partly to speech, while stood or plenty is clearly part o speech. Forinormation theory, the contrasts in the relative requency o occurrence

o the two dierent messages would be consistent with the strength o the

semantic, rather than syntactic, associations between internally cohesive

units.

Summary

Thus, our review and analysis o examples has indicated the value o 

understanding signifcation in written language and transormations onwritten language rom the interaction between syntagma and paradigm.

The understandings already developed imply limits to logical or compu-

tational ormalization. The syntagma has been consistently present as a

real existent, while the paradigm has tended to be realized as an analytic

construct. Practical understanding o the value o the extended syntagma,

both or searching documents and or displaying reerences, has thus ar

largely preceded any theoretical articulation.

Conclusion

We have derived specifc analytic advantages rom the congruences identi-

fed between linguistics and inormation theory. Linguistics has provided

a model or describing the relation between signifer and signifed, when

the signifer—as word or multiword sequence—is extracted rom the syn-

tagma, released into its paradigm, and reinserted in a variety o syntagmasby computational and human operations on written language or inor-

mation retrieval. The most distinctive and valuable contribution has been

an enhanced understanding o the shiting relation between expression

and content.

We developed an account o the production o meaning in written lan-

guage as interaction between the syntagma and paradigm. We approached,

but did not ully exhaust, the limits o choice oered by written language

and its computability on a semantic level. The preerence implied bycommon patterns o interactive use or nondeterministic modes o use

o systems could be strongly supported. The convergence o Saussurean

Page 155: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 155/199

146 Chapter 8

linguistics and inormation theory, inormed by an understanding o the

primitive computational operations, suggests constraints on determinis-

tic selection rom written language relevant to inormation retrieval. Atthis stage o conceptual development, constraints can be stated only as

provisional generalizations, not confrmed. To create index descriptions,

deterministic operations on written language may involve tearing words

or other syntactically dierentiated units rom the syntagma and releas-

ing them into paradigmatic multivalency. Multivalency may emerge most

clearly in syntagmas retrieved by Boolean operations in retrieval, but it

could remain present with other algorithmic procedures or description

and retrieval. Extending the syntagma cut in description or searching canreduce the potential or multivalency. A preerence or nondeterminis-

tic modes in searching can be supported theoretically as well as expe-

rientially. The human capacity or choice—earlier connected with the

etymology o intelligence (inter-legere, or to choose between) (Scholarly

and Ordinary Discourses, chapter 2)—exceeded the possibility o com-

putational modeling, particularly or combining and understanding the

syntagma. Inormation theory yielded an understanding o the relatively

requent occurrence o word and multiword sequences, extracted rom

the message.

The respective domains o linguistics and inormation theory have been

sustained, although some convergence emerged rom the dynamics o 

each discourse: in the linguistic view o speech, as marked by reedom

o combination, and in the elements o semantic selection by the source

and interpretation by the destination in inormation theory; Shannon’s

own experiments on language did not separate semantic rom syntacticprocesses (1951/1993). Insights into ull-text retrieval have been obtained

rom the correspondence o the syntagma and paradigm, and the mes-

sage and messages or selection, to processes o representation in ull text

retrieval, which could not have been obtained rom any o the discourses

in isolation. The computational transormations possible on written lan-

guage continued to be understood rom automata theory, with an inde-

pendent and consistent account given by inormation retrieval experience

and research. Practice and theory have been brought into a dialecticalrelation to each other.

Page 156: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 156/199

9Conclusion

Introduction

This chapter will review the semantics and syntactics already developed

in relation to:

preexisting theories relevant to the inormation retrieval (outlined in

chapter 1),

the labor theory approach, and

existing and emerging real-world practices.

Since the labor theoretic approach and the understanding o semantics

and syntactics are both relatively novel, the theories developed in this

book may not have explicitly inormed real-world developments. I the

theories have validity, however, real-world systems inevitably are both

constrained and enabled by considerations connected with human labor

(its costs, the ability o human semantic labor directly to address the level

o meaning, and the possibly o transerring syntactic labor to technol-ogy) and also by inherited modes o production o meaning rom written

language and the nature o the computational process. We will indicate a

potential synthesis o library and inormation science with Internet cul-

tures that emerge in practice beore being articulated in theory. Finally,

this chapter will reafrm the value o selection power and the merging

possibilities or enhanced selection.

Theories

The order o consideration or theories and practices preserves the order

o presentation throughout the book, beginning with the still-dominant

tradition o classic inormation-retrieval research.

Page 157: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 157/199

148 Chapter 9

Classic Inormation-Retrieval Research

The development o experimental inormation retrieval systems usu-

ally has occurred independent o linguistic theory. In the early 1970s areview concluded, “The most striking act to emerge rom the literature,

however, is the difculty o marrying linguistic techniques and retrieval

objectives” (Sparck Jones and Kay 1973, 197). Linguistically, very crude

procedures seemed to work quite well or retrieval (understood primarily

as the transormation o a query into a set o records) and it was unclear

what more sophisticated procedures could contribute (Sparck Jones and

Kay 1973, 197). Similarly, indexing solutions given to selection prob-

lems rom natural language owed very little to linguistics (Gardin 1973,140). Prior to more widely diused interest in metalanguages during the

late 1980s, only library schools had accepted that “[humanly assigned]

retrieval terminologies . . . [were] worthy o sustained study and research”

(Roberts 1989, 103). An understanding o languages o description—par-

ticularly those produced by algorithmic transormations on the language

o discourse—had to be constructed rom linguistics rather than imported

as an established product. In this context, this book has addressed the

 Janus-like character o inormation—amiliarly regarded as acing both

the technical world o bytes and data compression and the social world

o language and meaning (Gregory 2005)—that also requires, equally sig-

nifcant but less ully addressed, understanding rom the human and dis-

cursive as well as the mathematical and computational sciences

The value o classic experimental inormation retrieval research to cur-

rent practice and understandings is urther reduced by the understand-

ings developed. We must question the historically inherited preoccupationwith the word as a unit, and with the statistical distribution o word orms

(Zip 1936). The received idea o the word as unit o meaning should

be abandoned fnally and replaced with a more sophisticated model o 

signiication. The degree o success o inormation retrieval based on

the word unit, and the continuing utility o individual word-based tech-

niques, results rom the coincidence o the language’s semantic compo-

nent, even i understood as an associative paradigm with syntactically

separated sequences o the message. The utility or retrieval o combin-ing established techniques with developing understandings o the stability

o meaning and the requency o occurrence and applying them to more

Page 158: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 158/199

Conclusion 149

extended sequences rom the syntagma and the message—to the phrase

and the multiword sequence—has been demonstrated.

Less explicit assumptions o inormation retrieval research also demandreconsideration. The view that an index description must be brieer than

the document described can be regarded as a historically specifc product

o the storage constraints imposed by print and early modern computer

technologies. Rather neglected considerations o the depth o indexing

could be revisited. Most strongly, the assumption o deterministic modes

o interrogating systems, which correspond to the batch processing o the

1950s, can be replaced by a theoretically supported preerence or nonde-

terminism. The preerence or nondeterminism concurs with advocacy o selection power over query transormation as a design principle or inor-

mation retrieval systems, particularly when selection power is conceived

as property o human consciousness rather than o system processes.

Library and Inormation Science

From the perspective o library and inormation science, exposing

humanly assigned metadata as a historically specifc development o pre-

modernity—not ully present in orality emerging into literacy but persist-

ing into modernity as a renewed inheritance—was crucially relevant or

the labor theoretic approach and the practice o inormation retrieval.

Such recognition enables theoretical separation o the current value o 

metadata rom its persistence as an inheritance o modernity. In a synthe-

sis that combines recognition o humanly assigned metadata as a specifc

historical phase with the semantics developed, such metadata could be

regarded as gathering together dierent units rom the syntagma, yield-ing generic power (or in some less requently occurring applications,

dierentiating identical units o expression) and thus enhancing the pos-

sibilities o specifcity in searching (fgure 9.1). We suggested here that

humanly assigned metadata could maintain its value or generic capacity

(that is, collecting together disparate items or aspects o items rom the

original discourse). Currently, the separation o value rom inheritance

has occurred more requently in practice than in theory, which has been

constrained by intellectual inheritance and may require urther transor-mation (Wilson 2001).

Page 159: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 159/199

150 Chapter 9

Signifier Sign Signified

Signifier-sign-signified

Signifier /

MarkTwainSign Signified / Mark Twain

Signifier-sign-signified /

The author Mark Twain

Signifier /SamuelClemens

Sign Signified /

Samuel Clemens

Signifier-sign-signified /The private individual

Samuel Clemens

A metalanguage describing written documents taken as an object-language would treat

the entire complex of signifier, sign, and signified as its signified.

Metalanguage

Object-language

This pattern might be known as a double articulation.

If the metalanguage were algorithmically generated from the object-language it

could contain the phrases, Mark Twain and Samuel Clemens, and search could be highly

specific (but could only link the two aspects of the single individual together by listing

those terms).

Metalanguage

Object-language

Metalanguage

Object-language

If the metalanguage were purely humanly assigned (as would have been the case with

paper-based indexing), it might not distinguish between Mark Twain and Samuel

Clemens, but gather all occurrences of these terms under the heading, Mark Twain,

giving generic power in searching (but losing specificity).

Figure 9.1Meta- and object-language.

Page 160: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 160/199

Conclusion 151

Consistent with the recognition o humanly assigned metadata as a spe-

cifc historical phrase and corresponding to existing amiliar (i inormal)

practice, a transormed relation between assigned categories and languageo discourse could occur. In searching, chronology or temporal sequence

could change rom frst using metadata and then considering the language

o the original discourse to searching frst on descriptions syntactically

generated rom the language o discourse and then exploiting the prod-

ucts o human semantic description labor ound in association with the

desired language o discourse, possibly or their generic power. An anal-

ogy exists with the established (but oten inormal) practice o identiying

an item o interest and then looking at items categorized close to it—browsing enabled by collocation or human description labor. The tempo-

ral sequence o search corresponds to that o description, reinorcing the

analogy between description and searching, and is analogous to the real

historical process that moved rom linear utterance to brieer descriptions

o utterances. Giving attention frst to the language o discourse— the ull

text o a work—exploits a resource characteristically produced by human

semantic labor o greater duration and intensity than metadata, prima

acie making it the richer resource. A crucial value or human semantic

description labor—the gathering together o items related in meaning but

dissimilar in pattern—is also exposed. Practice increasingly realizes this

orm o value.

Inormation Society

The concept o inormational labor was adopted rom inormation soci-

ety discussions, with the intention o urther dierentiating the concept.A category o mental rather than inormational labor was developed, and

substantial conceptual advantages have accrued. The concept o mental 

labor implies continuity with what existed beore modern inormation 

technologies, themselves produced by mental and physical labor. A sharper

distinction between human and machine processing is implied, contrast-

ing mental (specifcally human) with inormation (human and machine).

Within human mental labor, valuable distinctions between semantic and

syntactic labor could result rom coupling the concept o mental laborwith established and historically warranted distinctions between levels

o analysis. The dierentiation o semantic rom syntactic labor contains

Page 161: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 161/199

152 Chapter 9

a distinction between human labor and potentially machine processes:

human syntactic labor potentially is transerable to technology, where it

becomes a machine process. These conceptual advantages imply that theconcept o mental labor is preerable to inormational labor. The value

o the urther distinction made between semantic and syntactic mental

labor was revealed by its ability to meaningully comprehend real world

practice.

Labor Theoretic Approach

The labor theoretic approach remained analytically revealing and suf-

cient to comprehend retrieval rom ull text, even when the activity o human description labor (which prompted development o the approach)

was analytically excluded. Comprehension was revealed by the undamen-

tal proposition that selection power is produced by selection labor, which

becomes concentrated in searching as human semantic labor. Distinct rom

labor, syntactic description and search processes could be transerred to

technology. Analytically, gathering together items related in meaning but

not strongly connected in pattern was revealed as a unique unction o 

semantic labor not easily (i at all) obtainable syntactically, which could

have value in enhancing selection power. Attention to the costs o human

labor in the labor theoretic approach reassumes relevance i the analytical

exclusion o human description labor is removed, accounting or the dis-

tribution o the products o human description labor and the transerence

o labor to searching, and thus implying the continuing development o 

these practices. The semantics developed or retrieval rom ull text also

may be relevant to the labeling decisions made by a describer concernedwith meaning o a document.

While continuing to concede that changes in practice precede theoreti-

cal explanation and that practice may embody deep orces and constraints

that are acknowledged but not explained away by theory, the value o the

theories developed can be isolated and recognized. Theoretically under-

standing a dynamic oers the possibility o inormed intervention in prac-

tice, particularly in the areas o reedom let by the determining orces.

Transerence o syntactic processes to machine may be inevitable, and ityields benefts or selection power. Since the distinctive value o human

semantic search labor in description and searching—giving generic

power—has been isolated, it can be deliberately exploited.

Page 162: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 162/199

Conclusion 153

Internet Cultures and Practice

Practical systems encompassing objects without humanly semantically

assigned metadata or without direct and independent access to that meta-data (where it exists) have prolierated. Systems inherit patterns o signif-

cation rom orality as well as rom written literacy, suggesting historically

deeper and socially wider roots compared to humanly assigned metadata.

Patterns o signifcation commonly encounter with computational proc-

esses and a crucial tension exists between the linearity o writing and cut-

ting rom a line. Through both direct use and the secondary eects o their

use, practical systems mediate to eects outside inormation retrieval.

Practical understanding o appropriate and market-accepted tech-niques or inormation retrieval rom written language has preceded

and seemingly is independent rom the theoretical elements derived here

rom Saussurean linguistics and inormation theory. However, analogies

between practices and theory and the possibility o more deliberately and

theoretically inormed developments could be established. The assumption

that relevance is a unction o the meaning o verbal orms can be qualifed

by recognizing the inuence o such eatures as quality, with consensual

notions o quality implemented in page-ranking techniques and the sig-

nifcance o socially distributed knowledge embodied in such techniques

anticipated by social epistemology (Shera 1952/1965). Google’s practice

o displaying the surrounding line o search terms in retrieved reerences

embodies an understanding o the value o the immediate syntagma in ini-

tial (but not necessarily fnal) disambiguation (Google 2005). The address

o documents might indicate the place o utterance. Indicating the place

o utterance is congruent with (but in the absence o stated addresseesdoes not completely ulfll) Vološinov’s, rather than Saussure’s, insistence

that a word obtains its meaning rom its position in a dialogue between

speakers (Vološinov 1929/1986, 65–123). The possibility o exact phrase

searching on Google.com and the use o statistically improbable phrases

by Amazon.com corresponds to theoretical recognition o the distinc-

tiveness o content and the relative inrequency o occurrence o specifc

sequences o more than one word. Searching in Google can exploit these

eatures; or Amazon.com, the eatures are presented as products o com-putationally generated descriptions that demand contrasting distributions

and types o direct human mental labor or eective use. The need or and

Page 163: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 163/199

154 Chapter 9

the possibility o humanly reiterated and refned searches suggest that the

human capacity or choice resists reduction to computational models or

logical ormulations.1

From the perspective o automata studies, directhuman choice corresponds technically to a classic sense o nondetermin-

ism, similar to human intervention in moving a halted Turing machine

to another state. Diusion (market adoption) o techniques strongly sug-

gests that inormation system consumers perceive their value (Swanson

1980).

We may have reached a practice plateau—a teleological or fnal stage

rom the theoretical considerations developed. Denying that a teleological

stage has been reached would involve reuting some particularly strongtheories, rejecting the account o signifcation derived rom Saussure,

denying inormation theory and its application to the syntactics o writ-

ten language, and reuting automata theory. The strength o the individual

theories is reinorced by their mutual convergence and by their closeness

to real-world practice.

Conclusion

The gestalt o the computer (Rosenberg 1974), understood here in the

sense o computational models and simulacra or human intelligence or

capacity or choice, is urther but indirectly eroded. As early as 1956,

Shannon had disassociated himsel rom the uncritical analogical exten-

sion o inormation theory, while not excluding possibilities or develop-

ment:

[the] basic results o the subject [inormation theory] are aimed in a very specifcdirection, a direction that is not necessarily relevant to such felds as psychol-ogy, economics, and other social sciences. . . . I personally believe that many o the concepts o inormation theory will prove useul in these other felds—and,indeed, some results are already quite promising—but the establishing o suchapplications is not a trivial matter o translating words to a new domain, butrather the slow tedious process o hypothesis and experimental verifcation. I,or example, the human being acts in some situations like an ideal decoder, this isan experimental and not a mathematical act, and as such must be tested under awide variety o experimental situations. (1956/1993, 462)

The decoder would act to transorm a received signal into a message; this

process can be both directly humanly perormed (or instance, by a Morse

Page 164: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 164/199

Conclusion 155

code telegrapher) and computationally modeled and executed. In contrast,

human activity in semantic selection rom a source and interpretation

at the destination exceeds computational modeling. Despite Shannon’sreservations—seldom directly cited—inormation theory became a main

source or communication studies (Fiske 1990, 1), uncritically assimilat-

ing Shannon’s mathematical theory o communication to Weaver’s inter-

pretation (Weaver 1949; Tidline 2004). The gestalt o the computer is

acutely maniested in cognitive science, with its insistence that “theories

o the mind should be expressed in a orm that can be modeled in a com-

puter program” (Johnson-Laird 1988, 52), without recourse to intuition,

and in the inuential linguistic modeling o the comprehension o utter-ances as a ormal system or automaton (Sperber and Wilson 1986). Even

a signifcant critique o cognitive science, or the absence o intentionality

rom computational systems, saw “no reason in principle why we couldn’t

give a machine the capacity to understand English, since in an important

sense our bodies with our brains are precisely such machines,” although

not while “the operation o the machine is defned solely in terms o com-

putational processes over ormally defned elements” (Searle 1980, 422).

We can detect a trace o a Cartesian inheritance in such comments: when

the mind is detached rom the body “the Cartesians situate the human

spirit in the pineal gland, as i it were an observatory” (Vico 1710/1988,

88). Semiotic phenomena may become difcult to interpret when they

are separated rom physical human presence, and operation over or-

mally dierentiated elements is intrinsic to a defnition o an inormation

machine (Minsky 1967) that has not yet been supplanted (Herken 1995;

Cockshott and Michaelson 2007). Drawing on the convergence o inor-mation theory and Saussurean linguistics, the argument here indicates a

strong theoretical possibility that human mental labor is irreducible to

computational models in the production and comprehension o written

utterances.

Thereore, it is possible subtly but signifcantly to shit the perspec-

tive on communication and computational models and their embodi-

ments in modern inormation technologies. Rather than being received

as substitutes or or simulacra o human intelligence, such technologiescan be regarded as the products o human mental and productive labor,

building on historically accumulated understandings and technologies.

Page 165: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 165/199

156 Chapter 9

Human mental labor can be distinguished rom the processes derived

and abstracted rom that labor that can be modeled computationally.

Abstraction o processes rom mental labor has intensifed since the earlyto mid-nineteenth century, and Babbage’s concern or mental labor and

Boole’s ormalization o logic led to Turing’s model o the computational

process (Babbage 1963, 1989; Boole 1854; Turing 1937). With modern

inormation technologies, transorming human mental labor into a tech-

nologically enacted process may enhance the exactness o procedures

(Warner 2001, 33–46), and that exactness may emerge as both increased

control over data and possibly as rigidity. A nondeterministic mode o 

use o systems—involving multiple direct human interventions and incor-porating semantic considerations—can in some circumstances retain

enhanced control without introducing rigidity.

Page 166: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 166/199

Postscript

Introduction

In this book, “human” reers explicitly to the human sciences, the disci-

plinary bases in which this book is rooted. Implicit to this point, “human”

also reers to modes o human being that now can be made explicit. A

crucial distinction can be made or modes o being, most signifcantly

counterpositioning being ully human against brute existence. The tech-

nological mode o inormation provides a secondary distinction withinbeing ully human, a distinction made ully explicit in the preceding

chapters. Thus, we now can develop a crucial question or the central

transormations in being that are occurring in connection with modern

inormation retrieval.

Modes o Being

Schematically, being ully human in a political economy and civil societycan be distinguished rom brute existence outside society—both as a his-

torical transition and or individuals once civil society exists. Discourse

evolves and then contributes reciprocally to being ully human. A capac-

ity or selection is also undamental to the contrast with brute existence

(Vico 1725/2002, 33–34). Exemplifed here in connection with legal codes,

inormation retrieval developed alongside civil society and represents one

part o the discourse essential to being ully human. Inormation retrieval

can be conducted orally and also in written literate and computational

modes. Selection in inormation retrieval represents another expression

o the human capacity or selection. The contrast o ull humanity with

Page 167: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 167/199

158 Postscript 

brute existence outside society has inormed this work, rather than being

directly addressed within it. Contemporary changes in technological

modes and inormation retrieval are conceived as an overlay on beingully human, not as a radical transormation that compares to the transi-

tion rom brute existence; an enhancement o human capacities can be

detected.

Distinctions within being ully human made by technological mode

directly inormed this book’s discussions. We traced and distinguished

orality’s emergence into literacy, written literacy, and computational

modes and acknowledged the transitional orms that occurred between

modes; each subsequent era received, encountered, and possibly modi-fed the inheritances rom previous modes. For instance, written literacy

received inheritances rom orality emerging into literacy, and the com-

putational mode received inheritances rom written literacy, including its

transormations o orality emerging into literacy. Thus, we understood

dierent technological modes as transitions or additions within being

ully human in civil society.

Human history is conceived here as material progress, with associated

changes in consciousness. The conception is consistent with a perspective

on history derived partly rom Marx, which identifes

the one element o directional change in human aairs which is observable andobjective, irrespective or our subjective or contemporary wishes and value-judg-ments, namely the persistent and increasing capacity o the human species to con-trol the orces o nature by means o manual and mental labour, technology andthe organization o production. (Hobsbawm 1998, 41)

The reality o this directional change is:

demonstrated by the growth o the human population o the globe throughouthistory, without signifcant set-backs, and the growth—particularly in the pastew centuries—o production and productive capacity. (Hobsbawm 1998, 41)

The orces or change were understood as human physical and mental

labor on the natural and humanly modifed environment. Our concern

here has ocused directly on mental labor and its products and also on the

transer o mental labor to modern inormation technologies.

The labor theory o value, which can be assimilated to this conceptiono history, has classically been more concerned with physical or productive

labor than with mental labor. By ocusing on the substitution o machine

Page 168: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 168/199

Postscript 159

processes or human labor and some methodological preerence or con-

tinuous quantitative developments, the theory can obscure the possibility

o radical qualitative change. Empirically, qualitative eects could emergerom repeated quantitative substitutions, and some discussions have rec-

ognized the possibility o radical change. Marx, or instance, noted that

industrial technology did not simply save or substitute or human physi-

cal labor but expanded human powers and activities:

or, with the help o machinery, human labour perorms actions and creates thingswhich without it would be absolutely impossible o accomplishment (Marx1858/1973, 389)

Similarly, inormation and communication technologies need not simplysubstitute or human mental labor but can extend human mental capaci-

ties. Developing considerations rom the labor theory o value has the

particular advantage o incorporating technology as machinery.

A parallel concern with changes in the conditions and possibilities or

being in relation to mental labor can be derived rom a historical perspec-

tive on communication. Specifcally in relation to the history o writing

and its eects on consciousness, the position ormulated by Ong (1982;

1986) or the transition rom orality and literacy—that writing is a tech-

nology that restructures thought—can be transormed and made both

more extensive and more precise. Ong’s position represents a particular

instance o a more general proposition that “all new intellectual tools

restructure thought” (Harris 1989), leading to a more precisely ormu-

lated and satisying question:

how does this innovation make possible or oster orms o thought which were

previously difcult or impossible? (Harris 1989, 166)

The concern with orms o thought is analogous to the possibilities or

mental labor.

A similarly precise question, connected with the conception o history

as cumulative development and with the idea o additions within ully

human being, can now be ormulated and addressed or inormation

retrieval: What central transormations in being are occurring in connec-

tion with modern inormation retrieval?

The question can be understood as a particularized instantiation and

transormation o the more general question posed by Eric Hobsbawm in

the conception o history as material progress:

Page 169: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 169/199

160 Postscript 

For, like it or not—and there are plenty o historians who don’t like it—there isone central question in history which cannot be avoided, i only because we allwant to know the answer to it. Namely: how did humanity get rom caveman

to space-traveller, rom a time when we were scared by sabre-toothed tigers to atime when we are scared by nuclear explosions—that is scared not by the hazardso nature but by those we have created ourselves? (1998, 40)

The specifc orce or change in inormation retrieval has been identifed

throughout this book as the transer o human labor to technology. In

a transormation o Hobsbawm’s question, we are concerned here with

the eects o that transer, possibly emerging rom repeated quantitative

substitutions.

The historical schema developed here—distinguishing orality emerginginto literary, written literacy or premodernity, and computational modes

or modernity—remains relevant or understanding eects on being and

consciousness. The immediate ocus is on the recent, still current, and

continuing transition rom the premodernity o written literacy to the

computational mode o modernity. Eects within the proessional prac-

tices o inormation retrieval can be distinguished rom eects or users

o inormation retrieval systems. Due to greater social diusion and the

larger number o individuals who use (rather than construct) systems, the

outside eects have greater signifcance or collective human being.

Enhanced Selection Power

A crucial eect o recent practical developments involves the enhanced

potential or recalling diverse material. While that potential has emerged

rom within inormation retrieval through repeated quantitative substitu-

tions, we now can describe it as a qualitative transormation or users andcollective being. Without the semantic labor o comprehending a complex

bibliographic system, the syntactic drudgery o transcription and copying,

or the physical work o searching through collections, everyday users may

be approaching the condition o subject bibliographers in query ormula-

tion and selection rom search results, but with some transormations and

additions in their semantic search labor. Subject bibliographies were clas-

sically regarded as highly valuable or privileged orms, yielding richness

without urther investigative labor. The richness obtainable, both beoreand ater current transitions, may be accompanied by uncertainty as to

whether the boundaries o investigation have been reached. An enhance-

ment has occurred in human capacities—understood as an overlay rather

Page 170: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 170/199

Postscript 161

than as transormation in human existence, and brought about by the

continuing desire or selection power. The enhancement brings certain

penalties: a change (and under certain circumstances an increase) in searchlabor in the use o retrieval systems, and at urther remove an accentua-

tion o the burden o the knowable.

Conclusion

The provision o enhanced selection power and its theoretical valuing

provides a deep eect—the restoration o man as an artifcer and the rec-

ognition o the subtlety o the processes involved in inormation retrieval.Rather than being subjected to a retrieval process beyond immediate con-

trol, a searcher is presented with an enhanced capacity or choice and,

with certain systems, or making sets rom material recalled. The new

and historically unprecedented potential or enhanced orms o knowing

existing textual material then can be explored productively. For instance,

directly relevant to understanding language (although with eects or

inormation retrieval), the unrivalled opportunity o ull text database or

exploring the semantic mutability o written word orms with dierent

contexts now can be pursued. Rather than “Waiting or Godot [while ail-

ing] to grasp what is now within reach” (Swanson 1988, 97), we can begin

to explore the potential or improving human interaction with recorded

knowledge, thus becoming more ully human.

Page 171: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 171/199

Page 172: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 172/199

Notes

Chapter 1

1. One student’s comment on the schema within the labor theoretic approachwas: “It is the simplicity the schema brings to the reader that will aid them inincreasing their knowledge in this area as the schema can always be rememberedin its entirety due to its excellent minimalist structure.” (Written comments roma student taking Inormation Policy, Queen’s University Belast, 2007.)

Chapter 3

1. Database management systems could be absorbed into the perspective con-structed, and the commonality and contrast they have with inormation retrievalsystems—the use o descriptions but with dierent schema or their construc-tion—will be indicated. The basis or the distinction between inormation re-trieval and database management systems also appears theoretically weak whenmade explicit.

The labor theoretic approach can reveal commonalities and a dierence be-tween inormation retrieval and database management systems (central concerns

o the related subjects, although they are disjunct disciplines or communities o study, o inormation science and inormation systems (Ellis, Allen, and Wilson1999)).

Substantial commonalities are discernible. Databases may be similarly intendedto enlarge the selection power o the searcher. Selection power is also created byselection labor, disaggregated into search and description labor. Syntactic aspectso description and searching have been transerred to technology, and commontechnology is subject to similar possibilities and constraints o computability. Asimilarity in costs could be assumed.

A dierence would lie in the use o dierent schemas or description, with da-

tabase theory and practice dominated by the entity-attribute model. The model istraceable to distinctions classically made in philosophy and in ormal and math-ematical logic, in contrast to the emergence o description schemas or docu-ment description rom more practical experience. Common, or at least strongly

Page 173: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 173/199

164 Notes

analogous, concerns could be ound within the dierent description schemas—consider, or instance, the apparent parallel between normalization o names andreduction o authors’ names to canonical orms.

The distinction o attribute rom entity has been questioned in philosophy andwas parodied by Lewis Carroll, who had a scholarly career as the symbolic logi-cian under his given name, Charles Dodgson (see fgure N3.1).

The distinction has also been difcult to sustain in database practice.Substantial commonalities and a signifcant contrast crucial to distinction be-

tween inormation retrieval and database management systems have been iso-

lated and clarifed.

2. I am indebted to a student or this observation.

3. W. W. Greg’s comments ater completing A Bibliography o the English Printed Drama to the Restoration, the ‘‘product o a lietime o study,’’ are worth recallingwith regard to the prolonged labor o searching, the accumulation o expertise,the subjectively experienced limitations o that expertise in ully comprehendingthe topic sought, and the modifcation o originally held intentions though inter-action with the material discovered:

lest anyone should think that looking back on my work I eel any complacency

over the manner o its execution, I here admit that I can hear the caustic criticwho ever sits like a amiliar imp at my elbow maintaining that my problem in

Figure N3.1Parody o entity-attribute distinction. Source: Carroll 1865/1998, 58.

Page 174: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 174/199

Notes 165

writing this introduction has been threeold: frst to discover what in act I havedone, next why I did it, and lastly how best it may be deended.

The “decision to limit the work to printed plays was one o convenience orexpediency” (1959, v).

4. Some riddles inormally anticipate the Boolean operators, particularly AND(consider the Anglo-Saxon examples in Hamer 1970, 95–107), but also OR andNOT.

Chapter 4

1. Selection power ➞ Selection labor would be valid in truth-table terms, where

Selection power is False and Selection labor is True; or NOT-Selection powerAND Selection labor; ¬ Selection power ^ Selection labor. Diagrammatically, therelation between Selection labor and Selection power could be represented as:

2. Semantic labor➞Human labor is true or NOT-Semantic labor AND Humanlabor, but the number o instances likely would decrease with modern technolo-gies, and the quantity o human labor (which is not semantic labor) would dimin-ish accordingly. Diagrammatically, the relation between Human labor, Semanticlabor, and Syntactic labor or process could be represented as:

3. The assertion sign ( ), which can be read verbally as “it is true that …” is de-liberately omitted rom the assertion o the primitive terms, selection power andselection labor, and other propositions in the summary sequence. Its omission isin accord with the redundancy theory o truth itsel traceable to at least the sev-enteenth century and Honoré Fabri (Vico 1710/1988, 145-146), which receivedclear modern expression by Wittgenstein:

Frege’s assertion sign ‘ ’ is logically altogether meaningless; in Frege (and Russell)

it only shows that these authors hold as true the propositions marked in thisway.

Selection labor

Selectionpower

Human labor

Semantic

labor

Syntactic labor

or process

Page 175: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 175/199

166 Notes

‘ ’ belongs thereore to the propositions no more than does the number o the proposition. A proposition cannot possibly assert o itsel that it is true.(1922/1981, § 4.442)

Denying the possibility that a proposition can assert its own truth is consis-tent with a view o logic and technology as human constructions rather thanautonomous existents and emphasizes selection power as a property o humanconsciousness assisted by, but not inhering in, system design.

4. The possibility o signifcantly diering costs or wage rates or comparableorms o direct human labor, particularly between dierent geopolitical regions,is acknowledged but not directly addressed.

5. Thirty our tokens o the intended type, or reerence to the passage in theGrundrisse, were recalled, with no unintended recalled descriptions (Google

2007c).

6. Johnson did visit libraries at Oxord as part o his work on the Dictionaryo the English Language. Some twenty years ater the publication o that work, Johnson also wrote an entry or the title page o another work attributed to him:An Account o an Attempt to Ascertain the Longitude at Sea, by an Exact Theoryo the Variation o the Magnetical Needle, “in the great Catalogue … with hisown hand” (Boswell 1791/1980, 190, 194).

Chapter 5

1. A close analog to a Googlewhack (see http://www.googlewhack.com/).

2. “Direct semantic ratifcation” is a slightly technical term—originally rom an-thropology (Goody and Watt 1968, 29)—that reers, in part, to the possibility o dialog between communicators, present in interactive oral communication andrestricted or delayed with written communication. This dialog is reemerging inelectronic communication as still absent at the point o production o utterancesbut with dialogic response characteristically much more rapid than response orwritten communication (Warner 2004, 57–67).

3. As Ambrose Bierce noted in his The Devil’s Dictionary.

Quotation, n. The act o repeating erroneously the words o another. The wordserroneously repeated.

Intent on making his quotation truer,He sought the page inallible o Brewer,Then made a solemn vow that he would beCondemned eternally. Ah, me, ah, me! (1906/1989, 260).

Chapter 6

1. Inormation theory was not available to Saussure in the early twentieth cen-tury, beore its ormalization in 1948 (Shannon 1948/1993), but would be “a

Page 176: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 176/199

Notes 167 

science which treats combination rather like algebraic equations” and would un-derstand the linking o “sound types” (Saussure 1916/1983, 51) or other symbolsin succession as transition possibilities and probabilities.

2. I am indebted to Luigina Ciolf o the University o Limerick or inorming meo this pedagogic technique.

3. For an explication and exemplifcation o the distinction o signifer, sign, andsignifed, and their role in signifcation when considered in relation to writtenlanguage, see Warner 1994, 9–15.

Chapter 7

1. Cryptography is understood as the transormation o an identifed and inter-cepted signal by the receiver into a message or passing to the destination. ForShannon, “[]rom the point o view o the cryptoanalyst, a secrecy system . . .[was] almost identical with a noisy communication system” and the cryptogramwas “analogous to the perturbed signal” (1949/1993, 113).

2. A search or the string “cohesive group o letters with strong” in May 2007retrieved fve documents captured by Google, seven by Google Scholar, and ourby Google Book Search, with some duplication between Google and GoogleScholar—a version o Shannon’s own article was recalled by Google (Google2007a, b, c).

3. Text messaging also embodies a practical, rather than theoretically inormed,understanding o the possibility o reducing redundancy, while drawing on mod-els in oral speech. Consider, or instance, the sequence Thnx 4 inrmtn thry.

Chapter 8

1. The analytic advantages that might ollow directly or linguistics and inor-mation theory are not our primary concern here, and we will address them onlysummarily.

For linguistics, we have confrmed the syntagm as a primary existent and theparadigm as an analytic construct while considering computational transorma-tions on written language, as well as the direct experience o oral and writtenutterances. In the priority given to language over the utterance, and in the un-derstanding o the paradigm as a primary existent, we have detected elements o “abstract objectivism” (Vološinov 1929/1986, 79–80) in Saussure but have notrejected central Saussurean concepts. We urther demonstrated the analytic valueo distinguishing between syntagm and paradigm or inormation retrieval romull text, indicating its validity and strongly suggesting that Saussurean linguisticsare relevant in inormation science.

The original orm o inormation theory, including its restriction to expression,has been careully preserved, although elements o potentially semantic selectionand interpretation have emerged in Shannon’s own model.

Page 177: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 177/199

168 Notes

For the conjunction o linguistics and inormation theory, we dierentiatedthe issues o the signifed (linguistics) rom the relative requency (inormationtheory) o sequences, i not ully substantively, then analytically. The difculty o 

substantively separating issues o expression and content rom requency, and theimplied increased convergence between concepts rom linguistic and inormationtheory, may reect a common historical inheritance or practical implementationo communication that preceded theoretical abstraction, particularly or writtenlanguage.

2. I am indebted to Howard Rosenbaum o Indiana University or a communica-tion o this pedagogic technique.

3. Some commentators, notably Wittgenstein, would qualiy this use o primi-tive:

When we have rightly introduced the logical signs, the sense o all their com-binations has been already introduced with them: thereore not only “ p v q” butalso “~(p v ~q)”, etc. etc. We should then already have introduced the eect o all possible combinations o brackets; and it would then have become clear thatthe proper general primitive signs are not “ p v q”, ”( x) . x”, etc., but the mostgeneral orm o their combinations. (1922/1981, §5.46).

4. A student’s comment on this analysis was, “The thematic concepts are pow-erul and systematic tools, which reach almost all the acets o the inormationretrieval system.” (Written comments rom a student taking Communicating Elec-

tronically, Queen’s University Belast, 2002.)5. The difculty or inability o the human mind to generate random sequenceswas exploited by Shannon in a machine that could anticipate human choices incalling heads or tails (1953/1993; Sloane and Wyner 1993, xvii).

6. Consider the traces o the process o production implied in Mark Twain’sfctional comments in A Connecticut Yankee in King Arthur’s Court , on receivingMorse code telegraphy in enciphered orm, “and then came a click that was as a-miliar to me as a human voice; or Clarence had been my own pupil” (1889/1996,479), and the synchronic access to the process o communication when intercept-ing signals or deciphering Enigma-produced messages in the 1939–1945 war.

7. James Murray was the original editor o the New English Dictionary, whichbecame known as the Oxord English Dictionary (Murray 1978).

8. Figures give lightly edited selections or the opening reerences retrieved romGoogle 2005.

Chapter 9

1. A comparable development can be detected in translation systems that increas-

ingly present selections o words or human choice (Pym 2003). The congruencebecomes particularly clear i translation is conceived as choosing complexes o signifer and signifed rom paradigms o the target language to assign to syntag-mas in the source text.

Page 178: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 178/199

Supplementary Readings

Readings that urther develop or similarly expound the approach developed inthis work are limited in number by the relative novelty o the themes expoundedor both the labor theoretic approach and or the application o Saussurean lin-guistics and Shannon’s inormation theory to understanding retrieval rom writ-ten discourse.

Some signifcant sources that have been adapted to the themes developed hereare indicated.

The most productive way o deepening understanding o themes and conceptsexpounded would be dialectic exploration and testing them against a variety o 

widely available inormation systems.

Chapter 2: Selection Power and Selection Labor

For the dialogue between query transormation and selection power:

Warner, J. 2000. “In the Catalogue Ye Go or Men: Evaluation Criteria or In-ormation Retrieval Systems.” Aslib Proceedings 52: 76–82. Also InormationResearch. July 4, 1999. Available at: http://inormationr.net/ir/4-4/paper62.html.

For a critique o the value o providing all the relevant inormation, placed in a

uller intellectual context:Wilson, P. 1996b. “Interdisciplinary Research and Inormation Overload.” Li-brary Trends 45: 192–203.

Chapter 3: Description and Search Labor

For a discussion o the costs o creating descriptions and o the monetary valuesattached to the products o description labor:

Hayes, R. M. 2000. “Assessing the Value o a Database Company. In The Web o Knowledge: A Festschrit in Honor o Eugene Garfeld, ed. B. Cronin and H. B.Atkins, Medord, NJ: Inormation Today, Inc., 73–84.

The text o the judgment o the United Supreme Court in Feist v. Rural can beread.

Page 179: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 179/199

170 Supplementary Readings

Feist. 1991. Feist Publications, Inc. v. Rural Tel. Service Co., Inc. 499 U.S. 340.

The cited narratives involving the determination o identity can be instructivelyread.

Hardy, T. 1888/1976. “The Three Strangers.” In Wessex Tales. Introduction andnotes by F. B. Pinion. London: Macmillan, 13–36.

Henry, O. 1910/1993. “The Theory and the Hound.” In Selected Stories, ed. withan introduction by G. Davenport. New York: Penguin Books, 398–406.

Chapter 5: Retrieval rom Full Text

Vološinov’s discussion o the single all-meaning word used by prehistoric man

and its “[m]ultiplicity o meanings” (1986, 100–101) orms a more ormal coun-terpart to Lewis Carroll and Humpty Dumpty:

Vološinov, V. N. 1986. Marxism and the Philosophy o Language. First publishedin 1929. Trans. by Ladislav Matejka and I. R. Titunik. New York and London,Seminar Press, 80–106 (particularly, 100–102).

Chapter 6: A Semantics or Retrieval rom Full Text

Syntagmatic and associative (paradigmatic) relations as discussed by Saussure:

Saussure, F. de. 1983. Course in General Linguistics, Part Two, Chapters V–VII.First published 1916. C. Bally and A. Sechehaye, eds., with the collaboration o A.Riedlinger. Trans. and annotated by R. Harris. London: Duckworth, 121–136.

The pedagogic task o cutting words rom a sheet o written paper (frst cutting acontinuous ribbon and then individual words) in order to grasp the distinction o syntagm rom paradigm, is a valuable learning exercise.

For a valuable commentary on Saussure:

Harris, R. 1987. Reading Saussure: A Critical Commentary on the Cours de Lin- guistique Générale. London: Duckworth.

Chapter 7: A Syntactics or Retrieval rom Full Text

In a rather neglected piece originally published in the Encyclopedia Britannica, Claude Shannon gives a nontechnical and widely intelligible account o inorma-tion theory:

Shannon, C. E. 1993. “Inormation Theory.” In Claude Elwood Shannon: Col-lected Papers, ed. N. J. A. Sloane and Aaron D. Wyner. First published in 1968.Piscataway, NJ: IEEE Press, 212–220.

Nontechnical readers can still proft rom reading Shannon’s original paper onprediction and entropy in printed English:

Page 180: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 180/199

Supplementary Readings 171

Shannon, C. E. 1993. Prediction and Entropy o Printed English. Claude Elwood Shannon: Collected Papers, ed. N. J. A. Sloane and Aaron D. Wyner. First pub-lished in 1951. Piscataway, NJ: IEEE Press, 194–208.

Chapter 8: Semantics and Syntactics or Retrieval rom Full Text

Roland Barthes gives a very clear account o crucial semiotic distinctions:

Barthes, R. 1984. “Elements o Semiology.” In Writing Degree Zero & Elementso Semiology, R. Barthes, ed. Trans. A. Lavers and C. Smith. London: JonathanCape, 75–172.

For an exposition o the distinction o signifer, sign, and signifed:

Warner, J. 1994. From Writing to Computers. London: Routledge, 915.The relation o object- to metalanguage and the signifcance o the costs o hu-man description labor could be considered in relation to proposals or the Se-mantic Web.

Berners-Lee, T., J. Hendler, and O. Lassila. 2001. “The Semantic Web: A New Formo Web Content That is Meaningul to Computers Will Unleash a Revolutiono New Possibilities.” Scientifc American, May 2001. Available at: http://www.sciam.com/article.cm?articleID=00048144-10D2-1C70-84A9809EC588EF21.

Page 181: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 181/199

Page 182: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 182/199

Bibliography

Amazon.com 2005. “Amazon.com Statistically Improbable Phrases.” RetrievedMay 2, 2005, rom http://www.amazon.com/.

Amazon. 2007. “Amazon.com.” Retrieved June 27, 2007, rom http://www.ama-zon.com/.

Aristotle. The Ethics o Aristotle: The Nicomachean Ethics. Translated by J. A.K. Thomson, revised with notes and appendices by Hugh Tredennick, introduc-tion and bibliography by Jonathan Barnes. London: Penguin Group, 1976. Firstpublished 323 BC.

———. The Politics and The Constitution o Athens. Stephen Everson, ed. Cam-bridge: Cambridge University Press, 1996. First published 323 BC.

Augustine. Conessions. Translated with an introduction by R. S. Pine-Cofn. Lon-don: Penguin Group, 1961. First published 398 BC.

Ayer, A. J. Language, Truth and Logic. Harmondsworth, England: Penguin Books,1980. First published 1936 by Victor Gollancz.

Bacon, F. “O studies.” In The Essays. Edited with an introduction by J. Pitcher.Harmondsworth, England: Penguin Books, 1985. First published 1625, printedby Iohn Haviland or Hanna Barret and Richard Whitaker.

Babbage, C. On the Economy o Machinery and Manuactures, 4th edition. Lon-don: Charles Knight, 1835. In Reprints o Economic Classics. A. M. Kelley: NewYork, 1963

———. Science and Reorm: Selected Works o Charles Babbage. Edited with anintroduction by A. Hyman. Cambridge: Cambridge University Press, 1989.

Barraclough, E. D. “Online Searching in Inormation Retrieval,” Journal o Docu-mentation, Vol. 33 (1977), pp. 220–238.

Barthes, R. “Elements o Semiology.” Writing Degree Zero & Elements o Semi-ology, 75–172. Translated by A. Lavers and C. Smith, London: Jonathan Cape,

1984.Bath University Library. Inormation Requirements o Researchers in the Social Sciences, R. Barthes, ed. Bath: Bath University Library, 1971.

Page 183: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 183/199

174 Bibliography

Belkin, N. J. and A. Vickery. Interaction in Inormation Systems: A Review o Research rom Document Retrieval to Knowledge-Based Systems (Library and Inormation Research Report 35). London: British Library, 1985.

Bell, E. T. Men o Mathematics. London: Penguin Books, 1937.

Berger, T. “Lossy Source Coding.” In Inormation Theory: 50 Years o Discov-ery, edited by S. Verdú and S. W. McLaughlin, 649–679. Piscataway: Wiley-IEEEPress, 2000.

Bergin, T. G. and M. H. Fisch. “Introduction.” In G. Vico. The New Science o Giambattista Vico. Unabridged translation o the 3rd edition (1744) with the ad-dition o “Practice o the New Science,” xix–xlv. Translated by T. G. Bergin andM. H. Fisch. Ithaca, NY: Cornell University Press, 1976.

Berners-Lee, T., J. Hendler, and O. Lassila. 2001. “The Semantic Web: A NewForm o Web Content that is Meaningul to Computers Will Unleash a Revolutiono New Possibilities.” Scientifc American, (May 2001). Available at: http://www.sciam.com/article.cm?articleID=00048144-10D2-1C70-84A9809EC588EF21.

Bierce, A. The Enlarged Devil’s Dictionary. Researched and edited by E. J. Hop-kins. Preace by J. M. Myers. London: Penguin Books, 1989. First published 1906by Doubleday, Page, & Company as The Cynics’s Word Book.

Blair, D. C. “Knowledge Management: Hype, Hope, or Help?”   Journal o theAmerican Society or Inormation Science and Technology, Vol. 53 (2002), pp.1019–1028.

Bochenski, I. M. History o Formal Logic. Translated by I. Thomas. Indiana:Notre Dame, 1961.

Boole, G. An Investigation o the Laws o Thought. On Which are Founded theMathematical Theories o Logic and Probabilities. London: Walton and Maberly,1854.

Boolos, G. S. and Jerey, R. C. Computability and Logic, 3rd edition. Cambridge:Cambridge University Press, 1989.

Borges, J. L. The Book o Imaginary Beings. J. L. Borges with M. Guerrero. Re-vised, enlarged and translated by N. T. di Giovanni in collaboration with theauthor. London: Penguin Books, 1974.

Boswell, J. Lie o Johnson. Edited by R. W. Chapman and revised by J. D. Flee-man. Introduction by P. Rogers. Oxord: Oxord University Press, 1980. Firstpublished 1791, printed by Henry Baldwin, or Charles Dilly.

Bradord, S. C. Documentation, 2nd edition. High Wycombe, Bucks: UniversityMicroflms or the College o Librarianship Wales, 1971. First published 1948 byC. Lockwood.

Buckland, M. “Inormation as Thing,” Journal o the American Society or Inor-mation Science, Vol. 42 (1991), pp. 351–360.

Buckland, M. and C. Plaunt. “On the Construction o Selection Systems,” LibraryHi Tech, Vol. 12 (1994), pp. 15–28.

Page 184: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 184/199

Bibliography 175

Carroll, L. “Alice’s Adventures in Wonderland.” In L. Carroll, Alice’s Adventuresin Wonderland and  Through the Looking-Glass and What Alice Found There, edited with an introduction and notes by H. Haughton, 1–110. London: Penguin

Books, 1998. First published 1865 by Macmillan and Co..———. “Through the Looking Glass, and What Alice Found There.” In L. Car-roll, Adventures in Wonderland and  Through the Looking-Glass and What AliceFound There, edited with an introduction and notes by H. Haughton, 111–246.London: Penguin Books, 1998. First published 1872 by Macmillan and Co.

Chandler, R. Raymond Chandler Speaking. D. Gardiner and K. S. Walker, eds.Introduction by P. Skenazy. Berkeley: University o Caliornia Press, 1997. Firstpublished 1962 by Houghton Miin.

Cherry, C. On Human Communication: A Review, A survey, and A Criticism. 3rd

ed. Cambridge, Mass: MIT Press, 1978.Childe, V. G. Society and Knowledge. London: George Allen & Unwin, 1956.

Cleverdon, C. W. ASLIB Cranfeld Research Project: Report on the Testing and Analysis o an Investigation into the Comparative Efciency o Indexing Systems.Cranfeld: College o Aeronautics, 1962.

Cleverdon, C., J. Mills, and M. Keen. Factors Determining the Perormance o Indexing Systems. Cranfeld: College o Aeronautics, 1966.

Cockshott, P. and G. Michaelson. “Are There New Models o Computation? Re-ply to Wegner and Eberbach,” The Computer Journal, Vol. 50 (2007), pp. 232–247.

Collins, R. The Sociology o Philosophies: A Global Theory o Intellectual Change. Cambridge, Mass: Belknap Press o Harvard University Press, 1998.

Copyright Decisions. Decisions o the United States Courts Involving Copyright and Literary Property 1789–1909. Washington: Copyright Ofce, Library o Congress, 1909.

Culler, J. Saussure. London: Fontana Press, 1988.

Darwin, C. The Origin o Species by Means o Natural Selection or The Preserva-

tion o Favoured Races in the Struggle or Lie. London: Penguin Group, 1968. Firstpublished 1859 by John Murray.

Davis, M. “Inuences o Mathematical Logic on Computer Science.” In The Uni-versal Turing Machine: A Hal Century Survey, edited by R. Herken, 315–326.Oxord: Oxord University Press, 1988.

Davis, M. The Universal Computer: The Road rom Leibniz to Turing. NewYork: W. W. Norton, 2000.

Dickens, C. The Personal History o David Copperfeld . Oxord: Oxord Univer-sity Press, 1981. First published 1850 by Bradbury & Evans.

———. Hard Times. Oxord and New York: Oxord University Press, 1989. Firstpublished 1854 by Bradbury & Evans.

Page 185: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 185/199

176 Bibliography

———. Great Expectations. London: Cumberlege, 1946. First published 1861 byChapman and Hall.

Dijkstra, E. W. “Letters to the Editor: Go To Statement Considered Harmul (Let-ters to the Editor),” Communications o the ACM, Vol. 11 (1968), pp.147–148.

Donovan, E. The Natural History o British Quadrapeds; Consisting o Coloured Figures, Accompanied with Scientifc and General Descriptions, o All the SpeciesThat Are Known to Inhabit the British Isles . . . and Also Such as Are Clearly Au-thenticated to Have Been Originally Indigenous, But Are Now Extirpated, or Be-come Extremely Rare; the Whole Arranged in Systematic Order, Ater the Mannero Linnaeus. London: Printed or the Author, and or F. C. and J. Rivington, 1820.

Ellis, D. “Theory and Explanation in Inormation Retrieval Research,” Journal o Inormation Science, Vol. 8 (1984), pp. 25–38.

———. “The Physical and Cognitive Paradigms in Inormation Retrieval Re-search,” Journal o Documentation, Vol. 48 (1992), pp. 45–64.

———. Progress and Problems in Inormation Retrieval . London: Library As-sociation, 1996.

Ellis, D., D. Allen, T. Wilson. “Inormation Science and Inormation Systems:Conjunct Subjects Disjunct Disciplines,” Journal o the American Society or In-ormation Science, Vol. 50 (1999), pp. 1095–1107.

Ellis, D. and A. Vasconcelos. “Ranganathan and the Net: Using Facet Analysis toSearch and Organise the World Wide Web,” Aslib Proceedings, Vol. 51 (1999),pp. 3–10.

Feist Publications, Inc. v. Rural Telephone Service Co., Inc., 499 U.S. 340(1991).

Fiske, J. Introduction to Communication Studies. 2nd edition. London: Routledge,1990.

Freud, S. Jokes and Their Relation to the Unconscious. Translated and edited by J.Strachey and A. Richards. Harmondsworth, England: Penguin Books, 1976. Firstpublished 1905 by Deuticke.

Gardin, J-C. “Document Analysis and Linguistic Theory,” Journal o Documenta-tion, Vol . 29 (1973), pp. 137–168.

Gardner, M. Logic Machines and Diagrams, 2nd edition. Brighton: Harvester,1983.

Goody, J. and I. Watt, “The Consequences o Literacy.” In Literacy in Traditional Societies, edited by J. Goody, 27–68. Cambridge: Cambridge University Press,1968.

Google. 2004. Google Advanced Search. Retrieved December 17, 2004, romhttp://www.google.co.uk/advanced_search?hl=en.

———. 2005. Google Advanced Search. Retrieved May 5, 2005, rom http://www.google.co.uk/advanced_search?hl=en.

———. 2007a. Google. Retrieved June 27, 2007, rom http://www.google.co.uk/ advanced_search?hl=en.

Page 186: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 186/199

Bibliography 177 

———. 2007b. Google Advanced Image Search. Retrieved June 27, 2007, romhttp://images.google.com/advanced_image_search?hl=en.

———2007c. Google Book Search. Retrieved July 25, 2007, rom http://books.google.co.uk/advanced_book_search

———. 2008. Google Advanced Scholar. Retrieved June 2, 2008, rom http:// scholar.google.com/advanced_scholar_search

Gosden, C. Prehistory: A Very Short Introduction. Oxord: Oxord UniversityPress, 2003.

Greg, W. W. A Bibliography o the English Printed Drama to the Restoration,Volume IV. London: printed or the Bibliographical Society at the UniversityPress, Oxord, 1959.

Gregory, V. L. “Review o J. Warner. Humanizing Inormation Technology,” LibraryResources & Technical Services, Vol. 49 (2005), pp. 59–60.

Griesbaum, J. 2000. Evaluierung Hybrider Suchsysteme im WWW. iDiplomarbe-it, Universität Konstanz Inormationswissenschat [Evaluation o Hybrid InternetSearch Systems]. Retrieved February 21, 2007, rom http://www.in.uni-konstanz.de/~griesbau/fles/evaluierung_hybrider_suchsysteme_im_www.pd.

Hardy, T. “The Three Strangers.” In Wessex Tales, 13–36. Introduction and notesby F. B. Pinion. London: Macmillan, 1976. First published 1888 by MacMillan.

Hamer, R. A Choice o Anglo-Saxon Verse. London: Faber and Faber, 1970.

Harris, R. Reading Saussure: A Critical Commentary on the Cours de LinguistiqueGénérale. London: Duckworth, 1987.

———. “How does writing restructure thought?” Language and Communica-tion, Vol. 9 (1989), pp. 99–106.

———. Signs o Writing . Routledge: London and New York, 1995.

Hayes, R. M. “Assessing the Value o a Database Company.” The Web o Knowl-edge: A Festschrit in Honor o Eugene Garfeld, edited by B. Cronin and H. B.Atkins, 73–84. Medord, NJ: Inormation Today, Inc., 2000.

Heine, M. H. “The ‘Question’ as a Fundamental Variable in Inormation Sci-

ence.” Theory and Application o Inormation Research (Proceedings o the Sec-ond International Research Forum on Inormation Science, 3–6 August 1977,Royal School o Librarianship, Copenhagen), edited by O. Harbo and L. Kajberg,137–145. London: Mansell, 1980.

Henry, O. “The Theory and the Hound.” In Selected Stories, edited with an in-troduction by G. Davenport, 398–406. New York: Penguin Books, 1993. Firstpublished 1910 by Doubleday, Page & Co..

Herken, R., ed. The Universal Turing Machine: A Hal-Century Survey, 2nd edi-tion. Wien and New York: Springer-Verlag, 1995.

 Johnson, S. “Preace to the Dictionary.” In Johnson’s Dictionary: A Modern Se-lection, edited by E. L. McAdam and G. Milne, 3–29. London and Basingstoke:Macmillan, 1982a. First published 1755, printed by W. Strahan, or J. and P.Knapton [etc.].

Page 187: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 187/199

178 Bibliography

———. Johnson’s Dictionary: A Modern Selection, edited by E. L. McAdam andG. Milne, eds. London and Basingstoke: Macmillan, 1982b. First published 1755,printed by W. Strahan, or J. and P. Knapton [etc.].

———. A Dictionary o the English Language [Electronic Resource]: on CD-ROM. Ed. A. McDermott. Cambridge: Cambridge University Press, 1996.

 Johnson-Laird, P. N. The Computer and the Mind: An Introduction to CognitiveScience. London: Fontana, 1988.

Linné, C. The Animal Kingdom, or Zoological System, o the Celebrated SirCharles Linnaeus. Class I. Mammalia: Containing a Complete Systematic De-scription, Arrangement, and Nomenclature, o All the Known Species and Variet-ies. . . . Edinburgh: Printed or A. Strahan, T. Cadell, 1792.

Liversidge, A. “Profle o Claude Shannon.” In Claude Elwood Shannon: Collect-ed Papers, edited by N. J. A. Sloane and Aaron D. Wyner, xix–xxxiii. Piscataway,NJ: IEEE Press, 1993.

Lyotard, J-F. The Postmodern Condition: a Report on Knowledge. Translatedby G. Bennington and B. Massumi. Minneapolis: University o Minnesota Press,1984.

MacKenzie, D. Mechanizing Proo: Computing, Risk, and Trust . Cambridge,Mass: MIT Press, 2001.

Marx, K. “The Eighteenth Brumaire o Louis Bonaparte.” In K. Marx. Surveysrom Exile, edited and introduced by David Fernbach, 143–249. Harmonds-worth, Eng: Penguin Books in association with New Let Review, 1973. Firstpublished 1852 in Die Revolution.

———. Grundrisse: Foundations o the Critique o Political Economy  (RoughDrat). Translated with a oreword by Martin Nicolaus. London: Penguin Booksin association with New Let Review, 1973. Written c.1858 and frst published1939 by Verlag ür Fremdsprachige Literatur.

———. Capital: A Critique o Political Economy, Volume One. Introduced by E.Mandel and translated by B. Fowkes. London and New York: Penguin Books inassociation with New Let Review, 1976. First published 1867 by O. Meissner

and L. W. Schmidt.———. Capital: A Critique o Political Economy, Volume Three. Introduced byE. Mandel and translated by B. Fowkes. London and New York: Penguin Booksin association with New Let Review, 1981. First published 1894 by Meissner.

McKenzie, D. F. “Speech-Manuscript-Print,” The Library Chronicle o the Uni-versity o Texas, Vol. 20 (1990), pp. 87–109.

McLeod, N. S., J. Perelman, and W. B Johnston. Horse Feathers. Hollywood:Paramount, 1932.

Minsky, M. Computation: Finite and Infnite Machines. Englewood Clis, NJ:Prentice-Hall, 1967.

Montgomery, C. A. “Linguistics and Inormation Science,” Journal o the Ameri-can Society or Inormation Science, Vol. 23 (1972), pp. 195–219.

Page 188: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 188/199

Bibliography 179

Moradi, H., J. W. Grsymala-Busse, and J. A. Roberts. “Entropy o English Text:Experiments with Humans and a Machine Learning System Based on RoughSets,” Inormation Sciences, Vol. 104 (1998), pp. 31–47.

Morrison, D. “Index to The Times,” Journal o Documentation, Vol. 42 (1986),pp. 189–195.

Murra, K. O. “History o Some Attempts to Organize Bibliography Internation-ally.” In Bibliographic Organization: Papers Presented Beore the Fiteenth An-nual Conerence o the Graduate Library School July 24–29, 1950, edited by J. H.Shera and M. E. Egan 24–53. Chicago: University o Chicago Press, 1951.

Murray, K. M. E. Caught in the Web o Words: James A. H. Murray and the Ox-ord English Dictionary. New Haven: Yale University Press, 1978.

Njal. Njal’s Saga. Translated with an introduction by M. Magnusson and H. Páls-son. Harmondsworth: Penguin Books, 1960. First published 1280.

Ong, W. J. Orality and Literacy: the Technologizing o the Word . London andNew York: Methuen, 1982.

———. “Writing is a Technology That Restructures Thought.” In The WrittenWord: Literacy in Transition (Wolson College Lectures 1985), edited by G. Bau-mann, 23–50. Clarendon Press: Oxord, 1986.

Orwell, G. “Why I Write.” In G. Orwell. Collected Essays, Journalism and Letters, Volume I, edited by S. Orwell and I. Angus, 23–30. Harmondsworth, England:Penguin Books, 1970. First published 1946 in Gangrel .

Oxord Dictionaries. Oxord English Dictionary, C. Soanes and A. Stevenson,eds. Oxord University Press, 2005. Also available at http://dictionary.oed.com.

Palmer, S. Palmer’s Index to The Times: Spring Quarter. Richmond House, Shep-perton on Thames: Samuel Palmer, 1885.

Pierce, J. R. An Introduction to Inormation Theory: Symbols, Signals and Noise,2nd edition. New York: Dover Publications, 1980.

Plutarch. The Rise and Fall o Athens: Nine Greek Lives by Plutarch. Trans-lated with an introduction by I. Scott-Kilvert. Harmondsworth, England: Penguin

Books, 1960. First published in 100 BC.Pow, T. “Did Englishmen Eat Each Other?: In the 150th Anniversary Year o Sir  John Franklin’s Death, Tom Pow Considers Why His Failed Endeavour Over-shadowed the Greater Achievements o the Orcadian Explorer John Rae,” Scot-land, January 26, 1997, p. 8.

Powers, T. 2005. “Black Arts,” New York Review o Books, May 12, 2005, pp.21–25.

Pym, A. “Redefning Translation Competence in an Electronic Age,” Meta, Vol.48 (2003), pp. 481-497.

Quine, W. V. O. “New Foundations or Mathematical Logic.” In W. V. O. Quine.From a Logical Point o View: 9 Logico-Philosophical Essays. Cambridge: Har-vard University Press, 1953. First published in 1937.

Page 189: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 189/199

180 Bibliography

Railton, S. 2005. “Mark Twain in His Times”. Retrieved March 17, 2005, romhttp://etext.lib.virginia.edu/railton/index2.html.

Ramsey, F. P. “The Foundations o Mathematics In F. P. Ramsey. Philosophical Pa- pers, edited by D. H. Mellor, 164–224. Cambridge: Cambridge University Press,1990. First published 1925 in Proceedings o the London Mathematical Society.

Roberts, N. “Social Considerations Towards a Defnition o Inormation Science,” Journal o Documentation Vol. 32 (1976), pp. 249–257.

———. “Communication and the Bibliographical System o the Social Sciences.”In Use o Social Sciences Literature, edited by N. Roberts, 1–27. London: But-terworths, 1977.

———. “Online: In an Educational Cul-de-Sac?” Education or Inormation, Vol.7 (1989), pp. 101–106.

Rosenberg, V. 1974. “The Scientifc Premises o Inormation Science,” Journal o the American Society or Inormation Science, Vol. 25 (1974), pp. 263–269.

Saussure, F. de. Course in General Linguistics. Edited by C. Bally and A. Seche-haye with the collaboration o A. Riedlinger. Translated and annotated by R.Harris. London: Duckworth, 1983. First published 1916 by Payot.

Searle, J. R. “Minds, Brains and Programs,” The Behavioral and Brain Sciences, Vol. 3 (1980), pp. 417–457.

Shakespeare, W. Macbeth. The Arden Edition o the Works o William Shake-

speare edited by K. Muir. London and New York: Routledge, 1988.Shannon, C. E. “A Mathematical Theory o Communication.” In Claude Elwood Shannon: Collected Papers, edited by N. J. A. Sloane and A. D. Wyner, 5–83. Pis-cataway: IEEE Press, 1993. First published 1948 Bell System Technical Journal .

———. “Communication Theory o Secrecy Systems.” In Claude Elwood Shan-non: Collected Papers, edited by N. J. A. Sloane and A. D. Wyner, 84–143. Piscat-away: IEEE Press, 1993. First published 1949 in Bell System Technical Journal .

———. “Prediction and entropy o printed English.” In Claude Elwood Shannon:Collected Papers, edited by N. J. A. Sloane and A. D. Wyner, 194–208. Piscataway,

NJ: IEEE Press, 1993. First published 1951 in Bell System Technical Journal .———. “A Mind-Reading (?) Machine.” In Claude Elwood Shannon: Collected Papers, edited by N. J. A. Sloane and A. D. Wyner, 688–689. Piscataway, NJ: IEEEPress, 1993.

———. “The Bandwagon.” In Claude Elwood Shannon: Collected Papers, editedby N. J. A. Sloane and A. D. Wyner, 462. Piscataway: IEEE Press, 1993. Firstpublished 1956 in Institute o Radio Engineers, Transactions on InormationTheory.

———. “Inormation theory.” In Claude Elwood Shannon: Collected papers, ed-

ited by N. J. A. Sloane and A. D. Wyner, 212–220. Piscataway, NJ: IEEE Press,1993. First published 1968 in Encyclopedia Britannica.

Shera, J. H. “Foundations o a Theory o Bibliography.” In J. H. Shera. Librariesand the Organization o Knowledge, edited by D. J. Foskett, 18–33. Hamden,Connecticut: Archon Books, 1965.

Page 190: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 190/199

Bibliography 181

———. Social Epistemology, General Semantics, and Librarianship. D. J. Foskett,ed., 12–17. Hamden, Connecticut: Archon Books, 1965.

Shneiderman, B. Leonardo’s Laptop: Human Needs and the New Computing Technologies. Cambridge, Mass: MIT Press, 2003.

Short, W. R. 2008. “Norse Laws and Legal Procedures.” Retrieved January 15,2008 rom, http://www.hurstwic.org/history/articles/society/text/laws.htm. 

Sloane, N. J. A. and A. D. Wyner. “Biography o Claude Elwood Shannon.” InClaude Elwood Shannon: Collected Papers, edited by N. J. A. Sloane and A. D.Wyner, xi–xvii. Piscataway: IEEE Press, 1993.

Smithson, S. “Inormation Retrieval Evaluation in Practice: A Case Study Ap-proach,” Inormation Processing and Management, Vol. 30 (1994), pp. 205–221.

Sparck Jones, K. and M. Kay. Linguistics and Inormation Science. New York andLondon: Academic Press, 1973.

Sperber, D. and D. Wilson. Relevance: Communication and Cognition. Oxord:Basil Blackwell, 1986.

Starobinski, J. Words Upon Words: The Anagrams o Ferdinand de Saussure.Translated by O. Emmet. New Haven and London: Yale University Press, 1979.

Stevens, A. Ariadne’s Clue: A Guide to the Symbols o Mankind . London: AllenLane The Penguin Press, 1998.

Swanson, D. R. 1980. “Libraries and the Growth o Knowledge.” In The Role o Libraries in the Growth o Knowledge, edited by D. R. Swanson, 112–136. Chi-cago: University o Chicago Press.

———. 1988. “Historical Note: Inormation Retrieval and the Future o an Illu-sion,” Journal o the American Society or Inormation Science, Vol. 39 (1988),pp. 92–98.

Tidline, T. J. “Working in Shannon’s Shadow: Mistaken Identity and PersistentEntropy o Inormation Concepts.” Poster presented at the Annual Meeting o the American Society or Inormation Science and Technology, Providence, Rhode

Island, November 2004.Times 1885. The Times (London). 1885.

Turing, A. M. “On Computable Numbers, With an Application to the Ent-scheidungsproblem,” Proceedings o the London Mathematical Society, Vol. 42(1937), pp. 230–265.

Twain, M. A Connecticut Yankee in King Arthur’s Court . First published 1889.New York and Oxord: Oxord University Press, 1996.

UNESCO/Library o Congress Bibliographic Survey. Bibliographical Services,Their Present State and Possibilities o Improvement. Report Prepared as a Work-

ing Paper or an International Conerence on Bibliography. Washington, 1950.Van Rijsbergen, C. J. Inormation Retrieval , 2nd edition. London: Butterworth-Heinemann, 1979.

Page 191: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 191/199

182 Bibliography

Verdú, S. and S. W. McLaughlin. Inormation Theory: 50 Years o Discovery.New York: IEEE Press, 2000.

Verne, J. Backwards to Britain. Translated by J. Valls-Russell. Edinburgh andNew York: Chambers, 1992. First published as Voyage à Reculons en Angleterreet en Ecosse, 1989.

Vico, G. On the Most Ancient Wisdom o the Italians: Unearthed rom theOrigins o the Latin Language.. Translated with introduction and notes by L.M. Palmer. Ithaca: Cornell University Press, 1988. First published 1710, Ex Ty-pographia Felicis Mosca.

———. The First New Science. Edited and translated by L. Pompa. Cambridge:Cambridge University Press, 2002. First published 1725, Per F. Mosca.

———. The New Science o Giambattista Vico. Unabridged translation o thethird edition (1744) with the addition o “Practice o the New Science” Trans-lated by T. G. Bergin and M. H. Fisch. Ithaca, NY and London: Cornell UniversityPress, 1976. First published 1744, Nella stamperia Muziana, a spese di G. e S.Elia.

———. The Autobiography o Giambattista Vico. Translated rom the Italian byM. H. Fisch and T. G. Bergin. Ithaca: Cornell University Press, 1990. First pub-lished 1818, Presso Porcelli.

Vološinov, V. N. Marxism and the Philosophy o Language. Translated by L.Matejka and I. R. Titunik. New York and London: Seminar Press, 1986. First

published 1929 by Priboi.Warner, J. “Writing and Literary Work in Copyright: A Binational and Histori-cal Analysis,” Journal o the American Society or Inormation Science, Vol. 44(1993), pp. 307–321.

———. From Writing to Computers. London and New York: Routledge, 1994.

———. “The Public Reception o the Research Assessment Exercise 1996,” AslibProceedings, Vol. 49 (1997), pp. 263–276.

———. 2000. “In the Catalogue Ye Go or Men: Evaluation Criteria or Inorma-tion Retrieval Systems,” Aslib Proceedings, Vol. 52 (2000), pp. 76–82. Also romInormation Research, Vol. 4 (July 1999). Available at http://inormationr.net/ ir/4-4/paper62.html.

———. Inormation, Knowledge, Text . Lanham, MD: Scarecrow Press, 2001.

———. “Inormation and Redundancy in the Legend o Theseus,”   Journal o Documentation, Vol. 59 (2003), pp. 540–557.

———. Humanizing Inormation Technology. Lanham, MD: Scarecrow Press,2004.

———. “Labor in Inormation Systems,” Annual Review o Inormation Science

and Technology, Vol. 39 (2005a), pp. 551–573.———. “An Inormation Dynamic: Technologies or the Reproduction o WrittenUtterances,” Aslib Proceedings, Vol. 57 (2005b), pp. 412–423.

Page 192: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 192/199

Bibliography 183

Weaver, W. “Recent Contributions to the Mathematical Theory o Communica-tion.” In The Mathematical Theory o Communication, edited by C. E. Shannonand W. Weaver, 1–28. Urbana: University o Illinois Press, 1949.

Webster, F. Theories o the Inormation Society, 2nd edition. London and NewYork: Routledge, 2002.

Weisman, R. “Talking Search Technology: Eric Schmidt, Chairman and CEO,Google Inc.,” Boston Globe, February 4, 2002, p. D1.

Wiener, N. The Human Use o Human Beings: Cybernetics and Society. Revisededition. New York: De Capo Press, 1954.

Wilkins, J. An Essay Towards a Real Character and a Philosophical Language.Menston, England: Scolar Press, 1968. First published in 1668.

Wilson, P. 1968. Two Kinds o Power: An Essay on Bibliographical Control .Berkeley and Los Angeles: University o Caliornia Press, 1968.

———. “Situational Relevance,” Inormation Storage and Retrieval, Vol. 9(1973), pp. 457–471.

———. “Review o F. Webster. Theories o the Inormation Society,” College and Research Libraries, Vol. 57 (1996a), pp. 487–489.

———. “Interdisciplinary Research and Inormation Overload,” Library Trends, Vol. 45 (1996b.), pp. 192–203.

———. “Some Consequences o Inormation Overload and Rapid Conceptual

Change.” In Inormation Science: From the Development o the Discipline toSocial Interaction, edited by J. Olaisen, E. Munch-Petersen, and P. Wilson, 21–34.Oslo: Scandinavian University Press, 1996c.

———. “Review o E. Svenonius. The Intellectual Foundations o InormationOrganization,” College and Research Libraries, Vol. 62 (2001), pp. 203–204.

Wittgenstein, L. Tractatus Logico-Philosophicus. London and New York: Rout-ledge and Kegan Paul, 1981. First published 1922 by Routledge & Kegan Paul.

WorldCat. 2007. WorldCat. OCLC. Retrieved June 27, 2007, rom http://frst-search.uk.oclc.org/.

Zip, G. K. The Psycho-Biology o Language: An Introduction to Dynamic Phi-lology. London: George Routledge, 1936.

Page 193: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 193/199

Page 194: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 194/199

Index

Note: b stands or box;  or fgure; n or note; t or table.

Airport security systems, 46bAlgorithmic transormations, 20,

131–132, 133Alphabets, 119Amazon.com, 63, 64–65, 76, 137, 153Anagrams, 97

Aristotle, 2, 23, 71, 115Assertion sign, 165–166n3Atomic acts, 18, 54Augustine, 80, 82, 93Automata theory, 8, 62, 70, 78, 133,

135Ayer, Alred J., 18

Babbage, Charles, 37, 38, 156Barthes, Roland, 103–104

Bath University Library, 6Belkin, Nicholas J., and Alina Vickery,

4–5Bell, Eric T., 19Bentsen, Lloyd, 76Berger, Toby, 117Biblical concordances, 88, 126Bibliographic control, 9, 21–22Bibliographic records, 43, 44, 86Bibliographic searching theory, 66

Bibliographic systems, 3Bibliographies, subject, 48, 160Bierce, Ambrose, 166ch5n3Biological classifcations, 30

Blair, David C., 41, 79Boolean logic, 38, 49, 89, 156Boolean operators, 8, 74, 87, 133, 134Boolos, George, and Richard C.

 Jerey, 62Borges, Jorge L., 104bBoustrophedon, 100Bradord, Samuel C., 48British Museum Catalog , 45

Brute existence, 157Buckland, Michael K., 41Buckland, Michael K., and Christian

Plaunt, 4, 133

Carroll, Lewis, 81b, 93, 164n1Cataloging, 8, 45, 47–48, 64. See also 

WorldCat Chandler, Raymond, 110bChange, dynamism or, 83–84, 85

Cherry, Colin, 116, 119, 122Childe, V. Gordon, 37, 69, 79, 126Choice, 2, 8, 146Chomsky, Noam, 96Ciolf, Luigina, 167ch6n2Classifcation, 8, 23, 30Clemens, Samuel Langhorne. See 

Twain, MarkCleverdon, Cyril W., J. Mills, and

Michael Keen, 6

Cockshott, Paul and Greg Michaelson,78, 155

Coding systems, 114, 126–127, 168n6Cognitive science, 155

Page 195: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 195/199

186 Index

Collins, Randall, 25Communal labor, 35, 56Communication, materiality o, 92

Computation theory, 8, 62, 70, 78,133, 135

Computers. See Inormationtechnologies

Concordances, 22, 88, 126Controlled vocabularies, 82Cooke and Wheatstone telegraph, 120Copyright, 39Countdown, 120Cryptography, 114

Culler, Jonathan D., 18, 80Cybernetics, 22–23

Darwin, Charles, 30Database description, 8Database management systems,

163–164ch3n1Defnitions. See ElucidationsDeliberation, 71Descartes, René, 155Description labor, 8, 40–48, 50, 57, 58

in modernity, 10–11, 70, 91, 92Dickens, Charles, 17–18, 23–24,

38–39Dictionaries, 139, 144. See also 

 Johnson, SamuelDijkstra, Edsger W., 99Direct semantic ratifcation, 76Discourse, language o, 47, 82

Dodgson, Charles. See Carroll, LewisDonovan, Edward, 24

Ellis, David, 3, 4Ellis, David, and Ana C. Vasconcelos,

3, 4Elucidations, 17–18E-mail, 122Epistemology, 65Evaluation, 3

Fabri, Honoré, 165n3Feist v. Rural , 39–40Fiske, John, 115, 155

Formal logic, 54Frege, 165–166n3Freud, Sigmund, 143

Full text retrieval, 73–94

Gardin, Jean-Claude, 82, 148Generic capacity, 86–87, 106Genus-species relations, 19, 20, 103Goody, Jack and Ian Watt, 18, 26, 37Google, 63, 135, 137, 153

advanced search, 45, 63Google Scholar, 86image search, 41, 43, 65

search examples, 76, 141, 142 Gosden, Chris, 2Graphic representations, 20, 21 Greg, Walter W., 48Gregory, Vicki L., 148Griesbaum, Joachim, 25Grouping material, 86–87, 106

Hardy, Thomas, 46bHarris, Roy, 96, 97, 98, 99, 100, 107,

159Hayes, Robert M., 48, 55Hebrew, Biblical, 128Heine, Michael H., 3, 4Henry, O., 46bHerken, Rol, 133, 155History, human, 8, 59, 158–159Hobsbawm, Eric, 59, 158–159Human, elucidated, 157–158

Human history, 8, 59, 158–159Humpty Dumpty, 81b, 93

Icelandic law-speakers, 26, 27–28b, 57Images

production, 46bsearching, 41, 43, 65

Indeterminacy, 110–111Indexing, 3, 5–6, 7, 9, 19, 22, 132,

148, 149

Inormation, Janus-like character, 148Inormational labor. See Mental laborInormation science, 2–3, 84–85Inormation society, 3

Page 196: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 196/199

Index 187 

Inormation technologies, 4, 6–7, 8,58–59, 83, 159

computer programs, 99

gestalt o the computer, 70, 132, 154,155

and labor, 34, 35, 38, 39, 40, 47Inormation theory, 8, 77, 78,

113–115, 116, 117, 145–146. Seealso Shannon, Claude

Inman, Bobby Ray, 65bIntellectual labor. See Mental laborIntelligence, 2, 9, 24Internet search engines, 1, 3, 50, 63,

76, 85, 88, 89, 138–139. See also Google

 Johnson, Samuel, 71, 108–109,139–141

 Johnson-Laird, Philip N., 103, 155

Keyboards, 120Knowledge, 78–79

Labor, 2. See also Communal labor;Description labor; Mental labor;Physical labor; Search labor;Selection labor; Semantic labor;Syntactic labor; Universal labor

Labor theoretic approach, 1, 11,53–71, 92, 152

Languageconceptions o, 79–82

linearity, 98–99, 100, 128as nomenclature, 73, 80, 82, 111Law-speakers, Icelandic, 26, 27–28b,

57Lexicography. See Dictionaries;

 Johnson, SamuelLibrarianship, 5–6, 21, 67, 84–85Linearity, 123

o language, 98–99, 100, 128Linguistics. See Chomsky, Noam;

Saussure, Ferdinand de; Zip,George K.

Linné, Carl von, 24Literacy, 54, 58, 77, 89, 91

Liversidge, Anthony, 114Logarithm tables, 37Logic

Boolean, 38, 49, 89, 156ormal, 54

Lyotard, Jean-Francois, 6

Marx, Karl, 8, 77, 85, 144, 158on labor, 33–34, 35, 56on technology, 28b, 34, 63–64,

159Material implication, 19, 59Mathematics, revolutions in, 69

McKenzie, Donald F., 99, 110bMcLeod, Norman Z., 137Meaning o words. See SemanticsMemory, 103Mental labor, 2, 33–37, 71, 151–152

distinctions within, 31, 37–40mechanization o, 70, 156

Message, 12, 117, 122–124and syntagma, 95, 123, 134

Messages or selection, 12, 117,118–122, 124–125, 128

and paradigm, 95, 134Metadata, 148, 149, 150 Minsky, Marvin L., 2, 34, 62, 70, 78,

133, 155Modernity. See Inormation

technologiesMontgomery, Christine A., 96Moradi, Hamid, Jerzy W. Grsymala-

Busse, and James A. Roberts, 126Morrison, Doreen, 43Morse code, 114, 127Multivalency o words, 74, 81b, 108,

110–111, 136, 146Multiword sequences, 126–129Murra, Katherine O., 22Murray, James, 139

Newspapers

databases, 74–75indexes, 43–45, 126

Njal’s Saga, 26, 27b, 57Nondeterminism, 149

Page 197: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 197/199

188 Index

OCLC (Online Computer LibraryCenter), 47. See also WorldCat 

Ohlman, Herbert, 120

Ong, Walter J., 28, 159Orality, 54, 56, 57–58, 65b, 77, 89.

See also Law-speakers, IcelandicOrwell, George, 41Oxord English Dictionary, 139

Palmer’s Index to the TimesNewspaper, 43–45, 126

Paradigm, 12, 97, 102–107, 137–139,140 

elucidated, 95, 103, 133and messages or selection, 77, 95,

120–121, 134Passports, 46bPhotographs, 43, 46bPhrase searching, 64, 76, 77, 109, 129,

136–137Physical labor, 2, 10Pierce, John R., 116Plutarch, 23, 24Powers, Thomas, 65bPrediction, 126, 127, 128Premodernity. See LiteracyPrinters’ onts, 120, 121Programs, computer, 99Prony, Gaspard de, 37

Quayle, Dan, 76Query transormation, 3, 4, 5, 7, 25,

69, 70Quine, Willard van O., 19qwerty, 120

RAE, 75–76Ramsey, Frank P., 4, 69Random sequences, 135Relevance, 70, 112Representation, language o, 47Research Assessment Exercise (RAE),

75–76Roberts, Norman, 6, 45, 49, 50, 85,

114, 148Rosenbaum, Howard, 168n2Rosenberg, Victor, 70, 132, 154

Saussure, Ferdinand de, 8, 12, 18, 77,79–80, 93, 95–103, 108, 123–124,145–146

and inormation theory, 99,166ch6n1

on nouns, 139and Shannon, 114, 115ton synonymy, 109, 144

Science Museum, 120Scrabble, 120Search algorithms, 3Search engines. See Internet search

engines

Search labor, 10–11, 29–30, 40,48–51, 58, 73

Searle, John R., 155Selection labor, 1, 28–31, 58, 59, 68,

91, 92elucidated, 10

Selection power, 1, 6, 15b, 17–27, 59,68, 69, 92, 149

elucidated, 9–10, 17–18enhanced, 160–161

Semantic labor, 2, 38, 39–40, 85Semantics, 12, 88, 95–112, 131–132Semiotics, 95–96Shakespeare, William, 15b, 104bShannon, Claude, 29, 30, 64, 114,

116, 118, 119, 154–155. See also Inormation theory

on language, 125, 146and Saussure, 114, 115t, 123

on spaces, 126–127, 137Shera, Jesse H., 66, 153Shneiderman, Ben, 21Short, William R., 27Shorthand, 127Signals, 115–116, 117–118Signifed, 74, 111, 134Signifer, 74, 116, 123, 134Slave labor analogy, 51, 61Smith, Adam, 37

Smithson, Steve, 6, 21Spaces between words, 105, 127, 128Sparck Jones, Karen, and Martin Kay,

96, 148Sperber, Dan, 115, 155

Page 198: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 198/199

Index 189

Starobinski, Jean, 97Stevens, Anthony, 9Structure o book, 13t, 14

Subject bibliographies, 48Swanson, Don R., 9, 63, 154, 161Symbolic logic. See Formal logicSynonymy, 109, 144Syntactic labor, 2, 38, 39–40Syntactics, 12, 73, 88, 113–130,

131–132elucidated, 113machine process, 84, 85

Syntagma, 12, 97–100, 103, 133, 141,

142, 143elucidated, 95, 102and message, 77, 95, 123, 134

Syntax. See Syntactics

Technology, 2, 28b, 34. See also Inormation technologies

Telegraphy, 117–118, 120, 122,127

Text messaging, 167ch7n3Tidline, Tonyia J., 78, 114, 155Times Newspaper, Palmer’s Index to

the, 43–45, 126Translation systems, 168ch9n1Turing, Alan M., 133, 156Twain, Mark, 19, 20, 21, 107–108,

150, 168n6

UNESCO/Library o Congress, 9,

21–22Universal labor, 35, 56Utterance, 89, 137, 143, 144, 151

Value, labor theory o, 83, 158–159Van Rijsbergen, 25Verdú, Sergio, and Steven W.

McLaughlin, 30, 114, 115Verne, Jules, 46bVico, Giambattista, 23, 77, 155, 157

Vocabularies, controlled, 82Voice communication. See OralityVolosinov, Valentin N., 97, 143, 144,

153

Warner, Julian(1993), 114, 127(1994), 78, 95

(1997), 75(2000), 19, 24, 25(2001), 29, 133, 156(2003), 99, 122, 123(2004), 2, 4, 34, 79, 132(2005a), 38, 41(2005b), 47

Weaver, Warren, 114–115, 155Weaving analogy, 104Webster, Frank, 2, 6, 7

Weisman, Robert, 66Wiener, Norbert, 22, 23, 51, 61Wilkins, John, 82Wilson, Patrick, 6, 22, 25, 66, 70, 149Wittgenstein, Ludwig, 18, 54, 165–

166n3, 168n3Word

elucidated, 101–102, 125–126, 127,128, 144, 148

searching by, 134, 135–136WorldCat , 22, 44, 45, 47, 64, 65. See

also CatalogingWriting, 159

Zip, George K., 61, 96, 148

Page 199: Human Information Retrieval - J.warner-2010-9780262013444

8/8/2019 Human Information Retrieval - J.warner-2010-9780262013444

http://slidepdf.com/reader/full/human-information-retrieval-jwarner-2010-9780262013444 199/199