lis 6771 indexing with a controlled vocabulary basic concepts
TRANSCRIPT
LIS 677 2
Indexing: Topics Covered
The “concept triangle”
The five-axiom theory of indexing
The indexing process
LIS 677 4
The Referent “The referent is everything about
which a meaningful statement can be made.”
For example, about a certain table many statements can be made concerning the material of which it is made, its price, purpose, producer, weight, the structure of its surface, etc.
LIS 677 5
The Concept “We define the concept as the sum
of the essential statements that can be made about a referent.” Essential statements are those which
contribute to the characterization of the referent itself.
Inessential statements are those which do not contribute to the characterization of the referent itself.
LIS 677 6
Kinds of Concepts General concepts
The general concept describes a class of interrelated referents.
For example: metal, oxidation, information Individual concepts
The individual concept is one to which no meaningful conceptual feature can be added.
For example: Albert Einstein; Fritz the Cat.
LIS 677 7
General vs. Individual Concepts in Indexing “It is the task of subject indexes to
provide access to documents or text passages relevant to general concepts.”
“An information system which works quite well for individual concepts, may totally fail when it is required to manage general concepts too.”
LIS 677 8
The Mode of Expression Lexical expressions
linear strings of characters commonly agreed upon to express concepts or concept connections
Non-lexical expressions linear strings of characters by which
concepts or concept relations are expressed and upon which no firm agreement has been made
LIS 677 9
Forms of Expression & Indexing Lexical expressions require little
indexing work Often appear in Identifier fields rather
than in Descriptor fields of database records
Non-lexical expressions require indexing work non-lexical expressions exhibit
ambiguity and multiplicity
LIS 677 10
Concepts & Expressions Individual concepts are almost always
expressed lexically General concepts are almost always
expressed non-lexically In natural, uncontrolled language there is an
unlimited multitude of non-lexical, paraphrasing expressions for concepts
Multiplicity & ambiguity of natural language expressions are largely restricted to general concepts
LIS 677 11
Five-Axiom Theory of Indexing
Definability Order Sufficient degree of order Representational predictability Representational fidelity
LIS 677 12
Axiom of Definability The compilation of information
relevant to a topic can be delegated (to a skilled specialist or a programmed search mechanism) only to the extent to which the inquirer can define the topic in terms of concepts and concept relations.
LIS 677 13
Axiom of Order Any compilation of information
relevant to a topic is an order-creating process. Order is defined as the meaningful
proximity of the parts of a whole at a foreseeable place.
LIS 677 14
Axiom of Sufficient Degree of Order The demands made on the degree
of order increase as the size of the collection and/or the frequency of the searches and/or the specificity of the searches increases.
LIS 677 15
Axiom of Representational Predictability The completeness of any search for
documents relevant to a topic of interest depends on the predictability of the modes of expression for concepts in the search file. Successful searches require a language
with predictable modes of expression for concepts.
LIS 677 16
Axiom of Representational Fidelity The precision of any search for
documents relevant to a topic of interest depends on the fidelity with which the modes of expression for concepts can be expressed in the system’s language.
LIS 677 17
The Indexing Process
Step 1: Determine the essence of a document
Step 2: Represent this essence with sufficient
degrees of predictability and fidelity
LIS 677 18
Importance of Categories “The predictability of essence selection is
markedly enhanced when the indexers have an orientation to conceptual categories.” For example, in some chemistry databases, all
descriptors belong to the following categories: MATTER LIVING ENTITY APPARATUS PROCESSS
In ERIC, the nine Descriptor Groups serve as categories.
LIS 677 19
Natural Language Indexing “Natural language expressions, as
derived from original texts, can only in the case of individual concepts lead to an information system of adequate quality and survival power.”
The specificity of natural language expressions is compromised by their lack of predictability.
LIS 677 20
Importance of “Cutter’s Rule” Precise and complete searches
require that the most specific descriptors that the vocabulary provides be chosen for the indexing of a subject.
A query with a specific descriptor must not retrieve concepts that are more general than the search descriptor.