simpson family trees - ischools of...simpson family trees because the simpson family is known...

42

Upload: others

Post on 17-Sep-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to
Page 2: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

Chapter 6Describing Relationships and Structures

Robert J. GlushkoMatthew Mayernik

Alberto PepeMurray Maloney

6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2256.2. Describing Relationships: An Overview . . . . . . . . . . . . . . . . . . . . . . . . . 2276.3. The Semantic Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2286.4. The Lexical Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2406.5. The Structural Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2456.6. The Architectural Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2556.7. The Implementation Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2586.8. Relationships in Organizing Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 2606.9. Key Points in Chapter Six . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264

6.1 IntroductionWe consider a family to be a collection of people affiliated by some connections, such as common ancestors or a common residence. The Simpson family includes a man named Homer and a woman named Marge, the married parents of three sibling children, a boy named Bart and two girls, Lisa and Maggie. This magical family speaks many languages, but most often uses the language of the local tel­evision station. In the English-speaking Simpson family, the boy describes his parents as his father and mother and his two siblings as his sisters. In the Span­ish speaking Simpson family he refers to his parents as su padre y su madre and his sisters are las hermanas. In the Chinese Simpson family the sisters refer to each other according to their relative ages; Lisa, the elder, as jiě jie and, Mag­gie, the younger, as mèi mei.

Page 3: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

Simpson Family TreesBecause the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to language learners.

• A website for teaching Spanish• A website for teaching French• A website for teaching German

Kinship relationships are ubiquitous and widely studied, and the names and sig­nificance of kinship relations like “is parent of” or “is sibling of” are familiar ones, making kinship a good starting point for understanding relationships in organizing systems. An organizing system can make use of existing relation­ships among resources, or it can create relationships by applying organizing principles to arrange the resources. Organizing systems for digital resources or digital description resources are the most likely to rely on explicit relationships to enable interactions with the resources.

In a classic book called Data and Re­ality, William Kent defines a relation­ship as an association among several things, with that association having a particular significance. “The things being associated,” the components of the relationship, are people in kinship relationships but more generally can be any type of resource (Chapter 4), when we relate one resource instance to another. When we describe a re­source (Chapter 5), the components of the relationship are a primary re­source and a description resource. If

we specify sets of relationships that go together, we are using these common re­lationships to define resource types or classes, which more generally are called categories (Chapter 7). We can then use resource types as one or both the com­ponents of a relationship when we want to further describe the resource type or to assert how two resource types go together to facilitate our interactions with them.We begin with a more complete definition of relationship and introduce five per­spectives for analyzing them: semantic, lexical, structural, architectural, and im­plementation. We then discuss each perspective, introducing the issues that each emphasizes, and the specialized vocabulary needed to describe and ana­lyze relationships from that point of view. We apply these perspectives and vo­cabulary to analyze the most important types of relationships in organizing sys­tems.

The Discipline of Organizing

Chapter 6 — Describing Relationships and Structures226

Page 4: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

6.2 Describing Relationships: An OverviewThe concept of a relationship is pervasive in human societies in both informal and formal senses. Humans are inescapably related to generations of ancestors, and in most cases they also have social networks of friends, co-workers, and casual acquaintances to whom they are related in various ways. We often hear that our access to information, money, jobs, and political power is all about “who you know,” so we strive to “network” with other people to build relation­ships that might help us expand our access. In information systems, relation­ships between resources embody the organization that enables finding, selec­tion, retrieval, and other interactions.Most organizing systems are based on many relationships to enable the system to satisfy some intentional purposes with individual resources or the collection as a whole. In the domain of information resources, common resources include web pages, journal articles, books, datasets, metadata records, and XML docu­ments, among many others. Important relationships in the information domain that facilitate purposes like finding, identifying, and selecting resources include “is the author of,” “is published by,” “has publication date,” “is derived from,” “has subject keyword,” “is related to,” and many others.When we talk about relationships we specify both the resources that are associ­ated along with a name or statement about the reason for the association. Just identifying the resources involved is not enough because several different rela­tionships can exist among the same resources; the same person can be your brother, your employer, and your landlord. Furthermore, for many relationships the directionality or ordering of the participants in a relationship statement matters; the person who is your employer gives a paycheck to you, not vice ver­sa. Kent points out that when we describe a relationship we sometimes use whole phrases, such as “is-employed-by,” if our language does not contain a sin­gle word that expresses the meaning of the relationship.

Core Concepts Edition

6.2 Describing Relationships: An Overview 227

Page 5: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

Navigating This ChapterIn this chapter, we analyze relationships from several different per­spectives:Semantic perspective

The semantic perspective is the most essential one; it character­izes the meaning of the association between resources. (§6.3)

Lexical perspectiveThe lexical perspective focuses on how the conceptual descrip­tion of a relationship is expressed using words in a specific lan­guage. (§6.4)

Structural perspectiveThe structural perspective analyzes the actual patterns of associ­ation, arrangement, proximity, or connection between resources. (§6.5)

Architectural perspectiveThe architectural perspective emphasizes the number and ab­straction level of the components of a relationship, which togeth­er characterize its complexity. (§6.6)

Implementation perspectiveThe implementation perspective considers how the relationship is implemented in a particular notation and syntax and the manner in which relationships are arranged and stored in some technolo­gy environment. (§6.7)

6.3 The Semantic PerspectiveTo describe relationships among resources, we need to understand what the re­lations mean. This semantic perspective is the essence of relationships and ex­plains why the resources are related, relying on information that is not directly available from perceiving the resources. In our Simpson family example, we no­ted that Homer and Marge are related by marriage, and also by their relation­ship as parents of Bart, Lisa, and Maggie, and none of these relationships are directly perceivable. This means that “Homer is married to Marge” is a seman­tic assertion, but “Homer is standing next to Marge” is not.Semantic relationships are commonly expressed with a predicate with one or more arguments. A predicate is a verb phrase template for specifying properties of objects or a relationship among objects. In many relationships the predicate is an action or association that involves multiple participants that must be of

The Discipline of Organizing

Chapter 6 — Describing Relationships and Structures228

Page 6: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

particular types, and the arguments define the different roles of the partici­pants.We can express the relationship between Homer and Marge Simpson using a predicate(argument(s)) syntax as follows:

is-married-to (Homer Simpson, Marge Simpson)The sequence, type, and role of the arguments are an essential part of the rela­tionship expression. The sequence and role are explicitly distinguished when predicates that take two arguments are expressed using a subject-predicate-object syntax that is often called a triple because of its three parts: Homer Simpson → is-married-to → Marge SimpsonHowever, we have not yet specified what the “is-married-to” relationship means. People can demonstrate their understanding of “is-married-to” by realizing that alternative and semantically equivalent expressions of the relationship between Homer and Marge might be: Homer Simpson → is-married-to → Marge Simpson Homer Simpson → is-the-husband-of → Marge Simpson Marge Simpson → is-married-to → Homer Simpson Marge Simpson → is-the-wife-of → Homer SimpsonGoing one step further, we could say that people understand the equivalence of these different expressions of the relationship because they have semantic and linguistic knowledge that relates some representation of “married,” “husband,” “wife,” and other words. None of that knowledge is visible in the expressions of the relationships so far, all of which specify concrete relationships about individ­uals and not abstract relationships between resource classes or concepts. We have simply pushed the problem of what it means to understand the expressions into the mind of the person doing the understanding.We can be more rigorous and define the words used in these expressions so they are “in the world” rather than just “in the mind” of the person understand­ing them. We can write definitions about these resource classes:

• The conventional or traditional marriage relationship is a consensual life­time association between a husband and a wife, which is sanctioned by law and often by religious ceremonies;

• A husband is a male lifetime partner considered in relation to his wife; and• A wife is a female lifetime partner considered in relation to her husband.

Definitions like these help a person learn and make some sense of the relation­ship expressions involving Homer and Marge. However, these definitions are

Core Concepts Edition

6.3 The Semantic Perspective 229

Page 7: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

not in a form that would enable someone to completely understand the Homer and Marge expressions; they rely on other undefined terms (consensual, law, lifetime, etc.), and they do not state the relationships among the concepts in the definitions. Furthermore, for a computer to understand the expressions, it needs a computer-processable representation of the relationships among words and meanings that makes every important semantic assumption and property precise and explicit. We will see what this takes starting in the next section.

6.3.1 Types of Semantic RelationshipsIn this discussion we will use entity type, class, concept, and resource type as synonyms. Entity type and class are conventional terms in data modeling and database design, concept is the conventional term in computational or cognitive modeling, and we use resource type when we discuss organizing systems. Simi­larly, we will use entity occurrence, instance, and resource instance when we re­fer to one thing rather than to a class or type of them.There is no real consensus on how to categorize semantic relationships, but these three broad categories are reasonable for our purposes:Inclusion Relationship

One entity type contains or is comprised of other entity types; often ex­pressed using “is-a,” “is-a-type-of,” “is-part-of,” or “is-in” predicates.

Attribution RelationshipAsserting or assigning values to properties; the predicate depends on the property: “is-the-author-of,” “is-married-to,” “is-employed-by,” etc.

Possession RelationshipAsserting ownership or control of a resource; often expressed using a “has” predicate, such as “has-serial-number-plate.”

All of these are fundamental in organizing systems, both for describing and ar­ranging resources themselves, and for describing the relationships among re­sources and resource descriptions.

The Discipline of Organizing

Chapter 6 — Describing Relationships and Structures230

Page 8: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

6.3.1.1 InclusionThere are three different types of inclusion relationships: class inclusion, mero­nymic inclusion, and topological inclusion. All three are commonly used in or­ganizing systems.Class inclusion is the fundamental and familiar “is-a,” “is-a-type-of,” or “sub­set” relationship between two entity types or classes where one is contained in and thus more specific than the other more generic one. Meat → is-a → FoodA set of interconnected class inclusion relationships creates a hierarchy, which is often called a taxonomy. Meat → is-a → Food Dairy Product → is-a → Food Cereal → is-a → Food Vegetable → is-a → Food Beef → is-a → Meat Pork → is-a → Meat Chicken → is-a → Meat Ground Beef → is-a → Beef Steak → is-a → Beef ...A visual depiction of the taxonomy makes the class hierarchy easier to perceive. See Figure 6.1, A Partial Taxonomy of Food.Each level in a taxonomy subdivides the class above it into sub-classes, and each sub-class is further subdivided until the differences that remain among the members of each class no longer matter for the interactions the organizing sys­tem needs to support. We discuss the design of hierarchical organizing systems in §7.3, “Principles for Creating Categories.”All of the examples in the current section have expressed abstract relationships between classes, in contrast to the earlier concrete ones about Homer and Marge, which expressed relationships between specific people. Homer and Marge are instances of classes like “married people,” “husbands,” and “wives.” When we make an assertion that a particular instance is a member of class, we are classifying the instance. Classification is a class inclusion relationship be­tween an instance and a class, rather than between two classes. (We discuss Classification in detail in Chapter 8.) Homer Simpson → is-a → Husband

Core Concepts Edition

6.3 The Semantic Perspective 231

Page 9: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

Figure 6.1. A Partial Taxonomy of Food.

A partial taxonomy of food distinguishes the categories or prepared food from meat, distinguishes chicken, beef, and pork as subcategories of meat, and distin­

guishes ground beef and steak as subcategories of beef.

This is just the lowest level of the class hierarchy in which Homer is located at the very bottom; he is also a man, a human being, and a living organism (in car­toon land, at least). You might now remember the bibliographic class inclusion hierarchy we discussed in §4.3.2; a specific physical item like your dog-eared copy of Macbeth is also a particular manifestation in some format or genre, and this expression is one of many for the abstract work. instance→ is-member-of → class

The Discipline of Organizing

Chapter 6 — Describing Relationships and Structures232

Page 10: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

Part-whole inclusion or meronymic inclusion is a second type of inclusion rela­tionship. It is usually expressed using “is-part-of,” “is-partly,” or with other simi­lar predicate expressions. Winston, Chaffin, and Herrmann identified six distinct types of part-whole relationships. Their meaning subtly differs depending on whether the part is separately identifiable and whether the part is essential to the whole.

• Component-Object is the relationship type when the part is a separate com­ponent that is arranged or assembled with other components to create a larger resource. In §4.1.1.1, “Resources with Parts,” we used as an example the component-object relationship between an engine and a car: The Engine → is-part-of → the CarThe components of this type of part-whole relationship need not be physical objects; “Germany is part of the European Union” expresses a component-object relationship. What matters is that the component is identifiable on its own as an integral entity and that the components follow some kind of pat­terned organization or structure when they form the whole. Together the parts form a composition, and the parts collectively form the whole. A car that lacks the engine part will not work.

• Member-Collection is the part-whole relationship type where “is-part-of” means “belongs-to,” a weaker kind of association than component-object be­cause there is no assumption that the component has a specific role or func­tion in the whole. The Book → is-part-of → the LibraryThe members of the collection exist independently of the whole; if the whole ceases to exist the individual resources still exist.

• Portion-Mass is the relationship type when all the parts are similar to each other and to the whole, unlike either of the previous types where engines are not tires or cars, and books are not like record albums or libraries. The Slice → is-part-of → the Pie

• Stuff-Object relationships are most often expressed using “is-partly” or “is-made-of” and are distinguishable from component-object ones because the stuff cannot be separated from the object without altering its identity. The stuff is not a separate ingredient that is used to make the object; it is a con­stituent of it once it is made. Wine → is-partly → Alcohol

• Place-Area relationships exist between areas and specific places or locations within them. Like members of collections, places have no particular func­tional contribution to the whole. The Everglades → are-part-of → Florida

Core Concepts Edition

6.3 The Semantic Perspective 233

Page 11: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

• Feature-Activity is a relationship type in which the components are stages, phases, or sub activities that take place over time. This relationship is simi­lar to component-object in that the components in the whole are arranged according to a structure or pattern. Overtime → is-part-of → a Football Game

A seventh type of part-whole relationship called Phase-Activity was proposed by Storey.

• Phase-Activity is similar to feature-activity except that the phases do not make sense as standalone activities without the context provided by the ac­tivity as a whole. Paying → is-part-of → Shopping

Topological, Locative and Temporal Inclusion is a third type of inclusion rela­tionship between a container, area, or temporal duration and what it surrounds or contains. It is most often expressed using “is-in” as the relationship. Howev­er, the entity that is contained or surrounded is not a part of the including one, so this is not a part-whole relationship. The Vatican City → is-in → Italy The meeting → is-in → the afternoon

6.3.1.2 AttributionIn contrast to inclusion expressions that state relationships between resources, attribution relationships assert or assign values to properties for a particular re­source. In Chapter 5 we used “attribute” to mean “an indivisible part of a re­source description” and treated it as a synonym of “property.” We now need to be more precise and carefully distinguish between the type of the attribute and the value that it has. For example, the color of any object is an attribute of the object, and the value of that attribute might be “green.”Some frameworks for semantic modeling define “attribute” very narrowly, re­stricting it to expressions with predicates with only one argument to assert properties of a single resource, distinguishing them from relationships between resources or resource types that require two arguments: Martin the Gecko → is-small Martin the Gecko → is-green However, it is always possible to express statements like these in ways that make them into relationships with two arguments: Martin → has-size → small Martin → has-skin-color → green

The Discipline of Organizing

Chapter 6 — Describing Relationships and Structures234

Page 12: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

Another somewhat tricky aspect of attribution relationships is that from a se­mantic perspective, there are often many different ways of expressing equiva­lent attribute values. Martin → has-size → 6 inches Martin → has size → 152 mm These two statements express the idea that Martin is small. However, many im­plementations of attribution relationships treat the attribute values literally. This means that unless we can process these two statements using another rela­tionship that expresses the conversion of inches to mm, the two statements could be interpreted as saying different things about Martin’s size.Finally, we note that we can express attribution relationships about other rela­tionships, like the date a relationship was established. Homer and Marge Simp­son’s wedding anniversary is an attribute of their “is-married-to” relationship.

6.3.1.3 PossessionA third distinct category of semantic relationships is that of possession. Posses­sion relationships can seem superficially like part-whole ones: Bob → has → a car A car → has → wheelsHowever, in the second of these relationships “has” is an elliptical form of “has as a part,” expressing a part-whole relationship rather that one of possession.The concept of possession is especially important in institutional organizing sys­tems, where questions of ownership, control, responsibility and transfers of ownership, control, and responsibility can be fundamental parts of the interac­tions they support. However, possession is a complex notion, inherently connec­ted to societal norms and conventions about property and kinship, making it messier than institutional processes might like.Possession relationships also imply duration or persistence, and are often diffi­cult to distinguish from relationships based on habitual location or practice. Miller and Johnson-Laird illustrate the complex nature of possession relation­ships with this sentence, which expresses three different types of them:

He owns an umbrella but she’s borrowed it, though she doesn’t have it with her.

Core Concepts Edition

6.3 The Semantic Perspective 235

Page 13: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

6.3.2 Properties of Semantic RelationshipsSemantic relationships can have numerous special properties that help explain what they mean and especially how they relate to each other. In the following sections we briefly explain those that are most important in systems for organiz­ing resources and resource descriptions.

6.3.2.1 SymmetryIn most relationships the order in which the subject and object arguments are expressed is central to the meaning of the relationship. If X has a relationship with Y, it is usually not the case that Y has the same relationship with X. For ex­ample, because “is-parent-of” is an asymmetric relationship, only the first of these relationships holds:Homer Simpson → is-parent-of → Bart Simpson (TRUE) Bart Simpson → is-parent-of → Homer Simpson (NOT TRUE) In contrast, some relationships are symmetric or bi-directional, and reversing the order of the arguments of the relationship predicate does not change the meaning. As we noted earlier, these two statements are semantically equivalent because “is-married-to” is symmetric:

Homer Simpson → is-married-to → Marge Simpson Marge Simpson → is-married-to → Homer Simpson We can represent the symmetric and bi-directional nature of these relationships by using a double-headed arrow: Homer Simpson ⇔ is-married-to ⇔ Marge Simpson

6.3.2.2 TransitivityTransitivity is another property that can apply to semantic relationships. When a relationship is transitive, if X and Y have a relationship, and Y and Z have the same relationship, then X also has the relationship with Z. Any relationship based on ordering is transitive, which includes numerical, alphabetic, and chro­nological ones as well as those that imply qualitative or quantitative measure­ment. Because “is-taller-than” is transitive: Homer Simpson → is-taller-than → Bart Simpson Bart Simpson → is-taller-than → Maggie Simpson implies that: Homer Simpson → is-taller-than → Maggie Simpson

The Discipline of Organizing

Chapter 6 — Describing Relationships and Structures236

Page 14: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

Inclusion relationships are inherently transitive, because just as “is-taller-than” is an assertion about relative physical size, “is-a-type of” and “is-part-of” are as­sertions about the relative sizes of abstract classes or categories. An example of transitivity in part-whole or meronymic relationships is: (1) the carburetor is part of the engine, (2) the engine is part of the car, (3) therefore, the carburetor is part of the car.Transitive relationships enable inferences about class membership or proper­ties, and allow organizing systems to be more efficient in how they represent them since transitivity enables implicit relationships to be made explicit only when they are needed.

6.3.2.3 EquivalenceAny relationship that is both symmetric and transitive is an equivalence rela­tionship; “is-equal-to” is obviously an equivalence relationship because if A=B then B=A and if A=B and B=C, then A=C. Other relationships can be equivalent without meaning “exactly equal,” as is the relationship of “is-congruent-to” for all triangles.We often need to assert that a particular class or property has the same mean­ing as another class or property or that it is generally substitutable for it. We make this explicit with an equivalence relationship.Sister (English) ⇔ is-equivalent-to ⇔ Hermana (Spanish)

6.3.2.4 InverseFor asymmetric relationships, it is often useful to be explicit about the meaning of the relationship when the order of the arguments in the relationship is re­versed. The resulting relationship is called the inverse or the converse of the first relationship. If an organizing system explicitly represents that: Is-child-of → is-the-inverse-of → Is-parent-of We can then conclude that: Bart Simpson → is-child-of → Homer Simpson

Core Concepts Edition

6.3 The Semantic Perspective 237

Page 15: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

6.3.3 OntologiesWe now have described types and properties of semantic relationships in enough detail to return to the challenge we posed earlier: what information is required to fully understand relationships? This question has been asked and debated for decades and we will not pretend to answer it to any extent here. However, we can sketch out some of the basic parts of the solution.Let us begin by recalling that a taxonomy captures a system of class inclusion relationships in some domain. But as we have seen, there are a great many kinds of relationships that are not about class inclusion. All of these other types of relationships represent knowledge about the domain that is potentially nee­ded to understand statements about it and to make sense when more than one domain of resources or activities comes together.For example, in the food domain whose partial taxonomy appears in Figure 6.2, A Partial Ontology of Food., we can assert relationships about properties of classes and instances, express equivalences about them, and otherwise enhance the representation of the food domain to create a complex network of relation­ships. In addition, the food domain intersects with food preparation, agriculture, commerce, and many other domains. We also need to express the relationships among these domains to fully understand any of them. Grilling → is-a-type-of → Food Preparation Temperature → is-a-measure-of → Grilling Hamburger → is-equivalent-to → Ground Beef Hamburger → is-prepared-by → Grilling Hamburger Sandwich → is-a-type-of → Prepared Food Rare → is-a → State of Food Preparation Well-done → is-a → State of Food Preparation Meat → is-preserved-by → Freezing Thawing → is-the-inverse-of → Freezing ...In this simple example we see that class inclusion relationships form a kind of backbone to which other kinds of relationships attach. We also see that there are many potentially relevant assertions that together represent the knowledge that just about everyone knows about food and related domains. A network of relationships like these creates a resource that is called an ontology. A visual depiction of the ontology illustrates this idea that it has a taxonomy as its con­ceptual scaffold. (See Figure 6.2, A Partial Ontology of Food.)Ontologies are essential parts in some organizing systems, especially information-intensive ones where the scope and scale of the resources require an extensive and controlled description vocabulary. (See §5.3 The Process of

The Discipline of Organizing

Chapter 6 — Describing Relationships and Structures238

Page 16: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

Figure 6.2. A Partial Ontology of Food.

A partial ontology of food overlays the taxonomy of food with statements that make assertions about categories, instances, and relationships in the food do­main. Example statements might be that “Grilling is a type of food preparation,” that“Meat is preserved by freezing,” and that “Hamburger is equivalent to

ground beef.”

Describing Resources (page 188).) The most extensive ontology ever created is Cyc, born in 1984 as an artificial intelligence research project. Three decades later, the latest version of the Cyc ontology contains several hundred thousand terms and millions of assertions that interrelate them.

Core Concepts Edition

6.3 The Semantic Perspective 239

Page 17: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

6.4 The Lexical PerspectiveThe semantic perspective for analyzing relationships is the fundamental one, but it is intrinsically tied to the lexical one because a relationship is always ex­pressed using words in a specific language. For example, we understand the re­lationships among the concepts or classes of “food,” “meat,” and “beef” by us­ing the words “food,” “meat,” and “beef” to identify progressively smaller classes of edible things in a class hierarchy.The connection between concept and words is not so simple. In the Simpson family example with which we began this chapter, we noted with “father” and “padre” that languages differ in the words they use to describe particular kin­ship relationships. Furthermore, we pointed out that cultures differ in which kinship relationships are conceptually distinct, so that languages like Chinese make distinctions about the relative ages of siblings that are not made in Eng­lish.This is not to suggest that an English speaker cannot notice the difference be­tween his older and younger sisters, only that this distinction is not lexicalized—captured in a single word—as it is in Chinese. This “missing word” in English from the perspective of Chinese is called a lexical gap. Exactly when a lexical gap exists is sometimes tricky, because it depends on how we define “word”—polar bear and sea horse are not lexicalized but they are a single meaning-bearing unit because we do not decompose and reassemble meaning from the two separate words. These “lexical gaps” differ from language to language, whereas “conceptual gaps”—the things we cannot think of or directly experi­ence, like the pull of gravity— may be innate and universal. We revisit this issue as “linguistic relativity” in Chapter 7.Earlier in this book we discussed the naming of resources (§4.4.2 The Problems of Naming (page 158)) and the design of a vocabulary for resource description (§5.3.1.3 Scope, Scale, and Resource Description (page 193)), and we explained how increasing the scope and scale of an organizing system made it essential to be more systematic and precise in assigning names and descriptions. We need to be sure that the terms we use to organize resources capture the similarities and differences between them well enough to support our interactions with them. After our discussion about semantic relationships in this chapter, we now have a clearer sense of what is required to bring like things together, keep dif­ferent things separate, and to satisfy any other goals for the organizing system.For example, if we are organizing cars, buses, bicycles, and sleds, all of which are vehicles, there is an important distinction between vehicles that are motor­ized and those that are powered by human effort. It might also be useful to dis­tinguish vehicles with wheels from those that lack them. Not making these dis­tinctions leaves an unbalanced or uneven organizing system for describing the

The Discipline of Organizing

Chapter 6 — Describing Relationships and Structures240

Page 18: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

semantics of the vehicle domain. However, only the “motorized” concept is lexi­calized in English, which is why we needed to invent the “wheeled vehicle” term in the second case.Simply put, we need to use words effectively in organizing systems. To do that, we need to be careful about how we talk about the relationships among words and how words relate to concepts. There are two different contexts for those re­lationships.

• First, we need to discuss relationships among the meanings of words. (§6.4.1) and the most commonly used tool for describing them (§6.4.2).

• Second, we need to discuss relationships among the form of words. (§6.4.3 Relationships among Word Forms (page 244))

6.4.1 Relationships among Word MeaningsThere are several different types of relationships of word meanings. Not surpris­ingly, in most cases they parallel the types of relationships among concepts that we described in §6.3 The Semantic Perspective (page 228).

6.4.1.1 Hyponymy and HyperonymyWhen words encode the semantic distinctions expressed by class inclusion, the word for the more specific class in this relationship is called the hyponym, while the word for the more general class to which it belongs is called the hypernym. George Miller suggested an exemplary formula for defining a hyponym as its hy­pernym preceded by adjectives or followed by relative clauses that distinguish it from its co-hyponyms, mutually exclusive subtypes of the same hypernym. hyponym = {adjective+} hypernym {distinguishing clause+}For example, robin is a hyponym of bird, and could be defined as “a migratory bird that has a clear melodious song and a reddish breast with gray or black up­per plumage.” This definition does not describe every property of robins, but it is sufficient to differentiate robins from bluebirds or eagles.

6.4.1.2 MetonymyPart-whole or meronymic semantic relationships have lexical analogues in met­onomy, when an entity is described by something that is contained in or other­wise part of it. A country’s capital city or a building where its top leaders reside is often used as a metonym for the entire government: “The White House an­nounced today...” Similarly, important concentrations of business activity are of­ten metonyms for their entire industries: “Wall Street was bailed out again...”

Core Concepts Edition

6.4 The Lexical Perspective 241

Page 19: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

6.4.1.3 SynonymySynonymy is the relationship between words that express the same semantic concept. The strictest definition is that synonyms “are words that can replace each other in some class of contexts with insignificant changes of the whole text’s meaning.” This is an extremely hard test to pass, except for acronyms or compound terms like “USA,” “United States,” and “United States of America” that are completely substitutable.Most synonyms are not absolute synonyms, and instead are considered proposi­tional synonyms. Propositional synonyms are not identical in meaning, but they are equivalent enough that substituting one for the other will not change the truth value of the sentence. This weaker test lets us treat word as synonyms even though their meanings subtly differ. For example, if Lisa Simpson can play the violin, then because “violin” and “fiddle” are propositional synonyms, no one would disagree with an assertion that Lisa Simpson can play the fiddle.An unordered set of synonyms is often called a synset, a term first used by the WordNet “semantic dictionary” project started in 1985 by George Miller at Princeton. Instead of using spelling as the primary organizing principle for words, WordNet uses their semantic properties and relationships to create a network that captures the idea that words and concepts are an inseparable sys­tem. Synsets are interconnected by both semantic relationships and lexical ones, enabling navigation in either space.

6.4.1.4 PolysemyWe introduced the lexical relationship of polysemy, when a word has several dif­ferent meanings or senses, in the context of problems with names (§4.4.2.2 Ho­monymy, Polysemy, and False Cognates (page 160)). For example, the word “bank” can refer to a: river bank, money bank, bank shots in basketball and bil­liards, an aircraft maneuver, and other concepts.

The Discipline of Organizing

Chapter 6 — Describing Relationships and Structures242

Page 20: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

6.4.1.5 AntonymyAntonymy is the lexical relationship between two words that have opposite meanings. Antonymy is a very salient lexical relationship, and for adjectives it is even more powerful than synonymy. In word association tests, when the probe word is a familiar adjective, the most common response is its antonym; a probe of “good” elicits “bad,” and vice versa. Like synonymy, antonymy is sometimes exact and sometimes more graded.Contrasting or binary antonyms are used in mutually exclusive contexts where one or the other word can be used, but never both. For example, “alive” and “dead” can never be used at the same time to describe the state of some entity, because the meaning of one excludes or contradicts the meaning of the other.Other antonymic relationships between word pairs are less semantically sharp because they can sometimes appear in the same context as a result of the broader semantic scope of one of the words. “Large” and “small,” or “old” and “young” generally suggest particular regions on size or age continua, but “how large is it?” or “how old is it?” can be asked about resources that are objectively small or young.

6.4.2 ThesauriThe words that people naturally use when they describe resources reflect their unique experiences and perspectives, and this means that people often use dif­ferent words for the same resource and the same words for different ones. Guid­ing people when they select description words from a controlled vocabulary is a partial solution to this vocabulary problem (§4.4.2.1 The Vocabulary Problem (page 158)) that becomes increasingly essential as the scope and scale of the or­ganizing system grows. A thesaurus is a reference work that organizes words according to their semantic and lexical relationships. Thesauri are often used by professionals when they describe resources.Thesauri have been created for many domains and subject areas. Some thesauri are very broad and contain words from many disciplines, like the Library of Congress Subject Headings (LOC-SH) used to classify any published content. Other commonly used thesauri are more focused, like the Art and Architecture Thesaurus (AAT) developed by the Getty Trust and the Legislative Indexing Vo­cabulary developed by the Library of Congress.

Core Concepts Edition

6.4 The Lexical Perspective 243

Page 21: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

We can return to our simple food taxonomy to illustrate how a thesaurus anno­tates vocabulary terms with lexical and semantic relationships. The class inclu­sion relationships of hypernomy and hyponymy are usually encoded using BT (“broader term”) and NT (“narrower term”): Food BT Meat Beef NT MeatThe BT and NT relationships in a thesaurus create a hierarchical system of words, but a thesaurus is more than a lexical taxonomy for some domain be­cause it also encodes additional lexical relationships for the most important words. Many thesauri emphasize the cluster of relationships for these key words and de-emphasize the overall lexical hierarchy.

6.4.3 Relationships among Word FormsThe relationships among word meanings are critically important. Whenever we create, combine, or compare resource descriptions we also need to pay atten­tion to relationships between word forms. These relationships begin with the idea that all natural languages create words and word forms from smaller units. The basic building blocks for words are called morphemes and can express se­mantic concepts (when they are called root words ) or abstract concepts like “pastness” or “plural”). The analysis of the ways by which languages combine morphemes is called morphology.Simple examples illustrate this:

“dogs” = “dog” (root) + “s” (plural)“uncertain” = “certain” (root) + “un” (negation)“denied” = “deny” (root) + “ed” (past tense)

Morphological analysis of a language is heavily used in text processing to create indexes for information retrieval. For example, stemming (discussed in more de­tail in Chapter 10) is morphological processing which removes prefixes and suf­fixes to leave the root form of words. Similarly, simple text processing applica­tions like hyphenation and spelling correction solve word form problems using roots and rules because it is more scalable and robust than solving them using word lists. Many misspellings of common words (e.g., “pain”) are words of lower frequency (e.g., “pane”), so adding “pane” to a list of misspelled words would occasionally identify it incorrectly. In addition, because natural languages are generative and create new words all the time, a word list can never be com­plete; for example, when “flickr” occurs in text, is it a misspelling of “flicker” or the correct spelling of the popular photo-sharing site?

The Discipline of Organizing

Chapter 6 — Describing Relationships and Structures244

Page 22: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

6.4.3.1 Derivational MorphologyDerivational morphology deals with how words are created by combining mor­phemes. Compounding, putting two “free morphemes” together as in “batman” or “catwoman,” is an extremely powerful mechanism. The meaning of some compounds is easy to understand when the first morpheme qualifies or restricts the meaning of the second, as in “birdcage” and “tollbooth.” However, many compounds take on new meanings that are not as literally derived from the meaning of their constituents, like “seahorse” and “batman.”Other types of derivations using “bound” morphemes follow more precise rules for combining them with “base” morphemes. The most common types of bound morphemes are prefixes and suffixes, which usually create a word of a different part-of-speech category when they are added. Familiar English prefixes include “a-,” “ab-,” “anti-,” “co-,” “de-,” “pre-,” and “un-.” Among the most common Eng­lish suffixes are “-able,” “-ation,” “-ify,” “ing,” “-ity,” “-ize,” “-ment,” and “-ness.” Compounding and adding prefixes or suffixes are simple mechanisms, but very complex words like “unimaginability” can be formed by using them in combina­tion.

6.4.3.2 Inflectional MorphologyInflectional mechanisms change the form of a word to represent tense, aspect, agreement, or other grammatical information. Unlike derivation, inflection nev­er changes the part-of-speech of the base morpheme. The inflectional morpholo­gy of English is relatively simple compared with other languages.

6.5 The Structural PerspectiveThe structural perspective analyzes the association, arrangement, proximity, or connection between resources without primary concern for their meaning or the origin of these relationships. We take a structural perspective when we define a family as “a collection of people” or when we say that a particular family like the Simpsons has five members. Sometimes all we know is that two resources are connected, as when we see a highlighted word or phrase that is pointing from the current web page to another. At other times we might know more about the reasons for the relationships within a set of resources, but we still fo­cus on their structure, essentially merging or blurring all of the reasons for the associations into a single generic notion that the resources are connected.

Core Concepts Edition

6.5 The Structural Perspective 245

Page 23: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

Stop and Think: Kevin Bacon Numbers

See http://oracleofbacon.org/ for a web-based demonstration of “Kevin Bacon Numbers,” which measure the average degrees of separation among more than 2.6 million actors in more than 1.9 million movies. Its name reflects the parlor game “Six Degrees of Kevin Bacon,” a pun on “six degrees of separation” that is often associated with Travers and Milgram's work; the game relies on the remarkable variety of Bacon's roles, and hence the number of fel­low actors in his movies (two actors in the same movie have one degree of separation). Bacon's Bacon Num­ber is 2.994, but it turns out that more than 300 actors are closer to the center of the movie universe than Bacon. Try some famous ac­tors and see if their Bacon Numbers are greater or smaller than Bacon's. (Hint: older actors have been in more movies.)

Travers and Milgram conducted a now-famous study in the 1960s involving the delivery of written messages between people in the midwestern and eastern United States. If a person did not know the intended recipient, he was instruc­ted to send the message to someone that he thought might know him. The study demonstrated what Travers and Milgram called the “small world problem,” in which any two arbitrarily selected people were separated by an average of few­er than six links.It is now common to analyze the number of “degrees of separation” between any pair of resources. For example, Markoff and Sengupta describe a 2011 study using Facebook data that computed the average “degree of separation” of any two people in the Facebook world to be 4.74.

Many types of resources have internal structure in addition to their structur­al relationships with other resources. Of course, we have to remember (as we discussed in §4.3 Resource Identity (page 152)) that we often face arbi­trary choices about the abstraction and granularity with which we de­scribe the parts that make up a re­source and whether some combina­tion of resource should also be identi­fied as a resource. This is not easy when you are analyzing the structure of a car with its thousands of parts, and it is ever harder with information resources where there are many more ways to define parts and wholes. However, an advantage for informa­tion resources is that their internal structural descriptions are usually highly “computable,” something we consider in depth in Chapter 10, Inter­actions with Resources.

The Discipline of Organizing

Chapter 6 — Describing Relationships and Structures246

Page 24: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

Business StructuresManagement science is constantly reevaluating different structures for or­ganizations. Many large businesses are organized similarly near the top, with a board of directors, a chief executive officer, and other executives who manage the vice presidents or directors of various business units. Within and across these business units, however, there are significant variations in how a business can organize its people.Management strategies are built around the style of organization the busi­ness has chosen. These organizational choices reflect the CEO’s manage­ment philosophy, the industry, regulatory requirements, operating scale, and other factors. Strict hierarchies are a traditional approach, with a tree structure leading from the lowest level worker directly up to the CEO. The strict management hierarchy at Foxconn is needed to enable close oversight of large numbers of low level employees in the manufacturing industry, with workers organized by physical location.Other firms have a matrix structure in which an employee can be working on multiple projects, and reporting to a different manager for each one. A consulting firm’s matrix structure might emphasize an employee’s function­al role (e.g., “process engineering consultant”) and disassociate it from the employee’s home location, which is why consultants spend so much time traveling on airplanes from project to project.

6.5.1 Intentional, Implicit, and Explicit StructureIn the discipline of organizing we emphasize “intentional structure” created by people or by computational processes rather than accidental or naturally-occurring structures created by physical and geological processes. We acknowl­edged in §1.5 that there is information in the piles of debris left after a tornado or tsunami and in the strata of the Grand Canyon. These structural patterns might be of interest to meteorologists, geologists, or others but because they were not created by an identifiable agent following one or more organizing prin­ciples, they are not our primary focus.Some organizing principles impose very little structure. For a small collection of resources, co-locating them or arranging them near each other might be suffi­cient organization. We can impose two- or three-dimensional coordinate systems on this “implicit structure” and explicitly describe the location of a resource as precisely as we want, but we more naturally describe the structure of resource locations in relative terms. In English we have many ways to describe the struc­tural relationship of one resource to another: “in,” “on,” “under,” “behind,” “above,” “below,” “near,” “to the right of,” “to the left of,” “next to,” and so on. Sometimes several resources are arranged or appear to be arranged in a

Core Concepts Edition

6.5 The Structural Perspective 247

Page 25: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

Stop and Think: Intentional, implicit/explicit structure

Find a map of the states (or provin­ces or other divisions) in your coun­try. You probably think of some set of these as members of a collection. Other than their literal arrange­ment (e.g., “x is next to y, y is east of z”), how could you describe their relationships to each other within the collection? Are these relation­ships based on natural or uninten­tional properties or intentional ones? Example: in the United States, California, Oregon, and Washington are considered the “West Coast” and the Pacific Ocean determines their western bounda­ries. Some of the borders between the states are natural, determined by rivers, and other borders are more intentional and arbitrary.

sequence or order and we can use po­sitional descriptions of structure: a late 1990s TV show described the planet Earth as the “third rock from the Sun.”We pay most attention to intentional structures that are explicitly repre­sented within and between resources because they embody the design or authoring choices about how much implicit or latent structure will be made explicit. Structures that can be reliably extracted by algorithms be­come especially important for very large collections of resources whose scope and scale defy structural analy­sis by people.

6.5.2 Structural Relationships within a ResourceWe almost always think of human and other animate resources as unitary entities. Likewise, many physical re­

sources like paintings, sculptures, and manufactured goods have a material in­tegrity that makes us usually consider them as indivisible. For an information resource, however, it is almost always the case that it has or might have had some internal structure or sub-division of its constituent data elements.In fact, since all computer files are merely encodings of bits, bytes, characters and strings, all digital resources exhibit some internal structure, even if that structure is only discernible by software agents. Fortunately, the once inscruta­ble internal formats of word processing files are now much more interpretable after they were replaced by XML in the last decade.When an author writes a document, he or she gives it some internal organiza­tion with its title, section headings, typographic conventions, page numbers, and other mechanisms that identify its parts and their significance or relation­ship to each other. The lowest level of this structural hierarchy, usually the para­graph, contains the text content of the document. Sometimes the author finds it useful to identify types of content like glossary terms or cross-references within the paragraph text. Document models that mix structural description with con­tent “nuggets” in the text are said to contain mixed content.

The Discipline of Organizing

Chapter 6 — Describing Relationships and Structures248

Page 26: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

In data-intensive or transactional domains, document instances tend to be ho­mogeneous because they are produced by or for automated processes, and their information components will appear predictably in the same structural relation­ships with each other. These structures typically form a hierarchy expressed in an XML schema or word processing style template. XML documents describe their component parts using content-oriented elements like <ITEM>, <NAME>, and <ADDRESS>, that are themselves often aggregate structures or containers for more granular elements. The structures of resources maintained in databases are typically less hierarchical, but the structures are precisely captured in data­base schemas.In more qualitative, less information-intensive and more experience-intensive domains, we move toward the narrative end of the Document Type Spectrum, and document instances become more heterogeneous because they are pro­duced by and for people. (See the sidebar, The Document Type Spectrum (page 138) in §4.2.1.) The information conveyed in the documents is conceptual or the­matic rather than transactional, and the structural relationships between docu­ment parts are much weaker. Instead of precise structure and content rules, there is usually just a shallow hierarchy marked up with Word processing or HTML tags like <HEAD>, <H1>, <H2>, and <LIST>.The internal structural hierarchy in a resource is often extracted and made into a separate and familiar description resource called the “table of contents” to support finding and navigation interactions with the primary resource. In a prin­ted media context, any given content resource is likely to only be presented once, and its page number is provided in the table of contents to allow the read­er to locate the chapter, section or appendix in question. In a hypertext media context, a given resource may be a chapter in one book while being an appendix in another. Some tables of contents are created as a static structural descrip­tion, but others are dynamically generated from the internal structures whenev­er the resource is accessed. In addition, other types of entry points can be gen­erated from the names or descriptions of content components, like selectable lists of tables, figures, maps, or code examples.Identifying the components and their structural relationships in documents is easier when they follow consistent rules for structure (e.g., every non-text com­ponent must have a title and caption) and presentation (e.g., hypertext links in web pages are underlined and change cursor shapes when they are “moused over”) that reinforce the distinctions between types of information components. Structural and presentation features are often ordered on some dimension (e.g., type size, line width, amount of white space) and used in a correlated manner to indicate the importance of a content component.Many indexing algorithms treat documents as “bags of words” to compute sta­tistics about the frequency and distribution of the words they contain while

Core Concepts Edition

6.5 The Structural Perspective 249

Page 27: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

Structural MetadataStructural metadata, in the form of a schema for a database or document, describes a class of information resources, and may also prescribe gram­matical details of inclusion and attribution relationships among the compo­nents. For example, the chapters of this book contain four levels of subsec­tions. Each of those sections contains a title, some paragraphs and other text blocks, and subordinate sections. The textual content of the paragraphs includes highlighted terms and phrases that are defined in situ and refer­enced again in the glossary and index; there are also bibliographic citations that are reflected in the bibliography and index. We can discover these char­acteristics of a book through observation, but we could also examine its structural metadata, in its schema.Structural metadata allows us to describe and prescribe relations among database tables, within the chapters of a book, or among parts in an inven­tory management system. The schema for HTML, for example, informs us that the <A> element can be used to signal a hypertext link-end; whether that link-end is an anchor or a target, or both, depends on the combination of values assigned to attributes. In HTML, the optional REL attribute may contain a value that signals the purpose of a hypertext link, and any HTML element may include a CLASS attribute value that may be used as a CSS se­lector for the purposes of formatting or dynamic interactions.The usefulness of any given schema is often a function of the precision with which we may make useful statements based upon the descriptions and pre­scriptions it offers. Institutional schemas tend to be more prescriptive and restrictive, stressing professional orthodoxy and conformance to controlled vocabularies. Schemas for the information content in social and informal ap­plications tend to be less prescriptive. Whether and how we use structural metadata is a tradeoff. Structural metadata is essential to enable quality control and maintenance in information collection and publishing processes, but someone has to do the work to create it.

ignoring all semantics and structure. In Chapter 10, Interactions with Resources, we contrast this approach with algorithms that use internal structural descrip­tions to retrieve more specific parts of documents.

6.5.3 Structural Relationships between ResourcesMany types of resources have “structural relationships” that interconnect them. Web pages are almost always linked to other pages. Sometimes the links among a set of pages remain mostly within those pages, as they are in an e-commerce catalog site. More often, however, links connect to pages in other sites, creating a link network that cuts across and obscures the boundaries between sites.

The Discipline of Organizing

Chapter 6 — Describing Relationships and Structures250

Page 28: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

Stop and Think: Structural Metadata for a Course Syllabus

Analyze the structure of your syllabus for this course. What are its structural elements and some of the rules that specify how they are organized? Re­member, think in terms of structural el­ements and not presentational ele­ments. How does this structural schema compare to those of other course sylla­bi? What kinds of interactions would be enabled if all of your courses used the same syllabus schema?

The links between documents can be analyzed to infer connections between the authors of the docu­ments. Using the pattern of links between documents to under­stand the structure of knowledge and of the intellectual community that creates it is not a new idea, but it has been energized as more of the information we exchange with other people is on the web or otherwise in digital formats. An important function in Google’s search engine is the page rank al­gorithm that calculates the rele­vance of a page in part using the number of links that point to it while giving greater weight to pages that are themselves linked to often.Web-based social networks enable people to express their connections with oth­er people directly, bypassing the need to infer the connections from links in documents or other communications.

6.5.3.1 Hypertext LinksThe concept of read-only or follow-only structures that connect one document to another is usually attributed to Vannevar Bush in his seminal 1945 essay titled “As We May Think.” Bush called it associative indexing, defined as “a provision whereby any item may be caused at will to select immediately and automatically another.” The “item” connected in this way was for Bush most often a book or a scientific article. However, the anchor and destination of a hypertext link can be a resource of any granularity, ranging from a single point or character, a para­graph, a document, or any part of the resource to which the ends of link are connected. The anchor and destination of a web link are its structural specifica­tion, but we often need to consider links from other perspectives. (See the side­bar, Perspectives on Hypertext Links (page 253)).Theodor Holm Nelson, in a book intriguingly titled Literary Machines, renamed associative indexing as hypertext decades later, expanding the idea to make it a writing style as well as a reading style. Nelson urged writers to use hypertext to create non-sequential narratives that gave choices to readers, using a novel technique for which he coined the term transclusion.At about the same time, and without knowing about Nelson’s work, Douglas En­gelbart’s Augmenting the Human Intellect, described a future world in which

Core Concepts Edition

6.5 The Structural Perspective 251

Page 29: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

TransclusionThe inclusion, by hypertext refer­ence, of a resource or part of a re­source into another resource is called transclusion. Transclusion is normally performed automatically, without user intervention. The in­clusion of images in web documents is an example of transclusion. Transclusion is a frequently used technique in business and legal document processing, where re-use of consistent and up-to-date content is essential to achieve efficiency and consistency.

professionals equipped with interac­tive computer displays utilize an in­formation space consisting of a cross-linked resources.In the 1960s, when computers lacked graphic displays and were primarily employed to solve complex mathemat­ical and scientific problems that might take minutes, hours or even days to complete, Nelson’s and Engel­bart’s visions of hypertext-based per­sonal computing may have seemed far-fetched. In spite of this, by 1968, Engelbart and his team demonstrated human computer interface including the mouse, hypertext, and interactive media, along with a set of guiding

principles.Hypertext links are now familiar structural mechanisms in information applica­tions because of the World Wide Web, proposed in 1989 by Tim Berners-Lee and Robert Cailliau. They invented the methods for encoding and following hyper­text links using the now popular HyperText Markup Language (HTML). The re­sources connected by HTML’s hypertext links are not limited to text or docu­ments. Selecting a hypertext link can invoke a connected resource that might be a picture, video, or interactive application.Since browsers made them familiar, hypertext links have been used in other computing applications as structure and navigation mechanisms.

The Discipline of Organizing

Chapter 6 — Describing Relationships and Structures252

Page 30: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

Perspectives on Hypertext LinksA lexical perspective on hypertext links concerns the words that are used to signal the presence of a link or to encode its type. In web contexts, the words in which a structural link is embedded are called the anchor text. More generally, rhetorical structure theory analyzes how different conven­tions or signals in texts indicate relationships between texts or parts of them, like the subtle differences in polarity among “see,” “see also,” and “but see” as citation signals.Many hypertext links in web pages are purely structural because they lack explicit representation of the reason for the relationship. When it is evident, this semantic property of the link is called the link type.An architectural perspective on links considers whether links are one-way or bi-directional. When a bi-directional link is created between an anchor and a destination, it is as though a one-way link that can be followed in the opposite direction is automatically created. Two one-way links serve the same purpose, but the return link is not automatically established when the first one is created. A second architectural consideration is whether to em­ploy binary links, connecting one anchor to one destination, or n-ary links, connecting one anchor to multiple types of destinations. (See §6.6)A “front end” or “surface” implementation perspective on hypertext links concerns how the presence of the link is indicated in a user interface; this is called the “link marker”; underlining or coloring of clickable text are con­ventional markers for web links. A “back end” implementation issue is whether links are contained or embedded in the resources they link or whether they are stored separately in a link base. (See §6.7)

6.5.3.2 Analyzing Link StructuresWe can portray a set of links between resources graphically as a pattern of box­es and links. Because a link connection from one resource to another need not imply a link in the opposite direction, we distinguish one-way links from explicit­ly bi-directional ones.A graphical representation of link structure is shown on the left panel of figure Figure 6.3, Representing Link Structures.. For a small network of links, a diagram like this one makes it easy to see that some resources have more incoming or outgoing links than other resources. However, for most purposes we leave the analysis of link structures to computer programs, and there it is much better to represent the link structures more abstractly in matrix form. In this matrix the resource identifiers on the row and column heads represent the source and des­tination of the link. This is a full matrix because not all of the links are symmet­ric; a link from resource 1 to resource 2 does not imply one from 2 to 1.

Core Concepts Edition

6.5 The Structural Perspective 253

Page 31: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

Figure 6.3. Representing Link Structures.

The structure of links between web resources can be represented graphically or in a matrix. The matrix representation is a more abstract one that can be ana­

lyzed by computers.

A matrix representation of the same link structure is shown on the right panel of Figure 6.3, Representing Link Structures.. This representation models the net­work as a directed graph in which the resources are the vertices and the rela­tionships are the edges that connect them. We now can apply graph algorithms to determine many useful properties. A very important property is reachability, the “can you get there from here” property. Other useful properties include the average number of incoming or outgoing links, the average distance between any two resources, and the shortest path between them.

6.5.3.3 Bibliometrics, Shepardizing, Altmetrics, and Social Network AnalysisInformation scientists began studying the structure of scientific citation, now called bibliometrics, nearly a century ago to identify influential scientists and publications. This analysis of the flow of ideas through publications can identify “invisible colleges” of scientists who rely on each other’s research, and recog­nize the emergence of new scientific disciplines or research areas. Universities use bibliometrics to evaluate professors for promotion and tenure, and libraries use it to select resources for their collections.

The Discipline of Organizing

Chapter 6 — Describing Relationships and Structures254

Page 32: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

The expression of citation relationships between documents is especially nuanced in legal contexts, where the use of legal cases as precedents makes it essential to distinguish precisely where a new ruling lies on the relational con­tinuum between “Following” and “Overruling” with respect to a case it cites. The analysis of legal citations to determine whether a cited case is still good law is called Shepardizing because lists of cases annotated in this way were first published in the late 1800s by Frank Shepard, a salesman for a legal publishing company.The links pointing to a web page might be thought of as citations to it, so it is tempting to make the analogy to consider Shepardizing the web. But unlike le­gal rulings, web pages aren’t always persistent, and only courts have the au­thority to determine the value of cited cases as precedents, so Shepard-like met­rics for web pages would be tricky to calculate and unreliable.Nevertheless, the web’s importance as a publishing and communication medium is undeniable, and many scholars, especially younger ones, now contribute to their fields by blogging, Tweeting, leaving comments on online publications, writing Wikipedia articles, giving MOOC lectures, and uploading papers, code, and datasets to open access repositories. Because the traditional bibliometrics pay no attention to this body of work, alternative metrics or “altmetrics” have been proposed to count these new venues for scholarly influence.Facebook’s valuation is based on its ability to exploit the structure of a person’s social network to personalize advertisements for people and “friends” to whom they are connected. Many computer science researchers are working to deter­mine the important characteristics of people and relationships that best identify the people whose activities or messages influence others to spend money.

6.6 The Architectural PerspectiveThe architectural perspective emphasizes the number and abstraction level of the components of a relationship, which together characterize the complexity of the relationship. We will briefly consider three architectural issues: degree (or arity), cardinality, and directionality.These architectural concepts come from data modeling and they enable rela­tionships to be described precisely and abstractly, which is essential for main­taining an organizing system that implements relationships among resources. Application and technology lifecycles have never been shorter, and vast amounts of new data are being created by increased tracking of online interac­tions and by all the active resources that are now part of the Internet of Things. Organizing systems built without clear architectural foundations cannot easily scale up in size and scope to handle these new requirements.

Core Concepts Edition

6.6 The Architectural Perspective 255

Page 33: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

6.6.1 DegreeThe degree or arity of a relationship is the number of entity types or categories of resources in the relationship. This is usually, though not always, the same as the number of arguments in the relationship expression. Homer Simpson (husband) ⇔ is-married-to ⇔ Marge Simpson (wife)is a relationship of degree 2, a binary relationship between two entity types, be­cause the “is-married-to” relationship as we first defined it requires one of the arguments to be of entity type “husband” and one of them to be of type “wife.”Now suppose we change the definition of marriage to allow the two participants in a marriage to be any instance of the entity type “person.” The relationship ex­pression looks exactly the same, but its degree is now unary because only 1 en­tity type is needed to instantiate the two arguments: Homer Simpson (person) ⇔ is-married-to ⇔ Marge Simpson (person)Some relationships are best expressed as ternary ones that involve three differ­ent entity types. An example that appears in numerous data modeling books is one like this: Supplier → provides → Part → assembled-in → ProductIt is always possible to represent ternary relationships as a set of binary ones by creating a new entity type that relates to each of the others in turn. This new entity type is called a dummy in modeling practice. Supplier → provides → DUMMY Part → provided-for → DUMMY DUMMY → assembled-in → ProductThis transformation from a sensible ternary relationship to three binary ones in­volving a DUMMY entity type undoubtedly seems strange, but it enables all re­lationships to be binary while still preserving the meaning of the original terna­ry one. Making all relationships binary makes it easier to store relationships and combine them to discover new ones.

The Discipline of Organizing

Chapter 6 — Describing Relationships and Structures256

Page 34: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

6.6.2 CardinalityThe cardinality of a relationship is the number of instances that can be associ­ated with each entity type in a relationship. At first glance this might seem to be degree by another name, but it is not.Cardinality is easiest to explain for binary relationships. If we return to Homer and Marge, the binary relationship that expresses that they are married hus­band and wife is a one-to-one relationship because a husband can only have one wife and a wife can only have one husband (at a time, in monogamous societies like the one in which the Simpsons live).In contrast, the “is-parent-of” relationship is one-to-many, because the meaning of being a parent makes it correct to say that:Homer Simpson → is-parent-of → Bart AND Lisa AND MaggieAs we did with the ternary relationship in §6.6.1 Degree (page 256), we can transform this more complex relationship architecture to a set of simpler ones by restricting expressions about being a parent to the one-to-one cardinality. Homer Simpson → is-parent-of → Bart Homer Simpson → is-parent-of → Lisa Homer Simpson → is-parent-of → MaggieThe one-to-many expression brings all three of Homer’s children together as ar­guments in the same relational expression, making it more obvious that they share the same relationship than in the set of separate and redundant one-to-one expressions.

6.6.3 DirectionalityThe directionality of a relationship defines the order in which the arguments of the relationship are connected. A one-way or uni-directional relationship can be followed in only one direction, whereas a bi-directional one can be followed in both directions.All symmetric relationships are bi-directional, but not all bi-directional relation­ships are symmetric. (See §6.3.2.1 Symmetry (page 236).) A relationship be­tween a manager and an employee that he manages is “employs,” a different meaning than the “is-employed-by” relationship in the opposite direction. As in this example, the relationship is often lexicalized in only one direction.

Core Concepts Edition

6.6 The Architectural Perspective 257

Page 35: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

6.7 The Implementation PerspectiveFinally, the implementation perspective on relationships considers how a rela­tionship is realized or encoded in a technology context. The implementation per­spective contrasts strongly with the conceptual, structural, and architectural perspectives, which emphasize the meaning and abstract structure of relation­ships. The implementation perspective is a superset of the lexical perspective, because the choice of the language in which to express a relationship is an im­plementation decision. However, most people think of implementation as all of the decisions about technological form rather than just about the choice of words.In this book we focus on the fundamental issues and challenges that apply to all organizing systems, and not just on information-intensive ones that rely exten­sively on technology. Even with this reduced scope, there are some critical im­plementation concerns about the notation, syntax, and deployment of the rela­tionships and other descriptions about resources. We briefly introduce some of these issues here and then discuss them in detail in Chapter 9, The Forms of Re­source Descriptions.

6.7.1 Choice of ImplementationThe choice of implementation determines how easy it is to understand and proc­ess a set of relationships. For example, the second sentence of this chapter is a natural language implementation of a set of relationships in the Simpson family:

The Simpson family includes a man named Homer and a woman named Marge, the married parents of three sibling children, a boy named Bart and two girls, Lisa and Maggie.

A subject-predicate-object syntax makes the relationships more explicit:

Example 6.1. Subject-predicate syntax Homer Simpson → is-married-to → Marge Simpson Homer Simpson → is-parent-of → Bart Homer Simpson → is-parent-of → Lisa Homer Simpson → is-parent-of → Maggie Marge Simpson → is-married-to → Homer Simpson Marge Simpson → is-parent-of → Bart Marge Simpson → is-parent-of → Lisa Marge Simpson → is-parent-of → Maggie Bart Simpson → is-a → Boy Lisa Simpson → is-a → Girl Maggie Simpson → is-a → Girl

The Discipline of Organizing

Chapter 6 — Describing Relationships and Structures258

Page 36: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

In the following example of a potential XML implementation syntax, we empha­size class inclusion relationships by using elements as containers, and the rela­tionships among the members of the family are expressed explicitly through ref­erences, using XML’s ID and IDREF attribute types:

Example 6.2. An XML implementation syntax<Family name="Simpson"> <Parents children="Bart Lisa Maggie"> <Father name="Homer" spouse="Marge" /> <Mother name="Marge" spouse="Homer" /> </Parents> <Children parents="Homer Marge" > <Boy name="Bart" siblings="Lisa Maggie" /> <Girl name="Lisa" siblings="Bart Maggie" /> <Girl name="Maggie" siblings="Bart Lisa" /> </Children></Family>

None of the models we have presented so far in this chapter represents the complexities of modern families that involve multiple marriages and children from more than one marriage, but they are sufficient for our limited demonstra­tion purposes.

6.7.2 Syntax and GrammarThe syntax and grammar of a language consists of the rules that determine which combinations of its words are allowed and are thus grammatical or well-formed. Natural languages have substantial similarities by having nouns, verbs, adjectives and other parts of speech, but they differ greatly in how they arrange them to create sentences. Conformance to the rules for arranging these parts makes a sentence syntactically compliant but does not mean that an expression is semantically comprehensible; the classic example is Chomsky’s anomalous sentence:

Colorless green ideas sleep furiously

Any meaning this sentence has is odd, difficult to visualize, and outside of readi­ly accessible experience, but anyone who knows the English language can rec­ognize that it follows its syntactic rules, as opposed to this sentence, which breaks them and seems completely meaningless:

Ideas colorless sleep furiously green

Core Concepts Edition

6.7 The Implementation Perspective 259

Page 37: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

6.7.3 Requirements for Implementation SyntaxThe most basic requirement for implementation syntax is that it can represent all the expressions that it needs to express. For the examples in this chapter we have used an informal combination of English words and symbols (arrows and parentheses) that you could understand easily, but simple language is incapable of expressing most of what we readily say in English. But this benefit of natural language only accrues to people, and the more restrictive and formal syntax is easier to understand for computers.A second consideration is that the implementation can be understood and used by its intended users. We can usually express a relationship in different lan­guages while preserving its meaning, just as we can usually implement the same computing functionality in different programming languages. From a se­mantic perspective these three expressions are equivalent: My name is Homer Simpson Mon nom est Homer Simpson Mein name ist Homer SimpsonHowever, whether these expressions are equivalent for someone reading them depends on which languages they understand.An analogous situation occurs with the implementation of web pages. HTML was invented as a language for encoding how web pages look in a browser, and most of the tags in HTML represent the simple structure of an analogous print document. Representing paragraphs, list items and numbered headings with <P> and <LI> and <Hn> makes using HTML so easy that school children can create web pages. However, the “web for eyes” implemented using HTML is of less ef­ficient or practical for computers that want to treat content as product catalogs, orders, invoices, payments, and other business transactions and information that can be analyzed and processed. This “web for computers” is best imple­mented using domain-specific vocabularies in XML.

6.8 Relationships in Organizing SystemsIn the previous sections as we surveyed the five perspectives on analyzing rela­tionships we mentioned numerous examples where relationships had important roles in organizing systems. In this final section we examine three contexts for organizing systems where relationships are especially fundamental; the Seman­tic Web and Linked Data, bibliographic organizing systems, and situations in­volving system integration and interoperability.

The Discipline of Organizing

Chapter 6 — Describing Relationships and Structures260

Page 38: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

Wikipedia Info Boxes

Wikipedia encourages authors to aug­ment their articles with “info boxes” that organize sets of structured infor­mation generically relevant to the type of resource that is the subject of the article. These three examples show parts of the info boxes for

“Company,” “Song,” and “City.”

(Collage of screenshots by R. Glush­ko.)

6.8.1 The Semantic Web and Linked DataIn a classic 2001 paper, Tim Berners-Lee laid out a vision of a Semantic Web in which all information could be shared and processed by automated tools as well as by people. The essential technologies for making the web more semantic and relationships among web resources more explicit are applications of XML, in­cluding RDF (§5.2.2.4 Resource Description Framework (RDF) (page 184)), and OWL (§6.3.3 Ontologies (page 238)). Many tools have been developed to support more semantic encoding, but most still require substantial expertise in semantic technologies and web standards.More likely to succeed are applications that aim lower, not trying to encode all the latent semantics in a document or web page. For example, some wiki and blogging tools contain templates for semantic annotation, and Wikipedia has thousands of templates and “info boxes” to encourage the creation of informa­tion in content-encoded formats.

The “Linked Data” movement is an extension of the Semantic Web idea to reframe the basic principles of the web’s architecture in more semantic terms. Instead of the limited role of links as simple untyped relationships between HTML documents, links be­tween resources described by RDF can serve as the bridges between is­lands of semantic data, creating a Linked Data network or cloud.

6.8.2 Bibliographic Organizing SystemsMuch of our thinking about relation­ships in organizing systems for infor­mation comes from the domain of bib­liographic cataloging of library re­sources and the related areas of clas­sification systems and descriptive the­sauri. Bibliographic relationships pro­vide an important means to build structure into library catalogs.Bibliographic relationships are com­mon among library resources. Smira­

glia and Leazer found that approximately 30% of the works in the Online Com­

Core Concepts Edition

6.8 Relationships in Organizing Systems 261

Page 39: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

puter Library Center (OCLC) WorldCat union catalog have associated derivative works. Relationships among items within these bibliographic families differ, but the average family size for those works with derivative works was found to be 3.54 items. Moreover, “canonical” works that have strong cultural meaning and influence, such as “the plays of William Shakespeare” and The Bible, have very large and complex bibliographic families.

6.8.2.1 Tillett’s TaxonomyBarbara Tillett, in a study of 19th and 20th-century catalog rules, found that many different catalog rules have existed over time to describe bibliographic re­lationships. She developed a taxonomy of bibliographic relationships that in­cludes equivalence, derivative, descriptive, whole-part, accompanying, sequen­tial or chronological, and shared characteristic. These relationship types span the relationship perspectives defined in this chapter; equivalence, derivative, and description are semantic types; whole-part and accompanying are part se­mantic and part structural types; sequential or chronological are part lexical and part structural types; and shared characteristics are part semantic and part lexical types.

6.8.2.2 Resource Description and Access (RDA)Many cataloging researchers have recognized that online catalogs do not do a very good job of encoding bibliographic relationships among items, both due to catalog display design and to the limitations of how information is organized within catalog records. Author name authority databases, for example, provide information for variant author names, which can be very important in finding all of the works by a single author, but this information is not held within a catalog record. Similarly, MARC records can be formatted and displayed in web library catalogs, but the data within the records are not available for re-use, re-purposing, or re-arranging by researchers, patrons, or librarians.The Resource Description and Access (RDA) next-generation cataloging rules are attempting to bring together disconnected resource descriptions to provide more complete and interconnected data about works, authors, publications, publishers, and subjects.RDA uses RDF to assert relationships among bibliographic materials.

6.8.2.3 RDA and the Semantic WebThe move in RDA to encode bibliographic data in RDF stems from the desire to make library catalog data more web-accessible. As web-based data mash-ups, application programming interfaces (APIs), and web searching are becoming ubiquitous and expected, library data are becoming increasingly isolated. The

The Discipline of Organizing

Chapter 6 — Describing Relationships and Structures262

Page 40: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

developers of RDA see RDF as the means for making library data more widely available online.In addition to simply making library data more web accessible, RDA seeks to leverage the distributed nature of the Semantic Web. Once rules for describing resources, and the relationships between them, are declared in RDF syntax and made publicly available, the rules themselves can be mixed and mashed up. Cre­ators of information systems that use RDF can choose elements from any RDF schema. For example, we can use the Dublin Core metadata schema (which has been aligned with the RDF model) and the Friend of a Friend (FOAF) schema (a schema to describe people and the relationships between them) to create a set of metadata elements about a journal article that goes beyond the standard bib­liographic information. RDA’s process of moving to RDF is well underway.

6.8.3 Integration and InteroperabilityIntegration is the controlled sharing of information between two (or more) busi­ness systems, applications, or services within or between firms. Integration means that one party can extract or obtain information from another one, it does not imply that the recipient can make use of the information.Interoperability goes beyond integration to mean that systems, applications, or services that exchange information can make sense of what they receive. Intero­perability can involve identifying corresponding components and relationships in each system, transforming them syntactically to the same format, structurally to the same granularity, and semantically to the same meaning.For example, an Internet shopping site might present customers with a product catalog whose items come from a variety of manufacturers who describe the same products in different ways. Likewise, the end-to-end process from custom­er ordering to delivery requires that customer, product and payment informa­tion pass through the information systems of different firms. Creating the neces­sary information mappings and transformations is tedious or even impossible if the components and relationships among them are not formally specified for each system.In contrast, when these models exist as data or document schemas or as classes in programming languages, identifying and exploiting the relationships between the information in different systems to achieve interoperability or to merge dif­ferent classification systems can often be completely automated. Because of the substantial economic benefits to governments, businesses, and their customers of more efficient information integration and exchange, efforts to standardize these information models are important in numerous industries. Chapter 10, In­teractions with Resources will dive deeper into interoperability issues, especially those that arise in business contexts.

Core Concepts Edition

6.8 Relationships in Organizing Systems 263

Page 41: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

6.9 Key Points in Chapter Six• A relationship is “an association among several things, with that association

having a particular significance.”(See §6.1 Introduction (page 225))

• Just identifying the resources involved is not enough because several differ­ent relationships can exist among the same resources.(See §6.2 Describing Relationships: An Overview (page 227))

• Most relationships between resources can be expressed using a subject-predicate-object model.(See §6.3 The Semantic Perspective (page 228) and §6.7.1 Choice of Imple­mentation (page 258))

• For a computer to understand relational expressions, it needs a computer-processable representation of the relationships among words and meanings that makes every important semantic assumption and property precise and explicit.(See §6.3 The Semantic Perspective (page 228))

• Three broad categories of semantic relationships are inclusion, attribution, and possession.(See §6.3.1 Types of Semantic Relationships (page 230))

• A set of interconnected class inclusion relationships creates a hierarchy called a taxonomy.(See §6.3.1.1 Inclusion (page 231))

• Classification is a class inclusion relationship between an instance and a class.(See §6.3.1.1 Inclusion (page 231))

• Ordering and inclusion relationships are inherently transitive, enabling in­ferences about class membership and properties.(See §6.3.2.2 Transitivity (page 236))

• Class inclusion relationships form a framework to which other kinds of rela­tionships attach, creating a network of relationships called an ontology.(See §6.3.3 Ontologies (page 238))

• When words encode the semantic distinctions expressed by class inclusion, the more specific class is called the hyponym; the more general class is the hypernym.(See §6.4.1.1 Hyponymy and Hyperonymy (page 241))

The Discipline of Organizing

Chapter 6 — Describing Relationships and Structures264

Page 42: Simpson Family Trees - iSchools of...Simpson Family Trees Because the Simpson family is known throughout the world, the Simpson family tree is often used to teach kinship terms to

• Morphological analysis of how words in a language are created from smaller units is heavily used in text processing.(See §6.4.3 Relationships among Word Forms (page 244))

• Many types of resources have internal structure in addition to their structur­al relationships with other resources.(See §6.5.2 Structural Relationships within a Resource (page 248) and §6.5.2 Structural Relationships within a Resource (page 248))

• Using the pattern of links between documents to understand the structure of knowledge and the structure of the intellectual community that creates it is an idea that is nearly a century old.(See §6.5.3 Structural Relationships between Resources (page 250))

• Many hypertext links are purely structural because there is no explicit rep­resentation of the reason for the relationship.(See the sidebar, Perspectives on Hypertext Links (page 253))

• The architectural perspective on resources emphasizes the number and ab­straction level of the components of a relationship; three important issues are degree, cardinality, and directionality.(See §6.6 The Architectural Perspective (page 255))

• The essential technologies for making the web more semantic and relation­ships among web resources more explicit are XML, RDF, and OWL.(See §6.8.1 The Semantic Web and Linked Data (page 261))

• Much of our thinking about relationships in organizing systems for informa­tion comes from the domain of bibliographic cataloging of library resources and the related areas of classification systems and descriptive thesauri.(See §6.8.2 Bibliographic Organizing Systems (page 261))

• The Resource Description and Access (RDA) next-generation cataloging rules are attempting to bring together disconnected resource descriptions.(See §6.8.2.2 Resource Description and Access (RDA) (page 262))

• Integration is the controlled sharing of information between two (or more) business systems, applications, or services within or between firms.(See §6.8.3 Integration and Interoperability (page 263))

• Interoperability goes beyond integration to mean that systems, applications, or services that exchange information can make sense of what they receive.(See §6.8.3 Integration and Interoperability (page 263))

Core Concepts Edition

6.9 Key Points in Chapter Six 265