an integrated model of meaning and sense activation and disambiguation

7
Brain and Language 68, 165–171 (1999) Article ID brln.1999.2075, available online at http://www.idealibrary.com on An Integrated Model of Meaning and Sense Activation and Disambiguation Peter Dixon and Leslie C. Twilley University of Alberta, Edmonton, Alberta, Canada A simulation of meaning and sense selection is presented. In the model, distrib- uted representations are used for both unrelated meanings and related meaning senses; however, related meaning senses are assumed to be correlated. Related meaning senses are assumed to be learned associatively in similarly correlated con- texts. Our model is sufficient to account for the contrasting results obtained in prim- ing paradigms by Whitney, McKay, Kellas, and Emerson (1985) and Tabossi (1988). 1999 Academic Press Key Words: Lexical ambiguity; meaning senses; priming. Many words have multiple unrelated meanings, and virtually all words have multiple meaning senses; that is, related meanings that are appropriate in different contexts. Consequently, the mechanism that allows the lexical system to select an appropriate interpretation of a word in context is a critical component of the language processing system. In previous research, we have demonstrated how a simple system based on independent contributions of perceptual and contextual information suffices to account for much of the evidence on the activation of unrelated homograph meanings (Twilley & Dixon, in press). Here, we demonstrate how the same approach can be used to account for the activation and selection of related word senses. Two contrasting sets of priming results were considered. Whitney, McKay, Kellas, and Emerson (1985) measured the activation of word mean- ings using a Stroop color-naming task. First, subjects listened to sentences ending in a prime word with multiple meaning senses (e.g., ‘‘The child touched the rabbit’’; then, after a variable delay, a target word (e.g., ‘‘FUR’’) was presented visually and subjects named the color in which the word was printed. It was assumed that the activation of the prime word’s meaning would prime the target word’s meaning when they were related, which in turn would interfere with naming the target’s color. The amount of interfer- Address correspondence and reprint requests to Peter Dixon, Department of Psychology, University of Alberta, Edmonton, AB, T6G 2E9 Canada. E-mail: [email protected]. 165 0093-934X/99 $30.00 Copyright 1999 by Academic Press All rights of reproduction in any form reserved.

Upload: peter-dixon

Post on 16-Oct-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Brain and Language 68, 165–171 (1999)Article ID brln.1999.2075, available online at http://www.idealibrary.com on

An Integrated Model of Meaning and Sense Activationand Disambiguation

Peter Dixon and Leslie C. Twilley

University of Alberta, Edmonton, Alberta, Canada

A simulation of meaning and sense selection is presented. In the model, distrib-uted representations are used for both unrelated meanings and related meaningsenses; however, related meaning senses are assumed to be correlated. Relatedmeaning senses are assumed to be learned associatively in similarly correlated con-texts. Our model is sufficient to account for the contrasting results obtained in prim-ing paradigms by Whitney, McKay, Kellas, and Emerson (1985) and Tabossi (1988). 1999 Academic Press

Key Words: Lexical ambiguity; meaning senses; priming.

Many words have multiple unrelated meanings, and virtually all wordshave multiple meaning senses; that is, related meanings that are appropriatein different contexts. Consequently, the mechanism that allows the lexicalsystem to select an appropriate interpretation of a word in context is a criticalcomponent of the language processing system. In previous research, we havedemonstrated how a simple system based on independent contributions ofperceptual and contextual information suffices to account for much of theevidence on the activation of unrelated homograph meanings (Twilley &Dixon, in press). Here, we demonstrate how the same approach can be usedto account for the activation and selection of related word senses.

Two contrasting sets of priming results were considered. Whitney,McKay, Kellas, and Emerson (1985) measured the activation of word mean-ings using a Stroop color-naming task. First, subjects listened to sentencesending in a prime word with multiple meaning senses (e.g., ‘‘The childtouched the rabbit’’; then, after a variable delay, a target word (e.g., ‘‘FUR’’)was presented visually and subjects named the color in which the word wasprinted. It was assumed that the activation of the prime word’s meaningwould prime the target word’s meaning when they were related, which inturn would interfere with naming the target’s color. The amount of interfer-

Address correspondence and reprint requests to Peter Dixon, Department of Psychology,University of Alberta, Edmonton, AB, T6G 2E9 Canada. E-mail: [email protected].

1650093-934X/99 $30.00

Copyright 1999 by Academic PressAll rights of reproduction in any form reserved.

166 DIXON AND TWILLEY

TABLE 1Example Stimuli from Whitney et al. (1981)

Core targetAppropriate (prime)

The child touched the rabbit. FURAppropriate (control)

The child touched the present. FURInappropriate (prime)

The child chased the rabbit. FURInappropriate (control)

The child chased the butterfly. FUR

Peripheral targetAppropriate (prime)

The child chased the rabbit. HOPAppropriate (control)

The child chased the butterfly. HOPInappropriate (prime)

The child touched the rabbit. HOPInappropriate (Control)

The child touched the present. HOP

ence (relative to a control with unrelated primes and targets) provided anindex of the extent to which the prime word in context activated a meaningsense related to the target. Our view is that a word such as ‘‘rabbit’’ hasmultiple, overlapping interpretations: a quick rabbit, a timid rabbit, a petrabbit, rabbit meat, and so on. Whitney et al. (1985) presented two types oftargets: core targets were words that were related to many different interpre-tations of the prime (e.g., ‘‘FUR’’); peripheral targets were words that wererelated to only a relatively specific interpretation (e.g., ‘‘HOP’’). The primeword was presented in contexts that were appropriate for one interpretationor the other (see Table 1). The results (reproduced in Fig. 1) indicated thatthe pattern of interference varied with the interstimulus interval between theprime and the target. For core targets, interference was found at all delaysregardless of context. Peripheral targets also showed interference at all de-lays, but only when the context was appropriate; peripheral targets with inap-propriate contexts only showed interference with immediate presentation.

A different pattern of results was obtained by Tabossi (1988) using a simi-lar priming paradigm. In her study, the prime word was heard in contextfollowed immediately by the visual presentation of the target; a lexical deci-sion was made to the target word. Activation of the prime meaning wasassessed by comparing the lexical decision time for the target word to thatobtained with a control word (see Table 2). The sentence contexts were in-tended to be related to a property of one sense of the prime word; thus, theconditions were similar to the peripheral-target condition of Whitney et al.(1985) with a delay of 0. However, Tabossi attempted to generate contexts

MEANING AND SENSE ACTIVATION 167

FIG. 1. Stroop interference effects obtained by Whitney et al. (1985) (left panel) andcorresponding simulated priming effects (right panel).

that clearly predicted a particular feature of that sense of the prime, whichwas not always true in the materials used by Whitney et al. (1985). Conse-quently, the contextual constraint can be regarded as stronger or more com-pelling in her task. In keeping with this interpretation, Tabossi found strongereffects of context, and priming was only obtained when the prime was pre-sented in a context that supported the target sense; these results are shownin Fig. 2.

These different patterns of results can both be explained using an elabora-tion of the model we have developed for homograph meanings (Twilley &Dixon, in press). In our approach, distributed representations over large setsof microfeatures are used for word meanings, and a particular meanings canthus be written as a vector of activations, w. Vector representations are alsoused for the perceptual information that determines the identity of the word,p, and the contextual information that determines the appropriate choice ofmeaning, c. We assume that the input to the meaning representation is

TABLE 2Example Stimuli from Tabossi (1988)

Appropriate (prime)The girl was pricked by the rose. THORN

Appropriate (control)The girl was pricked by the wasp. THORN

Inappropriate (prime)The girl smelled the perfume of the rose. THORN

Inappropriate (control)The girl heard the buzz of the wasp. THORN

168 DIXON AND TWILLEY

FIG. 2. Lexical decision results obtained by Tabossi (1988) (left panel) and correspondingsimulated lexical access times (right panel).

a linear combination of the sum of perceptual and contextual information,that is, M(p 1 c), where M is a matrix of connection weights. The pat-tern of activation is simply this input, scaled to range 0 to 1, that is, w 5f [M(p 1 c)], where f is a logistic scaling function. The extent to which apattern of activation, w, represents a particular word meaning, t, is indexedby the dot product

a 5w ⋅ titi

.

We assume that simple incremental learning algorithms are used to learn theconnection weights that associate a word in context with a particular meaning(e.g., Kohonen, 1977; Rumelhart, Hinton, & Williams, 1986). Under suchcircumstances, the form of the learned connection matrix is well known andcan be written as

M 5 f 21(T) (XTX)21XT, (1)

where the columns of the matrix T are the patterns of activation correspond-ing to each meaning sense, and the columns of X indicate the combinationsof perceptual and contextual features that should produce each of thosesenses (i.e., the patterns p 1 c for each meaning).

The same approach is used for related meaning senses, except that boththe contexts and meanings are correlated. (Because the representations areassumed to be made up of microfeatures that are either present or absent,correlation is essentially an index of the extent to which the features presentin two representations overlap.) In other words, we assume that differentmeaning senses have different, but overlapping representations in terms offeatures and that the contexts in which the two senses are appropriate have

MEANING AND SENSE ACTIVATION 169

similar representations with overlapping features. However, as long as thedifferent contexts are not identical, the connection weights described by Eq.(1) can be learned through simple associations, just as unrelated, nonoverlap-ping meanings can be learned. As a consequence, the model makes specificpredictions concerning the extent to which different word senses should beactive purely on the basis of the assumption that representations are distrib-uted and reflect an associative learning history. Those predictions vary, ofcourse, with the correlational structure of the contexts that are encounteredduring learning and the correlation among the word meanings themselves.

In order to simulate the priming results of Whitney et al. (1985) and Ta-bossi (1988) using Eq. (1), we assumed a lexicon with four meanings: twodifferent senses of the prime word, a target meaning, and an unrelated controlmeaning. In the simulation of the Whitney et al. results, the two prime senseswere assumed to overlap 20% and a peripheral target was assumed to overlap15% with one of those senses. Core targets were assumed to be more relatedto features that different senses have in common and were assumed to sharehalf of the features common to the two senses. The contexts in which eachof these meanings were appropriate were assumed to be similarly related.These correlations determine the learned matrix of connection weights [asin Eq. (1)]. Using this weight matrix, one can predict the pattern of activationthat would occur when the system is presented with contextual informationalong with perceptual information for a particular word. The perceptual inputwas assumed to be transient, so that it was strong within 300 ms of a word’spresentation and then would gradually decay thereafter; the form of this tran-sient activation was taken from Twilley and Dixon (in press). Lexical accesstime for the target meaning was simulated in the model by identifying thetime after the presentation of the target stimulus that its meaning (as mea-sured by the dot product) reached threshold. Priming effects were simulatedby presenting the target at various delays after the prime and its context andsimply adding the inputs; priming was the difference between lexical accesstime for related and control conditions.

As shown in Fig. 1, the stimulated results are comparable to those obtainedby Whitney et al. (1985): with immediate presentation, priming is found inall conditions, and with delayed presentation, priming is found for core tar-gets and for peripheral targets only with an appropriate context. As in theWhitney et al. (1985) results, there is little priming for peripheral targetswhen the context is inappropriate at anything other than zero-delay. Thecorrelation between the simulated and observed effects was .82. These resultsare similar to those predicted by exhaustive access models of meaning activa-tion. In such a theoretical framework, it is assumed that all meaning sensesare activated initially, followed by a stage in which one sense is selectedbased on the context. Although there is no such sequence of stages in themodel we developed, it nevertheless provides an accurate match to results.Thus, we conclude that results such as those of Whitney et al. (1985) do not

170 DIXON AND TWILLEY

require a model with sequential access and selection stages and that the time-course of sense activation can be explained simply on the assumption thatthe strength of perceptual information varies over time.

The contrasting pattern of results obtained by Tabossi is also predictedby the model if, in keeping with the design of Tabossi’s materials, strongercontext and priming relationships are assumed. The simulation results areshown in Fig. 2; the correlation between the obtained and observed valueswas .96. In order to reproduce the stronger prime–target relationship in Ta-bossi’s materials, the target was assumed to overlap 20% with the relatedmeaning sense of the prime (rather than 15%), and, in order to generatestronger contextual constraint, the strength of the contextual input was in-creased by a similar amount. With these changes, the contextual input suc-ceeds in suppressing the inappropriate meaning and the inappropriate contextproduces little priming. However, all other characteristics of the simulationremain unchanged. In particular, the activation of the different meaningsenses depends directly on the assumed learning history that determines theconnection weights.

The overall approach embodied in the present model represents a depar-ture from many earlier approaches to meaning senses. Often researchers haveaccounted for the activation of different senses of a word in context by mak-ing qualitative distinctions among different kinds of meaning features thatmight be activated when a word is encountered. For example, Barsalou(1982) distinguished between context-dependent and context-independentfeatures: context-dependent features would be activated by the context, inde-pendent of the word, while context-independent features would be activatedsimply by the word itself. In contrast, we make no qualitative, a priori distinc-tions among features; instead, we assume that features come to be associatedwith a word or the surrounding context purely as a function of the word’slearning history. The present theoretical approach also highlights a contin-uum between distinct, unrelated meanings and related meaning senses, asdo other models using similar distributed representations (e.g., Kawamoto,1993). However, we believe the present approach is unique in explicitly iden-tifying a relationship between meaning sense activation and the correlationalstructure of the contexts in which a word appears, as in Eq. (1).

In sum, we feel that part of the model’s appeal is that it allows quantitativepredictions on the basis of only a meager set of assumptions concerningthe representation and processing of word meanings. The representationalassumption is simply that word senses are correlated and learned in a varietyof correlated contexts; the processing assumption reduces essentially to theidea that the perceptual input is transient. These assumptions entail that theactivation of both distinct word meanings and related senses is determinedby the correlational structure of the contexts in which words appear. Further,when the implications of these minimal assumptions are developed, the pat-terns of priming effects observed, for example, by Whitney et al. (1985)

MEANING AND SENSE ACTIVATION 171

and Tabossi (1988) can be readily interpreted in terms of that correlationalstructure. Thus, the present approach embodies a simple solution to the prob-lem of selecting meaning senses as well as unrelated word meanings.

REFERENCES

Barsalou, L. W. 1982. Context-independent and context-dependent information in concepts.Memory & Cognition, 10, 82–93.

Kawamoto, A. H. 1993. Nonlinear dynamics in the resolution of lexical ambiguity: A paralleldistributed processing account. Journal of Memory and Language, 32, 474–516.

Kohonen, T. 1977. Associative memory: A system-theoretic approach. Berlin: Springer-Verlag.

Rumelhart, D. E., Hinton, G. E., & Williams, R. J. 1986. Learning internal representationsby error propagation. In D. E. Rumelhart & J. L. McClelland (Eds.), Parallel distributedprocessing: Explorations in the microstructure of cognition. Cambridge, MA: MIT Press.

Tabossi, P. 1988. Effects of context on the immediate interpretation of unambiguous nouns.Journal of Experimental Psychology: Learning, Memory, and Cognition, 14, 153–162.

Twilley, L. C., & Dixon, P. in press. Meaning resolution processes for words: A parallelindependent model. Psychonomic Bulletin & Review.

Whitney, P., McKay, T., Kellas, G., & Emerson, W. A., Jr. 1985. Semantic activation ofnoun concepts in context. Journal of Experimental Psychology: Learning, Memory, andCognition, 11, 126–135.