the representation of polysemy through vectors: some building … · 2011-08-21 · the...

15
328 Int. J. Cont. Engineering Education and Life-Long Learning, Vol. 21, No. 4, 2011 Copyright © 2011 Inderscience Enterprises Ltd. The representation of polysemy through vectors: some building blocks for constructing models and applications with LSA Guillermo Jorge-Botana, José A. León* and Ricardo Olmos Departamento de Psicología Básica, Facultad de Psicología, Universidad Autónoma de Madrid, Campus de Cantoblanco, 28049 – Madrid, Spain Fax: +34-91-497-52-15 E-mail: [email protected] E-mail: [email protected] E-mail: [email protected] *Corresponding author Inmaculada Escudero Departamento de Psicología Evolutiva y de la Educación, Facultad de Psicología, UNED, C/ Juan del Rosal, 10, 28040 – Madrid, Spain E-mail: [email protected] Abstract: The problem of the multiplicity of word meanings has preoccupied so many researches from the linguistics, psychology or computational linguistic. In this paper, we revised how LSA represents the polysemous words and we explain some bias related with the meaning generation and revised some constraint-satisfaction models which introduce into the equation some dynamic mechanisms. The idea of these models is to take the amalgamated word vector from LSA and embed it into its discourse and semantic context, and by means of a dynamic mechanism, the appropriate features of it is are selected. To illustrate our arguments, we present some networks, providing evidence that polysemous words have separated representations for each sense only in presence of the linguistic context that involved it. We also present an example of how these mechanisms also contribute to support the visual heuristic searches in the visual information retrieval interfaces (VIRIs). Keywords: latent semantic analysis; LSA; polysemy; lexical representation; linguistic context; lexical semantics; predication; discourse comprehension; psycholinguistics; ambiguity, semantic networks; visual information retrieval interfaces; VIRIs. Reference to this paper should be made as follows: Jorge-Botana, G., León, J.A., Olmos, R. and Escudero, I. (2011) ‘The representation of polysemy through vectors: some building blocks for constructing models and applications with LSA’, Int. J. Continuing Engineering Education and Life-Long Learning, Vol. 21, No. 4, pp.328–342.

Upload: others

Post on 06-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The representation of polysemy through vectors: some building … · 2011-08-21 · The representation of polysemy through vectors 329 Biographical notes: Guillermo de Jorge-Botana

328 Int. J. Cont. Engineering Education and Life-Long Learning, Vol. 21, No. 4, 2011

Copyright © 2011 Inderscience Enterprises Ltd.

The representation of polysemy through vectors: some building blocks for constructing models and applications with LSA

Guillermo Jorge-Botana, José A. León* and Ricardo Olmos Departamento de Psicología Básica, Facultad de Psicología, Universidad Autónoma de Madrid, Campus de Cantoblanco, 28049 – Madrid, Spain Fax: +34-91-497-52-15 E-mail: [email protected] E-mail: [email protected] E-mail: [email protected] *Corresponding author

Inmaculada Escudero Departamento de Psicología Evolutiva y de la Educación, Facultad de Psicología, UNED, C/ Juan del Rosal, 10, 28040 – Madrid, Spain E-mail: [email protected]

Abstract: The problem of the multiplicity of word meanings has preoccupied so many researches from the linguistics, psychology or computational linguistic. In this paper, we revised how LSA represents the polysemous words and we explain some bias related with the meaning generation and revised some constraint-satisfaction models which introduce into the equation some dynamic mechanisms. The idea of these models is to take the amalgamated word vector from LSA and embed it into its discourse and semantic context, and by means of a dynamic mechanism, the appropriate features of it is are selected. To illustrate our arguments, we present some networks, providing evidence that polysemous words have separated representations for each sense only in presence of the linguistic context that involved it. We also present an example of how these mechanisms also contribute to support the visual heuristic searches in the visual information retrieval interfaces (VIRIs).

Keywords: latent semantic analysis; LSA; polysemy; lexical representation; linguistic context; lexical semantics; predication; discourse comprehension; psycholinguistics; ambiguity, semantic networks; visual information retrieval interfaces; VIRIs.

Reference to this paper should be made as follows: Jorge-Botana, G., León, J.A., Olmos, R. and Escudero, I. (2011) ‘The representation of polysemy through vectors: some building blocks for constructing models and applications with LSA’, Int. J. Continuing Engineering Education and Life-Long Learning, Vol. 21, No. 4, pp.328–342.

Page 2: The representation of polysemy through vectors: some building … · 2011-08-21 · The representation of polysemy through vectors 329 Biographical notes: Guillermo de Jorge-Botana

The representation of polysemy through vectors 329

Biographical notes: Guillermo de Jorge-Botana obtained his PhD in Cognitive Science from Universidad Autónoma de Madrid. He has experience in the fields of computational linguistics, latent semantic analysis, voice and self service technologies, psycholinguistic models, event-related brain potentials and cognition. He works in the technology sector, programming and designing applications for self service and interactive voice response (IVR) platforms at Prosodie Ibérica.

José A. León is a Professor of Text Comprehension, Knowledge Acquisition and Reading Comprehension at Universidad Autónoma de Madrid. His work involves the theoretical study of cognitive processes in comprehension, memory and language, as well as the application of cognitive principles to educational practice. His current research covers a variety of topics including text comprehension, understanding and evaluating inferences, neurocognitive processes, developing automatic scoring technology and computational models using latent semantic analysis.

Ricardo Olmos obtained his PhD in Behavioral Science Methodology and Reading Comprehension from Universidad Autónoma de Madrid. He has research experience in computational linguistics (the latent semantic analysis model of computational linguistics). He is an Associate Professor teaching Psychometrics at IE University, and last year gave a degree-level Introduction to Psychometrics course at Universidad Autónoma de Madrid. His fields of expertise are statistics, psychometrics and computational linguistics.

Inmaculada Escudero is a Professor at UNED. She has participated in many public and privately-funded research projects in the fields of technology, reading and comprehension, usability, computational linguistics and sociolinguistics.

1 Introduction

The phenomenon of polysemy is ubiquitous in human language. In fact there is no single case of a pure monosemic word, since even the word most closely identified with a single meaning has different nuances based on the context and community that uses it. For this reason, the study and categorisation of polysemy is a matter of degree, ranging from the simplest case with a shared etymological meaning to the most extreme case where the two meanings share the same word due to chance or phonetic convergence. Various authors have therefore supported classification in the form of a continuum. At one end, the meanings are completely independent of one another (homonymy) and at the other they are clearly related (metonymy); in the middle, we would find metaphors (Klepousniotou, 2002). Metonymy and polysemy are partners, and metaphors are a weaker version of homonymy. A good criterion for identifying whether ambiguity involves independent or related meanings is that offered by Ahrens (1998): two meanings belong to different senses if one of them is not an instance of the other that has been formed via metonymic or meronymic extensions (not the case for metaphors), if the instances of both meanings cannot inherit from the same class of nouns, and if both meanings cannot appear in the same contexts. Conversely, two meanings are different aspects of a single concept (Ahrens calls them meaning facets) if they are formed using

Page 3: The representation of polysemy through vectors: some building … · 2011-08-21 · The representation of polysemy through vectors 329 Biographical notes: Guillermo de Jorge-Botana

330 G. Jorge-Botana et al.

metonymic or meronymic extensions, if they can inherit from the same noun classes and if they do appear in similar contexts. In fact, one of the tasks facing a Lexicographer is to distinguish when the usage of a term modulates its meaning and when it configures a new one. To this end, lexicographers have proposed several tests to clarify the issue, although none has proved definitive (see Kilgarriff, 1997). The problem is that in a dictionary it is difficult to define when the usage of a word pertains to a meaning other than those already established – in other words, when we come across cases of homonymy or polysemy.

But what may make sense for the purposes of linguistic classification may not be entirely relevant to explaining the phenomenon, if we do this using a vector model where these boundaries are not established a priori. Moreover, if our true aim is to discuss how ambiguity is dealt with by humans, the principle of parsimony should lead us to start from scratch without our model assuming that such classifications are firm (until required) – this is the case with LSA models. Since vector space models such as LSA have produced good results simulating lexical representation in humans (Landauer and Dumais, 1997; Kintsch and Mangalath, in press), they may help to define objective indices and throw some light on theories of lexical ambiguity. In fact, the same algorithms (applied to LSA) used to explain the extraction of different senses of a word in interaction with lexical context, have been used to explain the basic mechanism for extracting the sense of a metaphor (Kintsch, 2000, 2001; Kintsch and Bowles, 2002).

In LSA, each term is represented using a single vector that contains all information pertaining to the contexts where it appears. Each vector can be thought of as a box containing meanings. The most salient meaning is the most frequent in the reference corpus (the corpus used to train LSA) followed by other less frequent meanings. The key lies in the fact that in the retrieval phase, one meaning might be activated and another inhibited in a single representational space, using the context (Kintsch, 2001; Kintsch and Mangalath, in press). Furthermore, this might occur even where we find a very high correlation between the meanings (as in weak polysemy), if both senses have little in common (metaphors) or even if both senses are completely orthogonal (as in homonymy). The challenge is to find a mechanism to bias the flexible representation of LSA or another vector-space model toward the pertinent meaning. This is the mechanism that must operate on the LSA representation if we seek a plausible meaning: differential activation of certain patterns of the vector. We will start, nevertheless, with the raw vector representation in the absence of any such mechanism.

2 Polysemy in vectors

The primary definition of polysemy is in terms of its ambiguity. A polysemous word has not one meaning but many. Such a word has many meanings, but depending on the context, some of them are sometimes concealed. If we were asked to sketch the abstract concept of polysemy, we might imagine dispersed colours, many different levels of depth, or perhaps a blended drink. As in a blended drink, the vector of polysemous words have many features (Figure 1), or scores in many dimensions (more dimensions that monosemous words).

Page 4: The representation of polysemy through vectors: some building … · 2011-08-21 · The representation of polysemy through vectors 329 Biographical notes: Guillermo de Jorge-Botana

The representation of polysemy through vectors 331

Figure 1 Two possible vectors of polysemous and monosemous words respectively

Figure 2 Coordinates of the dimensions of ‘noray’ in three of its conditions (see online version for colours)

Notes: We can see that the 0–30 condition has the most extreme coordinates in the latent semantic space, then the 30–0 condition and last the 15–15 condition, which has predominantly intermediate coordinates.

Source: Taken from Jorge-Botana et al. (submitted)

In a recently-submitted paper (Jorge-Botana et al., submitted), we carried out a simulation task on a semantic vector space trained with the Spanish corpus Lexesp (Sebastián-Gallés et al., 2000). The simulation consisted of giving a made-up word (‘noray’) several forms along the polysemy-monosemy continuum. To do so, texts were created with two possible meanings of the made-up word: ‘noray’ as a sport and ‘noray’ as an old district demarcation. Several corpora were trained, where ‘noray’ appeared in the texts with different proportions of the two meanings. For example, in one corpus ‘noray’ appears in 25 documents as a sport and in five as an old district demarcation, in

-0.1 

-0.08

-0.06

-0.04

-0.02

0.02

0.04

0.06

0.08

0.1

1  15 29 43 57 71 85 99 113 127 141 155 169 183 197 211 225 239

0-3015-1530-0

Page 5: The representation of polysemy through vectors: some building … · 2011-08-21 · The representation of polysemy through vectors 329 Biographical notes: Guillermo de Jorge-Botana

332 G. Jorge-Botana et al.

another, ‘noray’ appears in 30 documents as a sport and zero times in phrases concerning demarcation, in another, ‘noray’ appears in 30 documents as an old district demarcation and zero times as a sport , and so on. We analysed the vector of the artificial word in each condition in terms of the (standardised) score in each dimension – in other words whether the scores in a given dimensions were extreme or close to the average. This analysis supported the hypothesis that ambiguous words have the most differentiated and weakest representation, the opposite of non-ambiguous words whose definition is precisely located in certain dimensions, where it scores higher (Figure 2).

In order to objectively check that the aspect of the vector for each condition is significant, we proceeded to analyse the score for each condition in each dimension, with a rank of 1, 2 or 3 (depending whether it scores on the dimension in first, second or third order) – 1 for the largest coordinate, 2 for the second largest coordinate and 3 for the smallest coordinate. The results (Figure 3) show that the monosemous conditions had significantly more extreme distributions. In 52% of the dimensions the 0–30 conditions has the higher coordinate. The 30–0 dimension had the highest coordinate in 32% of the dimensions (and the lowest coordinates, 41%), and the 15–15 condition in only 16% (and 50% of intermediates). The monosemous conditions (0–30 and 30–0), then, lead to more extreme scores on the dimensions than the polysemous conditions (15–15). Polysemy attenuates the dimensions where the word might stand out, with a smoothing caused by its multiple meanings.

Figure 3 The monosemous conditions have more extreme distributions. Polysemy attenuates the dimensions (see online version for colours)

Source: Taken from Jorge-Botana et al. (submitted)

Page 6: The representation of polysemy through vectors: some building … · 2011-08-21 · The representation of polysemy through vectors 329 Biographical notes: Guillermo de Jorge-Botana

The representation of polysemy through vectors 333

3 Some indirect measures of polysemy

There are several ways to measure the ambiguity of a word: we can check how many senses a word has in a lexical dictionary, or we can ask individuals to judge each word’s degree of ambiguity in terms of familiarity, imaginability, etc. If we have a huge corpus, we can easily extract some indirect indices based on the entropy of the distribution of vectors. This type of measures can be a good predictor of the ambiguity of words. They are based on information theory, indicating that the information that a word has about the documents in which it appears depends on the occurrences throughout all the documents. The philosophy is as follows: a word that occurs in all the documents, such as the word ‘thing’, provides little information about their contexts where it appears. A word that is focalised in a few contexts (or one), such as ‘surgery’, provides more information about these contexts where it is found. This measure is an indirect measure of ambiguity and some studies have demonstrated good correlations with polysemy. Jorge-Botana et al. (submitted) computed the so-called log-Entropy [see Nakov et al. (2001) for a review] to measure the context focalisation of words, and found significant differences between two sets of polysemous and monosemous words.

Figure 4 Mean global weight for polysemic and monosemic words (see online version for colours)

Note: The higher the global weight the lower the entropy.

Source: Taken from Jorge-Botana et al. (submitted)

MonosemyPolysemy

0.80

0.70

0.60

0.50

0.40

0.30

Global weight

Page 7: The representation of polysemy through vectors: some building … · 2011-08-21 · The representation of polysemy through vectors 329 Biographical notes: Guillermo de Jorge-Botana

334 G. Jorge-Botana et al.

Polysemous words have significantly more entropy. McDonald and Shillcock (2001) found a correlation between polysemy and their measure of context distinctiveness (a measure based on entropy). In fact, these kinds of measures try to find scores to measure concepts such as ‘context distinctiveness’ (the fact that a word is useful for discriminating a context) or ‘contextual diversity’ (the fact that a word represents many distinct contexts). Contextual diversity has been proposed by Adelman et al. (2006) as the raw number of documents in which a word appears in a divided corpus. Furthermore, they calculated the ratio of contextual diversity to word frequency as an index of clustering of a word. Measuring some indices related to the phenomena of polysemy in an objective manner will help us to understand the theory of meaning, and to find some way to deal with technological issues such as obtaining confidence indices in the information retrieval industry (see US patent 6,256,629 B1).

4 How this kind of representation affects the extraction of meaning?

Such a representation has some consequences. The first was identified by Deerwester et al. (1990) and affects the extraction of meaning. These authors state that although the phenomenon of synonymy is faithfully represented by LSA simulations, the same is not true with polysemy. A term, even if it has more than one meaning, is still represented in a single vector, which has certain coordinates. Several meanings are represented as an average of those meanings, weighted according to the frequency of the contexts where it is found. Deerwester and his colleagues provide the key: “If none of the real meanings is like the average meaning” it may create a bias in the representation, producing an entity that does not match any actual term usage. If we seek to extract a list of this vector’s related neighbours, the bias of such a blended representation would present many obstacles. In two of our previous studies, we consider unwanted effects that could be present in vector-space models when extracting different meanings for a polysemous word without any context word, or adding (by vector sum) a context word (Jorge-Botana et al., 2010, 2009). We considered two problems concerning the system’s approach to processing polysemous words. The first is predominant meaning inundation: when extracting meaning from a word, it is possible that only the predominant meaning emerges if other meanings are not sufficiently well-represented in the semantic space. This may even occur if the word is accompanied by a context word (adding the two vectors) that is not powerful enough to filter out the dominant meaning. The other is imprecise definition: when the polysemous word is accompanied by a context word – and even when the context word coincides with the dominant meaning of the polysemous word (in the semantic space) – the meaning extracted may be very general. The result is that the meaning extracted from the polysemous word is related to the context word but is not sufficiently precise, due to a bias toward the general sense of the predominant meaning.

The second consequence is in the field of experimentation, and concerns how we explain the advantage of monosemous words in experimental tasks such as evocation of sense and the difficulty of association between polysemous and other words without a mediated context. In contrast with theories that propose different entries for each meaning of an ambiguous word, Jorge-Botana et al. (submitted) used a corpus data study to propose and explain that a single representation model like LSA might account for some empirical phenomena whilst conserving the initial principle: static entries for terms,

Page 8: The representation of polysemy through vectors: some building … · 2011-08-21 · The representation of polysemy through vectors 329 Biographical notes: Guillermo de Jorge-Botana

The representation of polysemy through vectors 335

with weighting based on the moment of acquisition. In this study, we used LSA to monitorise Piercey and Joordens’ (2000) ‘efficient then inefficient’ model, which notes that ambiguous words have an advantage in generating non-specific activation, but this advantage is lost in reading as activation is channelled toward only certain patterns.

5 Construction-integration, predication, dependency trees

In light of what we have discussed above, we could say that the probability of extracting the desired meaning from a term is strongly dependent on how representative each meaning is. In other words, there is a bias toward the dominant meaning of a polysemous term, or at least some average based on it, even if this term is found within a context.

Figure 5 Construction-integration network of the sentence ‘the band played a waltz’ (see online version for colours)

Note: The connections are the cosines between each word. Source: Taken from Kintsch et al. (1999)

The reason for this is that LSA is not a process theory, but rather a static representation of one kind of knowledge (the knowledge drawn from the training corpus). To simulate the construction of meaning, we have to manage the network of information supplied by LSA and exploit the representations of terms as well as those of context (see Burgess, 2000). Such a procedure has been used to simulate some dynamic processes – for example Kintsch (1998) adapted his construction-integration model with LSA as the lexical representation. Each word of the lexicon was represented with an LSA vector. Imagine a sentence ‘the band played a waltz’ [see details in Kintsch et al. (1999)]. The mechanism is as follows: first, a network representing the sentence is constructed with the three content words from the sentence and the neighbourhood of each. All the words are connected to each other with links whose strengths are the cosines between them.

Page 9: The representation of polysemy through vectors: some building … · 2011-08-21 · The representation of polysemy through vectors 329 Biographical notes: Guillermo de Jorge-Botana

336 G. Jorge-Botana et al.

Second, in the integration phase, the most strongly activated words are the most relevant in the context of the sentence. Irrelevant content is removed while relevant content is retained.

One year later, this mechanism was adapted to simulate the comprehension of predication structures and predicative metaphors (Kintsch, 2000, 2001; Kintsch and Bowles, 2002). Again, a net is implemented but this time concerning a predication (e.g., ‘my lawyer is a shark’). The procedure is as follows: the predicate is identified (‘shark’), the semantic neighbours closest to the predicate are extracted, the cosines between each of the chosen neighbours of the predicate and the argument (‘lawyer’) are calculated to draw up a network (as in construction-integration). The network is run and left to settle into a stable state (Figure 6). In short, the aim is to objectively locate those semantic neighbours in the vicinity of the predicate which are also pertinent to the argument. The final step is to calculate the vector with the vector sum of the predicate plus the argument, plus the terms that receive most activation in the network. Kintsch (2000) shows the algorithm at work and checks the final meaning of predications such as ‘the bridge collapsed’ and ‘the plan collapsed’. He even investigates the difference between metaphors that are simple and difficult to understand based on the predication parameters (Kintsch and Bowles, 2002).

Figure 6 The most strongly activated nodes in the predicative metaphor ‘my lawyer is a shark’ (see online version for colours)

Note: The final definition: vicious but not fish-like Source: Taken from Kintsch (2008)

Page 10: The representation of polysemy through vectors: some building … · 2011-08-21 · The representation of polysemy through vectors 329 Biographical notes: Guillermo de Jorge-Botana

The representation of polysemy through vectors 337

More recently, models have tried to take into account higher syntactical structures, for instance adapting the predication algorithms to syntactical dependencies (Kintsch, 2008).

6 Visualising nets of words

There are many examples of the application of LSA to semantic engines. For example, many researchers have obtained positive results using LSA to emulate human graders (Haley et al., 2005, 2007) and some of these findings have been translated into commercial grader systems or e-learning platforms, where LSA permits comparison of semantic similarity between a student essay and a golden essay (an essay written by an expert). For example, Wade-Stein and Kintsch (2000) built Summary Street as a robust and practical educational tool. Some papers from a previous feature in this journal talked about this kind of applications; the goal of this paper is to describe the role of polysemy in vector-space models and the consequences for models and applications. We will directly discuss some examples where the representation of polysemy is an important issue in the engine. Our example is related with visualisation. The meaning produced by a technique such as LSA could be used as a visual information retrieval interface (VIRI) supported by a lexical context, allowing users to visually recognise the information needed rather than entering a search query. Such methods support the search process, providing a graphic overview of a semantic domain using related words that appear on the screen, and helping users to know what information can be retrieved using the interface. The Quintura search engine (http://www.quintura.com) provides a good model for understanding the process. The user writes the search key-word and the system retrieves a net with some words that represents the possible words in context. This is useful as a heuristic tool for helping users to constrain the search and identify related topics. The process can be performed on classroom computers, but this kind of systems must take care with the combination of words suggested in the context the VIRI covers. If a key-word provided by the user is polysemous, combining it with another in the VIRI might give rise to the biases mentioned above (especially predominant meaning inundation). In a recent paper, we provided some examples of this bias and showed what happens if we apply the predication algorithm (Kintsch, 2001) to correct such effects (Jorge-Botana et al., 2010, 2009). In the latter study, we use several methods on a diagnostic corpus to fine-tune the retrieval of meaning. The same methods are used in the former but this time on a Spanish colloquial corpus, LexEsp (Sebastián-Gallés, 2000). Here, our goal was to obtain a net in which two of the possible meanings of a polysemous word were represented: those which were coherent with two contexts (two words) that are added. For example, in one of the nets we wanted to draw1 (in the form of a sorted list of semantic neighbours) the meaning of the word ‘partido’2 (match/party) followed separately by the word ‘fútbol’ (football) or by the word ‘nacionalista’ (nationalist). Before extracting such lists, the vector of both structures were calculated by means of a simple sum of vectors (partido+nacionalista) or using the predication algorithm (partido[nacionalista]). The words in the lists for both structures were combined into a single net. The general purpose was to check whether any meaning was strong enough to bias the final net in the first condition (partido+nacionalista), and to see what happens if we apply the predication algorithm (partido[nacionalista]) to avoid such a bias.

Page 11: The representation of polysemy through vectors: some building … · 2011-08-21 · The representation of polysemy through vectors 329 Biographical notes: Guillermo de Jorge-Botana

338 G. Jorge-Botana et al.

In the baseline condition, ‘partido’ in isolation, only the general meanings referring to a political party are extracted (Figure 7). Only three or four nodes were assigned terms that corresponds to football topics (zone separated by line a in Figure 7); all others contained political terms, an effect we have referred to as predominant meaning inundation – the predominant meaning of partido is political.

Figure 7 ‘Partido’ in isolation (see online version for colours)

Note: The line separates the dominant sense of the word (political topics) from the other (sports-related topics).

Introducing the two contexts, but using the simple vector sum, the sports-related topics make gains against political topics (zone separated by line in Figure 8). Introducing the football context reduces the predominant meaning inundation effect but fails to eliminate it completely.

Page 12: The representation of polysemy through vectors: some building … · 2011-08-21 · The representation of polysemy through vectors 329 Biographical notes: Guillermo de Jorge-Botana

The representation of polysemy through vectors 339

Figure 8 ‘Partido’ with the two context words in the vector sum condition (see online version for colours)

Notes: The line separates the political topics of the word from the sports-related topics. Political topics still seem to be dominant.

Source: Taken from Jorge-Botana et al. (2010)

Introducing the two contexts with the predication algorithm, sports-related meaning enjoys a surge relative to terms related to the argument ‘nacionalista’. There is now no sign of the predominant meaning inundation effect mentioned earlier (see the two zones separated by the line a in the Figure 9), and the vast majority of political terms were concerned with nationalism, such as regions with nationalist parties. The predication algorithm has managed not only to retrieve the elements of partido with a political meaning, but also to retrieve some terms referring to a specific political subsector.

Page 13: The representation of polysemy through vectors: some building … · 2011-08-21 · The representation of polysemy through vectors 329 Biographical notes: Guillermo de Jorge-Botana

340 G. Jorge-Botana et al.

Figure 9 ‘Partido’ with the two context words in the predication algorithm condition see online version for colours)

Notes: The line separates the political topics of the word from the sports-related topics. Both topics seem equally feasible.

7 Conclusions

LSA is an important automatic method that has been used in many fields including, for instance, computational models of cognitive processes, information retrieval, e-learning systems, hypertext analysis, call routing and network visualisation. In this paper, we have described some properties of the vectors which represent polysemous words in LSA, which can be very useful for tuning this kind of application. To summarise, all meanings of a word are represented with the same vector, weighted according to the frequency with which that word appears representing each meaning. It is thus an amalgam of meanings. This has consequences for the retrieval of meaning, because a procedure is needed to retrieve only the meaning that coherent with a single context, avoiding the biases from others. We have revised some issues related with the causes of such biases. The first

Page 14: The representation of polysemy through vectors: some building … · 2011-08-21 · The representation of polysemy through vectors 329 Biographical notes: Guillermo de Jorge-Botana

The representation of polysemy through vectors 341

concerned the aspect of an LSA polysemous vector and the distributional properties of the meanings. There are means to automatically measure indirectly whether the distribution of a vector is focalised upon a few or many contexts. These measures could be important in setting a confidence level for deciding whether a word represents the context in which it appears. Knowledge of this confidence level could be useful in commercial applications. The second concerned some biases that arise when a meaning is retrieved, and some of the models that have been used to simulate the way that humans avoid them. Like humans, machines could use these procedures to obtain better results – for example, in word sense visualisation. This leads us to our third point: visualisation shows that using a procedure that takes into account such biases leads to more plausible networks.

Acknowledgements

This work was funded by the Spanish Ministry of Education via Grant PSI 2009-31932.

References Adelman, J.S., Brown, G.D.A. and Quesada, J.F. (2006) ‘Contextual diversity, not word frequency,

determines word naming and lexical decision times’, Psychological Science, Vol. 17, pp.814–823.

Ahrens, K., Chang, L-L., Chen, K-J. and Huang, C-R. (1998) ‘Meaning representation and meaning instantiation for Chinese nominals’, Computational Linguistics and Chinese Language Processing, Vol. 3, No. 1, pp.45–60.

Burgess, C. (2000) ‘Theory and operational definitions in computational memory models: a response to Glenberg and Robertson’, Journal of Memory and Language, Vol. 43, pp.402–408.

Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K. and Harshman, R. (1990) ‘Indexing by latent semantic analysis’, Journal of the American Society for Information Science, Vol. 41, pp.391–407.

Haley, D.T., Thomas, P., De Roeck, A. and Petre, M. (2005) ‘A research taxonomy for latent semantic análisis-based educational applications’, Technical Report No. 2005/09, The Open University.

Haley, D.T., Thomas, P., Petre, M. and De Roeck, A. (2007) ‘Seeing the whole picture: comparing computer assisted assessment systems using LSA-based systems as an example’, Technical Report No. 2007/07, Open University – Department of Computing.

Jorge-Botana, G., Olmos, R. and León J.A. (submitted) ‘Monitoring the penalization/advantage of lexical ambiguity in vector model representations’, Cognition.

Jorge-Botana, G., León, J.A., Olmos, R. and Hassan-Montero, Y. (2010) ‘Visualizing polysemy using LSA and the predication algorithm’, Journal of the American Society for Information Science and Technology, Vol. 61, No. 8, pp.1706–1724.

Jorge-Botana, G., Olmos, R. and León, J.A. (2009) ‘Using LSA and the predication algorithm to improve extraction of meanings from a diagnostic corpus’, Spanish Journal of Psychology, Vol. 12, No. 2, pp.424–440.

Kilgarriff, A. (1997) ‘I don’t believe in word senses’, Computers and the Humanities, Vol. 31, No. 2, pp.91–113.

Kintsch, W. (2008) ‘Symbol systems and perceptual representations’, M. de Vega, A.M. Glenberg and A.C. Graesser (Eds.): Symbols and Embodiment: Debates on Meaning and Cognition, pp.145–164, Oxford University Press, Oxford, UK.

Page 15: The representation of polysemy through vectors: some building … · 2011-08-21 · The representation of polysemy through vectors 329 Biographical notes: Guillermo de Jorge-Botana

342 G. Jorge-Botana et al.

Kintsch, W. and Bowles, A. (2002) ‘Metaphor comprehension: what makes a metaphor difficult to understand?’, Metaphor and Symbol, Vol. 17, pp.249–262.

Kintsch, W., Patel, V. and Ericsson, K.A. (1999) ‘The role of Long-term working memory in text comprehension’, Psychologia, Vol. 42, pp.186–198.

Kintsch, W. (1998) ‘The representation of knowledge in minds as machines’, International Journal of Psychology, Vol. 33, No. 6, pp.411–420.

Kintsch, W. and Mangalath, P. (in press) ‘The construction of meaning’, Topics in Cognitive Science.

Kintsch, W. (2000) ‘Metaphor comprehension: a computational theory’, Psyhonomic Bulletin and Review, Vol. 7, pp.257–266.

Kintsch, W. (2001) ‘Predication’, Cognitive Science, Vol. 25, pp.173–202. Klepousniotou, E. (2002) ‘The processing of lexical ambiguity: homonymy and polysemy in the

mental lexicon’, Brain and Language, Vol. 81, Nos. 1–3, pp.205–223. Landauer, T.K. and Dumais, S.T. (1997) ‘A solution to Plato's problem: the latent semantic

analysis theory of the acquisition, induction, and representation of knowledge’, Psychological Review, Vol. 104, pp.211–240.

McDonald, S. and Shillcock, R. (2001) ‘Rethinking the word frequency effect: the neglected role of distributional information in lexical processing’, Language and Speech, Vol. 44, pp.295–323.

Nakov, P., Popova, A. and Mateev, P. (2001) ‘Weight functions impact on LSA performance’, Paper presented at Recent Advances in Natural Language Processing – RANLP 2001, Tzigov Chark, Bulgaria.

Piercey, C.D. and Joordens, S. (2000) ‘Turning an advantage into a disadvantage: ambiguity effects in lexical decision versus reading tasks’, Memory & Cognition, Vol. 28, pp.657–666.

Sebastián-Gallés, N., Martí, M.A., Carreiras, M. and Cuetos, F. (2000) ‘LEXESP: una base de datos informatizada del español’, Universitat de Barcelona, Barcelona.

U.S. Pat. Num. 6,256,629 (2001) ‘A method and apparatus for measuring the degree of polysemy in polysemous words’, (with J. van Santen), July 3.

Wade-Stein, D. and Kintsch, E. (2004) ‘Summary street: interactive computer support for writing’, Cognition and Instruction, Vol. 22, No. 3, pp.333–362.

Notes 1 All these networks were drawn with the viewer application developed by the SCImago

research group, which is also being used for visual representation of scientific co-citation networks. See Jorge-Botana et al. (2010) for more details.

2 Partido in Spanish has several meanings, the commonest being ‘political party’ and ‘game/match’ (only in the sense of playing competitively).