choice in lexis: computer generation of lexis as most delicate grammar
TRANSCRIPT
tiguage Sciences, Volume 14, Number 4, pp. 579-605, 1992 0388-0001/92 $5.00+.00 Printed in Great Britain 0 1993 Pergamon Press Ltd
Choice in Lexis: Computer Generation of Lexis as Most Delicate Grammar
Marilyn Cross
Wollongong University
ABSTRACT
A model and implementation of lexis for text generation is described that has the goal of pro- viding motivated choice within the lexical component. The model usea the Sys~mic-Fu~tio~l approach to Language and devetops lexis as most delicate grammar, buiMing the implementation on the USC/IS1 Nigel gramma r. The differentiation between choices in the network is based on paradigmatic features that have formal linguistic consequences. The detailed examination of material transformation Processes in the implementation demonstrates the impact of lexical feature choices on g rammatical roles. It is also shown that collocation within register may be captured through the cross-tit&age of preselection from g rammatical functions to paradigmatic features in the network.
LEXIS IN GENERATION
The term lexis will be used to refer to “the resources of the vocabulary. . . cover- ing both the static org~i~tion of vocabulary and the process of lexical choice” (Matthiessen 1988a). Lexis contrasts with the more widely used term ‘lexicon’ (cf. Mel’Euk and Polguere 1987; Nirenburg and Raskin 1987), which tends to cover vocabulary in much the same way as in lexicography, treating it from the three aspects of semantics, syntax and phonology. The term lexicon will be reserved for the organi- sation of vocabulary from a lexi~graphic~ position and also in extracts from authors referred to in this paper.
Until recently, lexis has received relatively little attention from generationists--“in most of the generation systems, lexical selection could not be a primary concern due to the overwhelming complexity of the generation problem itself’ (Pustejovsky and Nirenburg 1987). However, generationists are bound to treat Iexis in some manner and according to Curing (1986), that treatment is very varied. One aspect of that
580 Language Sciences, Volume 14, Number 4 (1992)
variation has been the creation of a division between grammar and lexis on the basis of closed and open systems:
L‘ . . open-class items are not only conceptually different from closed-class items but are pro-
cessed differently as well. Closed class items have no epistemological status other than
procedural attachments to conceptual and discourse information.”
(Pustejovsky and Nirenburg 1987, 205)
Their position can be countered by taking a metafunctional interpretation of lexis which diminishes the differences between open-class and closed-class items (Matthiessen 1988~).
The original conception of the Nigel grammar was to run with a separate lexicon that generated open-class lexical items and for grammatical items to be generated through the grammar module (Mann 1983). The separation of grammatical from lexical items and the treatment of the lexicon as a ‘list’ is a common paradigm in com- putational linguistics. However, not every generationist has subscribed to the division of grammatical words from lexical words (Matthiessen 1988a). In McKeown’s TEXT, the dictionary component covers both the open-class and closed-class lexical items (McKeown 1985). Patten’s (1988) generator has no separate lexicon, but outputs both grammatical and lexical words as a result of the choices taken in both the semantic and grammatical networks. Ward (1988) departs from the multi-stage model of generation (McDonald et al. 1987), using a single semantic network to represent both linguistic knowledge and world knowledge. Words are selected through cumulative activation of the network.
The task of choosing appropriately from the open-class of lexical items has been tackled in various ways. Some form of discrimination net is quite commonly used in which selection tests are applied to pick the appropriate word (Goldman 1975; Pustejovsky and Nirenburg 1987). In Babel (Goldman 1975) verb senses are given ‘defining characteristics* that enable differentiation of one lexical verb from another. For example, to differentiate drink from ear, which both have the underlying concept INGEST, the object has the property FLUID for drink but not for eat.
One problem with discrimination nets is the tendency to include constraints that originate from heterogenous sources. In the example from Babel, not only is there information based on lexical constraints, but also information that is grammatically motivated-for example, constraints based on beneficiary and tense. Thus, the basis of the discrimination net tends to be somewhat eclectic. Of interest, is the demon- stration by Matthiessen (1988~) that the discrimination nets of Goldman that are based on ideational meaning fit readily into an extension of the dispositive process network.
In the generators discussed, the emphasis has been on selecting the ‘correct’ lexical item. Hovy (1985; 1987a) has been concerned to differentiate between lexical items that have some degree of synonymy in texts. In an exploration of how to ‘slant’ texts,
Choice in Lids 581
Hovy uses a lexicon in which items are tagged for affect (1985) and in which the selection of lexical items is based on rhetorical goals that mediate between a speaker’s pragmatic goals and a text’s style (1987a).
PROLEGOMENA TO A MODEL FOR LEXIS
L&s as Closed or Open system
In order to examine the case for a cline between grammar and lexis, the traditional treatment of lexis as an open system will be discussed and contrasted with the treat- ment of grammar as a closed system.
A closed system is a series of terms where the list of terms is exhaustive, each term excludes all the others and if a new term is added, at least one of the previous terms undergoes a change of meaning, so that in effect a new system replaces the old (Halliday 1961). From this perspective, grammar is characterised by closed relations where there is a choice among a fixed number of possibilities: for example, in the personal pronoun set, class membership is closed-a new pronoun is much less likely to be added (Hall&y 1985b). If the lexis does not readily behave as a closed system, one alternative is to view it as an open system. For lexis, class membership is open and extendable. Even so, the addition of a new lexical item does result in a shift in the organisation of the relationships existing between the members of the set:
“ . . (a linguistic) move has a repercussion upon the whole system. . The changes in values
which result may be, in any particular circumstance, negligible, or very serious, or of moderate
importance.” (Saussure 1906, 88)
As an example, the addition of the lexical item AIDS to the set of sexually trans- mitted diseases does not come into a lexical void but fits into a system of disease types, contrasting with the other members of the set. Thus, one basis for meaning for both grammatical and lexical items is their value in a paradigmatic system:
6. a system in which all the elements fit together, and in which the value of any one element
depends on the simultaneous coexistence of all the others.” Saussure 1906, 113)
The work of computational linguists building ontologies for the lexicons of machine readable dictionaries attests to this viewpoint (Byrd et al. 1987; Wilks et al. 1989). If the basis for relating the items of closed and open sets is one of paradigmatic relations, the distinction between grammatical and lexical items is not one of kind but one of quantity. Halliday (1961) has maintained that the difference between grammar and lexis is in terms of delicacy in the system network.
582 Language Sciences, Volume 14, Number 4 (1992)
Lexis as Most Delicate Grammar
It is three decades ago that Halliday remarked:
“The grammarian’s dream. . is to turn the whole of linguistic form into grammar, hoping to
show that lexis can be defined as ‘most delicate grammar’.”
(Halliday 1961, 267)
The feasibility of the task of “extending a lexicogrammatical network in delicacy
so as to turn it into a device for the description and generation of. . . lexical items”
(Hasan 1987, 185) was explored for some material processes by Hasan (1985, 1987).
Hasan (1987) explored how processes “whose completion results in gain/loss of
access to things” (p. 187), for example, the lexical items gather, collecr and accumu-
late might be described as part of the transitivity region of grammar. The systems that
led to the generation of the lexical items were treated as more delicate choices within
the material process category. The choice of features in the lexical extension of the
grammatical networks not only influenced lexical choice but also had contingent
structural effects (cf. Fawcett 1989).
The critical theoretical tenet underpinning the grammarian’s dream is the scale
of delicacy, which enables grammar to be extended out to lexis. Delicacy enables
linguistic description to be made at varying degrees of abstraction, beginning with the
most abstract or primary degree of delicacy and continuing along the scale until that
level of abstraction is reached that is sufficient for the descriptive task (Halliday
1976a, 62-7). Two assumptions that spring from this tenet are that grammar and lexis
are unified on a continuum and that lexical items may be distinguished by grammatical
criteria. The result of the grammarian’s dream is that the division between lexis and
grammar disappears and the resources merge.
Collocation
Firth (1957a) first introduced into descriptive linguistics the idea of meaning by
collocation, in which part of the meaning of a word is given by its habitual association
with other words. Firth maintained that meaning by collocation was “an abstraction
at the syntagmatic level” and was not concerned with the conceptual approach to word
meaning (Firth 1957a, 196). Following the discussion of Halliday (1966), one may
suggest that the range of collocation is correlated with different grammatical forms:
“ . it is not to say there is no interrelation between structural and collocational patterns; but.
their interdependence can be regarded as mutual rather than as one-way (and) it will be more
clearly displayed by a form of statement which first shows grammatical and lexical restrictions
separately and then brings them together.” (Halliday 1966, 152)
CbokeinLexts 583
How the lexical restrictions may be captured in a representation is another question, but it is a problem that may be addressed on a large scale and empirically now that machine based corpora and dictionaries are available (Velardi and Pazienza 1989; Wilks et al. 1988, 1989; Small et al. 1988). As collocation is a relationship between lexical items that co-varies with grammatical patterns (Halliday 1966), it needs to be accounted for in a model of lexis as most delicate grammar.
Lexical Restrictions and Register
Register theory deals with the interrelationship between linguistic variation and types of audience and situation (Bateman and Paris, 1989a) in a principled linguistic way:
“Types of situation differ from one another, broadly speaking in three respects: first, as regards
to what is actually taking place [held]; secondly, as regards what part the language is playing
[model; and, thirdly, as regards who is taking part [tenor]. These three variables, taken together,
determine the range within which meanings are selected and the forms which are used for their
expression. . . What the theory of register does is to attempt to uncover the general principles
which govern this variation, so that we can begin to understand what situational factors
determine what linguistic features. . ” (Halliday 1978, 31-2)
One observation that may be made from the examination of collocation, is that register plays a part in the selectional restriction of lexis. Register is relevant to the treatment of lexis both for text understanding and generation. For robustness, a generator should provide the flexibility to deal with heterogenous texts. Pragmatically, once the register of a text genre is defined, the range of choice available for those texts is circumscribed. The narrowing of choice begins at the level of text structure and continues right through to the lexis. At the level of lexis, it seems almost trite to claim that in texts, for example, about the environment, the probability of lexical items such as hippopotamus or love occurring would be relatively low. However, its implications for the organisation of lexis for generation is profound (refer Wilks et al. 1989 and Small et al. 1988 on lexical ambiguity). Not only is the choice of items restricted, but by extension, the collocational sets into which the items enter are limited. Thus, for example, if one takes the lexical item water in pedagogical texts on the water cycle, and a range of three lexical items either side of water, the significant collocational set (Sinclair 1966) would include cycle,fresh, supply, evaporate, vupour, land, urea, full,
snow, drain, rain, lots, turn, little, drop, cloud, big. The pervasiveness of water, in this domain may be compared with the distribution of the item in a medical domain where the few occurrences relate to its role as a medium for disease, for example, water-borne hepatitis (Hobbs 1986).
Given that register predicts what forms may occur, it may be seen as selecting both
584 Language Sciences, Volume 14, Number 4 (1992)
the range of the lexis and the collocational set. It is possible then, to reflect this in the lexis. The logical outcome is to limit or filter the range of lexis available to the generator, for example, the subworld division of Nirenburg and Raskin (1987) for machine translation and Pustejovsky and Nirenburg (1987) for generation. Even where the goal is to provide a lexical resource that covers a heterogenei~ of text types (Calzorlari and Picchi 1988; Pazienza and Velardi 1987), there may be some account taken of register, for example, the utilisation of field descriptors such as ‘engineering’ in the LDOCE (Wilks et al. 1988). The solution to build a different lexis for each project has the usual limitation of scalability (Wilks et al. 1988, 1989).
A Model for I&s
A model of lexis as most delicate grammar extends grammar along the scale of delicacy out to lexis. Its affinity is with the multi-stage models of generation, such as Mann (1983), McKeown (1985) and Patten (1988). The treatment of lexis as most delicate grammar has its closest counterpart in Patten (1988) and follows the general model described by (Matthiessen, 1988a).
In the model, lexis is perforce subdivided by grammatical class. The description of the lexis for verbs extends the network for Processes (Halli~y 1985; Hasan 1987). Similarly, the description of the lexis for nominals is made after the choice is made for the nominal group (Halliday 1976b). The majority of workers in lexis make a primary division in the lexis between nouns and verbs (Miller, 1985; Dahlgren er al., 1989).
IMPLEMENTATION
~plemen~tjon of the Model of Letis in HORACE
The implementation of the model of lexis forms part of a larger project (called HORACE) that attempts to develop an explicit model of text production using the methodology of computer generation (Cross 1991). The model is based on the Systemic-Functional theory of language and utilises the Nigel grammar (Mann 1983; Mann and Matthiessen 1983; Matthiessen 1988b,c). The location of the lexico- gr~ati~i resource in the architecture of the system is given in the following figure (Fig. 1).
As register based generation, the implementation has taken the theoretical approach of Patten (1988) and Bateman and Paris (1989a,b). The feature based lexical networks are metalinguistic (Goldman 1975; Hobbs et al. 1987; Pazienza and Velardi 1987) rather than lexical taxonomic (Eggins et al. 1987). Because lexis is represented on a continuum with grammar, there is no need to incorporate grammatical info~ation
Choice in Lexis 58!!
Blackboards Knowledge sources
Controt stratsgy ms0JJffiEI i
Stages I I Semiotic I
Staga I
‘T I
Lexicogrammar
Laxicogrammatical structure
Realization statements
1 Structure builder 1
Figure 1. ~~e~rna~~ Resource in the Architecture of HORACE.
into the lexical items. With lexis as most delicate grammar, the ontology of the lexis is theoretically and practically embedded in the grammar and has no need of extra- linguistic justification (cf. Hobbs et al. 1987; Dahlgren et al. 1989). Morever, there is an c1 priori grammatical division (Miller 1985; Ritchie et al. 1987) rather than the theoretically unjustified overlap presented in some ontologies (Dahlgren and McDowell 1986; Dahlgren et al. 1989). Finally, the model of lexis as most delicate grammar attenuates the distinction between the closed systems and the open systems of lexis rendering both as:
1‘ . . a system in which all the elements fit together, and in which the value of any one element
depends on the simultaneous coexistence of all the others.”
(Saussure 1906, 113)
The issue of a lexis that not only takes account of denotative meaning but also connotative/affective meaning and the textual function of vocabulary is one that has been raised by Matthiessen:
586 Language Sciences, Volume 14, Number 4 (1992)
“Any lexical choice will potentially reflect all three metafunctions. Ideationally it represents
some phenomenon by classifying it in relation to a lexical taxonomy. Interpersonally, it is a
positioning in terms of formality, attitude and so on. Textually, it relates to previous discourse
by repeating, varying, generalising, or summarising a previous item.”
(1988a, 8)
In HORACE the model of lexis is built primarily using the ideational meanings of
the lexical item. It is this aspect of meaning that is reflected in the register of environ-
mental texts which provides the base for the project. However, where there is contrast
between vocabulary items on the basis of tenor, then the constraint mechanism is
invoked which enables extra-network determination of choice (Matthiessen 1987;
Cross 1991). Thus, vocabulary items which are indistinguishable in terms of ideational
meaning are differentiated by means of interpersonal meaning. The model adopted is
one in which interpersonal meanings combine with the ideational lexis rather than
make independent contributions (Matthiessen 1988a).
The grammar is extended using the system network convention-equivalent to an
acyclic directed graph. The divisions in the network are based on the word meanings
and contrasts that exist in the exemplar texts. The nodes of the network are para-
digmatic (Halliday 1985). The actual lexical items are generated through realisation
statements attached to the nodes of the network. The extension of the grammatical
networks to lexis covers lexis for Process, Thing, Quality and Circumstance, which
in the nomenclature of grammatical class are verbs, nouns, adjectives and adverbs,
respectively.
Tucker and Fawcett (1991) who have also been implementing lexis as most delicate
grammar in the COMMUNAL project (Fawcett and Tucker 1990) recently made the
point that if the lexical networks are separated from the more grammatical part of the
network, then the lexical networks may appear to be simple taxonomies. Their defence
of the COMMUNAL lexical networks holds equally well for the HORACE networks.
The lexical networks in HORACE are more than taxonomies of semantic features
because of the interdependence of the lexical networks with the grammatical networks.
The interconnections, together with the power of preselection (Consequences of
Choice in the Transformation Network) between networks for process, thing, quality
and circumstance (Cross, 1991) bring theoretical organisation to what have been
labelled ‘selectional restrictions’ in other approaches to the lexicon (Pustejovsky and
Nirenburg, 1987).
Organisation of the Material Process Network
In Nigel, four types of processes are identified: material, mental, verbal and
relational (Halliday, 1985). The majority of processes in the environmental texts are
material and relational, so the part of the implementation described here is a fragment
of the material network. There will be some brief reference to the lexical network built for uominals.
The starting point for the extension of the rna~ri~ process network is where Nigel terminates its desc~ption of material processes. In the material process network the gate EFF~~~E-MATERS is the entry condition for the system IXXNGTYPE with a choice between dispositive and creative types of processes. The material pro- cesses that occur in the texts may be grouped into six major categories: transfor- mation, behaviour, motion, occurrence, dis~sitive and creative with their huking into the Nigel network as follows. (Fig. 2).
Other-matsriot
--iI
Occurrence ~met~mlagico~
Motion
--J ~s~~~t~“e _.q@;p,‘,“‘i”
Key x
a -3
There is o system x/y with entry condition a I if a, then either x or y ! Y
x
i
0 m Y
b
( -4 m
R
---I * Y
There are two systems a/b and x/y, ordered in de endence so thct a/b has antry conditton m and xly has entry CD nz &ion a if m then either o orb, and if a, then either x ar y)
There ore two simultaneous systems m/n ond x/y, bath having entry condition a f if a, then bath either m or n and ~~ependantty, either ,x or y)
There is o system w/y with compound entry conditi~, conjunction of a ond c (If bath a and c th@%n either x or y 1
There is a system x/y with two possible entry conditions, o or d (if either a or 6, then either xor y)
~fo~tio~ Zn the Material Pmxss Network
The ~sfo~~on process are those in which the me&m udders a ~~fo~~on of some kind. The ~fo~tion may be a me~o~hosis in which the medium changes from one state to another, for exampie, evaporate, in which the medium changes from liquid to gas. The ~~sfo~ation may undergo a change in state but not be meta-
588 Language Sciences, Volume 14, Number 4 (1992)
morphosised, for example, hear, where the medium changes temperature but not
form. Transformation may also be a change in physical form, for example, break in
which the wholeness of the physical form is breached, or a change in composition,
for example, desalinate where a substance is subtracted from the medium. In brief,
the subtypes in transformation are:
change-intensity increase/decrease attribute
metamorphosis change from one state to another
change-in-physical form change outward physical form
change-in-composition addition/subtraction substance
Transformation is exemplified in the texts through the lexical items heat, cool,
evaporate, condense, transpire, increase, decrease, dry, melt, freeze, absorb,
desalinate, compact, crack and break. Examples from the texts are:
El. When the sea is heated by the sun
E2. from where it (water) evaporates again.
E3. The icebergs melt in the sea
E4. In some desert regions sea water is desalinated
E5. There it is compacted into ice
For each of the processes, the transformation is subtly different. For cool, the medium
changes temperature, but does not undergo a metamorphosis from one form to
another, as in evaporate, or a change in the outward physical form, for example,
break, or a change in composition, for example, desalinate. The process cool may
be contrasted with the process heat: the former is a decrease in temperature, the
latter an increase in temperature. Comparing compact with heat and cool, the
former differs from heat and cool, in that the medium undergoes a change in
physical form. However, if one pairs compact with expund, then the pair is linked
by the contrast of increase/decrease as are the cool and heat. Using the feature
<change-intensity > to represent increase/decrease then it might be suggested that
<change-intensity> contributes to heat, cool, compact and expand. The differenti-
ation between the pairs may be described as <thermal-change > versus < volume-
change>. For the distinctions described so far, the network may be diagrammed as
follows (Fig. 3).
Choice in Lexis 589
rransformotion
Figure 3. Partial Network for Transformation-l.
As was argued previously and is shown in the network above, <change-in-physical- form > and < volume-decrease > are features that will eventually lead to the lexical item compact. If the range of possible lexical items is extended to cover break and crack, then it may be argued that part of the meaning of all of these items is C change- in-physical-form > , as it is for compact. Break and crack do not share the feature < volumedecrease > with compacr. Further differentiation may be captured by posting the opposing features < wholeness-breached > and < wholeness-unbreached > . Break
and crack share the feature < wholeness-breached > and compact requires the feature C wholeness-unbreached > . With the pair break and crack a point has been reached where there is some degree of synonymy, at least in the register under examination. Constructing some text based examples, the following are possible:
Ed. Water can seep through some rocks such as . . . granite which is cracked through weathering.
E7. Water can seep through some rocks as . . . granite which is broken through weathering.
The problem is how to differentiate between cracked and broken. One possible solution is to introduce another system that leads to one item classes. To differentiate between break and crack, the features <division-specified > and <division- unspecified> might be introduced, in the sense that break can mean a division into pieces, whereas crack does not necessarily involve a division. The network would then be expanded as follows (Fig. 4).
590 Language Sciences, Volume 14, Number 4 (1992)
Vdume change
Phenomenon speclfoed No volume ihonqe
PhenOmeno” ““speafed \
Chonqe intensity
No chonqe mtenslty Deformatmn speclfled Transformotum
) joLume decrease
Change I” phys~cat form -
No change in physical form Wholeness unbreached
Figure 4. Partial Network for Transformation-2.
This solution bases differentiation in the network, prior to the choice of the lexical item (cf. Ward 1988). If the approach is taken to its logical conclusion, then every word would have a corresponding node in the network. For a limited register, this is feasible (Patten 1988). However, there may be places where ideationally items appear to be synonymous in a given context. In such a case, one solution is to use the inter- personal to guide the choice.
As was mentioned previously, there is a subgroup of transformation processes in which the medium undergoes a metamorphosis. Take for example the pair evaporate
and condense. For evaporate the medium begins with an initial form and changes to a final gaseous form. For condense, the reverse occurs: the medium begins with an initial gaseous form and changes to a final liquid form. However, there is more. The metamorphoses involve a change in thermal energy, so that part of the meaning of evaporate is an increase in thermal energy and part of the meaning of condense is a decrease in thermal energy. Including another contrastive pair in the network, the processes melt and freeze also involve an increase in thermal energy and a decrease in thermal energy, respectively. The metamorphosis is different: melt is a change from an initial solid form to a final liquid form and freeze, an initial liquid form to a solid form. The network for the metamorphosis processes discussed is as follows (Fig. 5):
Expanding the contrasts in the network, the transformational process dry up may be linked and indeed, is linked with evaporate in the texts:
ES. The sun’s heat causes water to evaporate from oceans and lakes. This means that the water dries up becomes vapour and disappears into the air.
Choice in L&s 591
LFinal gas ’
Change in physical form
No change in physicaL form
Figure 5. Partial Network for Transformation-3.
For dry up, the initial form of the medium is specified as liquid but the final form of the medium is left unspecified, nor can the role of < thermal-increase > be inferred. The lexical item can be accounted for by introducing the features <tinal-form- specified > and < final-form-unspecified > and including the feature < no-change- intensity > .
Evaporate also contrasts with transpire: both involve the change of the medium from liquid to vapour, but for the latter, the role of thermal energy is not specified and the process is restricted to plants. Introducing the features <plant-specified > and < plant-unspecified > differentiates between the meanings.
E9. Transpiration means the giving off of water vapour by plant leaves.
Dissolve is also part of metamorphosis sharing with melt the change from an initial form of solid to a final form of liquid, but not the increase in thermal energy. More- over, for dissolve a liquid contact material is specified. The extension of the metamor- phosis network to take into account dry up, transpire and dissolve is as follows (Fig. 6).
The final lexical items to be built into the network are absorb and desalinate. Both of these processes fall into the category of change-in-composition, with absorb in- volving the addition of a substance and desalinate the subtraction of a substance. Further distinctions are that for absorb, the medium loses its separate identify and for desalinate the subtracted substance must be salt. The change-in-composition part of
592 Language Sciences, Volume 14, Number 4 (1992)
Transformation Metamorphosis
hange in physical form
No change in phySiCot form
Fire 6. Partial Network for Transformation-l.
the network and the full network for transformative material processes are given in
the following two figures (Figs 7, 8).
(-4 Change intensity
L
I- Metomorphosis
II
Transformation
in physical form
in composition
substance
Figure 7. Partial Network for Transformation-5
choiiinIA%is 593
Transformotiol
Change intensity -I, 0 change intensity
--I Metamorphosis I
No metamorphosis
1
rlnltial solidd /As
CTirTige in physical -
P404;hange in physical Wholeness unbreached
Change in
i composition
~~~~~~~~
substance
-3 Crack
~aporote
I mnspire
I ondense I )issoIve
Ielt
Zompact
Figure 8. Network for Transformative Material Processes.
In order to dern~ns~a~ the full power of the lexical networks in HORACE, the realisation statements for the network extension for transformative material processes will be developed in the section ‘Consequences of Choice in the Transformation Network’.
594 Language Sciences, Volume 14, Number 4 (1992)
Grammatical Status of Transformational Processes
Transformational processes come in a variety of guises. They can occur as both
middle or effective processes (Halliday 1985a). For the latter, they can occur in both
the active and passive voice. They can also occur in causative constructions. Some
transformational processes can occur as material-processual things. Not every trans-
formational process can occur in every possible form: for example, evaporate occurs
in all categories, condense occurs as a middle process or in a causative construction,
while transpire only occurs as a material-processual thing, that is, transpirurion.
Absorb and desalinate only occur as effective processes.
The consequences for the network of the different possibilities for transformational
processes are twofold, affecting the entry conditions to the network and the exit to the
lexicalisation. The entry conditions for the network must cover the possibility for both
material processes and material-processual things. This means that the entry con-
ditions for the network are disjunctive: either < material > or < material-processual>
are possible entry features. Because the network has entry conditions in both the rank
of clause and group-phrases, realisation statements need either to be rank neutral or
captured in systems with entry conditions differentiated for rank. The lexicalisation
of the items also needs to be carried out separately. Take for example, the feature
< evaporate > which is part of the entry conditions for two gates, one of which would
lead to lexilication of the Process as evaporate and the other to the lexification of the
Thing as evaporation. The gates for lexification re-introduce the differentiation
between process and thing. The joint entry conditions for the lexification of evaporate
are <material > and <evaporate> , which may be contrasted with the joint entry
conditions for evaporation, namely, < material-processual > and < evaporate > .
One problem that is handled at the lexification level is the impossibility of a middle
form of absorb or desalinate. The features absorb and desalinate appear in the network
but can only be lexified as a verbal stem if the feature <effective-material > has been
selected.
The alternatives considered and rejected for transformational processes were to
provide separate networks for material-effective transformational processes, for
material-middle transformational processes and for material-processual things. The
separate networks would have allowed more specificity in the realisation statements,
but would have meant essentially duplicating the network three times and losing the
relationship between processes and processual nouns. It is doubtful whether the
network as it is represented offers a solution that is theoretically justifiable. Ideally,
an extended network should be developed that not only maintains the commonality of
meaning between the processes and processual nouns, but also respects the divisions
of rank. However, the time and effort required for this development is not commen-
surate with the scope of research, although it is important to draw attention to this fact
Choice in Lexis 595
in order to avoid such lacunae in any large scale generation project. The structural consequences of choices in the network are explored in the next section which details the realisations for transformation.
Co~ue~ of Choice in the T~o~~tion Network
In HORACE, the extension of the grammatical networks out to the lexis, has pro- vided a theoretically justified means of building choice into the lexis and of generating lexical items. In the material process network, for example, the extension of the network enables choices to be made between various features that ultimately determine the realisation of one particular material process rather than another. However, it is also possible to expand the networks to capture the concurrent effects of lexical choices on associated grammatical functions. For example, choices made in the network for material processes will in their turn constrain choices to be made for the medium and agents of those processes (cf. Matthiessen’s (1990) discussion of col- location of medium and range with material processes). This is where the formal consequences of differentiation in the lexical network make their presence felt (Hasan 1987).
In the work of Hasan (1987), the realisations in the fully extended form of lexis as most delicate grammar, affect not only the production of disposal processes but also serve to constrain the choices available to other grammatical functions within the sphere of influence of those processes. In Hasan’s words “options have consequences: they are justified by what they do” (Hasan 1987, 185). In the work of Hasan (1987), the realisations in the fully extended form of lexis as most delicate grammar, affect not only the production of disposal processes but also serve to constrain the choices available to other grammatical actions within the sphere of influence of those processes. In Hasan’s words: “options have consequences: they are justified by what they ‘do”’ (Hasan 1987, 185).
The realisation statements of the disposal network are the means for expressing those constraints. If the realisations for the disposal network are examined, omitting the initial choices of < material > and c action > and of the henefactive potential for more equal comparison with the HORACE networks, it will be found that the majority of the realisations govern functions other than the process. Approximately one-fifth of the realisations constrain the process, the remainder constrain the medium, the agent, the cooperant and the complement. Of the latter, less than one-third are con- cerned with syntactic operations of constituency, namely, the operations of insertion, conflation and ordering.
The majority of the constraints on the medium, agent, cooperant and complement are concerned with predetermination, expressed through the operations of preselection and subcategori~tion. The medium, for example, may preselect or be subcategorised
5% Language Sciences, Volume 14, Number 4 (1992)
as alienable object, divisible, singular, plural or noncount, indicating high degree of extent or any degree of vastness, solid or liquid, where the configuration of pre- determined choices depends on the process selected. Thus, the full power of the network lies not only in the consequences of the choices for the grammatical function under focus, in this case the process, but also in the influence of the choices on associated functions.
The critical event in transformation processes is that the medium is transformed in some way. The medium starts out in one form or state and ends up in another. It is these changes that are reflected in the predeterminations expressed through the realisation statements. To explicate the next part of the discussion, the top ontology of the thing network is given and the more delicate part of the thing network that specifies water forms is given in the following two figures (Figs 9, 10).
Other nonperson
Compositional
Multiple thing relation I Superordinate
f
4
spectfied
Collective Multiple thing retation unspecified
Indwdual Flora Eorth ext
Life ext Earth
4
Organic --I
Living
i
f Fauna Earth nonext
Life nonext Sator system
Nonliving Universe 4
i
Sun
Moon
inorganic i
Natural thing
i
Extra solar system
Nonuniverse
Artefact Thing spatial Thing spatio temporal
-i
Oimenslonal
-I
_[ Thing temporot
Thing nonspatio temporat
Nondimensional Thmg moving
Thing motion specified
i
[ Thing immobile
Thing mohon unspecified Thing presence
-4
Thing occurence specified --I Thing absence
\ Thmg occurence unspecified
Figure 9. Ontology of the Thing Network.
In the metamorphosis part of the network, the choice of the initial state of the Medium, reflected in the feature choices < initial-solid > and < initial-gas > , pre- determines the selection from the Thing network for Medium. Correspondingly, the Medium will preselect for the features <solid> , <liquid > and <gas > in the NATURE-MATERIAL-TYPE system in the Thing network (Cross, 1991).
Other element
Ice form txt
)-I jolid
f
I Solid ncnext ’ Lice form
rLiquid _ water fern
_$uid__l ext I\ , r
GL L Gas nonext
-1-l Water voocwr - Wotw vwour
term
L Nature material specified
- form no&t form oti term
Figure 10. Partial Thing Network: Water Forms.
(system name ~ETA~O~~OSIS-I~AL-FG~ : inputs
(METAMORPHOSIS) :outputs
((0.3333334 INITIAL-SOLID (PRESELECT MEZDIUM SOLID)) (0.3333334 IDEAL-SQUID
(PRESELECT MEDIUM LIQUID)) (0.3333334 INITIAL-GAS
(PRESELECT ADS GAS))) :metafunction IDEATIONAL)
Examples of processes that will be affected by the preselections are dissolve and melt for the preselection of a solid medium, evaporate, transpire, dry-up atifieeze for the preselection of a liquid Medium and condense for the preselection of a gaseous medium. Not only will the medium of processes be constrained, but also the mediums of material-processual nominals. Thus the preselection of liquid for medium will permit evaporation of water or moisn~re, but not of ice or water-vapom.
The metamorphosed state of the medium also exerts some influence on predeter- mination for the thing network. Those predeterminations will become extant if, what
598 Language Sciences, Volume 14, Number 4 (1992)
may be called, a circumstantial complement is present. If the text includes something
for the medium to ‘transform into’, then the preselections may be for the features
<solid>, <liquid> or <gas> in the thing network, dependent primarily on the
choice of the features <final-solid > , < final-liquid > or < final-gas > for the meta-
morphosed state. The result of such preselections will enable realisations such as
freeze into ice, dissolve or melt into liquid, condense into water droplets, evaporate
or transpire into water-vapour. The circumstantial Complement is permitted a variety
of realisations as long as the features < solid > , < liquid > or <gas > are part of its
systemic path selection for Thing.
Further predeterminations in the metamorphosis part of the network are possible.
If the feature < evaporate > has been selected and the source of the Medium is present
in the text, then that source, as the function spacelocative, preselects for the features
< water-mass > or <water-body > :
(system
:name EVAPORATE-SPACELOCATIVE-SPECIFICATION
: inputs
(AND EVAPORATE-LEX-VERB-NONPHORIC-PLACE AWAY-FROM-MOTION)
:outputs
((0.9 WATER-SPACELOCATIVE-SPECIFIED)
(0.1 WATER-SPACELOCATIVE-UNSPECIFIED))
: metafunction IDEATIONAL)
(system
:name EVAPORATE-SPACELOCATIVE-TYPE
: inputs
(WATER-SPACELOCATIVE-SPECIFIED)
: outputs
((0.5 OCEANIC-SPACELOCATIVE-SPECIFIED
(PRESELECT SPACELOCATIVE > MINIRANGE > THING WATER-MASS))
(0.5 NONOCEANIC-SPACELOCATIVE-SPECIFIED
(PRESELECT SPACELOCATIVE > MINIRANGE > THING WATER-BODY)))
met&unction IDEATIONAL)
The preselections capture the idea that the source of the process evaporate is usually
some body of water. Similarly, the material-processual nominal evaporation pre-
selects for a source that is a body of water.
In contrast, the process transpire requires flora as its source, real&d by the space-
locative peselecting the feature < flora-ext > . This preselection captures the con-
straint that transpiration comes from plants of some kind. The processual-nominal
Chokein Lexis 599
frun.spirution also preselects for the feature Cflora-ext > if the source of the trans- piration is present in the text. There is also the possibility that the source of the transpiration may be realised through the grammatical function agent in an effective- material clause, in which case the agent preselects for < flora-ext > in the Thing network. The examples of transpiration in the texts avoid the direct use of agent for the source of the moisture by introducing the source in conjunction with another process. In the example from T5, the sources plants and plant leaves are linked respectively to used and giving ofl
T5: 21. //Some of the moisture [used by plantsJ soon returns to the air through
transpiration. //
23. //Transpiration means the giving off of water vupour by plant leaves.//
In T7, the source plants is introduced in conjunction with absorbed:
T7: 18. //Some soaks into the soil/
19. /where it is absorbed by plants/
20. /and partly returned to the air through tmnspimtion.//
In the change-in-physical-form part of the transformation network, Medium preselects for the feature <solid > in the Thing network, reflecting the linguistic fact that solid rather than liquid or gaseous Things compact, break and crack. Within that same subnetwork, the transformed medium, realised by the grammatical function Space- locative, preselects for the feature < thing-part> in the thing network, capturing the linguistic fact that things break or crack into pieces.
The final consequence of the choices made in transformation occurs in the change- in-composition part of the network where for the feature <desalinate>, medium preselects for the feature c water-form-nonext > , expressing the strong probability that the medium will be water in its liquid form.
In summary, the development of the realisation statements for the material trans- formation network has demonstrated the full consequences of choices in the lexical network and, in particular, the predictive power of those choices. The grammatical function with greatest predictive power is the medium, followed by the grammatical representation of the transformed medium. This finding accords with the work of Hasan (1987) in the material disposal lexical network who also found that the medium was the most highly constrained grammatical function. It also bears out the obser- vation by Halliday that in the ergative representation of transitivity, it is “the Process and the Medium that form the nucleus of the English clause” (Hall&y 1985a, 147).
600 Language Sciences, Volume 14, Number 4 (19!22)
Further research on the detail of the realisations in the lexical networks would test the
extent of the predetermination as described in Hall&y’s subsequent observation that
“this nucleus then determines the range of options that are available to the rest of the
clause” (ibid.).
Collocation as Preselection within Register
The discussion of realisations for transformations has justified the cross-linkage of
lexis through preselection by grammatical functions. Hence, given certain choices
within the network for process, preselections are validated for, minimally, the medium
and may extend to other participants such as transformed medium and circumstance.
The cross-linkage through grammatical functions between paradigmatic features in the
network has the ultimate effect of determining collocations. However, rather than
limiting collocation to a listing of co-occurring lexical items, the preselection between
grammatical functions and paradigmatic features enables linking to be made at the
level of potential rather than expression. Given preselection of a certain feature in the
network, the ultimate linkage might be between a number of lexical items, all of which
share the preselected feature. Moreover, because the grammatical function itself is
embedded in a potential, the ultimate lexification of that function may be a number
of items. For example, preselection of the feature < liquid > by Medium under the
aegis of the feature < initial-liquid > in the METAMORPHOSIS-INITIAL-FORM
system is equivalent to collocating dry up, evaporate, transpire andfreeze with liquid
or water.
Let me extend the argument to collocation between thing and epithet and/or clas-
sifier, a commonly exemplified syntagm, viz. strong argument (Halliday 1996), manly
breast (Firth 1957, 196) or even womanly breast. In the matter cycle register, if the
feature <cycle-nonext > is selected in the thing network, which will be lexified as
cycle, and an epithet is required, then that epithet is likely to be regular, constant or
continuous. Thus the epithet would preselect the features <quality-regularity-specified >
or < quality-interruptability-specified > . Similarly, the range of collocation is limited
if a classifier is required. The possibilities for a classifier are natural, or some kind
of matter, for example, water cycle, nitrogen cycle or carbon cycle. Thus the pre-
selections for classifier would be < quality-causality-specified > from the quality
network and from the thing network the preselection would be for the features
< elements-ext > or < compounds-ext > .
Thus, it has been possible to demonstrate that collocation within register may be
captured through the cross-linkage of preselection from grammatical functions to
paradigmatic features in the network. Preselection at the lexical end of the gram-
matical continuum has the ultimate effect of establishing collocation between lexical
items.
Choice in Lexis 601
CONCLUSIONS
In this paper, the rationale and implementation of a model of lexis based on most delicate grammar has been presented. With the goal of a lexical component that provides motivated choice, the approach has been to extend the grammatical networks out to lexis. The differentiation between choices in the network is based on para- digmatic features that have linguistic consequences. The detailed examination of the transformation processes has demonstrated the impact of lexical feature choices on participants in the process and circumstances surrounding the process. It has also been shown that collocation of lexical items within a register may be handled through preselections when the lexical networks are developed for all grammatical classes.
Tasks for future research would include the implementation of finer differentiation based on large-scale collocational data, a more detailed examination of apparent synonymy within the register and the exploration of how the textual metafunction might be incorporated into the model of lexis.
NOTES
1. Address correspondence to: Marilyn Cross, Wollongong University, Telecom- munications Software Research Centre, New South Wales 2500, Australia.
REFERENCES
Bateman, J. A. and C. L. Paris 1989a “Constraining the Deployment of Lexicogrammatical Resources during
Text Generation: toward a Computational Instantiation of Register Theory,” Paper presented at 16th International Systemics Workshop, Helsinki.
1989b “Phrasing a Text in Terms the User can Understand,” Proceedings ofthe
11th International Joint Conference on Artificial Intelligence, Detroit, Michigan, pp. 151 l-7.
Byrd, R. J., N. Calzolari, M. S. Chodorow, J. L. Klavans, M. S. Neff and 0. A. Rizk 1987 “Tools and Methods for Computational Lexicology,” Computational
Linguistics 13. 219-40.
Calzolari, N., and E. Picchi 1988 “Acquisition of Semantic Information from an On-Line Dictionary,‘*
Proceedings of the 12th International Conference on Computational
Linguistics, pp. 87-9, Budapest. Church, K. W. and P. Hanks
1989 “Word Association Norms, Mutual Information and Lexicography,” Proceedings of 27th Annual Meeting of the Association for Computational
Linguistics, pp. 76-83.
602 Language Sciences, Volume 14, Number 4 (1992)
Cross, M.
1991 Choice in Text: A Systemic-Functional Approach to Computer Modelling
of Variant Text Production. Ph.D thesis, Sydney: Macquarie University.
Cumming, S.
1986 “The Lexicon in Text Generation,” USC Technical Report ISI/RR-
86-168.
Dahlgren, K. and McDowell J.
1986 “Kind Types in Knowledge Representation.” Proceedings 11th Znter-
national Conference on Computational Linguistics, pp. 216-21, Bonn.
Dahlgren, K., J. McDowell and E. P. Stabler
1989 “Knowledge Representation for Commonsense Reasoning with Text,”
Computational Linguistics 15. 149-70.
Eggins, S., J. R. Martin and P. Wignell
1987 “The Discourse of Geography: Ordering and Explaining the Experiential
World,” Working Papers in Linguistics, University of Sydney, Writing
Project Report, 5. 25-65.
Fawcett, R. P.
1987 “The Semantics of Clause and Verb in English,” in New Developments
in SystemicLinguistics, pp. 130-83, M. A. K. Halliday and R. P. Fawcett
(eds), London: Pinter.
Fawcett, R. P. and G. H. Tucker
1990 “Demonstration of Genesys: A Very Large, Semantically Based Systemic
Functional Generator,” Proceedings of 13th International Conference on
Computational Linguistics, Helsinki, 1. 47-9. Firth, J. R.
1957 “Modes of Meaning,” in Papers inLinguistics 1934-1951, pp. 190-215,
London: Oxford University Press.
Goldman, N.
1975 ‘Conceptual Generation,” in Conceptual Information Processing, pp.
289-374, R. S. Schank (ed.), Amsterdam: North-Holland.
Hall&y, M. A. K.
1961 “Categories of the Theory of Grammar,” Word, 17.
1966 “Lexis as a Linguistic Level,” in In Memory of J. R. Firth, pp. 148-62,
C. E. Bazell, J. C. Catford, M. A. K. Halliday and R. H. Robins (eds),
London: Longmans.
1976a “Theory,” in Halliday: System and Function in Language, pp. 33-98,
G. R. Kress (ed.), London: O.U.P.
1976b “Text as Semantic Choice in Social Contexts,” in Grammars and
Descriptions, pp. 176-225, T. A. Van Dijk and J. S. Petofi (eds), Berlin:
de Gruyter.
1978
1985a 1985b
Hasan, R 1985
1987
. .
“Lending and Borrowing: from Grammar to Lexis,” in 7&e Cultivated
Australian: Festschrij? in Honour of Arthur Delbridge, pp. 55-67, J. E. Clark (ed.), Hamburg: Helmut Buske. “The Grammarian’s Dream: Lexis as most Delicate Grammar,” in New Developments in Systemic Linguistics, pp. 184-211, M. A. K. Halliday and R. P. Fawcett (eds), London: Pinter.
Hobbs, J. R.
Choice in L&s 603
Language as Social Semiotic: the Social Interpretation of Language and
Meaning, London: Edward Arnold. An Introduction to Functional Grammar, London: Edward Arnold. Spoken and Written Language, Geelong, Victoria: Deakin University Press.
1986 “Sublanguage and Knowledge,” in Analysing Language in Restn’cted
Domains: Sublanguage Description and Processing, pp. 53-68, R. Grishman and R. Kittredge (eds), New Jersey: Lawrence Erlbaum.
Hobbs, J. R., W. Croft, T. Davies, D. Edwards and K. Laws 1987 “Commonsense Metaphysics and Lexical Semantics,” Computational
Linguistics 13. 24 l-50.
Hovy, E. H. 1985 “Putting Affect into Text,” Proceedings of8th Conference of Cognitive
Science, pp. 669-75. Amherst. 1987a “Some Pragmatic Decision Criteria in Generation,” in Natural Language
Generation: New Results in Artificial Intelligence, Psychology and
Linguistics, pp. 3-17, G. Kempen (ed.), Dordrecht: Martinus Nijhoff. 1987b “Interpretation in Generation,” Proceedings American Asociation of
Artificial Intelligence, pp. 545-9. McDonald, D. D., M. M. Vaughan and J. D. Pustejovsky
1987 “Factors Contributing to Efficiency in Natural Language Generation,” in Natural Language Generation: New Results in Artificial Intelligence,
Psychology and Linguistics, pp. 159-82, G. Kempen (ed.), Dordrecht: Martinus Nijhoff.
McIntosh, A. 1966 “Patterns and Ranges,” in Patterns of language, pp. 183-99, A. McIntosh
and M. A. K. Halliday (eds), London: Longmans. McKeown, K. R.
1985 Text Generation, Cambridge: Cambridge University Press. Mann, W. C.
1983 “An Overview of the Penman Text Generation System,” USC Technical
Report ISI/RR83-114.
604 Language Sciences, Volume 14, Number 4 (1992)
Mann, W. C. and C. M. I. M. Matthiessen
1983 “Nigel: a Systemic Grammar for Text Generation,” USC Technical
Report ISI/RR83-105.
Matthiessen, C. M. I. M.
1985 “The Systemic Framework in Text Generation: Nigel,” in Systemic
Perspectives on Discourse, pp. 96-118, J. D. Benson and W. S. Greaves
(eds), Vol. 1, New Jersey: Ablex.
1988a WharS in Nigel: Lexicogrummuticul Cartography. USC/IS1 documentation.
1988b Documentation for the First Release of the Nigel Grammar. USC/IS1
documentation.
1988~ “Lexico(grammatica1) Choice in Text Generation,” in Natural Language
Generation in Artificial Intelligence and Computational Linguistics, pp.
242-92, C. L. Paris, W. R. Swartout and W. C. Mann (eds), Boston:
Kluwer Academic.
n.d. Lexicogrammatical Cartography: English Systems, Draft, 1990, University
of Sydney.
Mel’Euk, I. A. and A. Polguere
1987 “A Formal Lexicon in the Meaning-Text Theory (or How to do Lexica
with Words), ” Computational Linguistics 13. 261-75.
Miller, G. A.
1985 “Dictionaries of the Mind,” Proceedings of the 23rd Annual Meeting
of Association for Computational Linguistics, Chicago: University of
Chicago, pp. 305-14.
Nirenburg , S. and I. Nirenburg
1988 “A Framework for Lexical Selection in Natural Language Generation,”
Proceedings of the 12th International Conference on Computation&
Linguistics, pp. 47 l-75, Budapest.
Nirenburg , S. and V. Raskin
1987 “The Subworld Concept Lexicon and the Lexicon Management System,”
Computational Linguistics 13. 276-89.
Patten, T.
1988 Systemic Text Generation as Problem Solving, Cambridge: Cambridge
University Press.
Pazienza, M. T. and P. Velardi
1987 “A Structured Representation of Word-Senses for Semantic Analysis,”
Proceedings of the 3rd Internutional Conference on Computational
Linguistics, Copenhagen, pp. 249-57.
Pustejovsky, J. and S. Nirenburg
1987 “Lexical Selection in the Process of Text Generation,” Proceedings of
the 25th Annual Meeting of Association for Computational Linguistics,
Stanford, pp. 201-6.
Choice in L&s 605
Ritchie, G. D., S. G. Pulman, A. W. Black and G. J. Russell 1987 “A Computational Framework for Lexical Description,” Computational
Linguistics 13. 290-307.
Saussure, F. de 1906-11 Course in General Linguistics, Translated by R. Harris 1983, London:
Duckworth. Small, S. L., G. W. Cottrell and M. K. Tanenhaus
1988 Lexical Ambiguity Resolution: Perspectives from Psycholinguistics, Neuro-
psychology and Artificial Intelligence, San Ma&o: Morgan Kaufmann. Tucker, G. H. and R. P. Fawcett
n.d. Modelling Lexis in a Computational Systemic-Functional Grammar,
Draft Paper, 1991. Velardi, P. and M. T. Pazienza
1989 “Computer Aided Interpretation of Lexical Cooccurrences,” Proceedings
of the 2 7th Annual Meeting of the Association for Computational Linguistics,
pp. 185-92. Wilks, Y., D. Fass, C. Guo, J. E. McDonald, T. Plate and B. M. Slator
1988 “Machine Tractable Dictionaries as Tools and Resources for Natural Language Processing,” Proceedings of the 12th International Conference
on Computational Linguistics, pp. 750-5, Budapest. 1989 “A Tractable Machine Dictionary as Resource for Computational
Semantics, ’ ’ in Computational Lexicography for Natural Language Pro-
cessing, pp. 193-228, B. Boguraev and T. Briscoe (eds), London: Longman.