word grammar (richard hudson)

Word Grammar

New Perspectives on a Theory of Language Structure

This page intentionally left blank

Word Grammar

New Perspectives on a Theory of Language Structure

edited by Kensei Sugayama and Richard Hudson

continuum

ContinuumThe Tower Building 15 East 26th Street11 York Road New YorkLondon SE1 7NX NY 10010

All rights reserved. No part of this publication may be reproduced or transmitted inany form or by any means, electronic or mechanical, including photocopying,recording, or any information storage or retrieval system, without prior permission inwriting from the publishers.

First published 2006

British Library Cataloguing-in-Publication DataA catalogue record for this book is available from the British Library.

ISBN: 0-8264-8645-2 (hardback)

Library of Congress Cataloguing-in-Publication DataTo come

Typeset by BookEns Ltd, Royston, Herts.Printed and bound in Great Britain by MPG Books Ltd, Bodmin, Cornwall

© Kensei Sugayama and Richard Hudson 2005

The problem of the word has worried general linguists for thebest part of a century.-P. H. Matthews

Contents

ContributorsPreface

Kensei Sugayama

Introduction1. What is Word Grammar?

Richard Hudson1. A Brief Overview of the Theory2. Historical Background3. The Cognitive Network4. Default Inheritance5. The Language Network6. The Utterance Network7. Morphology8. Syntax9. Semantics

10. Processing11. Conclusions

Part IWord Grammar Approaches to Linguistic Analysis:Its explanatory power and applications

2. Case Agreement in Ancient Greek: Implications for atheory of covert elementsChet Creider and Richard Hudson

1. Introduction2. The Data3. The Analysis of Case Agreement4. Non-Existent Entities in Cognition and in Language5. Extensions to Other Parts of Grammar6. Comparison with PRO and pro7. Comparison with Other PRO-free Analyses8. Conclusions

3. Understood Objects in English and Japanese withReference to Eat and Taberui A Word GrammaraccountKensei Sugayama

1. Introduction2. Word Grammar

xixiii

13

3571213151821242728

33

35

3535414246495052

54

5456

viii WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE

3. Eat in English4. Taberu in Japanese5. Conclusion

4. The Grammar of Be To: From a Word Grammarpoint of viewKensei Sugayama1. Introduction and the Problem2. Category of Be3. Modal Be in Word Grammar4. Morphological Aspects5. Syntactic Aspects6. Semantics of the Be To Construction7. Should To be Counted as Part of the Lexical Item?8. A Word Grammar Analysis of the Be To Construction9. Conclusion

5. Linking in Word GrammarJasper Holmes1. Linking in Word Grammar: The syntax semantics principle2. The Event Type Hierarchy: The framework; event types;

roles and relations3. Conclusion

6. Word Grammar and Syntactic Code-Mixing ResearchEva Eppler1. Introduction2. Constituent Structure Grammar Approaches to Intra-Sentential

Code-Mixing3. A Word Grammar Approach to Code-Mixing4. Word Order in Mixed and Monolingual 'Subordinate' Clauses5. Summary and Conclusion

7. Word Grammar Surface Structures and HPSG OrderDomainsTakafumi Maekawa1. Introduction2. A Word Grammar Approach3. An Approach in Constructional HPSG: Ginzburg and Sag 20004. A Linearization HPSG Approach5. Concluding Remarks

Part IITowards a Better Word Grammar

8. Structural and Distributional HeadsAndrew Rosta1. Introduction2. Structural Heads

586063

67

676869707172757781

83

83

103114

117

117

118121128139

145

145146154160165

169

171

171172

WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE ix

3. Distributional Heads4. Thai-Clauses5. Extent Operators6. Surrogates versus Proxies7. Focusing Subjuncts: just, only, even8. Pied-piping9. Degree Words

10. Attributive Adjectives11. Determiner Phrases12. The type of Construction13. Inside-out Interrogatives14. 'Empty Categories'15. Coordination16. Correlatives17. Dependency Types18. Conclusion

9. Factoring Out the Subject DependencyNikolas Gisborne

1. Introduction2. Dimensions of Subjecthood3. The Locative Inversion Data4. Factored Out Subjects5. Conclusions

ConclusionKensei Sugayama

Author IndexSubject Index

172174174177179181181182182184185187189191191199

204

204205210216222

225

227229

Contributors

RICHARD HUDSON is Professor Emeritus of Linguistics at University CollegeLondon. His research interest is the theory of language structure; his mainpublications in this area are about the theory of Word Grammar, includingWord Grammar (1984, Oxford: Blackwell); English Word Grammar (1990,Oxford: Blackwell) and a large number of more recent articles. He has alsotaught sociolinguistics and has a practical interest in educational linguistics.Website: www. phon. ucl. ac. uk/home/dick/home. hrmEmail: dick@linguistics. ucl. ac. uk

KENSEI SUGAYAMA, Professor of English Linguistics at Kobe City University ofForeign Studies. Research interests: English Syntax, Word Grammar, LexicalSemantics and General Linguistics. Major publications: 'More on unaccusativeSino-Japanese complex predicates in Japanese' (1991). UCL Working Papers inLinguistics 3; 'A Word-Grammatic account of complements and adjuncts inJapanese' (1994). Proceedings of the 15th International Congress of Linguists;'Speculations on unsolved problems in Word Grammar' (1999). The Kobe CityUniversity Journal 50. 7; Scope of Modern Linguistics (2000, Tokyo: Eihosha);Studies in Word Grammar (2003, Kobe: Research Institute of Foreign Studies,KCUFS).Email: ken@inst. kobe-cufs. ac. jp

CHET CREIDER, Professor and Chair, Department of Anthropology, Universityof Western Ontario, London, Ontario, Canada. Research interests: morphol-ogy, syntax, African languages. Major publications: Structural and PragmaticFactors Influencing the Acceptability of Sentences with Extended Dependencies inNorwegian (1987, University of Trondheim Working Papers in Linguistics 4); TheSyntax of the Nilotic Languages: Themes and variations (1989, Berlin: DietrichReimer); A Grammar of Nandi (1989, with J. T. Creider, Hamburg: HelmutBuske); A Grammar of Kenya Luo (1993, ed. ); A Dictionary of the NandiLanguage (2001, with J. T. Creider, Koln: Riidiger Koppe).Email: creider@uwo. ca

ANDREW ROSTA, Senior Lecturer, Department of Cultural Studies, Universityof Central Lancashire, UK. Research Interests: all aspects of English grammar.Email: a. rosta@v21. me. uk

NIKOLAS GISBORNE is a lecturer in the Department of Linguistics and EnglishLanguage at the University of Edinburgh. His research interests are in lexicalsemantics and syntax, and their interaction in argument structure.

www.phon.ucl.ac.uk/home/dick/home.htm

xii WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE

Website: www. englang. ed. ac. uk/people/nik. htmlEmail: n. gisborne@ed. ac. uk

JASPERW. HOLMES is a self-employed linguist who has worked with many largeorganizations on projects in lexicography, education and IT. Teaching andresearch interests include syntax and semantics, lexical structure, corpuses andother IT applications (linguistics in computing, computing in linguistics),language in education and in society, the history of English and English as aworld language. His publications include 'Synonyms and syntax' (1996, withRichard Hudson, And Rosta, Nik Gisborne). Journal of Linguistics 32; 'Thesyntax and semantics of causative verbs' (1999). UCL Working Papers inLinguistics 11; 'Re-cycling in the encyclopedia' (2000, with Richard Hudson), inB. Peeters (ed. ) The Lexicon-Encyclopedia Interface (Amsterdam: Elsevier);'Constructions in Word Grammar' (2005, with Richard Hudson) in Jan-OlaOstman and Mirjam Fried (eds) Construction Grammars: Cognitive Grounding andTheoretical Extensions (Amsterdam: Benjamins).Email: jasper. holmes@gmail. com

EVA EPPLER, Senior Lecturer in English Language and Linguistics, School ofArts, University of Roehampton, UK. Research Interests: morpho-syntax ofGerman and English, syntax-pragmatics interface, code-mixing, bilingualprocessing and production, sociolinguistics of multilingual communities. Recentmain publication: '"... because dem Computer brauchst' es ja nicht zeigen. ":because + German main clause word order' International Journal ofBilingualism 8. 2 (2004), pp. 127-44.Email: evieppler@hotmail. com

TAKAFUMI MAEKAWA, PhD student, Department of Language and Linguistics,University of Essex. Research Interests: Japanese and English syntax, Head-Driven Phrase Structure Grammar and lexical semantics. Major publication:'Constituency, Word Order and Focus Projection' (2004). The Proceedings ofthe llth International Conference on Head-Driven Phrase Structure Grammar.Center for Computational Linguistics, Katholieke Universiteit Leuven, August3-6.Email: maekawa@btinternet. com

www.englang.ed.ac.uk/people/nik.html

Preface

This volume comes from a three-year (April 2002-March 2005) researchproject on Word Grammar supported by the Japan Society for the Promotionof Science, the goal of which is to bring together Word Grammar linguistswhose research has been carried out in this framework but whose approachesto it reflect differing perspectives on Word Grammar (henceforth WG). Igratefully acknowledge support for my work in WG from the Japan Society forthe Promotion of Science (grant-in-aid Kiban-Kenkyu C (2), no. 14510533 fromApril 2002-March 2005). The collection of papers was planned so as tointroduce the readers into this theory and to include a diversity of languages, towhich the theory is shown to be applicable, along with critique from differenttheoretical orientations.

In September 1994 Professor Richard Hudson, the founder of WordGrammar, visited Kobe City University of Foreign Studies to give a lecture inWG on a part of his lecturing trip to Japan. His talks were centred on advancesin WG at that time, which refreshed our understanding of the theory. ProfessorHudson has been writing in a very engaging and informative way for about twoquarters of a century in the world linguistics scene.

Word Grammar is a theory of language structure which Richard Hudson,now Emeritus Professor of Linguistics at University College London, has beenbuilding since the early 1980s. It is still changing and improving in detail, yet themain ideas remain the same. These ideas themselves developed out of twoother theories that he had tried: Systemic Grammar (now known as SystemicFunctional Grammar), due to Michael Halliday, and then Daughter-Dependency Grammar, his own invention.

Word Grammar fills a gap in the study of dependency theory. Dependencytheory may not belong to the mainstream in the Western World, especially notin America, but it is gaining more and more attention, which it certainlydeserves. In Europe, dependency has been better known since the Frenchlinguist Lucien Tesniere's study in the 1950s (cf. Hudson, this volume). I justmention here France, Belgium, Germany and Finland. Dependency theorynow also rules Japan in the shape of WG. Moreover, the notion of head, thecentral idea of dependency, has been introduced into virtually all modernlinguistic theories. In most grammars, dependency and constituency are usedsimultaneously. However, this adduces the risk of making these grammars toopowerful. WG's challenge is to eliminate constituency from grammar except incoordinate structures, although certain dependency grammars, especially theGerman ones, refuse to accept constituency for coordination.

Richard Hudson's first book was the first attempt to write a generative(explicit) version of Systemic Grammar (English Complex Sentences: An

xiv WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE

Introduction to Systemic Grammar, North Holland, 1971); and his second bookwas about Daughter-Dependency Grammar (Arguments for a Non-transforma-tional Grammar, University of Chicago Press, 1976). As the latter tide indicates,Chomsky's transformational grammar was very much 'in the air', and bothbooks accepted his goal of generative grammar but offered other ideas aboutsentence structure as alternatives to his mixture of function-free phrase structureplus transformations. In the late 1970s when Transformational Grammar wasimmensely influential, Richard Hudson abandoned Daughter-DependencyGrammar (in spite of its drawing a rave review by Paul Schachter in Language54, 348-76). His exploration of various general ideas that hadn't come togetherbecame an alternative coherent theory called Word Grammar, first describedin the 1984 book Word Grammar and subsequently improved and revised inthe 1990 book English Word Grammar. Since then the details have beenworked out much better and there is now a workable notation and anencyclopedia available on the internet (cf. Hudson 2004). The newest versionof Word Grammar is now on its way (Hudson in preparation).

The time span between the publication of Richard Hudson's Word Grammar(1984) and this volume is more than two decades (21 years to be precise). Theintervening years have seen impressive developments in this theory by the WGgrammarians as well as those in other competitive linguistic theories such asMinimalist Programme, Head-driven Phrase Structure Grammar (HPSG),Generalized Phrase Structure Grammar (GPSG), Lexical Functional Grammar(LFG), Construction Grammar and Cognitive Grammar.

Here are the main ideas, most of which come from the latest version of theWG homepage (Hudson 2004), together with an indication of where they camefrom:

• It is monostratal - only one structure per sentence, no transformations.(From Systemic Grammar).

• It uses word-word dependencies - e. g. a noun is the subject of a verb.(From John Anderson and other users of Dependency Grammar, viaDaughter-Dependency Grammar; a reaction against Systemic Grammarwhere word-word dependencies are mediated by the features of the motherphrase. )

• It does not use phrase structure - e. g. it does not recognize a noun phraseas the subject of a clause, though these phrases are implicit in thedependency structure. (This is the main difference between Daughter-Dependency Grammar and Word Grammar. )

• It shows grammatical relations/functions by explicit labels - e. g. 'subject'and 'object'. (From Systemic Grammar).

• It uses features only for inflectional contrasts - e. g. tense, number but nottransitivity. (A reaction against excessive use of features in both SystemicGrammar and current Transformational Grammar. )

• It uses default inheritance, as a very general way of capturing the contrastbetween 'basic' or 'underlying' patterns and 'exceptions' or 'transforma-tions' - e. g. by default, English words follow the word they depend on, butexceptionally subjects precede it; particular cases 'inherit' the default pattern

WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE xv

unless it is explicitly overridden by a contradictory rule. (From ArtificialIntelligence).

• It views concepts as prototypes rather than 'classical' categories that can bedefined by necessary and sufficient conditions. All characteristics (i. e. alllinks in the network) have equal status, though some may for pragmaticreasons be harder to override than others. (From Lakoff and earlyCognitive Linguistics, supported by work in sociolinguistics).

• It presents language as a network of knowledge, linking concepts aboutwords, their meanings, etc. - e. g. twig is linked to the meaning 'twig', to theform /twig/, to the word-class 'noun', etc. (From Lamb's SrratificationalGrammar, now known as Neurocognitive Linguistics).

• In this network there are no clear boundaries between different areas ofknowledge - e. g. between 'lexicon' and 'grammar', or between 'linguisticmeaning' and 'encyclopedic knowledge'. (From early Cognitive Linguistics- and the facts).

• In particular, there is no clear boundary between 'internal' and 'external'facts about words, so a grammar should be able to incorporatesociolinguistic facts - e. g. the speaker of jazzed is an American. (FromSociolinguistics).

In this theory, word-word dependency is a key concept, upon which the syntaxand semantics of a sentence build. Dependents of a word are subcategorizedinto two types, i. e. complements and adjuncts. These two types of dependentsplay an important role in this theory of grammar.

Let me give you a flavour of the syntax and semantics in WG, as shown inFigure 1:

Figure 1

xvi WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE

Contributors to this volume are primarily WG grammarians across the worldwho participated in the research organized by myself, and I am also grateful forbeing able to include critical work by Maekawa of the University of Essex, whois working in a different paradigm.

All the papers here manifest what I would characterize as theoreticalpotentialities of WG, exploring how powerful WG is to offer analyses forlinguistic phenomena in various languages. The papers we have collected comefrom varying perspectives (formal, lexical-semantic, morphological, syntactic,semantic) and include work on a number of languages, including English,Ancient Greek, Japanese and German. Phenomena studied include verbalinflection, case agreement, extraction, construction, code-mixing, etc.

The papers in this volume span a variety of topics, but there is a commonthread running through them: the claim that word-word dependency isfundamental to our analysis and understanding of language. The collectionstarts with a chapter on WG by Richard Hudson which serves to introduce thenewest version of WG. The subsequent chapters are organized into twosections:

Part I: Word Grammar Approaches to Linguistic Analysis: its explanatorypower and applicationsPart II: Towards a Better Word Grammar

Part I contains seven chapters, which contribute to recent developments inWG and explore how powerful WG is to analyze linguistic phenomena invarious languages. They deal with formal, lexical, morphological, syntactic andsemantic matters. In this way, these papers give a varied picture of thepossibilities of WG.

In Chapter 2, Creider and Hudson provide a theory of covert elements,which is a hot issue in linguistics. Since WG has hitherto denied the existenceof any covert elements in syntax, it has to deal with claims such as the one thatcovert case-bearing subjects are possible in Ancient Greek. As the authors saythemselves, their solution is tantamount to an acceptance of some covertelements in syntax, though in every case the covert element can be predictedfrom the word on which it depends. The analysis given is interesting becausethe argument is linked to dependency. It is more sophisticated than the simpleand undefined Chomskyan notion of PRO element.

In Chapter 3, Sugayama joins Creider and Hudson in detailing an analysis ofunderstood objects in English and Japanese, albeit at the level of semanticsrather than syntax. He studies an interesting contrast between English andJapanese concerning understood objects. Unlike English and most otherEuropean languages, Japanese is quite unique in allowing its verbs to miss outtheir complements on the condition that the speaker assumes that they areknown to the addressee. The reason seems to be that in the semantic structureof the sentences, there has to be a semantic argument which should be, but isnot, mapped onto syntax as a syntactic complement. The author adduces a WGsolution that is an improvement on Hudson's (1990) account.

Sugayama shares with the preceding chapter an in-depth lexical-semantic

WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE xvii

analysis in order to address the relation between a word and the construction.In Chapter 4, he attempts to characterize the be to construction within the WGframework. He has shown that a morphological, syntactic and semantic analysisof be in the be to construction provides evidence for the category of be in thisconstruction. Namely, be is an instance of modal verb in terms of morphologyand syntax, while the sense of the whole construction is determined by thesense of 'to'.

In Chapter 5, Holmes in a very original approach develops an account forthe linking of syntactic and semantic arguments in the WG approach. Underthe WG account, both thematic and linking properties are determined at boththe specific and the general level. This is obviously an advantage.

In Chapter 6, Eppler draws on experimental studies concerning the code-mixing and successfully extends WG to an original and interesting area ofresearch. Constituent-based models have difficulties accounting for mixingbetween SVO and SOV languages like English and German. A dependency(WG) approach is imperative here. A word's requirements do not project tolarger units like phrasal constituents. The Null-Hypothesis, then, formulated inWG terms, assumes that each word in a switched dependency satisfies theconstraints imposed on it by its own language. The material is taken fromEnglish/German conversations of Jewish refugees in London.

Maekawa continues the sequence in this collection towards more purelytheoretical studies. In Chapter 7, he looks at three different approaches to theasymmetries between main and embedded clauses with respect to the elementsin the left periphery of a clause: the dependency-based approach within WG,the Constructional HPSG approach, and the Linearization HPSG analysis.Maekawa, a HPSG linguist, argues that the approaches within WG and theConstructional HPSG have some problems in dealing with the relevant facts,but that Linearization HPSG provides a straightforward account of them.Maekawa's analysis suggests that linear order should be independent to aconsiderable extent from combinatorial structure, such as dependency orphrase structure.

Following these chapters are more theoretical chapters which help toimprove the theory and clarify what research questions must be undertakennext.

Part II contains two chapters that examine two theoretical key concepts inWG, head and dependency. They are intended to help us progress a few stepsforward in revising and improving the current WG, together with Hudson (inpreparation).

The notion of head is a central one in most grammars, so it is normal that itis discussed and challenged by WG and other theorists. In Chapter 8, Rostadistinguishes between two kinds of head and claims that every phrase has both adistributional head and a structural head, although he agrees that normally thesame word is both distributional and structural head of a phrase. Finally,Gisborne's Chapter 9 then challenges Hudson's classification of dependencies.The diversification of heads (different kinds of dependency) plays a role inWG as well. Gisborne is in favour of a more fine-grained account of

xviii WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE

dependencies than Hudson's 1990 model. He focuses on a review of thesubject-of dependency, distinguishing between two kinds of subjects, whichseems promising. Gisborne's thesis is that word order is governed not only bysyntactic information but also by discourse-presentational facts.

I hope this short overview will suggest to the prospective reader that ourattempt at introducing a dependency-based grammar was successful.

By the means of this volume, we hope to contribute to the continuingcooperation between linguists working in WG and those working in othertheoretical frameworks. We look forward to future volumes that will furtherdevelop this cooperation.

The editors gratefully acknowledge the work and assistance of all thosecontributors whose papers are incorporated in this volume, including one non-WG linguist who contributed papers from his own theoretical viewpoint andhelped shape the volume you see here.

Last but not least, neither the research in WG nor the present volume wouldhave been possible without the general support of both the Japan Society forthe Promotion of Science and the Daiwa Anglo-Japanese Foundation, whoseassistance we gratefully acknowledge here. In addition, we owe a special debt ofgratitude to Jenny Lovel for assisting with preparation of this volume in hernormal professional manner. We alone accept responsibility for all errors inthe presentation of data and analyses in this volume.

Kensei Sugayama

References

Hudson, R. A. (1971), English Complex Sentences: An Introduction to Systemic Grammar.Amsterdam: North Holland.

— (1976), Arguments for a Non-transformational Grammar. Chicago: University ofChicago Press.

— (1984), Word Grammar. Oxford: Blackwell.— (1990), English Word Grammar. Oxford: Blackwell.— (2004, July 1-last update), 'Word Grammar', (Word Grammar), Available:

www. phon. ucl. ac. uk/home/dick/wg. htm (Accessed: 18 April 2005).— (in preparation), Advances in Word Grammar. Oxford: Oxford University Press.Pollard, C. and Sag, LA. (1987), Information-Based Syntax and Semantics. Stanford:

CSLI.Schachter, P. (1978), 'Review of Arguments for a Non-Transformational Grammar'.

Language, 17, 348-76.Sugayama, K. (ed. ) (2003), Studies in Word Grammar. Kobe: Research Institute of

Foreign Studies, KCUFS.Tesniere, Lucien (1959), Elements de Syntaxe Structurale. Paris: Klincksieck.

www.phon.ucl.ac.uk/home/dick/wg.htm

Introduction

1 What is Word Grammar?

RICHARD HUDSON

AbstractThe chapter summarizes the Word Grammar (WG) theory of language structureunder the following headings: 1. A brief overview of the theory; 2. Historicalbackground; 3. The cognitive network: 3. 1 Language as part of a general network;3. 2 Labelled links; 3. 3 Modularity; 4. Default inheritance; 5. The languagenetwork; 6. The utterance network; 7. Morphology; 8. Syntax; 9. Semantics; 10.Processing; and 11. Conclusions.

1. A Brief Overview of the Theory

Word Grammar (WG) is a general theory of language structure. Most of thework to date has dealt with syntax, but there has also been serious work insemantics and some more tentative explorations of morphology, sociolinguistics,historical linguistics and language processing. The only areas of linguistics thathave not been addressed at all are phonology and language acquisition (but evenhere see van Langendonck 1987). The aim of this article is breadth rather thandepth, in the hope of showing how far-reaching the theory's tenets are.

Although the roots of WG lie firmly in linguistics, and more specifically ingrammar, it can also be seen as a contribution to cognitive psychology; in termsof a widely used classification of linguistic theories, it is a branch of cognitivelinguistics (Lakoff 1987; Langacker 1987; 1990; Taylor 1989). The theory hasbeen developed from the start with the aim of integrating all aspects of languageinto a single dieory which is also compatible with what is known about generalcognition. This may turn out not to be possible, but to the extent that it ispossible it will have explained the general characteristics of language as 'merely'one instantiation of more general cognitive characteristics.

The overriding consideration, of course, is the same as for any otherlinguistic theory: to be true to the facts of language structure. However, ourassumptions make a great deal of difference when approaching these facts, so itis possible to arrive at radically different analyses according to whether weassume that language is a unique module of the mind, or that it is similar toother parts of cognition. The WG assumption is that language can be analysedand explained in the same way as other kinds of knowledge or behaviour unlessthere is clear evidence to the contrary. So far this strategy has proved productiveand largely successful, as we shall see below.

4 WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE

As the theory's name suggests, the central unit of analysis is the word, whichis central to all kinds of analysis:

• Grammar. Words are the only units of syntax (section 8), as sentencestructure consists entirely of dependencies between individual words; WGis thus clearly part of the tradition of dependency grammar dating fromTesniere (1959; Fraser 1994). Phrases are implicit in the dependencies, butplay no part in the grammar. Moreover, words are not only the largest unitsof syntax, but also the smallest. In contrast with Chomskyan linguistics,syntactic structures do not, and cannot, separate stems and inflections, soWG is an example of morphology-free syntax (Zwicky 1992: 354). Unlikesyntax, morphology (section 7) is based on constituent-structure, and thetwo kinds of structure are different in others ways too.

• Semantics. As in other theories words are also the basic lexical unitswhere sound meets syntax and semantics, but in the absence of phrases,words also provide the only point of contact between syntax and semantics,giving a radically 'lexical' semantics. As will appear in section 9, a ratherunexpected effect of basing semantic structure on single words is a kind ofphrase structure in the semantics.

• Situation. We shall see in section 6 that words are the basic units forcontextual analysis (in terms of deictic semantics, discourse or sociolinguistics).

Words, in short, are the nodes that hold the 'language' part of the humannetwork together. This is illustrated by the word cycled in the sentence / cycled toUCL, which is diagrammed in Figure 1.

Figure 1

WHAT IS WORD GRAMMAR? 5

Table 1 Relationships in cycled

related concept G relationship of C to notation in diagramcycled

the word /the word tothe morpheme {cycle}the word-form {cycle+ed}the concept 'ride-bike'the concept 'event e'the lexeme CYCLEthe inflection 'past'menow

subjectpost-adjunctstemwholesensereferentcycled isa CYCLE

speakertime

V'>a'straight downward linecurved downward linestraight upward linecurved upward linetriangle resting on CYCLE

'speaker''time'

As can be seen in this diagram, cycled is the meeting point for tenrelationships which are detailed in Table 1. These relationships are all quitetraditional (syntactic, morphological, semantic, lexical and contextual), andtraditional names are used where they exist, but the diagram uses notationwhich is peculiar to WG. It should be easy to imagine how such relationshipscan multiply to produce a rich network in which words are related to oneanother as well as to other kinds of element including morphemes and variouskinds of meaning. All these elements, including the words themselves, are'concepts' in the standard sense; thus a WG diagram is an attempt to model asmall part of the total conceptual network of a typical speaker.

2. Historical Background

The theory described in this article is the latest in a family of theories which havebeen called 'Word Grammar' since the early 1980s (Hudson 1984). The presenttheory is very different in some respects from the earliest one, but the continueduse of the same name is justified because we have preserved some of the mostfundamental ideas - the central place of the word, the idea that language is anetwork, the role of default inheritance, the clear separation of syntax andsemantics, the integration of sentence and utterance structure. The theory is stillchanging and further changes are already identifiable (Hudson, in preparation).

As in other theories, the changes have been driven by various forces - newdata, new ideas, new alternative theories, new personal interests; and by theinfluence of teachers, colleagues and students. The following brief history maybe helpful in showing how the ideas that are now called 'Word Grammar'developed during my academic life.

The 1960s. My PhD analysis of Beja used the theory being developed byHalliday (1961) under the name 'Scale-and-Category' grammar, which laterturned into Systemic Functional Grammar (Butler 1985; Halliday 1985). Ispent the next six years working with Halliday, whose brilliantly wide-ranginganalyses impressed me a lot. Under the influence of Chomsky's generative


grammar (1957, 1965), reinterpreted by McCawley (1968) as well-formednessconditions, I published the first generative version of Halliday's SystemicGrammar (Hudson 1970). This theory has a very large network (the 'systemnetwork') at its heart, and networks also loomed large at tihat time in theStratificational Grammar of Lamb (1966; Bennett 1994). Another reason whystratificational grammar was important was that it aimed to be a model ofhuman language processing - a cognitive model.

The 1970s. Seeing the attractions of both valency theory and Chomsky'ssubcategorization, I produced a hybrid theory which was basically SystemicGrammar, but with the addition of word-word dependencies under theinfluence of Anderson (1971); the theory was called 'Daughter-DependencyGrammar' (Hudson 1976). Meanwhile I was teaching sociolinguistics andbecoming increasingly interested in cognitive science (especially defaultinheritance systems and frames) and the closely related field of lexicalsemantics (especially Fillmore's Frame Semantics 1975, 1976). The result was avery 'cognitive' textbook on sociolinguistics (Hudson 1980a, 1996a). I was alsodeeply influenced by Chomsky's 'Remarks on nominalization' paper (1970),and in exploring the possibilities of a radically lexicalist approach I toyed withthe idea of 'pan-lexicalism' (1980b, 1981): everything in the grammar is 'lexical'in the sense that it is tied to word-sized units (including word classes).

The 1980s. All these influences combined in the first version of WordGrammar (Hudson 1984), a cognitive theory of language as a network whichcontains both 'the grammar' and 'the lexicon' and which integrates languagewith title rest of cognition. The semantics follows Lyons (1977), Halliday (1967-8) and Fillmore (1976) rather than formal logic, but even more controversially,the syntax no longer uses phrase structure at all in describing sentence structure,because everything that needs to be said can be said in terms ofdependencies between single words. The influence of continental depen-dency theory is evident but the dependency structures were richer than thoseallowed in 'classical' dependency grammar (Robinson 1970) - more like thefunctional structures of Lexical Functional Grammar (Kaplan and Bresnan1982). Bresnan's earlier argument (1978) that grammar should be compatiblewith a psychologically plausible parser also suggested the need for a parsingalgorithm, which has led to a number of modest Natural Language Processing(NLP) systems using WG (Fraser 1985, 1989, 1993; Hudson 1989; Shaumyan1995). These developments provided the basis for the next book-lengthdescription of WG, 'English Word Grammar' (EWG, Hudson 1990). Thisattempts to provide a formal basis for the theory as well as a detailed applicationto large areas of English morphology, syntax and semantics.

The 1990s. Since the publication of EWG there have been some importantchanges in the theory, ranging from the general theory of default inheritance,through matters of syntactic theory (with the addition of 'surface structure', thevirtual abolition of features and the acceptance of 'unreal' words) andmorphological theory (where 'shape', 'whole' and 'inflection' are new), todetails of analysis, terminology and notation. These changes will be describedbelow. WG has also been applied to a wider range of topics than previously:


• lexical semantics (Gisborne 1993, 1996, 2000, 2001; Holmes 2004;Hudson and Holmes 2000; Hudson 1992, 1995, forthcoming; Sugayama1993, 1996, 1998);

• morphology (Creider 1999; Creider and Hudson 1999);c• morphology (Creider 1999; Creider and Hudson 1999);• sociolinguistics (Hudson 1996a, 1997b; Eppler 2005);• language processing (Hudson 1993a, b, 1996b; Hiranuma 1999, 2001).

Most of the work done since the start of WG has applied the theory to English,but it has also been applied to the following languages: Tunisian Arabic (Chekili1982); Greek (Tzanidaki 1995, 1996a, b); Italian (Volino 1990); Japanese(Sugayama 1991, 1992, 1993, 1996; Hiranuma 1999, 2001); Serbo-Croatian(Camdzic and Hudson 2002); and Polish (Gorayska 1985).

The theory continues to evolve, and at the time of writing a 'Word GrammarEncyclopedia' which can be downloaded via the WG website (www. phon. u-cl. ac. uk/home/dick/wg. htm) is updated in alternate years.

3. The Cognitive Network

3. 1 Language as part of a general network

The basis for WG is an idea which is quite uncontroversial in cognitive science:

The idea is that memory connections provide the basic building blocks throughwhich our knowledge is represented in memory. For example, you obviously knowyour mother's name; this fact is recorded in your memory. The proposal to beconsidered is that this memory is literally represented by a memory connection,...That connection isn't some appendage to the memory. Instead, the connection is thememory.... all of knowledge is represented via a sprawling network of theseconnections, a vast set of associations. (Reisberg 1997: 257-8)

In short, knowledge is held in memory as an associative network (thoughwe shall see below that the links are much more precisely defined than theunlabelled 'associations' of early psychology and modern connectionistpsychology). What is more controversial is that, according to WG, the sameis true of our knowledge of words, so the sub-network responsible for words isjust a part of the total 'vast set of associations'. Our knowledge of words is ourlanguage, so our language is a network of associations which is closely integratedwith the rest of our knowledge.

However uncontroversial (and obvious) this view of knowledge may be ingeneral, it is very controversial in relation to language. The only part of languagewhich is widely viewed as a network is the lexicon (Aitchison 1987: 72), and afashionable view is that even here only lexical irregularities are stored in anassociative network, in contrast with regularities which are stored in afundamentally different way, as 'rules' (Pinker and Prince 1988). For example,we have a network which shows for the verb come not only that its meaning is




'come' but that its past tense is the irregular came, whereas regular past tensesare handled by a general rule and not stored in the network. The WG view isthat exceptional and general patterns are indeed different, but that they canboth be accommodated in the same network because it is an 'inheritancenetwork' in which general patterns and their exceptions are related by defaultinheritance (which is discussed in more detail in section 4). To pursue the lastexample, both patterns can be expressed in exactly the same prose:

(1) The shape of the past tense of a verb consists of its stem followed by -d.(2) The shape of the past tense of come consists of came.

The only difference between these rules lies in two places: 'a verb' versus come,and 'its stem followed by -ed" versus came. Similarly, they can both beincorporated into the same network, as shown in Figure 2 (where the triangleonce again shows the 'isa' relationship by linking the general concept at its baseto the specific example connected to its apex):

Figure 2

Once the possibility is accepted that some generalizations may be expressedin a network, it is easy to extend the same treatment to the whole grammar, aswe shall see in later examples. One consequence, of course, is that we lose theformal distinction between 'the lexicon' and 'the rules' (or 'the grammar'), butthis conclusion is also accepted outside WG in Cognitive Grammar (Langacker1987) and Construction Grammar (Goldberg 1995). The only parts of linguisticanalysis that cannot be included in the network are the few general theoreticalprinciples (such as the principle of default inheritance).

3. 2 Labelled links

It is easy to misunderstand the network view because (in cognitive psychology)there is a long tradition of 'associative network' theories in which all links havejust the same status: simple 'association'. This is not the WG view, nor is it theview of any of the other theories mentioned above, because links are


classified and labelled - 'stem', 'shape', 'sense', 'referent', 'subject', 'adjunct'and so on. The classifying categories range from the most general - the 'isa' link- to categories which may be specific to a handful of concepts, such as 'goods'in the framework of commercial transactions (Hudson forthcoming). This is afar cry from the idea of a network of mere 'associations' (such as underliesconnectionist models). One of the immediate benefits of this approach is that itallows named links to be used as functions, in the mathematical sense ofKaplan and Bresnan (1982: 182), which yield a unique value - e. g. 'the referentof the subject of the verb' defines one unique concept for each verb. In order todistinguish this approach from the traditional associative networks we can callthese networks 'labelled'.

Even within linguistics, labelled networks are controversial because the labelsthemselves need an explanation or analysis. Because of this problem sometheories avoid labelled relationships, or reduce labelling to something moreprimitive: for example, Chomsky has always avoided functional labels forconstituents such as 'subject' by using configurational definitions, and thepredicate calculus avoids semantic role labels by distinguishing arguments interms of order.

There is no doubt that labels on links are puzzlingly different from the labelsthat we give to the concepts that they link. Take the small network in Figure 2for past tenses. One of the nodes is labelled 'COME: past', but this label couldin fact be removed without any effect because 'COME: past' is the only conceptwhich isa 'verb: past' and which has came as its shape. Every concept is uniquelydefined by its links to other concepts, so labels are redundant (Lamb 1996,1999: 59). But the same is not true of the labels on links, because a network withunlabelled links is a mere associative network which would be useless inanalysis. For example, it is no help to know that in John saw Mary the verb islinked, in some way or other, to the two nouns and that its meaning is linked,again in unspecified ways, to the concepts 'John' and 'Mary'; we need to knowwhich noun is the subject, and which person is the see-er. The same label maybe found on many different links - for example, every word that has a sense(i. e. virtually every word) has a link labelled 'sense', every verb that has a subjecthas a 'subject' link, and so on. Therefore the function of the labels is to classifythe links as same or different, so if we remove the label we lose information. Itmakes no difference whether we show these similarities and differences bymeans of verbal labels (e. g. 'sense') or some other notational device (e. g.straight upwards lines); all that counts is whether or not our notation classifieslinks as same or different. Figure 3 shows how this can be done using firstconventional attribute-value matrices and second, the WG notation used so far.

This peculiarity of the labels on links brings us to an important characteristicof the network approach which allows the links themselves to be treated like theconcepts which they link - as 'second-order concepts', in fact. The essence of anetwork is that each concept should be represented just once, and its multiplelinks to other concepts should be shown as multiple links, not as multiplecopies of the concept itself. Although the same principle applies generally toattribute-value matrices, it does not apply to the attributes themselves. Thus


Figmre 3

there is a single matrix for each concept, and if two attributes have the samevalue this is shown (at least in one notation) by an arc that connects the twovalue-slots. But when it comes to the attributes themselves, their labels arerepeated across matrices (or even within a single complex matrix). For example,the matrix for a raising verb contains within it the matrix for its complementverb; an arc can show that the two subject slots share the same filler but the onlyway to show that these two slots belong to the same (kind of) attribute is torepeat the label 'subject'.

In a network approach it is possible to show both kinds of identity in thesame way: by means of a single node with multiple 'isa' links. If two words areboth nouns, we show this by an isa link from each to the concept 'noun'; and iftwo links are both 'subject' links, we put an isa link from each link to a singlegeneral 'subject' link. Thus labelled links and other notational tricks are justabbreviations for a more complex diagram with second-order links betweenlinks. These second-order links are illustrated in Figure 4 for car and bicycle, aswell as for the sentence Jo snores.

Figure 4


This kind of analysis is too cumbersome to present explicitly in mostdiagrams, but it is important to be clear that it underlies the usual notationbecause it allows the kind of analysis which we apply to ordinary concepts to beextended to the links between them. If ordinary concepts can be grouped intolarger classes, so can links; if ordinary concepts can be learned, so can links.And if the labels on ordinary concepts are just mnemonics which could, inprinciple, be removed, the same is true of the labels on almost all kinds of link.The one exception is the 'isa' relationship itself, which reflects its fundamentalcharacter.

3. 3 Modularity

The view of language as a labelled network has interesting consequences for thedebate about modularity: is there a distinct 'module' of the mind dedicatedexclusively to language (or to some part of language such as syntax orinflectional morphology)? Presumably not if a module is defined as a separate'part' of our mind and if the language network is just a small part of a muchlarger network. One alternative to this strong version of modularity is nomodularity at all, with the mind viewed as a single undifferentiated whole; thisseems just as wrong as a really strict version of modularity. However there is athird possibility. If we focus on the links, any such network is inevitably'modular' in the much weaker (and less controversial) sense that links betweenconcepts tend to cluster into relatively dense sub-networks separated byrelatively sparse boundary areas.

Perhaps the clearest evidence for some kind of modularity comes fromlanguage pathology, where abilities are impaired selectively. Take the case ofPure Word Deafness (Airman 1997: 186), for example. Why should a personbe able to speak and read normally, and to hear and classify ordinary noises,but not be able to understand the speech of other people? In terms of a WGnetwork, this looks like an inability to follow one particular link-type ('sense') inone particular direction (from word to sense). Whatever the reason for thisstrange disability, at least the WG analysis suggests how it might apply to just thisone aspect of language, while also applying to every single word: what isdamaged is the general relationship 'sense', from which all particular senserelationships are inherited. A different kind of problem is illustrated by patientswho can name everything except one category - e. g. body-parts or thingstypically found indoors (Pinker 1994: 314). Orthodox views on modularityseem to be of little help in such cases, but a network approach at least explainshow the non-linguistic concepts concerned could form a mental cluster ofclosely-linked and mutually defining concepts with a single super-category. It iseasy to imagine reasons why such a cluster of concepts might be impairedselectively (e. g. that closely related concepts are stored close to each other, so asingle injury could sever all their sense links), but the main point is to haveprovided a way of unifying them in preparation for the explanation.

In short, a network with classified relations allows an injury to apply tospecific relation types so that these relations are disabled across the board. The


approach also allows damage to specific areas of language which form clusterswith strong internal links and weak external links. Any such cluster or sharedlinkage defines a kind of 'module' which may be impaired selectively, but themodule need not be innate: it may be 'emergent', a cognitive pattern whichemerges through experience (Karmiloff-Smith 1992; Bates et al. 1998).

4. Default Inheritance

Default inheritance is just a formal version of the logic that linguists have alwaysused: true generalizations may have exceptions. We allow ourselves to say thatverbs form their past tense by adding -ed to the stem even if some verbs don't,because the specific provision made for these exceptional cases willautomatically override the general pattern. In short, characteristics of a generalcategory are 'inherited' by instances of that category only 'by default' - only ifthey are not overridden by a known characteristic of the specific case. Commonsense tells us that this is how ordinary inference works, but default inheritanceonly works when used sensibly. Although it is widely used in artificialintelligence, researchers treat it with great caution (Luger and Stubblefield 1993:386-8). The classic formal treatment is Touretzky (1986).

Inheritance is carried by the 'isa' relation, which is another reason forconsidering this relation to be fundamental. For example, because snores isa'verb' it automatically inherits all the known characteristics of 'verb' (i. e. of 'thetypical verb'), including, for example, the fact that it has a subject; similarly,because the link between Jo and snores in Jo snores isa 'subject' it inherits thecharacteristics of 'subject'. As we have already seen, the notation for 'isa'consists of a small triangle with a line from its apex to the instance. The base ofthe triangle which rests on the general category reminds us that this category islarger than the instance, but it can also be imagined as the mouth of a hopperinto which information is poured so that it can flow along the link to theinstance.

The mechanism whereby default values are overridden has changed duringthe last few years. In EWG, and also in Fraser and Hudson (1992), themechanism was 'stipulated overriding', a system peculiar to WG; but since thenthis system has been abandoned. WG now uses a conventional system in whicha fact is automatically blocked by any other fact which conflicts and is morespecific. Thus the fact that the past tense of COME is came automatically blocksthe inheritance of the default pattern for past tense verbs. One of theadvantages of a network notation is that this is easy to define formally: we alwaysprefer the value for 'R of C' (where R is some relationship, possibly complex,and C is a concept) which is nearest to C (in terms of intervening links). Forexample, if we want to find the shape of the past tense of COME, we have achoice between came and corned, but the route to came is shorter than that tocorned because the latter passes through the concept 'past tense of a verb'. (Fordetailed discussions of default inheritance in WG, see Hudson 2000a, 2003b. )

Probably the most important question for any system that uses defaultinheritance concerns multiple inheritance, in which one concept inherits


from two different concepts simultaneously - as 'dog' inherits, for example,both from 'mammal' and from 'pet'. Multiple inheritance is allowed in WG, asin unification-based systems and the programming language DATR (Evans andGazdar 1996); it is true that it opens up the possibility of conflicting informationbeing inherited, but this is a problem only if the conflict is an artefact of theanalysis. There seem to be some examples in language where a form isungrammatical precisely because there is an irresoluble conflict between twocharacteristics; for example, in many varieties of standard English thecombination */ amn't is predictable, but ungrammatical. One explanation forthis strange gap is that the putative form amn't has to inherit simultaneouslyfrom aren't (the negative present of BE) and am (the I-form of BE); but thesemodels offer conflicting shapes (aren't, am] without any way for either tooverride the other (Hudson 2000a). In short, WG does allow multipleinheritance, and indeed uses it a great deal (as we shall see in later sections).

5. The Language Network

According to WG, then, language is a network of concepts. The following morespecific claims flesh out this general idea.

First, language is part of the same general conceptual network which containsmany concepts which are not part of language. What distinguishes the languagearea of this network from the rest is that the concepts concerned are words andtheir immediate characteristics. This is simply a matter of definition: conceptswhich are not directly related to words would not be considered to be part oflanguage. As explained in section 3. 3, language probably qualifies as a modulein the weak sense that the links among words are denser than those betweenwords and other kinds of concept, but this does not mean that language is amodule in the stronger sense of being 'encapsulated' or having its own specialformal characteristics. This is still a matter of debate, but we can be sure that atleast some of the characteristics of language are also found elsewhere - themechanism of default inheritance and the isa relation, the notion of linearorder, and many other formal properties and principles.

As we saw in Table 1, words may have a variety of links to each other and toother concepts. This is uncontroversial, and so are most of the links that arerecognized. Even the traditional notions of 'levels of language' are respected inas much as each level is defined by a distinct kind of link: a word is linked to itsmorphological structure via the 'stem' and 'shape' links, to its semantics by the'sense' and 'referent' links, and to its syntax by dependencies and word classes.Figure 5 shows how clearly the traditional levels can be separated from oneanother. In WG there is total commitment to the 'autonomy' of levels, in thesense that the levels are formally distinct.

The most controversial characteristic of WG, at this level of generality, isprobably the central role played by inheritance (isa) hierarchies.Inheritance hierarchies are the sole means available for classifying concepts,which means that there is no place for feature-descriptions. In most othertheories, feature-descriptions are used to name concepts, so that instead of


Figure 5

'verb' we have '[+V, -N]' or (changing notation) '[Verb: +, Noun: -, SUB-CAT: <NP>]' or even 'S/NP'. This is a fundamental difference because, aswe saw earlier, the labels on WG nodes are simply mnemonics and the analysiswould not be changed at all if they were all removed. The same is clearly nottrue where feature-descriptions are used, as the name itself contains crucialinformation which is not shown in any other way. In order to classify a word asa verb in WG we give it an isa link to 'verb'; we do not give it a feature-description which contains that of 'verb'.

The most obviously classifiable elements in language are words, so inaddition to specific, unique, words we recognize general 'word-types'; but wecan refer to both simply as 'words' because (as we shall see in the next section)their status is just the same. Multiple inheritance allows words to be classifiedon two different 'dimensions': as lexemes (DOG, LIKE, IF, etc. ) and asinflections (plural, past, etc. ). Figure 6 shows how this cross-classification can beincorporated into an isa hierarchy. The traditional word classes are shown onthe lexeme dimension as classifications of lexemes, but they interact in complexways with inflections. Cross-classification is possible even among word-classes;for example, English gerunds (e. g. Writing in Writing articles is fun. ) are bothnouns and verbs (Hudson 2000b), and in many languages participles areprobably both adjectives and verbs.

Unlike other theories, the classification does not take words as the highestcategory of concepts - indeed, it cannot do so if language is part of a largernetwork. WG allows us to show the similarities between words and other kinds

SEMANTICS

SYNTAX

MORPHOLOGY

PHONOLOGY

GRAPHOLOGY


Figure 6

of communicative behaviour by virtue of an isa link from 'word' to'communication', and similar links show that words are actions and events.This is important in the analysis of deictic meanings which have to relate to theparticipants and circumstances of the word as an action.

This hierarchy of words is not the only isa hierarchy in language. There aretwo more for speech sounds ('phonemes') and for letters ('graphemes'), and afourth for morphemes and larger 'forms' (Hudson 1997b; Creider and Hudson1999), but most important is the one for relationships - 'sense', 'subject' and soon. Some of these relationships belong to the hierarchy of dependents whichwe shall discuss in the section on syntax, but there are many others which donot seem to comprise a single coherent hierarchy peculiar to language (incontrast with the 'word' hierarchy). What seems much more likely is thatrelationships needed in other areas of thought (e. g. 'before', 'part-of) are put touse in language.

To summarize, the language network is a collection of words and word-parts(speech-sounds, letters and morphemes) which are linked to each other and tothe rest of cognition in a variety of ways, of which the most important is the 'isa'relationship which classifies them and allows default inheritance.

6. The Utterance Network

A WG analysis of an utterance is also a network; in fact, it is simply anextension of the permanent cognitive network in which the relevant wordtokens comprise a 'fringe' of temporary concepts attached by 'isa' links, so the


utterance network has just the same formal characteristics as the permanentnetwork. For example, suppose you say to me 'I agree. ' My task, as hearer, is tosegment your utterance into the two words / and agree, and then to classify eachof these as an example of some word in my permanent network (my grammar).This is possible to the extent that default inheritance can apply smoothly; so, forexample, if my grammar says that / must be the subject of a tensed verb, thesame must be true of this token, though as we shall see below, exceptions canbe tolerated. In short, a WG grammar can generate representations of actualutterances, warts and all, in contrast with most other kinds of grammar whichgenerate only idealized utterances or 'sentences'. This blurring of the boundarybetween grammar and utterance is very controversial, but it follows inevitablyfrom the cognitive orientation of WG.

The status of utterances has a number of theoretical consequences both forthe structures generated and for the grammar that generates them. The mostobvious consequence is that word tokens must have different names from thetypes of which they are tokens; in our example, the first word must not beshown as / if this is also used as the name for the word-type in the grammar.This follows from the fact that identical labels imply identity of concept,whereas tokens and types are clearly distinct concepts. The WG convention isto reserve conventional names for types, with tokens labelled 'wl', 'w2' and soon through the utterance. Thus our example consists of wl and w2, which isa Tand 'AGREE: pres' respectively. This system allows two tokens of the same typeto be distinguished; so in / agree I made a mistake, wl and w3 both isa T. (Forsimplicity WG diagrams in this chapter only respect this convention when it isimportant to distinguish tokens from types. )

Another consequence of integrating utterances into the grammar is thatword types and tokens must have characteristics such that a token can inheritthem from its type. Obviously the token must have the familiar characteristicsof types - it must belong to a lexeme and a word class, it must have a sense anda stem, and so on. But the implication goes in the other direction as well: thetype may mention some of the token's characteristics that are normallyexcluded from grammar, such as characteristics of the speaker, the addresseeand the situation. This allows a principled account of deictic meaning (e. g. /refers to the speaker, you to the addressee and now to the time of speaking), asshown in Figure 1 and Table 1. Perhaps even more importantly, it is possibleto incorporate sociolinguistic information into the grammar, by indicating thekind of person who is a typical speaker or addressee, or the typical situation ofuse.

Treating utterances as part of the grammar has two further effects which areimportant for the psycholinguistics of processing and of acquisition. As far asprocessing is concerned, the main point is that WG accommodates deviantinput because the link between tokens and types is guided by the rather liberal'Best Fit Principle' (Hudson 1990: 45ff): assume that the current token isa thetype that provides the best fit with everything that is known. The defaultinheritance process which this triggers allows known characteristics of the tokento override those of the type; for example, a misspelled word such as mispelled

can isa its type, just like any other exception, though it will also be shown as adeviant example. There is no need for the analysis to crash because of an error.(Of course a WG grammar is not in itself a model of either production orperception, but simply provides a network of knowledge which the processorcan exploit. ) Turning to learning, the similarity between tokens and typesmeans that learning can consist of nothing but the permanent storage of tokensminus their utterance-specific content.

These remarks about utterances are summarized in Figure 7, whichspeculates about my mental representation for the (written) 'utterance' Tonsmispelled it. According to this diagram, the grammar supplies two kinds ofutterance-based information about wl:

• that its referent is a set whose members include its addressee;• that its speaker is a 'northerner' (which may be inaccurate factually, but is

roughly what I believe to be the case).

It also shows that w2 is a deviant token of the type 'MISSPELL: past'. (Thehorizontal line below 'parts' is short-hand for a series of lines connecting theindividual letters directly to the morpheme, each with a distinct part name: part1, part 2 and so on. )

Figure 7

WHAT IS WORD GRAMMAK?


7. Morphology

As explained earlier, the central role of the word automatically means that thesyntax is 'morphology-free'. Consequently it would be fundamentally against thespirit of WG to follow transformational analyses in taking Jo snores as Jo 'tense'snore. A morpheme for tense is not a word in any sense, so it cannot be asyntactic node. The internal structure of words is handled almost entirely bymorphology. (The exception is the pattern found in clitics, which we return toat the end of this section. )

The WG theory of inflectional morphology has developed considerably inthe last few years (Creider and Hudson 1998; Hudson 2000a) and is stillevolving. In contrast with the views expressed in EWG, I now distinguish sharplybetween words, which are abstract, and forms, which are their concrete (visibleor audible) shapes; so I now accept the distinction between syntactic words andphonological words (Rosta 1997) in all but terminology. The logic behind thisdistinction is simple: if two words can share the same form, the form must be aunit distinct from both. For example, we must recognize a morpheme {bear}which is distinct from both the noun and the verb that share it (BEARnoun andBEARvverb). This means that a word can never be directly related to phonemeserb). This means that a word can never be directly related to phonemesand letters, in contrast with the EWG account where this was possible (e. g.Hudson 1990: 90: 'whole of THEM = <them>'). Instead, words are mappedto forms, and forms to phonemes and letters. A form is the 'shape' of a word,and a phoneme or letter is a 'pronunciation' or 'spelling' of a form. InFigure 7, for example, the verb MISSPELL has the form {misspell} as its stem (akind of shape), and the spelling of {misspell} is < misspell>.

In traditional terms, syntax, form and phonology define different 'levels oflanguage'. As in traditional structuralism, their basic units are distinct words,morphemes and phoneme-type segments; and as in the European tradition,morphemes combine to define larger units of form which are still distinct fromwords. For example, {misspell} is clearly not a single morpheme, but it exists asa unit of form which might be written {mis+spell} - two morphemes combiningto make a complex form - and similarly for {mis+spell+ed}, the shape of thepast tense of this verb. Notice that in this analysis {... } indicates forms, notmorphemes; morpheme boundaries are shown by '+'.

Where does morphology, as a part of the grammar, fit in? Inflectionalmorphology is responsible for any differences between a word's stem - theshape of its lexeme - and its whole - the complete shape. For example, thestem of misspelled is {misspell}, so inflectional morphology explains the extrasuffix. Derivational morphology, on the other hand, explains the relationsbetween the stems of distinct lexemes - in this case, between the lexemesSPELL and MISSPELL, whereby the stem of one is contained in the stem ofthe other. The grammar therefore contains the following 'facts':

• the stem of SPELL is {spell};e the stem of MISSPELL is {mis+spell};• the 'mis-verb' of a verb has a stem which contains {mis} + the stem of this

verb;


• the whole of MISSPELL: past is {mis+spell+ed};• the past tense of a verb has a whole which contains its stem + {ed}.

In more complex cases (which we cannot consider here) the morphologicalrules can handle vowel alternations and other departures from simplecombination of morphemes.

A small sample of a network for inflectional morphology is shown inFigure 8. This diagram shows the default identity of whole and stem, and thedefault rule for plural nouns: their shape consists of their stem followed by {s}.No plural need be stored for regular nouns like DUCK, but for GOOSE theirregularity must be stored. According to the analysis shown here, geese isdoubly irregular, having no suffix and having an irregular stem whose vowelpositions (labelled here simply T and '2') are filled by (examples of) <e>instead of the expected <o>. In spite of the vowel change the stem of geese isathe stem of GOOSE, so it inherits all the other letters, but had it beensuppletive a completely new stem would have been supplied.

Figure 8


This analysis is very similar to those which can be expressed in terms of'network morphology' (Brown et al. 1996), which is also based on multipledefault inheritance. One important difference lies in the treatment ofsyncretism, illustrated by the English verb's past participle and passive participlewhich are invariably the same. In network morphology the identity is shown byspecifying one and cross-referring to it from the other, but this involves anarbitrary choice: which is the 'basic' one? In WG morphology, in contrast, thesyncretic generalizations are expressed in terms of 'variant' relations betweenforms; for example, the past participle and passive participle both have as theirwhole the 'en-variant' of their stem, where the en-variant of {take} is {taken} andthat of {walk} is {walked}. The en-variant is a 'morphological function'which relates one form (the word's stem) to another, allowing the requiredcombination of generalization (by default a form's en-variant adds {ed} to a copyof the form) and exceptionality.

As derivational morphology is responsible for relationships betweenlexemes, it relates one lexeme's stem to that of another by means of exactlythe same apparatus of morphological functions as is used in inflectionalmorphology - indeed, some morphological functions may be used both ininflection and in derivation (for example, the one which is responsible foradding {ing} is responsible not only for present participles but also fornominalizations such as flooring). Derivational morphology is not welldeveloped in WG, but the outlines of a system are clear. It will be based onabstract lexical relationships such as 'mis-verb' (relating SPELL toMISSPELL) and 'nominalization' (relating it to SPELLING); these abstractrelations between words are realized, by default, by (relatively) concretemorphological functions, so, for example, a verb's nominalization is typicallyrealized by the ing-variant of that verb's stem. Of course, not all lexicalrelationships are realized by derivational morphology, in which related lexemesare partly similar in morphology; the grammar must also relate lexemes wheremorphology is opaque (e. g. DIE - KILL, BROTHER - SISTER). Thenetwork approach allows us to integrate all these relationships into a singlegrammar without worrying about boundaries between traditional sub-disciplinessuch as derivational morphology and lexical semantics.

I said at the start of this section that clitics are an exception to the generallyclear distinction between morphology and syntax. A clitic is a word whoserealization is an affix within a larger word. For example, in He's gone, the clitic 'sis a word in terms of syntax, but its realization is a mere affix in terms ofmorphology. They are atypical because typical words are realized by an entireword-form; but the exceptionality is just a matter of morphology. In the case of's, I suggest that it isa the word 'BE: present, singular' with the one exceptionalfeature that its whole isa the morpheme {s} - exactly the same morpheme as wefind in plural nouns, other singular verbs and possessives. As in other uses, {s}needs to be part of a complete word-form, so it creates a special form called a'host-form' to combine it with a suitable word-form to the left.

In more complex cases ('special clitics' - Zwicky 1977) the position of theclitic is fixed by the morphology of the host-form and conflicts with the


demands of syntax, as in the French example (3) where en would follow deux(*Paul mange deux en) if it were not attached by cliticization to mange, giving asingle word-form en mange.

8. Syntax

As in most other theories, syntax is the best developed part of WG, whichoffers explanations for most of the 'standard' complexities of syntax such asextraction, raising, control, coordination, gapping and agreement. However theWG view of syntax is particularly controversial because of its rejection of phrasestructure. WG belongs to the family of 'dependency-based' theories, in whichsyntactic structure consists of dependencies between pairs of single words. Aswe shall see below, WG also recognizes 'word-strings', but even these are notthe same as conventional phrases.

A syntactic dependency is a relationship between two words that areconnected by a syntactic rule. Every syntactic rule (except for those involved incoordination) is 'carried' by a dependency, and every dependency carries atleast one rule that applies to both the dependent and its 'parent' (the word onwhich it depends). These word-word dependencies form chains which linkevery word ultimately to the word which is the head of the phrase or sentence;consequently the individual links are asymmetrical, with one word dependingon the other for its link to the rest of the sentence. Of course in some cases thedirection of dependency is controversial; in particular, published WG analysesof noun phrases have taken the determiner as head of the phrase, though thisanalysis has been disputed and may turn out to be wrong (Van Langendonck1994; Hudson 2004). The example in Figure 9 illustrates all thesecharacteristics of WG syntax.

A dependency analysis has many advantages over one based on phrasestructure. For example, it is easy to relate a verb to a lexically selectedpreposition if they are directly connected by a dependency, as in the pairconsists of in Figure 9; but it is much less easy (and natural) to do so if thepreposition is part of a prepositional phrase. Such lexical interdependencies arecommonplace in language, so dependency analysis is particularly well suited todescriptions which focus on 'constructions' - idiosyncratic patterns not coveredby the most general rules (Holmes and Hudson 2005). A surface dependencyanalysis (explained below) can always be translated into a phrase structure bybuilding a phrase for each word consisting of that word plus the phrases of allthe words that depend on it (e. g. a sentence; of a sentence; and so on); but

(3) PaulPaul

enof-them

mangeeats

deux,two

'Paul eats two of them. '

Once again we can explain this special behaviour if we analyze en as an ordinaryword EN whose shape (whole) is the affix {en}. There is a great deal more to besaid about clitics, but not here. For more detail see Hudson (2001) andCamdzic and Hudson (2002).


Figure 9

dependency analysis is much more restrictive than phrase-structure analysisbecause of its total flatness. Because one word can head only one phrase it isimpossible to build a dependency analysis which emulates a VP node or 'unarybranching'. This resfrictiveness is welcome, because it seems that such analysesare never needed.

In contrast, the extra richness of dependency analysis lies partly in thelabelled dependency links, and partly in the possibility of multipledependencies. In a flat structure, in contrast with phrase structure, it isimpossible to distinguish co-dependencies (e. g. a verb's subject and object) byconfiguration, so labels are the only way to distinguish them. There is clearly atheoretical trade-off between phrase structure and labelled functions: the moreinformation is given in one, the less needs to be given in the other. The generaltheory of WG is certainly compatible with phrase structure - after all, weundoubtedly use part-whole structures in other areas of cognition, and they playan important role in morphology - but it strongly favours dependency analysisbecause labelled links are ubiquitous in the cognitive network, both insemantics, and elsewhere. If knowledge is generally organized in terms oflabelled links, why not also in syntax? But if we do use labelled links(dependencies) in syntax, phrase structure is redundant.

Syntactic structures can be much more complex than the example inFigure 9. We shall briefly consider just three kinds of complication: structure-sharing, coordination and unreal words. Structure-sharing is found whenone word depends on more than one other word - i. e. when it is 'shared' as adependent. The notion is familiar from modern phrase-structure analyses,especially Head-driven Phrase Structure Grammar (HPSG) (Pollard and Sag1994: 19), where it is described as 'the central explanatory mechanism', and it isthe main device in WG which allows phrases to be discontinuous. (In

KEY

WORD CLASSES

DEPENDENCY TYPES


recognizing structure-sharing, WG departs from the European tradition ofdependency analysis which generally allows only strictly 'projective', continuousstructures such as Figure 9. ) Figure 10 illustrates two kinds of structure-sharing- in raising (you shared by have and been) and in extraction (what shared by have,been, looking and at). The label 'x<' means 'extractee', and V means 'sharer'(otherwise known as 'xcomp' or 'incomplement').

Figure 10

This diagram also illustrates the notion 'surface structure' mentioned above.Each dependency is licenced by the grammar network, but when the result isstructure-sharing, just one of these dependencies is drawn above the words; thetotality of dependencies drawn in this way constitutes the sentence's surfacestructure. In principle any of the competing dependencies could be chosen, butin general only one choice is compatible with the 'geometry' of a well-formedsurface structure, which must be free of 'tangling' (crossing dependencies - i. e.discontinuous phrases) and 'dangling' (unintegrated words). There are no suchconstraints on the non-surface dependencies. (For extensive discussion of howthis kind of analysis can be built into a parsing algorithm, see Hudson 2000c;for a comparison with phrase-structure analyses of extraction, see Hudson2003c. )

The second complication is coordination. The basis of coordination isthat conjuncts must share their 'external' dependencies - dependencies (if any)to words outside the coordination. The structure of the coordination itself (interms of 'conjuncts' and 'coordinators') is analyzed in terms of 'word-strings',simple undifferentiated strings of words whose internal organization isdescribed in terms of ordinary dependencies. A word string need not be aphrase, but can consist of two (or more) mutually independent phrases as in theexample of Figure 11, where the coordination and conjuncts are bounded bybrackets: {[... ] [... ]}.

Unreal -words are the WG equivalent of 'empty categories' in othertheories. Until recently I have rejected such categories for lack of persuasiveevidence; for example, my claim has always been that verbs which appeared tohave no subject really didn't have any subject at all. So an imperative (Hurry!)had no subject, rather than some kind of covert subject. However I am nowconvinced that, for at least some languages, this is wrong. The evidence comesfrom case-agreement between subjects and predicatives (WG sharers) in


Figure II

languages such as Icelandic and Ancient Greek (Hudson 2003a); and theconclusion is that some words have no realization (Creider and Hudson, thisvolume). In this new analysis, therefore, an imperative verb does have a subject:the word you. This is the ordinary word you, with its ordinary meaning, butexceptionally, it is unrealized because this is what imperative verbs require oftheir subjects. As Creider and I show, unrealized words may explain a widerange of syntactic facts.

This discussion of syntax merely sets the scene for many other syntactictopics, all of which now have reasonably well-motivated WG treatments: wordorder, agreement, features, case-selection, 'zero' dependents. The mostimportant point made is probably the claim that in syntax the networkapproach to language and cognition in general leads naturally to dependencyanalysis rather than to phrase structure.

9. Semantics

As in any other theory, WG has a compositional semantics in which each wordin a sentence contributes some structure that is stored as its meaning. However,these meanings are ordinary concepts which, like every other concept, aredefined by a network of links to other concepts. This means that there can beno division between 'purely linguistic' meaning and 'encyclopedic' meaning.For instance the lexemes APPLE and PEAR have distinct senses, the ordinaryconcepts 'apple' and 'pear', each linked to its known characteristics in thenetwork of general knowledge. It would be impossible to distinguish themmerely by the labels 'apple' and 'pear' because (as we saw in section 3. 2) labelson concepts are just optional mnemonics; the true definition of a concept isprovided by its various links to other concepts. The same is true of verbmeanings: for example, the sense of EAT is defined by its relationships to otherconcepts such as 'put', 'mouth', 'chew5, 'swallow' and 'food'. The underlyingview of meaning is thus similar to Fillmore's Frame Semantics, in which lexicalmeanings are denned in relation to conceptual 'frames' such as the one for'commercial transaction' which is exploited by the definitions of 'buy', 'sell' andso on. (See Hudson forthcoming for a WG analysis of commercial transactionverbs. )

Like everything else in cognition, WG semantic structures form a networkwith labelled links like those that are widely used in Artificial Intelligence. As inJackendofFs Conceptual Semantics (1990), words of all word classes contributethe same kind of semantic structure, which in WG is divided into 'sense5

(general categories) and 'referent' (the most specific individual or category


referred to). The contrast between these two kinds of meaning can becompared with the contrast in morphology (section 7) between stem and whole:a word's lexeme provides both its stem and its sense, while its inflectionprovides its whole and its referent. For example, the word dogs is defined by acombination of the lexeme DOG and the inflection 'plural', so it is classified as'DOG: plural'. Its lexeme defines the sense, which is 'dog', the general conceptof a (typical) dog, while its inflection defines the referent as a set with more thanone member. As in other theories the semantics cannot identify the particularset or individual which a word refers to on a particular occasion of use, andwhich we shall call simply 'set s'; this identification process must be left to thepragmatics. But the semantics does provide a detailed specification for what thatindividual referent might be - in this case, a set, each of whose members is adog. One WG notation for the two kinds of meaning parallels that for the twokinds of word-form: a straight line for the sense and the stem, which are bothretrieved directly from the lexicon, and a curved line for the referent and theshape, which both have to be discovered by inference. The symmetry of theserelationships can be seen in Figure 12.

Figure 12

The way in which the meanings of the words in a sentence are combined isguided by the syntax, but the semantic links are provided by the sensesthemselves. Figure 13 gives the semantic structure for Dogs barked, where thelink between the word meanings is provided by 'bark', which has an 'agent' link(often abbreviated 'er' in WG) to its subject's referent. If we call the particular


act of barking that this utterance refers to 'event-e', the semantic structure mustshow that the agent of event-e is set-s. As with nouns, verb inflections contributedirectly to the definition of the referent, but a past-tense inflection does this bylimiting the event's time to some time ('tl') that preceded the moment ofspeaking ('now5). Figure 13 shows all these relationships, with the two wordslabelled VI' and 'w2'. For the sake of simplicity the diagram does not showhow these word tokens inherit their characteristics from their respective types.

Figure 13

The analysis of Dogs barked illustrates an important characteristic of WGsemantic structures. A word's 'basic' sense - the one that is inherited from itslexeme - is modified by the word's dependents; and the result of thismodification is a second sense, more specific than the basic sense but moregeneral than the referent. This intermediate sense contains the meaning of thehead word plus its dependent, so in effect it is the meaning of that phrase. Incontrast with the syntax, therefore, the semantic structure contains a node foreach phrase, as well as nodes for the individual words - in short, a phrasestructure. Moreover, this phrase structure must be strictly binary because thereare reasons for believing that dependents modify the head word one at a time,each defining a distinct concept, and that the order of combining maycorrespond roughly to the bracketing found in conventional phrase structure.For example, although subjects and objects are co-dependents, subjects seem tomodify the concepts already defined by objects, rather than the other wayround, so Dogs chase cats defines the concepts 'chase cats' and 'dogs chase cats',but not 'dogs chase' - in short, a WG semantic structure contains somethinglike a VP node. This step-wise composition of word meanings is called'semantic phrasing9.

This brief account of WG semantics has described some of the basic ideas,


but has not been able to illustrate the analyses that these ideas permit. In theWG literature there are extensive discussions of lexical semantics, and someexplorations of quantification, definiteness and mood. However it has to be saidthat the semantics of WG is much less well researched than its syntax.

10. Processing

The main achievements on processing are a theory of parsing and a theory ofsyntactic difficulty; but current research is focused on a general theory of cognitiveprocessing in which language processing falls out as a particular case (Hudson, inpreparation). In this theory, processing is driven by a combination of spreadingactivation, default inheritance and binding, like any other psychological model itneeds to be tested, and one step towards this has been taken by building twocomputer systems called WGNet++ (see www. phon. ucl. ac. uk/home/WGNet/wgnet++. htm) and Babbage (www. babbagenet. org) for experimenting withcomplex networks.

The most obvious advantage of WG for a parser, compared withtransformational theories, is the lack of freely-occurring 'invisible' words (incontrast with the unrealized words discussed above, which can always bepredicted from other realized words such as imperative verbs); but thedependency basis also helps by allowing each incoming word to be integratedwith the words already processed without the need to build (or rebuild) highersyntactic nodes.

A very simple algorithm guides the search for dependencies in a way thatguarantees a well-formed surface structure (in the sense defined in section 8):the current word first tries to 'capture' the nearest non-dependent word as itsdependent, and if successful repeats the operation; then it tries to 'submit' as adependent to the nearest word that is not part of its own phrase (or, ifunsuccessful, to the word on which this word depends, and so on recursively upthe dependency chain); and finally it checks for coordination. (More details canbe found in Hudson 2000c. ) The algorithm is illustrated in the followingsequence of 'snapshots' in the parsing of Short sentences make good examples,where the last word illustrates the algorithm best. The arrows indicate syntacticdependencies without the usual labels; and it is to be understood that thesemantic structure is being built simultaneously, word by word. The structureafter ': -' is the output of the parser at that point

(4)

The familiar complexities of syntax are mostly produced by discontinuouspatterns. As explained in section 8, the discontinuous phrases are shown bydependencies which are drawn beneath the words, leaving a straightforward

abcdef

wl = snort.w2 = sentences.w3 = make.w4 = good.w5 = examples.

No progressCaptureCaptureNo progressCaptureSubmit

- wl.- wl- wl- wl- w4- wl

w2.w2w2w5.w2

w3.w3,

w3

w4

(w4 <-) w5.

.

www.phon.ucl.ac.uk/home/WGNet/wgnet++.htm

www.phon.ucl.ac.uk/home/WGNet/wgnet++.htm

www.babbagenet.org


surface structure. For example, subject-raising in He has been working is shownby non-surface subject links from both been and working to he. Once the surfacestructure is in place, these extra dependencies can be inferred more or lessmechanically (bar ambiguities), with very little extra cost to the parser.

The theory of syntactic complexity (Hudson 1996b) builds on thisincremental parsing model. The aim of the parser is to link each word as adependent to some other word, and this link can most easily be establishedwhile both words are still active in working memory. Once a word has becomeinactive it can be reconstructed (on the basis of the meaning that it contributed),but this is costly. The consequence is that short links are always preferred tolong ones. This gives a very simple basis for calculating the processing load for asentence (or even for a whole text): the mean 'dependency distance' (calculatedas the number of other words between a word and the word on which itdepends). Following research by Gibson (1998) the measure could be mademore sophisticated by weighting intervening words, but even the simplemeasure described here gives plausible results when applied to sample texts(Hiranuma 2001). It is also supported by a very robust statistic about Englishtexts: that dependency links tend to be very short. (Typically 70 per cent ofwords are adjacent to the word on which they depend, with 10 per centvariation in either direction according to the text's difficulty. )

11. Conclusions

WG addresses questions from a number of different research traditions. As informal linguistics, it is concerned with the formal properties of languagestructure; but it also shares with cognitive linguistics a focus on how thesestructures are embedded in general cognition. Within syntax, it usesdependencies rather than phrase structure but also recognizes the richstructures that have been highlighted in the phrase-structure tradition. Inmorphology it follows the European tradition which separates morphologystrictly from syntax, but also allows exceptional words which (thanks tocliticization) contain the forms of smaller words. And so on through other areasof language. Every theoretical decision is driven by two concerns: staying true tothe facts of language, and providing the simplest possible explanation for thesefacts. The search for new insights is still continuing, and more cherished beliefsmay well have to be abandoned; but the most general conclusion so far seemsto be that language is mostly very much like other areas of cognition.

References

Aitchison, Jean (1987), Words in the Mind: An Introduction to the Mental Lexicon.Oxford: Blackwell.

Altaian, Gerry (1997), The Ascent of Babel: An Exploration of Language, Mind andUnderstanding. Oxford: Oxford University Press.

Anderson, John (1971), 'Dependency and grammatical functions'. Foundations ofLanguage, 7, 30-7.

Bates, Elizabeth, Elman, Jeffrey, Johnson, Mark, Karmiloff-Smith, Annette, Parisi,


Domenico and Plunkett, Kim (1998), 'Innateness and emergentism', in WilliamBechtel and George Graham (eds), A Companion to Cognitive Science. Oxford:Blackwell, pp. 590-601.

Bennett, David (1994), 'Stratificational Grammar', in Ronald Asher (ed. ), Encyclopediaof Language and Linguistics. Oxford: Elsevier, pp. 4351-56.

Bresnan, Joan (1978), 'A realistic transformational grammar', in Morris Halle, JoanBresnan, and George Miller (eds), Linguistic Theory and Psychological Reality.Cambridge, MA: MIT Press, pp. 1-59.

Brown, Dunstan, Corbett, Greville, Fraser, Norman, Hippisley, Andrew andTimberlake, Alan (1996), 'Russian noun stress and network morphology'.Linguistics, 34, 53-107.

Butler, Christopher (1985), Systemic Linguistics: Theory and Application. London:Arnold.

Camdzic, Amela and Hudson, Richard (2002), 'Clitics in Serbo-Croat-Bosnian'. UCLWorking Papers in Linguistics, 14, 321-54.

Chekili, Ferid (1982), 'The Morphology of the Arabic Dialect of Tunis'. (Unpublisheddoctoral dissertation, University of London).

Chomsky, Noam (1957), Syntactic Structures. The Hague: Mouton.— (1965), Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.— (1970), 'Remarks on nominalization', in Rodney Jacobs and Peter Rosenbaum (eds),

Readings in Transformational Grammar. London: Ginn, pp. 184-221.Creider, Chet (1999) 'Mixed categories in Word Grammar: Swahili infinitival nouns'.

Linguistica Atlantica, 21, 53-68.Creider, Chet and Hudson, Richard (1999), 'Inflectional morphology in Word

Grammar'. Lingua, 107, 163-87.— (Ch. 2 in this volume), 'Case agreement in Ancient Greek: implications for a theory

of covert elements'. This volume.Eppler, Eva (2005), 'The Syntax of German-English code-switching'. (Unpublished

doctoral dissertation, UCL).Evans, Roger and Gazdar, Gerald (1996), 'DATR: a language for lexical knowledge

representation'. Computational Linguistics, 22, 167-216.Fillmore, Charles (1975), 'An alternative to checklist theories of meaning' Proceedings of

the Berkeley Linguistics Society, 1, 123-31.— (1976), 'Frame semantics and the nature of language'. Annals of the New York

Academy of Sciences, 280, 20-32.Fraser, Norman (1985), 'A Word Grammar Parser' (Unpublished doctoral dissertation,

University of London).— (1989), 'Parsing and dependency grammar'. UCL Working Papers in Linguistics, 1,

296-319.— (1993), 'Dependency Parsing' (Unpublished doctoral dissertation, UCL).— (1994), 'Dependency Grammar', in Ronald Asher (ed. ), Encyclopedia of Language and

Linguistics. Oxford: Elsevier, pp. 860-4.Fraser, Norman and Hudson, Richard (1992), 'Inheritance in Word Grammar'.

Computational Linguistics, 18, 133-58.Gibson, Edward (1998), 'linguistic complexity: locality of syntactic dependencies. '

Cognition 68, 1-76.Gisborne, Nikolas (1993), 'Nominalisations of perception verbs'. UCL Working Papers

in Linguistics, 5, 23-44.— (1996), 'English Perception Verbs'. (Unpublished doctoral dissertation, UCL).— (2000), 'The complementation of verbs of appearance by adverbs', in Ricardo

Bermudez-Otero, David Denison, Richard Hogg and C. McCully (eds), Generative


Theory and Corpus Studies: A Dialogue from 10 ICEHL. Berlin: Mouton de Gruyter,pp. 53-75.

— (2001), 'The stative/dynamic contrast and argument linking'. Language Sciences, 23,603-37.

Goldberg, Adele (1995), Constructions: A Construction Grammar Approach to ArgumentStructure. Chicago: University of Chicago Press.

Gorayska, Barbara (1985), 'The Semantics and Pragmatics of English and Polish withReference to Aspect' (Unpublished doctoral dissertation, UCL).

Halliday, Michael (1961), 'Categories of the theory of grammar'. Word, 17, 241-92.— (1967-8), 'Notes on transitivity and theme in English'. Journal of Linguistics, 3, 37-

82, 199-244; 4, 179-216.— (1985), An Introduction to Functional Grammar. London: Arnold.Hiranuma, So (1999), 'Syntactic difficulty in English and Japanese: A textual study'.

UCL Working Papers in Linguistics, 11, 309-21.— (2001), 'The Syntactic Difficulty of Japanese Sentences' (Unpublished doctoral

dissertation, UCL).Holmes, Jasper (2004), 'Lexical Properties of English Verbs' (Unpublished doctoral

dissertation, UCL).Holmes, Jasper and Hudson, Richard (2005), 'Constructions in Word Grammar', in

Jan-Ola Ostman and Mirjam Fried (eds), Construction Grammars: Cognitive Groundingand Theoretical Extensions. Amsterdam: Benjamins, pp. 243-72.

Hudson, Richard (1964), 'A Grammatical Analysis of Beja' (Unpublished doctoraldissertation, University of London).

— (1970), English Complex Sentences: An Introduction to Systemic Grammar/Amsterdam:Nordi-Holland.

— (1976), Arguments for a Non-transformational Grammar. Chicago: University ofChicago Press.

— (1980a), Sociolinguistics. Cambridge: Cambridge University Press.— (1980b), 'Constituency and dependency'. Linguistics, 18, 179-98.— (1981), 'Panlexicalism'. Journal of Literary Semantics, 10, 67-78.- (1984), Word Grammar. Oxford: BlackweU.— (1989), 'Towards a computer-testable Word Grammar of English'. UCL Working

Papers in Linguistics, 1, 321-39.— (1990), English Word Grammar. Oxford: Blackwell.— (1992), 'Raising in syntax, semantics and cognition', in Iggy Roca (ed. ), Thematic

Structure: Its Role in Grammar. The Hague: Mouton, pp. 175-98.— (1993), 'Do we have heads in our minds?', in Greville Corbett, Scott McGlashen and

Norman Fraser (eds), Heads in Grammatical Theory. Cambridge: CambridgeUniversity Press, pp. 266-91.

— (1995), Word Meaning. London: Roudedge.— (1996a), Sociolinguistics (2nd edition). Cambridge: Cambridge University Press.— (1996b), 'The difficulty of (so-called) self-embedded structures'. UCL Working Papers

in Linguistics, 8, 283-314.— (1997a), 'The rise of auxiliary DO: verb non-raising or category-strengthening?'.

Transactions of the Philological Society, 95, 41-72.— (1997b), 'Inherent variability and linguistic theory'. Cognitive Linguistics, 8, 73-108.— (1998), English Grammar. London: Routiedge.- (2000a), 'VflrowY. Language, 76, 297-323.— (2000b), 'Gerunds and multiple default inheritance'. UCL Working Papers in

Linguistics, 12, 303-35.— (2000c), 'Discontinuity'. Traitement Automatique des Langues, 41, 15-56.


- (2001), 'Cities in Word Grammar'. UCL Working Papers in Linguistics, 13, 293-4.— (2003a), 'Case-agreement, PRO and structure-sharing'. Research in Language, 1, 7-33.— (2003b), 'Mismatches in default inheritance', in Elaine Francis and Laura Michaelis

(eds), Linguistic Mismatch: Scope and Theory. Stanford: CSLI, pp. 269-317.- (2003c), 'Trouble on the left periphery'. Lingua, 113, 607-42.— (2004), 'Are determiners heads?'. Functions of Language,, 11, 7-43.— (forthcoming), 'Buying and selling in Word Grammar', in Jozsef Andor and Peter

Pelyvas (eds), Empirical, Cognitive-Based Studies In The Semantics-PragmaticsInterface. Oxford: Elsevier.

— (in preparation) Advances in Word Grammar. Oxford: Oxford University Press.Hudson, Richard and Holmes, Jasper (2000) 'Re-cycling in the Encyclopedia', in Bert

Peeters (ed. ), The Lexicon/Encyclopedia Interface. Oxford: Elsevier, pp. 259-90.Jackendoff, Ray (1990), Semantic Structures. Cambridge, MA: MIT Press.Kamiloff-Smith, Annette (1992), Beyond Modularity: A developmental perspective on

cognitive science. Cambridge, Mass.: MIT Press.Kaplan, Ron and Bresnan, Joan (1982), 'Lexical-functional Grammar: a formal system

for grammatical representation', in Joan Bresnan (ed. ), The Mental Representation ofGrammatical Relations. Cambridge, MA: MIT Press, pp. 173-281.

Kreps, Christian (1997), 'Extraction, Movement and Dependency Theory' (Unpub-lished doctoral dissertation, UCL).

Lakoff, George (1987), Women, Fire and Dangerous Things: What Categories Revealabout the Mind. Chicago: University of Chicago Press.

Lamb, Sidney (1966), An Outline of Stratificational Grammar. Washington, DC:Georgetown University Press.

— (1999), Pathways of the Brain: The Neurocognitive Basis of Language. Amsterdam:Benjamins.

Langacker, Ronald (1987), Foundations of Cognitive Grammar I. Theoretical Prerequisites.Stanford: Stanford University Press.

— (1990), Concept, Image and Symbol. The Cognitive Basis of Grammar. Berlin: De Gruyter.Luger, George and Stubblefield, William (1993), Artificial Intelligence. Structures and

Strategies for Complex Problem Solving. Redwood City, CA: Benjamin/CummingsPub. Co.

Lyons, John (1977), Semantics. Cambridge: Cambridge University Press.McCawley, James (1968), 'Concerning the base component of a transformational

grammar'. Foundations of Language, 4, 243-69.Pinker, Steven (1994), The Language Instinct. Harmondsworth: Penguin Books.Pinker, Steven and Prince, Alan (1988), 'On language and connectionism: Analysis of a

Parallel Distributed Processing model of language acquisition'. Cognition, 28, 73-193.Pollard, Carl and Sag, Ivan (1994), Head-driven Phrase Structure Grammar. Chicago:

University of Chicago Press.Reisberg, Daniel (1997), Cognition. Exploring the Science of the Mind. New York: W. W.

Norton.Robinson, Jane (1970), 'Dependency structures and transformational rules'. Language,

46, 259-85.Rosta, Andrew (1994), 'Dependency and grammatical relations'. UCL Working Papers

in Linguistics, 6, 219-58.— (1996), 'S-dependency'. UCL Working Papers in Linguistics, 8, 387-421.— (1997), English Syntax and Word Grammar Theory. Unpublished doctoral dissertation,

UCL, London.Shaumyan, Olga (1995), Parsing English with Word Grammar. Imperial College London

MSc Thesis.


Sugayama, Kensei (1991), 'More on unaccusative Sino-Japanese complex predicates inJapanese'. UCL Working Papers in Linguistics, 3, 397-415.

— (1992), 'A word-grammatic account of complements and adjuncts in Japanese(interim report)'. Kobe City University Journal, 43, 89-99.

— (1993), 'A word-grammatic account of complements and adjuncts in Japanese'.Proceedings of the 15th International Congress of Linguistics, Vol. 2, Universite Laval,pp. 373-6.

— (1996), 'Semantic structure of eat and its Japanese equivalent taberu: a Word-Grammatic account'. Translation and Meaning, 4, 193-202.

Taylor, John (1989), Linguistic Categorisation: An Essay in Cognitive Linguistics. Oxford:Oxford University Press.

Tesniere, Lucien (1959), Elements de Syntaxe Structurale. Paris: Klincksieck.Touretzky, David (1986), The Mathematics of Inheritance Systems. Los Altos, CA: M.

Kaufmann Publishers.Tzanidaki, Dimitra (1995), 'Greek word order: towards a new approach'. UCL Working

Papers in Linguistics, 7, 247-77.— (1996a), 'Configurationality and Greek clause structure'. UCL Working Papers in

Linguistics, 8, 449-84.— (1996b), 'The Syntax and Pragmatics of Subject and Object Position in Modern

Greek' (Unpublished doctoral dissertation, UCL).van Langendonck, Willy (1987), Word Grammar and child grammar'. Belgian Journal

of Linguistics, 2, 109-32.— (1994), 'Determiners as heads?'. Cognitive Linguistics, 5, 243-59.Volino, Max (1990), 'Word Grammar, Unification and the Syntax of Italian Clitics'

(Unpublished doctoral dissertation, Edinburgh University).Zwicky, Arnold (1977), On Clitics. Bloomington: IULC.— (1992), 'Some choices in the theory of morphology', in Robert Levine (ed. ), Formal

Grammar: Theory and Implementation. Oxford: Oxford University Press, pp. 327-71.

Parti

Word Grammar Approaches to Linguistic Analysis:Its explanatory power and applications

2. Case Agreement in Ancient Greek: Implications fora theory of covert elements

CHET CREIDER AND RICHARD HUDSON

AbstractIn Ancient Greek a predicative adjective or noun agrees in case with the subjectof its clause, even if the latter is covert. This provides compelling evidence for'empty' (i. e. covert) elements in syntax, contrary to the tradition in WG theory.We present an analysis of empty elements which exploits a feature unique toWG, the separation of 'existence' propositions from propositions dealing withother properties; an empty word has the property of not existing (or, moretechnically, 0 quantity). We contrast this theory with the Chomskyan PRO andthe Head-driven Phrase Structure Grammar (HPSG) 'potential' SUBCAT list

1. Introduction

Case agreement in Ancient Greek1 has attracted a small but varied set oftreatments in the generative tradition (Andrews 1971; Lecarme 1978; Quicoli1982). In this literature the problems were framed and solved in transforma-tional frameworks. In the present chapter we wish to consider the data from thepoint of view of the problems they pose for a theory of case assignment andphonologically empty elements in a modern, declarative framework - WordGrammar (WG; Hudson 1990). We present an analysis of empty elementswhich exploits a feature unique to WG, the separation of existence propositionsfrom propositions dealing with other properties; and we contrast it with earlierWG analyses in which these 'empty' elements are simply absent and withChomskyan analyses in terms of PRO, a specific linguistic item which is alwayscovert. The proposed analysis is similar in some respects to the one proposedby Pollard and Sag (1994) for HPSG.

2. The Data

We confine our attention to infinitival constructions. The infinitive in AncientGreek is not inflected for person, number or case and hence, when predicateadjectives and predicate nominals appear as complements of infinitives, it isnecessary to account for the case of these elements. One purpose of thisdiscussion is to show that traditional grammars are right to explain the case ofpredicates in terms of agreement with the subject, but this analysis works most


naturally if we also assume some kind of 'null' subject for some infinitives. Theexamples that support the null subject take the accusative case, and arediscussed in section 2. 1; there are well-known exceptions which are traditionallydescribed in terms of 'attraction' and which are discussed in section 2. 2.

2. 1. Accusative subjects

Traditional grammars of Greek state that the subject of an infinitive takes theaccusative case.

Examples are usually given of the accusative plus infinitive construction as inthe following:

(1)

(2)

ekeleuonthey-ordered

autousthem(acc)

poreuesthaito-proceed

'they ordered that they should proceed' (Smyth 1956: 260, X. A. 4. 2. 12)phe: sis/he-says

toiisthe (ace)

andrasmen(acc)

apeltheinto-go-away

s/he says that the men went away' (Goodwin 1930: 196)

A partial syntactic analysis of (2) is shown in Figure 1. In this analysis theinfinitive is a dependent (object) of the main verb, and it has a dependent(subject) which bears the accusative case. We assume a standard analysis inwhich the definite determiner is a head with respect to a 'determiner phrase'.

Figure 1

Since the subject is accusative, elements (predicate nouns and adjectives)which agree with it are accusative (in contrast with the nominative case foundwhen the subject is nominative):

(3) a

b

KlearkhosClearchus(nom)

phugas e: nexile(nom) was (contrast phugada, 'exile(acc)')

'Clearchus was an exile. ' (X. A. 1. 1. 9)nomizo:I-thinkkaiand

garfor

huma: syou(acc)philousfriends (ace)

emoime(dat)

einaito-be

kaiand

patridafatherland(acc)

'for I think you are to me both fatherland and friends' (X. A. 1. 3. 6)

CASE AGREEMENT IN ANCIENT GREEK 37

The agreement of a predicative with the subject can be convenientlydiagrammed as in Figure 2. (In words, whatever a verb's subject andpredicative may be, their case must be the same. )

ergoisi dae: monaworks skilled(acc)

However, note that the predicative may be accusative even when the accusativesubject is itself absent:

(4) philanthro: pon einai deihumane (ace) to-be must'(one) must be humane' (I. 2. 15)

(5) oud' ara po: s e: n enpantess'not then in-any-way was in allpho: ta genesthaiman(acc) to-become'(one) could not then in any way become a man skilled in all works'(H. II. 23. 670-71)

This can also be true even when there is a coreferential element in the higherclause:

(6) exarkesei soi nirannon genesthaiit-will-suffice you(dat) king(acc) to-become'it will be enough for you to become king' (Liddell and Scott 1971,P. Ale. 2. 141. a)

Figure 3

Figure 2


A partial structure for (6) is presented in Figure 3. Such examples raise urgentquestions about the status of 'understood' subjects. If an understood subject issimply one which is 'understood' in the semantics but entirely absent from thesyntax, it is hard to explain the case of the predicative in these examples. Wereturn to these questions below.

When the subject of the infinitive is identical to that of the main verb, it isnormally not expressed:

penabove

panto: nall (gen)

emmenaito-be

(7) all' hod' ane: r etheleibut this man(nom) he-wishesallo: nothers (gen)'but this man wishes to be above all others' (H. II. 1. 287)

Agreeing elements may nevertheless appear in the accusative:

(8) enth' erne men pro: tisth' hetaroi lissontothen me (ace) then first-of-all companions (nom) they-beggedepeesin turo: n ainumenous ienai palinwords(dat) cheeses(gen) taking(acc-pl) to-go back'thereupon my companions then first of all begged me with words to take (i. e.that they might take) some of the cheeses (and) to depart' (H. Od. 9. 224-5)

Examples like (8) are hard to explain without assuming some kind of accusativesubject for the infinitive with which the predicative participle ('taking') canagree, as hinted at in Figure 4; but of course there is no overt accusative subject.

Figure 4

In situations of emphasis, an infinitival subject may be expressed (Chantraine1953: 312; Kiihner and Gerth 1955, 2: 30-31; Smyth 1956: 439). Whenexpressed it appears in the accusative case:

(9) ego: men toinun eukhomai prin tauta epideinhuph' humo: nI(nom) as-for therefore pray (that) before these-things to-see by yougenomena murias erne ge kata te: s ge: s orguiashaving-become 10, 000 me (ace) indeed under the earth fathomsgenesthaito-become'for my part, therefore, I pray that before I see these tilings having been broughtabout by you, I may be ten thousand fathoms under the earth' (Kiihner andGerth 1955: 31, X. A. 7. 1. 30)


(10) hoi Aigiiptioi enomizon heo: utous pro: tous genesthaithe(nom) Egyptians(nom) they-thought themselves (ace) first(acc) to-bepanto: n anthro: po: nall (gen) human-beings (gen)'the Egyptians used to think they were the first of all human beings' (Kuhner andGerth 1955: 31, Hdt. 2. 2)

(11) to: n d' allo: n erne phe: mi tolu propheresteron emaiof-those others me (ace) I-say by-far better(acc) to-be'but of those others I say I am better by far' (H. Od. 8. 221)

(Other examples for Homeric Greek in II. 7. 198, 13. 269, 20. 361 - Chantraine1953: 312. )

The emphasis need not be strong, as the following example, with unstressedclitic pronoun, shows:

(12) kai te me phe: mi makhe: Tro: essi are: geinand in-fact me (ace) s/he-says batde(dat) Trojans(dat) to-help'and she says that I help the Trojans in battle' (H. II. 1. 521)

When the infinitive is used in exclamations with an overt subject, the latterappears in the accusative:

(13) erne tatheih tademe (ace) to-suffer this'That I should suffer this!' (A. Eum. 837)

These examples with overt accusative subjects strongly support the traditionalrule that infinitives have accusative subjects, so the question is how to allow thisgeneralization to extend to infinitives which appear to have no subject at all inorder to explain the accusative cases found on predicatives in infinitival clausesin examples such as (4) and (5).

2. 2 Non-accusative subjects

Greek provides a number of interesting alternatives to the possibilities given insection 2. 1. These are traditionally discussed under two headings, although theprocess is the same in both cases:

Sehr viele der Verben, die den Infinitive zu sich nehmen, haben daneben noch einpersonliches Objekt bei sich, welches in dem Kasus steht, den das Verb erfordert...Wenn zu dem Infinitive adjektivische oder substantivische Pradikatsbestimmungentreten, so stehen dieselben entweder vermittelst einer Attraktion mit dempersonlichen Objekte in gleichem Kasus oder mit Vernachlassigung der Attraktionim Akkusative (Kuhner and Gerth 1955: 24)3

Wenn aber das Subjekt des regierenden Verbs zugleich auch das Subjekt desInfinitivs ist, so wird das Subjekt des Infmitivs... weggelassen, und wenn adjektivischeoder substantivische Pradiskatsbestimmungen bei dem Infinitive stehen, so werdendiese vermittelst der Attraktion in den Nominative gesetzt (ibid.: 29)4


In short, the predicative of an infinitive may have a case which is 'attracted' tothat of a nominal in the higher clause, whether its object or its subject.Examples:

aphamartouse: khthonalosing (dat) earth (ace)

khre: saitoadvise

(14) emoi de ke kerdion eie: seume (dat) but would better it-be you (gen)dumenaito-go (beneath)'but for me it would be better losing you to die' (H. II. 6. 410-11)

(15) dokeo: he,: jm: nAigine: teo: n deesthai ton theonI-think us(dat) Aeginetans(gen) to-beg the(acc) god(acc)timo: re: te: ro: n genesthahelpers (gen) to-become'I think the god has advised us to beg the Aeginetans to become (our) helpers'

(Hdt. 5. 80)

Examples of attraction show that some infinitives do not have accusativesubjects, but they do not undermine the generalization that many do. Theanalysis of attraction is tangential to our present concern, but is easilyaccomplished via 'structure-sharing', 5 where the higher nominal doubles up assubject of the infinitival clause - for example, in (15) the genitive noun'Aeginetans' is not only the complement of the higher verb 'beg' but alsosubject of the lower infinitive. The proposed structure for this remarkablycomplicated sentence is shown in Figure 5.

Figure 5

This analysis easily explains why the lower nominal predicate 'helpers' hasthe same case as this shared nominal, but it does not help with examples whereeven the higher nominal is merely implicit, as in (16). (We give an explanationfor examples of this type in section 6. )

(16) ethelo: de toi e: pios einaiI-wish but you(dat) kind(nom) to-be'but I wish to be kind to you' (H. II. 8. 40)


According to Chantraine (1953: 313) the relative frequency of attractionincreased from Homer forward into Attic authors and in the Attic period itappears to have been obligatory in cases like (16), i. e. there are no exampleslike (8) in Attic Greek. This may have been the reason traditional grammarsdiscuss attraction under two headings, one for nominative cases and the otherfor oblique cases.

3. The Analysis of Case Agreement

First, note that morphological case, unlike gender or number, is a purelymorpho-syntactic property, so it is available to words and not to their meanings.One consequence is that it is independent of reference.

(17) I saw him yesterday while he was on his way to the beach.

In (17) the three pronouns share a common set of semantic features (genderand number) and have a common referent, but occur respectively in the'objective', 'subjective' and 'possessive' cases (to use the terms of traditionalgrammar). So far as we know, semantic co-reference between a nominal and apronoun never triggers case agreement in the latter, though it often requiresagreement for number and gender. A further consequence is that the onlypossible 'target' for case agreement is a word (or other syntactic element); thisrules out a semantic account of case agreement, however attractive such anaccount might be for number and gender.

Thus, faced with examples such as (18=6), where an infinitive has anaccusative predicative but no overt subject, we cannot explain the predicative'scase merely by postulating a subject argument in the semantic structure withouta supporting syntactic subject.

(18) exarkesei soi turannon genesthaiit-will-suffice you(dat) king(acc) to-become'it will be enough for you to become king'

The argument X in the semantic structure 'X becoming a king' cannot by itselfcarry the accusative case; there must also be some kind of accusative subjectnominal in the syntactic structure. Nor does a control analysis help, because thecontrolling nominal is the pronoun soi, 'to you', which is dative; as expected, itscase has nothing to do with that of coreferential nominals.

The analysis seems to show that a specifically syntactic subject must bepresent in order to account for the accusative case seen on the predicatenominal. We accept this conclusion, though somewhat reluctantly because itconflicts with the stress on concreteness which we consider an importantprinciple of Word Grammar. We are sceptical about the proliferation of emptynodes in Chomskyan theories, and have always felt that the evidence for suchnodes rested heavily on theory-internal assumptions which we did not share. Incontrast, the evidence from case agreement strikes us as very persuasive, so wenow believe that syntactic structure may contain some 'null' elements which are


not audible or visible, such as a case-bearing subject in Ancient Greek infinitivalclauses; this is the analysis that we shall develop in the rest of this chapter. (For afuller statement of this argument and conclusion, see Hudson 2003. )

Fortunately, it seems that this evidence is supported by completelyindependent data from other languages. For example, Welsh soft mutationand agreement are much easier to explain if we allow syntactic subjects to beinaudible (Borsley 2005). Welsh mutation applies to verb dependents whichare separated from the verb. Normally these are objects as in (19):

(19) Gweles (i) gi.saw-lSG (I) dog'I saw a dog. '

Here gi is the mutated form of ci, 'dog', whose form shows it to be object ratherthan subject even when the optional subject i is omitted. Conversely, however,subjects are also mutated if they are delayed, as in (20):

(20) a. Mae ci yn yr ardd.is dog in the garden'A dog is in the garden. '

b. Mae yn yr ardd gi.

In sentence (a), ci is in the unmutated form expected of a subject, but it ismutated in (b) because it has been separated from the verb mae. Thegeneralization seems to be that if a subject or object dependent is separatedfrom the verb, it is mutated; but this generalization presupposes that there isalways a syntactic subject in examples like (19), even when no subject is audible.The same conclusion also simplifies the treatment of verb agreement, which isconfined to verbs whose subject is a personal pronoun. In (19), the suffix {es}on gweles can be said to agree with a first-person singular subject even when thisis covert. In short, inaudible subjects make the grammar of Welsh simpler andmore explanatory, a possibility which we assume occurs to naive learners of thelanguage as well as to linguists.

In the rest of this chapter we explore the notion 'null element' within thetheoretical framework of Word Grammar. What exactly does it mean to saythat an element is 'null' in a cognitive theory of language which maximizes thesimilarities between language and other kinds of cognition? Having introducedthe relevant ideas we shall contrast our view of null elements with the morefamiliar ideas about elements such as PRO and pro, as well as with otherproposals from the WG and HPSG traditions.

4. Non-Existent Entities in Cognition and in Language

One of the rather obvious facts about everyday knowledge is that we knowthings about entities which we know not to exist. For example, we know thatFather Christmas brings presents, wears a red coat and has a beard; but we alsoknow that he doesn't exist. How can we know the characteristics of a non-


existent object? The answer must be that 'existence' is somehow separablefrom other kinds of characteristic. However there is a serious danger of aninternal contradiction because it is also clear that the concept of FatherChristmas does exist, complete with the links to beards, red coats and presents,even for those of us who know he does not exist. How can this contradictionbe avoided?

One possible answer follows from a basic assumption of Word Grammar:that tokens and types are distinct concepts, each with a separaterepresentation in the mind (Hudson 1984: 24; Hudson 1990: 31-2). Tokensexist in the realm of ongoing experience, while types exist in permanentknowledge; in other words, roughly speaking tokens are represented in workingmemory and types in long-term memory. For example, when we see a bird, weassume that we introduce a new concept to represent it in our minds, a tokenconcept which is distinct from every permanent concept we have for birds orbird species. Having introduced this distinct concept we can then classify it (e. g.it's a robin), notice unusual features and remember it. None of this is possible iftokens and types share the same mental nodes.

Another difference between tokens and types is that tokens are part of ourongoing experience; in short, they are 'real', whereas types are merelymemories and may even be fictional. For example, we have a permanentconcept for Father Christmas complete with a list of attributes. This concept isjust like those for other people except that we know he's not real; in otherwords, we know we will never meet a token of Father Christmas (even if we domeet tokens of people pretending to be tokens of Father Christmas). Thiscontrast between real and unreal types can be captured in WG by an attributewhich we call 'quantity' (the 'quantitator' of Hudson 1990: 23). If every token ofexperience has a quantity of 1, then real and unreal types can be distinguishedby their quantities - 1 for real and 0 for unreal. If Father Christmas has aquantity of 0, then any putative token of Father Christmas would be (highly)exceptional.

The example of Father Christmas is rather isolated because most of ourconcepts are based firmly on experience. However, there is an even moreimportant role for the quantity variable, which is in inherited attributes. Forexample, the default person has a foot on each leg, but some unfortunateindividuals have lost a foot. The quantity variable provides a mechanism forstating the default (each leg has one foot), and then for overriding it in the caseof specified individuals (e. g. the pantomime character Long John Silver has nofoot on his left leg). Potentially inheritable attributes of this kind are a commonpart of experience and provide an important role for the quantity variable.Strictly speaking, quantity is a function like any other, but it plays such a basicrole that it is convenient to abbreviate it as a mere number on the node itself.Using this notation (together with the triangles standardly used in WG notationfor the 'is-a' relation) tiie facts about feet may be shown in a diagram such asFigure 6. In prose, a typical person has one left leg, which has one foot; butalthough Long John Silver has the expected number of left legs (shown as amere dot, which inherits the default quantity), this leg has zero feet.


Figure 6

Returning to the analysis of grammatical structure, we see no reason torecognize fictional words as such - after all, how could one learn them? (Noticethat we learn about Father Christmas via verbal and visual representations, forwhich there is no parallel in language. ) In other words, we see no justificationfor lexical items such as the Chomskyan PRO which are inherently inaudible.However, we do see an important role for dependents that are merely potential(like a missing left foot). For example, take syntactic objects. On the one hand,we know that they are typically nouns, that they typically express the patient ortheme of the action, that they follow the verb, and so on; it is essential to state allthese properties just once, at the highest level (i. e. as a characteristic of thetypical verb or even of the typical word). But on the other hand, we also knowthat some verbs require an object, others merely allow one, and others againrefuse one. It is essential to be able to separate these statements of 'existence'from all the other properties of objects, and the obvious mechanism is thequantity variable introduced above. The default object may have a quantitywhich is compatible with either 1 or 0, but for many individual verbs this isoverridden for obligatorily transitive or intransitive verbs. Similar remarks applyto subjects, the topic of this chapter, but first we must distinguish two differentkinds of 'null' dependent.

On the one hand, there are dependents that are optional in the semantics aswell as in the syntax; for example, many verbs allow a beneficiary (e. g. make hera cake), but in the absence of a syntactic beneficiary there is no reason toassume a semantic one. In a sentence such as She made a cake, therefore, there isno beneficiary dependent although one is possible. In this case, therefore, thedependent itself has quantity 0. In many other cases, on the other hand, thenull dependent does contribute to the semantics; for instance, even whenDRINK has no object (He drank till he fell asleep], its semantic structurecertainly includes some liquid which by default is alcohol. In this case, weassume that the syntax does contain an object noun, an ordinary noun such asALCOHOL complete with its ordinary meaning; but exceptionally, it has norealization in form - no stem or fully inflected form.

Null subjects in English are always of this second type: specific words whichhave their usual meaning but which are deprived of any form because theirform's quantity is 0. This quantity varies with the verb's inflectional category;


e. g. finite verbs generally have a subject, but imperatives (a kind of finite)normally have an unrealized YOU. (See section 5. 3 for more discussion. ) Anotation for unrealized words in a written example would help to distinguishthem conceptually from PRO; we suggest square brackets round the ordinaryorthography for the missing word. Using this notation, we might write anEnglish imperative as follows:

(21) [You] hurry up!

The relevant grammar is in Figure 7, which includes the very general defaulttendency for words to have a realization.

Figure 7

In the case of Ancient Greek infinitives and participles, what distinguishesthose with overt subjects from those without is simply the quantity of thesubject's realization. Even if an infinitive has no overt subject, it still has asubject, and this subject is an ordinary word (probably a personal pronoun)which has the full range of inheritable syntactic and semantic properties. Andcrucially, it has a case which may trigger case agreement in a predicate. Therelevant part of the grammar is sketched in Figure 8. According to this diagram,a verb's subject is normally nominative and has optional realization - in otherwords, Greek is a (so-called) pro-drop language. (We discuss this point furtherin the next section. ) However, infinitives override the default pattern bydemanding the accusative case, so even 'null' subjects of infinitives have theaccusative case.


Figure 8

To summarize, then, we are proposing an attribute 'quantity' which controlsthe way in which we map stored concepts to items of experience, as types totokens. Any item of experience has the value 1 for this attribute, so it will onlymatch stored concepts which have the same value. In the case of words, whatallows us to experience them is their realization, so by definition the quantityfor the realization of a word token is 1. However the grammar of a languageallows the default 1 for realization to be overridden in the case of dependents ofspecific types of words, such as infinitives. But although these words have norealization, they do have all the other properties expected of them, includinggrammatical properties such as case. In Greek, it is the case of unrealizedsubjects that explains the agreement patterns described in the first section.

5. Extensions to Other Parts of Grammar

Since the proposed system applies equally to our knowledge of FatherChristmas and to the subjects of Greek infinitives, it would not be surprising if itturned out to be relevant in other parts of grammar as well. The following listsuggests a number of other areas where 'understood' elements can be handledin the same way.

Whereas English requires tensed verbs to have a realized subject, pro-droplanguages allow it to be unrealized. This is helpful in Ancient Greek, wherepredicatives have the nominative case in tensed clauses even when the subject isunrealized, and similarly a virtual subject is as likely as an overt one to 'attract'the predicative of a lower clause to its nominative case. A relevant example is(22)=(16), which we noted above as an outstanding problem for a 'structure-sharing' analysis of attraction. If we assume that the main verb ethelo:, 'I wish',has a nominative (but unreal) pronoun as its subject, the nominative on thelower predicative is as expected because this unreal pronoun is also the subjectof the lower clause:

5. 1 Null subjects of tensed verbs in fpro-drop languages'


(22) ethelo: de toi erpios einaiI-wish but you(dat) kind(nom) to-be'but I wish to be kind to you' (H. II. 8. 40)

The subject-verb agreement on the verb is easy to explain if there is always asubject, real or unreal. Without this assumption, however, a rule of agreementdoes not extend easily to examples where there is no overt higher subject.

5. 2 'object pro-drop*

(23) ou dei tois paidotribais enkalein oud' ekballein eknot necessary the(dat) trainers (dat) to-accuse nor to-banish fromto: n poleo: nthe (gen) cities (gen)'it is not necessary to accuse the trainers nor to banish them from the cities'(P. G. 460 d. )

This phenomenon, less common than 'subject pro-drop' but very common inGreek, is traditionally analyzed under the rubric of 'object-sharing' and has noagreed modern analysis. Treating the 'omitted' object as unrealized provides anatural and simple account. Note that the traditional shared object analysiswould incorrectly associate the dative case with the object of ekballein (normallyaccusative in this context).

5. 3 Subjects of imperatives

In languages where these are usually absent, such as English, the identity of theunreal subject is very clear: as we assumed above, it must be the pronoun youfor second-person imperatives, and we for first-person plural ones. This is clearnot only from the meaning but also from the choice of pronoun in the tagquestion:

(24) Hurry up, will you?(25) Let's go now, shall we?

Moreover, where a language offers a choice between intimate and distantsecond-person pronouns (such as the French pair tu and vous), the same choiceapplies, with the same social consequences, to imperatives even though there isno overt pronoun (e. g. Viens! or Venez! for 'Come!'). Without unrealizedpronouns as subject it is hard to extend the rule for choosing pronouns so that itapplies to the choice of imperative forms as well; but with unreal pronouns thechoice of pronoun automatically triggers the correct agreement on the verb.

5. 4 Complements of certain definite pronouns in English

The argument here rests on the assumption that 'pluralia tantum' such astrousers and scales are singular in meaning but plural in syntax; the assumption


has been challenged (Wierzbicka 1988) but we still find it especially plausiblefor examples such as scales (plural) contrasting with balance (singular). Therelevant datum is that the choice between this and these matches the syntacticnumber when the complement noun is overt (so this balance but these scales],but the same choice is made even when there is no overt complement

(26) I need some scales to weigh myself on, but these (*this) are (*is) broken.

If pluralia tantum really are singular in meaning, we cannot explain this choicein terms of meaning, and the most attractive explanation is that the choice isforced in the same way as in the overt case, by the presence of an unrealizedexample of the noun scales (or trousers or whatever). We might also considerextending this explanation to another curious fact about the demonstrativepronouns, which is that the singular form can only refer to a thing:

(27) Do you take this !!(woman) to be your lawfully wedded wife?

The explanation would be that only one unrealized noun is possible in thesingular: the noun thing. The analyses that we are suggesting are of coursecontroversial and may be wrong, but if they are correct then they show that theunrealized word may be a specific lexical noun rather than a general-purposepronoun as in the earlier examples.

5. 5 Complements of certain verbs such as auxiliaries in English

This covers the territory of so-called 'VP deletion' but also other kinds ofanaphoric ellipsis:

(28) I don't know whether I'm going to finish my thesis this year, but I may.(29) I may finish my thesis this year, but I don't know for sure.

If the complement of mqyjs allowed to be unrealized, then may in (28) actuallyhas a complement verb, whose properties are (more or less) copied from itsantecedent (namely, (I) finish my thesis this year); and similarly know in (29) hasan object which would have been realized as whether I'll finish my thesis thisyear. This analysis combines the flexibility of a purely semantic analysis with theability of a syntactic analysis to accommodate syntactic detail such as extractionout of an elided complement:

(30) OK, you didn't enjoy that book, but here's one which you will.

If will has no syntactic complement at all, the extraction of which in (30) is veryhard to explain; but if its complement is an unrealized enjoy, the rest of thesyntactic structure can be exactly as for here's one which you will enjoy., 6

In all these examples the omitted element is redundant and easy to recover,so the option of leaving it unsaid obviously helps both the speaker and the


hearer. The familiar functional pressure to minimize effort thus explains whythe choice between 'realized' and 'unrealized' exists in the grammar. On theother hand, it does not explain why languages allow it in different places - e. g.why some languages allow tensed verbs to have a null subject while others donot This variation must be due to different ways of resolving the conflictbetween this functional pressure and others which push in the oppositedirection, such as the pressure to make syntactic relations reliably identifiable(whether by word order as in English or by inflectional morphology as inGreek).

6. Comparison with PRO and pro

The analysis that we are proposing is different from the more familiar oneswhich invoke null pronouns such as PRO and pro, and we believe that thedifferences are important:

• PRO and pro are special pronouns which combine the peculiarity of alwaysbeing covert with the equally exceptional property of covering all persons,numbers and genders. The fact that they are exceptional in two such majorrespects should arouse suspicion. In contrast, our unrealized pronouns arethe ordinary pronouns - he, me, us, and so on - which just happen to beunrealized. Even if we count this as an exceptional feature it is their onlyexceptional feature, in contrast with the double exceptionality of PRO andpro.

• In our account, a pronoun may be realized for emphasis in contexts whereit would normally be unrealized; this accounts in a simple way for theexamples in (9) to (12) where the pronoun is emphatic. If the covertpronoun is always PRO, why should it always alternate with an ordinarypronoun?

• Our unrealized words need not be pronouns, unlike PRO and pro. Asexplained in the previous section, this allows us to extend the sameexplanation to other kinds of unexpressed words, such as unrealizedcommon nouns acting as complement of a pronoun/determiner such asthis, or virtual complements of verbs such as auxiliary verbs. In other words,our proposal subsumes null subjects under a much broader analysis whichcovers ellipsis in general.

• Unrealized words are identified by the quantity feature which appliesoutside language (e. g. to Father Christmas and feet) as well as inside. Incontrast, the difference between PRO or pro and other words is specific tolanguage, involving (presumably) the absence of a phonological entry. Anyexplanation which involves machinery that is available independently oflanguage is preferable to one which involves special machinery.

• In the standard analysis with PRO and pro the difference between these twois important because both abstract 'Case' and surface case are supposed tobe impossible for PRO but obligatory for pro; this contrast is also claimed tocorrelate with the contrast between subjects of non-finite and finite verbs.


More recently, PRO has been claimed to have a special 'null' case(Chomsky and Lasnik 1993). The empirical basis for these claims wasundermined long ago (e. g. Sigurdsson 1991), and our analysis does notrecognize the distinction between PRO and pro. Unrealized pronouns alltake case (or lack it) just like realized ones in the language concerned.

These differences between our proposal and the PRO/pro system all seem tofavour our proposal.

7. Comparison with Other PRO-free Analyses

In this section we compare our proposal with two other approaches to nullelements neither of which invokes a 'covert' element such as PRO. The firstapproach is in the WYSIWYG spirit of earlier versions of Word Grammar,where it was assumed that null elements were simply absent. This assumptionwas only workable because of die possibility of structure-sharing. In thisanalysis, the missing subject is specified as (i. e. supplied by) the subject of thehigher verb (see Hudson 1990: 235ff for details). For Greek, as we indicated insection 2, this approach is adequate for the cases traditionally described underthe rubric of 'attraction', but it fails for the default situation, where the subject ofthe infinitive (and other elements dependent on the lower verb) displayaccusative case. On the early WG assumptions, the only possible analysis issimply to stipulate, for the case where there is no infinitival subject, thatpredicates of infinitives are accusatives. But this approach fails to beexplanatory: why should these elements bear the accusative case rather thanthe general default nominative? (Contrast the principled explanation of Figure8 and accompanying text)

Moreover, this no-null-element, stipulative analysis suffers from an evengraver defect: the relation 'subject' is a collecting point for a large number ofdifferent patterns in semantics and morphology as well as syntax (Keenan1976). A verb's subject is the nominal that has the following properties (amongothers):

• its referent is the 'active argument' of the verb as defined by the latter'slexical entry - for instance, with RUN/TREKHO: it is the runner, withFALL/PIPTO: it is the fuller, with LIKE/PHILEO: it is the liker, and soon;

• in English, it typically stands before the verb;• it is the typical antecedent of a reflexive object of the same verb;• the verb agrees with it;• in English, it is obligatory if the verb is tensed;• it is also the verb's object if the verb is passive;• in Greek, its case is typically nominative.

As soon as some nominal is defined as the verb's subject, it immediatelyinherits all these characteristics en bloc. But in the absence of a subject there is


nothing to bring them all together. For example, if himself is the object of hurt itis tied anaphorically to the hurter via the 'subject' link, but if there is no subjectthis link disappears. And yet the fact is that the anaphoric relations are exactlythe same regardless of whether or not there is an overt subject; for example, the'understood' subject of hurt in Don't hurt yourself! binds yourself in exactly thesame way as the overt one in You may hurt yourself.

The analysis that we are proposing solves these problems by moving towardsthe standard view that every verb does indeed have a subject, whether or notthis is overt. Similar problems face the earlier WG approach in other areas ofgrammar, and can be solved in the same way. In section 5 we outlined a rangeof phenomena that seem to call for analysis in terms of unrealized words, andwhich more traditional WG analyses have treated in terms of dependents thatare simply absent.

Another attempt to handle null subjects without invoking PRO is proposedby Pollard and Sag (1994: 123-45) in the framework of HPSG. As with theearly WG analysis just described, this proposal applies only where syntacticstructure-sharing is not possible. They propose a structure for 'Equi' verbs suchas the following for try (ibid.: 135):

CAT | SUBCAT <NPb VP\inf, SUBCAT <JVP7>]

The infinitive's subject is the italicized 'NP' in its 'SUBCAT' (valency) list ThisNP merely indicates the need for a subject, and would normally be 'cancelled'(satisfied) by unification with an NP outside the VP; for example, in Theyworked hard the verb needs a subject, which is provided by they. However atleast the intention of this entry is to prevent the need from being satisfied, sothat the infinitive's subject remains unrealized, as in our proposed WG analysis.Moreover, this unrealized subject in the SUBCAT list may carry othercharacteristics which are imposed both by the infinitive and by try; for example,the subscripts in the entry for try show that it must be coreferential with thesubject of try - i. e. a sentence such as They try to work hard has only onemeaning, in which they are the workers as well as the try-ers. Most importantlyfor the analysis of Greek, the unrealized subject can carry whatever case may beimposed on it by the infinitive (Henniss 1989). Consequently it can be thetarget of predicative case agreement, so Ancient Greek case-agreement wouldbe no problem.

This approach is clearly very similar to ours. In both theories:

• the infinitive's subject is an ordinary noun(-phrase) rather than a specialpronominal (PRO or pro);

• the subject's status (overt or covert) is handled by a separate mechanismfrom its other properties;

• the subject's properties include those inherited from the infinitive;• the possibility of null realization is determined by the head, rather than

inherent in the unrealized nominal.


However there are also significant differences between the two proposals.

• The null NP in HPSG is purely schematic, so all null subjects have thesame syntax (bar any specific syntactic demands imposed by the infinitive).They are also schematic in their semantics, in spite of the coreferencerestriction, because reference is distinct from semantics (e. g. the winner maybe coreferential with someone I met last night, but these phrases obviouslyhave different semantic structures). In contrast, WG null subjects areordinary lexical nouns and pronouns.

• So far as we can see, the HPSG machinery for distinguishing overt andcovert valents does not appear to generalize beyond language; and indeed,many advocates of HPSG might argue that it should not do so. In contrast,we showed in section 4 that our proposal does; it can explain the 'non-occurrence' of Father Christmas in just the same way as that of the subjectof an infinitive.

Whether or not the proposals differ in terms of specifically linguistic analysesremains to be seen.

8. Conclusions

The most important conclusion is that where there is strong empiricalevidence for null elements, they can easily be included even in a 'surfacist'grammar such as WG. This can be done by exploiting the existing WGmachinery for determining 'quantity', a variable which guides the user inapplying knowledge to experience; for example, one of the properties that weattribute to Father Christmas is zero quantity - i. e. we expect no tokens inexperience. In these terms, a 'null word' is an ordinary word whose realizationhas the quantity 0 - an unrealized word. This (or something like it) isgenerally available in cognition both for distinguishing fact and fiction and forcases where an expected attribute is exceptionally absent, so it comes 'forfree', and it is preferable to inventing special linguistic inaudibilia such asPRO or pro.

References to classical works

Aeschylus, Eumenides (A. Eum. )Herodotus (Hdt)Homer, Iliad (H. II. )Homer, Odyssey (H. Od. )(LSJ: see liddell & Scott in References)Isocrates (I. )Plato, Alcibiades (P. Ale. )Plato, Gorgias (P. G. )Xenophon, Anabasis (X. A. )


References

Andrews, A. (1971), 'Case agreement of predicate modifiers in Ancient Greek'.Linguistic Inquiry, 2, 127-51.

Borsley, R. (2005), 'Agreement, mutation and missing NPs in Welsh', Available: http: //privatewww. essex. ac. uk/~rborsley/Agreement-paper. pdf (Accessed: 19 April 2005).

Chantraine, P. (1953), Grammaire Homerique. Vol 2. Paris: Klincksieck.Chomsky, N. and Lasnik, H. (1993), 'The theory of principles and parameters', in J.

Jacobs, A. v. Stechow, W. Sternefeld and T. Venneman (eds), Syntax: An InternationalHandbook of Contemporary Research. Berlin: Walter de Gruyter, pp. 506-69.

Goodwin, W. W. (1930), Greek Grammar (rev. Charles Burton Gulick). Boston: Ginn.Henniss, K. (1989), '"Covert" subjects and determinate case: evidence from

Malayalam', in J. Fee and K. Hunt (eds), Proceedings of the West Coast Conferenceon Formal Linguistics. Stanford: CSLI, pp. 167-75.

Hudson, R. (1984), Word Grammar. Oxford: Blackwell.— (1990), English Word Grammar. Oxford: Blackwell.— (2003), 'Case-agreement, PRO and structure sharing'. Research in Language, 1, 7-33.Keenan, E. L. (1976), 'Towards a universal definition of "subject"', in Charles Li (ed. ),

Subject and Topic. New York: Academic Press, pp. 303-33.Kiihner, R. and Gerth, B. (1955), Ausfuhrliche Grammatik der griechischen Sprache.

Leverkusen: Gottschaksche Verlagsbuchhandlung.Lecarme, J. (1978), 'Aspects Syntaxiques des Completives du Grec' (Unpublished

doctoral dissertation, University of Montreal).Liddell, H. G. and Scott, R. (1971), A Greek-English Lexicon (9th edn, rev. H. Jones and

R. McKenzie, suppl. by E. Barber) Oxford: Clarendon Press.Pollard, C. J. and Sag, I. A. (1994), Head-Driven Phrase Structure Grammar. Chicago:

University of Chicago Press.Quicoli, A. C. (1982), The Structure of Complementation. Ghent: Story-Scientia.Sigurdsson, H. (1991), 'Icelandic case-marked PRO and the licensing of lexical

arguments'. Natural Language & Linguistic Theory, 9, 327-63.Smyth, H. W. (1956), Greek Grammar (rev. G. Messing). Cambridge, MA: Harvard

University Press.Wierzbicka, A. (1988), The Semantics of Grammar. Amsterdam: Benjamins.

Notes

1 By Ancient Greek we mean the Greek of early epic poetry ('Homeric Greek') downto the Attic prose of the 5th and 4th centuries B. C. E.

2 A list of abbreviated references to classical authors can be found at the end of thispaper.

3 Very many of the verbs which take the infinitive also take a personal object whichstands in the case that the verb requires... If the infinitive also has an adjectival ornominal predicate, this stands in the same case as the personal object by (an)attraction, or in the absence of attraction, in the accusative.

4 However if the subject of the governing verb is at the same time the subject of theinfinitive, the subject of the infinitive is omitted, and if adjectival or nominalpredicates accompany the infinitive, these are put in the nominative by attraction.

5 This felicitous term is taken from the work of Pollard and Sag (1994), but theanalysis was worked out in full detail for English infinitives (and other constructions)in Hudson (1990: 235-9).

6 We owe this point to Andrew Rosta.

http://privatewww.essex.ac.uk/~rborsley/Agreement-paper.pdf

http://privatewww.essex.ac.uk/~rborsley/Agreement-paper.pdf

3 Understood Objects in English and Japanese withReference to Eat and Taberu: A Word Grammaraccount

KENSEI SUGAYAMA

AbstractThe author argues that there is a semantic difference of the suppressed objectbetween eat and its Japanese equivalent taberu. Then what kind of semantic structurewould the Japanese verb taberu have? This chapter is an attempt to answer thisquestion in the framework of Word Grammar (Hudson 1984, 1990, 1998, 2005).

1. Introduction

Unlike. English and perhaps most other European languages, Japanese allowsits transitive verbs to miss out their complements (e. g. subject, object) on thecondition that the speaker assumes that they are known to the addressee. 1 Thisis instantiated by the contrast in (1) and (2):

(1) A: mo keeki-wa yaki-mashita-kaalready cake-TP baked-Q'Did you bake the cake?'

B: hai, yaki-mashitayes baked'Yes, (I) baked it'

(2) A: *mo, yaki-mashita-kaalready baked-Q (* unless the object is situationally recovered)'Did you bake it?' (intended meaning)

The following sentences are also possible in Japanese. 2

(3) hyah! ugoitaInterj moved'Ouch! It moved. '

(4) kondo-wa yameyoonext-time-TP stop'I won't do it again. '

(5) kanojo-wa yubiwa-o oitashe-TP-Sb ring-Ob put'She put the ring there'

UNDERSTOOD OBJECTS IN ENGLISH AND JAPANESE 55

Sentences (3), (4) and (5) are colloquial and quite often used in the standardspoken Japanese. In this sense they are not marked sentences. In (3) only thesubject is left out, while in (4) both the subject and object are left out as shownin the word-for-word translation. Sentence (5) involves the transitive verb oita, apast tense form of oku 'put', which corresponds to put in English. Oku is a three-place predicate which semantically requires three arguments [agent, theme,place]. These three arguments are mapped syntactically to subject, object andplace adverbial, respectively. Quite interestingly (5) shows that the placeelement, which is also considered to be a complement (or adjunct-complement)of the verb oku, is optional when it is known to the addressee, which is virtuallyimpossible with its counterpart put in English. Although these complements arein fact missed out (i. e. unexpressed or ungrammaticalized), the addresseeeventually will come to an appropriate interpretation of each sentence whereunexpressed complements are supplied semantically or pragmatically and theyare no doubt given full interpretation. Why is this possible? A possible answercomes from the assumption that in the semantic structure of the sentencesabove, there has to be a semantic argument which should be, but is not actually,mapped onto the syntactic structure (i. e. grammaticalized as a syntacticcomplement in my terms).

Turning to English, on the other hand, it is possible to leave indefiniteobjects suppressed for semantically limited verbs such as eat, drink, read, etc. 3Thus, following Hudson (2005), the syntactic and semantic structure of John atewill be something like the one shown in Figure 1.

Figure 1

The links between syntactic and semantic structures in Figure 1 are shown bythe vertical solid and curved lines. The categories enclosed by single quotationmarks (e. g. 'John', 'John ate') are concepts which are part of the sentence'ssemantic structure; the numbers are arbitrary. Detailed explanation aboutsyntactic and semantic dependencies will be given in the next section.

But this kind of semantic structure does not seem to be a viable one for theJapanese verb taberu, because, as I will argue later, there is a semantic differencein the semantic feature of the suppressed object between eat and taberu, whichdoes not seem to be properly reflected in the semantic structure of those two


verbs in Word Grammar (WG). Then what kind of semantic structure will theJapanese verb tab em, the Japanese equivalent of eat, have? This chapter is anattempt to answer this question in the framework of Word Grammar.

The rest of the chapter is organized in the following way. Section 2introduces Word Grammar and deals with the relevant notions used in WG todeal with the problem of a covert object. Section 3 discusses the analysis of anintransitive use of the eat type verbs. Section 4 discusses the Japanese verbtaberu, an equivalent of eat in English. It also discusses the interpretation oftab em which lacks an overt object, using the syntactic and semantic analysis inWG. Section 5 offers my own account of how taberu is more adequatelydescribed in the semantic structure in WG.

2. Word Grammar

Before continuing any further, let us first have a brief look at the basicframework of WG and its main characteristics. WG, which is fully developedand formalized in Hudson (1984, 1990), subsequently revised by Rosta (1997),is to be taken as a lexicalist grammatical theory because the word is central -hence the name of the theory, basically making no reference to anygrammatical unit larger than a word. 4 In his recent comparison of WG withHead-driven Phrase Structure Grammar (HPSG), Hudson (1995b: 4) gives alist of the common characteristics between the two theoretically differentgrammars, some relevant ones of which are repeated here for convenience:

(6) a. both (i. e. WG and HPSG) include a rich semantic structure parallelwith the syntactic structure;

b. both are monostratal;c. both make use of inheritance in generating structures;d. neither relies on tree geometry to distinguish grammatical functions;e. both include contextual information about the utterance event (e. g. the

identities of speaker and hearer) in the linguistic structure.

In WG, syntactic structure is based on grammatical relations within a generalframework of dependency theory rather than on constituent structure. So agrammatical relation is defined as a dependency relation between the head andits dependents which include complements and adjuncts. In this framework,the syntactic head of a sentence, as well as its semantic head, is therefore a finiteverb on which its dependents such as subject, object and so forth depend. Totake a very simple example, the grammatical analysis of Vera lives in Altrinchamcan be partially shown by the diagram in Figure 2. 5 Each arrow in this diagramshows a dependency between words and it points from a head to one of itsdependents, but it is most important here that there are no phrases or clauses. 6

Thus in terms of dependency, lives is the root of the sentence on which Veraand in depend as a subject and a complement respectively. In turn, in is thehead of Altrincham. Semantically, 'Vera', which is a referent of Vera, is linkedas a live-er to a semantic concept 'Vera lives in Altrincham', an instance oflive' and a referent of lives at once and 'Altrincham', which is a referent of


Figure 2

Altrincham, is also linked to 'Vera lives in Altrincham' as a place. Thecurved lines connecting in and Altrincham mean that the referent of in and thatof Altrincham are the same (i. e. 'Altrincham'). 'Liv-er' and 'place' are namesof a semantic relation. A convenient way to diagram the model-instance relationis by using a triangle with its base along the general category (= model) and itsapex pointing at the member (= instance), with an extension line to link the two.In Figure 2, then, the diagram shows the relation between the sense of the wordlives, live', and its instance 'Vera lives in Altrincham'.

A WG grammar generates a semantic structure which parallels the syntacticstructure described above. The parallels are in fact very close as in Figure 2.Virtually every word is linked to a single element of the semantic structure, andthe dependency relations between the words are typically matched by one ofthe relations between their semantic concepts: dependency (shown as a linewith a point). The familiar distinction between 'referent' and 'sense' are used inmuch the same way as in other linguistic theories. Therefore, in WG a word'ssense is understood to be some general category (e. g. the typical chair), while itsreferent is some particular instance of this category.

Apart from the diagram we have seen, how are the syntactic and semanticstructure of a sentence represented in WG? WG consists of an unordered setof propositions called facts. All WG propositions have just two arguments and arelator between them, and all relations can be theoretically reduced to a singleone represented as '=', although for convenience 'isa' and 'has' are also used.Hudson (1990: 256), for instance, gives a fairly complete lexical entry for eat interms of propositions (or facts). Some of them considered to be most importantfor the present purpose are given in (7):

(7) a. EAT isa verb.b. sense of EAT = eatc. EAT has [0-1] object.d. referent of object of EA T = eat-ee of sense of it


These facts are self-explanatory except for a few technical expressions: 'A isa B'means 'A is an instance of B' and [0-1] in front of object means 'at least 0 andat most 1' (i. e. 0 or one in this particular case). An element in italics is meant tobe the antecedent of the pronoun.

Propositions in (7) partially represent the syntactic and semantic structures ofJohn ate potatoes diagrammed in Figure 3 with eat replaced by its past tense formate.

Figure 3

In Figure 3, as mentioned earlier, an item enclosed by single quotationmarks represents a concept in the semantic structure. In this case, 'Fred','potatoes', etc. are referents, whereas 'ate' and 'potato' are senses. X-er, X-ee, etc., labelled on a semantic dependency, are semantic relationscomplements considered to bear to their head.

This outline of WG brings us now to the syntactic and semantic analysis ofthe English verb eat.

3. Eat in English

Let us consider those English transitive verbs that optionally appear withouttheir object Examples of such verbs, among others, include dress, shave, lose,win, eat and read, as in (8a) to (8f):

(8) a. William dressed/shaved Andrew.b. hester United won/lost the game.c. read the book/ate the shepherd's pie.d. am dressed/shaved.e. hester United won/lost.f ate/read.

While these six verbs will take part in the same syntactic alternation, theintransitive verbs are obviously interpreted in different ways. In the examples(8d)-(8f), we can identify three verb classes according to difference in


interpretation of the unexpressed object. The paired sentences in (9) illustratethese differing interpretations.

(9) a. William shaved = William shaved himselfb. Manchester United won = Manchester United won the gamec. John ate = John ate something (edible) or other

For the shave type verbs, the object, if omitted, is interpreted as being coreferentwith the subject or the subject's specific body part. For the win type verbs, thesurface intransitive verb form signals a severe narrowing of the range of possiblereferents of the implicit object, roughly speaking, 'a specific game' to berecoverable from the context. For the eat type verbs, the intransitive formmeans a lack of commitment by the speaker to the referent of the object.

With the eat type verbs, the identity of the referent of the unexpressed objectmay be non-specific, i. e. literally unknown to the speaker, because thesentences in (10) do make sense.

(10) a. I saw Oliver eating, but I don't know what he was eatingb. When I peeked into Oliver's room, he was reading; now I wonder what

he was reading

In both sentences in (10), the identity of what was eaten/read is asked in thesecond part. This implies that the patient (or eat-<?£/read-^) argument of eat/read, which may be grammaticalized as the object at surface structure, does nothave to be definite.

There is other evidence that supports the indefmiteness of the suppressedobject of eat. Consider the following dialogues:

(11) A: What happened to my scones?B: *The dog ate.

(12) A: Did you eat your kippers?B: *Yes, I ate.

In both (11) and (12), speaker B cannot reply to speaker A by using ate withoutits object. What (11) and (12) suggest is that eat, when its object is suppressed,cannot have its null object referring to the element in the previous discourse,which, as I will explain very shortly, is in fact possible in Japanese. What theungrammaticality of the utterances by speaker B indicates is that the understoodobject has to be indefinite if eat is used as an intransitive verb. 7

Here is another interesting piece of evidence supporting that this claim istrue. Observe the following dialogue:

(13) A: I'm starving, let's eat.B: What would you like to eat?A: Doesn't matter, anything, I'm just so hungry.

When speaker A first uses the intransitive eat, it is clear that he/she does nothave a definite object (or the referent of a definite object) in mind, and is just


expressing his/her desire to consume something or other. As our previousarguments predict, this is exactly a case where the intransitive eat should appear,because there is no antecedent available in this context. However, the semanticstructure of eat in this example is considered to have the patient argument, asthe lexical semantics of eat requires two arguments whether its object is definiteor not. Then the question arises why this argument does not appear at surfacestructure.

Now let us reconsider a WG representation of Fred ate, the diagram ofwhich is repeated here for convenience.

Figure 4

By now it is clear that this semantic representation is inadequate for eat (ate}.It is not so difficult to see that the important semantic information of thesuppressed object is missing in Figure 4. In Fred ate, there is no object, whichimplies that it should be indefinite. Regrettably, this important semanticinformation does not seem to be incorporated in Figure 4. Therefore, myproposal is that we have to revise the diagram so that it can be enriched with thesemantic information of the unexpressed object and accordingly a moreaccurate one will be something like the one in Figure 5.

4. Taberu in Japanese

Let us now turn to a Japanese counterpart of eat, taberu. The picture of taberu,an equivalent of eat in Japanese, is quite different from that of English eat,

Figure 5


which we have just discussed. As stated in section 1, complements are usuallymissed out in Japanese as far as they are accessible to the speaker and theaddressee (or recoverable) in the context. This generalization applies to theverb taberu in Japanese.

Before analyzing the structure of taberu, which can be used with thesuppressed complement as in (14) and (15), let us consider what kind ofgrammatical structure WG would give to taberu. Like English eat, taberu inJapanese takes two arguments in its semantic structure, the agent (eat-er), whichis realized as subject, and the patient (eat-ee) which is realized as object. ThusWG gives the syntactic and semantic structures of Shota ga ringo o tabeta 'Shotaate apples' as diagrammed in Figure 6.

Figure 6

In passing, one of the advantages of using WG to analyze the syntacticstructure of Japanese is that by doing so, it is quite easy to explain thephenomenon of Tree word order as long as the parent is at the end' inJapanese. As is well-known, Japanese is a verb-final language which implies thatthe order of the subject, object and other dependents is not fixed as far as theyare before the verb, which is at the end of a sentence. Thus Shota ga ringo otabeta has an alternative version of ringo o Shota ga tabeta. A rather free order oftwo complements in the sentence is explained in WG by saying that these twoelements are co-dependents of the head taberu, therefore the order of thecomplements is free as far as they are before the head.

As stated above, complements of taberu can be missed out providing that theyare recovered from the context. The following examples illustrate this point

(14) hayaku taberoquick eat'Eat it quick'

(15) moo tabe-mashita-ka?already ate-Q'Did you eat it already?'


In both sentences above the definite object is apparently suppressed.Presumably the suppressed objects can be expressed as (definite) pronounswithout any change in meaning as in the following sentences:

(16) hayaku sore-o taberoquick it-Ob eat'Eat it quick'

(17) moo sore-o tabemashita-ka?already it-Ob ate-Q'Did you eat it already?'

In the last section I argued that the suppressed object of eat is indefinitebecause it cannot refer to its antecedent even when it is available in thepreceding context. Interestingly enough the opposite is true with taberu.Consider die following dialogues corresponding to (11) and (12), which wediscussed in the last section:

(18) A: watashino sukohn-wa doo-shimashi-ta?my scones-TP how-did'What happened to my scones?'

B: inu-ga tabetadog-Sb ate'The dog ate them'

(19) A: kippahzu-wa tabeta?kippers-TP-Sb ate'Did you eat your kippers?'

B: ee tabetayes ate'Yes, I ate them'

In (18) and (19), the definite object referring to an element in die previouscontext is left out in B's utterance. In (19), the subject referring to the speaker isalso missing in B's utterance. These cases show diat the suppressed object oftaberu is definite. In contrast, when the object of taberu is indefinite, there are infact cases where it has to be expressed as an indefinite noun as in (20):

(20) A: himana toki-wa nani-o shite-imasu-ka?spare time-TP what-Ob do-ing-Q'What do you do in your spare time?'

B: taitei nanika tabete-imasuusually something eat-ing'Usually I eat'

These arguments make it very clear that WG should represent the syntactic andsemantic structures of inuga tabeta in (18) as diagrammed in Figure 7.

As I stated in section 2, Hudson (1995b) claims that one of the keycharacteristics of WG is taken as 'including contextual information about theutterance event' (e. g. the identities of a speaker and a hearer) in the linguisticstructure. However, as it stands, the syntactic and semantic structures in Figures


Figure 7

1 and 4 do not seem to include as much contextual information as he suggeststhat a WG does. Revisions I have made in this section surely contribute toincreasing contextual information in the grammatical representation in WG.

5. Conclusion

Considering the fact I mentioned above of deletability of definite objects given aproper context in Japanese, it seems reasonable to add the following rule to thegrammar of Japanese to explain the proper semantic structure of tab em:

(21) Knower of eai-ee of sense of tab em = addressee of it.

Taking into account the arguments above, I conclude that the syntax of missingcomplements in Japanese can be given a more satisfactory description byintroducing a parameter of 'default definiteness'. I do not want to enter intodetails now but simply suggest that to distinguish between complements andadjuncts in Japanese one needs another parameter such as 'defaultdefiniteness'. To put it differently, by default the definiteness of a covertcomplement is [+defmite] and that of a covert adjunct is [+/-definite] as in (22):

(22) • Covert complement of verb = definite• Covert adjunct of verb = indefinite• Knower of referent of complement of verb = addressee of it

References

Allerton, D. J. (1982), Valency and the English Verb. London: Academic Press.Cote, S. A. (1996), 'Grammatical and Discourse Properties of Null Elements in

English'. (Unpublished doctoral dissertation, University of Pennsylvania).


Fillmore, Ch. J. (1986), 'Pragmatically controlled zero anaphora'. BLS 12, 95-107.Groefsema, M. (1995), 'Understood arguments: A semantic/pragmatic approach',

Lingua 96, 139-61.Haegeman, L. (1987a), 'The interpretation of inherent objects in English', Australian

Journal of Linguistics, 7, 223-48.— (1987b), 'Register variation in English'. Journal of English Linguistics, 20, (2), 230-

48.Halliday, M. A. K. and Hasan, R. (1976), Cohesion in English. London: Longman.Hudson, R. A. (1984), Word Grammar. Oxford: Blackwell.— (1990), English Word Grammar. Oxford: Blackwell.— (1992), 'Raising in syntax, semantics and cognition', in Rocca I. (ed. ), Thematic

Structure: Its Role in Grammar. Berlin: Mouton de Gruyter, pp. 175-98.— (1994), 'Word Grammar', in Asher, R. E. (ed. ), The Encyclopedia of Language and

Linguistics. Vol. 9. Oxford: Pergamon Press Ltd, pp. 4990-3.— (1995a), 'Really bare phrase-structure=dependency structure'. Eigo Go ho Bunpoh

Kenkyu (Studies in English Language Usage and English Language Teaching}, 17, 3-17.- (1995b), HPSG without PS? Ms.— (1995c), Word Meaning. London: Routledge.- (1996, October 28), 'Summary: Watch', ( LINGUIST List 7. 1525), Available: http: //

linguisdistorg/issues/7/7-1525. html#?CFID=4038808&CFrOKEN=16874386. (Ac-cessed: 21 April 2005).

— (1998), English Grammar. London: Routledge.- (2000), '*! amn't'. Language, 76, (2), 297-323.— (2005), 'An Encyclopedia of English Grammar and Word Grammar', (Word

Grammar), Available: www. phon. ucl. ac. uk/home/dick/wg. htm. (Accessed: 21 April2005).

Kilby, D. (1984), Descriptive Syntax and the English Verb. London: Groom Helm.Lambrecht, K. (1996, October 30), 'Re: 7. 1525, Sum: Watch', (LINGUIST List

7. 1534), Available: http: //linguisrlist. org/issues/7/7-1534. html#?CFID=4038808&CF-TOKEN= 16874386 (Accessed: 21 April 2005).

Langacker, R. W. (1990), Concept, Image and Symbol. Berlin: Mouton de Gruyter.Larjavaara, M. (1996, November 3), 'Disc: Watch', (LINGUIST List 7. 1552),

Available: http: //linguisdist. org/issues/7/7-1552. html#?CFID=4038808&CFTO-KEN=16874386 (Accessed: 21 April 2005).

Lehrer, A. (1970), 'Verbs and deletable objects'. Lingua, 25, 227-53.Levin, B. (1993), English Verb Classes and Alternations. Chicago: University of Chicago

Press.Massam, D. (1987), 'Middle, tough, and recipe context constructions in English'. NELS

18, 315-32.— (1992), 'Null objects and non-thematic subjects'. Journal of Linguistics, 28, (1), 115-

37.Massam, D. and Y. Roberge. (1989), 'Recipe context null objects in English'. Linguistic

Inquiry, 20, 134-9.Quirk, Randolph, Greenbaum, Sidney, Leech, Geoffrey N. and Svartvik, Jan Lars

(1985), A Comprehensive Grammar of the English Language. Harlow: Longman.

Rispoli, M. (1992), 'Discourse and the acquisition of eat'. Journal of Child Language, 19,581-95.

Rizzi, L. (1986), 'Null objects in Italian and the theory of pro'. Linguistic Inquiry, 17,501-57.

Roberge, Y. (1991), 'On the recoverability of null objects', in D. Wanner and D. A.


http://linguistlist.org/issues/7/7-1525.html#?CFID=4038808&CFTOKEN=16874386







Kibbee (eds), New Analyses in Romance Linguistics. Amsterdam: John Benjamins, pp.299-312.

Rosta, A. (1994), 'Dependency and grammatical relations'. UCL Working Papers inLinguistics, (6), 219-58.

— (1997), 'English Syntax and Word Grammar Theory'. (Unpublished doctoraldissertation, University of London).

Sperber, D. and Wilson, D. (1995), Relevance (2nd edn) Oxford: Blackwell.Sugayama, K. (1993), 'A Word-Grammatic account of complements and adjuncts in

Japanese', in A. Crochetiere, J. -C. Boulanger and C. Ouellon (eds), Acte du XVsCongres International des Linguistes Vol. 2. Sainte-Foy, Quebec: Les Presses de1'Universite Laval, pp. 373-76.

— (1994), 'Eigo no "missing objects" ni tsuite (Notes on missing objects in English)',Eigo Goho Bunpoh Kenkyu (Studies in English Language Usage and English LanguageTeaching), 1, 91-104.

— (1999), 'Speculations on unsolved problems in Word Grammar'. Kobe City UniversityJournal, 50, (7), 5-24.

Thomas, A. L. (1979), 'Ellipsis: the interplay of sentence structure and context'. Lingua,47, 43-68.

Notes

1 Notice here that Word Grammar, in the framework of which the arguments aredeveloped, assumes that complements include subject as well as object, contrary tomost of the phrase-structure-based theories. For details about the distinction betweencomplements and adjuncts in Japanese, see Sugayama (1993, 1999).

2 The following symbols for grammatical markers are used in the gloss: TP (Topic),Sb (Subject), Ob (Object), Q (Question marker). The symbol 0 is used for a zeropronominal in examples when necessary.

3 For a complete list of some 50 verbs having this feature, together with references, seeLevin (1993: 33). Lehrer (1970) also gives a similar list.

Unspecified Object Alternation in Levin's terms (Levin, 1993: 33), which appliesto eat, embroider, hum, hunt, fish, iron, knead, knit, mend, milk, mow, nurse, pack, paint,play, plough, polish, read, recite, sew, sculpt, sing, sketch, sow, study, sweep, teach, type,sketch, vacuum, wash, weave, whittle, write, etc., has the following pattern:a. Mike ate the cake.b. Mike ate. (= Mike ate a meal or something one typically eats. )

4 For some recent advances in WG, see Hudson (1992, 1995a, 1995b, 1998, 2000,2005) and Rosta (1994, 1997).

5 Notice here that C stands for a complement, rather than a complementizer in thediagram.

6 No phrases or clauses are assumed in WG except for some constructions (e. g.coordinate structures).

7 Things are not so straightforward. Surprisingly younger children seem to have adifferent grammar in which (9) and (10) are grammatical and are actually used.Rispoli (1992) looked at the acquisition of Direct Object omissibility with eat inyoung children's natural production. In terms of GB, eat is one of the many Englishverbs with which the internal argument can be saturated. He found that at an earlierstage of development children frequently omitted a Direct Object with eat when theunderstood object referred to something in the discourse context, as in this exchangebetween a parent (P) and a child (C):


(i) Child (2; 7)(Talking about a pencil)P: Well I see you already ate the eraser off of it. That's one of the first things youhadta do.C: I eat. (four times)P: I know you ate the eraser, so you don't need a candy bar now.

(Rispoli 1992: 590)

4 The Grammar of Be To: From a Word Grammarpoint of view1

KENSEI SUGAYAMA

AbstractThis chapter is an attempt to characterize the be to construction within a WordGrammar framework. First (section 2), a concise account of previous studies into thecategory of be in the construction is followed by a description of be in WordGrammar (section 3). Section 4, section 5 and section 6, then, present amorphological, syntactic and semantic discussion of the be to construction,respectively. Section 7 gives a detailed discussion of the question whether be to is alexical unit or not The analysis is theoretically framed in section 8, where it is shownhow Word Grammar offers a syntactico-semantic approach to the construction.

1. Introduction and the Problem

In contemporary English there is a construction instantiated by the sentences in(I): 2

(1) a. You are to report here at 6 a. m.b. What am I to do?c. I am to leave tomorrow.d. That young boy was to become President of the United States.

I shall call this construction the be to construction. 3 Previous studies haveanalyzed be in this construction in three different ways: (i) modal (e. g.Huddleston 1980); (ii) 'be + to' analyzed as quasi-auxiliary; (iii) intermediatebetween the two, as semi-auxiliary.

These approaches, however, did not give enough evidence to justify theiranalyses. In this chapter I will argue that be in this construction is an instance ofmodal verb and that 'be + to' is not a lexical (or possibly neither syntactic) unit asit is often treated in reference grammars. 4

My argument is within the framework of a theory called Word Grammar,hence called a Word Grammar account of the problem, and based on the factthat there is ample evidence supporting the claim that there is a syntactic andsemantic gap between the two elements, i. e. be and to. In the following sections,I will provide a characterization of the be to construction within a WordGrammar framework, as outlined above in the abstract.


2. Category of Be

Before we go into our analysis, let us have a brief look at what characteristics themodal be shares with other auxiliary verbs, e. g. can and have.

The table below presents a modal be, a prototypical modal can and aprototypical perfective auxiliary have in respect of the 30 criteria used inHuddleston (1980) to characterize auxiliaries and modals.

Table 1 Characteristics of be, can and have

BE CAN(modal)

HAVE(perfect)

1 Non-catenative use2 InversionPOLARITY3 Negative forms4 Precedes not5 Emphatic positiveSTRANDING6 So/neither tags7 SV order, verb given8 SV order, verb new9 Complement fronting10 Relativized complement11 Occurrence with DO12 ContractionPOSITION OF PREVERBS13 Precedes never14 Precedes epistemic adverb15 Precedes subject quantifierINDEPENDENCE OF CATENATIVE16 Temporal discreteness17 Negative complement18 Relation with subjectTYPE OF COMPLEMENTATION19 Base-complement20 fo-complement21 -en complement22 -ing complementINFLECTIONAL FORMS23 3rd Singular24 -en form25 -ing form26 Base-form27 Past Tense28 Unreal mood: protasis29 Unreal mood: apodosis30 Unreal mood: tentative

NB: R means that the verb has the given property but under restricted conditions.

+

R++

++

+

+

+++

++

+

+

++

+

+++

+++++

+

+++

++

+

++++

+

+++

+

+/R+/R++/R

+

+++

R+

+

+ •

R+++

-a - -

-

-

- - -

- - -

-=

-

-a -

-

- - -

-

- - -

= -

- -

--- -

THE GRAMMAR OF BE TO 69

Clearly, at the outset we can say that be in the current construction shares quite alot of features with a typical modal can. In what follows I concentrate on theextent to which this claim holds.

3. Modal Be in Word Grammar

In this section, we look at what Word Grammar says about aspects of be. WordGrammar (WG), which is fully developed and formalized in Hudson (1984,1990) is to be taken as a lexicalist grammatical theory because the word iscentral - hence the name of the theory, basically making no reference to anygrammatical unit larger than a word.

WG uses a small triangle to show the model-instance relation. So the generalcategory (i. e. model) is on the base and its apex pointing at the member (i. e.instance), with an extension line to link the two.

Now, let us see more in detail the properties of be as in (1), to examine theclaim that it should be categorized as a modal in WG terms. Consider thediagram in Figure 1.

Figure 1 BEto in Word Grammar

What Hudson (1996) basically claims in WG using the model-instance relationis the following:

• word is an independent entity in grammar;• verb is an instance of word;• auxiliary verb is an instance of verb;• modal verb is an instance of auxiliary verb, along with other instances (e. g.

HAVEs, DOs and BE);• be in this construction (represented as BEto in Figure 1) is an instance both

of modal verb and BE.

This analysis implies that BEto may have inherited characteristics of modal


verbs and at the same time those of BE by Hudson's Inheritance Principlealthough they are not always necessarily inherited:

Inheritance Principle (final version, Hudson 2005):If fact F contains C, and C' is an instance of C, then it is possible to infer a secondrule F' in which C' replaces C provided that:a. F does not contain "is an instance of. . . ", andb. there is no other fact which contradicts F and which can also be inherited by C'.

The idea of 'contradicting' can be spelt out more precisely, but the idea hereshould be clear. In a nutshell, the Inheritance Principle says that a fact aboutone concept C can be inherited by any instance C' of C unless it is contradictedby another specific fact about C'.

In sum, WG analyzes be in the be to construction as an instance of modalverb and be, allowing it to inherit characteristics from both heads in the model-instance hierarchy in Figure 1.

4. Morphological Aspects

I claim diat be in (1) should be considered as a modal because it shares most ofthe properties with a prototypical modal in morphology, syntax and semantics.This claim is assured in Figure 1 by the fact that BEto has a multiple head, thusinheriting the features from both heads, i. e. modal verb and be. However, afurther semantic analysis of the construction shows that the sense of theconstruction is derived from the sense of the infinitive clause rather than that ofbe. Let us start by having a look at the morphological characteristics of be in theconstruction and its similarities with other modals.

4. 1 Like modals

Consider the following examples:

• In Standard English a modal is always tensed - i. e. eidier present or past

This is compatible with the behaviour presented by be in the be to constructionas in (2):

(2) *It is a shame for John to be to leave here tomorrow. [Warner]* To be to leave is sad. [Pullum & Wilson]*I expect him to be to leave. [Pullum & Wilson]*He could be to leave. [Pullum & Wilson]*She might be to win the prize.*I don't like our being to leave tomorrow. [Warner]*I am being to put in a lot of overwork these days. [Seppanen]*I have always been to work together. [Seppanen]* Don't be to leave by midnight. [Pullum & Wilson]* Be to leave it till later. [Seppanen]


Clearly each of the examples in (2) shows that be in this construction cannot beinfinite. The presence of tense is what be shares with modal verbs.

• Only (tensed) auxiliary verbs accept -n't

Be in this construction allows negative contraction, which is shown only by thetensed auxiliaries, which again implies that be is a modal because auxiliariesinclude modals.

(3) Her novel wasn't to win that year.

4. 2 Unlike modals

• Be has a distinct j-form controlled by subject-verb agreement

(4) (Her novel is/They were/I am/She wasfWe are} to win the prize.

5. Syntactic Aspects

The second aspect is related to syntax.

5. 1 Like modals

• Only (tensed) auxiliary verbs allow a dependent not to follow them.

Be in this construction also has this feature as in (5).

(5) He is not to leave this room. [They are/*get not tired. ]

• It must share subject with the following verb (i. e. it is a raising. verb)5The behaviour of be is the same as that of a typical raising verb seem as in (6).

(6) He is to go/*He is (for) John to goHe can speak/*He can (for) John speak*Mary seemed (for) John to be enjoying himself

• Voice-neutral in many circumstances

This again strongly suggests that be in the be to is a raising verb.

(7) You are to take this back to the library at once ~ This is to be taken back to thelibrary at once. 6

• It cannot co-occur with other modals

This feature is critical in that two members of the same category of modal verbscannot appear consecutively. (8) shows that might and be to belong in the samecategory of modal verbs.


(8) *She might be to win the prize.

• It can precede perfective/progressive/passive auxiliary

When be appears with a perfective, progressive or passive auxiliary verb, italways appears in the left-most slot reserved for modal verbs, i. e. immediatelybefore these auxiliaries.

(9) Her novel (was to have won/was to be going on display/was to be considered}that year.

• It is an operator, i. e. has NICE properties [but Code only if the same verbgiven in the previous clause as in (11)]

It has NICE properties. Therefore it is an operator in the sense of Quirk et al.(1985) Auxiliaries have NICE properties. The NICE properties are also sharedby modals.

(10) Was her novel to win the prize? Mine was.(11) *Joe's novel would win a prize that year. Mine wasn't.

5. 2 Unlike modals

• taking to -infinitive rather than a bare infinitive as a complement of to

(12) That young boy was *(to) become President of the United States.

6. Semantics of the Be To construction

There exist an array of meanings this be-to has: arrangement, obligation, andpredestined future, 'future in the past', possibility, purpose ('to be intended to')and hypothetical condition.

It must be noted here that be in this construction has both epistemic/non-epistemic meanings, which is again a diagnostic typical modals show. Sentencesin (13)-(19) involve a non-epistemic instance of the construction:

(13) She is to see the dean tomorrow at 4 p. m.(14) You are to sit down and keep quiet.(15) You're to marry him within the next six months.(16) Their daughter is to be married soon. [Quirk et al. ](17) They are to be married in June [OALD6](18) The Prime Minister is to get a full briefing on the release of the hostages next

week.(19) Ministers are to reduce significantly the number of examinations taken by

pupils in their first year in the sixth form as the result of an official review to bepublished later this week. The review will recommend dismantling the modularsystem of assessment that is at the heart of the new sixth-form curriculum. [ TheTimes}


Although be to has several different meanings, its basic (or core) meaning can bestated as follows:

• The agent has been set or scheduled to do something by some external(outside) forces, and is thus obliged. However, the agent's commitment tothe obligation is left open.

Here the key points are the arrangeability of the event described and theopenness of the agent's commitment to the obligation. The first point is mosteasily detected when it occurs with the event that cannot be arranged. Considerthe following examples:

(20) ?The sun is to rise at 5. 15 a. m. tomorrow morning(21) The sun will rise at 5. 15 a. m. tomorrow morning.(22) You are to take these four times a day.

A straightforward example of the use of be to in the context where an eventcannot be arranged can be found in (20). (20) is odd in comparison with bothother sentences for the point I am about to make. The fact that the sun's rise is(or cannot be) normally not arranged is indeed the reason why (20) is low inacceptability. 7 In contrast, (21) is all right with will implying the speaker'ssubjective prediction. One can also utter a sentence like (22), which depicts anarrangeable event.

One might take it for granted that the be to necessarily implies thearrangement of an event, but that would be missing the more general point thatthere is no need to express the agent as in (23) or (24):

(23) There's to be an official inquiry. [Quirk et al. \(24) Regional accents are still acceptable but there is to be a blitz on incorrect

grammar.[CO BUILD2]

What is needed for non-epistemic meaning of the be to construction is that thesentence expresses an arrangeable event or activity.

There is another use of be to representing 'predestined future' as in (25)-(30):

(25) They are to stay with us when they arrive. [CLD](26) You are to be back by 10 o'clock. [Quirk et al. \(27) A clean coal-fired power plant is to be built at Bilsthorpe Colliery. [COBUILD](28) You are to take this back to the library at once ~ This is to be taken back to the

library at once.(29) 'They are to be seen and displayed on walls and floors both in museums and

domestically. ' UK Written. CO BUILD WB(30) I've also learned that in these difficult times it truly is important that we're all

thinking together about what is to be done and how best to move. [US spoken]

All these examples assert the speaker's high certainty at the speech time of theevent happening in the (near) future.


Related to this usage type is 'future in the past', a case where the speech timeis transferred to some point in the past

(31) a. After dinner they were to go to a movie. [COBUILD3]b. Then he received a phone call that was to change his life...

[COBUILD4](32) He was eventually to end up in the bankruptcy court. [Quirk et al. \(33) The meeting was to be held the following week. [Quirk et al. \(34) Her novel was to win the prize.(35) Worse was to follow.(36) This episode was to be a taste of what was to come in the following couple of weeks.

Different or varied meanings such as 'compulsion', 'plan', 'destiny', etc. canderive from the core meaning according to the context it appears in as in (37a)and (37b). In (37), the part before and/as is the same in both sentences.Nevertheless the interpretation of this part at the level of sentence meaning isquite different. Where does this difference come from? The context, moreprecisely the following context in this particular case, is responsible for thisdifference. What is interesting is that the meaning (sense) of be to is determinedby the following context.

(37) a. You aren't to marry him, and that's an order.b. You aren't to marry him, as I read it in the cards.

(37a) is interpreted as an order obviously, while (37b) has an epistemicpredictive sense. It is manifested quite clearly above that the array ofconnotations is pragmatically determined.

On the other hand, the be to has epistemic meanings illustrated in (38)-(42).

(38) Such an outcome is to be expected.(39) These insects are to be found in NSW. [Huddleston 1980: 66; Seppanen]

Furthermore it can be used in conditionals in English as in (40)-(42).

(40) And the free world has reacted quickly to this momentous process and mustcontinue to do so if it is to help and influence events [ICE-GB: S1B-054 #17: 1: B]

(41) the system is totally dependent on employee goodwill if it is to produce goodinformation. [ICE-GB: W2A-016 #118: 1]

(42) However, in nerves regeneration is essential if there is to be a satisfactoryfunctional outcome. [ICE-GB: W2A-026 #15: 1]

There exists arguably a clear-cut distinction between be to and the epistemicmodals in the use in conditionals. It is practically ruled out, or catalogued asperformance error, that speakers of English select an epistemic modal for theprotasis of conditionals, even though the meaning of this modal is conceptuallyquite compatible with the functioning of either part (protasis or adposis) of aconditional. The contrast in (43) serves as a most relevant observation toexplain this phenomenon.


(43) a. ??If it may rain, you should take your umbrella.b. If it is possible that it will rain, you should take your umbrella.

According to Lyons (1977: 805-86), 'conditional clauses are incompatible withsubjective epistemic modal expressions'. In (43a), may in the protasis If it mayrain shows figment of the speaker's imagination and merely expresses possibilityas non-factual, which is in conflict with another possible world created by if,while the possibility expressed by non-modal expressions in an acceptableutterance like (43b) refers to possibility as actuality independent of the speaker,and possibility is categorically asserted and therefore it is factual.

In passing, non-modal expressions can express modal-like meanings as in(44) and (45):

(44) a. It's your duty to visit your ailing parents,b. You ought to visit your ailing parents.

(45) a. Jessica is possibly at home now.b. Jessica may be at home.

In the end, there are of course differences between be to and modal verbs.Still be in (1) shares enough properties with modal verbs to be categorized as amodal, and although it may be categorized as a modal, the sense of the be toconstruction is best considered to be an existence of a situation where the eventis represented by the VP in the infinitive. Modal-like meanings of theconstruction are derived from the sense of to rather than that of be, which isattested in the following section.

7. Should To be Counted as Part of the Lexical Item?

Let me take pieces of evidence one by one to argue for my proposal.

• Inversion

In a yes-no question, what moves to the front is not be to but be, which suggeststhat be behaves like a modal (operator), with to being an infinitive marker.

(46) He should goShould he go?

(47) He ought to go.* Ought to he go?Ought he to go?

(48) He is to go.*Is to he go?Is he to go?

• VP fronting (Gazdar et al, 1982)

Impossibility of VP fronting as in (49f) shows that be itself isn't a modal. If it is, itshould behave as will in (49b):


(49) a. *... and went heb. ... and go he willc. ... and going he isd. ... and gone he hase. ... and taken by Sandy he wasf. *... and to go he isg. *... and to go he wantsh. *... and be going he willi. *... and have gone he willj. ... and being evasive he was

• Be may be separated from to.

Therefore, be to isn't a syntactic unit

(50) We are, I believe, to start tomorrow.(51) The most severe weather is yet/still to come. [Quirk et aL: 143](52) He was eventually to end up in the bankruptcy court. [Quirk et aL: 218]

• To may be missed out in the tag.

If be to is a unit, be to has to be retained.

(53) He was to have gone, wasn't he?

• Unlike ought to and have to, the to doesn't have to be retained in be to when aVP that follows to is deleted.

Since deletion of the VP after to appears always to be possible whether therelevant verb is a modal or not as in (55), this contrast tells nothing about thecategory of the item before to, but what (54) does imply is that the VP deletionis dependent on be in be to, rather than on to, which in turn suggests that be is amodal on its own.

(54) Bill is to leave at once, and Alice is (to) also. [McCawley]Bill has to leave at once, and Alice has * (to) also. [McCawley]We don't save as much money these days as we {ought (to)/used to}. [Quirk et aL:909]

(55) I've never met a Klingon, and I wouldn't want to. [Pullum, 1982: 185]

• Unlike ought to and have to, there is no phonetic fusion with be to.

Though it is not fully clear whether examples in (56) are zeugmatic or not, the/o-infinitive may be coordinated with a wide range of conjuncts of differentcategories (Warner 1993). My informants however say that they are all rightwithout a zeugmatic reading. If this is the case, the to-infinitive is anindependent unit in be to and there has to be a syntactic gap between be and to.


(56) He was new to the school and to be debagged the following day.The old man is an idiot and to be pitied.You are under military discipline and to take orders only from me.You are a mere private and not to enter this mess without permission.He was an intelligent boy and soon to eclipse his fellows.

If this is the case and there is a one-to-one relation between syntax andsemantics as is maintained in WG, the fo-mfmitive is semantically as well asfunctionally an independent element in the be-to construction and therefore thebe-to is obviously not a syntactic unit, although there has to be somegrammatical relation between the two elements (i. e. be and fo-infmitive). Itseems that what fo-infinitives in (56) have in common is the function ofpredication. Otherwise they cannot be coordinated with those conjuncts beforeand in (56). This implies that be is an ordinary predicative be, followed by aninfinitival clause with to of predicative function.

Sentences in (57) and (58) give evidence supporting this predicative functionof the infinitive clause because there is no be found in examples in (57).Nevertheless, the NPs in (57) and (58) all express predication, although theyare NPs as a whole in syntactic terms:

(57) Prudential to float its Egg online bankWoman to head British LibraryTeeth to be farmedNaked swim teacher to sueVestey grandson to stand trialHayward Gallery to be refurbished and extendedTendulkar to stand down as captain

(58) ... the quaint aspects of working class life to be found in many major novelists[Palmer]

This predicative analysis of the to-infinitive is also supported by (59a), where the to-infinitive is a part of a small clause with the preposition with. It is a well-attested fact thatwith heads a small clause expressing predication as in (59b). Therefore, the to-infinitivefunctions as a predicative in (59a).

(59) a. ... in a consultation paper agreed with Dublin to be released at the...[CO BUILD]

b. With Peter the referee we might as well not play the match. [Aarts 1992: 42]

All these arguments show that be to is not a lexical unit.

8. A Word Grammar Analysis of the Be To Construction

In this section, I will show how a WG analysis will make it possible to give thesame syntactic and semantic structure to epistemic/non-epistemic meanings ofthe construction, based on the evidence that be is an instance of a raising verb inboth cases. As far as I know, there has been very little, if any, research donewith the mapping between semantic and syntactic structures of core and


marginal modals, including the present construction. In this sense, myapproach is quite a valuable one. In these linguistic circumstances, I suggest thatthe question to be asked is: what must be the mapping between the semanticand syntactic structure of what is represented by the be to construction? I nowpresent an answer framed in WG terms. Before giving a detailed analysis, let ushave a quick look at WG in a nutshell.

In WG, a syntactic structure is based on grammatical relations within ageneral framework of dependency theory, rather than on constituent structure.Accordingly, a grammatical relation is defined as a dependency relationbetween the head and its dependents which include complements and adjuncts(alias modifiers). In this framework, the syntactic head of a sentence, as well asits semantic head, is therefore a finite verb on which its dependents such assubject, object and so forth depend. To take a very simple example, thegrammatical analysis ofyou are reading a Word Grammar paper is partially shownby the diagram in Figure 2.

Figure 2 Syntax and semantics in WG

Each arrow in this diagram shows a dependency between words and it pointsfrom a head to one of its dependents, but what is most important here is thatthere are no phrases or clauses in the sense of constituency grammars. Thus interms of dependency, are is the root of the sentence (represented as a boldarrow) on which you and reading depend as a subject and sharer (a kind ofcomplement), 8 respectively. In turn, a is the head of paper. Grammar dependson paper, Word depends on Grammar, and so on. Turning to the semantics ofthis sentence, 'you', which is a referent of you, is linked as a read-er (i. e. agent)to a semantic concept 'You read a Word Grammar', an instance of 'read'. 9

The curved (vertical) lines point to the referent of a word. '-Er' and '-ee' arenames of semantic relation or thematic roles. A small triangle representing the


model-instance relation is the same in earlier figures. A convenient way todiagram the model-instance relation is by using a triangle with its base along thegeneral category (= model) and its apex pointing at the member (= instance),with an extension line to link the two. In Figure 2, then, the diagram shows therelation between the sense of the word read, fread', and its instance fyou read aWG paper'.

Based on the arguments in the preceding sections, diagrams in (60a) and(60b) offer a view of syntax and semantics of the be to construction, which hasboth epistemic and non-epistemic senses. Translated into WG schematicfeatures, this means that in syntax of both senses, it has a raising structurerepresented as the main subject functioning as both the subject of be and that ofto or the infinitival verb.

Thus WG configuration posits by and large the same semantic structure toboth senses of the construction, which is headed by the semantic concept of'Be'. The detailed analysis of the epistemic structure is given in (60a):

What this diagram shows on the semantic structure is:

• epistemic sense is an instance of modality;• sense of aren't is 'be' (neglecting the negation);• 'be' is an instance of epistemic modality;• 'be' has a proposition as a dependent.

Similarly in (60b), containing the non-epistemic sense of the be to:

(60) a.

WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE

(60) b.

Here 'be' is an instance of non-epistemic, because (60b) means that some eventis arranged or planned, without any sense of the speaker's judgement on theproposition embedded in the utterance (sentence). In both cases, the meaning(i. e. sense) of 'be' needs a proposition which expresses an activity or event as adependent, which is a sense of the verb.

Abstracting away from the technical markers, the diagram in (60c) representsa WG analysis of the coordinate structure in (56). This diagram schematizes thevery idea that the same predicative function (a dependency relation) holdsbetween was and the first conjunct new to the school enclosed by square bracketson the one hand, and between was and the second conjunct to be debagged.... atthe same time.

(60) c.

80


9. Conclusion

In this chapter, I have shown that a morphological, syntactic and semanticanalysis of be in the be to construction provides evidence for the category of be inthis construction proposed here. Namely, be is an instance of a modal verb interms of morphology and syntax, while the sense of the whole construction isdetermined by the sense of 'to'. The analysis also explains why be to does notconstitute a lexical unit. Finally, the WG account presented here gives the samesyntactic and semantic structures to the construction, reducing the complexity ofthe mapping between the two levels of the structure of the modal-likeexpression be to.

References

Aarts, Bas (1992), Small Clauses in English. Berlin: Mouton de Gruyter.Bybee, John (1994), The Evolution of Grammar. Chicago: The University of Chicago

Press.Celce-Murcia, Marianne and Larsen-Freeman, Diane (1999 ), The Grammar Book.

Boston, MA: Heinle & Heinle.Collins Cobuild English Language Dictionary for Advanced Learners. (1995 ), Glasgow:

Harper Collins Publishers.Collins Cobuild English Language Dictionary for Advanced Learners. (2001 ), Glasgow:

Harper Collins Publishers.Collins Cobuild Advanced Learner's English Dictionary. (2003 ), Glasgow: Harper Collins

Publishers.Gazdar, Gerald, Pullum, Geoffrey K. and Sag, Ivan A. (1982), 'Auxiliaries and related

phenomena in a restrictive theory of grammar'. Language, 58, 591-638.Greenbaum, Sidney, Leech, Geoffrey N. and Svartvik, Jan Lars (eds), (1980), Studies in

English Linguistics for Randolph Quirk. London: Longman.Huddleston, Rodney (1976), 'Some theoretical issues in the description of the English

verb'. Lingua, 40, 331-383.— (1980), 'Criteria for auxiliaries and modals', in Greenbaum, Sidney, et al. (eds),

Studies in English Linguistics for Randolph Quirk. London: Longman, pp. 65-78.Hudson, Richard A. (1984), Word Grammar. Oxford: Blackwell.— (1990), English Word Grammar. Oxford: Blackwell.— (1996), A Word Grammar Encyclopedia (Version of 7 October 1996). University

College London.— (2005, February 17 - last update), 'An Encyclopedia of English Grammar and Word

Grammar', (Word Grammar), Available: www. phon. ucl. ac. uk/home/dicVwg. htm.(Accessed: 21 April 2005).

Kreider, Charles W. (1998), Introducing English Semantics. London: Roudedge.Lampert, Gunther and Lampert, Martina (2000), The Conceptual Structure (s) of

Modality. Frankfurt am Main: Peter Lang.Lyons, John (1977), Semantics. Cambridge: Cambridge University Press.McCawley, James D. (1988),

The Syntactic Phenomena of English. Chicago: University of Chicago Press.Napoli, Donna Jo (1989), Predication Theory. Cambridge: Cambridge University Press.Palmer, Frank Robert (19902), Modality and the English Modals. Harlow: Longman.— (2001 ), Mood and Modality. Cambridge: Cambridge University Press.Perkins, Michael. R. (1983), Modal Expressions in English. London: Frances Pinter.



Pullum, Geoffrey K. (1982), 'Syncategorematicity and English infinitival to*'. Glossa, 16(2), 181-215.

Pullum, Geoffrey K. and Wilson, Deirdre (1977), 'Autonomous syntax and the analysisof auxiliaries'. Language, 53, 741-88.

Quirk, Randolph, Greenbaum, Sidney, Leech, Geoffrey N. and Svartvik, Jan Lars(1985), A Comprehensive Grammar of the English Language. Harlow: Longman.

Seppanen, A. (1979), 'On the syntactic status of the verb be to in Present-day English'.Anglia, 97, 6-26.

Sugayama, Kensei (1996), 'Semantic structure of eat and its Japanese equivalent taberu: aWord-Grammatic account', in Barbara Lewandowska-Tomaszczyk and MarcelThelen (eds), Translation and Meaning, Part 4. Maastricht: Universitaire PersMaastricht, pp. 193-202

— (1998), 'On be in the be to Construction', in Yuzaburo Murata (ed), Grammar andUsage in Contemporary English. Tokyo: Taishukan, pp. 169-77.

Warner, Anthony R. (1993), English Auxiliaries. Cambridge: Cambridge UniversityPress.

Notes

1 This is a revised and expanded version of my paper of the same title read at theInternational Conference 'Modality in Contemporary English' held in Verona, Italyon 6-8 September 2001. 1 am most grateful to the comments from the audience atthe conference. Remaining errors are however entirely my own. The analysisreported here was partially supported by grants from the Daiwa Anglo-JapaneseFoundation (Ref: 02/2030). Their support is gratefully acknowledged.

2 Here we shall not take into account similar sentences as in (i), which are consideredto have a different grammatical structure from the one we are concerned with in thischapter.(i) My dream is to visit Florence before I die.

3 The idea of construction here is quite a naive one, different from the technicaldefinition of the one used in Goldberg's Construction Grammar.

4 Palmer (1990: 164), among others, claims that 'is to' is formally a modal verb.5 Be to can be used in the there construction as in (i).

(i) Regional accents are still acceptable but there is to be a blitz on incorrectgrammar. [COBUILD2]

This suggests that be is a raising verb because there is no semantic relation betweenthere and 'be to'.

Example (i) is construed as a counter-example to view this construction as oneinvolving a subject-control as in (ii).(ii) Mary is [PRO to leave by 5]. - Napoli

6 The possibility sense is found only in the passive, so there is no active counterpartfor These insects are to be found in NSW. [Huddleston 1980: 66; Seppanen]

7 Quite exceptionally, it could be arranged by God or other supernatural beings.Otherwise it cannot be.

8 Sharer is a grammatical relation in Word Grammar.9 Here we disregard the tense and aspect of the sentence.

5 Linking in Word Grammar

JASPER HOLMES

AbstractIn this chapter I shall develop an account for the linking of syntactic and semanticarguments in the Word Grammar (WG) framework. The WG account is shownto have some of the properties of role-based approaches and some of theproperties of class-based approaches.

1. Linking in Word Grammar: The syntax semantics principle

1. 1 Introduction

Any description of linguistic semantics must be able to account for the way inwhich words and their meanings combine in sentences. Clearly, thispresupposes an account of the regular relationships between syntactic andsemantic structures: a description of the mechanisms involved in linking.

The search for an adequate account of linking has two further motivations: itmakes it possible to explain the syntactic argument-taking properties of words(and therefore obviates the need for valency lists or other stipulativerepresentations of subcategorization facts); and it provides a framework fordealing with words whose argument-taking properties vary regularly with theword's meaning (many such cases are treated below and in the work of otherwriters in the field of lexical semantics including Copestake and Briscoe 1996;Croft 1990; Goldberg 1995; Lemmens 1998; Levin 1993; Levin and RappaportHovav 1995; Pustejovsky 1995; and Pustejovsky and Boguraev 1996).

Levin and Rappaport Hovav provide yet another reason to seek an accountof argument linking: that it is an intrinsic part of the structure of language. Intheir introduction, they make the following claim:

To the extent that the semantic role of an argument is determined by the meaning ofthe verb selecting it, the existence of linking regularities supports the idea that verbmeaning is a factor in detennining the syntactic structure of sentences. The strikingsimilarities in the linking regularities across languages suggest thatthey are part of the architecture of language. (1995: 1, my emphasis)

Of course, it is not the meanings of verbs alone that are relevant in determiningsemantic structure. It should also be clear that I do not share Levin and


Rappaport Hovav's conviction of the similarities across languages in thedetails of argument linking. However, I accept readily that the fact ofargument linking, and the mechanism that controls it, must be shared acrosslanguages.

The linking regularities that we seek are generalizations over correspon-dences between syntactic and semantic relationships. In the WG framework(Hudson 1984, 1990, 1994, 2004; Holmes 2005), they take the form ofspecializations or refinements of the Syntax Semantics Principle (SSP) (Hudson1990: 132). This is represented schematically in Figure 1 and given in prose in(1). The SSP, as shown here, corresponds to the bijection principle of LexicalFunctional Grammar (Bresnan 1982) and to the projection principles and 0-criterion of Government and Binding Theory (GB) (Chomsky 1981: 36, 38).

Figure 1 Syntax Semantics Principle

(1) Syntax Semantics Principle (SSP): A word's dependent refers to anassociate of its sense. 1

Specific linking rules for specific relationships link classes of syntacticdependency with classes of semantic associate. These classes gather togetherthe relevant syntactic and semantic properties. By way of exemplification, Ibegin with the structure associated with the indirect object relationship. I go onto discuss the properties of objects and subjects.

1. 2 Indirect objects

Figure 2 shows some of the syntactic and semantic structure that needs to beassociated lexically with GIVE: the verb has a subject (s in the diagram), anobject (o in the diagram) and an indirect object (io in the diagram), all of whichare nouns. The sense of the verb, Giving, is an event with an 'er' (the referent ofthe subject; the properties of 'ers' and 'ees' are discussed shortly), an 'ee' (thereferent of the object), a recipient (the referent of the indirect object) and a result,

LINKING IN WORD GRAMMAR 85

an example of Having which shares its arguments with its parent The giver hasagentive control over the event, being in possession of the givee beforehand andwilling the transfer of possession. The givee and the recipient are more passiveparticipants: the former undergoes a change of possession, but nothing else, thelatter simply takes possession of the givee. Being a haver presupposes otherproperties (centrally humanity), but those are not shown here.

Figure 2 GIVE

Clearly, not all this information is specific to GIVE. Volitional involvement,control and instigation are semantic properties associated with many othersubject relationships, even those of verbs that have no (indirect) objects; thepassive role and affectedness of the object also apply in many other cases; andmany other verbs can appear with indirect objects, with similar semanticproperties. Levin provides the following two groups of verbs permitting indirectobjects (1993: 45-49), distinguished from each other by semantic properties(those in (2) alternate, according to Levin's analysis, with constructions with thepreposition TO, those in (3) with constructions with FOR). The question ofalternation, as well as the difference between the two groups, is dealt with shortly.

(2) ADVANCE, ALLOCATE, ALLOT, ASK, ASSIGN, AWARD, BARGE,BASH, BAT, BEQUEATH, BOUNCE, BRING, BUNT, BUS, CABLE,CARRY, CART, CATAPULT, CEDE, CHUCK, CITE, CONCEDE,


DRAG, DRIVE, E-MAIL, EXTEND, FAX, FEED, FERRY, FLICK,FLING, FLIP, FLOAT, FLY, FORWARD, GIVE, GRANT, GUARAN-TEE, HAND, HAUL, HEAVE, HEFT, HIT, HOIST, HURL, ISSUE,KICK, LEASE, LEAVE, LEND, LOAN, LOB, LUG, MAIL, MODEM,NETMAIL, OFFER, OWE, PASS, PAY, PEDDLE, PHONE, PITCH,POSE, POST, PREACH, PROMISE, PULL, PUNT, PUSH, QUOTE,RADIO, READ, REFUND, RELAY, RENDER, RENT, REPAY, ROLL,ROW, SATELLITE, SCHLEP, SELL, SEMAPHORE, SEND, SERVE,SHIP, SHOOT, SHOVE, SHOW, SHUTTLE, SIGN, SIGNAL, SLAM,SLAP, SLIDE, SUNG, SLIP, SMUGGLE, SNEAK, TAKE, TEACH,TELECAST, TELEGRAPH, TELEPHONE, TELEX, TELL, THROW,TIP, TOSS, TOTE, TOW, TRADE, TRUCK, TUG, VOTE, WHEEL,WILL, WIRE, WIRELESS, WRITE, YIELD.

(3) ARRANGE, ASSEMBLE, BAKE, BLEND, BLOW, BOIL, BOOK,BREW, BUILD, BUY, CALL, CARVE, CASH, CAST, CATCH,CHARTER, CHISEL, CHOOSE, CHURN, CLEAN, CLEAR, COM-PILE, COOK, CROCHET, CUT, DANCE, DESIGN, DEVELOP, DIG,DRAW, EARN, EMBROIDER, FASHION, FETCH, FIND, FIX,FOLD, FORGE, FRY, GAIN, GATHER, GET, GRILL, GRIND,GROW, HACK, HAMMER, HARDBOIL, HATCH, HIRE, HUM,IRON, KEEP, KNIT, LEASE, LEAVE, LIGHT, MAKE, MINT, MIX,MOLD, ORDER, PAINT, PHONE, PICK, PLAY, PLUCK, POACH,POUND, POUR, PREPARE, PROCURE, PULL, REACH, RECITE,RENT, RESERVE, ROAST, ROLL, RUN, SAVE, SCRAMBLE,SCULPT, SECURE, SET, SEW, SHAPE, SHOOT, SING, SLAUGH-TER, SOFTBOIL, SPIN, STEAL, STITCH, TOAST, TOSS, VOTE,WASH, WEAVE, WHISTLE, WHITTLE, WIN, WRITE.

The set of verbs that can take an indirect object, of either kind, is inprinciple unlimited in size, since it is possible to extend it in one of two ways.First, membership is open to new verbs which refer to appropriate activities:

(4) We radioed/phoned/faxed/emailed/texted/SMSed them the news.(5) We posted/mailed/couriered/FedExed™ them the manuscript.(6) Boil/coddle/microwave/Breville™ me an egg.

Second, and even more tellingly, existing verbs can be used with indirectobjects, with novel meanings contributed by the semantics of the indirect object

(7) The colonel waggled her his bid with his ears.(8) Dust me the chops with flour.

Examples like (7) and (8) are acceptable to the extent that the actions theyprofile can be construed as having the appropriate semantic properties. Forexample a bottle of beer can be construed as having been prepared forsomeone if it has been opened for them to drink from, but a door is notconstrued as prepared when it has been opened for someone to pass through:

UNKING IN WORD GRAMMAR 87

(9) Open me a bottle of pils/*the door.

It is clear from this, and from the fact, noted by Levin (1993: 4-5) withrespect to the middle construction, that speakers make robust judgementsabout the meanings of unfamiliar verbs in constructions on the basis of theconstruction's meaning (see also (10)), that the meaning of the constructionmust be represented in a schematic form in the mind of the language user.

(10) Flense me a whale.

This schematic representation must pair the semantic properties of theconstruction with its syntactic and formal (phonological/graphological) proper-ties. Goldberg (2002) provides a powerful further argument for treatingconstructions as symbolic units in this way. This argument, which she traces toChomsky (1970) and to Williams (1991) (where it is called the 'target syntaxargument'), holds that where the properties of supposedly derived structures(here the creative indirect objects) match those of non-derived ones (here thelexically selected indirect objects), the generalization over the two sorts ofstructure is most effectively treated as an argument structure construction in itsown right.

In English, which does not have a rich inflectional morphology, the formalproperties of the indirect object relationship are limited to the fact that personalpronouns in the indirect object position appear in their default form ({me} not{I}), which is also true of direct objects. Other languages show more variation,marking the presence of an indirect object in the form of the verb, as in (11) fromIndonesian (Shibatani 1996: 171), or assigning different case to nouns in indirectobject position than those functioning as direct objects, as in German (12).

(11) Suya membunuh-kan Ana lipas.I kill BEN [name] centipedeI killed a centipede for Ana.

(12) a. Gib ihr/*sie Blumen.give her flowers

Give her flowers,b. Ki\P *ihr/sie.

kiss herKiss her.

Some syntactic properties of the indirect object (in English) are given byHudson (1992). These include the possibility of merger with subject in passiveconstructions (13), the obligatoriness of direct objects in indirect objectconstructions (14) and its position immediately following the verb (15).

(13) She was given some flowers.(14) We gave (her) *(some flowers).(15) a. We gave her some flowers/*some flowers her.

b. We sent her some flowers over/her over some flowers/*some flowersher over/*over her some flowers.


The semantic property common to all indirect objects is that they refer tohavers: in the case of the verbs taking 'dative' indirect objects in (2), the result ofthe verb's sense is that the referent of the indirect object comes into possessionof something; in the case of those taking 'benefactive' indirect objects in (3), theverb profiles an act of creating or preparing something intended to be given tothe referent of the indirect object.

Figure 3 Some verbs have indirect objects

Figure 3 shows the various properties associated with indirect objects. First,the diagram shows that indirect objects are nouns, and that it is verbs, andmore particularly verbs with objects, that have indirect objects: Ditransitive,the category of verbs with indirect objects, isa Transitive, the category of verbswith direct objects. This is enough by itself to represent the fact that the directobject is obligatory with indirect objects (14), but the object relationship isnevertheless also shown in the ditransitive structure, since it appears in theword order rule (indirect objects precede objects). The referent of the objectalso appears in the semantic structure, along with that of the indirect object,since without it the semantic structure cannot be interpreted. I show the tworeferents as coarguments of the result of the verb's sense, though thesemantics is worked out more clearly in the discussion of Figure 4. The factthat indirect objects can merge with subjects in passive constructions is dealtwith in the following section.

Indirect objects may have one of two slightly different semantic structures,each associated with a separate category of ditransitive verbs. In both, thereferents of the two dependents are 'er' and 'ee' of a Having, but the role of thatHaving differs somewhat between the two. The two structures are given inFigure 4.


Figure 4 Two kinds of indirect object

Ditransitive/1 is exemplified in (16):

(16) We baked her a cake.

The sense of the verb isa Making, and its result (therefore) isa Being (is a state)and the argument of that Being is the referent of the direct object: baking a cakeresults in that cake's existence, baking a potato results in that potato's beingready. The Having that connects the referents of the two arguments is thepurpose of the verb's sense: the purpose of the baking of the cake is that itshould belong to her (the referent of the indirect object). This concept isconnected to the sense of the verb by the beneficiary relationship (labelled ben/fy). Ditransitive/2 (17) has as its sense a Giving event, which straightforwardlyhas as its result the Having that connects the referents of the two arguments.The referent of the indirect object is connected to the sense of the verb by therecipient relationship.

(17) We passed her a parcel.

Once these two semantic structures are established, they can be used in thetreatment of the relationship between the indirect object construction andconstructions with the prepositions TO and FOR. Simply, TO has the same senseas Ditransitive/2 and FOR the same as Ditransitive/1 (with some differences: see(18)). This synonymy can, though it need not, be treated as a chance occurrence:no explanation is necessary for the relationship between constructions withindirect objects and those with TO. The case of FOR and the difference seen in(18) certainly supports the idea that the two constructions converge on a single

WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE

meaning by chance, since the two meanings are in fact different The use of theindirect object to refer to the beneficiary of an act of preparation is only possiblewhere the prepared item is prepared so it can be owned (or consumed) by thebeneficiary; this constraint does not apply to beneficiary FOR.

(18) a. Open a bottle of pils/the door for me.b. Open me a bottle of pils/*the door.

The pattern in Figure 3 (and Figure 4) represents a symbolic relationship.Lexical structures include specifications of the meanings of individual lexemesand of classes of lexemes defined by common properties of all sorts. A lexemehas a form and a range of syntactic properties which identify the syntactic poleof the symbolic relationship; it also has a sense, which provides the connectionto a range of semantic properties. Similarly, inflectional and other classes oflexemes share formal, syntactic and semantic properties. And similarly,syntactic dependencies are associated with a range of formal and syntacticproperties (chiefly constraints on the elements at either end of the dependency)and semantic properties (represented in the semantic relationship between themeanings of the two elements). Figure 5 shows, by way of an example, partiallexical structures for the lexeme OPEN, the inflectional category Past and theindirect object relationship.

Figure 5 Schematic representation of OPEN, Past, indirect object

The pattern in Figure 3 (and Figure 4) is a generalization over verbs takingindirect objects. A verb appearing in a construction with an indirect objectinstantiates the more general model. The model represents the properties ofthe construction in the same way as a lexeme represents the properties of aparticular word. In the case of a novel use of the construction (19), the fact thatthe sentence conforms to the formal properties entails that it also conforms tothe semantic properties of the construction. In fact the construction can also beused to constrain the set of verbs that may take an indirect object, since only


those verbs that can conform to the properties of the construction can appear init: *Skate me a half-pipe/*Run me a mile, etc.

(19) Waggle me your bid.

Examples like (19) represent cases of multiple inheritance: the verbinstantiates both WAGGLE (from which it gets its form and much of itsmeaning) and Ditransitive (from which it gets the indirect object andconcomitant semantic properties). This is the same mechanism that mediatesverbal inflection: the past tense of a verb inherits from the verb's lexeme andfrom the category Past at the same time.

Because of this possibility, it is not necessary to include all of the structure inthe diagrams in the lexical specification even of a verb like GIVE, since (when itis used ditransitively) the relevant properties follow from the general propertiesof ditransitive verbs. These verbs, whose use with an indirect object seemsunexceptional compared to those like (19), probably are lexically associatedwith the indirect object construction. GIVE, for example, might be separatedinto two sub-types, one of which isa Ditransitive, and the other of which takesTO as a complement. By contrast, a verb like ACCORD, that never appearswithout an indirect object, inherits all the properties of Ditransitive.

Figure 6 shows a part of the lexical structure of ACCORD and GIVE. Allcases of ACCORD have indirect objects, so the whole category is subsumedunder Ditransitive. GIVE, by contrast, is divided into two subcategories: onewhich isa Ditransitive, and one which isn't (this category has TO as acomplement). The diagram also shows that creative use of the indirect object asin (19) can be mediated by a contextual (= non-lexical) specialization of therelevant lexeme, that inherits also from the 'inflectional' category Ditransitive.

Figure 6 ACCORD, GIVE, Waggle me your bid


Some of the features of the ditransitive model are nevertheless oftenrepeated or overridden in the structures associated with verbs that arespecializations of it. For example, Lending and Loaning are special in that theirresult is temporary, Donating because the recipient is a charitable organization,and Denying in that the intention is that the recipient should not receive fromthe givee. These specializations of/divergences from the model must berepresented in the individual lexical structures of the verbs concerned.

A classification hierarchy consisting of classes defined by properties thatdistinguish them from other categories is a commonplace in many approachesto knowledge representation (and elsewhere). In linguistics the idea is found inthe work of structuralist semanticists (Weisgerber 1927; Trier 1931; Cruse1986), among others.

1. 3 Objects

Biber et al. (1999: 126-8) give a number of syntactic properties for Englishobjects, as follows (the properties assigned to objects and to subjects are alltaken from Biber et al. ', some details may be disputed, but the general pointremains the same):

a. found with transitive verbs onlyb. is characteristically an NP, but may be a nominal clausec. is in accusative case (when a pronoun)d. typically follows immediately after the VP (though there may be an intervening

indirect object)2

e. may correspond to the subject in passive paraphrases

• The first two syntactic properties refer to the classes of the words at eitherend of the object relationship: some verbs (the transitive verbs) lexicallyselect an object; the objects themselves are generally nouns.

• The third property concerns the form of the object: when it is a pronoun, ittakes the 'accusative' form (what I have above called the default form).

• The fourth property concerns its relative position in the sentence: objectsgenerally follow their parents, and only a limited set of other dependents ofthe parent may intervene (any number of predependents of the object mayintervene) (20). Biber et al. note that indirect objects may come between theobject and the parent (21); this possibility is also open to particles (22).

(20) Philly fiUeted ('skillfully) the fish.(21) We gave her a new knife.(22) She threw away the old one.

• The final syntactic property refers to passive constructions. Under the WGanalysis (see Hudson 1990: 336-53), the subject of a passive verb is at thesame time its object (23) (or indirect object (24)), the merger of dependentsbeing licenced by the passive construction itself.


(23) The camel hair coat was given to Cathy.(24) Cathy was given the camel hair coat

Figure 7 shows how these syntactic properties can be represented in a lexicalstructure.

Figure 7 Syntactic properties of objects

The parent in an object relationship isa Verb and the dependent isa Noun. Inthis way, the object relationship defines a class of transitive verbs (verbs thathave objects). Verbs that only appear in transitive constructions inherit allproperties from this class (DEVOUR isa Transitive, just as ACCORD isaDitransitive). The word order properties are represented by the nextrelationship: the form of the dependent is the next of that of the parent.The diagram also shows the category Ditransitive (see Figure 4), where theword order properties are somewhat different (the form of the object is the nextof that of the indirect object). Also represented in the diagram is the class ofpassive verbs (the category Passive). These verbs are defined by their formalproperties: the form of a passive verb consists of its base plus a suitable ending(not shown). There are two classes of passive verb: one, which also isaTransitive, in which the subject is merged with the object, and one, which alsoisa Ditransitive, in which the subject is merged with the indirect object.

The full lexical structure of the object relationship must also include itssemantic properties. In line with the approach outlined above for indirect


objects, the semantic properties of the object are related to its syntax through aspecialization of the SSP. Biber et al. (1999: 126-8) also identify a range ofpossible semantic relationships that correspond with the object relationship (seea-g), and the lexical semantic representation of the object relationship should begeneral over all of these.

a. ected (bake a potato)b. resultant (bake a cake)c. locative (swim the Ohio)d. instrumental (kick your feet)e. measure (weigh 100 tons)f. cognate object (laugh a sincere laugh)g. eventive (have a snooze)

Properties a, b and d can be quite straightforwardly collected under a generaltreatment, in terms of their force-dynamic properties: in each case, the sense ofthe verb has a result which is a further event having the referent of the object asan argument (when you bake a potato, the potato becomes soft and edible;when you bake a cake, the cake comes into existence; when you kick your feet,the feet move). This is represented in Figure 8: a verb's object refers to the 'er'of the result of the verb's sense. This two-stage relationship is furtherrepresented in a direct relationship between the verb's sense and the referent ofits object, labelled 'ee'.

Figure 8 Affected/effected objects

Notice that a similar conflation of a two-stage relationship into a direct onewas used above in the semantic structure of indirect objects. In fact, when a verbhas an indirect object, the recipient relationship overrides the 'ee' relationshipin being assigned to the 'er' of the result, in much the same way as the wordorder properties of the indirect object override those of the object. This is

LINKING IN WORD GRAMMAR , 95

determined in the semantics by the nature of the resulting state: where this stateisa Being, its 'er' is the 'ee' of the verb's sense; where it isa Having, its 'er' is therecipient of the verb's sense, rather than its 'ee', and the 'ee' of the verb's senseis the same as the 'ee' of the result (see Figure 4 above).

'LocationaT objects, as in c, do not refer to affected arguments, but to partsof a path. The example in c defines the beginning and end of the path (onopposite sides of the river), but other examples may profile the beginning (25a),middle (25b) or end (25c) of the path.

(25) a. The express jumped the rails, (from Biber et al. (1999: 127))b. nny vaulted the horse.c. Elly entered the room.

The set of verbs that can appear with an object of this kind is (naturally) limitedto those that can refer to a motion event and in this sense the 'locative' object islexically selected by its parent. Notice also that the verb (often) determineswhich part or parts of the path may be profiled by such an object. Because ofthis, these arguments must appear in the lexical structures of quite specificcategories (at the level of the lexeme or just above). The relevant categories aresubsumed under Transitive, since the syntactic properties are the same as thoseof the affected/effected objects, but it is arguable whether they need to becollected under a category 'locative object verb'. This category is justified to theextent that generalizations can be made over the relevant constructions.

There seems to be little semantically in common between locative objectsand affected/effected objects, though there is some relationship. For example,Dowry's (1991) incremental theme is a property of both kinds of object theevent in both cases is bounded by the theme:

(26) a. Barry baked a potato/*potatoes in five minutes,b. Sammy swam the Ohio/*rivers in five minutes.

When the sense of the verb is an unbounded event, a measure expression canbe used to define a bounded path: Sammy swam jive miles. It is not entirely clearthat arguments of this sort are indeed objects. Some certainly are not; in 27 theobject of pushed is the (pea) and not the measure expression.

(27) Evans pushed the pea five miles with his nose.

The 'measure' objects are also confined to a limited class of verbs, by whichthey are semantically selected (weigh Jive tons, measure Jive furlongs). They alsohave little in common semantically with the other types of object, since theirsemantics is so heavily constrained by the verb.

'Cognate' objects (28, 29) are also associated with a very small class of verbs(Levin gives 47 (1993: 95-6), out of a total of 3107 verbs). They havesomething in common semantically with effected objects, but the semantics isconstrained by the verb, which may also go so far as to select a particularlexeme. Levin notes:


Most verbs that take cognate objects do not take a wide range of objects. Often theyonly permit a cognate object, although some verbs will take as object anything that isa hyponym of a cognate object (1993: 96).

The verb and its object refer jointly to a performance of some kind.

(28) She sang a sweet song.(29) Deirdre died a slow and painful death.

'Eventive' objects are confined to an even smaller class, the 'light' verbs. Inthese cases the event structure is determined by the verb, but the details of thesemantics are supplied by the noun. In light verb constructions with HAVE, theobject refers to an event (have a bath/meal/billiards match); light DO, in contrast,refers to an affective/effective event, the precise nature of which is determinedby the semantics of the (affected) object:

(30) a. I'll do the beds, ['dig them/make them up']b. I'll do the potatoes, ['peel them']c. I'll do the cake, ['bake it']

Figure 9 collects together the various semantic properties of objects. Thecategory Transitive is the same as appeared in Figure 7: it is the locus of thesyntactic properties of objects (these are represented schematically here):

• The majority of objects are subsumed under the affective/effective category.In the diagram this is represented by the semantic concept Making.

• I show two subcategories. Making' (as in The cold made our lips blue) andCreating are schematic for the senses of the affective object verbs and theeffective object verbs respectively.

• Making is schematic for all affective/effective events, and as such provides asense for 'light' DO (shown as DO/light in the diagram). 'Light' HAVE isshown as a simple transitive verb that corefers with its object (the sharedreferent being an event).

• The set of verbs taking 'locative' objects is represented by a class havingMoving' as its sense. This concept, which is a subcategory of ordinaryMoving, subsumes cases of moving with respect to some landmark. Thelandmark appears in the semantic structure (labelled 1m).

• The types of Moving' are classified here according to whether the landmarkis construed as the middle of a path (Passing), an obstacle (Traversing), anend point (Entering) or a source (Leaving).

• Finally, the diagram shows that some nouns which are objects refer toMeasurements, and they define a property of the er of their parent's sense.

The lexical structure given in Figure 9 integrates the syntactic propertiesidentified above (Figure 7) with the semantic properties of the various types ofobject Figure 9 is schematic for all 'transitive constructions' in that verbs withobjects inherit (some of) their properties from the category Transitive (usually


Figure 9 Semantic properties of objects

by way of one of the subclasses) and nouns that are objects inherit some of theirproperties from the category that fills the relevant slot in the structure (perhaps alsoby way of one of its subclasses: the diagram does not show inheritance relationshipsbetween the object noun in the most general case and those in the subcases, butthese relationships are nevertheless implicit in the inheritance structure).

1. 4 Subjects

Biber et al. (1999: 123-5) give the following syntactic properties for Englishsubjects:

a. found with all types of verbsb. is characteristically an NP, but may be a nominal clausec. is in nominative case (when a pronoun and in a finite clause)d. characteristically precedes the VP, except in questions where it follows, except

where the subject is a Wh word itselfe. determines the form of present tense verbs (and of past tense BE)f. may correspond to a by phrase in passive paraphrases

• Again, the first two syntactic properties concern the classes of words thatparticipate in the relationship: verbs have subjects, which are generallynouns. Any verb may have a subject, so the class of 'subject verbs' is lessconstrained than the class of transitive verbs. It is perhaps for this reasonthat the semantic roles played by subjects are so much more diverse (seebelow). All tensed verbs have subjects, so the class Tensed is shown as asubset of the subject verbs. (See Figure 10. )

• The 'nominative' form of personal pronouns consists of the five words I,SHE, HE, WE and THEY which are subcases of the relevant pronounsthat are used only in subject position. (See Figure 10. )


Figure 10 Some syntactic properties of subjects

• The word order properties of subjects are slightly more complicated. Generallythe subject precedes its parent, but some subjects follow their parents and inmany of these cases the referent of the verb is questioned (the constructionforms a yes/no question); these cases are represented in the subclass of subjectverbs Inverted. The word order properties of Wh questions are determined inpart by the lexical properties of the category Wh (schematic over Wh words).This category is always the extractee (x< in the diagram) of its parent and soprecedes it. Where the Wh word is not the subject of the verb, the verb andsubject are also inverted (the complement of Wh isa Inverted).

Figure 11 Word order properties of subjects


• Subject-verb agreement is a property of the categories participating in thesubject relationship. Present verbs (Present is a subcase of Tensed) musthave the same agreement value as their subjects. Those with the agreementsingular have a form consisting of their base plus an {s}. Notice that thisrequires that the pronouns I and YOU have agreement plural (or have noagreement value) (/ like I she likes). Subject-verb agreement is dealt with atlength by Hudson (1999).

Figure 12 Subject-verb agreement

• The final syntactic property is more properly semantic in WG: just as thereis overlap between the semantics of the indirect object relationship and thatof the preposition TO, so there is considerable overlap between thesemantics of the subject relationship and that of the preposition BY.

The semantic properties of subjects are explored more fully in the followingsection, but some general remarks can be made here. Biber et al. (1999: 123-5)give the following possible semantic roles for subjects:

a. agent/willful initiator (She kicked a bottle cap at him)


b.c.d.

e.f.g-h.

external causer (The wind blew the plane off course)instrument (Tactics can win you these games]with stative verbs:• recipient (/ know it, She could smell petrol)• source (Ton smell funny)• positioner (She sat against a wall)affected (It broke, An escapee drowned)local (The first floor contains sculptures)eventive (A post mortem examination will take place)empty (It rained)

The first three roles (a-c) can be collected together by virtue of the force-dynamic properties they share: agents, causes and instruments all precedethe event in the force-dynamic chain.I argue below that affected subjects (e) are similarly controlled by the force-dynamic structures of the verbs that take them.The semantic roles played by the subjects of stative verbs are chieflydetermined by the lexical (semantic) structure of the individual lexeme,though some semantic classification is possible (see Figure 13).'Local' and 'eventive' subjects are controlled by the lexical structures of theverbs that take them.Since every verb can have a subject, the number of different semantic rolesopen to the referents of subjects is limited only by the number of differentevent types denoted by verbs. This can be seen particularly clearly in thecase of 'dummy' subjects (h).

•

•

•

•

•

Figure 13 Semantic properties of subjects


Figure 13 collects together the possible semantic roles associated with thesubject relationship, and relates them symbolically to the syntactic propertiesidentified above (given schematically in the diagram). The various semantictypes of subject are glossed by the 'er' relationship introduced above. A fullaccount of this relationship and of the 'ee' relationship linked with objects isprovided in the following section. Four kinds of stative predicate are shown,covering the three possibilities under (d) and the 'local' subjects in (f). Some ofthese semantic classes are dealt with in more detail in following chapters; eachmakes different requirements of its 'er'. A class of 'eventive verbs' is alsoincluded; these corefer with their subjects.

1. 5 Three linking rules

On the basis of the above discussion we can construct general linking rules forthe three relationships subject, object and indirect object These linking ruleslink sets of syntactic properties, associated with the relevant dependency class,with sets of semantic properties, associated with classes of semantic association.

The linking rule for subjects is given in Figure 14 and in prose in (31). Therule pairs the syntactic relationship subject with the semantic relationship 'er'.The former gathers together the syntactic properties of subjects (Figure 10-Figure12) and the latter the semantic properties associated with them (Figure 13).

Figure 14 Subject linking rule

(31) A word's subject refers to the 'er' of its sense.

A linking rule for objects is given in Figure 15 and in (32). The rule pairs thesyntactic relationship object with the semantic relationship 'ee' (this is thepattern given above for DO/light, which is followed by most transitive verbs).The former gathers together the syntactic properties of objects (Figure 7) andthe latter the semantic properties associated with them (Figure 8).


Figure 15 Object linking rule

(32) A word's object refers to the ee of its sense.

Finally, abstracting away from Figure 4 (and using 'beneficiary' as schematicover recipients and beneficiaries) gives us the following linking rule for indirectobjects. This rule gathers together, in the two associations indirect object andbeneficiary respectively, the syntactic and semantic properties of indirect objectconstructions, as identified above.

Figure 16 Indirect object linking rule

(33) A word's indirect object refers to the beneficiary of its sense.

Now, semantic relationships like recipient or beneficiary are quitestraightforwardly understood in terms of more complex semantic structures:if a concept C has a result which is an example of Having, dien that result's first

103

argument is the recipient of C. The relationships 'er' and 'ee', however, whichare linked to subject and object respectively, are less straightforward, so thestatus of the subject and object linking rules is at least open to question. In thesecond part of this chapter, I address the outstanding issues, providing moredetailed linking rules for subjects and objects.

2. The Event Type Hierarchy: The framework; event types;roles and relations

2. 1 The framework

In the first part of this chapter I sketched a linking mechanism within the WGframework, based on generalizations over grammatical relations (specializationsof the Syntax Semantics Principle). The details are fleshed out in this part.

The linking regularities presented above consist of symbolic structures whichlink specific syntactic relationships (subject, object, indirect object, etc. ) withspecific semantic relationships ('er', 'ee', recipient, etc. ). The syntacticrelationships are identified by a set of word-level (syntactic, morphological,phonological, etc. ) properties which, by default, are inherited by all cases of thedependency: unless otherwise specified, subjects precede their parents anddetermine their form, objects follow and permit no intervening codependents,and so on. The semantic relationships are identified by a set of concept-level(thematic, force-dynamic, etc. ) properties, which likewise constitute the defaultmodel for the relationship. The syntactic and semantic properties takentogether constitute the lexical structure of the relevant relationship, and can beseen as a gestalt.

As I argue above, semantic relationships like recipient and result are quitestraightforwardly understood in terms of more complex semantic structures.The relationships 'er' and 'ee', however, which are linked to subject and objectrespectively, are less straightforward. An account is provided here in which theproperties of 'ers' and 'ees' are defined by a hierarchy of event types (notice thatan event type (Having) played a role in the definition of result and recipient).

Since most of the event types are defined by a single exceptional argumentrelationship, and since the linking regularities are still stated in terms of singleroles, the WG approach outlined here combines the properties of role-basedand class-based approaches. The linking regularities presented above aregeneralizations over the linking properties of all subjects, objects, etc. Whileeach syntactic dependency always maps onto the same semantic argument, theexact nature of the role played by that argument is determined by the widerconceptual structure associated with the parent's sense (as represented partly byits event type). The distinction between words and constructions is an emergentproperty of the network structure.

The categories in the event type hierarchy are defined by their semantic(conceptual) properties, including force-dynamic properties (but not includingaspectual properties: see Holmes (2005: 176-211)). Many of the event typesfunction as the senses of words, though some do not. The categories support a

UNKING IN WORD GRAMMAR


number of associations (more at the more specific levels), including thosementioned in the linking regularities. The roles of those arguments are definedby the rest of the conceptual structure associated with the lexical category.

2. 2 Event types

Figure 17 shows the event type hierarchy. The various types are shown, butmost of their properties are not (they are given in the following diagrams). Thecategory at the top of the hierarchy is labelled Predicate; this is not an entirelysatisfactory name for this concept, but it has the benefit of subsuming bothstates and events. The names of the concepts in the hierarchy are intended tobe the senses of lexical words, and for this reason it is perhaps surprising that noreadily useable term exists for the highest category, though it might be arguedthat this concept does not have much use as an element in the normal use oflanguage. The event type hierarchy should more properly be called thepredicate type hierarchy.

Figure 17 Predicate type hierarchy

Predicates are divided into states (State) and events (Event), the latterconsisting of a series of (more or less transient) states. The most generalcategory, Predicate, is shown with a single argument, labelled 'er', and thisassociation is inherited (implicitly) by the two subclasses. The states are dividedinto Being and Having; the latter and some of the former have a secondargument, labelled 'ee'. Further properties of these categories are exploredshortly. The events include processes like Laughing and Yawning as well asthe further categories Becoming and Affecting. The first of these is telic (it has aresult which is a state); the second has an 'ee' as well as an 'er'. Affecting


includes transitive processes like Pushing and Beating ('hitting' not'defeating') as well as the category Making which subsumes two furthercategories, Creating, which is telic since its result is an example of Being (orExisting), and Making', which is telic in that its result isa Becoming and theresult of this second event is a state.

Figure 18 shows in more detail the properties of the states. Being defines aproperty of its 'er'. For example, Big functions as the size of its 'er' (Drunk isalso shown as an example of the way in which the semantic network representsall aspects of meaning). Other subcases of Being include Feeling, whichsubsumes psychological states (see Figure 19), and At, which subsumeslocations (see Figure 20). The inclusion of the traditional semantic roles themeand actor pre-empts the discussion of the difference between Being and Havingin Figure 21 and of the relationship between the argument positions andtraditional semantic roles in the following section.

Figure 18 Hierarchy of states

Figure 19 shows in more detail the properties of Feeling. This categorysubsumes one- and two-argument psychological states. In both cases the 'er'must be sentient. One of each kind of state is shown as an example. A singlesemantic relationship is shown for each; this stands for a fuller characterizationof the words' meanings which would include for example the relationshipbetween Happy and Smiling (the 'er' of Happy is often the 'er' of Smiling too).

Figure 20 shows the properties of At, the category subsuming locations, andthe sense of AT. The 'ee' of At defines the place of its 'er', which is thereforeunderstood as the theme of a state defined by the 'ee'. For this reason, the 'ee' isalso shown as the Landmark (see Figure 20). Two subcases of At are shown, Inand On, the senses of the prepositions IN and ON respectively. These two differfrom At in that the place of the 'er' is not the same as the 'ee', but is rather the


Figure 19 Feeling

same as the place of a part of the 'ee'. In the case of In, this part is the interior;in the case of On, it is the surface. The diagram also shows that Containing andSupporting are the converses of In and On respectively (if a is in b then bcontains a; if a is on b then b supports a). These facts are integral parts of themeanings of the prepositions.

Figure 20 At


Figure 21 shows the properties of Having, the sense of HAVE. As I haveshown in Figure 18, the arguments of Having and those of Being have differentproperties. In the case of Having the 'er' is also its actor and the 'ee' its theme (seesection 2. 3); in the case of Being the 'er' is the theme, and the 'ee', if there is one,is a landmark, or plays some other role (in the case of the psychological states it isoften called a stimulus). Figure 21 shows that Supporting and Containing aresubcases of Having (subsumed under a general category labelled 'Locating').

This explains why these categories assign their arguments in the oppositeway to the corresponding concepts On and In, which inherit their argumentstructure from Being (by way of At). It may also help to explain the way inwhich some languages use verbs corresponding to English BE and HAVE withdifferent sets of verbs in perfect constructions and perhaps also explain therelationship between passive and perfect constructions even in English. Thispossibility needs to be explored in future work.

Figure 21 Having

The correspondence between Being and Having also suggests an alternativeto the most usual analyses for verbs like GIVE and the indirect object (seeHolmes 2005: 46-54). It is often claimed that the more specific semantics ofindirect objects overrides the usual principle that the 'ee' of a causative event isassigned as the 'er' of its result (the gift, which is the 'ee' of Giving, is the 'ee'rather than the 'er' of the result, if this is to be a case of Having). However, it isalso possible that the result of Giving is instead a case of Being (morespecifically, it isa At), which would preserve the default arrangement. Thiswould also provide a means of describing the contrast between verbs like GIVEand those like EQUIP (rare in English) that show the opposite linkingarrangement. This view is supported by the prepositions that are used withthese verbs. GIVE selects TO (in the absence of an indirect object), which in


other constructions refers to a path terminating in a location; EQUIP selectsWITH, which has Having as its sense.

This suggestion is sketched in Figure 22. The result of Giving isa At; its 'er'(the thing located) is the 'ee' of Giving and its ee (the location) is the recipient.The result of Equipping isa Having; its 'er' (the possessor) is the 'ee' of Givingand its 'ee' is the 'equipment'.

Figure 22 Giving, Equipping

Figure 23 shows the properties of the non-states. Event inherits the 'er'relationship from the Predicate category, and passes it down to the subclasses.Becoming has additionally a result which is a state which shares its 'er'; the classis telic and provides the semantic schema for unaccusative constructions. Dyingis shown as an example (see Figure 27). Affecting has additionally an 'ee', whichis a patient. Pushing is shown as an example (see Figure 25). Making representstelic affective events (it has a result). Two subclasses of Making are shown.Creating provides the model for effective constructions and Making' forcausative (affective) ones. In both cases it is the 'ee' that functions as the 'er' ofthe result Killing is shown as an example of Making' (see Figure 26).

Figure 23 Events


Figure 24 Yawning isa Event

Figure 25 Pushing isa Affecting

Figure 26 Killing isa Making'


Figure 27 Dying isa Becoming

2. 3 Semantic roles and semantic relationships

I have given above a hierarchical classification of predicate types denned bytheir properties (see Figure 17, Figure 18, Figure 23). Note that the senses ofparticular words (not just verbs: prepositions and adjectives refer to events, asdo some nouns like DESTRUCTION, WEDDING, etc. ) are arranged in thesame hierarchy since they simply instantiate the more general predicate types.The properties of the predicate types determine the number and nature of thesemantic relationships associated with these senses and the linking of thoseassociations to syntactic dependencies; alternatively, the number and nature ofsemantic associations and the linking of those associations determines theposition of the sense in the predicate type hierarchy.

In the first part I provided linking regularities that link more or lessschematic semantic associations with more or less schematic syntactic ones.There I gave linking rules for subject, object and indirect object as well as themore general Syntax Semantics Principle (SSP). The semantic associationsreferred to in these rules are the same as those supported by the variouspredicate types. In fact the linking rules themselves form part of this hierarchy,appearing at the highest relevant level.

As noted above, semantic associations like recipient are fairly straightfor-wardly characterized in terms of other semantic relationships (in terms of theirmeanings) but 'er' and 'ee', the two relationships involved in subject and objectlinking, are not. The 'ers' and 'ees' of particular events (or event classes) areinstantiations of the more general 'er' and 'ee' that appear in the linkingregularities (note that the 'er' of Predicate (Figure 23) is the most general onethere is, so this is the locus of the subject linking rule and all other 'ers' areinstantiations of this one). The properties of 'ers' and 'ees' of more specificcategories are determined at the appropriate level in the predicate typehierarchy and it is here that the most semantic information is found.

In the preceding section I define the semantics of these relationships byrelating them to named thematic roles (actor, patient, theme, landmark),but this begs the question in the absence of a fuller semantic definition of theseroles. Indeed, as discussed below, once the thematic roles have definitions, itmay no longer be necessary, or desirable, to keep the relationships agent,theme etc. in lexical structure. 3


A number of problems with thematic roles have been identified in theliterature. The most immediate practical difficulty is that different writers (andeven different works by the same writer) use the same terms with differentmeanings; this is a particular problem for the terms Goal, Patient and Theme(see below). But there is also the non-monotonicity of argument linking ((34)-(36) are from Davis and Koenig (2000: 58)), which leads to proposals likeJackendofFs (1990) hierarchical argument linking.

(34) a. Mary owns many books.b. This book belongs to Mary.

(35) a. We missed the meaning of what he said.b. The meaning of what he said escaped/eluded us.

(36) a. Oak trees plague/grace/dot the hillsides,b. The hillsides boast/sport/feature oak trees.

A further, theoretical, difficulty (raised by Dowty 1991) is the open-endednature of the set of roles to be used. Goldberg considers this only an empiricalproblem, since in principle the set of thematic roles need not be finite, thenature of the roles being determined by the set of predicate types recognized inthe language:

[P]hrasal constructions that capture argument structure generalizations haveargument roles associated with them; these often correspond roughly to traditionalthematic roles... At the same time, because they are denned in terms of thesemantic requirements of particular constructions, argument roles in this frameworkare more specific and numerous than traditional thematic roles. (2002: 342)

Since the semantic relationships supported by the senses of words instantiate(isa) those of more general categories, the senses of different words (orconstructions) may elaborate the more general models in different ways, so thatthe set of thematic roles at the more specific levels can be very large indeed. InFigure 23, 1 used the thematic role actor as schematic over the first argumentsof all non-states (including processes (37) and causative (38) and unaccusative(39) events).

(37) a. The flag fluttered in the breeze.b. The tourist yawned.c. The flag distracted the tourist.d. Perry pushed a pea with his nose.

(38) a. Perry pushed a pea to Peterborough.b. The flag angered the tourist.c. The judges made a cake.d. Perry opened a bottle.

(39) a. The pea vanished.b. The ice melted.c. The band disbanded.

Trask defines actor as 'that argument NP exercising the highest degree of


independent action in the clause. ' (1993: 6), noting that this is a simpleextension of the category agent to fit other kinds of subject-linked arguments.This extension covers verbs referring to changes undergone by their singleargument (unaccusative verbs), whose arguments therefore may have few or noagentive properties (note however, that some are agents (30c). In fact, the actorsof other one- or two-argument events are also not agents ((28a), (28c), (29b)).

Agency is a property of some actors, determined by the thematic propertiesof the event, so the thematic role agent ('the semantic role borne by an NPwhich is perceived as the conscious instigator of an action' ibid.: 11) is not calledfor. Actor, then, corresponds roughly to Dowty's (1991) proto-agent: it isdefined by properties like volitional involvement, causal instigation etc., but notall cases share all these properties. Dowty's proto-agent wills the event, issentient, causes an event or change of state, moves and has independentexistence; the WG treatment presented here accepts all of these but the fourth,movement.

Patient ('the semantic role borne by an NP which expresses the entityundergoing an action' Trask (1993: 202)) is schematic over the secondargument of transitive events. Affecting, which is the most general such event,subsumes processes (like pushing a pea or patting a dog) and causative events(like pushing a pea to Peterborough or angering a tourist). The patient is theaffected (or effected) argument, even in some of the transitive processes.Processes have a temporal profile that consists of a set of repeated events.These events may themselves be causative (Pushing consists of a set of repeatedcausative actions on an object), though they may also be states (Patting consistsof a set of repeated locative states) in which case the patient is the theme of thestate (see below).

Dowty's (1991) proto-patient undergoes a change of state, is an incrementaltheme, is causally affected by another participant, does not move and does nothave independent existence. Again, the WG analysis accepts all these but thefourth, concerning movement The incremental theme is a product of theaspectual structure of affective events (see Holmes 2005).

States have themes, and some have actors. Actors of states share theproperties of those of non-states. The theme is the argument that the state ispredicated of (theme is also used with similar meaning as the name of adiscourse function, where it contrasts with rheme, as topic does with comment).Trask gives 'an entity which is in a state or a location or which is undergoingmotion' (1993: 278), a definition which subsumes some patients, as definedabove; Trask also notes that the terms theme and patient are used more or lessinterchangeably. However, in the current framework the two are separate:patients undergo some affective/effective process or change; themes have somestable property. Locative states also have a landmark: the argument whoseposition defines that of the theme.

The above definitions of the thematic roles are given in terms of semanticproperties. For example, an actor wills the event, is sentient, causes an event orchange of state and has independent existence. These semantic properties ofactor are shown in Figure 28.


Figure 28 Actor

In the linking framework outlined here, syntactic associations are linked tosemantic ones in a regular way (subjects refer to 'ers', objects to 'ees', indirectobjects to beneficiaries, etc. ), and those semantic associations are defined by(structural) semantic properties. The relationships 'er' and 'ee' are defined bythe categories of the predicate type hierarchy, and linked there to the variousproperties of actors, patients, themes and landmarks, like those in Figure 28. Itis an empirical question whether it is necessary to keep hold of the relationshipsactor, patient, etc.: theoretically the 'ers' of non-states could simply be linkeddirectly to the structure shown in Figure 28 without the mediation of the actorrelationship.

The contrast between Having and Being (the former has an actor-er and atheme-ee, the latter a theme-er and in some cases a landmark-ee, see Figure 18)demonstrates that 'er' and 'ee' are distinct from the thematic roles. Thisseparation of properties is found in other frameworks also. For example, inGoldberg's (2002) Construction Grammar the lexical structures of grammaticalconstructions are separated from those of specific words. Semantic relation-ships like Actor, Theme, etc. (participant roles), which are supported bythe senses of words, instantiate the argument roles of phrasal constructions(these correspond to my 'er' and 'ee'), which are therefore schematic overthem. The separation, in lexical structure and in the structures of sentences(constructs), of the two argument structures allows different verbs to elaboratedifferent constructions differently: the argument structure of the constructionmay add or take away participant roles from the verb, or vice versa.

The WG framework, however, represents the distinction differently: ratherthan being properties of two different kinds of elements, the participant rolesand the argument roles are simply different kinds of association supported bythe same elements (events). In Holmes (2005) I show this property of the WGframework to be crucial in the treatment of specific examples, since it becomes


clear there that both words and constructions may select both argument andparticipant roles.

Since the participant roles are defined in terms of sets of default properties,it is possible for more than one argument of a verb's sense to fit the bill for oneor other participant role. This is the case for the verbs SPRAY and LOAD. Asis well known, these two verbs can be used with objects referring to a thing orsubstance moved or to the place it is moved to. These two possibilities reflecttwo ways of interpreting the roles of the participants (of choosing whichparticipant best fits the patient model, and is therefore linked to 'ee' and thenceto object).

In these cases the lexical properties of the syntactic relationship (here object)can be added to those of the verb. Where the two are not in conflict, they aresimply merged. For example, since LOAD does not select either of its non-subject arguments as an incremental theme, this property is assigned to theobject-linked argument by the semantics of the 'ee' relationship (40) (themechanics of this example are discussed in Holmes 2005: 206ff).

(40) a. Larry loaded * (the) lorries with (the) lollies in 2 hours,b. Larry loaded * (the) lollies on (the) lorries in 2 hours.

When there is a conflict between the lexical properties of the construction andthose of the verb, the construct is (usually) rendered incoherent. The twoexamples in (41) are unacceptable because the lexical structure of POURspecifies that the 'ee' of its sense is a liquid (that is how the manner of pouring isdefined) and that of COVER specifies that the 'ee' of the sense ends upunderneath something. These two requirements clash with the semantics of theconstruction.

(41) a. * Polly poured the pot with waterb. *Corrie covered the quilt over the baby.

3. Conclusion

In the first part of this chapter I sketched the linking mechanisms of WG.Syntactic and semantic associative relationships participate in symbolicrelationships: syntactic dependencies have meanings, which serve to determinethe interpretations of compositional structures, as well as to constrain thepossibilities for composition. Just as the (default) properties of syntacticassociations are given in terms of a network of related concepts and propertiessurrounding the dependency class they instantiate, so are the (default)properties of semantic associations.

In the second part I distinguished two kinds of semantic association:participant roles, which carry thematic content; and argument roles, which aredetermined by the force-dynamic properties of the event class.


References

Biber, Douglas, Johansson, Stig, Leech, Geoffrey, Conrad, Susan and Finegan, Edward(1999), Longman Grammar of Spoken and Written English. Harlow, Essex: Longman.

Bresnan, Joan W. (1982), The Mental Representation of Grammatical Relations.Cambridge, MA: MIT Press.

Chomsky, Noam (1970), 'Remarks on nominalization', in Roderick A. Jacobs andPeter S. Rosenbaum (eds), Readings in English Transformational Grammar. WalthamMA: Ginn and Company, pp. 184-221.

— (1981), Lectures on Government and Binding. Dordrecht Foris.Copestake, Ann and Briscoe, Ted (1996), 'Semi-productive polysemy and sense

extension', in James Pustejovsky and Branimir Boguraev, Lexical Semantics: theProblem of Polysemy. Oxford: Clarendon Press, pp. 15-68.

Croft, William (1990), 'Possible verbs and the structure of events', in Savas L.Tsohatzidis (ed. ), Meanings and Prototypes: Studies in Linguistic Categorization.London: Routledge, pp. 48-73.

Cruse, David A. (1986), Lexical Semantics. Cambridge: Cambridge University Press.Davis, Anthony R. and Koenig, Jean-Pierre (2000), 'Linking as constraints on word

classes in a hierarchical lexicon'. Language, 76, 56-91.Dowry, David R. (1991), 'Thematic proto-roles and argument selection'. Language, 67,

547-619.Goldberg, Adele E. (1995), Constructions: a Construction Grammar Approach to Argument

Structure. Chicago: University of Chicago Press.— (2002), 'Surface generalizations'. Cognitive Linguistics, 13, 327-56.Holmes, Jasper W. (2005), 'Lexical Properties of English Verbs' (Unpublished doctoral

dissertation, University of London).Hudson, Richard A. (1984), Word Grammar. Oxford: Blackwell.— (1990), English Word Grammar. Oxford: Blackwell.— (1992), 'So-called "double objects" and grammatical relations'. Language, 68, 251-76.— (1994), 'Word Grammar', in Ronald Asher (ed. ), The Encyclopedia of Language and

Linguistics. Oxford: Pergamon Press, pp. 4990-93.— (1999), 'Subject-verb agreement in English'. English Language and Linguistics, 3, 173-207.— (2004, July 1-last update), 'Word Grammar', (Word Grammar], Available:

www. phon. ucl. ac. uk/home/dicVwg. htm (Accessed: 18 April 2005).Jackendoff, Ray S. (1990), Semantic Structures. Cambridge, MA: MIT Press.Lemmens, Maarten (1998), Lexical Perspectives on Transitivity and Ergativity: Causative

Constructions in English. Amsterdam: J. Benjamins.Levin, Beth (1993), English Verb Classes and Alternations: a Preliminary Investigation.

Chicago: University of Chicago Press.Levin, Beth and Rappaport Hovav, Malka (1995), Unaccusativity: at the Syntax-Lexical

Semantics Interface. Cambridge, MA: MIT Press.Pustejovsky, James (1995), The Generative Lexicon. Cambridge, MA: MIT Press.— (2001), 'Type construction and the logic of concepts', in Pierette Bouillon and

Federica Busa (eds), The Language of Word Meaning. Cambridge: CambridgeUniversity Press, pp. 91-123.

Pustejovsky, James and Branimir Boguraev (1996), 'Introduction: lexical semantics incontext', in James Pustejovsky and Branimir Boguraev, Lexical Semantics: theProblem of Polysemy. Oxford: Clarendon Press, pp. 1-14.

Shibatani, Masayoshi (1996), 'Applicatives and benefactives: a cognitive account', inMasayoshi Shibatani and Sandra A. Thompson (eds), Grammatical Functions: theirForm and Meaning. Oxford: Clarendon Press, pp. 157-94.



Trask, Robert L. (1993), A Dictionary of Grammatical Terms in Linguistics. London:Roudedge.

Trier, Jost (1931), Der Deutsche Wortschatz im Sinnbezirk des Verstandes. Von denAnfdngen bis zum 13. Jahrhundert. Heidelberg: Winter.

Weisgerber, Leo (1927), 'Die Bedeutungslehre - ein Irrweg der Sprachwissenschaft'.Germanisch-Romanische Monatsschrift, 15, 161-83.

Williams, Edwin (1991), 'Meaning categories of NPs and Ss'. Linguistic Inquiry, 22,584-7.

Notes

1 The SSP given here turns out not to be able to account for all cases of linking. It isrevised in Holmes (2005: 44).

2 Note that in Biber et al. the VP category subsumes the 'verbal complex' (main verband any auxiliaries), but not any complements or other postdependents of the verb.

3 Of course, if a particular speaker knows the words ACTOR and THEME (asmetalinguistic terms), then they must have these relationships in their lexicon, sincethey are (or should be!) the meanings of the relevant terms.

6 Word Grammar and Syntactic Code-MixingResearch

EVA EPPLER

AbstractThis chapter aims to show that WG is preferential over other linguistic theoriesfor the study of bilingual speech. Constituent-based models have difficultiesaccounting for intra-sentential code-mixing because the notions of governmentand functional categories are too powerful and rule out naturally occurringexamples. Properties of WG which make this syntactic theory particularly wellsuited for code-mixing research are the central role of the word, the dependencyanalysis, and several consequences of the view of language as a network which isintegrated with the rest of cognition. A qualitative and quantitative analysis ofbecause and weil clauses shows that code-mixing patterns can be studiedproductively in WG.

1. Introduction

Intra-sententially CODE-MIXED data, i. e. utterances constructed from wordsfrom more than one language, pose an interesting problem for syntacticresearch as two grammars interact in one utterance. Based on a German/English bilingual corpus, 11 will show in section 2 of this chapter that constraintson code-switching formulated within Phrase Structure Grammar frameworks(Government and Binding, Principles and Parameters, Minimalism) are toorestrictive in that they rule out naturally occurring examples of mixing.

In section 3 I will discuss aspects of WG that make it particularly well suitedfor the syntactic analysis of intra-sententially mixed data. WG facilitates the fullsyntactic analysis of sizeable corpora and allows us to formulate hypotheses oncode-switching which can subsequently be tested on data. All findings aresupported by quantitative data.

As the word order contrast between German and English is most marked insubordinate clauses, I focus on examples of this construction type in section 4. 1will show that code-mixing patterns can be studied productively in terms ofWG: WG rules determining the word order in German/English mixed clauseshold in relation to my corpus and are supported by evidence from othercorpora. The main section of this chapter focuses on because and weil clauses. Acomparison of the mixed and monolingual clauses reveals that German/Englishbilinguals who engage in code-mixing recognize and utilize structural


congruence at the syntax-pragmatics interface. They predominantly mix in aconstruction type in which the word order contrast between German (SOV)and English (SVO) is neutralized.

2. Constituent Structure Grammar Approaches to Intra-Sentential Code-Mixing

The question underlying grammatical code-switching research is whether thereare syntactic constraints on code-mixing. Some of the hypotheses on intra-sententiaT code-switching have been formulated in informal frameworks oftraditional grammatical notions; others are derived from assumptions under-lying specific modern syntactic theories. In this section I will review the mainphrase structure grammar approaches to code-mixing and show that theconstraints formulated within them do not account for the data.

DiSciullo, Muysken and Singh (1986) propose to constrain code-switchingby government, the traditional assumption behind X-bar theory. They initiallyused the Chomsky (1981: 164) formulation of government 'oc governs y in [J3... y . . . a. . . y . . . ], where a = X, and oc and y are part of the same maximalprojection'. The X-bar assumption that syntactic constituents are endocentric isimportant for the formulation and working of the government constraint.Heads not only project their syntactic features onto the constituent they govern,but also their language index. The language index is assumed to be somethingspecified in the lexicon (DiSciullo et al. 1986: 6), since the lexicon is a language-specific collection of elements. For code-switching purposes the GovernmentConstraint was formalized in DiSciullo et al. (1986: 6) as [Xp Yp], where Xgoverns Y, and p and q are language indices. The nodes in a tree mustdominate elements drawn from the same language when there is a governmentrelation holding between them.

The Government Constraint predicts that ungoverned elements, such asdiscourse markers, tags, exclamations, interjections and many adverbs, caneasily be switched. This prediction is also supported by my data (see alsoEppler 1999) and most other bilingual corpora. However, the GovernmentConstraint also predicts that switches between verbs and their objects and/orclausal complements, and switches between prepositions and their NPcomplements, are ungrammatical. Examples violating these predictions frommy corpus are:

(1) *TRU: so [/] so you have eine Ubersicht. Jen2. cha, line 133an overview

(2) *DOR: I wonder, wem sie nachgradt Jen2. cha, line 1531whom she takes after

or in the other direction, i. e. from a German verb to an English clausalcomplement:

(3) *MEL: ich hab(e) gedacht there is going to be a fight. Jenl. cha, line 987I have thought

WORD GRAMMAR AND SYNTACTIC CODE-MIXING 119

(4) *TRU: der hat iiber faith + healing gesprochen. Jen2. cha, line 2383he has about spoken

The original inclusion of functional categories in the class of governors ruledout code-switches which are also documented in my data, e. g. betweencomplementizers and clauses that depend on them, as in (5):

(5) TRU: to buy yourself in means that +... Jenl. cha, lines 977ffDOR: du kannst dich nochmal einkaufen.

you can yourself once more buy in

and the domain of government was too large. The above formulation of thegovernment constraint includes the whole maximal projection and thus, forexample, bans switching between verbs and location adverbs, again contrary tothe evidence. Therefore a limited definition of government, involving only theimmediate domain of the lexical head, including its complements but not itsmodifiers/adjuncts, was adopted and the Government Constraint was re-phrased (Muysken 1989) in terms of L-marking:

*[Xp Yq], where X Lrmarks Y, and p and q are language indices

Muysken and collaborators thus shifted from an early and quite generaldefinition of government to the more limited definition of Lrmarking in theirformulation of the Government Constraint. L-marking restricts government tothe relation between a lexical head and its immediate complements. Even themodified version of the government constraint in terms of L-marking isempirically not borne out, as we see from the following example:

(6) TRU: das ist painful. Jen3. cha, line 1879this is

Muysken (2000: 25) identifies two main reasons why the GovernmentConstraint, even in its revised form, is inadequate. The main reason is thatCATEGORIAL EQUIVALENCE2 undoes the effect of the government restriction.The Government Constraint is furthermore assumed to insufficiently acknowl-edge the crucial role functional categories are supposed to play in code-mixing.

Functional categories feature prominently in several approaches to code-mixing. Joshi, for example, proposes that 'Closed class items (e. g. determiners,quantifiers, prepositions, possessive, Aux, Tense, helping verbs, etc. ) cannot beswitched' (1985: 194). Myers-Scotton and Jake (1995: 983) assume that inmixed constituents, all SYSTEM MORPHEMES3 that have grammatical relationsexternal to their head constituent (i. e. participate in the sentence's thematicgrid) will come from the language that sets the grammatical frame in the unit ofanalysis (CP). And Belazi, Rubin and Toribio (1994) propose the FunctionalHead Constraint.

Their model is embedded in the principles and parameters approach.Belazi, Rubin and Toribio (1994) propose to restrict code-mixing by the


feature-checking process of f-selection. In Belazi, Rubin and Toribio's model,language is a feature4 of FUNCTIONAL heads that needs checking like all otherfeatures. The Functional Head Constraint (Belazi, Rubin and Toribio (1994:228)) is formulated as follows:

The language feature of the complement F-selected by a functional head, like allother relevant features, must match the corresponding feature of that functionalhead.

Code switching between a lexical head and its complement proceedsunimpeded in this model.

Because many inflectional morphemes were treated as independentfunctional heads in the principles and parameters approach, Belazi, Rubinand Toribio (1994) subsume the FREE MORPHEME CONSTRAINT5 (Sankoffand Poplack 1981) under their functional head constraint: switching isdisallowed between an inflectional morpheme and a word-stem. A counter-example to this restriction from my corpus would be:

(7) *DOR: wir suffer-n da alle. Jen2. cha, line 904we suffer INFL MP all

Like all researchers working on Spanish/English and Arabic/French code-mixing, Belazi, Rubin and Toribio (1994) have to deal with the differentplacement of adjectives pre- or post-modifying nouns in the language pairs theyare working on. Their data indicate that switching is possible when theadjectives and nouns obey the grammars of the languages from which they aredrawn. This leads them to supplement the Functional Head Constraint with theWORD-GRAMMAR INTEGRITY COROLLARY, 6 which states that 'a word oflanguage X, with grammar G, must obey grammar G' (Belazi, Rubin andToribio 1994: 232).

like the Government Constraint, the Functional Head Constraint rules outswitches between complementizers and their clausal complements. Thereforeexample (5) provides counter-evidence to this constraint It also rules outswitches between infinitival to and its verbal complement, examples of whichare also attested in my corpus.

(8) *LIL: you don't need to wegwerfen. Jen2. cha, line 2555throw away

The Functional Head Constraint furthermore rules out switches betweendeterminers (including quantifiers and numerals) and nouns. As nouns are themost frequently borrowed or switched word class, counterexamples abound inmy and many other corpora.

MacSwan (1999: 188), working within the minimalist framework, alsoassumes that code-switching within a PF component is not possible. This PFDisjunction Theorem amounts to the same effect as the Free MorphemeConstraint (Sankoff and Poplack 1981) and the various restrictions on switching


between stems and morphologically bound inflectional material. Examples (7)and (9) are therefore clear violations of the PF Disjunction Theorem.

(9) *DOR: sie haben einfach nicht ge#bother-ed. Ibron. cha, lines 1012, 14they have simply not

The minimalist framework he is working in forces MacSwan (1999) to preserveconstituent structure, but he acknowledges the advantages of a system oflexicalized parameters for the analysis of code-switching.

In this section I reviewed approaches to code-mixed data that cruciallydepend on constituency structure/maximal projections (DiSciullo, Muyskenand Singh 1986) and functional categories (Belazi, Rubin and Toribio 1994). Ishowed that these constraints and models are too restrictive in that they rule outnaturally occurring examples of intra-sentential code-mixing. The 'governmentconstraints' (DiSciullo, Muysken and Singh 1986; Muysken 1989) were foundto be too restrictive when tested against natural language data because thegovernment domain was too large. Models, approaches and constraints basedon functional categories (Joshi 1985; Myers-Scotton 1993; Belazi, Rubin andToribio 1994) fall short of accounting for the data available and areunsatisfactory because none of the definitions of functional categories thathave been offered (in terms of function words, closed class items, systemmorphemes or non-thematicity) work. They either define fuzzy categorieswhere a sharp distinction would be needed, or they conflict with the data.Complementizers and determiners, the two most commonly quoted examplesof functional categories, provide most of the counterexamples.

For these reasons a syntactic theory that rejects constituency structure anddoes not recognize functional categories (Hudson 2000) seems an interestingand promising option to explore. In the next section I will review other aspects/characteristics of WG which are perceived to make this theory of sentencestructure more suitable for the analysis of (monolingual and) code-mixed datathan other theories.

3. A Word Grammar Approach to Code-Mixing

The main reason why I chose WG for the syntactic analysis of my data isbecause this theory of sentence structure takes the word as a central unit ofanalysis. In WG, syntactic structures are analysed in terms of dependencyrelations between single words, 7 a parent and a dependent. Phrases are definedby dependency structures which consist of a word plus the phrases rooted inany of its dependents. In other words, WG syntax does not use phrase structurein describing sentence structure, because everything that needs to be said aboutsentence structure can be formulated in terms of dependencies between singlewords. For intra-sententially switched data this is seen as an advantage overother syntactic theories because each parent only determines the properties ofits immediate dependent. Language specific requirements are thus satisfied, ifthe particular pair of words, i. e. the parent and the dependent, satisfy them. Aword's requirements do not project to larger units like maximal projections/


phrasal constituents. If we want to formulate constraints on code-switchingwithin WG, they have to be formulated for individual types of dependencyrelations. Because they do not affect larger units, they are less likely to be toorestrictive than constraints affecting whole phrasal constituents. One of the mainproblems of constituency based models, i. e. over-generalization throughphenomena like government chains, therefore cannot occur in a WGapproach to code-mixing.

The central role of the word in WG moreover means that words are notonly the largest units of WG syntax, but also the smallest. In contrast withChomskyan linguistics, syntactic structures do not, and cannot, separate stemsand inflections. Furthermore, at least as far as overt words are concerned, WGrejects the notion of functional category. Hudson (2000) shows that this notionis problematic, because it has never been defined coherently and because allthe individual categories that have been given as examples (e. g. complementi-zers) raise serious problems. For the same reasons, constraints on intra-sentential code-switching based on functional categories Qoshi 1985; Belazi,Rubin and Toribios 1994) and models of code-switching that crucially dependon the distinction between system and content morphemes (Myers-Scotton1993) run into serious empirical difficulties (see section 2). Because WG is anexample of 'morphology-free syntax' (Zwicky 1992: 354) which rejects thenotion of functional categories, a WG approach to intra-sentential code-switching cannot over-emphasize the role of inflectional morphemes.

Words being the only and central unit of analysis in Word Grammarfurthermore benefits code-mixing research in a purely pragmatic way. Themajority of research in this area is based on sizable natural language corpora.Because the only units that need to be processed in WG are individual wordsand larger units are built by dependency relations between two words which canbe looked at individually, a WG approach to intra-sentential code-mixingrequires less analysis than constituency-based modes. This facilitates the analysisof large-ish corpora. Eppler (2004), for example, is based on a WG analysis of a22, 000 word corpus.

For inrra-sententially mixed sentences a dependency analysis is furthermoreseen as an advantage over phrase structure grammar frameworks because ithighlights the functional relations between words (from the same or differentlanguages) rather than code-switch points. Constituency-based models describeand/or constrain intra-sentential code-switching by disallowing switchesbetween, for example, PP and NP (see section 2). A WG analysis wouldnote a switched complement relation which is grammatical, if the prepositionand the determiner/(pro-)noun involved in it satisfy the constraints imposed onthem by their own language. To start to understand what is going on in intra-sentential code-switching, it seems more beneficial to gain an insight into whichsyntactic relations are frequently or rarely switched, rather than to increase ourknowledge about points in sentences where switching does not occur.

Another characteristic of WG is that dependency analyses have a totally flatstructure. A single, completely surface structure analysis (with extradependencies being drawn below the sentence-words) is seen as benefiting


WG over other theories of language structure for code-mixing research:linguists working on code-mixing during times when Chomskyan frameworksstill stressed the difference between surface and deep structure did not knowwhat to do with D-structure, because code-switching clearly seems to be asurface structure phenomenon. Romaine (1989: 145) concludes her discussionof the government constraint with the statement 'data such as these [code-mixing data] have no bearing on abstract principles such as government [... ]because code-switching sites are properties of S-structure, which are not basegenerated and therefore not determined by X-bar theory'. This problem doesnot emerge when one works with WG because of its totally flat, i. e. surface,analysis. A syntactic theory that shares properties of the linguistic phenomenonunder investigation appears to be preferable to other syntactic theories; i. e. for asurface-structure phenomenon like code-mixing, a syntactic model that allows asingle, completely surface analysis seems to be well suited.

Other aspects of WG which make this theory of sentence structure moresuitable for the analysis of code-mixed data than other theories are derivedfrom the WG view of language as a network which contains both the grammarand the lexicon and which integrates language with the rest of cognition.

This cognitive view of language as a labelled network has consequences for acontroversial issue in psycholinguistic bilingualism research: the lexicons debate,i. e. whether bilinguals' lexical items/lemmas are stored in one or two lexicons.The network idea offers the advantage of viewing a bilingual's two languages assub-networks, with denser links between lexical items from the same languageand looser connections between lexical items from different languages.

This view of the bilingual lexicon (s) in combination with the multiple defaultinheritance system which WG operates on could possibly have enormousbenefits for writing a psycholinguistically realistic grammar of a bilingual. Thefollowing exploration is just a sketchy idea as to how this could work andrequires fleshing out, but the basic idea seems to work. Default inheritanceallows us to build a maximally efficient system for bilinguals by locating theshared properties of words which 'belong' to different languages higher up theis-a hierarchy and the language specific properties lower down in this hierarchy.English come and German kommen, for example, are both verbs (is-a verb). Theytherefore share certain characteristics: they have a similar meaning ('movetowards'), they both have tense (present or past), they have a subject and thesubject tends to be a pre-dependent noun, etc. All these generalizable factsabout German and English verbs can be located fairly high up in the is-ahierarchy. The features in which our two example words differ, for examplethat they have a different form (/kDITISn/ and /kAm/ respectively), and thatGerman kommen, when it is the complement of a subordinating conjunction oran auxiliary/modal, would be placed in clause final position, would be lower inthe is-a hierarchy. Because of the way default inheritance works, characteristicsof a general category are 'inherited' by instances of that category only if they arenot overridden by a more specific (e. g. language-specific) characteristic. A factlocated lower down in the inheritance hierarchy of entities or relations takespriority over one located above it. Thus we could maximize the bilingual system


by allowing generalization by default inheritance and ensure that the languagespecific properties would automatically override the general pattern. Forbilinguals this system would have the advantage that the grammatical system of aCastilian/Catalan bilingual, for example, would have fewer overriding/blockinglanguage specific properties listed than that of a German/English bilingual. 8

WG furthermore aims at integrating all aspects of language into a singletheory which is also compatible with what is known about general cognition;that is, language can be analyzed and explained in the same way as other kindsof knowledge or behaviour. For example, it is widely acknowledged that code-mixing is influenced by social and psychological factors (Muysken 2000) and asyntactic model that allows us to incorporate this kind of information is bettersuited to describe language contact phenomena than theories that dealexclusively with language. Knowledge of more than one language, and the useof more than one language in one sentence, can be analyzed and explained inthe same way as knowledge of one language and monolingual language use. Inother words, code-mixing is not seen as 'deviant'. Because WG aims to explainand analyze language in the same way as other kinds of social and psychologicalknowledge or behaviour, it is perceived to be more suitable for research intobilingualism than other models of syntax.

The WG view of language as a network of associations which is closelyintegrated with the rest of our knowledge lends itself particularly well to code-mixing research for another reason. It is a well accepted fact in this researchparadigm that adult bilinguals know, first of all, which language the words theyuse belong to. Second, they know when to code-switch and when not to (code-switching as a MARKED or UNMARKED choice, 9 for example, Myers-Scotton andJake 1995), or when they should be in MONOLINGUAL SPEECH MODE or whenthey can go into BILINGUAL MODE (Grosjean 1995). Third, bilinguals also knowwhich mixing patterns are acceptable in their speech community and which arenot (SMOOTH versus FLAGGED code-switching, 10 for example, Poplack andMeechan 1995). This knowledge about language use is obviously closelyintegrated with other types of (social) knowledge and a syntactic theory thatviews language as a part of the total associative network is clearly more suitableto explain these phenomena than other theories.

Viewing language as a sub-network (responsible for words) which is just apart of the total associative network creates another advantage of WG for theresearch paradigm under discussion in this chapter. This benefit is related tothe fact that most code-mixing research is based on natural language corpora. 11

In contrast with most other kinds of grammar which generate only idealizedutterances or sentences, WG grammar can generate representations of actualutterances. A WG analysis of an utterance is also a network; it is simply anextension of the permanent cognitive network in which the relevant wordtokens comprise a fringe of temporary concepts attached by 'is-a' links; so theutterance network has just the same formal characteristics as the permanentnetwork. This blurring of the boundary between grammar and utterance isquite controversial, but it follows from the cognitive orientation of WG. Forwork based on natural speech data it is seen as another crucial advantage of


WG over other theories which can only generate syntactic structures forsentences. From the examples quoted so far, it is obvious that the audio datathis study is based on are transcribed as utterances, i. e. units of conversationalstructure. For the grammatical analysis, however, I assume that conversationalspeech consists of the instantiation of linguistic units, i. e. sentences. In otherwords, every conversational utterance is taken to be a token of a particular typeof linguistic unit, the structural features of that unit being defined by thegrammatical rules of either German or English. When using a WG approach tocode-mixed data, one does not have to 'edit' the corpus prior to linguisticanalysis. Any material that cannot be taken as a token of either a German orEnglish word-form can be left in the texts, but if it cannot be linked to otherelements in the utterance via a relationship of dependency, it is not included inthe syntactic analysis. That is, all the words in a transcribed utterance that arerelated to other words by syntactic relationships constitute the sentences thegrammatical analysis is based on. As far as I am aware, WG is the only syntactictheory that can (and wants to) generate representations of actual utterances, andfacilitates the grammatical analysis of natural speech data without prior editing.

Another consequence of integrating utterances into the grammar is that aword token must be able to inherit from its type. Obviously the token musthave the typical features of its type - it must belong to a lexeme and a wordclass, it must have a sense and a stem, and so on. But the implication goes in theother direction as well: the type may mention some of the token'scharacteristics that are normally excluded from grammar, such as characteristicsof the speaker, the addressee and the situation. For example, we can say thatthe speaker is a German/English bilingual and so is the addressee; the situationthus allows code-mixing. This aspect of WG theory thus allows us toincorporate sociolinguistic information into the grammar, by indicating the kindof person who is a typical speaker or addressee, or the typical situation of use.

Treating utterances as part of the grammar has further effects which areimportant for the psycholinguistics of processing. The main point here is thatWG accommodates deviant input because the link between tokens and types isguided by the 'Best Fit Principle' (Hudson 1990: 45fD: assume that the currenttoken is-a the type that provides the best fit with everything that is known. Thedefault inheritance process which this triggers allows known characteristics ofthe token to override those of the type. Let's take the deviant word /bAS9/ inthe following example:

(lOa) *TRU: xxx and warum waren keine bus(s)e [%pho: bAS9]? JenS. cha, line 331why were there no buses

/bAS9/ is phonologically deviant for German (Busse is pronounced /buS9/),and morphologically deviant for English, because the English plural suffix is -(e)s, not -e. Although this word is deviant, 12 it can is-a its type, just like anyother exception. But it will be shown as a deviant example. There is no needfor the analysis to crash because of an 'error'. The replies to *TRU's questionclearly show that the conversation does not crash:


(lOb) *LIL: xxx [>] wegen einer bombe.*MEL: xxx [>] a bomb scare. JenS. cha, lines 332-333

This is obviously a big advantage of WG for natural speech data.Another characteristic of natural speech data - and code-mixed data in

particular - is that they are inherently variant Most syntactic theories aim atdescribing and explaining regularized and standardized linguistic data andtherefore disregard inherent variability. Hudson (1997) outlines how aprototype-based network theory that is based on default inheritance and usesentrenchment, like WG, can incorporate variation.

One of the strengths of the network approach is that it allows links to havedifferent 'strength'; these are an essential ingredient of the model of spreadingactivation, and are highly relevant to quantitative work. Hudson (1997)stipulates that a language user who observes variation will arrive atgeneralizations about this variation. Each part of a variable network structurehas some degree of'entrenchment' which reflects the experiences of the personconcerned. The degree of entrenchment of a concept can be presented as aprobability of that concept being preferred to any relevant alternatives. This ispresented for word-final variable t/d loss in Figure 1, where the figures13 inangled brackets present the probabilities.

Figure 1 (Hudson f 997: Figure 5)

This analysis of variation is declarative and non-procedural and requires justtwo elementary operations: pattern-matching and default inheritance. Speakersand hearers need to know that alternative forms can be used instead of the basicform, and in a real life context the choice between them is influenced by thelinguistic and social context. Figure 2 just hints at how these extra variablescould be introduced.


Figure 2 (Hudson 1997: Figure 6)

This model of inherent variability is possible because WG assumes thatlinguistic concepts are closely linked to non-linguistic concepts and carryquantitatively different entrenchment values. The reason why I find theproposed model so appealing is because it is a model of competence - notperformance. Inherent variability is generally (rightly or wrongly) associatedwith performance, and to my knowledge there is no other modelthat presentsvariability and sociolinguistic information as part of a speaker's competence. Ibelieve that linguistic variation that is influenced by social factors is part ofevery speaker's competence and a (more fleshed out) model of how speakersexploit their sociolinguistic competence is therefore required within linguistictheory.

In the following main section of this chapter I will present a quantitative/variationist and qualitative analysis of monolingual and code-mixed subordinateclauses. As none of the syntactic restrictions on code-switching proposed in theliterature hold absolutely and universally, several recent studies in the field(Mahootian and Santorini 1996; MacSwan 1999; Eppler 1999) have reverted tothe null hypothesis. I take the same approach. Formulated in WG terms, thenull hypothesis assumes that each word in a switched dependency satisfies theconstraints imposed on it by its own language.

Subordination was chosen as an area of investigation because the twolanguages in contact in this particular situation, German and English, displaysurface word order differences: English subordinate clauses are SVO whereasGerman subordinate clauses are SOV. The contrasting word order rules forEnglish and German, stated in Word Grammar rules, are:

El) In English any verb follows its subject but precedes all its other


dependents. This holds true for main as well as subordinate clauses andgives rise to SVO order in both clause types.

E2) Subordinators, e. g. because, require a following finite verb as theircomplement. A word's complement generally follows it. 14

For German the most relevant rules15 concerning word order in main andsubordinate clauses are:

Gl) A default finite verb follows one of its dependents but precedes all otherdependents. This gives rise to a verb second (V2) word order in Germanmain clauses.

G2) A finite verb selected by a lexical subordinator/complementizer takes allits non-verb dependents to the left, i. e. it is a 'late'16 verb.

G3) Subordinators/complementizers, e. g. daft, select a 'late' finite verb as theircomplement. 17 According to G2 finite 'late' verbs follow all their non-verb

dependents.

An example illustrating rules G1-G3 would be:

(11) Ich glaube nicht, da|3 wir die Dorit schon gekannt habenI think not that weDorit already known have

JenS. cha, line 83

The utterance initial main clause displays V2 word order. The finite auxiliaryhaben which depends on the subordinates daft, on the other hand, is in clausefinal position following all other constituents including non-finite verbs likegekannt. In English finite verbs in subordinate clauses do not behave differentlyfrom finite verbs in main clauses. Therefore we do not have to override thedefault rule El in the 'isa-hierarchy' of grammar rules. Because German finiteverbs depending on a subordinates take a different word order position than'independent' finite verbs, we need a more specific rule (G2) that overrides thedefault rule (Gl) in the cases stated, i. e. finite verbs selected by Germansubordinators.

The pre-minimalism constituent based models discussed in section 2 allhave difficulties accounting for mixing between SVO and SOV languagesbecause of the opposite setting of the branching parameter. I will show in thenext section that this code-mixing pattern can be studied productively in termsofWG.

4. Word Order in Mixed and Monolingual 'Subordinate' Glauses

Code-switching between main and subordinate clauses was chosen as a researcharea for several reasons. First, it is interesting from a syntactic point of view. IfGerman-English bilinguals want to code-switch subordinate clauses, they needto resolve the problem of English being SVO whereas German finite verbsdepending on subordinating conjunctions generally being placed in clause-finalposition (SOV). 18 How this word order contrast is resolved is relevant to the


underlying question in all grammatical code-switching research, i. e. whetherthere are syntactic constraints on code-mixing. Second, the code-switchedcorpus contains a considerable number of switches between main andsubordinate clauses (37), not including the 27 switches involving becausediscussed in more detail below. Third, code-switching at clause boundaries hasattracted much attention in the research area.

As complementizers are one of the most commonly quoted examples of wordclasses that are functional categories in constituent-based models of syntax, thegovernment and functional head constraints discussed in section 2, all rule outswitching between C and the remainder of the CP. Gumperz (1982) alsoproposes that subordinate conjunctions must always be in the same code as theconjoined sentence. Sankoff and Poplack (1981: 34), on the other hand, observethat in their Spanish/English corpus subordinate conjunctions tend to remain inthe language of the head element on which they depend. Bentahila and Davies'(1983) corpus of Arabic/French yields numerous examples of switches at varioustypes of clause boundary: switches between main clauses and subordinate clauses,switching between complementizers and the clauses they introduce, andexamples where the conjunction is in a different language from both clauses.

Although my corpus also contains switches at all the points discussed byBentahila and Davies (1983), my data largely support Gumperz' (1982)'constraint', that is, subordinate conjunctions (apart from because) tend to be inthe language of the subordinate clause that depends on them, and not the headelement on which they depend. Examples illustrating switches between mainand various types of subordinate clauses in both directions are:

(12) *MEL: ich hab(e) gedacht, there is going to be a fight. Jenl. cha, line 987I have thought

(13) *MEL: I forgot, dass wir alle wieder eine neue partie angefangen haben.that we all again a new game started have

Jenl. cha, line 2541(14) *TRU: die mutter wird ihr gelernt haben, how to keep young.

her mother would her taught have Jenl. cha, line 2016(15) *DOR: wenn du short hist, you -wouldn't talk.

when you are*DOR: aber wenn man geld hat, you talk.

but when one money has Jen3, line 581-2(16) *TRU: er schreibt fuenfzehn, if you leave it in your hand.

he counts fifteen Jen2. cha, line 932(17) *LIL: das haengt davon ab, what 'nasty' is(t).

that depends on Jen2. cha, line 1062

Note that the null hypothesis is born out in examples (12)-(17) and in the vastmajority of monolingual and mixed dependencies19 in the German-Englishcorpus. The WG rules determining the word order in main and subordinateclauses also hold. These findings are furthermore supported by the quantitativeanalysis of 1, 350 monolingual and 690 mixed dependency relations in a 2, 000word monolingual sample corpus and a 7, 000 word code-mixed corpus (seeEppler 2004).


This study particularly focuses on because and well clauses. Severalresearchers (Gardner-Chloros 1991; Salmons 1990; Treffers-Daller 1994;Bolle 1995; Boumans 1998) studying code-mixing between SVO and SOVlanguages noticed that the clauses depending on switched conjunctions arefrequently not SOV but V2. The conjunction in these examples, furthermore,is frequently the causal conjunction because, parce que and omdat. This ledBoumans (1998: 121) to hypothesize that '... it is possible that foreignconjunctions do not trigger verb-final in Dutch and German clauses simplybecause they are used in functions that require main clause order'. He,however, found it 'hardly feasible to examine this hypothesis in relation to thepublished examples because these are for the most part presented out ofcontext' (Boumans 1998: 121). I will show that a fully (ODES20) transcribedcorpus of German and English data allows us to verify this hypothesis.

Both types of analysis, qualitative structural and quantitative distributional,are considered to be necessary for a comprehensive description of the data,because different structural patterns are used to different degrees and fordifferent purposes. The variation in the data can best be describedquantitatively; the qualitative analysis provides an explanation for the structuralpatterns found. This combination of methodologies furthermore enables us toaddress Muysken's (2000: 29) statement that'... we do not yet know enoughabout the relation between frequency distributions of specific grammaticalpatterns in monolingual speech data and properties of the grammar to handlefrequency in bilingual data'. I will compare the because- and «W/-clauses inmixed utterances with monolingual German and English examples and showthat we do know enough about the syntax and pragmatics of this construction toexplain both the frequency distribution of causal conjunctions and the use ofverb second (rather than verb final) word order.

4. 1 The empirical issues

4. 1. 1 ASYMMETRY BETWEEN CONJUNCTIONS OF REASONThe distribution of German and English subordinators/complementizers inthe corpus is approximately 60: 40, which is in accordance with the generaldistribution of word tokens from the two languages in the data. If, however,we focus on because and the translation equivalent from the same word class,the subordinating causal conjunction well, we get a very different picture.The corpus yields twice as many tokens of the English subordinator as itdoes of well (see Table 1). A typical use of because, especially for speakerDOR, is:

(18) DOR: es war unsere [... ] Schuld because man fiihlt sichit was our fault tone feels

mit den eigenen Leuten wohler.with the own people happier. Ibron. cha, line 221

Because in the above example can be argued to be a single lexical item inserted


in otherwise German discourse. This particular usage of the English causalsubordinator is not restricted to speaker DOR:

(19) LIL: because er ist ein aufbrausender Irishman.he is a hot-blooded

Jenl. cha, line 389

Because also enters syntactic relations where the word on which it depends isEnglish (eat) and its dependent is German (schmeckt), as in:

(20) DOR: eat it with der Hand-! because das schmeckt ganz anders.the hand it tastes very differently

Ibron. cha, line 2214

or vice versa, e. g. because has a German head verb (habe) but an Englishcomplement (know):

(21) MEL: ich hab's nicht einmal gezahlt because I know I'm going to lose.I have it not even counted

Jenl. cha, line 881

The German subordinator of reason, weil, on the other hand, only enters intomonolingual dependency relations:

(22) DOR: dann ist sie, weil sie so ungliicklich war, doit gestorben.then has she, because she so unhappy was, there died

Ibron. cha, line 1002

So there is not only an asymmetry in the number of tokens each subordinatoryields, but also in the language distribution of the immediate syntactic relationswhich because and weil enter into, i. e. their main clause head verb and thesubordinate dependent verb. The results are summarized in Table 1.

Table 1: Language of head and dependent of because and weil

The phenomenon of single lexical item subordinate conjunctions in otherlanguage contexts is not uncommon in code-mixing literature. 21 As far asdirectionality of the switch is concerned, the situation in my corpus is in sharpcontrast with the findings of dyne (1973) who studies German/English code-mixing among the Jewish refugee community in Melbourne, Australia. Hereports that 'the words transferred from German to English are mainlyconjunctions (denn, ob, und, weil, wie, wo)' (Clyne 1973: 104). The corpus fromthe refugee community in London also shows a high propensity for switching

BecauseWeil

headE -

860

depE headE

50

- depG headc

1659

- depG headc

60

- depE total

12359


conjunctions, however the vast majority of them are English conjunctions inotherwise German discourse. Lexical transfer of the same word class thusseems to work in the opposite direction in two bilingual communities with avery similar sociolinguistic profile mixing the same language pair.

To rule out the possibility that English because is used in place of anotherGerman causal conjunction, I will now look at the other possibilities. Da isanother causal subordinates, thus producing the identical word order effects towell, but normally used in more formal contexts. The whole corpus yields onlyone example of German da used as a subordinating conjunction. This token isembedded in formal discourse and was produced by a speaker who does notuse the mixed code as a discourse mode. Denn is a causal coordinatingconjunction. It was used once by a speaker from the group recordings (notDOR) and three times by a speaker in a more formal setting. Denn hasincreasingly gone out of use in colloquial German (Pasch 1997; Uhmann1998), however, since it is used by my informants, we need to consider it as apossible translation equivalent of because. This possibility is interesting becauseit involves word order issues: as a coordinating conjunction, denn always takesV2 order in the clause following it. The relations between well and denn will bediscussed further in section 4. 2. 2 on word order.

4. 1. 2 VERB SECOND WORD ORDER AFTER BECAUSE AND WEILExamples (18)-(20) also demonstrate the structural feature under investigation:German finite verbs occur in main clause word order position in subordinateclauses introduced by because. In actual fact not one German finite verbdepending on because is in clause final position (as in monolingual Germansubordinate clauses with an overt German subordinates; see example 20).

Furthermore, not all finite dependent verbs follow their subject. Some ofthem follow fronted indirect objects as in (23), others follow adverbials as in(24):

(23) DOR: because dem Computer brauchst' es nicht zeigen.the computer need you it not show.

Jen2. cha, line 729(24) LIL: is' wahr -? because bei mir hat schon +... 22

it's true at my place has alreadyJenf. cha, line 298

The word order in subordinate clauses after because is summarized in Table 2.

Table 2: Word order in subordinate clauses after because

dependent English dependent German

BecauseSVX92

SVX15

xvs6

sov0


What are supposed to be German dependent verbs occur in second positionafter because, which shows that because, at least for my informants, has not takenover the syntactic characteristics of the German subordinating conjunction wellwhich requires its dependent verbs to be clause final.

Let us now take a closer look at this subordinator. Table 1 illustrates thatwell only has German complements. According to the rules of standardGerman (rules G2 & G3), finite verbs depending on an overt subordinatorshould follow all their dependents, i. e. be clause final. This is not borne out inthe corpus. Note, however, that 58 per cent of dependent verbs are in finalposition after well, whereas none is in this position after because. Table 3summarizes the position of the dependent finite verb in well clauses from mycorpus. In order to see whether verb second after well is a parochial conventionof my data or not, I also give the distribution of V2 and Vf from several othercorpora of monolingual spoken German23 for comparison.

Table 3: Verb position after weil partly based on Uhmann (1998: 98)

WeilEppler (2004)BYU (Vienna)Farrar (1998) BYUSchlobinski (1992)Uhmann (1998)Dittmar (1997)

Vf34621147742499

V22511517221929

Vf58%85%69%70%56%77. 3%

V242%15%31%23%44%22. 7%

Table 3 shows that between 15 per cent and 44 per cent of dependent verbs inthese corpora are not in final position. So weil+V2 word order is not just apeculiarity of the German spoken by my bilingual informants.

We thus have two problems to solve: 1) the asymmetrical distribution ofbecause and weil in the corpus; and 2) the word order variation in both mixedand monolingual causal clauses introduced by because and weil. In the nextsection I will suggest possible solutions to these two problems.

4. 2 Possible explanations

4. 2. 1 FOR THE ASYMMETRY OF BECAUSE AND WEIL

The frequencies with which because and weil occur in dependency relations(summarized in Table 1) suggest that for the asymmetry between because andweil a probabilistic perspective is required.

Fourteen out of the sixteen tokens of because in an otherwise Germancontext were produced by one speaker (DOR). This is even more significant ifwe remember that this speaker is German dominant. The data from thisspeaker only contain seven tokens of the German subordinator weil (and nodenn). Because thus seems to replace weil for specific uses in the speech of thisspeaker. This use of the causal conjunctions is also to be found among theclose-knit network of bilinguals who use the mixed code as a discourse mode


(speakers TRU, MEL and LIL); but there is no significant asymmetricalrelation between because and well in the rest of the corpus.

Reasons for the discrepancy between the British and Australian corpora willhave to remain speculative for the moment I will, however, come back to thispoint at the end of section 4. 2. 2. Why German-speaking Jewish refugees inAustralia incorporate German conjunctions into their English - and thedirectionality of lexical transfer being reversed among the same speakers inBritain - could be due to the Australian corpus having been collectedapproximately 20 years before the London corpus. Michael dyne collecteddata, from this speech community in the 1970s. My corpus was collected in1993. An additional two decades of exposure to English of the London-basedrefugees may be a possible explanation for this discrepancy. Data fromAmerican/German dialects that have been in contact with English for up to twocenturies support this assumption. See example (25) from Salmons (1990:472):

(25) Almost jedes moi is Suppe gewen, because mir ban keievery time is it soup be we have no

Zeit khat fer Supper recht essen.time had for soup properly to eat

Treffers-Daller (1994: 192-5) discusses (25) and (26) and suggests analyzing theconjunctions in these two examples as coordinators. For monolingual EnglishSchleppegrell (1991: 323) argues that 'a characterisation of all because clauses assubordinate clauses [... ] is inadequate'. The possibility of a paratactic24 functionfor because will be discussed in the next section.

Gardner-Chloros's (1991) French/Alsatian data also offer an interestingexample of two Alsatian clauses linked by a French causal conjunction.

(26) Un noh isch de Kleinmann nunter, parce que ich haband now is the Kleinmann down there I have

mi dort mue melde.myself there must check in.

The German verbs selected by the English and French conjunctions inexamples (25) and (26) follow just one dependent, in these cases their subjects.I will discuss the not strictly causal/subordinating use of English because,German weil and French parce que in the next section.

4. 2. 2 V2 AFTER BECAUSE AND WEIL

The clearest result of the quantitative analysis presented in Table 2 is that allGerman finite verbs in clauses after because are in second position and none inclause final position.

The Word Grammar rules stated in section 3 account for the empirical databecause English subordinates only require finite verbs as their complements(rule E2). German subordinators (rule G3), on the other hand, provide a


specific context that requires dependent verbs to take all their dependents tothe left. As because is an English subordinates which does not specify that itscomplement has to be a clause final verb, we get main clause word order (SVOin monolingual English or V2 in mixed utterances).

Supporting evidence for this interpretation comes from the six instanceswhere the finite verb follows a dependent other than its subject (cf. examples23-24 and 27 below).

(27) DOR: I lost because # dreimal gab sie mir drei Konige.three times gave she me three kings

Jenl. cha, line 817

In the above example the verb is in second position, but the clause is clearly notSVO. The finite verb is preceded by an adverbial but followed by the subject.In other words, the clause displays the V2 order expected in German mainclauses.

But how do we know that because and the because-clause, are used in arestrictive subordinating way in examples (23), (24) and (27)? This questionneeds to be addressed because research conducted by, amongst others,Rutherford (1970), SchleppegreU (1991) and Sweetser (1990), cast doubt onthe characterization of all because-danses as subordinate clauses. Especially inspoken discourse, because can be used in a variety of non-subordinating and notstrictly causal functions.

Several criteria have been proposed to distinguish between restrictive (i. e.subordinating25) and non-restrictive because-clauses (Rutherford 1970). Insentences containing restrictive because clauses yes/no questioning of the wholesentence is possible; pro-ing with so or neither covers the entire sentence; theycan occur inside a factive nominal; and if another because clause were added,the two causal clauses would occur in simple conjunction. In semantic terms themain and the subordinate clause form one complex proposition and thebecause-clause provides the cause or reason for the proposition itself. Thiscausal relationship is one of 'real-world' causality (Sweetser 1990: 81). Chafe(1984) asserts that restrictive because clauses have a reading that presupposes thetruth of the main clause and asserts only the causal relation between the clauses.These clauses tend to have a commaless intonational pattern.

I will now apply these characteristics to some of the causal clausesintroduced by because in the corpus cited so far. Utterance (27) passes all ofRutherford's (1970) syntactic criteria for restrictive because-clauses. The mainand because-clanses form one complex proposition with a reading in which 'hergiving the speaker three kings' is the real world reason for the speaker losing thegame of cards. The truth of the sentence-initial clause is presupposed and thecausal relation between the two clauses is asserted. These properties of (27)speak for a restrictive analysis. The intonational contour of the utterance,however, displays a short pause after the conjunction. 26 Note furthermore thatthe causal clause in (27) contains a pre-posed constituent that triggers inversion,i. e. a main clause phenomenon (Green 1976). So there are indicators for botha restrictive/subordinate reading but also syntactic and intonational clues that


point to a non-restrictive/epistemic reading in which the speaker's knowledgecauses the conclusion. The latter interpretation suggests non-subordination,which would justify the V2 word-order pattern.

Example (18), repeated here with more context (to facilitate theinterpretation) and prosodic information as (28), contains the Englishconjunction because but is otherwise lexified with German words:

(28) DOR: wir waren nie mit richtige Englaender zusammen.'we never mixed with "real" English people'

DOR: man hatte konnen # man hat nicht wollen.'we could have # but we didn't want to'

DOR: es war unsere [... ] Schuld-.it was our fault

because man ftihlt sick mit den eigenen Leuten wohler.one feels oneself with the own people better

Ibronxha, line 217-22

This example passes none of Rutherford's (1970) 'tests'. The intonational dropbefore the conjunction which intonationally separates the two clauses alsosuggest a non-subordinate analysis for (28). A restrictive reading of the wholeconstruction is awkward if not unacceptable: feeling relaxed in the company offellow compatriots is not the cause or reason for feeling guilty. The non-restrictive reading in which the because clause provides the reason why thespeaker said 'it was our own fault' is far more plausible. The because clause,furthermore, indicates an interpretative link between clauses that are severalutterances apart: the last utterance in (28) provides a 'long-distance' reason forthe first utterance in this sequence. Schleppegrell (1991: 333) calls these uses ofbecause 'broad-scope thematic links'. They can be only identified when a corpusprovides the relevant context for the example. The wider context also identifiesthe clause preceding the causal clause as presupposed and thematic. Theinformation provided in the causal clause is new and asserted.

The analysis so far suggests that because is used in non-restrictive and non-subordinating functions in code-mixed utterances in my corpus. Withoutrepeating them, I will now briefly discuss the other examples in which becauseintroduces a clause with predominantly German lexical items (Examples 19-20and 23-24). Example (19) is a response to a preceding wh-question and thus anindependent utterance, the information presented in the reply is notinformationally subordinated, it forms the focus of the discourse and providesnew information (Schleppegrell 1991: 31). Example (20) has two intonationalcontours. The intonational rise and the verb first order mark the initial clause asa command or suggestion, i. e. an independent proposition; the followingbecause clause then represents an elaboration of that proposition. The contentof the causal clause is therefore not presupposed. Example (20) displays all thecharacteristics of an 'epistemic' (Sweetser 1990) because, which indicates'elaboration and continuation in non-subordinating and non-causal contexts'(Schleppegrell 1991: 323). The because clause in example (23) is preceded by ashort pause, contains a main clause phenomenon (extraction), and is reflexive


on the previous discourse; finally, the because clause in (24) follows a risingintonation of the initial tag, and again explicitly mentions the speaker'sknowledge state ('it's true').

We can conclude that those clauses in which because has a German (V2)verb as its complement, display more characteristics of 'non-restrictive'(Rutherford 1970) clauses and should therefore be analyzed as paratacticrather than subordinating. The Word Grammar rules formulated in section 3still account for the data because if because is not analyzed as a subordinator, thedefault rule Gl is not overridden and G2 and G3 do not get activated.

The analysis of the code-mixed data discussed so far indicates that thepredominantly German clauses introduced by because fulfil functions that arenot strictly causal but rather epistemic, broad-scope thematic link, etc. Thisdistinct usage is also reflected in their structural and intonational patterns. Wecan therefore assume that we are dealing with non-restrictive because that is non-subordinate and thus triggers main clause (V2) word order.

However, we also need to consider the monolingual data. The monolingualGerman data from my corpus are more worrying at first sight Like because, wellwas traditionally analyzed as a subordinating conjunction with causal meaningwhich takes a finite verb as its complement These grammar rules are notabsolutely adhered to by my informants and monolingual speakers of German.Only 58 per cent of verbs depending on well in the speech of my informants arein clause final/late' position. Table 3 shows, furthermore, that in corpora ofsimilar, i. e. southern, varieties of German only 3. 1 -85 per cent (with an average ofapproximately 67 per cent) of the subordinate clauses introduced by well aregrammatical according to the rules for monolingual German as stated in section 3.

The recent German literature on well constructions (Giinthner 1993, 1996;Pasch 1997; Uhmann 1998), however, suggest an explanation for themonolingual German data and opens up the possibility for an interestinginterpretation of the mixed data. There is agreement among the above namedresearchers that a) there is considerable variation in the use of well + V2 or well+ Vf; b) well + V2 is most frequent in southern German dialects; and c) weilclauses with verb final placement and weil clauses with main clause (V2) wordorder are found to show systematic but not absolute differences. In a nutshell,the analysis for German weil is similar to the analysis proposed for Englishbecause: there are two types of weil clauses, one strictly subordinating one, andseveral non-restrictive paratactic uses. The factor that best seems to account forthe data is the information structure of the construction. If pragmatics andsyntax, which in German is a much clearer indicator than in English, fail toprovide clear criteria as to which type of ^/-construction we are dealing with,intonation can once again help to disambiguate. Example (29) from my corpusillustrates epistemic weil + V2:

(29) LIL: sie hat sich gedacht, die [/] die muss doch Wien kennenlernen,'She thought she needs to get to know Vienna'weil die eltern sind beide aus Wien.because parents are both from Vienna

JenS. cha, line 107-8


Note that in (29) well could be replaced by the German coordinatingconjunction derm. Pasch (1997) and Uhmann (1998) agree that the non-restrictive well seems to take the position of Standard German denn in thesystem of conjunctions of reason in colloquial German.

In the analysis so far it has been established that there are 'restrictive' and'non-restrictive' because clauses in English and 'restrictive' and 'non-restrictive'well clauses in German. A cross-linguistic comparison of these clause typesrevealed that they share many of their discourse-pragmatic, syntactic andintonational characteristics. My informants use both clause types from bothlanguages in monolingual contexts. In addition to this, they employ because incode-mixed contexts. They treat English because as the translation equivalent ofthe non-restrictive weil+V2 or denn. Their linguistic competence tells them thatthese constructions are equivalent in syntax and pragmatic content.

This was demonstrated for the quoted examples and also holds true for thebecause followed by weil+V2 examples not reproduced in this chapter.Furthermore, if we apply this analysis to the quantitative asymmetry found inthe corpus between the two conjunctions because and weil and add the 21tokens of because+V2 to the weil tokens, this asymmetry shrinks to a figure (80weil: 120 because) which is in line with the general language distribution in thecorpus. In addition to the syntactic and pragmatic reasons for using this'congruence approach' (Sebba 1998: 1) to switching at clause boundaries, myinformants may also be dialectally pre-disposed to the weil+VS constructionbecause all of them are Lx speakers of a southern German variety.

I will now briefly return to the discrepancy between the Australian (Clyne1973) and London corpora mentioned in sections 4. 1. 1 and 4. 2. 1. Thequestion was why German speaking Jewish refugees in Australia incorporateGerman conjunctions into their English, and the directionality of lexical transferis reversed among the same speakers in Britain. I hypothesized that duration oflanguage contact may have something to do with it. At the time of datacollection, German speaking refugees in Australia had been mixing Germanand English for approximately 30 years. In London, on the other hand, thesetwo languages had been in contact for more than half a century when I collectedmy data. Another situation where we can witness long-term contact between thetwo languages under investigation are German-American dialects. Note,furthermore, that example (25) from these data (Salmons 1990) also has mainclause word order after because.

The development in Pennsylvania German (Louden 2003) is particularlyinteresting in this respect. Louden (2003) illustrates the causal conjunctionparadigm in Pennsylvania German (PG) data from the 19th century onwards.In the second half of this century he found the standard German distribution ofweil + verb final and dann (< Germ, denn) + V2. In data from the beginning ofthe 20th century PG still has verbs depending on weil in final position; dann,however, has been replaced byfer (< Engl. /or) + V2. In modern sectarian PGweil is backed up with (d) ass, a historical merger of doss with comparative als,and for (originally dann < Germ, denn) has been replaced with because + V2.

This development is interesting for several reasons: PG, in the late 19th,


early 20th century went through a phase that mirrors present-day English in thedistribution between because and for. In modern Pennsylvania German, welldoes not seem to be able to function as subordinates in its own right any longerand it has to be backed up by another complementizer to trigger verb finalplacement. This supports rule G2 (section 3) which implicitly proposes asubordinate feature on lexical complementizers. Modern PG seems to have lostthis feature and therefore needs to be 'backed up' by another subordinates totrigger verb final word order.

Dann in modern PG, on the other hand, after having gone through the stageof fer (<Engl. for), is eventually replaced by because, as in my data. Thisdevelopment not only backs up the speculation voiced in section 4. 1. 2, i. e. thatthe discrepancy between my and Clyne's (1973) German-English corpora mightbe due to prolonged language contact, but also the qualitative analysis presentedin section 4. 2. 2.

The WG stipulation of a subordinate feature on German complementizers(Rule G3) is furthermore supported by data from another language contactsituation with word order contrasts in subordinate clauses: French Dutchcontact in Brussels. The most frequently borrowed subordinates in BrusselsDutch is tandis que. Treffers-Daller (1994: 191) observes that the Dutchequivalent of tandis que, terwijl, is rarely used in her corpus. In those cases thatdo occur, the Dutch conjunction is followed by the Dutch complementizer dat.like well in Pennsylvania German, Brussels Dutch terwijl may also have lostthe subordinate feature and require an 'obvious' complementizer to trigger verbfinal placement.

5. Summary and Conclusion

In section 2 of this chapter I illustrated why syntactic constraints on intra-sentential code-mixing formulated within Phrase Structure Grammar frame-works are empirically not borne out. They are too restrictive because thedomain of government, i. e. maximal projections or constituents, was too large,and because of the problematic distinction between lexical and functionalcategories.

In section 3 I discussed the advantages of WG over other linguistic theoriesfor code-mixing research. They are seen to be:

• Word Grammar requires less analysis than constituency-based modelsbecause the only units that need to be processed are individual words.Larger units are built by dependency relations between two words whichcan be looked at individually.

• As syntactic structure consists of dependencies between pairs of singlewords, constraints on code-mixing are less prone to over-generalization thanconstraints involving notions of government and constituency.

• Word Grammar allows a single, completely surface analysis (with extradependencies drawn below the sentence-words). Code-mixing seems to be asurface-structure phenomenon, so this property of WG fits the data.


• Knowledge of language is assumed to be a particular case of more generaltypes of knowledge. Word Grammar accommodates sophisticated socio-linguistic information about speakers and speech communities. This isimportant for language contact phenomena that are influenced by socialand psychological factors.

• In contrast with most other syntactic theories, Word Grammar recognizesutterances.

• WG is a competence model which can handle inherent variability.

I do not claim that the present work illuminates theories of language structurebut it confronts a linguistic theory, Word Grammar, with statistical data, andshows that this theory of language structure can be successfully andilluminatingly used for the analysis of monolingual and code-mixed construc-tions. The WG formulation of the null hypothesis is born out with just ahandful of exceptions, and the WG rules determining word order inmonolingual German or English and code-mixed clauses also hold.

The investigation of word order in subordinate clauses, furthermore, showsthat the null hypotheses seems to be correct even in cases where we wouldexpect restricitions on code-switching due to surface word order differencesbetween the two grammars involved in mixing. The quantitative analysis ofmonolingual and code-mixed because and well clauses revealed that a) the coregroup of informants favour the English causal conjunction because over Germanweil or denn; the use of well and denn are restricted to monolingual Germancontexts, and because is also used to introduce mixed utterances; b) the wordorder in weil clauses varies between verb final, as required in subordinateclauses, and verb second, the main clause order; the coordinating conjunctiondenn only occurs once and with main clause order, as expected; mixed clausesintroduced by because invariably have verb second structure. Independentresearch on the syntactic, intonational, semantic and pragmatic properties ofmonolingual because and weil clauses has shown that these properties cluster toform two main types of causal clauses: restrictive and non-restrictive (Rutherford1970). The qualitative analysis of the monolingual causal clauses in the corpusrevealed that they also fall into these two types and that the mixed utterancesintroduced by because predominantly have the grammatical properties of non-restrictive clauses. Thus Boumans' (1998: 121) hypothesis that 'foreignconjunctions do not trigger verb-final in German clauses simply because theyare used in functions that require main clause order' could be verified. Thequantitative analysis of because and weil clauses has furthermore demonstratedhow frequency distributions of a specific grammatical pattern in monolingualspeech data can be combined with our knowledge about syntactic and pragmaticproperties of grammars to handle frequency in bilingual data (Muysken 2000).

The WG analysis of German (and Dutch) lexical subordinators having a'subordinate' feature which triggers verb final placement was furthermoresupported by data from two other language contact situations (PennsylvaniaGerman and Brussels Dutch) in which certain subordinators seem to have lostthis feature and therefore to require 'backing up' from overt complementizers.


References

Belazi, H. M., Rubin, E. J. and Toribio, A. J. (1994), 'Code switching and X-bar theory:The functional head constraint5. Linguistic Inquiry, 25, 221-37.

Bentahila, A. and Davies, E. E. (1983), 'The syntax of Arabic - French code -switching'. Lingua, 59, 301-30.

Bolle, J. (1995), 'Mengelmoes: Saranan and Dutch language contact', in Papers from theSummer School Code-switching and Language Contact. Ljouwerl/Leeuwarden: FryskeAkademie, pp. 290-4.

Boumans, L. (1998), The Syntax of Codeswitching: Analysing Moroccan ArabiclDutchConversations. Tilburg: Tilburg University Press.

Chafe, W. L. (1984), 'How people use adverbial clauses'. Berkeley Linguistics Society, 10,437-49.

Chomsky, N. (1981), Lectures on Government and Binding. Dordrecht: Foris.Clyne, M. G. (1973), 'Thirty years later: Some observations on "Refugee German" in

Melbourne', in H. Scholler and J. Reidy (eds) Lexicography and Dialect Geography,Festgabefor Hans Kurath. Wiesbaden: Steiner, pp. 96-106.

— (1987), 'Constraints on code-switching: how universal are they?' Linguistics, 25, 739-64.

DiSciullo, A-M., Muysken P. and Singh, R. (1986), 'Government and Code-Mixing'.Journal of Linguistics, 22, 1-24.

Eppler, E. (1999), 'Word order in German-English mixed discourse'. UCL WorkingPapers in Linguistics, 11, 285-308.

— (2004), '... Because dem Computer brauchst' es ja nicht zeigen': Because + Germanmain clause word order'. International Journal of Bilingualism, 8, 127-43.

— German/English LIDES database <http: Htalkbank. org/datalLIDES/Eppler. zip>.Gardner-Chloros, P. (1991), Language Selection and Switching in Strasbourg. Oxford:

Clarendon Press.Green, G. M. (1976), 'Main clause phenomena in subordinate clauses'. Language, 52,

382-97.Grosjean, F. (1995), 'A psycholinguistic approach to codeswitching', in P. Muysken and

L. Milroy (eds), One Speaker, Two Languages. Cambridge: Cambridge UniversityPress, pp. 259-75.

Gumperz, J. J. (1982), Discourse Strategies. Cambridge: Cambridge University Press.Gunther, S. (1993), '... Weil-man kann es ja wissenschafuich untersuchen'- Diskur-

spragmatische Aspekte der Wortstellung in weil-Satzen'. Linguistische Berichte, 143,37-59.

— (1996), 'From subordination to coordination?'. Pragmatics, 6, 323-56.Hudson, R. A. (1980), Sociolinguistics. Cambridge: Cambridge University Press.— (1997), 'Inherent variability and linguistic theory'. Cognitive Linguistics, 8, 73-108.— (1990), English Word Grammar. Oxford: Blackwell.— (2000), 'Grammar without functional categories', in R. Borsley (ed. ), The Nature and

Function of Syntactic Categories. New York: Academic Press, pp. 7-35.Joshi, A. K. (1985), 'Processing of sentences with intrasentential code-switching', in L.

Dowry, L. Kartunnen and A. M. Zwicky (eds), Natural Language Parsing.Cambridge: Cambridge University Press, pp. 190-205.

Lehmann, Ch. (1988), 'Towards a typology of clause linkage' in J. Haiman and S.Thompson (eds), Clause combining in grammar and discourse. Amsterdam/Philadelphia: John Benjamins, pp. 181-226.

Louden, M. L. (2003), 'Subordinate clause structure in Pennsylvania German'. FGLSjSGL Joint Meeting. London, 2003.

http://talkbank.org/data/LIDES/Eppler.zip


MacSwan, J. (1999), A Minimalist Approach to Intrasentential Codeswitching. New Yorkand London: Garland.

Mahootian, S. and Santorini, B. (1996), 'Code-switching and the complement/adjunctdistinction'. Linguistic Inquiry, 27, 3, 464-79.

Muysken, P. (1989), 'A unified theory of local coherence in language contact', in P.Nelde (ed. ), Language Contact and Conflict. Brussels: Centre for the Study ofMultilingualism, pp. 123-9.

— (2000), Bilingual Speech. A Typology of Code-Mixing. Cambridge: CambridgeUniversity Press.

Myers-Scotton, C. (1993), Duelling Languages: Grammatical Structure in Code-Switching.Oxford: Oxford University Press.

Myers-Scotton, C. and Jake, J. L. (1995), 'Matching lemmas in a bilingual languagecompetence and production model: Evidence from intrasentential code switching'.Linguistics, 33, 981-1024.

Pasch, R. (1997), 'Weil mit Hauptsatz-Kuckucksei im denn-Nest?'. Deutsche Sprache, 25,252-71.

Poplack, S. (1980), 'Sometime I'll start a sentence in Spanish y termino en EspanoF.Linguistics, 18, 581-618.

Poplack and Meechan (1995), 'Orphan categories in bilingual discourse: A comparativestudy of adjectivization strategies in Wolof/French and Fongbe/French'. LanguageVariation and Change, 7, 2, 169-94.

Romaine, S. (1989), Bilingualism. Maiden, Mass.: Blackwell.Rutherford, W. E. (1970), 'Some observations concerning subordinate clauses in

English'. Language, 46, 97-115.Salmons, J. (1990), 'Bilingual discourse marking: Code switching, borrowing, and

convergence in some German-American dialects'. Linguistics, 28, 453-80.Sankoff, D. and Poplack, S. (1981), 'A Formal Grammar for Code-Switching'. Papers in

Linguistics, 14, 3-46.Schleppegrell, M. J. (1991), 'Paratactic because'. Journal of Pragmatics, 16, 323-37.Schlobinski, P. (1992), 'Nexus druch weil', in P. Schlobinsky (ed. ) Funktionale

Grammatik und Sprachbeschreibung. Opladen: Westdeutscher Verlag. 315-44.Scotton, C. M. (1990), 'Code-switching and borrowing: Interpersonal and macro-level

meaning', in R. Jacobson (ed. ), Codeswitching as a Wordwide Phenomenon, New York:Peter Lang, pp. 85-110.

Sebba, M. (1998), 'A congruence approach to the syntax of Codeswitching'. InternationalJournal of Bilingualism, 2, 1-19.

Sweetser, E. (1990), From Etymology to Pragmatics. Metaphorical and Cultural Aspects ofSemantic Structure. Cambridge: Cambridge University Press.

Thorne, J. P. (1986), 'Because', in D. Kastovsky and A. Szwedek (eds) Linguistics acrossHistorical and Geographical Boundaries (Vol. 12). Berlin: Mouton. 1063-6.

Treffers-Daller, J. (1994), Mixing Two Languages: French-Dutch Contact in aComparative Perspective. Berlin: de Gruyter.

Uhmann, S. (1998), 'Verbstellungsvariationen in weil-Satzen'. ^eitschrift fur Sprachwis-senschaft, 17, 29-139.

Zwicky, Arnold (1992), 'Some choices in the theory of morphology', in Robert Levine(ed. ) Formal Grammar: Theory and Implementation. Oxford: Oxford University Press,pp. 327-71.


Notes

1 The corpus was collected in 1993 from German-speaking Jewish refugees residingin London. All transcripts are available on <http: //talkbank. ord/data/LIDES/Eppler. zipX

2 Categorial equivalence is 'when the switched element has the same status in thetwo languages, is morphologically encapsulated, shielded off by a functionalelement from the matrix language, or could belong to either language' (Muysken2000: 31).

3 Myers-Scotton and Jake (1995: 985) define system morphemes as morphemes thatdo not participate in the thematic structure of a sentence, i. e. they are specified as [-theta-role assigner/receiver]. A second feature characteristic of 'most' systemmorphemes is the feature [+ Qantification]. A morpheme has a plus setting forquantification within the Matrix Language Frame model, if it restricts possiblereferents of a lexical category. Myers-Scotton and Jake (1995: 985) give tense andaspect as examples for [+ QJ. Tense and aspect restrict the possible reference ofpredicates (i. e. verbs and adjectives). Prototypical system morphemes are inflectionsand most function words.

4 The WG approach of incorporating different syntactic properties of WORDS inisa-hierarchies seems more economical and convincing.

5 The Free Morpheme Constraint (Sankoff and Poplack 1981: 5) prohibits switchingbetween a bound morpheme (pre- or suffix) and a lexical form unless the latter hasbeen phonologically integrated into the language of the bound morpheme.

6 Note the similarity of this corollary with the WG null hypothesis this study is basedon.

7 Constituency analysis is applied only to coordinate structures.8 This system implies that code-mixing ought to be less frequent among typologically

quite different language pairs.9 According to the theory of markedness (Scotton 1990: 90), speakers know that for a

particular conventionalized exchange in their community, a certain code choice willbe the unmarked realization of an expected rights and obligations set betweenparticipants. They also know that other possible choices are more or less markedbecause they are indexical of other than the expected rights and obligations set.

10 Smooth code-switches are unmarked by false starts, hesitations, lengthy pauses, etc.;flagged switches are accompanied by discourse markers and other editingphenomena (Poplack 1980).

11 A database is particularly important for studies of codes that do not have 'native'speakers who can provide fairly reliable grammaticality judgements. A corpus is alsoan essential test for the constraints on and models of code-mixing.

12 An alternative analysis of this example would be that it is ambiguous, i. e. it conformsto two different models. The stem conforms to the English phonological model andthe suffix conforms to the German plural suffix; i. e. it is a morphologicallyintegrated borrowed stem.

13 The figures for individual cases need not be the same; cases of lexical diffusionwould seem to suggest the contrary (Hudson 1980: 168ff). And presumably theentrenchment value for the general rule in such cases could be different from all theindividual rules.

14 Default inheritance rules apply to the few English constructions in which thecomplement comes before the head.

15 These rules are not intended to cover scrambling, double infinitive constructionsand other well-known word order intricacies of German.

http://talkbank.ord/data/LIDES/Eppler.zip

http://talkbank.ord/data/LIDES/Eppler.zip


16 The term 'late' was chosen instead of 'final' because finite dependent auxiliaries indouble infinitive constructions can be followed by their non-finite dependents; cf.endnote 15.

17 Support for this analysis comes from the fact that German subordinate clauseslacking a subordinator/complementizer are V2 (or verb initial). Cf.:

Sie sagte, sie kennen Doris vs. Sie sagte, daft sie Doris kennenShe said they know Doris She said that they Doris know

According to G3, it is only subordinators/complementizers that select 'late' finiteverbs. So if a verb depends directly on another verb (kennen directly depending onsagte and not daff) the default rule need not be overridden.

18 Exceptions to this rule are extraposition and double-infinitive constructions.19 The null hypothesis is violated in five tokens of two construction types: word-order

violations of objects and negatives (see Eppler 2004).20 The data this study is based on are transcribed in the LIDES (Language Interaction

Data Exchange) system. More information on the transcription system can be foundon <www. ling. lancs. ac. uk/staff/mark/lipps/ >.

21 See for example Clyne (1987), Gardner-Chloros (1984), Salmons (1990), Treffers-Daller (1994).

22 Example (24) is an incomplete subordinate clauses. This does not effect the analysisbecause the word order position of the relevant finite dependent verb is clear.

23 Since all my informants are from Vienna, I used only examples from the tenViennese informants for the Brigham Young Corpus (BYU) corpus. Farrar (1998)counted all occurrences of weil in the speakers of southern German dialects fromthe BYU corpus. Schlobinski's (1992) data are standard Bavarian; and the Uhmann(1998) corpus is 'alemannisch-bairisch'.

24 Lehmann (1988) suggests that for clauses that are linked in a relationship ofsociation rather than dependency, 'paratixis' is a more appropriate term than'coordination'.

25 Two clauses (X and Y) have been defined as being in a subordination relationship'if X and Y form an endocentric construction with Y as the head' (Lehmann 1988:182).

26 Note that in the English literature, Rutherford (1970) and Thome (1986), thecomma intonation is assumed to precede the conjunction. Schleppegrell (1991:333) mentions the possibility of because followed by a pause.

www.ling.lancs.ac.uk/staff/mark/lipps/

7 Word Grammar Surface Structures and HPSGOrder Domains*

TAKAFUMI MAEKAWA

AbstractIn this chapter, we look at three different approaches to the asymmetries betweenmain and embedded clauses with respect to the elements in the left periphery ofa clause: the dependency-based approach within Word Grammar (Hudson2003), the Constructional Head-driven Phrase Structure Grammar (HPSG)approach along the lines of Ginzburg and Sag (2000), and the LinearizationHPSG analysis by Chung and Kim (2003). We argue that the approaches withinWG and the Constructional HPSG have some problems in dealing with therelevant facts, but that Linearization HPSG provides a straightforward accounta ofthem. This conclusion suggests that linear order should be independent to aconsiderable extent from combinatorial structure, such as dependency or phrasestructure.

1. Introduction

There are two ways to represent the relationship between individual words:DEPENDENCY STRUCTURE and PHRASE STRUCTURE. The former is a purerepresentation of word-word relationships while the latter includes additionalinformation that words are combined to form constituents. If all work can bedone just by means of the relationship between individual words, phrasestructure is redundant and hence dependency structure is preferable to it. Itwould therefore be worth considering whether all work can really be done withjust dependencies. We will look from this perspective at certain linear orderasymmetries between main clauses and subordinate clauses. One example ofsuch asymmetries can be seen in the contrast of (1) and (2). The former showsthat a topic can precede a fronted wA-element in a main clause:

(1) a. Who had ice-cream for supper?b. For supper who had ice-cream?

(2) illustrates, however, that this is not possible in an embedded clause:

(2) a. Who had ice-cream for supper is unclear,b. * For supper who had ice-cream is unclear.


It is clear that main clauses are different from subordinate clauses with respectto the possibility of topicalization. It has been noted by a number of researchersthat elements occurring in the left periphery of the clause, such as interrogativeand relative pronouns, topic and focused elements, show such linear orderasymmetries (see Haegeman 2000; Rizzi 1997; and works cited therein).

The purpose of this overview chapter is to take a critical look at the currenttreatment of such asymmetries within the frameworks of WORD GRAMMAR(wo) and HEAD-DRIVEN PHASE STRUCTURE GRAMMAR (HPSG),and ask how they should be represented in the grammar. We compare the WGapproach developed in Hudson (2003; see also Hudson 1995, 1999) with the tworelatively recent versions of HPSG: what can be called CONSTRUCTIONAL HPSG inwhich grammars include hierarchies of phrase types (Sag 1997; and Ginzburg andSag 2000), and so-called LINEARIZATION-BASED HPSG (or LINEARIZA-TION HPSG), in which linear order is independent to a considerable extentfrom phrase structure and is analysed in terms of a separate level of 'ORDERDOMAINS' (Pollard et al. 1994; Reape 1994; Kathol 2000, etc. ). 1 It will be arguedthat trie WG and the Construction HPSG approaches have some problems, butthat Linearization HPSG can provide a straightforward account of the facts.

The organization of this chapter is as follows. In the next section we considerhow a WG approach might accommodate the asymmetry between main andsubordinate wA-interrogatives. Section 3 then looks at a Construction HPSGanalysis along the lines of Ginzburg and Sag (2000). In section 4 we shalloutline a Linearization HPSG analysis developed by Chung and Kim (2003).In the final section, we offer some concluding remarks.

2. A Word Grammar Approach

Before looking at the WG analysis of the phenomenon in discussion, weshould briefly outline how word order, w/z-constructions and extractions aretreated in WG. In WG word order is controlled by two kinds of rule: generalrules that control the geometry of dependencies, and word-order rules thatcontrol the order of a word in relation to other word(s): its LANDMARK(S)(Hudson 2005). In simple cases a word's landmark is its PARENT: the word itdepends on. In the cases where the word has more than one parent, only the'higher' parent becomes its landmark (PROMOTION PRINCIPLE; see Hudson2003a). For example, let us consider the sentence It was raining. The raisedsubject it depends on two verbs, was and raining, so it has two parents. In thiscase, eligible as its landmark is was. This is because raining depends on was, sothe latter is the 'higher' of the two. In a WG notation, It was raining isrepresented as shown below.

(3)

WG SURFACE STRUCTURES AND HPSG ORDER DOMAINS 147

The fact that it is the subject of the two verbs is indicated by the two arrowslabelled V (subject). The arrow labelled V indicates that raining is a 'SHARER'of was. This is so named since it shares the subject with the parent verb. In anotation adopted here, the dependencies that do not provide landmarks aredrawn below the words. Therefore, one of the V arrows, the one from rainingto it, is drawn below the words. We thus pick out a sub-set of totaldependencies of a sentence and draw them above the words. This sub-set iscalled SURFACE STRUCTURE. Word-order rules are applied to it, and determinethe positioning of a word in relation to its landmark or landmarks. Thus, thesurface structure is the dependencies which are relevant for determining wordorder. A word-order rule specifies that a subject normally precedes itslandmark, and another rule specifies that a sharer normally follows itslandmark, as illustrated by the representation in (3).

Among the rules that control the surface structure, THE NO-TANGLINGPRINCIPLE is the most important for our purpose: dependency arrows in thesurface structure must not tangle. 2 This principle excludes the ungrammaticalsentence (4b):

(4) a. He lives on green peas,b. * He lives green on peas.

The dependency structures of this pair are shown in (5):

(5)

(5b) includes tangling of the arrows. Its ungrammaticality is predicted by the NoTangling Principle.

Let us turn to the WG treatment of wA-interrogatives. Consider thedependency structure of What happened? for example. As in the case of anordinary subject such as it in (3), the grammatical function of the w/z-pronounwhat to the verb happened is a subject. Therefore, what depends on happened,and this situation can be represented as follows.

(6)


On the other hand, Hudson (1990: 361-82; 2003) argues that the verb is acomplement of the wh-pronoun and thus depends on it

(7)

The evidence for the headness of wh-pronoun includes the followingphenomena (Hudson 2003). First, the pronoun can occur without the verbin sluicing constructions:

(8) a. Pat I know he's invited a friend. Jo: Oh, who [has he invited] ?b. I know he's invited a friend, but I'm not sure who [he's invited].

Second, the pronoun is what is selected by the higher verb. In (9) wonder andsure require a subordinate interrogative clause as their complement For aclause to be subordinate interrogative, the presence of either a wh-pronoun, orwhether or if is required.

(9) a. I wonder *(who) came.b. I'm not sure * (what) happened.

Third, the pronoun selects the verb's characteristics such as finiteness andwhether or not it is inverted. (10) illustrates that why selects a finite or infiniteverb as its complement, but when only selects a finite verb:

(10) a. Why/when are you glum?b. Why/*when be glum?

(11) indicates that why selects an inverted verb as its complement whereas howcome selects a non-inverted verb:

(11) a. Why are you so glum?b. * Why you are so glum?c. * How come are you so glum?d. How come you are so glum?

(12) illustrates that what, who and when select a to-infinitive, but why does not:

(12) I'm not so sure what/who/when/*why to visit

Hudson (2003) argues that all of these phenomena are easily accounted for ifthe 2£>/z-pronoun is a parent of the next verb. In the framework of WG,therefore, there is no reason to rule out any of (6) and (7); the sentence issyntactically ambiguous. Thus, in What happened? what and happened depend oneach other, and the dependency structure may be either of (13a) and (13b):


(13) a.

b.

Thus, w/z-interrogatives may involve a mutual dependency. In (13b), happened isthe parent and the dependency labelled V is put in surface structure. In (13a),however, what is the parent, and the dependency labelled 'c' (complement) isput in surface structure.

Finally, we outline how extraction is dealt with in WG. Let us consider (14a)with an preposed adjunct in the sentence initial position:

(14) a. Now we need help,b. We need help now.

The preposed adjunct now would otherwise follow its parent need as in (14b),but it precedes it. This situation is represented in WG by adding an extradependency 'EXTRACTEE' to now.

(15)

The arrow from need to now is labelled 'x <, > a', which means 'an adjunctwhich would normally be to the right of its parent (" > a") but which in this caseis also an extractee ("x>")'. Thus the adjunct now is to the left of the parentverb need.

With this background in mind, let us now turn to the asymmetry betweenmain and subordinate clauses in question: adverb-preposing is not possible insubordinate interrogatives although it is possible in main interrogatives.

(16) a. Now what do we need?b. * He told us now what we need.

As stated above, a w/z-pronoun and its parent are mutually dependent. In (16a)


do is the complement of what whereas what is the extractee of do. Thus, thedependency structure for (16a) would be either of (17a) and (17b). In theformer, what is the parent, and the dependency labelled 'c' from what to do isput in surface structure. In the latter, however, do is the parent and thedependency labelled 'x<' from do to what is put in surface structure. Thepreposed adjunct now is labelled 'x <, > a', and precedes its parent do. As thediagram shows, the 'x <, > a' arrow from do to now tangles with the verticalarrow in (17a). Thus, it violates the No Tangling Principle. On the other hand,there is no tangling in (17b), so it is the only correct WG analysis of (16a).

(17) a.

b.

Let us turn to the subordinate w/z-interrogative in (16b). In (16b) what is theobject and the extractee of need while need is the complement of what. It has thestructure represented in (18). What is the clause's subordinates and it has to bethe parent of the subordinate clause. The dependency labelled 'c' should beput in surface structure since if the arrow labelled 'x <, o' were in the surfacestructure, what would have two parents and violate THE NO-DANGLING PRIN-CIPLE: words should not have more than one parent in surface structure(Hudson 2005). As the diagram shows, the arrow from need to now is tangledwith the one from told to what. Unlike the main clause case in (17), it has noalternative structure, so (16b) is ungrammatical.

(18)


Thus, WG can capture the linear order asymmetries of the main andsubordinate clauses in terms of dependencies in surface structure and generalprinciples on dependencies.

Although the WG analysis looks successful in accommodating theasymmetry between the main and subordinate clauses, there are someweaknesses. As surveyed above, the WG approach states that adjunct prepbsingis possible out of main w/z-interrogatives because the preposed adjunct avoidsviolation of the No-Tangling Principle due to the fact that it is a co-dependent ofthe ^/z-element (Hudson 2003: 636). The argument along these lines wouldsuggest that extraction is allowed as long as it does not violate the No-TanglingPrinciple. However, there are cases in which extraction out of embedded wh-interrogative is excluded although it does not violate the No-Tangling Principle.The data comes from the SUBJECT-AUXILIARY INVERSION (SAI) structuresillustrated by (19). 3

(19) Under no circumstances would I go into the office during the vacation.

In WG, a preposed operator of the SAI clauses, such as under no circumstancesin (19), is a kind of extractee (Hudson 2005), so we should expect that itbehaves like a preposed adjunct As expected, SAI operators cannot beextracted out of the subordinate zx'A-interrogatives, as illustrated by the followingexamples. A WG approach would suggest that this is due to the No-TanglingPrinciple.

(20) a. * Lees wonders under no circumstances at all why would Robinvolunteer.

b. * I wonder only with great difficulty on which table would she put thebig rock.(Chung and Kim 2003)

With this in mind, let us consider the main ^-interrogative clause. A WGapproach would predict that preposing of an SAI operator is possible out ofmain w/z-interrogatives because it should not involve violation of the No-Tangling Principle, as in the case of adjunct preposing. However, it is actuallyungrammatical.

(21) a. * In no way, why would Robin volunteer?b. * Only with great difficulty on which table would she put the big rock?

(Chung and Kim 2003)

Here the preposed SAI operator precedes the wA-element. Note that thesituation is completely on a parallel with the case of the adjunct preposing like(16a), which is repeated here.

(22) Now what do we need?

As we have seen, (22) is grammatical because it does not violate the No-


Tangling Principle. However, the sentences in (21) are ungrammatical thoughthey do not violate the same principle. This makes the analysis in terms of theNo-Tangling Principle less plausible.

As we saw at the outset of this section, word order is controlled by twokinds of rule in WG: general rules, such as the No-Tangling Principle thatcontrol the geometry of dependencies; and word order rules that control theorder of a word in relation to its landmark or landmarks (Hudson 2005).Someone might suggest that the No-Tangling Principle is simply irrelevant in(21) and that a word-order rule could exclude the ill-formed order. However,there are some problems in this approach. Let us suppose that WG has a rulewhich excludes the OP(ERATOR) < WH order. It is natural to suggest thatthe same rule could apply not only to main clauses but also to subordinateclauses. As predicted, the subordinate clauses with the same elements in thesame order as (21) are ungrammatical. This is actually illustrated by (20)above. Now we should recall that they can also be excluded by the No-Tangling Principle as well; a preposed operator is extracted from thesubordinate w/z-interrogative clause. The situation is entirely on a parallel with(16b). A question arises: for which reason are the sentences in (20) excluded,by the No-Tangling Principle or by a word-order rule? If we took the firstoption, then the ungrammaticality of (20) would be accounted for by the No-Tangling Principle, whereas that of the corresponding main clauses would beexplained by a word-order rule. If we took the second option, then both mainand subordinate clauses would be excluded by a word-order rule. It is clearthat we cannot take the first option: it forces the word-order rule to refer tomain clauses. Note that WG does not have a unit larger than a word, so itdoes not recognize clauses (Hudson 2005). It does not, therefore, have a wayto distinguish main and subordinate clauses, apart from the assumption thatthe latter has a subordinator and a parent outside of the clause. (Hudson1990: 375-6). It is, then, impossible for WG rules to refer to any clause.What about the second option, a word-order rule approach, where each ill-formed case is excluded by a rule which bans a particular word order? Itcould indeed account for the ungrammaticality of the OP < WH order inboth main and subordinate clauses. However, this option also has a problem,to which we will turn in the following paragraph.

Consider the following pair, which shows another asymmetry between mainand subordinate clauses:

(23) a. * Why, in no way would Robin volunteer?b. Lees wonders why under no circumstances would Robin volunteer.

A w/z-extractee precedes a preposed negative operator in the main clause in(23a), and it is ungrammatical. However, the same order is allowed in thesubordinate clause, as in (23b). There is clearly an asymmetry between a mainand a subordinate clause. The same sort of asymmetry can be observed in thecase of a w/z-extractee and a topic extractee as well. In (24), a w/z-extracteeprecedes a topic extractee in main clauses, and they are ungrammatical:


(24) a. * To whom, a book like this, would you give?(Koizumi 1995)

b. * For what kind of jobs during the vacation would you give into the office?(Baltin 1982)

On the other hand, the same permutation of w/z-element and a topic is allowedin subordinate clauses, as in (25):

(25) a. the man to whom, liberty, we could never grant.b. ? I wonder to whom this book, Bill should give.

(Chung and Kim 2003)c. I was wondering for which job, during the vacation, I should give into

the office.

Here we have yet another asymmetry between a main and a subordinate clause.Our observation in (23)-(25) indicates that the word order which is grammaticalin subordinate clauses is ungrammatical in main clauses. Note that the No-Tangling Principle cannot exclude the ungrammatical cases since they are all inmain clauses. Therefore, the only option we have is to specify word-order rulesto exclude ill-formed cases. Now the same problem as (21) and (20) arisesagain. Such word-order rules would have to state that it is applied to a mainclause but not to a subordinate clause. However, it is impossible for WG rulesto refer to a clause since WG does not recognize any unit larger than a word.Thus, we cannot adopt a word-order rule approach, either.

We have pointed out that the No-Tangling principle is not effective enoughto accommodate the cases of preposing of an SAI operator, another asymmetrybetween main and subordinate w/z-interrogatives. Recall that the most importantassumption for a WG approach is that the wA-pronoun is the parent of thesubordinate wA-interrogatives. We should note that this assumption itself is notwithout problem. Consider examples in (26a) and (26b); the former is the onecited by Hudson himself as a problematic data for his analysis (Hudson 1990:365): 4

(26) a. Which students have failed is unclear,b. Who shot themselves is unclear.

In WG treatment of wh-pYonoun, which and who are not only the subject ofhave and shot, respectively, but also the subject of is. The verb should agree innumber with its subject, so have/shot and is should both agree with which/who.Which in (26a) should share its plurality with students since the former is adeterminer of the latter; who in (26b) should share its plurality with themselvessince the former is the antecedent of the latter. This does not explain themorphology of the copula verb in both sentences, which requires the singularsubject. This analysis would predict sentences like the following to begrammatical:

(27) a. * Which students have failed are unclear,b. * Who shot themselves are unclear.


The copular verb is are, not is, agreeing with its subject which in (a) and who in(b). These sentences are, however, ungrammatical. Thus, the assumption thatthe w^-pronoun is the parent of the subordinate interrogatives has a weakness.

We should also note that there are some cases where an extractee is allowedto precede the complementizer. The following examples are from Ross (1986):

(28) a. Handsome though Dick is, I'm still going to marry Herman,b. The more that you eat, the less that you want.

In (28a), the first clause is the subordinate clause, and the adjective handsome, acomplement of is, is in front of the complementizer though. In (28b) the more,which is an object of eat and want, is followed by the complementizer that. 5 Itwould be natural to assume the fronted elements in these examples to be anextractee in WG's terms; but if so, the dependency arrow from the verb to theextractee would tangle with the vertical arrow to the complementizer, and hencethe resulting structure in (29) violates the No-Tangling Principle. 6

(29)

It seems, then, that a WG approach to the asymmetry between main andsubordinate ^-interrogatives has some problems.

3. An Approach in Constructional HPSG: Ginzburg and Sag 2000

We will now consider how adjunct preposing in main and subordinate wh-interrogatives might be accommodated within the framework of HPSG. InHPSG, lexical and phrasal descriptions are formulated in terms of FEATURESTRUCTURES like (30):

(30)


The value of the feature PHONOLOGY (PHON) represents phonologicalinformation of a sign. The value of SYNTAX-SEMANTICS (SYNSEM) is oftype synsem, a feature structure containing syntactic and semantic information.The SLASH feature is for representing information about long-distancedependencies, which we will consider further below. The value of LOCAL(LOG) contains the subset of syntactic and semantic information shared in long-distance dependencies. The syntactic properties of a sign are represented underthe path SYNSEM|LOC|CAT(EGORY). The HEAD value containsinformation standardly shared between a phrase and its head, informationsuch as parts of speech. The semantic properties of a sign are representedunder SYNSEM | LOG | CON (TENT). The value of the ARG-ST (ARGU-MENT-STRUCTURE) is a list of synsem objects corresponding to thedependents which a lexical item selects for, including certain types of adverbialphrases (Abeille and Godard 1997; Bouma et al. 2001; Kim and Sag 2002; vanNoord and Bouma 1994; Przepiorkowski 1999a, 1999b).

Sag (1997) and Ginzburg and Sag (2000) hypothesize that a rich network ofphrase-structure constructions with associated constraints is part of thegrammars of natural languages. The hierarchies allow properties that areshared between different phrasal types to be spelled out just once. The portionof the hierarchy which will be relevant to adjunct preposing is represented in(31).

(31)

Phrases are classified along two dimensions: clausality and headedness. Theclausality dimension distinguishes various kinds of clauses. Clauses are subjectto the constraint that they convey a message. Core clauses are one subtype,which is defined not to be modifiers and headed by finite verbal forms or the


auxiliary to. The headedness dimension classifies phrases on the basis of theirhead-dependent properties, i. e. whether they are headed or not, what kind ofdaughters they have, etc. A general property of headed phrases (hd-ph} is thepresence of head daughter, and this phrasal type is constrained as follows:

(32). Generalized Head Feature Principle (GHFP)hd-ph:

[synsem /[!]]—»•... H[synsem /\T\]...

The GENERALIZED HEAD FEATURE PRINCIPLE (GHFP) states that theSYNSEM value of the mother of a headed phrase and that of its head daughtershould be identical by default A subtype of hd-ph, head-filler-phrase (hd-Jill-ph),is associated with the following constraint:

(33) hd-fill-ph:

This constraint requires the following properties. First, the head daughter mustbe a verbal projection. Second, one member of the head daughter's SLASH setis identified with the LOCAL value of the filler daughter. Third, other elementsthat might be in the head daughter's SLASH must constitute the SLASH valueof the mother. Ginzburg and Sag (2000) treat the topicalization constructions asa subtype of hd-fill-ph, and posit a type topicalization-clause (top-cl). It is alsoassumed to be a subtype of core-cl. The type top-cl is subject to the construction-particular constraint which takes the following form:

(34) top-cl:

Topicalized clauses have an independent ([1C +]) finite clause as a headdaughter. Consider (35) for example:

(35) a. Problems of this sort, our analysis would never account for.b. * She subtly suggested [problems of this sort, our analysis would never

account for].(Ginzburg and Sag 2000: 50)

The topicalized sentence in (35a) is an independent clause (i. e. [INDEPEN-DENT-CLAUSE (1C) +]), hence its head daughter our analysis would neveraccount for has [1C +]. A clause has the [1C — ] specification in an embeddedenvironment, and hence the embedded clause in (35b) is [1C — ].


Topicalization of such a clause is ruled out by (34). The filler daughter of thetopicalized clause is constrained to be [WH {}], the effect of which is to preventany w/z-words from appearing as the filler or an element contained within thefiller. The constraints introduced above are unified to characterize thetopicalized clause constructions.

Given the above constraints, a sentence with a preposed adjunct will havesomething like the following structure (Bouma et al. 2001; Kim and Sag 2002):

(36)

As noted above, certain types of adverbial phrases are selected by the verbalhead and listed in the ARG-ST list, along with true arguments. Thus, adjunct-preposing and standard cases of topicalization can be given a unified treatment.The ARG-ST of the verb visit thus contains an adverbial element, whose synsemis specified as a gap-ss. Gap-ss, a subtype of synsem, is specified to have anonempty value for the feature SLASH. Its LOG value corresponds to itsSLASH value, as indicated by the shared value [1]. The ARGUMENT REALI-ZATION PRINCIPLE ensures that all arguments, except for a gap-ss, are realizedon the appropriate valence list (i. e. SUBJ(ECT), COMP(LEMENT)S orSP(ECIFIER), and hence are selected by a head. Note that in (36) the gap-ss inthe ARG-ST list of visit does not appear in a COMPS list The nonempty


SLASH value is incorporated into the verb's SLASH value. 7 The verb'sSLASH value is projected upwards in a syntactic tree from the head daughter tomother, due to the GHFP. The termination of this transmission, which iseffected by subtypes of the hd-fill-ph constructions, occurs at an appropriatepoint higher in the tree: a dislocated constituent as specified as [LOG [1]]combines with the head that has the property specified in the constraint for hd-fill-ph in (33).

Now we can consider how this approach might accommodate the asymmetrybetween main and subordinate w/z-interrogatives. The data observed in the lastsection can be summarized as (37):

(37) Distribution of SAJ operator, ro/z-element and topic (Based on Chung and Kim2003)

TOP<WHWH<TOPOP<WHWH<OP

Main clause

ok (16a)(24)(21)(23a)

ok

ok

(16b)(25)(20)(23b)

We will begin with the asymmetry in terms of the interaction of a topic and awA-element. The relevant data is repeated here for convenience with the labelsand brackets added:

(38) a. [Si Now [S2 what do we need]]?b. * He told us [Si now [32 what we need]].

SI is composed of the topic filler and the clausal head, S2. S2 of the twosentences in (38) is of the type ns-wh-int-cl. What we need to do is to checkcompatibility of a clause of the type top-d and that of the type ns-wh-int-cl, withthe latter being the head of the former. We saw above that clauses of the typetop-d are constrained by various constraints; the unification of the constraints isrepresented as follows:

(39)

Of note here is that the LOG value of the mother [2] is shared with that of thehead, due to the GHFP (32). This means that the head daughter, in this case aclause of the type ns-wh-int-d, should have a finite verb as its head and its 1Cvalue is +. According to the hierarchy in (31), a clause of this type ischaracterized as unification of core-d, int-d, hd-ph, hd-fill-ph, wh-int-ph and ns-

Embedded clause

**

*

*

*


wh-int-ph. The following structure is the result of the unification, but issimplified with the details irrelevant to the discussion omitted:

(40)

The shared value between the features 1C and INV(ERTED) guarantees that ifthe clause of this type is inverted ([INV +]) then its 1C value is -+, that is, itappears in a main clause; if it is uninverted ([INV — ]) then it should be in anembedded clause ([1C — ]). The S2 of (38a), the head daughter of the wholeclause, is inverted, so its INV value is +, and hence its 1C value is +. Thissatisfies the requirement stated in (39) that the head daughter of a topicalizationconstruction is an independent clause.

The S2 in (38b) is an instance of ns-wh-int-cl as in the previous case, but it isnot inverted (i. e. [INV — ]) in this case. The S2 should then be specified as [1C— ] due to constraint (40). As we saw above, the head daughter of atopicalization construction should be [1C +]. This is the reason why theembedded interrogative does not allow topicalization. Under Ginzburg andSag's (2000) analysis, the asymmetry between main and subordinate wh-interrogatives in terms of adjunct preposing is thus due to the conflict of therequirement from the topicalization constructions and the embeddedinterrogative constructions: the former requires [1C +] while the latter isspecified as [1C — ].

We will move on to the data problematic to a WG approach. Let us firstconsider how Ginzburg and Sag's (2000) approach might deal with theasymmetry in terms of the order WH <TOPIC. As we observed in (24), theWH < TOPIC order is ungrammatical in main clauses. The data is repeatedin (41), with the square brackets and the labels added:

(41) a. * [si To whom, [§2 a book like this, would you give?]]b. * [si For what kind of jobs [52 during the vacation would you give into the

office?]]

As we observed in (25), however, the same order is acceptable in subordinateclauses. The data is repeated in (42):

(42) a. the man (si to whom, [52 liberty, we could never grant]]b. ?I wonder [Si to whom [S2 this book, Bill should give. ]]c. I was wondering [51 for which job, [32 during the vacation, I should give

into the office. ]]


In (41) and (42), SI is an instance of ns-wh-int-cl, and its head daughter S2 is ofthe type top-cl, so what we need to do is to check compatibility of top-cl as ahead of ns-wh-int-cl. (40) states that the CAT value of ns-wh-int-cl should beshared by that of its head, top-cl in this case. The v and clausal specifications arecompatible with those of top-cl. SI in the sentences in (41) has [1C +] since it ismain clause. Its clausal head S2, therefore, should also have [1C +], according to(40). This indicates that the feature structure description given for the headdaughter does not violate the constraint for top-cl in (39). Thus, their analysismakes a wrong prediction that the sentences in (41) are grammatical.

Since subordinate interrogatives cannot appear independently, SI in (42)has the [1C — ] specification, and so does its head daughter top-cl. As we sawabove, however, top-cl has [1C +]. Therefore, ungrammaticality of (42) ispredicted; so we have the wrong prediction again.

We will next turn to the interaction of preposed operator of SAI clauses anda ^^-element. As we observed in (20) and (21), the OP < WH is excluded inboth main and subordinate clauses. We also observed in (23) that the WH <OP order is excluded in main clauses, but grammatical in subordinate clauses.The relevant data is repeated here in (43) and (44), with square brackets andlabels added for expository purposes:

(43) a. * [si In no way, [§2 why would Robin volunteer]]?b. * I wonder [S1 only with great difficulty [S2 on which table would she put

the big rock]].(44) a. * [si Why, [52 in no way would Robin volunteer]]?

b. Lees wonders [51 why [$% under no circumstances would Robinvolunteer]].

It is not clear exactly what sort of constraints preposed operators must satisfy inGinzburg and Sag's (2000) system, but it is clear that the S2 in (43a, b) and the SIin (44a, b) are clauses of the type ns-wh-int-cl. Therefore, they should at leastsatisfy constraint (40). As we saw above, this constraint guarantees that the clauseof this type is inverted ([INV +]) if it is in a main clause ([1C +]) and that it isuninverted ([INV —]) if it is in an embedded clause ([1C — ]). All theoccurrences of ns-wh-int-cl in (43) and (44) are inverted, so they all should beindependent ([1C +]), and that means they cannot appear in subordinate clauses.This correctly predicts that (43b) is ungrammatical, but it is problematic to (44b);we have here an example of the clause of the type ns-wh-int-cl, which appears in asubordinate clause ([1C — ]), but is inverted ([INV +]). Nothing in Ginzburg andSag's (2000) constraints rules out the (a) examples in (43) and (44).

It seems, then, that an approach to the asymmetry between main andsubordinate wMnterrogatives within the framework of Ginzburg and Sag (2000)has some problems.

4. A Linearization HPSG Approach

An analysis of English left peripheral elements given by Chung and Kim (2003)is based on a version of HPSG, which is a so-called linearization-based HPSG.


In this framework, word order is determined not at the level of the local tree,but in a separate level of 'order domains', an ordered list of elements thatcontain at least phonological and categorical information (see, e. g. Pollard et al.1994; Reape 1994; and Kathol 2000). The list can include elements fromseveral local trees. Order domains are given as the value of the attributeDOM(AIN). At each level of syntactic combination, the order domain of themother category is computed from the order domains of the daughterconstituents. The domain elements of a daughter may be COMPACTED to form asingle element in the order domain of the mother or they may just becomeelements in the mother's order domain. In the latter case the mother has moredomain elements than the daughters. For example, let us consider the followingrepresentation for the sentence Is the girl coming? (Borsley and Kathol 2000):

(45)

The VP is coming has two daughters and its domain contains two elements, onefor is and one for coming. The top S node also has two daughters, but its orderdomain contains three elements. This is because the VP's domain elementshave just become elements in the S's order domain, whereas those of the NPare compacted into one single domain element, which ensures the continuity ofthe NP. Discontinuity is allowed if the domain elements are not compacted: isand coming are discontinuous in the order domain of the S.

The notable feature of Chung and Kim's (2003) analysis is that each elementof a clausal order domain is uniquely marked for the region that it belongs to(Borsley and Kathol 2000; Kathol 2000, 2002; Perm 1999). The positionalassignment is determined by the following constructional constraints:

(46) a-

b.

c.


PKA-elements are assigned to position 3 in main clauses, and those inembedded (interrogative and relative) clauses are put in position 1. Topicelements are always assigned to position 2, and the operators are alwaysassigned to position 3. 8 Thus, left peripheral elements in English have thefollowing distributions:

(47) Distribution of English left peripheral elements (Chung and Kim 2003)

Main clauseEmbedded clause

Marker field1

WH/COMP

Topic field2

TOPTOP

Focus field3

WH/OPOP

An embedded wA-phrase competes for position 1 with a complementizer. Thiscompetition accounts for the fact that these two elements never co-occur inEnglish (cf. Chomsky and Lasnik 1977). They further assume THETOPOLOGI-GAL LINEAR PRECEDENCE CONSTRAINT, a linear precedence constraint whichis imposed on the elements in order domains:

(48) Topological Linear Precedence Constraint1 < 2 < 3

(48) states that the elements in position 1 should precede those in 2, whichshould in turn precede those in 3.

Now let us consider how this approach might accommodate the asymmetrybetween main and subordinate w/t-interrogatives. The summary of the relevantdata given in (37) is repeated here in (49):

(49) Distribution of SAI operator, wh-element and topic

TOP < WHWH < TOPOP < WHWH < OP

Main clause

ok***

(16a)(24)(21)(23a)

Embedded clause

(16b)ok (25)

(20)ok (23b)

As introduced above, Chung and Kim's approach assumes that a topic is inposition 2 and a wA-element is in position 3 in main clauses. This accounts forthe grammaticality of the TOP <WH order in main clauses since it has thefollowing representation:

(50)

*

*


This order domain does not violate the Topological Linear PrecedenceConstraint in (48), and hence accounts for the grammaticality of (16a), repeatedhere for convenience:

(51) Now what do we need?

Let us turn to embedded clauses, where the TOP < WH order isungrammatical. (46c) states that wA-elements are assigned to position 1 inembedded clauses, whereas topic elements are always in position 2, no matterwhether it is embedded or not. Thus, the TOP < WH order in embeddedclauses leads to the following order domain.

(52)

(52) violates the Topological Linear Precedence Constraint since its DOMelement marked 2 precedes that marked 1. This explains the ungrammaticalityof (16b).

(53) *He told us now what we need.

Thus, Chung and Kim's (2003) approach can accommodate the asymmetrybetween main and embedded clauses with respect to a topic and a wA-element.

The fact that WH < TOP is excluded from the main clauses is accounted foralong the same lines. This linear order leads to the following order domain:

(54)

Here, the element with 3 precedes that with 2, which violates (48), whichaccounts for the ungrammaticality of the sentences in (24).

(55) a. * To whom, a book like this, would you give?b. * For what kind of jobs during the vacation would you give into the

office?

For embedded clauses, on the other hand, (46a) and (46c) require that a topicshould be in 2 and a wA-phrase in position 1, respectively. The resulting orderis (56):

(56)


This conforms to constraint (48), which correctly predicts the grammaticality ofthe WH < TOP order in embedded clauses, illustrated by (25), which isrepeated below:

(57) a. the man to whom, liberty, we could never grantb. ? I wonder to whom this book, Bill should give.c. I was wondering for which job, during the vacation, I should give into

the office.

Constraint (46b) states that w/z-elements and operators are both assigned toposition 1 in main clauses. This accounts for the ungrammaticality of the WH< OP and the OP < WH orders in main clauses: the competition for a singleposition between these two elements entails that they cannot co-occur.

(58) a. * In no way, why would Robin volunteer?b. * Only with great difficulty on which table would she put the big rock?

(59) * Why, in no way would Robin volunteer?

A wA-phrase is assigned to position 1 in embedded clauses while operators areassigned to position 3, embedded or not. This accounts for the grammaticalityof the WH < OP order since its order domains has the 1 < 3 linear order.

(60) Lees wonders why under no circumstances would Robin volunteer.

The OP < WH order is correctly excluded since it entails 3 < 2, which violates(48).

(61) a. * Lees wonders under no circumstances at all why would Robinvolunteer.

b. * I wonder only with great difficulty on which table would she put the bigrock.

Thus, a linearization-based HPSG approach by Chung and Kim (2003) canprovide an account for all the relevant data, including those problematic for anapproach in Word Grammar and for the framework of Ginzburg and Sag (2000).

Another advantage of Chung and Kim's (2003) approach is that it can alsopredict the grammaticality with respect to TOP < OP and OP < TOP. Thepositional assignment represented in (47) predicts that a topic precedes anoperator in both main and embedded clauses, and it also predicts, with theTopological Linear Precedence Constraint (48), the ungrammaticality of theOP < TOP order in both types of clauses. This is borne out by the followingexamples, which illustrate that TOP < OP is no problem but OP < TOP isungrammatical in main clauses (62) and in embedded clauses (63):

(62) a. To John, nothing would we give.b. * Nothing, to John, would we give.

(63) a. He said that beans, never in his life, had he been able to stand,b. * He said that never in his life, beans, had he been able to stand.


5 Concluding Remarks

In this chapter, we have looked at three different approaches to theasymmetries between main and embedded clauses with respect to the elementsin the left periphery of a clause. We compared the dependency-basedapproach developed within WG (Hudson 2003) with the ConstructionalHPSG approach along the lines of Ginzburg and Sag (2000), and theLinearization HPSG analysis by Chung and Kim (2003), and argued that theapproaches within WG and the Constructional HPSG have some problems indealing with the relevant facts, but that Linearization HPSG provides astraightforward account of them.

As we discussed at the outset of this chapter, dependency structure is simplerthan phrase structure in that the former only includes information on therelationship between individual words, but the latter involves additionalinformation about constituency. Other things being equal, simpler representa-tions are preferable to more complex representations. This might lead to theconclusion that WG is potentially superior to HPSG. We have shown,however, that both the dependency-based analysis in WG and the constituency-based analysis in Constructional HPSG are not satisfactory in accounting for thelinear order facts. These two frameworks follow the traditional distinctionbetween the rules for word order and the rules defining the combinations ofelements. 9 We should note, however, that the rules for word order are appliedto local trees in Constructional HPSG and to dependency arrows in WG.Sisters must be adjacent in Constructional HPSG whereas in WG the parentand its dependent can only be separated by elements that directly or indirectlydepend on one of them. This means that the linear order is still closely tied tothe combinatorial structure. That these frameworks cannot accommodatecertain linear order facts suggests that neither dependency structure nor phrasestructure is appropriate as the locus of linear representation. We saw above thatthe linearization HPSG analysis gives a satisfactory account of linear order ofelements in the left periphery. This conclusion suggests that we need toseparate linear order from combinatorial mechanisms more radically than theabove traditional separation of the rules.

References

Abeille, Anne and Godard, Daniele (1997), 'The syntax of French negative adverbs',in Danielle Forget, Paul Hirschbuhler, France Martineau, and Maria L. Rivero(eds), Negation and Polarity: Syntax and Semantics. Amsterdam: John Benjamins, pp.1-17.

Baltin, Mark (1982), 'A landing site for movement rules'. Linguistic Inquiry, 13, 1-38.Borsley, Robert D. (2004), 'An approach to English comparative correlatives', in Stefan

Muller (ed. ), Proceedings of the HPSG04 Conference. Stanford: CSLI Publications, pp.70-92.

Borsley, Robert D. and Kathol, Andreas (2000), 'Breton as a V2 language'. Linguistics,38, 665-710.

Borsley, Robert D. and Przepiorkowski, Adam (eds), Slavic in Head-Driven PhraseStructure Grammar. Stanford: CSLI Publications.


Bouma, Gosse, Malouf, Rob and Sag, Ivan A. (2001), 'Satisfying constraints onextraction and adjunction'. Natural Language and Linguistic Theory, 19, 1-65.

Chomsky, Noam and Lasnik, Howard (1977), 'Filters and control'. Linguistic Inquiry, 8,425-504.

Chung, Chan and Kim, Jong-Bok (2003), 'Capturing word order asymmetries in Englishleft-peripheral constructions: A domain-based approach', in Stefan Miiller (ed. ),Proceedings of the 10th International Conference on Head-Driven Phrase StructureGrammar. Stanford: CSLI Publications, pp. 68-87.

Ginzburg, Jonathan and Sag, Ivan A. (2000), Interrogative Investigations. Stanford: CSLIPublications.

Haegeman, liliane (2000), 'Inversion, non-adjacent inversion and adjuncts in CP'.Transaction of the Philological Society, 98, 121-60.

Hudson, Richard A. (1990), English Word Grammar. Oxford: Blackwell.— (1995), 'HPSG without PS?'. Available: www. phon. ucl. ac. uk/home/dick/unpub. htm.

(Accessed: 21 April 2005).— (1999), 'Discontinuity'. Available: www. phon. ucl. ac. uk/home/dick/disconthtm. (Ac-

cessed: 21 April 2005).- (2003), 'Trouble on the left periphery'. Lingua, 113, 607-42.— (2005, Feburuary 17-last update), 'An Encyclopedia of English Grammar and Word

Grammar', (Word Grammar). Available: www. phon. ucl. ac. uk/home/dick/wg. htm.(Accessed: 21 April 2005).

Kathol, Andreas (2000), Linear Syntax. Oxford: Oxford University Press.— (2002), 'Linearization-based approach to inversion and verb-second phenomena in

English', in Proceedings of the 2002 LSK International Summer Conference Volume II:Workshops on Complex Predicates, Inversion, and 0 T Phonology, pp. 223-34.

Kim, Jong-Bok and Sag, Ivan A. (2002), 'Negation without head-movement'. NaturalLanguage and Linguistic Theory, 20, 339-412.

Koizumi, Masatoshi (1995), 'Phrase Structure in Minimalist Syntax'. (Unpublisheddoctoral dissertation, MIT),

van Noord, Gertjan and Bouma, Gosse (1994), 'Adjuncts and the processing of lexicalrules', in Fifteenth International Conference on Computational Linguistics (COLING'94), pp. 250-6.

Perm, Gerald (1999), 'Linearization and WH-extraction in HPSG', in R. D. Borsley andA. Przepiorkowski (eds) Slavic in Head-Driven Phrase Structure Grammar. Stanford:CSLI Publications, pp. 149-82.

Pollard, Carl, Kasper, Robert and Levine, Robert (1994), Studies in Constituent Ordering:towards a Theory of Linearization in Head-driven Phrase Structure Grammar. ResearchProposal to the National Science Foundation, Ohio State University.

Pollard, Carl and Sag, Ivan A. (1994), Head-Driven Phrase Structure Grammar. Chicago:University of Chicago Press.

Przepiorkowski, Adam (1999a), 'On complements and Adjuncts in Polish', R. D.Borsley and A. Przepiorkowski (eds) Slavic in Head-Driven Phrase StructureGrammar. Stanford: CSLI Publications, pp. 183-210.

Przepiorkowski, Adam (1999b), 'On case assignment and "adjuncts as complements"',in Gert Webelhuth, Jean-Pierre Koenig and Andreas Kathol (eds), Lexical andConstructional Aspects of Linguistic Explanation. Stanford: CSLI Publications, pp.231-45.

Reape, Michael (1994), 'Domain union and word order variation in German', in JohnNerbonne, Klaus Netter and Carl J. Pollard, (eds), German in Head-Driven PhraseStructure Grammar. Stanford: CSLI Publications, pp. 151-98.

Rizzi, Luigi (1997), 'On the fine structure of the left-periphery', in Liliane Haegeman

www.phon.ucl.ac.uk/home/dick/unpub.htm.

www.phon.ucl.ac.uk/home/dick/discont.htm



(ed. ), Elements of Grammar. Dordrecht: Kluwer Academic Publishers, pp. 281-337.Ross, John R. (1986), Infinite Syntax! New Jersey: Ablex Publishing Corporation.Sag, Ivan A. (1997), 'English relative clause constructions'. Journal of Linguistics, 33,

431-84.Webelhuth, Gert, Koenig, Jean-Pierre and Kathol, Andreas (eds) (1999), Lexical and

Constructional Aspects of Linguistic Explanation. Stanford: CSLI Publications.

Notes

* I would like to thank Bob Borsley and Kensei Sugayama for their helpfulcomments. Any errors are those of the author.

1 For comparison of WG with an earlier version of HPSG (Pollard and Sag 1994),see Hudson (1995).

2 In the current version of WG (Hudson 2005), the No-Tangling Principle has beenreplaced with ORDER CONCORD, whose effects are essentially the same as for itspredecessor. In this chapter we will refer to the No-Tangling Principle.

3 The examples in the rest of this section are cited from Haegeman (2000) unlessotherwise indicated.

4 (26b) was provided for me by Bob Borsley (p. c. )5 (28b) is not acceptable to some speakers (Borsley 2004).6 The data in (28) could be accommodated in WG if we assumed a dependency

relation between the complementizer and the extractee (Borsley (p. c. ); andSugayama (p. c. )). Needless to say, however, an argument along these lines wouldneed to clarify the nature of this apparently ad hoc grammatical relation.

7 This amalgamation of the SLASH values is due to the SLASH-AmalgamationConstraint (Ginzburg and Sag 2000: 169):

(i)

8 See Kathol (2002) for an alternative analysis for English clausal domains.9 In constituency-based grammars such as HPSG, these two rule-types are LINEAR

PRECEDENCE RULES and IMMEDIATE DOMINANCE RULES.

Part II

Towards a Better Word Grammar

8. Structural and Distributional Heads

ANDREW ROSTA

AbstractHeads of phrases are standardly diagnosed by both structural and distributionalcriteria. This chapter argues that these criteria often conflict and that the notion'head of a phrase' is in fact a conflation of two wholly distinct notions, 'structuralhead' (SH) and 'distributional head' (DH). The SH is the root of the phrase andis diagnosed by structural criteria (mainly, word order and ellipsis),. Additionally,the distribution of the phrase may be conditioned by one or more words in thephrase: these are DHs. The SH is often a DH, but there are many Englishconstructions in which a DH is not the SH and is instead a word subordinatedwithin the phrase. The chapter discusses a variety of these constructions,including: that-dauses; pied-piping; degree words; attributive adjectives; determi-ners; just, only, even; not, almost, never, all but; the type-of construction;coordination; correlatives; adjuncts; subjects; empty categories.

1. Introduction

The central contention of this chapter is that a number of constructions inEnglish oblige us to recognize that the distribution of a phrase may bedetermined by a word subordinated within the phrase, rather than by, ascommonly taken for granted, the structural head of a phrase - i. e. the highestlexical node in the phrase. By a phrase's 'distribution' is meant the range ofenvironments - positions - in which it can occur. An example of such aconstruction is pied-piping (discussed in section 8), as in (la). The root of thephrase in the midst of which throng of admirers is in, but it is by virtue ofcontaining which that it occupies its position before the inverted auxiliary, for(la) alternates with (Ib), but not with (2a-2b).

(1) a. In the midst of which throng of admirers was she finally located?b. Which throng of admirers was she finally located in the midst of?

(2) a. *In the midst of this throng of admirers was she finally located,b. *This throng of admirers was she finally located in the midst of.

The head of a phrase is normally understood to be denned, and hencediagnosed, by both structural and distributional criteria. But the notion 'head of aphrase' is in fact a conflation of two wholly distinct notions: the distributional, or'external', head, and the structural, or 'internal', head. These two types of head are


explained in sections 2-3. Although I use the term 'phrase' in a mostly theory-neutral way, it is important to realize that it doesn't entail the Phrase StructureGrammar notion that phrases are nonlexical nodes. In Word Grammar (WG),which is the grammatical model that serves as a framework for the discussion ofgrammatical analysis in this chapter, all nodes are lexical, and WG defines aphrase as a word plus all the words that are subordinate to it1 (A word's'subordinates' are its 'descendants' in the syntactic tree, the nodes below it; its'superordinates' are its 'ancestors', the nodes above it) The words in a sentencecomprise all the nodes of a tree, and every subtree of the sentence tree is a phrase.

2. Structural Heads

A phrase's structural head (henceforth 'SH') is, as stated above, to be definedas the highest lexical node in the phrase. In a model such as Word Grammar,in which all nodes are lexical, the SH is, therefore, the root of the phrase'stree structure. For determining which word is the root of a phrase, theprincipal diagnostic is word order. Take the phrase eat chocolate: if chocolate isthe root then there cannot be a dependency between eat and a word thatfollows chocolate; and if eat is the root then there cannot be a dependencybetween chocolate and a word that precedes eat. The test shows that eat is theroot of eat chocolate:

(3) a. *Do Belgian eat chocolate. ['Do eat Belgian chocolate. ']b. *Do your eat chocolate. ['Do eat your chocolate. ']

(4) a. Eat chocolate today,b. Don't eat chocolate.

These restrictions follow from the general (and probably inviolable) principle ofgrammar that requires phrases to be continuous: no parts of a phrase can beseparated from one another by an element that is not itself contained within thephrase. The principle is discussed further in section 15. Diagrammatically, theprinciple can conveniently be captured as a prohibition against a branch in thesyntactic tree structure crossing another, as in (5).

(5)

3. Distributional Heads

The distribution of a word (or phrase) is the range of grammatical environmentsit can occur in. In the broadest sense, this includes a word's co-occurrence withboth its dependents, e. g. the fact that eat can occur with an object (eat chocolate),and its regent, e. g. the fact that eat can be complement of an auxiliary (will eat).(The term 'regent' is used in this chapter as the converse of 'dependent'. ) But inthe narrower and more usual sense employed here, a word's distributionessentially concerns what it can be a dependent of. 'Distribution' in the lattersense contrasts with 'Valency' (or 'Selection'), which concerns what a word can

* Do Belhian eat chocolate. [Do eat Belgian chocolate.']

STRUCTURAL AND DISTRIBUTIONAL HEADS 173

be regent of. As a first approximation, we can therefore say that the distributionof X is the product of rules that such and such a regent permits or requires adependent belonging to a category that X belongs to. But the topic of thischapter is such that instead of that first approximation we need, at leastpretheoretically, to formulate this in terms of the notion 'distributional head':when a word permits or requires a dependent of category X, it permits orrequires a dependent that is a phrase whose distributional head is of category X.

Models of syntax have generally held that something is a distributional head(henceforth 'DH') if and only if it is a SH - in other words, that a phrase hasjust one sort of head, and that this single head determines both the structuraland the distributional properties of the phrase. But my first aim is to show thatthe two sorts of head must be distinguished. Normally the two sorts of headcoincide, so that one word is both SH and DH of a phrase - i. e. that the root ofa phrase determines its distribution. This is generally known as 'endocentricity'.But a fair number of constructions in English suggest that that norm cannot beexceptionless. (And, as we will see later, once we acknowledge that the normhas exceptions, there is reason to question whether it is in fact even much of anorm at all. ) In these constructions, the SH is not the DH. This is exocentricity.But the constructions involve a very particular kind of exocentricity: in them,the DH is subordinate to the SH. That is, the distribution of the phrase isdetermined not by the root of the phrase but by a word subordinated more orless deeply within the phrase. I will call this phenomenon 'hypocentricity', sincethe DH is below the SH in the tree.

Although the notion 'structural head', defined as the root of a phrase, has arole in the formal analysis of hypocentricity, the notion 'distributional head'does not, and is purely descriptive. This is because it turns out that a phrasemay have many distributional heads. This can be illustrated as follows. Section11 argues that in 'determiner phrases' (i. e. 'noun phrases' in the traditionalsense), the determiner is SH and the noun is DH. This is illustrated in (6a),where, as in subsequent examples, small capitals indicate the SH and italics theDH. And section 8 argues that in pied-piping in wh-relative clauses, the wh-word is DH, so SH and DH are as indicated in (6b-6c). But in (6c) the locus ofthe DH also follows the pattern of (6a), giving (6d), where there is one SH, the,and two DHs, news and which.

(6) a. [THE news] had just reached usb. [NEWS of which] had just reached usc. [THE news of which] had just reached usd. [THE news of which] had just reached us

In the formal analysis of hypocentricity introduced in section 4 and presentedin full in section 6, a phrase's DHs are defined relationally relative to the SH.

So, although I said in section 1, in framing the discussion of hypocentricity,that the notion 'head of phrase' is a conflation of two sorts of phrasal head, thestructural and the distributional head, it would be more accurate to say that thetraditional notion 'head of a phrase' remains valid, but that it is defined bystructural criteria, as the phrase root, and, contrary, to what is usually thought,


not by distributional criteria. The distribution of a phrase may be conditionedby categorial properties of its (structural) head, but it may equally well beconditioned instead by categorial properties of words more or less deeplysubordinated within the phrase. As we will see in section 15 and section 17,once the criteria for identifying the phrasal head are solely structural and notdistributional, we are led to transmogrify the familiar WG analyses of thestructure of many constructions into radically new but more satisfactory forms.

In the following sections I discuss a number of constructions where there isprima facie reason to think that they might be hypocentric. It is beyond thescope of this chapter to agonize over the details of the structure of eachconstruction, so by and large my identification of the SH in each constructionwill rest more on prima facie plausibility than on detailed argumentation.

4. 77m*-Clauses

In a ^/-clause, the SH is that, which explains why it must be at the extreme leftedge of the clause. But the DH of the thai-clause is the finite complement ofthat. The evidence for this is that the clausal complement of certain verbs, suchas require and demand, must be subjunctive. 2 So the DH is the subjunctive word;it is the presence of the subjunctive word that satisfies the selectionalrequirements of require I demand:

(7) a. I require [THAT she be/*is here on time].b. I demand [(THAT) she give/*gives an immediate apology].

A satisfactory analysis of this phenomenon is provided in Rosta (1994, 1997)(from whose terminology I deviate in this chapter without further comment).That is defined (in its lexical entry) as 'surrogate' of its complement, the finiteword. As a general rule, every word is also surrogate of itself; so the finite wordis surrogate of itself. Require /demand select for a complement that is surrogate ofa finite word. Since the surrogates of a finite word are itself and that (if it iscomplement of that), the selectional requirements of require/demand can besatisfied by that or by a finite word. Surrogacy accounts for some hypocentricconstructions, but not all. We return to this point in section 6.

5. Extent Operators

I adopt 'extent operator' as an ad hoc term to cover such items as all but, morethan, other than, almost, not, never, which do not necessarily form a naturalgrammatical class but do at least have certain shared properties that warranttheir being discussed together here. Reasons will be given why, when an extentoperator modifies a predicative word, as in (8a-8f), or a number word, as in(9a-9f), the extent operator appears to be SH and the number or predicativeword to be DH.

(8) a. She had [ALL but expired}.b. My rent has [MORE than doubled].c. She was [OTHER than proud of herself].


d. My rent has [ALMOST doubled].e. Her having [NOT had a happy childhood], he was inclined to be patient

with her.f. Her having [NEVER seen the sea before], this was a real treat.

(9) a. [MORE than thirty] went.b. [ALMOST thirty] went.c. [BARELY thirty] went.d. [OVER/UNDER thirty people] went.e. [NoT many] know that.f. [NOT two minutes] had elapsed before the bell rang.

The identification of the DH is probably not very controversial, but thejustification for it is most apparent in (8a, b, d, e, f), where auxiliary have requires apast participle as its complement, and it is the DH that satisfies thisrequirement. Note also that as demonstrated by (lOa-lOb), verbal numberinflection is triggered by the number of the DH rather than the SH or themeaning of the whole phrase:

(10) a. [MORE than one] is/*are.b. [LESS/FEWER than two] are/*is.

More controversial is the identification of the SH, and the evidence for this willnow be presented.

First of all, there is the evidence of meaning: the bracketed phrases could allbe described as 'semantically exocentric'. For instance, (8b, 8d) don't refer toan event of doubling, and (9a-9d) don't refer to a quantify of 30. Rather, themeanings are roughly thus:

(8a, 8d): 'My rent has increased by a factor that is more/slightly less than 2'(8c): 'She was in a state that is other than a state of being proud of herself(9a-d): 'a set whose cardinality is a number more than/almost/barely/over/under thirty'(9e): 'a set whose cardinality is a number that is not many'(9f): 'a set (of minutes) whose cardinality is a number that is not two', or 'aperiod that is not two minutes'

There is no prior theoretical reason to suppose that the 'semantic head' shouldin general be the SH rather than the DH; if anything, one would expect the DHto align with the semantic head, given that it is the DH that seems to be themore visible from outside the phrase. But it remains the case that in theseconstructions the extent operator is closest to being the semantic head. Forexample all in (8a) might be taken to mean 'an act that stops slightly short ofoutright expiring', and in (9d), over /under might be taken to mean 'a number - aplace in number space - that is over/under thirty', just as under the table means'the place under the table' in (lla-c). 3

(11) a. Under the table is an obvious place to hide.b. Let's paint under the table black.c. The pupils always cover under the table with chewing gum.


The second kind of evidence for the identification of the SH comes fromcertain of the extent operators' more core variants that have a nominalcomplement, notably all but JVP, more than JVP and over/under JVP. It is easy todemonstrate that in these variants, all, more and over /under is both SH and DH.For instance, in (12a) we have me rather than / and are rather than am becausethe subject is all rather than me/L4'

(12) a. All but me/*I are/*am to come.b. More than me/*I are/*am here.c. Over/Under us/*we seems/*seem unsuitable for the storage of

radioactive waste.

The third and most telling sort of evidence comes from ellipsis. Ellipsisinvolves the deletion of the phonological content of some syntactic structure,and it seems to operate rather as if (the phonology of) a branch of the syntactictree were snipped off. Thus if the phonological content of one node is deleted,then so must be the phonological content of all nodes subordinate to it. 5 So, ifwe have established that a branch links two nodes, X and Y, and X's phonologyremains when Y's is deleted, it must follow that Y is subordinate to X. And wefind that with certain extent operators, including not, nonstandard never(meaning 'not' rather than 'nowhen') and almost, their phonology can remainwhen the phonology of the DH is deleted. (Words whose phonology is deletedare shown in subscript. )

(13) a. %I would prefer that you not be so mde.b. %I know you want to do it, but try to notc. Would I do it? I wouldn't not do itd. We'll make him note. I know it's unmanly to flinch, but how can you stand there and not fi^ch?f. You can't go out without knickers - not go out without knickers and still

stay decentg. %She never stoie your cigarette lighter- ['She didn't']h. %Did she do it? No, but she almost

(13g) is dialectal, and I have also marked (13a, b, h) as subject to variantjudgements, because some speakers reject them, but all of (13a-13h) areacceptable for some speakers, and that is what matters here. The conclusion isthat the deleted DH is subordinate to the extent operator, which is thereforethe SH. 6

We have established, then, that the internal structure of these phrases is asshown in (14a-14b). This raises two questions. The first, which is addressed insection 6, concerns the structure of (15a-15b): how can structures (14a-14b) bereconciled with the fact that it is perished that satisfies the selectionalrequirements of had?

(14) all but perished

almost perished


(15) a. She had all but perished,b. She had almost perished.

The second question concerns the structure of (16a-16b). Other things beingequal, we would expect (16a-16b) to be ungrammatical due to illicit wordorder, as diagrammed in (17a-17b), while we would expect (18a-18b), whoseword order remedies the apparent illicitness of (17a-17b), to be grammatical.

(16) a. I know she all but perished,b. I know she almost perished.

SUBJECT

6. Surrogates verus Proxies

In section 4 we observed that require/demand requires as its complement thesurrogate of a finite word. This requirement is satisfied by the finite word itself,(20a), by clausal that, (20b), and by extent operators (20c). (The list is notexhaustive. )

(17) a.

I know she all but perished.

b.I know she almost perished.

(18) a.* I know all but she perished. ['She all but perished. ']

b.* I know almost she perished. ['She almost perished. ']

It seems, then, that (16a-16b) must involve something along the lines ofobligatory 'leftwards-extraposition' of the subject; the subject moves from itsordinary position and ends up as a subordinate of the extent operator, asdiagrammed in (19a-19b). 7 We return to this matter in section 17. 2, where afar more satisfactory solution is provided.

(19) a. I know she all but perished.

SUBJECT

I know she almost perished.

(20) a. She demanded he go.

b.

c.

She demanded that he go.

She demanded he almost go.


But as (21a-21c) show, the complement of clausal that can be a finite word, oran extent operator, but not another that. The same pattern holds forcomplements of extent operators, (22a-22f). (Structurally, almost in (21c) and(22d) occurs in the position where the complement of but and that is expected.For this reason, I conclude that almost (rather than perished] is indeed thecomplement of but and that. And, as pointed out in section 5, almost is thesemantic head in almost perished: it means 'an event near to being an event ofperishing'. ) Hence it cannot be the case that the selectional requirements of thator of extent operators are such that any surrogate of a finite word will satisfythem.

(21) a. I know that she went.

b. * I know that that she went.

c. I know that she almost went.

(22) a. I know that she almost perished.

b * I kno'w she almost that perished.

c. I know that she almost almost perished.

d. I know that she all but almost perished.

e. I know that she almost all but perished.

f- Anybody not not happy should raise their hands now.

(20-22) show that there are two types of hypocentric phrase. In one type, theSH is surrogate of the DH, and the SH can be that, an extent operator, or theDH, the finite word. In the other type, the SH can be an extent operator or theDH, but not that. To capture this pattern, which, as we will see in later sections,generalizes across many diverse constructions, we need to posit a subtype of ofthe Surrogate relation, which I will call 'Proxy'. So, whereas require/demandselect for a complement that is surrogate of a subjunctive word, that selects for acomplement that is proxy of a finite word. Likewise for extent operators: almostand but (in all but} select for a complement that is proxy of a finite word (or ofwhatever other sorts of word extent operators can modify).

In general, the format for selectional rules will be not (23a), but rather (23b-23c). Rules of type (23a) seem to be surprisingly scarce: I am currently aware ofonly one instance, which is discussed in section 16 (in examples (73a-73c)).


(23) a. the complement of X is a word of category Yb. the complement of X is proxy of a word of category Yc. the complement of X is surrogate of a word of category Y

Given that rules of form (23b-23c) are more cumbersome than the rules ofform (23a) that we are accustomed to, we will introduce an abbreviatingequivalent for (23b-23c), and say that X 'targets' Y for its complement (but,implicitly, will accept Y's surrogate or proxy in lieu of Y); Y is X's 'complementtarget'.

The key rules defining Surrogate and Proxy are (24a-24c):

(24) a. If X is proxy of Y, then X is surrogate of Y.b. X is proxy of X.c. If X is surrogate of Y, and Y is surrogate of Z, then X is surrogate of Z.

More specific rules of the grammar define what is surrogate or proxy of what. Itdefines that as surrogate (but not proxy) of its finite target, and it defines extentoperators as proxy of the modified word.

Informally, we can distinguish between different degrees of hypocentricity.In its strongest form, phrases [X[Y]] and [Y] are in free variation: i. e. [Z[X[Y]]]is possible if and only if [Z[Y]] is possible. The analysis for this sort of case isthat X is proxy of Y, and the complement of Z must be proxy of Y. Extentoperators are examples of strong hypocentricity. In a weaker form ofhypocentricity, phrases [X[Y]1 and [Y] are in free variation only in certainenvironments, e. g. [Z[Y]] alternates with [Z[X[Y]]], but [W[Y]] does notalternate with *[W[X[Y]]]. The analysis for this sort of case is that X issurrogate of Y, and the complement of Z must be surrogate of Y. That clausesare an example of this weaker form. But there is potentially a still weaker formof hypocentricity, 'quasihypocentricity', in which [Z[X[Y]]] does not alternatewith [Z[Y]] at all, but nonetheless Z and Y are sensitive to each other'spresence; for instance, it might be that X is eligible to be complement of Z onlyif Y is complement of X, or it might be that the presence of Z is a condition forY's inflectional form, or vice versa. Some (somewhat rarefied) examples ofquasihypocentricity are discussed in section 17. 2-3, where the analysis ofquasihypocentricity is elaborated on a little.

7. Focusing Subjuncts: just, only, even

Just, only and even, called 'focusing subjuncts' in Quirk et al. (1985), behaverather like extent operators with regard to hypocentricity. The DH of even Sophyin (25a) is Sophy, and it is to the DH that the inflectional morphology issensitive, as (25b) shows. The DH is complement target of the subjunct, andthe subjunct is proxy of its complement target:

(25) a. EVEN Sophy} would

b. Even I/*me am/*is.


The identification of the focusing subjimct as the SH needs some justification.If the subjunct is not SH, then the structure is one of (26a-26b). (Classingfocusing subjuncts as adverbials, as Quirk et al. (1985) do, implies (26b). )

b. Even Sophy would.

(26a) fails to account for why the subjunct must be at the extreme edge of thefocused phrase. With the subjunct as SH, as in (25a), the ungrammaticality of(27a) is predicted by the crossing branches. But no such prediction is madewith structure (26a), as in (27b):

(27) a. * Sophy even's parents would.

b. Sophy even's parents would.

(26b), with the subjunct as dependent of the verb, incorrectly predicts that (28-29) should be ungrammatical. In (28a-28b) there are crossing branches. In(29a-29b), Edgar does not occupy the position immediately following its regent(gave and will), even though that normally results in ungrammaticality, as with(30a-30b). But the incorrect predictions vanish if the structures are as in (31-32).

(28) a. She stepped in only two puddles.

(32) a. She gave even Edgar flowers.

(26) a. Even Sophy would.

b. Pictures only of eyelashes were recommended.

(29) a. She gave even Edgar flowers.

b. Will even Edgar relent?

(30) a. * She gave today Edgar flowers.

b. * Will today Edgar relent?

(31) a. She stepped in only two puddles.

b. Pictures only of eyelashes were recommended.


8. Pied-piping

The roots of the bracketed phrases in (33a-33b) are under and on. But thephrases occupy their position before the inverted auxiliary by virtue ofcontaining a negative element, no, or an interrogative wh-word, which. The rootof the bracketed phrase in (33c) is on or should, depending on which issubordinate to which in one's analysis, but the DH is which: the rule for finitewh-relative clauses is that they contain a subject or topic phrase that contains arelative wh-word. In contrast to certain other hypocentric constructions, thesemantic head of the phrase appears to be the DH, the wh-word, sincesemantically it is the relationship of binding or equation that connects therelative clause to the modificand.

(33) a. [UNDER no circumstances] would she consent.b. [ON the corner of which streets] should we meet?c. streets [on the corner of which we should meet]

As with the inside-out interrogative construction discussed in section 13, the SHin pied-piping is surrogate of the DH, and - in relative clause pied-piping atleast - the surrogate relation is long-distance, i. e. there is no upper limit to thenumber of nodes that can be on the path up the tree from DH to SH.

9. Degree Words

In the phrases bracketed in (34a-34d), the degree words (too, more, as) modifythe adjectives that are the distributional head. If the DH were also the SH, wewould expect complements of the degree word to appear on the same side ofthe adjective as the degree word does, as in the ungrammatical (35a-35d), inorder to avoid the sort of crossing branches diagrammed in (36).

(34) a. This is [TOO heavy for him to lift].b. He is [TOO tough to shed the odd tear].c. She is [MORE sophisticated than him].d. She is [AS sophisticated as him].

(35) a. *This is too for him to lift heavy.b. *He is too to shed the odd tear tough.c. *She is more than him sophisticated.d. *She is as as him sophisticated.

(36) He is too tough to shed the odd tear.

A standard solution to this problem would have the adjective as SH and thecomplement of the degree word obligatorily extraposed. But as far as I amaware, this solution is motivated solely by the lack of any mechanism for

b. Will even Edgar relent?


handling hypocentricity, and not by independent evidence. By purely structuraland not distributional criteria, it is the degree word that is prime candidate forbeing SH. Accordingly, we take the modificand to be complement target of thedegree word, and the degree word to be surrogate of its complement target.

10. Attributive Adjectives

The same kind of argument that suggests that degree words are structurallysuperordinate to the adjectives they modify also suggests that attributiveadjectives are structurally superordinate to the common nouns they modify. In(37a), to read is complement of easy, but if book is SH then we would expect theword order to be impossible, (37b). The word order favours easy as SH, (37c).Again as with degree words, a construction-specific rule of obligatoryextraposition is an alternative solution.

(37) a. an [EASY book to read]

b. an [EASY book to read]

c. an [EASY book to read]

The nonextrapositional analysis copes well when a degree word modifies anattributive adjective, for it correctly predicts that a complement of the degreeword must follow the noun modified by the adjective, as (38):

(38) a more sophisticated person tfran him

To rule out (39), the extrapositional analysis would have to posit that than himfirst extraposes to become some sort of postdependent of sophisticated, in linewith the extraposition rule that applies to degree words, and then extraposesagain to become a postdependent of the noun, in line with the extrapositionrule that applies to attributive adjectives.

(39) *a more sophisticated than him person

The suggested analysis is that attributive adjectives are not adjuncts. Rather,they take the proxy of a noun as their complement, and the adjective is proxy ofits complement. Further evidence for this analysis comes from ellipsis:

(40) She chose an easy puzzle and he chose a difficult puzzle

11. Determiner Phrases

It was suggested in section 10 that attributive adjectives target a noun for theircomplement, the resulting phrase having the adjective as SH and the noun asDH. In this section I present other examples of hypocentric NPs (or DPs).


Hudson (2004) notes the contrast (41a-41b). It is the presence of aninstance of the lexeme WAY that allows the noun phrase to be an adverbial.Hence way is the DH:

(41) a. She did it this way.b. *She did it this manner.

Generalizing beyond this example, the DH of the determiner phrase is thenoun that is its complement target. Evidence for this comes from extrapositionout of subjects, for only dependents of the DH can extrapose out of a subject.For example, (42a) can yield (42b) by extraposition, since the extraposee is adependent of statement, the DH of the subject phrase. But in (42c), the DH ofthe subject phrase is author, so extraposition of a dependent of statement isungrammatical. On the assumption, justified below, that the determiner is SH,(42d) shows that it is not the case that dependents of the actual subject (the SH,those] can extrapose. And the apparent exception presented by diegrammaticality of (42e), where a dependent of statement is extraposed but atfirst glance the DH would seem to be sort, in fact serves to confirm that, asargued below in section 12, sort of phrases are hypocentric, so statement is DHof sort of statement and of the whole subject phrase.

(42) a. A statement that denies the allegation has been released.b. A statement _ has been released [that denies the allegation].c. *The author of a statement _ has been arrested [that denies the

allegation].d. * Those _ have been released [statements that deny the allegation],e. A curious sort of statement _ has been released [that denies the

allegation].

But it is the determiner that is SH. The evidence for this comes from wordorder and ellipsis. The word order evidence is that the determiner alwaysoccurs at the extreme left of the phrase, a fact that follows automatically if thenoun is subordinate to the determiner but that would be unexplained if thedeterminer is subordinate to the noun. The ellipsis evidence is that the nounbut not the determiner can be deleted. Thus (43a-43b) are synonymous,whereas (44a-44b) are not, and there is no reason to suppose that anydeterminer is present in (44b).

(43) a. These scales are not working properly,b. These scaies are not working properly.

(44) a. This milk is off.b. Milk is off.

Indeed, the ellipsis can perfectly well apply to way, as in (45a), the phonologicalvisibilia of a syntactic structure whose lexical content is given in (45b).

(45) a. You do it your way and I'll do it mine.b. You do it you's way and I will do it me's way.

There is not much in the way of argument against treating the determiner asSH. Van Langendonck (1994) argues that treating the determiner as head in


this booklthat book fails to capture the analogy with the adjective phrases this big Ithat big. But section 9 has argued that in this big/that £fg, the SH is this /that, andbig is DH, so the analogy is captured.

There are other examples that can be used to make the point made by (41a-41b), but none that are quite so convincing. The verb crane appears to requireX's neck as its object, but it's hard to prove that this is not merely the consequenceof the verb's meaning, which specifies that the cranee is the neck of the craner. Asomewhat more convincing example is (46): wreak for many speakers requires anobject whose DH is havoc, and no synonym will suffice in its stead.

(46) The storm will wreak the usual havoc/%devastation.

The same appears to hold for cognate objects:

(47) a. She smiled her usual smile/*grin.b. She slept a deep and placid sleep/* slumber/* somnolence/*kip.

But the contrast in (48a-48c) shows havoc and cognate objects to be unlike wayadverbials, and makes it hard to maintain that the presence of havoc and smile in(48b-48c) is a syntactic requirement. 8

(48) a. *She did it something that fell short of a wholly sensible way.b. The storm will wreak something that falls short of outright havoc.c. She smiled something that fell short of the sweet smile we had come to

expect from her.

The relationship between determiner and noun is analogous to that betweenclausal that and finite word. Just as multiple that is ungrammatical (See that(*that) she does), so are multiple determiners: the (*the) book, or, moreplausibly, *a my book, fa book of mine'. Clausal that takes the proxy of a finiteword as its complement, and is surrogate of its complement Likewise, thedeterminer takes the proxy of a common noun as its complement, and issurrogate of its complement

This analysis predicts that (49) should be ungrammatical. I am not sure thatthat prediction is correct, though.

(49) ?She did it the opposite of a sensible way.

(49) ?She did it the opposite of a sensible way.

The bracketed phrase in (50a) is hypocentric. The SH and DH are as shown.The SH is proxy of the DH. The of in this construction takes as its complementthe proxy of a common noun. Since determiners are surrogate but not proxy oftheir complement target, this rules out (50b) (at least as an instance of thishypocentric construction).

(50) a. these [TYPES/KINDS/SORTS/VARIETIES/MANNERS/CLASSES of dog]b. * these types of a dog

Some evidence for the hypocentricity of this construction has already beengiven in section 11. Further evidence is as follows.


First, there is the grammatically of (51a-51b). The adverbial function of thenoun phrase is licensed by virtue of having way as its DH:

(51) a. Do it the usual sort of way.b. Do it the same kind of way you always do.

Second, dog in (50) needn't have the coerced mass interpretation that it gets inThere was dog all over the road. Normally, a noun can receive a count interpretationonly if it is the complement target of a determiner; bare, determinerless nounsmust receive a mass interpretation. 9 If types in (50) is proxy of dog, then dog can becomplement target of these and hence receive a count interpretation.

It seems that in type of X, type is optionally rather than obligatorily proxy ofX. (52a) is ambiguous between a reading equivalent to (52b), with cake receivinga count interpretation, and a reading equivalent to (52c), with cake receiving amass interpretation. When it receives the count interpretation, cake (and type) iscomplement target of a determiner (presumably a), and type is proxy of cake Ibrick. When it receives the mass interpretation, type is not proxy of cake/brick,and the only complement target of a is type.

(52) a. A strange type of cake was on display.b. A cake of a strange type was on display.c. Cake of a strange type was on display.

The third and last piece of evidence for the identification of the DH is asfollows. (53a) is paraphrasable as (53b), (54a) as (54b), and (56a) as (56b-56c). 10 But (55a) is trickier to paraphrase. (55b) is ungrammatical for somereason. 11 (55c)/(56b) is a possible paraphrase of (55a), but it is ambiguous,because it also paraphrases (55b). The only unambiguous paraphrase of (55a) is(55d). And in (55d) we find that these agrees in number with cakes but not type.Hence cakes is DH of type of cakes.

(53) a. (a) cake of this type (54) a. cake of these typesb. this type of cake b. these types of cake

(55) a. cakes of this type (56) a. cakes of these typesb. *this type of cakes b. these types of cakesc. these types of cakes c. these types of caked. %these type of cakes

13. Inside-out Interrogatives

The italicized phrases in (57a-57f) are instances of what I will call the 'inside-out interrogative' construction.

(57) a. She always chooses nobody can ever guess which item from the menu.b. It was hidden in the middle of nobody could tell where.c. She's been going out with I've no idea who.d. She managed to escape nobody was able to fathom how.e. She smokes goodness only knows how many cigarettes a day.f. The drug makes you you can never be sure how virile.


The construction is functionally motivated by the impossibility of relativizingout of an interrogative clause, as in (58), so from a functional if not a structuralperspective it ought to be seen as a kind of relative clause.

(58) *in the middle of somewhere ̂nobody could tell which place _z was.

By all appearances these phrases have the internal structure of a clause thatitself contains an interrogative clause in which sluicing has occurred, as in (59a-59f). This is why the structure is 'inside-out': (57a) means something like if not(59a) then at least 'It was hidden in the middle of a place such that nobodycould tell which place it was'.

(59) a. Nobody can ever guess which item from the menu she aiwayS Ch00ses-b. Nobody could tell where it was hidden in ^ middle of.

c. I've no idea who she>s been going out with-

d. Nobody was able to fathom how she managed to escape.

e. Goodness only knows how many cigarettes a day she smokes-f. You can never be sure how virile ̂ drug makes you.

But inside-out interrogative phrases have a distribution equivalent to that ofthe interrogative wh-word they contain, as in (60a-60f) - setting aside for amoment the shift to question-meaning.

(60) a. She always chooses which item from the menu?b. It was hidden in the middle of where?c. She's been going out with who?d. She managed to escape how?e. She smokes how many cigarettes a day?f. The drug makes you how virile?

To put it another way, this in (6la-6If) can be replaced by an inside-outinterrogative to yield (57a-57f).

(61) a. She always chooses this item from the menu.b. It was hidden in the middle of this place.c. She's been going out with this person.d. She managed to escape this way.e. She smokes this many cigarettes a day.f. The drug makes you this virile.

The SH of the inside-out interrogative is the root of the clause, i. e. a proxy ofa finite word. The essence of the construction is that this proxy of a finite wordis licensed to be surrogate of an interrogative wh-word that is complement of (asubordinate of) the finite word. Since the SH is surrogate of the wh-word, theSH is also surrogate of whatever the wh-word is surrogate of. The key surrogacyrelations in (57a-57f) are indicated by the dotted arrows in (62a-62f), whichassume that where, who and manner (but not degree) how are the phonologicalexpression of what is, syntactically, which place, which body (meaning 'person', asin somebody) and which way. The SH of the bracketed phrases is in smallcapitals.


(62) a. She always chooses [nobody CAN ever guess which item from the menu].

b. It was hidden in the middle of [nobody COULD tell which place].

c. She's been going out with [I'VE no idea which body].

d. She managed to escape [nobody WAS able to fathom which way].

e. She smokes [goodness only KNOWS how many cigarettes a day].

f. The drug makes you [you CAN never be sure how virile].

Section 11 explains why which is surrogate of items/place I body/way. Section 9explains why how is surrogate of many and virile. Section 10 explains why manyis surrogate of cigarettes (on the assumption that many is some kind of attributiveadjective). Because how is surrogate of many, and many is surrogate of cigarettes,how is surrogate of cigarettes. Hence, in (62a, b, c, e), the SH in small capitals issurrogate of a noun, thus satisfying the requirement of chooses, o/and with for acomplement that is surrogate of a noun. In (62d), the SH is surrogate of way,which makes it eligible to function as a manner adverbial. In (62f), the SH insmall capitals is surrogate of an adjective, thus satisfying the requirement ofmakes for a complement that is surrogate of an adjective. 12

14. 'Empty Categories'

WG has so far not embraced the empty categories so beloved of other models,chiefly Transformational Grammar. But there is no fundamental incompatibilitybetween WG and empty categories, if empty categories are taken to bephonologyless words. As I will briefly detail below, empty categories would be abeneficial enhancement to WG, so it is worth considering how they would work inWG. As also explained below, though, they do raise a certain problem, but thisproblem is solvable by means of the Proxy relation, though not by means ofhypocentricity. This is why empty categories warrant a short section in this chapter.

Since all nodes in WG are words, the WG counterpart of empty categorieswould in WG be a word, an instance of the lexical item '<£>', which isphonologyless and has the semantic property of expressing a variable. 13

(Phonologyless words are notated within angle brackets. )Positing <e> affords both better analyses of the data, and significant

simplifications to the overall model. The principal simplification comes if

Traditional WG makes a distinction between dependencies - dependencytokens, that is, not dependency types - that form branches of the sentence tree,and dependencies that don't. For example the object dependency from eat towhat in (63b) doesn't form a branch in the tree. Word order rules apply only todependencies that form branches. (64b) is ungrammatical because the indirectobject (every child in the class) must follow its regent, give. (64c) is ungrammaticalbecause the indirect object must precede the direct object (a gold star). But(64d-64e) are grammatical, even though the indirect object does not followgiven in (64d) and does not precede the direct object in (64e), because theindirect object is not a branch dependent of given.

(64) a. The teacher will give every child in the class a gold star.b. *The teacher will every child in the class give a gold star.c. *The teacher will give a gold star every child in the class.d. Every child in the class was given a gold star.e. Also given a gold star were all the children in the class.

But the < e > analysis allows us to do away with the distinction between branchand nonbranch dependencies: with the sole exception of Binder, alldependencies are branches. The syntactic structure of a sentence is just atree with labelled branches, supplemented by nonbranch relations of typesBinder, Surrogate and Proxy. (Even the branch labels are potentiallyredundant, given that a branch is distinguished from its siblings by its position. )Thus, the whole apparatus of syntactic structure can be significantly simplified,for the price of merely one extra lexical item among thousands.

On the assumption that unbound <e> is interpreted as 'something/someone', we are then in a position to posit structures for (65a-65d)14 that yieldthe meaning that the sentences actually have. Furthermore, the presence of< e > in (65c-65d) provides a way to capture the fact that even though there isno overt or deleted object of keep or subject of alive, semantically the object ofkeep is still understood to be the subject of alive. ((65c) is the structure onewould have if unbound < e > is added to otherwise orthodox WG. (65d) is thestructure I am proposing. )

OBJECT


syntactically bound <e> occurs in positions where, in a transformationalmodel, traces (or other bound empty categories) occur. This would give us thesort of structure shown in (63a), in contrast to the traditional WG analysisshown in (63b). ('Binder' is, needless to say, a syntactical relation between aword and another word that binds it. )

OBJECT(63) a.

b.

Whatdid she say he had been hoping to eat <e>?BINDER

What did she say he had been hoping to eat?


The main snag with < e > has to do with the phenomenon of connectivity,whereby traces have to have the categorial properties of their binder, i. e. of whatthey're traces of. An adjective leaves an adjectival trace, a noun leaves a nominaltrace, and so forth. This is so that the trace can satisfy the categorial selectionalrequirements imposed on the position the trace occupies. For example, thesubject-raising verb wax requires an adjectival complement. So if <e> isSection 11 explains why which is surrogate of items/place I body/way. Section 9wroth.

(66) How wroth did she wax <e>?

If we had to introduce invisible words of every conceivable word class, and addrules requiring them to agree in word class with their binder, then this would bevery much the opposite of a simplification to the grammar. But the Proxyrelation provides a simple solution, if <e> is proxy of its binder. Theselectional requirements of wax are that it takes a complement that is surrogateof an adjective, and this requirement is satisfied in (66), since (i) how is binder of<e> and hence <e> is proxy of how; (ii) being a degree word, how issurrogate of wroth; and (iii) <e> is therefore surrogate of wroth.

In all the other cases discussed in this chapter where the Proxy relation is tobe found, it occurs in a hypocentric phrase, where the SH is proxy of the DH,which is subordinate to the SH. But this clearly does not apply to the proxyrelation holding between <e> and its binder. The conclusion to be drawnfrom this is that rather than the Surrogate and Proxy relations being merelyconvenient ways to formalize hypocentricity, they are in fact fundamental, andhypocentricity is merely a convenient label for phrases whose root is surrogateof one of its subordinates.

15. Coordination

Since its beginnings, WG has analyzed coordination as exceptional to majorand otherwise exceptionless principles. The first exception is that whereas therest of syntax consists solely of dependencies between lexical nodes, i. e.

(65) a.

She was reading <e>.

b. Thou shalt not kill <e>...

... but need'st not strive officiously to keep <e> alive.

SUBJECTc.

d. ... but need'st not strive officiously to keep <e> <e> alive.BINDER

BINDER


words, coordinate structures employ nonterminal, nonlexical nodes, whichare linked to other nodes not by dependencies but by part-whole relations.The nonterminal nodes are of types Conjunct and Coordination. Forexample, in (67), the coordination node, marked by curly brackets, is motherof two conjunct nodes, marked by angle brackets, and of and. The firstconjunct node is mother of Sophy and of roses, and the second is mother ofEdgar and of tulips.

(67) Give {< [Sophy] [roses] > and < [Edgar] [tulips] >}.

Coordination is thus an exception to the principle of Node Lexicality, whichrequires all nodes to be lexical.

The second exception is that branches in the tree can cross only where thereis coordination, as shown in (68):

He thinks she made her excuses and left.

Latterly, WG has handled this exception by doing away with a No CrossingBranches principle, and replacing it with a principle of 'Precedence Concord',which states that if X is a subordinate of Y, and Y is a subordinate of Z, then theprecedence of X relative to Z must be the same as the precedence of Y relativeto Z; so if X precedes Z then so must Y, and if X follows Z then so must Y. Tothis principle, (67-68) are not an exception. But nor is (69) an exception to iteither, and (69) is ungrammatical, due to the crossing branches. Hence aprinciple of No Crossing Branches is still required, and (68) is still an exceptionto it.

Third, coordination is an exception to the principle of 'Branch Uniqueness',which requires each dependency between words and its dependents to be of adifferent type. Hence a word cannot have more than one subject or more thanone object, and so on. 15 But in WG's analysis of (68), give has two indirectobjects and two direct objects. It is easy enough to reformulate BranchUniqueness so that it applies only to dependents that aren't conjoined with eachother, but that just raises the question of why there should be such anexception.

Other things being equal, WG's model of grammar would be both simplerand more plausible if the principles of Node Lexicality, No Crossing andBranch Uniqueness were exceptionless. This calls for a wholly different analysisof coordination. 16 I propose that coordinations are hypocentric phrases whoseSH is the conjunction. Each conjunct is DH. The conjuncts are dependents ofthe conjunction, and the conjunction is proxy of its dependents.

(68)

(69) * Give students tulips of linguistics. ['Give students of linguistics tulips. ']


(70) She ate [apples AND oranges],

At a stroke, the exceptions to Node Lexicality and Branch Uniqueness areeradicated. There are no nonlexical nodes. Branch Uniqueness is preserved,because ate in (70) has only one object, namely and. As for No Crossing, andthe analysis of (68), we return to this in section 17. 2, which provides an analysisthat does not violate No Crossing.

The obvious glaring objection to this analysis of coordination comes fromcomplex coordination, as in (7la), where the conjuncts appear not to be singlephrases. (As pointed out in Hudson (1976), the position of the correlativeshows that (7la) cannot be derived by deletion from (71b). ) But the objectioncan be turned on its head, and (7la) can be taken as evidence that Sophy tulipsis in fact a single phrase. Section 17. 4 provides a vague sketch of how this couldbe.

(71) a. Give both Sophy tulips and Edgar roses.b. * Give both Sophy tulips and give Edgar roses.

16. Correlatives

Another instance of hypocentriciry in coordination arises with correlatives (both,either, neither). The correlative's position at the extreme edge of the phrasefollows if it is SH. The conjunction is complement of the correlative, and thecorrelative is proxy of the conjunction.

(72) a. She eats [BOTH apples and oranges].

b. She eats [EITHER apples or oranges].c. She eats [NEITHER apples nor oranges].

Correlatives are one of the very rare instances, mentioned in section 6, of wordswhose complement is their complement target rather than a proxy of theirtarget. This can be seen from the ungrammaticality of (73a) in contrast to (73b-73c). In (73a), the complement of both is or, which is proxy of each and: this isungrammatical, because the complement of both must be and.

(73) a. *Find both [[Alice and Bill] or [Carol and Dave]].b. Find (either) both Alice and Bill or both Carol and Dave.c. Find Alice and Bill or Carol and Dave.

17. Dependency Types

In WG, dependencies are of different types, such as Subject and Object. Insection 141 suggested that these types could be reduced to labels on branches


in the sentence tree. But in this section I will argue that branches areunlabelled, and that there is no distinction between branches anddependencies; and so-called 'dependency types' are in fact lexical items intheir own right. Thus, instead of X being subject of Y, there is a word, aninstance of the lexical item < PREDICATION >, that has two dependents, X (thesubject) and Y (the predicate). These words that take over the job ofgrammatical relations ('GRs'), I will call 'GR-words'. GR-words belong to a classof function word characterized, in part, by phonologylessness.

This proposals is relevant to this chapter for two reasons. First, the phraseswhose root is a GR-word are strongly or weakly hypocentric. And second, manyof the other analyses made elsewhere in the chapter converge on the GR-wordanalysis as a more or less inescapable conclusion.

17. 1 Adjuncts

Semantically, adjuncts are the converse of complements, in that whereas X is asemantic argument of Y when X is a complement of Y, Y is a semanticargument of X when X is an adjunct of Y. For instance, in She snoozed during theinterval, the snoozing is the theme argument of 'during'. A natural logicalcorollary of the fact that the modificand is an argument of the modifier is thatmodification is recursive: after one modifier has been added to the modificand,another modifier (of the same type or another type) can always be added. Thisis, of course, because a predicate's argument 'place' ('attribute') can have onlyone 'filler' ('value'), but the filler of one argument place can also be the filler ofmany others. We would therefore predict that recursibility is a default propertyof adjunction. One cannot rule out, a priori, the possibility of special rulesprohibiting adjunct recursion in certain constructions, but it is hard to imaginewhat systemic or functional motivation there could be for such a prohibition. Sothe null hypothesis is therefore that all adjuncts are recursible. 17

If adjuncts were simply dependents of the word they modify, then theprinciple of Branch Uniqueness ought to make them irrecursible. I proposeinstead that Adjunct is not a dependency but rather a GR-word. Rather than Xbeing adjunct of Y, X and Y are dependents of an Adjunction GR-word; X isthe modifier dependent and Y is the modificand dependent. Adjunction is aword class; the words it contains are the different kinds of adjuncts, such as< manner-adverbial >, < depictive >, and so forth. Adjunction phrases arehypocentric: the adjunction (the SH) is proxy of first dependent, themodificand (the DH). This can be seen from (74), where it is dozed thatsatisfies the requirement of had for a past participle as its complement target:

(74) She had [<ADJUNCTION> [dozed off] [during the interval]].

The adjunction serves as the locus of constructional meaning. For example,(75a) has the meaning (75b) and the structure (75c), < depictive > being anadjunction. It is the word < depictive > that adds the meaning 'while', i. e. that


the relationship between her going to bed and her being agitated is that theformer occurs during the latter.

(75) a. She went to bed agitated.b. She went to bed while (she was) agitated.c. She [<depictive> [went to bed] [agitated]].

A further merit of adjunctions is that they explain what Hudson (1990) calls'semantic phrasing'. For example, (76a-76b) are not synonymous. (76a) saysthat what happens provocatively is her undressing slowly, while (76b) says thatwhat happens slowly is her undressing provocatively. This nuance of meaning isreflected directly in the structure, (77a-77b).

(76) a. She undressed slowly provocatively,b. She undressed provocatively slowly.

(77) a. She <adjunction> <adjunction> undressed slowly provocatively.

She <adiunction> <adjunction> undressed provocatively slowly.

Noun+noun premodification structures present a conundrum soluble onlyby means of adjunctions. The conundrum rests on the difficulty of reconcilingevidence from word order with evidence from ellipsis. Ellipsis, as in (78),demonstrates that the modifying noun cannot be a dependent of the modifiednoun, since the modified noun can delete while the modifying noun remains:

(78) On one reading, it receives a count interpretation and on the otherreading it receives a mass interpretation-

We might for a moment suppose that the modifying noun is like an attributiveadjective, and its complement is a proxy of the modified noun (cf. Section 10).In this case, (79a-79b) would have the indicated dependency structure. Theirambiguity would then hinge not on the dependency structure but on thecomplement targets, shown in (80-81) by dotted arrows pointing to thecomplement target.

(79) a. old clothes bag

b. work clothes bag

(80) a. Oid clothes bag ['bag for old clothes']

b. work clothes bag ['bag for work clothes']


(81) a. old clothes bag ['old bag for clothes']

b. work clothes bag ['work bag for clothes', 'clothes bag for work']

But then we find that this analysis falls foul of word order evidence. Thestructure given to (82a) fails to rule out (82b):

(82) a.

revenge kitchen implement attack ['revenge attack with kitchen implement']b.

| * kitchen revenge implement attack ["revenge attack with kitchen implement"]

But the traditional WG analysis, where the modifying noun is a dependent ofthe modified noun, makes the right predictions here, as shown in (83a-83b),even though it is incompatible with the ellipsis facts:

(83) a.

revenge kitchen implement attack ['revenge attack with kitchen implement']b.

kitchen revenge implement attack ['revenge attack with kitchen implement']

The solution is to be found if this construction involves an adjunction,' < n + n >'. This adjunction allows its second dependent to delete, as in (84). Itgives the structures in (85-86). And these structures succeed in excluding (87b)as a No Crossing violation.

(84) On one reading, it receives a count interpretation and on the otherreading it receives a [<n+n> [mass] [interpretation! ]•

(85) a. [<n+n> [old [clothes]] [bagj] ['bag for old clothes']b. [<n+n> [<n+n> [work] [clothes]] [bag]] ['bag for work clothes']

(86) a. [old [<n+n> [clothes] [bag]]] ['old bag for clothes']b. [<n+n> [work] [<n+n> [clothes] [bag]]] ['work bag for clothes',

'clothes bag for work']

(87) a. <n+n> revenge <n+n> <n+n> kitchen implement attack['revenge attack with kitchen implement']

195

* <n+n> <n+n> <n+n> kitchen revenge implement attack['revenge attack with kitchen implement']

17. 2 Subjects

Conjoined predicates, as in (88a), present a problem. If the structure is as in(88b), then No Crossing is violated. If the structure is as in (88c) or (88d), thensome kind of rule of leftwards extraposition of subjects is required:

(88) a. He thinks she made her excuses and left,b. He thinks she made her excuses and left.

c. He thinks she made her excuses and left.

d. He thinks she <e> made her excuses and <e> left.

As we saw in section 5, exactly the same problem arises with extent operators:

He thinks she made her excuses and left.

He knows she all but perished.

He knows she all but <e> perished.

He knows she all but perished.

But under the GR-word analysis, the problem evaporates. The GR-word<predication > has two dependents, 18 the first corresponding to the subject andthe second corresponding to the predicate19:

b.

(89) a.

b.

c.

d.

STRUCTURAL AND DISRTIBUTIONAL HEADS


He thinks <predication> she made her excuses and left.

b. He knows <predication> she all but perished.

< Predication > phrases are quasihypocentric in the sense defined insection 6. A phrase [X[<predication > [Y][Z]]] does not freely alternate with[X[Z]]. In nontechnical and atheoretical terms, predicative phrases of categoryC do not freely alternate with nonpredicative phrases of category C. But X andZ are nevertheless sensitive to one another's presence, as can be seen from(91a-91b). The complement of auxiliary have must be a <predication> whosesecond dependent is proxy of a past participle. The complement of wax mustbe a <predication > whose second dependent is proxy of an adjective.

(91) a. She had [<PREDICATION> <e> perished\.b. She waxed [<PREDICATION> <e> wroth],

That <predicatiori> is not proxy of its second dependent can be seen from thefact that one <predicatiori> cannot be second dependent of another, i. e. thatmultiple subjects cannot occur.

(92) a. * She he went.

b- * <predication> She <predication> he went.

As it stands, the analysis makes it look coincidental that it is only the second('predicate') dependent of < predication > and not the first ('subject') that hasDH-like properties. Therefore the grammar should perhaps formally accordthe second dependent in this construction a special status. Let us therefore call< predication > the 'guardian' of its second dependent, the metaphor being thatthe second dependent is a legal minor and its intercourse with itssuperordinates must always be mediated by its guardian. And let us add rules((93a-93b); (93b) replaces (24c)):

(93) a. If X is surrogate of Y, then X is guardian of Y.b. If X is guardian of Y, and Y is guardian of Z, then X is guardian of Z.

In this case, the complement of auxiliary have must be a <predication > that isguardian of a past participle, and the complement of wax must be a<predication > that is surrogate of an adjective.

17. 3 Topics and finiteness

Topics are phrases, like white chocolate in (94), that have been moved to theposition immediately preceding the preverbal subject.

(90) a.


(94) White chocolate, I can't help gorging myself on _.

Like Subject and Predicate, both the topic and the 'comment' phrases can be acoordination.

(95) a. White chocolate, she keeps on giving me and I can't help gorgingmyself on.

b. Both white chocolate and Cheshire cheese, I can't help gorging myselfon.

As with <predication >, these facts motivate a GR-word for the topic-commentstructure, its first dependent being the topic and its second the comment. Thesecond dependent of this GR-word is a <predication > that is guardian of averb or auxiliary.

(96) [<'topic-comment'> [White chocolate], [<predication> [I] [can'thelp gorging myself on]]].

Topics occur only in finite clauses. On the unproblematic assumption thatthe structure of It is is (97), it can also be maintained that all finite clausescontain topics.

(97) [<'topic-comment'> [it] z [<predication> [<e>z] [is]]]

< Topic—comment > can therefore be equated with finiteness: a finite clause isone that contains < Topic-comment >, which we could equally well call<finite >.

I leave for future investigation issues about the relationship between<finite> and mood and tense, about verbal inflection, and about whethermood exists as a grammatical category in English. 20 At any rate, it is clear that<finite> phrases are at most weakly hypocentric. If Indicativity andSubjunctivity are subtypes of <finite >, then <finite> phrases are nothypocentric at all, since know can select for a surrogate of < indicative >, requirefor a surrogate of < subjunctive > and insist for a surrogate of <finite >. If, onthe other hand, the mood distinctions are located lower down within the<finite> phrase, then know /require /insist will select for a surrogate of<Jinite>, but know and require will further stipulate that their complementmust be guardian of wherever the appropriate mood distinction is located.

17. 4 Complements

An inescapable corollary of the proposed analysis of coordination is thatconjuncts are complete phrases. 21 In this section I sketch how this must work,though the sketch is of a solution strategy rather than an analysis workedthrough in detail.


(98a) shows that - uncontroversially - eats cheese is a complete phrase. But(98b) shows that core is a complete phrase too. So in a verb+object construction,V+O is a complete phrase, but so too is V on its own. How can this be?

(98) a. She will [[eat cheese] and [drink wine]],b. She will [[core] and [peel]] the apples.

The answer has to be that there is an extra GR-word present, whose function islike that of an X' node, uniting into a single phrase its two separate dependents,V and O. This gives the structures in (99a-99c).

(99) a. She will [[<X'> eat cheese] and [<X'> drink wine]],

b. She will <X> [[core] and [peel]] the apples.

c. She will <X> eat [cheese and bread].

Similarly, (lOOa) shows that Sophy is a complete phrase in (lOOd), and (lOOb)shows that Sophy is a complete phrase in (lOOd). But (lOOc) shows also thatSophy roses is a complete phrase in (lOOd).

(100)a. She will give Sophy and Edgar roses.b. She will give Sophy roses and tulips.c. She will give Sophy roses and Edgar tulips.d. She will give Sophy roses.

There must therefore be an additional GR-word present - let's call it'<ditransitive>'. (lOOa-lOOd) must have structures (lOla-lOld).

(101)a. She will <X > give <ditransitive> Sophy and Edgar roses.

b. She will <X> give <ditransitive> Sophy roses and tulips.

c.

She will <X> give <ditransitive> Sophy roses and <ditransitive> Edgar tulips,

d. She will <X'> give <ditransitive> Sophy~roses.


it is not the roses that occur today but rather their being given. Today must bean adjunct of a GR-word that marks the second object, as in (102b). (103a)therefore has structure (103b).

(102) a. She will give Sophy roses today and tulips tomorrow.

She will <X'> give <ditr> Sophy <adj> <GR> roses today and <adj> <GR> tulips tomorrow.

(103)a. She will give Sophy roses.

b. She will <X'> give <ditr> Sophy <GR> roses.

With the exception of < transitive >, the GR-words involved in comple-mentation would be guardians rather than surrogates or proxies, since the GR-words are not freely omissible in the way that surrogates and proxies are. As forthe kind of hypocentricity, if any, involved with < X' >, I leave this for futureinvestigation.

18. Conclusion

This chapter has demonstrated the existence - and indeed the prevalence - ofhypocentricity, the syntactic phenomenon whereby the distribution of a phraseis determined not by the root of the phrase but by a word subordinate to thephrase root. Hypocentricity comes in different 'strengths'. In the strongest formof hypocentricity, a phrase with a given SH is in free variation with a version ofthe phrase with the SH absent. These are the hypocentric constructions thatinvolve the Proxy relation. The instances discussed in this chapter involve (i)'extent operators' like almost, not and all but, (ii) 'focusing subjuncts' like even;(iii) attributive adjectives; (iv) the type-of construction; (v) coordinatingconjunctions; (vi) correlatives like both; and (vii) adjunctions, which are theinvisible words that link adjuncts to their modificands. Apart from coordina-tion, these could all be called 'modifier constructions'. In addition it has beensuggested that invisible bound variables are proxy of their binder, even though,exceptionally, the binder would not be a subordinate of its proxy.

In hypocentricy of 'intermediate strength', a phrase with a given SH is indistributional alternation with a version of the phrase with the SH absent, butthe variation is limited to certain environments. These are the hypocentricconstructions that involve the Surrogate relation. The instances discussed in thischapter involve (i) clausal that; (ii) inside-out interrogative clauses, which behavelike clausal determiners; (iii) pied-piping; (iv) degree words; and (v)determiners.

In the weakest form of hypocentricity, there is no distributional alternation,but the DH is nevertheless sensitive to material external to the hypocentric

b.

(102a) shows that roses today is aphrase. Today is an adjunct, but not roses:


phrase. These are the hypocentric constructions that involve the Guardianrelation. The instances discussed in this chapter involve the invisible GR-words<predication >, which is the root of the subject+predicate construction, and< finite >, which is the root of the topic+comment construction, and variousother GR-words that form the structural basis of complementation.

The relations Proxy and Surrogate are initially motivated as mechanisms thatprovide an analysis for constructions that cannot otherwise be satisfactorilyhandled by WG. Once this mechanism is admitted, it opens the way - or thefloodgates - for a series of increasingly radical (and increasingly sketchy andprogrammatic) analyses of coordination and of grammatical relations, whichaim to simplify WG by drastically reducing the range of devices from whichsyntactic structure is constituted, while still remaining consistent with WG'sbasic tenets. The devices that are done away with are (i) exceptions to theprinciple of Node Lexicality, i. e. nonlexical phrasal nodes, which orthodoxWG uses for coordination; (ii) exceptions to the No Crossing principle barringcrossing branches in the sentence tree; (iii) dependencies that are not associatedwith branches in the sentence tree; and, perhaps, (iv) dependency types toutcourt. In their most extreme form, these changes result in a syntactic structureconsisting of nothing but words linked by unlabelled branches forming a tangle-free tree, supplemented by Binder, Proxy, Surrogate and Guardian relations.While I believe the necessity for Proxy and Surrogate relations is demonstratedfairly securely by the earlier sections of the chapter, their extended applicationin the analysis of coordination, empty variables and grammatical relations is of afar more speculative nature. But my aim in discussing these analyses in thischapter has been to point out how they are possible within a WG model andwhy they are potentially desirable.

References

Cormack, Annabel and Breheny, Richard (1994), 'Projections for functional categories'.UCL Working Papers in Linguistics, 6, 35-62.

Hudson, Richard (1976), 'Conjunction reduction, gapping and right node raising'.Language, 52, 535-62.

— (1990), English Word Grammar. Oxford: Blackwell.— (2004), 'Are determiners heads?'. Functions of Language 11, 7-42.Jaworska, Ewa (1986), 'Prepositional phrases as subjects and objects'. Journal of

Linguistics, 22, 355-74.Payne, John (1993), 'The headedness of noun phrases: Slaying the nominal hydra', in

Greville G. Corbett, Norman M. Fraser and Scott McGlashan (eds), Heads inGrammatical Theory. Cambridge: Cambridge University Press, pp. 114-39.

Quirk, Randall, Greenbaum, Sidney, Leech, Geoffrey N. and Svartvik, Jan Lars (1985),A Comprehensive Grammar of the English Language. Harlow: Longman.

Rosta, Andrew (1994), 'Dependency and grammatical relations'. UCL Working Papersin Linguistics, 6, 219-58.

— (1997), 'English Syntax and Word Grammar Theory'. (Unpublished doctoraldissertation, University of London).

Van Langendonck, Willy (1994), 'Determiners as Heads?'. Cognitive Linguistics, 5,243-59.


Notes

1 In standard Phrase Structure Grammar, lexical nodes are terminal and nonlexicalnodes are nonterminal. If a nonterminal node is defined as one that contains others,then in WG all nodes are terminal. But this terminology is a bit misleading, since ina WG tree structure, terminal nodes are ones that have no subordinates. Hence it ismore perspicuous to define WG as maintaining that all nodes are lexical.

2 This problem posed by require was pointed out by Payne (1993); cf. also Cormackand Breheny (1994).

3 See Jaworska (1986) for examples of such nonpredicative prepositions.4 These judgements are for conservative Standard English. Admittedly there is the

famous line 'The boy stood on the burning deck, whence all but he had fled'(Felicia Hemans, 'Casabianca'), but that could be a solecism induced by theregister, hypercorrectively, and by the disconcerting unfamiliarity of the word-sequence him had in contrast to he had. It is also true that for many speakers ofcontemporary English, the rules for the incidence of personal pronouns' subjectiveforms, especially in less colloquial registers, seem to be pretty much of the make-them-up-as-you-go-along or when-in-any-doubt-use-the-subjective-form sort.

5 It has to be admitted that this claim flies in the face of a certain amount of evidenceto the contrary, notably determiner-complement ellipsis, as in (i), andpseudogapping, as in (ii).(i) Do students of Lit tend to be brighter than those students °f Lang?(ii) She will do her best to bring the food, as will he a0 his best to bring me wine.

6 (13f) raises its own analytical curiosities, which I won't investigate further here.Logically, the structure is 'not [[go out without knickers] and [still stay decent]]';that is, the ellipsis is of one conjunct Egregiously unexpected though such aphenomenon is, I find that the analogous structure in (i) is, also surprisingly,acceptable.(i) Nobody likes to complain, but she should [[COmpiaiJ and [be the happier for

it]].7 The arrows below the words represent dependencies that don't form branches in

the tree structure. (See section 14 on the eradication of such dependencies. )8 More generally, I would maintain that open class lexemes are invisible to syntax

and hence that selectional rules cannot refer to them. Only word classes are visibleto syntax and can be involved in selectional rules. WG has always held thatlexemes are word classes, but my contention is that this is true only of closed classlexemes, in that closed class lexemes are word classes that are associated withparticular phonological forms, whereas open class lexemes are morphologicalstems (which is why processes of derivational morphology, which output stems, canoutput only open class lexemes). Way (and a few other similar words, such asplace) would be a subclass of Common Noun.

9 More precisely, the rule is that by default, nouns receive a mass interpretation, butthe complement of certain determiners, such as an, must be proxy of a noun thatreceives a count interpretation.

10 I don't know why (56c) is grammatical. Cake has a count interpretation, so ought tobe complement target of a determiner, but if it is complement target of these then itought to agree in number with these.

11 I suggest that the reason is that common nouns take plural inflection only whencomplement of a plural determiner. (This supposes that bare plurals arecomplement of a phonologically invisible plural an. } In (55b) there is no pluraldeterminer present to trigger the plural inflection on cakes.


12 This is an approximation. Make actually requires a complement that is surrogate ofa predicative word. In the analysis of predicativity given in section 17. 2,<predicative > would be complement of make, and the surrogate of virile would bedependent of <predicative >.

13 By 'empty category', I mean traces and suchlike, that have a privileged syntacticstatus, and are empty not only of phonological content but also of ordinary lexicalcontent. As noted in section 5, it is also possible for ordinary words in particularenvironments to lack phonology; cf. also Creider and Hudson (this volume).

14 (65b-65c) from Arthur Hugh Clough's 'The latest decalogue'.15 Besides coordination, adjuncts appear to constitute an exception to Branch

Uniqueness. A word can have more than one adjunct - indeed, that is part of thedefinition of the Adjunct relation. But section 17. 1 provides an analysis of adjunctsthat removes the exception to Branch Uniqueness.

16 A further problem with the WG analysis of coordination is that it cannot easilyaccommodate the fact that the definitional boundary between coordination andsubordination is gradient rather than clearcut as one would expect werecoordination and subordination handled by fundamentally different mechanisms.(See Rosta 1997 for a full demonstration of this point. )

17 This null hypothesis stands up extremely well to the data, but there are someconstructions where a dependent is irrecursible but is not subject to selectionalrestrictions and is not an argument of the modificand and hence has selectionaland semantic properties more typical of adjuncts than of complements. But suchdependents are best seen as atypical complements. One example is the resultdependent in the resultative construction, e. g. soggy in (i). Another is the indirectobject me in (ii).(i) She sneezed the hankie soggy.(ii) Fry me some bacon.Another example is bare relative clauses (BRCs), in those dialects in which BRCsare not recursible.(iii) %The book [I'd been asking for _] [she finally bought me _] turned out to be

crap.Another respect in which BRCs are unlike adjuncts is that, as (ii) shows, they don'textrapose, though this point should be taken as suggestive rather than conclusive,since it is not clear how consistently extraposability distinguishes adjuncts fromcomplements or other nonadjuncts.(iv) [That book _J has arrived [*(that) you ordered Ji.

18 In the absence of evidence to the contrary, GR-words are assumed to precede theirdependents, since this is the default order for English. Indeed, for English it mayeventually turn out to be an exceptionless principle that dependents follow theirregent. But there are plenty of exceptions that are hard to explain away, such as too'also' ({[She] [too] went]), degree enough, possessive 's ([[Sophy]'s [father]), andnonfmal dependents of conjunctions.

19 A corollary of this analysis is that the 'inverted subject' in subject-auxiliaryinversion is not in fact a subject Rather, it is an object (of the auxiliary) that has notbeen raised to subject position.

20 The data to be accounted for is summarized in (i-ix).(i) though she is/ was/ %be/ *were/ goes/ %go/ went mad(ii) if she is/ was/ %be/ were/ goes/ %go/ went mad(iii) insist that she is/ was/ be/ *were/ goes/ go/ went mad(iv) require that she *is/ *was/ be/ were/ goes/ go/ went mad


21

(v)(vi)(vii)(viii)(ix)

*be/ *were/ goes/*?be/*?be/

%were/ *goes/%were/ *goes/

*go/ went*go/ *went*go/ *went

madshe mad.

you to.

know that she is/ was/She would, *is/ %was/She would, is/ %was/

I would prefer that you not be/%be/*is/*is.She almost *be/*be/is/%is.

In determining what sorts of phrase can be coordinated, it is important to factorout the extraneous but distorting effects of Right Node Raising-type operations,which delete the phonology of part of one conjunct. See Rosta (1997).

9 Factoring Out the Subject Dependency

NIKOLAS GISBORNE

AbstractThis chapter offers a revision to the English Word Grammar (EWG) model byfactoring out different kinds of dependency. This is because the informationencoded in the EWG model of dependencies is not organized at the appropriatelevel of granularity. It is not enough to say, for example, that by default thereferent of a subject is the agent of the event denoted by the verb.

1. Introduction

The English Word Grammar model treats dependencies as asymmetricalsyntactic relations (Hudson 1990: 105-8), where the critical information is theasymmetry and the relative ordering of head and dependent. Hudson (1990:120-1) goes on to treat grammatical relations as a particular subclass ofdependency relation, and to identify certain semantic roles as beingprototypically linked to certain grammatical relations. For the English WordGrammar model, therefore, dependencies are labelled asymmetrical syntacticrelations which are also triples of semantic information, syntactic relationinformation and word-order information bound together by default inheritance.The theory is syntactically minimalist: all syntactic phenomena are analyzed interms of dependency relations and the categorization of the words that thedependencies relate to.

The result is a highly restrictive model of grammar, where all relationshipsare strictly local. It differs from other lexicalist frameworks, such as LexicalFunctional Grammar (LFG), in that there are not different domains of structurewhich represent different kinds of information. All grammatically relevantinformation for WG is read off the lexicon and the dependency information.Within this theory of grammar, Hudson (1990) uses an inventory ofdependencies which is pretty much what you find making up the set ofgrammatical relations in both traditional grammar and classical transformationalgrammar.

In this chapter, I offer a revision to the English Word Grammar model byfactoring out different kinds of subject dependency. This is because theinformation encoded in the EWG model is not organized at the appropriate levelof granularity. It is not enough to say, for example, that by default the referent of asubject is the agent of the event denoted by the verb. This is because there are at

FACTORING OUT THE SUBJECT DEPENDENCY 205

least three kinds of subject in English: subjects triggered by finiteness (on thegrounds that English is not a 'pro-drop' language); subjects triggered bypredicative complementation; and 'thematic' or 'lexical' subjects such as thesubjects of gerunds and other inherently predicating expressions. Subjectstriggered by finiteness are not required to be in any kind of semantic relationshipwith the event denoted by the verb. Similar observations about the non-uniformity of the subject relationship are found in McCloskey (1997, 2001).

In this chapter, therefore, I review the inventory of dependencies in WordGrammar, and establish a more fine-grained account of subjecthood than themodel of Hudson (1990) envisages. I focus on data introduced in Bresnan(1994). Bresnan (1994) explores locative inversion, shown in (1), and showsthat in inverted sentences like (Ib), the subject properties are split between theitalicized PP and the emboldened NP:

(1) a. A lamp was in the corner.b. In the comer was a lamp. 1

Bresnan's (1994) account explains the split subject properties in terms of theparallel architecture of LFG where grammatical information is handled in termsof a-structure, f-structure and c-structure. I show that the revised WordGrammar account can capture the same kind of data as LFG within a moreparsimonious ontology.

The chapter is organized into 5 sections. In section 2, 1 discuss the differentdimensions, of subjecthood, and explore the different properties that subjectshave been claimed to display since Keenan (1976). In section 3, I lay out thedata that needs to be discussed (drawn from Bresnan 1994), and explain theproblems that this data presents. In section 4, I present the refined view ofsubjecthood that this chapter argues for, and show how it accounts for the data.The final section, section 5, presents the conclusions and some prospects forfuture research.

2. Dimensions of Subjecthood

Subject properties have been gathered up in several different places - forexample, Keenan (1976), Keenan and Comrie (1977), and Andrews (1985).Subjects have been shown to have diverse properties across languages, and ithas been shown that not every subject property is always displayed by allsubjects in a given language. It is this observation that drives Falk's (2004: 1)claim that 'a truly explanatory theory of subjecthood has yet to be constructed'.In this section, I itemize and exemplify some of the major features ofsubjecthood, which are generally held to apply crosslinguistically, and presentthree diagnostics which apply parochially to English. I have relied on thepresentation of these properties in Falk (2004: 2-5), where they are usefullygathered together. Not all of the subject properties laid out here are directlyrelevant to the analysis of the split-subject phenomena found in locativeinversion, but they are relevant to the broader conclusions about subjecthoodthat this case study takes us to, and which are laid out in section 5.


2. 1 Typical subject properties

Subjects are typically the dependent which expresses the agent argument in theactive voice. This is shown in (2):

(2) a. The dog chased the catb. The cat was chased by the dog.

In (2a), the subject is also the agent of the action. In order for the subject not tohave to be the agent, passive voice is available as in (2b). Voice phenomena aredevices for re-arranging the arguments of the verb so that the agent no longerhas to be presented as the subject Of course, it is not always the case thatsubjects are agents, because there are verbs that do not have agentive subjects,as in (3), but many linguists follow Jackendoff (1990) in assuming a hierarchy ofsemantic roles, where the most agent-like is always the one which links to thesubject.

(3) a. Jimmy weighs 90kg.b. The glass shattered.

So we can use semantic role as a subject-diagnostic.The second diagnostic is that sole arguments of intransitives typically show

(other) subject properties. For example, tag questions and subject-auxiliaryinversion are diagnostics of subjecthood in English, and the subject of (3b)above and (4a) can have the relevant diagnostic applied to it

(4) a. The glass shattered, did it/*he/*she/*they?b. Did the glass shatter?

From this, we can say that the glass in (3b) is shown to be the subject by thediagnostics in (4).

The addressee of an imperative is a subject In the following examples, theaddressee has the status of the subject, irrespective of its semantic role.

(5) a. Go away!b. Be miserable, see if I care!

From the imperative examples, we can see that it is also possible for subjects tobe covert. 2

One widely noted diagnostic is to do with anaphora. There is a subject-object asymmetry, which becomes evident when subject and object are co-referential. In the case of co-reference, it is the object which is expressed as a(reflexive)3 pronoun. This is shown in (6):

(6) a. Jane hurt herself.b. * Herself hurt Jane.

There is cross-linguistic variation in this construction. In English, there is a


hierarchy of grammatical functions so that the reflexive pronoun has to belower in the hierarchy than its antecedent. In some other languages, onlysubjects may be antecedents of reflexive pronouns.

The subject is the only argument which may be shared in a predicativecomplementation structure (in fact, in both varieties of predicative comple-mentation - raising and control). This is shown by the examples in (7) for'control' verbs, and (8) for 'raising' verbs. The subject of the xcomp is sharedwith either the object or the subject of the matrix verb.

(7) a. Jane persuaded the doctor to see Peter.b. Jane persuaded Peter to be seen by the doctor.c. The doctor was persuaded to see Peter. 4d. *Jane persuaded Peter the doctor to see.

In (7a), the doctor is shared between persuaded and to see Peter, because to is thexcomp of persuade. The passivization facts in (7b) show us that the doctor in (7a)is the subject of to (and see). The passivization facts in (7c) show us that the doctoris an argument that is shared with persuaded, because it is also the object ofpersuaded. The ungrammatical (7d) shows that Peter cannot be the object ofpersuaded and of see at the same time. Therefore, the property of being sharablewith a higher predicate is a property of subjects, not other arguments.

(8) a. It seems that Jane likes Peter.b. Jane seems to like Peter.c. Peter seems to be liked by Jane.d. * Peter seems Jane to like .

The relationship between (8a) and (8b) shows that in (8b), Jane is the subject ofboth seems and to like. The example in (8c) shows that the passive subject of (tobe) liked can also be shared with seems. The ungrammatical (8d) shows that it isnot possible to exploit the object of like as the shared subject of seems.

Falk (2004) claims that the subject is the only argument which can be sharedin coordination. The examples in (9) show that a subject can be shared by twoconjoined verbs, but not an object:

(9) a. Jane kissed Peter and hugged Jim.b. *Jane kissed Peter and Cassandra hugged .

However, this observation is not quite right. In Right Node Raising, the objectof the second conjunct can be shared, as in Jane kissed, and Cassandra hugged,Peter. Right Node Raising needs to be treated as a special construction typebecause, among other things, it comes with particular intonation - indicatedhere by the commas - which is not a necessary part of the argument sharing in(9a). But, it is also the case that it is possible to say Cassandra peeled and ate agrape. Here, both the object and the subject are shared by the conjoined verbs.

There are two remaining properties of subjects which can be stated verygenerally. The first is that in many languages the subject is obligatory, as it is in


English (except in the case of imperatives). This observation gives rise to theProjection Principle of Chomsky (1981), and its later incarnation as theExtended Projection Principle (EPP). The second fact is that subjects areusually discourse topics.

In the next section, I identify some subject properties that are foundparochially in English.

2. 2 Parochial subject diagnostics for English

The first is that subject-inversion is found in main-clause interrogatives.

(10) a. Jane was running.b. Was Jane running?c. Jane ran.d. Did Jane run?

As (lOa-lOb) show, where there is an auxiliary in the corresponding declarativeclause, it inverts with the subject in interrogatives. The examples in (lOc-lOd)show that where there is no auxiliary in the corresponding declarative clause,one has to be supplied in the interrogative.

The next diagnostic for English is that tag-questions have properties that areunique to subjecthood: the pronoun in a tag question has to agree with theperson, number and gender 'features' of the noun or noun phrase in the matrixclause it replaces. If we look back at the example in (4a), we can see that theonly legal pronoun in the tag question is it, which has features appropriate to theglass.

The last diagnostic that I want to look at concerns extraction. There are twomain properties in English: the Condition on Extraction Domains shows us thatit is easier to extract out of complements than out of adjuncts, which in turn areeasier to extract out of than subjects. And the THAT-trace effect shows that - ingeneral terms - English subjects resist extraction. (In other languages subjectsare often more extractable than other arguments. Keenan and Comrie (1977)showed that in terms of the THAT-trace effect, English is atypical: they foundthat cross-linguistically, subjects were more, not less, likely to be extracted. )There are relevant data in (11):

(11) a. Jane thinks that Peter is a drunk.b. * Who does Jane think that is a drunk?c. Who does Jane think is a drunk?d. What does Jane think that Peter is ?

The example in (lla) gives the basic declarative sentence, (lib) shows that it isimpossible to extract a subject after that, even though (lie) shows that it ispossible to extract a subject out of a finite complement clause when there is nothat, and (lid) shows that it is possible to extract other arguments out of a finitecomplement clause, like the object.


2. 3 Subject-verb agreement

Subject-verb agreement is not universally found as a subject property; however,it is not a parochial property of English either. As a phenomenon, agreement iscomplex - some languages have agreement that works across a range ofdimensions, whereas English only shows agreement in terms of number(Hudson 1999). Although English has subject-verb agreement, which is verycommon in Semitic, Bantu and Indo-European languages, some otherlanguages have no agreement morphology at all - for example the modernScandinavian languages and the Sinitic languages. English subject-verbagreement is shown in (12). 5

(12) a. The girl likes the dog.b. The girls like the dog.c. *The girl like the dog.d. *The girl like the dogs.

The examples in (12a-12b) show that the number feature of the finite verb co-varies with the number of the subject. If the subject is plural, so is the verb: girlstriggers like. The example in (12c) shows that a plural subject requires a pluralverb, and the example in (12d) shows that English does not have agreementwith objects, so a plural object cannot rescue a plural verb that has a singularsubject. The agreement phenomena of English are significant in the discussionof locative inversion that follows.

This, then, completes the review of subject properties. A number of theseproperties were exploited by Bresnan in her (1994) article, which is discussedin the next section, but before we turn to section 3, I shall just summarize thesubject properties in three bullet-point lists here:

General subject properties.• subjects are typically the dependent which expresses the Agent argument in

the active voice;• the sole arguments of intransitives typically show (other) subject proper-

ties;• the addressee of an imperative is a subject;• there is a subject-object asymmetry, such that where subject and object

are co-referential, it is the object which is expressed as a (reflexive)pronoun;

• the subject is the only argument of an xcomp which may be shared in apredicative complementation structure (in both raising and control);

• the subject can be the shared argument in coordination;• the subject is often obligatory;• subjects are usually the discourse topic.

Parochial subject diagnostics for English• English main-clause interrogatives show subject-inversion;• tag questions show agreement between the pronoun tag and the subject;


• English subjects resist extraction. (In other languages, subjects are oftenmore extractable than other arguments).

Agreement• subject-verb agreement: subjects agree with their verb.

The issue, at least in as much as the locative inversion data constitute a problem fora story of subjecthood, is to do with which of these subject properties belongtogether. In the next section, I look at the locative inversion data presented inBresnan (1994), and then in section 4, 1 look at these subject properties in the lightof Bresnan's findings about the arguments in the locative inversion construction.

3. The Locative Inversion Data

Bresnan (1994) presents an account of locative inversion which carefully detailsthe circumstances within which locative inversion can take place, and which alsodescribes the discourse factors, as well as the grammatical factors, which affectlocative inversion. Locative inversion in English is shown in (1) above, and (13)and (14) below.

(13) a. My friend Rose was sitting among the guests.b. Among the guests was sitting my friend Rose.

(14) a. The tax collector came back to the village,b. Back to the village came the tax inspector.

As Bresnan (1994: 75) puts it, 'locative inversion involves the preposing of alocative PP and the postposing of the subject NP after the verb. The positionsof the locative and subject arguments are inverted without changing thesemantic role structure of the verb. ' Bresnan (1994: 75-6) sets out the limits oflocative inversion, excluding other kinds of inversion around be from thediscussion, and limiting the phenomenon to examples like those in (15).

(15) a. Crashing through the woods came a wild boar.b. Coiled on the floor lay a one-hundred-and-fifty-foot length of braided nylon

climbing rope three-eighths of an inch thick.

The examples in (15) can be included in the set of locative inversion databecause the inverted VPs involve a locative PP, and the verbs GOME and LIEnumber among the verbs which support locative inversion. Bresnan goes on todemonstrate that the verbs which allow locative inversion are unaccusative -thus the grammaticality difference between (16a) and (16b) - or passive (butwith the BY phrase suppressed), as in (17):

(16) a. Among the guests was sitting my friend Rose.b. * Among the guests was knitting my friend Rose.

(17) a. My mother was seated among the guests of honour,b. Among the guests of honour was seated my mother.


From these data, Bresnan concludes that locative inversion 'can occur just incase the subject can be interpreted as the argument of which the location,change of location or direction expressed by the locative argument ispredicated' (1994: 80) - to put this another way, the subject must be a 'theme'in the terms of Jackendoff (1990). This is consistent with Bresnan's account ofunaccusativity, where it is claimed that unaccusativity is not a syntacticphenomenon, but one where the unaccusative subject's referent is always thetheme of the sense of the verb.

The final aspect of the grammar of locative inversion is that the locative PP isalways an argument of the verb, not an adjunct. The argument/adjunctdistinction is hard to draw in the case of locative expressions, and I do not wantto get bogged down in the debate, but the evidence that Bresnan (1994: 82-3)brings to bear on the issue is compelling enough. She shows that adjuncts canbe preposed before the subjects of questions, although arguments cannot, andshe uses the so-anaphora test to show that adjuncts can be excluded from theinterpretation of so-anaphora, whereas locative arguments cannot.

To summarize, locative inversion:

• occurs with unaccusative verbs or passivized verbs;• requires the subject NP's referent to be the theme of the sense of the verb;• requires the locative PP to be an argument of the verb.

There are other facts that apply in the treatment of locative inversion. Theseare:

• presentational focus;• sentential negation;• other subject properties.

Presentation focus is not strictly syntactic. I shall not return to this. Sententialnegation is more important. Bresnan gives the examples in (18). Thesignificance is that in (18a), sentential negation is not possible, whereas in (18b),constituent negation of the postverbal NP is possible:

(18) a. *On the wall never hung a picture of U. S. Grant.b. On the wall hangs not a picture of U. S. Grant but one of Jefferson Davis.

Bresnan (1994: 88) quotes Aissen (1975: 9) as saying that this restriction is dueto the way in which the locative expression sets a backdrop for a scene.Negating the main clause undermines this discourse function, whereascontrastive negation on the postverbal NP does not have such an effectBresnan (1994: 88) on the other hand, contrasting English with Chichewa,argues that sentential negation in Chichewa excludes the subject, so therestriction comes down to a statement about the scope of negation.


3. 1 Evidence that the subject properties are split between thelocative PP and the postposed NP

We shall see in this section that by a number of different diagnostics forsubjecthood, the subject properties are split between the locative PP and thepostposed NP that expresses the theme argument.

AgreementIn the case of agreement, we see that the locative PP does not agree with thefinite verb.

(19) a. In the swamp was /*were found a child.b. In the swamp were /*was found two children.

In English agreement is with the NP Theme.

Control of attributive VPs (participial relatives)In this case too, we see that the locative PP cannot be the controller (or subject)of an attributive participle. As Bresnan (1994: 95) points out, this constitutes adifference between English and other languages: Chichewa does allowexamples like (21b). In borrowing Bresnan's examples, I have also taken herrepresentational system. 6

(20) a. On the corner stood a woman fcp who was standing near another woman],b. On the corner stood a woman [0 standing near another woman] cp.

Note that the locative PP cannot control the participle in the participial relative.

(21) a. She stood on the corner [cp on which was standing another woman] cp.b. *She stood on the corner [0 standing another woman].

Subject-raisingHowever, English does allow apparent subject-raising of locative PPs as in (22).

(22) a. Over my windowsill seems to have crawled an entire army of ants.b. On that hill appears to be located a cathedral.c. In these villages are likely to be found the best examples of this cuisine.

Bresnan (1994: 96) observes that only subjects can be raised in English. Shecompares die two examples in (23) as evidence of this:

(23) a. It seems that John, you dislike,b. *John seems you to dislike.

In (23a), John is the focused, and leftward-moved, object of dislike. Thismovement is entirely acceptable in the context of the finite verb. In (23b),however, we can see that it is not possible for the object to be focused and thenraised over seems as its subject. From this, she concludes that any word or


phrase which is the subject of a predicate like seem is also the subject of thexcomp of seem.

Tag questionsThe argument from tag questions is a negative one: the claim is that the NPtheme cannot be the subject by this diagnostic. In English tag-questions, adeclarative clause expressing a statement is followed by an auxiliary verb and apronoun which expresses a questioning of the prepositional content of themain clause. Examples are given in (24). The pronoun must agree with thesubject of the main clause.

(24) a. Mary fooled John, didn't she/*he?b. John was fooled by Mary, wasn't he/*she?

As Bresnan points out, tags are in general unacceptable with locative inversion.The examples in (25) show this:

(25) a. ?Into the garden ran John, didn't he?b. *Into the garden ran a man, didn't one/he?7

The example in (25a) is less unacceptable than that in (25b). Bresnan (1994:97) quotes Bowers (1976: 237) who gives the example in (26) and arguesthat this shows that the postposed NP in locative inversions cannot be thesubject.

(26) In the garden is a beautiful statue, isn't there?

The claim is that in (26) there is coreferential with [i]n the garden, whichindicates, if anything, that in the garden is a more likely candidate for subjectstatus than the postposed NP. 8 Bresnan also quotes *A man arrived didn't onejhe? - an example from Gueron (1980: 661) to show that tags are in generaldifficult to establish with locatives even when they do not involve inversion.However, this example does seem to be set up to make the situation more,rather than less, problematic: replace a man by the man, and the problem ofpronoun choice vanishes. With appropriate context, as in the train arrived at 3,didn't it? a tag question is fine with locative inversion.

The tag-question data are difficult to interpret, therefore; it seems that thebest solution is to put them on one side as inconclusive.

Subject extraction/THAT-trace effectIn this section, I simply quote part of Bresnan's (1994: 97) section 8. 2, althoughwith renumbered examples. Bresnan is discussing the THAT-rrace effect.

[The] preposed locatives in locative inversion show the constraints on subjectextraction adjacent to complementizers:

(27) a. It's in these villages that we all believe can be found the best examplesof this cuisine.


b. *It's in these villages that we all believe that can be found the bestexamples of this cuisine.

Nonsubject constituents are unaffected by this restriction, as we can see bycomparing extraction of the uninverted locatives:

(28) a. It's in these villages that we all believe the finest examples of this cuisine canbe found .

b. It's in these villages that we all believe that the finest examples of this cuisinecan be found .

Only subjects show the effect.

(29) a. It's this cuisine that we all believe can be found in these villages.b. *It's this cuisine that we all believe that can be found in these villages.

Extraction from coordinate constituentsBresnan (1994: 98) gives the examples in (31-32) which show the constraint in(30):

(30) 'subject gaps at the top level of one coordinate constituent cannot occur with, anyother kind of gap in the other coordinate constituent. '

(31) a. She's someone that loves cooking and hates jogging.b. She's someone that cooking amuses and jogging bores .

(3la) has two subject gaps; (31b) has two non-subject gaps.

(32) a. * She's someone that cooking amuses and hates jogging.b. She's someone that cooking amuses and I expect will hate jogging.

In (32a), a non-subject gap is coordinated with a subject gap, leading toungrarnmaticality. In (32b), we see that a non-subject gap can be coordinated withan embedded subject gap, hence the careful formulation of the constraint in (30).Bresnan (1994: 98) suggests that judgments are delicate with examples like thosein (33)-(34), which involve locative inversion examples, but gives these examples:

(33) a. That's the old graveyard, in which is buried a pirate and is likely to beburied a treasure, [subject-subject]

b. That's the old graveyard, in which workers are digging and a treasure islikely to be buried . [nonsubject-nonsubject]

(34) a. PPThat's the old graveyard, in which workers are digging and is likelyto be buried a treasure, [nonsubject-subject]

b. That's the old graveyard, in which workers are digging and they say isburied a treasure, [nonsubject-embedded subject]

The crucial point here is that the examples in (33) show that 'the invertedlocative PPs show the extraction patterning of subjects' (Bresnan 1994: 98). AsBresnan points out, (34a) is fine with there in the subject gap 'which channelsthe extraction to a nonsubject argument (the oblique)', which argues in favourof a subject treatment of the locative PP.


Another diagnostic for subjecthood is inversion in interrogatives (Bresnan,1994: 102). As the examples in (35) show, it is clearly the case that the locativePP is not a subject by this criterion. We shall use this fact in the next sectionwhere I argue that the locative inversion data can be best handled by treatingsyntactic subjects as distinct from morphosyntactic subjects.

(35) a. *Did over my windowsill crawl an entire army of ants?b. *Did on that hill appear to be located a cathedral?

As these examples show, the locative PPs are not able to appear as the subjectsof auxiliary do in closed interrogatives. Moreover, as we can see in (36), theycannot occur as subjects in open interrogatives, either:

(36) a. *When did over my windowsill crawl an entire army of ants?b. *Why did on that hill appear to be located a cathedral?c. Why did an entire army of ants crawl over my windowsill?d. Why did a cathedral appear on that windowsill?

The examples in (36) show that the locative PP cannot appear as the subject in(36a-36b) although the theme NP can in (36c-36d) in non-inverted examples.However, Bresnan (1994: 102), in a section arguing against a null expletivesubject analysis, provides the following data, which make the situation reportedhere more complex:

(37) a. Which portrait of the artist hung on the wall?b. * Which portrait of the artist did hang on the wall?

The examples in (37) show that when a subject itself is questioned, as in (37a),which is the interrogative correlate of a portrait of the artist hung on the wall,subject-inversion is not triggered. In fact, as (37b) shows, auxiliaries cannotoccur. We see the same facts with locatives:

(38) a. On which wall hung a portrait of the artist?b. *On which wall did hang a portrait of the artist?

The examples in (38) correlate to on the wall hung a portrait of the artist. In theseexamples, on which wall behaves just like a subject in a subject-interrogative.

3. 2 Results and conclusions

It is possible to organize these results into a table - we can then explore thehypothesis that the split in subject properties shown in locative inversioncorresponds to a split in subject properties which can be explored elsewhere ingrammar. Table 1 shows whether a given subject property applies to the locativePP or the theme NP in a locative inversion structure; for this reason the tablesays that it is not possible for the NP to undergo subject-to-subject raising in thecase of into the room ran a child.


Table 1 Subject property applied to the locative PP or the theme NP in alocative inversion structure

Subject property Found on PP? Found on NP?

AgreementSubject of participial relativeSubject raisingTag questions^THAT-trace effectExtraction from coordinated constituentsInversion in interrogativesSubject interrogatives

XX////X/

//XXXx10

X11

X

The evidence from Bresnan's paper, which I have reviewed in this section,shows that several subject properties may occur on either the locative PP or thepostposed subject NP. Only three properties are not able to occur on the PP:these are agreement, being the subject of a participial relative, and an inversionin non-subject interrogatives. The tag question data appears to favour a subjectanalysis of the locative PP rather than the postposed NP.

In the next section, I set out to accommodate those facts within WordGrammar. As I stated in the introduction, these facts are problematic for themodel put forward in Hudson (1990), because there is only a single account ofsubjects in that theory. In the next section, I review some of the dimensions ofsubjecthood, in the light of the discussion in section 2. 1 go on to argue that weneed to split subjects into three kinds - lexical subjects, syntactic subjects andmorphosyntactic subjects - and that this division can take account of the patternof data reported in section 3.

4. Factored Out Subjects

In this section, I relate the data in section 3 to the more general discussion ofsubjects presented in section 2. The problem for the English Word Grammartypology of dependencies is that in this model there is only one dependencywhich has the 'subject-of label - and yet in the locative inversion data, as wehave seen, there are two candidates for the subject dependency. If we start withthe general observations made about subjects in section 2, we can see that thesubject properties detailed there gather around three poles: there are subjectproperties which are - broadly speaking - lexical; syntactic; and morphosyn-tactic. 12 We can put the subject properties into three lists. Bracketed itemsfeature in more than one list13

Lexical properties• Subjects are typically the dependent which expresses the Agent argument in

the active voice.• (The sole arguments of intransitives typically show (other) subject

properties. )


• (The addressee of an imperative is a subject)

Syntactic properties• The subject can be the shared argument in coordination.• The subject is the only argument of an xcomp which may be shared in a

predicative complementation structure (in both raising and control).• English subjects resist extraction. (In other languages subjects are often

more extractable than other arguments).• Auxiliary inversion fails with a subject interrogative.• Subjects are usually the discourse topic.

Morphosyntactic properties• (The addressee of an imperative is a subject. )• The subject is often obligatory.• Subject-verb agreement.

As we can see, the lexical properties are those properties which are primarily todo with the mapping of semantic roles to grammatical functions. The first twoitems listed as lexical properties concern the mapping of semantic roles inparticular in nominative-accusative languages (like English) rather thanabsolutive-ergative languages (like West Greenlandic). I have the second twoitems which are listed as lexical properties in brackets, because these are sharedwith other parts of the grammar: the identification of the subject of imperativesis not only a lexical property. What makes it a 'lexical' subject is that the subjectof an imperative picks out the same semantic role as the first two criteria - thelinking facts apply here as well. But this subject-criterion could also bemorphosyntactic, because the imperative is a mood, and mood is amorphosyntactic feature. Excepting certain well-known construction types, itis only the imperative mood that permits subjects not to be represented by anovert noun or noun phrase in English.

The syntactic properties are those that have to do with the grammaticalphenomena that are commonly called 'movement' or 'deletion'. The first threecan be subsumed under the descriptive generalization that the subject is theelement which can be an argument of more than one predicate. The Englishextraction data are at odds with the more typical extraction data, but they stillshow that subjects can be identified by their extraction properties. I haveincluded the non-inversion of subject interrogatives as a syntactic fact (ratherthan a morphosyntactic one) on the grounds that subject-inversion is a wordorder rule, and is therefore syntactic. For this reason, the fact that PP locativesbehave like other subjects shows that they have the same syntactic properties asother subjects: they resist subject-inversion in interrogatives.

The final observation listed under syntactic subjects is arguably not evengrammatical - but there are constructional interactions between topic andsyntactic structure and, indeed, focus and syntactic structure. Again, there is adescriptive generalization to be captured, that subjects generally are topics. 14

The morphosyntactic properties of subjects tend to be linguistically specific.


Tenseless languages will not show any morphosyntactic subject properties, andas Hudson (1999) shows, such properties are in decay in English. I have alreadydiscussed the issue to do with the imperative. I have put the obligatory criterionhere, because it is related to agreement: in languages which have a highlydeveloped agreement morphology in both the verbal conjugation system andthe nominal declension system it is possible for subjects to be omitted. Englishhas obligatory subjects - which can be expletive - which appear to be obligatorybecause of the impoverished inflectional morphology. 15

It is possible, on the basis of this discussion, to make some generalpredictions about what might be found cross-linguistically. The lexicalproperties of subjects will vary according to whether the language is nominativeor ergative. The morphosyntactic properties of subjects will vary according towhether the language has a rich inflectional system, an impoverishedinflectional system, or no inflectional system. And the syntactic properties ofsubjects should be relatively consistent across languages: to the extent that thereis variation in the syntactic properties of subjects, it should be attributable to theinteraction between this dimension of subjecthood and one of the otherdimensions.

From the point of view of Word Grammar, these observations aboutsubjects are only salient if factored-out subjects can do two things: exist inmismatch structures, so that no single grammatical relation can be held toobtain between a verb and another element in a clause; and help capturedescriptive facts better than simply treating subject-of as a single unitary relation.

If we return to the data presented in Table 1, we can see that all of thesubject properties that are found on the inverted PP are syntactic subjectproperties in that they are all subject properties that are relevant to the ability ofa single noun or noun phrase to be construed as an argument of more than onepredicate. The properties that were found on the locative PP were:

• subject raising;• tag questions;• subject extraction;• extraction from coordinated constituents;• subject interrogatives.

Excepting the tag-question data (which are arguably morphosyntactic, and whichwere, in any case, moot) all of these subject properties are properties that arerelated to the ability of a single subject entity to be an argument of more thanone predicate. What we find is that locative PPs can behave like syntacticsubjects, and that when they do behave like syntactic subjects, the NP themeargument of the verb cannot behave like a syntactic argument.

This is half of an argument that subject properties are split. The other half ofthe argument is that the morphosyntactic properties are found on the NP. Theproperties that were found on the NP in English, but not on the preposed PP,were:


• agreement;• subject of participial relative.

Agreement is clearly morphosyntactic, and in any case, it is not possible for acategory which does not have number to show agreement. The more difficultcase is that of being subject of a participial relative. Bresnan (1994: 94)introduces this diagnostic because in Chichewa it is possible for a PP to be thesubject of a participial relative. Given the cross-linguistic variation, this has to beassigned to an arbitrary difference between languages. It is probably due to thefact that participial relatives are adjuncts of the noun they modify, and in thesemantics of English adjuncts take their head as their argument. The adjunctswhich do not take their heads as their arguments are limited in number to asmall set of exceptions: adjective adjuncts of verbs in, for example, resultativeconstructions like Jane ran her trainers threadbare where threadbare is an adjunctof ran, and its 'er' is (her) trainers. The Chichewa data suggest that, in general,the participial relative facts need to be treated as syntactic rather than lexical ormorphosyntactic.

The datum which I have not discussed is the inability of PP subjects toundergo subject-inversion in interrogatives. I repeat examples (35) and (36)here:

(35) a. *Did over my windowsill crawl an entire army of ants?b. *Did on that hill appear to be located a cathedral?

(36) a. *When did over my windowsill crawl an entire army of ants?b. *Why did on that hill appear to be located a cathedral?c. Why did an entire army of ants crawl over my windowsill?d. Why did a cathedral appear on that windowsill?

The examples in (35) and (36a-36b) show that the locative PP in locativeinversion cannot undergo subject-inversion. I think that the crucial thing here isthat this is not a syntactic property of subjecthood, but a morphosyntactic one. 16It is not a syntactic one, because the syntactic constraints were generallyconcerned with the ability of a single phrase to occur as an argument of morethan one predicate. The restriction in (35) and (36) is different; in fact, it isattributable to the PP's lack of morphosyntactic properties. I take it that anauxiliary cannot invert with a phrase that it cannot agree in number with. 17

However, the examples in (37) and (38), which I repeat here, show that thelocative PP behaves like a subject with respect to subject interrogatives:

(37) a. Which portrait of the artist hung on the wall?b. * Which portrait of the artist did hang on the wall?

(38) a. On which wall hung a portrait of the artist?b. *On which wall did hang a portrait of the artist?

The split between morphosyntactic subjects and syntactic subjects permits anelegant account of the interrogative data. In locative inversion, the locative PPpatterns with subjects in subject interrogatives, simply because this property is a


word-order property, which belongs in the domain of syntactic subjecthood.But the locative PP does not undergo subject-inversion because this is amorphosyntactic property.

I propose that, in the case of locative inversion, we reject the data from tagquestions on the grounds that even uninverted theme NPs cannot beantecedents for the pronouns in tag-questions, as we saw in the discussion of(24)-(26) above. Tag questions should be diagnostic of morphosyntacticsubjecthood, given that the auxiliary verb behaves like a resumptive pronounwith respect to the main verb, and given that the agreement pattern shouldreflect that of the main clause.

4. 1 Summary and discussion

The locative inversion data argue for a differentiation between morphosyntacticsubjects and syntactic subjects - a split which is supported by an examination ofsubject properties more generally. The situation is slightly more complicated inthat the locative inversion data argues for a two-way split, but the more generaldiscussion of subject properties suggests that there ought to be a three-way split.However, in other general discussions of subjecthood, such as Dixon (1994),Falk (2004), and Manning (1996), only a two-way split between 'subjects' and'pivots' is maintained. In that work, 'subjects' correspond to my lexical subjects,and the subject, as opposed to the pivot, is the argument where linking iscontrolled. On the other hand, in Anderson (1997) a distinction betweenmorphosyntactic subjects and syntactic subjects, such as I have been arguing forhere, is maintained.

I think that the way forward is to treat the investigation of subjecthood in asimilar way to the commutation-series approach to the phoneme inventory of alanguage. We have seen from the locative inversion data that morphosyntacticand syntactic subjects can be factored out from each other. I propose, briefly, toshow that lexical and morphosyntactic subjects can be factored out, and thenthat lexical and syntactic subjects too can be factored out

The account presented here contrasts with Bresnan's in two dimensions.Bresnan (1994, 103-5) argues that the locative PP is a subject in LFG's domainof f-structure, but that it does not occupy a subject position in LFG's domain ofc-structure. 18 The clause-initial property of the locative PP in locative inversionis attributed to its also being a topic in f-structure. The other part of Bresnan's(1994: 105) analysis is that the postposed NP is identified as the f-structureobject, for both English and Chichewa. The evidence and arguments thatBresnan puts forward for her analysis are largely that because it is a PP, thelocative PP cannot fulfill certain structural roles associated with subjecthood,which are contingent on the property of being a nominal category. PPs do notnormally have the distribution of NPs - it is only in the particular constructionof locative inversion, with the additional overlay of topichood, that locative PPsmay behave as subjects. 19 Bresnan claims that the f-structure element in heranalysis accounts for what I have described here as syntactic subject properties;and the c-structure part accounts for the morphosyntactic subject properties.


Within Word Grammar, we cannot exploit a c-structure/f-structuremismatch. Nor is it possible to assert that certain subject properties reside inone domain of structure. However, by splitting the subject properties in the wayI have here, we can account for the same phenomena within a single domain ofstructure: the dependency relation. This buys an advantage over Bresnan'saccount: as we have seen, the postposed NP has the properties of amorphosyntactic subject. Bresnan treats the postposed NP as an object, but thisanalysis cannot account for its agreement properties. By treating it as amorphosyntactic subject, and not an object, we can account for the agreementfacts without assuming, for example, that English has object agreement.

In the next section, I discuss the distinction between lexical and othersubjects a little further. This discussion does not add to the analysis of locativePPs in locative inversion, but it does complete the discussion of a three-way splitin subject properties.

4. 1. 1 LEXICAL AND OTHER SUBJECTSWe can see that lexical and morphosyntactic subjects must be factored out fromeach other by looking at raising and gerunds. The examples in (39) showraising, the examples in (40) show gerunds.

(39) a. Jane seems to be running.b. I expect Jane to be running.

In both examples in (39), Jane is the 'er' of'run'. However, in neither examplecan Jane be thought of as the morphosyntactic subject, because there is noagreement: the infinitive does not have a feature-value 'number'.

The examples in (40) are even more acute: the gerund running also has novalue for number, but it does have a subject - the pronoun me in (40a) and thepronoun my in (40b). Again, in neither case can this be thought of asmorphosyntactic subjecthood.

(40) a. Everyone laughed at me running,b. My running was funny.

Additionally, the evidence from gerunds shows that lexical subjects have to bedistinguished from syntactic subjects: in (40a), me is the head of running, and in(40b), my is the head of running. From these examples, we have to concludethat there are cases where lexical subjecthood has to be distinguished fromsyntactic subjecthood and from morphosyntactic subjecthood.

Another example, although a negative one, comes from weather IT. Theexample in (41) shows that weather IT can be simultaneously a morphosyntacticand a syntactic subject, even though it cannot be a lexical subject given that theverb rain does not have any semantic arguments.

(41) It seems to be raining and to be sleeting.

The example in (41) is also important because it shows that the property of


being a syntactic subject is not co-extensive with being a topic. Both Bresnan(1994) and Falk (2004) argue that syntactic subjects are topics of some kind, butan expletive pronoun is not a candidate for topichood.

5. Conclusions

In this section, I argue that the treatment of the data presented in this chapterhandles the facts and the data more satisfactorily than the mismatch account ofBresnan (1994), and that it is more compatible with other general assumptionsabout the architecture of grammar that Word Grammar adopts. On the basis ofthe locative inversion evidence, I have made a distinction betweenmorphosyntactic and syntactic subjects, and on the basis of further evidencefrom other constructions have made a further distinction which separates lexicalsubjects out from the other kinds of subject.

Bresnan also argues for a three-way distinction, but in her case thefactorization of subjecthood is over three of the domains of structure that LFGrecognizes: a-structure, f-structure and c-structure. She effectively argues that thelocative PPs can only be construed as subjects because they are also topics. Theproblem with this account is that it treats 'topic' as fundamentally syntactic,located in f-structure, when it is clear (a) that subjects need not be topics; and (b)that some subjects cannot be topics. Furthermore, we have seen that theproperties of subjects itemized in section 4 do not require there to be a separatedimension of topichood - it is simply the case that some subjects are syntacticrather than morphosyntactic.

In some senses, the different approaches between this chapter and Bresnan(1994) are due to underlying assumptions that the two models have, whichmake them different from each other. LFG does not permit there to be amapping of more than one f-structure relation between two elements; WordGrammar does not distinguish between argument structure and the instantiateddependencies in a given construction. But it is also the case that the WGaccount espoused here allows the theory of subjects to be elaborated so that itcan account for a wide range of differences in the spectrum of subjectproperties.

There are some obvious avenues for future research: for example, bothWest Greenlandic and Mandarin are tenseless. For this reason, Mandarin hasbeen argued not to have the subject and object dependencies that are witnessedin other languages. However, while Mandarin has long-distance reflexives, WestGreenlandic does not. One salient difference is that Mandarin is a nominativelanguage while West Greenlandic is an ergative language, and so the question isbegged whether these facts are attributable to differences in lexical subjects inthese languages.

Certainly more research is required on the cross-linguistic typology ofdependencies. Meanwhile, it is clear that the English Word Grammar modelneeds to be revised, to admit at least three different kinds of subject.


References

Aissen, J. (1975), 'Presentational-£/z<?re insertion: a cyclic root transformation'. ChicagoLinguistics Society, 11, 1-14.

Anderson, J. M. (1997), A Notional Theory of Syntactic Categories. Cambridge:Cambridge University Press.

Andrews, A. (1985), 'The major functions of the noun phrase', in T. Shopen (ed. ),Language Typology and Syntactic Description, vol. 1: Clause Structure. Cambridge:Cambridge University Press, pp. 62-154.

Bowers, J. S. (1976), 'On surface structure grammatical relations and the structure-preserving hypothesis'. Linguistic Analysis, 2, 584-6.

Bresnan, J. W. (1994), 'Locative inversion and the architecture of grammar'. Language,70, 72-131.

Chomsky, N. (1981), Lectures on Government and Binding. Dordrecht: Foris.Dixon, R. M. W. (1994), Ergativity. Cambridge: Cambridge University Press.Falk, Y. (2004), 'Explaining subjecthood' (Unpublished manuscript, Hebrew University

of Jerusalem).Gueron, J. (1980), 'The syntax and semantics of PP-extraposition'. Linguistic Inquiry,

11, 637-78.Hudson, R. A. (1990), English Word Grammar. Oxford: Blackwell.— (1999), 'Subject-verb agreement in English'. English Language and Linguistics, 3, 173-

207.Jackendoff, R. S. (1990), Semantic Structures. Cambridge, MA: MIT Press.Keenan, E. L. (1976), 'Towards a universal definition of "subject"', in Charles Li (ed. ),

Subject and Topic. New York: Academic Press, pp. 303-33.Keenan, E. L. and Connie, B. (1977), 'Noun phrase accessibility and universal

grammar'. Linguistic Inquiry, 8, 63-99.McCloskey, J. (1997), 'Subjecthood and subject position', in L. Haegeman (ed. ), Elements

of Grammar: Handbook of Generative Syntax. Dordrecht Kluwer, pp. 197-235.— (2001), 'The distribution of subject properties in Irish', in W. D. Davies and S.

Dubinsky (eds), Objects and Other Subjects. Dordrech: Kluwer, pp. 157-92.Manning, C. (1996), Ergativity. Stanford, CA: CSLI Publications.

Notes

1 Unless I explicitly state otherwise, the examples in section 1 and section 3 (where Ipresent Bresnan's locative inversion data) are taken from Bresnan's (1994) paper,and the grammaticality judgements are hers. I have, however, silently amendedBresnan's spelling to British English norms.

2 This is a pre-formal statement, and I do not intend it to commit me to any particulartheoretical position.

3 English makes a distinction between disjoint pronouns - the forms him, her, me andso forth, and anaphoric pronouns like himself, herself, myself and so forth. Not alllanguages make this distinction, and English has not always made the samedistinction in the course of its history.

4 The underscore represents the subject position for to. I do not mean by thisrepresentation to suggest that there is actual movement - like Hudson (1990) Ireject a movement account The representation is intended to be pre-formal, and isborrowed from Bresnan (1994), whose examples I borrow in section 3 - and inborrowing some of these examples, I import the representation.

5 I adopt the analysis of subject-verb agreement presented in Hudson (1999), which


argues, in summary, that present-day English represents a transitional stage where,except in the case of be, number is the only remaining agreement feature.

6 The italicized part of the sentence shows that in (22a) [o]ver my windowsill appearsto have been raised from the subject position of to have crawled an entire army of antsinto the subject position of seems. However, in borrowing this representation, I donot commit myself to a movement analysis of these data.

7 These examples are not drawn from Bresnan's paper.8 This claim is debatable: there in this example is not the deictic there of there it is, but

the empty one of there's a problem. It might make more sense to say that it was co-referential if it were the deictic there.

9 The table shows Bresnan's (1994) evaluation of this evidence, although it seemsclear that the tag-question data is rather more moot than her presentation wouldsuggest. I return to this in section 4 below.

10 This is not, purely, a subject diagnostic: the point is that the parallelism shows that ifthe extracted PP is to be treated as a subject in one conjunct, it cannot be an objector other argument in the second conjunct, which suggests strongly that it is actually asubject.

11 Of course, when the theme NP is in the normal subject position it can undergoinversion: the point of these cells in the table is that neither the NP nor the PP canundergo inversion when it is the PP that is in subject position.

12 I am using these terms pre-formally as a descriptive heuristic. I refine the terms, andthe analysis, below.

13 I am leaving the reflexive binding facts out of the lists. These facts couldtheoretically belong in all three lists - and for that reason, more research needs tobe done about the relationship between reflexive binding and the dimensions ofsubjecthood. I shall come back to this briefly in section 5; here it suffices to pointout that binding has been treated in terms of clause domains, which is eithersyntactic or morphosyntactic, depending on how clauses are denned, and in termsof hierarchies of arguments, which is clearly lexical.

14 Of course, a subject need not be a topic: weather it, the it of extraposition, andexpletive THERE cannot be treated as topics given that they are not referential.

15 It might be objected that Mandarin has no inflectional morphology whatever, yetsubjects can be omitted in Mandarin when they can be pragmatically recovered.This point is, however, consistent with my observation: Mandarin has no tense;indeed, it could be argued that it has no finiteness at all. Given that lack ofmorphosyntax in Mandarin, it is unsurprising that it does not have the category ofmorphosyntactic subjects. And given that subjects can be omitted in languages witha rich morphosyntactic system, as well as in languages lacking a morphosyntacticsystem, we can deduce that obligatory subjecthood is neither a lexical property, nora syntactic property.

16 Hudson (1990: 240) also treats subject-inversion as a morphosyntactic property.17 This claim argues for a treatment of THERE-insertion where THERE acquires its

number from its 'associate', given that THERE can invert with an auxiliary.18 She goes on to reduce the typological differences between English and Chichewa to

a difference in the c-structure representations: in Chichewa, the f-structure PPsubject is also a c-structure subject

19 For this reason, Bresnan (1994: 110) distinguishes between 'place' and 'time'denoting PPs, which can have the same distribution as nominal elements, and thePPs found in locative inversion.

Conclusion

KENSEI SUGAYAMA

The movement of Word Grammar began largely as an approach to the analysisof grammatical structure and linguistic meaning in response to constituency-based grammar and generative grammar. In this book, we have focused on theanalyses of morphology, syntax, semantics, and discourse based on thefundamental hypotheses presented in the Preface and Chapter 1: WG ismonostratal; it uses word-word dependencies; it does not use phrase structure;and language is viewed as a network of knowledge, linking concepts aboutwords, their meanings, etc. We conclude our survey by pointing out some ofthe ways Word Grammar has gone, and should go, beyond its boundaries.

The monostratal character of WG is an advantage, especially the absence oftransformations, even of movement rules. Their role has been taken over bythe acceptance of double dependency within certain limits.

Although word-word dependencies are difficult to accept for a number ofgrammarians, it makes the grammar simpler and is also important indetermining the default word order of a language. The notion of phrase isnot completely lost in WG since phrases can be seen as dependency chains. Itis also a good idea to see grammatical relations (subject, object, etc. ) as asubclass of dependents.

WG presents language as a network of knowledge, linking concepts aboutwords, their meanings, etc. In this network, there are no clear boundariesbetween different areas of knowledge - e. g. between 'lexicon' and 'grammar', orbetween 'linguistic meaning' and 'encyclopedic knowledge'. It is rarely knownthat this hypothesis was advanced much earlier than the contemporarymovement of cognitive linguistics. Thus WG has implied since its very start inthe early 1980s that conceptual structures and processes proposed for languageshould be essentially the same as those found in nonlinguistic human cognition.It uses 'default inheritance', as a very general way of capturing the relationbetween 'model' or 'prototype' concepts and 'instances' or 'peripheral'concepts. 'Default inheritance' and especially 'prototypes' are now widelyaccepted among linguists.

As Richard Hudson puts it in his conclusion to Chapter 1, WG addressesquestions from a number of different research traditions. As in formallinguistics, it is concerned with the formal properties of language structure; but italso shares with cognitive linguistics a focus on how these structures are


embedded in general cognition. Within syntax, it uses dependencies ratherthan phrase structure, but also recognizes the rich structures that have beenhighlighted in the phrase-structure tradition. In morphology it follows theEuropean tradition which separates morphology strictly from syntax, but alsoallows exceptional words which contain the forms of smaller words. And so onthrough other areas of language. Every theoretical decision is driven by twoconcerns: staying true to the facts of language, and providing the simplestpossible explanation for these facts.

The search for new illuminating insights is still under way, and morewidespread beliefs may well have to be abandoned; but the most generalconclusion so far, as Richard Hudson says, seems to be that language is mostlyvery much like other areas of cognition. Thus, Word Grammar in itsarchitecture has the potential to make a contribution to a theory of cognitionthat goes beyond language.

Author Index

Abeille, A. 155Anderson, J. M. 6, 220Andrews, A. 35, 72, 205

Baltin, M. 153Biber, D. et al, 92, 94, 95, 97, 99Boguraev, B. 83Borsley, R. D. 161, 167n. 4, n. 5, n. 6Bouma, G. 155, 157Breheny, R. 201Bresnan, J. 6, 84, 205, 210, 220, 223n.

13, 224n. 7, n. 9, n. 19Briscoe, T. 83

Chomsky, N. (including Chomskyan) 9,35, 41, 84, 87, 162

Chung, C. 145, 146, 151, 153, 158,160-5

Connie, B. 205Conrad, S. 92, 94, 95, 97, 99Copestake, A. 83Cormack, A. 201n. 2Creider, C. 202n. 13Croft, W. 83Cruse, D. 92

Davis, A. IllDixon, R. M. W. 220Dowty, D. 95, 111, 112

Eppler, E. 118, 122, 127, 129

Falk, Y. 205Fillmore, Ch. 6Finegan, E. 92, 94, 95, 97, 99

Gazdar, G. 75

Ginzburg, J. 145, 146, 154-60, 164,165, 167n. 7

Godard, D. 155Goldberg, A. 8, 83, 87, 111, 113

Haegeman, L. 146, 167n. 3Halliday, M. A. K. 5, 6Henniss, K. 51Holmes, J. 84, 112, 113, 114Huddleston, R. 68, 69Hudson, R. A. 35, 42, 43, 50, 69, 70,

84, 87, 92, 99, 121, 122, 125, 126,145-53, 165, 167n. 1, n. 2, 183,191, 202n. 13, 204, 205, 218, 223n.4, n. 5, 224

Jackendoff, R. 24, 111, 206, 211Jaworska, E. 20In. 3Johansson, S. 92, 94, 95, 97, 99

Rasper, R. 146, 161Kathol, A. 146, 161, 167n. 8Keenan, E. L. 205Kim, J. -B. 145, 146, 151, 153, 155, 157,

158, 160-5Koenig, J. -P. IllKoizumi, M. 153

Lakoff, G. 3Lamb, S. 9Langacker, R. 3, 8Lasnik, H. 50, 72, 162Lecarme, J. 35Leech, G. 92, 94, 95, 97, 99Lemmens, M. 83Levin, B. 83, 95-6Levine, R. 146, 161

228 AUTHOR INDEX

Lyons, J. 6

McCawley, J. 6McCloskey, J. 205Malouf, R. 155, 157Manning, C. 220Muysken, P. 118, 119, 121, 124, 130, 140

Payne, J. 201n. 2Penn, G. 161Pinker, S. 11Pollard, C. J. 35, 51, 146, 161, 167n. 1Przepiorkowski, A. 155Pustejovsky, J. 83

Quicoli, A. C. 35Quirk, R. et al. 180

Rappaport Hovav, M.Reape, M. 146, 161Rizzi, L. 146Ross, J. R. 154

83

Rosta, A. 202n. 16, 203n. 21Rutherford, W. E. 135, 136, 137

Sag, I. A. 35, 51, 145, 146, 154-60, 164,165, 167n. 1, n. 7

Sankoff, D. 120, 129Shaumyan, O. 6Shibatani, M. 87Sigursson, H. 50Sugayama, K. 7, 167n. 6

Taylor, J. 3Tesniere, L. 4Trask, R. 111-2Trier, J. 92

van Langendonck, W. 183van Noord, G. 155

Warner, A. R. 76Weisgerber, L. 92Williams, E. 87

Subject Index

a(n) 202accusative case 36, 38, 39, 41, 45, 50accusative subject (see subject(s))actor (see participant role(s))adjunct 149-51, 154, 157, 159, 192adjunction 192adjective (s)

attributive adjectives 182predicative adjective 35

adverbial 155, 157agent (see participant role(s))agreement 95all but 174ff, 178almost 174ff, 178argument role 113attraction 36, 40, 41, 46, 50, 53n. 3,

53n. 4atypical complement 202auxiliary

quasi-auxiliary 67semi-auxiliary 67

be 68, 69be to construction 67, 70, 72, 73, 75, 78,

79because 117, 128, 129-40beneficiary (see participant role(s))Best Fit Principle 15binder 188, 200both 191branch 188Branch Uniqueness principle 190, 192,

202

case agreement 35, 41-2, 45, 51clausal that 174clitic 21-2

Code-mixing 117, 118, 119, 120, 122-4,128, 129, 131, 139

Code-switching 118, 124, 127, 128cognition, 3, 15, 22, 24, 28cognitive linguistics 3, 28comment 197compaction 161complementizer 154, 162, 167n. 6complex coordination (see coordination)concept 5, 8, 9, 10, 11, 12, 13, 16, 24,

25conjunct 189connectivity 189constituency 165, 167n. 9constraint 117-20, 122, 127,constructional constraint 161coordination 21, 22, 23, 190-1, 202

complex coordination 191, 198correlative 191count interpretation (see also mass

interpretation) 201n. 9

default definiteness 63default inheritance 5, 6, 8, 12-13, 16,

20, 123, 124, 126degree words 181deletion 176

VP deletion 48demand 174, 177dependency 6, 21-4, 27-8, 122, 125,

127, 139, 145-54, 165, 167n. 6, 204long-distance dependency 155dependency types 191

depictive 192determiner 182difficulty 27, 28distribution 172

230 SUBJECT INDEX

ditransitive 198DOMAIN (DOM) (see HEAD-DRIVEN

PHRASE STRUCTUREGRAMMAR)

eat type verbs 56, 59ee (see also er) 101, 104, 110either 191ellipsis 176, 182, 183, 193, 201n. 5,

201n. 6ellipsis, anaphoric 48ellipsis, determiner-complement 201n. 5embedded clause 145-6, 156, 159, 160,

162-5empty category 187ff, 202empty element 35endocentricity (see also hypocentricity)

173English 117, 118, 123, 125, 127-40er (see also ee) 101, 104, 110even 178ffevent 104-5, 108-10existence propositions 35exocentricity 173extractee 149-54, 167n. 6extraction 21, 23extraposition 183, 202

feature structure (see HEAD-DRIVENPHRASE STRUCTUREGRAMMAR)

filler (see HEAD-DRIVEN PHRASESTRUCTURE GRAMMAR)

finite 197form 13, 15, 18, 20-1, 25'future in the past' 74

gap-ss 157Generalized Head Feature Principle

(GHFP) 156, 158generative grammar 5-6German 117, 118, 123, 125, 127, 128,

130-40goal (see participant role(s))grammatical relation 192GR-word 192, 195guardian 196, 200

head 154-60, 173Head-Driven Phrase Structure Grammar

(HPSG) 35, 42, 51, 52, 56,

145-6, 154, 160, 164, 165, 167n. l,n. 9

DOMAIN (DOM) 161-3feature structure 154filler 156, 158HEAD (in HPSG) 154SLASH 154-59, 167n. 7synsem (value) 155, 157SYNTAX-SEMANTICS

(SYNSEM) 154-6hypocentricity (see also

endocentricity) 173, 178,199-200

immediate dominance rules 167n. 9incremental theme 95, 112INDEPENDENT-CLAUSE (1C) 156,

158-60infinitival subject (see subject(s))inflection 4, 6, 14, 20, 25, 26Inheritance Principle 70inside-out interrogative

construction 185ffinstance (see also model) 57, 58, 69, 67,

70, 72, 77, 78, 79, 81interrogative 146-54, 158-60, 162interrogative clause 185ffInversion 75inversion, locative 21 Off,inversion, subject-auxiliary (see SAI)INVERTED (INV) 159, 160isa 8, 9, 10, 11, 12, 13, 14, 15, 16, 57,

58

just 178ff

landmark 105, 110, 112, 146, 152lexeme 5, 14, 18, 21, 24, 27, 34, 36

ACCORD 91AT 105BE 107COVER 114DEVOUR 93DO 96FOR 89-90GIVE 84, 85, 91, 107HAVE 95, 96, 107IN 105LOAD 114ON 105OPEN 90

SUBJECT INDEX 231

lexeme, cont.POUR 114SPRAY 114TO 89, 107WITH 108

Lexical Functional Grammar(LFG) 204, 220, 222

lexical relationship 27lexical subjects (see subject(s))lexical unit 67, 77, 81linear order 145, 146, 151, 163-5linear precedence rule 165local tree 161, 165locative inversion 21 Off,long-distance dependency (see dependency)

mass interpretation 185, 201n. 9mental nodes 43model (see also instance) 57, 69, 70, 79modularity 11-12more than 174ffmorpheme 5, 15, 17, 18, 19, 20morphological case 41morphology 4, 7, 18-21, 22, 28morphology, derivational 18, 20morphology, inflectional 18, 19morphology-free syntax 4morphosyntactic subjects (see subject(s))multiple inheritance 12, 13, 14, 91

neither 191network 5, 6, 7-10, 11, 13, 14, 15, 17,

20, 22, 24, 27, 123, 124, 126never 174ffNICE properties 72No-Dangling Principle 150No-Tangling Principle 147, 151-4,

167n. 2not 174ff, 178noun + noun premodification 193-5null subject (see subject(s))

object 92-7, 101object, indirect 84-92, 102, 107objects, suppressed 54, 56, 59, 60, 62objects, understood 59, 65n. 7object pro-drop 47only 178ffOrder Concord 167n. 2order domain 146, 161-4other than 174ff

parent 146, 148-50, 152-4, 165parsing 27, 28participant role (s) 113

actor 107, 110, 111, 113agent 112beneficiary 89-90, 102goal 111patient 108, 110, 111, 112recipient 85, 89, 102-3theme 107, 110, 111, 112

passive 87, 93past 90patient (see participant role(s))phrase 145-6, 154-8, 162-5pied-piping 171, 181plural inflection 202n. llpluralia tantum 47, 48precedence concord 190predicate 195predicate nominal 35, 41predication 19-56present 99principle of No Crossing 190principle of Node Lexicality 190PRO 35, 42, 44, 45, 49, 50, 51, 52processing 7, 16, 27-28pro-drop 45, 46-7Promotion Principle 146proxy 177ff, 188, 199pseudogapping 201n. 5purpose 89

quantity variablequestion 98

43, 44

recipient (see participant role(s))regent 172require 174restrictive vs. non-restrictive 135-40result 95, 108Right Node Raising 203n. 21

semantic phrasing 26semantics 4, 6, 7, 13, 24-7sharer 147shave type verbs 59SLASH (see HEAD-DRIVEN PHRASE

STRUCTURE GRAMMAR)sociolinguistics 4, 6, 7SOV 127, 128, 130split subjects (see subject(s))

232 SUBJECT INDEX

SSP (see Syntax Semantics Principle)state 104-8Stratificational Grammar 6structure-sharing 22, 23SUBCATlist 36, 51subject(s) 97-101, 195, 205

accusative subject 36-9, 40infinitival subject 38, 50lexical subjects 216morphosyntactic subjects 217null subject 36, 44-5, 46, 49, 51,

52split subjects 205subject properties 206 ffsyntactic subjects 218

subject-auxiliary inversion (SAI) 151,153, 158, 160, 162, 202n. l9

subordinate 172subordinate clause 145-46, 149-54,

159-60superordinate 172surface structure 147, 149-51surrogate 174, 177ff, 187, 199SVO 127, 128, 130,synsem (value) (see HEAD-DRIVEN

PHRASE STRUCTUREGRAMMAR)

syntactic subjects (see subject(s))syntax 4, 21-4, 25, 26, 27-8Syntax Semantics Principle (SSP) 84,

94, 103, 110SYNTAX-SEMANTICS (SYNSEM, see

HEAD-DRIVEN PHRASESTRUCTURE GRAMMAR; seealso synsem)

terminal node 201n. lthai-clause 174theme (see participant role(s))three-place predicate 55token (see also type) 43, 46topic 145, 146, 152, 153, 158, 161-4,

197topicalization 146, 156, 157, 159Topological Linear Precedence

Constraint 162-4type (see also token) 43, 46type-of construction 184-5

unreal words (see words)Unspecified Object Alternation 65n. 3utterance 15-17

V2 128, 128, 132, 136, 137VP fronting 75

way 183weil (German] 117, 130-40^-interrogative 146, 147, 149-54,

158-60, 162w/z-pronoun 147, 148, 149, 153, 154win type verbs 59word(s) 4, 5, 13, 14, 15, 18-19, 20,

21-2, 23, 24, 26, 28unreal words 6, 22, 23-4

Word Grammar (WG) 56-8, 117,121-7, 137, 139, 140, 145-154,159, 164-5, 167n. 2, 204, 205, 218,221, 222

word order 146, 147, 152, 153, 161,165, 172, 177, 188

word-order rule 146, 147, 152, 153

word grammar (richard hudson)

Documents