on the non-context-freeness of romanian

34
On the non-context-freeness of Romanian Nicholas Longenbaugh Harvard College [email protected] October 2, 2013 Nicholas Longenbaugh October 2, 2013 1 / 34

Upload: tranphuc

Post on 10-Jan-2017

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: On the non-context-freeness of Romanian

On the non-context-freeness of Romanian

Nicholas Longenbaugh

Harvard College

[email protected]

October 2, 2013

Nicholas Longenbaugh October 2, 2013 1 / 34

Page 2: On the non-context-freeness of Romanian

Overview

1 Formal language theoryComplexity and the Chomsky hierarchyWhy weak generative capacity?Weak generative capacity of natural language

2 Romanian is weakly non-context-freeWh-elements in RomanianAccount

3 Weak non-context-freeness in perspectiveSwiss-GermanSwedishBambaraRomanian

4 Conclusion

Nicholas Longenbaugh October 2, 2013 2 / 34

Page 3: On the non-context-freeness of Romanian

Formal language theory Complexity and the Chomsky hierarchy

Complexity and the Chomsky hierarchy.

Complexity: how sophisticated are the rules we need to generate orrecognize the strings of a particular language?

We can characterize languages according to their complexity

Regular: {an : n ≥ 0}.Context free: {anbn : n ≥ 0}.Context sensitive: {anbncn : n ≥ 0}.

Nicholas Longenbaugh October 2, 2013 3 / 34

Page 4: On the non-context-freeness of Romanian

Formal language theory Complexity and the Chomsky hierarchy

Complexity and the Chomsky hierarchy

For each of these classes, there is associated a cannonical formalism:

Regular: finite state machinesContext free: context free grammarsContext sensitive: linear bounded nondeterministic Turing machineRecursively enumerable: enumerative Turing machine

Generative capacity: for a particular grammar or other languagetheoretic device, two relevant concepts

Strong generative capacity: the set of structures that can begeneratedWeak generative capacity: the set of strings that can be generated

Q: what is the generative capacity of natural languge?

Focus on weak generative capacity

Nicholas Longenbaugh October 2, 2013 4 / 34

Page 5: On the non-context-freeness of Romanian

Formal language theory Why weak generative capacity?

Why care about weak generative capacity?

Constrain our models & rule out approaches that aren’t formallycomplex enough

Gain insight into the formal properties of the language faculty

Limits on the parsing/processing tools in the brainLimits on language production

Complexity parameter view (Deutscher (2010), Everett (2005), Givon(2009), Heine and Kuteva (2007), Wray and Grace (2007), Sauerland(to appear))

Nicholas Longenbaugh October 2, 2013 5 / 34

Page 6: On the non-context-freeness of Romanian

Formal language theory Weak generative capacity of natural language

Weak generative capacity of natural language

Claim: natural language is weakly and strongly non-context-free

Various “proofs” over the years (Bar-Hillel and Shamir (1960),Chomsky (1959), Postal (1964), Elster (1978)), all invalid (Pullumand Gazdar, 1982)

Valid proofs of weak non-context-freeness in various languages:

Shieber (1985): Swiss-GermanCuly (1985): BambaraMiller (1991): SwedishHigginbotham (1987): English

Nicholas Longenbaugh October 2, 2013 6 / 34

Page 7: On the non-context-freeness of Romanian

Formal language theory Weak generative capacity of natural language

Weak generative capacity of natural language

Claim: natural language is weakly and strongly non-context-free

Various “proofs” over the years (Bar-Hillel and Shamir (1960),Chomsky (1959), Postal (1964), Elster (1978)), all invalid (Pullumand Gazdar, 1982)

Valid proofs of weak non-context-freeness in various languages:

Shieber (1985): Swiss-GermanCuly (1985): BambaraMiller (1991): SwedishHigginbotham (1987) : English NOPE!

Claim: Romanian is also weakly non-context-free

Nicholas Longenbaugh October 2, 2013 7 / 34

Page 8: On the non-context-freeness of Romanian

Romanian is weakly non-context-free Wh-elements in Romanian

Case morphology on Romanian wh-elements

Romanian overtly distinguishes accusative and dative case onwh-elements

(1)

Accusative Dativewho (pe) cine cuiwhich (pe) care carui

Nicholas Longenbaugh October 2, 2013 8 / 34

Page 9: On the non-context-freeness of Romanian

Romanian is weakly non-context-free Wh-elements in Romanian

Verb subcategorization

We can classify verbs as to whether they select accusative or dativecomplements

roga, to ask, and lasa, to allow, require accusative complements

(2) a. Pe cinewho.acc

aihave

rugat?asked

“Who did you ask?”b. *Cui

who.dati-aicl-have

rugat?asked

(3) a. Pe cinewho.acc

aihave

lasatallowed

sato

vizitezevisit

Bucurestiul?Bucharest

“Who did you allow to visit Bucharest?”b. *Cui

who.dati-aicl-have

lasatallowed

sato

vizitezevisit

Bucurestiul?Bucharest

Nicholas Longenbaugh October 2, 2013 9 / 34

Page 10: On the non-context-freeness of Romanian

Romanian is weakly non-context-free Wh-elements in Romanian

Verb subcategorization

We can classify verbs as to whether they select accusative or dativecomplements

spune, to tell, and scrie, to write, subcategorize for a dativecomplement

(4) a. Cuiwho.dat

i-aicl-have

spus?told

“Who did you tell?”b. *Pe cine

who.accaihave

spus?told

(5) a. Cuiwho.dat

i-aicl-have

scris?written

“Who did you write to?”b. *Pe cine

who.accaihave

scris?written

Nicholas Longenbaugh October 2, 2013 10 / 34

Page 11: On the non-context-freeness of Romanian

Romanian is weakly non-context-free Wh-elements in Romanian

Multiple wh-questions

In sentence with multiple wh-elements, all must be extracted tointerrogative clause initial position

True even when the wh-elements originate in different clauses!

(6) a. Pe cineiwho.acc

cuijwho.dat

aihave

rugatasked

ti sa-ito-cl

spunatell

tj povestea?story

“Who did you ask to tell who the story?”b. *Pe cine ai rugat sa-i spuna cui povestea?c. *Cui ai rugat pe cine sa-i spuna povestea?d. *Ai rugat pe cine sa-i spuna cui povestea?

Nicholas Longenbaugh October 2, 2013 11 / 34

Page 12: On the non-context-freeness of Romanian

Romanian is weakly non-context-free Wh-elements in Romanian

Multiple wh-questions

(7) a. [Pe carewhich.acc

baiat]iboy.acc

[caruiwhich.dat

fete]jgirl.dat

l-aicl-have

rugatasked

ti sa-ito-cl

spunatell

tj

povestea?story?“Which boy did you ask to tell which girl the story?”

b. Pe cineiwho.acc

cuijwho.dat

aihave

lasatallowed

ti sa-ito-cl

scriewrite

tj scrisoarea?letter?

“Who did you allow to write who the letter?”c. [Pe care

Which.accfata]igirl.acc

[caruiwhich.dat

baiat]jboy.dat

aihave

lasat-oallowed-cl

ti sa-ito-cl

scriewrite

tj scrisoarea?letter

“Which girl did you allow to write which boy the letter?”

Nicholas Longenbaugh October 2, 2013 12 / 34

Page 13: On the non-context-freeness of Romanian

Romanian is weakly non-context-free Wh-elements in Romanian

Multiple wh-questions

We aren’t limited to two wh-elements either:

(8) Pe cineiwho.acc

cuijwho.dat

lawith

cekwhat

aihave

vrutwanted

sato

lasiallow

ti sa-ito-cl

spunatell

tj tk?

“Who did you want to allow to tell who what?”

(9) Pe cineiwho.acc

pe cinejwho.acc

cuikwho.dat

aihave

vrutwanted

sato

lasiallow

ti sato

laseallow

tj sa-ito-cl

spunatell

tk povesta?story

“Who did you want to allow to allow who to tell who the story?”

(10) [Pe carewhich.acc

baiat]iboy.acc

[careiwhich.dat

fete]jgirl.dat

[careiwhich.dat

femei]kwoman.dat

aihave

vrutwanted

sa-lto-cl

lasiallow

ti sa-ito-cl

spunatell

tj sa-ito-cl

scriewrite

tk scrisoarea?letter

“Which boy have you wanted to allow to tell which girl to write which womanthe letter?”

Nicholas Longenbaugh October 2, 2013 13 / 34

Page 14: On the non-context-freeness of Romanian

Romanian is weakly non-context-free Account

Summary

ruga, lasa and spune, scrie subcategorize for only accusative and onlydative complements, respectively

All wh-elements must be extracted to clause initial position

Conclusion: Romanian permits structures of the following form

(11) wh-element.accn wh-element.datm you wanted verb1.accn verb2.datm

something.

Or, explicitly:

(12) (Pe cine)n

who.accn(cui)m

who.datmaihave

vrutwanted

(sa(to-cl

rogi)n

ask)n(sa-i(to-cl

spuna)m

tell)mpovestea?story

“Who did you want to ask to ask who . . . to tell who to tell who the story?”

Nicholas Longenbaugh October 2, 2013 14 / 34

Page 15: On the non-context-freeness of Romanian

Romanian is weakly non-context-free Account

Strategy

Homomorphism: function from strings (words) to symbols (letters)

f (romanian) = a

Intersection of language L and language L′

L ∩ L′ = {w : w ∈ L and w ∈ L′}: everything in L and in L′

Context-free languages are closed under i) image underhomomorphism, ii) intersection with regular languages

Apply a homomorphism to Romanian to simplify the representation(words 7→ symbols)

Intersect Romanian with a regular language to filter out only thestrings in (12)

Arrive at the non-context-free language anbmcndm

Conclusion: Romanian is weakly non-context-free

Nicholas Longenbaugh October 2, 2013 15 / 34

Page 16: On the non-context-freeness of Romanian

Romanian is weakly non-context-free Account

Account

Denote the set of all strings of Romanian as Romanian

Recall that all strings of the following form are in Romanian

(13) (Pe cine)n

who.accncuim

who.datmaihave

vrutwanted

(sa(to-cl

rogi)n

ask)n(sa-i(to-cl

spuna)m

tell)mpovestea?story

“Who did you want to ask to ask who . . . to tell who to tell who the story?”

Nicholas Longenbaugh October 2, 2013 16 / 34

Page 17: On the non-context-freeness of Romanian

Romanian is weakly non-context-free Account

Account

Define the following homomorphism

(14) f (w) =

a if w = pe cine

b if w = cui

c if w = sa rogi

d if w = sa-i spuna

ε if w = anything else

Take intersection of image of Romanian under f with the regularlanguage L = a∗b∗c∗d∗

f (Romanian) ∩ L = anbmcndm

Nicholas Longenbaugh October 2, 2013 17 / 34

Page 18: On the non-context-freeness of Romanian

Romanian is weakly non-context-free Account

Account

f (Romanian) ∩ L = anbmcndm

(15) Context-free pumping lemma: If L is a context free language,there exists some length p such that for all w ∈ L, if |w | ≥ p,then w may be split into five pieces u, v , x , y , z such thatw = uvxyz and for all i ≥ 0, uv ixy iz ∈ L.

It can be proved that anbmcnbm cannot be pumped ⇒ it is notcontext-free

If context-free languages are closed under homomorphism and underintersection with regular languages, then if Romanian is context-freeanbmcndm should be context free

Conclusion: Romanian is non-context-free

Nicholas Longenbaugh October 2, 2013 18 / 34

Page 19: On the non-context-freeness of Romanian

Weak non-context-freeness in perspective

Other weak non-context-freeness arguments

Q: why should we care about romanian?

We don’t need special phonological or morphological processes

Just pure A′-movement

Claim: all the other arguments rely on special phonological ormorphological processes

Nicholas Longenbaugh October 2, 2013 19 / 34

Page 20: On the non-context-freeness of Romanian

Weak non-context-freeness in perspective

Other weak non-context-freeness arguments

Swiss-German: reordering during the PF linearization process

Swedish: obligatory spelling out of wh-traces

Bambara: two morphological processes (noun reduplication,agglutinative agentive construction)

Nicholas Longenbaugh October 2, 2013 20 / 34

Page 21: On the non-context-freeness of Romanian

Weak non-context-freeness in perspective Swiss-German

Swiss-German

Shieber (1985) proved Swiss-German to be non-context-free based onthe following structures (proof details are almost identical to ours)

(16) . . . obj.accn obj.datm have wanted verb.accn verb.datm

(17) . . . mer. . . we

d’chindn

the children.accnem Hansm

Hans.datmesthe

husshouse.acc

haendhave

welewanted

laan

letn

halfem

helpmaastriichepaint

“. . . we have wanted to let the children help Hans let the children help Hans. . . paint the house” (Shieber, 1985)

This order of complements/verbs is only attested in Swiss-Germanand Dutch

Where does it come from?

Nicholas Longenbaugh October 2, 2013 21 / 34

Page 22: On the non-context-freeness of Romanian

Weak non-context-freeness in perspective Swiss-German

Swiss-German

(18) . . . obj.accn obj.datm have wanted verb.accn verb.datm

(19) . . . mer. . . we

d’chindn

the children.accnem Hansm

Hans.datmesthe

husshouse.acc

haendhave

welewanted

laan

letn

halfem

helpmaastriichepaint

“. . . we have wanted to let the children help Hans let the children help Hans. . . paint the house” (Shieber, 1985)

West-Germanic is head final, at least at VP level

(20) . . . [VP OBJ1 [. . . [VP OBJ2 [. . . [VP OBJ3 V3 ] ] V2 ] ] V1 ]

But we need (1-2-3-1-2-3) not (1-2-3-3-2-1)

Head movement!

Nicholas Longenbaugh October 2, 2013 22 / 34

Page 23: On the non-context-freeness of Romanian

Weak non-context-freeness in perspective Swiss-German

Swiss-German

Underlying (3-2-1) order can be changed: (1-3-2), (1-2-3), etc.

If this is feature motivated head movement, some questions:

Why does the movement not affect the semantics like other types ofverb movement?Why aren’t there any phonological or morphological reflexes of thefeatures?What are the features that motivate this movement and why arecertain orders precluded (3-1-2)?

Wurmbrand (2004, 2006, 2012): verb reordering is not headmovement and occurs after Spell Out

Nicholas Longenbaugh October 2, 2013 23 / 34

Page 24: On the non-context-freeness of Romanian

Weak non-context-freeness in perspective Swiss-German

Swiss-German

Reordering is post-syntactic (Wurmbrand, 2004)

Overview:

In West-Germanic, there is a post-syntactic process, the infinitivus proparticipio (IPP)Have + modal: modal appears as infinitive not participleIPP is distinctly post-syntactic

Participles in German are ambiguous between simple past andperfective tenses, infinitives are notIPP modals are ambiguous ⇒ they are interpreted as participles

Finally, IPP feeds reordering, not vice versa

Conclusion: Swiss-German data depends on a post-syntacticreordering operation

Nicholas Longenbaugh October 2, 2013 24 / 34

Page 25: On the non-context-freeness of Romanian

Weak non-context-freeness in perspective Swedish

Swedish

Miller (1991) proved Swedish weakly non-context-free with thefollowing structures

(21) Harhere

aris

{M,F ,Pl}{M,F ,Pl}

somthat

jagI

undrarwonder

({vilken({which

M,M,

vilkenwhich

F ,F ,

vilkawhich

Pl}Pl}

SgSg

undrar)+

wonder)+vilkenwhich

pojkeboy

{han{he,

honshe,

de}they}

troddethought

(att(to

{han,{he,

hon,she,

de}they}

trodde)+

thought)∗{han,{he,

hon,she,

de}they}

hadehad

rekommenderatrecomended

tillto

studenterna.student

“Here is the M/F/Pl that I wonder (which M/F/Pl Sg wonders)+ which boyhe/she/they thought (that he/she/they thought)∗ he/she/they hadrecommended to the students.

Each of the han, hon, de are resumptive pronouns that are obligatorilyinserted

Nicholas Longenbaugh October 2, 2013 25 / 34

Page 26: On the non-context-freeness of Romanian

Weak non-context-freeness in perspective Swedish

Swedish

Condition on the resumption of gaps in Swedish:

(22) Given a gap G1 and its filler F1, G1 must be realized as a resumptive pronoun ifthere is a gap G2 following G1 such that the filler F2 of G2 follows F1. (Miller,1991)

The han, hon, de are all mandatory in (21) (G2 here is the gap afterrekommenderat)

Claim: Swedish resumptive pronouns are just spelled out wh-traces insome cases

Without this, (21) is the same as English

Nicholas Longenbaugh October 2, 2013 26 / 34

Page 27: On the non-context-freeness of Romanian

Weak non-context-freeness in perspective Swedish

Swedish

Engdahl (1985) demonstrates that resumptive pronouns in Swedishare just spelled out A′-traces

They can license parasitic gaps:

(23) Detit

varwas

denthat

fangeni

prisonersomthat

lakarnathe-doctors

intenot

kindecould

avgoradecide

[′C omif

hani

he

verkligenreally

varwas

sjuk]ill

[utanwithout

attto

talatalk

medwith

p personligen].in person

“This is the prison that the doctors couldn’t determine if he really was illwithout talking to in person” (Engdahl, 1985)

Parasitic gaps are licensed by the following structural configuration (αis an element in A′-position, t is a variable bound by α, p is theparasitic gap, condition holds at Spell Out)

(24) . . . α . . . t . . . p (order irrelevant) (Chomsky, 1982)

Nicholas Longenbaugh October 2, 2013 27 / 34

Page 28: On the non-context-freeness of Romanian

Weak non-context-freeness in perspective Swedish

Swedish

Parasitic gaps are licensed by the following structural configuration (αis an element in A′-position, t is a variable bound by α, p is theparasitic gap, condition holds at Spell Out)

(25) . . . α . . . t . . . p (order irrelevant) (Chomsky, 1982)

Conclusion: Swedish resumptive pronouns must be A′-bound atSpell Out

Conclusion: Swedish resumptive pronouns are spelled out A′-traces

Nicholas Longenbaugh October 2, 2013 28 / 34

Page 29: On the non-context-freeness of Romanian

Weak non-context-freeness in perspective Swedish

Swedish

Conclusion: Swedish resumptive pronouns are spelled out A′-traces

Further evidence: resumptive pronouns can co-occur with A′-traces inATB situations

(26) [Detthere

finnsare

vissacertain

ord]iwords

(som)that

jagI

oftaoften

traffarmeet

pa ti menbut

intenot

minnsremember

hurhow

deithey

stavas.are-spelled

“There are certain words that I often come across but never remember howthey are they are spelled.”

The Swedish argument depends on mandatory spell out of A′-traces,a post-syntactic process

Nicholas Longenbaugh October 2, 2013 29 / 34

Page 30: On the non-context-freeness of Romanian

Weak non-context-freeness in perspective Bambara

Bambara

Bambara relies on two overtly morphological processes“Noun o noun”: whichever noun

(27) a. wuludog

o wuludog

“whichever dog”b. *malo

riceo wulu

dog(Culy, 1985)

Agentive construction: (T)ransitive (V)erb + (N)noun = “one whoTVs Ns”

(28) a. wuludog

+ nyinisearch for

+ la = wulunyinina

“One who searches for dogs”b. malo

rice+ file

watch+ la = malofilela

“one who watches rice” (Culy, 1985)

Nicholas Longenbaugh October 2, 2013 30 / 34

Page 31: On the non-context-freeness of Romanian

Weak non-context-freeness in perspective Bambara

Bambara

Agentive construction is recursive:

(29) a. wulunyininadog searcher

+ nyinisearch for

+ la = wulunyininanyinina

“one who searches for dog searchers”b. malofilela

rice watcher+ file

watch+ la = malofilelafilela

“One who watches rice watchers”(Culy, 1985)

Agentive construction can feed “noun o noun”

Non-context-freeness is based on structure below

(30) wulu(filela)n(nyinina)m

dog(watcher)n(searcher)mo wulu(filela)n(nyinina)m

dog(watcher)n(searcher)m

Conclusion: Bambara weak non-context-freeness depends onmorphological processes

Nicholas Longenbaugh October 2, 2013 31 / 34

Page 32: On the non-context-freeness of Romanian

Weak non-context-freeness in perspective Romanian

Romanian

In Swiss-German, Swedish, and Bambara, data for weaknon-context-freeness argument depends on post- or extra-syntacticprocesses

Q: How can we be sure that the Romanian wh-elements aren’treordered post-syntactically?

A: superiority

Romanian respects superiority constraints (Boskovic, 2002)

(31) a. pe cinewho.acc

cuiwho.dat

aihave

vrutwanted

sato

lasiallow

sa-ito

spunatell

povestea?story

“Who did you want to allow to tell who the story?”b. ?*cui pe cine ai vrut sa lasi sa-i spuna povestea?

Nicholas Longenbaugh October 2, 2013 32 / 34

Page 33: On the non-context-freeness of Romanian

Conclusion

Conclusion

Based on overt Case morphology & multiple wh-extraction, Romanianis weakly non-context-free

Romanian provides a fundamentally new way wherein naturallanguage exceeds the generative power of context-free grammars

Romanian differs from previously attested examples in not relying onpost- or extra-syntactic operations

Nicholas Longenbaugh October 2, 2013 33 / 34

Page 34: On the non-context-freeness of Romanian

Conclusion

Yehoshua Bar-Hillel and Eliyahu Shamir. Finite-state languages: Formal representations and adequacy problems. WeizmannScience Press of Israel, 1960.

Zeljko Boskovic. On multiple wh-fronting. Linguistic Inquiry, 33(3):351–383, 2002.

Noam Chomsky. On certain formal properties of grammars. Information and control, 2(2):137–167, 1959.

Noam Chomsky. Some concepts and consequences of the theory of government and binding, volume 6. MIT press, 1982.

Christopher Culy. The complexity of the vocabulary of bambara. Linguistics and Philosophy, 8(3):345–351, 1985.

Guy Deutscher. The unfolding of language. Random House, 2010.

Jon Elster. Logic and society: Contradictions and possible worlds. Wiley New York, 1978.

Elisabet Engdahl. Parasitic gaps, resumptive pronouns, and subject extractions. Linguistics, 23(1):3–44, 1985.

Daniel L Everett. Cultural constraints on grammar and cognition in Piraha. Current Anthropology, 46(4):621–646, 2005.

Talmy Givon. The genesis of syntactic complexity: Diachrony, ontogeny, neuro-cognition, evolution. John Benjamins, 2009.

Bernd Heine and Tania Kuteva. The genesis of grammar. Newsletter, page 440, 2007.

James Higginbotham. English is not a context-free language. In The Formal Complexity of Natural Language, pages 335–348.Springer, 1987.

Philip H Miller. Scandinavian extraction phenomena revisited: Weak and strong generative capacity. Linguistics and Philosophy,14(1):101–113, 1991.

Paul M Postal. Limitations of phrase structure grammars. The structure of language, pages 137–151, 1964.

Geoffrey K Pullum and Gerald Gazdar. Natural languages and context-free languages. Linguistics and Philosophy, 4(4):471–504,1982.

Stuart M Shieber. Evidence against the context-freeness of natural language. Linguistics and Philosophy, 8(3):333–343, 1985.

Alison Wray and George W Grace. The consequences of talking to strangers: Evolutionary corollaries of socio-cultural influenceson linguistic form. Lingua, 117(3):543–578, 2007.

Susi Wurmbrand. Syntactic vs. post-syntactic movement. In Sophie Burelle and Stanca Somesfalean, editors, Proceedings of the2003 Annual Meeting of the Canadian Linguistic Association (CLA), pages 284–295, 2004.

Susi Wurmbrand. Verb clusters, verb raising, and restructuring. The Blackwell companion to syntax, pages 229–343, 2006.

Susi Wurmbrand. Parasitic participles: Evidence for the theory of verb clusters. Taal en Tongval, 2012.

Nicholas Longenbaugh October 2, 2013 34 / 34