evaluating models of parameter setting

62
1 EVALUATING MODELS OF PARAMETER SETTING Janet Dean Fodor Graduate Center, City University of New York

Upload: chin

Post on 08-Jan-2016

37 views

Category:

Documents


2 download

DESCRIPTION

EVALUATING MODELS OF PARAMETER SETTING. Janet Dean Fodor Graduate Center, City University of New York. On behalf of CUNY-CoLAG CUNY Computational Language Acquisition Group With support from PSC-CUNY. William G. Sakas , co-director Carrie Crowther Lisa Reisig-Ferrazzano - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: EVALUATING MODELS OF PARAMETER SETTING

1

EVALUATING MODELS OF PARAMETER SETTING

Janet Dean Fodor

Graduate Center,

City University of New York

Page 2: EVALUATING MODELS OF PARAMETER SETTING

2

Page 3: EVALUATING MODELS OF PARAMETER SETTING

3

On behalf of CUNY-CoLAGCUNY Computational Language Acquisition Group

With support from PSC-CUNY

William G. Sakas, co-director

Carrie Crowther Lisa Reisig-Ferrazzano Atsu Inoue Iglika Stoyneshka-Raleva

Xuan-Nga Kam Virginia TellerYukiko Koizumi Lidiya TornyovaEiji Nishimoto Erika TrosethArtur Niyazov Tanya Viger Iana Melnikova Pugach Sam Wagner

www.colag.cs.hunter.cuny.edu

Page 4: EVALUATING MODELS OF PARAMETER SETTING

4

Before we start…

Warning: I may skip some slides.

But not to hide them from you.

Every slide is at our website

www.colag.cs.hunter.cuny.edu

Page 5: EVALUATING MODELS OF PARAMETER SETTING

5

What we have done

A factory for testing models of parameter setting.

UG + 13 parameter values → 3,072 languages (simplified but human-like).

Sentences of a target language are the input to a learning model.

Is learning successful? How fast?

Why?

Page 6: EVALUATING MODELS OF PARAMETER SETTING

6

Our Aims

A psycho-computational model of syntactic parameter setting.

Psychologically realistic.

Precisely specified.

Compatible with linguistic theory.

And… it must work!

Page 7: EVALUATING MODELS OF PARAMETER SETTING

7

Parameter setting as the solution (1981)

Avoids problems of rule-learning.

Only 20 (or 200) facts to learn.

Triggering is fast & automatic = no linguistic computation is necessary.

Accurate.

BUT: This has never been modeled.

Page 8: EVALUATING MODELS OF PARAMETER SETTING

8

Parameter setting as the problem (1990s)

R. Clark, and Gibson & Wexler have shown:

P-setting is not labor-free, not always successful. Because… The parameter interaction problem. The parametric ambiguity problem.

Sentences do not tell which parameter values generated them.

Page 9: EVALUATING MODELS OF PARAMETER SETTING

9

This evening…

Parameter setting:

How severe are the problems?

Why do they matter?

How to escape them?

Moving forward: from problems to

explorations.

Page 10: EVALUATING MODELS OF PARAMETER SETTING

10

Problem 1: Parameter interaction Even independent parameters interact in

derivations (Clark 1988,1992).

Surface string reflects their combined effects.

So one parameter may have no distinctive isolatable effect on sentences. = no trigger, no cue (cf. cue-based learner;

Lightfoot 1991; Dresher 1999)

Parametric decoding is needed. Must disentangle the interactions, to identify which p-values a sentence requires.

Page 11: EVALUATING MODELS OF PARAMETER SETTING

11

Parametric decoding

Decoding is not instantaneous. It is hard work. Because…

To know that a parameter value is necessary, must test it in company of all other p-values.

So whole grammars must be tested against the sentence. (Grammar-testing ≠ triggering!)

All grammars must be tested, to identify one correct p-value. (exponential!)

Page 12: EVALUATING MODELS OF PARAMETER SETTING

12

Decoding

This sets: no wh-movt, p-stranding, head initial VP, V to I to C, no affix hopping, C- initial, subj initial, no overt topic marking

Doesn’t set: oblig topic, null subj, null topic

O3 Verb Subj O1[+WH] P Adv.

Page 13: EVALUATING MODELS OF PARAMETER SETTING

13

More decoding

Adv[+WH] P NOT Verb S KA.

This sets everything except ±overt topic marking.

Verb[+FIN].

This sets nothing, not even +null subject.

Page 14: EVALUATING MODELS OF PARAMETER SETTING

14

Problem 2: Parametric ambiguity

A sentence may belong to more than one language.

A p-ambiguous sentence doesn’t reveal thetarget p-values (even if decoded).

Learner must guess (= inaccurate) or pass (= slow, + when? )

How much p-ambiguity is there in natural language? Not quantified; probably vast.

Page 15: EVALUATING MODELS OF PARAMETER SETTING

15

Scale of the problem (exponential)

P-interaction and p-ambiguity are likely to increase with the # of parameters.

How many parameters are there?

20 parameters → 220 grammars = over a million30 parameters → 230 grammars = over a billion40 parameters → 240 grammars = over a trillion 100 parameters → 2100 grammars = ???

Page 16: EVALUATING MODELS OF PARAMETER SETTING

16

Learning models must scale up

Testing all grammars against each input sentence is clearly impossible.

So research has turned to search methods: how to sample and test the huge field of grammars efficiently.

Genetic algorithms (e.g., Clark 1992) Hill-climbing algorithms (e.g., Gibson &

Wexler’s TLA 1994)

Page 17: EVALUATING MODELS OF PARAMETER SETTING

17

Our approach Retain a central aspect of classic triggering:

Input sentences guide the learner toward the p-values they need.

Decode on-line; parsing routines do the work. (They’re innate.)

Parse the input sentence (just as adults do, for comprehension) until it crashes.

Then the parser draws on other p-values, to find one that can patch the parse-tree.

Page 18: EVALUATING MODELS OF PARAMETER SETTING

18

Structural Triggers Learners (CUNY)

STLs find one grammar for each sentence.

More than that would require parallel parsing, beyond human capacity.

But the parser can tell on-line if there is (possibly) more than one candidate.

If so: guess, or pass (wait for unambig).

Considers only real candidate grammars;directed by what the parse-tree needs.

Page 19: EVALUATING MODELS OF PARAMETER SETTING

19

Summary so far… Structural triggers learners (STLs) retain an

important aspect of triggering (p-decoding).

Compatible with current psycholinguistic models of sentence processing.

Hold promise of being efficient. (Home in on target grammar, within human resource limits.)

Now: Do they really work, in a domain with realistic parametric ambiguity?

Page 20: EVALUATING MODELS OF PARAMETER SETTING

20

Evaluating learning models

Do any models work?

Reliably? Fast? Within human resources?

Do decoding models work better than domain-search (grammar-testing) models?

Within decoding models, is guessing better or worse than waiting?

Page 21: EVALUATING MODELS OF PARAMETER SETTING

21

Hope it works! If not… The challenge: What is UG good for?

All that innate knowledge, only a few facts to learn, but you can’t say how!

Instead, one simple learning procedure: Adjust the weights in a neural network; Record statistics of co-occurrence frequencies.

Nativist theories of human language are vulnerable until some UG-based learner is shown to perform well.

Page 22: EVALUATING MODELS OF PARAMETER SETTING

22

Non-UG-based learning Christiansen, M.H., Conway, C.M. and Curtin, S. (2000). A

connectionist single-mechanism account of rule-like behavior in infancy. In Proceedings of 22nd Annual Conference of Cognitive Science Society, 83-88. Mahwah, NJ: Lawrence Erlbaum.

Culicover, P.W. and Nowak, A. (2003) A Dynamical Grammar. Oxford, UK: Oxford University Press. Vol.Two of Foundations of Syntax.

Lewis, J.D. and Elman, J.L. (2002) Learnability and the statistical structure of language: Poverty of stimulus arguments revisited. In B. Skarabela et al. (eds) Proceedings of BUCLD 26, Somerville, Mass: Cascadilla Press.

Pereira, F. (2000) Formal Theory and Information theory: Together again? Philosophical Transactions of the Royal Society, Series A 358, 1239-1253.

Seidenberg, M.S., & MacDonald, M.C. (1999) A probabilistic constraints approach to language acquisition and processing. Cognitive Science 23, 569-588.

Tomasello, M. (2003) Constructing a Language: A Usage-Based Theory of Language Acquisition. Harvard University Press.

Page 23: EVALUATING MODELS OF PARAMETER SETTING

23

The CUNY simulation project We program learning algorithms proposed in the

literature. (12 so far) Run each one on a large domain of human-like

languages. 1,000 trials (1,000 ‘children’) each. Success rate: % of trials that identify target. Speed: average # of input sentences consumed

until learner has identified the target grammar. Reliability/speed: # of input sentences for 99%

of trials ( 99% of ‘children’) to attain the target. Subset Principle violations and one-step local

maxima excluded by fiat. (Explained below as necessary.)

Page 24: EVALUATING MODELS OF PARAMETER SETTING

24

Designing the language domain

Realistically large, to test which models scale up well. As much like natural languages as possible. Except, input limited like child-directed speech. Sentences must have fully specified tree structure

(not just word strings), to test models like STL. Should reflect theoretically defensible linguistic

analyses (though simplified). Grammar format should allow rapid conversion into

the operations of an effective parsing device.

Page 25: EVALUATING MODELS OF PARAMETER SETTING

25

Language domains created

# params # langs

# sents per lang

tree structure Language properties

Gibson & Wexler (1994)

3 8 12 or 18 Not fully specified

Word order + V2

Bertolo et al. (1997)

7 64 distinct

Many Yes G&W + V-raising + degree-2

Kohl (1999) 12 2,304 Many Partial B et al. + scrambling

Sakas & Nishi-moto (2002)

4 16 12-32 Yes G&W + null subj/topic

Fodor, Melni-kova & Troseth (2002)

13 3,072 168-1,420 Yes S&N + Imp + wh-movt + piping + etc

Page 26: EVALUATING MODELS OF PARAMETER SETTING

26

Selection criteria for our domain

We have given priority to syntactic phenomena which:

Occur in a high proportion of known natl langs;

Occur often in speech directed to 2-3 year olds;

Pose learning problems of theoretical interest;

A focus of linguistic / psycholinguistic research;

Syntactic analysis is broadly agreed on.

Page 27: EVALUATING MODELS OF PARAMETER SETTING

27

By these criteria

Questions, imperatives.

Negation, adverbs.

Null subjects, verb movement.

Prep-stranding, affix-hopping (though not widespread!).

Wh-movement, but no scrambling yet.

Page 28: EVALUATING MODELS OF PARAMETER SETTING

28

Not yet included

No LF interface (cf. Villavicencio 2000)

No ellipsis; no discourse contexts to license fragments.

No DP-internal structure; Case; agreement.

No embedding (only degree-0).

No feature checking as implementation of movement parameters (Chomsky 1995ff.)

No LCA / Anti-symmetry (Kayne 1994ff.)

Page 29: EVALUATING MODELS OF PARAMETER SETTING

29

Our 13 parameters (so far) Parameter Default

Subject Initial (SI) yes Object Final (OF) yes Complementizer Initial (CI) initial V to I Movement (VtoI) no I to C Movement (of aux or verb) (ItoC) no Question Inversion (Qinv = I to C in questions only) no Affix Hopping (AH) no Obligatory Topic (vs. optional) (ObT) yes Topic Marking (TM) no Wh-Movement obligatory (vs. none) (Wh-M) no Pied Piping (vs. preposition stranding) (PI) piping Null Subject (NS) no Null Topic (NT) no

Page 30: EVALUATING MODELS OF PARAMETER SETTING

30

Parameters are not all independent

Constraints on P-value combinations:

If [+ ObT] then [- NS]. (A topic-oriented language does not have null subjects.)

If [- ObT] then [- NT]. (A subject-oriented language does not have null topics.)

If [+ VtoI] then [- AH]. (If verbs raise to I, affix hopping does not occur.)

(This is why only 3,072 grammars, not 8,192.)

Page 31: EVALUATING MODELS OF PARAMETER SETTING

31

Input sentences

Universal lexicon: S, Aux, O1, P, etc.

Input is word strings only, no structure.

Except, the learner knows all word categories and all grammatical roles!

Equivalent to some semantic boot-strapping; no prosodic bootstrapping (yet!)

Page 32: EVALUATING MODELS OF PARAMETER SETTING

32

Learning procedures

In all models tested (unless noted), learning is:

Incremental = hypothesize a grammar after each input. No memory for past input.

Error-driven = if Gcurrent can parse the sentence, retain it.

Models differ in what the learner does when Gcurrent fails = grammar change is needed.

Page 33: EVALUATING MODELS OF PARAMETER SETTING

33

The learning models: preview

Learners that decode: STLs. Waiting (‘squeaky clean’) Guessing

Grammar-testing learners Triggering Learning Algorithm (G&W) Variational Learner (Yang 2000)

…plus benchmarks for comparison too powerful too weak

Page 34: EVALUATING MODELS OF PARAMETER SETTING

34

Learners that decode: STLs

Strong STL: Parallel parse input sentence, find all successful grammars. Adopt p-values they share. (A useful benchmark, not a psychological model.)

Waiting STL: Serial parse. Note any choice-point in the parse. Set no parameters after a choice. (Never guesses. Needs fully unambig triggers.)

(Fodor 1998a)

Guessing STLs: Serial. At a choice-point, guess.(Can learn from p-ambiguous input.) (Fodor 1998b)

Page 35: EVALUATING MODELS OF PARAMETER SETTING

35

Guessing STLs’ guessing principles

If there is more than one new p-value that could patch the parse tree…

Any Parse: Pick at random.

Minimal Connections: Pick the p-value that gives the simplest tree. ( MA + LC)

Least Null Terminals: Pick the parse with the fewest empty categories. ( MCP)

Nearest Grammar: Pick the grammar that differs least from Gcurrent.

Page 36: EVALUATING MODELS OF PARAMETER SETTING

36

Grammar-testing: TLA Error-driven random: Adopt any grammar.

(Another baseline; not a psychological model.)

TLA (Gibson & Wexler, 1994): Change any one parameter. Try the new grammar on the sentence. Adopt it if the parse succeeds. Else pass.

Non-greedy TLA (Berwick & Niyogi, 1996): Change any one parameter. Adopt it. (No test of new grammar against the sentence.)

Non-SVC TLA (B&N 96): Try any grammar other than Gcurrent. Adopt it if the parse succeeds.

Page 37: EVALUATING MODELS OF PARAMETER SETTING

37

Grammar-testing models with memory

Variational Learner (Yang 2000,2002) has memory for success / failure of p-values.

A p-value is: rewarded if in a grammar that parsed an input; punished if in a grammar that failed.

Reinforcement is approximate, because of interaction. A good p-value in a bad grammar is punished, and vice versa.

Page 38: EVALUATING MODELS OF PARAMETER SETTING

38

With memory: Error-driven VL

Yang’s VL is not error-driven. It chooses p-values with probability proportional to their current success weights. So it occasionally tries out unlikely p-values.

Error-driven VL (Sakas & Nishimoto, 2002) Like Yang’s original, but:

First, set each parameter to its currently more successful value. Only if that fails, pick a different grammar as above.

Page 39: EVALUATING MODELS OF PARAMETER SETTING

39

Previous simulation results

TLA is slower than error-driven random on the G&W domain, even when it succeeds (Berwick & Niyogi 1996).

TLA sometimes performs better, e.g., in strongly smooth domains (Sakas 2000, 2003).

TLA fails on 3 of G&W’s 8 languages, and on 95.4% of Kohl’s 2,304 languages.

There is no default grammar that can avoid TLA learning failures. The best starting grammar succeeds only 43% (Kohl 1999).

Some TLA-unlearnable languages are quite natural, e.g., Swedish-type settings (Kohl 1999).

Waiting-STL is paralyzed by weakly equivalent grammars (Bertolo et al. 1997).

Page 40: EVALUATING MODELS OF PARAMETER SETTING

40

Data by learning model

Algorithm% failure

rate# inputs

(99% of trials)# inputs (average)

Error-driven random 0 16,663 3,589

TLA original 88 16,990 961

TLA w/o Greediness 0 19,181 4,110

TLA without SVC 0 67,896 11,273

Strong STL 74 170 26

Waiting STL 75 176 28

Guessing STLs

Any parse 0 1,486 166

Minimal Connections 0 1,923 197

Least Null Terminals 0 1,412 160

Nearest Grammar 80 180 30

Page 41: EVALUATING MODELS OF PARAMETER SETTING

41

Summary of performance Not all models scale up well.

‘Squeaky-clean’ models (Strong / Waiting STL)fail often. Need unambiguous triggers.

Decoding models which guess are most efficient. On-line parsing strategies make good learning

strategies. (?) Even with decoding, conservative domain search

fails often (Nearest Grammar STL).

Thus: Learning-by-parsing fulfills its promise. Psychologically natural ‘triggering’ is efficient.

Page 42: EVALUATING MODELS OF PARAMETER SETTING

42

Now that we have a workable model…

Use it to investigate questions of interest:

Are some languages easier than others? Do default starting p-values help? Does overt morphological marking facilitate

syntax learning? etc…..

Compare with psycholinguistic data, where possible. This tests the model further, and may offer guidelines for real-life studies.

Page 43: EVALUATING MODELS OF PARAMETER SETTING

43

Are some languages easier?

Guessing STL- MC # inputs

(99% of trials)# inputs (average)

‘Japanese’ 87 21

‘French’ 99 22

‘German’ 727 147

‘English’ 1,549 357

Page 44: EVALUATING MODELS OF PARAMETER SETTING

44

Language difficulty is not predicted by how many of the target p-settings are defaults.

Probably what matters is parametric ambiguity Overlap with neighboring languages Lack of almost-unambiguous triggers

Are non-attested languages the difficult ones? (Kohl, 1999: explanatory!)

What makes a language easier?

Page 45: EVALUATING MODELS OF PARAMETER SETTING

45

Sensitivity to input properties

How does the informativeness of the input affect learning rate?

Theoretical interest: To what extent can UG-based p-setting be input-paced?

If an input-pacing profile does not match child learners, that could suggest biological timing (e.g., maturation).

Page 46: EVALUATING MODELS OF PARAMETER SETTING

46

Some input properties

Morphological marking of syntactic features: Case Agreement Finiteness

The target language may not provide them. Or the learner may not know them.

Do they speed up learning? Or just create more work?

Page 47: EVALUATING MODELS OF PARAMETER SETTING

47

Input properties, cont’d

For real children, it is likely that:

Semantics / discourse pragmatics signals illocutionary force: [ILLOC DEC], [ILLOC Q] or [ILLOC IMP]

Semantics and/or syntactic context reveals SUBCAT (argument structure) of verbs.

Prosody reveals some phrase boundaries(as well as providing illocutionary cues).

Page 48: EVALUATING MODELS OF PARAMETER SETTING

48

Making finiteness audible [+/-FIN] distinguishes Imperatives from

Declaratives. (So does [ILLOC], but it’s inaudible.)

Imperatives have null subject. E.g., Verb O1.

A child who interprets an IMP input as a DEC could mis-set [+NS] for a [-NS] lang.

Does learning become faster / more accurate when [+/-FIN] is audible? No. Why not?

Because Subset Principle requires learner to parse IMP/DEC ambiguous sentences as IMP.

Page 49: EVALUATING MODELS OF PARAMETER SETTING

49

Providing semantic info: ILLOC Suppose real children know whether an input

is Imperative, Declarative or Question.

This is relevant to [+ItoC] vs. [+Qinv]. ( [+Qinv] [+ItoC] only in questions )

Does learning become faster / more accurate when [ILLOC] is audible? No. It’s slower!

Because it’s just one more thing to learn. Without ILLOC, a learner could get all

word strings right, but their ILLOCs and p-values all wrong – and count as successful.

Page 50: EVALUATING MODELS OF PARAMETER SETTING

50

Providing SUBCAT information Suppose real children can bootstrap verb argument structure from meaning / local context.

This can reveal when an argument is missing. How can O1, O2 or PP be missing? Only by [+NT].

If [+NT] then also [+ObT] and [-NS] (in our UG).

Does learning become faster / more accurate when learners know SUBCAT? Yes. Why?

SP doesn’t choose between no-topic and null- topic. Other triggers are rare. So triggers for [+NT] are useful.

Page 51: EVALUATING MODELS OF PARAMETER SETTING

51

Enriching the input: Summary Richer input is good if it helps with something that

must be learned anyway (& other cues are scarce).

It hinders if it creates a distinction that otherwise could have been ignored. (cf. Wexler & Culicover 1980)

Outcomes depend on properties of this domain, but it can be tailored to the issue at hand.

The ultimate interest is the light these data shed on real language acquisition.

We can provide profiles of UG-based / input- (in)sensitive learning, for comparison with children.

The outcomes are never quite as anticipated.

Page 52: EVALUATING MODELS OF PARAMETER SETTING

52

This is just the beginning…

Next on the agenda

Page 53: EVALUATING MODELS OF PARAMETER SETTING

53

Next steps ~ input properties

How much damage from noisy input? E.g., 1 sentence in 5 / 10 / 100 not from target language.

How much facilitation from ‘starting small’?E.g., Probability of occurrence inversely proportional to sentence length.

How much facilitation (or not) from the exact mix of sentences in child-directed speech? (cf. Newport, 1977; Yang, 2002)

Page 54: EVALUATING MODELS OF PARAMETER SETTING

54

Next steps ~ learning models Add connectionist and statistical learners.

Add our favorite STL (= Parse Naturally), with MA, MCP etc. and a p-value ‘lexicon’.

(Fodor 1998b)

Implement the ambiguity / irrelevance distinction, important to Waiting-STL.

Evaluate models for realistic sequence of setting parameters. (Time course data)

Your request here

Page 55: EVALUATING MODELS OF PARAMETER SETTING

55

The end

www.colag.cs.hunter.cuny.edu

www.colag.cs.hunter.cuny.edu

Page 56: EVALUATING MODELS OF PARAMETER SETTING

56

REFERENCESBertolo, S., Broihier, K., Gibson, E., and Wexler, K. (1997) Cue-based learners in parametric language systems:

Application of general results to a recently proposed learning algorithm based on unambiguous 'superparsing'. In M. G. Shafto and P. Langley (eds.) 19th Annual Conference of the Cognitive Science Society, Lawrence Erlbaum Associates, Mahwah, NJ.

Berwick, R.C. and Niyogi, P. (1996) Learning from Triggers. Linguistic Inquiry, 27(2), 605-622.Chomsky, N. (1995) The Minimalist Program. Cambridge MA: MIT Press.Clark, R. (1988) On the relationship between the input data and parameter setting. NELS 19, 48-62.Clark, R. (1992) The selection of syntactic knowledge, Language Acquisition 2(2), 83-149.Dresher, E. (1999) Charting the learning path: Cues to parameter setting. Linguistic Inquiry 30.1, 27-67.Fodor, J D. (1998a) Unambiguous triggers, Linguistic Inquiry 29.1, 1-36.Fodor, J.D. (1998b) Parsing to learn. Journal of Psycholinguistic Research 27.3, 339-374.Fodor, J.D., I. Melnikova and E. Troseth (2002) A structurally defined language domain for testing syntax

acquisition models, CUNY-CoLAG Working Paper #1.Gibson, E. and Wexler, K. (1994) Triggers. Linguistic Inquiry 25, 407-454.Kayne, R.S. (1994) The Antisymmetry of Syntax. Cambridge MA: MIT Press.Kohl, K.T. (1999) An Analysis of Finite Parameter Learning in Linguistic Spaces. Master’s Thesis, MIT.Lightfoot, D. (1991) How to set parameters: Arguments from Language Change. Cambridge, MA: MIT Press. Sakas, W.G. (2000) Ambiguity and the Computational Feasibility of Syntax Acquisition, PhD Dissertation, City

University of New York.Sakas, W.G. and Fodor, J.D. (2001). The Structural Triggers Learner. In S. Bertolo (ed.) Language Acquisition

and Learnability. Cambridge, UK: Cambridge University Press.Sakas, W.G. and Nishimoto, E. (2002). Search, Structure or Heuristics? A comparative study of memoryless

algorithms for syntax acquisition. 24th Annual Conference of the Cognitive Science Society. Hillsdale, NJ: Lawrence Erlbaum Associates.

Yang, C.D. (2000) Knowledge and Learning in Natural Language. Doctoral dissertation, MIT.Yang, C.D. (2002) Knowledge and Learning in Natural Language. Oxford University Press.Villavicencio, A. (2000) The use of default unification in a system of lexical types. Paper presented at the

Workshop on Linguistic Theory and Grammar Implementation, Birmingham,UK.Wexler, K. and Culicover, P. (1980) Formal Principles of Language Acquisition. Cambridge MA: MIT Press.

Page 57: EVALUATING MODELS OF PARAMETER SETTING

57

Something we can’t do: production

What do learners say when they don’t know?

Sentences in Gcurrent, but not in Gtarget.

Do these sound like baby-talk?

Me has Mary not kissed why? (early) Whom must not take candy from? (later)

Sentences in Gtarget but not in Gcurrent.

Goblins Jim gives apples to.

Page 58: EVALUATING MODELS OF PARAMETER SETTING

58

CHILD-DIRECTED SPEECH STATISTICS FROM THE CHILDES DATABASE

The current domain of 13 parameters is almost as much as it’s feasible to work with – maybe we can eventually push it up to 20.

Each ‘language’ in the domain has only the properties assigned to it by the 13 parameters.

Painful decisions – what to include? what to omit? To decide, we consult adult speech to children in

CHILDES transcripts. Child age approx 1½ to 2½ years (earliest produced syntax).

Child’s MLU very approx 2. Adults’ MLU from 2.5 to 5. So far: English, French, German, Italian, Japanese.

(Child Language Data Exchange System, MacWhinney 1995)

Page 59: EVALUATING MODELS OF PARAMETER SETTING

59

STATISTICS ON CHILD-DIRECTED SPEECH FROM THE CHILDES DATABASE

  ENGLISH GERMAN ITALIAN JAPANESE RUSSIAN

Name+Age (Y;M.D) Eve 1;8-9.0 Nicole 1;8.15 Martina 1;8.2 Jun (2;2.5-25) Varvara 1;6.5 -1;7.13

File Name eve05-06.cha nicole.cha mart03,08.cha jun041-044.cha varv01-02.cha

Researcher/Childes folder name BROWN WAGNER CALAMBRONE ISHII PROTASSOVA

Number of Adults 4,3 2 2,2 1 4

MLU Child 2.13 2.17 1.94 1.606 2.8

MLU Adults (avg. of all) 3.72 4.56 5.1 2.454 3.8

Total Utterances (incl. Frags.) 1304 1107 1258 1691 1008

Usable Utterances/Fragments 806/498 728/379 929/329 1113/578 727/276

USABLES (% of all utterances) 62% 66% 74% 66% 72%

DECLARATIVES 40% 42% 27% 25% 34%

DEICTIC DECLARATIVES 8% 6% 3% 8% 7%

MORPHO-SYNTACTIC QUESTIONS 10% 12% 0% 18% 2%

PROSODY-ONLY QUESTIONS 7% 5% 15% 14% 5%

WH-QUESTIONS 22% 8% 27% 15% 34%

IMPERATIVES 13% 27% 24% 11% 11%

EXCLAMATIONS 0% 0% 1% 3% 0%

LET'S CONSTRUCTIONS 0% 0% 2% 4% 2%

Page 60: EVALUATING MODELS OF PARAMETER SETTING

60

FRAGMENTS (% of all utterances) 38% 34% 26% 34% 27%

NP FRAGMENTS 25% 24% 37% 10% 35%

VP FRAGMENTS 8% 7% 6% 1% 8%

AP FRAGMENTS 4% 3% 16% 1% 7%

PP FRAGMENTS 9% 4% 5% 1% 3%

WH-FRAGMENTS 10% 2% 10% 2% 6%

OTHER (E.g. stock expressions yes, huh) 44% 60% 26% 85% 41%

COMPLEX NPs (not from fragments)          

Total Number of Complex NPs 140 55 88 58 105

Approx 1 per 'n' utterances 6 13 11 19 7

NP with one ADJ 91 36 27 38 54

NP with two ADJ 7 1 2 0 4

NP with a PP 20 3 15 14 18

NP with possessive ADJ 22 7 0 0 4

NP modified by AdvP 0 0 31 1 6

NP with relative clause 0 8 13 5 5

Page 61: EVALUATING MODELS OF PARAMETER SETTING

61

DEGREE-n Utterances          

DEGREE 0 88% 84% 81% 94% 77%

Degree 0 deictic (E.g. that's a duck) 8% 6% 2% 8% 18%

Degree 0 (all others) 92% 94% 98% 92% 82%

DEGREE 1+ 12% 16% 19% 6% 33%

infinitival complement clause 36% 1% 31% 2% 30%

finite complement clause 12% 1% 40% 10% 26%

relative clause 10% 16% 12% 8% 3%

coordinating clause 30% 59% 9% 10% 41%

adverbial clause 11% 18% 7% 80% 0%

ANIMACY and CASE          

% Utt. with +animate somewhere 62% 60% 37% 8% 31%

% Subjects (overt) 94% 91% 56% 63% 97%

% Objects (overt) 18% 23% 44% 23% 14%

Case-marked NPs 238 439 282 100 949

Nominative 191 283 36 45 552

Accusative 47 79 189 4 196

Dative 0 77 57 4 35

Genitive 0 0 0 14 98

Topic 0 0 0 34 0

Instrumental and Prepositional 0 0 0 0 68

Subject drop 0 26 379 740 124

Object drop 0 4 0 125 37

Negation Occurrences 62 73 43 72 71

Nominal 5 19 2 0 8

Sentential 57 54 41 72 63

Page 62: EVALUATING MODELS OF PARAMETER SETTING

62

www.colag.cs.hunter.cuny.edu