elsj sf slides2

Post on 22-Jun-2015

221 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Presenting a new type of usage-based approach to grammatical constructions

Toward a pattern-based analysis of English resultatives:

Keio University Masato YOSHIKAWA

April 24th, 2010 ELSJ International Spring Forum 2010

1. INTRODUCTION

1.1. Outline

Theme The Resultative Construction (RC, henceforth; e.g., (1))• (1) John hammered the metal flat.

Position Usage-based view (e.g., Kemmer & Barlow 2000;

Langacker 1987)• Based on Pattern Lattice Model (Kuroda & Hasebe 2009; Kuroda

2009), a radically memory-based/exemplar-based model of language

Methodology a quantitative research• using the RC database collected by Boas (2003).

Conclusion• RC is a “mosaic” of partially similar conventional phrases

ELSJ International Spring Forum 2010

3

1.2. The aim of this talk

The aims of this talk To show the possibility of a new approach to

grammatical constructions which is based on the Usage-based view;• Suggestion: “reductionist” approaches should not work

To contribute to a “memory-based” or “exemplar-based” theory of human linguistic knowledge (e.g., Bod 2006; Pierrehumbert 2001; Port 2007)

What is implied Constructions of abstract kind = psychologically unreal!? Grammar = an epiphenomenon derived from analogical

applications of conventionalized expressions!?

ELSJ International Spring Forum 2010

4

1.3. The organization of this talk

Section 2 Provides a brief sketch of Pattern Lattice Model (PLM)

Section3 Reports the detail of the quantitative research

Section 4 Discusses the results of the research

Section 5 Summarizes the whole discussion; Remarks on the remaining problems

Section 6 Acknowledgements and additional references

ELSJ International Spring Forum 2010

5

2. BACKGROUNDPresenting the Pattern Lattice Model (PLM)

2.1. Pattern Lattice Model (PLM)

Pattern Lattice Model (PLM, Kuroda & Hasebe 2009; Kuroda 2009) Assumption 1: • the linguistic knowledge we have in mind

= a collection of concrete exemplars of linguistic experiences– Exemplars are considered almost equivalent to what we call

“episodes” (e.g., Tulving 2002)

• The underlying idea: the hypothesis of “full memory”

Assumption 2:• Those exemplars are connected to vast number of “indices”– Indices = any kinds of abstract units (e.g., phonemes,

morphemes, lexemes, etc.)

• As for syntax: the relevant indices = “patterns”– whose definition is given below

ELSJ International Spring Forum 2010

7

2.2. Patterns [1/3]

Where do patterns come from? Segment an exemplar e (e.g., (1a)) into arbitrary size of

units and make T(e) (e.g., (1b))(1) a. John hammered the metal flat.

b. [John, hammered, the metal, flat]

ELSJ International Spring Forum 2010

8

John hammered the metal flathammeredthe metalflat = e

= T(e)

segmentation

John

hammered the metal flatJohn

2.2. Patterns [2/3]

Where do patterns come from? Replace each segment with a variable X (shown here as

“_”)• The products of this procedure = patterns

(2) {[ _, hammered, the metal, flat], [ John, _, the metal, flat], [ John, hammered, _, flat], [ John, hammered, the metal, _ ]}

ELSJ International Spring Forum 2010

9

hammered the metal flatJohn

hammered the metal flat__

__ the metal flatJohn

hammered __ flatJohn

hammered the metal __John

Patte

rn

s

2.2. Patterns [3/3]

Where do patterns come from? Perform the replacement recursively until all the

segments are replaced with variables• The result = the pattern set P for e = P (e)

ELSJ International Spring Forum 2010

10

2.3. Pattern Lattice

What is Pattern Lattice (PL)? A hierarchical network of patterns The partially-ordered set where “≤” = “is-a” relation Is-a relation here:

• For pi , pj ∈ P, pi is-a pj when pj matches pi

– x = [a, b, _, d], y = [ a, _, _, d] – y matches x ⇒ x is-a y

The TOP of PL = a pattern composed only of variable(s) The BOTTOM of PL = a set of exemplar(s)• Shown diagrammatically in the next slide

ELSJ International Spring Forum 2010

11

The Hasse diagram of PL

ELSJ International Spring Forum 2010

12

RANK

Created by using Pattern Lattice Builder (http://www.kotonoba.

net/rubyfca/)

2.4. Why PLM?

PLM gives us A solid foundation for the usage-based view of language; A simple but powerful algorithm of pattern generation;• This means: the current Usage-based Model (e.g., Langacker

2000) = insufficient

A pattern-based analysis = an approach based on PLM

Note PLM = only the beginning! We need:• Additional procedure which tells us which patterns are useful

ELSJ International Spring Forum 2010

13

3. RESEARCH

3.1. Data

RC database collected by Boas (2003) Containing about 6000 examples of RCs obtained from

British National Corpus (BNC)• Downloadable at

http://cslipublications.stanford.edu/hand/1575864088appendix.pdf

Manual coding Each sentence annotate with• 1) the head noun of Argument 1– = “Object” if transitive/“Subject” if intransitive

• 2) the head noun of Argument 2 – = “Subject” if transitive/NONE if intransitive

• 3) the verb

• 4) the resultative predicate

ELSJ International Spring Forum 2010

15

3.1. Data in detail [1/4]

ELSJ International Spring Forum 2010

16

3.1. Data in detail [2/4]

ELSJ International Spring Forum 2010

17

3.1. Data in detail [3/4]

ELSJ International Spring Forum 2010

18

3.1. Data in detail [4/4]

ELSJ International Spring Forum 2010

19

3.2. Method

VP Extraction Extract VP from manually-coded data Tally the number of different VPs

Pattern generation Input the VPs into self-made Python script to get

patterns• The tool employed ≠ what is shown in ABSTRACT– Python’s version: 2.6.5; Windows ver.

Calculate z-score of each pattern p i.e., z(p)•

– f(p) = the frequency of p; f(k) = the average frequency of the rank k

– s(k) = the standard deviation of the frequency of the rank k

• z-score tells us how productive and conventional a pattern is

ELSJ International Spring Forum 2010

20

)(ks

kfpfpz

3.3. Results [1/2]

Overview 3,376 different VPs 11,392 patterns*• Notice!– Different from the number

shown in ABSTRACT

The “top” pattern: • “shoot __ dead” (z = 43.6)

“Superior” patterns Shown in the right table• Notice!– Different from the table

show in ABSTRACT

ELSJ International Spring Forum 2010

21

Rank Pattern Freq Z1 shoot _ dead 389 43.612 push door

open96 33.31

3 _ _ off 1421 30.434 _ door open 216 24.095 throw door

open57 19.56

6 make me sick 56 19.217 tear _ apart 167 18.568 make _ sick 128 14.169 push _ open 118 13.03

10 slam door shut

38 12.86

11 _ _ to death 590 12.5312 beat _ off 108 11.9013 stab _ to

death107 11.79

14 drive _ mad 106 11.6715 make me _ 106 11.6716 _ head off 104 11.4517 throw _ open 103 11.3418 push door _ 100 11.0019 drive me mad 32 10.75

3.3. Results [2/2]

ELSJ International Spring Forum 2010

22Rank Pattern Freq Z

20 make _ _ 461 9.7521 _ them off 87 9.5322 bite _ off 86 9.4223 bite head off 28 9.3324 _ _ dead 434 9.1725 shoot _ _ 409 8.6326 shoot man dead 26 8.6327 _ _ apart 407 8.5928 _ door shut 77 8.429 _ _ open 396 8.3530 make you sick 25 8.2831 _ it off 75 8.1832 get hand dirty 24 7.9233 slam _ shut 72 7.8434 blow _ off 71 7.7235 beat _ to_death 68 7.3936 _ him to_death 67 7.2737 make _ safe 65 7.0538 make _ ill 64 6.9339 _ it _ 316 6.63

Rank Pattern Freq Z

40 blow head off 20 6.51

41beat challenge

off20 6.51

42 _ door _ 300 6.29

43squeeze eye

shut19 6.16

44 throw door _ 57 6.1445 _ me sick 56 6.0346 _ him _ 288 6.0347 rip _ off 55 5.9248 _ them _ 280 5.8649 make me mad 18 5.8150 tear _ to_pieces 53 5.6951 put _ to_sleep 53 5.6952 tear _ _ 268 5.653 _ him off 52 5.5854 _ them apart 51 5.4755 brush _ off 51 5.4756 make you _ 50 5.3557 _ me mad 50 5.3558 take _ apart 50 5.3559 slam it shut 16 5.1

4. DISCUSSION

4.1. Variety of slot positions

Inconsistency of slot positions As for the top 100 patterns:• V = “X _ _”: 5 pattern types

• O = “_ Y _”: 6 pattern types

• R = “_ _ Z”: 7 pattern types

• VO = “X Y _”: 8 pattern types

• OR = “_ Y Z”: 13 pattern types

• VR = “X _ Z”: 29 pattern types

• VOR = “X Y Z”: 32 pattern types

Overall (for the patterns whose z ≥ 1)• V= 20; O = 10; R = 16; VO = 38; OR = 51; VR = 93; VOR =

106

This may mean: The resultative construction = inconsistent set??

ELSJ International Spring Forum 2010

24

4.2. Remarks

Ubiquitous Super-Lexical patterns VO, OR, VR, and VOR are ubiquitous Suggestion: RC = irreducible to lexical factors!? One possibility: RC = a mosaic of conventional patterns

Bonus Additional examples (found in Corpus of Contemporary

American English, COCA: Davies 2008-)• “_ door open” creak door open, buzz door open, etc.– RCs with additional verbs

• “beat _ _” beat ~ senseless– New RP

Note: Examples with the verb make ≠ RC!?

ELSJ International Spring Forum 2010

25

5. CONCLUDING REMARKS

5.1. Summary of this research

This talk presents A quantitative research of the Resultative Construction

(RC) Under the radically usage-based model called Pattern

Lattice Model (PLM)

Findings Slot position of the patterns = highly inconsistent Productive patterns of RC = highly lexically-specific =

concrete

Conclusion RC = a mosaic of conventional patterns (e.g., shoot _

dead, _ door open, drive me mad, etc)• But unfortunately this is only a suggestion…

ELSJ International Spring Forum 2010

27

5.2. Remaining problems

“Semi-”concreteness The inputs employed to generate patterns = abstract

arrays (= VOR) ≠ concrete item sequences (e.g., raw sentences)

This means: this research = NOT entirely usage-based

No direct references to psychological reality Only the result of corpus research was provided

Psychological experiment (or the like) will be needed

ELSJ International Spring Forum 2010

28

6. ACKNOWLEDGEMENTS AND REFERENCES

6.1. Acknowledgements

Prof. Ippei INOUE (Keio University) Mr. Fuminori NAKAMURA (Keio Univeristy)

ELSJ International Spring Forum 2010

30

6.2. References

Boas, H. 2003. A constructional approach to resultatives. Stanford: CSLI publications.

Bod, R. 2006. Exemplar-based syntax: How to get productivity from examples. The linguistic review, 23, 291-320.

Davies, M. 2008-. The Corpus of Contemporary American English (COCA): 400+ million words,1990-present. Available online at http://www.americancorpus.org.

Kemmer, S., & Barlow, M. 2000. Introduction: A usage-based conception of language. In Barlow, M., &. Kemmer, S. (eds.) Usage-based models of language (pp. vii-xxii). Stanford: CSLI Publications.

Kuroda, K. 2009. Pattern lattice as a model of linguistic knowledge and performance. Proceedings of The 23rd Pacific Asia Conference on Language, Information and Computation.

Kuroda, K. and Hasebe, Y. 2009. Modeling (Human) Knowledge and Processing of Natural Language Using Pattern Lattice. 15th Annual Meeting of Japanese Society of Natural Language Processing, 670‒673.

Langacker, R. 1987. Foundations of cognitive grammar Vol. 1: Theoretical prerequisites. Stanford: Stanford University Press.

— — . 2000. A dynamic usage-based model. In Barlow, M., &. Kemmer, S. (eds.) (pp. 1- 63).

Pierrehumbert, J. 2001. Exemplar dynamics: Word frequency, lenition and contrast. In Bybee, J., & Hopper, P. (eds.) Frequency and the emergence of linguistic structure (pp. 137-157). Amsterdam: John Benjamins.

Port, R. 2007. How words are stored in memory: Beyond phones and phonemes. New Ideas in Psychology, 25, 143-170.

Tulving, E. 2002. Episodic memory: From mind to brain. Annual Review of Psychology, 53, 1–25.

ELSJ International Spring Forum 2010

31

THANK YOU FOR YOUR ATTENTION

top related