vot is necessary but not sufficient for describing the voicing contrast in japanese

37
1 VOT is necessary but not sufficient for describing the voicing contrast in Japanese Eun Jong Kong*, Mary E. Beckman*, Jan Edwards † (*Ohio State University, †Univ. of Wisconsin at M adison) LSA 2009 January 10

Upload: dora

Post on 28-Jan-2016

37 views

Category:

Documents


0 download

DESCRIPTION

LSA 2009 January 10. VOT is necessary but not sufficient for describing the voicing contrast in Japanese. Eun Jong Kong*, Mary E. Beckman*, Jan Edwards † (*Ohio State University, † Univ. of Wisconsin at Madison). Introduction. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

1

VOT is necessary but not sufficient for describing the voicing contrast in Japanese

Eun Jong Kong*, Mary E. Beckman*, Jan Edwards †

(*Ohio State University, †Univ. of Wisconsin at Madison)

LSA 2009 January 10

Page 2: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

2

Since the seminal work of Lisker and Abramson (1964), Voice Onset Time (VOT) has been used as the primary measure for comparing word-initial stop voicing and aspiration contrasts across languages.

Introduction

Figure.1 Voice onset time distribution of apical (dental and alveolar) stops of two-category languages. Taken from Lisker & Abramson (1964).

e.g.,

• Spanish: /d/ vs. /t/lead VOT vs. short lag VOT

• Cantonese: /t/ vs. /th/ short lag vs. long lag VOT.

• English: /d/ vs. /t/

lead or short lag VOT vs. long lag VOT

freq

uenc

y

Spanish

Cantonese

English

voice onset time (msec)

d t

t th

d t

vot=0

Page 3: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

3

VOT has also been a useful acoustic measure for describing children’s mastery of word-initial stops in languages with voicing and/or aspiration contrasts.

Introduction

Figure.2 VOT distribution of alveolar stops in Thai. Taken from Gandour et al (1986).

e.g., Thai (Gandour et al 1986)

- stops with three-way contrast

: /d/ vs. /t/ vs. /th/

- lead VOT mastered later than short lag VOT or long lag VOT

3 year olds

5 year olds

7 year olds

Thai

/d/ /t/ /th /

Page 4: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

4

Is VOT the whole story?

Japanese stops and VOT Two-way voicing contrast (Homma 1980, Shimizu 1989)

voiced stops: not only lead VOT, but also short lag VOT (Takada 2004)

voiceless stops: neither clearly short lag nor clearly long lag, but intermediate between the two (Riney et al 2007)

This results in overlap in VOT range between the two categories Is there another acoustic measure that helps to disambiguate?

Introduction

Page 5: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

5

To evaluate whether VOT is a sufficient acoustic measure in distinguishing voiced stops from voiceless stops in Japanese, we investigate

how the acoustic parameter of VOT relates to native speaker/transcriber judgments of accuracy for voiced and voiceless stop consonants in English- and Japanese- acquiring children.

whether another acoustic parameter is also needed to predict native speaker/transcriber judgments of these productions.

Goal of the study

Page 6: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

6

Children’s stop productions were analyzed to address the following questions.

Question 1) Are there differences between the time-courses for mastering the stop voicing contrasts in English and Japanese?

Method; judgments by trained native speaker/phoneticians, logistic regression.

Question 2) How well does the single acoustic dimension of VOT predict the native speaker/transcriber’s judgments of voiced vs. voiceless stops produced by English- and Japanese-acquiring children?

Question 3) Is there another acoustic dimension that improves the prediction of the native speaker/transcriber’s judgments of the voicing contrast in stops produced by these children?

Method; acoustic analysis, logistic regression

Research questions

Page 7: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

7

Data collection

1) Production data come from project

- cross-language investigation of phonological development

www.ling.ohio-state.edu/~edwards/

2) Subjects 51 children (2;0-6;0) , 20 adults (18;0-30;0) recorded in Tokyo 50 children (2;0-6;0) , 15 adults (18;0-30;0), recorded in Ohio

3) Materials: word-initial pre-vocalic lingual stops — e.g., Japanese /d/ daikon ‘radish’ vs. /t/ tamago ‘egg’ English /d/ dove vs. /t/ tongue

(velar stops were also recorded but not discussed here)

Page 8: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

8tamago ‘egg’

Page 9: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

9daikon ‘radish’

Page 10: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

10Correct Voicing Voicing Error

Page 11: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

11Correct Voicing Voicing Error

Page 12: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

12

Question 1) Are there differences between the time-courses for mastering the stop voicing contrasts in English and Japanese?

Measure: voicing accuracy from transcriptions by a trained phonetician native speaker of English/Japanese.

voicing correct: /t/ → [t], /d/ → [d], /d/ → [g], /t/ → [k] voicing error: /t/ → [d], /d/ → [t], /t/ → [n]

Criterion for mastery: 75% voicing accuracy (adapted from criteria used in norming studies such as Smit et al., 1990).

Analysis 1: Transcription

Page 13: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

1330 40 50 60 70 80

02

04

06

08

01

00

dt

30 40 50 60 70 80

02

04

06

08

01

00

gk

age (months): English

% v

oici

ng a

ccur

acy

cons

.

30 40 50 60 70 80

02

04

06

08

01

00

voicedvoiceless

75% accuracy criterion

30 40 50 60 70 80

02

04

06

08

01

00

dt

30 40 50 60 70 80

02

04

06

08

01

00

g or gjk or kj

30 40 50 60 70 80

02

04

06

08

01

00

voicedvoiceless

age (months): Japanese

% v

oici

ng a

ccur

acy

cons

./d/ at 42 mo/d/ at 42 mo

Transcription: results Mixed effects logistic regression.

Dependent variable: token by token voicing accuracy (correct / incorrect)

Independent variable: age of child and target voicing (fixed effect) + subject (random effect)

before 24 mo

age in month

English

Japanese

Page 14: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

14

Analysis.1: interim conclusion

Transcription Analysis The voicing contrast is mastered later by Japanese-speaking

children, as compared to English-speaking children.

Page 15: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

15

VOT: the latency between the burst and the voicing onset.

VOT

Analysis 2: VOT

Time (s)141.9 142.1

-0.06935

0.08031

0

141.871398 142.072843torn4_20000

/t/ in “torn”

burst voice onset

Question 2) How well does the single acoustic dimension of VOT predict the native speaker/transcriber’s judgments of voiced vs. voiceless stops produced by English- and Japanese-acquiring children?

Page 16: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

16

-0.20 -0.10 0.00 0.10

020

4060

male

-0.20 -0.10 0.00 0.10

020

4060

femaledt

English: clear separation between short lag (/d/) vs. long lag (/t/) Japanese: lead or short lag (/d/) vs. intermediate lag (/t/), with much

overlap.

English

VOT: results (adults)

-0.15 -0.05 0.05 0.15

020

4060

male

-0.15 -0.05 0.05 0.15

020

4060

female/d//t/

Japanese

VOT=0

VOT medians.

VOT in seconds

no. o

f co

unts

Page 17: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

17

-0.2 -0.1 0.0 0.1 0.2

020

4060

2yos

-0.2 -0.1 0.0 0.1 0.2

020

4060

5yos /d//t/

no. o

f co

unts

VOT in seconds

VOT: results (children)

Language specific VOT distributions in children’s stops English: clearly separated peaks. Japanese: intermediate values for /t/ with even more overlap

with /d/ than in adults.

VOT=0

VOT medians.

Japanese

English5 yos2 yos

-0.2 -0.1 0.0 0.1 0.2

020

4060

2yo

-0.2 -0.1 0.0 0.1 0.2

020

4060

5yo /d//t/

no. o

f co

unts

VOT in seconds

2 yosJapanese

5 yos

Page 18: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

18-0.3 -0.2 -0.1 0.0 0.1 0.2 0.3

0.0

0.2

0.4

0.6

0.8

1.0

VOT (seconds)

prob

abil

ity

of tr

ansc

ript

ion

as /t

/

correctly predicted 94%

English

VOT: results (children)

-0.3 -0.2 -0.1 0.0 0.1 0.2 0.3

0.0

0.2

0.4

0.6

0.8

1.0

VOT (seconds)

prob

abil

ity

of tr

ansc

ript

ion

as /t

/

correctly predicted 80%

Japanese

Mixed effects logistic regression

Dependent variable: token by token voicing judgment (/t/ or not /t/)

Independent variable: VOT

Page 19: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

19

VOT: results (children) Evaluation of predictive value

Model’s prediction accuracy with VOT as an independent variable i.e., calculate proportion of tokens where the odds of transcribing /t/ are greater than 50% and the transcriber actually transcribed /t/:‘VOT model’: 94% and 80%

Baseline prediction accuracy with no independent variable i.e., calculate the proportion of tokens where the transcriber transcribed a voiceless consonant: ‘Baseline’: 49.7% and 63.3%

Page 20: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

20

Analysis 2: interim conclusionTranscription Analysis The voicing contrast is mastered later for Japanese-speaking

children, as compared to English-speaking children.

VOT The single acoustic dimension of VOT predicts the transcribed

voicing for English productions 94% of the time.

Accuracy of prediction for Japanese productions is much lower.

Page 21: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

21

Question 3) Is there another acoustic dimension that improves the prediction of the native speaker/transcriber’s judgments of the voicing contrast in stops produced by these children?

H1-H2 A type of breathiness measure. Amplitude difference between the first harmonic and the second

harmonic.

Frequency (Hz)0 6000

Sound pressure level (dB/Hz)

0

20

40 H1-H2 (dB)first harmonic (H1)

second harmonic (H1)

Time (s)141.9 142.1

-0.06935

0.08031

0

141.871398 142.072843torn4_20000

25ms

Am

plitude (dB)

Analysis 3: H1-H2 by VOT

“torn”

Page 22: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

22

Adults English

Higher H1-H2 and longer VOT for /t/.

No overlap between VOT ranges

Japanese Higher H1-H2 and

longer VOT for /t/. Overlap between

VOT ranges

-20

-10

010

20

-100 -10 0 10 100

adults: male

-20

-10

010

20

-100 -10 0 10 100

adults: female

/t//th/

log VOT (ms)H

1-H

2 (d

B)

English

-20

-10

010

20

-100 -10 0 10 100

adults: male

-20

-10

010

20

-100 -10 0 10 100

adults: female

/d//t/

log VOT (ms)

H1-

H2

(dB

)

Japanese

H1-H2 by VOT: adults

male

male

female

female

Page 23: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

23

Perceived /t/ and /d/ by transcriber.

English /t/

: longer lag VOT

Japanese /t/

: longer lag VOT, higher H1-H2

H1-H2 by VOT: children

-20

-10

010

20

-100 -10 0 10 100

-20

-10

010

20

-100 -10 0 10 100

/d/ on target/t/ on target[t] off target[d] off target

/d/ on target/t/ on target[t] off target[d] off target

log VOT (ms)

H1-

H2

(dB

)

-20

-10

010

20

-100 -10 0 10 100

2 yos-2

0-1

00

1020

-100 -10 0 10 100

5 yos

/d/ on target/t/ on target[t] off target[d] off target

/d/ on target/t/ on target[t] off target[d] off target

log VOT (ms)

H1-

H2

(dB

)

English

Japanese

2 yos 5 yos

Page 24: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

24

VOT: results (children)

Mixed effects logistic regression

Dependent variable: token by token voicing judgment (/t/ or not /t/)

Independent variables: VOT+ H1H2

Page 25: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

25

VOT and H1-H2: results (children) Evaluation of predictive value

Baseline prediction accuracy with no independent variable i.e., calculate the proportion of tokens where the transcriber transcribed a voiceless consonant: 49.7% and 63.3%

Model’s prediction accuracy with VOT as an independent variable: 94% and 80%

Model’s prediction accuracy with VOT and H1-H2 as independent variables: 94% and 83%

VOT H1-H2

0.0

0.2

0.4

0.6

0.8

1.0

29.4 times

English children

7.91 0.27

norm

aliz

ed c

oeff

icie

nts

29.4 times

VOT H1-H2

0.0

0.2

0.4

0.6

0.8

1.0

5.3 times

5.84 1.1

5.3 times

VOT H1-H2 VOT H1-H2

English Japanese

> >

*

*

*

*

* P < 0.05

Page 26: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

26

Analysis 3: interim conclusionTranscription Analysis The voicing contrast is acquired later for Japanese-speaking

children, as compared to English-speaking children.VOT The single acoustic dimension of VOT is adequate to

characterize the transcription results for English. However, VOT alone does not adequately characterize

the transcription results for Japanese. H1-H2 by VOT In Japanese, the additional acoustic parameter of H1-H2

improves the prediction of the transcription results. The effects of VOT relative to H1-H2 was greater in English than

in Japanese

Page 27: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

27

Japanese-speaking children showed mastery of the voicing contrast at a later age than English speaking children. However, the VOT ranges for the productions of Japanese-speaking

children were similar to those of adults. When VOT alone was used to predict the judgments of a

trained native speaker/transcriber, it was only 80% successful in Japanese, whereas it was 94% successful in English.

Adding the acoustic parameter of H1-H2 improved the prediction of the native speaker/transcriber judgments for the productions of the Japanese-speaking children, but not for those of the English-speaking children.

Summary and conclusion

Page 28: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

28

English and Japanese encode their stop voicing contrast in the acoustic dimensions in language-specific ways. English: exclusively along VOT dimension Japanese: more than VOT dimension

Unlike English, VOT is not a sufficient acoustic measure of stop voicing contrast in Japanese. It was necessary to examine other relevant acoustic

dimensions such as breathiness to correctly characterize Japanese stop voicing contrast.

Summary and conclusion

Page 29: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

29

Acknowledgement

This work was supported by by NIDCD grant 02932 to Jan Edwards.

We thank the children who participated in the task, the parents who gave their consent, and the principals and teachers at the schools at which the data were collected.

Thank you for your attention!

Page 30: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

30

reference

Lisker, L. and A. Abramson. 1964. A cross-language study of voicing in initial stops: acoustical measurements. Words, 20.

Riney, T., N. Takagi, K. Otaa, and Y. Uchida. 2007. The intermediate degree of vot in japanese initial voiceless stops. Journal of Phonetics, 35.

Smit, A.B., L. Hand, J. Freilinger, J renthal, and A Bird. 1990. The iowa articulation norms project and its nebraska replication. Journal of Speech and Hearing Disorders, 55.

Gandour, H. S. H., J., R. Petty, S. Dardarananda, Dechongkit, and S. Mukongoen. 1986. The acquisition of the voicing contrast in thai: A study of voice onset time in word-initial stop consonants. Journal of Child Language, 13.

Takada, M. 2004. VOT tendency in the initial voiced alveolar plosive /d/ in Japanese and the speakers' age. Journal of the Phonetic Society of Japan, 8(3), 57-66.

Homma, Y. (1980). Voice onset time in Japanese stops. Onseigakkai Kaihoo, 163, 7-9.

Sander, E.1972. When are speech sounds learned? Journal of Speech and Hearing Disorders, 37: 55-63.

Page 31: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

31

Extra I: Velarsadults scatterplts

English adults: coronals + velars

Japanese adults: coronals (top) + velars (bottom)

Page 32: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

32

English children(alv: left, velar: right)- VOT only model:93%- VOT&H1-H2 mode

l: no improvement. VOT was the only effective parameter.

Extra I: Velarschildren scatterplots

Page 33: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

33

Japanese children

(alv: left, velar: right)

- VOT only model: 87%

- VOT&H1-H2 model: no improvement. VOT was the only effective parameter.

Extra I: Velarschildren scatterplots

Page 34: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

34

Page 35: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

35

Page 36: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

36

Page 37: VOT is necessary but not sufficient for describing the voicing contrast in Japanese

37Correct Voicing Voicing Error