intonation in the monosyllabic utterances of 1-year-olds

15
Infant Behavior & Development 24 (2002) 393–407 Intonation in the monosyllabic utterances of 1-year-olds David Snow Department of Audiology and Speech Sciences, 1353 Heavilon Hall, Purdue University, West Lafayette, IN 47907-1353, USA Received 24 April 2001; received in revised form 1 August 2001; accepted 7 December 2001 Abstract Previous research has demonstrated that falling contours predominate in infant utterances as early as 3 months of age. The precocious appearance of falling intonation is usually attributed to “biolog- ical tendencies,” that is, the physiological naturalness of descending fundamental frequency patterns. In contrast, other investigations have shown that some children do not use adultlike falling or rising intonation contours until they produce their first words. To resolve these conflicting views of prosodic development, this study acoustically investigated intonation production in the monosyllabic utterances of 10 English-speaking children from 10 to 13 months of age and the utterance-final monosyllables of ten 4-year-olds. Children in both age groups produced a wider accent range in falling contours than in rising contours. Infants produced a narrower accent range than the preschoolers. The findings suggest that biological tendencies are not sufficient to account for children’s acquisition of intonation between the ages of 1 and 4 years. © 2002 Elsevier Science Inc. All rights reserved. Keywords: Infant babbling; Suprasegmentals; Prosody; Intonation; Nuclear tones 1. Introduction Intonation refers to the distinctive use of pitch patterns in spoken sentences. As part of the larger system of suprasegmentals (pitch, length, and loudness), intonation itself contains subsystems, the most important of which is a set of “nuclear tones.” These are pitch patterns that occur at the ends of units called intonation-groups (Cruttenden, 1997). In this paper, intonation-groups always correspond grammatically to a simple sentence, a short phrase, or a Tel.: +1-765-494-3824; fax: +1-765-494-0771. E-mail address: [email protected] (D. Snow). 0163-6383/02/$ – see front matter © 2002 Elsevier Science Inc. All rights reserved. PII:S0163-6383(02)00084-X

Upload: david-snow

Post on 15-Sep-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Infant Behavior & Development 24 (2002) 393–407

Intonation in the monosyllabic utterances of 1-year-olds

David Snow∗

Department of Audiology and Speech Sciences, 1353 Heavilon Hall, Purdue University,West Lafayette, IN 47907-1353, USA

Received 24 April 2001; received in revised form 1 August 2001; accepted 7 December 2001

Abstract

Previous research has demonstrated that falling contours predominate in infant utterances as earlyas 3 months of age. The precocious appearance of falling intonation is usually attributed to “biolog-ical tendencies,” that is, the physiological naturalness of descending fundamental frequency patterns.In contrast, other investigations have shown that some children do not use adultlike falling or risingintonation contours until they produce their first words. To resolve these conflicting views of prosodicdevelopment, this study acoustically investigated intonation production in the monosyllabic utterancesof 10 English-speaking children from 10 to 13 months of age and the utterance-final monosyllables often 4-year-olds. Children in both age groups produced a wider accent range in falling contours than inrising contours. Infants produced a narrower accent range than the preschoolers. The findings suggestthat biological tendencies are not sufficient to account for children’s acquisition of intonation betweenthe ages of 1 and 4 years. © 2002 Elsevier Science Inc. All rights reserved.

Keywords:Infant babbling; Suprasegmentals; Prosody; Intonation; Nuclear tones

1. Introduction

Intonation refers to the distinctive use of pitch patterns in spoken sentences. As part ofthe larger system of suprasegmentals (pitch, length, and loudness), intonation itself containssubsystems, the most important of which is a set of “nuclear tones.” These are pitch patternsthat occur at the ends of units called intonation-groups (Cruttenden, 1997). In this paper,intonation-groups always correspond grammatically to a simple sentence, a short phrase, or a

∗ Tel.: +1-765-494-3824; fax:+1-765-494-0771.E-mail address:[email protected] (D. Snow).

0163-6383/02/$ – see front matter © 2002 Elsevier Science Inc. All rights reserved.PII: S0163-6383(02)00084-X

394 D. Snow / Infant Behavior & Development 24 (2002) 393–407

single-syllable utterance. The pitch patterns are said to be nuclear because they occur on thehighest accent of the intonation-group, or its nucleus—the last accent in the intonation-group.

One of the distinctive characteristics of nuclear tones is the direction of their pitch change.Tones with a falling contour occur in many utterances functioning as statements. The upperpanel of Fig. 1 shows an example from the speech of a 4-year-old girl who we will call Clara.The figure represents the acoustic analysis of the statementA hat, which Clara produced inresponse to the questionWhat can the baby put on?The top portion of the panel is the timewaveform, the amplitude of the speech signal over time. The lower portion of the panel depictsthe fundamental frequency contour—the primary physical correlate of pitch. The entire fallingtone is carried out in the nuclear-accented, monosyllabic wordhat.

Tones with a rising contour occur in many utterances functioning asyes/noquestions. Thelower panel of Fig. 1 illustrates the questionThis is your cat?(also produced by Clara).

Fig. 1. Time waveform andf0 contour of the utteranceA hat(top panel) spoken by a 4-year-old girl (Clara) and theutteranceThis is your cat?(bottom panel) spoken by the same child.

D. Snow / Infant Behavior & Development 24 (2002) 393–407 395

In this case, a rising tone is carried out in the nuclear-accented, utterance-final word. Notethat utterances need not have the syntax of declarative or interrogative sentences to carry theappropriate tones.

Another characteristic of nuclear tones (in addition to the direction of pitch change) is theirwidth of pitch change, oraccent range. This is the absolute or logarithmic difference betweenthe highest and lowest points in the fundamental frequency contour. In English, utterance-finaltones have a wide accent range. This prominent pitch change phonetically marks the end ofcomplete clauses and sentences and thus serves an important syntactic function. In the wordhat (Fig. 1), Clara used a pitch change greater than 3/4 of an octave. In contrast, the pitchchange that Clara used in nonfinal and non-nuclear words likea or thiswas about 1/12th of anoctave.

In developmental studies of intonation, Kent and colleagues reported that falling contourspredominated in infant vocalizations at 3, 6, 9, and 12 months (Kent & Murray, 1982; Kent& Bauer, 1985). The predominance of falling contours as early as 3 months is usually ac-counted for by respiratory constraints on speech production. The classical expression of thisidea is found in Lieberman’s (1967) breath-group theory. According to this theory, speechproduction is organized in units of one or more vocalizations called breath groups. A fallingintonation pattern at the end of breath groups is the natural result of rapidly declining sub-glottal pressure during speech production (the relaxation of laryngeal tension also contributesto a fall in pitch). Lieberman (1967) proposed that the universal “referential breath-group”provides an innately determined, physiological basis for the melody of infant cry and non-cryvocalizations.

The breath-group theory has also been invoked to account for the adultlike segmental length-ening that infants sometimes produce in the same phonetic contexts that carry nuclear tones, aphenomenon called “final syllable lengthening” or simply “final lengthening.” To account forthis timing effect in infant utterances, Laufer (1980), suggested that articulation is coupled withthe breath cycle, leading to greater segment durations at the end of breath groups (hence, atthe boundary of utterances). Segmental timing is relevant to developmental intonation becausemost physiological mechanisms that could theoretically alter the fundamental frequency ofthe voice also alter speech timing. Indeed, biologically based accounts of intonation typicallypredict that pitch and timing are closely associated during development.

Lieberman (1967) further proposed that the breath-group is the basis for the linguisticfunction of intonation in adults and older children. “At some point in the development ofspeech, intonation takes on a linguistic reference.. . . Children come to use the innate referentialbreath-group as the phonetic marker of complete sentences.” (pp. 46–47). This implies thatintonation has the same form and the same physiological basis from babbling to meaningfulspeech. One source of evidence consistent with this prediction is that falling contours, whichpredominate in prelinguistic speech, also predominate in children’s early meaningful speech(Wells, Montgomery, & MacLure, 1979; also see Cruttenden, 1981). “Continuity” in thissense implies that the mechanisms needed to account for intonation in preverbal infants alsoaccount for intonation in older children and adults. Using a similar logic, researchers haveclaimed that final lengthening reflects continuity across developmental levels and thereforehas a common physiological basis in infant vocalizations and adult speech (Robb & Saxman,1990). In the pitch or timing domain, however, these predictions apply only to falling tones.

396 D. Snow / Infant Behavior & Development 24 (2002) 393–407

In fact, the breath-group theory implies that the acquisition pattern for rising tones is differentthan for falling tones because the physiological mechanisms are different. Because childrenmust acquire rising tones through experience with the ambient language, the theory predictsthat only rising tones will change across developmental levels.

In contrast to expectations based on the breath group theory, some researchers have ob-served discontinuities in the development of falling as well as rising tones. Notably, childlanguage researchers have reported that some children do not produce nuclear tones in thepre-speaking period, and only do so when they produce their first words. The classical ex-pression of this observation is found in Leopold’s study of his daughter, Hildegard. Leopold(1947, p. 256) reported that Hildegard did not use adultlike intonation patterns (i.e., “nothingworth noting”) until the early single-word period. This is consistent with a number of diarystudies indicating that children begin to produce one or more tones at about 12 or 13 monthsof age (Halliday, 1975; Menn, 1976). Similarly, Snow (1994) observed a period of markedchange in children’s development of final lengthening at 21 months, a discontinuity that wasindependent of intonation. These observations suggest that early intonation development ischaracterized by discontinuity, on the one hand, and dissociation from speech timing, on theother. Taken together, the findings imply that mechanisms in addition to the innate referentialbreath-group are needed to account for children’s acquisition of intonation. Such mechanismscould include unknown changes in the physiology of speech production during early child-hood. Another possibility is that falling and rising intonation is shaped by the child’s languageexperience.

Researchers have had different conclusions about continuity partly because they have em-phasized different prosodic boundary features. Laufer (1980) and Snow (1994), for example,focused on temporal patterns chiefly associated with falling contours. Kent and colleaguesanalyzed the direction of pitch change in infant vocalizations. Finally, Leopold (1947) ap-parently referred to the perceived salience of tones, or accent range. Leopold’s observationsare especially noteworthy because accent range is a primary phonetic marker of completesentences and clauses. However, the question of continuity vs. discontinuity in children’suse of accent range has not been adequately addressed in acoustic investigations of infantspeech.

To study this question (and its implications for biologically based theories of intonationdevelopment), the present investigation analyzed accent range in the vocalizations of infantswho were beginning to produce their first words—the same linguistic period in which Hildegardfirst used canonical intonation patterns. To test for continuity across developmental levels, wecompared the infants’ meaningful and non-meaningful vocalizations to the spontaneous speechof 4-year-old children. Four-year-olds were selected as a comparison group for two reasons.First, they represent an advanced stage of phonological development. For example, 4-year-oldshave mastered the falling tone pattern (Snow, 1998) and most vowel and consonant sounds(Sander, 1972). Second, most 4-year-old children respond very well to the basic play routinethat we adapt to elicit speech samples from infants between 11 and 13 months of age. In bothage groups, the directionality of tones was also analyzed, because falling and rising contoursmay reflect different acquisition patterns. Finally, we compared the accent range and length ofthe children’s nuclear tones to determine whether intonation and timing develop in the sameway (association pattern) or in different ways (dissociation pattern).

D. Snow / Infant Behavior & Development 24 (2002) 393–407 397

2. Methods

2.1. Participants

Ten infants and ten 4-year-olds participated in the study. There were five girls and five boysin each age group. The infants and their families were recruited by a newspaper advertisementin the vicinity of Purdue University (West Lafayette, IN). The infants (see Table 1) met thefollowing selection criteria: (1) between 10 and 13 months of age; (2) an expressive vocabularyof 1–35 words, by parent report; (3) no unusual prenatal, sensory, or developmental concerns;(4) from English-speaking homes (General American dialect); and (5) hearing within normallimits, defined as 20 dB HL or better at 500, 1,000, 2,000, and 4,000 Hz bilaterally (AmericanNational Standards Institution (ANSI), 1969). The infants were from predominately EuropeanAmerican backgrounds and middle class families.

The 4-year-old children were selected from one of two branches of a preschool in the vicinityof the University of Arizona (Tucson). They participated in a study of intonation developmentreported by Snow (1998). The 4-year-olds met the following criteria: (1) between the agesof 4:0 (years:months) and 4:11; (2) normal motor, cognitive, social-emotional, and languagedevelopment by teacher report; (3) no history of referral for speech–language services by parentreport; (4) from English-speaking homes (General American dialect); and (5) hearing withinnormal limits, defined as 35 dB HL or better at 500, 1,000, 2,000, and 4,000 Hz bilaterallyin order to accommodate for ambient noise in the preschools. The 4-year-olds were frompredominately European American backgrounds and middle class families.

Table 1Infant participants: demographic and parent report data

Child Gender Age (months) Vocabulary measures—parent report (CDI)a Lexicon sizeb

Words comprehended Words produced

Number Percentile Number Percentile

EO F 10 30 35 9 80 1AB F 12 32 10 4 30 1CB F 12 100 55 34 85 5EG F 12 63 35 18 70 5LS F 12 132 70 6 45 4ZM M 11 32 25 10 80 1AX M 12 69 50 9 70 10BG M 12 152 80 4 45 0AH M 13 50 20 5 2 0CK M 13 121 75 16 85 4

Mean 11.9 78.1 45.5 11.5 59.2 3.1SD .88 45.2 24.3 9.2 27.8 3.1

a CDI: MacArthur Communicative Development Inventory.b Lexicon size: no. of different words produced in the experimental session.

398 D. Snow / Infant Behavior & Development 24 (2002) 393–407

2.2. Materials

The play materials for the infants included blocks, balls, a bucket, bubbles, a doll baby,animal puppets, doll clothes, a cup, a bottle, a crib and other toys. The toys were selected froma standardized set to correspond to words that each child could produce, according to parentreport. For 4-year-olds, the play routine used many of the same materials in four nurturingactivities centered around a doll baby: (1) introducing animal puppets; (2) feeding and dressingthe baby; (3) playing with toys; and (4) going home.

2.3. Procedures

In a laboratory suite (or preschool room), the participants sat down on a play mat on thefloor. The interactions were videotaped and audiotaped. The child and experimenter (or parent)each wore a wireless electret condenser lapel microphone. A Telex FMR-70 wireless system(FMR-2 in Tucson) relayed the audio signals to a stereo cassette recorder and a camcorder.

2.3.1. InfantsWithin 15 days before the experimental play session, the infants passed the hearing screening,

using VRA procedures, and the parent completed the MacArthur Communicative DevelopmentInventory (CDI)—Words and Gestures Form (Fenson, Dale, Reznick et al., 1991). Each childand his or her mother participated individually in the experimental session. During the first10–15 min, the parent and baby played together alone, as they might do at home. Parentswere asked to repeat any vocalizations they believed to be meaningful. In the next 20–25 min,an experimenter joined them. The activities encouraged infants to attend to objects, to give,take, and show objects, to interact with adults, and to make requests, activities that may elicitdifferent kinds of vocalizations (Stark, Bernstein, & Demorest, 1993). Five experimenters (theinvestigator, a male research assistant, and three female research assistants) each interactedwith one or two of the children.

2.3.2. Four-year-oldsThe hearing screening was conducted on a day prior to (and within 3 months of) the exper-

imental play activities. Each child participated individually in the experimental session withan adult experimenter in a room at the child’s preschool. In a 25-min period of informallystructured activities, the experimenter brought out the dolls and toys one at a time from a clothbag. The session concluded with a 5-min imitation routine. There were two experimenters, in-cluding the investigator who interacted with six of the children, and a female research assistantwho interacted with four of the children.

2.4. Analysis of infant vocalizations

Two research assistants each analyzed the data for five infants. Using criteria and proceduresmodeled after those described by Stoel-Gammon (1989), the analyst phonetically transcribed amaximum of 100 utterances in each social context and all words produced in the entire sample.The transcription grouped vocalizations into “utterances,” that is, one or more speech-like

D. Snow / Infant Behavior & Development 24 (2002) 393–407 399

vocalizations bounded by (1) a turn-taking, (2) a non-speech-like vocalization, or (3) a pauseof at least 1,100 ms—the temporal criterion used by Branigan (1979).

The transcriber judged each utterance to be non-meaningful or meaningful. Meaningfulutterances (i.e., words) were identified on the basis of four criteria: (1) some phonetic relationto an adult-based word; (2) appropriate use in context; (3) consistency; and (4) the parent’sverbal or nonverbal confirmation that the child’s utterance was meaningful. Imitated wordswere also transcribed (but are not reported in this paper). The transcriber determined eachchild’s “lexicon size,” a type measure reflecting the number of different words used in thesession (see Table 1).

Utterances were assigned a phonological complexity level based on Stoel-Gammon (1989).Level Iutterances contained vowels, syllabic consonants, or syllables with glides or glottal con-sonants (e.g., [ε], [w�], or [hai] hi). Level I vocalizations are often described as “precanonical.”Level II utterances contained one or more CV(C) syllables with a true consonant (e.g., [bɑ],[mæ], or [d�] duck). Level III utterances contained two or more true consonants that differin place or manner (e.g., [bɑt], [n�p], or [d�k] duck). Oller (1980) described non-meaningfulutterances at levels II and III, respectively, as “canonical” and “variegated.”

A second person carried out reliability judgments based on the first 10 utterances in eachsocial context for each child. Across all categories, the percentage of agreement was88%.

2.5. Instrumental analysis

2.5.1. InfantsOnly monosyllabic utterances were analyzed. Monosyllables were selected as the focus of

this study because they constitute a basic, early developing, and frequently occurring typeof infant vocalization (Vihman, 1993). Indeed, some children only produce monosyllables inearly words. The instrumental analysis sampled monosyllables in three contextual categories,each having two levels: social context (parent only vs. experimenter and parent), linguisticfunction (non-meaningful vs. meaningful), and phonological complexity (precanonical vs.canonical/variegated). Thus, there were eight combinations of categories and levels (“utter-ance types”). In non-meaningful speech, a maximum of 10 consecutive utterances in eachutterance type was analyzed within the first 50 non-meaningful utterances. In meaningfulspeech, a maximum of 10 utterances in each utterance type was analyzed within the entiresample.

2.5.2. Four-year-oldsAll words meeting criteria 1–5 below were analyzed. Examples of words selected for analysis

include the italicized portion of “Pig”, “ A hat” or “I’ll play with the cat.”

1. The stem (e.g.,sock) or a monosyllabic inflected form (e.g.,socks) of one of the followingwords:bear, bed, bird, book, boy, cat, dog, duck, food, girl , hat, home, horse, juice, mat,milk, no, pig, sock, shoe, toy, or numbersthreeto six.

2. Fully voiced, clearly audible, and non-imitated.3. Spoken in isolation or in the phrase-final, accented position of a multi-word utterance.

400 D. Snow / Infant Behavior & Development 24 (2002) 393–407

4. The final syllable was separated from the child’s next utterance by a turn-taking or a pauseof more than 400 ms, a criterion for distinguishing nonfinal from final words (Branigan,1979).

2.5.3. Acoustic measuresThe duration andf0 contour of monosyllables meeting the selection criteria were measured

using the CSpeech signal analysis system (Milenkovic and Read, 1992), with a 16-bit resolutionand a sampling rate of 22 kHz. In accordance with procedures described by Allen and Hawkins(1980), the analyzed portion of each syllable was the vocalic nucleus. The beginning and endingboundaries of each vocalic nucleus were set at the amplitude peak of the first or last periodiccycle that was visually distinct in the time waveform. Syllable duration was calculated as thedifference between the boundaries.

The pitch extraction algorithms of CSpeech generated the fundamental frequency (f0) con-tour between the syllable boundaries. Voiced portions of the signal that the automatic rou-tines failed to analyze correctly were edited via computer-assisted procedures supported byCSpeech. The contour was described as “falling” if the maximumf0 preceded the minimumf0 and “rising” if the maximumf0 followed the minimumf0. “Accent range” was calculatedas the logarithmic difference between the maximum and minimumf0, expressed in cents (1octave= 1,200 cents). This latter measure permitted the frequency data to be expressed interms of perceptually equivalent units (Burns & Ward, 1982). For example, the falling tone inthe wordhat in Fig. 1 had a maximumf0 of 282 Hz and a minimum of 163 Hz. The accentrange is expressed by the formula (1,200/log 2)(log(282/163)) = 953 cents. Accent range wasanalyzed statistically using a “magnitude criterion.” That is, intonation would reflect continuityif the average magnitude of pitch change was comparable across developmental levels.

2.5.4. ReliabilityA second person carried out reliability measurements of three randomly selected monosyl-

lables for each infant. The mean measures for the two analysts were within 4 ms of duration,12 Hz for minimumf0, 4 Hz for maximumf0, and 130 cents for accent range. A doctoral stu-dent in Speech and Hearing Sciences made the acoustic measures in the original study of4-year-olds reported by Snow (1998). The author carried out reliability measurements of onemonosyllable and the stressed syllable and entire word of one disyllable randomly selectedfrom the transcript of each child. The mean measures for the two analysts were within 6 ms ofduration, 6 Hz for minimumf0, 1 Hz for maximumf0, and 82 cents for pitch range.

3. Results

The perceptual analysis of infant speech included 849 non-meaningful utterances. Utter-ances of one, two, and three or more syllables, respectively, accounted for 53, 26, and 21%.Monosyllables were the most frequently occurring category in 9 of the 10 infants. A sign testindicated that this was significant (z = 2.121,p < .05). Monosyllables accounted for an evenlarger percentage of words: 72% of 110 meaningful vocalizations. The perceptual analysis ofthe 4-year-olds’ speech included 969 utterances. Utterances of one, two, and three or more

D. Snow / Infant Behavior & Development 24 (2002) 393–407 401

syllables, respectively, accounted for 15, 20, and 65%. Utterances of three or more syllableswere the most frequently occurring category in all 10 of the 4-year-olds.

Eight infants produced between one and 10 different words in the experimental session (seeTable 1, “Lexicon size”). By parent report, all 10 children had an expressive vocabulary of4–34 words (see Table 1, “Vocabulary Measures-Parent Report (CDI)”). Based on the experi-mental observations and/or parent report, all of the children were considered to be in the earlysingle-word period. However, 92% of their utterances were non-meaningful.

3.1. Acoustic analysis

A total of 295 monosyllabic utterances for infants were analyzed acoustically, of these 56%were falls. And 230 monosyllabic words for 4-year-olds were analyzed, of these 77% werefalls. The groups differed significantly in the proportions of falls vs. rises (chi-square= 25.9,p < .001).

Table 2 lists the results for syllable duration and accent range by contour direction and agegroup. The accent range data are displayed in Fig. 2. The children produced a variable numberof analyzable tokens (the range for 1-year-olds was from 7 to 49; for 4-year-olds, from 12 to29). Table 2 and Fig. 2 reflect the group means of averages across tokens for each child. Eachtype of data (duration and accent range) was evaluated by a mixed ANOVA, with age as thebetween-groups variable and contour direction as the within-groups variable. There were nosignificant effects for duration. For accent range, there was a main effect of age [F(1, 18) =10.867,p < .01] and contour direction [F(1, 18) = 5.278,p < .05]. The interaction was notsignificant. Thus, falling tones had a greater accent range than rising tones, and 4-year-oldsproduced a greater accent range than 1-year-olds. Because there were age-related differencesin variability (as discussed further below), post-hoc tests were conducted with and withoutthe assumption of homogeneity of variances. Tests based on pooled or separate variancesproduced the same significance levels. The univariate tests showed that 1- and 4-year-oldsdiffered significantly only in the accent range of falling contours [F(1, 18) = 15.038,p <

.01]. The difference between falling and rising contours as a function of age was not significant,confirming the absence of an interaction in the multivariate test.

The error bars in Fig. 1 reflect standard deviations among individuals. Bartlett’s test showedthat the groups had different variances in the accent range of rising contours and the durationof falling contours. Surprisingly, the inter-individual standard deviations for accent range weregreatest in the group of 4-year-olds. Variability within individuals was evaluated by submitting

Table 2Means and SDs (in parentheses) of syllable duration and accent range in monosyllabic nuclear tones by age groupand contour direction

Age (years) Falling intonation contours Rising intonation contours

Duration (ms) Accent range (cents) Duration (ms) Accent range (cents)

1 293 (137) 521 (205) 248 (100) 424 (164)4 318 (54) 1,048 (378) 331 (141) 714 (522)

402 D. Snow / Infant Behavior & Development 24 (2002) 393–407

Fig. 2. Mean magnitude (and SD) of accent range by contour direction and the children’s age.

the individual standard deviations to the same ANOVA that was used to evaluate the individualmeans. None of the group differences were significant. Thus, there was no evidence thatutterance-to-utterance variability differed across age groups.

The 4-year-olds produced monosyllabic tones in utterances that varied in length from 1 to12 syllables. To test for the possible effects of utterance duration on intonation, the 4-year-olds’acoustic data were divided into two sets: single-syllable utterances and multi-syllabic utter-ances. The duration and accent range of monosyllabic nuclear tones were relatively constantin utterances of different length. For example, the accent range of falling tones in single- vs.multi-syllable utterances was 1,014 cents vs. 1,050 cents. Pairedt-tests indicated that utterancelength did not significantly affect either duration or accent range.

Finally, the 4-year-olds were compared to the 1-year-olds using only monosyllabic utter-ances in both age groups. Because three of the 4-year-olds did not produce monosyllableswith rising contours, the analysis was restricted to falling contours. There were no significantgroup differences in duration. However, the 4-year-olds produced a wider accent range thanthe infants [F(1, 18) = 5.529,p < .05], replicating the result obtained with the entire dataset.

In the group of infants, accent range was not significantly affected by gender, social con-text, phonological complexity, or meaningfulness (babble vs. words). In some individual chil-dren, however, there were large differences in intonation between these last two categories(non-meaningful vs. meaningful speech). Fig. 3 shows two examples from the speech of oneof these infants, whom we will call Abby (AB in Table 1). Comparisons with a falling toneproduced by 4-year-old Clara also illustrate some the main findings of the study. The top panel

D. Snow / Infant Behavior & Development 24 (2002) 393–407 403

Fig. 3. Time waveform andf0 contour of the babbled utterance (tɑ) (top panel) spoken by a 1-year-old girl (Abby),the single-word utterance (dε) There(middle panel) spoken by the same child, and the wordhat in the utteranceAhat (bottom panel) spoken by a 4-year-old girl (Clara).

of Fig. 3 displays the analysis of a non-meaningful utterance spoken by Abby. The middlepanel shows a word that she produced. The bottom panel depicts Clara’s production of a wordthat was illustrated in Fig. 1. In the babbled utterance, Abby’s falling contour had an accentrange of only 439 cents (cf. group mean= 521 cents). This is less than the accent range thatClara produced (953 cents; cf. group mean= 1,048 cents). However, in the word that Abbyproduced, the accent range was 830 cents (almost twice as prominent as the babbled utterance),and the falling tone was similar to Clara’s.

4. Discussion

This study showed that 1-year-olds use fewer falling contours and a narrower accent rangethan 4-year-olds. As predicted by Leopold’s (1947) observations, the results support a dis-continuity hypothesis of intonation development. Previous research indicating continuity had

404 D. Snow / Infant Behavior & Development 24 (2002) 393–407

focused on contour direction. For example, Kent and Murray (1982) reported that falling con-tours accounted for nearly 80% of infant vocalizations. Kent and Murray pointed out that thisis consistent with the preference for falling contours that has been reported in children’s earlymeaningful speech. In the present study, an equally high proportion of falling contours wasobserved in 4-year-olds. However, infants produced falling contours in only 56% of utterances,a percentage of occurrence that was significantly less than that of the 4-year-olds and less thanthe percentage that Kent and Murray reported for infants.

The discrepancy between studies may be less serious than it first appears. In Kent andMurray’s (1982) investigation, contours that could be classified unambiguously as “falling”(i.e., falls or rise-falls) accounted for 47% of the combined utterances at 3, 6, and 9 monthsof age. An additional 31% of the utterances were “flat.” Because most of these latter contoursactually had a slightly falling pattern, Kent and Murray grouped them with falling contours,so that the percentage of falls, including these less clear cases, was 78%. In perception andproduction, however, there is little or no distinction between flat contours that are slightly fallingvs. slightly rising. Linguistically, all “level” tones can be grouped with rises (Cruttenden, 1997).Thus, the discrepancy between studies could be due to variations among types of flat contoursin which the functional and physiological differences between rises and falls are minimized.

Kent and colleagues did not analyze accent range. However, the relatively high proportion oflevel contours they reported foreshadows the relevance of accent range because level contoursby definition have a very narrow margin of pitch change—a feature that is not characteristic ofEnglish tones at the boundaries of complete sentences. By analyzing width of pitch change asa gradient within directional classes, the present study showed that there was a developmentalchange in accent range as well as a change in the proportion of falling vs. rising contours.Indeed, the average accent range of falling tones in the speech of 4-year-olds was about twiceas large as the pitch change that 1-year-olds used in similar environments.

It is of special interest that accent range developed markedly in falling tones. Inasmuch asfalling pitch is “natural,” it might be expected that falling intonation would show little or nodevelopmental change. The fact that this is not the case suggests that intonation developmentbetween 1 and 4 years of age depends on more than the biological tendencies that are inherentin the innate referential breath group. In both age groups, however, accent range was narrowerin rising contours than in falling contours. This finding confirms one aspect of the breath-grouptheory, namely, the prediction that rising and falling intonation will develop differently becausethe underlying biological tendencies are different.

This study investigated infants’ intonation in monosyllabic utterances. The analysis of ut-terances by length confirmed the claim that monosyllables are characteristic of the babbleand early words of English-speaking children (Boysson-Bardies et al., 1992). The 4-year-olds,however, produced monosyllabic tones in utterances that were usually longer than one syllable.One possible hypothesis is that the longer utterances of 4-year-olds accounted for the greaterpitch change they produced. Contrary to this expectation, however, there were no prosodicdifferences between the nuclear tones that 4-year-olds produced in short vs. long utterances.Another finding related to length is that the age groups did not differ in the duration of thenuclear tones themselves. Thus, the group difference in intonation cannot be explained as asecondary effect of differences in the length of children’s nuclear tones or the length of theutterances in which the tones occurred.

D. Snow / Infant Behavior & Development 24 (2002) 393–407 405

Variability in intonation was also greater among the 4-year-olds than the infants. It couldbe argued that 4-year-olds vary more than infants because their mean pitch excursions arewider and syllable lengths are longer. However, when the 4-year-olds demonstrated greaterinter-individual variability (accent range in rising contours, duration in falling contours), therewas no difference between the group means. Conversely, when the means were different (e.g.,accent range in falling contours), the conditions for homogeneity of variance were met.

Although variability (between and especially within individuals) is often interpreted as anindex of motor control (e.g., Kent & Forner, 1980), two of the present findings suggested thatthe control off0 production was not the primary source of the difference between groups. First,the older children actually demonstrated the greater speaker-to-speaker variability. Second,there were no significant differences between the groups in utterance-to-utterance variability.Although motor speech development probably affectsf0 production throughout early child-hood, the present findings about speaker variability highlight instead the unexpected role ofage-related individual differences. Four-year-olds seem to exploit the possibilities affordedby intonation to express individual differences in temperament and emotional expression. Insingle-syllable utterances, infants do not actively exploit the range of intonation in the waythat older children do.

The 4-year-olds produced meaningful utterances, but most of the infants’ vocalizationswere non-meaningful. This suggests that intonation is linked to meaningful speech. In the1-year-olds, however, intonation was not significantly different in words vs. babble. Similarly,the phonological level of syllables produced by the infants did not affect accent range. Phono-logical complexity is relevant to early word production because the frequency of canonicalvs. precanonical syllables in infant speech is usually greater in words than babble (Vihman& Miller, 1988). Finally, it could be hypothesized that words are more “communicative” thanbabble and that children might linguistically mark complete utterances by intonation only whenthose utterances are communicative. However, examples of Abby’s speech (e.g., Fig. 3) didnot support the idea that babble and words have different social functions. Abby produced bothmeaningful and non-meaningful utterances in contexts that suggested social interaction andintentional communication. In sum, the evidence from infant speech does not suggest that wordproduction alone accounted for the observed age-related changes in children’s intonation.

Although the infants as a group did not use intonation differently in words vs. babble,some individual children produced relatively mature intonation patterns in meaningful speechonly. These children were similar to Leopold’s (1947) daughter, Hildegard, who was describedin Section 1. Individual differences of this type among the infants seemed to reflect differentsensitivities to segmental vs. suprasegmental features in the ambient language. In adult speech,prosodic features are represented in a different component of the grammar than segmental andlexical features. However, children like Abby in this study seem to analyze characteristicmelody patterns as an inherent part of the sound structure of words. That is, the phonology ofearly words for these children is a fusion of segmental and suprasegmental features (Galligan,1987). Other children, who are biased towards segmental features from the beginning, maynot produce adultlike intonation patterns until later in the single-word period or perhaps afterthey produce their first word-combinations. In Dore’s (1973) terms, both groups of childrenwould probably be described as “word babies.” Still other children, who Dore (1973) describedas “intonation babies,” demonstrate a selective sensitivity to suprasegmental patterns instead.

406 D. Snow / Infant Behavior & Development 24 (2002) 393–407

These are the children who produce jargon intonation before the onset of meaningful speech.In sum, intonation development seems to be related to language experience, but there are atleast three different ways that individual children may begin to use intonation in pre-word,word, and multi-word utterances. The individual differences among infants complement thevariability among individuals that the 4-year-olds demonstrated. A major priority for futureresearch is to understand more fully the basis of these individual differences in both age groups.

A related implication for future research has to do with design issues. Cross-sectional studieslike the present one are intended to generate hypotheses about intonation development in an agerange for which the data are scanty. Between-group studies, however, may blur the distinctionbetween developmental differences and individual differences. In light of the large role ofindividual differences that the present findings suggest, the need for follow-up longitudinalstudies is especially compelling. Investigations of the same children over time will not only testhypotheses generated by cross-sectional research; they will also facilitate studies of the strikingindividual differences among children that are apparent at widely different developmentalstages.

In summary, infants’ intonation in early cry and non-cry behaviors may be influenced bythe natural biological tendency to produce descending pitch patterns. However, children’sintonation develops markedly between the age of 1 and 4 years. This pattern of developmentalchange suggests that biological biases stemming from the innate referential breath-group do notexplain children’ acquisition of intonation after the first year of life. To account for intonationdevelopment between 1 and 4 years of age, one additional explanatory factor, perhaps, is thechild’s linguistic experience. Indeed, for some children, dramatic developments in intonationare associated with a major milestone in language development—the child’s first attempts toproduce adult-based words.

Acknowledgments

This research was funded in part by a Kinley Trust grant from the Purdue Research Foun-dation and an R03 grant (DC04365-02) from the National Institute on Deafness and OtherCommunication Disorders. Portions of this study were presented at the Annual Meeting ofthe American Speech–Language–Hearing Association, Washington, DC, November 16–19,2000. I would like to thank Violette Hawa and Betsy Evers for their contributions to the datacollection and perceptual analysis phases of this research. Special thanks also to Joshua Kelly,Jessica Yoder, and Tara Robinson for their contributions to the instrumental analyses.

References

Allen, G. D., & Hawkins, S. (1980). Phonological rhythm: Definition and development. In G. H. Yeni-Komshian, J.F. Kavanagh, & C. A. Ferguson (Eds.),Child phonology. Vol. 1: Production(pp. 227–256). New York: AcademicPress.

Boysson-Bardies, de B., Vihman, M. M., Roug-Hellichius, L., Durand, C., Landberg, I., & Arao, F. (1992). Materialevidence of infant selection from target language: A cross-linguistic phonetic study. In C. A. Ferguson, L.

D. Snow / Infant Behavior & Development 24 (2002) 393–407 407

Menn, & C. Stoel-Gammon (Eds.),Phonological development: Models, research, implications(pp. 369–391).Timonium, MD: York Press.

Branigan, G. (1979). Some reasons why successive single-word utterances are not.Journal of Child Language, 6,411–421.

Burns, E. M., & Ward, W. D. (1982). Intervals, scales, and tuning. In D. Deutsch (Ed.),The psychology of music(pp. 241–269). New York: Cambridge University Press.

Cruttenden, A. (1981). Falls and rises: Meanings and universals.Journal of Linguistics, 17, 77–91.Cruttenden, A. (1997).Intonation(2nd ed.). Cambridge: Cambridge University Press.Dore, J. (1973). The development of speech acts. Ph.D. Dissertation, The City University of New York.Fenson, L., Dale, P. S., Reznick, J. S., Thal, D., Bates, E., Hartung, J. P., & Reilly, J. S. (1991).Technical manual

for the MacArthur Communicative Development Inventories. San Diego, CA: San Diego State University.Galligan, R. (1987). Intonation with single words: Purposive and grammatical use.Journal of Child Language, 14,

1–21.Halliday, M. A. K. (1975).Learning how to mean: Explorations in the development of language. London: Edward

Arnold.Kent, R. D., & Bauer, H. R. (1985). Vocalizations of one-year-olds.Journal of Child Language, 12, 491–526.Kent, R. D., & Forner, L. L. (1980). Speech segment durations in sentence recitations by children and adults.Journal

of Phonetics, 8, 157–168.Kent, R. D., & Murray, A. D. (1982). Acoustic features of infant vocalic utterances at 3, 6, and 9 months.Journal

of the Acoustical Society of America, 72, 353–365.Laufer, M. (1980). Temporal regularity in prespeech. In T. Murry & J. Murry (Eds.),Infant communication: Cry

and early speech(pp. 284–309). Houston: College-Hill Press.Leopold, W. F. (1947).Speech development of a bilingual child: A linguist’s record, Vol. II: Sound learning in the

first two years. Evanston, IL: Northwestern University Press.Lieberman, P. (1967).Intonation, perception, and language. Cambridge, MA: MIT Press.Menn, L. (1976). Pattern, control and contrast in beginning speech: A case study in the development of word form

and word function. Unpublished Doctoral Dissertation, University of Illinois at Urbana-Champaign.Milenkovic, P., & Read, C. (1992). CSpeech Version 4. Department of Electrical Engineering, University of Wis-

consin, Madison.Oller, D. K. (1980). The emergence of the sounds of speech in infancy. In G. Yeni-Komshian, J. F. Kavanagh, & C.

A. Ferguson (Eds.),Child phonology: Vol. 1. Production(pp. 93–112). New York: Academic Press.Robb, M. P., & Saxman, J. H. (1990). Syllable durations of preword and early word vocalizations.Journal of Speech

and Hearing Research, 33, 583–593.Sander, E. K. (1972). When are speech sounds learned?Journal of Speech and Hearing Disorders, 37, 55–63.Snow, D. (1994). Phrase-final syllable lengthening and intonation in early child speech.Journal of Speech and

Hearing Research, 37, 831–840.Snow, D. (1998). Prosodic markers of syntactic boundaries in the speech of four-year-old children with normal and

disordered language development.Journal of Speech, Language, and Hearing Research, 41, 1158–1170.Stark, R. E., Bernstein, L. E., & Demorest, M. E. (1993). Vocal communication in the first 18 months of life.Journal

of Speech and Hearing Research, 36, 548–558.Stoel-Gammon, C. (1989). Prespeech and early speech development of two late talkers.First Language, 9, 207–224.Vihman, M. M. (1993). Variable paths to early word production.Journal of Phonetics, 21, 61–82.Vihman, M., & Miller, R. (1988). Words and babble at the threshold of language acquisition. In M. D. Smith & J.

L. Locke (Eds.),The emergent lexicon: The child’s development of a linguistic vocabulary(pp. 151–183). NewYork: Academic Press.

Wells, G., Montgomery, M., & MacLure, M. (1979). Adult-child discourse: Outline of a model of analysis.Journalof Pragmatics, 3, 337–380.