perceptual organization of speech sounds by infantshillenbr/papers/hillenbrand... · 2009. 7....

15
Journal of Speech and Hearing Research, HILLENBRAND, Volume 26, 268-282, June 1983 I PERCEPTUAL ORGANIZATION OF SPEECH SOUNDS BY INFANTS JAMES HILLENBRAND Nort] cestern Unicersit!! Euanston Illinois' An operant head-turn procedure was used to test whether 6-month-old infants recognize the auditor' similariW of speech sounds sharing a value on a phonetic-feature dimension. One group of ini:ants was reiniorced }br head turns when a change occurred from a series of repeating background stimuli containing nasal consonants ([m, n, rj]) to repetitions from a categoD' of syllables containing voiced stop consonants ([b, d, g]), or to a change from stops to nasals. The stiluuli were naturally produced b'v both male and fbmale talkers. The perfbnnance of infants in this "phonetic" group was compared to that of infants in a "nonphonetic" control group. Using the salne procedures, these inf~ants were reinforced for head turns to a group of phonetically unrelated speech sounds. Results indicated that the perfonnance of infants in the group trained on phonetically related speech sounds was far superior to that of"infants in the nonphonetie control group. These findinKs suggest that prelinguistic infants can perceptually organize speech sounds on the basis of auditory properties related to feature simflaritv. A major focus of speech-perception research over the past several decades has been an attempt to define pho- netic categories in terms of acoustic properties--for example, to specify the acoustic attributes that define or "cue" the segment [g], or the feature [velar], in all the contexts in which it occurs. Much of the literature in this area has suggested that the critical cues to phonetic categories are often highly variable with changes in "context." The physical cues to speech-sound categories have been found to vary with changes in noncritical di- mensions such as the phonetic environment in which the segment appears, the position that the segment occupies within the syllable, and the talker who produces the ut- terance. These results, combined with a variety of other findings, have led some investigators to theorize that the cues to phonetic categories are not derived from the physical signal in a direct way. Specifically, the sugges- tion has been made that the perception of speech is mediated in some way by knowledge of how speech is produced. According to this view, the speech waveform is assumed to be interpreted in terms of the articulatm7 gestures that were used to produce the signal (Liberman, 1970; Liberman, Cooper, Shankweiler, & Studdert- Kennedy, 1967; Stevens & House, 1972). Other investigators have argued that attempts to relate phonetic categories to the acoustic signal have failed to account seriously for the psychophysical processes in- volved in the coding of complex auditory signals. Ac- cording to this point of view, invariant acoustic cues to phonetic categories can, in fact, be derived from the physical signal without appealing to articulatory knowl- edge (Fant, 1967; Kuhl, 1979a; Miller, Engebretson, Spenner, & Cox, 1977; Searle, Jacobson, & Rayment, 1979; Stevens & Blumstein, 1978). Speech-perception research with infants can provide specific kinds of evidence on the contention that ar- ticulatory knowledge is a necessary condition for the categorization of speech sounds. The reasoning is rela- tively simple; Since prelinguistic infants are not as- sumed to possess sophisticated knowledge about the production of speech, demonstrations of phonetic categorization by infants will indicate the limits of the type of articulatory knowledge likely to be involved in this process. In a recent series of experiments, Kuhl and her associates (Kuhl & Miller, 1982; Kuhl, 1977; 1979b; Holmberg, Morgan, & Kuhl, 1977; Kuhl & Hillenbrand, Note 1), attempted to determine the extent to which young infants recognize similarities among speech sounds when variations are introduced in noncritical di- mensions. For example, an experiment by Kuhl (1979b) demonstrated that 6-month-old infants could detect a change from one category, of vowels to another when the tokens varied randomly in talker and pitch contour. In- ihnts in this experiment were initially trained to make a head turn for a visual reward when a change occurred from repetitions of a single token of [aJ, synthesized to simulate a male voice with a falling pitch contour, to repetitions of a single token of Ill, produced by the same male "talker" with the same pitch contour. The infants were then graduaIly exposed to a number of novel to- kens synthesized to simulate female and child talkers with either falling or rising pitch contours. The results showed that infants readily transferred learning from the tokens produced by the male talker to the novel tokens produced by female and child talkers. Similar experiments have tested the perception of an [a]-[a] contrast across variations in talker and pitch con- tour (Kuhl, 1977), fricative contrasts across variations in vowel context and talker (Holmberg et al., 1977), a nasal-consonant place contrast across variations in vowel context and talker (Hillenbrand, 1980, Note 2), and, using a different version of the operant head-turn proce- dure, a stop-consonant place contrast across variations in vowel context (Fodor, Garrett, & Brill, 1975), To date, infant research on phonetic categories has fo- cused exclusively on the infant's ability to recognize © 1983, American Speech-Language-Hearing Association 268 0022-4685/83/2602-0268501.00/0

Upload: others

Post on 30-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PERCEPTUAL ORGANIZATION OF SPEECH SOUNDS BY INFANTShillenbr/Papers/Hillenbrand... · 2009. 7. 29. · consonants (Fant, 1960). A second reason for studying the stop/nasal eontrast

Journal of Speech and Hearing Research, HILLENBRAND, Volume 26, 268-282, June 1983

I

P E R C E P T U A L O R G A N I Z A T I O N OF S P E E C H S O U N D S BY I N F A N T S

JAMES HILLENBRAND Nort] cestern Unicersit!! Euanston Illinois'

An operant head-turn procedure was used to test whether 6-month-old infants recognize the auditor' similariW of speech sounds sharing a value on a phonetic-feature dimension. One group of ini:ants was reiniorced }br head turns when a change occurred from a series of repeating background stimuli containing nasal consonants ([m, n, rj]) to repetitions from a categoD' of syllables containing voiced stop consonants ([b, d, g]), or to a change from stops to nasals. The stiluuli were naturally produced b'v both male and fbmale talkers. The perfbnnance of infants in this "phonetic" group was compared to that of infants in a "nonphonetic" control group. Using the salne procedures, these inf~ants were reinforced for head turns to a group of phonetically unrelated speech sounds. Results indicated that the perfonnance of infants in the group trained on phonetically related speech sounds was far superior to that of" infants in the nonphonetie control group. These findinKs suggest that prelinguistic infants can perceptually organize speech sounds on the basis of auditory properties related to feature simflaritv.

A major focus of speech-perception research over the past several decades has been an attempt to define pho- netic categories in terms of acoustic propert ies--for example, to specify the acoustic attributes that define or "cue" the segment [g], or the feature [velar], in all the contexts in which it occurs. Much of the literature in this area has suggested that the critical cues to phonetic categories are often highly variable with changes in "context." The physical cues to speech-sound categories have been found to vary with changes in noncritical di- mensions such as the phonetic environment in which the segment appears, the position that the segment occupies within the syllable, and the talker who produces the ut- terance. These results, combined with a variety of other findings, have led some investigators to theorize that the cues to phonetic categories are not derived from the physical signal in a direct way. Specifically, the sugges- tion has been made that the perception of speech is mediated in some way by knowledge of how speech is produced. According to this view, the speech waveform is assumed to be interpreted in terms of the articulatm7 gestures that were used to produce the signal (Liberman, 1970; Liberman, Cooper, Shankweiler, & Studdert- Kennedy, 1967; Stevens & House, 1972).

Other investigators have argued that attempts to relate phonetic categories to the acoustic signal have failed to account seriously for the psychophysical processes in- volved in the coding of complex auditory signals. Ac- cording to this point of view, invariant acoustic cues to phonetic categories can, in fact, be derived from the physical signal without appealing to articulatory knowl- edge (Fant, 1967; Kuhl, 1979a; Miller, Engebretson, Spenner, & Cox, 1977; Searle, Jacobson, & Rayment, 1979; Stevens & Blumstein, 1978).

Speech-perception research with infants can provide specific kinds of evidence on the contention that ar- ticulatory knowledge is a necessary condition for the categorization of speech sounds. The reasoning is rela-

tively simple; Since prelinguistic infants are not as- sumed to possess sophisticated knowledge about the product ion of speech, demonst ra t ions of phonet ic categorization by infants will indicate the limits of the type of articulatory knowledge likely to be involved in this process. In a recent series of experiments, Kuhl and her associates (Kuhl & Miller, 1982; Kuhl, 1977; 1979b; Holmberg, Morgan, & Kuhl, 1977; Kuhl & Hillenbrand, Note 1), attempted to determine the extent to which young infants recognize similarities among speech sounds when variations are introduced in noncritical di- mensions. For example, an experiment by Kuhl (1979b) demonstrated that 6-month-old infants could detect a change from one category, of vowels to another when the tokens varied randomly in talker and pitch contour. In- ihnts in this experiment were initially trained to make a head turn for a visual reward when a change occurred from repetitions of a single token of [aJ, synthesized to simulate a male voice with a falling pitch contour, to repetitions of a single token of Ill, produced by the same male "talker" with the same pitch contour. The infants were then graduaIly exposed to a number of novel to- kens synthesized to simulate female and child talkers with either falling or rising pitch contours. The results showed that infants readily transferred learning from the tokens produced by the male talker to the novel tokens produced by female and child talkers.

Similar experiments have tested the perception of an [a]-[a] contrast across variations in talker and pitch con- tour (Kuhl, 1977), fricative contrasts across variations in vowel context and talker (Holmberg et al., 1977), a nasal-consonant place contrast across variations in vowel context and talker (Hillenbrand, 1980, Note 2), and, using a different version of the operant head-turn proce- dure, a stop-consonant place contrast across variations in vowel context (Fodor, Garrett, & Brill, 1975),

To date, infant research on phonetic categories has fo- cused exclusively on the infant's ability to recognize

© 1983, American Speech-Language-Hearing Association 268 0022-4685/83/2602-0268501.00/0

Page 2: PERCEPTUAL ORGANIZATION OF SPEECH SOUNDS BY INFANTShillenbr/Papers/Hillenbrand... · 2009. 7. 29. · consonants (Fant, 1960). A second reason for studying the stop/nasal eontrast

HILLENBRAND: Infants" Organization of Speech Sounds 269

phonetic similarity at the level of the phone, or phonetic segment. The purpose of the present study was to extend these findings and to test infants on their ability to or- ganize speech sounds at the more abstract level of the phonetic feature. The feature contrast was a stop/nasal distinction: [b, d, g] versus [m, n, r3]. This contrast seemed like a logical starting point for testing feature perception in infancy for two reasons. First, a good deal is known about the physical correlates of this distinction. During the occlusion portion of nasal consonants, a nasal murmur is produced that is characterized by (a) a low- frequency first resonance at 200-300 Hz, well separated from higher formants; (b) relatively high damping factors (large formant bandwidths and low formant levels); and (c) an antiformant that varies in frequency with place of articulation (Fant, 1960; Fujimura, I962), Voiced stop consonants, on the other hand, (a) do not show a nasal murmur (although a low-frequency "voice bar" may be present during the occlusion), (b) are characterized by aperiodic release bursts, and (c) typically show more rapid changes in amplitude following release than nasal consonants (Fant, 1960). A second reason for studying the stop/nasal eontrast is that information is available on infants' discrimination of stop and nasal consonants. Evidence is available to show that infants can discrimi- nate individual pairs of speech sounds differing in stop- consonant place of articulation (Eimas, 1974; Morse, 1972), nasal-consonant place of articulation (Hil len- brand, Note 2), and a stop-nasal manner-class contrast (Eimas & Miller, 1980).

The present study examined the ability of infants to eategorize speech sounds according to the stop-nasal dis- tinction. In other words, the study was designed to de- termine whether infants recognize that the stops [b, d, g] are similar to one another and distinct from a class eon- sisting of the nasals [m, n, 0].

M E T H O D S

The general approach of the study was similar to the transfer-of-learning experiments by Kuhl and her col- leagues (Kuhl, 1977; i979b; Holmberg et al., 1977; Kuhl & Hillenbrand, Note 1). One group of 6-month-old in- fants was visually reinforeed for head-turn responses when a change occurred from a background category of syllables containing nasal consonants ([m, n, D]) to a comparison category of syllables containing voiced stop consonants ([b, d, g]), or to a change from stops to nasals. The speech sounds were produced by both male and female talkers. The performance of infants in this "pho- netic" group was compared to the performance of a sepa- rate group of infants run in a procedurally identical "nonphone t i c" condition. These infants were tested using the same pool of stimuli used in the phonetic con- dition, but the stimuli were assigned to reinforced and unreinforced categories in such a way that the categories could not be organized according to phonetic attributes or talker.

The procedure, which is described in detail below,

used a visual reward to train an infant to make a head- turn response when a change occurred from a class of repeating background stimuli to repetitions from a com- parison category. The experimental stages for the pho- netic condition are shown in Table 1. The first stage con- trasted a single token of [ma] with a single token of [ba].

TABLE 1. Experimental stages for the phonetic condition.

Stage Categor~j 1 Categor~ d 2

1 : Initial training 2: Place variation

3: Talker x Place

4: Transfer of learning

ba (M) ma (M) ba (M) rna (M) da (M) na (M)

ba (M) ma (M) da (M) na (M) ba (F) ma (F) da (F) na (F)

ba (M) ma (M) da (M) na (M) ga (M) Da (M) ba (F) rna (F) da (F) na (F) 9a (F) Da (F)

Both syllables were naturally produced by the same male voice. In the second stage, postdental consonants were added to each class; that is, [ma] and [na] were con- trasted with [ba] and [da]. In the third stage, labial and postdental consonants produced by a female voice were added to eaeh category. In the fourth and final stage, velar consonants were added to each class, resulting in a contrast between male and female [m, n, D] and male and female [b, d, g]. Half of the infants were trained with the stop consonants as the comparison category, and half were trained with the nasal consonants as the compari- son category.

In the final stage of the experiment, the infant's task was to make a head-turn response whenever a change occurred from a category- of nasal consonants to a cate- gory of voiced stop consonants--or from stop consonants to nasal consonants-- independent of random variation in place of artieulation and talker. I f subjects in this task succeeded in responding to the stimuli in the compari- son category, it would be tempting to conclude that the infants recognized the similarity of speech sounds shar- ing a phonetic-feature value. It is possible, however, that infants might simply memorize which tokens were rein- forced and which ones were not. Memorizing tokens, of course, would not necessar i ly require a pe rcep tua l grouping of the stimuli. To test for this possibility, the performance of infants run in the phonet ic task de- scribed above was compared to the performanee of a separate group of infants run in a nonphonetie condition. In the nonphonetie condition categories were arranged in such a way that the six stimuli in each class could not be organized according to phonetic or acoustic charac- teristics. Subjects were tested using the same procedures and equipment, plus the same pool of 12 stimuli as in

Page 3: PERCEPTUAL ORGANIZATION OF SPEECH SOUNDS BY INFANTShillenbr/Papers/Hillenbrand... · 2009. 7. 29. · consonants (Fant, 1960). A second reason for studying the stop/nasal eontrast

270 Journal of Speech and Hearing Research

TABLE 2. Experimental stages for the nonphonetie condition.

Stage Category 1 Category 2

ba (F) na (M)

ba (F) na (M) oa (M) 0a (F) ba (F) na (M) ~a (M) ga (F) da (F) ma (F) ma (M) ba (M)

ba (F) na (M) ~a (M) ga (F) da (F) ma (F) ma (M) ba (M) ga (M) da (M) na (F) 1ja (F)

the phonetic condition. The experimental stages for the nonphonetic condition are shown in Table 2. Subjects were initially trained on a relatively gross contrast be- tween a male [na] and a female [ba]. The subsequent stages were analogous to those of the phonetic condition in terms of the number of tokens added in each stage. However, sounds were added in such a way that, by the final stage, it was not possible to organize the stimuli along any simple dimension: Each class included an equal number of stops and nasals, male voices and female voices, labials, postdentals, and velars. As in the phonetic condition, half of the subjects were trained with category 1 as the comparison class and the other half with category 2. It was reasoned that the only way an infant could succeed on this task was to memorize which individual stimuli were reinforced and which ones were not. If the performance of infants in the pho- netic group proved to be superior to that of the non- phonetic group, the effect could be attributed to percep- tual categorization of the speech sounds by infants in the phonetic group.

Stimuli

The stimuli were naturally produced tokens of[m, n, rj, b, d, g] in prevocalie position with the vowel [a]. One adult male and one adult female produced several tokens of each syllable. Audio recordings were made in a sound-treated booth with a cardioid microphone (Senn- heiser MKH 415T-U) and a high-quality full-track re- corder (Nagra 4.2). The talkers were instructed to pro- duce all stimuli with approximately equal durations, intensities, and slightly falling pitch contours. A VU meter was used to monitor intensity. The recorded stimuli were digitized and stored in the disk memory of a digital computer (DEC PDP 11/10). A sample rate of 20 kHz was used with a maximum amplitude resolution of eight bits within a ±4-V dynamic range. All signals were low-pass filtered at 8 kHz and conditioned with an au- tocorrelator noise-reduction device (Phase Linear 1000).

26 268-282 June 1983

One token of each syllable produced by the two talkers was selected for use in the discrimination tests. The to- kens were chosen by selecting those stimuli that showed the closest match on computer-derived measurements of fundamental frequency contour, intensity contour, and duration. In the final set of stimuli there were no sys- tematic dif ferences be tween the stop and nasal categories in fundamental frequency, overall RMS inten- sity, or duration. (Measurements of these stimuli are given in Table A of the Appendix.) Formal listening tests showed that all stimuli were identified reliably by a panel of five adult listeners.

Audiotapes for discrimination testing were prepared by recording stimuli from the two categories on separate channels of tape. At the output of the D/A converter, the stimuli were low-pass filtered at 8 kHz, conditioned with an autoeorrelator noise-reduction device (Phase Linear 1000), and recorded with a constant 1.7-see onset-to- onset interstimulus interval. The onsets of the stimuli on the two channels of each tape were synchronized using a cueing procedure described by Hillenbrand, Minifie, and Edwards (1979). Gain settings at the input to the tape deck (TEAC 3340-S) were adjusted so that the two stimuli that had contrasted in the initial-training stage balanced for loudness.

Calibration

Signals were calibrated by a combination of sound- level measurements and a loudness-balance procedure. The gain setting at the output of the tape deck was ad- justed so that the peak intensity of one syllable in the initial-training pair measured 65 dBA, using the fast- response setting of a sound-level meter (Bruel & Kjaer, Model 2209). A loudness-balance procedure was used to adjust the output gain of the channel carrying the con- trasting syllables. An experimenter used an electronic switch to alternate between the two channels. The out- put gain of the channel carrying the contrasting syllables was adjusted until one adult listener judged that the two signals were equally loud. These same gain settings were used for the experimental conditions involving multiple tokens of the two categories. The loudness bal- ance was checked as part of the daily calibration proce- dure.

Procedures

1. General. A schematic of the experimental site is shown in Figure 1. The infant was held on the parent's lap facing an assistant. An experimenter in an adjacent room controlled the equipment and was able to observe the infant on a video monitor. A loudspeaker (Electro- Voice SP-12) was positioned at a 90 ° angle to the assis- tant. In fi-ont of the speaker was an electrically operated stuffed toy bear in a smoked plexiglass box. When acti- vated, the box was illuminated and the bear tapped on a drum.

Page 4: PERCEPTUAL ORGANIZATION OF SPEECH SOUNDS BY INFANTShillenbr/Papers/Hillenbrand... · 2009. 7. 29. · consonants (Fant, 1960). A second reason for studying the stop/nasal eontrast

@ @ equipment : ~

Q

E-Experimenter A-Assistant P-PIruI I-Infant

Ifll-Visuul Ileinfercer C.Cameru M-Viiee Muitor

FIGURE 1. Experimental site for the visually reinforced head- turn procedure (from Kuhl, 1979b).

HILLENBRAND: Infants" Organization of Speech Sounds 271

TRIAL STRUCTURE

OBSERVATION PRE- INTERVAL POST-

j i NASAL4.

CHANGETRiAL "NASAL3 NASAL 1 NASAL2 NASAL41 STOP 1 STOP4 STOP2 NASAL 3 NASAL2 NASAL4... I

CONTROL NA AL f TRIAL " S 3 NASALt NASAL2 NASAL41 NASAL 3 NASAL 1 NASAL21 NASAL 3 NASAL 2 ] I

TiME

FIGURE 2. Trial structure for the phonetic condition. The figure shows stimuli being presented before, during, and after change and control trials. The subscripts refer to the individual stimuli in the background and comparison categories. The example shown here is for stage 3 of the phonetic condition in which the stop category was reinforced (after Kuhl, 1979b).

TRIAL STRUCTURE

OBSERVATION PRE- INTERVAL POST- 1 i

TRIAL " ] CHANGE ,.NASAL5 NASAL2 NASAL2 NASAL21L STOP 1 STOP 1 STOP 1 NASAL6 NASAL6 NASAL6 "

TRIAL " I

I

CONTROL .NASAL 5 NASAL2 NASAL2 NASAL2 I NASAL4 NASAL4 NASAL41 NASAL 6 NASAL E NASAL 6 .

] TIME

FIGURE 3. Trial structure for the final stage of testing (stage 4). The stimuli are presented in random order, but each stimulus in the order is repeated three times (see Kuhl, 1979b).

The experiment was run with a tape deck (TEAC 3340-S) and a logic device. Throughout the entire exper- inaent, t ape- recorded st imuli were con t inuous ly presented at onset-to-onset intervals of 1.7 see. The as- sistant's task was to keep the infant's attention by ma- nipulating silent toys. When the assistant judged the infant to be in a "ready state," that is, quiet and attend- ing to the toys, he pressed a button signaling the exper- imenter to initiate a 5-see observation interval. Two kinds of trials could occur during the interval: change trials or control trials. Figure 2 shows stimuli being presented before, during, and after change and control trials for the phonetic condition. During a change trial, a silent switch initiated a change in tape-recorder chan- nels from the repeating background category to three presentations from the comparison category. A hand-held vibrotactile device signaled the start of a 5-sec observa- tion interval to the assistant; a small light mounted on the monitor signaled the start of the interval to the exper- imenter. I f both the experimenter and the assistant judged that a head turn occurred during the observation interval, they independently pressed buttons that acti- vated the visual reinforcer for 3 sec. And-gate circuitry ensured that the reinforcer would be activated only on change trials in which both judges voted during the 5-sec observation interval. During a control interval, the infant continued to hear stimuli from the background category. On control trials, both the experimenter and the assistant made a judgment about the occurrence of a head turn, but reinforcement was not provided, regard- less of the infant's response. For the final stage of testing (stage 4), stimuli were presented using a special three- repetition trial structure described by Kuhl (1979b). As

shown in Figure 3, the stimuli were presented in ran- dom order, but each stimulus in the order was repeated three times. Since a single token was presented on any given trial, this format made it possible to assign the in- fant's response to a particular stimulus. On both change and control trials the experimenter recorded the stimulus that was presented and the infant's response.

For all stages of the experiment an infant 's per- formanee was measured by comparing the proportion of head turns on change trials to the proportion of head turns on control trials. To reduce the possibility that the parent or assistant might cue the infant's response, and to control for bias in judging head turns, music was pre- sented over earphones to both adults in the test room at a level sufficient to mask a change from one stimulus to another. The experimenter was able to hear the stimuli over an audio monitor in the control room and therefore could have been biased in his judgment of head turns. Experimenter bias in this task would be revealed by his failure to agree with the assistant, who was unbiased. In- terjudge agreement for all trials was 98%, indicating that experimenter bias did not play a large role in the judg- ment of head turns. When the two judges did fail to agree, the trials were always scored as errors. As a fur- ther effort to reduce the possibility of bias, an electronic probability generator, set at 50%, was used to determine whether a given observation interval would be a change or control trial. Since previous work with the head-turn procedure suggested that long strings of change and con- trol trials increased the probability of infant errors, the experimenter was instructed to override the probability generator for a single trial after three consecutive change or control trials (see Kuhl, 1979b).

Page 5: PERCEPTUAL ORGANIZATION OF SPEECH SOUNDS BY INFANTShillenbr/Papers/Hillenbrand... · 2009. 7. 29. · consonants (Fant, 1960). A second reason for studying the stop/nasal eontrast

272 Journal of Speech and Hearing Research 26 268-282 June 1983

2. Conditioning the head-turn response. The head- turn response was conditioned by initiating a change trial and, after a few presentations of the comparison stimulus, activating the visual reinforcer. After a variable number of these trials, most infants began to make head turns that anticipated the activation of the visual rein- forcer. To be included in the experiment, an infant was required to make three consecutive anticipatory head turns. Subjects were a/lowed a maximum 25 trials to meet the conditioning criterion. Testing on the initial- training stage was not begun until the infant met the conditioning criterion. Experience with the head-turn procedure has shown that infants who meet the condi- tioning criterion very quickly will sometimes perform poorly on the initial-training stage. For that reason, all infants were given a minimum of 15 conditioning trials.

3. Progressing subjects through the experiment. An in- fant advanced from one stage of the experiment to the next when he/she met an accuracy criterion of 9 correct responses in 10 consecutive trials, half being change trials and half being control trials. I f an infant did not meet this 9-out-of-10 criterion in 20 trials, he/she was automatically progressed to the next stage of the experi- ment. When an infant reached the final stage of the ex- periment, he/she was given as close to 75 trials as possi- ble. A variety of problems prevented this in some eases, including scheduling difficulties, experimenter error, and infants who had become fussy after prolonged test- ing. The number of trials run on the final stage ranged from 63 to 75, with an average of 68.9 trials.

4. Retraining. It was often the case that infants at vari- ous stages of testing would show a marked drop in per- formanee. In many cases the infant appeared to have for- gotten the experimental contingencies or seemed to lose interest in the task. Infants were retrained by the presen- tation of conditioning trials--change trials in which the visual reinforcer was manually activated if the infant did not respond within about 4 sec of the stimulus change, Two rules controlled the presentation of these retraining trials:

1. A single retraining trial was presented after three consecu- tive misses on change trials.

2. If after the first 15 trials of a session an infant had missed more than half of the change trials, the next five trials were retraining trials. Regardless of the stage of testing that the infant was in, these retraining trials used the pair of stimuli from the initial-training stage.

5. Testing sessions. A test session was terminated when either the experimenter or the assistant judged that the baby was becoming tired or fussy or at the end of 30 trials. Testing sessions lasted about 10-15 minutes, with an average of 20 trials per session. Infants were usually given all of the trials for a particular experimen- tal stage within the same session. However, if a session was terminated before an infant completed testing on a given stage, testing on the next session would resume where the infant left off. Seven or eight sessions were generally required to complete the experiment.

Subjects

The subjects were normal 5a/2 - to 61/2-month-old infants selected by mail solicitation to parents in the Seattle area. A parent questionnaire was used to screen out in- fants who (a) had been treated for middle-ear problems, (b) had a family history of congenital hearing loss, or (c) were born more than 2 weeks premature or 2 weeks late. Subjects were assigned randomly to either the phonetic or the nonphonetic group. A total of 23 subjects began testing. Subjects were run until eight infants completed testing in each group. To be included in the study, an infant had to pass the conditioning criterion of three con- secutive anticipatory head-turn responses in the first 25 trials of testing. Six subjects failed to pass the condition- ing criterion on the [ma]-[ba] contrast for the phonetic study. One additional subject in the phonetic group was eliminated due to an experimenter error, leaving seven subjects in this group instead of eight. The nonphonetic condition offered subjects a much grosset, multidimen- sional contrast, consequently, only one subject in the nonphonetie condition failed to pass conditioning in the allotted 25 trials.

R E S U L T S

The most interesting results of this study come from an analysis of the babies' responses on the final stage of each condition. These analyses are discussed first, fol- lowed by a description of the infants' performance on the preliminary stages. Figure 4 displays file percentages of head turns on change trials and on control trials for in- fants in the phonetic and nonphonetie groups for the final stage of testing. The graph shows that more head turns were observed on change as opposed to control trials for both groups of infants. The trial-type effect, however, was much more pronounced for the phonetic group. Infants in the two groups responded about equally often on control trials, but the phonetic infants responded much more often on change trials than the nonphonetie infants. A two-way analysis of variance for trial type and group, with repeated measures on the trial-type variable, revealed significant main effects for both trial type (F = 17.4; df = 1, 13; p < .001) and group (F = 8.0; df = 1, 13; p < .01). There was also a signifi- cant group x trial-type interaction (F = 7.2; df = 1, 13; p < .05), indicating that the trial-type effect was signifi- cantly larger for the phonetic group. Post hoe analysis showed that the trial-type effect was statistieally reliable for both the phonetic group (F = 11.9; df = 1, 6; p < .01) and the nonphonetie group (F = 8.0; df = 1, 7; p < .05). These comparisons indicate that infants in both groups performed significantly above chance on the final stage of testing, but that infants in the phonetic group per- formed with greater accuracy than those in the non- phonetic group.

It was also of interest to determine specifically how the subjects distributed their responses among the indi- vidual sounds in the re inforced and unre inforeed

Page 6: PERCEPTUAL ORGANIZATION OF SPEECH SOUNDS BY INFANTShillenbr/Papers/Hillenbrand... · 2009. 7. 29. · consonants (Fant, 1960). A second reason for studying the stop/nasal eontrast

HILLENBRAND: Infants' Organizatio~ of Speech Sounds 273

100-

90- U.I

z 8O- o O. oo 70- tU or

Z 6 0 - or

50- A

uJ 40- " r

I-. z 30" LU O

or 2 0 " LU O.

10"

0

PHONETIC GROUP

I . . . . . . . .

I . . . . . . . .

I I I / 1 1 1 /

~ / i / / / / J

fJJ]JJ]~

~JJJ]Jf/

C h a n g e C o n t r o l

NON-PHONETIC GROUP

Change Control

TRIAL TYPE

FIGURE 4. Percent head-turn responses on change and control trials for infants in the phonetic (n = 7) and nonphonetic (n = 8) groups. The data in this figure and in Figures 5-11 are from the final stage of testing (stage 4).

categories . F igu re 5 p resen ts these data for the three in- fants in the phone t ic group who were t ra ined to turn to the stop category. The six shaded columns to the left show the pe rcen tage of head turns to each of the six stop consonants p r e sen t ed on change trials; the six u n s h a d e d columns to the r ight show the same data for t he six nasal c o n s o n a n t s p r e s e n t e d d u r i n g con t ro l i n t e rva l s . T h e s t imulus is g iven on the hor izonta l axis. Since the s t imuli were a r ranged in random order on the audiotape , the ex- pe r imen t e r had no control over wha t s t imulus wou ld be p r e s e n t e d on a g i v e n t r ia l . As a c o n s e q u e n c e , t he n u m b e r of presenta t ions of the s t imuli va r i ed somewhat .

The most obvious feature of F igu re 5 is that, as no ted p rev ious ly , many more h e a d turns were o b s e r v e d on change trials as c o m p a r e d to control trials. More specifi- cally, however , infants s e e m e d to turn in roughly equal propor t ions in response to each o f the six sounds in the two categories; that is, t hey d id not show any prominent , cons is ten t p re fe rence for a par t icular ta lker or place-of- ar t iculat ion value. This was also t rue for the subgroup of four infants re in forced for head turns in response to the nasal consonants (see F igure 6). Again, the genera l pic- ture is one o f a re la t ive ly even d is t r ibu t ion of responses among the st imuli . I t is e spec ia l ly in teres t ing that the infants d id not show a p re fe rence for the s t imulus used in the in i t ia l - t ra in ing stage, shown at the ext reme left of each graph. In fact, F igu re 6 shows a s l ight t e n d e n c y to avoid the t ra in ing token, a l though this effect is not par- t icular ly prominent .

CO Z o

0

Ul O.

100.

90-

80.

70-

60-

50-

40" y, 8 30. y,

20- ~,

10" ~,

bM dM gM

SUMMARY:PHONETIC GROUP STOPS REINFORCED

bF dF gF mM nM liHn M mF nF nF

(16) (32) (10) (17) (19) (20) (16) (16) (13) (15) (17) (13) Change Control

STIMULUS

FIGURE 5. Percent head-turn responses to each of the stimuli presented during change trials (shaded columns) and control trials (unshaded columns) for the phonetic subgroup in which the stop category was reinforced (n = 3). The figures in paren- theses indicate the number of times each stimulus was pre- sented. M = male voice; F = female voice.

100-

co 90- LU 09 Z O 80- O. 60 uJ 7 0 - re

Z cc 60- 1- 50- < LU 40- -1- F- z 30- LU O

rr 20- LU Q.

10-

0

SUMMARY: PHONETIC GROUP NASALS REINFORCED

Z

F/.

mM nM riM

(26) (30) (19)

Change

7,

z

z z

z ~

ml ~ nF r}g bM dM gM bF dF gg

(33) (24)(18) (21) (28) (23) (23) (22) (23)

Control STIMULUS

FIGURE 6. Percent head-turn responses to each Of the stimuli presented during change trials and control trials for the pho- netic subgroup in which the nasal category was reinforced (n = 4).

A c learer pic ture of these results can be ob ta ined by combin ing the data for all seven infants in the phonet ic group. This can be done by contras t ing re inforced versus unre inforced s t imuli and col laps ing the data into b roader categories such as " labia l , male ," "denta l , male ," and so on. A graph combin ing the data from all subjects in the phonet ic group is shown in F igure 7. The impress ion of an even d is t r ibut ion of r e spond ing to the s t imuli is even

Page 7: PERCEPTUAL ORGANIZATION OF SPEECH SOUNDS BY INFANTShillenbr/Papers/Hillenbrand... · 2009. 7. 29. · consonants (Fant, 1960). A second reason for studying the stop/nasal eontrast

274 Journal of Speech and Hearing Research 26 268-282 June 1983

100"

90- o3 z 80" o o. o3 70- IJJ

z 60- rr

P" 50-~ 6 < Lu 4 0 , w. I.- :,, 30 -

204 re LU Q.

10"

SUMMARY: PHONETIC GROUP ALL SUBJECTS

LM DM VM LF DF VF LM DM VM LF DF VF (42) (62) (30) (50) (43) (36) (37) (44) (36) (38) (39) (36)

Change Control

STIMULUS

FIGURE 7. Percent head-turn responses to each of the stimuli presented during change trials and control trials for all subjects in the phonetic group (n = 7). L = labial; D = postdental; V = velar.

100-

90. uJ GO z 80. o n 09 t,u 70 ~r

I M Y : NON-PHONETIC GROUP TRAINING STIMULUS:bF

60-

50-

40-

~: 30-

10'

o bF rim f e l i e ~ rle gF mF bM dM qF

(27) (17) (21) (27) (21) ( ~ ) (22) (25) (15) (25) (42) (10)

Change Control STIII~ULUS

FIGURE 8. Percent head-tuna responses to each of the stimuli presented during change trials and control trials for the non- phonetic subgroup in which [ba] (female) served as the training stimulus (n = 4).

stronger in this graph. The mean response pe rcen tage to the reinforced s t imuli was 67.5%, with a range of only 8% and a s tandard devia t ion of 2.9%. A three-way analy- sis of variance for ta lker (male vs. female), place of ar- t iculation (labial vs. pos tdenta l vs. velar), and trial type (change vs. control) r evea led a s ignif icant main effect for the trial-type factor only (F = 13.4; df = 1, 6; p < .01). There were no effects for ta lker (F = 1.1; df = 1, 6; 19 NS) or place of art iculat ion (F = 1.4; df = 2, 12; p NS), and none of the interactions approached significance.

The pattern of r e spond ing in the nonphone t ic group was quite different from that of the phonet ic group. Fig- ure 8 shows the percen tage of head turns to each of the st imuli p resented to the group of four nonphone t ic in- fants who were re inforced in ini t ial t ra in ing for head turns to [ba] (female). As a group, these infants t e n d e d to turn more often tO the six s t imuli in the re inforced class than to those in the unre inforced class (25% vs. 16%). But, unl ike the pat tern obse rved for the phonet ic infants, the responses were d i s t r ibu ted very uneven ly among the six reinforced stimuli. Specif ical ly, many more responses were cued by the [ba] (female) st imulus, which served as the re inforced token in the in i t ia l - t ra in ing contrast. A very similar pat tern can be seen in F igure 9 for the sub- group of four infants t r a ined wi th the ca t egor i e s re- versed, that is, the infants for whom [na] (male) served as the reinforced st imulus in init ial t raining. Again, the in- fants were respond ing most often to the st imulus used in the ini t ial- training contrast, wi th re la t ive ly low levels of responding to the o ther st imuli . As a group, the e ight subjects in the nonphone t ic condi t ion r e s p o n d e d to 29% of the change trials, c o m p a r e d to 19% of the control trials. However , when data are r emoved from trials on which training st imuli were p resen ted , the rate o f re- sponding on change trials is only 18%, almost ident ical

O9 LU O9 Z o

LU

Z

100-

90-

80

70,

60.

50-

40" ~.

30-

20- ~,

10-

o

I Y : ~N-PHONETIC GROUP ~ A ~ G STIMULUS: nM

nM gF ~ ~ ~IF bF r~M dF mM gM nF

(31) (24) ( 1 ) ( M ) ( g ) ( ~ ) ( H ) (20) (24) (23) (17) (27)

C ~ Control ITBIIULUS

FIGURE 9. Percent head-tuna responses to each of the stimuli presented during change trials and control trials for the non- phonetic subgroup in which [na] (male) served as the training stimulus (n = 4).

to the response rate on control trials, This sugges t s ' tha t the significant t r ia l- type effect found for this group was due a l m o s t e x c l u s i v e l y to r e s p o n s e s to the t r a i n i n g st imulus.

I t was not poss ib le tm tmmbine the data from the two subgroups in the hOt, phone t ic condi t ion. For the pho- net ic condi t ion this w a s accompl i shed by combin ing the responses to re inforced s t imuli which shared values on all d imens ions excep t the s top/nasal d imens ion . This perfect symmetry, of course, d id not exist for the non-

Page 8: PERCEPTUAL ORGANIZATION OF SPEECH SOUNDS BY INFANTShillenbr/Papers/Hillenbrand... · 2009. 7. 29. · consonants (Fant, 1960). A second reason for studying the stop/nasal eontrast

HILLENBRAND: Infants" Organization of Speech Sounds 275

phone t ic categories . Consequen t ly , it was not poss ib le to l ine up each s t imulus in one category with a s t imulus in the other category that d i f fered on a s ingle feature value.

Profiles of Individual Subjects

The data p re sen ted thus far are the results of averages from groups of subjects . Resul ts from the seven indi- v idua l infants in the phone t i c group are p r e s e n t e d in F igure 10. T h r e e measures are given to the r ight of each graph: (a) the pe rcen tage o f head turns on change trials (CH), (b) the pe rcen tage of head turns on conta'ol trials (CL), and (c) the overal l pe rcen t correct on both change and control trials (%C). These graphs should be exam- ined with some caut ion because of the variat ion in the n u m b e r o f p re sen ta t i ons o f the s t imuli , g iven in par- en theses on the hor izonta l axis. Since the expe r imen te r had no control over which s t imulus was p r e s e n t e d on a given trial, some o f the data points in these graphs are based on very few responses . Examina t ion o f these data c lear ly shows that the infants do not form a homogene- ous group. Two of the infants, Subjects 3 and 7, a p p e a r e d to be r e spond ing randomly to the st imuli , whi le the re- ma in ing five infants pe r fo rmed with re la t ive ly high ac- curacy.

F igure 11 shows the response pat terns o f the e ight in- fants tes ted in the nonphone t i e group. In te r sub jee t var- iabi l i ty in the per formance of these subjects is also evi- dent . Some of the infants, par t icu lar ly Subjects 1, 2, 4, and 6, apparen t ly found the task very diff icult and pro- duced wha t s e e m e d to be essen t ia l ly random head- turn responses to the 12 stimuli . O the r infants, however , re- sponded wi th some cons i s tency to the s t imulus used in the in i t ia l - t ra in ing contrast, shown at the ex t reme left o f e a c h g r a p h . S u b j e c t 8, in fact , a p p e a r e d to have m e m o r i z e d a second s t imulus. It is in te res t ing that this second s t imulus ([ga], female) has l i t t le in common wi th the t ra in ing s t imulus final, male). On the o ther hand, Subject 3, who was ini t ia l ly t r a ined to [ga] (female), re- s ponded a lmost exc lus ive ly to the tokens p r o d u c e d by the female talker. The pa t te rn shown by this infant is more typical of o ther infants who have b e e n run us ing this type of p r o c e d u r e - - t h a t is, some a t tempt by the in- fant to formulate a genera l rule to organize the s t imulus ca tegor ies (Kuhl, Ho lmberg , Morgan, H i l l e n b r a n d , & Cameron , Note 3).

Results from Preliminary Stages

The data desc r ibed to this po in t were de r ived from analyses of the infants ' responses on the final stage of the exper iment . This sect ion provides a b r i e f descr ip t ion of the resul ts from the p re l imina ry stages o f the exper i - ment; a more de ta i l ed account of these results can be found in H i l l e n b r a n d (Note 4). Tab le 3 shows the results from the first three expe r imen t a l stages and from the cond i t ion ing phase for infants in the phonet ic and non- phonet ic groups. For the condi t ion ing phase the erite-

TABLE 3. Number of trials required to reach criterion for sub- jects in the phonetic and nonphonetic groups.

Experimental stage Subject Condition 1 2 3

Phonetic group 1 10 20 16 -

2 13 __a _ _ 3 11 -- -- - 4 10 - 12 -- 5 20 - 15 17 6 21 10 - 14 7 20 -- -- -

Nonphonetic group 1 8 - - - 2 6 10 - - 3 7 10 - - 4 9 - - - 5 20 - - - 6 9 10 - - 7 5 10 - - 8 3 - - -

aSubject failed to meet criterion (indicated by dashes).

r ion was th ree consecut ive ant ic ipatory head turns; for the th ree exper imenta l stages the cr i ter ion was n ine cor- rect responses in 10 consecut ive trials.

One fairly p rominen t f inding from these tables is that, on the average, infants in the phone t ic group r equ i r ed more trials to reach the condi t ion ing cr i ter ion (~ = 15.0) than d id infants in the nonphone t i c group (~ = 8.7). This d i f ference was p red i c t ab l e since the nonphone t i c infants were t ra ined on a contrast involving di f ferences in sev- eral acous t ic d i m e n s i o n s , w h i l e the p h o n e t i c infants were t r a ined on a m in ima l pair . A second fea ture of in teres t in these tables is that in the majori ty of cases infants d id not mee t the 9-out-of-10 accuracy cr i ter ion and, consequent ly , were progressed to the next experi- menta l stage after 20 trials. This was true for both groups and for all three stages. This was not par t icular ly surpris- ing since previous work has shown that infants typical ly r equ i re more than 20 trials to reach cr i ter ion on conso- nant contrasts (Ho lmberg et al., 1977).

A more revea l ing pic ture of the infants ' performance throughout the e xpe r ime n t can be seen by examining the overal l pe rcen tage of correct responses as a function of the e x p e r i m e n t a l stage. Mean and s tandard dev ia t ion percen t correct for each exper imenta l stage are p lo t ted in F igure 12 for the phone t ic group and in F igure 13 for the nonphone t i c group. F igure 12 shows that there was no t e n d e n c y for the per formance of the phonet ic infants to dec l ine as the expe r imen t b e c a m e more complex. In fact, these data show a s l ight t rend in the oppos i te direct ion. In contrast, the per formance of the nonphonetie infants d r o p p e d ra ther sharp ly from stage 1 to stage 2 and re- ma ined at a re la t ive ly low level . These results suggest tha t the n o n p h o n e t i c infants we re ab l e to l ea rn the head- turn task bu t were unable to memor ize the unre- l a ted tokens that were a d d e d as the e x p e r i m e n t pro- gressed.

Page 9: PERCEPTUAL ORGANIZATION OF SPEECH SOUNDS BY INFANTShillenbr/Papers/Hillenbrand... · 2009. 7. 29. · consonants (Fant, 1960). A second reason for studying the stop/nasal eontrast

276 Journal of Speech and Hearing Research 26 268-282 June 1983

I00 -

90- to

80-

~ 70-

~ 6o

~ ~o ~ 4o

~ 20

10

P~ F~ SUBJECT~I

PHONETIC GROUP

bM aM gM bF dF gF mM nM rIM mF nF ~F (6) (14) (3} (S) (3) (T) (5) (71 (3) (5) (5) (4)

Change Control STIMULUS

CH=86 CL=30 %C=78

100.

90

80. !,o ~ 60

~ 50

~: 40 ~ 30

~ 20 a.

10

SUBJECTR5 PHONETIC GROUP

HH nn l rnM nM r(M mF nF qP bM dM gM bF dF gF (7) (7) (5) (12)(9) (4) (3) (7) (3) (5) (5) (81

Change Control STIMULUS

CH=I00 CL=26 %C=87

SUBJECT#2 PHONETIC GROUP

I'M dM gM bF CiF gF mM nM qM mF nF I]F (5) (9) (2) (7) (8) (61 (4) (3) (2) (6) (3) (4)

Change Conlrol

STIMULUS

CH=88 CL=9 %C=90

SUBJECT#6 PHONETIC GROUP

mM nM r~l mF nP t]F bM dM gM bF dF gF (3) (g) (3) (6) (S) (5) (31 (E] (81 (51 (51 (71

Change Contro(

STIMULUS

CH=68 CL=I5 %C=77

100-

90"

80-

70-

60"

50"

40-

30-

20-

10"

0

SUBJECT~3 PHONETIC GROUP

bM dM gM bF dF gF mM nM riM mF nF tiE (5) (9) (6) (4) (8) (7) (7) (E) (E) (4) (9) (5)

Change Control STIMULUS

CH=I9 CL=26 %C=47

100 ~

CO 90~

g 5~ ~ 7"0-

z 6o ~ 5o ~ 4o

~: 20

10

SUBJECT#7 PHONETIC GROUP

mM nM ~M mF oF t]F bM dM gM bF dF gF (9) (7) (4) (t0) (t) (4) (~t) (3) ( lg) (9) (4) (t)

Change Control STIMULUS

CH=34 CL=28 %C=53

1°° I go

50; ~ F~ 7O ̧ !

50'

,o.

30. ~:~

2 0 I ~

10.

O. - mM nM riM (7) (10) (7)

Change

SUBJECT~4 PHONETIC GROUP

CH=79 CL=9 %C=85

mF nF r]F bM dM gM bF dP gF (5) (6) (3) (4) (12) (0) (4) (8) (7)

Control

STIMULUS

FIGURE 10. Individual response profiles for subjects in the phonetic group. The figures to the right of each graph indicate the percentage of responses on change trials (CH), the percentage of responses on control trials (CL), and the overall percent correct (%C).

Page 10: PERCEPTUAL ORGANIZATION OF SPEECH SOUNDS BY INFANTShillenbr/Papers/Hillenbrand... · 2009. 7. 29. · consonants (Fant, 1960). A second reason for studying the stop/nasal eontrast

100- SUBJECI

90- NON-PHONETIC GROUP

80-

~ 70-

~ 60-

~ 50.

,o. 1 30-

~ 20"

lO. n bF ~M dF mM oM nF nM gF mg bM dM ~F (7) (B) (5) (E) (4) (11) (7) (S) (5) (B) (9) (1)

Change Control

STIMULUS

100"

90" SUBJECT #2 NON-PHONETIC GROUP

80- o

70- ¢ Z 60-

so-

Dn ~ 40-

30-

~ 20-

0 bF rim dF mM gM nF nM gF mF BF dM ~F (7) (5) (8) (4) (7) (4) (7) (7) (4) (B) (10) (3)

Change Control STIMULUS

CH = 15 CL= 18 %C= 49

CH = 14 CL= 8 %C= 53

HILLENBRAND: Infants" Organization of Speech Sounds 277

1004 SUBJECT #5 = NON-PHONETIC GROUP

90-

Z 80- o

70-

z 6O-

2 50- "1 ~, 30-

== lo-

G nM gF mF ToM dM =IF bF r~M dF dM gM nF (9) (5) (6) (5) (6) (0) (5) (3) (5) (9) (3) (5)

Change Control STIMULUS

CH= 18 CL= 13 %C= 53

100- SUBJECT#6 NON-PHONETIC GROUP

90-

B0"

70-

60-

30-

nM gF mF bM dM rlF bF qM dg mM gM nF

(6) (7) (2) (7) (4) (9)" (10) "(9) (E) (~) (E) (7) Change Control

STIMULUS

CH= 37 CL= 30

%C= 54

100

90

so

=o 70

60

~ 513

~ 40

~ 30

~ 20

~ 10" o .

~uI~c~:~T,c GROUP

bF ~M dF mM gM nF nM gF mF bM dM r}F (10) (2) (6) (T) (4) (T) (2) (S) (5) (5) (11) (2)

Change Control

STIMULUS

CH= 42 CL= I0 %C= 66

100'

90"

80-

70"

SUBJECT#7 NON-PHONETIC GROUP

.o.11 R 50"

40"

10"

0 nM gF mF bM dM rjF hF rIM dF dM gM nP (8) (5) (3) (9) (6) (3) (4) (5) (7) (E) (2) (3)

Change Control

STIMULUS

CH= 32 CL= 19

%C= 57

1009

9 0 SUBJECT #4 NON-PHONETIC GROUP

8 0

70

~ EO , 5 0

10'

0 bF ~M dF mM gM nP nM gF mF bM dM ~P (3) (4) (2) (10) (B) (7) (B) (7) (t) (8) (12) (4)

Change Control STIMULUS

CH= 28 CL= 24 %C= 52

100-

90-

~Z 80-

~ 70-

~ 60-

i °° ~ 40

~ 30

~ 20

10.

SUBJECT#8 NON-PHONETIC GROUP

nM gF mF bM dM rjF bF rim dF mM 9M nF (8) ( 7 ) (7) (5) (2) (6) (7) (3) (4) (E} (B) (12)

Change Control STIMULUS

CH=43 CL=II %C=66

FIGURE 11. Individual response profiles for subjects in the nonphonetic group. The figures to the right of each graph indicate the percentage of responses on change trials (CH), the percentage of responses on control trials (CL), and the overall percent correct (%C).

Page 11: PERCEPTUAL ORGANIZATION OF SPEECH SOUNDS BY INFANTShillenbr/Papers/Hillenbrand... · 2009. 7. 29. · consonants (Fant, 1960). A second reason for studying the stop/nasal eontrast

278 Journal of Speech and Hearing Research 26 268-282 June 1983

I - (9 IJ.I r r ft. 0 (9 t-- Z I.U (9 Ix 1.1.1 n

100-

90 -

80 -

70-

60-

50-

40 -

30 -

07

NON-PHONETIC GROUP

I I I I

1 2 3 4

infants in the nonphonetic group. As a consequence, more infants in the phonetic group failed to meet the conditioning criterion. For this reason, it could be ar- gued that the phonetic/nonphonetic difference was the result of bias in subject selection. It is possible that the more difficult initial-training contrast in the phonetic condition resulted in the selection of better subjects than those in the nonphonetic condition.

To test for this possibility, an additional control condi- tion was run using a nonphonetie task in which the initial-training contrast was the same as that for the pho- netic group--[ma] versus [ba]. The experimental stages for this condition are shown in Table 4. As in the pho-

TABLE 4. Experimental stages for an additional nonphonetie control condition.

EXPERIMENTAL STAGE

FIGURE 12. Overall percent correct for each experimental stage for the phonetic group. The error bars indicate one standard de- viation.

I-. 0 ILl rr IX 0 (9 l - Z uJ 0 IX I,,1.1

100-

90 -

80 -

70-

60 -

50 -

40 -

3 0 -

oq

PHONETIC GROUP

I ! ! I

1 2 3 4

E X P E R I M E N T A L S T A G E

FIGURE 13. Overall percent correct for each experimental stage for the nonphonetic group. The error bars indicate one standard deviation.

An Additional Control Condition

As was discussed previously, the nonphonetie condi- tion was designed to test infants on a set of stimuli com- parable to that used in the phonetic condition but which could not be grouped on the basis of auditory similarity. The relatively good performance of infants in the pho- netic group led to the conclusion that these subjects rec- ognized similarities among sounds in the stimulus categories. However, infants in the phonetic group were initially trained on a more difficult contrast than were

Stage Category 1 Category 2

ba (M) ma (M)

ba (M) ma (M) ga (F) oa (M) ba (M) ma (M) ga (F) oa (M) ma (F) da (F) na (M) ba (F)

ba (M) ma (M) ga (F) 13a (M) ma (F) da (F) na (M) ba (F) da (M) ga (M) oa (F) na (F)

netic condition, the initial-training stage contrasted [ma] (male) with [ha] (male). However, as in the nonphonetic condition described previously, stimuli were added in subsequent stages in sueh a way that the categories could not be organized by talker or by place or manner of production. Testing procedures were identical to those described previously except that the tape deck and modular programming logic were replaced by a digital computer (DEC PDP 11/34). A computer program pre- sented stimuli and controlled experimental contingen- cies aeeording to the stone rules and with the same tim- ing parameters as were used to design the programming logie described previously. Six 5V2- to 6a/2-month-old in- Ieants began testing; two of these subjects failed to pass the conditioning criterion.

The results of this control experiment do not support the possibility that tile phonetie/nonphonetie difference was due exclusively to bias in subject selection. Average performance on the initial-training stage was 68% cor- rect, comparable to that of the phonetic group. However, unlike the performance of the phonetic group, these sub- jects' performance fell very close to ehance and stayed there for the remaining stages. Average performance for the final stage was 58% correct. These findings support the eonelusion that infants in the phonetic condition per-

Page 12: PERCEPTUAL ORGANIZATION OF SPEECH SOUNDS BY INFANTShillenbr/Papers/Hillenbrand... · 2009. 7. 29. · consonants (Fant, 1960). A second reason for studying the stop/nasal eontrast

HILLENBRAND: Infants' Organization of Speech Sounds 279

formed well because they recognized the perceptual similarity of syllables sharing a value on a feature di- mension.

D I S C U S S I O N

The principal findings of this study were:

1. The overall performance of infants in the phonetic group was significantly better than that of the non- phonetic group.

2. The phonetic infants tended to distribute their re- sponses more or less evenly among the stimuli in the reinforced category, while infants in the nonphonetic group tended to favor the stimulus that was used in the initial-training contrast.

3. There was no evidence of a systematic decline in the performance of phonetic infants as the experiment became more complex, whereas the performance of infants in the nonphonetic group tended to drop as tokens were added to the two categories.

These results suggest that infants do recognize the similarity of speech sounds that share a value on a phonetic-feature dimension. The alternate possibility that simple rote memorization was responsible for these results seems unlikely in light of the relatively poor overall performance of infants in the nonphonetic group. This same phonet ic/nonphonet ic difference was also found in a similar study examining categorization of fricatives (Kuhl et al., Note 3) and in a study examining categorization of nasal consonants (Hillenbrand, Note 2). It is important to point out, however, that the nonphonet- ic results do not prove that memorization was not in- volved in any form in the phonetic condition. It is a well-established finding that memorization is most effi- cient when the items to be recalled can be organized in some fashion (e.g., see Bartlett , i932; Bransford & Franks, 1974; Tu l v i ng & Dona ldson , 1972). The phonetic/nonphonetie effect suggests that if memoriza- tion was involved, the process was aided by the percep- tual similarity of the speech sounds. Whatever the exact role of memory in these experiments, it appears that rec- ognition of perceptual similarity is a necessary condition for good performance on this kind of task.

One additional issue that needs to be addressed in in- terpreting these findings concerns the discriminability of tokens within the stop and nasal categories. To qualify as categorization, it mugt be demonstrated that the tokens in the particular class are being treated as equivalent but different. That is, it would not be interesting to demon- strate common responses to the class [b, d, g] if infants could not discriminate stop-consonant place of articula- tion. The literature provides ample evidence that infants can discriminate among voieed stop consonants (Eimas, 1974; Morse, 1972). In addition, a recent exper iment using procedures very similar to those described in this repor t p rovides ev idence for the d i sc r imina t ion of nasal-consonant place of articulation by young infants (Hi l lenbrand, Note 2). These discr iminat ion results

suggest that infants in the present study demonstrated what Bornstein (1981) has called "equivalence classifica- tion," or "the equivalent treatment of discriminably dif- ferent stimuli based on their perceptual similarity" (p. 4o).

Perceptual Development and Theories of Speech Perception

The present results extend the findings of previous re- search on infants in which speech-sound categorization was tested at the level of the phonetic segment (Fodor et al., 1975; Holmberg et al., 1977; Kuhl, 1977; 1979b; Kuhl & Miller, 1982; Hillenbrand, Note 2). Taken as a group, these studies suggest that young infants have relatively sophisticated abilities to focus on the critical acoustic dimensions that "define" speech-sound categories while ignoring prominent variation in noncritical dimensions. These findings are analogous to the more extensive de- velopmental literature on perceptual constancies in vi- sion. The work of Bower (1964), for example, suggests that young infants perceive the true size of an object de- spite the substantial variations in retinal-image size that result when object-observer distance is changed.

The exact role of experience is not clear in these vi- sion experiments, nor is it a simple issue in relation to the infant studies on speech-sound categories. Since the subjects in these studies were not newborns, it is not possible to rule out learning or simply the effects of ex- posure to speech in accounting for these results. Two conclusions seem reasonable, however. First, if these abilities are learned, they are learned very quickly and apparently without any specific training. Second, and perhaps more important than the specific question of in- nateness, these abilities predate the acquisition of de- tailed knowledge of speeeh production and the acquisi- tion of sophisticated speech-comprehension abilities. This observation bears directly on specific theoretical debates in speech-percept ion research. An important contention of "motor theories" of speech perception is that the invarianee problem is resolved by processes that involve the mediation of articulatory knowledge. The re- sults of the present study, and other demonstrations of perceptual constancy for speech by infants, suggest that sophisticated articulatory knowledge is not a necessary condition for the demonstration of these abilities. It ap- pears that prelinguistie infants are capable of extracting the acoustic properties that form the basis of phonetic categories. I f this general finding is corroborated by fur- ther research, it would seem to support the anditory- based theories proposed by Fant and others (Fant, 1967; Miller, 1977; Miller et al., 1977; Searle et al., 1979). However, it is possible to formulate a version of an articulation-based theory consistent with the infant find- ings. It is necessary only to assume that the artieulatory knowledge which mediates the perception of speech is phylogenetically rather than ontogenically acquired; that is, that part of human genetic endowment is a species- specific mechanism for speech perception. In fact, this sort of approach has been successful in explaining the

Page 13: PERCEPTUAL ORGANIZATION OF SPEECH SOUNDS BY INFANTShillenbr/Papers/Hillenbrand... · 2009. 7. 29. · consonants (Fant, 1960). A second reason for studying the stop/nasal eontrast

280 Journal of Speech and Hearing Research 26 268-282 June 1983

perception of biologically relevant signals in other species (Hailman, 1969; Marler, 1970; 1976). However, recent experiments on speech perception by nonhuman listeners are not consistent with this view. Research on the dog (Barn, 1975) and the chinchilla (Burdick & Mil- ler, 1975; Kuhl & Miller, 1975; 1978) suggests that nonhuman listeners are able to sort speech sounds on the basis of phonetic similarity across variations in non- critical dimensions. Taken together, the infant and ani- mal findings suggest that acoustic invariants are avail- able in the speech signal and, further, that the mamma- lian auditory system seems capable of extracting these properties in a variety of contexts.

A C K N O W L E D G M E N T S

This work is a portion of a dissertation conducted at the Uni- versity of Washington's Child Development and Mental Retar- dation Center under the direction of Patricia Kuhl. Her careful guidance is gratefully acknowledged, as is the advice of Fred Minifie, Wesley Wilson, and Philip Dale. I would also like to thank Jean Tully, Tristan Holmberg, Chris Prall, and Kyum-Ha Lee for their valuable contributions to this project. This work was supported by a research contract from the National Institute of Child Health and Human Development to Dr. Fred Minifie (NICHD HD-3-2793), a grant from the National Science Foun- dation to Dr. Patrieia Kuhl (BNS 79-13767), and by an Annual Fund Doctoral Fellowship to the author from the Graduate School of the University of Washington.

Implications for Phonological Development

The phonetic condition contrasted a category of voiced stop consonants with a category of nasal consonants. The performance of subjects in this task indicates that infants are capable of organizing speech sounds on the basis of categories at least this broad or "abstract." The feature categories tested, however, are phonologically organized within even broader feature classes, such as [_-+con- tinuant] or [_'sonorant]. It would be interesting tO de- termine whether infants are capable of organizing speech sounds based on very broad feature categories such as these. For example, would infants reinforced for head turns to nasal consonants also respond to presenta- tions of other sonorants, such as liquids and semivowels, but not to presentations of obstruents, such as fricatives and affricates? The importance of determining the in- fant's proclivities for classifying speech sounds is that these kinds of perceptual abilities may form the basis for acquiring phonological rules that appeal to feature categories.

There are a number of phonological rules that appeal to the nasal/oral distinction. For example, in most dialects of American English, voiced stops that precede homorganic syllabic nasals are released nasally rather than orally (e.g., " s adden" ) . Most descr ipt ions of phonological rule systems suggest that rules such as these are specified in terms of values on feature dimen- sions rather than individual phonetic segments. While the present results do not argue that infants are born with anything that could be described as "phonological knowledge," it is possible that the acquisi t ion of phonological rules may be aided by the infant's recogni- tion of the inherent perceptual similarity of speech sounds sharing particular feature values. On a related is- sue, some investigators have argued that children do not learn the sound system of their language in a straightforward "segment -by-segment" fashion, but rather by learning the hierarchical organization of fea- tures and feature contrasts (Blache, 1978; Jakobson, 1968; Smith, 1973). More detailed studies of the type presented here might reveal a relationship between the acquisition of phonological rules and phonetic segments and the relative difficulty of organizing speech sounds along various feature dimensions.

R E F E R E N C E N O T E S

1. KUHL, P. K., & HILLENBRAND, J. Speech perception by young infants: Perceptual constancy for categories based on pitch contour. Paper presented at the biennial meeting of the Society for Research in Child Development, San Fran- cisco, 1979.

2. HILLENBRAND, J. Speech perception by infants: Categoriza- tion along a nasal consonant place dimension. Manuscript submitted for publication.

3. KUIJL, P. K., HOLMBERG, T. L., MORGAN, K. A., HILLEN- ~RAND, J., & CAMERON, P. Perception of equivalence for fricatives in CV syllables. Manuscript in preparation.

4. HILLENBRAND, J. Perceptual organization of speech sounds by young infants. Unpublished doctoral dissertation, Uni- versity of Washington, 1980.

5. PRALL, C. W., • HILLENBRAND, J. AUDED: A time-domain analysis and editing program for audio signals. Technical report, Northwestern University, Evanston, IL, 1980.

R E F E R E N C E S

BARTLETT, F. C. Remembering. Cambridge, England: Cam- bridge University Press, 1932.

BARU, A. V. Discrimination of synthesized vowels [a] and Ill with varying parameters in dog. In G. Fant & M. A. A. Tathum (Eds.), Auditory analysis and perception of speech. New York: Academic Press, 1975.

BLACI-IE, S. E. The acquisition of distinctive features. Balti- more: University Park Press, 1978.

BORNSTEIN, M. H. Two kinds of perceptual organization near the beginning of life. In W. A. Collins (Ed.), Aspects of the development of competence. Hillsdale, NJ: Lawrence Erlbaum Associates, 1981.

BOWER, T. G. R. Discrimination of depth in premotor infants. Psychonomic Science, 1964,1,368.

BKaNSFOrU), J. D., & FRANKS, J. J. Memory for syntactic form as a function of semantic context. Journal of Experimental Psy- chology, 1974, 103, 1037-1039.

BUttDICK, C. K., & MILLER, J. D. Speech perception by the chinchilla: Discrimination of sustained/a/and/i/. Journal of the Acoustical Society of America, 1975, 58, 415-427.

EIMAS, P. D. Auditory and linguistic processing of cues for place of articulation by infants. Perception & Psychophysics, 1974, 16, 513-521.

EIMAS, P. D., & MILLER, J. L. Discrimination of information for manner of articulation. Infant Behavior and Development, 1980, 3, 367;375.

FANT, G, Acoustic theory of speech production. The Hague: Mouton, 1960.

FANT, G. Auditory patterns of speech. In W. Wathen-Dunn (Ed.), Models for the perception of speech and visual form. Cambridge: MIT Press, 1967.

Page 14: PERCEPTUAL ORGANIZATION OF SPEECH SOUNDS BY INFANTShillenbr/Papers/Hillenbrand... · 2009. 7. 29. · consonants (Fant, 1960). A second reason for studying the stop/nasal eontrast

HILLENBRAND: Infants' Organization of Speech Sounds 281

FODOR, J. A., GARRETT, M. F., & BRILL, S. L. Pi-ka-pu. The perception of speech sounds by pre-linguistic infants. Percep- tion & Psychophysics, 1975, 18, 74-78.

FUJIMURA, O. Analysis of nasal consonants. Journal of the Acoustical Society of America, 1962,34, 1865-1875.

HAILMAN, J. P. How an instinct is learned. Scientific American, 1969, 221, 98-106.

HILLENBRAND, J. Categorization of stop and nasal consonants by young infants. Journal of the Acoustical Society of America, 1980, 68(Suppl. 1), S31(A).

HILLENBRAND, J., MINIFIE, F. D., & EDWARDS, T. J. Tempo of spectrum change as a cue in speech-sound discrimination by infants. Journal of Speech and Hearing Research, 1979, 22, 147-165.

HOLMBERG, T. L., MORGAN, K. A., & KUHL, P. K. Speech per- ception in early infancy: Discrimination of fricative conso- nants. Journal of the Acoustical Society of America, 1977, 62(Suppl. 1), $99(A).

JAKOBSON, R. Child language, aphasia and phonological uni- versals. The Hague: Mouton, 1968.

KUHL, P. K. Speech perception in early infancy: Perceptual constancy for the vowel categories/a/and/o/ . Journal of the Acoustical Society of America, 1977, 62(Suppl. 1), $39(A).

KUHL, P. K. Models and mechanisms in speech perception: Species comparisons provide further contributions. Brain, Behavior and Evolution, 1979, 16,374-408. (a)

KL~L, P. K. Speech perception in early infancy: Perceptual constancy for spectrally dissimilar vowel categories. Journal of the Acoustical Society of America, 1979, 66, 1668-1679. (b)

KUHL, P. K., & MILLER, J. D. Speech perception by the chin- chilla: Voiced-voiceless distinctions in alveolar plosive con- sonants. Science, 1975,190, 69-72.

KUItL, P. K., & MILLER, J. D. Speech perception by the chin- chilla: Identification for synthetic VOT stimuli. Journal of the Acoustical Society of America, 1978, 63, 905-917.

KUHL, P. K., & MILLER, J. D. Discrimination of auditory target dimensions in the presence or absence of variation in a sec- ond dimension by infants. Perception & Psychophysics, 1982, 31,279-292,.

LmERMAN, A. M. The grammars of speech and language. Cogni- tive Psychology, 1970, 1,301-323.

LIBERMAN, A. M., COOPER, F. S., SHANKWEILER, D. P., & STUDDERT-KENNEDY, M. Perception of the speech code. Psychological Review, 1967, 74, 431-461.

MARLER, P. A comparative approach to vocal learning'. Song development in white-crowned sparrows. Psychological Monographs, 1970, 71, 1-25.

MARLER, P. Sensory templates in species-specific behavior. In J. Fentress (Ed.), Simpler networks and behavior. Sunder- land: Sinauer Associates, 1976.

MmLER, J. D. Perception of speech by animals: Evidence for speech processing by mammalian auditory systems. In T. H. Bullock (Ed.), Recognition of complex auditory signals. Ber- lin: Abakon Verlagsgesellschaft, 1977.

MILLER, J. D., ENGEBRETSON, A. M., SPENNER, B. F., & Cox, J. R. Preliminary analysis of speech sounds with a digital model of the ear.Journal of the Acoustical Society of America, 1977, 62(Supph I), $13(A).

MORSE, P. A. The discrimination of speech and non-speech in early infancy.Journal of Child Psychology, 1972,14,477-492.

SEARLE, C. L., JACOBSON, J. Z., & RAYMENT, S. G. Stop conso- nant discrimination based on human audition. Journal of the Acoustical Society of America, 1979, 65, 799-809.

SMITH, N. V. The acquisition of phonology: A case study. Cam- bridge, England: Cambridge University Press, 1973.

STEVENS, K. N., ~¢ BLUMSTEIN, S. E. Invariant cues for place of articulation in stop consonants. Journal of the Acoustical So- ciety of America, 1978, 64, 1358-1368.

STEVENS, K. N., & HousE, A. S. Speech perception. In J. V. Tobias (Ed.), Foundations of modern auditory theory (Vol. 2). New York: Academic Press, 1972.

TULWNG, E., & DONALDSON, W. (Eds.). Organization of mem- ory. New York: Academic Press, 1972.

Received March 2, 1982 Accepted August 12, 1982

Requests for reprints should be sent to James Hillenbrand, Department of Communicative Disorders, Northwestern Uni- versity, 2299 Sheridan Road, Evanston, IL 60201.

Page 15: PERCEPTUAL ORGANIZATION OF SPEECH SOUNDS BY INFANTShillenbr/Papers/Hillenbrand... · 2009. 7. 29. · consonants (Fant, 1960). A second reason for studying the stop/nasal eontrast

282 Journal of Speech and Hearing Research 26 268-282 J u n e 1983

A P P E N D I X

Table A shows the results of acoustic measurements on the stop-vowel and nasal-vowel stimuli used in the infant tests. All measurements were made using the program AUDED (Prall & Hillenbrand, Note 5) written for a DEC PDP 11 computer. Fundamental frequency was measured for the vocalic portion of each utterance by displaying succes- sive 100-msec segments of the waveform on a high-resolution graphics terminal (Tektronix 4010) and using a cross-hair cursor to mark the boundaries of each pitch period. For simplicity, the table shows only mean fundamen- tal frequency. All utterances showed rise/fall fundamental frequency contours. Intensity was measured by a pro- gram that simply calculated an RMS value over all data points in the waveform and converted the value to a decibel scale. All values in the table are given in relation to [ba] (male), which was arbitrarily set to 65 dB. The overall duration of each utterance was measured from the same graphics displays as those used to calculate fundamental frequency.

TABLE A. Fundamental frequency, intensity, and duration measurements of the stop and nasal stimuli. Fundamen- tal frequency means and standard deviations are given separately for the male and female talkers.

Fundamental RMS Stimuli frequency (Hz ) intensity ( dB ) Duration (msec )

ba (male) 80.7 65.0 479.2 da (male) 81.5 64.5 562.8 ga (male) 82.1 64.3 518.6 ba (female) 197.6 69.1 407.2 da (female) 194,8 69.5 474.4 ga (female) 197.4 68.4 560.4

mean 81.4/196.6 66.8 500.4 SD .7/1.6 2.4 59.4

ma (male) 83.9 64.8 505.6 na (male) 84.0 64.3 561.9 0a (male) 81.3 64.4 535.2 ma (female) 188,0 71.2 487.7 na (female) 197.0 71.9 508.2 rja (female) 192.4 71.5 484.9

mean 83.1/192.4 68.0 513.9 SD 1.5/4.5 3,9 29.6