1 speech perception 3/30/00. 2 speech perception how do we perceive speech? –multifaceted process...

17
1 Speech Perception 3/30/00

Upload: chester-harvey

Post on 29-Dec-2015

227 views

Category:

Documents


1 download

TRANSCRIPT

1

Speech Perception

3/30/00

2

Speech Perception• How do we perceive speech?

– Multifaceted process

– Not fully understood

– Models & theories attempt to explain process

• Knowledge of speech perception advanced:– Spectrograph

• Consists of analyzing filters to analyze speech signals

• Determining acoustic cues for different speech sounds

• Provides information of fundamental frequency, harmonics, formants of the vocal tract

– Pattern playback• Speech synthesizer that converts painted visual patterns into speech-like

sounds

• Performs the reverse of the spectrograph

• Converts visual input into perceived auditory signal

3

Speech Perception• Before 1950- Speech was difficult to

analyze- laborious analysis systems

• After 1950- Speech perception became easier to study with spectrogram & pattern playback

4

Speech Perception

• How we understand the speech of other people.

• How we select one voice in particular from a crowd.

• The processes of taking in the acoustic signal of speech and how we reach decisions quickly about who said it, what was said and how it was said.

5

Vowel Perception

• Formants-–Resonances of the human vocal tract

–Acoustic cues for the identification of vowels

–The first 2 or 3 formants (F1, F2, F3) are sufficient for the perceptual identification of vowels

6

Diphthong Perception

• Diphthongs (combinations of 2 vowels) exhibit

formant transitions-

– Frequency changes in a portion of the formants,

reflecting changes in the shape of the vocal tract via

articulatory movements

7

Consonant Perception• Perception is more complex because consonants

depend on vowels for their recognition– I.e. if a stop consonant is separated from vowels they will

not be perceived as stops

– Stop consonant perception is dependent on rapidly changing formant transitions C to V in a CV context

8

Suprasegmental Perception

• Suprasegmental (prosodic) features of a

language are those properties of speech sounds

that appear simultaneously or are overlaid

onto the phonetic (segmental) features- I.e.

Intonation, stress, quantity timing.

• Alter the meaning of an utterance

9

Intonation• Involves changes in fundamental frequency,

perceived as the pitch pattern of a phrase or sentence

• Can be used to change speaker’s meaning– Declarative sentence (rise-fall intonation)– Questions (end-of-sentence pitch rise)

• Can use the same words but change the meaning

10

Intonation

11

Stress

• The perception of stress or the degree of force

of an utterance-

– Involves 3 acoustic parameters (Intensity, duration

and fundamental frequency)

– Stressed syllables- Increase in all acoustic

parameters

– OBject (stress on first syllable- NOUN)

– obJECT (stress on second syllable- VERB)

12

Quantity Timing

• Duration within a phonological system

– Changes in relative durations of linguistic units in

words can change the meaning of words

– Changes in sentence duration can indicate the

mood of the speaker

13

Issues in Speech Perception

• Invariance, Linearity, Segmentation–These issues address the primary

recognition problem of: “How the form of a spoken word is recognized from acoustic information in the speech waveform.

14

Acoustic-Phonetic Invariance• There is a distinct set of acoustic

features corresponding to each phoneme so that each time the phone is produced, the same acoustic features are identified, regardless of context

15

Linearity

• Proposes that in a spoken word, a specific sound corresponds to each phoneme, with units of sound corresponding to phonemes being discrete and ordered in a particular sequence.

16

Segmentation

• The speech signal can be divided (and recombined) into acoustically independent units that correspond to specific phonemes.

17

Problems??• 3 principles imply a one-to-one connetion

between the acoustic and phonemic properties of sounds in words. NO….Evidence indicates that natural speech does not conform to these conditions!– 1. Acoustic cues outnumber phonemes in words

– 2. Acoustic properties of a phoneme vary in different phonetic contexts.

– 3. At a given point there may be overlapping of acoustic properties or preceding phones or phones following

– 4. The articulators move continuously in conversation- coarticulation effects