1 speech perception 3/30/00. 2 speech perception how do we perceive speech? –multifaceted process...
TRANSCRIPT
2
Speech Perception• How do we perceive speech?
– Multifaceted process
– Not fully understood
– Models & theories attempt to explain process
• Knowledge of speech perception advanced:– Spectrograph
• Consists of analyzing filters to analyze speech signals
• Determining acoustic cues for different speech sounds
• Provides information of fundamental frequency, harmonics, formants of the vocal tract
– Pattern playback• Speech synthesizer that converts painted visual patterns into speech-like
sounds
• Performs the reverse of the spectrograph
• Converts visual input into perceived auditory signal
3
Speech Perception• Before 1950- Speech was difficult to
analyze- laborious analysis systems
• After 1950- Speech perception became easier to study with spectrogram & pattern playback
4
Speech Perception
• How we understand the speech of other people.
• How we select one voice in particular from a crowd.
• The processes of taking in the acoustic signal of speech and how we reach decisions quickly about who said it, what was said and how it was said.
5
Vowel Perception
• Formants-–Resonances of the human vocal tract
–Acoustic cues for the identification of vowels
–The first 2 or 3 formants (F1, F2, F3) are sufficient for the perceptual identification of vowels
6
Diphthong Perception
• Diphthongs (combinations of 2 vowels) exhibit
formant transitions-
– Frequency changes in a portion of the formants,
reflecting changes in the shape of the vocal tract via
articulatory movements
7
Consonant Perception• Perception is more complex because consonants
depend on vowels for their recognition– I.e. if a stop consonant is separated from vowels they will
not be perceived as stops
– Stop consonant perception is dependent on rapidly changing formant transitions C to V in a CV context
8
Suprasegmental Perception
• Suprasegmental (prosodic) features of a
language are those properties of speech sounds
that appear simultaneously or are overlaid
onto the phonetic (segmental) features- I.e.
Intonation, stress, quantity timing.
• Alter the meaning of an utterance
9
Intonation• Involves changes in fundamental frequency,
perceived as the pitch pattern of a phrase or sentence
• Can be used to change speaker’s meaning– Declarative sentence (rise-fall intonation)– Questions (end-of-sentence pitch rise)
• Can use the same words but change the meaning
11
Stress
• The perception of stress or the degree of force
of an utterance-
– Involves 3 acoustic parameters (Intensity, duration
and fundamental frequency)
– Stressed syllables- Increase in all acoustic
parameters
– OBject (stress on first syllable- NOUN)
– obJECT (stress on second syllable- VERB)
12
Quantity Timing
• Duration within a phonological system
– Changes in relative durations of linguistic units in
words can change the meaning of words
– Changes in sentence duration can indicate the
mood of the speaker
13
Issues in Speech Perception
• Invariance, Linearity, Segmentation–These issues address the primary
recognition problem of: “How the form of a spoken word is recognized from acoustic information in the speech waveform.
14
Acoustic-Phonetic Invariance• There is a distinct set of acoustic
features corresponding to each phoneme so that each time the phone is produced, the same acoustic features are identified, regardless of context
15
Linearity
• Proposes that in a spoken word, a specific sound corresponds to each phoneme, with units of sound corresponding to phonemes being discrete and ordered in a particular sequence.
16
Segmentation
• The speech signal can be divided (and recombined) into acoustically independent units that correspond to specific phonemes.
17
Problems??• 3 principles imply a one-to-one connetion
between the acoustic and phonemic properties of sounds in words. NO….Evidence indicates that natural speech does not conform to these conditions!– 1. Acoustic cues outnumber phonemes in words
– 2. Acoustic properties of a phoneme vary in different phonetic contexts.
– 3. At a given point there may be overlapping of acoustic properties or preceding phones or phones following
– 4. The articulators move continuously in conversation- coarticulation effects