more motor theory + fricative acoustics march 30, 2010

47
More Motor Theory + Fricative Acoustics March 30, 2010

Upload: gwen-marshall

Post on 16-Jan-2016

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: More Motor Theory + Fricative Acoustics March 30, 2010

More Motor Theory + Fricative Acoustics

March 30, 2010

Page 2: More Motor Theory + Fricative Acoustics March 30, 2010

To Begin With…• Homeworks!

• Today:

• Some more thoughts on perception

• And then a brief review of obstruent acoustics

• On Thursday, we’ll be doing:

• A brief description of vocal tract musculature.

• Static palatography demo!

• You’re welcome to bring in a camera, if you so desire.

• Also, a link:

• http://sakurakoshimizu.blogspot.com/

Page 3: More Motor Theory + Fricative Acoustics March 30, 2010

Revised Schedule• Next week:

• Tuesday: auditory perception

• Thursday: speech synthesis + SSRIs

• The final week:

• Tuesday: “3-5 cool things” about your project language + a chance for catch-up and review

• Thursday: Longer/final project presentations (for volunteers only)

Page 4: More Motor Theory + Fricative Acoustics March 30, 2010

Motor Theory Review• Last time, we discussed the basics of the motor theory of speech perception.

• Some basic precepts:

• Humans have a special neurological module for speech perception.

• (And other species don’t.)

• Speech perception doesn’t work on the basis of general principles.

• We perceive speech as gestures, not sounds.

• Some basic evidence:

• Categorical perception

Page 5: More Motor Theory + Fricative Acoustics March 30, 2010

A Modular Mind Modelcentral

processes

judgment, imagination, memory, attention

modules vision hearing touch speech

transducers eyes ears skin etc.

external, physical reality

Page 6: More Motor Theory + Fricative Acoustics March 30, 2010

More Evidence for Modularity• It has also been observed that speech is perceived multi-modally.

• i.e.: we can perceive it through vision, as well as hearing (or some combination of the two).

• We’re perceiving “gestures”

• …and the gestures are abstract.

• Interesting evidence: McGurk Effect

Page 7: More Motor Theory + Fricative Acoustics March 30, 2010

McGurk Effect, revealedAudio Visual Perceived

ba + ga da

ga + ba ba, bga, gba

• Some interesting facts:

• The McGurk Effect is exceedingly robust.

• Adults show the McGurk Effect more than children.

• Americans show the McGurk Effect more than Japanese.

Page 8: More Motor Theory + Fricative Acoustics March 30, 2010

Original McGurk Data Auditory Visual

• Stimulus: ba-ba ga-ga

• Response types:

Auditory: ba-ba Fused: da-da

Visual: ga-ga Combo: gabga, bagba

Age Auditory Visual Fused Combo

3-5 19% 36 81 0

7-8 36 0 64 0

18-40 2 0 98 0

Page 9: More Motor Theory + Fricative Acoustics March 30, 2010

Original McGurk Data Auditory Visual

• Stimulus: ga-ga ba-ba

• Response types:

Auditory: ba-ba Fused: da-da

Visual: ga-ga Combo: gabga, bagba

Age Auditory Visual Fused Combo

3-5 57% 10 0 19

7-8 36 21 11 32

18-40 11 31 0 54

Page 10: More Motor Theory + Fricative Acoustics March 30, 2010

Audio-Visual Sidebar• Visual cues affect the perception of speech in non-mismatched conditions, as well.

• Scientific studies of lipreading date back to the early twentieth century

• The original goal: improve the speech perception skills of the hearing-impaired

• Note: visual speech cues often complement audio speech cues

• In particular: place of articulation

• However, training people to become better lipreaders has proven difficult…

• Some people got it; some people don’t.

Page 11: More Motor Theory + Fricative Acoustics March 30, 2010

Sumby & Pollack (1954)• First investigated the influence of visual information on the perception of speech by normal-hearing listeners.

• Method:

• Presented individual word tokens to listeners in noise, with simultaneous visual cues.

• Task: identify spoken word

• Clear:

• +10 dB SNR:

• + 5 dB SNR:

• 0 dB SNR:

Page 12: More Motor Theory + Fricative Acoustics March 30, 2010

Sumby & Pollack data

Auditory-Only Audio-Visual

• Visual cues provide an intelligibility boost equivalent to a 12 dB increase in signal-to-noise ratio.

Page 13: More Motor Theory + Fricative Acoustics March 30, 2010

Tadoma Method

• Some deaf-blind people learn to perceive speech through the tactile modality, by using the Tadoma method.

Page 14: More Motor Theory + Fricative Acoustics March 30, 2010

Audio-Tactile Perception• Fowler & Dekle: tested ability of (naive) college students to perceive speech through the Tadoma method.

• Presented synthetic stops auditorily

• Combined with mismatched tactile information:

• Ex: audio /ga/ + tactile /ba/

• Also combined with mismatched orthographic information:

• Ex: audio /ga/ + orthographic /ba/

• Task: listeners reported what they “heard”

• Tactile condition biased listeners more towards “ba” responses

Page 15: More Motor Theory + Fricative Acoustics March 30, 2010

Fowler & Dekle data

orthographic mismatch condition

tactile mismatch condition

read “ba”

felt “ba”

Page 16: More Motor Theory + Fricative Acoustics March 30, 2010

Another Piece of the Puzzle• Another interesting finding which has been used to argue for the “speech is special” theory is duplex perception.

• Take an isolated F3 transition:

and present it to one ear…

Page 17: More Motor Theory + Fricative Acoustics March 30, 2010

Do the Edges First!• While presenting this spectral frame to the other ear:

Page 18: More Motor Theory + Fricative Acoustics March 30, 2010

Two Birds with One Spectrogram

• The resulting combo is perceived in duplex fashion:

• One ear hears the F3 “chirp”;

• The other ear hears the combined stimulus as “da”.

Page 19: More Motor Theory + Fricative Acoustics March 30, 2010

Duplex Interpretation• Check out the spectrograms in Praat.

• Mann and Liberman (1983) found:

• Discrimination of the F3 chirps is gradient when they’re in isolation…

• but categorical when combined with the spectral frame.

• (Compare with the F3 discrimination experiment with Japanese and American listeners)

• Interpretation: the “special” speech processor puts the two pieces of the spectrogram together.

Page 20: More Motor Theory + Fricative Acoustics March 30, 2010

fMRI data• Benson et al. (2001)

• Non-Speech stimuli = notes, chords, and chord progressions on a piano

Page 21: More Motor Theory + Fricative Acoustics March 30, 2010

fMRI data• Benson et al. (2001)

• Difference in activation for natural speech stimuli versus activation for sinewave speech stimuli

Page 22: More Motor Theory + Fricative Acoustics March 30, 2010

Mirror Neurons• In the 1990s, researchers in Italy discovered what they called mirror neurons in the brains of macaques.

• Macaques had been trained to make grasping motions with their hands.

• Researchers recorded the activity of single neurons while the monkeys were making these motions.

• Serendipity:

• the same neurons fired when the monkeys saw the researchers making grasping motions.

• a neurological link between perception and action.

• Motor theory claim: same links exist in the human brain, for the perception of speech gestures

Page 23: More Motor Theory + Fricative Acoustics March 30, 2010

Motor Theory, in a nutshell• The big idea:

• We perceive speech as abstract “gestures”, not sounds.

• Evidence:

1. The perceptual interpretation of speech differs radically from the acoustic organization of speech sounds

2. Speech perception is multi-modal

3. Direct (visual, tactile) information about gestures can influence/override indirect (acoustic) speech cues

4. Limited top-down access to the primary, acoustic elements of speech

Page 24: More Motor Theory + Fricative Acoustics March 30, 2010

Moving On…• One important lesson to take from the motor theory perspective is:

• The dynamics of speech are generally more important to perception than static acoustic cues.

• Note: visual chimerism and March Madness.

Page 25: More Motor Theory + Fricative Acoustics March 30, 2010

Auditory Chimeras• Speech waveform + music spectrum:

• Music waveform + speech spectrum:

frequency bands

1 2 4 8 16 32

frequency bands

1 2 4 8 16 32

Source: http://research.meei.harvard.edu/chimera/chimera_demos.html

Originals:

Page 26: More Motor Theory + Fricative Acoustics March 30, 2010

Auditory Chimeras• Speech1 waveform + speech2 spectrum:

• Speech2 waveform + speech1 spectrum:

frequency bands

1 2 4 6 8 16

frequency bands

1 2 4 6 8 16

Originals:

Page 27: More Motor Theory + Fricative Acoustics March 30, 2010

Finally, Fricatives• The last type of sound we need to consider in speech acoustics is an aperiodic, continuous noise.

• Ideally:

• Q: What would the spectrum of this waveform look like?

Page 28: More Motor Theory + Fricative Acoustics March 30, 2010

White Noise Spectrum• Technical term: White noise

• has an unlimited range of frequency components

• Analogy: white light is what you get when you combine all visible frequencies of the electromagnetic spectrum

Page 29: More Motor Theory + Fricative Acoustics March 30, 2010

Turbulence• We can create aperiodic noise in speech by taking advantage of the phenomenon of turbulence.

• Some handy technical terms:

• laminar flow: a fluid flowing in parallel layers, with no disruption between the layers.

• turbulent flow: a fluid flowing with chaotic property changes, including rapid variation in pressure and velocity in both space and time

• Whether or not airflow is turbulent depends on:

• the volume velocity of the fluid

• the area of the channel through which it flows

Page 30: More Motor Theory + Fricative Acoustics March 30, 2010

Turbulence• Turbulence is more likely with:

• a higher volume velocity

• less channel area

• All fricatives therefore require:

• a narrow constriction

• high airflow

Page 31: More Motor Theory + Fricative Acoustics March 30, 2010

Fricative Specs• Fricatives require great articulatory precision.

• Some data for [s] (Subtelny et al., 1972):

• alveolar constriction 1 mm

• incisor constriction 2-3 mm

• Larger constrictions result in -like sounds.

• Generally, fricatives have a cross-sectional area between 6 and 12 mm2.

• Cross-sectional areas greater than 20 mm2 result in laminar flow.

• Airflow = 330 cm3/sec for voiceless fricatives

• …and 240 cm3/sec for voiced fricatives

Page 32: More Motor Theory + Fricative Acoustics March 30, 2010

Turbulence Sources• For fricatives, turbulence is generated by forcing a stream of air at high velocity through either a narrow channel in the vocal tract or against an obstacle in the vocal tract.

• Channel turbulence

• produced when airflow escapes from a narrow channel and hits inert outside air

• Obstacle turbulence

• produced when airflow hits an obstacle in its path

Page 33: More Motor Theory + Fricative Acoustics March 30, 2010

Channel vs. Obstacle• Almost all fricatives involve an obstacle of some sort.

• General rule of thumb: obstacle turbulence is much noisier than channel turbulence

• [f] vs.

• Also: obstacle turbulence is louder, the more perpendicular the obstacle is to the airflow

• [s] vs. [x]

• [x] is a “wall fricative”

Page 34: More Motor Theory + Fricative Acoustics March 30, 2010

Sibilants• Alveolar, dental and post-alveolar fricatives form a special class (the sibilants) because their obstacle is the back of the upper teeth.

• This yields high intensity turbulence at high frequencies.

Page 35: More Motor Theory + Fricative Acoustics March 30, 2010

vs.

“shy” “thigh”

Page 36: More Motor Theory + Fricative Acoustics March 30, 2010

Fricative Noise• Fricative noise has some inherent spectral shaping

• …like “spectral tilt”

• Note: this is a source characteristic

• This resembles what is known as pink noise:

• Compare with white noise:

Page 37: More Motor Theory + Fricative Acoustics March 30, 2010

Fricative Shaping• The turbulence spectrum may be filtered by the resonating tube in front of the fricative.

• (Due to narrowness of constriction, back cavity resonances don’t really show up.)

• As usual, resonance is determined by length of the tube in front of the constriction.

• The longer the tube, the lower the “cut-off” frequency.

• A basic example:

• [s] vs.

Page 38: More Motor Theory + Fricative Acoustics March 30, 2010

vs.

“sigh” “shy”

[s]

Page 39: More Motor Theory + Fricative Acoustics March 30, 2010

Sampling Rates Revisited• Remember: Digital representations of speech can only capture frequency components up to half the sampling rate

• the Nyquist frequency

• Speech should be sampled at at least 44100 Hz

(although there is little frequency information in speech above 10,000 Hz)

• [s] has higher acoustic energy from about 3500 - 10000 Hz

• Note: telephones sample at 8000 Hz

• 44100 Hz

• 8000 Hz

Page 40: More Motor Theory + Fricative Acoustics March 30, 2010

Further Back

[xoma]

palatal vs. velar

• In more anterior fricatives, turbulence noise is generally shaped like a vowel made at the same place of articulation.

Page 41: More Motor Theory + Fricative Acoustics March 30, 2010

Even Further Back• Examples from Hebrew:

Page 42: More Motor Theory + Fricative Acoustics March 30, 2010

At the Tail End• [h] exhibits a lot of coarticulation

• [h] is not really a “fricative”;

• it’s more like a whispered or breathy voiced vowel.

“heed” “had”

Page 43: More Motor Theory + Fricative Acoustics March 30, 2010

Aspirated Fricatives• Like stops, fricatives can be aspirated.

• [h] follows the supraglottal frication in the vocal tract.

• Examples from Chinese:

[tsa] [tsha]

Page 44: More Motor Theory + Fricative Acoustics March 30, 2010

Back at the Ranch• There is not much of a resonating filter in front of labial fricatives…

• so their spectrum is flat and diffuse

• (like bilabial stop release bursts)

• Note: labio-dentals are more intense than bilabial fricatives

• (channel vs. obstacle turbulence)

Page 45: More Motor Theory + Fricative Acoustics March 30, 2010

Fricative Internal Cues• The articulatory precision required by fricatives means that they are less affected by context than stops.

• It’s easy for listeners to distinguish between the various fricative places on the basis of the frication noise alone.

• Result of both filter and source differences.

• Examples:

• There is, however, one exception to the rule…

Page 46: More Motor Theory + Fricative Acoustics March 30, 2010

Huh?• The two most confusable consonants in the English language are [f] and .

• (Interdentals also lack a resonating filter)

Page 47: More Motor Theory + Fricative Acoustics March 30, 2010

Helping Out• Transition cues may partially distinguish labio-dentals from interdentals.

• Normally, transitions for fricatives are similar to transitions for stops at the same place of articulation.

• Nonetheless, phonological confusions can emerge--

• Some dialects of English substitute [f] for .

• Visual cues may also play a role…