1 psy280: perception prof. anderson department of psychology audition 1 & 2
TRANSCRIPT
1
Psy280: Psy280: PerceptionPerception
Prof. AndersonProf. Anderson
Department of PsychologyDepartment of Psychology
Audition 1 & 2Audition 1 & 2
2
Hearing: What’s it good Hearing: What’s it good for?for?
Remote sensingRemote sensing Not restricted like visual fieldNot restricted like visual field
Can sense object not visibleCan sense object not visible
3
Hearing: The sound of Hearing: The sound of silencesilence
A tree in the forestA tree in the forest Physical signal but no perceptionPhysical signal but no perception
One hand clappingOne hand clapping No physical signal, no perceptionNo physical signal, no perception
Separate physical quantity from perceptual Separate physical quantity from perceptual qualityquality
Sound is the perceptual correlate of the Sound is the perceptual correlate of the physical changes in air pressurephysical changes in air pressure Or water pressure when under waterOr water pressure when under water
John Cage’s 4:33 No. 2, 1962John Cage’s 4:33 No. 2, 1962
4
What are the physical What are the physical attributes associated with attributes associated with
sound?sound? LoudnessLoudness Amplitude or height of pressure waveAmplitude or height of pressure wave
PitchPitch Frequency of times per second (Hz) a pressure Frequency of times per second (Hz) a pressure
wave repeats itselfwave repeats itself
5
What is sound quality?What is sound quality? Pure tones Pure tones
Single frequency (f)Single frequency (f) Rarely exist in real worldRarely exist in real world
Complex tonesComplex tones More than one fMore than one f Due to resonanceDue to resonance Air pressure causes reverberationsAir pressure causes reverberations
E.g., tuning forksE.g., tuning forks E.g., Plucking the A string on a guitarE.g., Plucking the A string on a guitar
Fundamental frequency 440 Hz (cycles/s)Fundamental frequency 440 Hz (cycles/s) HarmonicsHarmonics
Reverberations at multiples of the fundamentalReverberations at multiples of the fundamental E.g., 880, 1320E.g., 880, 1320 Creates fullness of complex soundsCreates fullness of complex sounds
Timbre is the relative amplification of harmonicsTimbre is the relative amplification of harmonics
6
The human earThe human ear Outer earOuter ear
Focusing of soundFocusing of sound Resonance amplifies Resonance amplifies
2000-5000 Hz range2000-5000 Hz range Converts from air to Converts from air to
mechanical vibrationmechanical vibration Middle earMiddle ear
AmplificationAmplification Fluid denser than airFluid denser than air Focus vibrations onto Focus vibrations onto
stapes/oval windowstapes/oval window Increased leverage from Increased leverage from
ossiclesossicles Inner earInner ear
Sensory transductionSensory transduction Physical to neural Physical to neural
energyenergy Fluid pressure changesFluid pressure changes Bending of hair cellsBending of hair cells
7
Auditory sensory Auditory sensory transduction: The inner transduction: The inner
earear CochleaCochlea Coiled and liquid filled Coiled and liquid filled
3 layers3 layers Cochlear partitionCochlear partition
Contains organ of cortiContains organ of corti Organ of cortiOrgan of corti
Cilia (hair) cellsCilia (hair) cells Between basilar and Between basilar and
tectorial membranes tectorial membranes Transduction Transduction
Movement of cilia Movement of cilia between membranesbetween membranes
8
Auditory transductionAuditory transduction
Bending—>physical energyBending—>physical energy Converted to neural signalsConverted to neural signals
Bend one direction —> depolarizationBend one direction —> depolarization More likely to fire APMore likely to fire AP
Other direction —> hyperpolarizationOther direction —> hyperpolarization Less likely to fire APLess likely to fire AP
9
Auditory pathwaysAuditory pathways
QuickTime™ and aGIF decompressor
are needed to see this picture.
10
Audition: What and Audition: What and wherewhere
What is it?What is it? *Pitch *Pitch IdentificationIdentification
Surprisingly, little is Surprisingly, little is known beyond known beyond speechspeech
Where is it?Where is it? *location*location
11
What: PitchWhat: Pitch
How does neural firing signal How does neural firing signal different pitches?different pitches? 1) Timing codes1) Timing codes 2) Place codes2) Place codes
12
Pitch: Temporal codingPitch: Temporal coding Idea: Diff f’s Idea: Diff f’s
signaled by rate of signaled by rate of neuronal firingneuronal firing
Hair cell responseHair cell response Bend one direction Bend one direction
—> depolarization—> depolarization Other direction —> Other direction —>
hyperpolarizationhyperpolarization Result?Result?
Bursting pattern of Bursting pattern of neural response neural response related to frequency related to frequency of oscillationof oscillation
13
Problems with temporal Problems with temporal codingcoding
Problem: A single neuron can’t fire at the rate necessary to represent higher f Problem: A single neuron can’t fire at the rate necessary to represent higher f tonestones E.g., 1000-20,000 Hz (i.e., 1000-20000 per second)E.g., 1000-20,000 Hz (i.e., 1000-20000 per second) Max neuron firing rate: 500-800 per secondMax neuron firing rate: 500-800 per second
Solution: volley principleSolution: volley principle No single neuron represents fNo single neuron represents f Coding across many neurons with staggered firing ratesCoding across many neurons with staggered firing rates
Evidence: Phase lockingEvidence: Phase locking Diff neurons respond to Diff neurons respond to diff peaksdiff peaks Not every peakNot every peak Pool across multiple neurons to Pool across multiple neurons to represent high f’srepresent high f’s
14
Pitch: Place codingPitch: Place coding Related to doctrine of Related to doctrine of
specific nerve energiesspecific nerve energies What is pitch?What is pitch?
Activation of different Activation of different places in auditory systemplaces in auditory system
Frequency specificFrequency specific TonotopyTonotopy
CochlearCochlear BrainstemBrainstem CorticalCortical
Stimulate these regionsStimulate these regions Should result in pitch Should result in pitch
perceptionperception
Owl brainstem
Human auditory cortex
15
Place coding starts in Place coding starts in cochleacochlea
Von Bekesy studied basilar Von Bekesy studied basilar membrane in cadaversmembrane in cadavers Base more narrow and stifferBase more narrow and stiffer Apex wider and more flexibleApex wider and more flexible
Observed traveling wavesObserved traveling waves Diff frequencies (f) result in Diff frequencies (f) result in
waves w/ diff envelopeswaves w/ diff envelopes Higher f: Peak closer to baseHigher f: Peak closer to base Lower f: Peak closer to apexLower f: Peak closer to apex
Thus, f related to “place” Thus, f related to “place” where peak fluctuation where peak fluctuation occursoccurs
16
Frequency tuning: Frequency tuning: Neural place codingNeural place coding
Tonotopic arrangement of hair cell nervesTonotopic arrangement of hair cell nerves Diff nerves innervate diff parts of basilar Diff nerves innervate diff parts of basilar
membranemembrane Allows for “place” code for frequencyAllows for “place” code for frequency
Frequency tuning curves of single hair cells
17
Complex tones: Complex tones: Fourier decompositionFourier decomposition
Basilar Basilar membrane acts membrane acts as f analyzeras f analyzer
Breaks down Breaks down complex f complex f inputs into inputs into constituent constituent pure tone pure tone componentscomponents
18
Auditory masking: Auditory masking: Evidence for cochlear Evidence for cochlear
place codingplace coding Auditory maskingAuditory masking
Presence of certain Presence of certain tones decreases tones decreases perception of nearby perception of nearby tonestones
Similar f result in Similar f result in greater maskinggreater masking
Asymmetry in spread Asymmetry in spread of maskingof masking Consistent with basilar Consistent with basilar
vibrational overlapvibrational overlap E.g. 400 Hz mask E.g. 400 Hz mask
overlaps more with 800 overlaps more with 800 than 200 Hzthan 200 Hz
400 Hz maskIncreases threshold for 800 more than 200 Hz
19
Mystery of the missing Mystery of the missing fundamentalfundamental
400 Hz fundamental plus 400 Hz fundamental plus harmonics (800, 1200, 1600, 2000) harmonics (800, 1200, 1600, 2000) Sounds like 400 Hz pitch with complex Sounds like 400 Hz pitch with complex
timbretimbre What if remove fundamental f What if remove fundamental f
(400Hz)?(400Hz)? Perceived pitch doesn’t change!Perceived pitch doesn’t change! Hence: The missing fundamentalHence: The missing fundamental
Problem for place codingProblem for place coding No direct stimulation of 400 Hz on No direct stimulation of 400 Hz on
basilar membranebasilar membrane
f
Harmonic structure determines perceived pitchHarmonic structure determines perceived pitch Not what is present on basilar membrane Not what is present on basilar membrane What we hear is not what the basilar membrane tell us, What we hear is not what the basilar membrane tell us, but what our brain doesbut what our brain does
20
What does Barry White What does Barry White sound like on the sound like on the
telephone?telephone? Telephone carries 300-Telephone carries 300-3400Hz3400Hz
Typical male voiceTypical male voice Fundamental f = 120 HzFundamental f = 120 Hz
Barry whiteBarry white 30 Hz?30 Hz?
Can’t speak to Barry on Can’t speak to Barry on the telephone?the telephone?
Missing fundamental Missing fundamental allows us to hear allows us to hear “virtual” pitch of voice“virtual” pitch of voice
21
If its too loud your too If its too loud your too oldold
Db (SPL) scaleDb (SPL) scale Loudness doubles about Loudness doubles about
every 10 db at 1000 Hzevery 10 db at 1000 Hz Audibility curvesAudibility curves
Loudness varies with fLoudness varies with f Low volumeLow volume
Attenuated low and high f Attenuated low and high f relative to midrangerelative to midrange
High volumeHigh volume Less frequency attenuationLess frequency attenuation Low volume sounds muddy Low volume sounds muddy
Mostly mid rangeMostly mid range I like my music loudI like my music loud
Pain and pleasure
Each curve represents equal loudness
22
Otoacoustic emissions: Otoacoustic emissions: Talking earsTalking ears
Ears don’t only receive sounds, they make Ears don’t only receive sounds, they make them!them! Discovered in 1978Discovered in 1978 Tiny microphonesTiny microphones
Occur spontaneously and also in response to Occur spontaneously and also in response to soundsound It like your ears are talking back!It like your ears are talking back!
Created by movement of outer hair cells (ohc)Created by movement of outer hair cells (ohc) Part of auditory sensitivity is movement of ohc to Part of auditory sensitivity is movement of ohc to
change change region specific flexibility of basilar membraneregion specific flexibility of basilar membrane Allows tuning curves to be so narrowAllows tuning curves to be so narrow
Hearing impairments often start with loss of Hearing impairments often start with loss of ohc functionohc function
23
Auditory localizationAuditory localization
Where is the sound coming from? Where is the sound coming from? DistanceDistance Elevation (vertical)Elevation (vertical) Azimuth (horizontal)Azimuth (horizontal)
Localization not nearly as precise as visionLocalization not nearly as precise as vision Localization within 2-3.5 degrees in front of Localization within 2-3.5 degrees in front of
headhead 20 degrees behind head20 degrees behind head Suggests important role of visionSuggests important role of vision
Tunes auditory localizationTunes auditory localization
24
Why is is auditory Why is is auditory localization not obvious?localization not obvious?
VisionVision Stimulate different photoreceptors in Stimulate different photoreceptors in
eyeeye AuditionAudition
No such separation of sounds sources No such separation of sounds sources on sensory surfaceon sensory surface
Sources combine to equally stimulate Sources combine to equally stimulate ear receptorsear receptors
25
Why have two ears?Why have two ears?
Two aural perspectives on the worldTwo aural perspectives on the world
Like vision, can be used to get Like vision, can be used to get different sound pictures of different sound pictures of environmentenvironment
Binaural cuesBinaural cues The disparities between ears is used for The disparities between ears is used for
localizationlocalization
26
AzimuthAzimuth Interaural (between ears) Time Interaural (between ears) Time
Difference (ITD)Difference (ITD) Air pressure changes are very slow relative Air pressure changes are very slow relative
to speed of lightto speed of light ITD at side = max 600 µSITD at side = max 600 µS ITD at front = 0ITD at front = 0 Can induce perception of location by Can induce perception of location by
varying ITD using headphonesvarying ITD using headphones Interaural Level (intensity) Difference Interaural Level (intensity) Difference
(ILD)(ILD) Amplitude decreases w/ distanceAmplitude decreases w/ distance Head casts sound/acoustic shadowHead casts sound/acoustic shadow
Reduced amplitude due to reflectionReduced amplitude due to reflection Measure w/ tiny microphones Measure w/ tiny microphones f dependentf dependent
Greater shadow for higher fGreater shadow for higher f
27
ElevationElevation ITD/ILD not very usefulITD/ILD not very useful Use spectral cuesUse spectral cues Frequency information Frequency information
can result in different can result in different perceptual qualiaperceptual qualia Monaural: f serves as Monaural: f serves as
signal for pitchsignal for pitch Binaural: f serves as signal Binaural: f serves as signal
for locationfor location Pinna differentially Pinna differentially
absorb fabsorb f Result: Notches in Result: Notches in
frequency spectrafrequency spectra
Above
Level
Below
28
DistanceDistance At close distances (< 1 meter)At close distances (< 1 meter)
ILD can discriminate near and farILD can discriminate near and far At very close distances ILD is very large (e.g. 20 Db)At very close distances ILD is very large (e.g. 20 Db)
But what’s that going to do for us?But what’s that going to do for us? At far distances At far distances
We are very poor judges for unfamiliar soundsWe are very poor judges for unfamiliar sounds Suggests that sound serves as signal for visual searchSuggests that sound serves as signal for visual search
Use sound level for familiar sourcesUse sound level for familiar sources Frequency: Auditory atmospheric hazeFrequency: Auditory atmospheric haze
Absorption of high fAbsorption of high f Sound muffledSound muffled
Auditory parallaxAuditory parallax Sounds move faster across ears at near relative to far distancesSounds move faster across ears at near relative to far distances
29
Brain basis for Brain basis for localizationlocalization
ITD detectors ITD detectors Brainstem: Superior Brainstem: Superior
olivary nucleusolivary nucleus Primary auditory Primary auditory
cortexcortex Coincidence detectionCoincidence detection
Neurons fire maximally Neurons fire maximally when signals arrive at when signals arrive at same timesame time
Thus: “coincidence”Thus: “coincidence” Axonal distance create Axonal distance create
input delays input delays
Sound to right
Sound to left
30
Auditory scene analysisAuditory scene analysis
How do we segregate different sounds being How do we segregate different sounds being produced by many sources simultaneously?produced by many sources simultaneously?
How do we tell what frequencies belong to How do we tell what frequencies belong to what source?what source? E.g., Cocktail partyE.g., Cocktail party Don’t perceive an unorganized jumble of Don’t perceive an unorganized jumble of
frequenciesfrequencies Not simply high vs low fNot simply high vs low f Most f ranges overlapMost f ranges overlap
How do we segregate information as belonging to How do we segregate information as belonging to distinct auditory objects?distinct auditory objects?
31
Principles of auditory Principles of auditory groupinggrouping
Like gestalt visual principlesLike gestalt visual principles Auditory stream segregationAuditory stream segregation SimilaritySimilarity
TimbreTimbre LocationLocation PitchPitch TimeTime
1 stream
2 streams
32
Auditory-visual Auditory-visual interactions: Location interactions: Location
and pitchand pitch Visual capture of soundVisual capture of sound
Location: Ventriloquism effectLocation: Ventriloquism effect Pitch: McGurk effectPitch: McGurk effect
““Ba”Ba” ““Va”Va” ““Tha”Tha” ““Da”Da”
Visual information is integrated with Visual information is integrated with auditionaudition
Creates fused auditory visual perceptionCreates fused auditory visual perception
QuickTime™ and aCinepak decompressor
are needed to see this picture.
33
Auditory-visual Auditory-visual interactions: Location interactions: Location
and pitchand pitch Auditory experience is much more Auditory experience is much more
than pressure level changesthan pressure level changes