auditory objects of attention chris darwin university of sussex with thanks to : rob hukin (ra) nick...
Post on 22-Dec-2015
214 views
TRANSCRIPT
Auditory Objects of Attention
Chris Darwin
University of Sussex
With thanks to :• Rob Hukin (RA)• Nick Hill (DPhil)• Gustav Kuhn (3° year proj)• MRC
Need for sound segregation
• Ears receive mixture of sounds
• We hear each sound source as having its own appropriate timbre, pitch, location
• Stored information about sounds (eg acoustic/phonetic relations) concerns a single source
Mechanisms of segregation
• Primitive grouping mechanisms based on general heuristics
• Schema-based mechanisms based on specific knowledge.
A Paradox
• We can attend to sounds coming from a particular direction– everyday experience
– Auditory RTs faster to cued side (Spence & Driver, 1994)
• Interaural time differences (ITDs) are the main cue to the location of a complex sound (Wightman & Kistler, 1992).
A Paradox
On the other hand
• ITDs are ineffective at grouping together sounds from a single sound source (Culling & Summerfield, 1995; Darwin & Hukin, 1995)
Left cochlea Right cochlea
200 Hz
500 Hz
1000 Hz
2000 Hz
M S O
+600µs
-600µs
-600µs
+600µs
EE AR
Coincidence detection and ITD
Two models of attention
Establish ITD of frequency
components
Attend to common ITD across
components
Establish ITD of frequency
components
Group components by harmonicity, onset-time etc
Establish direction of grouped object
Attend to direction of
grouped object
Attend to common ITD Attend to direction of object
Peripheral filtering into frequency components
Peripheral filtering into frequency components
Plan
• check out Culling & Summerfield for more natural sounds
•Show evidence for grouping before across-frequency ITD calculated
• show that ITD can be a very powerful sequential grouping cue
ILD condition
600-Hz
Target vowel /I/ or //
"Hello, you'll hear the sound X now"
no 600-Hz
Left
Right
Phase Ambiguity500 Hz: period = 2ms
L lags by 1.5 ms L leads by 0.5 ms
LL R
cross-correlation peaks at +0.5ms and -1.5ms
auditory system weighted toone closest to zero
Disambiguating phase-ambiguity
• Narrowband noise at 500 Hz with ITD of 1.5 ms (3/4 cycle) heard at lagging side.
•Increasing noise bandwidth changes location to the leading side.
Explained by across-frequency consistency of ITD.
(Jeffress, Trahiotis & Stern)
Resolving phase ambiguity
500 Hz: period = 2ms
L lags by 1.5 ms or L leads by 0.5 ms ?
-2.5200
800
600
400
-0.5 1.5 3.5
Delay of cross-correlator ms
Fre
quency
of
audit
ory
filt
er
Hz
Cross-correlation peaks for noise delayed in one ear by 1.5 ms
300 Hz: period = 3.3ms
R R LL R
Actual delay
Left ear actually lags by 1.5 ms
L lags by 1.5 ms or L leads by 1.8 ms ?
R
Segregation by onset-time
200
400
600
800
Fre
quen
cy (
Hz)
Duration (ms)0 400
Duration (ms)0 80 400
Synchronous Asynchronous
ITD: ± 1.5 ms (3/4 cycle at 500 Hz)
Segregated tone changes location
-20
0
20
0 20 40 80
Onset Asynchrony (ms)
Poi
nter
IID
(dB
)
Pure
ComplexR L
Segregation by mistuning
200
400
600
800
Fre
quen
cy (
Hz)
Duration (ms)0 400
Duration (ms)0 80 400
In tune Mistuned
Interim Summary
• ITD ineffective for simultaneous segregation
• Integration of ITD across frequency influenced by grouping cues
Question: Can attention be directed on the basis of ITD to grouped objects?
Attending to one sentence
Could you please write the word dog down now
…dog...
You’ll also hear the sound bird this time
Continuity of attention exptITD = + 45 µs
"Could you please write the word dog down now"
+ 45 µs + 45 µs
Fo = 106 Hz 106 Hz 100 Hz
ITD = - 45 µs
"You'll also hear the sound bird this time"
- 45 µs - 45 µs
Fo = 100 Hz 100 Hz106 Hz
1.0 2.0 s0.0
Continuity of Fo vs ITD
• Fo differences: 0, 1, 2, 4 semitones
• ITD differences: ± 45, 91, 181 µs
• Normal: Fo & ITD work together
• Switched: Fo & ITD opposed
Continuity of ITD very effective
50
60
70
80
90
100
0 1 2 4
±45 µs
±91 µs
±181 µs
difference in Fo (semitones)
Summary
• ITD ineffective for simultaneous grouping
• ITD provides good spatial separation for grouped objects
• Monotone pitch contours ineffective for source continuity
Vocal tract good against reverb
0
20
40
60
80
100
Fo together Fo original Fo apart Fo original + VT
Effect of reverberation on relative strength of ITD, prosody and vocal tract
RT60 = 0
RT60 = 0.5 s
chan
ge in
% c
orre
ct b
y IT
D
whe
n op
pose
d b
y pr
osod
y
ITD = ±91 µs
Shadowing sentences
Jemma felt stiff and tired after 3 hours in the hot and stuffy room and she would have liked ||
…to go outdoors for a breath of fresh air
We had spent our entire time from Cairo to Luxor in a tiny bus with no proper windows and really wanted ||
…the air conditioning to be switched on
…liked the airconditioning...
Shadowing results
0
10
20
30
40
50
Normal Swapped
Same VT
Different VT
Sw
itche
s (a
gain
st I
TD
) in
sha
dow
ing
(%)
ITD = ±91 µs
p<0.05
p<0.05
p<0.002
+ITD+Prosody
+ITD+Prosody+Vocal Tract
+ITD-Prosody
+ITD-Prosody-Vocal Tract