auditory search: by susan gillingham department of psychology … · 2012. 11. 20. · and spatial...
TRANSCRIPT
Auditory Search:
The Deployment of Attention within a Complex Auditory Scene
by
Susan Gillingham
A thesis submitted in conformity with the requirements
for the degree of Master of Arts
Department of Psychology
University of Toronto
© Copyright by Susan Gillingham (2012)
ii
ABSTRACT
Auditory search: The deployment of attention within a complex auditory scene
Master of Arts, 2012
Susan Gillingham
Department of Psychology
University of Toronto
Current theories of auditory attention are largely based upon studies examining either the
presentation of a single auditory stimulus or requiring the identification and labeling of
stimuli presented sequentially. Whether or not these theories apply in more complex
ecologically-valid environments where multiple sound sources are simultaneously active
is still unknown. This study examined the pattern of neuromagnetic responses elicited
when participants had to perform a search in an auditory language-based `scene` for a
stimulus matching an imperative target held in working memory. The analysis of source
waveforms revealed left lateralized patterns of activity that distinguished target present
from target absent trials. Similar source waveform amplitudes were found when the
target was presented in the left or right hemispace. The results suggest that auditory
search for speech sounds engage a left lateralized process in the superior temporal gyrus.
iii
ACKNOWLEDGMENTS
Many people contributed to the completion of this thesis. I would like to specially thank
my academic supervisor, Dr. Claude Alain, for the opportunity to engage in this exciting
area of research and for his guidance and patience, and his lab manager, Yu He, who
played such an integral role in the initiation and monitoring of this project`s progress. I
am grateful to my committee members, Dr. Bradley Buchsbaum and Dr. Bernhard Ross
for their advice and guidance in steering the project down future paths. And finally to all
the members of the MEG and EEG labs at the Rotman Research Institute, the many
members of my graduate cohort and the administrative staff in the Psychology
Department at the University of Toronto who offered support throughout the year and
ensured that I was going to complete this project on time. This study was funded by the
National Sciences and Engineering Research Council (NSERC).
iv
TABLE of CONTENTS
List of Tables …………………………………………….…………….………. pg. v
List of Figures ………………………………………….…………….………… pg. vi
Chapter I: Introduction ……………………………….…………….………….. pg. 7
Chapter II: Method ………………………………….…………….…………… pg. 16
Chapter III: Results ………………………………….…………….…………... pg. 22
Chapter IV: Discussion ….…………………………………………………….. pg. 28
Chapter V: References ……………………………………………………….... pg. 38
Tables …………………………………………………………………………. pg. 49
Figures ……………………………………………………………………….... pg. 51
v
LIST of TABLES
Table 1: Mean amplitude values derived from trials with correct responses
Table 2: Mean amplitude values derived from trials with errant responses
vi
LIST of FIGURES
Figure 1: Illustration of the progression of a single experimental trial
Figure 2: The neuromagnetic activity and location of dipoles in a representative
participant
Figure 3: Graphical representation of behavioural response times
Figure 4: Graphical representation of the neuromagnetic source waveforms
1
CHAPTER I: INTRODUCTION
Research on attention is one of the major areas of investigation within
psychology, neurology and cognitive neuroscience. There are many areas of active
investigation that aim to understand the brain networks and mechanisms that support
attention, in addition to the relationship between attention and other cognitive processes
like working memory, vigilance, and learning. This thesis focuses on auditory attention
with a particular emphasis on the neural underpinnings that underlie auditory search of a
predefined sound (i.e., target) embedded in a “cluttered” auditory scene.
Previous research in auditory attention has made significant progress in advancing
our knowledge of how and where incoming sound stimuli are analysed and processed in
the brain. Behavioural and imaging studies investigating sounds presented either
sequentially (Cusack & Roberts, 2000; Snyder, Alain, & Picton, 2006), concurrently
(Alain & Izenberg, 2003; Dyson, Alain, & He, 2005), or intending to produce task
interference (Alain & Woods, 1993) have provided confirming evidence for a prominent
theory of the pre-attentive segregation of acoustic information into “sound objects” which
subsequently form the basic unit for attentional processing (Alain & Arnott, 2000). This
theory inherently suggests that multiple units are concurrently available for attentional
capture, selection, and processing. Experiments designed to manipulate the auditory
information to be attentively monitored and updated in working memory have
demonstrated the importance of the inferior parietal lobule (IPL) in serving this function
(Alain, He, & Grady, 2008) and, with consideration of how a final integrated perception
of an auditory stimulus is produced (Dyson, Dunn, & Alain, 2010), that there is
2
anatomical differentiation of processing within this region dependent upon the type of
auditory units drawn into attention. Sub-regions of the dorsolateral IPL are primarily
responsible for processing units that carry identifying information (i.e., the “what”
pathway) while the ventromedial IPL is involved in processing the units localizing the
origin of the sound (i.e., the “where” pathway) within our external environment (Alain,
Arnott, Hevenor, Graham, & Grady, 2001; Arnott, Binns, Grady, & Alain, 2004; Leung
& Alain 2011). These theories of acoustic segregation, identification, and localization
provide a foundational conceptualization of cortical mapping in auditory attention.
The Deployment of Attention in Complex Contexts
This cortical mapping has primarily occurred, however, utilizing paradigms that
present simplistic sound information in sequential streams (Alain & Woods, 1993; Dalton
& Lavie, 2004; Wetzel & Schröger, 2007 as examples). Under these circumstances, the
object-based account provides viable explanations for top-down (schematic
representations) and bottom-up (stimulus properties) interactions leading to the
deployment of attention to task-relevant auditory information at the expense of irrelevant
information (Degerman, Rinne, Salmi, Salonen, & Alho, 2006; Johnson & Zatorre, 2005,
2006). However, it is still unclear whether the same account would also apply in more
complex, ecologically valid contexts, in which sounds may occur simultaneously rather
than one at the time. This is an important issue to address if one wishes to extend theory
developed from simple laboratory tasks to more ecologically valid complex listening
situations in which listeners must segregate and select a particular sound object in the
3
presence of other task-irrelevant sounds occurring simultaneously, as well as sensory
information in other modalities.
Attention and auditory scene analysis has often been studied using paradigms
where contextual changes to stimulus presentation serve as cues to aid in the segregation
and identification of the multiple sound objects. The main findings using both non-speech
(Gockel, Carlyon, & Micheyl, 1999) and speech (Assmann and Summerfield, 1990;
Chalikia and Bregman, 1989; Drennan, Gatehouse, and Lever, 2003; Shackleton and
Meddis, 1992) sounds indicate that increasing the differences between concurrent stimuli
in the form of either spectral and spatial characteristics increases accuracy in identifying
the sounds. A combination of changes in these spectral and spatial acoustic cue sources
appear to be linearly additive, easing the identification of sounds even further (Du et al.,
2011; Schröger, 1995; Takegata and Morotomi, 1999). Interestingly, even though spectral
and spatial information is believed to be processed in separate parietal pathways as
mentioned previously, the additivity effects, at least for speech sounds, occur early during
the preattentive processing of the multiple sound objects in auditory cortex and are
hypothesized to optimize perceptual organization (Du et al., 2011). The question still
remains, however, of how attention is deployed and processing occurs in response to
multiple sound units when a search for a specific unit, in contrast to just identifying or
locating all units, is the imperative task.
This question was addressed in the visual domain using electroencephalography
(EEG) and stimulus arrays standard in visual search experiments. Luck and Hillyard
(1994a,b) first demonstrated that target detection in a complex visual array was
characterized by a unique neural component labelled as the N2pc (N2-posterior-
4
contralateral). The N2pc, sourced from the posterior occipital lobe, appeared as a greater
amount of negativity 200 – 300 ms after target presentation in the hemisphere
contralateral, relative to ipsilateral, to the visual hemi-field in which the target was
presented (Luck & Hillyard, 1994a). The N2pc only occurred in complex arrays when
both a target and distracting stimuli were present and Luck and Hillyard (1994b) initially
interpreted this finding as indicative of a filtering mechanism that discounted distractors
before selecting an item as the target. Eimer (1996) demonstrated that filtering of
distractors was unlikely since the N2pc was present and similar in signal strength
regardless of the number of distractors present in the visual array. Combined with Luck
and Hillyard’s (1994a) illustration that the N2pc occurs in response specifically to target,
and not just pop-out, features suggested that the N2pc was a marker for the identification
of task-relevant information guided by pre-attentively maintained information in
situations where there were multiple competing sources of input.
The Characterization of the N2pc and its Relation to the Deployment of Attention
The nature of the deployment of visual attention to target relevant information in
complex environments as measured by the N2pc has been extensively investigated in the
fifteen years since its first discovery. The task-relevant specificity of the N2pc was
further confirmed when it appeared only on target-present trials using displays containing
distractors with target-similar features (Akyürek & Schubo, 2011). Additional support
can be found in studies using spatial cuing; the N2pc arose only on trials when the target
object appeared in a display and not simply in response to an informative spatial cue
5
(Woodman, Arita, & Luck, 2009), although an attenuated component can occur after the
appearance of spatially-cued distractor-only displays (Kiss, Van Velzen, & Eimer, 2008).
While there is general agreement of this task-specific representation illustrated by
the N2pc, there is still debate as to whether or not it simultaneously incorporates the
suppression of irrelevant information from attentional capture. Arguments range from the
possibility that N2pc reflects primarily target enhancement (Mazza, Turatto, &
Caramazza, 2009; Robitaille & Jolicoeur, 2006), especially under increasingly low
conditions of target-distractor disparity (Zhao et al., 2011), to an assumption that the two
processes are inherently tied together and represented by the N2pc (Conci, Gramann,
Müller, & Elliot, 2006), to the N2pc representing a summation of separate processes of
both target selection and distractor suppression that can each elicit their own separate
components under appropriate task design (Hickey, Di Lollo, & McDonald, 2009;
Hilimire, Mounts, Parks, & Corballis, 2011). A somewhat intermediate hypothesis
suggests that the N2pc is responsible for the selection and enhancement of target
information but that increasing task complexity to incorporate the labelling of a target
stimulus (and not just noticing its presence) will elicit later components separate from the
N2pc that represent the detailed discrimination of the target (Mazza, Turatto, Umiltà, &
Eimer, 2007) and the contextually necessary suppression of distracting information (Pd –
distractor positivity; Sawaki & Luck, 2010). Regardless of the preference for either of
these hypotheses, onset time of the N2pc has been shown to be affected only by the
strength of the target selection criteria (Brisson, Robitaille, & Jolicoeur, 2007) and not by
manipulations to the interfering information (Robitaille & Jolicoeur, 2006), suggesting
6
that this component is a reliable measure of the initial deployment of attention to task-
relevant information.
Manipulation of the N2pc component has also addressed the question of the
interaction between top-down attentional versus bottom-up stimulus influenced control of
attentional deployment. Studies referencing the appearance of the N2pc in response to
target manipulation in contrast to distractor manipulation suggest that top-down
attentional control is predominantly responsible for the deployment and capture of
attention by task-relevant information (Eimer, Kiss, Press, & Sauter, 2009; Lien,
Ruthruff, & Cornett, 2010; Wykowska & Schubö, 2009), an effect that becomes even
stronger as task difficulty increases (Kehrer et al., 2009). Opposite stimulus-driven results
have been noted (Hickey, McDonald, Theeuwes, 2006), and it has been suggested that
the timing of the attentional shift will determine if it was motivated by top-down or
bottom-up processes (Hickey, van Zoest, & Theeuwes, 2010).
Although the N2pc occurs both in the right and left hemisphere contralateral to
the visual presentation of the target stimulus, there is some suggestion that lateralization
in the strength of the component signal may occur as a factor of task-driven hemispheric
specificity. In a study using two conditions differentiated by the target’s colour
eccentricity compared to the distractors, the amplitude of the N2pc was larger only in the
left hemisphere and only when the target differed in colour from the distracors (Liu et al.,
2009). The authors interpreted this as a left hemisphere influence due to endogenous
naming of the colour distinction.
Lastly, the N2pc has also been used in the investigation of errant attentional
processing leading to behavioural errors in target selection. Hypothesizing that activity in
7
frontal eye field neurons is associated with the shifts in attention represented by the N2pc
component produced from the visual cortex, Heitz, Cohen, Woodman, & Schall (2010)
used a memory-guided saccade task to measure attentional deployment in monkeys and
specifically examined the neural activity on erroneous trials. They discovered that the m-
N2pc (m = monkey) occurred on trials immediately before the behavioural response of an
erroneous eye saccade, suggesting that errors represent a mis-deployment of attention
rather than a malfunction of the attentional system as a whole (Heitz et al., 2010).
Applying Knowledge of the N2pc to the Deployment of Auditory Attention
The N2pc has been shown to be a reliable indicator of the deployment of visual
attention to task-relevant information under conditions using complex stimulus arrays. In
addition, it also likely represents characteristics of attentional mechanisms responsible for
top-down guidance of response selection, hemispheric dominance in response to
particular external stimuli, and the misdirection of processes leading to errant behavioural
responses. As such, it can be expected that a similar component could be elicited in other
modalities and may be useful for advancing the investigation of auditory attentional
deployment in more complex and ecologically valid contexts. To date only one study has
been published that has attempted a first step in this direction. Using an auditory scene
containing two of four possible acoustic stimuli composed of either white noise or a
mixture of tones or frequencies, Gamble and Luck (2011) elicited a component that is
possibly analogous to the N2pc on trials where participants correctly indicated the
presence of a predefined target stimulus. This component also occurred contralateral to
the presentation of the target auditory stimulus and only when the target was presented
8
simultaneously with a distractor stimulus, but was primarily sourced from an anterior
cluster of electrodes (N2ac – anterior component).
The purpose of this current experiment was four-fold. First, this current study will
try to replicate the preliminary findings of a unique component indicative of the
deployment of auditory attention toward task-relevant stimuli in a complex acoustic
array. Second, this experiment will utilize stimuli (English vowel sounds) meant to be a
part of a step-wise approach toward translating findings from synthetic laboratory
contexts to possible real-life applicability. It is expected that these stimuli will elicit an
N2ac type component as did the noise stimuli utilized by Gamble and Luck, but that there
may also be lateralization preferences based upon the stimuli holding language-based
semantic meaning as was seen in the Lui et al. (2009) study mentioned previously. Third,
the source of the N2ac will be more accurately localized with the use of
magentoencephalography (MEG) rather than EEG, just as Hopf and colleagues (2000)
were able to decompose the N2pc in vision research into two distinct neural responses
over the parietal and occipital cortices. MEG will also be exceptionally important
considering lateralization differences in the strength of the N2ac signal may occur. In this
thesis it is this latter function of MEG that will be important in guiding data analysis.
More in-depth examination of interactions between brain areas will occur in a second
phase of data analysis not considered here. Fourth, by manipulating task difficulty using
two conditions in which vowels sounds are either close or far apart in their frequency
values, making it respectively harder and easier to discriminate the vowels, the N2ac
component can be characterized in terms of its onset, amplitude, and offset. In addition,
the more difficult condition is expected to produce error rates sufficiently large enough
9
(pilot data suggests approximately 10 – 15% compared to 4-5 % in the Gamble and Luck
study) to analyse errant trials for maintained but misdirected attentional features.
10
CHAPTER II: METHOD
Participants
Eighteen (6 male, 12 female) young, neurologically healthy adults between 20
and 30 years of age were recruited from the established in-house research participant pool
at Baycrest. One participant (female) was excluded after testing because of problems with
the MEG system during data acquisition. All remaining 17 participants are included in
the analyses (mean age = 25 ± 2.9 years). Participants were screened for
exclusion/inclusion criteria through both review of the information available in the
database and by self-report during a standard in-house recruitment questionnaire
administered during the first contact. Inclusion criteria included normal (or corrected)
vision and hearing (audiograms assessing pure-tone thresholds within octave frequencies
from 250 – 8000 Hz were performed at the beginning of each testing session), English as
first or primary language, right-handed, and absence of diseases that might compromise
brain function such as diabetes. All participants provided informed consent at the
beginning of each testing session abiding by the standards set forth by the Baycrest
Research Ethics Board and the Declaration of Helsinki.
Task
The main experimental task involved, on each trial, the concurrent presentation of
two different speech sounds, one to each ear, thereby composing an auditory ‘scene’. The
auditory stimuli were four synthesized American English vowel sounds: /ah/, /ee/, /er/,
/ae/ (Assmann and Summerfield, 1994; Du et al., 2011). Stimuli were presented to the
participant during magnetoencephalograhic data acquisition binaurally at 75 dB through
11
Etymotic ER3A insert earphones connected with 1.5m of plastic tubing. To simulate
spatial hearing in natural environments, each participant completed a calibration task
where single vowels were presented at different spatial locations using multiple head-
related transfer function (HRTF) coefficients (Wenzel, Arruda, Kistler, & Wightman,
1993; Wightman and Kistler, 1989a, 1989b). Participants were required to indicate when
the vowels seemed to be occurring 45° to the left and right of the midline and that
coefficient was then applied to the vowels for the remainder of the experiment. On each
experimental trial, one vowel was presented with the fundamental frequency (f0) set at
100 Hz, and the other set at a frequency either 1 (106 Hz) or 4 (126 Hz) semitones higher
than f0 (f1 or f4) (Alain et al., 2005). The objective for the participant was to indicate
whether a predefined target vowel was present or absent by pressing one of two possible
buttons.
The experiment took place in four stages: i) a brief audiogram and calibration of
the HRTF, ii) a stimulus identification task to ensure that participants could correctly
identify the four different vowel sounds (24 trials, 4 vowels, 2 trials at each of the 3
frequency levels), run twice, iii) a brief 5 trial practice before each of the four
experimental blocks demonstrating the task and the target stimulus on that particular
block, and iv) the actual experimental block, during which MEG recording took place.
An experimental trial consisted of a silent wait period of 1000 ms, followed by the 200
ms auditory scene (this presentation length was recommended in the vision literature,
Brisson & Jolicoeur, 2007 and has been used in similar auditory studies within our own
laboratory, Alain et al., 2005; Du et al, 2011), and the subsequent participant response,
after which the 1000 ms wait period for the next trial began immediately. The participant
12
was required to make a button response on each trial; “1” if they felt the target was one of
the two vowels presented on that trial or “2” if they felt the target was not present. See
Figure 1 for an illustration of the progression of a trial.
Figure 1 here
There were four experimental blocks, each with one of the four vowels designated
as the target stimulus. The order of the target vowel presented in each block was
counterbalanced across participants. Each block consisted of 336 trials, half of which
contained the target vowel. On the 168 trials containing the target vowel, half of the
targets were presented to the left hemispace and the other half to the right hemispace.
Task context was manipulated by the frequency difference between the pairs of vowels
on each trial (i.e., f0/f1 or f0/f4). The presentation of both the target and distractor vowels at
each frequency to the left and right hemispace, the pairing of each vowel with the three
remaining vowels, and the number of trials representing the frequency differences
between concurrent vowels were equal, counterbalanced, and randomly presented within
each block. Each experimental block lasted between 10 and 12 minutes. After testing,
participants were debriefed with a questionnaire probing their feedback on the ease of
doing the task and their strategies for performance. A complete testing session for each
participant lasted approximately 2 hours.
Data Acquisition and Analysis
13
Behavioural data indicating the response time and accuracy on each trial were
collected and separately analysed as a function of two within-subjects variables: Target
(three levels: Target Present Left – TPL; Target Present Right – TPR; Target Absent -
TA) and SemiTone (two levels: ST 1; ST 4) using planned within-subjects repeated
measures analysis of variance (ANOVA), collapsing across trials from all four
experimental blocks. If a main effect of Target or an interaction between Target and
SemiTone was significant, follow-up comparisons were carried out using paired sample t-
tests. Errors were calculated as a rate since the number of possible errors varied by
condition because of the differing number of trials between the Target Present Left and
Target Present Right (84 trials each) and Target Absent (168 trials) conditions.
For MEG recording, neuromagnetic brain activity was recorded from a 151
channel whole-head neuromagnetometer (VSM Medtech) while the participant performed
the behavioural task in a magnetically shielded room. Participants were seated in an
upright position with the top and back of their head surrounded by the scanner. Before the
start of scanning and the task, head localization coils were placed on the nasion, and on
the left and right preauricular points so that neuromagnetic data can be coregistered with
an anatomical magnetic resonance image. Data were collected in each of the four
experimental blocks at a sampling rate of 625 Hz and low-pass filtered at 200 Hz.
Our approach to data analysis was chosen based upon our intention to mimic
conventional analysis of electroencephalographic data and results of our past studies
implicating auditory cortex involvement in sound segregation (Alain et al., 2005),
specifically its involvement in the preattentive processing of sound units from
concurrently presented stimuli with variations in acoustic characteristics (Du et al., 2011).
14
Depending upon the participant and the vowel used as the target stimulus, average
reaction times varied from 500 to 1400 ms. Therefore, an epoch of 200 ms before
stimulus to 1500 ms after stimulus presentation was analysed. Artefact rejection
parameters were set to retain 90% of all trials from each experimental block for each
participant. Auditory evoked fields (AEFs) were averaged for each condition (i.e., TPL
ST 1, TPL ST 4, etc) for correct and error trials separately, and these averaged data were
then low-pass filtered at 30 Hz before source modeling.
For each participant, the grand average (including all experimental trials) AEFs
were modeled by placing single dipoles in the right and left auditory cortices near
Heschl’s gyrus and measuring the source strength of the waveforms using BESA
software version 5.3 (Brain Electrical Source Analysis, MEGIS Software GmbH). The
location and orientation of the dipoles were fit to represent a 40 ms interval around the
N1m wave peak, the most reliable deflection across all participants and target stimuli.
The parameters from this grand mean model were then used to calculate the source
waveforms for each of the sub-conditions of interest.
Upon examining the source waveforms, the first noticeable result was a bias for a
stronger response in the left hemisphere, rendering the expected contralateral-ipsilateral
target-dependent responses necessary to elicit the N2ac non-existent. We therefore
proceeded with traditional data analysis methods of examining the mean amplitude as the
main dependent variable within time intervals chosen to study classic peaks of positivity
and negativity as well as areas of interest determined from the graphical illustrations of
the source waveforms. Time intervals of 80-150 ms, 155-250 ms, and 300-450 ms were
used to quantify the N1m, P2m and P3m, respectively and interval 500-800 ms was used
15
to examine sustained late wave activity. (The ‘m’ signifies these waveforms as arising
from magnetic versus electric neural activity.) A first repeated measures ANOVA was
applied to the mean amplitude with three within-subjects variables: Hemisphere (2 levels;
Left, Right), Target (3 levels; Target Present Left - TPL, Target Present Right - TPR,
Target Absent - TA) and SemiTone (2 levels; ST 1 and ST 4). In cases where the Target
variable or an interaction between variables was significant, follow-up paired sample t-
tests were used to examine differences between pairs of conditions.
For all analyses, p-values are reported as exact numbers with three decimal places
to conform to new APA guidelines.
16
CHAPTER III: RESULTS
Figure 2a illustrates the AEFs from all MEG sensors in one participant. The large
N1m occurs at approximately 110 ms followed by sustained negativity. The magnetic
field topography at the N1m latency matches the expected pattern of bilateral sources
near the superior temporal gyri (see Figure 2b). Figure 2c shows the left and right dipole
locations for this participant and the overall group mean at the N1m. The values for the
group mean dipole locations using Talairach coordinates were (-41, -21, 1) for the left
dipole and (46, -14, -1) for the right dipole in the superior tempori gyri. The right dipole
was closer to the lateral surface [t(16) = 39.983, p = .000] and was more anterior [t(16) =
6.649, p = .000] than the left dipole. The dipole locations along the superior-inferior axis
did not differ.
Figure 2 here
Behavioural Data - Reaction Time (RT) on Correct Trials
Both the main effects of Target [F(2,32) = 40.411, p = .000] and SemiTone
[F(1,16) = 7.602, p = .014] were significant but there was no interaction between the two
variables. Pairwise comparison of the three Target conditions showed that participants
were faster in both the TPL [t(16) = -7.561, p = .000] and TPR [t(16) = -6.569, p = .000]
conditions than the TA condition. Although the TPL and TPR conditions did not differ
significantly, the TPR condition had slightly faster response times than the TPL
condition. Contrary to what was hypothesized for the SemiTone manipulation, the
condition with a 4 semitone difference in f0 between the two concurrent vowels generated
17
slower responses than the 1 semitone difference condition. See Figure 3 for a graphical
illustration of the response times on trials where participants made correct responses for
the overall Target and SemiTone conditions.
Figure 3 here
Behavioural Data – Accuracy
The main effect of Target was significant [F(2,32) = 24.379, p = .000]. However,
the main effect of SemiTone was not significant nor was the interaction between Target
and SemiTone. For the Target condition, pairwise comparison showed that the error rates
were higher in both the TPL [t(16) = 6.119, p = .000] and TPR [t(16) = 5.338, p = .000]
conditions when compared to the TA condition, but they did not differ from each other.
MEG Data – Source Wave Forms on Correct Trials
The source waveforms from the left and right hemisphere derived from trials
where participants made accurate responses are illustrated in Figure 4. This figure can be
referenced for illustrations of the data reported in this section for the four time intervals
of interest.
Figure 4 here
i) Interval 80-150 ms, N1m
Table 1a summarizes the mean amplitude values for each condition during this time
interval. The main effects of the three within-subjects variables were significant
18
[Hemisphere: F(1,16) = 6.544, p = .021; Target: F(2,32) = 4.582, p = .018; SemiTone:
F(1,16) = 9.288, p = .008]. Overall the N1 mean amplitude was larger in the left than in
the right hemisphere. For the Target variable, pairwise comparisons revealed that the TA
condition was statistically more negative than the TPR condition. Neither the TA-TPL
nor TPL-TPR comparisons differed significantly. The mean overall amplitude was also
larger for the 1 semitone than the 4 semitone condition.
There was a three-way interaction [F(2,32) = 5.667, p = .008]. Since all conditions
were more negative in the left hemisphere, follow-up analyses were carried out within
each hemisphere separately to examine the relationship between Target and SemiTone
conditions. In the left hemisphere, when there was 1 semitone separating the two
concurrently presented vowels, the three Target conditions did not significantly differ.
However for the 4 semitone difference, the N1m was smaller when the target was
presented to the left or right hemisphere compared to when the target was absent [TPL vs.
TA: t(16) = 3.105, p = .007; TPR vs. TA: t(16) = 3.239, p = .005]. In the right
hemisphere, again there was only a difference between the Target conditions when there
was a 4 semitone difference separating the two vowels, but this time it was the TPR
condition that had a significantly smaller mean amplitude than both the TPL [t(16) =
2.657, p = .017] and TA [t(16) = 3.243, p = .005] conditions.
ii) Interval 155-250 ms, P2m
Table 1b summarizes the mean amplitude values for each condition during this time
interval. There was a main effect of Hemisphere [F(1,16) = 19.584, p = .000] and
SemiTone [F(1,16) = 10.821, p = .005]. The source waveforms were more negative in the
19
left than in the right hemisphere and in the 4 semitone compared to the 1 semitone
conditions. The three Target conditions did not significantly differ, but there was an
interaction between the Hemisphere and Target variables [F(2,32) = 4.193, p = .024].
Follow-up paired-sample t-tests between the Target conditions were carried out for each
hemisphere separately. In the left hemisphere the TPL condition was more negative than
both the TPR [t(16) = -2.631, p = .018] and TA [t(16) = -2.562, p = .021] conditions, but
there were no significant differences in the right hemisphere.
iii) Interval 300-450 ms, P3m
Table 1c summarizes the mean amplitude values for each condition during this time
interval. There was a main effect of Hemisphere [F(1,16) = 36.258, p = .000] but neither
of the remaining main effects or interactions were significant. The mean amplitude was
more negative in the left compared to the right hemisphere.
iv) Interval 500-800 ms, Late sustained field
Table 1d summarizes the mean amplitude values for each condition during this time
interval. There was a significant main effect of Hemisphere [F(1,16) = 13.924, p = .002]
and Target [F(2,32) = 8.063, p = .001], but no main effect of SemiTone. As with the
previous three time intervals, the mean amplitude in the left hemisphere was more
negative than in the right hemisphere. Follow-up paired sample t-tests examining the
differences between the Target conditions showed that the mean amplitude for the TA
condition was more negative compared to both the TPL [t(16) = 2.458, p = .026] and
TPR [t(16) = 3.472, p = .003] conditions.
20
There was also an interaction between the Hemisphere and Target variables [F(1,16)
= 3.513, p = .042]. Again, as in the previous time intervals, follow-up analyses focused
on the relationship between the three levels of the Target variable within each hemisphere
separately. The TA condition had a larger negative amplitude than both the TPL [t(16) =
2.859, p = .011] and TPR [t(16) = 3.192, p = .006] conditions in the left hemisphere, but
was only significantly bigger than the TPR condition [t(16) = 2.543, p = .022] in the right
hemisphere.
MEG Data – Source Wave Forms on Errant Trials
Table 2 a, b, and c lists the mean amplitude and standard deviation values for the
trials where participants responded incorrectly for the N1m, P2m, and Late time intervals.
P3m was not considered because this peak was not always visibly present in the error-
derived source waveforms. Using the same analysis approach as for the mean amplitude,
there was a significant main effect of hemisphere in all three time intervals [N1m:
F(1,16) = 4.459, p = .051; P2m: F(1,16) = 10.575, p = .005; Late: F(1,16) = 22.932, p =
.000]. No other main effects or interactions were significant.
Although the expected N2ac component was not present, it was still hypothesized that
the error trials could illustrate modulation of attention processes at different stages.
Therefore, within three of the time intervals of interest (N1m, P2m, and Late), the mean
amplitude of the source waveforms derived from both correct and incorrect trials were
compared for each Target condition separately. This was carried out using a repeated
measure ANOVA with two within-subjects variables: Accuracy (two levels; Correct,
21
Incorrect) and SemiTone (two levels; 1 ST, 4 ST). Given the robustness of the response
in the left hemisphere and the nature of this analysis as a follow-up exploratory
examination of the data, only responses in the left hemisphere were considered. There
were no significant differences in the main effects of either the Accuracy or SemiTone
variables and also no interaction between them in the earlier two time intervals. In the
Late time interval however, the sole significant effects were a main effect of Accuracy
for both the TPL [F(1,16) = 4.767, p = .044] and TPR [F(1,16) = 8.071, p = .012]
conditions. In both cases the mean amplitude was much larger (more negative) on the
error trials compared to the accurate trials, and was actually very close to the mean
amplitude of the correct TA condition. See Table sections 1d and 2c for a direct
comparison of these values. There were no significant effects in the TA condition.
22
CHAPTER IV: DISCUSSION
Behavioural
Behaviourally, average response times were faster on trials where a target was
present compared to when it was absent. When directly comparing trials where the target
was presented to the left hemispace versus the right hemispace, reaction times did not
differ significantly, but were consistently faster when presented to the right. This
suggested the possibility of a right ear advantage for the language stimuli used in this task
given the direct anatomical connection between the information input to the right ear and
the left hemisphere (Hugdahl, Westerhausen, Alho, Medvedev, and Hamalainen, 2008).
Reaction times were also affected by a manipulation of task context. The
frequency difference between the vowels presented to each ear within a single trial was
changed such that on half of the trials the vowels were separated by a 1 semitone
difference and on the other half by a 4 semitone difference. Although it was hypothesized
that the 1 semitone separation would create a situation where differentiating the two
vowels would be more difficult, the reaction times were slowest on the trials with a 4
semitone difference between the vowels. Accounts of sound segregation with
concurrently occurring stimuli in both humans and animals are derived from
experimental results indicating improved performance in the identification of imperative
sound stimuli with increasing differences in frequency (Alain, Schuler, and McDonald,
2002; Dyson and Alain, 2004; Nityananda and Bee, 2011), including vowel sounds
similar to those employed in this study (Alain et al., 2005; Assmann and Summerfield,
1990,1994; Snyder and Alain, 2005). These past studies, however, often involved
paradigms whereby the participants were charged with the identification and labelling of
23
the sounds presented to each ear and accuracy was the main dependent variable. This
current study required the participants to perform a different task – to search for a single
sound and to indicate its presence or absence, and thus may involve different or
additional attentional processes. While both task types would presumably involve the
preattentive processing steps of sound analysis (simple sound object feature processing),
the follow-up steps in the previous studies involved consciously identifying and labeling
the recognized objects. In this current task, the subsequent steps may or may not have
involved the active awareness of the exact sounds heard in each hemispace, but most
likely involved the comparison of perceptually recognized objects with the target sound
that was held in working memory throughout each block of the task. If the target sound
was identified, the appropriate response could be made. If the target was not immediately
identified, such as would happen in the TA condition, it is possible there was an extra
level of attentional online monitoring of task progression whereby the decision that the
target is not present is checked and confirmed (Rosenthal et al., 2006), translating
behaviourally into longer response times. The role of manipulating the frequency
differences between the concurrent sounds in affecting these cognitive processes is still
unknown. Since a slowing in the 4 semitone condition relative to the 1 semitone
condition occurred for both target present and target absent trials, it is possible that the
larger difference in frequency actually allowed for further processing since the vowels
were more identifiable leading to awareness and identification of the two vowels before a
choice had to be made. The slowing in RT may not then reflect a greater difficulty in the
4 semitone condition, but a deeper processing simply because the choice of deeper
processing was available.
24
These behavioural results, then, gave us several lines of inquiry to examine in the
MEG data; in addition to the original question of interest, observing if there was an
indication of attentional deployment contralateral to the presentation of a target stimulus,
there was also the potential to expect a right hemispace advantage bias as well as
modulation of attentional response effects during auditory search due to changes in task
context.
MEG - Deployment of Attention and the N2ac
The primary hypothesis for this study was that there would be greater magnetic
activity in response to the detection of task relevant stimuli occurring contralateral to the
hemispace in which the target sound was presented, and that this contralateral activity,
relative to ipsilateral, would produce a unique component representing attention
deployment to relevant stimuli in complex environments. Upon inspection of the
magnetic source waveforms, the immediately obvious result was the strong bias of
elicited response in the left hemisphere; the waveforms from the left source were much
more negative than the waveforms from the right source regardless of the hemispace to
which the sound was presented. A comparison of contralateral and ipsilateral activity
confirmed that the expected finding of attention deployment dependent upon the location
of the vowel did not occur.
As also suggested by the behavioural data, the strong left lateral response could
possibly reflect a right ear advantage for processing language stimuli. Each brain
hemisphere may have specialization for processing different categories of sound stimuli.
In dichotic listening studies, when assessing non-language sounds (for example,
25
identifying changes of pitch in simple tones) there is often a left ear advantage suggesting
a right hemisphere dominance (Sininger and Bhatara, 2012; Wioland, Rudolf, Metz-Lutz,
Mutschler, & Marescaux, 1999) while the processing of linguistic stimuli shows the
opposite right ear advantage (Hugdahl et al., 2008), although these hemispheric
preferences can be modulated with top-down attentional control (Ofek and Pratt, 2004).
In addition, the left hemisphere activity may have been amplified even further by all of
our participants being right handed since the right ear advantage is stronger in right-
handed people (Nachshon, 1978). However, when considering these present results in the
context of previous work within our lab using similar methodologies, the interpretations
are not immediately clear. Using the same stimuli presented concurrently as in this
current task, fMRI results have shown left thalamo-cortical network involvement in the
successful identification of the speech sounds (Alain et al., 2005), but MEG results using
a similar paradigm have not shown a strong hemispheric bias (Du et al., 2011) and in
both cases the majority of the participants were right-handed. The main difference
between these previous two studies and this current task is the focus on using attentional
processes to search for an imperative stimulus as opposed to identifying all stimuli.
Therefore, there is the potential that left lateralized working memory/task setting
processes (D`Esposito and Badre, 2012; Stuss et al., 2002) are biasing the results in
conjunction with the language component. It is also possible that the source of the N2ac
is located primarily in other brain regions and our seeding of the dipoles to the auditory
cortex does not adequately capture the potential source activity from these areas.
26
Electromagnetic Activity at the N1m
The overall N1m mean amplitude was larger in the left than in the right
hemisphere. It was also larger when the target was absent than when it was present,
although the difference was only statistically reliable for the TPR-TA comparison. The
semitone manipulation only had an effect on the trials where the target was present; for
both the TPL and TPR conditions the amplitude was more negative when there was a 1
semitone difference compared to when there was a 4 semitone difference between the
two vowel sounds, although in the right hemisphere this effect only reached significance
for the TPR condition.
The N1 is thought to represent sensory gating in the auditory cortex whereby
information is sorted according to its relevancy to the task at hand (Alho et al., 1998).
Studies using both EEG and MEG methods have produced N1 amplitudes, without
specified lateralization effects, that are greater in response to the presentation of task-
related, compared to unrelated, information (Alho et al., 1998; Escera, Alho, Winkler,
and Näätänen, 1998; Näätänen, Kujala, and Winkler, 2011; Näätänen and Winkler, 1999;
Tsuchida, Katayama, and Murohashi, 2012), although the task-unrelated, or distractor,
N1 amplitudes can increase with greater distractor variability that potentially makes the
target less salient (Tong and Melara, 2007). Although the N1 is considered an exogenous
component, researchers have found an increase in N1 amplitude when participants were
instructed to actively attend to sounds presented to one ear only (Hillyard, Hink,
Schwent, & Picton, 1973; Sabri, Liebenthal, Waldron, Medler, & Binder, 2006; Woldorff
and Hillyard, 1991; Woldorff et al., 1993), an effect which has been shown to be
attenuated in certain disorders that potentially involve anatomical volume reduction in
27
areas thought to generate the N1 (i.e., the superior temporal gyrus, Näätänen and Picton,
1987; Neelon, Williams, & Garrell, 2006) such as schizophrenia and bipolar disorder
(Force, Venables, & Sponheim, 2008).
In this current dichotic language-based study the endogenous attention-orienting
instructions were held constant throughout the task. The participants were required to
search for a specified target on each trial and use a two-choice response to indicate its
presence or absence. Overall, on trials where the target stimulus was absent, the N1m
amplitude was larger than when the target was present, regardless of the hemispace to
which the target was presented. This may suggest that the absence of target features
required further processing of the stimulus information. In contrast, upon detection of the
target stimulus features, simple feature processing stopped and processing could proceed
onto the utilization of the features to represent an object as reflected in the onset of the P2
wave (Crowley and Colrain, 2004). These results expand on our current knowledge of
top-down attentional effects on early sound processing. The common theme across the
various studies mentioned here was an increase in N1 when there was either task-relevant
information available or when people were instructed to direct their attention to a
particular task. This current study highlights the special case of an auditory search and
potentially the involvement of working memory processes. A focusing of attention and a
capture of sound objects relevant to a particular decision decreases the N1m, whereas not
finding a match to the more prominent task-relevant stimulus (i.e., the imperative target)
requires further processing instead of being discarded as a task-irrelevant (non-matching)
trial. Working memory processes have been shown to interact in early auditory sensory
gating. As working memory load in a task increases, N1 will decrease in people who
28
have a high working memory capacity (Tsuchida et al., 2012). However, the extent of
influence of endogenous control and working memory processes have over the early
processing stages may also depend upon the exogenous factors influencing stimulus
quality. While in the 4 semitone condition the target absent condition had a larger
negative amplitude than for the target present conditions, there was no difference in N1m
as a function of the target being absent or present when the vowels were only separated
by 1 semitone.
Electromagnetic Activity at the P2m
As with the N1m, the source waveforms differed between hemispheres with the
mean amplitudes being overall more negative in the left compared to the right
hemisphere. The TPL condition in the left hemisphere had a more negative amplitude
than both the TPR and TA conditions, the latter two being very close. There were no
differences between the conditions in the right hemisphere. There was an overall main
effect of the 4 semitone condition being more negative than the 1 semitone difference
condition.
The peak amplitude often recorded in the 200 ms post stimulus onset range, or P2,
is thought to reflect the forming of an object representation from the identity of the
detected stimulus features (Crowley and Colrain, 2004; Näätänen and Winkler, 1999).
The greater negativity seen in the 4 semitone condition replicates findings from past
studies showing that an increase in the mistuning of concurrently presented stimuli will
lead to a more negative elicited neural response, interpreted as an improvement in the
perceptual identification of an object (the object related negativity, ORN; Alain, Arnott,
29
and Picton, 2001; Alain and Izenberg, 2003; Alain and McDonald, 2007; Alain, Schuler,
and McDonald, 2002; Alain, Reinke, et al., 2005). Although not statistically reliable, the
greatest change occurred in the TA condition, where there was a decrease in amplitude
from the 1 to the 4 semitone conditions. An interpretation for this change is not
immediately apparent. That the ORN would be bigger in one condition than another
implies that at this pre-attentive processing stage the brain would have made a decision,
beyond simple perception, about the identity of the concurrent stimuli, but again, as with
the N1m, this may be something unique to searching and matching to items held in
working memory.
Electromagnetic Activity at the P3m and Later Component
No statistically significant differences were found within the P3m interval other
than the left hemisphere being overall more negative than the right hemisphere, although
from the graphical illustration of the source waveforms in Figure 4 there may be an
indication of a separation of response dependent upon the Target trial type in the 4
semitone condition. This would be in line with previous research suggesting that the P3 is
affected largely by changes in task context, specifically manipulations in task difficulty
(Katayama and Polich, 1998). However, the analyses employed in this current report are
most likely not sensitive enough to capture the differences in this component and may
require measuring its onset as opposed to the mean amplitude.
The later slow wave activity was assessed between the time period of 500 and 800
ms post onset of stimulus. Our main question of interest involved broadly characterizing
the difference in amplitudes between the three Target conditions since activity in this area
30
is thought to represent indices of working memory and the allocation of controlled
attentional resources, differing as a function of stimulus context (i.e., target versus
distractor selection in many standard laboratory auditory tasks) (Katayama and Polich,
1998; Näätänen et al., 2011). Overall both of the target present conditions had a mean
amplitude that was significantly less negative than the target absent condition. Following
interpretations from past literature just mentioned, this would indicate that the selection
of a target occurred more quickly than the confirmation of its absence. There was no
overall main effect of SemiTone, although similar findings of a separation between the
Target conditions that would likewise require more sensitive analysis measures than we
have performed apply to this time interval as to the P3 component.
Limitations
There were several methodological limitations to this study that probably
contributed to the lack of replication of the Gamble and Luck (2011) study in addition to
the usage of language stimuli discussed previously. The switch from using EEG to MEG
required that the sounds be presented through ear inserts and be subjected to the
application of a HRTF function to mimic having the sounds heard at an angle of 45
degrees from midline, whereas in the EEG setup speakers are placed at a physical angle
of 45 degrees dependent upon the location of the head. It may also be somewhat easier to
differentiate left and right with a speaker set up as opposed to through earphones. Also,
the source of the neural generators responsible for the activity behind the N2ac may not
be near the N1 generators. Thus, we would need further analyses to explore other brain
regions.
31
Future Analyses
The intention of this current report was to examine the magnetic source
waveforms modeled around the N1. Given the complexity of this task and the likelihood
of involving multiple attentional and memory resources to perform a search, a distributed
network of anatomical areas is likely involved from stimulus onset to response. Stimulus
search and selection in a complex auditory scene will also involve the parietal streams for
determining the identity and location of preattentive sound objects (Alain et al., 2001),
the parietal-left frontal working memory/task setting network (D'Esposito and Badre,
2012), the right lateral monitoring of items held in working memory in online response
selection, or 'epoptic process' (Petrides, 2012), and the medial frontal involvement in
'energizing' the system to respond (Stuss et al., 2005), just to name a few of the major
processes that are well documented in the empirical literature. As such, next steps in
analyses involve using a model-free approach to source localization.
32
CHAPTER V: REFERENCES
Akyürek, E.G., & Schubo, A. (2011). The allocation of attention in displays with
simultaneously presented singletons, Biological Psychiatry, 87, 218-225.
Alain, C., & Arnott, S.R. (2000). Selectively attending to auditory objects. Frontiers in
Bioscience, 5, 202-212.
Alain, C., Arnott, S.R., Hevenor, S., Graham, S., & Grady, C.L. (2001). “What” and
“where” in the human auditory system. Proceedings of the National Academy of
Sciences, 98, 12301-12306.
Alain, C., Arnott, S.R., & Picton, T.W. (2001). Bottom-up and top-down influences on
auditory scene analysis: Evidence from event-related brain potentials. Journal of
Experimental Psychology: Human Perception and Performance, 27, 1072-1089.
Alain, C., He, Y., and Grady, C. (2008). The contribution of the inferior parietal lobe to
auditory spatial working memory. Journal of Cognitive Neuroscience, 20, 285-
295.
Alain, C., & Izenberg, A. (2003). Effects of attentional load on auditory scene analysis.
Journal of Cognitive Neuroscience, 15, 1063-1073.
Alain, C., & McDonald, K.L. (2007). Age-related differences in neuromagnetic brain
activity underlying concurrent sound perception. Journal of Neuroscience, 27,
1308-1314.
Alain, C., Reinke, K., He, Y., Wang, C.H., Lobaugh, N. (2005). Hearing two things at
once: Neurophysiological indices of speech segregation and identification.
Journal of Cognitive Neuroscience, 17, 811-818.
33
Alain, C., Reinke, K., McDonald, K.L., Chau, W., Tam, F., Pacurar, A., & Graham, S.
(2005). Left thalamo-cortical network implicated in successful speech separation
and identification. NeuroImage, 26, 592-599.
Alain, C., Schuler, B.M., & McDonald, K.L. (2002). Neural activity associated with
distinguishing concurrent auditory objects. Journal of the Acoustic Society of
America, 111, 990-995.
Alain, C., & Woods, D.L. (1993). Distractor clustering enhances detection speed and
accuracy during selective listening. Perception & Psychophysics, 54, 509-514.
Alho, K., Connolly, J.F., Cheour, M., Lehtokoski, A., Huotilainen, M., Virtanen, J.,
Aulanko, R., & Ilmoniemi, R.J. (1998). Hemispheric lateralization in preattentive
processing of speech sounds. Neuroscience Letters, 258, 9-12.
Alho, K., Winkler, I., Escera, C., Huotilainen, M., Virtanen, J., Jääskeläinen, I.P.,
Pekkonen, E., & Ilmoniemi, R.J. (1998). Processing of novel sounds and
frequency changes in the human auditory cortex: Magentoencephalographic
recordings. Psychophysiology, 35, 211-224.
Arnott, S.R., Binns, M.A., Grady, C.L., & Alain, C.(2004). Assessing the auditory dual-
pathway model in humans. NeuroImage, 22, 401-408.
Assmann, P.F., & Summerfield, Q. (1990). Modeling the perception of concurrent
vowels : Vowels with different fundamental frequencies. Journal of the Acoustic
Society of America, 88, 680-697.
Assmann, P.F., & Summerfield, Q. (1994). The contribution of waveform interactions to
the perception of concurrent vowels. Journal of the Acoustic Society of America,
95, 471-484.
34
Brisson, B., & Jolicoeur, P. (2007). The N2pc component and stimulus duration.
Cognitive Neuroscience and Neuropsychology, 18, 1163-1166.
Brisson, B., Robitaille, N., & Jolicoeur, P. (2007). Stimulus intensity affects the latency
but not the amplitude of the N2pc. Cognitive Neuroscience and Neuropsychology,
18, 1627-1630.
Chalikia, M.H., & Bregman, A.S. (1989). The perceptual segregation of simultaneous
auditory signals: Pulse train segregation and vowel segregation. Perception and
Psychophysics, 46, 487-496.
Conci, M., Gramann, K., Müller, H.J., & Elliot, M.A. (2006). Electrophysiological
correlates of similarity-based interference during detection of visual forms.
Journal of Cognitive Neuroscience, 18, 880-888.
Crowley, K.E., & Colrain, I.M. (2004). A review of the evidence for P2 being an
independent component process: Age, sleep and modality. Clinical
Neurophysiology, 115, 732-744.
Cusack, R., & Roberts, B. (2000). Effects of differences in timbre on sequential grouping.
Perception & Psychophysics, 62, 1112-1120.
Dalton, P., & Lavie, N. (2004). Auditory attentional capture: Effects of singleton
distractor sounds. Journal of Experimental Psychology: Human Perception and
Performance, 30, 180-193.
Degerman, A., Rinne, T., Salmi, J., Salonen, O., & Alho, K. (2006). Selective attention to
sound location or pitch studied with fMRI. Brain Research, 1077, 123-134.
D’Esposito, M., & Badre, D. (2012). Combining the insights dereived from lesion and
fMRI studies ot understand the function of prefrontal cortex. In B. Levine and
35
F.I.M. Craik (Eds.), Mind and the Frontal Lobes: Cognition, Behavior, and Brain
Imaging (pp. 93-108). New York, NY: Oxford University Press.
Drennan, W.R., Gatehouse, S., Lever, C. (2003). Perceptual segregation of competing
speech sounds: The role of spatial location. Journal of the Acoustic Society of
America, 114, 2178-2189.
Du, Y., He, Y., Ross, B., Bardouille, T., Wu, X., Li, L., & Alain, C. (2011). Human
auditory cortex activity shows additive effects of spectral and spatial cues during
speech segregation. Cerebral Cortex, 21, 698-707.
Dyson, B.J., & Alain, C. (2004). Representation of concurrent acoustic objects in primary
auditory cortex. Journal of the Acoustic Society of America, 115, 280-288.
Dyson, B.J., Alain, C., & He, Y. (2005). Effects of visual attentional load on low-level
auditory scene analysis. Cognitive, Affective, and Behavioral Neuroscience, 5,
319-338.
Dyson, B.J., Dunn, A.K., & Alain, C. (2010). Ventral and dorsal streams as modality-
independent phenomena. Cognitive Neuroscience, 1, 64-65.
Eimer, M. (1996). The N2pc component as an indicator of attentional selectivity.
Electroencephalography and Clinical Neurophysiology, 99, 225-234.
Eimer, M., Kiss, M., Press, C., & Sauter, D. (2009). The roles of feature-specific task set
and bottom-up salience in attentional capture: An ERP study. Journal of
Experimental Psychology: Human Perception and Performance, 35, 1316-1328.
Escera, C., Alho, K., Winkler, I., Naatanen, R. (1998). Neural mechanisms of involuntary
attention to acoustic novelty and change. Journal of Cognitive Neuroscience, 10,
590-604.
36
Force, R.B., Venables, N.C., & Sponheim, S.R. (2008). An auditory processing
abnormality specific to liability for schizophrenia. Schizophrenia Research, 103,
298-310.
Gamble, M.L., & Luck, S.J. (2011). N2ac: An ERP component associated with the
focusing of attention within an auditory scene. Psychophysiology, 48, 1057-1068.
Gockel, H., Carlyon, R.P., & Micheyl, C. (1999). Context dependence of fundamental-
frequency discrimination: Lateralized temporal fringes. Journal of the Acoustic
Society of America, 106, 3553-3563.
Heitz, R.P., Cohen, J.Y., Woodman, G.F., & Schall, J.D. (2010). Neural correlates of
correct and errant attentional selection revealed through N2pc and frontal eye
field activity. Journal of Neurophysiology, 104, 2433-2441.
Hickey, C., Di Lollo, V., & McDonald, J.J. (2009). Electrophysiological indices of target
and distractor processing in visual search. Journal of Cognitive Neuroscience, 21,
760-775.
Hickey, C., McDonald, J.J., & Theeuwes, J. (2006). Electrophysiological evidence of the
capture of visual attention. Journal of Cognitive Neuroscience, 18, 604-613.
Hickey, C., van Zoest, W., & Theeuwes, J. (2010). The time course of exogenous and
endogenous control of covert attention. Experimental Brain Research, 201, 789-
796.
Hilimire, M.R., Mounts, J.R.W., Parks, N.A., & Corballis, P.M. (2011). Dynamics of
target and distractor processing in visual search: Evidence from event-related
brain potentials. Neuroscience Letters, 495, 196-200.
37
Hillyard, S.A., Hink, R.F., Schwent, V.L., & Picton, T.W. (1973). Electrical signs of
selective attention in the human brain. Science, 182, 177-180.
Hopf, J-M., Luck, S.J., Girelli, M., Hagner, T., Mangun, G.R., Scheich, H., & Heinze, H-
J. (2000). Neural sources of focused attention in visual search. Cerebral Cortex,
10, 1233-1241.
Hugdahl, K., Westerhausen, R., Alho, K., Medvedev, S., & Hamalainen, H. (2008). The
effect of stimulus intensity on the right ear advantage in dichotic listening.
Neuroscience Letters, 431, 90-94.
Johnson, J.A., & Zatorre, R.J. (2005). Attention to simultaneous unrelated auditory and
visual events: Behavioral and neural correlates. Cerebral Cortex, 15, 1609-1620.
Johnson, J.A., & Zatorre, R.J. (2006). Neural substrates for dividing and focusing
attention between simultaneous auditory and visual events. NeuroImage, 31,
1673-1681.
Katayama, J., & Polich, J. (1998). Stimulus context determines P3a and P3b.
Psychophysiology, 35, 23-33.
Kehrer, S., Kraft, A., Irlbacher, K., Koch, S.P., Hagendorf, H., Kathmann, N., Brandt,
S.A. (2009). Electrophysiological evidence for cognitive control during conflict
processing in visual spatial attention. Psychological Research, 73, 751-761.
Kiss, M., Van Velzen, J., & Eimer, M. (2008). The N2pc component and its link to
attention shifts and spatially selective visual processing. Psychophysiology, 45,
240-249.
Leung, A.W.S, & Alain, C. (2011). Working memory load modulates auditory what and
where neural networks. NeuroImage. 55: 1260-1269.
38
Lien, M-C., Ruthruff, E., & Cornett, L. (2010). Attentional capture by singletons is
contingent on top-down control settings: Evidence from electrophysiological
measures. Visual Cognition, 18, 682-727.
Liu, Q., Li, H., Campos, J.L., Wang, Q., Zhang, Y., Qiu, J., Zhang, Q., & Sun, H-J.
(2009). The N2pc component in ERP and the lateralization effect of language on
color perception. Neuroscience Letters, 454, 58-61.
Luck, S.J., & Hillyard, S.A. (1994a). Electrophysiological correlates of feature analysis
during visual search. Psychophysiology, 31, 291-308.
Luck, S.J., & Hillyard, S.A. (1994b). Spatial filtering during visual search: Evidence
from human electrophysiology. Journal of Experimental Psychology: Human
Perception and Performance, 20, 1000-1014.
Mazza, V., Turatto, M., & Caramazza, A. (2009). Attention selection, distractor
suppression and N2pc. Cortex, 45, 879-890.
Mazza, V., Turatto, M., Umiltà, C., & Eimer, M. (2007). Attentional selection and
identification of visual objects are reflected by distinct electrophysiological
responses. Experimental Brain Research, 181, 531-536.
Näätänen, R., Kujala, T., & Winkler, I. (2011). Auditory processing that leads to
conscious perception: A unique window to central auditory processing opened by
the mismatch negativity and related responses. Psychophysiology, 48, 4-22.
Näätänen, R., & Picton, T.W. (1987). The N1 wave of the human electric and magnetic
response to sound: A review and an analysis of the component structure.
Psychophysiology, 24, 375-425.
39
Näätänen, R., & Winkler, I. (1999). The concept of auditory stimulus representation in
cognitive neuroscience. Psychological Bulletin, 125, 826-859.
Nachson, I. (1978). Handedness and dichotic listening to nonverbal features of speech.
Perceptual and Motor Skills, 47, 1111.
Neelon, M.F., Williams, J., & Garrell, P.C. (2006). The effects of auditory attention
measured from human electrocorticograms. Clinical Neurophysiology, 117, 504-
521.
Nityananda, V., & Bee, M.A. (2011). Finding your mate at a cocktail party: Frequency
separation promotes auditory stream segregation of concurrent voices in multi-
species frog choruses. PLoS, 6, 1-11.
Ofek, E., & Pratt, H. (2004). Ear advantage and attention: an ERP study of auditory cued
attention. Hearing Research, 189, 107-118.
Petrides, M. (2012). The mid-dorsolateral prefrontal-parietal network and the epoptic
process. In D.T. Stuss & R.T. Knight (Eds.), Principles of Frontal Lobe Function,
2nd Edition. New York, NY: Oxford University Press.
Robitaille, N., & Jolicoeur, P. (2006). Effect of cue-target interval on the N2pc. Cognitive
Neuroscience and Neuropsychology, 17, 1655-1658.
Rosenthal, C.R., Walsh, V., Mannan, S.K., Anderson, E.J., Hawken, M.B., & Kennard,
C. (2006). Temporal dynamics of parietal cortex involvement in visual search.
Neuropsychologia, 44, 731-743.
Sabri, M., Liebenthal, E., Waldron, E.J., Medler, D.A., & Binder, J.R. (2006). Attentional
modulation in the detection of irrelevant deviance: A simultaneous ERP/fMRI
study. Journal of Cognitive Neuroscience, 18, 689-700.
40
Sawaki, R., & Luck, S.J. (2010). Capture versus suppression of attention by salient
singletons: Electrophysiological evidence for an automatic attend-to-me signal.
Attention, Perception, & Psychophysics, 72, 1455-1470.
Schröger, E. (1995). Processing of auditory deviants with changes in one versus two
stimulus dimensions. Psychophysiology, 32, 55-65.
Shackleton, T.M., & Meddis, R. (1992). The role of interaural time difference and
fundamental frequency difference in the identification of concurrent vowel pairs.
Journal of the Acoustic Society of America, 91, 3579-3581.
Sininger, Y.S., & Bhatara, A. (2012). Laterality of basic auditory perception. Laterality,
17, 129-149.
Snyder, J.S., & Alain, C. (2005). Age-related changes in neural activity associated with
concurrent vowel segregation. Cognitive Brain Research, 24, 492-499.
Snyder, J.S., Alain, C., & Picton, T.W. (2006). Effects of attention on neuroelectric
correlates of auditory stream segregation. Journal of Cognitive Neuroscience, 18,
1-13.
Stuss, D.T., Binns, M.A., Murphy, K.J., & Alexander, M.P. (2002). Dissociations within
the anterior attentional system: Effects of task complexity and irrelevant
information on reaction time speed and accuracy.
Stuss, D.T., Alexander, M.P., Shallice, T., Picton, T.W., Binns, M.A., Macdonald, R.,
Borowiec, A., & Katz, D.I. (2005). Multiple frontal systems controlling response
speed. Neuropsychologia, 43, 396-417.
41
Takegata, R., & Morotomi, T. (1999). Integrated neural representation of sound and
temporal features in human auditory sensory memory: An event-related potential
study. Neuroscience Letters, 274, 207-210.
Tong, Y., & Melara, R.D. (2007). Behavioral and electrophysiological effects of
distractor variation on auditory selective attention. Brain Research, 1166, 110-
123.
Tsuchida, Y., Katayama, J., & Murohashi, H. (2012). Working memory capacity affects
the interference control of distracters at auditory gating. Neuroscience Letters,
516, 62-66.
Wenzel, E.M., Arruda, M., Kistler, D.J., & Wightman, F.L. (1993). Localization using
nonindividualized head-related transfer functions. Journal of the Acoustic Society
of America, 94, 111-123.
Wetzel, N., & Schröger, E. (2007). Modulation of involuntary attention by the duration of
novel and pitch deviant sounds in children and adolescents. Biological
Psychology, 75, 24-31.
Wightman, F.L., & Kistler, D.J. (1989a). Headphone simulation of free-field listening, I:
Stimulus synthesis. Journal of the Acoustic Society of America, 85, 858-867.
Wightman, F.L., & Kistler, D.J. (1989b). Headphone simulation of free-field listening, II:
Psychophysical validation. Journal of the Acoustic Society of America, 85, 868-
878.
Wioland, N., Rudolf, G., Metz-Lutz, M.N., Mutschler, V., & Marescaux, C. (1999).
Cerebral correlates of hemispheric lateralization during a pitch discrimination
task: An ERP study in dichotic situation. Clinical Neurophysiology, 110, 516-523.
42
Woldorff, M.G., Gallen, C.C., Hampson, S.A., Hillyard, S.A., Pantev, C., Sobel, D., &
Bloom, F.E. (1993). Modulation of early sensory processing in human auditory
cortex during auditory selective attention. Proceedings of the National Academy
of Sciences in the United States of America, 90, 8722-8726.
Woldorff, M.G., & Hillyard, S.A. (1991). Modulation of early auditory processing during
selective listening to rapidly presented tones. Electroencephalography and
Clinical Neurophysiology, 79, 170-191.
Woodman, G.F., Arita, J.T., & Luck, S.J. (2009). A cuing study of the N2pc component:
An index of attentional deployment to objects rather than spatial locations. Brain
Research, 1297, 101-111.
Wykowska, A., & Schubö, A. (2009). On the temporal relation of top-down and bottom-
up mechanisms during guidance of attention. Journal of Cognitive Neuroscience,
22, 640-654.
Zhao, G., Liu, Q., Zhang, Y., Jiao, J., Zhang, Q., Sun, H., & Li, H. (2011). The amplitude
of the N2pc reflects the physical disparity between target item and distractors.
Neuroscience Letters, 491, 68-72.
43
Table 1: Mean amplitude values derived from trials with correct responses.
Table 1a: Mean Amplitude at N1m Left Hemisphere Right Hemisphere Overall Conditions Group Mean (SD) Group Mean (SD) TPL -56.01 (22.31) -50.01 (20.72) TPR -55.82 (21.91) -48.84 (19.98) TA -57.24 (21.82) -50.43 (20.47) Semi-Tone Differences 1 ST 4 ST 1 ST 4 ST TPL -58.00 (24.03) -53.90 (21.07) -50.47 (21.39) -49.45 (20.26) TPR -57.03 (22.96) -54.59 (21.12) -50.35 (20.60) -47.29 (19.75) TA -57.09 (21.62) -57.44 (22.11) -51.17 (20.77) -49.69 (20.24)
Table 1b: Mean Amplitude at P2m Left Hemisphere Right Hemisphere Overall Conditions Group Mean (SD) Group Mean (SD) TPL -38.28 (19.50) -25.43 (17.03) TPR -36.01 (18.65) -24.96 (15.65) TA -35.51 (17.66) -25.52 (15.81) Semi-Tone Differences 1 ST 4 ST 1 ST 4 ST TPL -37.85 (21.15) -38.63 (18.07) -24.04 (17.97) -26.47 (16.22) TPR -35.69 (19.43) -36.23 (18.14) -24.80 (16.50) -25.06 (15.48) TA -33.53 (16.53) -37.54 (19.10) -24.29 (15.51) -26.77 (16.29)
Table 1c: Mean Amplitude at P3m Left Hemisphere Right Hemisphere Overall Conditions Group Mean (SD) Group Mean (SD) TPL -54.07 (24.44) -34.63 (20.40) TPR -51.49 (24.48) -34.39 (18.24) TA -52.23 (22.42) -34.40 (18.90) Semi-Tone Differences 1 ST 4 ST 1 ST 4 ST TPL -53.20 (26.82) -54.54 (22.58) -33.35 (22.29) -35.56 (18.79) TPR -52.02 (24.27) -50.93 (25.40) -34.94 (18.19) -33.69 (19.74) TA -51.01 (20.04) -53.45 (25.07) -34.46 (18.48) -34.38 (19.64)
Table 1d: Mean Amplitude at Late Left Hemisphere Right Hemisphere Overall Conditions Group Mean (SD) Group Mean (SD) TPL -33.78 (20.37) -21.86 (15.83) TPR -31.09 (22.55) -21.14 (12.46) TA -42.88 (20.14) -24.88 (13.37) Semi-Tone Differences 1 ST 4 ST 1 ST 4 ST TPL -32.28 (23.19) -34.87 (18.34) -19.93 (16.96) -23.28 (15.13) TPR -31.14 (22.38) -31.13 (23.86) -21.72 (15.50) -20.44 (13.19) TA -40.83 (18.40) -44.89 (22.36) -24.50 (13.49) -25.33 (13.85)
44
Table 2: Mean amplitude values derived from trials with errant responses.
Table 2a: Mean Amplitude at N1m Left Hemisphere Right Hemisphere Overall Conditions Group Mean (SD) Group Mean (SD) TPL -57.73 (23.67) -49.41 (22.82) TPR -51.99 (20.63) -47.71 (19.91) TA -54.48 (25.73) -47.40 (21.62) Semi-Tone Differences 1 ST 4 ST 1 ST 4 ST TPL -56.48 (23.17) -55.61 (23.09) -51.39 (25.99) -50.06 (19.99) TPR -50.93 (23.80) -55.27 (22.90) -45.47 (23.67) -48.46 (22.13) TA -52.14 (27.27) -52.39 (25.71) -50.71 (23.03) -41.51 (28.35)
Table 2b: Mean Amplitude at P2m Left Hemisphere Right Hemisphere Overall Conditions Group Mean (SD) Group Mean (SD) TPL -35.72 (22.26) -24.49 (20.75) TPR -33.56 (18.09) -25.37 (14.34) TA -31.78 (17.11) -21.79 (18.78) Semi-Tone Differences 1 ST 4 ST 1 ST 4 ST TPL -36.45 (20.27) -32.13 (36.52) -23.55 (18.99) -28.46 (20.39) TPR -32.95 (24.11) -35.74 (19.82) -21.56 (16.98) -29.09 (13.64) TA -27.83 (22.94) -34.20 (21.37) -25.55 (15.88) -18.40 (26.78)
Table 2c: Mean Amplitude at Late Left Hemisphere Right Hemisphere Overall Conditions Group Mean (SD) Group Mean (SD) TPL -43.07 (25.08) -22.22 (21.97) TPR -39.81 (25.64) -24.45 (16.87) TA -40.47 (34.50) -21.20 (16.09) Semi-Tone Differences 1 ST 4 ST 1 ST 4 ST TPL -46.55 (28.99) -36.94 (22.25) -25.21 (27.49) -23.74 (19.50) TPR -40.42 (25.19) -41.00 (25.83) -23.28 (15.53) -25.55 (19.09) TA -37.91 (36.19) -36.49 (26.58) -21.88 (14.92) -17.50 (23.17)
45
Start ofTrial
1000 ms
Tones
200 ms 500-1500 ms
Response(End of Trial)
Figure 1: Illustration of the progression of a single experimental trial.
46
A B
C
Red Dipole: Representative participant, Right Dipole
Blue Dipole: Representative participant, Left Dipole
Green Dipole: Group mean, Right Dipole
Purple Dipole: Group mean, Left Dipole
Figure 2: Neuromagnetic activity averaged over all experimental trials. A) Auditory evoked fields (AEFs) for one representative participant. B) Contour maps for the same participant at the N1m. C) The location of dipoles in the left and right hemisphere for the N1m for the same participant as well as the overall group mean using a MRI template from BESA 5.2.
47
Figure 3: Response times averaged across all 17 participants. The top panel shows the response times for the three trial types (i.e., the presence/absence and location of target presentation) of the Target condition. TPL = Target Present Left; TPR = Target Present Right; TA = Target Absent. The lower panel shows the response times for the two trial types (i.e., frequency difference between concurrently presented stimuli) of the SemiTone condition. 1ST = 1 semi-tone difference; 4ST = 4 semi-tone difference.
500
600
700
800
900
TPL TPR TA
500
600
700
800
900
1ST 4ST
Resp
onse
tim
e (m
s)Re
spon
se ti
me
(ms)
Target
SemiTone
48
N1
P2
P3 Late
N1
P2P3 Late
N1N1
P2P2
P3P3
LateLate
N1N1
P2P2
P3
P3
Late
Late
(a) All Correct Trials
(b) Correct Trials, 1 SemiTone (c) Correct Trials, 4 SemiTone
Figure 4: Source wave forms modeled from single dipoles seeded in the auditory cortex in the left and right hemispheres. In panel (a) all correct trials are included and split based upon the target being presented to the left ear (TPL), right ear (TPR), or being absent (TA).Panels (b) and (c) show the same data for correct trials but are separated according to the frequency difference between two vowelson a trial being separated by either 1 or 4 semi-tones, respectively. “Source” = Hemisphere.