auditory search: by susan gillingham department of psychology … · 2012. 11. 20. · and spatial...

Auditory Search:

The Deployment of Attention within a Complex Auditory Scene

by

Susan Gillingham

A thesis submitted in conformity with the requirements

for the degree of Master of Arts

Department of Psychology

University of Toronto

© Copyright by Susan Gillingham (2012)

ii

ABSTRACT

Auditory search: The deployment of attention within a complex auditory scene

Master of Arts, 2012

Susan Gillingham

Department of Psychology

University of Toronto

Current theories of auditory attention are largely based upon studies examining either the

presentation of a single auditory stimulus or requiring the identification and labeling of

stimuli presented sequentially. Whether or not these theories apply in more complex

ecologically-valid environments where multiple sound sources are simultaneously active

is still unknown. This study examined the pattern of neuromagnetic responses elicited

when participants had to perform a search in an auditory language-based `scene` for a

stimulus matching an imperative target held in working memory. The analysis of source

waveforms revealed left lateralized patterns of activity that distinguished target present

from target absent trials. Similar source waveform amplitudes were found when the

target was presented in the left or right hemispace. The results suggest that auditory

search for speech sounds engage a left lateralized process in the superior temporal gyrus.

iii

ACKNOWLEDGMENTS

Many people contributed to the completion of this thesis. I would like to specially thank

my academic supervisor, Dr. Claude Alain, for the opportunity to engage in this exciting

area of research and for his guidance and patience, and his lab manager, Yu He, who

played such an integral role in the initiation and monitoring of this project`s progress. I

am grateful to my committee members, Dr. Bradley Buchsbaum and Dr. Bernhard Ross

for their advice and guidance in steering the project down future paths. And finally to all

the members of the MEG and EEG labs at the Rotman Research Institute, the many

members of my graduate cohort and the administrative staff in the Psychology

Department at the University of Toronto who offered support throughout the year and

ensured that I was going to complete this project on time. This study was funded by the

National Sciences and Engineering Research Council (NSERC).

iv

TABLE of CONTENTS

List of Tables …………………………………………….…………….………. pg. v

List of Figures ………………………………………….…………….………… pg. vi

Chapter I: Introduction ……………………………….…………….………….. pg. 7

Chapter II: Method ………………………………….…………….…………… pg. 16

Chapter III: Results ………………………………….…………….…………... pg. 22

Chapter IV: Discussion ….…………………………………………………….. pg. 28

Chapter V: References ……………………………………………………….... pg. 38

Tables …………………………………………………………………………. pg. 49

Figures ……………………………………………………………………….... pg. 51

v

LIST of TABLES

Table 1: Mean amplitude values derived from trials with correct responses

Table 2: Mean amplitude values derived from trials with errant responses

vi

LIST of FIGURES

Figure 1: Illustration of the progression of a single experimental trial

Figure 2: The neuromagnetic activity and location of dipoles in a representative

participant

Figure 3: Graphical representation of behavioural response times

Figure 4: Graphical representation of the neuromagnetic source waveforms

1

CHAPTER I: INTRODUCTION

Research on attention is one of the major areas of investigation within

psychology, neurology and cognitive neuroscience. There are many areas of active

investigation that aim to understand the brain networks and mechanisms that support

attention, in addition to the relationship between attention and other cognitive processes

like working memory, vigilance, and learning. This thesis focuses on auditory attention

with a particular emphasis on the neural underpinnings that underlie auditory search of a

predefined sound (i.e., target) embedded in a “cluttered” auditory scene.

Previous research in auditory attention has made significant progress in advancing

our knowledge of how and where incoming sound stimuli are analysed and processed in

the brain. Behavioural and imaging studies investigating sounds presented either

sequentially (Cusack & Roberts, 2000; Snyder, Alain, & Picton, 2006), concurrently

(Alain & Izenberg, 2003; Dyson, Alain, & He, 2005), or intending to produce task

interference (Alain & Woods, 1993) have provided confirming evidence for a prominent

theory of the pre-attentive segregation of acoustic information into “sound objects” which

subsequently form the basic unit for attentional processing (Alain & Arnott, 2000). This

theory inherently suggests that multiple units are concurrently available for attentional

capture, selection, and processing. Experiments designed to manipulate the auditory

information to be attentively monitored and updated in working memory have

demonstrated the importance of the inferior parietal lobule (IPL) in serving this function

(Alain, He, & Grady, 2008) and, with consideration of how a final integrated perception

of an auditory stimulus is produced (Dyson, Dunn, & Alain, 2010), that there is

2

anatomical differentiation of processing within this region dependent upon the type of

auditory units drawn into attention. Sub-regions of the dorsolateral IPL are primarily

responsible for processing units that carry identifying information (i.e., the “what”

pathway) while the ventromedial IPL is involved in processing the units localizing the

origin of the sound (i.e., the “where” pathway) within our external environment (Alain,

Arnott, Hevenor, Graham, & Grady, 2001; Arnott, Binns, Grady, & Alain, 2004; Leung

& Alain 2011). These theories of acoustic segregation, identification, and localization

provide a foundational conceptualization of cortical mapping in auditory attention.

The Deployment of Attention in Complex Contexts

This cortical mapping has primarily occurred, however, utilizing paradigms that

present simplistic sound information in sequential streams (Alain & Woods, 1993; Dalton

& Lavie, 2004; Wetzel & Schröger, 2007 as examples). Under these circumstances, the

object-based account provides viable explanations for top-down (schematic

representations) and bottom-up (stimulus properties) interactions leading to the

deployment of attention to task-relevant auditory information at the expense of irrelevant

information (Degerman, Rinne, Salmi, Salonen, & Alho, 2006; Johnson & Zatorre, 2005,

2006). However, it is still unclear whether the same account would also apply in more

complex, ecologically valid contexts, in which sounds may occur simultaneously rather

than one at the time. This is an important issue to address if one wishes to extend theory

developed from simple laboratory tasks to more ecologically valid complex listening

situations in which listeners must segregate and select a particular sound object in the

3

presence of other task-irrelevant sounds occurring simultaneously, as well as sensory

information in other modalities.

Attention and auditory scene analysis has often been studied using paradigms

where contextual changes to stimulus presentation serve as cues to aid in the segregation

and identification of the multiple sound objects. The main findings using both non-speech

(Gockel, Carlyon, & Micheyl, 1999) and speech (Assmann and Summerfield, 1990;

Chalikia and Bregman, 1989; Drennan, Gatehouse, and Lever, 2003; Shackleton and

Meddis, 1992) sounds indicate that increasing the differences between concurrent stimuli

in the form of either spectral and spatial characteristics increases accuracy in identifying

the sounds. A combination of changes in these spectral and spatial acoustic cue sources

appear to be linearly additive, easing the identification of sounds even further (Du et al.,

2011; Schröger, 1995; Takegata and Morotomi, 1999). Interestingly, even though spectral

and spatial information is believed to be processed in separate parietal pathways as

mentioned previously, the additivity effects, at least for speech sounds, occur early during

the preattentive processing of the multiple sound objects in auditory cortex and are

hypothesized to optimize perceptual organization (Du et al., 2011). The question still

remains, however, of how attention is deployed and processing occurs in response to

multiple sound units when a search for a specific unit, in contrast to just identifying or

locating all units, is the imperative task.

This question was addressed in the visual domain using electroencephalography

(EEG) and stimulus arrays standard in visual search experiments. Luck and Hillyard

(1994a,b) first demonstrated that target detection in a complex visual array was

characterized by a unique neural component labelled as the N2pc (N2-posterior-

4

contralateral). The N2pc, sourced from the posterior occipital lobe, appeared as a greater

amount of negativity 200 – 300 ms after target presentation in the hemisphere

contralateral, relative to ipsilateral, to the visual hemi-field in which the target was

presented (Luck & Hillyard, 1994a). The N2pc only occurred in complex arrays when

both a target and distracting stimuli were present and Luck and Hillyard (1994b) initially

interpreted this finding as indicative of a filtering mechanism that discounted distractors

before selecting an item as the target. Eimer (1996) demonstrated that filtering of

distractors was unlikely since the N2pc was present and similar in signal strength

regardless of the number of distractors present in the visual array. Combined with Luck

and Hillyard’s (1994a) illustration that the N2pc occurs in response specifically to target,

and not just pop-out, features suggested that the N2pc was a marker for the identification

of task-relevant information guided by pre-attentively maintained information in

situations where there were multiple competing sources of input.

The Characterization of the N2pc and its Relation to the Deployment of Attention

The nature of the deployment of visual attention to target relevant information in

complex environments as measured by the N2pc has been extensively investigated in the

fifteen years since its first discovery. The task-relevant specificity of the N2pc was

further confirmed when it appeared only on target-present trials using displays containing

distractors with target-similar features (Akyürek & Schubo, 2011). Additional support

can be found in studies using spatial cuing; the N2pc arose only on trials when the target

object appeared in a display and not simply in response to an informative spatial cue

5

(Woodman, Arita, & Luck, 2009), although an attenuated component can occur after the

appearance of spatially-cued distractor-only displays (Kiss, Van Velzen, & Eimer, 2008).

While there is general agreement of this task-specific representation illustrated by

the N2pc, there is still debate as to whether or not it simultaneously incorporates the

suppression of irrelevant information from attentional capture. Arguments range from the

possibility that N2pc reflects primarily target enhancement (Mazza, Turatto, &

Caramazza, 2009; Robitaille & Jolicoeur, 2006), especially under increasingly low

conditions of target-distractor disparity (Zhao et al., 2011), to an assumption that the two

processes are inherently tied together and represented by the N2pc (Conci, Gramann,

Müller, & Elliot, 2006), to the N2pc representing a summation of separate processes of

both target selection and distractor suppression that can each elicit their own separate

components under appropriate task design (Hickey, Di Lollo, & McDonald, 2009;

Hilimire, Mounts, Parks, & Corballis, 2011). A somewhat intermediate hypothesis

suggests that the N2pc is responsible for the selection and enhancement of target

information but that increasing task complexity to incorporate the labelling of a target

stimulus (and not just noticing its presence) will elicit later components separate from the

N2pc that represent the detailed discrimination of the target (Mazza, Turatto, Umiltà, &

Eimer, 2007) and the contextually necessary suppression of distracting information (Pd –

distractor positivity; Sawaki & Luck, 2010). Regardless of the preference for either of

these hypotheses, onset time of the N2pc has been shown to be affected only by the

strength of the target selection criteria (Brisson, Robitaille, & Jolicoeur, 2007) and not by

manipulations to the interfering information (Robitaille & Jolicoeur, 2006), suggesting

6

that this component is a reliable measure of the initial deployment of attention to task-

relevant information.

Manipulation of the N2pc component has also addressed the question of the

interaction between top-down attentional versus bottom-up stimulus influenced control of

attentional deployment. Studies referencing the appearance of the N2pc in response to

target manipulation in contrast to distractor manipulation suggest that top-down

attentional control is predominantly responsible for the deployment and capture of

attention by task-relevant information (Eimer, Kiss, Press, & Sauter, 2009; Lien,

Ruthruff, & Cornett, 2010; Wykowska & Schubö, 2009), an effect that becomes even

stronger as task difficulty increases (Kehrer et al., 2009). Opposite stimulus-driven results

have been noted (Hickey, McDonald, Theeuwes, 2006), and it has been suggested that

the timing of the attentional shift will determine if it was motivated by top-down or

bottom-up processes (Hickey, van Zoest, & Theeuwes, 2010).

Although the N2pc occurs both in the right and left hemisphere contralateral to

the visual presentation of the target stimulus, there is some suggestion that lateralization

in the strength of the component signal may occur as a factor of task-driven hemispheric

specificity. In a study using two conditions differentiated by the target’s colour

eccentricity compared to the distractors, the amplitude of the N2pc was larger only in the

left hemisphere and only when the target differed in colour from the distracors (Liu et al.,

2009). The authors interpreted this as a left hemisphere influence due to endogenous

naming of the colour distinction.

Lastly, the N2pc has also been used in the investigation of errant attentional

processing leading to behavioural errors in target selection. Hypothesizing that activity in

7

frontal eye field neurons is associated with the shifts in attention represented by the N2pc

component produced from the visual cortex, Heitz, Cohen, Woodman, & Schall (2010)

used a memory-guided saccade task to measure attentional deployment in monkeys and

specifically examined the neural activity on erroneous trials. They discovered that the m-

N2pc (m = monkey) occurred on trials immediately before the behavioural response of an

erroneous eye saccade, suggesting that errors represent a mis-deployment of attention

rather than a malfunction of the attentional system as a whole (Heitz et al., 2010).

Applying Knowledge of the N2pc to the Deployment of Auditory Attention

The N2pc has been shown to be a reliable indicator of the deployment of visual

attention to task-relevant information under conditions using complex stimulus arrays. In

addition, it also likely represents characteristics of attentional mechanisms responsible for

top-down guidance of response selection, hemispheric dominance in response to

particular external stimuli, and the misdirection of processes leading to errant behavioural

responses. As such, it can be expected that a similar component could be elicited in other

modalities and may be useful for advancing the investigation of auditory attentional

deployment in more complex and ecologically valid contexts. To date only one study has

been published that has attempted a first step in this direction. Using an auditory scene

containing two of four possible acoustic stimuli composed of either white noise or a

mixture of tones or frequencies, Gamble and Luck (2011) elicited a component that is

possibly analogous to the N2pc on trials where participants correctly indicated the

presence of a predefined target stimulus. This component also occurred contralateral to

the presentation of the target auditory stimulus and only when the target was presented

8

simultaneously with a distractor stimulus, but was primarily sourced from an anterior

cluster of electrodes (N2ac – anterior component).

The purpose of this current experiment was four-fold. First, this current study will

try to replicate the preliminary findings of a unique component indicative of the

deployment of auditory attention toward task-relevant stimuli in a complex acoustic

array. Second, this experiment will utilize stimuli (English vowel sounds) meant to be a

part of a step-wise approach toward translating findings from synthetic laboratory

contexts to possible real-life applicability. It is expected that these stimuli will elicit an

N2ac type component as did the noise stimuli utilized by Gamble and Luck, but that there

may also be lateralization preferences based upon the stimuli holding language-based

semantic meaning as was seen in the Lui et al. (2009) study mentioned previously. Third,

the source of the N2ac will be more accurately localized with the use of

magentoencephalography (MEG) rather than EEG, just as Hopf and colleagues (2000)

were able to decompose the N2pc in vision research into two distinct neural responses

over the parietal and occipital cortices. MEG will also be exceptionally important

considering lateralization differences in the strength of the N2ac signal may occur. In this

thesis it is this latter function of MEG that will be important in guiding data analysis.

More in-depth examination of interactions between brain areas will occur in a second

phase of data analysis not considered here. Fourth, by manipulating task difficulty using

two conditions in which vowels sounds are either close or far apart in their frequency

values, making it respectively harder and easier to discriminate the vowels, the N2ac

component can be characterized in terms of its onset, amplitude, and offset. In addition,

the more difficult condition is expected to produce error rates sufficiently large enough

9

(pilot data suggests approximately 10 – 15% compared to 4-5 % in the Gamble and Luck

study) to analyse errant trials for maintained but misdirected attentional features.

10

CHAPTER II: METHOD

Participants

Eighteen (6 male, 12 female) young, neurologically healthy adults between 20

and 30 years of age were recruited from the established in-house research participant pool

at Baycrest. One participant (female) was excluded after testing because of problems with

the MEG system during data acquisition. All remaining 17 participants are included in

the analyses (mean age = 25 ± 2.9 years). Participants were screened for

exclusion/inclusion criteria through both review of the information available in the

database and by self-report during a standard in-house recruitment questionnaire

administered during the first contact. Inclusion criteria included normal (or corrected)

vision and hearing (audiograms assessing pure-tone thresholds within octave frequencies

from 250 – 8000 Hz were performed at the beginning of each testing session), English as

first or primary language, right-handed, and absence of diseases that might compromise

brain function such as diabetes. All participants provided informed consent at the

beginning of each testing session abiding by the standards set forth by the Baycrest

Research Ethics Board and the Declaration of Helsinki.

Task

The main experimental task involved, on each trial, the concurrent presentation of

two different speech sounds, one to each ear, thereby composing an auditory ‘scene’. The

auditory stimuli were four synthesized American English vowel sounds: /ah/, /ee/, /er/,

/ae/ (Assmann and Summerfield, 1994; Du et al., 2011). Stimuli were presented to the

participant during magnetoencephalograhic data acquisition binaurally at 75 dB through

11

Etymotic ER3A insert earphones connected with 1.5m of plastic tubing. To simulate

spatial hearing in natural environments, each participant completed a calibration task

where single vowels were presented at different spatial locations using multiple head-

related transfer function (HRTF) coefficients (Wenzel, Arruda, Kistler, & Wightman,

1993; Wightman and Kistler, 1989a, 1989b). Participants were required to indicate when

the vowels seemed to be occurring 45° to the left and right of the midline and that

coefficient was then applied to the vowels for the remainder of the experiment. On each

experimental trial, one vowel was presented with the fundamental frequency (f0) set at

100 Hz, and the other set at a frequency either 1 (106 Hz) or 4 (126 Hz) semitones higher

than f0 (f1 or f4) (Alain et al., 2005). The objective for the participant was to indicate

whether a predefined target vowel was present or absent by pressing one of two possible

buttons.

The experiment took place in four stages: i) a brief audiogram and calibration of

the HRTF, ii) a stimulus identification task to ensure that participants could correctly

identify the four different vowel sounds (24 trials, 4 vowels, 2 trials at each of the 3

frequency levels), run twice, iii) a brief 5 trial practice before each of the four

experimental blocks demonstrating the task and the target stimulus on that particular

block, and iv) the actual experimental block, during which MEG recording took place.

An experimental trial consisted of a silent wait period of 1000 ms, followed by the 200

ms auditory scene (this presentation length was recommended in the vision literature,

Brisson & Jolicoeur, 2007 and has been used in similar auditory studies within our own

laboratory, Alain et al., 2005; Du et al, 2011), and the subsequent participant response,

after which the 1000 ms wait period for the next trial began immediately. The participant

12

was required to make a button response on each trial; “1” if they felt the target was one of

the two vowels presented on that trial or “2” if they felt the target was not present. See

Figure 1 for an illustration of the progression of a trial.

Figure 1 here

There were four experimental blocks, each with one of the four vowels designated

as the target stimulus. The order of the target vowel presented in each block was

counterbalanced across participants. Each block consisted of 336 trials, half of which

contained the target vowel. On the 168 trials containing the target vowel, half of the

targets were presented to the left hemispace and the other half to the right hemispace.

Task context was manipulated by the frequency difference between the pairs of vowels

on each trial (i.e., f0/f1 or f0/f4). The presentation of both the target and distractor vowels at

each frequency to the left and right hemispace, the pairing of each vowel with the three

remaining vowels, and the number of trials representing the frequency differences

between concurrent vowels were equal, counterbalanced, and randomly presented within

each block. Each experimental block lasted between 10 and 12 minutes. After testing,

participants were debriefed with a questionnaire probing their feedback on the ease of

doing the task and their strategies for performance. A complete testing session for each

participant lasted approximately 2 hours.

Data Acquisition and Analysis

13

Behavioural data indicating the response time and accuracy on each trial were

collected and separately analysed as a function of two within-subjects variables: Target

(three levels: Target Present Left – TPL; Target Present Right – TPR; Target Absent -

TA) and SemiTone (two levels: ST 1; ST 4) using planned within-subjects repeated

measures analysis of variance (ANOVA), collapsing across trials from all four

experimental blocks. If a main effect of Target or an interaction between Target and

SemiTone was significant, follow-up comparisons were carried out using paired sample t-

tests. Errors were calculated as a rate since the number of possible errors varied by

condition because of the differing number of trials between the Target Present Left and

Target Present Right (84 trials each) and Target Absent (168 trials) conditions.

For MEG recording, neuromagnetic brain activity was recorded from a 151

channel whole-head neuromagnetometer (VSM Medtech) while the participant performed

the behavioural task in a magnetically shielded room. Participants were seated in an

upright position with the top and back of their head surrounded by the scanner. Before the

start of scanning and the task, head localization coils were placed on the nasion, and on

the left and right preauricular points so that neuromagnetic data can be coregistered with

an anatomical magnetic resonance image. Data were collected in each of the four

experimental blocks at a sampling rate of 625 Hz and low-pass filtered at 200 Hz.

Our approach to data analysis was chosen based upon our intention to mimic

conventional analysis of electroencephalographic data and results of our past studies

implicating auditory cortex involvement in sound segregation (Alain et al., 2005),

specifically its involvement in the preattentive processing of sound units from

concurrently presented stimuli with variations in acoustic characteristics (Du et al., 2011).

14

Depending upon the participant and the vowel used as the target stimulus, average

reaction times varied from 500 to 1400 ms. Therefore, an epoch of 200 ms before

stimulus to 1500 ms after stimulus presentation was analysed. Artefact rejection

parameters were set to retain 90% of all trials from each experimental block for each

participant. Auditory evoked fields (AEFs) were averaged for each condition (i.e., TPL

ST 1, TPL ST 4, etc) for correct and error trials separately, and these averaged data were

then low-pass filtered at 30 Hz before source modeling.

For each participant, the grand average (including all experimental trials) AEFs

were modeled by placing single dipoles in the right and left auditory cortices near

Heschl’s gyrus and measuring the source strength of the waveforms using BESA

software version 5.3 (Brain Electrical Source Analysis, MEGIS Software GmbH). The

location and orientation of the dipoles were fit to represent a 40 ms interval around the

N1m wave peak, the most reliable deflection across all participants and target stimuli.

The parameters from this grand mean model were then used to calculate the source

waveforms for each of the sub-conditions of interest.

Upon examining the source waveforms, the first noticeable result was a bias for a

stronger response in the left hemisphere, rendering the expected contralateral-ipsilateral

target-dependent responses necessary to elicit the N2ac non-existent. We therefore

proceeded with traditional data analysis methods of examining the mean amplitude as the

main dependent variable within time intervals chosen to study classic peaks of positivity

and negativity as well as areas of interest determined from the graphical illustrations of

the source waveforms. Time intervals of 80-150 ms, 155-250 ms, and 300-450 ms were

used to quantify the N1m, P2m and P3m, respectively and interval 500-800 ms was used

15

to examine sustained late wave activity. (The ‘m’ signifies these waveforms as arising

from magnetic versus electric neural activity.) A first repeated measures ANOVA was

applied to the mean amplitude with three within-subjects variables: Hemisphere (2 levels;

Left, Right), Target (3 levels; Target Present Left - TPL, Target Present Right - TPR,

Target Absent - TA) and SemiTone (2 levels; ST 1 and ST 4). In cases where the Target

variable or an interaction between variables was significant, follow-up paired sample t-

tests were used to examine differences between pairs of conditions.

For all analyses, p-values are reported as exact numbers with three decimal places

to conform to new APA guidelines.

16

CHAPTER III: RESULTS

Figure 2a illustrates the AEFs from all MEG sensors in one participant. The large

N1m occurs at approximately 110 ms followed by sustained negativity. The magnetic

field topography at the N1m latency matches the expected pattern of bilateral sources

near the superior temporal gyri (see Figure 2b). Figure 2c shows the left and right dipole

locations for this participant and the overall group mean at the N1m. The values for the

group mean dipole locations using Talairach coordinates were (-41, -21, 1) for the left

dipole and (46, -14, -1) for the right dipole in the superior tempori gyri. The right dipole

was closer to the lateral surface [t(16) = 39.983, p = .000] and was more anterior [t(16) =

6.649, p = .000] than the left dipole. The dipole locations along the superior-inferior axis

did not differ.

Figure 2 here

Behavioural Data - Reaction Time (RT) on Correct Trials

Both the main effects of Target [F(2,32) = 40.411, p = .000] and SemiTone

[F(1,16) = 7.602, p = .014] were significant but there was no interaction between the two

variables. Pairwise comparison of the three Target conditions showed that participants

were faster in both the TPL [t(16) = -7.561, p = .000] and TPR [t(16) = -6.569, p = .000]

conditions than the TA condition. Although the TPL and TPR conditions did not differ

significantly, the TPR condition had slightly faster response times than the TPL

condition. Contrary to what was hypothesized for the SemiTone manipulation, the

condition with a 4 semitone difference in f0 between the two concurrent vowels generated

17

slower responses than the 1 semitone difference condition. See Figure 3 for a graphical

illustration of the response times on trials where participants made correct responses for

the overall Target and SemiTone conditions.

Figure 3 here

Behavioural Data – Accuracy

The main effect of Target was significant [F(2,32) = 24.379, p = .000]. However,

the main effect of SemiTone was not significant nor was the interaction between Target

and SemiTone. For the Target condition, pairwise comparison showed that the error rates

were higher in both the TPL [t(16) = 6.119, p = .000] and TPR [t(16) = 5.338, p = .000]

conditions when compared to the TA condition, but they did not differ from each other.

MEG Data – Source Wave Forms on Correct Trials

The source waveforms from the left and right hemisphere derived from trials

where participants made accurate responses are illustrated in Figure 4. This figure can be

referenced for illustrations of the data reported in this section for the four time intervals

of interest.

Figure 4 here

i) Interval 80-150 ms, N1m

Table 1a summarizes the mean amplitude values for each condition during this time

interval. The main effects of the three within-subjects variables were significant

18

[Hemisphere: F(1,16) = 6.544, p = .021; Target: F(2,32) = 4.582, p = .018; SemiTone:

F(1,16) = 9.288, p = .008]. Overall the N1 mean amplitude was larger in the left than in

the right hemisphere. For the Target variable, pairwise comparisons revealed that the TA

condition was statistically more negative than the TPR condition. Neither the TA-TPL

nor TPL-TPR comparisons differed significantly. The mean overall amplitude was also

larger for the 1 semitone than the 4 semitone condition.

There was a three-way interaction [F(2,32) = 5.667, p = .008]. Since all conditions

were more negative in the left hemisphere, follow-up analyses were carried out within

each hemisphere separately to examine the relationship between Target and SemiTone

conditions. In the left hemisphere, when there was 1 semitone separating the two

concurrently presented vowels, the three Target conditions did not significantly differ.

However for the 4 semitone difference, the N1m was smaller when the target was

presented to the left or right hemisphere compared to when the target was absent [TPL vs.

TA: t(16) = 3.105, p = .007; TPR vs. TA: t(16) = 3.239, p = .005]. In the right

hemisphere, again there was only a difference between the Target conditions when there

was a 4 semitone difference separating the two vowels, but this time it was the TPR

condition that had a significantly smaller mean amplitude than both the TPL [t(16) =

2.657, p = .017] and TA [t(16) = 3.243, p = .005] conditions.

ii) Interval 155-250 ms, P2m

Table 1b summarizes the mean amplitude values for each condition during this time

interval. There was a main effect of Hemisphere [F(1,16) = 19.584, p = .000] and

SemiTone [F(1,16) = 10.821, p = .005]. The source waveforms were more negative in the

19

left than in the right hemisphere and in the 4 semitone compared to the 1 semitone

conditions. The three Target conditions did not significantly differ, but there was an

interaction between the Hemisphere and Target variables [F(2,32) = 4.193, p = .024].

Follow-up paired-sample t-tests between the Target conditions were carried out for each

hemisphere separately. In the left hemisphere the TPL condition was more negative than

both the TPR [t(16) = -2.631, p = .018] and TA [t(16) = -2.562, p = .021] conditions, but

there were no significant differences in the right hemisphere.

iii) Interval 300-450 ms, P3m

Table 1c summarizes the mean amplitude values for each condition during this time

interval. There was a main effect of Hemisphere [F(1,16) = 36.258, p = .000] but neither

of the remaining main effects or interactions were significant. The mean amplitude was

more negative in the left compared to the right hemisphere.

iv) Interval 500-800 ms, Late sustained field

Table 1d summarizes the mean amplitude values for each condition during this time

interval. There was a significant main effect of Hemisphere [F(1,16) = 13.924, p = .002]

and Target [F(2,32) = 8.063, p = .001], but no main effect of SemiTone. As with the

previous three time intervals, the mean amplitude in the left hemisphere was more

negative than in the right hemisphere. Follow-up paired sample t-tests examining the

differences between the Target conditions showed that the mean amplitude for the TA

condition was more negative compared to both the TPL [t(16) = 2.458, p = .026] and

TPR [t(16) = 3.472, p = .003] conditions.

20

There was also an interaction between the Hemisphere and Target variables [F(1,16)

= 3.513, p = .042]. Again, as in the previous time intervals, follow-up analyses focused

on the relationship between the three levels of the Target variable within each hemisphere

separately. The TA condition had a larger negative amplitude than both the TPL [t(16) =

2.859, p = .011] and TPR [t(16) = 3.192, p = .006] conditions in the left hemisphere, but

was only significantly bigger than the TPR condition [t(16) = 2.543, p = .022] in the right

hemisphere.

MEG Data – Source Wave Forms on Errant Trials

Table 2 a, b, and c lists the mean amplitude and standard deviation values for the

trials where participants responded incorrectly for the N1m, P2m, and Late time intervals.

P3m was not considered because this peak was not always visibly present in the error-

derived source waveforms. Using the same analysis approach as for the mean amplitude,

there was a significant main effect of hemisphere in all three time intervals [N1m:

F(1,16) = 4.459, p = .051; P2m: F(1,16) = 10.575, p = .005; Late: F(1,16) = 22.932, p =

.000]. No other main effects or interactions were significant.

Although the expected N2ac component was not present, it was still hypothesized that

the error trials could illustrate modulation of attention processes at different stages.

Therefore, within three of the time intervals of interest (N1m, P2m, and Late), the mean

amplitude of the source waveforms derived from both correct and incorrect trials were

compared for each Target condition separately. This was carried out using a repeated

measure ANOVA with two within-subjects variables: Accuracy (two levels; Correct,

21

Incorrect) and SemiTone (two levels; 1 ST, 4 ST). Given the robustness of the response

in the left hemisphere and the nature of this analysis as a follow-up exploratory

examination of the data, only responses in the left hemisphere were considered. There

were no significant differences in the main effects of either the Accuracy or SemiTone

variables and also no interaction between them in the earlier two time intervals. In the

Late time interval however, the sole significant effects were a main effect of Accuracy

for both the TPL [F(1,16) = 4.767, p = .044] and TPR [F(1,16) = 8.071, p = .012]

conditions. In both cases the mean amplitude was much larger (more negative) on the

error trials compared to the accurate trials, and was actually very close to the mean

amplitude of the correct TA condition. See Table sections 1d and 2c for a direct

comparison of these values. There were no significant effects in the TA condition.

22

CHAPTER IV: DISCUSSION

Behavioural

Behaviourally, average response times were faster on trials where a target was

present compared to when it was absent. When directly comparing trials where the target

was presented to the left hemispace versus the right hemispace, reaction times did not

differ significantly, but were consistently faster when presented to the right. This

suggested the possibility of a right ear advantage for the language stimuli used in this task

given the direct anatomical connection between the information input to the right ear and

the left hemisphere (Hugdahl, Westerhausen, Alho, Medvedev, and Hamalainen, 2008).

Reaction times were also affected by a manipulation of task context. The

frequency difference between the vowels presented to each ear within a single trial was

changed such that on half of the trials the vowels were separated by a 1 semitone

difference and on the other half by a 4 semitone difference. Although it was hypothesized

that the 1 semitone separation would create a situation where differentiating the two

vowels would be more difficult, the reaction times were slowest on the trials with a 4

semitone difference between the vowels. Accounts of sound segregation with

concurrently occurring stimuli in both humans and animals are derived from

experimental results indicating improved performance in the identification of imperative

sound stimuli with increasing differences in frequency (Alain, Schuler, and McDonald,

2002; Dyson and Alain, 2004; Nityananda and Bee, 2011), including vowel sounds

similar to those employed in this study (Alain et al., 2005; Assmann and Summerfield,

1990,1994; Snyder and Alain, 2005). These past studies, however, often involved

paradigms whereby the participants were charged with the identification and labelling of

23

the sounds presented to each ear and accuracy was the main dependent variable. This

current study required the participants to perform a different task – to search for a single

sound and to indicate its presence or absence, and thus may involve different or

additional attentional processes. While both task types would presumably involve the

preattentive processing steps of sound analysis (simple sound object feature processing),

the follow-up steps in the previous studies involved consciously identifying and labeling

the recognized objects. In this current task, the subsequent steps may or may not have

involved the active awareness of the exact sounds heard in each hemispace, but most

likely involved the comparison of perceptually recognized objects with the target sound

that was held in working memory throughout each block of the task. If the target sound

was identified, the appropriate response could be made. If the target was not immediately

identified, such as would happen in the TA condition, it is possible there was an extra

level of attentional online monitoring of task progression whereby the decision that the

target is not present is checked and confirmed (Rosenthal et al., 2006), translating

behaviourally into longer response times. The role of manipulating the frequency

differences between the concurrent sounds in affecting these cognitive processes is still

unknown. Since a slowing in the 4 semitone condition relative to the 1 semitone

condition occurred for both target present and target absent trials, it is possible that the

larger difference in frequency actually allowed for further processing since the vowels

were more identifiable leading to awareness and identification of the two vowels before a

choice had to be made. The slowing in RT may not then reflect a greater difficulty in the

4 semitone condition, but a deeper processing simply because the choice of deeper

processing was available.

24

These behavioural results, then, gave us several lines of inquiry to examine in the

MEG data; in addition to the original question of interest, observing if there was an

indication of attentional deployment contralateral to the presentation of a target stimulus,

there was also the potential to expect a right hemispace advantage bias as well as

modulation of attentional response effects during auditory search due to changes in task

context.

MEG - Deployment of Attention and the N2ac

The primary hypothesis for this study was that there would be greater magnetic

activity in response to the detection of task relevant stimuli occurring contralateral to the

hemispace in which the target sound was presented, and that this contralateral activity,

relative to ipsilateral, would produce a unique component representing attention

deployment to relevant stimuli in complex environments. Upon inspection of the

magnetic source waveforms, the immediately obvious result was the strong bias of

elicited response in the left hemisphere; the waveforms from the left source were much

more negative than the waveforms from the right source regardless of the hemispace to

which the sound was presented. A comparison of contralateral and ipsilateral activity

confirmed that the expected finding of attention deployment dependent upon the location

of the vowel did not occur.

As also suggested by the behavioural data, the strong left lateral response could

possibly reflect a right ear advantage for processing language stimuli. Each brain

hemisphere may have specialization for processing different categories of sound stimuli.

In dichotic listening studies, when assessing non-language sounds (for example,

25

identifying changes of pitch in simple tones) there is often a left ear advantage suggesting

a right hemisphere dominance (Sininger and Bhatara, 2012; Wioland, Rudolf, Metz-Lutz,

Mutschler, & Marescaux, 1999) while the processing of linguistic stimuli shows the

opposite right ear advantage (Hugdahl et al., 2008), although these hemispheric

preferences can be modulated with top-down attentional control (Ofek and Pratt, 2004).

In addition, the left hemisphere activity may have been amplified even further by all of

our participants being right handed since the right ear advantage is stronger in right-

handed people (Nachshon, 1978). However, when considering these present results in the

context of previous work within our lab using similar methodologies, the interpretations

are not immediately clear. Using the same stimuli presented concurrently as in this

current task, fMRI results have shown left thalamo-cortical network involvement in the

successful identification of the speech sounds (Alain et al., 2005), but MEG results using

a similar paradigm have not shown a strong hemispheric bias (Du et al., 2011) and in

both cases the majority of the participants were right-handed. The main difference

between these previous two studies and this current task is the focus on using attentional

processes to search for an imperative stimulus as opposed to identifying all stimuli.

Therefore, there is the potential that left lateralized working memory/task setting

processes (D`Esposito and Badre, 2012; Stuss et al., 2002) are biasing the results in

conjunction with the language component. It is also possible that the source of the N2ac

is located primarily in other brain regions and our seeding of the dipoles to the auditory

cortex does not adequately capture the potential source activity from these areas.

26

Electromagnetic Activity at the N1m

The overall N1m mean amplitude was larger in the left than in the right

hemisphere. It was also larger when the target was absent than when it was present,

although the difference was only statistically reliable for the TPR-TA comparison. The

semitone manipulation only had an effect on the trials where the target was present; for

both the TPL and TPR conditions the amplitude was more negative when there was a 1

semitone difference compared to when there was a 4 semitone difference between the

two vowel sounds, although in the right hemisphere this effect only reached significance

for the TPR condition.

The N1 is thought to represent sensory gating in the auditory cortex whereby

information is sorted according to its relevancy to the task at hand (Alho et al., 1998).

Studies using both EEG and MEG methods have produced N1 amplitudes, without

specified lateralization effects, that are greater in response to the presentation of task-

related, compared to unrelated, information (Alho et al., 1998; Escera, Alho, Winkler,

and Näätänen, 1998; Näätänen, Kujala, and Winkler, 2011; Näätänen and Winkler, 1999;

Tsuchida, Katayama, and Murohashi, 2012), although the task-unrelated, or distractor,

N1 amplitudes can increase with greater distractor variability that potentially makes the

target less salient (Tong and Melara, 2007). Although the N1 is considered an exogenous

component, researchers have found an increase in N1 amplitude when participants were

instructed to actively attend to sounds presented to one ear only (Hillyard, Hink,

Schwent, & Picton, 1973; Sabri, Liebenthal, Waldron, Medler, & Binder, 2006; Woldorff

and Hillyard, 1991; Woldorff et al., 1993), an effect which has been shown to be

attenuated in certain disorders that potentially involve anatomical volume reduction in

27

areas thought to generate the N1 (i.e., the superior temporal gyrus, Näätänen and Picton,

1987; Neelon, Williams, & Garrell, 2006) such as schizophrenia and bipolar disorder

(Force, Venables, & Sponheim, 2008).

In this current dichotic language-based study the endogenous attention-orienting

instructions were held constant throughout the task. The participants were required to

search for a specified target on each trial and use a two-choice response to indicate its

presence or absence. Overall, on trials where the target stimulus was absent, the N1m

amplitude was larger than when the target was present, regardless of the hemispace to

which the target was presented. This may suggest that the absence of target features

required further processing of the stimulus information. In contrast, upon detection of the

target stimulus features, simple feature processing stopped and processing could proceed

onto the utilization of the features to represent an object as reflected in the onset of the P2

wave (Crowley and Colrain, 2004). These results expand on our current knowledge of

top-down attentional effects on early sound processing. The common theme across the

various studies mentioned here was an increase in N1 when there was either task-relevant

information available or when people were instructed to direct their attention to a

particular task. This current study highlights the special case of an auditory search and

potentially the involvement of working memory processes. A focusing of attention and a

capture of sound objects relevant to a particular decision decreases the N1m, whereas not

finding a match to the more prominent task-relevant stimulus (i.e., the imperative target)

requires further processing instead of being discarded as a task-irrelevant (non-matching)

trial. Working memory processes have been shown to interact in early auditory sensory

gating. As working memory load in a task increases, N1 will decrease in people who

28

have a high working memory capacity (Tsuchida et al., 2012). However, the extent of

influence of endogenous control and working memory processes have over the early

processing stages may also depend upon the exogenous factors influencing stimulus

quality. While in the 4 semitone condition the target absent condition had a larger

negative amplitude than for the target present conditions, there was no difference in N1m

as a function of the target being absent or present when the vowels were only separated

by 1 semitone.

Electromagnetic Activity at the P2m

As with the N1m, the source waveforms differed between hemispheres with the

mean amplitudes being overall more negative in the left compared to the right

hemisphere. The TPL condition in the left hemisphere had a more negative amplitude

than both the TPR and TA conditions, the latter two being very close. There were no

differences between the conditions in the right hemisphere. There was an overall main

effect of the 4 semitone condition being more negative than the 1 semitone difference

condition.

The peak amplitude often recorded in the 200 ms post stimulus onset range, or P2,

is thought to reflect the forming of an object representation from the identity of the

detected stimulus features (Crowley and Colrain, 2004; Näätänen and Winkler, 1999).

The greater negativity seen in the 4 semitone condition replicates findings from past

studies showing that an increase in the mistuning of concurrently presented stimuli will

lead to a more negative elicited neural response, interpreted as an improvement in the

perceptual identification of an object (the object related negativity, ORN; Alain, Arnott,

29

and Picton, 2001; Alain and Izenberg, 2003; Alain and McDonald, 2007; Alain, Schuler,

and McDonald, 2002; Alain, Reinke, et al., 2005). Although not statistically reliable, the

greatest change occurred in the TA condition, where there was a decrease in amplitude

from the 1 to the 4 semitone conditions. An interpretation for this change is not

immediately apparent. That the ORN would be bigger in one condition than another

implies that at this pre-attentive processing stage the brain would have made a decision,

beyond simple perception, about the identity of the concurrent stimuli, but again, as with

the N1m, this may be something unique to searching and matching to items held in

working memory.

Electromagnetic Activity at the P3m and Later Component

No statistically significant differences were found within the P3m interval other

than the left hemisphere being overall more negative than the right hemisphere, although

from the graphical illustration of the source waveforms in Figure 4 there may be an

indication of a separation of response dependent upon the Target trial type in the 4

semitone condition. This would be in line with previous research suggesting that the P3 is

affected largely by changes in task context, specifically manipulations in task difficulty

(Katayama and Polich, 1998). However, the analyses employed in this current report are

most likely not sensitive enough to capture the differences in this component and may

require measuring its onset as opposed to the mean amplitude.

The later slow wave activity was assessed between the time period of 500 and 800

ms post onset of stimulus. Our main question of interest involved broadly characterizing

the difference in amplitudes between the three Target conditions since activity in this area

30

is thought to represent indices of working memory and the allocation of controlled

attentional resources, differing as a function of stimulus context (i.e., target versus

distractor selection in many standard laboratory auditory tasks) (Katayama and Polich,

1998; Näätänen et al., 2011). Overall both of the target present conditions had a mean

amplitude that was significantly less negative than the target absent condition. Following

interpretations from past literature just mentioned, this would indicate that the selection

of a target occurred more quickly than the confirmation of its absence. There was no

overall main effect of SemiTone, although similar findings of a separation between the

Target conditions that would likewise require more sensitive analysis measures than we

have performed apply to this time interval as to the P3 component.

Limitations

There were several methodological limitations to this study that probably

contributed to the lack of replication of the Gamble and Luck (2011) study in addition to

the usage of language stimuli discussed previously. The switch from using EEG to MEG

required that the sounds be presented through ear inserts and be subjected to the

application of a HRTF function to mimic having the sounds heard at an angle of 45

degrees from midline, whereas in the EEG setup speakers are placed at a physical angle

of 45 degrees dependent upon the location of the head. It may also be somewhat easier to

differentiate left and right with a speaker set up as opposed to through earphones. Also,

the source of the neural generators responsible for the activity behind the N2ac may not

be near the N1 generators. Thus, we would need further analyses to explore other brain

regions.

31

Future Analyses

The intention of this current report was to examine the magnetic source

waveforms modeled around the N1. Given the complexity of this task and the likelihood

of involving multiple attentional and memory resources to perform a search, a distributed

network of anatomical areas is likely involved from stimulus onset to response. Stimulus

search and selection in a complex auditory scene will also involve the parietal streams for

determining the identity and location of preattentive sound objects (Alain et al., 2001),

the parietal-left frontal working memory/task setting network (D'Esposito and Badre,

2012), the right lateral monitoring of items held in working memory in online response

selection, or 'epoptic process' (Petrides, 2012), and the medial frontal involvement in

'energizing' the system to respond (Stuss et al., 2005), just to name a few of the major

processes that are well documented in the empirical literature. As such, next steps in

analyses involve using a model-free approach to source localization.

32

CHAPTER V: REFERENCES

Akyürek, E.G., & Schubo, A. (2011). The allocation of attention in displays with

simultaneously presented singletons, Biological Psychiatry, 87, 218-225.

Alain, C., & Arnott, S.R. (2000). Selectively attending to auditory objects. Frontiers in

Bioscience, 5, 202-212.

Alain, C., Arnott, S.R., Hevenor, S., Graham, S., & Grady, C.L. (2001). “What” and

“where” in the human auditory system. Proceedings of the National Academy of

Sciences, 98, 12301-12306.

Alain, C., Arnott, S.R., & Picton, T.W. (2001). Bottom-up and top-down influences on

auditory scene analysis: Evidence from event-related brain potentials. Journal of

Experimental Psychology: Human Perception and Performance, 27, 1072-1089.

Alain, C., He, Y., and Grady, C. (2008). The contribution of the inferior parietal lobe to

auditory spatial working memory. Journal of Cognitive Neuroscience, 20, 285-

295.

Alain, C., & Izenberg, A. (2003). Effects of attentional load on auditory scene analysis.

Journal of Cognitive Neuroscience, 15, 1063-1073.

Alain, C., & McDonald, K.L. (2007). Age-related differences in neuromagnetic brain

activity underlying concurrent sound perception. Journal of Neuroscience, 27,

1308-1314.

Alain, C., Reinke, K., He, Y., Wang, C.H., Lobaugh, N. (2005). Hearing two things at

once: Neurophysiological indices of speech segregation and identification.


33

Alain, C., Reinke, K., McDonald, K.L., Chau, W., Tam, F., Pacurar, A., & Graham, S.

(2005). Left thalamo-cortical network implicated in successful speech separation

and identification. NeuroImage, 26, 592-599.

Alain, C., Schuler, B.M., & McDonald, K.L. (2002). Neural activity associated with

distinguishing concurrent auditory objects. Journal of the Acoustic Society of

America, 111, 990-995.

Alain, C., & Woods, D.L. (1993). Distractor clustering enhances detection speed and

accuracy during selective listening. Perception & Psychophysics, 54, 509-514.

Alho, K., Connolly, J.F., Cheour, M., Lehtokoski, A., Huotilainen, M., Virtanen, J.,

Aulanko, R., & Ilmoniemi, R.J. (1998). Hemispheric lateralization in preattentive

processing of speech sounds. Neuroscience Letters, 258, 9-12.

Alho, K., Winkler, I., Escera, C., Huotilainen, M., Virtanen, J., Jääskeläinen, I.P.,

Pekkonen, E., & Ilmoniemi, R.J. (1998). Processing of novel sounds and

frequency changes in the human auditory cortex: Magentoencephalographic

recordings. Psychophysiology, 35, 211-224.

Arnott, S.R., Binns, M.A., Grady, C.L., & Alain, C.(2004). Assessing the auditory dual-

pathway model in humans. NeuroImage, 22, 401-408.

Assmann, P.F., & Summerfield, Q. (1990). Modeling the perception of concurrent

vowels : Vowels with different fundamental frequencies. Journal of the Acoustic

Society of America, 88, 680-697.

Assmann, P.F., & Summerfield, Q. (1994). The contribution of waveform interactions to

the perception of concurrent vowels. Journal of the Acoustic Society of America,

95, 471-484.

34

Brisson, B., & Jolicoeur, P. (2007). The N2pc component and stimulus duration.

Cognitive Neuroscience and Neuropsychology, 18, 1163-1166.

Brisson, B., Robitaille, N., & Jolicoeur, P. (2007). Stimulus intensity affects the latency

but not the amplitude of the N2pc. Cognitive Neuroscience and Neuropsychology,

18, 1627-1630.

Chalikia, M.H., & Bregman, A.S. (1989). The perceptual segregation of simultaneous

auditory signals: Pulse train segregation and vowel segregation. Perception and

Psychophysics, 46, 487-496.

Conci, M., Gramann, K., Müller, H.J., & Elliot, M.A. (2006). Electrophysiological

correlates of similarity-based interference during detection of visual forms.


Crowley, K.E., & Colrain, I.M. (2004). A review of the evidence for P2 being an

independent component process: Age, sleep and modality. Clinical

Neurophysiology, 115, 732-744.

Cusack, R., & Roberts, B. (2000). Effects of differences in timbre on sequential grouping.

Perception & Psychophysics, 62, 1112-1120.

Dalton, P., & Lavie, N. (2004). Auditory attentional capture: Effects of singleton

distractor sounds. Journal of Experimental Psychology: Human Perception and

Performance, 30, 180-193.

Degerman, A., Rinne, T., Salmi, J., Salonen, O., & Alho, K. (2006). Selective attention to

sound location or pitch studied with fMRI. Brain Research, 1077, 123-134.

D’Esposito, M., & Badre, D. (2012). Combining the insights dereived from lesion and

fMRI studies ot understand the function of prefrontal cortex. In B. Levine and

35

F.I.M. Craik (Eds.), Mind and the Frontal Lobes: Cognition, Behavior, and Brain

Imaging (pp. 93-108). New York, NY: Oxford University Press.

Drennan, W.R., Gatehouse, S., Lever, C. (2003). Perceptual segregation of competing

speech sounds: The role of spatial location. Journal of the Acoustic Society of

America, 114, 2178-2189.

Du, Y., He, Y., Ross, B., Bardouille, T., Wu, X., Li, L., & Alain, C. (2011). Human

auditory cortex activity shows additive effects of spectral and spatial cues during

speech segregation. Cerebral Cortex, 21, 698-707.

Dyson, B.J., & Alain, C. (2004). Representation of concurrent acoustic objects in primary

auditory cortex. Journal of the Acoustic Society of America, 115, 280-288.

Dyson, B.J., Alain, C., & He, Y. (2005). Effects of visual attentional load on low-level

auditory scene analysis. Cognitive, Affective, and Behavioral Neuroscience, 5,

319-338.

Dyson, B.J., Dunn, A.K., & Alain, C. (2010). Ventral and dorsal streams as modality-

independent phenomena. Cognitive Neuroscience, 1, 64-65.

Eimer, M. (1996). The N2pc component as an indicator of attentional selectivity.

Electroencephalography and Clinical Neurophysiology, 99, 225-234.

Eimer, M., Kiss, M., Press, C., & Sauter, D. (2009). The roles of feature-specific task set

and bottom-up salience in attentional capture: An ERP study. Journal of

Experimental Psychology: Human Perception and Performance, 35, 1316-1328.

Escera, C., Alho, K., Winkler, I., Naatanen, R. (1998). Neural mechanisms of involuntary

attention to acoustic novelty and change. Journal of Cognitive Neuroscience, 10,

590-604.

36

Force, R.B., Venables, N.C., & Sponheim, S.R. (2008). An auditory processing

abnormality specific to liability for schizophrenia. Schizophrenia Research, 103,

298-310.

Gamble, M.L., & Luck, S.J. (2011). N2ac: An ERP component associated with the

focusing of attention within an auditory scene. Psychophysiology, 48, 1057-1068.

Gockel, H., Carlyon, R.P., & Micheyl, C. (1999). Context dependence of fundamental-

frequency discrimination: Lateralized temporal fringes. Journal of the Acoustic

Society of America, 106, 3553-3563.

Heitz, R.P., Cohen, J.Y., Woodman, G.F., & Schall, J.D. (2010). Neural correlates of

correct and errant attentional selection revealed through N2pc and frontal eye

field activity. Journal of Neurophysiology, 104, 2433-2441.

Hickey, C., Di Lollo, V., & McDonald, J.J. (2009). Electrophysiological indices of target

and distractor processing in visual search. Journal of Cognitive Neuroscience, 21,

760-775.

Hickey, C., McDonald, J.J., & Theeuwes, J. (2006). Electrophysiological evidence of the

capture of visual attention. Journal of Cognitive Neuroscience, 18, 604-613.

Hickey, C., van Zoest, W., & Theeuwes, J. (2010). The time course of exogenous and

endogenous control of covert attention. Experimental Brain Research, 201, 789-

796.

Hilimire, M.R., Mounts, J.R.W., Parks, N.A., & Corballis, P.M. (2011). Dynamics of

target and distractor processing in visual search: Evidence from event-related

brain potentials. Neuroscience Letters, 495, 196-200.

37

Hillyard, S.A., Hink, R.F., Schwent, V.L., & Picton, T.W. (1973). Electrical signs of

selective attention in the human brain. Science, 182, 177-180.

Hopf, J-M., Luck, S.J., Girelli, M., Hagner, T., Mangun, G.R., Scheich, H., & Heinze, H-

J. (2000). Neural sources of focused attention in visual search. Cerebral Cortex,

10, 1233-1241.

Hugdahl, K., Westerhausen, R., Alho, K., Medvedev, S., & Hamalainen, H. (2008). The

effect of stimulus intensity on the right ear advantage in dichotic listening.

Neuroscience Letters, 431, 90-94.

Johnson, J.A., & Zatorre, R.J. (2005). Attention to simultaneous unrelated auditory and

visual events: Behavioral and neural correlates. Cerebral Cortex, 15, 1609-1620.

Johnson, J.A., & Zatorre, R.J. (2006). Neural substrates for dividing and focusing

attention between simultaneous auditory and visual events. NeuroImage, 31,

1673-1681.

Katayama, J., & Polich, J. (1998). Stimulus context determines P3a and P3b.

Psychophysiology, 35, 23-33.

Kehrer, S., Kraft, A., Irlbacher, K., Koch, S.P., Hagendorf, H., Kathmann, N., Brandt,

S.A. (2009). Electrophysiological evidence for cognitive control during conflict

processing in visual spatial attention. Psychological Research, 73, 751-761.

Kiss, M., Van Velzen, J., & Eimer, M. (2008). The N2pc component and its link to

attention shifts and spatially selective visual processing. Psychophysiology, 45,

240-249.

Leung, A.W.S, & Alain, C. (2011). Working memory load modulates auditory what and

where neural networks. NeuroImage. 55: 1260-1269.

38

Lien, M-C., Ruthruff, E., & Cornett, L. (2010). Attentional capture by singletons is

contingent on top-down control settings: Evidence from electrophysiological

measures. Visual Cognition, 18, 682-727.

Liu, Q., Li, H., Campos, J.L., Wang, Q., Zhang, Y., Qiu, J., Zhang, Q., & Sun, H-J.

(2009). The N2pc component in ERP and the lateralization effect of language on

color perception. Neuroscience Letters, 454, 58-61.

Luck, S.J., & Hillyard, S.A. (1994a). Electrophysiological correlates of feature analysis

during visual search. Psychophysiology, 31, 291-308.

Luck, S.J., & Hillyard, S.A. (1994b). Spatial filtering during visual search: Evidence

from human electrophysiology. Journal of Experimental Psychology: Human

Perception and Performance, 20, 1000-1014.

Mazza, V., Turatto, M., & Caramazza, A. (2009). Attention selection, distractor

suppression and N2pc. Cortex, 45, 879-890.

Mazza, V., Turatto, M., Umiltà, C., & Eimer, M. (2007). Attentional selection and

identification of visual objects are reflected by distinct electrophysiological

responses. Experimental Brain Research, 181, 531-536.

Näätänen, R., Kujala, T., & Winkler, I. (2011). Auditory processing that leads to

conscious perception: A unique window to central auditory processing opened by

the mismatch negativity and related responses. Psychophysiology, 48, 4-22.

Näätänen, R., & Picton, T.W. (1987). The N1 wave of the human electric and magnetic

response to sound: A review and an analysis of the component structure.

Psychophysiology, 24, 375-425.

39

Näätänen, R., & Winkler, I. (1999). The concept of auditory stimulus representation in

cognitive neuroscience. Psychological Bulletin, 125, 826-859.

Nachson, I. (1978). Handedness and dichotic listening to nonverbal features of speech.

Perceptual and Motor Skills, 47, 1111.

Neelon, M.F., Williams, J., & Garrell, P.C. (2006). The effects of auditory attention

measured from human electrocorticograms. Clinical Neurophysiology, 117, 504-

521.

Nityananda, V., & Bee, M.A. (2011). Finding your mate at a cocktail party: Frequency

separation promotes auditory stream segregation of concurrent voices in multi-

species frog choruses. PLoS, 6, 1-11.

Ofek, E., & Pratt, H. (2004). Ear advantage and attention: an ERP study of auditory cued

attention. Hearing Research, 189, 107-118.

Petrides, M. (2012). The mid-dorsolateral prefrontal-parietal network and the epoptic

process. In D.T. Stuss & R.T. Knight (Eds.), Principles of Frontal Lobe Function,

2nd Edition. New York, NY: Oxford University Press.

Robitaille, N., & Jolicoeur, P. (2006). Effect of cue-target interval on the N2pc. Cognitive

Neuroscience and Neuropsychology, 17, 1655-1658.

Rosenthal, C.R., Walsh, V., Mannan, S.K., Anderson, E.J., Hawken, M.B., & Kennard,

C. (2006). Temporal dynamics of parietal cortex involvement in visual search.

Neuropsychologia, 44, 731-743.

Sabri, M., Liebenthal, E., Waldron, E.J., Medler, D.A., & Binder, J.R. (2006). Attentional

modulation in the detection of irrelevant deviance: A simultaneous ERP/fMRI

study. Journal of Cognitive Neuroscience, 18, 689-700.

40

Sawaki, R., & Luck, S.J. (2010). Capture versus suppression of attention by salient

singletons: Electrophysiological evidence for an automatic attend-to-me signal.

Attention, Perception, & Psychophysics, 72, 1455-1470.

Schröger, E. (1995). Processing of auditory deviants with changes in one versus two

stimulus dimensions. Psychophysiology, 32, 55-65.

Shackleton, T.M., & Meddis, R. (1992). The role of interaural time difference and

fundamental frequency difference in the identification of concurrent vowel pairs.

Journal of the Acoustic Society of America, 91, 3579-3581.

Sininger, Y.S., & Bhatara, A. (2012). Laterality of basic auditory perception. Laterality,

17, 129-149.

Snyder, J.S., & Alain, C. (2005). Age-related changes in neural activity associated with

concurrent vowel segregation. Cognitive Brain Research, 24, 492-499.

Snyder, J.S., Alain, C., & Picton, T.W. (2006). Effects of attention on neuroelectric

correlates of auditory stream segregation. Journal of Cognitive Neuroscience, 18,

1-13.

Stuss, D.T., Binns, M.A., Murphy, K.J., & Alexander, M.P. (2002). Dissociations within

the anterior attentional system: Effects of task complexity and irrelevant

information on reaction time speed and accuracy.

Stuss, D.T., Alexander, M.P., Shallice, T., Picton, T.W., Binns, M.A., Macdonald, R.,

Borowiec, A., & Katz, D.I. (2005). Multiple frontal systems controlling response

speed. Neuropsychologia, 43, 396-417.

41

Takegata, R., & Morotomi, T. (1999). Integrated neural representation of sound and

temporal features in human auditory sensory memory: An event-related potential

study. Neuroscience Letters, 274, 207-210.

Tong, Y., & Melara, R.D. (2007). Behavioral and electrophysiological effects of

distractor variation on auditory selective attention. Brain Research, 1166, 110-

123.

Tsuchida, Y., Katayama, J., & Murohashi, H. (2012). Working memory capacity affects

the interference control of distracters at auditory gating. Neuroscience Letters,

516, 62-66.

Wenzel, E.M., Arruda, M., Kistler, D.J., & Wightman, F.L. (1993). Localization using

nonindividualized head-related transfer functions. Journal of the Acoustic Society

of America, 94, 111-123.

Wetzel, N., & Schröger, E. (2007). Modulation of involuntary attention by the duration of

novel and pitch deviant sounds in children and adolescents. Biological

Psychology, 75, 24-31.

Wightman, F.L., & Kistler, D.J. (1989a). Headphone simulation of free-field listening, I:

Stimulus synthesis. Journal of the Acoustic Society of America, 85, 858-867.

Wightman, F.L., & Kistler, D.J. (1989b). Headphone simulation of free-field listening, II:

Psychophysical validation. Journal of the Acoustic Society of America, 85, 868-

878.

Wioland, N., Rudolf, G., Metz-Lutz, M.N., Mutschler, V., & Marescaux, C. (1999).

Cerebral correlates of hemispheric lateralization during a pitch discrimination

task: An ERP study in dichotic situation. Clinical Neurophysiology, 110, 516-523.

42

Woldorff, M.G., Gallen, C.C., Hampson, S.A., Hillyard, S.A., Pantev, C., Sobel, D., &

Bloom, F.E. (1993). Modulation of early sensory processing in human auditory

cortex during auditory selective attention. Proceedings of the National Academy

of Sciences in the United States of America, 90, 8722-8726.

Woldorff, M.G., & Hillyard, S.A. (1991). Modulation of early auditory processing during

selective listening to rapidly presented tones. Electroencephalography and

Clinical Neurophysiology, 79, 170-191.

Woodman, G.F., Arita, J.T., & Luck, S.J. (2009). A cuing study of the N2pc component:

An index of attentional deployment to objects rather than spatial locations. Brain

Research, 1297, 101-111.

Wykowska, A., & Schubö, A. (2009). On the temporal relation of top-down and bottom-

up mechanisms during guidance of attention. Journal of Cognitive Neuroscience,

22, 640-654.

Zhao, G., Liu, Q., Zhang, Y., Jiao, J., Zhang, Q., Sun, H., & Li, H. (2011). The amplitude

of the N2pc reflects the physical disparity between target item and distractors.

Neuroscience Letters, 491, 68-72.

43

Table 1: Mean amplitude values derived from trials with correct responses.

Table 1a: Mean Amplitude at N1m Left Hemisphere Right Hemisphere Overall Conditions Group Mean (SD) Group Mean (SD) TPL -56.01 (22.31) -50.01 (20.72) TPR -55.82 (21.91) -48.84 (19.98) TA -57.24 (21.82) -50.43 (20.47) Semi-Tone Differences 1 ST 4 ST 1 ST 4 ST TPL -58.00 (24.03) -53.90 (21.07) -50.47 (21.39) -49.45 (20.26) TPR -57.03 (22.96) -54.59 (21.12) -50.35 (20.60) -47.29 (19.75) TA -57.09 (21.62) -57.44 (22.11) -51.17 (20.77) -49.69 (20.24)

Table 1b: Mean Amplitude at P2m Left Hemisphere Right Hemisphere Overall Conditions Group Mean (SD) Group Mean (SD) TPL -38.28 (19.50) -25.43 (17.03) TPR -36.01 (18.65) -24.96 (15.65) TA -35.51 (17.66) -25.52 (15.81) Semi-Tone Differences 1 ST 4 ST 1 ST 4 ST TPL -37.85 (21.15) -38.63 (18.07) -24.04 (17.97) -26.47 (16.22) TPR -35.69 (19.43) -36.23 (18.14) -24.80 (16.50) -25.06 (15.48) TA -33.53 (16.53) -37.54 (19.10) -24.29 (15.51) -26.77 (16.29)

Table 1c: Mean Amplitude at P3m Left Hemisphere Right Hemisphere Overall Conditions Group Mean (SD) Group Mean (SD) TPL -54.07 (24.44) -34.63 (20.40) TPR -51.49 (24.48) -34.39 (18.24) TA -52.23 (22.42) -34.40 (18.90) Semi-Tone Differences 1 ST 4 ST 1 ST 4 ST TPL -53.20 (26.82) -54.54 (22.58) -33.35 (22.29) -35.56 (18.79) TPR -52.02 (24.27) -50.93 (25.40) -34.94 (18.19) -33.69 (19.74) TA -51.01 (20.04) -53.45 (25.07) -34.46 (18.48) -34.38 (19.64)

Table 1d: Mean Amplitude at Late Left Hemisphere Right Hemisphere Overall Conditions Group Mean (SD) Group Mean (SD) TPL -33.78 (20.37) -21.86 (15.83) TPR -31.09 (22.55) -21.14 (12.46) TA -42.88 (20.14) -24.88 (13.37) Semi-Tone Differences 1 ST 4 ST 1 ST 4 ST TPL -32.28 (23.19) -34.87 (18.34) -19.93 (16.96) -23.28 (15.13) TPR -31.14 (22.38) -31.13 (23.86) -21.72 (15.50) -20.44 (13.19) TA -40.83 (18.40) -44.89 (22.36) -24.50 (13.49) -25.33 (13.85)

44

Table 2: Mean amplitude values derived from trials with errant responses.

Table 2a: Mean Amplitude at N1m Left Hemisphere Right Hemisphere Overall Conditions Group Mean (SD) Group Mean (SD) TPL -57.73 (23.67) -49.41 (22.82) TPR -51.99 (20.63) -47.71 (19.91) TA -54.48 (25.73) -47.40 (21.62) Semi-Tone Differences 1 ST 4 ST 1 ST 4 ST TPL -56.48 (23.17) -55.61 (23.09) -51.39 (25.99) -50.06 (19.99) TPR -50.93 (23.80) -55.27 (22.90) -45.47 (23.67) -48.46 (22.13) TA -52.14 (27.27) -52.39 (25.71) -50.71 (23.03) -41.51 (28.35)

Table 2b: Mean Amplitude at P2m Left Hemisphere Right Hemisphere Overall Conditions Group Mean (SD) Group Mean (SD) TPL -35.72 (22.26) -24.49 (20.75) TPR -33.56 (18.09) -25.37 (14.34) TA -31.78 (17.11) -21.79 (18.78) Semi-Tone Differences 1 ST 4 ST 1 ST 4 ST TPL -36.45 (20.27) -32.13 (36.52) -23.55 (18.99) -28.46 (20.39) TPR -32.95 (24.11) -35.74 (19.82) -21.56 (16.98) -29.09 (13.64) TA -27.83 (22.94) -34.20 (21.37) -25.55 (15.88) -18.40 (26.78)

Table 2c: Mean Amplitude at Late Left Hemisphere Right Hemisphere Overall Conditions Group Mean (SD) Group Mean (SD) TPL -43.07 (25.08) -22.22 (21.97) TPR -39.81 (25.64) -24.45 (16.87) TA -40.47 (34.50) -21.20 (16.09) Semi-Tone Differences 1 ST 4 ST 1 ST 4 ST TPL -46.55 (28.99) -36.94 (22.25) -25.21 (27.49) -23.74 (19.50) TPR -40.42 (25.19) -41.00 (25.83) -23.28 (15.53) -25.55 (19.09) TA -37.91 (36.19) -36.49 (26.58) -21.88 (14.92) -17.50 (23.17)

45

Start ofTrial

1000 ms

Tones

200 ms 500-1500 ms

Response(End of Trial)

Figure 1: Illustration of the progression of a single experimental trial.

46

A B

C

Red Dipole: Representative participant, Right Dipole

Blue Dipole: Representative participant, Left Dipole

Green Dipole: Group mean, Right Dipole

Purple Dipole: Group mean, Left Dipole

Figure 2: Neuromagnetic activity averaged over all experimental trials. A) Auditory evoked fields (AEFs) for one representative participant. B) Contour maps for the same participant at the N1m. C) The location of dipoles in the left and right hemisphere for the N1m for the same participant as well as the overall group mean using a MRI template from BESA 5.2.

47

Figure 3: Response times averaged across all 17 participants. The top panel shows the response times for the three trial types (i.e., the presence/absence and location of target presentation) of the Target condition. TPL = Target Present Left; TPR = Target Present Right; TA = Target Absent. The lower panel shows the response times for the two trial types (i.e., frequency difference between concurrently presented stimuli) of the SemiTone condition. 1ST = 1 semi-tone difference; 4ST = 4 semi-tone difference.

500

600

700

800

900

TPL TPR TA

500

600

700

800

900

1ST 4ST

Resp

onse

tim

e (m

s)Re

spon

se ti

me

(ms)

Target

SemiTone

48

N1

P2

P3 Late

N1

P2P3 Late

N1N1

P2P2

P3P3

LateLate

N1N1

P2P2

P3

P3

Late

Late

(a) All Correct Trials

(b) Correct Trials, 1 SemiTone (c) Correct Trials, 4 SemiTone

Figure 4: Source wave forms modeled from single dipoles seeded in the auditory cortex in the left and right hemispheres. In panel (a) all correct trials are included and split based upon the target being presented to the left ear (TPL), right ear (TPR), or being absent (TA).Panels (b) and (c) show the same data for correct trials but are separated according to the frequency difference between two vowelson a trial being separated by either 1 or 4 semi-tones, respectively. “Source” = Hemisphere.

auditory search: by susan gillingham department of psychology … · 2012. 11. 20. · and spatial...

Documents