audiovisual processing and integration in amblyopia · audiovisual temporal integration using the...

Audiovisual Processing and Integration in Amblyopia

by

Michael David Richards

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy

Institute of Medical Science University of Toronto

© Copyright by Michael David Richards 2018

ii

Audiovisual Processing and Integration in Amblyopia

Michael David Richards

Doctor of Philosophy

Institute of Medical Science University of Toronto

2018

Abstract

Amblyopia is a developmental visual disorder caused by abnormal visual experience during early

life. Accumulating evidence points to perceptual deficits in amblyopia beyond vision, in the

realm of audiovisual multisensory perception. This thesis presents a systematic psychophysical

investigation of audiovisual processing and integration in adults with unilateral amblyopia. Study

I examines audiovisual spatial integration and reveals amblyopic deficits in localization precision

for unisensory visual and auditory stimuli, but statistically optimal integration according to the

maximum likelihood estimation model of multisensory integration. Study II confirms the novel

deficit in sound localization described in Study I, and reveals a non-uniform spatial pattern of

sound localization deficits that implicates the superior colliculus as a primary neural locus

affected by abnormal visual experience. Study III measures audiovisual simultaneity perception,

and shows that asynchronous audiovisual pairs are perceived as synchronous over wider

temporal intervals than normal, regardless of which eye is viewing. Study IV examines

audiovisual temporal integration using the temporal ventriloquism effect, and reveals successful

temporal integration in amblyopia, but possibly over a wider interval of audiovisual asynchrony.

In sum, the findings suggest that the capacities for spatial and temporal audiovisual integration

are intact in amblyopia, but that non-integrative multisensory processes, including cross-modal

temporal matching and cross-sensory calibration of sound localization, are impaired.

iii

Acknowledgments

I would like to express my sincere gratitude to a number of people and organizations that

contributed in some way to the realization of this thesis.

To my primary supervisor, Dr. Agnes Wong, thank you for your invaluable mentorship,

guidance, and support over the course of this research program. I am grateful for your

encouragement along this winding path, for the intellectual freedom you afforded me in my

work, and for your commitment to the ideal of merging scientific rigour with clinical

compassion. You have been an inspiration to me in pursuing this career.

To my co-supervisor, Dr. Herb Goltz, thank you for your day-to-day engagement with the

challenges and triumphs of my graduate experience. I am grateful for the encouragement,

knowledge, and thoughtful advice you shared with me during our innumerable scientific

discussions. Your mentorship has fundamentally shaped my training and this work.

To the members of my advisory committee, Dr. Karen Gordon, Dr. Bob Harrison, and Dr.

Daphne Maurer, thank you for your invaluable guidance and scientific insights that have pushed

me to become a better scientist. I am grateful for the time and energy you have devoted to seeing

me through this program.

I am also grateful for the administrative and financial support provided by the Clinician

Investigator Program and the Vision Science Research Program at the University of Toronto.

To past and present members of the lab, thank you for your friendship on this journey. Thank

you to Luke Gane, Al Blakeman for your technical expertise. Thank you to Linda Colpa for

recruiting and screening subjects, to Inna Tsirlin for early critiques of the research proposal, and

to Cindy Narinesingh for software support. Thank you to my desk mates Jaime Sklar, Marija

Zivcevska, and Shaobo Lei for the camaraderie on our side of ‘the wall’, and to Arham Raashid

for your friendship and unwavering enthusiasm for science.

To my family and friends, thank you for loving and supporting me through these transformative

years. I am eternally grateful to my mom and dad for their belief in my ability to confront and

overcome any obstacle. And to my partner, Tanner, thank you for your love, patience, and

support—I could not have finished this without you.

iv

Table of Contents

Acknowledgments.......................................................................................................................... iii

Table of Contents ........................................................................................................................... iv

Statement of Contributions ............................................................................................................ ix

List of Abbreviations .......................................................................................................................x

List of Tables ................................................................................................................................ xii

List of Figures .............................................................................................................................. xiii

Chapter 1 General Introduction .......................................................................................................1

General Introduction ...................................................................................................................1

1.1 Amblyopia............................................................................................................................1

1.1.1 Overview ..................................................................................................................1

1.1.2 Etiology of Amblyopia ............................................................................................3

1.1.3 Abnormalities in Spatiotemporal Visual Processing in Amblyopia ........................5

1.1.4 Neural Basis of Amblyopia ....................................................................................12

1.1.5 Normal Visual Development and Sensitive Periods for Damage and Recovery ...14

1.2 Auditory Processing ...........................................................................................................19

1.2.1 Overview ................................................................................................................19

1.2.2 Auditory Spatial Processing ...................................................................................19

1.2.3 Auditory Temporal Processing ..............................................................................27

1.3 Multisensory Processing and Integration ...........................................................................29

1.3.1 Overview ................................................................................................................29

1.3.2 Influence of Cognitive Factors in Multisensory Processing ..................................33

1.3.3 Neural Sites of Multisensory Processing ...............................................................35

1.3.4 Multisensory Integration ........................................................................................41

1.3.5 Theories of Multisensory Integration and Modality Dominance ...........................42

1.3.6 Development of Multisensory Processes ...............................................................47

v

1.3.7 Cross-Sensory Calibration Hypothesis ..................................................................48

1.3.8 Selected Psychophysical Measures of Audiovisual Processing and Integration ...49

1.4 Multisensory Processing in Amblyopia .............................................................................55

1.4.1 Audiovisual Temporal and Spatial Perception ......................................................55

1.4.2 Audiovisual Speech Perception .............................................................................57

1.5 Summary ............................................................................................................................58

Chapter 2 Study Aims and Hypotheses .........................................................................................59

Study Aims and Hypotheses .....................................................................................................59

2.1 General Rationale and Research Aims ..............................................................................59

2.2 Specific Study Aims and Hypotheses ................................................................................60

2.2.1 Audiovisual Spatial Perception ..............................................................................60

2.2.2 Audiovisual Temporal Perception .........................................................................62

Chapter 3 Study I ...........................................................................................................................66

Study I: Optimal Audiovisual Integration in the Ventriloquism Effect but Pervasive Deficits in Unisensory Spatial Localization in Amblyopia ......................................................66

3.1 Abstract ..............................................................................................................................66

3.2 Introduction ........................................................................................................................67

3.3 Methods..............................................................................................................................69

3.3.1 Participants .............................................................................................................69

3.3.2 Apparatus and Stimuli............................................................................................72

3.3.3 Procedure ...............................................................................................................73

3.3.4 Data Analysis .........................................................................................................75

3.4 Results ................................................................................................................................75

3.4.1 Localization Performance ......................................................................................76

3.4.2 Testing the Maximum Likelihood Estimation Model ............................................80

3.5 Discussion ..........................................................................................................................86

vi

Chapter 4 Study II ..........................................................................................................................89

Study II: Amblyopia and the Developmental Calibration of Sound Localization ....................89

4.1 Abstract ..............................................................................................................................89

4.2 Introduction ........................................................................................................................89

4.3 Methods..............................................................................................................................92

4.3.1 Experiment 1: Relative sound localization—minimum audible angle task using speaker array ................................................................................................92

4.3.2 Experiment 2: Absolute Auditory Localization .....................................................96

4.3.3 Experiment 3: Replication of MAA task using stereo speaker apparatus (amplitude panning) .............................................................................................101

4.4 Results ..............................................................................................................................101

4.4.1 Experiment 1 ........................................................................................................101

4.4.2 Experiment 2 ........................................................................................................102

4.4.3 Experiment 3 ........................................................................................................106

4.5 Discussion ........................................................................................................................107

Chapter 5 Study III.......................................................................................................................111

Study III: Alterations in Audiovisual Simultaneity Perception in Amblyopia .......................111

5.1 Abstract ............................................................................................................................111

5.2 Introduction ......................................................................................................................111

5.3 Materials and Methods .....................................................................................................115

5.3.1 Participants ...........................................................................................................115

5.3.2 Apparatus and Stimuli..........................................................................................118

5.3.3 Procedure .............................................................................................................118

5.3.4 Analysis................................................................................................................119

5.4 Results ..............................................................................................................................121

5.4.1 Binocular Viewing Condition ..............................................................................121

5.4.2 Monocular Viewing Conditions ...........................................................................128

vii

5.5 Discussion ........................................................................................................................130

Chapter 6 Study IV ......................................................................................................................136

Study IV: Temporal Ventriloquism Reveals Normal Audiovisual Temporal Integration in Amblyopia ...............................................................................................................................136

6.1 Abstract ............................................................................................................................136

6.2 Introduction ......................................................................................................................136

6.3 Methods............................................................................................................................141

6.3.1 Participants ...........................................................................................................141

6.3.2 Apparatus and Stimuli..........................................................................................142

6.3.3 Design and Procedure ..........................................................................................143

6.3.4 Data Analysis .......................................................................................................144

6.4 Results ..............................................................................................................................146

6.5 Discussion ........................................................................................................................151

Chapter 7 General Discussion and Conclusions ..........................................................................156

General Discussion and Conclusions ......................................................................................156

7.1 Summary of Findings and Evaluation of Specific Hypotheses .......................................156

7.1.1 Audiovisual Spatial Perception ............................................................................156

7.1.2 Audiovisual Temporal Perception .......................................................................158

7.2 Is Audiovisual Integration Impaired in Amblyopia? .......................................................159

7.2.1 Possible Mechanisms for the Pattern of Audiovisual Integration Abnormalities in Amblyopia .......................................................................................................160

7.3 Are Non-integrative Audiovisual Processes Impaired in Amblyopia? ............................169

7.3.1 Cross-modal Matching .........................................................................................169

7.3.2 Unisensory Impairments and Cross-sensory Calibration .....................................171

7.4 Clinical Implications ........................................................................................................173

7.5 Conclusions ......................................................................................................................175

Chapter 8 Future Directions .........................................................................................................177

viii

Future Directions .....................................................................................................................177

8.1 Development and Mechanisms of Multisensory Processing and Integration ..................177

8.2 Nature and Extent of Perceptual Impairments in Amblyopia ..........................................179

8.3 Looking to the Future of Amblyopia Therapy .................................................................182

References ....................................................................................................................................184

Copyright Acknowledgements.....................................................................................................213

ix

Statement of Contributions

Dr. Michael Richards (author) – all aspects of this work, including but not limited to:

experimental design, data collection, data analysis, data interpretation, thesis preparation

Dr. Agnes Wong (supervisor) – mentorship, assistance with experimental design, assistance with

data interpretation, assistance with thesis preparation

Dr. Herbert Goltz (co-supervisor) – mentorship, assistance with experimental design, assistance

with data interpretation, assistance with thesis preparation

Dr. Daphne Maurer (committee member) – mentorship, assistance with experimental design,

assistance with data interpretation, assistance with thesis preparation

Dr. Karen Gordon (committee member) – mentorship, assistance with experimental design,


Dr. Robert Harrison (committee member) – mentorship, assistance with experimental design,


Linda Colpa – assistance with participant recruitment, assistance with data interpretation

Arham Raashid – assistance with data analysis, assistance with data interpretation

Luke Gane – assistance with experimental apparatus, assistance with data analysis

Alan Blakeman – assistance with experimental apparatus, assistance with data analysis

Jaime Sklar – assistance with data collection, assistance with data interpretation

Cindy Narinesingh – assistance with data analysis

Dr. Inna Tsirlin – assistance with experimental design

x

List of Abbreviations

AE Amblyopic eye

ANCOVA Analysis of covariance

ANOVA Analysis of variance

AV Audiovisual

BOLD Blood oxygenation level-dependent

DNLL Dorsal nucleus of the lateral lemniscus

ETDRS Early treatment of diabetic retinopathy study

FE Fellow eye (as compared to the amblyopic eye)

fMRI Functional magnetic resonance imaging

HRTF Head-related transfer function

IC Inferior colliculus

ILD Interaural level difference

IPS Intraparietal sulcus

ITD Interaural time difference

JND Just noticeable difference

LE Left eye

LED Light emitting diode

LGN Lateral geniculate nucleus

logMAR Logarithm of the minimum angle of resolution

xi

LSO Lateral superior olive

MAA Minimum audible angle

MLE Maximum likelihood estimation

MNTB Medial nucleus of the trapezoid body

MRI Magnetic resonance imaging

MSO Medial superior olive

MT Middle temporal visual area

PET Positron emission tomography

PSE Point of subjective equality

PSS Point of subjective simultaneity

RE Right eye

RMS Root mean square

SC Superior colliculus

SD Standard deviation

SOA Signal onset asynchrony

STS Superior temporal sulcus

TOJ Temporal order judgment

V1 Primary visual cortex, or striate cortex

V2, V3, V3a, Vp, V4+, V8 Extrastriate visual cortices

VEP Visual evoked potential

xii

List of Tables

Table 3.1: Characteristics of participants with amblyopia............................................................ 71

Table 3.2: Probe stimulus displacements used for each test stimulus condition .......................... 74

Table 4.1: Clinical details of participants with amblyopia in Experiment 1 ................................ 94

Table 4.2: Clinical details of participants with amblyopia in Experiment 2 ................................ 98

Table 5.1: Characteristics of participants with amblyopia.......................................................... 117

Table 5.2: Audiovisual simultaneity window parameters by main group .................................. 122

Table 5.3: Audiovisual simultaneity window parameters by amblyopia severity ...................... 125

Table 5.4: Audiovisual simultaneity window parameters by amblyopia etiology...................... 126

Table 5.5: Audiovisual simultaneity window parameters by suppression status........................ 127

Table 5.6: Audiovisual simultaneity window parameters by stereopsis level ............................ 128

Table 5.7: Comparison of audiovisual simultaneity window parameters by viewing condition for

participants with amblyopia (repeated measures ANOVA) ....................................................... 130

Table 6.1: Clinical characteristics of participants with amblyopia ............................................. 145

Table 6.2: Visual temporal order judgment performance in the control and amblyopia groups 146

xiii

List of Figures

Figure 1.1: Visual evoked potential P1 amplitude and latency distributions from trial-by-trial

analysis from 18 adults with unilateral amblyopia ....................................................................... 11

Figure 1.2: Schematic diagram of retinal projections in the retinostriate and retinocollicular

pathways ....................................................................................................................................... 36

Figure 1.3: Summary of putative multisensory areas of the human brain based on primate

anatomical data, human psychophysical data, and functional neuroimaging studies ................... 39

Figure 1.4: Posterior-to-anterior audiovisual processing gradient in the human STS .................. 40

Figure 1.5: A hypothetical audiovisual simultaneity window ...................................................... 50

Figure 1.6: A diagram of the spatial ventriloquism effect ............................................................ 52

Figure 1.7: Examples of audiovisual stimulus conditions that elicit the temporal ventriloquism

effect ............................................................................................................................................. 53

Figure 3.1: Audiovisual apparatus for the presentation of visual blobs and auditory clicks ........ 72

Figure 3.2 Illustration of the trial timeline .................................................................................... 74

Figure 3.3: Unimodal and bimodal localization task performance ............................................... 78

Figure 3.4: Localization precision for visual-only, auditory-only, and spatially congruent

bimodal audiovisual stimuli .......................................................................................................... 79

Figure 3.5: Bimodal localization bias for audiovisual stimuli with spatial conflict ..................... 80

Figure 3.6: Bimodal localization precision, as observed and as predicted by the MLE model .... 81

Figure 3.7: Maximal bimodal advantage ratio for localization precision, observed, as predicted

by the MLE model, and as predicted by integration failure ......................................................... 83

Figure 3.8: Perceptual weight for vision (wV), observed and as predicted by the MLE model .... 84

xiv

Figure 3.9: Visual blob size equivalent to the auditory click in terms of spatial precision (on

unimodal presentation) and perceptual weight (on bimodal presentation) ................................... 85

Figure 4.1: Apparatus for Experiment 1, a horizontal array of 11 speakers with a central fixation

LED ............................................................................................................................................... 95

Figure 4.2: Apparatus for Experiment 2, stereo speakers with LED monitor .............................. 99

Figure 4.3: Relative sound localization performance on a horizontal speaker array .................. 102

Figure 4.4: Absolute sound localization performance ................................................................ 103

Figure 4.5: Correlations between RMS error for sound localization and clinical measures of

amblyopia across auditory target positions ................................................................................. 105

Figure 4.6: Relative sound localization performance on stereo speaker apparatus .................... 106

Figure 4.7: Correlation between minimum audible angle (MAA) values determined by amplitude

panning (Experiment 3) and by physical speakers (Experiment 1) ............................................ 107

Figure 5.1: Schematic diagram of signal onset asynchronies (SOA) for auditory-lead and visual-

lead conditions. ........................................................................................................................... 119

Figure 5.2: Sample audiovisual simultaneity judgment data from a visually normal control

participant, fitted with a truncated Gaussian function by the maximum likelihood method ...... 120

Figure 5.3: Main group analysis for audiovisual simultaneity judgment responses with both eyes

viewing as a function of SOA ..................................................................................................... 122

Figure 5.4: Subgroup analyses for audiovisual simultaneity judgment responses with both eyes

viewing as a function of SOA ..................................................................................................... 124

Figure 5.5: The audiovisual simultaneity window for binocular and monocular viewing

conditions among participants with amblyopia .......................................................................... 129

Figure 6.1: Schematic of the apparatus and stimuli that induce the temporal ventriloquism effect

..................................................................................................................................................... 140

xv

Figure 6.2: The temporal ventriloquism effect with and without intact audiovisual integration 141

Figure 6.3: The audiovisual apparatus ........................................................................................ 143

Figure 6.4: Visual temporal order judgment performance for visual-only stimuli and audiovisual

stimuli with synchronous clicks (AV sync) ................................................................................ 147

Figure 6.5: The temporal ventriloquism effect in the control group and the amblyopia group .. 148

Figure 6.6: Relation between susceptibility to the temporal ventriloquism effect and visual acuity

in the amblyopic eye across click timing conditions in which the second click lagged the onset of

the second light ........................................................................................................................... 150

Figure 6.7: Relation between susceptibility to the temporal ventriloquism effect and stereo acuity

across click lag conditions in participants with amblyopia ........................................................ 150

Figure 8.1: Possible mechanisms that determine the temporal window of audiovisual integration

..................................................................................................................................................... 178

1

Chapter 1 General Introduction

General Introduction

1.1 Amblyopia

1.1.1 Overview

The term amblyopia is derived from the Greek words amblys, meaning ‘dulled’ or ‘blunt’, and

ops, meaning ‘eye’, and literally means ‘dimness of vision’. Amblyopia, also commonly referred

to as ‘lazy eye’, has been traditionally defined as a decrease in visual acuity in the absence of any

apparent ocular defect to account for the impairment (von Noorden & Campos, 2002). The

traditional definition, however, is incomplete. Evidence from animal studies (Hubel & Wiesel,

1970; Kiorpes, Kiper, O'Keefe, Cavanaugh, & Movshon, 1998; Movshon et al., 1987) and more

recently from human neuroimaging studies (Goodyear, Nicolle, Humphrey, & Menon, 2000),

shows that the visual deficit in amblyopia is related to dysfunctional processing of visual

information. Amblyopia is accompanied by one or more amblyogenic factors that disrupted

normal visual experience during a sensitive period in the development of the visual pathways in

infancy or early childhood (Birch, 2013). Therefore, amblyopia may be more precisely defined

as an impairment in visual processing caused by abnormal visual experience during a critical

period in the first years of life (Holmes & Clarke, 2006).

Clinically, amblyopia can be defined as a unilateral, or rarely bilateral, reduction in best-

corrected visual acuity that cannot be directly attributed to a structural eye abnormality

(American Academy of Ophthalmology Pediatric Ophthalmology/Strabismus Panel, 2012; The

Lasker/IRRF Initiative for Innovation in Vision Science, 2017). Additionally, the diagnosis of

amblyopia requires a history of one or more amblyogenic factors, most commonly strabismus

(eye misalignment) and anisometropia (difference in refractive error between the eyes), or more

rarely visual deprivation (most often from congenital cataract), that interfered with pattern vision

or normal binocular interaction in early life (Preslan & Novak, 1996). Because the majority of

amblyopia is unilateral, a widely used and practical definition of amblyopia is an inter-ocular

difference in best-corrected visual acuity of 2 or more lines (i.e. ≥ 2 logMAR) on a standard eye

2

chart (Holmes & Clarke, 2006). Visual acuity differences of less than 2 lines are generally within

the range of test-retest variability for normal observers (Holmes et al., 2001).

Amblyopia is a significant public health concern among children and adults alike. The

prevalence of amblyopia among children in developed countries is estimated between 2% to 4%

(Donnelly, Stewart, & Hollinger, 2005; Friedman et al., 2009; Preslan & Novak, 1996; Robaei et

al., 2006; Thompson, Woodruff, Hiscox, Strong, & Minshull, 1991; Williams et al., 2008).

Failure to detect and treat amblyopia in childhood means that its effects are often lifelong.

Indeed, amblyopia is the leading cause of persistent monocular blindness among adults (Buch,

Vinding, La Cour, & Nielsen, 2001; Krueger & Ederer, 1984), and its estimated prevalence in

adult populations is approximately 3% (Attebo et al., 1998; Brown et al., 2000; Vinding,

Gregersen, Jensen, & Rindziunski, 2009). In addition to the unilateral visual deficit acquired in

childhood, people with amblyopia are at markedly increased risk of bilateral blindness compared

to the general population, most commonly from trauma to the fellow eye (Tommila &

Tarkkanen, 1981). The lifetime risk of serious visual impairment in the fellow eye is estimated at

1.2% to 3.3% (Rahi, Logan, Timms, Russell-Eggitt, & Taylor, 2002). In total, health economists

estimate that untreated amblyopia accounts for over $7 billion in lost earning power annually in

the United States (Membreno, Brown, Brown, Sharma, & Beauchamp, 2002).

The impacts of amblyopia extend beyond visual perception as well. Quality of life studies report

that amblyopia has negative effects on social interactions (Horwood, Waylen, Herrick, Williams,

& Wolke, 2005; Packwood, Cruz, Rychwalski, & Keech, 1999; van de Graaf et al., 2007), self-

esteem and self-image (Packwood et al., 1999; Webber, Wood, Gole, & Brown, 2008), sports

involvement (Packwood et al., 1999), educational attainment (Chua & Mitchell, 2004), and

ultimate career choice (Adams & Karas, 1999; Packwood et al., 1999) (see Carlton and

Kaltenthaler (2011) for review). Although vision problems are widely acknowledged not to

cause primary dyslexia or learning disabilities (American Academy of Pediatrics Section on

Ophthalmology Council on Children with Disabilities, American Academy of Ophthalmology,

American Association for Pediatric Ophthalmology and Strabismus, & American Association of

Certified Orthoptists, 2009), school-aged children with unilateral amblyopia read more slowly

than their typically-sighted counterparts, even under natural binocular viewing conditions

(Kanonidou, Proudlock, & Gottlob, 2010; Kelly, Jost, De La Cruz, & Birch, 2015). Eye-hand

coordination is also affected, with slower and less precise reaching and grasping (Grant,

3

Melmoth, Morgan, & Finlay, 2007; Niechwiej-Szwedo, Goltz, Chandrakumar, Hirji, & Wong,

2011; Niechwiej-Szwedo, Goltz, Chandrakumar, & Wong, 2012). Additionally, perceptual

abnormalities in audiovisual multisensory processing are evident in children and adults with

unilateral amblyopia (Burgmeier et al., 2015; Chen, Lewis, Shore, & Maurer, 2017; Narinesingh,

Goltz, Raashid, & Wong, 2015; Narinesingh, Goltz, & Wong, 2017; Narinesingh, Wan, Goltz,

Chandrakumar, & Wong, 2014).

The current gold-standard for the treatment of unilateral amblyopia in children involves

refractive correction and occlusion (i.e., patching) or pharmacological penalization of the fellow

eye to promote use of the amblyopic eye (American Academy of Ophthalmology Pediatric

Ophthalmology/Strabismus Panel, 2012; Stewart, Moseley, & Fielder, 2011). In the case of

visual deprivation, the cause of visual obstruction must be addressed first. If strabismus is

present, however, amblyopia treatment may commence immediately, before eye muscle surgery

to straighten the eyes. The frequency and duration of occlusion or penalization prescribed are

generally determined by the severity of the acuity deficit, and may continue for months to a year

or more, until gains in visual acuity reach a plateau. Using a primary endpoint of 0.3 logMAR

(i.e., 20/40 Snellen equivalent), the overall success rate for occlusion therapy in young children

is about 75% (Flynn, Schiffman, Feuer, & Corona, 1998). Treatment is considerably less

effective after 7 years of age, but small improvements have been observed into late adolescence

(Campos, 1995; Holmes et al., 2011; Lea, Loades, & Rubinstein, 1989; Scheiman et al., 2005).

Novel therapies such as dichoptic games (Holmes et al., 2016), dark exposure (Duffy & Mitchell,

2013) and retinal inactivation (Fong, Mitchell, Duffy, & Bear, 2016) hold promise, but their

efficacy is as yet unproven in humans.

1.1.2 Etiology of Amblyopia

Amblyopia is typically classified according to the amblyogenic factor presumed to have

interfered with visual experience during the critical period in visual maturation. Refractive

amblyopia is caused by chronic retinal defocus associated with untreated refractive error in one

or both eyes. Unilateral refractive amblyopia is termed anisometropic amblyopia, and occurs

when the refractive error between the two eyes is unequal. The more hyperopic (i.e., far-sighted)

eye typically receives the more defocused retinal image and becomes amblyopic (Birch, 2013).

Bilateral refractive amblyopia is much less common, and occurs in cases of high refractive error

4

affecting both eyes. Strabismic amblyopia is associated with misalignment of the visual axes.

Constant, non-alternating strabismus, and eso-deviations are particularly amblyogenic, and lead

to amblyopia in the non-fixating eye (Birch, 2013). Mixed-mechanism amblyopia is the term

applied to cases that exhibit both anisometropia and strabismus as amblyogenic factors.

Deprivational amblyopia is caused by complete or partial obstruction of the visual axis in one or

both eyes. It is most commonly associated with congenital cataract, but may also be observed in

cases of severe ptosis, corneal opacity, and vitreous hemorrhage. Deprivational amblyopia is the

rarest form of amblyopia, the earliest onset, and generally causes visual impairment that is more

severe and refractory to treatment (American Academy of Ophthalmology Pediatric

Ophthalmology/Strabismus Panel, 2012). The rationale for this etiological classification of

amblyopia is not solely based on convenience, but also supported by differences in epidemiology

and the pattern of visual deficits (see section 1.1.3) among the groups.

In adults with a history of amblyopia, the most common etiology is anisometropia in 50%,

followed by mixed-mechanism in 27%, strabismus in 19% and deprivation in 4% (Attebo et al.,

1998). Among children, however, the relative prevalence depends on the age group under study.

Below 3 years of age, 82% of amblyopia is associated with strabismus, only 5% is associated

with anisometropia, and 13% is associated with mixed mechanism (Birch & Holmes, 2010).

Between 3 and 6 years of age, however, the proportion associated with strabismus decreases to

38%, and the proportions associated with anisometropia and mixed mechanism rise to 37% and

24%, respectively (Repka et al., 2002). These differences by age cohort suggest differential

sensitivity to amblyogenic factors as visual development progresses, with strabismus being a

stronger influence before age 3 years, and anisometropia emerging as a significant influence

primarily after age 3 years (Birch, 2013). Indeed, a longitudinal study of infants showed that

anisometropia is does not confer increased risk of amblyopia unless it persists for 3 years

(Abrahamsson, Fabian, & Sjostrand, 1990).

The different etiologies of amblyopia also show differential responsiveness to treatment. Using a

final visual acuity of 0.3 logMAR (20/40) as the definition of treatment success, a meta-analysis

of 23 studies on occlusion therapy reported a success rate of 78% in strabismic amblyopia, 67%

in anisometropic amblyopia, and 59% in mixed-mechanism amblyopia (Flynn et al., 1998).

Furthermore, it reported slightly superior mean final visual acuity for anisometropic and

strabismic amblyopia (0.30 logMAR and 0.27 logMAR, respectively) than for mixed mechanism

5

amblyopia (0.43 logMAR). Monocular deprivational amblyopia is usually considered separately

in clinical studies because of its early age at presentation and need for urgent surgical

intervention. Anecdotally, it is regarded as more severe and resistant to therapy than other forms

of amblyopia (American Academy of Ophthalmology Pediatric Ophthalmology/Strabismus

Panel, 2012), but the visual outcome for deprivational amblyopia is highly dependent on age at

which treatment is initiated (Birch & Stager, 1988; Birch, Stager, & Wright, 1986; Birch,

Swanson, Stager, Woody, & Everett, 1993; Kugelberg, 1992). For unilateral congenital cataract,

cases treated before 2 months of age achieve a mean visual acuity of 0.38 logMAR (20/48),

compared to 0.89 logMAR (20/155) in cases treated at 3 months of age or later (Birch, Stager,

Leffler, & Weakley, 1998). A recent study reported that the current standard for cataract surgery,

performed at a median age of 1.8 months, achieves a visual acuity of 0.3 logMAR (20/40) in

only 28% of cases (Lambert et al., 2010; Lambert, DuBois, Cotsonis, Hartmann, & Drews-

Botsch, 2016).

1.1.3 Abnormalities in Spatiotemporal Visual Processing in Amblyopia

Amblyopia is typically detected and diagnosed by a reduction in optotype acuity, but its

associated findings include a wide range of visual and perceptual processing abnormalities

(American Academy of Ophthalmology Pediatric Ophthalmology/Strabismus Panel, 2012;

McKee, Levi, & Movshon, 2003). In addition to a reduction in optotype acuity, typical visual

deficits in amblyopia include reductions in contrast sensitivity, Vernier acuity (i.e., detection of

slight misalignment between two parallel line segments), grating acuity (i.e., resolution of

alternating dark and light stripes of high spatial frequencies), and stereo acuity (i.e., perception of

binocular disparity depth cues in three-dimensional space) (McKee et al., 2003). Foveal

suppression and spatial interference (i.e., the ‘crowding’ effect) in the amblyopic eye are also

characteristic findings (Babu, Clavagnier, Bobier, Thompson, & Hess, 2013; Bonneh, Sagi, &

Polat, 2007; Levi & Klein, 1985; Stuart & Burian, 1962). Although the fellow eye typically has

normal optotype acuity on clinical testing, it commonly exhibits subtle visual deficits as well

(Meier & Giaschi, 2017).

In the amblyopic eye, contrast sensitivity is typically reduced, with high spatial frequencies

preferentially affected, although broadband impairments are sometimes observed (Abrahamsson

& Sjostrand, 1988; Bradley & Freeman, 1981; Hess & Howell, 1977; Levi & Harwerth, 1977).

6

In the fellow eye, contrast sensitivity is also reduced (Leguire, Rogers, & Bremer, 1990; Wali,

Leguire, Rogers, & Bremer, 1991). The contrast sensitivity deficit in the fellow eye is less

severe, but correlated with the contrast sensitivity deficit in the amblyopic eye. Furthermore,

standard occlusion therapy for amblyopia (i.e., patching of the fellow eye) improves contrast

sensitivity in both the amblyopic and fellow eye, indicating binocular interaction (Leguire et al.,

1990; Wali et al., 1991). Vernier acuity and grating acuity are also impaired in the amblyopic

eye, and the two measures are highly correlated with the deficit in optotype acuity (McKee et al.,

2003). In the fellow eye, evidence suggests that Vernier acuity is usually unaffected or even

slightly enhanced (Freeman & Bradley, 1980; Levi & Klein, 1985). Stereo acuity requires

sensory fusion of correlated or aligned images from each eye, and therefore is affected by any

condition impairing monocular visual acuity, eye alignment, or binocular processing. Stereo

acuity deficits in amblyopia are very typical, and are related to the severity of the visual acuity

deficit, strabismus, and foveal suppression. Indeed, as a general rule, stereopsis is absent in deep

amblyopia (McKee et al., 2003).

Beyond these deficits in low-level visual processing, amblyopia is also associated with

impairments in higher-level processing that requires combination of visual cues over a large area

of visual space to form a coherent percept (Mirabella, Hay, & Wong, 2011; Sharma, Levi, &

Klein, 2000). Spatial distortions of the visual field and positional uncertainty in visual alignment

tasks are observed in the amblyopic and fellow eye (Barrett, Pacey, Bradley, Thibos, & Morrill,

2003; Bedell, Flom, & Barbeito, 1985; Fronius, Sireteanu, & Zubcov, 2004; Hess & Holliday,

1992; Sireteanu, Thiel, Fikus, & Iftime, 2008). Detection of global shape is impaired in the

amblyopic eye (Hess, Wang, Demanins, Wilkinson, & Wilson, 1999), and contour integration,

which requires long-range interactions between elements in the visual space, is impaired in the

amblyopic eye and possibly in the fellow eye (Kovacs, Polat, Pennefather, Chandna, & Norcia,

2000). Similarly, abnormal long-range spatial interference by flanking elements (i.e.,

‘crowding’) is evident in the amblyopic eye (Bonneh, Sagi, & Polat, 2004; Polat, Sagi, & Norcia,

1997). Additionally, individuals with amblyopia show binocular deficits in global motion

detection (Aaen-Stockdale & Hess, 2008; Ho et al., 2006), motion-defined form detection

(Giaschi, Regan, Kraft, & Hong, 1992), and real-world scene perception (Mirabella et al., 2011).

The distribution, pattern and severity of visual deficits also varies among the etiological

categories of amblyopia. In anisometropic amblyopia, spatial frequency discrimination

7

thresholds are higher (i.e., impaired) at most spatial frequencies in the amblyopic eye, with

greater deficits evident at higher spatial frequencies. In strabismic amblyopia, however,

discrimination thresholds vary inconsistently across spatial frequencies, with peaks and troughs

that qualitatively resemble the spatial frequency discrimination profile of the normal peripheral

retina (Mathews, Yager, Ciuffreda, & Ettinger, 1987). The deficits in contrast sensitivity also

differ between etiological subtypes, with anisometropic and deprivational amblyopia having

poorer contrast sensitivity thresholds than either strabismic or mixed-mechanism amblyopia

(McKee et al., 2003). Furthermore, in anisometropic amblyopia, the deficit in contrast sensitivity

is uniformly distributed across the central and peripheral visual field, whereas in strabismic

amblyopia, the peripheral field (i.e., beyond the central 30 degrees) is relatively spared (Hess &

Pointer, 1985). In general, anisometropic and strabismic amblyopia also differ in the way their

visual deficits co-vary with each other. In anisometropic amblyopia, Vernier acuity and grating

acuity are affected proportionately, but in strabismic amblyopia the two are decoupled, with

Vernier falling off faster than predicted in relation to grating acuity; that is, strabismic amblyopia

has a proportionally greater deficit in Vernier acuity than grating acuity (Birch & Swanson,

2000; Levi & Klein, 1985). Similarly, inaccuracies in spatial localization can be predicted from

the deficit in contrast sensitivity in anisometropic amblyopia, but in strabismic amblyopia,

deficits in contrast sensitivity and spatial localization accuracy appear decoupled (Hess &

Holliday, 1992). The etiological categories of amblyopia also vary in their level of binocular

dysfunction (e.g., stereo acuity and suppression). Empirically and intuitively, strabismic and

mixed-mechanism amblyopia tend to have the poorer binocular function, and anisometropic and

deprivational amblyopia tend to have relatively preserved binocular function, as the visual axes

remain aligned (McKee et al., 2003). Indeed, binocular function has been proposed as a primary

determining factor behind the differential patterns of visual deficits observed among the

etiological categories of amblyopia (McKee et al., 2003).

Despite the variations in visual function described above, the etiological subtypes of amblyopia

share more similarities than differences (McKee et al., 2003). Indeed, the diagnostic hallmarks of

amblyopia—reduced visual acuity, reduced contrast sensitivity, reduced stereo acuity, elevated

interocular suppression, and the crowding phenomenon—are common to all etiological subtypes

(Levi & Klein, 1985; J. Li et al., 2011; McKee et al., 2003). Additionally, clinical trials support a

common therapeutic strategy—occlusion and pharmacologic penalization—as the gold-standard

8

for all subtypes of amblyopia (American Academy of Ophthalmology Pediatric

Ophthalmology/Strabismus Panel, 2012; Holmes & Clarke, 2006). Some have also noted that

ascertainment of etiology can be difficult, and may lead to spurious classification, because of

changes in a patient’s refraction and eye alignment in the interval between actual onset and first

clinical presentation (The Lasker/IRRF Initiative for Innovation in Vision Science, 2017). The

scientific study of these etiologic subtypes as a common clinical entity is therefore well justified.

The effects of amblyopia also extend beyond the spatial domain to include temporal aspects of

visual processing that require combination of visual cues over an interval of time. Severely

affected amblyopic eyes (i.e., optotype acuity worse than 0.7 logMAR) are associated with

reduced temporal contrast sensitivity across modulation frequencies from 2 Hz to 30 Hz

(Harwerth, Smith, Boltz, Crawford, & von Noorden, 1983; Wesson & Loop, 1982). Several

studies also suggest that amblyopic vision involves temporal uncertainty as well: the amblyopic

eye in anisometropic and strabismic amblyopia shows an extra-foveal deficit in temporal

resolution that correlates with the severity of the spatial acuity deficit (Spang & Fahle, 2009),

and sensitivity to visual asynchrony is reduced in the fovea of amblyopic eyes (Huang, Li, Deng,

Yu, & Hess, 2012). Furthermore, reduced sensitivity to visual temporal order is observed in the

fellow eye in strabismic amblyopia when the judgment requires intrahemispheric integration of

visual information (i.e., transmission across the corpus callosum) (St John, 1998). Amblyopic

eyes also show markedly reduced duration of visual persistence, apparent as an abnormal

decrease in the ability to detect differences between two visual stimuli as the inter-stimulus

interval increases (Altmann & Singer, 1986).

In addition to these impairments in visual cue combination over time, amblyopia is also

associated with increased latency in the visual system. The Pulfrich effect is a compelling

binocular visual illusion that links spatial vision and temporal processing (Pulfrich, 1922). In this

effect, a pendulum oscillating in the frontal plane is falsely perceived as having an elliptical

orbit. The illusion is not present in visually typical observers under normal binocular conditions,

but only when a signal latency difference is introduced into the visual system, either

experimentally or pathologically (Heng & Dutton, 2011). If stereopsis is intact, the image of the

moving pendulum in the lagging eye creates a binocular disparity that is perceived as three-

dimensional depth. A spontaneous Pulfrich effect has been observed in anisometropic

amblyopia, indicating delayed processing of visual information from the amblyopic eye (Tredici

9

& von Noorden, 1984). Indeed, this finding agrees with studies of pattern-reversal visual evoked

potentials (VEPs) which show that the latency of the P1 wave (time from stimulation to the peak

of the first major positive inflection, usually at about 100 ms) is prolonged for the amblyopic eye

(Arden & Barnard, 1979; Barnard & Arden, 1979; Sokol, 1983) and possibly for the fellow eye

(Watts, Neveu, Holder, & Sloper, 2002). Within individuals, trial-to-trial VEP latency

measurements from stimulation of the amblyopic eye are not only longer, but also more variable

(Banko, Kortvelyes, Nemeth, Weiss, & Vidnyanszky, 2013; Kelly, Tarczy-Hornoch, Herlihy, &

Weiss, 2015). VEP latency jitter and the customary practice of averaging waveforms over

multiple trials is proposed by some as the artifactual source of the reduced VEP amplitudes

observed in amblyopia, as illustrated in Figure 1.1 (Banko, Kortvelyes, Nemeth, et al., 2013).

Studies of multifocal VEP responses in amblyopia also show that increases in latency and

decreases in amplitude are not spatially uniform, but more pronounced in the central visual field

in both anisometropic and strabismic amblyopia (Yu, Brown, & Edwards, 1998; Zhang & Zhao,

2005). Furthermore, a study of anisometropic amblyopia showed the interocular difference in

retinocortical transmission time is correlated with the interocular difference in visual acuity

(Parisi, Scarale, Balducci, Fresina, & Campos, 2010). Visual processing delays are also evident

in visuomotor tasks, with increased latency in initiation of saccades (Ciuffreda, Kenyon, & Stark,

1978; McKee, Levi, Schor, & Movshon, 2016) and smooth pursuit eye movements (Raashid,

Liu, Blakeman, Goltz, & Wong, 2016) when viewing with the amblyopic eye, and increased

duration of the motor planning phase during visually-guided reaching movements (Niechwiej-

Szwedo et al., 2011).

A common factor implicated in many of the spatial and temporal perceptual abnormalities

outlined above is noise and its processing by the amblyopic visual system (Banko, Kortvelyes,

Weiss, & Vidnyanszky, 2013; Levi, 2013). Difficulty in handling external noise is evident from

behavioural studies. The addition of random visual noise to stimuli for global motion and global

orientation discrimination tasks degrades performance in amblyopia to a much greater degree

than it does for visually normal observers, suggesting an impairment in the ability to segregate

external noise from signal (Mansouri & Hess, 2006). The amblyopic eye also shows markedly

reduced sensitivity for the detection of white noise, particularly at high spatial frequencies (Levi,

Klein, & Chen, 2007). In addition to difficulty processing signals in external (i.e., stimulus)

noise, the amblyopic visual system is hypothesized to have higher levels of internal (i.e., neural)

10

noise. Behaviourally, this is indicated by greater trial-to-trial variability in perceptual tasks (Levi

& Klein, 2003; Levi, Klein, & Chen, 2008; Levi, Klein, & Yap, 1987; Levi, Waugh, & Beard,

1994), increased latency and diminished amplitude of saccadic eye movements (Niechwiej-

Szwedo, Goltz, Chandrakumar, Hirji, & Wong, 2010; Raashid, Wong, Chandrakumar,

Blakeman, & Goltz, 2013), and greater error in visually-guided grasping (Grant et al., 2007) and

reaching hand movements (Niechwiej-Szwedo, Goltz, et al., 2012). Physiologically, internal

noise is apparent as increased latency jitter in VEP responses (Banko, Kortvelyes, Nemeth, et al.,

2013) (see Figure 1.1), and reduced neural synchrony in V1 when driven by stimulation of the

amblyopic eye (Roelfsema, Konig, Engel, Sireteanu, & Singer, 1994). The importance of noise

in the pathophysiology of amblyopia is underscored by the ability to simulate amblyopic deficits

in contrast sensitivity (Nordmann, Freeman, & Casanova, 1992) and saccadic adaptation

(Raashid, Wong, Blakeman, & Goltz, 2015) in visually normal viewers by the addition of

external random noise, and by the inability of visual blur, alone, to simulate amblyopic deficits in

visually-guided reaching (Niechwiej-Szwedo, Kennedy, et al., 2012).

11

Figure 1.1: Visual evoked potential P1 amplitude and latency distributions from trial-by-

trial analysis from 18 adults with unilateral amblyopia. The histograms show identical P1

amplitude distributions for the fellow eye (FE) and amblyopic eye (AE), but P1 latency

distributions that are wider and skewed toward longer latencies for the AE. The density plots

illustrate that the P1 latency from stimulation of the AE is noisier and less reliable as a marker of

event timing than the signal from the FE. From Banko, Kortvelyes, Nemeth, et al. (2013).

Reprinted with permission from Elsevier.

12

1.1.4 Neural Basis of Amblyopia

The neural basis of amblyopia has been extensively studied in animal models, and more recently,

using neuroimaging techniques in humans. Although the current scientific consensus points to

the striate or primary visual cortex (V1) as the principal site of neurological dysfunction in

amblyopia, abnormalities have been identified at multiple sites in the visual system (Barrett,

Bradley, & McGraw, 2004; Hess, 2001).

The first significant insights into the neural basis of amblyopia arose from the pioneering work

of Hubel and Wiesel on the structure and function of the visual system in kittens deprived of

vision in one eye (Wiesel & Hubel, 1963a, 1963b). They showed that monocular deprivation by

eyelid suture beginning at birth caused marked atrophy in the layers of the lateral geniculate

nucleus (LGN) fed by the deprived eye (Wiesel & Hubel, 1963a). Deprivation for a comparable

period beginning after 1 to 2 months of normal visual experience resulted in similar but less

severe atrophy, and deprivation beginning in adulthood resulted in no atrophy in the LGN.

Electrophysiological recordings in the kittens with early-onset monocular deprivation showed

normal receptive fields, but reduced overall activity in the layers of the LGN fed by the deprived

eye. Subsequent investigations in monocularly deprived monkeys showed similar histological

changes in the primate LGN, but essentially normal physiological responses to visual stimulation

in the layers fed by the deprived eye (Baker, Grigg, & von Noorden, 1974; Blakemore & Vital-

Durand, 1986). These findings suggested experience-dependent plasticity in the morphology of

the LGN during a critical period in early life, but they did not account for the behavioural deficits

observed in deprivational amblyopia.

Unlike the LGN neurons in the striate cortex of visually deprived animals showed profound loss

of binocularity, loss of responsiveness to stimulation of the deprived eye, and a commensurate

shift in ocular dominance toward the non-deprived eye (Baker et al., 1974; Wiesel & Hubel,

1963b). As with the atrophy observed in the LGN, the physiological changes in the feline striate

cortex were less severe if the animal was deprived after a period of normal visual experience,

and absent if the animal was deprived in adulthood. Although monocular deprivation does not

cause atrophy of the striate cortex as it does in the LGN, cortical microstructure is greatly

affected: in cortical layer IV, the ocular dominance columns serving the deprived eye are

13

markedly narrower, while those serving the fellow eye are expanded (Hubel, Wiesel, & LeVay,

1977; Shatz & Stryker, 1978).

In addition to the early anatomical and physiological studies of deprivational amblyopia, other

studies have examined the neuroanatomic correlates of anisometropic and strabismic amblyopia.

Histologically, macaque monkeys with optically-induced anisometropic amblyopia show atrophy

in the LGN layers fed by the amblyopic eye, and narrowing of the V1 ocular dominance columns

serving the amblyopic eye, similar to that seen in monocular deprivation (Hendrickson et al.,

1987). These neuroanatomic changes in anisometropic amblyopia are accompanied by

physiological changes in the striate cortex: the number of neurons driven binocularly (i.e.,

cortical binocularity) is reduced, the proportion of neurons driven by the fellow eye is increased,

and neurons driven by the amblyopic eye have abnormally poor spatial selectivity and contrast

sensitivity (Kiorpes et al., 1998; Movshon et al., 1987). In monkeys with surgically-induced

strabismic amblyopia, cortical binocularity is similarly reduced, but the shift in eye dominance is

only seen among animals with more profound behavioural visual impairment (Kiorpes et al.,

1998). Furthermore, it has been shown in cats that strabismic amblyopia involves a loss of long-

range horizontal intracortical fibres connecting ocular dominance columns of opposite eyes (i.e.

binocular connections), and a reduction in temporal synchronization among neurons driven by

the amblyopic eye (Löwel & Engelmann, 2002; Roelfsema et al., 1994). Histological data on

amblyopia in humans is sparse, but two post-mortem studies suggest that narrowing of ocular

dominance columns serving the amblyopic eye may not be a feature of anisometropic or

strabismic amblyopia in humans (Horton & Hocking, 1996; Horton & Stryker, 1993).

Although the animal studies described above showed a positive relation between the magnitude

of striatal abnormalities and the degree of visual impairment, quantitative analyses suggest that

the physiological losses in V1 do not fully account for the behavioural deficits in amblyopia

(Kiorpes et al., 1998). Non-invasive neuroimaging techniques provided further insights into the

neural basis of amblyopia beyond V1 in humans. Positron emission tomography (PET) has

shown reduced cerebral blood flow and glucose metabolism in V1 and extrastriate visual areas

during amblyopic eye viewing (Demer, von Noorden, Volkow, & Gould, 1988; Imamura et al.,

1997). Functional magnetic resonance imaging (fMRI) studies report similar findings, with

decreased blood oxygenation level-dependent (BOLD) responses associated with amblyopic eye

viewing in areas V1, V2, V3, Vp (i.e. ventral posterior area of V3), and V3a (i.e., dorsal V3)

14

(Barnes, Hess, Dumoulin, Achtman, & Pike, 2001; Conner, Odom, Schwartz, & Mendola,

2007a, 2007b; Li, Dumoulin, Mansouri, & Hess, 2007). Furthermore, fMRI responses to

stimulation of the amblyopic eye are progressively reduced from V1/V2 to V4+/V8 and the

lateral occipital complex, suggesting impaired transmission of visual information to areas of

higher visual processing (Muckli et al., 2006). In amblyopic macaque monkeys, decreased fMRI

responses to the amblyopic eye have also been observed in the middle temporal (MT) visual area

(El-Shamayleh, Kiorpes, Kohn, & Movshon, 2010), which has been shown to mediate global

motion discrimination (Newsome & Pare, 1988; Salzman, Britten, & Newsome, 1990).

Similarly, an fMRI study of humans with amblyopia showed decreased activity in area MT

during visual motion tracking (Secen, Culham, Ho, & Giaschi, 2011). In agreement with these

spatially distributed reductions in cerebral blood flow and metabolism, the effective connectivity

between disparate visual brain areas (LGN-striate and striate-extrastriate) is also reduced when

driven by the amblyopic eye (Goodyear et al., 2000; Li, Mullen, Thompson, & Hess, 2011).

1.1.5 Normal Visual Development and Sensitive Periods for Damage and Recovery

The visual system is immature at birth, and its functional and anatomic development is critically

dependent upon the visual input it receives in early life (Blakemore, 1988; Boothe, Dobson, &

Teller, 1985; Maurer, Lewis, Brent, & Levin, 1999; Wiesel & Hubel, 1963b) (see Lewis and

Maurer (2005) for review). It therefore follows that if vision is disrupted during development, the

normal course of visual maturation can be derailed and lead to permanent dysfunction of the

visual system. In their landmark studies on monocular deprivation in kittens, Hubel and Wiesel

described a critical, or sensitive, period during which monocular deprivation causes changes in

the ocular dominance of neurons in the striate cortex (Hubel & Wiesel, 1970; Wiesel & Hubel,

1963b). Subsequent studies in animals and humans found that there is not one sensitive period

for visual development, but different sensitive periods for the various aspects of visual

perception (Harwerth, Smith, Duncan, Crawford, & von Noorden, 1986), and the various kinds

of abnormal visual experience (Daw, 1998). Furthermore, for any aspect of visual perception, the

sensitive period for damage by abnormal visual experience is often different from the period

during which functional recovery may be achieved, for example, by occlusion therapy for

amblyopia (Lewis & Maurer, 2005). The general timelines of normal human visual development,

sensitive periods for damage and functional recovery are outlined below.

15

The visual system in newborn humans is functional but immature. Shortly after birth, infants can

visually discriminate their mother’s face from a stranger’s (Pascalis, de Schonen, Morton,

Deruelle, & Fabre-Grenet, 1995), and show a preference for biological motion (Simion, Regolin,

& Bulf, 2008). Visual acuity is rudimentary at birth, measuring less than 1.0 logMAR (i.e., less

than 20/200) by response to an optokinetic grating stimulus (Gorman, Cogan, & Gellis, 1957).

Grating acuity measured by a forced choice preferential looking task improves from 1 to 3

cycles/degree (i.e., less than 20/200) to about 10 cycles/degree (i.e., 20/60) in first 6 months

postnatally, then improves more slowly until reaching adult levels at 4 to 6 years of age (Mayer

et al., 1995; Neu & Sireteanu, 1997; Salomao & Ventura, 1995). Spatial sweep visual evoked

potentials are in general agreement with behavioural improvements in grating acuity in early

infancy (Norcia & Tyler, 1985). While postnatal changes in retinal structure and refraction

account for much of the observed improvement in visual acuity, maturation of neural networks

involved in visual processing must also play a role (Banks & Bennett, 1988; Daw, 2006). Vernier

acuity, or hyperacuity, follows a protracted developmental trajectory distinct from grating acuity.

Vernier acuity is poorer than grating acuity until 2 to 3 months of age, after which it catches up

and parallels the gains in grating acuity during early childhood (Shimojo & Held, 1987;

Skoczenski & Norcia, 2002). While grating acuity reaches adult levels by 6 years, Vernier acuity

continues to improve, surpassing grating acuity and reaching a plateau at around 14 years of age

(Skoczenski & Norcia, 2002). Contrast sensitivity thresholds also improve rapidly in infancy.

From birth until 9 weeks of age, contrast sensitivity improves across all spatial frequencies

(Norcia, Tyler, & Hamer, 1990). Improvements in contrast sensitivity closely follow the

development of grating acuity. Contrast sensitivity at high spatial frequencies develops more

slowly than at low spatial frequencies, but the overall sensitivity function reaches adult form by 7

to 9 years of age (Adams & Courage, 2002; Ellemberg, Lewis, Liu, & Maurer, 1999). Sensitivity

to global motion and biological motion—higher level visual functions that require combination

of spatially disparate visual cues over time—also have a prolonged course of improvement, with

adult-like thresholds not reached until 12 to 14 years of age (Hadad, Maurer, & Lewis, 2011). As

with the differential rate of development of contrast sensitivity across spatial frequencies (Adams

& Courage, 2002; Ellemberg et al., 1999), the maturation of sensitivity to global motion may be

non-uniform for different motion integration stimuli (Hadad, Schwartz, Maurer, & Lewis, 2015).

16

The concept of a sensitive period for visual development is relevant to clinical practice because it

indicates the time during which visual impairment may be prevented (i.e., the sensitive period for

damage), and possibly reversed (i.e., the sensitive period for recovery) by therapeutic

intervention. In their studies of visual development in the cat, Hubel and Wiesel described a

sensitive period for changes in the ocular dominance of cells in the striate cortex and the

associated behavioural deficits caused by early-onset monocular deprivation (Hubel & Wiesel,

1970). Deprivation before 4 weeks of age had no effect on the functional architecture of the

striate cortex. Beginning at 4 weeks, however, the visual system showed an exquisite

susceptibility to monocular deprivation, with even brief periods of deprivation causing profound

physiological and behavioural deficits. Susceptibility remained high for two weeks, then tailed

off slowly until the third month, at which point the visual system was no longer affected by

deprivation. Near the end of the sensitive period, even long periods of eyelid closure produced

only mild physiological deficits. The sensitive period for plasticity of ocular dominance columns

in the striate cortex was subsequently delineated in macaque monkeys (Horton & Hocking,

1997). Unlike the feline visual system, which had a 1 month period of non-susceptibility to

monocular deprivation, the macaque visual system was vulnerable to physiological damage from

birth. Indeed, susceptibility to damage in macaque infants was highest at 1 week of age,

diminished by 5 weeks of age, and absent by 3 months of age. Furthermore, animal studies

describe multiple, partially overlapping sensitive periods for different visual functions and

different areas of the visual brain (Harwerth et al., 1986; Jones, Spear, & Tong, 1984). In

macaque monkeys, the sensitive period was found to end by 3 months of age for scotopic

spectral sensitivity, by 6 months of age for photopic spectral sensitivity, but extend up to 2 years

of age for spatial contrast sensitivity, and beyond 2 years of age for binocular summation

(Harwerth et al., 1986). In another study of monocular deprivation in kittens, the period of

susceptibility for changes in ocular dominance ended by 18 to 26 weeks of age for the

extrastriate cortex (specifically, the lateral suprasylvian area), but extended beyond 35 weeks of

age for the striate cortex (Jones et al., 1984).

In humans, the sensitive periods for damage in visual development have been extensively studied

in children with deprivational amblyopia from congenital and early-onset cataract (see Lewis and

Maurer (2005) for review). A study of early surgery for bilateral congenital cataracts showed that

if deprivation begins at birth and lasts for less than 10 days, children can sometimes attain

17

normal optotype acuity (Kugelberg, 1992). Another study examined the visual acuity in children

at the first contact lens fitting following surgery for congenital and acquired unilateral cataract

(Vaegan & Taylor, 1979). It showed that deprivation onset between 6 and 30 months of age

caused the greatest loss of acuity (less than or equal to counting fingers), that the losses

diminished as a function of deprivation onset after 3 years of age, and that no losses were

apparent with deprivation onset after 10 years of age. Similarly, a study of children with early

monocular and binocular cataract found that deprivation onset before 5 years of age resulted in

abnormal grating acuity, but that onset at 11 years or later had no effect on grating acuity

(Ellemberg, Lewis, Maurer, Brar, & Brent, 2002). The pattern of early high susceptibility and

gradual tailing off is reminiscent of the sensitive period described in animal models (Horton &

Hocking, 1997; Hubel & Wiesel, 1970), but over a much longer time scale. Together, these

studies suggest that the sensitive period for damage of visual spatial acuity extends from birth (or

10 days after birth) to about 10 years of age. Importantly, this is about 4 years beyond the period

of normal development for grating acuity, indicating that this aspect of the visual function

remains vulnerable and plastic for some time after adult-level function is attained (Lewis &

Maurer, 2005). The sensitive period for damage and its relation to normal development is quite

different for global motion, however. A study of early onset unilateral and bilateral cataract

showed that global motion discrimination thresholds are only impaired if deprivation begins

before 4 to 8 months of age (Ellemberg et al., 2002). Similarly, a clinical case study of visual

recovery in a 43 year old man deprived of form vision since 3 years of age reported impaired

spatial resolution consistent with deprivational amblyopia, but normal performance on motion

processing tasks (Fine et al., 2003). Furthermore, fMRI responses in the patient showed

dramatically reduced responses in V1, but normal activation in area MT to motion stimuli. In

stark contrast to spatial acuity, global motion has a brief sensitive period for damage (4 to 8

months) that is dramatically shorter than its period of normal development (12 to 14 years)

(Hadad et al., 2011). Further complicating the discussion of sensitive periods for damage is an

apparent difference between amblyogenic factors (discussed in section 1.1.2). Strabismic

amblyopia is 16 times more prevalent than anisometropic amblyopia under 3 years of age (Birch

& Holmes, 2010), but the two are equally prevalent between 3 and 6 years of age (Repka et al.,

2002). Additionally, anisometropia with onset by 1 year of age is not associated with an

increased risk of amblyopia unless it persists for 3 years (Abrahamsson et al., 1990). These

18

findings suggest that the sensitive period for damage may occur later for retinal defocus (i.e.,

anisometropia) than for binocular decorrelation (i.e., strabismus) or form deprivation.

While the sensitive periods for damage are of tremendous interest to basic science, the sensitive

periods for recovery are arguably more relevant to clinical care because they predict the

effectiveness of treatment. According to clinical tradition, the most common forms of amblyopia

(strabismic, anisometropic, and mixed mechanism) have the best potential for recovery of visual

acuity if occlusion therapy is undertaken before 7 years of age (Campos, 1995; von Noorden &

Campos, 2002). While this is generally borne out in clinical studies (Flynn et al., 1998; Holmes

et al., 2011; Lea et al., 1989), partial recovery is well-documented in adolescents (Scheiman et

al., 2005) and reported in adults long after the end of the sensitive period for damage (Kasser &

Feldman, 1953; Kishimoto et al., 2014; Kupfer, 1957). This suggests that the sensitive period for

recovery of visual acuity tails off more slowly than the sensitive period for damage. Other

evidence indicates that the potential for recovery may also be enhanced or re-opened. For

example, profound recovery of visual acuity in the amblyopic eye has been reported in adults

following loss of the fellow eye (Hamed, Glaser, & Schatz, 1991; Klaeger-Manzanell, Hoyt, &

Good, 1994; Vereecken & Brabant, 1984), and in animals following brief periods of darkness

(Duffy & Mitchell, 2013; He, Ray, Dennis, & Quinlan, 2007) and bilateral pharmacological

retinal inactivation (Fong et al., 2016). Training on perceptual learning tasks has also produced

modest gains in visual acuity, Vernier acuity, contrast sensitivity, and stereopsis in adults with

amblyopia (Levi & Polat, 1996; Li, Ngo, Nguyen, & Levi, 2011; Polat, Ma-Naim, Belkin, &

Sagi, 2004), but debate remains about whether these gains represent true visual recovery, or

merely changes in visual attention (Tsirlin, Colpa, Goltz, & Wong, 2015). Even if normal visual

acuity is achieved by occlusion therapy, some deficits in other visual functions persist. In

deprivational amblyopia, for example, global motion sensitivity can remain impaired despite

complete recovery of visual acuity (Constantinescu, Schmidt, Watson, & Hess, 2005). Similarly,

reading capacity may remain markedly impaired in strabismic amblyopia despite full recovery of

visual acuity (Zürcher & Lang, 1979). Like the sensitive periods for damage, these findings

suggest that different visual functions have distinct but overlapping sensitive periods for

recovery as well. Furthermore, they highlight the potential inadequacy of occlusion therapy in

addressing the full constellation of perceptual impairments that occur in amblyopia.

19

1.2 Auditory Processing

1.2.1 Overview

Acoustic perception is a complex process involving peripheral transduction of sound information

by the cochlea, transmission of the neural impulses to the brain, and central processing of the

auditory signals to register a conscious perception of the acoustic world. At the most basic level,

the peripheral auditory system detects and encodes the frequency and amplitude of one-

dimensional sound waves. Centrally, however, the combination and comparison of the acoustic

information from both ears (i.e., binaural hearing) permits additional information about the

auditory world to be inferred and extracted. For example, analysis of binaural differences in

signal timing and signal intensity provide information about the spatial structure of the auditory

world. Certainly, the ability to localize sound sources and to perceive movement of auditory

objects is critical for survival (Grothe, Pecka, & McAlpine, 2010). Furthermore, integration of

auditory signals over an interval of time enables perception of acoustic sequences and temporal

patterns. This ability to recognize the larger temporal structure of modulation in frequency and

amplitude is essential to communication, comprehension of speech, and appreciation of music

(Warren, 2008).

1.2.2 Auditory Spatial Processing

Sound waves produce one-dimensional oscillations of the eardrum that are transmitted by the

ossicles of the middle ear to the fluid-filled cochlea. Within the cochlea, the transmitted waves

cause deflections in neurosensory hair cells arranged along the basilar membrane. The hair cells

are tonotopically arranged, meaning that their position along the basilar membrane represents the

frequency of a sound rather than a location in space. Therefore, unlike the photoreceptors in the

neurosensory retina, which have an intrinsic spatiotopy, the peripheral auditory apparatus has no

explicit representation of space. Rather, the perception of auditory space must be inferred from

the frequency and amplitude of the signals from each ear. These spatial computations are largely

undertaken by specialized centres in the auditory midbrain (reviewed in Grothe et al. (2010)).

Despite this lack of an explicit spatiotopy, the mature human auditory system has remarkable

sound localization abilities. Indeed, using monaural and binaural cues, humans can discriminate

changes in angular direction of 1 to 2 degrees in the horizontal plane (Blauert, 1970; Klemm,

20

1920) and of 3.5 degrees in the vertical plane (Makous & Middlebrooks, 1990). Below, the

mechanisms of monaural and binaural sound localization in humans are reviewed.

1.2.2.1 Monaural Cues

The head and pinna of the outer ear interact with sound entering the auditory meatus, producing

many frequency-specific time delays and complex spectral changes (Wright, Hebrank, & Wilson,

1974). The spectral transformation that results is referred to as a head-related transfer function,

or HRTF, and varies depending on the azimuth and elevation of the sound source relative to the

head and pinna (Wightman & Kistler, 1989a, 1989b). These monaural cues are of primary

importance in auditory localization on the vertical plane. When the normal convolutions of the

human pinna are experimentally occluded, sound localization in the vertical plane is significantly

impaired: discrimination of changes in elevation is reduced, and the likelihood of front-back

confusions is increased (Gardner & Gardner, 1973). Discrimination of changes in azimuth,

however, appears unaffected by obliteration of monaural spectral cues (Hofman, Van Riswick, &

Van Opstal, 1998).

1.2.2.2 Binaural Cues

Accurate auditory localization in the horizontal plane (i.e., azimuth) relies on two binaural cues:

interaural level difference (ILD) and interaural time difference (ITD). The recognition that these

two cues contribute to binaural perception of azimuth and their dominance at different

frequencies is often referred to as the duplex theory of sound localization (Rayleigh, 1907).

ILD refers to the difference in sound pressure level between the two ears, and is caused by

acoustic shadowing by the head. If a sound is presented from the side, the path of the sound wave

is interrupted by the head, shadowing the far ear from the source of the sound. The amount that a

given sound source is attenuated, or acoustically shadowed, depends on the wavelength of the

sound relative to the diameter of the listener’s head. High frequency sounds with wavelengths

smaller than the diameter of the human head can be attenuated by as much as 35 decibels

(Middlebrooks, Makous, & Green, 1989), far above the 1 to 2 decibel ILD detection threshold

for clicks (Mills, 1958; Von Békésy, 1930). However, as frequency decreases (and wavelength

increases), the effect of acoustic shadowing tails off. Frequencies at or below approximately

21

1400 Hz have wavelengths equal to or larger than the diameter of the human head, and generally

produce ILDs too small to be useful as binaural localization cues (Mills, 1958).

ITD refers to the difference in acoustic signal arrival time (i.e. phase delays) between the two

ears. The relation between ITDs and horizontal angular direction is dependent on the distance

between the two ears and the velocity of sound waves through air. A sound presented from the

side has different path lengths to each ear, and given a constant velocity, will arrive at the far ear

later. In humans, the physiological range of ITDs varies from 0 μs for a central sound source to

about 750 μs for a fully lateralized sound source (van der Heijden & Trahiotis, 1999). The

discrimination thresholds for ITDs can be as short as 10 μs, however, which corresponds to a

change in azimuth of about 2 degrees (Klumpp & Eady, 1956). For frequencies below

approximately 1400 Hz, ITD provides an unambiguous spatial signal (Mills, 1958). For higher

frequencies, however, the ITD becomes spatially ambiguous because phase offsets of one or

more wavelengths could produce ITDs in the physiologic range.

1.2.2.3 Neural Pathways for Sound Localization

Signals for ITDs and ILDs from each ear converge in the auditory midbrain, where dedicated

networks transform these binaural cues into signals for sound location in head-centred

coordinates (see Grothe et al. (2010) for review).

The first site of convergence for binaural ILD cues is the lateral superior olive (LSO). The LSO

receives excitatory input directly from the ipsilateral cochlear nucleus and inhibitory inputs from

the contralateral cochlear nucleus via the medial nucleus of the trapezoid body (MNTB). These

excitatory and inhibitory inputs give rise to ILD sensitivity in LSO neurons by a subtractive

process (Boudreau & Tsuchitani, 1968; Caird & Klinke, 1983). LSO neurons are excited by

sounds that are more intense in the ipsilateral ear, and are inhibited by sounds more intense in the

contralateral ear. Neurons in each LSO send excitatory projections to the contralateral dorsal

nucleus of the lateral lemniscus (DNLL) and inferior colliculus (IC), and inhibitory projections

to the ipsilateral DNLL and IC. In addition to input from the LSOs bilaterally, each IC receives

inhibitory input from the contralateral DNLL and excitatory input directly from the contralateral

cochlear nucleus. These additional ascending inputs to the IC provide additional ILD cues and

improve the spatial sensitivity of IC neurons compared to LSO neurons (Park, Klug, Holinstat, &

Grothe, 2004).

22

Sensitivity to ITD cues is traditionally thought to emerge from convergence of binaural signals in

the medial superior olive (MSO), although in mammals, neurons in both the MSO and LSO

respond to ITD cues. The MSO receives excitatory input directly from the ipsilateral and

contralateral cochlear nucleus, as well as inhibitory input directly from the ipsilateral cochlear

nucleus and indirectly from the contralateral cochlear nucleus via the MNTB. For low frequency

sounds, action potentials from the cochlea are phase-locked to the stimulus waveform, thus

preserving the fine temporal structure of the acoustic stimulus (Rose, Brugge, Anderson, & Hind,

1967). These converging inputs on MSO neurons permit analysis of the phase offset between the

phase-locked binaural signals, and typically give rise to maximal excitation when the auditory

signal leads in the contralateral ear (Batra, Kuwada, & Fitzpatrick, 1997). Neurons in each MSO

send ascending excitatory projections to the ipsilateral and contralateral DNLL and IC. Although

both ILD and ITD cues contribute to sound localization in behavioural studies (Klumpp & Eady,

1956; Mills, 1958), spatial selectivity in the mammalian IC appears to be predominantly to ILD

cues (Campbell, Doubell, Nodal, Schnupp, & King, 2006)

In humans, accurate sound localization in each spatial hemifield relies on the integrity of the

contralateral IC (Litovsky, Fligor, & Tramo, 2002). Beyond this gross lateralization of function,

however, evidence of spatiotopic organization in the IC is weak. A study of the macaque monkey

IC found that while individual neurons are spatially tuned, the overall topographic organization

within each IC is tonotopic rather than spatiotopic (Zwiers, Versnel, & Van Opstal, 2004). A

study of the guinea pig IC, however, found a weak spatiotopy, with more caudal areas of the IC

representing more peripheral locations in the contralateral auditory hemifield (Binns, Grant,

Withington, & Keating, 1992).

Each IC sends ipsilateral projections rostrally to the multisensory superior colliculus (SC)

(Moore & Goldberg, 1966; Oliver & Huerta, 1992), where the auditory inputs interact with

retinotopically organized projections from the visual system to produce a spatiotopic map of

auditory space (King & Palmer, 1983). Unlike the auditory neurons in the IC, auditory neurons

of the SC show spatial sensitivity that changes with eye position, reflecting a transformation

from a head-centred to a retina-centred encoding of auditory space (Hartline, Vimal, King,

Kurylo, & Northmore, 1995; Jay & Sparks, 1987b). A shared coordinate system between the SC

auditory space map and overlying visual space map allows topographical alignment to be

maintained as the eyes move in the orbits (Jay & Sparks, 1987a), and is likely essential for the

23

integration of auditory and visual location signals that occurs in deeper layers of the SC

(Meredith & Stein, 1986b).

1.2.2.4 Normal Development of Sound Localization

Like the visual system (Mayer & Dobson, 1982), the auditory system is immature at birth, and

follows a developmental trajectory shaped by sensory input in early life (King, 2009; Litovsky &

Ashmead, 1997). As shortly as 10 minutes after birth, neonates show a slow orienting response

to sounds presented in the left and right hemifields (Clifton, Morrongiello, Kulig, & Dowd,

1981; Muir & Field, 1979; Wertheimer, 1961). Once infants develop sufficient neck control, the

precision of horizontal sound localization can be studied using a minimum audible angle (MAA)

task, which measures the smallest change in sound source position that can be reliably perceived

(Mills, 1958). At 6 months of age, the MAA is about 20°, but improves gradually until reaching

adult levels (about 1° to 2°) sometime after 18 months of age but before 5 years of age

(Ashmead, Clifton, & Perris, 1987; Litovsky, 1997; Mills, 1958; Morrongiello, 1988). In

infancy, sensitivity to both ILD and ITD cues are poorer relative to adult thresholds, but

sensitivity to ITD cues is better than would be predicted by measurements of MAA for free-field

sound sources (Ashmead, Davis, Whalen, & Odom, 1991; Ashmead, Grantham, Murphy, &

Tharpe, 1993). This lack of real-world auditory localization precision is postulated to allow for

easy recalibration of sound localization as head growth causes rapid changes in the mapping of

ILD, ITD, and spectral cues to physical space (Ashmead et al., 1991; Clifton, Gwiazda, Bauer,

Clarkson, & Held, 1988).

1.2.2.5 Influence of Vision on the Development of Sound Localization

Spatial tuning and calibration of the developing auditory system does not occur in isolation, but

is strongly influenced by visual experience in early life. As noted previously, orienting responses

to auditory spatial cues are present in humans at birth (Clifton et al., 1981; Muir & Field, 1979;

Wertheimer, 1961). Similarly, animal studies have shown that a rudimentary map of auditory

space is present in the mammalian superior colliculus at birth (King & Carlile, 1993). Although

visual input is not necessary for the development of normal or even supra-normal spatial hearing

abilities (Lessard, Pare, Lepore, & Lassonde, 1998; Röder et al., 1999; Voss et al., 2004),

evidence from animals and humans show that abnormal visual input can cause significant,

permanent alterations to the processing of auditory spatial cues. Knudsen and Knudsen (1989)

24

showed that barn owls reared wearing prism spectacles mislocalize sounds presented in darkness

in the direction of the early visual field shift. Owls reared with prisms for less than 6 months

tended to recover normal sound localization following prism removal, but those reared with

prisms for more than 6 months did not recover normal sound localization abilities, indicating a

permanent developmental miscalibration of spatial hearing (Knudsen & Knudsen, 1990). These

behavioural changes in sound localization were accompanied by shifts in the auditory space map

in the optic tectum, the avian homolog to the mammalian SC (Knudsen & Brainard, 1991).

Similarly, ferrets reared with experimentally-induced strabismus in one eye show a

corresponding shift in spatial tuning of neurons in the contralateral SC (King, Hutchings, Moore,

& Blakemore, 1988). Ferrets reared with a more severe visual distortion by rotation of one eye

about the visual axis show complete disorganization of the SC auditory map (King et al., 1988).

These animal studies indicate that vision has a dominant influence on the calibration of sound

localization, regardless of whether the visual input is spatially accurate.

Studies of clinical populations with visual impairment confirm that early visual experience can

affect the developmental calibration of spatial hearing in humans as well. Lessard et al. (1998)

found that people with early-onset bilateral visual impairment (i.e., congenital blindness in the

central visual field) localize sounds with poor precision and accuracy. Similarly, Gori, Sandini,

Martinoli, and Burr (2014) found that congenitally blind adults perform poorly on auditory

spatial bisection tasks, indicating an impairment in the encoding of Euclidean auditory

relationships. Interestingly, several studies of visually impaired populations show sensory

compensation by the auditory system for the loss of visual spatial abilities. For example, people

who are totally blind from birth can localize sounds with normal accuracy in the central region of

space (Lessard et al., 1998), and display supra-normal localization abilities in peripheral space

(i.e., when sounds are presented laterally) (Röder et al., 1999). One-eyed people who lost their

second eye in early life also show better-than-normal accuracy in sound localization in the

central region of space (Hoover, Harris, & Steeves, 2012). Although early visual experience

strongly influences the development of spatial hearing, vision is clearly not necessary to attain

sound localization abilities that are equivalent to or exceed normal performance in many

respects.

25

1.2.2.6 Sensitive Periods for the Development of Sound Localization

In general, the auditory system remains adaptable to changes in the spatial mapping of binaural

cues, even into adulthood (Moore, 1993). Ferrets raised with chronic monaural occlusion adapt

to the skewed ILD cues and localize sound normally at maturity as long as the occlusion is

maintained (King, Parsons, & Moore, 2000). Adult ferrets raised with normal binaural

experience show impaired sound localization immediately following monaural occlusion, but

recover near-normal spatial hearing within 6 months, indicating the capacity for adaptation in

adulthood (King et al., 2000). Similarly, adult humans who went through childhood with normal

sensory experience show systematic bias in sound localization immediately following monoaural

occlusion, but can adapt back to pre-occlusion performance within 1 week with training.

(Kumpik, Kacelnik, & King, 2010). Like the visual system, however, the auditory system has

sensitive periods in early life during which abnormal auditory and visual input can permanently

alter the processing of binaural cues and the localization of sound (see Keuroghlian and Knudsen

(2007) for review).

Evidence that accurate and precise sound localization depends on auditory experience in early

life is available from animal studies and reports on human populations with early-onset hearing

impairment. Experiments on monaural occlusion in young barn owls found that if owls are

younger than 8 weeks of age at the time of ear plugging, they can fully adapt to the abnormal

binaural cues (Knudsen, Esterly, & Knudsen, 1984). If they are older than 8 weeks, however,

adaptation does not occur and they remain tuned to normal binaural cues, indicating that the

sensitive period for damage to sound localization by abnormal binaural experience ends at about

8 weeks of age (Knudsen, Esterly, et al., 1984; Knudsen & Knudsen, 1986). The sensitive period

for recovery of sound localization in barn owl appears to be longer, however: researchers

subjected barn owls to monaural occlusion starting before 8 weeks of age, and found that the rate

of recovery slowed as a function of the age at ear plug removal, with virtually no recovery

observed if occlusion persisted for 38 to 42 weeks (Knudsen, Knudsen, & Esterly, 1984).

Studies of children with early deafness followed by restoration of hearing by cochlear

implantation suggests an analogous sensitive period in humans. One study of children with

bilateral cochlear implants found that those who acquired profound deafness after 3.5 years of

age localized sounds more accurately than congenitally deaf children, regardless of the age of

26

cochlear implantation (Killan, Royle, Totten, Raine, & Lovett, 2015). Other researchers have

measured spatial release from masking—a psychoacoustic measure of the ability of a listener to

separate acoustic signals from noisy backgrounds on the basis spatial cues—as a marker of

spatial hearing ability in children with a history of early peripheral hearing loss. A study of

children age 5 to 13 years with a prior history of otitis media with effusion (i.e., middle ear

disease causing transient hearing loss) found that spatial release from masking did not improve

even after pure tone thresholds had normalized (Pillsbury, Grose, Hall, & Iii, 1991). This effect

of early auditory deprivation was confirmed by a later study that found otitis media with effusion

for a cumulative total of 2.5 years or more during the first 5 years of life resulted in residual

impairments in binaural hearing (Hogan & Moore, 2003). These findings demonstrate that

abnormal auditory experience during the first several years can produce lasting deficits in spatial

hearing.

The influence of visual experience on sound localization also demonstrates sensitive periods for

damage and recovery in early life (see King (2009) for review). This has been most extensively

studied in the barn owl. Knudsen and Knudsen (1990) showed that the accuracy of sound

localization is maximally affected by visual field displacement when prism spectacle experience

began by 3 weeks of age. This sensitive period for damage gradually tapered off until between

15 to 38 weeks of age. They also examined the sensitive period for recovery from vision-induced

sound mislocalization, and found that the recovery was full if normal vision was restored by

about 26 weeks of age, but absent if normal vision was restored after 29 weeks of age. In barn

owls, therefore, the sensitive period for maximal recovery appears to extend beyond the sensitive

period for maximal damage. Detailed data of this kind are not available for humans, but

investigations comparing sound localization performance in the early-blind and late-blind

generally show some normal to supra-normal abilities in both groups (Abel, Figueiredo, Consoli,

Birt, & Papsin, 2009; Ashmead et al., 1998; Fieger, Röder, Teder-Salejarvi, Hillyard, & Neville,

2006; Hoover et al., 2012; Lessard et al., 1998; Röder et al., 1999; Voss et al., 2004), but deficits

on some tasks only among the early-blind (Gori et al., 2014; Lessard et al., 1998; Zwiers, Van

Opstal, & Cruysberg, 2001). These findings suggest the presence of a sensitive period for

damage by abnormal visual experience in early life for humans.

27

1.2.3 Auditory Temporal Processing

The human auditory system is capable of perceiving the temporal structure of sound over a

remarkable range of time scales. Successful processing and integration of temporal cues on the

order of microseconds enables perception of spatial location by ITD cues, and integration on the

order of milliseconds allows detection and discrimination of the basis of simultaneity, non-

simultaneity, temporal order, and duration (see Wittmann (1999) for review). Perception and

recognition of temporal structure and patterns on the order of seconds enables the comprehension

of speech and appreciation of music (Pöppel, 1997; Warren, 2008). The limits of temporal

resolution are not equal for all perceptual tasks, however, suggesting that different neural

mechanisms are at play (Hirsh, 1959).

Extraction of spatial information from ITD cues demands higher temporal precision than any

other neural computation in the human brain (Grothe et al., 2010). Based on spatial

discrimination thresholds, the temporal resolution limit for ITD cues is approximately 10 μs

(Klumpp & Eady, 1956). Unlike other types of auditory temporal processing, however, the

perceptual result of ITD processing is spatial, not temporal, in nature. The reliable perception of

time in the auditory system requires temporal separation at least two orders of magnitude greater

that for ITDs: a temporal interval of at least few milliseconds is required for a listener to reliably

perceive two brief sounds presented in sequence as non-simultaneous, and a separation of about

20 ms is needed for a listener to reliably report the order in which they occurred (Hirsh, 1959;

Hirsh & Sherrick, 1961; Kanabus, Szelag, Rojek, & Poppel, 2002).

1.2.3.1 Normal Development of Auditory Temporal Processing

As in the spatial domain, performance on auditory perceptual tasks in the temporal domain

improves throughout development. Discrimination thresholds of auditory temporal order for

clicks presented to alternate ears improve from about 130 ms at 5 years of age to about 70 ms by

11 years of age (Berwanger, Wittmann, von Steinbuchel, & von Suchodoletz, 2004), but this

methodology may be confounded with spatial discrimination by ITD cues. Others have avoided

this spatial confound by examining the thresholds for purely temporal stimuli. Davis and

McCroskey (1980) examined auditory temporal resolution by measuring the minimum inter-

stimulus interval between two brief (17 ms) sounds before they are perceptually fused into one

event. They found that the auditory fusion threshold improved dramatically from between 20 to

28

24 ms at 3 years of age to between 6 to 10 ms at 8 years of age, before stabilizing to between 4 to

8 ms at 9 to 11 years of age. Others have quantified auditory temporal resolution by measuring

the detection threshold for a temporal gap (i.e., silence) within a burst of continuous noise.

Similar to the auditory fusion threshold, the auditory gap detection threshold was found to

improve significantly through childhood, reaching adult levels by 11 years of age (Irwin, Ball,

Kay, Stillman, & Rosser, 1985; Wightman, Allen, Dolan, Kistler, & Jamieson, 1989). Gap

detection thresholds vary by stimulus frequency and level, but for broadband noise at 40 dB SPL,

improve from approximately 18 ms at 6 years of age to 6 ms at 11 years to age (Irwin et al.,

1985). Further supporting this time course of development for auditory temporal processing,

Tallal (1978) found that temporal order discrimination for two tones of different frequency

improved through childhood, reaching adult levels by approximately 9 years of age.

1.2.3.2 Influences of Abnormal Sensory Experience on Auditory Temporal Processing

Like the development of auditory spatial processing abilities, the development of auditory

temporal processing abilities is also dependent upon normal sensory experience in early life. A

study of auditory gap detection showed that adult listeners with congenital moderate

sensorineural hearing loss have higher gap detection thresholds than normal hearing controls,

even after controlling for audibility of the stimuli (DeFilippo & Snell, 1986). Similar results were

reported among adult cochlear implant users with later-onset hearing loss, however, raising the

possibility that this deficit is not developmental in origin (Blankenship, Zhang, & Keith, 2016).

Another study, however, found that children with a history of transient hearing loss from otitis

media with effusion prior to 5 years of age have poorer auditory gap detection thresholds

compared to unaffected children (Khavarghazalani, Farahani, Emadi, & Hosseni Dastgerdi,

2016). Their findings indicate that long-lasting effects on gap threshold detection can arise from

auditory deprivation in early life, and suggest that early childhood encompasses a sensitive

period for auditory temporal processing (Khavarghazalani et al., 2016). Temporal processing

deficits exist at larger time scales as well. A study of older children examining their ability to

reproduce a temporal sequence presented as a series of suprathreshold tones reported that

children with early hearing loss do so less accurately than normal hearing controls (Sterritt,

Camp, & Lipman, 1966). Interestingly, early hearing impairment affects temporal processing in

other sense modalities as well. For example, in adults who became deaf before 2 years of age,

29

simultaneity judgment thresholds are dramatically impaired for unisensory visual and unisensory

tactile stimuli (Heming & Brown, 2005). Conversely, early visual impairment is associated with

superior temporal auditory resolution: early blind adults outperform developmentally typical

adults in auditory temporal order judgments (Stevens & Weaver, 2005), and possibly in auditory

gap detection (Muchnik, Efrati, Nemeth, Malin, & Hildesheimer, 1991; Weaver & Stevens,

2006). These heightened auditory temporal processing abilities in the early blind may represent

sensory compensation in the absence of vision.

The interactions between vision and audition are reviewed further in the following section.

1.3 Multisensory Processing and Integration

1.3.1 Overview

Each sense modality provides a unique window on the external world. The visual system detects

energy in the form of electromagnetic radiation in the visible spectrum. The auditory system

detects energy in the form of air pressure fluctuations. The somatosensory system detects thermal

and mechanical energy delivered to the body in the form of touch, pain, vibration, and

movement. In many instances, the senses provide modality-specific information about features in

the environment. Colour, for instance, is specific to vision, while pitch is specific to audition. In

other instances, however, the different senses provide information on shared features of a

stimulus that are not specific to a particular modality. Object features accessible to more than one

sensory modality, such as intensity, spatial location, and temporal frequency or rhythmic

structure, have been variously referred to as amodal attributes (Lewkowicz, 2000), intermodal

invariants (Gibson, 1966), common sensibles (Marks, 1978), and intersensory equivalencies

(Lewkowicz & Lickliter, 1994). When watching a percussionist, for example, the occurrence of

drum beats in space and time is often accessible to both the visual and auditory systems, and if

he or she is playing very intensely, to the somatosensory system as well. These unisensory

information streams converge in the central nervous system, which serves the fundamental

function of combining these multisensory information streams to synthesize a coherent and

biologically meaningful internal representation of the external world. This function, termed

perceptual binding, is of critical importance for survival because its perceptual result determines

an organism’s ability to react quickly, precisely, and appropriately to salient stimuli (Ernst &

Bulthoff, 2004).

30

In any scientific endeavour, clear definitions of the processes and phenomena under discussion

are essential for accurate and nuanced understanding of the subject matter and for the ultimate

advancement of knowledge. In the literature dealing with multisensory perception, however,

terminology is often applied variably. In an effort to address this semantic confusion, Stein et al.

(2010) published a multi-author consensus paper defining key terms in the field. Their

conceptual framework and essential definitions will be presented briefly. “Multisensory

processing” is an umbrella term referring to any neural or behavioural phenomenon associated

with two or more sensory modalities. “Cross-modal matching” is a multisensory process in

which stimulus features from different sensory modalities are compared to estimate their

equivalence. The unisensory features being compared may be temporal in nature (e.g., time of

onset or duration), spatial in nature (e.g., spatial location or frequency), or relate to identity (e.g.,

matching lip movements to sound). “Multisensory integration” is a multisensory process that

involves the combination of unisensory signals to produce a neural or behavioural response that

is significantly different from its component inputs. Unlike cross-modal matching, multisensory

integration does not require the preservation of the features of the unisensory stimuli, but instead

fuses the common features to create a new neural response or behavioural percept (Stein et al.,

2010).

Multisensory processing in the central nervous system provides not only a richer, more complete

perceptual gestalt, it forms the foundation for cross-modal associative learning in infancy (Stein,

Stanford, & Rowland, 2014) and enables cross-modal communication essential for sensory

calibration during development (Gori, 2015; Knudsen & Knudsen, 1990). As these processes

undergo experience-dependent maturation, integrative capacities emerge at both the neuronal and

behavioural levels (Stein et al., 2014; Wallace, 2004). At maturity, the synthesis and integration

of sensory information across modalities confers distinct perceptual advantages in response

speed, precision, and detection thresholds (Stein & Meredith, 1993).

The adaptive advantage of multisensory processing and integration is perhaps most obvious in its

effect on reaction times. For example, simple manual response times to multisensory cues are

shorter than to cues presented in the visual, auditory, or tactile modality alone (Andreassi &

Greco, 1975; Forster, Cavina-Pratesi, Aglioti, & Berlucchi, 2002; Hershenson, 1962). Initiation

of saccadic eye movements that shift gaze to a target of interest are similarly reduced when

redundant cues are presented in more than one sensory modality (Frens, Van Opstal, & Van Der

31

Willigen, 1995; Harrington & Peck, 1998; Hughes, Reuter-Lorenz, Nozawa, & Fendrich, 1994).

While spatially concordant multisensory cues produce the shortest saccadic latencies, pairing a

visual target with a spatially neutral cue also improves performance compared to visual-only

presentations (Colonius & Diederich, 2004).

The benefits of multisensory processing are also evident in the ability to perceive subtle stimuli

near the threshold of detectability. In cats, detection of low-intensity visual stimuli is improved

by presentation of spatially coincident but task irrelevant auditory cues (Stein, Meredith,

Huneycutt, & McDade, 1989). In humans, sensitivity to near-threshold visual stimuli is similarly

enhanced by presentation of a sound in close spatial and temporal proximity (Frassinetti,

Bolognini, & Ladavas, 2002; Noesselt, Bergmann, Hake, Heinze, & Fendrich, 2008).

Conversely, sensitivity to faint sounds is enhanced by simultaneous presentation of a neutral

light cue (Lovelace, Stein, & Wallace, 2003).

In addition to reaction times and detection thresholds, multisensory processing can improve the

precision and accuracy of sensory perception as well. For instance, in circumstances where the

spatial reliability of visual and auditory signals is similar, localization precision and accuracy for

spatially aligned audiovisual stimuli is better than for stimuli presented in either modality alone

(Alais & Burr, 2004; Stevenson, Fister, Barnett, Nidiffer, & Wallace, 2012). In the temporal

dimension, in-phase auditory flutter improves detection of visual flicker, lowering the critical

flicker frequency (Ogilvie, 1956). Similar multisensory enhancement is also observed in the

accuracy of saccades to bimodal audiovisual targets (Corneil, Van Wanrooij, Munoz, & Van

Opstal, 2002). In the visuotactile realm, the ability to discriminate differences in object size is

better when visual and tactual cues are available compared to when cues are available in only

one modality (Ernst & Banks, 2002). Combined multisensory cues have also been shown to

improve speech comprehension. The addition of visual cues (i.e., lip-reading) significantly

improves the intelligibility of auditory speech for hearing impaired listeners (Grant, Walden, &

Seitz, 1998), and for normal hearing listeners in noisy environments (Driver, 1996; Grant &

Seitz, 2000; Sumby & Pollack, 1954).

Beyond the advantages in precision, accuracy, and reaction times, multisensory processing also

plays an important role in resolving perceptual ambiguities that exist at the unisensory level

(Ernst & Bulthoff, 2004; Green & Angelaki, 2010). A common example of this is the self-motion

32

illusion (i.e., vection) that occurs when stationary trains begin to move relative to one another.

To the visual system of a passenger inside one train, viewing visual motion of the neighbouring

train through the window is perceptually ambiguous: the same visual stimulus may be generated

by the passenger’s train moving forward, or by the neighbouring train moving backward. This

ambiguity is resolved, however, by multisensory combination of visual motion cues with

vestibular motion cues (Ernst & Bulthoff, 2004; Fetsch, Turner, DeAngelis, & Angelaki, 2009).

In the audiovisual realm, sound can similarly shift perception of ambiguous visual motion paths.

When two simple dots are animated to converge and diverge, they may be perceived as

streaming through one another or as colliding and bouncing off one another. The addition of a

brief sound at the moment of convergence, however, biases the percept towards a collision and

bounce (Sekuler, Sekuler, & Lau, 1997).

The above examples demonstrate the adaptive advantages of combined processing and

integration of multiple sensory streams by the nervous system in scenarios with ecological

validity. The tendency toward integration may also give rise to illusions, however, in scenarios

where the stimuli are manipulated experimentally to introduce disparity or incongruity in one or

more dimensions (e.g., space, time, semantic content, or numerosity). Such illusions have been

exploited extensively by researchers to determine the neural and computational underpinnings of

multisensory perception (Calvert, Spence, & Stein, 2004; Stein & Meredith, 1993). For example,

in the classic ventriloquism effect, spatial disparity between the location of a visual stimulus and

the source of a corresponding sound biases, or ‘captures’, the perceived spatial origin of the

sound (Howard & Templeton, 1966). In the more recently described temporal ventriloquism

effect, temporal disparity between visual and auditory cues biases perception: the apparent

interval between the onset of two lights can be shortened by paired auditory clicks that

temporally intervene between the lights, or lengthened by paired clicks that temporally flank the

lights (Morein-Zamir, Soto-Faraco, & Kingstone, 2003). In the well-known McGurk effect,

incongruity between semantic content of an auditory speech signal and the accompanying visual

speech signal alters the perceived auditory signal (McGurk & MacDonald, 1976). For instance,

when audio of the syllable /ba/ is paired with video of the syllable /ga/, the resulting percept in

developmentally normal individuals is rarely veridical, but most commonly an illusory /da/.

Incongruities in numerosity between rapid sequential visual flashes and auditory beeps also elicit

33

a multisensory illusion: a single flash accompanied by multiple beeps is perceived as multiple

flashes (Shams, Kamitani, & Shimojo, 2000, 2002).

The phenomena described above are a small selection of the numerous multisensory effects

described in the scientific literature, but serve to underscore the extent to which perception is

shaped by multisensory interactions.

1.3.2 Influence of Cognitive Factors in Multisensory Processing

Although numerous studies suggest that multisensory perceptual binding is a rapid, pre-attentive,

stimulus-driven process that occurs without the observer’s awareness (Driver, 1996; McGurk &

MacDonald, 1976; Sekuler et al., 1997), several cognitive factors including attention and

decisional bias have been shown to modulate performance on multisensory tasks.

Attention is an essential cognitive function that enables an observer to select relevant stimuli so

that greater neural resources may be devoted to their processing (Talsma, Senkowski, Soto-

Faraco, & Woldorff, 2010). The interactions between attention and multisensory processing are

bidirectional: salient stimuli may alter attentional selection by a bottom-up alerting mechanism,

while top-down directed attention can alter the manner in which multisensory stimuli are

processed (Theeuwes, 1991). The influence of directed attention on multisensory processing is

evident in studies of the McGurk effect—an audiovisual speech illusion (Navarra, Alsius, Soto-

Faraco, & Spence, 2010). When attentional resources shift to a secondary task, susceptibility to

the McGurk effect decreases despite direct viewing of the speaker’s face (Alsius, Navarra,

Campbell, & Soto-Faraco, 2005). Similarly, when attention is directed to a tactile stimulus while

performing the McGurk task, observers’ susceptibility to the audiovisual illusion decreases

(Alsius, Navarra, & Soto-Faraco, 2007). In addition to altering the McGurk effect, directed

attention may also modulate multisensory enhancement in reaction times. For example, in an

audiovisual cue discrimination task, focusing attention on a single modality abolishes the

multisensory enhancement in reaction time observed when attention is not explicitly directed

(Mozolic, Hugenschmidt, Peiffer, & Laurienti, 2008). Importantly, however, not all multisensory

phenomena are influenced by directed attention. The spatial ventriloquism effect is influenced by

neither directed attention (i.e., top-down attentional effects) (Bertelson, Vroomen, De Gelder, &

Driver, 2000), nor automatic attention (i.e., bottom-up alerting effects) (Vroomen, Bertelson, &

De Gelder, 2001). Similarly, visual perceptual enhancement in the temporal ventriloquism effect

34

is not accounted for by attentional alerting or distraction by accompanying auditory clicks

(Morein-Zamir et al., 2003).

Another hypothesized effect of attention on sensory perception is that of prior entry. According

to the law of prior entry, a stimulus presented in an attended modality or location will be

perceived before a stimulus in an unattended modality or location (Titchener, 1908). Scientific

support for this effect is largely derived from studies of audiovisual temporal order judgment

(TOJ) tasks. These studies show that the point of subjective simultaneity (PSS) shifts as a

function of the attended modality, speeding perception by an estimated 14 ms (Schneider &

Bavelier, 2003; Spence, Shore, & Klein, 2001; Zampini, Shore, & Spence, 2005). While

different explanations have been proposed, including sensory acceleration (Stelmach &

Herdman, 1991; Tünnermann, Petersen, & Scharlau, 2015) and enhanced perceptual sensitivity

(Schneider & Bavelier, 2003), the underlying mechanism of the effect remains controversial

(Frey, 1990; Spence & Parise, 2010).

Many psychophysical tasks probing multisensory processes are also susceptible to response

biases that may mimic lower-level sensory interactions (Welch, 1999). In signal detection theory,

this concept is formalized as the criterion parameter (Green & Swets, 1966). Importantly, the

criterion can vary, or shift, independently of the internal signal characteristics, and both factors

determine the perceptual response. Consequently, identifying the confound of response bias or

criterion shift is particularly problematic in experimental paradigms (e.g., yes/no judgments) that

do not have a stochastically-determined noise floor like that built-in to the 2-alternative forced

choice method (Fechner, 1889; Welch, 1999). For example, an investigation of audiovisual

simultaneity judgments and temporal order judgments showed that the mechanisms driving

perceptual changes in these tasks are ambiguous, and may be explained equally by changes in

neural timing or by criterion shifts (Yarrow, Jahn, Durant, & Arnold, 2011). Less equivocal

evidence of cognitive response bias is evident in studies of intersensory bias, however, which

show a relatively larger magnitude of multisensory interaction among pairings of realistic,

contextually-rich, semantically congruent (i.e., “compelling”) stimuli (e.g., a steam kettle and a

whistle sound) versus arbitrary, unfamiliar, unrealistic stimuli (e.g. a dot and a tone) (Warren,

Welch, & McCarthy, 1981). Similarly, asynchrony between speech video clips and audio tracks

is easier to detect with gender mismatched stimuli compared with gender matched stimuli

35

(Vatakis & Spence, 2007), but no such effect was found for matched and mismatched non-

speech stimuli (Vatakis & Spence, 2008).

1.3.3 Neural Sites of Multisensory Processing

Perceptual information is segregated by modality in the peripheral nervous system, but

converges on a shared population of neurons for multisensory processing in the central nervous

system. Rather than a discrete locus, however, neurophysiologic and neuroimaging studies have

revealed numerous brain regions involved in multisensory functions, and a diversity of activation

patterns that vary by perceptual task and stimulus (see Calvert (2001) and Alais, Newell, and

Mamassian (2010) for reviews).

1.3.3.1 Superior Colliculus

Multisensory integration has been most extensively studied in the superior colliculus (SC), a

phylogenetically ancient midbrain structure common to all mammals and homologous to the

optic tectum in other vertebrates (King, 2004). It receives ascending multisensory signals—

visual, auditory, and somatosensory (Meredith & Stein, 1986b)—and plays key roles in directing

attention to stimuli of interest (Robinson & Kertzman, 1995) and mediating saccadic eye

movements (Schiller & Stryker, 1972; Sparks, 1986). Neurons in the superficial layers of the SC

are unisensory and respond solely to visual stimulation, while cells in the deeper layers are

multisensory and respond to combinations of visual, auditory, and somatosensory stimuli (King,

2004). In addition to this superficial-to-deep organization by modality, neurons within each layer

of the SC are topographically arranged according to their spatial receptive fields (Cynader &

Berman, 1972; Lane, Allman, Kaas, & Miezin, 1973; Sparks, 1988; Wallace, Wilkinson, &

Stein, 1996). Importantly, the multimodal spatial maps in the SC are in spatial register, so that

overlapping receptive fields of visual-, auditory-, and somatosensory-responsive neurons map to

the same region of space (Sparks, 1988; Wallace et al., 1996). The superficial visual receptive

fields are organized in a retinotopic manner similar to the primary visual cortex: each SC

contains a representation of the contralateral visual field, with the central-peripheral axis

represented rostro-caudally, and the superior-inferior axis represented latero-medially (Lane et

al., 1973). The deeper auditory and somatosensory receptive fields have similar topography to

the visual map, and maintain their alignment by making compensatory shifts in response to

changes in eye position (Groh & Sparks, 1996; Hartline et al., 1995). The prioritization of spatial

36

alignment among the multisensory maps in the SC highlights its role in processing sensory

information according to spatial location—an amodal feature—regardless of its modality of

origin.

The neuroanatomic arrangement of retinocollicular projections is also distinct from the

arrangement of retinostriate projections, as shown in Figure 1.2 (Lane et al., 1973; Pollack &

Hickey, 1979). The right and left striate cortices (V1) receive approximately equal input from the

corresponding nasal and temporal retinas of each eye. Each SC, however, receives input

predominantly from the nasal retina (serving the temporal visual hemifield) of the contralateral

eye. As a result, the nasal and temporal visual hemifields of each eye are equally represented in

V1, but the temporal visual hemifield is over-represented in the SC.

Figure 1.2: Schematic diagram of retinal projections in the retinostriate and

retinocollicular pathways. (A) In the retinostriate pathway, retinal projections undergo a hemi-

decussation in the optic chiasm. The right and left primary visual cortices (V1) therefore receive

approximately equal input from the corresponding points in the nasal and temporal retinas of

each eye. (B) In the retinocollicular pathway, the majority of retinal projections originate in the

37

nasal retina of each eye and decussate fully in the optic chiasm. The right and left superior

colliculi (SC) therefore receive predominantly crossed input from the nasal retina of the

contralateral eye (serving the temporal visual field for that eye). LGN, lateral genicular nucleus;

RE, right eye; LE, left eye. Modified from Zackon, Casson, Zafar, Stelmach, and Racette (1999).

Reprinted with permission from Elsevier.

Multisensory neurons in the SC exhibit several characteristic responses that offer insight into the

mechanisms and constraints of multisensory integration at this site (Holmes & Spence, 2005;

Stein, Stanford, Ramachandran, Perrault, & Rowland, 2009). First, the firing rate of multisensory

neurons tends to be enhanced when stimuli in different sense modalities originate from the same

location in space; the greater the spatial separation between two unimodal signals, the smaller the

multisensory response enhancement (Meredith & Stein, 1986a). This is referred to as the “spatial

rule” of multisensory integration. Second, neural responses in the multisensory layers of the SC

are greater when the stimuli in each modality occurs as approximately the same time (Meredith,

Nemitz, & Stein, 1987). This is termed the “temporal rule” of multisensory integration. Third,

SC multisensory neurons driven by spatially congruent stimuli from different modalities show a

magnitude of response enhancement (i.e., an increase in firing rate) that is greater than the sum

of the responses to the unisensory stimulus alone (Meredith & Stein, 1986a). This phenomenon

is termed “superadditivity” (Stein & Meredith, 1993). Fourth, SC multisensory neuron response

enhancement (i.e., the magnitude of superadditivity) is generally greater for weak component

unisensory stimuli (Meredith & Stein, 1986b). This classical multisensory response is termed the

principle of “inverse effectiveness” (Stein & Meredith, 1993). Together, superadditivity and

inverse effectiveness comprise non-linear responses that enhance the saliency of weak but

spatially- and temporally-congruent stimuli when information is available from more than one

modality (Holmes & Spence, 2005).

These characteristic non-linear responses of multisensory SC neurons are not present at birth, but

develop in an experience-dependent manner. Single-cell recordings from the SC of newborn

monkeys show that although deep SC neurons respond to visual, auditory, and somatosensory

stimuli, the inputs are not integrated to produce response enhancement (Wallace & Stein, 2001).

Similar recordings in cats reared in darkness show that visual experience is necessary for

38

integrative responses to emerge in the SC (Wallace, Perrault, Hairston, & Stein, 2004). This

experience-dependent maturation of SC multisensory neurons is also mediated by descending

cortical input: in kittens, ablation of corticofugal pathways from the anterior ectosylvian sulcus

and the rostral lateral suprasylvian sulcus precludes development of integrated multisensory

responses (Jiang, Jiang, & Stein, 2006). In the adult cat, reversible cryogenic inactivation of

these cortical areas also temporarily blocks multisensory enhancement in SC neurons (Jiang,

Wallace, Jiang, Vaughan, & Stein, 2001).

Assuming normal experience-dependent development, probabilistic models suggest the

multisensory response enhancement in the SC, and its spatial and temporal constraints, represent

an optimal strategy to attend and orient to important environmental stimuli (Rowland, Stanford,

& Stein, 2007).

1.3.3.2 Cortical Areas

Beyond the SC, functional neuroimaging and electrophysiological studies show extensive

multisensory interactions in the human cerebral cortex (Figure 1.3). Although the regions

involved and precise patterns of activation are highly stimulus and task dependent, some general

patterns have emerged in the literature (Calvert, 2001). The superior temporal sulcus (STS) is

commonly activated in tasks requiring integration of complex multisensory stimuli, particularly

during audiovisual speech perception (Callan, Callan, Kroos, & Vatikiotis-Bateson, 2001;

Calvert, Campbell, & Brammer, 2000; Raij, Uutela, & Hari, 2000). Functional MRI data suggest

a posterior-to-anterior audiovisual processing gradient in the STS: the posterior regions respond

to audiovisual signals regardless of their spatialtemporal structure, the middle regions integrate

audiovisual signals according to their physical stimulus properties (i.e., spatiotemporal

correspondence), and the anterior regions integrate audiovisual signals according to their

linguistic content (Lee & Noppeney, 2011b). Activation of the STS is also observed duing

illusory visual perception in the sound-induced flash illusion (Watkins, Shams, Tanaka, Haynes,

& Rees, 2006). The intraparietal sulcus (IPS) shows enhanced activation during spatial

localization and spatial attention tasks requiring cross-modal integration of audiovisual stimuli

(Bushara et al., 1999; Lewis, Beauchamp, & DeYoe, 2000) and visuotactile stimuli (E.

Macaluso, C. Frith, & J. Driver, 2000; E. Macaluso, C. D. Frith, & J. Driver, 2000). The cortex

of the insula and claustrum appear to have a prominent roles in cross-modal matching of visual

39

and tactile stimuli (Banati, Goerres, Tjoa, Aggleton, & Grasby, 2000; Hadjikhani & Roland,

1998) and in processing temporal correspondence of visual and auditory stimuli (Bushara,

Grafman, & Hallett, 2001; Calvert, Hansen, Iversen, & Brammer, 2001). Areas of the frontal

lobes frequently show enhanced activation to multisensory stimuli (Banati et al., 2000; Bushara

et al., 2001; Callan et al., 2001; Calvert et al., 2000; Calvert et al., 2001; Lee & Noppeney,

2011b; Lewis et al., 2000; Raij et al., 2000), but their role in multisensory processing is less

distinct than more posterior areas. Some have speculated, however, that the frontal lobes serve to

process more arbitrary or abstract cross-modal associations (e.g., the association between

auditory and visual representations of alphabetical letters) (Calvert, 2001; Calvert et al., 2004;

Raij et al., 2000).

Figure 1.3: Summary of putative multisensory areas of the human brain based on primate

anatomical data, human psychophysical data, and functional neuroimaging studies. The left

image shows a lateral view of the brain; the right image shows the medial view of the brain.

From Calvert (2001). Reprinted with permission from Oxford University Press.

40

Figure 1.4: Posterior-to-anterior audiovisual processing gradient in the human STS.

Coloured areas show fMRI blood oxygenation level-dependent (BOLD) activation based on

different audiovisual stimulus features. The posterior STS responds to audiovisual signals

regardless of their spatiotemporal structure (magenta). The mid-STS responds to audiovisual

signals on the basis of spatiotemporal correspondence (cyan). The anterior STS responds to

audiovisual correspondence on the basis of linguistic content. The frontal lobe also shows areas

of BOLD activation to audiovisual stimuli. Figure modified from Lee and Noppeney (2011b),

with permission under the Creative Commons BY-NC-SA 3.0 Unported License.

Multisensory stimuli have also been shown to modulate functional responses in cortices

traditionally viewed as modality-specific (see Macaluso (2006) for review). For example, cross-

modal perceptual binding, as indicated by sound-induced change in visual motion, is associated

not only with activation of multimodal areas, but also reciprocal inactivation of unimodal areas

(Bushara et al., 2003). Conversely, cross-modal binding of audiovisual speech signals produces

response enhancement in the primary visual and auditory cortices as well as the multisensory

STS (Calvert et al., 1999; Calvert et al., 2000; Nath & Beauchamp, 2012). Similarly, spatial

correspondence between visual and tactile stimuli elevates brain activity not only in the IPS, but

also in the primary sensory cortices contralateral to the stimuli (Macaluso & Driver, 2001).

Findings such as these challenge the traditional view of sensory processing proceeds in a

hierarchical manner from primary unisensory areas, to secondary cortices and increasingly

multisensory areas. Instead, empirical findings indicate that multisensory perception depends on

41

both feed-forward and feed-back interactions between unisensory and multisensory areas

(Calvert, 2001; Macaluso, 2006).

1.3.4 Multisensory Integration

The brain processes multisensory information by comparing continuous unisensory inputs and

selectively combining, or binding together, related signals to improve the fidelity of perception

(Parise & Ernst, 2016). From this viewpoint, whether multisensory signals are perceptually

bound and to what extent they are integrated depends on their relatedness. Although signal

relatedness may be conceptualized on a continuous scale from 0% to 100%, it reflects a

probabilistic measure of a binary determination: whether unisensory signals in different

modalities arose from a common source or from different sources (Shams & Beierholm, 2010).

Indeed, the biological value of perceptual cues to an organism lies in the information they

convey about their extrinsic causes (Kording et al., 2007). If the constraints on multisensory cue

combination are very liberal, an organism risks inappropriately binding multisensory cues from

separate events, thus losing critical information about its environment. On the other hand, if the

constraints on multisensory cue combination are very strict, an organism may not integrate cues

arising from a single source, thus impairing its ability to quickly and precisely detect important

stimuli in its noisy sensory environment. The determination of relatedness and appropriate

integration of multisensory signals is therefore the central problem faced by a multisensory

perceptual system.

As is evident from studies on perceptual bias with experimentally-induced discrepancy between

multisensory stimuli, the perceptual system has a tendency to produce a perceptual experience

consistent with non-discrepant stimuli originating from a common source (Welch & Warren,

1980). Welch and Warren (1980) termed the observer’s belief, or perception, that two or more

unisensory cues belong together, or originate from a common source, the “unity assumption”.

They postulated that the strength of this assumption is a function of the extent of feature

correspondence (e.g., spatial, temporal, or identity) between the unisensory signals, and

cognitive factors such as attention and the overall “compellingness” of the stimulus complex

(Welch, 1999; Welch & Warren, 1980). Indeed, in their studies of the ventriloquism effect,

Thurlow and Jack (1973) showed that the strength of visual capture was dependent upon the

degree of spatial and semantic correspondence between the unisensory signals. If the visual and

42

auditory cues were too widely separated, or semantically unrelated (e.g., a video of a face

mouthing syllables paired with an auditory tone), the cross-modal interaction was relatively

diminished.

While the unity assumption predicts perceptual binding of related but discrepant multisensory

stimuli, the precise manner in which the perceptual system resolves the discrepancy is dependent

upon the stimuli and the perceptual task at hand. In many instances, one modality dominates or

‘captures’ the other in terms of its perceptual representation of the shared feature. In the spatial

dimension, vision tends to dominate when in conflict with other senses. Perhaps the best-known

demonstration of visual dominance is the classic ventriloquism effect, in which the spatial

position of a visual cue dominates and overrides the perceived location of the auditory cue

(Bertelson & Radeau, 1981; Howard & Templeton, 1966; Thurlow & Jack, 1973). Visual

dominance has also been described in the context of visuotactile discrepancy. When an object is

simultaneously grasped and viewed through a distorting lens, judgments of shape are strongly

biased toward the non-veridical visual input (Rock & Victor, 1964). In the temporal dimension,

however, audition tends to dominate vision. In situations of discrepant audiovisual flutter

frequencies, for example, the auditory rate drives, or entrains, the perceived rate of visual flutter

(Gebhard & Mowbray, 1959; Recanzone, 2003; Shipley, 1964). Similarly, multiple beeps

accompanying a single flash can create the illusion of multiple flashes (Shams et al., 2000,

2002), and in the temporal ventriloquism effect, auditory cues in close temporal proximity to

visual cues alters the perceived timing of the visual cues (Morein-Zamir et al., 2003).

1.3.5 Theories of Multisensory Integration and Modality Dominance

Various theories and models have been proposed to explain how the nervous system combines

multisensory cues and determines intersensory bias under different conditions.

1.3.5.1 Directed-Attention Hypothesis

The directed-attention hypothesis proposes that modality dominance in multisensory perception

is determined by differences in attention given to signals in each modality (Posner, Nissen, &

Klein, 1976; Welch & Warren, 1980). According to this hypothesis, visual dominance in

audiovisual location judgments and visuotactile shape judgments reflects a greater amount of

attention given to vision than to audition or touch (Welch & Warren, 1980). Posner et al. (1976)

43

proposed that because visual cues are less alerting than auditory cues, a greater proportion of

attention is tuned to vision. Furthermore, they suggested that visual dominance in the

multisensory percept is achieved through a mechanism of sensory facilitation termed prior entry

(Titchener, 1908). This theory found its support in studies of attentional manipulation in the

ventriloquism effect (Canon, 1970, 1971). However, virtually no effect of attentional

manipulation could be elicited in the setting of visual-proprioceptive positional discrepancy

(Pick, Warren, & Hay, 1969). The proposed attentional bias toward vision, in isolation, also

failed to explain the dominance of audition in intersensory discrepancy in the temporal

dimension (Gebhard & Mowbray, 1959; Shipley, 1964).

1.3.5.2 Modality Appropriateness and Precision Hypotheses

The modality appropriateness hypothesis begins with the assumption that the various sensory

modalities are not equally suited for the perception of any given event (Freides, 1974; O’Connor

& Hermelin, 1972; Welch & Warren, 1980). This theory states that while the different sensory

modalities are capable of many overlapping information processing functions, each has a subset

of functions that it performs better than other modalities. This inherent sensory processing

superiority, in turn, determines the relative bias toward, or dominance of, a particular modality.

A related theory—the modality precision hypothesis—defines the appropriate modality as the

one that has greatest precision for a given perceptual task (Choe, Welch, Gilford, & Juola, 1975;

Welch & Warren, 1980). Put another way, a modality’s dominance in a multisensory percept

may not reflect its inherent physiologic priority over other modalities, but rather its superior

precision for the perceptual dimension being probed (Witten & Knudsen, 2005). Information

from the most precise sense will therefore dominate in the fused percept. Because vision is more

reliable and precise than other senses in the spatial dimension (Recanzone, 2009; Witten &

Knudsen, 2005), it follows that vision dominates in spatial aspects of perception (Bertelson &

Radeau, 1981; Rock & Victor, 1964; Thurlow & Jack, 1973). Indeed, visual signals are rarely

subject to environmental distortion, and the topography of the retina maps directly to physical

space. Conversely, audition is more precise than other senses in the temporal domain

(Recanzone, 2009; Witten & Knudsen, 2005). The modality precision hypothesis therefore

predicts the dominance of audition observed in temporal aspects of perception (Gebhard &

Mowbray, 1959; Morein-Zamir et al., 2003; Recanzone, 2003; Shams et al., 2000; Shipley,

1964).

44

More recently, investigators have found that the typical dominance of vision and audition in

spatial and temporal perception, respectively, can be diminished or reversed. For example, in an

audiovisual frequency judgment task, the typical dominance of auditory flutter frequency over

visual flicker frequency was reversed by making the auditory temporal signal temporally

ambiguous (Wada, Kitagawa, & Noguchi, 2003). In an audiovisual spatial localization task,

blurring of the visual target reversed the typical dominance of the vision over audition (Alais &

Burr, 2004). Similarly, studies of the visual and proprioceptive contributions to judgments of

limb position reported that the relative weighting of the sensory modalities dynamically adjusts

in response to degradation of the visual position signal (Mon-Williams, Wann, Jenkinson, &

Rushton, 1997). These findings indicate that modality dominance in multisensory perception is

not categorical and fixed, but continuous and flexible, with individual modalities dynamically re-

weighted based on the immediately-available sensory information. Although the modality

appropriateness and precision hypotheses predict variable dominance based on the perceptual

task, they do not explicitly account for these dynamic changes in perceptual weighting for a

given task (Wada et al., 2003).

1.3.5.3 Bayesian Inference and the Maximum Likelihood Estimation Model

As stated in section 1.3.4, the adaptive value of perceptual cues to an organism lies in the

information they convey about the external environment (Kording et al., 2007). It follows, then

that the reliability and precision of sensory information are related to its survival value to the

organism (Kording et al., 2007). Because multisensory integration serves to enhance the

reliability and precision of sensory information (Jacobs, 2002), it can be modeled as a problem of

optimal combination (Ernst & Bulthoff, 2004). Bayes’ theorem provides a probabilistic

framework to formalize aspects of the modality precision hypothesis and allows construction of a

hypothetical ideal observer (Deneve & Pouget, 2004; Kersten & Yuille, 2003; Yuille & Bulthoff,

1996). Such an ideal observer may then be used as a reference standard with which to compare

actual human performance (Ernst & Bulthoff, 2004).

For a given external stimulus feature, � (e.g. spatial location), and its sensory representation, �

(encoded as a random variable with noise), Bayes’ theorem states that the posterior probability

distribution, ��|��, is proportional to the product of the likelihood function, ��|��, and the

prior probability distribution ��:

45

��|�� ∝ ��|�� × ��

The likelihood function (i.e., the noise distribution) can be determined experimentally by

repeatedly presenting a stimulus at the same location, �, and measuring the variability in �. If all

positions of � are equally likely, then �� is a uniform distribution, and the theorem reduces to:

��|�� ∝ ��|��

The value of � that maximizes the posterior probability is therefore the optimal estimate of the

stimulus location, ��:

�� = argmax�

��|��

The same approach can be applied to multisensory stimuli (Deneve & Pouget, 2004). For

example, given an audiovisual location signal, ��, and its independent visual, ��, and auditory,

��, representations in the sensory system, the optimal location estimate may be computed by:

�� = argmax��

��|��, ��

Using Bayes’ theorem and the assumption of uniform prior distributions (Deneve & Pouget,

2004; Kersten & Yuille, 2003), the posterior distribution reduces to:

��|��, �� ∝ ��|�� × ��|��

If ��|�� and ��|�� represent Gaussian distributions, then the optimal audiovisual location

estimate, ��, is the weighted sum of the unimodal location estimates, �� and ��:

�� = �� + ��

where the weights, �� and ��, represent the unimodal location estimate reliabilities (i.e., the

inverse variances of their respective posterior probabilities) divided by a normalizing term:

�� = 1 ��⁄1 ��⁄ + 1 ��⁄ and �� = 1 ��⁄

1 ��⁄ + 1 ��⁄

46

This special case of multisensory Bayesian inference with uniform priors is also known as the

Maximum Likelihood Estimation (MLE) model of multisensory integration (Deneve & Pouget,

2004; Ernst & Bulthoff, 2004; Yuille & Bulthoff, 1996).

In summary, the MLE model states that the optimal strategy for multisensory integration to

combine sensory information into the most reliable composite estimate possible. It assumes that

the noise associated with each unisensory estimate is independent and normally distributed, so

that the statistically optimal combination is a simple weighted average where the perceptual

weight is determined by the normalized reciprocal variance of the unisensory estimate. The

uniform prior distributions in the MLE model imply that all possible values for the stimulus are

equally likely, and that the strength of the “unity assumption” (i.e., the belief that the various

unisensory cues belong together) is invariant (Chen & Spence, 2017).

Several studies have demonstrated that human multisensory perception is consistent with cue

combination by the MLE model. In the spatial domain, vision typically dominates in visual-

haptic judgments of shape (Rock & Victor, 1964). Ernst and Banks (2002), however, showed

that the relative perceptual weights of vision and touch can be modulated by the addition of

random noise in depth to the visual signal. Using an intersensory conflict paradigm, they

measured the reliability of the unisensory and multisensory percepts, as well as the intersensory

bias in the multisensory percept across multiple visual noise levels. These empirical data agreed

with predictions from an MLE ideal observer, showing that visual-haptic integration results in

not only optimal modality weighting, but also maximal reliability, in the multisensory percept.

Similarly, Alais and Burr (2004) demonstrated that integration of visual and auditory spatial

signals is consistent with the MLE model using another paradigm of intersensory conflict—the

ventriloquism effect. Instead of adding random noise in depth, however, they modulated the

reliability of the visual stimulus by adding increasing amounts of Gaussian blur. In the temporal

domain, the validity of the MLE model is less clear. A study of audiovisual temporal integration

of flash and beep stimuli reported good agreement with the MLE model (Andersen, Tiippana, &

Sams, 2005). However, other studies of audiovisual temporal integration have not supported this

model. Quantitative analysis of human performance on an audiovisual temporal bisection task

did not support the MLE model (Burr, Banks, & Morrone, 2009). Similarly, a study of

audiovisual rate perception showed that multisensory integration in the temporal dimension is

47

not adequately described by the MLE model, and is more consistent with a Bayesian model that

includes a prior probability distribution (Battaglia, Jacobs, & Aslin, 2003).

1.3.6 Development of Multisensory Processes

Like the development of unisensory perceptual abilities, many multisensory processes mature

over a prolonged period of postnatal development (Wallace, 2004). These processes include

cross-modal associative learning, cross-modal matching, and eventual emergence of the capacity

to integrate multisensory stimuli optimally (Lewkowicz & Lickliter, 1994; Stein & Meredith,

1993). At the neuronal level, simultaneous activation of pre- and post-synaptic neurons by

converging multisensory stimuli is postulated to govern associate learning by simple Hebbian

rules (Cuppini, Magosso, Rowland, Stein, & Ursino, 2012; Feldman, 2012). Indeed, such

experience-dependent cross-modal associative learning has been demonstrated in the SC of

neonatal cats (Yu, Rowland, & Stein, 2010). In humans, cross-modal association and associative

learning is evident in behavioural studies of infants. Using preferential-looking paradigms, the

ability to learn sight-sound pairings has been demonstrated in infants just a few hours old

(Morrongiello, Fenwick, & Chance, 1998). At 3 to 4 weeks of age, infants show an ability to

match auditory and visual stimuli on the basis of intensity (Lewkowicz & Turkewitz, 1980). One

month olds have also been reported to recognize which of two visually perceived shapes matches

one they previously explored tactually (Meltzoff & Borton, 1979), although this finding has not

been supported by subsequent replication studies (Brown & Gottfried, 1986; Maurer, Stager, &

Mondloch, 1999; Pêcheux, Lepecq, & Salzarulo, 1988). At 4 to 6 months of age, infants show a

preference for cross-modal correspondence for novel visual-auditory pairings (Lyons-Ruth,

1977; Spelke, 1976) and cross-modal transfer for visual-tactual pairings (Rose, Gottfried, &

Bridger, 1981). In addition to simple stimulus pairings, behavioural and electrophysiological data

also suggest that multisensory combination of visual and auditory cues plays an important role in

speech acquisition in infancy (Kushnerenko, Teinonen, Volein, & Csibra, 2008; Lewkowicz &

Hansen-Tift, 2012). Lewkowicz (2000) conducted a review of the literature on multisensory

perception in human infants, and proposed that multisensory capacities progress from simple to

more complex in a sequential and hierarchical fashion. According to that model of multisensory

development in the first year of life, sensitivity to temporal relations between auditory and visual

stimuli emerges initially on the basis of synchrony and duration, followed by sensitivity to rate,

and lastly on the basis of complex rhythmic features (Lewkowicz, 2000).

48

Although the ability to compare and combine multisensory cues is present from early infancy,

evidence suggests that optimal multisensory integration does not emerge until considerably later

(Ernst, 2008). For simple audiovisual detection tasks, multisensory facilitation of reaction times

indicative of auditory and visual coactivation is not observed until approximately 8 years of age,

and remains immature until 10 to 11 years of age (Barutchu, Crewther, & Crewther, 2009;

Barutchu et al., 2010). In spatial navigational tasks, adults demonstrate optimal integration of

visual and proprioceptive cues, but performance of children up to 8 years of age is consistent

with a non-integrative strategy of unisensory dominance instead (Nardini, Jones, Bedford, &

Braddick, 2008). Similarly, when provided with visual and haptic cues of object size, statistically

optimal integration producing a bimodal enhancement in precision does not emerge until 8 to 10

years of age (Gori, Del Viva, Sandini, & Burr, 2008). Children younger than 12 years also show

non-optimal integration of auditory and visual cues in tasks of spatial bisection (Gori, Sandini, &

Burr, 2012). Audiovisual speech is an apparent exception to the relatively late emergence of

multisensory integration: when presented with stimuli to elicit the McGurk effect, infants as

young as 4 months old show behavioural (Burnham & Dodd, 2004; Desjardins & Werker, 2004;

Rosenblum, Schmuckler, & Johnson, 1997) and electrophysiological (Bristow et al., 2009)

evidence of audiovisual integration. However, the apparent mechanisms of audiovisual speech

integration vary by age. Behavioural and electrophysiological studies suggest that young children

rely on general perceptual mechanisms for audiovisual speech integration, and only after 6 to 8

years of age do they develop speech-specific and phonetic integrative mechanisms as well

(Baart, Bortfeld, & Vroomen, 2015; Baart, Vroomen, Shaw, & Bortfeld, 2014; Eskelund,

Tuomainen, & Andersen, 2011; Lalonde & Holt, 2016).

1.3.7 Cross-Sensory Calibration Hypothesis

Before the emergence of optimal integration, children tend to demonstrate unisensory dominance

when confronted with conflicting multisensory cues. For example, in children younger than 8

years, visual cues dominate haptic cues in spatial orientation discrimination, and haptic cues

dominate visual cues in size discrimination (Gori et al., 2008). Some have suggested that body

growth in childhood (e.g., increases in limb and digit length, interaural separation, and

interocular distance) necessitates prioritizing sensory recalibration over optimal combination to

prevent the accumulation of bias in multisensory perception (Ernst, 2008; Gori et al., 2008). This

idea has been formalized as the cross-sensory calibration hypothesis, which states that

49

multisensory interactions in childhood play a fundamental role in maintaining accurate sensory

calibration (Burr & Gori, 2012; Gori, 2015; Gori et al., 2008). Similar to the modality precision

hypothesis for modality dominance in adults (Welch & Warren, 1980), the cross-sensory

calibration hypothesis predicts that the more accurate and robust modality informs the calibration

of the other in a multisensory interaction. Consistent with cross-sensory calibration, visual

distortion from prism spectacles induces persistent bias in auditory localization in barn owls

(Knudsen & Knudsen, 1989), and strabismus (i.e., misalignment of the eyes) shifts the auditory

spatial map in the superior colliculus of ferrets (King et al., 1988). In humans, within-modality

perceptual impairments are consistent with predictions of this hypothesis as well. Haptic

orientation discrimination, typically calibrated by visual cues in childhood, is impaired in early-

blind individuals, but haptic size discrimination, which is typically calibrated by touch, is

preserved (Gori, Sandini, Martinoli, & Burr, 2010). Children with upper limb movement

disorders show the opposite pattern: visual orientation discrimination, typically calibrated by

vision, is preserved, but visual size discrimination, typically calibrated by touch, is impaired

(Gori, Tinelli, Sandini, Cioni, & Burr, 2012). In each instance, a lack of accurate information

from the more robust modality that typically dominates in childhood multisensory perceptions is

hypothesized to impair cross-sensory calibration. Furthermore, this hypothesis provides an

explanation for perceptual impairments observed in modalities not directly affected by a

peripheral sensory pathology.

1.3.8 Selected Psychophysical Measures of Audiovisual Processing and Integration

1.3.8.1 Audiovisual Simultaneity Judgment

For a given observer, the range of signal onset asynchronies (SOAs) over which a given set of

visual and auditory stimuli are perceived as simultaneous is termed the audiovisual simultaneity

window (Figure 1.5). In adulthood, the audiovisual simultaneity window is characteristically

bell-shaped with a slight skew toward conditions in which the visual signal leads the acoustic

signal (Dixon & Spitz, 1980; Lewald & Guski, 2003; Slutsky & Recanzone, 2001; Zampini,

Guest, Shore, & Spence, 2005). The consequence of this skew is that the likelihood of perceived

simultaneity between visual and auditory stimuli is maximal when the light objectively precedes

the sound. Several hypotheses have been advanced to explain this visual-lead asymmetry. Based

on differences in reaction times and evoked potential response latencies to unisensory visual and

50

auditory stimuli (Andreassi & Greco, 1975; King & Palmer, 1985), some have suggested that the

asymmetry is a consequence of faster internal processing of auditory compared to visual stimuli

(Vroomen & Keetels, 2010). It has been alternatively hypothesized to represent adaptive tuning

to the natural delay in sound waves compared to light waves emanating from any common

source a significant distance from the observer (e.g., the delay in thunder following a lightning

strike) (King & Palmer, 1985; Vroomen & Keetels, 2010).

Figure 1.5: A hypothetical audiovisual simultaneity window. The response distribution has a

characteristic bell shape with a skew toward the visual-lead side.

While the temporal constraints on the audiovisual simultaneity window are obvious, other factors

also affect the likelihood of perceived simultaneity. Auditory and visual stimuli in originating

from the same location are more likely to be perceived as simultaneous than those originating

from different locations (Zampini, Guest, et al., 2005). Stimulus type also has a significant effect

on the audiovisual simultaneity window, with a wider, more symmetric window observed for

audiovisual speech stimuli compared to simple and complex non-speech stimuli (Stevenson &

Wallace, 2013). The shape of the audiovisual simultaneity window can also be altered by

training. Exposure to a fixed audiovisual time lag for a period of minutes results in a

recalibration of perceived simultaneity responses toward the adapted asynchrony (Fujisaki,

Shimojo, Kashino, & Nishida, 2004). Short-term perceptual training with feedback (Powers,

51

Hillock, & Wallace, 2009; Stevenson, Wilson, Powers, & Wallace, 2013) and long-term musical

training (Lee & Noppeney, 2011a) may also narrow the audiovisual simultaneity window,

particularly on the visual-lead side.

Evidence suggests that a slow, attentive process of cross-modal feature comparison, rather than

true multisensory integration, may underlie audiovisual asynchrony detection (Fujisaki &

Nishida, 2005). The temporal limit of audiovisual asynchrony detection, or width of the

simultaneity window, is therefore argued to depend upon the temporal information encoded at

the unisensory level, or upon the inherent temporal resolution of the neural mechanism for cross-

modal matching (Fujisaki & Nishida, 2005).

Although sensitivity to audiovisual synchrony is posited as the initial basis for multisensory

association in early infancy (Lewkowicz, 2000), this perceptual process continues to mature

throughout early and middle childhood. The audiovisual simultaneity window narrows on both

the auditory-lead and visual-lead sides from early childhood through adolescence, reaching an

adult shape between 9 years and 17 years of age (Chen, Shore, Lewis, & Maurer, 2016; Hillock-

Dunn & Wallace, 2012; Hillock, Powers, & Wallace, 2011; Lewkowicz & Flom, 2014).

In adults, the width of the audiovisual simultaneity window is also correlated with other indices

of multisensory integration. People with a narrow simultaneity window tend to experience a

stronger McGurk effect, but are less susceptible to the sound-induced flash illusion (Stevenson,

Zemtsov, & Wallace, 2012). A common factor uniting these behavioural finding is hypothesized

to be an individual’s ability to dissociate asynchronous multisensory signals (Stevenson,

Zemtsov, et al., 2012).

1.3.8.2 Spatial Ventriloquism Effect

The spatial ventriloquism effect is an illusion involving cross-modal integration in which

spatially disparate visual and auditory stimuli are perceived as originating from the same location

(Figure 1.6). In normal subjects, localization of the visual stimulus typically dominates the fused

percept and ‘captures’ the auditory stimulus (Figure 1.6A) (Howard & Templeton, 1966; Welch

& Warren, 1980). The strength of this perceptual fusion follows the spatial and temporal rules of

multisensory integration (Holmes & Spence, 2005), diminishing with increasing spatial and

temporal separation until the two stimuli are consistently regarded as separate events (Godfroy,

52

Roumes, & Dauchy, 2003; Lewald, Ehrenstein, & Guski, 2001; Lewald & Guski, 2003;

Recanzone, 2009; Slutsky & Recanzone, 2001). The spatial ventriloquism effect is not

significantly affected by top-down directed attention or bottom-up automatic attention, indicating

that the phenomenon is a result of low-level, automatic cross-modal interactions (Bertelson et al.,

2000; Vroomen et al., 2001).

Alais and Burr (2004) demonstrated that blurring of the visual stimulus, which reduces its spatial

reliability, can diminish and even reverse the dominance of sight over sound (Figure 1.6B).

Quantitative analysis of their results showed that integration of visual and auditory spatial

information obeys the MLE model of optimal combination, such that the variance in the

localization estimate of the fused percept is minimized (Alais & Burr, 2004).

Figure 1.6: A diagram of the spatial ventriloquism effect. The speaker icon indicates the

location of the auditory stimulus, and the Gaussian blob represents the location and spatial

reliability of the visual stimulus. The red dot indicates the perceived location of the fused

audiovisual event. Diagram not to scale. (A) Classical ventriloquism, in which the location of the

fused percept is dominated by the location of the visual stimulus. (B) If the visual stimulus is

very spatially unreliable relative to the auditory stimulus, the location of the fused percept shifts

toward the auditory location.

53

1.3.8.3 Temporal Ventriloquism Effect

The temporal ventriloquism effect is a cross-modal phenomenon in which non-speech, non-

rhythmic, and spatially uninformative sounds alter performance on a visual temporal order

judgment (TOJ) task (Figure 1.7) (Morein-Zamir et al., 2003). If two clicks temporally intervene

between the onset of two lights, performance is worsened, as if the clicks “pull” the lights closer

together in time (Figure 1.7A). Conversely, if two clicks temporally flank the onset of two

lights, performance is enhanced, as if the clicks “pull” the lights apart in time (Figure 1.7B).

Morein-Zamir et al. (2003) conducted a series of variations on this paradigm, and concluded that

the enhancement in visual TOJ is dependent upon the second sound trailing the second light by

about 100 to 200 ms and the effect is not accounted for by mechanisms of attentional alerting or

cross-modal interference. Rather, they postulate that the phenomenon results from a low-level

integrative mechanism that resolves intersensory temporal discrepancy by drawing the stimuli

toward temporal convergence (Fendrich & Corballis, 2001; Morein-Zamir et al., 2003).

Beyond the obvious temporal constraints, the temporal ventriloquism effect exhibits no

dependency on intersensory spatial correspondence (Vroomen & Keetels, 2006) or synesthetic

congruency between the auditory and visual stimuli (Keetels & Vroomen, 2011).

Figure 1.7: Examples of audiovisual stimulus conditions that elicit the temporal

ventriloquism effect. (A) When two clicks temporally intervene between the onset times of two

lights, visual TOJ performance is degraded. (B) When two clicks temporally flank the onset of

two lights, visual TOJ performance is enhanced.

54

1.3.8.4 McGurk Effect

The McGurk effect is an audiovisual illusion in which the perception of an auditory speech

stimulus is altered by concurrent presentation with an incongruent visual speech stimulus. In the

original study by McGurk and MacDonald (1976), an auditory /ba/ combined with a visual /ga/

consistently produced the illusory auditory percept of /da/. Similarly, an auditory /pa/ paired with

a visual /ka/ was often perceived as an auditory /ta/. Subsequent studies found that the

phenomenon generalized to many syllabic combinations, with the resulting auditory percept

being either a fused syllable intermediate to the veridical auditory and visual cues, or dominated

by the visual cue (Burnham & Dodd, 1996; Paré, Richler, ten Hove, & Munhall, 2003).

The illusory auditory percept in the McGurk effect is a relatively robust multisensory

phenomenon. The strength of the percept is not substantially influenced by large spatial

disparities between the visual and auditory signals (Jones & Munhall, 1997). A clear view of the

speaker’s lips is also not required: the strength of the McGurk effect does not begin to diminish

until gaze is displaced beyond 10 to 20 degrees from the speaker’s mouth (Paré et al., 2003), and

the effect is relatively insensitive to spatial degradation of the visual signal by pixelation

(MacDonald, Andersen, & Bachmann, 2000). Even among observers that are aware of the

artificial stimulus pairing or asynchrony between the auditory and visual stimuli, the illusion

remains strong (Soto-Faraco & Alsius, 2009). However, unlike integrative phenomena of low-

level audiovisual stimuli (e.g., the spatial and temporal ventriloquism effects), the strength of the

McGurk effect is significantly diminished when attention is divided or diverted from the

audiovisual speech stimuli (Alsius et al., 2005; Alsius et al., 2007).

Several studies indicate that auditory and visual speech signals may be integrated by both

speech-specific and more general multisensory mechanisms (Eskelund et al., 2011; Tuomainen,

Andersen, Tiippana, & Sams, 2005; van Wassenhove, Grant, & Poeppel, 2007; Vroomen &

Stekelenburg, 2011). Furthermore, combined behavioural and electroencephalographic evidence

suggests that audiovisual speech integration in the McGurk effect proceeds in a hierarchical

fashion, with general spatial and temporal features being integrated first, within 100 ms of sound

onset, followed by integration on the basis of phonetic properties shared by the auditory and

visual signals (Baart, Stekelenburg, & Vroomen, 2014).

55

1.4 Multisensory Processing in Amblyopia

Given the protracted course of experience-dependent postnatal development of the visual,

auditory, and multisensory perceptual systems (Burr & Gori, 2012; Daw, 2006; Warren, 2008),

and the cross-modal dependency of spatial hearing on vision (King, 2009), and temporal vision

on hearing (Heming & Brown, 2005), some researchers have begun to examine multisensory

processing in amblyopia and other forms of early visual impairment. Below, the current

knowledge of this emerging field of study is summarized.

1.4.1 Audiovisual Temporal and Spatial Perception

Several lines of evidence indicate that amblyopia involves abnormal temporal interactions

between auditory and visual stimuli. The earliest study to explicitly examine this issue tested

adults with a history of early visual deprivation from bilateral congenital cataract on a simple

visual task with an auditory distractor (Putzar, Goerendt, Lange, Rosler, & Röder, 2007).

Participants were shown a series of rapidly changing colours, and asked to identify the colour

simultaneous with a target flash. A task-irrelevant auditory tone was presented before or after the

target flash. The distractor tone significantly biased the perceived timing of the target flash

among controls, but bilaterally deprived participants were less affected, indicating reduced cross-

modal interactions between vision and hearing. The temporal constraints on the perception of

audiovisual simultaneity have also been studied in a similar population (Chen et al., 2017). In

adults with a history of early bilateral deprivation, the audiovisual simultaneity window was

found to be normal on the auditory-lead side, but widened on the visual-lead side (i.e., they were

more likely to perceive a click and a flash as simultaneous when in fact, the flash came first). In

contrast, adults with a history of early monocular deprivation had a symmetrically widened

simultaneity window (i.e., they were more likely to perceive audiovisual simultaneity in both

auditory-lead and visual-lead conditions). In both groups, the abnormalities in simultaneity

perception persisted regardless of which eye was viewing, suggesting that these effects are

mediated by abnormalities in central audiovisual processing rather than peripheral visual input

alone. The perception of audiovisual simultaneity has not previously been investigated in

anisometropic, strabismic, or mixed mechanism amblyopia. The sound induced flash illusion

(Shams et al., 2000, 2002) has also been employed to study audiovisual temporal interactions in

amblyopia. A preliminary study of the illusion in adults with deprivational amblyopia showed

56

normal susceptibility among adults with unilateral deprivational amblyopia, but reduced

susceptibility to the illusion (i.e., lower likelihood of perceiving illusory flashes) among adults

with bilateral deprivational amblyopia, especially when the visual flashes were presented

peripherally (Chen, Shore, Lewis, & Maurer, 2015). Similar to findings in unilateral

deprivational amblyopia, a study of the sound induced flash illusion in unilateral anisometropic,

strabismic, or mixed mechanism amblyopia found no abnormal susceptibility under monocular

viewing conditions (Narinesingh et al., 2017). Under binocular viewing conditions, however,

participants with non-deprivational forms of amblyopia showed susceptibility to the illusion over

a wider range of auditory-leading SOAs, suggesting a widened temporal window of perceptual

binding (Narinesingh et al., 2017).

Audiovisual spatial processing has not been explicitly investigated in populations with

amblyopia. However, abnormal cross-modal interactions in motion perception, which

incorporates both spatial and temporal signals, has been described in adults with sight recovery

following brief bilateral deprivation from congenital cataracts (Guerreiro, Putzar, & Röder,

2016). Such individuals, who experienced a brief period of congenital blindness, exhibit a

significant visual motion aftereffect following adaptation to auditory motion that is absent in

normally sighted individuals and visually impaired individuals who acquired their deficits after

childhood. This cross-modal effect suggests abnormal involvement of audition in visual motion

processing, and supports previous finding of cross-modal reorganization of the neural substrates

for visual motion processing (e.g., area MT) described in early blind and sight recovery

individuals (Jiang, Stecker, Boynton, & Fine, 2016; Saenz, Lewis, Huth, Fine, & Koch, 2008).

In addition to effects on temporal and motion perception, a brief period of early visual

deprivation has also been shown to affect the attentional balance between vision and audition for

lateralized stimuli (de Heering et al., 2016). Compared to normally sighted individuals, adults

with a history of early visual deprivation had faster reaction times on auditory trials, and on trials

requiring an attentional switch from vision to audition (de Heering et al., 2016). These findings

suggest that auditory signals command greater attentional salience in individuals with a remote

history of brief visual deprivation.

57

1.4.2 Audiovisual Speech Perception

Studies of audiovisual speech perception using McGurk effect paradigms in humans with

amblyopia consistently show abnormally low susceptibility to the illusion compared to visually

normal controls (Burgmeier et al., 2015; Narinesingh et al., 2015; Narinesingh et al., 2014;

Putzar, Hötting, & Röder, 2010). An early study examined adults who had been treated for

bilateral congenital cataracts before 17 months of age, and found that early visual deprivation

was associated with normal auditory speech perception, but poorer lip-reading ability, and

reduced susceptibility to the McGurk effect (Putzar, Hötting, et al., 2010). Interestingly, this

study included a small subgroup of adults who acquired significant visual impairment after 5

years of age who also showed a reduced McGurk effect, but normal lip-reading abilities. At face

value, this finding suggests that later-onset visual impairment can also impair audiovisual

integration. The authors noted, however, that most of the participants in the acquired visual

impairment group also had mild visual impairments since birth, raising the possibility that early

visual impairments more subtle than bilateral cataracts may also interfere with multisensory

development.

These behavioural findings have been followed up with functional neuroimaging using similar

audiovisual speech paradigms. Unlike visually normal controls, participants with a history of

early transient bilateral visual deprivation lacked fMRI evidence of audiovisual integration in the

primary and higher auditory cortices and STS, and did not exhibit response enhancement in

higher-order visual areas (Calvert et al., 1999; Calvert et al., 2000; Guerreiro, Putzar, & Röder,

2015). Another fMRI study in a similar clinical population showed enhanced responses to

auditory stimuli in occipital visual cortex, indicating that early visual deprivation causes

persistent cross-modal reorganization of unisensory cortical areas (Collignon et al., 2015).

Several further studies examined the McGurk effect in the non-deprivational forms of

amblyopia. Similar to the findings in adults with early bilateral visual deprivation, children and

adults with unilateral anisometropic, strabismic, and mixed mechanism amblyopia also

demonstrated reduced susceptibility to the McGurk effect (Burgmeier et al., 2015; Narinesingh et

al., 2015; Narinesingh et al., 2014). Importantly, this audiovisual perceptual anomaly persisted

under amblyopic eye, fellow eye, and binocular viewing conditions, meaning that visual blur

cannot account for the multisensory abnormality (Narinesingh et al., 2014). Despite a clear

58

relation between perception of the McGurk effect and amblyopia, no study to date has found an

association between susceptibility to the McGurk effect and stereo acuity or visual acuity in the

amblyopic eye (Burgmeier et al., 2015; Narinesingh et al., 2015; Narinesingh et al., 2014).

However, one study which recorded the ages of onset and resolution of amblyopia found a

reduced McGurk effect only in children whose amblyopia remained unresolved past 5 years of

age; resolution by 5 years or onset after 5 years was associated with a normal McGurk response

(Burgmeier et al., 2015). This finding suggests the existence of a sensitive period for normal

integration of audiovisual speech signals during the first 5 years of life.

1.5 Summary

Multisensory processing and integration are ubiquitous, adaptive functions that are central to our

perception of the external world. Information encoded by different peripheral sense organs is not

only combined to enrich perceptual representations, but also interacts at multiple downstream

sites to alter the quality and calibration of perception functions. In the mature sensory system,

multisensory combination and integration can improve the reliability of perceptual responses to,

for example, audiovisual speech. In the developing sensory system, cross-modal interactions

influence the calibration of unisensory functions, such as sound localization. Amblyopia is a

common developmental diagnosis that has been systematically investigated as a unisensory

visual impairment. However, its effects are increasingly recognized to extend beyond vision to

the multisensory domain. Indeed, amblyopia is associated with altered cross-modal interactions

in audiovisual speech perception and audiovisual temporal processing. Knowledge of the extent

and mechanisms of the audiovisual impairments in amblyopia, however, remains in its infancy.

More study is therefore needed to understand the nature of the audiovisual processing and

integration deficits in amblyopia, the mechanisms underlying these deficits, and their relation to

conventional clinical diagnosis and treatment.

59

Chapter 2 Study Aims and Hypotheses

Study Aims and Hypotheses

2.1 General Rationale and Research Aims

Multisensory processing and integration play fundamental roles in human perception, behaviour,

learning, and developmental sensory calibration. Amblyopia, a common developmental visual

impairment, is associated with abnormalities in multisensory processing—particularly in

audiovisual speech perception and temporal judgment tasks. Several issues remain unresolved,

however. Despite spatial vision being the most prominent area of deficit in amblyopia,

audiovisual processing in the spatial domain has not been investigated in amblyopia. More

critically, the underlying basis for multisensory abnormalities in amblyopia remains unclear.

Indeed, because multisensory processing encompasses both integrative and non-integrative

functions modulated by complex interactions involving temporal, spatial, and identity

correspondence, unambiguous conclusions about the underlying mechanisms are difficult to

draw from prior scientific evidence. The clinically relevant consequences of these audiovisual

processing abnormalities, and the adequacy of current amblyopia therapies in addressing them,

also remain unknown.

The general aim of this thesis is to characterize the extent of audiovisual spatial and temporal

processing and integration abnormalities in adult humans with amblyopia. Toward this goal, four

studies were conducted to assess multisensory processing of simple audiovisual stimuli in

participants with the commonest forms of unilateral amblyopia (anisometropic, strabismic, and

mixed mechanism).

60

2.2 Specific Study Aims and Hypotheses

2.2.1 Audiovisual Spatial Perception

2.2.1.1 Study I: Optimal Audiovisual Integration in the Ventriloquism Effect but Pervasive Deficits in Unisensory Spatial Localization in Amblyopia

The majority of data indicating abnormalities in multisensory processing in amblyopia comes

from studies of audiovisual speech perception (Burgmeier et al., 2015; Narinesingh et al., 2015;

Narinesingh et al., 2014; Putzar, Hötting, et al., 2010). Audiovisual integration in the spatial

dimension has not been examined previously in humans with amblyopia. This study addresses

this knowledge gap by examining the spatial ventriloquism effect in humans with unilateral

amblyopia, comparing their performance to visually normal controls, and to an ideal observer

based on the MLE model of optimal multisensory integration (Alais & Burr, 2004).

The specific aims of this study are:

1) To measure the precision of unisensory (visual and auditory) and multisensory

(audiovisual) spatial localization in participants with amblyopia under real-world

binocular viewing conditions.

2) To measure the perceptual weights of vision and audition in a ventriloquism effect

paradigm under real-world binocular viewing conditions.

3) To determine if amblyopia is associated with optimal audiovisual spatial integration

according to the MLE model.

The specific hypotheses of this study are:

1) Participants with amblyopia will show reduced precision of visual localization,

normal precision of auditory localization, and reduced precision of audiovisual

localization.

The known binocular deficits in spatial vision, spatial distortion, and positional

uncertainty (reviewed in section 1.1.3) are expected to manifest as poorer performance in

61

unisensory visual localization. Because vision typically dominates in audiovisual spatial

localization judgments (reviewed in section 1.3.5), reduced localization precision is also

expected for audiovisual (i.e., bimodal) stimuli. Although auditory spatial localization is

influenced by early visual experience (reviewed in section 1.2.2.5), deficits in sound

localization have only been demonstrated previously in severe bilateral early visual

impairment. Accordingly, unisensory auditory localization is expected to be within

normal limits in participants with amblyopia.

2) Participants with amblyopia will weight audition more heavily than visually normal

controls in the ventriloquism effect paradigm.

As above, visual spatial localization precision is expected to be reduced, whereas

auditory spatial localization precision is expected to be normal. The fused audiovisual

percept in the ventriloquism effect is therefore expected to reflect this differential effect

on vision and audition, with audition being weighted relatively more by participants with

amblyopia compared to visually typical controls.

3) Participants with amblyopia will integrate visual and auditory spatial signals

optimally according to the MLE model.

Optimal multisensory integration is widely thought to develop late relative to unisensory

perceptual abilities (reviewed in sections 1.3.6 and 1.3.7). Amblyogenic factors, in

contrast, exert their influences on the visual system primarily in early childhood

(reviewed in section 1.1.5). Because optimal integration likely emerges after amblyopic

visual deficits have developed, audiovisual spatial perception in amblyopia is expected to

obey the MLE model.

2.2.1.2 Study II: Amblyopia and the Developmental Calibration of Sound Localization

Early visual experience is known to influence the development of sound localization abilities and

to alter the neural representation of auditory space in the superior colliculus (reviewed in section

1.2.2.5). This study was designed to further investigate the effect of unilateral amblyopia on

sound localization suggested by the findings of Study I (see section 3).

62


1) To measure the precision of relative sound localization in the horizontal plane (i.e., the

minimum audible angle, or MAA).

2) To measure the accuracy of absolute sound localization in the horizontal plane.


1) Participants with amblyopia will have a wider MAA than visually normal controls.

The sensitive periods for visual development (reviewed in section 1.1.5) and the period

for normal development of sound localization (reviewed in section 1.2.2.4) overlap.

Furthermore, abnormal early visual experience is known to affect the developmental

calibration of sound localization abilities (reviewed in section 1.2.2.5). Given that visual

positional uncertainty affects both the amblyopic and fellow eyes (reviewed in section

1.1.3), and based on the unexpected finding of reduced sound localization precision in

Study I (see section 3), participants with amblyopia are expected to demonstrate an

abnormally wide MAA.

2) Participants with amblyopia will localize sounds less accurately than visually

normal controls.

Visual spatial distortions affect not only the amblyopic eye, but also the fellow eye in

humans with amblyopia (reviewed in section 1.1.3). Based on the same reasoning

outlined in the explanation for hypothesis (1), participants with amblyopia are expected

to localize sounds less accurately than visually normal controls.

2.2.2 Audiovisual Temporal Perception

2.2.2.1 Study III: Alterations in Audiovisual Simultaneity Perception in Amblyopia

Evidence suggests that audiovisual simultaneity perception is based on a non-integrative

mechanism of cross-modal matching (Fujisaki & Nishida, 2005). However, the width of the

simultaneity window correlates with the strength of multisensory integration in the McGurk

63

effect (Stevenson, Zemtsov, et al., 2012). In the most common subtypes of amblyopia

(anisometropic, strabismic, and mixed mechanism), reduced audiovisual integration measured by

the McGurk effect is well-documented (reviewed in section 1.4.2). Understanding audiovisual

simultaneity perception in amblyopia may therefore provide insight into the basis for a reduced

McGurk effect. Although abnormal perception of audiovisual simultaneity has been reported in

humans with the deprivational subtype of amblyopia (reviewed in section 1.4.1), it has not been

measured in the most common subtypes that exhibit a reduced McGurk effect.

The specific aim of this study is:

1) To measure the temporal window of audiovisual simultaneity over a range of signal onset

asynchronies (SOAs).

The specific hypothesis of this study is:

1) Participants with the most common subtypes of amblyopia will be more likely than

controls to perceive asynchronous audiovisual signals as simultaneous for both

visual-lead and auditory-lead SOAs (i.e., they will have a symmetrically widened

temporal window of audiovisual simultaneity).

Based on the known correlation between the width of the simultaneity window and

susceptibility to the McGurk effect in visually normal adults, as well as the symmetrically

widened simultaneity window previously observed in adults with unilateral deprivational

amblyopia, the clinical population in this study is expected to exhibit a symmetrically

widened simultaneity window.

2) The width of the audiovisual simultaneity window in participants with amblyopia

will not be dependent upon viewing condition.

Adults with deprivational amblyopia are known to have a widened simultaneity window

regardless of whether the amblyopic or fellow eye is viewing, suggesting a

developmental origin to the abnormality (Chen et al., 2017). Participants with unilateral

anisometropic, strabismic, and mixed mechanism amblyopia are expected to have a

64

symmetrically widened simultaneity window under all viewing conditions, similar to the

known behaviour of adults with unilateral deprivational amblyopia.

2.2.2.2 Study IV: Temporal Ventriloquism Reveals Normal Audiovisual Temporal Integration in Amblyopia

While some hypothesize that width of the audiovisual simultaneity window reflects the capacity

to encode and compare unisensory temporal features (i.e., non-integrative functions) (Fujisaki &

Nishida, 2005), others suggest that the simultaneity window may reflect abnormalities in the

capacity for audiovisual integration (Chen et al., 2017). This study was designed to distinguish

these possibilities by investigating the temporal ventriloquism effect—a phenomenon in which

audiovisual integration normally enhances temporal resolution on a visual temporal order

judgment (TOJ) task.


1) To measure temporal resolution for a visual TOJ task with and without paired auditory

stimuli that normally elicit enhancement by the temporal ventriloquism effect.

2) To measure the temporal window of perceptual binding for the temporal ventriloquism

effect.


1) Participants with amblyopia will exhibit enhanced visual TOJ in the temporal

ventriloquism effect, consistent with an intact mechanism for audiovisual temporal

integration.

Evidence suggests that multisensory integration develops late in humans (reviewed in

sections 1.3.6 and 1.3.7), after the typical sensitive period for the development of

amblyopia (reviewed in section 1.1.5). In the temporal domain, audition offers greater

precision and typically dominates over vision (see sections 1.3.4, 1.3.5.2, and 1.3.7). On

these bases, participants with amblyopia are expected to exhibit multisensory

enhancement in visual TOJ performance consistent with the temporal ventriloquism

effect.

65

2) Participants with amblyopia will exhibit perceptual binding (i.e. multisensory

enhancement) in the temporal ventriloquism effect over a wider interval of SOAs

compared to visually normal controls.

Consistent with a widened temporal window of audiovisual simultaneity, and a widened

window of perceptual binding in the sound-induced flash illusion (reviewed in section

1.4.1), participants with amblyopia are expected to exhibit perceptual binding in the

temporal ventriloquism effect over a wider range of SOAs compared to visually normal

adults.

66

Chapter 3 Study I

Study I: Optimal Audiovisual Integration in the Ventriloquism Effect but Pervasive Deficits in Unisensory Spatial Localization in Amblyopia

3.1 Abstract

Purpose: Classically understood as a deficit in spatial vision, amblyopia is increasingly

recognized to also impair audiovisual multisensory processing. Studies to date, however, have

not determined whether the audiovisual abnormalities reflect a failure of multisensory

integration, or an optimal strategy in the face of unisensory impairment. We use the

ventriloquism effect and the maximum likelihood estimation (MLE) model of optimal

integration to investigate integration of audiovisual spatial information in amblyopia.

Methods: Fourteen participants with unilateral amblyopia and 16 visually normal controls

localized brief auditory-only, visual-only, and combined audiovisual stimuli during binocular

viewing using a location discrimination task. A subset of combined audiovisual trials involved

the ventriloquism effect, an illusion in which auditory and visual stimuli originating from

different locations are perceived as a unified event from a single location. Localization precision

and bias were determined by psychometric curve fitting, and the observed parameters were

compared to predictions from the MLE model.

Results: Spatial localization precision was significantly reduced in the amblyopia group for

visual-only, auditory-only, and combined audiovisual stimuli, compared to the control group.

Analyses of localization precision and bias for combined audiovisual stimuli showed no

significant deviations from the MLE model in either the amblyopia and control group.

Conclusions: Despite pervasive deficits in localization precision for visual, auditory, and

audiovisual stimuli, audiovisual spatial integration remains intact and optimal in unilateral

amblyopia.

67

3.2 Introduction

Amblyopia is a neurodevelopmental visual disorder that affects 2–4% of the population.(Birch,

2013) Beyond its widely known effects on vision (McKee et al., 2003), emerging research

indicates that amblyopia also involves a range of abnormalities in multisensory processing. For

example, even when viewing with both eyes, people with unilateral amblyopia show reduced

susceptibility to the McGurk effect (Burgmeier et al., 2015; Narinesingh et al., 2015;

Narinesingh et al., 2014), diminished ability to perceive asynchrony between auditory and visual

stimuli (Chen et al., 2017; M. D. Richards, H. C. Goltz, & A. M. F. Wong, 2017b), and a

widened temporal binding window for the sound-induced flash illusion (Narinesingh et al.,

2017).

While it is clear that early visual experience is necessary for the normal development of many

multisensory processes (Hötting & Röder, 2009; Putzar et al., 2007; Röder, Rosler, & Spence,

2004; Wallace et al., 2004), it is less clear whether the multisensory abnormalities in amblyopia

represent a failure to integrate the available unisensory information, or appropriate integration of

the available, but deficient, unisensory information. Difficulty in answering this question arises

for several reasons. First, there is often ambiguity surrounding which phenomena constitute

multisensory integration. Several prominent investigators in the field define multisensory

integration as “the neural process by which unisensory signals are combined to [produce]… a

multisensory response (neural or behavioral) that is significantly different from the responses

evoked by the modality-specific component stimuli” (Stein et al., 2010). The McGurk effect,

which often elicits a multisensory percept that is distinct from the auditory and visual stimuli, fits

the above definition well. However, the nature of audiovisual asynchrony detection is more

ambiguous—it may plausibly be underpinned by a cross-modal matching process rather than

integration, but empirically, it is correlated with other indices of audiovisual integration (Chen et

al., 2017; Stevenson, Zemtsov, et al., 2012). Second, difficulty in distinguishing between a

failure of integration and a deficiency in the unisensory components being integrated arises

because we lack a model of how specific features of the unisensory components (such as spatial,

temporal, and semantic content) determine what is perceived at the multisensory level.

A paradigm to study this question in amblyopia is provided by the ventriloquism effect (Howard

& Templeton, 1966). The ventriloquism effect is an audiovisual illusion in which spatially

68

disparate visual and auditory stimuli are perceived as originating from the same location.

Typically, the location information of the visual unisensory component dominates in the

perceived location of the fused audiovisual percept, a process sometimes termed visual capture

(Welch & Warren, 1980). Alais and Burr (2004), however, showed that by blurring the visual

stimulus (i.e. modulating its spatial reliability or precision), the perceptual dominance of vision

over audition can be diminished or even reversed. Critically, they demonstrated that the location

and spatial precision of the multisensory percept can be predicted from the location and spatial

precision of the unimodal components using the maximum likelihood estimation (MLE) model

of optimal combination. Therefore, the MLE model of the ventriloquism effect offers a powerful

methodology to disentangle the relative contributions of unisensory impairment and integration

failure from the multisensory abnormalities observed in amblyopia.

The MLE model has been put forward by several groups as a model for multisensory integration

of spatial information involving vision (Alais & Burr, 2004; Ernst & Banks, 2002; Moro, Harris,

& Steeves, 2014). For the ventriloquism effect, the MLE model predicts that the perceived

location of a bimodal event will be the weighted average of the locations of the unimodal events,

such that:

�� = �� + �� (1)

where �� and �� are the unisensory localization estimates for vision and audition, �� and �� are

the perceptual weights for vision and audition, and �� is the resultant bimodal localization

estimate. The perceptual weights, �� and ��, sum to 1, and are proportional the variances of the

unisensory localization estimates, �� and ��, such that:

�� = �� + ��

(2)

And

�� = �� + ��

(3)

69

Experimentally, localization variance can be estimated from the psychometric curve fit to the

unimodal localization data. The combination of unisensory localization estimates in the MLE

model is mathematically optimal in that it results in a bimodal localization estimate with the

lowest possible variance (i.e. highest possible precision):

�� = �� + ��

≤ min��, �� (4)

If the psychometric response is represented by a cumulative normal function, the variance of the

function is inversely related to maximum slope, !, at the inflection point of the curve:

! = " 1√2%& ∙ " 1

√��& (5)

Therefore, following from Equations (4) and (5), the MLE model predicts that spatial

localization precision (represented by !) is always greater for the bimodal event than for its

unisensory components, and that the bimodal advantage in spatial localization precision is

greatest when the precisions of the unisensory components are equal.

In the present report, we employ the ventriloquism effect and predictions of the MLE model to

investigate integration of audiovisual spatial information in amblyopia.

3.3 Methods

3.3.1 Participants

All participants were adults with no visual disorders other than amblyopia, strabismus, or

refractive error. Participants were excluded if they had a history of neurodevelopmental or

neurological disorder, hearing impairment, high ametropia (hyperopia > +5D or myopia > -6D),

or any other ocular pathology or prior intraocular surgery. Each participant underwent ocular and

hearing assessment by a certified orthoptist or ophthalmologist. The ocular assessment

documented distance visual acuity with correction (ETDRS chart), stereo acuity (Randot circles

and Titmus fly test), foveal suppression (Worth 4-dot test), eye alignment, and refractive

correction. The hearing assessment (Student Support Services Team, 2008) ensured reliable

detection of suprathreshold pure tones (25 dBA sound pressure level) in each ear at four standard

70

frequencies (500, 1000, 2000, and 4000 Hz) using a screening audiometer (model MA 27,

MAICO Diagnostics, Eden Prairie, MN, USA) with circumaural headphones (model TDH 39,

MAICO Diagnostics, Eden Prairie, MN, USA). Amblyopia was defined as visual acuity of 0.18

logMAR or poorer in the amblyopic eye, and an interocular acuity difference of 0.2 logMAR or

more. Participants were classified as having anisometropic amblyopia if the interocular

difference in spherical equivalent or cylindrical error was 1 diopter (D) or more, as having

strabismic amblyopia if there was any manifest deviation on cover testing in the absence of

anisometropia, or as having mixed-mechanism amblyopia when both anisometropia and a

manifest deviation of 8 prism diopters or more were present. Visually normal was defined as

visual acuity of at least 0.1 logMAR (20/25) in each eye, with stereo acuity of 40 seconds of arc

or better, and no manifest strabismus. The study was approved by the Research Ethics Board at

The Hospital for Sick Children, and all protocols adhered to the tenets of the Declaration of

Helsinki. Written informed consent was obtained from all participants after explanation of the

nature and possible consequences of the study.

Fourteen adults with unilateral amblyopia (mean age: 28.8 years; age range: 19–48) and 16

visually normal controls (mean age: 29.2 years; age range: 23–47) participated in the study.

Clinical characteristics of the participants with amblyopia are summarized in Table 3.1.

71

Table 3.1: Characteristics of participants with amblyopia

Visual acuity

(logMAR)

Refractive correction

Participant Age

(years)

Subtype RE LE RE LE Stereo acuity

(arc sec)

Worth 4-dot

response

A1 29 Strab 0.00 1.00 None None Not measurable LE suppressed

A2 22 Aniso 0.00 0.48 -1.50 +0.50 x 80 +1.00 +1.25 x 95 200 Fused

A3 48 Aniso 0.70 0.00 +2.25 +0.25 x 174 -0.75 3000 Fused

A4 29 Aniso 0.48 -0.10 -5.00 -1.25 3000 Fused

A5 23 Aniso -0.10 0.48 -2.25 +0.25 +2.25 x 85 200 Fused

A6 29 Mixed 0.00 1.00 Plano +3.50 +2.00 x 90 Not measurable LE suppressed

A7 19 Aniso 0.00 0.18 -0.75 +2.00 x 84 -2.75 +4.50 x 99 40 Fused

A8 27 Strab 0.00 0.48 -6.25 +1.00 x 45 -5.50 +1.25 x 135 200 Fused

A9 37 Mixed -0.10 1.30 -1.00 +6.00 +2.50 x 120 Not measurable LE suppressed

A10 32 Aniso -0.10 0.54 Plano +2.00 +2.00 x 124 140 Fused

A11 23 Strab 0.20 0.00 +0.50 +0.50 x 28 +1.25 +0.50 x 88 Not measurable Diplopic

A12 44 Mixed 0.90 0.00 +6.00 +1.25 x 75 -0.75 Not measurable RE suppressed

A13 22 Aniso 1.1 -0.10 -6.00 +0.75 x 174 -4.50 +0.50 x 75 3000 Fused

A14 19 Mixed 0.48 0.00 +3.00 +1.00 x 130 +4.25 3000 Fused

Abbreviations: RE, right eye; LE, left eye; Aniso, anisometropia; Strab, strabismic.

72

3.3.2 Apparatus and Stimuli

The entire experiment was conducted in a darkened, double-walled audiometric chamber

(internal dimensions 2.0 x 2.1 x 2.2 m). The floor was carpeted, and the walls and ceiling were

lined with 5 cm acoustic wedge foam (Foam Factory, Macomb, MI, USA). Head position was

constrained by a chinrest fixed to a small table 65 cm from the centre of the audiovisual

apparatus, as shown in Figure 3.1.

Figure 3.1: Audiovisual apparatus for the presentation of visual blobs and auditory clicks.

Virtual sound sources were generated at locations on horizontal axis between the two physical

speakers by linear amplitude panning. The speakers were driven by coherent signals of

independently variable amplitude, such that the signal amplitude gain to the right and left

speakers always summed to 1.

Visual stimuli consisted of medium contrast (39%) Gaussian blobs of 5 size variants (1 SD =

16°, 20°, 24°, 28°, or 32°), flashed for 33 ms on a large LED monitor (model E654, NEC

Corporation, Tokyo, Japan) subtending 96° x 64° of visual angle (165 cm diagonal). The monitor

was overlaid with an ND 0.9 filter to create a high-quality Gaussian blob (peak luminance 2.1

cd/m2) with imperceptible steps between gray levels. Auditory stimuli consisted of 32 ms clicks

73

(8 cycles of 2–5 kHz bandpass filtered white noise, 4 ms in duration, enveloped with a 2 ms

sigmoid on/off ramp), presented at 62.0 dBA through two speakers mounted on either side of the

monitor on the horizontal midline. Apparent click location was controlled by linear amplitude

panning of interaural level difference (ILD) cues (Pulkki, 2001; Warncke, 1941). The output

profile for each speaker was measured across the entire stimulus dynamic range using a sound

level meter to ensure their outputs were identical. Auditory and visual stimulus timings were

confirmed with an oscilloscope. A wireless gamepad was used to initiate trials and enter

responses. The study was approved by the Research Ethics Board at The Hospital for Sick

Children, and all protocols adhered to the tenets of the Declaration of Helsinki.

3.3.3 Procedure

All trials were conducted with both eyes open. Participants performed a relative spatial

localization task for unimodal stimuli (visual blobs only or auditory clicks only) and bimodal

stimuli (blobs and clicks together) similar to that described by Alais and Burr.(Alais & Burr,

2004) A general trial timeline is illustrated in Figure 3.2. Upon initiation of each trial, a red

fixation dot (0.66°) was presented centrally for 500 ms, followed by a randomized delay of 250–

400 ms. Two matching stimuli (a test stimulus and probe stimulus, but in random order) were

then presented in succession, 500 ms apart, and the participant was asked to “indicate whether

the second event occurred left or right relative to the first event”. Participants were instructed to

keep their head and eyes aligned centrally. There were 21 test stimulus conditions: 6 unimodal (1

click-only and 5 blob size variants), 5 bimodal with spatially congruent clicks and blobs (5 blob

size variants paired with a click), and 10 bimodal with spatially conflicting clicks and blobs (5

blob size variants paired with a click, but blob displaced 4° left and click displaced 4° right, or

click displaced 4° left and blob displaced 4° right). Bimodal test stimulus conditions with spatial

conflict were designed to elicit the ventriloquism effect, and participants were not told of this

spatial disparity. The test stimulus was presented centrally (0°) in all trials; for bimodal test

stimuli with spatial conflict, the unimodal components were displaced 4° in opposite directions

such that that their average location was still 0°. The probe stimulus matched the characteristics

of the test stimulus except for horizontal displacement (specified in Table 3.2), and in some

cases, spatial congruency (bimodal test stimuli with spatial conflict were paired with spatially

congruent probe stimuli). Data were collected in separate blocks for the unimodal auditory

74

conditions, and for each of the 5 blob sizes within the unimodal visual and bimodal conditions.

Twenty trials were run for each probe stimulus displacement, randomly interleaved within each

block.

Figure 3.2 Illustration of the trial timeline. After trial initiation by the participant, a fixation

dot appeared centrally for 500 ms, followed by a dark interval of 250–400 ms. Two brief stimuli

(test and probe) were displayed in sequence, 500 ms apart, but in random order. The participant

judged whether the second stimulus originated left or right relative to the first.

Table 3.2: Probe stimulus displacements used for each test stimulus condition

Stimulus condition Probe stimulus displacements (°)

Click only -15, -12, -9, -6, -3, 3, 6, 9, 12, 15

16 °/SD blob ± click -8, -6, -4, -2, 2, 4, 6, 8

20 °/SD blob ± click -10, -7.5, -5, -2.5, 2.5, 5, 7.5, 10

24 °/SD blob ± click -12, -9, -6, -3, 3, 6, 9, 12

28 °/SD blob ± click -14, -10.5, -7, -3.5, 3.5, 7, 10.5, 14

32 °/SD blob ± click -16, -12, -8, -4, 4, 8, 12, 16

N.B.: negative displacement = leftward, positive displacement = rightward, SD = standard

deviation

75

3.3.4 Data Analysis

The proportion of ‘probe stimulus perceived left’ responses was calculated for each probe

displacement, and the data were fit with a cumulative normal function by the maximum

likelihood method. The mean of the function is the point of subjective equality (PSE) and

represents the localization estimate of the test stimulus (�� in Equation 1). The standard deviation

of the function, �, is related to the precision of the localization estimate, !, as described by

Equation 5. For all unimodal conditions and spatially congruent bimodal conditions, the curve fit

was constrained such that PSE = 0° to avoid falsely steep fits due to undersampling around the

mean. For all bimodal conditions with spatial conflict, the curve fit was unconstrained, as

variation in the PSE was of primary interest. As is common in psychophysical methodology, the

maximum slope of the psychometric function, !, was taken as the measure of localization

precision, and was calculated from � values using Equation 5(Strasburger, 2001). All ! values

were subsequently log10 transformed to achieve linearity and equality of variances required for

statistical analysis. The assumption of equality of variances was met by Levene’s test for all

between-group t-tests, analyses of variance (ANOVAs), and analyses of covariance

(ANCOVAs), and by Mauchly’s test of sphericity for all repeated measures ANOVAs. The

assumption of homogeneity of regression was met for all ANCOVAs. All fitted functions and

parameters were calculated with custom-written scripts in MATLAB version 2011b (Mathworks,

Inc., Natick, MA, USA). All statistical tests were computed using IBM SPSS Statistics version

22 (Armonk, NY, USA). Statistical significance was defined as p < 0.05.

3.4 Results

Mean psychometric data for the unimodal and bimodal localization tasks for the visually normal

control and amblyopia groups are shown in Figure 3.3. Subsequent analyses of localization

precision, perceptual weight by modality, and agreement with the MLE model are reported in the

sections below.

76

3.4.1 Localization Performance

3.4.1.1 Localization Precision for Unimodal Stimuli

Localization precision, defined as the slope of the fitted psychometric function at the midpoint,

decreased monotonically in both groups for unimodal visual stimuli as the blob size increased

from 16° to 32° (Figure 3.3A, B; Figure 3.4A). A one-way ANCOVA controlling for the

covariate of blob size showed that unimodal visual localization precision was significantly

poorer in the amblyopia group compared to the control group (F(1,147) = 7.542, p = 0.007).

Surprisingly, unimodal auditory localization precision (Figure 3.3A, B; Figure 3.4B) was also

significantly reduced in the amblyopia group compared to the control group (t(28) = 2.138, p =

0.041) (Wong, Richards, & Goltz, 2017).

3.4.1.2 Localization Precision for Spatially Congruent Bimodal Stimuli

Localization precision for spatially congruent bimodal stimuli decreased monotonically in both

the control group and amblyopia group as the blob size increased from 16° to 28° (Figure 3.3C,

D; Figure 3.4C). The flattening of the relation at large blob sizes is likely a ceiling effect

imposed by the higher precision of the auditory stimulus. A one-way ANCOVA controlling for

the covariate of blob size showed that bimodal localization precision was significantly lower in

the amblyopia group compared to the control group (F(1,147) = 21.407, p < 0.001).

3.4.1.3 Localization Bias for Spatially Conflicted Bimodal Stimuli

Localization performance for bimodal stimuli with spatial conflict is illustrated for the control

group (Figure 3.3E, G) and the amblyopia group (Figure 3.3F, H). In these trials, the visual and

auditory unimodal components were displaced 4° in opposite directions from centre to elicit a

ventriloquism effect. Localization bias, or PSE, was computed for both conflict conditions (i.e.,

blob -4° and blob 4°) at each blob size for every individual. Results from the two conflict

conditions were subsequently pooled, however, as a 2 x 5 two-way repeated measures ANOVA

for the effect of conflict condition and blob size on PSE showed no significant effect of conflict

condition for the control group (F(1, 15) = 0.218, p = 0.647) or the amblyopia group (F(1,13) =

1.694, p = 0.215). Both groups showed a monotonic progression in PSE from a vision-dominant

localization to an audition-dominant localization as the blob size increased (Figure 3.5). A one-

77

way ANCOVA controlling for the covariate of blob size showed no significant difference in PSE

between the two groups (F(1,297) = 3.003, p = 0.084).

78

Figure 3.3: Unimodal and bimodal localization task performance. Data are shown for the

control group (A, C, E, G) and the amblyopia group (B, D, F, H). Symbols represent the mean

proportion of trials in which a probe stimulus was perceived leftward of a test stimulus. Visual

79

stimuli were Gaussian blobs of specific sizes (1 SD = 16°, red; 1 SD = 20°, orange; 1 SD = 24°,

green; 1 SD = 28°, blue; 1 SD = 32°, purple), and auditory stimuli were white noise clicks. (A,

B) Mean psychometric data for localization of unimodal visual (rainbow symbols and solid lines)

and unimodal auditory test stimuli (black symbols and dashed lines) centred at 0°. (C, D) Mean

psychometric data for localization of bimodal test stimuli whose unimodal components were

central and spatially congruent (i.e. blob and click both centred at 0°). (E–H) Mean psychometric

data for localization of bimodal stimuli whose unimodal components are in spatial conflict (i.e.,

symmetrically displaced about 0°). (E, F) Bimodal conflict conditions with blobs centred 4° left

and clicks centred 4° right. (G, H) Conflict conditions with clicks centred 4° left and blobs

centred 4° right.

Figure 3.4: Localization precision for visual-only, auditory-only, and spatially congruent

bimodal audiovisual stimuli. Control group data are shown in blue and amblyopia group data

are shown in red. Localization precision (i.e., psychometric function slope) values were log10

transformed to equalize variances and linearize the relation between localization precision and

blob size. Error bars represent ±1 SEM. For (A) visual-only stimuli, (B) auditory-only stimuli,

and (C) spatially congruent bimodal stimuli, the control and amblyopia groups differed

significantly after controlling for any differences in blob size (* p < 0.05, **p < 0.01).

80

Figure 3.5: Bimodal localization bias for audiovisual stimuli with spatial conflict. Control

group data are shown in blue, and amblyopia group data are shown in red. Error bars represent

±1 SEM. The unimodal components (visual blob and auditory click) were horizontally displaced

4° in opposite directions from centre, such that their average location was 0°. Positive PSE

values indicate locations toward the blob, and negative PSE values indicate locations toward the

click, and include results pooled from the two conflict conditions tested (i.e., blob left and click

right; click left and blob right). In both groups, the strength and direction of the ventriloquism

effect was modulated by visual blob size.

3.4.2 Testing the Maximum Likelihood Estimation Model

3.4.2.1 Observed Versus Predicted Localization Precision for Spatially Congruent Bimodal Stimuli

Agreement between the observed localization precision for spatially congruent bimodal stimuli

and the values predicted by the MLE model is illustrated for the control group (Figure 3.6A) and

amblyopia group (Figure 3.6B). For the control group, a 2-way repeated measures ANOVA

comparing observed and predicted bimodal localization precision across blob sizes showed no

significant interaction between factors (F(4,60) = 1.136, p = 0.348) and no significant deviation

from the MLE model (F(4,60) = 1.136, p = 0.348). The same 2-way repeated measures ANOVA

analysis in the amblyopia group showed no significant interaction between factors (F(4,52) =

2.293, p = 0.072), and no significant difference in localization precision as observed and as

predicted by the MLE model (F(1,13) = 3.671, p = 0.078).

81

Figure 3.6: Bimodal localization precision, as observed and as predicted by the MLE

model. Values were log10 transformed to equalize variances and linearize the relation between

localization precision and blob size. Error bars represent ±1 SEM. For (A) the control group

(shown in blue) and (B) the amblyopia group (shown in red), the observed bimodal localization

precision (solid lines) did not differ significantly from the predictions of the MLE model (dashed

lines).

According to the MLE model, audiovisual integration results in enhanced localization precision

for bimodal stimuli by optimal combination of the component unimodal spatial signals. In

complete integration failure, however, the best localization precision achievable is that of the

more precise unimodal signal. This distinction provides a test for integration in amblyopia.

Importantly, the MLE model also predicts that the bimodal enhancement in localization precision

is greatest, and therefore most detectable, when the localization precisions of the unimodal

components are equal (i.e., β’V = β’ A) (see Equations 4 and 5). The bimodal localization

precision observed in this study was therefore compared to that expected with intact integration

(i.e., MLE-predicted value computed from unimodal component precisions) and with integration

failure (i.e., the most precise unimodal component) specifically for the condition in which the

82

unimodal components were most similar for each participant (Figure 3.7). For the control group,

a one-way repeated measures ANOVA showed a significant difference among the observed,

MLE-predicted, and best unimodal bimodal localization precisions (F(1.041,15.616) = 7.130, p =

0.016, Greenhouse-Geisser correction). As expected, post hoc multiple comparisons revealed a

significant difference between the observed bimodal localization precision and the best unimodal

localization precision (p = 0.017), but no significant difference between the observed bimodal

localization precision and the MLE-predicted values (p = 0.974), indicating that audiovisual

spatial integration was intact in the control group. For the amblyopia group, a one-way repeated

measures ANOVA showed a significant difference among the observed, MLE-predicted, and

best unimodal bimodal localization precisions (F(1.184,15.388) = 8.827, p = 0.007, Greenhouse-

Geisser correction). Post hoc multiple comparisons revealed a significant difference between the

observed bimodal localization precision and the best unimodal localization precision (p = 0.011),

but no significant difference between the observed bimodal localization precision and the MLE-

predicted values (p = 0.727), indicating that audiovisual spatial integration was intact in the

amblyopia group.

83

Figure 3.7: Maximal bimodal advantage ratio for localization precision, observed, as

predicted by the MLE model, and as predicted by integration failure. Error bars represent ±1

SEM. For (A) the control group and (B) the amblyopia group, the observed maximal bimodal

advantage ratio was consistent with intact integration as predicted by the MLE model, and

inconsistent with integration failure (i.e., best unimodal). *p < 0.05; n.s. = not significant.

3.4.2.2 Observed Versus Predicted Visual Perceptual Weight for Spatially Conflicted Bimodal Stimuli

The MLE model also makes predictions about the contribution of each modality to the perceived

location of a bimodal event when the unimodal components are in spatial conflict (Figure 3.3E–

H). The model predicts that the perceptual weights of vision, ��, and audition, ��, in a bimodal

percept are proportional to their unimodal localization precision (Figure 3.3A, B; Figure 3.4A,

B), according to Equations (1), (2) and (3) above. Agreement between the observed visual

perceptual weight, ��, for spatially conflicted bimodal stimuli and the values predicted by the

MLE model is illustrated for the control group (Figure 3.8A) and the amblyopia group (Figure

3.8B). For both groups, a classic ventriloquism effect with near-complete visual capture was

84

observed for the smallest blob size (16°), while a reverse ventriloquism effect (Alais & Burr,

2004) in which audition dominated was observed for the largest blob sizes. Two-way repeated

measures ANOVAs comparing observed and predicted visual perceptual weight, ��, across blob

sizes showed no significant deviation from the MLE model in the control group (F(1,15) =

2.460, p = 0.138) or the amblyopia group (F(1,13) = 0.004, p = 0.952).

Figure 3.8: Perceptual weight for vision (wV), observed and as predicted by the MLE

model. Error bars represent ±1 SEM. For (A) the control group and (B) amblyopia group, the

perceptual weight of vision observed for bimodal stimuli with spatial conflict did not differ

significantly from that predicted by the MLE model.

3.4.2.3 Observed Equivalence Point for Localization Precision and Perceptual Weight

Another specific prediction of the MLE model is that visual and auditory stimuli will be

weighted equally in the localization estimate of the bimodal stimulus (i.e. �� = ��) when their

unimodal localization precisions are the same (i.e. ′� = ′�). To test this prediction, the visual

85

blob size equivalent to the auditory click in terms of unimodal spatial precision was compared to

the visual blob size equivalent to the auditory click in terms of perceptual weight (Figure 3.9).

For each participant, a linear regression was calculated to predict the unimodal visual precision

based on blob size (control: mean R2 = 0.94; amblyopia: mean R2 = 0.95), and the regression

equation was used to calculate the blob size at the precision level of the auditory click (i.e. when

′� = ′�). Another linear regression was calculated to predict the visual perceptual weight, ��,

based on blob size (control: mean R2 = 0.89; amblyopia: mean R2 = 0.86), and the regression

equation was used to calculate the blob size at �� = 0.5 for each participant (i.e. when �� =��). Paired sample t-tests showed that the mean blob size equivalent to the click in terms of

unimodal spatial precision did not differ significantly from the mean blob size when �� = 0.5 for

the control group (t(15) = -1.566, p = 0.138) or the amblyopia group (t(13) = 0.241, p = 0.834).

Therefore, this prediction of the MLE model was upheld in both groups.

Figure 3.9: Visual blob size equivalent to the auditory click in terms of spatial precision (on

unimodal presentation) and perceptual weight (on bimodal presentation). The MLE model

predicts that the equivalence point should be the same for unimodal spatial precision and

perceptual weight. Indeed, there was no significant difference between the two equivalence

points for either group.

86

3.5 Discussion

We report that under binocular viewing conditions typical of everyday experience, amblyopia is

associated with a pervasive impairment in spatial localization precision that involves visual,

auditory, and audiovisual (i.e., multisensory) perception. Using the MLE model of the

ventriloquism effect (Alais & Burr, 2004), we show that the deficits in audiovisual localization

actually represent optimal combination of the available unisensory (i.e., visual and auditory)

information. Taken together, these findings indicate that amblyopia does not involve a failure of

spatial audiovisual integration, and point to the importance of normal visual experience (or the

detrimental effect of amblyopic vision) in the developmental calibration of other senses.

The unisensory visual localization task measured relative localization precision under binocular

viewing conditions for diffuse visual blobs of various sizes. Despite normal visual acuity in the

fellow eye, the amblyopia group showed a general reduction in visual localization precision

across blob sizes. Several possibilities may account for this finding. Contrary to clinical dogma,

vision in the fellow eye is not normal (Meier & Giaschi, 2017). Careful psychophysical studies

have shown that the fellow eye has reduced optotype (Kandel, Grattan, & Bedell, 1980; McKee

et al., 2003) and vernier acuity (Levi & Klein, 1985), as well as greater spatial uncertainty and

distortion affecting both foveal and extra-foveal vision (Bedell et al., 1985; Sireteanu et al.,

2008). Another possible explanation for the reduction in visual localization precision is the

temporal interval between the two stimuli whose positions were judged. Previous studies of

spatial vision in the fellow eye (mentioned above) used static visual targets whose spatial

elements were present simultaneously. Our study, however, presented spatial elements (i.e.,

blobs) separated by a temporal interval of 500 ms. Factors such as reduced visual persistence

(Altmann & Singer, 1986) or fixation instability (Gonzalez, Wong, Niechwiej-Szwedo, Tarita-

Nistor, & Steinbach, 2012; Schor & Westall, 1984; Subramanian, Jost, & Birch, 2013) in

amblyopia may have therefore contributed to the observed visual spatial localization deficit.

This study also revealed a surprising and novel amblyopic deficit in auditory spatial localization

precision. Two features of the experimental task are particularly notable: (1) trials were

conducted in darkness with no visual cues, and (2) localization did not involve pointing of any

kind, but was a ‘left’ or ‘right’ determination entered as a button press on a gamepad. These

features mean that the spatial uncertainty in amblyopic vision (Hess & Holliday, 1992) and

87

visuomotor control (Niechwiej-Szwedo, Goltz, et al., 2012) cannot directly account for the

observed unisensory auditory effect. Rather, they suggest that the sensory impairment in

amblyopia extends beyond vision and into the realm of binaural spatial hearing.

The bimodal localization task measured relative localization precision under binocular viewing

conditions for diffuse visual blobs of varying sizes paired with a simultaneous auditory click. As

with unisensory stimuli, the amblyopia group showed a general impairment in visual localization

precision across blob sizes. However, analysis according the MLE model showed that

multisensory integration was intact in both the control and amblyopia groups: the maximal

bimodal precision advantage and spatial bias in the ventriloquism effect were optimal based on

the spatial features of the unimodal component stimuli. Small deviations from the MLE model

for bimodal localization precision seen at smaller and larger blob sizes (Figure 3.6) are similar to

those reported by Alais and Burr (2004) (see Figure 2B in their manuscript). The condition in

which the auditory and visual localization precisions are most similar is that in which integration

should result in the greatest improvement in localization precision. Indeed, the MLE predictions

for the maximal bimodal advantage ratio are very close to the empirical data for both groups

(Figure 3.7). Overall, these findings provide independent validation for the MLE model of

ventriloquism (Alais & Burr, 2004) in a larger sample of typically-sighted individuals, and

suggest that at least some multisensory processing abnormalities reported in amblyopia do not

reflect disordered multisensory integration, but rather unisensory deficits that feed into and

propagate through an otherwise normal integrative network.

A common theme in studies of multisensory integration in children is that it develops relatively

late (Burr & Gori, 2012) compared to unisensory abilities (Daw, 2006; Litovsky & Ashmead,

1997) and non-integrative multisensory processes such as cross-modal matching (Pons,

Lewkowicz, Soto-Faraco, & Sebastian-Galles, 2009). By some estimates, optimal multisensory

integration does not arise until age 8 to 10 years (Gori et al., 2008; Nardini et al., 2008), which is

beyond the critical period for the development of amblyopia (Birch, 2013). This may explain

why optimal integration in the ventriloquism effect is spared in amblyopia. Furthermore, it

suggests that other multisensory perceptual anomalies in amblyopia (e.g., reduced susceptibility

to the McGurk effect and poorer audiovisual asynchrony detection) may result from deficits in

unisensory perception (e.g. spatial or temporal uncertainty) that are propagated in an otherwise

88

optimal multisensory percept. Furthermore, disrupted cross-sensory calibration may be the

mechanism by which amblyopic vision impairs unisensory functions beyond vision.

Consistent with the theory of cross-sensory calibration in which the more robust and accurate

sense informs the other (Gori, 2015), these results implicate vision as the master reference for the

calibration of auditory spatial localization during development. Indeed, similar relationships have

been described for other multisensory object features. In normal children younger than 8 years,

vision informs touch in spatial orientation discrimination, but touch informs vision in size

discrimination (Gori et al., 2008). In early bilateral visual impairment, however, cross-sensory

calibration is affected in a predictable way: haptic orientation discrimination (for which vision

typically dominates) is impaired, but haptic size discrimination (for which touch typically

dominates) is preserved (Gori et al., 2010). What is striking about our results is that the

impairment in cross-sensory calibration of auditory localization occurred despite normal visual

acuity in one eye. We have conducted further experiments to specifically investigate the effects

of abnormal vision in the calibration of auditory spatial localization during development, which

will be the subject of a separate report.

89

Chapter 4 Study II

Study II: Amblyopia and the Developmental Calibration of Sound Localization

4.1 Abstract

The visual system of adults with amblyopia developed with reduced binocular input because one

eye was misaligned or defocused. Here we present the first evidence that this visual impairment

interferes with the developmental calibration of auditory localization. The pattern of deficits

suggests that visual input during early development calibrates the auditory spatial map in the

phylogenetically ancient retinocollicular pathway.

4.2 Introduction

Amblyopia is a developmental visual impairment that affects approximately 3% of the

population (Attebo et al., 1998; Brown et al., 2000; Preslan & Novak, 1996). It presents

clinically as a reduction in visual acuity and is not directly attributable to a structural eye

abnormality, but is associated with some factor—most commonly strabismus (eye misalignment)

or anisometropia (unequal refractive error)—that disrupts normal visual experience during a

sensitive period in early life (American Academy of Ophthalmology Pediatric

Ophthalmology/Strabismus Panel, 2012). Beyond the deficit in letter acuity, amblyopia is

associated with a constellation of developmental impairments in spatial visual perception

(McKee et al., 2003), temporal visual perception (Huang et al., 2012; Spang & Fahle, 2009; St

John, 1998), eye movement control (Ciuffreda et al., 1978; Raashid et al., 2016; Subramanian et

al., 2013), and hand-eye coordination (Niechwiej-Szwedo et al., 2011; Niechwiej-Szwedo, Goltz,

et al., 2012), and audiovisual multisensory processing (Burgmeier et al., 2015; Chen et al., 2017;

Narinesingh et al., 2017; Richards et al., 2017b).

In a previous study on audiovisual spatial integration (Study I), we detected an unexpected and

novel deficit in the precision of binaural sound localization in people with unilateral amblyopia

(M. D. Richards, H. C. Goltz, & A. M. Wong, 2017a). The present study is a follow-up on that

90

finding to more fully investigate the effect of amblyopia on the unisensory auditory spatial

perception.

Unlike the visual system which maps space in direct retinotopic coordinates, the human auditory

system does not have access to explicit spatial information. Instead, a listener must infer the

location of a sound indirectly from cues embedded in the acoustic signal. In the horizontal plane,

sound localization is largely based on differences in signal timing (i.e., interaural time difference,

ITD) and intensity (i.e., interaural level differences, ILD) between the ears (Rayleigh, 1907). The

relative contribution of these cues to sound localization depends on the frequency of the acoustic

signal, with ITDs predominating below 1400 Hz, and ILDs predominating for higher frequencies

(Mills, 1958). These binaural inputs converge in the auditory midbrain where spatial sensitivity

emerges in the dorsal nuclei (see Grothe et al. (2010) for review). Neurons in the primate inferior

colliculus show only coarse spatial selectivity for binaural cues (Groh, Kelly, & Underhill,

2003), but in the superior colliculus, a systematically organized spatiotopic map of auditory

space appears (King & Palmer, 1983). While ITD and ILD cues both contribute to sound

localization at the behavioural level (Mills, 1958), spatial selectivity in the mammalian superior

colliculus appears to be exclusively to ILD cues (Campbell et al., 2006).

The superficial layers of the superior colliculus also receive direct retinal input (Pollack &

Hickey, 1979; Williams, Azzopardi, & Cowey, 1995) and show retinotopic organization similar

to that of the striate cortex (DuBois & Cohen, 2000; Lane et al., 1973). In contrast to the

balanced binocular input to the striate cortex, however, each superior colliculus receives retinal

input primarily from the contralateral eye (Lane et al., 1973; Pollack & Hickey, 1979).

Importantly, the collicular visual space map is topographically aligned with the underlying

auditory space map (King & Palmer, 1983). When eye movements shift the retina-centred visual

frame of reference away from the head-centred auditory frame of reference, this alignment tends

to be maintained by shifts in the auditory map (Jay & Sparks, 1987b). Alignment of the

unisensory space maps ensures that auditory and visual stimuli activate neurons at the same site,

and enables integration of those signals in the deeper multisensory layers of the superior

colliculus (Meredith & Stein, 1986b). This multisensory convergence and spatial alignment is

likely essential to the role of the superior colliculus in shifting gaze and attention to salient

environmental stimuli (Schiller & Stryker, 1972; Sparks, 1986).

91

Although a rudimentary map of auditory space map is present in the superior colliculus at birth

(King & Carlile, 1993), animal studies have shown that abnormal visual input during

development causes changes in its topography and alignment with the visual space map (see

King (2009) for review). Barn owls reared with prism spectacles mislocalize sounds in the

direction of the visual field shift, and show corresponding shifts in the tectal auditory space map

(Knudsen & Brainard, 1991). After a certain age, normal visual experience is ineffective in

recovering normal sound localization abilities, indicating that these changes do not represent

short-term adaptation, but permanent alterations crystallized during a sensitive period of brain

plasticity (Knudsen & Knudsen, 1990). A similar shift is observed in the auditory space map in

the superior colliculus of ferrets reared with experimentally-induced strabismus (King et al.,

1988), and complete disorganization of the auditory map is observed when ferrets are reared with

a surgically rotated eye (King et al., 1988). Interestingly, anomalous acoustic experience caused

by chronic occlusion of one ear from birth has little effect on behavioural sound localization in

barn owls (Knudsen & Knudsen, 1990) and induces minimal change in the spatial tuning of

auditory neurons in the superior colliculus in ferrets (King et al., 1988). That the auditory map

can adjust to distorted binaural cues more readily than it can to distorted visual cues implicates

vision as the dominant guiding influence in calibrating the neural representation of auditory

space in the superior colliculus (King et al., 1988).

Spatial acuity in the human auditory system, as in the visual system (Mayer & Dobson, 1982),

follows a developmental trajectory through childhood. Binaural localization is poor at birth, but

improves dramatically during the first several years of life (Litovsky & Ashmead, 1997). In

developmentally typical infants, the smallest reliably perceptible separation between sound

sources, or minimum audible angle (MAA), improves from approximately 20° at 5 months of

age (Ashmead et al., 1987) to 4° at 18 months of age, finally reaching adult acuity of 1–2° by

about 5 years of age (Mills, 1958; Morrongiello, 1988). Abnormal visual experience in early life

is also known to affect sound localization in humans, but the relation is not simple. The

congenitally blind often exhibit sensory compensation for their loss of vision, with superior

auditory spatial tuning, particularly for peripheral stimuli (Ashmead et al., 1998; Lessard et al.,

1998; Röder et al., 1999; Voss et al., 2004). Similarly, people who had one eye enucleated (i.e.,

surgically removed) in childhood can localize sounds more accurately in the central region of

space (Hoover et al., 2012). In contrast, sound localization precision and accuracy are

92

significantly impaired in people whose early visual impairment is limited to the central field

bilaterally (Lessard et al., 1998). In the context of these prior findings, it is difficult to predict the

cross-sensory effect of unilateral amblyopia on sound localization. Is subnormal vision in one

eye accompanied by compensatory enhancement of spatial hearing, or is discordant binocular

input sufficient to impair spatial hearing despite normal visual acuity in the fellow eye? Our prior

investigations suggest that sound localization may indeed be impaired in amblyopia (Richards et

al., 2017a), but this cross-sensory effect has not been examined systematically in this prevalent

visual disorder.

4.3 Methods

In the present study, we measured the precision and accuracy of sound localization using a

relative localization task (Experiments 1 and 3) and an absolute localization task (Experiment 2)

in humans with unilateral amblyopia.

4.3.1 Experiment 1: Relative sound localization—minimum audible angle task using speaker array

4.3.1.1 Participants

All participants reported no history of neurological, neurodevelopmental, auditory, or visual

disorders other than amblyopia, strabismus and/or refractive error. A certified orthoptist or

ophthalmologist examined each participant to measure visual acuity (standard ETDRS chart),

stereopsis (Randot circles test and Titmus fly test), foveal suppression (Worth 4-dot test), ocular

motility and alignment, and refractive correction. Amblyopia was defined as an acuity of ≥0.18

logMAR in the affected eye, and an interocular difference of ≥0.2 logMAR. Amblyopia was

classified as anisometropic if the interocular difference in spherical or astigmatic error was ≥1

diopter (D), strabismic if there was any manifest deviation in the absence of anisometropia, and

mixed if there was a strabismus of ≥8 prism diopters (PD) in the presence of anisometropia ≥1 D.

Each participant also passed a standard hearing test (Student Support Services Team, 2008) to

ensure reliable detection of pure tones at ≤25 dBA sound pressure level (SPL) at four standard

frequencies in each ear (500, 1000, 2000, 4000 Hz). Written informed consent was obtained in

accordance with the protocol approved by the Research Ethics Board at The Hospital for Sick

Children, and in accordance with the Declaration of Helsinki.

93

Ten adults with amblyopia (3 males; mean age, 32 years; range, 22–46 years) and 10 normally-

sighted adults (3 males; mean age, 29 years; range, 22–47 years) participated in Experiment 1.

Demographic and clinical details for participants with amblyopia in Experiment 1 are

summarized in Table 4.1.

94

Table 4.1: Clinical details of participants with amblyopia in Experiment 1

Participant

Age (sex)

Subtype Visual acuity

(logMAR)


(diopters)

Alignment at 6m

(prism diopters)

Stereo

acuity

(arc sec)

Worth 4-

dot

response

Additional details

RE LE RE LE

P1

27 (F)

Strab 0.00 0.48 -6.25 +1.00 x45 -5.50 +1.25 x35 LE esotropia 2,

LE hypotropia 1

200 Fused Strab surgery age 9

P2

22 (F)

Aniso 0.00 0.48 -1.50 +0.50 x80 +1.00 +1.25 x95 LE esotropia 2 200 Fused

P3

22 (M)

Aniso 1.1 -0.10 -6.00 +0.75 x174 -4.50 +0.50 x75 RE esotropia 2 3000 Fused

P4

23 (F)

Strab 0.20 0.00 +0.50 +0.50 x28 +1.25 +0.50 x88 LE esotropia 8,

bilateral DVD

Not

measurable

Diplopic Infantile esotropia, 2

strab surgeries as child

P5

44 (F)

Mixed 0.90 0.00 +6.00 +1.25 x75 -0.75 RE exotropia 35 Not

measurable

RE

suppressed

P6

37 (F)

Aniso 0.18 -0.10 -3.25 +4.00 x10 -5.25 RE esotropia 1 70 Fused

P7

44 (M)

Aniso -0.10 1.20 -0.25 +2.00 x98 -2.75 +2.00 x69 LE esotropia flick 3000 Fused

P8

46 (F)

Strab -0.10 0.10 +4.25 +5.00 LE esotropia 25, LE

hypotropia 18

Not

measurable

LE

suppressed

Esotropia onset at 6–8

months of age

P9

28 (M)

Aniso 0.18 -0.10 +2.25 +0.25 Exophoria 2 70 Fused

P10

26 (F)

Aniso -0.10 0.18 +0.75 +3.00 LE esotropia 1 140 Fused

Abbreviations: RE, right eye; LE, left eye; Aniso, anisometropia; Strab, strabismus; DVD, dissociated vertical deviation.

95

4.3.1.2 Stimuli and Design

All trials were conducted in a darkened, sound attenuating chamber (internal dimensions 2.0 x

2.1 x 2.2 m) lined with 5 cm acoustic wedge foam (Foam Factory, Macomb, MI, USA). The

background noise level was 39.0 dBA SPL. Participants were seated with the head stabilized in a

chinrest 1 m from a horizontal array of 11 speakers (model CMS0361KLX, CUI Inc., Tualatin,

OR, USA) as shown in Figure 4.1. Auditory stimuli consisted of broadband white noise bursts of

32 ms duration, including a 2 ms sigmoid on/off ramp, delivered at 76.5 dBA SPL (output level

was verified as between 76.3 and 76.6 dBA SPL for each speaker). A red LED positioned over

the central speaker was illuminated between trials to aid in maintaining head alignment with the

speaker array. Participants used a wireless gamepad (model F710, Logitech, Newark, CA, USA)

to initiate trials and enter responses.

Figure 4.1: Apparatus for Experiment 1, a horizontal array of 11 speakers with a central

fixation LED. In each trial, one click was presented at the central reference position and the

other was presented a specified distance (3°, 6°, 9°, 12° or 15°) left or right of centre (auditory

angle θ).

96

Each trial began with illumination of the central fixation LED for 500 ms, followed by a

randomized delay between 250 ms and 400 ms. Two clicks (a reference click and a probe click)

were then presented in succession 500 ms apart. The reference click always originated from the

central speaker (0°), and the probe click originated from a non-central speaker (3°, 6°, 9°, 12° or

15° to the left or right of centre), forming an auditory angle θ; the order or presentation was

randomized. Participants were instructed to judge whether the second click was located to the

left or right relative to the first. Twenty trials were conducted for each of the 10 auditory angles

tested, with the probe click preceding the reference click in 50% of trials for each auditory angle.

Trials were run in random order, arranged in 2 blocks of 100 trials each.

4.3.1.3 Data Analysis

The proportion of trials in which the probe click was “heard right” of the reference click was

calculated for each auditory angle θ. A cumulative Gaussian function was fit to the psychometric

data for each participant in MATLAB version R2011b (Mathworks, Inc., Natick, MA, USA)

using the maximum likelihood method. The MAA was computed for each participant, and

defined as one half of the difference in θ between the 0.25 and 0.75 points on the y-axis of the

psychometric function (Mills, 1958).

The mean MAA values for the control and amblyopia groups were compared using an

independent samples t-test in IBM SPSS Statistics, version 22 (Armonk, NY, USA). Normality

of the data was established by the Shapiro-Wilk test. Degrees of freedom were adjusted to

overcome possible violations in equality of variance detected by Levene’s test.

Associations between the MAA and various clinical characteristics in the amblyopia group were

examined. Associations with amblyopic eye visual acuity in the amblyopic eye and stereo acuity

were assessed using Spearman’s rank correlation.

4.3.2 Experiment 2: Absolute Auditory Localization


Fourteen adults with amblyopia (mean age, range: 30, 19–48 years) and 14 normally-sighted

adults (mean age, range: 30, 23-47 years) participated in Experiment 2. Five of the participants

with amblyopia and four controls had also participated in Experiment 1. All new participants met

97

the same ophthalmic and auditory screening examination requirements as those in Experiment 1,

and the same definitions for amblyopia and its subtypes were used. Demographic and clinical

details for participants with amblyopia in Experiment 2 are summarized in Table 2. Written

informed consent was obtained in accordance with the protocol approved by the Research Ethics

Board at The Hospital for Sick Children, and in adherence to the Declaration of Helsinki.

98

Table 4.2: Clinical details of participants with amblyopia in Experiment 2

Participant

Age (sex)


(logMAR)


(diopters)

Alignment at 6 m

(prism diopters)

Stereo

acuity

(arc sec)

Worth 4-

dot

response

Additional details

RE LE RE LE

P1*

27 (F)

Strab 0.00 0.48 -6.25 +1.00 x45 -5.50 +1.25 x135 LE esotropia 2,

LE hypotropia 1

200 Fused Strab surgery, age 9

years

P2*

22 (F)

Aniso 0.00 0.48 -1.50 +0.50 x80 +1.00 +1.25 x95 LE esotropia 2 200 Fused

P3*

22 (M)

Aniso 1.10 -0.10 -6.00 +0.75 x174 -4.50 +0.50 x75 RE esotropia 2 3000 Fused

P4*

23 (F)

Strab 0.20 0.00 +0.50 +0.50 x28 +1.25 +0.50 x88 LE esotropia 8,

bilateral DVD

Not

measurable

Diplopic Infantile esotropia, 2

strab surgeries as child

P5*

44 (F)

Mixed 0.90 0.00 +6.00 +1.25 x75 -0.75 RE exotropia 35 Not

measurable

RE

suppressed

P11

32 (F)

Aniso -0.10 0.54 Plano +2.00 +2.00 x124 Orthotropic 140 Fused

P12

29 (F)

Mixed 0.00 1.00 Plano +3.50 +2.00 x90 LE exotropia 14,

LE hypertroia 4

Not

measureable

LE

suppressed

Strab surgery, age 4

years

P13

23 (F)

Aniso -0.10 0.48 -2.25 +0.25 +2.25 x85 LE esotropia 1 200 Fused

P14

19 (F)

Aniso 0.00 0.18 -0.75 +2.00 x84 -2.75 +4.50 x99 Exophoria 1 40 Fused

P15

19 (F)

Mixed 0.48 0.00 +3.00 +1.00 x130 +4.25 RE esotropia 4,

RE esophoria 10

3000 Fused Accom. esotropia,

strab surgery as child

P16

29 (F)

Aniso 0.48 -0.10 -5.00 -1.25 RE esotropia 2 3000 Fused

P17

29 (F)

Strab 0.00 1.00 None None LE esotropia 2,

bilateral DVD

Not

measurable

LE

suppressed

Infantile esotropia

P18

37 (F)

Mixed -0.10 1.30 -1.00 +6.00 +2.50 x120 LE exotropia 25 Not

measurable

LE

suppressed

Strab, surgery age 23

years

P19

48 (F)

Aniso 0.70 0.00 +2.25 +0.25 x174 -0.75 RE esotropia 2 3000 Fused

Abbreviations: RE, right eye; LE, left eye; Aniso, anisometropic; Strab, strabismic; DVD, dissociated vertical deviation; Accom.,

accommodative. *Also participated in Experiment 1.

99


All trials were conducted in the same acoustic chamber as Experiment 1. Participants were

seated with the chin stabilized in a chinrest before a large (165 cm diagonal) LED monitor (NEC,

model E654, Tokyo, Japan) flanked by stereo speakers (HP Inc., model BR387AA#ABA, Palo

Alto, CA, USA) at ear-level (shown in Figure 4.2). Auditory stimuli consisted of 32 ms click

trains (8 cycles of 4 ms white noise clicks at 62.0 dBA, enveloped with a 2 ms sigmoid on/off

ramp), repeating at 3 Hz. The white noise was 2–5 kHz bandpass filtered to limit the auditory

stimulus to frequencies at which interaural level difference cues predominate for binaural

localization (Mills, 1958). This stereophonic arrangement allowed the generation of phantom

(i.e. virtual) sound sources whose location was perceived on the horizontal axis between the two

physical speakers according to the principles of amplitude panning and summation localization

(Pulkki, 2001; Warncke, 1941). Participants used a wireless mouse with their preferred hand to

initiate trials and enter responses.

Figure 4.2: Apparatus for Experiment 2, stereo speakers with LED monitor. Phantom

sources were generated at locations on the azimuth (auditory angle θ) between the two physical

speakers by amplitude panning. The speakers were driven by coherent signals of independently

variable amplitude, such that the signal amplitude gain to the right and left speakers always

summed to 1.

100

Prior to each trial, a small red fixation dot (0.66°) was presented centrally to aid in consistent

alignment of the head and eyes with respect to the stereo speakers. Each trial began with the

offset of the fixation dot followed by the presentation of the auditory stimulus at one of 9

locations on the azimuth (-16°, -12°, -8°, -4°, 0°, 4°, 8°, 12°, or 16°). Two seconds after onset of

the click train, a visual cursor (vertical white line) appeared on the monitor. Participants aligned

the cursor with the perceived sound source, and clicked a mouse button to enter their response.

The initial horizontal location of the cursor was jittered randomly between -40° and 40° on every

trial to prevent systematic bias from ventriloquism (i.e., visual capture of the auditory stimulus

location by the visual cursor). The click train continued at 3 Hz until a response was entered. To

mitigate the potential effect of visual capture by the visual cursor further, participants were asked

to make their judgment during the 2 seconds of darkness before the cursor appeared. Participants

were instructed to hold their head still in the chinrest during trials, but eye movements were

unconstrained. Five trials were conducted at each of the 9 locations, randomized within a single

block.

4.3.2.3 Data Analysis

For participants with amblyopia, location data were signed relative to the side of the affected

eye. Positive values indicated locations in the spatial hemifield ipsilateral to the amblyopic eye,

and negative values indicated locations in the contralateral hemifield. As normally sighted

controls lack a lateralized visual impairment, location data were expressed in left/right spatial

coordinates.

Linear regression parameters (intercept and slope) were compared with their expected values of

0 and 1, respectively, using a one-sample t-test.

Between-subject localization variability within each hemifield was obtained by averaging the

standard deviation of the mean localization for each of the four click locations within each

hemifield. Hemifield asymmetry in between-subject localization variability was assessed using

paired t-test. Normality of the data was established by the Shapiro-Wilk test.

Within-subject localization error was calculated as the root mean square (RMS) error at each

click location. For the repeated measures ANOVA comparing the effect of hemifield on RMS

101

localization error across click locations, homogeneity of variance was established by Mauchly’s

test of sphericity.

4.3.3 Experiment 3: Replication of MAA task using stereo speaker apparatus (amplitude panning)


As in Experiment 2.


As in Experiment 1, with the following exceptions. Sound stimuli consisted of 2–5 kHz bandpass

filtered white noise to limit the auditory stimulus to frequencies at which interaural level

difference cues predominate (Mills, 1958). Instead of a central LED, a small (0.66°) red fixation

dot was presented on the LED monitor.

4.3.3.3 Data analysis

As in Experiment 1.

4.4 Results

4.4.1 Experiment 1

Performance on the relative sound localization task using the array of 11 speakers is illustrated in

Figure 4.3A. The MAA, illustrated in Figure 4.3B, was significantly larger in the amblyopia

group (mean ± SEM: 3.60 ± 0.35°) compared to the control group (mean ± SEM: 2.04 ± 0.12°),

indicating poorer sound localization precision in the amblyopia group (t(18) = -4.278, p = 0.001).

Within the amblyopia group, the MAA showed no significant correlation with visual acuity in the

amblyopic eye (Rs = 0.258, p = .471) or with stereo acuity (Rs = 0.644, p = .644).

102

Figure 4.3: Relative sound localization performance on a horizontal speaker array. Error

bars indicate SEM. (A) Mean psychometric data for the minimum audible angle task. Negative

and positive auditory angles represent sounds presented to the left and right of the central click,

respectively. (B) The mean minimum audible angle was significantly larger in the amblyopia

group compared to the control group (*p = 0.001).

4.4.2 Experiment 2

Performance on the absolute sound localization task using stereo speakers is illustrated in Figure

4.4. For the amblyopia group, location data are expressed relative to the side of the visual

impairment, with positive values indicating auditory locations in the hemifield ipsilateral to the

amblyopic eye.

103

Figure 4.4: Absolute sound localization performance. For the amblyopia group, positive

coordinates indicate auditory locations in the hemifield ipsilateral to the amblyopic eye. (A, B)

Mean localization of auditory targets ± SD. The relation between perceived and specified click

location was linear and closely matched the relation predicted by linear amplitude panning

(dotted grey line) for both groups. (C, D) Sound localization error (root mean square, RMS) by

click location ± SEM. (C) For the control group, the magnitude of sound mislocalization was

symmetric in the left and right auditory hemifields. (D) For the amblyopia group, the magnitude

of sound mislocalization was significantly greater in the auditory hemifield ipsilateral to the

104

amblyopic eye, compared to the contralateral hemifield (*p = 0.043). Contralateral = auditory

hemifield contralateral to the amblyopic eye, Ipsilateral = auditory hemifield ipsilateral to the

amblyopic eye.

The relation between perceived click location and specified click location was linear for all

participants in the control group (mean R2 = 0.98) and the amblyopia group (mean R2 = 0.96).

The mean intercept and slope of regression did not differ significantly from the expected values

of 0 and 1 for the control group (mean intercept = -0.73°, t(13) = -1.01, p = 0.331, mean slope =

1.10, t(13) = 1.183, p = 0.258), indicating that the virtual sound sources appeared where

specified, and that there was no systematic leftward or rightward bias in the apparatus (Figure

4.4A). Similarly, the mean intercept and slope of the regression did not differ significantly from

the expected values of 0 and 1 for the amblyopia group (mean intercept = 0.67°, t(13) = 0.626, p

= 0.542, mean slope = 1.11, t(13) = 1.677, p = 0.117), indicating that there was no systematic

bias in auditory localization toward or away from the amblyopic eye (Figure 4.4B).

Between-subject localization variability did not differ significantly between the left and right

hemifields in the control group (control: t(3) = -1.128, p = 0.342). In the amblyopia group,

however, there was significantly greater localization variability in auditory hemifield ipsilateral

to the amblyopic eye compared to the contralateral side (amblyopia: t(3) = -4.721, p = 0.018).

The mean magnitude of auditory mislocalization, calculated as the RMS error at each specified

click location for each participant, is illustrated in Figure 4.4C for the control group and in

Figure 4.4D for the amblyopia group. For the control group, a 2 x 4 repeated measures ANOVA

comparing the two auditory hemifields across specified click locations showed no significant

interaction of Hemifield x Click Location (F(3,39) = 0.579, p = 0.632), and no main effect of

Hemifield (F(1,13) = 0.075, p = 0.788). For the amblyopia group, the same analysis showed no

significant interaction of Hemifield x Click Location (F(3,39) = 0.781, p = 0.512), but did reveal

a significant main effect of Hemifield (F(1,13) = 5.041, *p = 0.043), with greater auditory

localization in the auditory hemifield ipsilateral to the amblyopic eye.

The correlations between clinical deficits in amblyopia and the magnitude of auditory

mislocalization (RMS error) are illustrated in Figure 4.5. Within the amblyopia group, the

105

clinical deficit in visual acuity and the magnitude of auditory mislocalization were significantly

correlated at the 8°specified click location within the auditory hemifield ipsilateral to the

amblyopic eye (Rs = 0.66, p = 0.011). Similarly, the clinical deficit in stereo acuity and the

magnitude of auditory mislocalization were significantly correlated at the specified click

locations 4° (Rs = 0.56, p = 0.012), 8° (Rs = 0.72, p = 0.004), and 12° (Rs = 0.54, p = 0.045)

within the auditory hemifield ipsilateral to the amblyopic eye. There were no significant clinical

correlations with auditory localization error in the auditory hemifield contralateral to the

amblyopic eye.

Figure 4.5: Correlations between RMS error for sound localization and clinical measures

of amblyopia across auditory target positions. The magnitude of auditory mislocalization was

significantly correlated with visual acuity and stereo acuity deficits in the auditory hemifield

ipsilateral to the amblyopic eye.

106

4.4.3 Experiment 3

Performance on the relative sound localization task using the stereo speaker apparatus (amplitude

panning) is illustrated in Figure 4.6A. The MAA, shown in Figure 4.6B, was significantly larger

in the amblyopia group (mean ± SEM: 4.21 ± 0.29°) compared to the control group (mean ± SEM:

3.38 ± 0.26°), indicating poorer sound localization precision in the amblyopia group (t(26) = -

2.120, p = 0.044). Within the amblyopia group, the MAA showed no significant correlation with

visual acuity in the amblyopic eye (Rs = -0.117, p = 0.690) or with stereo acuity (Rs = -0.048, p =

0.871).

Figure 4.6: Relative sound localization performance on stereo speaker apparatus. Error bars

indicate SEM. (A) Mean psychometric data for the minimum audible angle task. (B) The

minimum audible angle was significantly larger in the amblyopia group compared to the control

group (*p = 0.044).

107

The MAA values obtained using amplitude panning on the stereo speaker apparatus were

compared to those obtained using the physical speaker array in Experiment 1 (Fig. 7). For the

nine individuals tested on both apparatuses (4 control participants and 5 participants with

amblyopia), and the MAA values obtained were significantly correlated (R = 0.80, p = 0.009).

Figure 4.7: Correlation between minimum audible angle (MAA) values determined by

amplitude panning (Experiment 3) and by physical speakers (Experiment 1). Open circles

represent normal control participants and solid circles represent participants with amblyopia.

4.5 Discussion

In summary, we found novel deficits in both the precision and accuracy of sound localization in

people who grew up with amblyopia in one eye. The deficit in unisensory sound localization

precision was apparent as an increase in the MAA in the central region of space. The deficit in

sound localization accuracy was apparent as greater mislocalization in the spatial hemifield

ipsilateral to the amblyopic eye. Furthermore, the magnitude of sound mislocalization in the

108

ipsilateral hemifield was significantly correlated with the severity of amblyopic deficits in visual

acuity and stereo acuity.

The significant correlation between the MAA obtained with the physical sources in the speaker

array and virtual sources generated by amplitude panning indicates good agreement between

these two methods of measuring the MAA. The smaller MAA values obtained using the speaker

array were expected because the stimuli provided by physical sources were broadband, and

included low-frequency ITD and possibly spectral cues absent in virtual sources generated by

amplitude panning (Middlebrooks & Green, 1991; Pulkki & Karjalainen, 2001; Stevens &

Newman, 1936).

Unlike people who lose all vision in one or both eyes at an early age (Hoover et al., 2012;

Lessard et al., 1998; Röder et al., 1999), our results indicate that people with amblyopia do not

exhibit enhanced auditory localization to compensate for their deficits in spatial vision. Rather,

the developmental effect of unilateral amblyopia on spatial hearing more closely resembles that

of partial blindness with residual vision in both eyes (Lessard et al., 1998). This indicates that

discordant binocular vision can disrupt the developmental calibration of auditory space, and that

normal spatial acuity in the fellow eye is not adequate to rescue the process. More generally, the

results support the view that it is the quality of visual input, rather than its absence, that has the

stronger influence on the visual calibration of spatial hearing.

Based on the normal trajectory of MAA improvement through childhood, auditory spatial acuity

in adults with amblyopia is similar to the that of children between 1.5 to 5 years of age (Litovsky

& Ashmead, 1997). This age range corresponds roughly to the age of onset for the most common

forms of amblyopia (Birch & Holmes, 2010; Repka et al., 2002), raising the possibility that

amblyopia or its etiological factors (e.g., strabismus or anisometropia) interfere with the visually-

guided maturation of auditory spatial abilities. Alternatively, the loss of auditory spatial acuity

associated with amblyopia could represent regression to the level of a normal 1.5 to 5 year old

caused by anomalous visual input during a sensitive period in auditory system development.

Why does this amblyopic interference with auditory spatial development occur despite access to

high resolution visual spatial information from the fellow eye? This disconnect between the

binaural (auditory) spatial acuity and binocular (visual) spatial acuity may represent

109

physiological differences between the retinocollicular pathway involved in aligning and

calibrating the auditory space map and the retinogeniculostriate pathway responsible for visual

perception. Under binocular viewing conditions, perceptual dominance of the fellow eye is a

function of suppression of the signal from the amblyopic eye (Babu et al., 2013; J. Li et al.,

2011). Amblyopic suppression is mediated, however, by inhibitory interactions in the primary

visual cortex (Sengpiel, Jirmann, Vorobyov, & Eysel, 2006). If visual calibration of auditory

space occurs in the superior colliculus, as suggested (King, 2009), the usual cortical mechanisms

for amblyopic suppression may be bypassed. Without an independent midbrain mechanism to

suppress signals from the amblyopic eye, signals from both eyes would likely be equally salient

in their collicular representation of visual space.

More importantly, however, the primary visual cortex is also widely posited to be the site of the

neural deficit underlying amblyopia (Kiorpes et al., 1998; Movshon et al., 1987) (see Barrett et

al. (2004) for review). Therefore, the amblyopic visual deficit, as commonly defined, likely does

not affect the retinocollicular pathway. This suggests that the loss of auditory spatial acuity may

be an auditory analog of amblyopia caused by the same amblyogenic factors, but arising de novo

in the retinocollicular pathway. A similar pathologic mechanism involving direct retinocollicular

input has been previously proposed to explain the abnormally long saccadic latencies observed in

amblyopia (Ciuffreda et al., 1978).

Clinical markers of visual impairment, namely, visual acuity in the amblyopic eye and stereo

acuity, did not correlate significantly with the width of the MAA among the participants with

amblyopia. While the relevant predictors of the amblyopic deficit in MAA remain to be

determined, the width of the MAA may depend on historical factors such as such as age of onset,

age at treatment, and duration of patching, that are generally not known or remembered.

Furthermore, the lack of relation between MAA and clinical markers of amblyopia may reflect a

relatively short sensitive period for recovery of MAA compared to that for visual acuity. Indeed,

another amblyopic deficit possibly mediated by the superior colliculus—prolongation of saccadic

latency—can persist despite successful visual rehabilitation (Ciuffreda et al., 1978).

In addition to widening of the MAA, people with amblyopia also showed a significant tendency

to mislocalize sounds in the auditory hemifield ipsilateral to their amblyopic eye. This pattern of

110

auditory localization deficits is remarkable because it does not match the pattern of visual spatial

deficits observed in amblyopia (Hess & Pointer, 1985; Sireteanu et al., 2008). Although

participants localized sounds using a visually-guided cursor, the task was done with both eyes

open, and the specified click locations were well within the field of view of the fellow eye even

for the most eccentric auditory targets at 16° left and right of the midline. The asymmetry in

sound localization error therefore cannot be attributed difficulty seeing the visual cursor.

Furthermore, the pattern does not reflect the functional anatomy of the retinogeniculostriate

pathway, because the left and right primary visual cortices receive equal input from each eye,

ensuring that monocular visual loss does not cause blindness in half of the visual field (i.e.,

homonymous hemianopia). Rather, the hemispatial asymmetry in sound mislocalization is

suggestive of the coordinate framework of the retinocollicular pathway, because retinal input to

each superior colliculus is largely crossed from the contralateral eye (Lane et al., 1973; Pollack

& Hickey, 1979). That significant correlations between the sound mislocalization and the

severity of amblyopic visual deficits were restricted to the ipsilateral hemifield provides

additional evidence of retinocollicular involvement by the same reasoning. Taken together, these

findings provide the first behavioural evidence that the retinocollicular pathway functions to

calibrate auditory spatial abilities in humans.

111

Chapter 5 Study III

Study III: Alterations in Audiovisual Simultaneity Perception in Amblyopia

5.1 Abstract

Amblyopia is a developmental visual impairment that is increasingly recognized to affect higher-

level perceptual and multisensory processes. To further investigate the audiovisual perceptual

impairments associated with this condition, we characterized the temporal interval in which

asynchronous auditory and visual stimuli are perceived as simultaneous 50% of the time (i.e., the

audiovisual simultaneity window). Adults with unilateral amblyopia (n = 17) and visually normal

controls (n = 17) judged the simultaneity of a flash and a click presented with both eyes viewing.

The signal onset asynchrony (SOA) varied from 0 ms to 450 ms for auditory-lead and visual-lead

conditions. A subset of participants with amblyopia (n = 6) was tested monocularly. Compared

to the control group, the auditory-lead side of the audiovisual simultaneity window was widened

by 48 ms (36%; p = 0.002), whereas that of the visual-lead side was widened by 86 ms (37%; p =

0.02). The overall mean window width was 500 ms, compared to 366 ms among controls (37%

wider; p = 0.002). Among participants with amblyopia, the simultaneity window parameters

were unchanged by viewing condition, but subgroup analysis revealed differential effects on the

parameters by amblyopia severity, etiology, and foveal suppression status. Possible mechanisms

to explain these findings include visual temporal uncertainty, interocular perceptual latency

asymmetry, and disruption of normal developmental tuning of sensitivity to audiovisual

asynchrony.

5.2 Introduction

Amblyopia is a developmental visual impairment caused by abnormal visual experience during a

critical period in early childhood. It has a prevalence of 2–4% (Attebo et al., 1998; Brown et al.,

2000; Buch et al., 2001; Friedman et al., 2009; Preslan & Novak, 1996; Thompson et al., 1991;

Vinding et al., 2009), and is recognized as a leading cause of monocular blindness (Buch et al.,

2001; Krueger & Ederer, 1984). Clinically, it presents as a unilateral, or rarely bilateral,

reduction in best-corrected visual acuity that cannot be explained solely by a structural eye

112

abnormality. It is often accompanied by one or more factors, most commonly strabismus (eye

misalignment) or anisometropia (difference in refractive error between the eyes) that interfere

with normal binocular visual experience (American Academy of Ophthalmology Pediatric

Ophthalmology/Strabismus Panel, 2012).

While it is classically understood as a predominantly monocular visual disorder affecting low-

level visual functions such as optotype acuity, stereopsis, and contrast sensitivity (Abrahamsson

& Sjostrand, 1988; Hess & Howell, 1977; Levi & Harwerth, 1977; Levi et al., 1994), amblyopia

is increasingly recognized to involve deficits in higher-level perceptual processing. Affected

individuals show impairments in global shape detection (Hess et al., 1999), real-world scene

perception (Mirabella et al., 2011), motion processing (Aaen-Stockdale & Hess, 2008; Simmers,

Ledgeway, Hess, & McGraw, 2003), and feature counting (Sharma et al., 2000) that affect not

only the amblyopic eye, but also often extend to the fellow eye (Giaschi et al., 1992; Ho et al.,

2005; Kovacs et al., 2000). Beyond the purely visual domain, recent work has shown that

amblyopia also affects multisensory integration in speech perception, manifest as reduced

susceptibility to the McGurk effect, even while viewing with both eyes (Burgmeier et al., 2015;

Narinesingh et al., 2015; Narinesingh et al., 2014).

Multisensory integration is the process by which information from the various senses is

associated and merged into a unified percept. It confers broad advantages in terms of response

time (Morrell, 1968) and accuracy of discrimination (Frassinetti et al., 2002) (see Ernst and

Bulthoff (2004) for review). In infancy, normal visual experience during a critical period is

necessary for the emergence of robust integration of auditory and visual signals (Putzar et al.,

2007; Putzar, Hötting, et al., 2010; Wallace et al., 2004). In turn, audiovisual integration plays an

important role in the development of higher level perceptual functions including speech

acquisition in infancy (Kushnerenko et al., 2008; Lewkowicz & Hansen-Tift, 2012) and speech

comprehension in adulthood (Driver, 1996; Grant & Seitz, 2000; Grant et al., 1998; Sumby &

Pollack, 1954). Interestingly, deficits in multisensory integration have been increasingly

recognized as a feature of various neurodevelopmental disorders, including autism (Stevenson et

al., 2014), dyslexia (Hairston, Burdette, Flowers, Wood, & Wallace, 2005), and schizophrenia

(Foucher, Lacambre, Pham, Giersch, & Elliott, 2007; Martin, Giersch, Huron, & van

Wassenhove, 2013), but the mechanism remains elusive.

113

Visual and auditory stimuli presented in close temporal and spatial correspondence are likely to

be perceived as arising from a single event. This process, termed perceptual binding, is a rapid

pre-attentive process that occurs without the conscious awareness of the observer, and constitutes

a fundamental rule for learning associations between stimuli (Driver, 1996; McGurk &

MacDonald, 1976; Sekuler et al., 1997). Neuroimaging studies indicate that the temporal

correspondence of auditory and visual speech stimuli activates a broad network, including the

superior colliculus (SC), anterior insula, and anterior intraparietal sulcus (IPS), while perceptual

fusion (e.g. as in the McGurk effect) is associated with activation in the multisensory superior

temporal sulcus (mSTS), the middle IPS, and regions of the primary auditory cortex (Macaluso,

George, Dolan, Spence, & Driver, 2004; Miller & D'Esposito, 2005; Stevenson, VanDerKlok,

Pisoni, & James, 2011). Similar studies of non-speech stimuli (e.g. click-flash pairs) have shown

that temporal correspondence of simple audiovisual stimuli activates the SC, mSTS, IPS, and

insula (Calvert et al., 2001), while detection or perception of asynchrony is associated with

activation of an extensive network including the insula, posterior parietal, and prefrontal regions,

with the right insula being involved most significantly (Bushara et al., 2001). Furthermore,

Noesselt et al. (2007) showed that temporal correspondence of simple audiovisual stimuli not

only activates the mSTS, but also affects activity in the primary auditory and visual cortices,

likely by a feedback mechanism from the mSTS.

The temporal interval during which separate visual and auditory stimuli are perceived reliably as

simultaneous is termed the audiovisual simultaneity window, and reflects an equilibrium

between the sensitivity to signal asynchrony (which narrows the audiovisual simultaneity

window) and the tendency toward perceptual binding (which widens the audiovisual simultaneity

window). It is measured using a single-interval forced-choice simultaneity judgment task for

audiovisual stimulus pairs presented with varying signal onset asynchrony (SOA). It typically

has a bell-shaped distribution with a slight skew toward the visual-lead side of objective

simultaneity (Slutsky & Recanzone, 2001; Stevenson & Wallace, 2013; Zampini, Guest, et al.,

2005). Furthermore, audiovisual stimuli are typically perceived as maximally simultaneous when

the visual stimulus slightly precedes the sound. This visual-lead shift in the point of subjective

simultaneity (PSS) is commonly believed to reflect either tuning to the natural condition in

which light waves reach the eyes before sound waves reach the ears, or the neural delay related

to slower processing of visual signals (Vroomen & Keetels, 2010). The audiovisual simultaneity

114

window progressively narrows on both auditory-lead and visual-lead sides from childhood

through adolescence, reaching the adult shape by 9 to 17 years of age (Chen et al., 2016; Hillock-

Dunn & Wallace, 2012; Hillock et al., 2011; Lewkowicz & Flom, 2014). Interestingly,

individuals with a narrower audiovisual simultaneity window, particularly on the visual-lead

side, experience a stronger McGurk effect, suggesting that the audiovisual simultaneity window

may be an index of broader audiovisual integrative function (Stevenson, Zemtsov, et al., 2012).

For an individual with a developmentally normal sensorium, the overall width of the audiovisual

simultaneity window is not fixed, but varies depending on the characteristics of the stimuli.

Complex stimuli such as natural speech and audiovisual stimuli with high semantic congruency

result in a wider audiovisual simultaneity window than simple flash-beep stimuli (Stevenson &

Wallace, 2013; van Wassenhove et al., 2007). Increased spatial separation between the paired

stimuli (Keetels & Vroomen, 2005; Zampini, Guest, et al., 2005), as well as availability of visual

predictive information about when to expect an audiovisual event to occur (Petrini, Russell, &

Pollick, 2009), result in a narrower audiovisual simultaneity window. Its width can be further

narrowed by various forms of perceptual learning—short-term audiovisual and visual-only

training with feedback (Powers et al., 2009; Stevenson et al., 2013), long-term musical training

(Lee & Noppeney, 2011a), and video gaming experience (Donohue, Woldorff, & Mitroff, 2010).

In addition to the width of the audiovisual simultaneity window, its peak, or point of subjective

simultaneity is also variable. Repeated exposure to asynchronous stimuli shifts it toward the

trained asynchrony in a process termed temporal recalibration (Fujisaki et al., 2004; Navarra et

al., 2005; Roseboom & Arnold, 2011). Furthermore, the presence of an additional visual stimulus

that closely precedes or follows a synchronous audiovisual pair biases the PSS away from the

additional stimulus (Roseboom, Nishida, & Arnold, 2009).

Abnormal early visual experience has been shown to affect multisensory processing. Adults with

early pattern vision deprivation from bilateral congenital cataracts have an audiovisual

simultaneity window that is selectively broadened on the visual-lead side (Chen et al., 2017), as

well as diminished audiovisual interaction in speech perception (Putzar, Hötting, et al., 2010),

and a shift in attentional balance toward audition (de Heering et al., 2016). In contrast, the

audiovisual simultaneity window of adults with unilateral congenital cataract is symmetrically

broadened, similar to that seen in typically-developing children (Chen et al., 2017). Audiovisual

115

interactions have also been studied in monocular adults with a history of early enucleation. Like

those with unilateral amblyopia, this population shows reduced susceptibility to the McGurk

effect, but demonstrate normal responses to illusions involving temporal audiovisual integration

such as the sound-induced flash illusion and audiovisual simultaneity judgments (Moro &

Steeves, 2015). This suggests that audiovisual integration deficits may be specific to the nature

of the visual sensory disturbance during the critical period.

Despite its relatively high prevalence, much less is known about the extent of the multisensory

deficits in unilateral amblyopia from strabismus and anisometropia. Specifically, it is unclear

whether the audiovisual integration deficits in these forms of amblyopia are specific to speech, or

whether they reflect a broader impairment in multisensory processing. Evidence from visually

normal adults suggests that susceptibility to the McGurk effect is correlated with other indices of

temporal audiovisual integration (Stevenson, Zemtsov, et al., 2012). One such index is the

audiovisual simultaneity window. Visually normal individuals with lower susceptibility to the

McGurk effect have a wider audiovisual simultaneity window, indicating altered processing of

asynchronous multimodal signals (Stevenson, Zemtsov, et al., 2012). Based on this evidence

from visually normal adults and our previous studies showing that adults with amblyopia are less

susceptible to the McGurk effect (Narinesingh et al., 2015; Narinesingh et al., 2014), we

hypothesized that unilateral amblyopia will also show a symmetrically broadened audiovisual

simultaneity window under binocular and monocular viewing conditions, indicating a higher-

level alteration in audiovisual integration that is generalized beyond speech.

5.3 Materials and Methods

5.3.1 Participants

Participants were adults aged 18 to 48 years, with no history of neurological, auditory, or visual

disorders other than amblyopia, strabismus, or ametropia. Each participant was assessed by a

certified orthoptist or ophthalmologist to document visual acuity (standard ETDRS chart),

stereoacuity (Randot circles test and Titmus fly test), binocularity (Worth 4-dot test), eye

alignment (cover-uncover and alternate cover tests), and refractive correction. Amblyopia was

defined as a visual acuity of 0.18 logMAR (20/40) or worse in the amblyopic eye, and an inter-

ocular difference of at least 0.2 logMAR (2 lines on the ETDRS chart). Anisometropic

116

amblyopia was defined as an inter-ocular difference of 1 diopter (D) or more in either spherical

equivalent or astigmatic correction. Strabismic amblyopia was defined as any manifest deviation

on cover testing in the absence of anisometropia. Mixed amblyopia was defined as the presence

of both anisometropia and a manifest deviation of 8 prism diopters or more. Visually normal was

defined as visual acuity of at least 0.1 logMAR (20/25) in each eye. All participants completed a

hearing test on a commercially-available screening audiometer (model MA 27, MAICO

Diagnostics, Eden Prairie, MN, USA) with circumaural headphones (model TDH 39, MAICO

Diagnostics, Eden Prairie, MN, USA) to ensure reliable responses to low level (≤25 dB) pure

tones at a standard set of frequencies (0.5, 1, 2, and 4 kHz) (Student Support Services Team,

2008). Participants were excluded if they had a history of any other ocular pathology, previous

intraocular surgery, high ametropia (hyperopia > +5D or myopia > -6D), hearing impairment,

neurological disease, or neurodevelopmental disorder. Written informed consent was obtained

from all participants. The study was approved by the Research Ethics Board at The Hospital for

Sick Children, and all protocols adhered to the tenets of the Declaration of Helsinki.

Participants were recruited from November 2014 to February 2016 through flyers posted on

hospital property and advertisements posted on the social media websites Craigslist.ca and

Kijiji.ca. Of 26 individuals with amblyopia recruited, 17 passed the screening examinations and

participated in the study (3 males, mean age: 29 years, range: 19–48 years). An equal number of

visually normal naive control participants were recruited in a similar fashion (4 males, mean age:

29 years, range: 22–47 years). The clinical characteristics of the participants with amblyopia are

summarized in Table 5.1.

117

Table 5.1: Characteristics of participants with amblyopia

Visual acuity

(logMAR)


Participant Age Subtype RE LE RE LE Stereo acuity

(arc sec)

Worth 4-dot

response

1 29 Strab 0.00 1.00 None None Not measurable LE suppressed

2 22 Aniso 0.00 0.48 -1.50 +0.50 x 80 +1.00 +1.25 x 95 200 Fused

3 48 Aniso 0.70 0.00 +2.25 +0.25 x 174 -0.75 3000 Fused

4 36 Aniso 0.00 0.40 -1.00 +1.00 140 Fused

5 29 Aniso 0.48 -0.10 -5.00 -1.25 3000 Fused

6 23 Aniso -0.10 0.48 -2.25 +0.25 +2.25 x 85 200 Fused

7 29 Aniso 0.10 0.70 -1.50 +1.50 x 100 -3.00 +1.50 x 93 Not measurable LE suppressed

8 32 Strab -0.10 0.18 None None 70 Fused

9 29 Mixed 0.00 1.00 Plano +3.50 +2.00 x 90 Not measurable LE suppressed

10 19 Aniso 0.00 0.18 -0.75 +2.00 x 84 -2.75 +4.50 x 99 40 Fused

11 37 Mixed -0.10 1.30 -1.00 +6.00 +2.50 x 120 Not measurable LE suppressed

12 32 Aniso -0.10 0.54 Plano +2.00 +2.00 x 124 140 Fused

13 23 Strab 0.20 0.00 +0.50 +0.50 x 28 +1.25 +0.50 x 88 Not measurable Diplopic

14 44 Mixed 0.90 0.00 +6.00+1.25x75 -0.75 Not measurable RE suppressed

15 22 Aniso 1.1 -0.10 -6.00+0.75x174 -4.50+0.50x75 3000 Fused

16 19 Mixed 0.48 0.00 +3.00+1.00x130 +4.25 3000 Fused

17 27 Strab 0.00 0.48 -6.25 +1.00 x 45 -5.50 +1.25 x 135 200 Fused

Abbreviations: RE, right eye; LE, left eye; Aniso, anisometropia; Strab, strabismic.

118


Experiments were performed in a dark, sound attenuating chamber (internal dimensions 2.0 x 2.1

x 2.2 m) lined with 5 cm acoustic wedge foam (Foam Factory, Macomb, MI, USA). The

background noise was 39.0 dBA sound pressure level (SPL). Visual stimuli were gray Gaussian

blobs (6 SD = 4°) presented centrally for 33 ms (2 frames at 60 Hz) on a 165 cm LED monitor

(NEC, model E654, Tokyo, Japan). Auditory stimuli were 32 ms white noise click trains

(including a 2 ms sigmoid on/off ramp) presented at 62.0 dBA SPL via stereo speakers (HP Inc.,

model BR387AA#ABA, Palo Alto, CA, USA) mounted on either side of the monitor. Stimuli

were created digitally and controlled using a custom-written program, and participant responses

were collected directly via a gamepad (Logitech, model F710, Newark, CA, USA). The visual

and acoustic signals were horizontally aligned at eye level of the seated participant, and relative

timing was confirmed with an oscilloscope.

5.3.3 Procedure

The audiovisual simultaneity window was characterized using a two-alternative forced-choice

(2AFC) simultaneity judgment task. With the head stabilized on a chinrest 65 cm from the LED

monitor, participants were required to fixate a central red dot on the monitor (0.7°) and press a

button on the gamepad to initiate each trial. Following a random interval of 500 to 1500 ms

during which the screen was dark, a flash-click pair was presented, and the participant indicated

whether the two stimuli were “simultaneous” or “not simultaneous”. The signal onset

asynchrony (SOA) was varied from -450 ms (auditory stimulus presented first, i.e., auditory-

lead) to +450 ms (visual stimulus presented first, i.e., visual-lead) in 75 ms increments (i.e. -450,

-375, -300, -225, -150, -75, 0, +75, +150, +225, +300, +375, +450 ms) for a total of 13 SOA

levels (Figure 5.1). There were 20 trials for each SOA level, randomly interleaved in a single

block, typically taking 12–15 minutes to complete. Data were collected under binocular viewing

conditions for all participants. Data were also collected under amblyopic eye and fellow eye

monocular viewing conditions for a subset of 6 participants with amblyopia to determine if any

group effects were dependent on viewing condition.

119

Figure 5.1: Schematic diagram of signal onset asynchronies (SOA) for auditory-lead and

visual-lead conditions.

5.3.4 Analysis

The proportion of “simultaneous” responses was calculated for each SOA, and the response

distribution was fitted with a previously described truncated Gaussian function using the

maximum likelihood method.(Fujisaki et al., 2004) The correlation coefficient of the fit was

≥0.93 for each individual. The audiovisual simultaneity window width was defined as the width

of the fitted function at the 50% simultaneous response level, with the SOA to the left and right

of 0 ms (i.e. physical simultaneity) representing the auditory-lead threshold and visual-lead

threshold, respectively. The point of subjective simultaneity (PSS) was defined as the mean of the

fitted truncated Gaussian function. Group parameters were calculated as the arithmetic means of

the individual participant parameters. Sample data with fitted function are shown in Figure 5.2.

All curve fitting and parameter calculations were done using MATLAB version 2011b

(Mathworks, Inc., Natick, MA, USA).

120

Figure 5.2: Sample audiovisual simultaneity judgment data from a visually normal control

participant, fitted with a truncated Gaussian function by the maximum likelihood method.

The psychometric parameters (i.e., audiovisual simultaneity window width, auditory-lead

threshold and visual-lead threshold), were estimated at the 50% simultaneous response level. AV

= audiovisual.

Performance parameters (i.e., auditory-lead threshold, visual-lead threshold, audiovisual

simultaneity window width, and PSS) were compared between groups using one-way analysis of

variance (ANOVA) and Tukey post hoc multiple comparisons. Homogeneity of variances was

verified in each case by Levene’s test. Subgroup analyses were performed based on 4 common

clinical factors in amblyopia: (1) severity of the monocular acuity deficit, (2) presumed etiology,

(3) presence or absence of foveal suppression, and (4) level of stereopsis. Amblyopia severity

was classified as moderate if the acuity was ≤ 0.6 logMAR in the amblyopic eye, and as severe if

the acuity was >0.6 logMAR (American Academy of Ophthalmology Pediatric

121

Ophthalmology/Strabismus Panel, 2012). Presumed etiology was classified as either

anisometropic or strabismic/mixed. Foveal suppression status was classified as suppressed or

non-suppressed based on results from the Worth 4-dot test. Level of stereopsis was classified as

fine (i.e., some Randot circles; ≤400 seconds of arc) or poor (i.e., no Randot circles).

Associations between the 4 clinical factors were assessed using 2x2 contingency tables and the

phi coefficient (Φ). All statistics were computed using IBM SPSS Statistics version 22 (Armonk,

NY, USA). Statistical significance was defined as p < 0.05.

5.4 Results

5.4.1 Binocular Viewing Condition

5.4.1.1 Main Group Analysis

The audiovisual simultaneity window in adults with unilateral amblyopia was broadened by 134

ms, or 37%, compared to control participants (F(1,32) = 11.313, p = 0.002) when viewing

binocularly (Figure 5.3 and Table 5.2). The auditory-lead side of the audiovisual simultaneity

window was wider by 48 ms (36%; F(1,32) = 11.012, p = 0.002), and the visual-lead side was

wider by 86 ms (37%; F(1,32) = 6.00, p = 0.02). There was no significant difference in the PSS

between the control and amblyopia group.

122

Figure 5.3: Main group analysis for audiovisual simultaneity judgment responses with both

eyes viewing as a function of SOA. Comparison between control (n = 17) and amblyopia (n =

17) participant groups. Error bars represent standard error of the mean.

Table 5.2: Audiovisual simultaneity window parameters by main group

SOA, mean ± SD (ms)

Performance

parameter

Control

(n = 17)

Amblyopia

(n = 17)

F(1,32) p-value

Auditory-lead

threshold

-136 ± 34 -183 ± 49* 11.012 0.002

Visual-lead

threshold

231 ± 83 317 ± 119* 6.000 0.020

AV simultaneity

window width

366 ± 91 500 ± 136* 11.313 0.002

PSS

47 ± 44 67 ± 60 1.131 0.295

Abbreviations: * p < 0.05 (one-way ANOVA); SOA, signal onset asynchrony; SD, standard

deviation; AV, audiovisual.

123

5.4.1.2 Subgroup Analysis by Clinical Factors

5.4.1.2.1 Amblyopia Severity

Results of the subgroup analysis by amblyopia severity are summarized in Table 5.3 and Figure

5.4A. In the moderate amblyopia subgroup (n = 10), the auditory-lead threshold was broadened

by 45 ms (33%; p = 0.032), but the other parameters (visual-lead threshold, audiovisual

simultaneity window, and PSS) were not significantly different from the control group. In the

severe amblyopia subgroup (n = 7), three parameters were broadened compared to the control

group: the auditory-lead threshold by 51 ms (38%; p = 0.030), the visual-lead threshold by 155

ms (67%; p = 0.003), and the audiovisual simultaneity window by 207 ms (57%; p = 0.001). The

PSS in the severe amblyopia group showed a non-significant trend toward a visual-lead shift

compared to the control group (p = 0.064).

Within the amblyopia group (i.e., moderate vs. severe), severity was significantly related to only

the visual-lead threshold, with those classified as severe having a threshold 118 ms wider

compared those classified as moderate (p = 0.043). Severe amblyopia also showed non-

significant trends toward a wider simultaneity window (p = 0.068) and a visual-lead shifted PSS

(p = 0.071) compared to the moderate group.

124

Figure 5.4: Subgroup analyses for audiovisual simultaneity judgment responses with both

eyes viewing as a function of SOA. (A) Comparison by amblyopia severity. (B) Comparison by

presumed etiology. (C) Comparison by foveal suppression status. (D) Comparison by level of

stereopsis. Error bars represent standard error of the mean.

125

Table 5.3: Audiovisual simultaneity window parameters by amblyopia severity


Performance

parameter

Control

(n = 17)

Amblyopia F(2,31) Omnibus p-

value Moderate

(n = 10)

Severe

(n = 7)

Auditory-lead

threshold

-136 ± 34 -180 ± 39* -186 ± 63* 5.393 0.010

Visual-lead

threshold

231 ± 83 268 ± 103 386 ± 111*† 6.700 0.004

AV simultaneity

window width

366 ± 91 448 ± 126 572 ± 122* 9.120 0.001

PSS

47 ± 44 44 ± 45 100 ± 67 3.281 0.051

Abbreviations: * p < 0.05 (vs. Control group post hoc); † p < 0.05 (vs. Moderate group post

hoc); SOA, signal onset asynchrony; SD, standard deviation; AV, audiovisual.

5.4.1.2.2 Amblyopia Etiology

Results of the subgroup analysis by amblyopia etiology are summarized in Table 5.4 and Figure

5.4B. In the anisometropic subgroup, the auditory-lead threshold was broadened by 75 ms (56%;

p < 0.001) and the audiovisual simultaneity window was broadened by 134 ms (37%; p = 0.025),

but the visual-lead threshold and PSS were not significantly different from the control group. In

the strabismic/mixed subgroup, the visual-lead threshold was broadened by 116 ms (32%, p =

0.032), and the audiovisual simultaneity window was broadened by 133 ms (36%; p = 0.033),

but unlike the anisometropic group, the auditory-lead threshold was not significantly different

compared to the control group. There was a non-significant trend toward a visual-lead shifted

PSS in the strabismic/mixed group compared to the control group (p = 0.064).

Within the amblyopia group (i.e. anisometropic vs. strabismic/mixed), etiology was significantly

related to the auditory-lead threshold, with those classified as anisometropic having a threshold

57 ms wider compared to those classified as strabismic/mixed (p = 0.009). The PSS in the

strabismic/mixed group also showed a non-significant trend toward a visual-lead shift compared

to the anisometropic group (p = 0,058)

126

Table 5.4: Audiovisual simultaneity window parameters by amblyopia etiology


Performance

parameter

Control

(n = 17)

Amblyopia F(2,31) Omnibus p-

value Aniso

(n = 9)

Strab/mixed

(n = 8)

Auditory-lead

threshold

-136 ± 34 -210 ± 44*† -153 ± 34 12.165 <0.001

Visual-lead

threshold

231 ± 83 289 ± 107 348 ± 131* 3.689 0.037

AV simultaneity

window width

366 ± 91 500 ± 134* 500 ± 147* 5.480 0.009

PSS

47 ± 44 40 ± 47 97 ± 61 3.513 0.042

Abbreviations: * p < 0.05 (vs. Control group post hoc); † p < 0.05 (vs. Strab/mixed group post

hoc); SOA, signal onset asynchrony; SD, standard deviation; Aniso, anisometropic; Strab,

strabismic; AV, audiovisual.

5.4.1.2.3 Foveal Suppression Status

Results of the subgroup analysis by foveal suppression status are summarized in Table 5.5 and

Figure 5.4C. In the non-suppressed subgroup, the auditory-lead threshold was broadened by 20

ms (43%; p = 0.002) and the audiovisual simultaneity window was broadened by 116 ms (32%;

p = 0.033), but the visual-lead threshold and PSS were not significantly different from the

control group. In the suppressed subgroup, the visual-lead threshold was broadened by 156 ms

(68%, p = 0.011), the audiovisual simultaneity window was widened by 177 ms (48%; p =

0.014), and the PSS was shifted toward by visual-lead condition by 68 ms (p = 0.025), but the

auditory-lead threshold was not significantly different compared to the control group.

Within the amblyopia group, suppression status was significantly related to the PSS only, with

those classified as suppressed having a PSS shifted 69 ms toward the visual-lead condition

compared to those classified as non-suppressed (p = 0.030).

127

Table 5.5: Audiovisual simultaneity window parameters by suppression status


Performance

parameter

Control

(n = 17)

Amblyopia F(2,31) Omnibus

p-value Non-suppressed

(n = 12)

Suppressed

(n = 5)

Auditory-lead

threshold

-136 ± 34 -195 ± 53* -156 ± 20 7.432 0.002

Visual-lead

threshold

231 ± 82 287 ± 112 387 ± 114* 5.041 0.013

AV simultaneity

window width

366 ± 91 481 ± 149* 543 ± 99* 6.146 0.006

PSS

47 ± 44 46 ± 47 115 ± 65*† 4.286 0.023

Abbreviations: * p < 0.05 (vs. Control group post hoc); † p < 0.05 (vs. Non-suppressed group

post hoc); SOA, signal onset asynchrony; SD, standard deviation; W4D, Worth 4-dot test; AV

audiovisual.

5.4.1.2.4 Stereopsis Level

Results of the subgroup analysis by stereopsis level are summarized in Table 5.6 and Figure

5.4D. In the subgroup with fine stereopsis, none of the simultaneity window parameters were

significantly different from the control group, although there was a trend toward broadening of

the auditory-lead threshold that did not reach significance in post hoc testing (p = 0.055). In the

subgroup with gross stereopsis, the auditory-lead threshold was broadened by 49 ms (36%, p =

0.019), the visual-lead threshold was broadened by 103 ms (45%, p = 0.045), and the audiovisual

simultaneity window was broadened by 151 ms (41%; p = 0.007), but the PSS was not shifted

compared to the control group.

Within the amblyopia group, level of stereopsis was not significantly related to any simultaneity

window parameters.

128

Table 5.6: Audiovisual simultaneity window parameters by stereopsis level


Performance

parameter

Control

(n = 17)

Amblyopia F(2,31) Omnibus

p-value Fine stereopsis

(n = 7)

Poor stereopsis

(n = 10)

Auditory-lead

threshold

-136 ± 34 -182 ± 42 -184 ± 55* 5.343 0.010

Visual-lead

threshold

231 ± 83 293 ± 112 333 ± 126* 3.289 0.051

AV simultaneity

window width

366 ± 91 475 ± 143 518 ± 136* 5.861 0.007

PSS

47 ± 44 55 ± 45 75 ± 70 0.828 0.447

Abbreviations: * p < 0.05 (vs. Control group post hoc); SOA, signal onset asynchrony; SD,

standard deviation; AV, audiovisual.

5.4.1.2.5 Associations Between Clinical Factors

Participants with strabismic/mixed amblyopia were significantly more likely to exhibit foveal

suppression on the Worth 4-dot test compared to those with anisometropic amblyopia (Φ =

0.685, p = 0.005). Etiology was not significantly associated with amblyopia severity (Φ = 0.169,

p = 0.486) or stereopsis level (Φ = 0.310, p = 0.201). Participants with severe amblyopia were

significantly more likely to demonstrate foveal suppression (Φ = 0.509, p = 0.036) and to have

poor stereopsis (Φ = 0.700, p = 0.004) compared to those with moderate amblyopia. Participants

with foveal suppression on the Worth 4-dot test were significantly more likely to have poor

stereopsis compared to those with a non-suppressed response (Φ = 0.540, p = 0.026).

5.4.2 Monocular Viewing Conditions

A subset of 6 participants with amblyopia was tested under monocular amblyopic eye-only and

fellow eye-only viewing conditions. The mean “simultaneous” response percentages are plotted

by SOA in Figure 5.5. Repeated measures ANOVAs, summarized in Table 5.7, showed no

significant differences in any performance parameters across viewing conditions among

participants with amblyopia.

129

Figure 5.5: The audiovisual simultaneity window for binocular and monocular viewing

conditions among participants with amblyopia. There were no significant differences between

viewing conditions (n = 6). Error bars represent standard error of the mean.

130

Table 5.7: Comparison of audiovisual simultaneity window parameters by viewing

condition for participants with amblyopia (repeated measures ANOVA)


Performance

parameter

Both eyes Amblyopic eye Fellow eye F(2,10) Omnibus

p-value

Auditory-lead

threshold

-158 ± 40 -166 ± 53 -177 ± 73 0.331 0.726

Visual-lead

threshold

304 ± 155 283 ± 145 279 ± 119 0.607 0.564

AV simultaneity

window width

462 ± 157 449 ± 177 456 ± 144 0.054 0.948

PSS

73 ± 82 58 ± 64 51 ± 67 1.244 0.329

Abbreviations: SOA, signal onset asynchrony; SD, standard deviation; AV, audiovisual.

5.5 Discussion

We characterized the audiovisual simultaneity window in adults with unilateral amblyopia and in

visually normal control participants using a simultaneity judgment task. The window parameter

values obtained among control participants were very similar to those previously published for

similar experimental protocols (Stevenson & Wallace, 2013; Stevenson, Zemtsov, et al., 2012;

Stone et al., 2001; Zampini, Shore, et al., 2005). With both eyes viewing, the window was wider

in participants with amblyopia on both the auditory-lead and visual-lead sides. The broadening of

the simultaneity window among participants with amblyopia was similar among amblyopic eye

only, fellow eye only, and binocular viewing conditions, suggesting that these perceptual

differences may involve an abnormal central multisensory network for temporal processing. The

results are similar to those reported for adults with early monocular deprivation from congenital

cataract (Chen et al., 2017), and demonstrate that the abnormalities in audiovisual integration in

the most prevalent forms of amblyopia are not specific to the McGurk effect (i.e., audiovisual

speech perception) (Burgmeier et al., 2015; Narinesingh et al., 2015; Narinesingh et al., 2014),

but generalize to simultaneity judgments of simple, non-speech stimuli.

131

Subgroup analyses of the participants with amblyopia by their clinical characteristics showed

several differentiating patterns. The auditory-lead side of the simultaneity window varied with

etiology, with significant broadening seen in the anisometropic group. In contrast, the visual-lead

side varied with severity, with significant broadening seen in the severe group. The PSS is a

composite of the auditory-lead and visual-lead threshold values, and as such, exhibited an

intermediate response: the PSS trended toward visual-lead shifts in the strabismic/mixed group

and in the severe group, and showed a significant visual-lead shift in the foveal suppression

group.

A major distinction between strabmismic and anisometropic amblyopia is the difference in

binocular function (McKee et al., 2003). Strabismic and mixed mechanism amblyopia tend to

show stronger suppression and poorer stereopsis than anisometropic amblyopia (Birch et al.,

2016; Harrad & Hess, 1992; McKee et al., 2003). Interestingly, the clinical characteristics

associated with a broadened visual-lead threshold and visual-lead shifted PSS in this study are

also those known to indicate poor binocularity: strabismic/mixed etiology, foveal suppression,

and a severe monocular acuity deficit. Conversely, anisometropic etiology is known to indicate

relatively better binocular function, and was the only clinical characteristic positively associated

with a broadened auditory-lead threshold in this study.

While anisometropic and strabismic/mixed etiologies were distinguished by their effect on the

auditory-lead side of the audiovisual simultaneity window, several observations are noteworthy

(see Table 5.4 and Figure 5.4B). First, the width of the audiovisual simultaneity window among

the two etiology groups was the same. Second, the magnitude and direction of the differences in

the auditory-lead threshold, visual-lead threshold, and PSS (i.e. the midpoint of the two

thresholds) between the two etiology groups were nearly identical (i.e., 57–58 ms toward the

visual-lead side), suggesting a shift in the function rather than a widening of the visual-lead side.

Third, these effects are unlikely to be confounded by amblyopia severity, as there was no

statistical association between etiology and severity in the study sample. Taken together, these

observations suggest that two distinct mechanisms may be at play: that amblyopia in the absence

of significant strabismus or suppression (e.g., anisometropic amblyopia) leads to a symmetric

broadening of the audiovisual simultaneity window without shifting the PSS, and that it is the

overlay of significant strabismus or suppression (e.g. strabismic/mixed amblyopia) that shifts the

132

PSS toward the visual-lead condition. A symmetric broadening of the audiovisual simultaneity

window without a shift in PSS has also been observed in unilateral deprivational amblyopia

(Chen et al., 2017). Importantly, deprivational and anisometropic amblyopia share image

degradation as a common factor, and exhibit similarities on psychophysical tests of spatial acuity

and binocularity (McKee et al., 2003), lending further support to the hypothesis outlined above.

Because of the statistical associations between the clinical characteristics in the study sample, the

results must be interpreted with caution. Amblyopia severity was significantly associated with

every clinical characteristic except etiology, meaning that interpretation of the subgroup analyses

for suppression and stereopsis is confounded by unbalanced severity between groups. Some

variables may also reflect clinical factors, such as age of onset, which cannot generally be

determined accurately. Strabismus, for example, accounts for the majority of amblyopia cases

under age 3 years, while anisometropia becomes an etiologic factor primarily after age 3 (Birch,

2013). It is also likely that amblyopic etiology, suppression, stereopsis, and severity constitute

overlapping measures of common factors such as binocular function, or age of onset, although

their relations and these interactions are undoubtedly complex (McKee et al., 2003).

In visually normal individuals, the width of the audiovisual simultaneity window and PSS are not

only determined by sensory physiology, but are also modulated by cognitive factors such as

attention, and a decisional bias toward simultaneity (Zampini, Shore, et al., 2005). Attending to

either vision or audition has been shown to shift the PSS away from the attended modality in a

phenomenon termed prior entry (Spence et al., 2001). While it is possible that amblyopia is

associated with an attentional shift toward audition (de Heering et al., 2016), others have

determined that the magnitude of the prior entry effect in this task among visually normal

individuals is only 14 ms—far less than the 69 ms shift observed in the foveal suppression group

in this study (Zampini, Shore, et al., 2005). Decisional bias toward simultaneity (i.e. shift in

criterion for the unity assumption) would have the effect of widening both the auditory-lead and

visual-lead sides of the window without shifting the PSS (Welch & Warren, 1980). However, it

has been shown that within individuals, the width of the simultaneity window is stable over time

(Stone et al., 2001) and unaffected by the range of SOAs tested, suggesting that this parameter

reflects perceptual rather than decisional factors (Chen et al., 2016; Zampini, Guest, et al., 2005).

Indeed, if a decisional bias toward unity was the cause of a widened simultaneity window in

133

amblyopia, one might also expect that susceptibility to the McGurk effect would also be

heightened, but this is not the case (Burgmeier et al., 2015; Narinesingh et al., 2014).

Multiple non-cognitive factors may also contribute to the main and subgroup differences in

audiovisual temporal perception described in this study. Hypothetically, widening of the

simultaneity window could result from strengthened multisensory perceptual binding. As with

decisional bias toward unity, however, the heightened McGurk effect expected from enhanced

audiovisual perceptual binding has not been observed in amblyopia (Burgmeier et al., 2015;

Narinesingh et al., 2015; Narinesingh et al., 2014). Rather, the accompaniment of a wide

simultaneity window in amblyopia with low susceptibility to the McGurk effect is akin to the

relation observed in visually normal individuals (Stevenson, Zemtsov, et al., 2012), and suggests

an impairment in the ability to resolve asynchronous audiovisual pairs as unique events. A

possible mechanism for such an impairment is temporal uncertainty in the visual domain.

Assuming that decisional and criterion factors are unchanged, less precise visual temporal

information would reduce the precision of audiovisual asynchrony detection, and widen the

simultaneity window. Indeed, evidence for temporal uncertainty in amblyopia exists. Spang and

Fahle (2009) reported reduced visual temporal resolution in the amblyopic eyes of anisometropic

and strabismic participants, and that the temporal deficit correlated with amblyopia severity as in

the present study. Huang et al. (2012) employed a synchrony detection task to demonstrate a

foveal temporal processing impairment in the amblyopic eye of strabismic and anisometropic

participants. Impaired temporal processing is also evident in the fellow eye in strabismic

amblyopia when the judgment of temporal order requires interhemispheric transmission across

the corpus callosum (St John, 1998). Visual temporal uncertainty such as that demonstrated in

amblyopia can be expected to have downstream effects on multisensory processes, including

audiovisual asynchrony detection, dependent on visual input.

As discussed above, the PSS shift toward visual-lead SOAs among participants with foveal

suppression was larger than that which is solely attributable to attentional effects (Zampini,

Shore, et al., 2005). PSS shifts of more comparable magnitude, however, have been observed in

normal adults as a result of temporal recalibration to constant asynchrony (Fujisaki et al., 2004).

This phenomenon is likely an important mechanism to deal with the natural physical and neural

asynchrony in auditory and visual signals, and presents a possible mechanism for the PSS shifts

134

observed in amblyopia. In visually normal adults, the first peak cortical evoked response occurs

75 ms after onset of an auditory stimulus and 104 ms after onset of a visual stimulus, resulting in

a neural asynchrony of about 30 ms even under ideal conditions (Andreassi & Greco, 1975). In

amblyopia, however, cortical response latencies from the affected eye are increased compared to

the fellow eye (Sokol, 1983; Zhang & Zhao, 2005). This transmission latency difference may be

another source of temporal uncertainty and act as the perceptual stimulus to shift the PSS toward

visual-lead SOAs. Indeed, evidence for a significant interocular perceptual latency difference in

amblyopia is provided by the observation of a spontaneous Pulfrich effect in some observers

with amblyopia (Tredici & von Noorden, 1984). Another possible explanation for the PSS shift

in amblyopia is that suppression and poor stereopsis may interfere with the normal ability to

account for sound velocity and source distance when making audiovisual simultaneity judgments

(Engel & Dougherty, 1971; Sugita & Suzuki, 2003). This explanation, however, is unlikely, as

monocular adults who lost one eye at an early age perform as normal controls in this task (Moro

& Steeves, 2015).

If the putative audiovisual temporal correspondence detector were intact in amblyopia, one could

reasonably speculate that occlusion of the affected eye would eliminate the temporal uncertainty

and perceptual latency, and normalize the audiovisual simultaneity window parameters.

However, we found viewing condition had no significant effect on the simultaneity window

parameters. This result agrees with the findings in deprivational amblyopia (Chen et al., 2017),

and suggests that the abnormality in audiovisual simultaneity judgment is not solely a result of

amblyopic visual input, but that it involves a central alteration in the capacity to process

audiovisual temporal information. Furthermore, this interpretation is consistent with considerable

evidence that points to the importance of early sensory experience for the emergence of normal

audiovisual integration processes. Neurophysiology studies of cats reared with experimentally

manipulated or absent visual input reported abnormal audiovisual multisensory responses in the

superior colliculus (Wallace et al., 2004; Wallace & Stein, 2007). Adult humans with a history of

transient bilateral visual deprivation in early life show reduced audiovisual multisensory

interaction in behavioural studies (Chen et al., 2017; Putzar et al., 2007; Putzar, Hötting, et al.,

2010), and large-scale cross-modal reorganization of the visual cortex as assessed using

functional MRI (Collignon et al., 2015). Interestingly, typically-developing children up to age 7

years have a symmetrically broadened audiovisual simultaneity window similar to that observed

135

in amblyopia, suggesting that the amblyopic audiovisual simultaneity window may represent a

persistent juvenile state (Chen et al., 2016; Hillock-Dunn & Wallace, 2012; Hillock et al., 2011;

Lewkowicz & Flom, 2014). If the mechanism by which the audiovisual simultaneity window

normally narrows through childhood is experience-dependent, then amblyopia may interfere with

the calibration and refinement of the cortical processes responsible for audiovisual simultaneity

and asynchrony perception. Plausibly, amblyopic visual temporal uncertainty during a critical

period of brain development may limit the resolution of audiovisual asynchrony detection,

leading to a widened audiovisual simultaneity window.

The view that audiovisual simultaneity perception is altered developmentally by the temporal

uncertainty and perceptual latency inherent to amblyopic vision is supported by the lack of a

similar effect in monocular adults. Indeed, adults with a history of early enucleation (i.e.,

removal of one eye) have a normal simultaneity window (Moro & Steeves, 2015). This indicates

that monocular visual loss alone is not sufficient to alter the simultaneity window, and suggests

that impaired but not absent visual input is necessary to disrupt the refinement of temporal

audiovisual processes.

Although amblyopia is classically regarded as a monocular impairment of spatial vision, the

findings of this study, combined with the prior finding of reduced susceptibility to the McGurk

effect, indicate an impairment of audiovisual multisensory perception that generalizes beyond

speech (Burgmeier et al., 2015; Narinesingh et al., 2014). In addition to the main finding of a

widened audiovisual simultaneity window in amblyopia, subgroup analysis suggested that an

accompanying shift in the PSS is dependent on etiology and binocularity. Although the

mechanisms are not clear, hypotheses include visual temporal uncertainty and interocular

perceptual latency asymmetry. The findings give insight into the developmental calibration of

normal multisensory processes, and highlight a previously underappreciated impact of amblyopia

beyond vision.

136

Chapter 6 Study IV

Study IV: Temporal Ventriloquism Reveals Normal Audiovisual Temporal Integration in Amblyopia

6.1 Abstract

Introduction: We have shown previously that amblyopia involves impaired detection of

asynchrony between auditory and visual events. To distinguish whether this impairment

represents a defect in temporal integration or non-integrative multisensory processing (e.g.,

cross-modal matching), we employed the temporal ventriloquism effect in which visual temporal

order judgment (TOJ) is normally enhanced by a lagging auditory click.

Methods: Participants with amblyopia (n = 9) and visually normal controls (n = 9) performed a

visual TOJ task. Pairs of clicks accompanied the two lights such that the first click preceded the

first light, or second click lagged the second light by 100, 200, or 450 ms. Baseline audiovisual

synchrony and visual-only conditions were also tested.

Results: Within both groups, just noticeable differences (JNDs) for the visual TOJ task were

significantly enhanced over baseline in the 100 ms click lag condition. Within the amblyopia

group, poorer stereo acuity was significantly correlated with greater enhancement in visual TOJ

performance in the 200 ms click lag condition.

Conclusions: Audiovisual temporal integration is intact in amblyopia, as indicated by a normal

temporal ventriloquism effect. Amblyopia with poorer binocularity is associated with a widener

temporal binding window for the effect. These findings suggest that previously reported deficits

in audiovisual multisensory processing result from impaired cross-modal matching rather than

diminished capacity for temporal audiovisual integration.

6.2 Introduction

Amblyopia, or “lazy eye”, is a visual disorder that arises from anomalous visual experience

during a sensitive period of brain development in early life. It is most commonly caused by

misalignment of the eyes (i.e., strabismus), inequality in refractive error between the eyes (i.e.,

137

anisometropia), or a combination of the two factors that cause non-correspondence of the retinal

images or visual blur (Birch, 2013). In rare cases, it may also be induced by congenital cataracts

that deprive one or both eyes of form vision. Although visual rehabilitation is possible in

childhood, failures in diagnosis and treatment result in amblyopia being the number one cause of

persistent monocular blindness in adulthood (Buch et al., 2001; Krueger & Ederer, 1984).

Clinically, amblyopia is often regarded as a monocular impairment in low-level visual functions

such as spatial acuity, contrast sensitivity, and binocular vision (American Academy of

Ophthalmology Pediatric Ophthalmology/Strabismus Panel, 2012). However, a considerable

body of research shows that the deficits are not limited to vision in the amblyopic eye (see

(Meier & Giaschi, 2017) for review). The fellow eye also shows subtle deficits in spatial vision

(Bedell et al., 1985; Sireteanu et al., 2008), contrast sensitivity (Leguire et al., 1990; Wali et al.,

1991), and motion processing (Giaschi et al., 1992; Ho et al., 2005). Temporal aspects of visual

processing are also affected. Foveal vision in the amblyopic eye is less sensitive to asynchrony

between visual elements (Huang et al., 2012), extra-foveal regions have poorer temporal

resolution (Spang & Fahle, 2009), and the fellow eye in strabismic amblyopia shows impaired

perception of temporal order (St John, 1998). In agreement with these behavioural deficits in

temporal perception, visual evoked responses in humans show increased latency jitter and

decreased signal-to-noise ratios (Banko, Kortvelyes, Nemeth, et al., 2013; J. P. Kelly et al.,

2015). Physiological recordings in cats show reduced synchronization among neurons in the

striate cortex (Roelfsema et al., 1994) when driven by stimulation of the amblyopic eye. Beyond

vision, amblyopic abnormalities in multisensory processing and integration are also well

documented. People with unilateral amblyopia show reduced integration of incongruent auditory

and visual speech signals as demonstrated by the McGurk effect (Burgmeier et al., 2015;

Narinesingh et al., 2015; Narinesingh et al., 2014). They also show multisensory perceptual

binding over an abnormally wide range of signal onset asynchronies (SOAs) for simple visual

and auditory signals. For example, in audiovisual simultaneity judgment tasks, the temporal

window of perceived simultaneity is broadened on both the visual-lead and auditory-lead sides

for people with unilateral amblyopia (Chen et al., 2017; Richards et al., 2017b). In the sound-

induced flash illusion (Shams et al., 2000), a broader temporal window of auditory dominance

over vision when the sound precedes the flash is also evident (Narinesingh et al., 2017). These

138

multisensory perceptual abnormalities are observed under binocular viewing conditions, and so

cannot be fully explained by visual anomalies in the amblyopic eye alone.

It is important to distinguish multisensory integration from other multisensory processes.

Multisensory integration involves the fusion or combination of unisensory signals to produce a

new response that is significantly different from its component inputs (Stein et al., 2010).

Multisensory processing, on the other hand, is an umbrella term that encompasses multisensory

integration and non-integrative multisensory processes, such as cross-modal matching, that do

not produce a new response, but seek featural equivalencies in time, space, or identity between

unisensory inputs (Stein et al., 2010).

The fused illusory percepts in the McGurk effect and sound-induced flash illusion are examples

of multisensory integration because their perceptual products are qualitatively different from

their unisensory auditory or visual components (McGurk & MacDonald, 1976; Shams et al.,

2000). A defect in multisensory integration would therefore reduce or eliminate perception of the

fused phoneme or illusory flash. This is not the only explanation, however. The strength of

integration is also affected by the level of feature matching (e.g., temporal correspondence,

spatial correspondence, and phonetic identity) in the unisensory signal streams. For example,

degradation of phonetic identity in the auditory signal (Baart, Vroomen, et al., 2014) can reduce

the strength of the McGurk effect, and reduction of cross-modal temporal correspondence (i.e.

greater SOAs) can reduce the strength of both the McGurk effect (Munhall, Gribble, Sacco, &

Ward, 1996) and the sound-induced flash illusion (Shams et al., 2000). The factors influencing

the width of the audiovisual simultaneity window are similarly complex. In normal individuals, a

wider audiovisual simultaneity window correlates empirically with less susceptibility to the

McGurk effect (i.e. weaker integration), but greater susceptibility to the sound-induced flash

illusion (i.e. greater integration) (Stevenson, Zemtsov, et al., 2012). Therefore, the wider

audiovisual simultaneity window observed in amblyopia may represent reduced capacity for

multisensory integration, or integrative fusion over a wider range of SOAs, or both.

Alternatively, a wider audiovisual simultaneity window may represent entirely non-integrative

factors such as criterion shift toward simultaneity (Yarrow et al., 2011), or temporal uncertainty

in the unisensory streams that feed into the neural machinery of multisensory integration

(Richards et al., 2017b). Indeed, some empirical data suggest that the width of the audiovisual

139

simultaneity window may not be a function of multisensory integration, but rather of non-

integrative cross-modal matching of temporal features encoded within the unisensory streams

(Fujisaki & Nishida, 2005). Finally, because testing paradigms for the McGurk effect, sound-

induced flash illusion, and audiovisual simultaneity window typically involve a single interval

with a “target present” vs “target not present” response, they are more susceptible to inter-

individual response bias than traditional 2-alternative forced choice (2AFC) paradigms that have

a predictable noise floor.

The nature of multisensory processing abnormalities in amblyopia remains an unresolved

question. Two opposing mechanisms—a primary failure of integration or a primary deficiency in

unisensory information—can both lead to the same perceptual outcome for many of the

multisensory phenomena studied in amblyopia, as discussed above. To help resolve this

ambiguity, we have examined integration in another audiovisual phenomenon—the temporal

ventriloquism effect (Morein-Zamir et al., 2003). The temporal ventriloquism effect is an

example of audiovisual integration in which performance on a 2AFC visual temporal order

judgment (TOJ) task is improved by paired auditory events (Figure 6.1). Specifically, the ability

to detect the order of onset of two lights improves when spatially irrelevant clicks are presented

such that the second click lags the second light by 100 to 200 ms (Morein-Zamir et al., 2003). In

effect, the second click ‘pulls’ or ‘ventriloquises’ the perceived onset of the second light forward

in time, increasing the apparent interval between the two lights and making their temporal order

easier to judge.

140

Figure 6.1: Schematic of the apparatus and stimuli that induce the temporal ventriloquism

effect. The first click is simultaneous with the onset of the first light (arbitrarily shown as the top

light in this example), but the second click follows the onset of the second light. In normal

observers, discrimination performance on the visual temporal order judgment (TOJ) is enhanced

when the second click lags the second light by 100 to 200 ms.

Unlike the fused percepts of other audiovisual multisensory phenomena operating in the

temporal dimension (e.g., the McGurk effect, audiovisual simultaneity judgment and the sound-

induced flash illusion), the temporal ventriloquism effect results in perceptual enhancement that

cannot emerge from non-integrative cross-modal matching or perceptual blending. In the present

report, we use the temporal ventriloquism effect to determine whether the multisensory

processing abnormalities observed in amblyopia are related to a primary failure of audiovisual

temporal integration. If amblyopia involves a primary failure of audiovisual integration,

perceptual enhancement by temporal ventriloquism will not be observed (Figure 6.2).

141

Figure 6.2: The temporal ventriloquism effect with and without intact audiovisual

integration. In normal observers, the just noticeable difference (JND) in onset of the two lights

is reduced in the “Trailing 100” and “Trailing 200” conditions (i.e., when the second click trails

the second light by 100 or 200 ms) compared to the “Baseline” condition (i.e., when the two

clicks are synchronous with the onset of the two lights). If amblyopia involves a failure of

temporal integration, then this performance enhancement will not be observed (hypothesis

outlined in red). Adapted from (Morein-Zamir et al., 2003).

6.3 Methods

6.3.1 Participants

Nine adults with unilateral amblyopia (8 female; mean age 28 years; age range 18–47)

participated in this study, and 9 visually typical adults (7 female; mean age 31 years; age range

22–46 years) served as controls. All participants passed a standard hearing test, and were

assessed by a certified orthoptist or an ophthalmologist to measure refractive correction, visual

acuity, stereoacuity, foveal suppression, ocular motility, and eye alignment as described in detail

142

elsewhere (Richards et al., 2017b). Amblyopia was defined as a visual acuity of 0.18 logMAR

(20/30) or worse in the affected eye, and an intraocular difference of at least 0.2 logMAR (2 lines

on the standard ETDRS chart). Anisometropic amblyopia was defined as amblyopia with an

interocular difference of at least 1 diopter (D) in either spherical equivalent or astigmatic error.

Strabismic amblyopia was defined as amblyopia accompanied by any manifest deviation on the

cover test in the absence of significant anisometropia (defined above). Mixed-mechanism

amblyopia was defined as amblyopia accompanied by both anisometropia and strabismus of at

least 8 prism diopters. Participants were excluded if they had hyperopia greater than +5 D

spherical equivalent, myopia greater than -6 D spherical equivalent, or a history of any

neurological, neurodevelopmental, auditory, or visual disorders other than amblyopia, strabismus

and ametropia. Written informed consent was obtained from all participants. The study protocol

was approved by the Research Ethics Board at The Hospital for Sick Children, and followed the

tenets of the Declaration of Helsinki.

The clinical characteristics of the participants with amblyopia are summarized in Table 6.1.


The entire experiment was conducted in a carpeted acoustic chamber lined with 5 cm acoustic

wedge foam on the walls and ceiling. The audiovisual apparatus (shown in Figure 6.3) consisted

of two green light emitting diodes (LEDs) arranged 10 cm above and below a central speaker

(model CMS0361KLX, CUI Inc., Tualatin, OR, USA). A red LED positioned over the central

speaker served as a fixation target between trials. Auditory stimuli consisted of 2.5 ms square-

wave clicks presented at 62 dBA sound pressure level (SPL). Participants were seated with the

head stabilized in a chinrest at a standard viewing distance of 1.0 m, and used a wireless

gamepad (model F710, Logitech, Newark, CA, USA) to initiate each trial and enter responses.

143

Figure 6.3: The audiovisual apparatus. Two green LEDs were positioned 10 cm above and

below the point of fixation, indicated by a red LED. A central speaker was positioned

immediately behind the point of fixation. The apparatus was viewed from a distance of 1 m, such

that the visual angle between the fixation point and each green LED was 5.7°.

6.3.3 Design and Procedure

All trials were conducted in darkness with both eyes open. Each trial began with illumination of

the central fixation LED for 500 ms. After a random delay of 500–750 ms following offset of the

fixation LED, the two green LEDs were illuminated in sequence according to predetermined

light SOA conditions. Clicks accompanied the lights according to predetermined click timing

conditions. Participants were asked to make a visual temporal order judgment (TOJ) by

determining which light appeared last (“top” or “bottom”). Responses were unspeeded, and

participants were told that the sounds did not predict the order of the lights.

Twelve light SOA levels were tested: -144, -96, -72, -48, -36, -24, 24, 36, 48, 72, 96, and 144

ms, with negative values indicating that the bottom LED was illuminated first. Eight click timing

144

conditions were tested for each light SOA: one “AV sync” condition, three click “lead”

conditions, three click “lag” conditions, and one “visual-only” condition. In the AV sync

condition, clicks were synchronized with the onset of the two LEDs. In the three lead conditions,

the first click preceded, or led, the first light by 450, 200 or 100 ms, and the second click and

light were synchronous. In the three lag conditions, the first click and light were synchronous,

but the second click trailed, or lagged, the second light by 100, 200 or 450 ms. In the “visual-

only” condition, there were no accompanying clicks. Twenty practice trials preceded the start of

data collection. Twenty trials were run for each light SOA and click timing condition, yielding a

total of 1,920 experimental trials. All audiovisual conditions (i.e. AV sync, lead, and lag

conditions) were randomly interleaved and run in 4 blocks. The visual-only condition was run

separately in a single block.

6.3.4 Data Analysis

The just noticeable difference (JND) and point of subjective simultaneity (PSS) for the visual

TOJ task were determined for each participant in each of the eight click timing conditions. The

JND quantifies the minimum SOA for which the temporal order of the two lights can be reliably

determined, and is a measure of visual temporal resolution. To calculate the JND and PSS, the

proportion of trials in which the top LED was seen first was computed for all light SOA levels. A

cumulative Gaussian curve was then fit to the psychometric data using a maximum likelihood

method. The JND and PSS values were derived from the fitted curve. The JND was defined as

the SOA at which the top LED was seen first 75% of the time, minus the SOA at which the top

light was seen first 25% of the time, divided by two. The PSS was defined as the SOA at which

the top and bottom lights were equally likely to be seen first. All curve fits and parameters were

computed in MATLAB version R2011b (Mathworks, Inc., Natick, MA, USA).

All statistical analyses were computed in IBM SPSS Statistics, version 22 (Armonk, NY, USA).

Homogeneity of variance was established by Levene’s test for independent samples t-tests, and

by Mauchly’s test of sphericity for repeated-measures ANOVAs. Bonferroni adjustments were

applied to post hoc multiple comparisons where indicated in the results. Participants with non-

measureable stereo acuity (worse than 3000 seconds of arc) were assigned a supra maximal value

of 3001 for Spearman rank correlation. Statistical significance was defined as p < 0.05.

145

Table 6.1: Clinical characteristics of participants with amblyopia

Participant

Age (sex)


(logMAR)


(diopters)

Alignment at 6m

(prism diopters)

Stereo

acuity

(arc sec)

Worth 4-dot

response

Additional details

RE LE RE LE

A1

27 (F)

Strab 0.00 0.48 -6.25 +1.00 x 45 -5.50 +1.25 x 135 LE esotropia 2,

LE hypotropia 1

200 Fused Strab surgery at 9

years of age

A2

22 (F)

Aniso 0.00 0.48 -1.50 +0.50 x 80 +1.00 +1.25 x 95 LE esotropia 2 200 Fused

A3

22 (M)

Aniso 1.10 -0.10 -6.00 +0.75 x 174 -4.50 +0.50 x 75 RE esotropia 2 3000 Fused

A4

23 (F)

Strab 0.20 0.00 +0.50 +0.50 x 28 +1.25 +0.50 x 88 LE esotropia 8,

bilateral DVD

Not

measurable

Diplopic Infantile esotropia,

2 strab surgeries as

child

A5

44 (F)

Mixed 0.90 0.00 +6.00 +1.25 x 75 -0.75 RE exotropia 35 Not

measurable

RE

suppressed

A6

37 (F)

Aniso 0.18 -0.10 -3.25 +4.00 x 10 -5.25 RE esotropia 1 70 Fused

A7

46 (F)

Strab -0.10 0.10 +4.25 +5.00 LE esotropia 25,

LE hypotropia 18

Not

measurable

LE

suppressed

Esotropia onset at

6–8 months of age

A8

28 (M)

Aniso 0.18 -0.10 +2.25 +0.25 Exophoria 2 70 Fused

A9

26 (F)

Aniso -0.10 0.18 +0.75 +3.00 LE esotropia 1 140 Fused

Abbreviations: RE, right eye; LE, left eye; Aniso, anisometropic; Strab, strabismic; DVD, dissociated vertical deviation.

146

6.4 Results

Performance on the visual TOJ task is summarized for the amblyopia group and control group in

Table 6.2. Mean JND values did not differ significantly between the two groups for the baseline

synchronous click (i.e., AV sync) condition, the visual-only condition, or any of the

asynchronous click timing conditions. Similarly, mean PSS values did not differ significantly

between the two groups for any click timing condition. Furthermore, one-sample t-tests

comparing the PSS with the expected value of 0 showed no significant deviation of the PSS from

true simultaneity for any click timing condition in either group.

Table 6.2: Visual temporal order judgment performance in the control and amblyopia

groups

Click

timing

condition

(ms)

JND, mean ± SEM

(ms)

PSS, mean ± SEM

(ms)

Control Amblyopia t(16) p Control Amblyopia t(16) p

Lead 450 60 ± 8 73 ± 9 -1.038 0.32 -1 ± 8 -3 ± 2 0.197* 0.85

Lead 200 60 ± 8 70 ± 9 -0.803 0.43 -3 ± 8 2 ± 4 -0.472 0.64

Lead 100 66 ± 9 67 ± 6 -0.102 0.92 -7 ± 9 -4 ± 5 -0.355 0.73

AV sync 60 ± 6 64 ± 4 -0.461 0.65 -9 ± 8 0 ± 9 -0.727 0.48

Lag 100 45 ± 6 48 ± 3 -0.526* 0.61 -6 ± 8 -1 ± 7 -0.375 0.71

Lag 200 52 ± 4 49 ± 4 0.442 0.66 -3 ± 6 -7 ± 8 0.482 0.64

Lag 450 64 ± 7 61 ± 6 0.390 0.70 -3 ± 6 -7 ± 8 0.393 0.70

Visual-only 55 ± 6 65 ± 5 -1.338 0.20 -5 ± 8 -3 ± 10 -0.137 0.89

Abbreviations: JND, just noticeable difference; PSS, point of subjective simultaneity; AV,

audiovisual; *degrees of freedom adjusted for inequality of variances

Performance on the visual TOJ task for unimodal (visual-only) and bimodal baseline (AV sync)

stimuli were compared using paired samples t-tests (Figure 6.4). There was no significant

difference in mean JNDs between the visual-only condition and AV sync condition for the

control group (t(8) = 1.752, p = 0.118) or the amblyopia group (t(8) = -0.330, p = 0.750). Pearson

147

correlations between JND and amblyopic eye visual acuity were not significant for the visual-

only condition (R = -0.120, p = 0.758) or the baseline AV sync condition (R = 0.089, p = 0.820).

Similarly, Spearman rank correlations between JND and stereo acuity were not significant for the

visual-only condition (Rs = -0.085, p = 0.827) or the baseline AV sync condition (Rs = 0.162, p =

0.676).

Figure 6.4: Visual temporal order judgment performance for visual-only stimuli and

audiovisual stimuli with synchronous clicks (AV sync). JNDs did not differ significantly by

stimulus modality. Error bars represent standard error of the mean.

Variation in performance on the visual TOJ task across click timing conditions is illustrated in

Figure 6.5. Mean JND values for each click timing condition were submitted to a one-way

repeated-measures ANOVA for each group. There was a significant effect of click timing

condition on JND for the control group (F(6, 48) = 3.920, p = 0.002) and the amblyopia group

(F(6, 48) = 4.407, p = 0.001). Post hoc comparisons showed that visual TOJ performance was

significantly enhanced over baseline only when the second click lagged the second light by 100

ms for the control group (p = 0.011, Bonferroni correction) and the amblyopia group (F(6, 48) =

148

4.407, p = 0.001). The magnitude of this enhancement over baseline was 16 ms (25%) in the

control group and 15 ms (25%) in the amblyopia group. These findings are consistent with the

temporal ventriloquism effect previously described in visually normal participants (Morein-

Zamir et al., 2003).

Figure 6.5: The temporal ventriloquism effect in the control group and the amblyopia

group. JNDs for the control group are shown at the top in blue, and those for the amblyopia

group are shown in the bottom in red. The baseline AV sync condition is represented by a black

149

bar for both groups. Lead conditions are those in which the first click preceded the onset of the

first light, followed by a synchronous second click and light. Lag conditions are those in which

the first click and light were synchronous, but the second click trailed the second light. A

significant temporal ventriloquism effect was observed in both groups (*p < 0.05). Error bars

represent standard error of the mean.

To examine the relation between clinical measures of amblyopia and susceptibility to the

temporal ventriloquism effect, the JND improvement from the baseline AV sync condition was

computed for the three click lag conditions for each participant. Spearman correlations between

stereo acuity and the JND improvement from baseline (Figure 6.7) were not significant for the

100 ms click lag condition (Rs = 0.03, p = 0.93) or the 450 ms click lag condition (Rs = -0.51, p =

0.16). However, worse stereo acuity was significantly correlated with greater susceptibility to the

temporal ventriloquism effect for the intermediate 200 ms click lag condition (Rs = 0.74, p =

0.02). Pearson correlations between logMAR visual acuity in the amblyopic eye and the JND

improvement from baseline (Figure 6.6) showed a similar trend, though not statistically

significant. The relation between visual acuity and JND improvement from baseline were not

significant for the 100 ms click lag condition (R = -0.29, p = 0.45) or the 450 ms click lag

condition (R = -0.33, p = 0.39). The intermediate 200 ms click lag condition, however, showed a

trend toward greater susceptibility to the temporal ventriloquism effect in those with worse

visual acuity in the amblyopic eye (R = 0.64, p = 0.06).

150

Figure 6.6: Relation between susceptibility to the temporal ventriloquism effect and visual

acuity in the amblyopic eye across click timing conditions in which the second click lagged

the onset of the second light. The index of susceptibility to the temporal ventriloquism effect

was defined as the improvement in JND from each participant’s baseline performance in the AV

sync condition. The 200 ms click lag condition showed a positive relation between greater

susceptibility to the temporal ventriloquism effect and worse acuity in the amblyopic eye

(indicated by trend line), but this did not reach statistical significance. AE = amblyopic eye.

Figure 6.7: Relation between susceptibility to the temporal ventriloquism effect and stereo

acuity across click lag conditions in participants with amblyopia. The index of susceptibility

to the temporal ventriloquism effect was defined as the improvement in JND from each

participant’s baseline performance in the AV sync condition. The 200 ms click lag condition

showed a significant positive relation between greater susceptibility to the temporal

ventriloquism effect and worse stereo acuity (indicated by trend line).

151

6.5 Discussion

We characterized and compared the effect of paired sounds on performance in a visual TOJ task

for participants with unilateral amblyopia and visually normal controls under binocular viewing

conditions. Both the amblyopia and control group showed a significant 25% improvement in

visual temporal precision, as measured by the JND, when the second click lagged the onset of the

second light by 100 ms, consistent with the temporal ventriloquism effect previously described

(Morein-Zamir et al., 2003). This finding suggests that the capacity for audiovisual integration in

the temporal dimension remains intact in amblyopia. By extension, it lends support to the

hypothesis that failed integration is not the primary source of multisensory processing

abnormalities observed in amblyopia. In the amblyopia group, we also found that poorer stereo

acuity was correlated with greater JND improvement from baseline (i.e., greater susceptibility to

the temporal ventriloquism effect) when the second click lagged the onset of the second light by

200 ms. A similar trend was observed between poorer visual acuity in the amblyopic eye and

greater susceptibility to the temporal ventriloquism effect at the 200 ms click lag condition, but

the relation did not reach statistical significance. These findings suggest that the width of the

temporal binding window for the effect is modulated by the severity of amblyopic visual deficits,

with an extended window observed in those with greater deficits in stereo acuity, and possibly

visual acuity.

A common factor that modulates the temporal ventriloquism effect and many other multisensory

phenomena (such as audiovisual simultaneity perception, the McGurk effect, and the sound-

induced flash illusion) is a dependency on cross-modal temporal correspondence and asynchrony

detection (Morein-Zamir et al., 2003; Munhall et al., 1996; Shams et al., 2000; Stevenson,

Zemtsov, et al., 2012). Previous work has shown that unilateral amblyopia is associated with

symmetric widening of the temporal window of audiovisual simultaneity perception (Chen et al.,

2017; Richards et al., 2017b) and reduced susceptibility to the McGurk effect under both

monocular and binocular viewing conditions (Burgmeier et al., 2015; Narinesingh et al., 2015;

Narinesingh et al., 2014). In addition, a study of the sound-induced flash illusion in amblyopia

suggested that the temporal binding window for the illusion is extended under binocular

conditions when the clicks lead the flash (Narinesingh et al., 2017). How does the observation of

a normal-like temporal ventriloquism effect with a possibly extended temporal binding window

fit in the context of these prior findings? Insight comes from the work by Stevenson, Zemtsov, et

152

al. (2012) who described the correlations of various indices of multisensory function in a sample

of visually normal adults. They found that the width of the audiovisual simultaneity window was

negatively correlated with susceptibility to the McGurk effect, but positively correlated with

susceptibility to the sound-induced flash illusion. They proposed that a narrower audiovisual

simultaneity window relates directly to a superior ability to dissociate, or resolve, asynchronous

unisensory components of an audiovisual stimulus pair. Because temporal correspondence is a

constraint on multisensory perceptual binding, any change in the sensitivity to audiovisual

asynchrony will necessarily alter the likelihood of audiovisual integration. In the case of the

McGurk effect, heightened sensitivity to asynchrony means that auditory and visual stimuli

perceived as synchronous are more unique, more likely to have arisen from a single event, and

therefore more strongly integrated in a fused percept. In the case of the sound-induced flash

illusion, diminished sensitivity to asynchrony means that the temporal constraints on integration

are looser. In turn, the asynchrony inherent in the sound-induced flash illusion stimulus poses

less of an impediment to integration, and therefore susceptibility to the illusory percept is

increased. In their study, Stevenson, Zemtsov, et al. (2012) did not explicitly test the width of the

temporal binding window for the sound-induced flash illusion, but based on their reasoning

(outlined above), one might expect perceptual binding and an illusory percept over a wider range

of audiovisual SOAs—that is, a wider temporal binding window—as was previously observed in

amblyopia (Narinesingh et al., 2017). Like the sound-induced flash illusion, the temporal

ventriloquism effect is also dependent upon perceptual binding of asynchronous auditory and

visual signals in the temporal dimension. Therefore, reduced sensitivity to audiovisual

asynchrony, as evidenced by a widened simultaneity window (Chen et al., 2017; Richards et al.,

2017b), would likely not diminish susceptibility to the temporal ventriloquism effect, but enable

integration over a wider range of SOAs. Indeed, a widened audiovisual temporal binding

window in amblyopia is suggested by the significant correlation between susceptibility to the

temporal ventriloquism effect and the extent of the stereo acuity deficit (Figure 6.7).

Taken together, the profile of multisensory processing abnormalities suggest that amblyopia

involves reduced temporal resolution in unisensory perception or in the mechanism for cross-

modal matching (i.e., non-integrative comparison of unisensory features) rather than a primary

deficit in audiovisual integration. Several sources of evidence point to an amblyopic deficit in

temporal resolution residing within vision rather than audition. Most obviously, amblyopia is a

153

primary disorder of the visual system, and its causative factors are ones that interfere with

normal visual experience. Behaviourally, deficits in temporal processing have been demonstrated

in the amblyopic and fellow eye (Huang et al., 2012; Spang & Fahle, 2009; St John, 1998).

Physiologically, cortical responses driven by stimulation of the amblyopic eye are less

synchronized (Roelfsema et al., 1994) and their temporal encoding is less reliable (Roelfsema et

al., 1994). Studies of visually normal people also demonstrate that audition is more temporally

precise than vision (Kanabus et al., 2002), and tends to be dominant in processing the temporal

dimension of audiovisual events (Aschersleben & Bertelson, 2003; Gebhard & Mowbray, 1959;

Shams et al., 2000; Shipley, 1964). Given the normal dominance of audition in temporal

audiovisual processing, any amblyopic deficit in auditory temporal resolution would likely have

diminished the magnitude of the temporal ventriloquism effect and the sound-induced flash

illusion, yet no such diminution was observed in the present study or previously (Narinesingh et

al., 2017). Finally, the width of the audiovisual simultaneity window and the width of the

temporal binding window for temporal ventriloquism vary with the extent of amblyopic deficits

in stereo acuity and visual acuity, respectively (Richards et al., 2017b). While correlation does

not equal causation, the relation is compelling.

Reduced temporal resolution in amblyopic vision may arise from noisy encoding of the visual

signal. Indeed, increased temporal jitter (Banko, Kortvelyes, Nemeth, et al., 2013; J. P. Kelly et

al., 2015) and interocular transmission latency differences (Sokol, 1983; Watts et al., 2002)

would necessarily reduce the temporal certainty for any visual event. Consequently, the

probability distribution for true visual event timing would be widened and likely skewed toward

later onset. Such a developmental mis-calibration may provide an explanation for the weaker

temporal constraints (i.e., wider temporal window) for audiovisual perceptual binding

documented in the present study and in prior work on amblyopic multisensory perception (Chen

et al., 2017; Narinesingh et al., 2017; Richards et al., 2017b).

Curiously, despite previous findings of reduced visual temporal resolution in amblyopia (Huang

et al., 2012; Spang & Fahle, 2009; St John, 1998), no such impairment was found in the present

study. Indeed, the visual-only, baseline audiovisual (AV sync), and asynchronous audiovisual

(click lead and click lag) conditions did not differ significantly between groups. There are

several possible explanations for this lack of effect, outlined below.

154

One possibility is that the visual temporal processing deficit is not uniformly distributed across

visual space. Huang et al. (2012) described a temporal processing deficit that was present in

foveally (within 1.25° of fixation) but absent peripherally (at an eccentricity of 5°) in the

amblyopic eye, raising the possibility that the peripheral stimulus in the present study was

beyond the region of visual temporal resolution impairment. Against this, however, a temporal

processing deficit has been shown to involve regions of the amblyopic visual field well beyond

the eccentricity tested in the present study (Spang & Fahle, 2009).

Another possibility is that normal visual temporal information available from fellow eye

overruled the deficient temporal signal from the amblyopic eye on binocular viewing. Visual

temporal resolution is indeed impaired in the fellow eye of people with strabismic amblyopia,

but only when the visual TOJ involves visual stimuli presented on opposite sides of the vertical

midline and thus requiring intrahemispheric communication via the corpus callosum (St John,

1998). The visual stimuli in our study were presented on the vertical midline, however, meaning

that the visual TOJ may not have involved intrahemispheric communication, and therefore may

not have induced the visual TOJ deficit previously described (St John, 1998).

A third possibility is that the temporal resolution deficit relevant to abnormal audiovisual

processing in amblyopia does not lie within unisensory visual perception, but at the interface

between auditory and visual temporal perception—at the level of cross-modal matching of

temporal features. This view is supported by prior work showing that sensitivity to audiovisual

asynchrony detection is equally impaired whether viewing with the amblyopic eye, the fellow

eye, or with both eyes together (Chen et al., 2017; Richards et al., 2017b), and by evidence from

visually normal humans indicating that detection of audiovisual asynchrony is based on

matching, rather than integration, of temporal features encoded within the unisensory streams

(Fujisaki & Nishida, 2005). The neural mechanism for cross-modal matching may have been

mis-calibrated in early life under the influence of increased temporal jitter (Banko, Kortvelyes,

Nemeth, et al., 2013; Roelfsema et al., 1994) and interocular latency differences (Sokol, 1983;

Watts et al., 2002). If audiovisual cross-modal matching is calibrated prior to the end of the

sensitive period for amblyopic visual recovery (Lewis & Maurer, 2005), then therapy that

improves vision, equalizes evoked response latencies (Arden & Barnard, 1979; Barnard &

Arden, 1979) and reduces internal temporal noise (Birch et al., 2016) may not narrow the

audiovisual temporal binding window. Asynchronous sensitive periods for unisensory and

155

multisensory functions may therefore explain why the temporal resolution of cross-modal

matching may be impaired despite normal visual TOJ performance.

156

Chapter 7 General Discussion and Conclusions

General Discussion and Conclusions

7.1 Summary of Findings and Evaluation of Specific Hypotheses

This thesis examined audiovisual processing and integration in adults with unilateral amblyopia.

It did so by measuring their performance on tasks of audiovisual spatial and temporal perception.

In the spatial dimension, the precision of auditory and visual localization, and the precision and

bias of audiovisual localization, were measured and compared to the performance of normally

sighted controls and to an ideal observer based on the maximum likelihood estimation (MLE)

model of optimal integration (Study I and II). In the temporal dimension, sensitivity to

audiovisual asynchrony (Study III) and perceptual enhancement by the temporal ventriloquism

effect (Study IV) were measured and compared to the performance of normally sighted controls.

Overall, the findings indicated amblyopia involves spatial processing deficits in visual and

auditory localization, and temporal processing deficits in audiovisual asynchrony perception.

Despite these deficits in unisensory and non-integrative processing, however, individuals with

unilateral amblyopia exhibited optimal audiovisual spatial integration and intact audiovisual

temporal integration.

7.1.1 Audiovisual Spatial Perception

Studies I and II examined spatial localization of unisensory and multisensory audiovisual stimuli

in participants with amblyopia and visually normal controls.

7.1.1.1 Study I

Study I found that under binocular viewing conditions, participants with amblyopia localized

unimodal visual and auditory, as well as spatially congruent bimodal audiovisual stimuli, less

precisely than controls. Despite these pervasive deficits in spatial localization precision,

participants with amblyopia demonstrated optimal integration of visual and auditory spatial cues

according to the MLE model of multisensory integration (Alais & Burr, 2004; Ernst & Banks,

2002). This optimal strategy of audiovisual combination was evident not only in the spatial bias

of the fused percept in conditions of audiovisual spatial conflict (i.e., the spatial ventriloquism

157

paradigm), but also in the improvement in localization precision for bimodal stimuli (compared

to that for unimodal stimuli) in conditions of audiovisual spatial congruency.

The results of Study I partially support and are partially inconsistent with the specific hypotheses

(see section 2.2.1.1). In support of hypothesis (1), the observed localization precision for visual

and audiovisual stimuli was reduced compared to visually normal controls. In surprising

opposition, however, localization precision for auditory stimuli was also reduced. To the author’s

knowledge, reduced sound localization precision has not been reported in unilateral amblyopia,

or any other unilateral visual impairment. Hypothesis (2), which stated that participants with

amblyopia will weight audition more heavily than vision compared to visually normal controls,

was also rejected on the basis of empirical findings. The hypothesized sensory reweighting was

likely not observed because spatial precision for visual and auditory localization were both

reduced. Imporantly, hypothesis (3), stating that audiovisual spatial integration will obey the

MLE model of optimal combination, was supported. The predictions of the MLE model were

borne out in two ways: in terms of the perceptual weight of the unisensory components in the

bimodal localization estimate, and critically, in terms of enhanced precision of the bimodal

localization estimate. Such precision enhancement is a hallmark of statistically optimal

multisensory integration.

7.1.1.2 Study II

In follow up to Study I, Study II confirmed that amblyopia is associated with reduced relative

sound localization precision, as measured by the minimum audible angle (MAA) in the

horizontal plane. Study II also examined absolute sound localization in the horizontal plane, and

found that people with amblyopia localize sounds less accurately in the spatial hemifield

ipsilateral to their amblyopic eye. This asymmetry in sound localization accuracy correlated

significantly with the participants’ clinical deficits in stereo acuity and visual acuity.

The specific hypotheses for Study II (see section 2.2.1.2) were formulated in response to the

unexpected findings of Study I. Hypothesis (1), which stated that participants with amblyopia

will have a wider MAA than visually normal controls, was unequivocally supported. Hypothesis

(2), which predicted that participants with amblyopia will localize sounds less accurately than

visually normal controls, was supported, but with qualification: the impairment was not uniform

across the horizontal plane, but restricted to the spatial hemifield ipsilateral to amblyopic eye.

158

7.1.2 Audiovisual Temporal Perception

Companion works to Studies I and II in the spatial domain, Studies III and IV examined

audiovisual multisensory interactions in the temporal domain.

7.1.2.1 Study III

Study III examined audiovisual simultaneity perception, and found that amblyopia is associated

with a greater likelihood of perceiving asynchronous auditory and visual signals as simultaneous

over a wider range of signal onset asynchronies (SOAs) compared to visually normal controls.

When participants with amblyopia were analyzed as a homogeneous group, the audiovisual

simultaneity window was widened by more than 35% for both auditory-lead and visual-lead

SOAs, and did not vary between monocular (i.e., amblyopic eye and fellow eye) and binocular

viewing conditions. When the participants were subdivided according to their clinical

characteristics, however, a pattern emerged: amblyopia with good binocularity was associated

with widening of the audiovisual simultaneity window, and the overlay of poor binocularity was

associated with a concomitant shift in the point of subjective simultaneity toward visual-lead

SOAs.

The specific hypothesis for Study III (see section 2.2.2.1), which predicted (1) a symmetrically

widened audiovisual simultaneity window, and (2) non-dependence on viewing condition, were

supported by the empirical findings.

7.1.2.2 Study IV

Study IV measured the just noticeable difference (JND) for a visual temporal order judgment

(TOJ) task, and investigated perceptual enhancement by paired auditory clicks in a temporal

ventriloquism paradigm (Morein-Zamir et al., 2003). Performance did not differ between the

amblyopia and control groups for the unisensory visual condition or the baseline multisensory

condition in which the onset of each light was paired with a synchronous auditory click. In both

groups, the JND for the visual TOJ was enhanced by 25% over baseline when the second click

lagged the onset of the second light by 100 ms. Within the amblyopia group, poorer stereo acuity

was significantly correlated with a greater JND enhancement when the second click lagged the

onset on the second light by 200 ms. The results indicated that amblyopia is associated with

159

intact audiovisual integration, and a wider temporal binding window for the temporal

ventriloquism effect.

The findings of Study IV generally supported the specific hypotheses stated in section 2.2.2.2,

Hypothesis (1), which predicted enhanced visual TOJ performance according to the temporal

ventriloquism effect, was unequivocally supported. Hypothesis (2), which predicted a widened

temporal binding window for the temporal ventriloquism effect in amblyopia, was not supported

by direct comparison of JNDs between the two groups, but by the significant positive correlation

between the degree of stereo acuity deficit and JND enhancement among participants with

amblyopia.

7.2 Is Audiovisual Integration Impaired in Amblyopia?

At first glance, the evidence for impaired audiovisual integration in amblyopia appears to be

conflicting. Several studies have established that unilateral amblyopia is associated with reduced

sensitivity to the McGurk effect (Burgmeier et al., 2015; Narinesingh et al., 2015; Narinesingh et

al., 2014). These investigators advanced two main hypotheses to explain this multisensory

abnormality: that amblyopia is associated with (1) a failure of audiovisual integration, or (2)

reduced reliability of the visual signal which, in turn, induces sensory reweighting to favour

audition in the combined multisensory percept. Other studies of adults with a history of early

bilateral deprivation from congenital cataracts described similar reductions in susceptibility to

the McGurk effect, as well as reduced interactions between simple auditory and visual stimuli in

the temporal dimension (Putzar et al., 2007; Putzar, Hötting, et al., 2010). Again, the findings

suggested a lack of multisensory integration caused by early anomalous visual experience

(Putzar et al., 2007). In contrast, a study of the sound-induced flash illusion suggested that

audiovisual integration is intact in adults with unilateral amblyopia (Narinesingh et al., 2017).

Rather than an integration deficit, participants with amblyopia showed a wider window of

temporal integration for the sound-induced flash illusion compared to visually normal

participants (Narinesingh et al., 2017). This thesis (Study I and Study IV) also presented strong

evidence that audiovisual integration remains intact in amblyopia. In the spatial domain, not only

did audiovisual combination produce a multisensory percept that was more precise than either of

the unisensory component percepts, but the strategy of multisensory combination was also

optimal according to the MLE model of the spatial ventriloquism effect (Study I). In the

160

temporal domain, not only was the strength of the temporal ventriloquism effect similar between

the amblyopia and control groups, but the temporal window of integration was in fact wider in

participants with worse binocularity (Study IV).

How can these conflicting pieces of evidence surrounding impaired audiovisual integration in

amblyopia be reconciled? Several possibilities exist.

7.2.1 Possible Mechanisms for the Pattern of Audiovisual Integration Abnormalities in Amblyopia

7.2.1.1 Differential Impact on Anatomic Sites of Audiovisual Integration

One possibility is that separate integrative mechanisms exist for the various audiovisual tasks,

and that these mechanisms are differentially affected by amblyopia. That separate neural circuits

are responsible for processing different aspects of multisensory integration is supported by the

observation that distinct anatomic sites are preferentially activated by different multisensory

tasks (reviewed in section 1.3.3). For example, the STS is preferentially activated during

integration of audiovisual speech stimuli (Callan et al., 2001; Calvert et al., 2000; Nath &

Beauchamp, 2012; Raij et al., 2000), the IPS is preferentially activated in audiovisual tasks

involving spatial localization and spatial attention (Bushara et al., 1999; Lewis et al., 2000), and

the cortex of the insula is preferentially activated by audiovisual perceptual binding on the basis

of temporal correspondence (Bushara et al., 2001; Calvert et al., 2001). If amblyopia

disproportionately alters processing in the ventral STS and relatively spares the dorsal IPS and

cortex of the insula, then one might predict that integration of audiovisual speech signals in the

McGurk effect would be more impaired than integration of simple audiovisual stimuli in the

spatial and temporal ventriloquism effects. Indeed, this is the general pattern of multisensory

integration abnormalities observed in behavioural studies of amblyopia.

Physiological and neuroimaging evidence also offers some support to the hypothesis that the

circuits for audiovisual integration in the ventral pathway are disproportionately affected in

amblyopia (Milner & Goodale, 2008). In cats with strabismic amblyopia, single-unit responses to

visual stimuli are more abnormal in the ventral pathway compared to the dorsal pathway

(Schröder, Fries, Roelfsema, Singer, & Engel, 2002). Similarly, in humans with strabismic and

anisometropic amblyopia, fMRI data suggest that transmission failure from lower to higher

visual areas affects the ventral pathway more consistently than the dorsal pathway (Muckli et al.,

161

2006). These findings are not conclusive, however, as amblyopic abnormalities in the dorsal

pathway and its associated functions are well-documented in behavioural (Hess, Demanins, &

Bex, 1997; Ho et al., 2005; Mansouri & Hess, 2006; Mirabella et al., 2011; Simmers et al., 2003;

Simmers, Ledgeway, Mansouri, Hutchinson, & Hess, 2006), neuroimaging (Barnes et al., 2001;

X. Li et al., 2011; Secen et al., 2011) and physiological studies (Schröder et al., 2002).

If the pattern of audiovisual perceptual abnormalities in amblyopia stems from differential

effects on the capacity for audiovisual integration among anatomically distinct areas, then why

should one anatomic area be affected more than another? Specifically, why should the putative

circuits for audiovisual integration in the ventral pathway (e.g., audiovisual speech integration in

the STS) be more affected than those residing elsewhere (e.g., audiovisual spatial integration in

the IPS)? A possible answer lies in the known differences in the extent to which the central and

peripheral visual fields are represented in the two streams (reviewed by Milner and Goodale

(2008)). In the striate cortex (V1), the central visual field is topographically over-represented,

with more neural tissue devoted to processing of central compared to peripheral visual stimuli

(Tootell, Switkes, Silverman, & Hamilton, 1988; Van Essen, Newsome, & Maunsell, 1984).

While this emphasis on the central visual field persists in the ventral pathway, the peripheral

visual field is relatively emphasized in the dorsal pathway (Brown, Halpert, & Goodale, 2005;

Van Essen & Deyoe, 1995). Indeed, some dorsal visual areas, such as the parieto-occipital area,

show almost no cortical magnification at all (Colby, Gattass, Olson, & Gross, 1988).

Importantly, the spatiotemporal visual deficits in amblyopia also show differential effects on the

central and peripheral visual fields (reviewed in section 1.1.3): in strabismic amblyopia, contrast

sensitivity is relatively more affected in the central visual field (Hess & Pointer, 1985), and in

both anisometropic and strabismic amblyopia, increased latency on multifocal VEP is more

pronounced in the central visual field (Yu et al., 1998; Zhang & Zhao, 2005). The ventral

pathway and its associated circuits for audiovisual speech integration may therefore be

particularly affected by amblyopia, as both its visual input and the amblyopic deficit

predominantly involve the central visual field. By the same reasoning, the dorsal pathway and its

associated audiovisual integrative functions may be relatively spared.

Several findings challenge this hypothesis, however. Activation of the STS is observed during

illusory perception for both the sound-induced flash illusion (Watkins et al., 2006) and the

McGurk effect (Callan et al., 2001; Calvert et al., 2000; Raij et al., 2000), yet people with

162

amblyopia show reduced integration only for the McGurk effect (Burgmeier et al., 2015;

Narinesingh et al., 2017; Narinesingh et al., 2014). Not only do people with amblyopia remain

susceptible to the sound-induced flash illusion, they show a widened temporal binding window

for the effect (Narinesingh et al., 2017). If the differences in visual field representation in the

dorsal and ventral streams accounted for the pattern of audiovisual integration abnormalities in

amblyopia, then one would predict that susceptibility to both illusions involving the STS—the

McGurk effect and the sound-induced flash illusion—would be diminished. Furthermore, if the

amblyopic abnormalities in perception of the McGurk effect and sound-induced flash illusion are

related to alterations in shared multisensory neural circuits, then susceptibility to the illusions

would likely co-vary. Contrary to this prediction, however, susceptibility to the McGurk effect is

negatively correlated with susceptibility to the sound-induced flash illusion (Stevenson,

Zemtsov, et al., 2012).

7.2.1.2 Differential Influences of Attention on Audiovisual Integrative Processes

Another possible mechanism for the pattern of audiovisual integration abnormalities observed in

amblyopia is attention. Specifically, the interaction between a visual attention deficit in

amblyopia, and the differential effect of attention on multisensory processes in normal adults

(reviewed in section 1.3.2).

The hypothesis that amblyopia involves a visual attention deficit stems from studies suggesting

that the crowding phenomenon in normal peripheral vision is not due to limits in spatial

resolution, but rather to the resolving power of visual attention (He, Cavanagh, & Intriligator,

1996; Intriligator & Cavanagh, 2001). The increased crowding effect in amblyopia may therefore

reflect a deficit in visual attention. Multiple subsequent studies of the crowding effect have found

evidence of deficient selective visual attention in observers with strabismic amblyopia while

viewing with the amblyopic eye (Hariharan, Levi, & Klein, 2005; Levi, Hariharan, & Klein,

2002; McKee et al., 2003; Sharma et al., 2000; Tripathy & Cavanagh, 2002), and a study of

spatial tracking in amblyopia suggested the visual attention deficit extends to the fellow eye of

both strabismic and anisometropic subtypes (Ho et al., 2006). More recently, a functional

neuroimaging study showed that a brief period of visual deprivation from bilateral congenital

cataracts alters the balance between visual and auditory attention, favouring audition (de Heering

et al., 2016).

163

The results of Study I and Study IV presented in this thesis provide strong evidence that the

capacities for spatial and temporal integration of simple audiovisual stimuli (i.e., clicks and

flashes) remain intact in amblyopia. In contrast, previous studies of the McGurk effect in

amblyopia suggested that integration of audiovisual speech signals is impaired (Burgmeier et al.,

2015; Narinesingh et al., 2015; Narinesingh et al., 2014; Putzar, Hötting, et al., 2010).

Importantly, the magnitude of the modulating influence of attention on audiovisual integration

varies according to the perceptual task. The spatial ventriloquism effect has proven insensitive to

the effects of top-down directed attention and bottom-up automatic attention (Bertelson et al.,

2000; Vroomen et al., 2001). Morein-Zamir et al. (2003) have shown that the temporal

ventriloquism effect cannot be accounted for by attentional alerting or distraction by cross-modal

interference. Similarly, a study of the sound-induced flash illusion suggests that it is not a

function of visual attentional enhancement by sound (Shams et al., 2002). In contrast, the

strength of the McGurk effect is significantly modulated by attention: susceptibility is reduced

under conditions of increased attentional load and attentional diversion to irrelevant

somatosensory stimuli (Alsius et al., 2005; Alsius et al., 2007). Therefore, an amblyopic deficit

in visual attention may hypothetically account for the observed reduction in susceptibility to the

McGurk effect and preserved integration in the spatial and temporal ventriloquism effects, and

the sound-induced flash illusion. However, an attentional explanation for the effect of amblyopia

on audiovisual integration does not offer a clear explanation for the widened windows of

temporal binding observed in the temporal ventriloquism effect (discussed in Study IV and

section 6.5) and sound-induced flash illusion (Narinesingh et al., 2017).

7.2.1.3 Differential Sensitive Periods for Audiovisual Integrative Processes

The preponderance of evidence from developmentally typical humans points to multisensory

integration being a late-emerging function in the course of sensory development (reviewed in

section 1.3.6). Multisensory facilitation of reaction times emerges at about 8 years of age, and

matures over a period of 2 to 3 years (Barutchu et al., 2009; Barutchu et al., 2010). Statistically

optimal integration also emerges late: after 8 years of age for visual and proprioceptive

navigational cues (Nardini et al., 2008), between 8 and 10 years of age for visual and haptic

object size cues (Gori et al., 2008), and after 12 years of age for visual and auditory spatial

bisection cues (Gori, Sandini, et al., 2012). An apparent exception to this pattern, however, is

integration of audiovisual speech cues. Indeed, McGurk stimuli elicit behavioural and

164

electrophysiological responses suggestive of audiovisual integration in infants as young as 4

months of age (Bristow et al., 2009; Burnham & Dodd, 2004; Desjardins & Werker, 2004;

Rosenblum et al., 1997). If these facets of multisensory integration are susceptible to

maldevelopment, or damage, from anomalous sensory input, then their distinct ages of

emergence imply the presence of distinct sensitive periods as well. Indeed, multiple

asynchronous sensitive periods are well-described for different facets of unisensory visual

development (reviewed in section 1.1.5 and in Lewis and Maurer (2005)). By extension, if the

sensitive period for integration of audiovisual speech signals is considerably earlier than those

for simple audiovisual spatial and temporal signals (as tested by the spatial and temporal

ventriloquism effects in Study I and Study IV), then the pattern of audiovisual integrative

capacities observed in amblyopia may be explained. That is, amblyopia or abnormal visual

experience in early life may affect audiovisual speech integration, but not other integrative

functions, because their respective sensitive periods for damage are asynchronous. This

hypothesis is supported by data suggesting that normal susceptibility to the McGurk effect is

observed in children whose amblyopia either resolved before 5 years of age, or onset after 5

years of age (Burgmeier et al., 2015).

The concept of sensitive periods may also be relevant to amblyopic abnormalities in processes

such as audiovisual simultaneity perception (Study III) that are multisensory but not clearly the

consequence of integration (Chen et al., 2017; Fujisaki & Nishida, 2005). Chen et al. (2016)

showed that the audiovisual simultaneity window narrows on both the auditory-lead and visual-

lead sides throughout childhood, reaching its adult shape by 9 years of age—long after the

typical age of onset for amblyopia (Birch, 2013). Interestingly, 9 years is also the approximate

age at which many aspects of multisensory integration first emerge (reviewed above and in

section 1.3.6). This timeline of multisensory development raises the possibility that mature non-

integrative multisensory processes, such as cross-modal matching based on temporal and spatial

correspondence, may be a pre-condition for the emergence of statistically optimal integration.

Indeed, Ernst (2008) noted that establishing correspondence between multisensory signals (i.e.,

determining which signals belong together) is an essential task, without which integration cannot

occur. In this way, amblyopia may exert an influence on multisensory integration through its

effect on the maturation of non-integrative multisensory processes.

165

7.2.1.4 Multi-stage Audiovisual Processing

If the influence of amblyopia on multisensory integration is secondary to its deleterious effect on

non-integrative multisensory processes such as audiovisual asynchrony detection (Study III),

why is susceptibility to the McGurk effect reduced (Burgmeier et al., 2015; Narinesingh et al.,

2015; Narinesingh et al., 2014; Putzar, Hötting, et al., 2010), while susceptibility to temporal

ventriloquism (Study IV) and the sound-induced flash illusion are normal or even increased

(Narinesingh et al., 2017). A possible explanation is suggested by converging lines of evidence

for a multi-stage mechanism for multisensory processing specific to audiovisual speech

perception (reviewed in section 1.3.8.4).

Using perceptually ambiguous sine wave replicas of natural speech, Tuomainen et al. (2005)

showed that audiovisual integration in a McGurk paradigm depends on whether the listener

believes the auditory stimuli are speech or non-speech signals. If the listener was unaware that

the auditory stimuli were speech, negligible integration was observed. If the listener learned to

perceive the same auditory stimuli as speech, however, significant integration occurred (as for

natural speech). These results point to the existence of a speech-specific mode of multisensory

perception that depends on access to phonetic representations of auditory stimuli. An fMRI study

of a similar paradigm examined brain activation by visual speech paired with auditory speech

and sine wave replicas in participants trained to perceive the sine wave auditory signal as

intelligible speech or as non-speech sounds (Lee & Noppeney, 2011b). The results revealed a

posterior-to-anterior multisensory processing gradient along the STS and superior temporal gyrus

in the ventral stream (Figure 1.4). Although fMRI lacks the temporal resolution to determine the

activation sequence, this finding suggests that as audiovisual signals advance along this pathway,

they are integrated on the basis of increasingly selective and complex features. An

electrophysiological study by Baart, Stekelenburg, et al. (2014) employed a similar pairing of

visual speech and sine wave speech to examine the temporal characteristics of audiovisual

speech integration in the cerebral cortex. They found corroborating evidence for a speech-

specific mode of multisensory perception, and reported that audiovisual integration of

spatiotemporal features precedes integration of linguistic features. The authors proposed a

sequential, rather than parallel, time course of audiovisual speech integration in which

integration of spatiotemporal properties occurs first (from 50 to 100 ms) followed by integration

of phonetic properties (from 100 to 200 ms). If the output of the first (spatiotemporal) stage of

166

audiovisual speech integration influences or constrains integration in the second (phonetic) stage,

then the output of the first stage may be conceptually analogous to the unity assumption (Welch

& Warren, 1980), or a Bayesian prior (Magnotti & Beauchamp, 2017), that determines the

subsequent strength of integration of the audiovisual phonetic information. In other words, the

strength of phonetic integration in the second stage may be dependent upon the certainty of

common causality in the first stage. Indeed, evidence for such a relation between simple featural

binding and phonetic integration was reported in a study by Stevenson, Zemtsov, et al. (2012).

The authors measured the performance on a variety of audiovisual tasks in a sample of

developmentally normal adults, and found that those with a wider temporal window of perceived

audiovisual simultaneity generally showed lower susceptibility to the McGurk effect, but higher

susceptibility to the sound-induced flash illusion. They hypothesized that a wider simultaneity

window relates to a poorer ability to dissociate asynchronous events, leading to a reduction in the

uniqueness of perceived synchronous events, and consequently, reduced phonetic integration as

indexed by the McGurk effect. Indeed, if perceived synchrony is less unique, then its usefulness

in determining whether a given audiovisual pair arose from a single event (i.e., common

causality) is reduced. Similarly, the widened audiovisual simultaneity window observed in

amblyopia (Study III and Chen et al. (2017)) may reflect poorer ability to dissociate

asynchronous signals. This may lead to a less reliable determination of common causality in the

first stage of audiovisual speech integration, which propagates forward in the pathway to reduce

the strength of phonetic integration in the second stage. In this way, reduced susceptibility to the

McGurk effect in amblyopia may not reflect a failure of integration, but may indeed represent a

statistically optimal strategy of audiovisual speech integration.

If a reduced McGurk effect reflects a lower certainty of common causality in amblyopia, why is

susceptibility to the temporal ventriloquism effect and the sound-induced flash illusion not also

reduced? Unlike the McGurk effect, audiovisual integration in the temporal ventriloquism effect

and the sound-induced flash illusion do not require integration of linguistic elements. They are

therefore unlikely to invoke the multi-stage, speech-specific mode of perception involving the

anterior STS (Baart, Stekelenburg, et al., 2014; Lee & Noppeney, 2011b; Tuomainen et al.,

2005). As a consequence, the additional constraints on integration imposed by phonetic

mismatch would not affect these non-speech integrative phenomena. For example, if unilateral

amblyopia involves deficits in lip-reading ability like those described in early visual deprivation,

167

they may only affect audiovisual speech integration at the second stage specific to phonetic

content (Lalonde & Holt, 2016; Putzar, Goerendt, et al., 2010; Putzar, Hötting, et al., 2010).

Furthermore, multi-stage processing implies that the criterion for phonetic integration in

amblyopia may be shifted independently from the criterion for spatiotemporal integration.

Reduced susceptibility to the McGurk effect in amblyopia may alternatively represent

amblyopia-related maldevelopment of the neural substrates for phonetic integration in the second

stage of audiovisual speech integration. Because temporal ventriloquism and the sound-induced

flash illusion do not involve linguistic elements, any amblyopia-related impairment specific to

phonetic integration would not affect these non-speech integrative phenomena.

7.2.1.5 Optimal Integration in the Setting of Reduced Sensory Reliability

A Bayesian framework has been successfully applied to numerous instances of multisensory

integration in humans (reviewed in section 1.3.5.3). An underlying assumption of the Bayesian

framework is that integration is statistically optimal, and that the weight of each modality in the

combined multisensory percept is a function of the relative reliability of each unisensory

component stimulus (Ernst & Bulthoff, 2004). Sensory information from the more reliable

modality is weighted more heavily than sensory information from the less reliable modality. In

normal adults, the unisensory weighting coefficients for optimal combination are not fixed for

each modality, but have been shown to dynamically readjust in response to exogenous changes

in signal reliability (Alais & Burr, 2004; Andersen et al., 2005; Battaglia et al., 2003; Ernst &

Banks, 2002; Gori et al., 2008; Moro et al., 2014; Nardini et al., 2008). This dynamic response to

exogenous changes in signal reliability indicates that normal multisensory integration remains

sensitive and flexible at maturity in many instances.

Several experiments that demonstrate dynamic readjustment of unisensory weighting in optimal

multisensory integration have modulated the reliability of the exogenous visual signal by the

addition of random noise (Battaglia et al., 2003; Ernst & Banks, 2002). Importantly,

spatiotemporal noise is also a well-documented feature of the amblyopic visual system (reviewed

in section 1.1.3, Levi (2013), and Banko, Kortvelyes, Weiss, et al. (2013)). Downstream

multisensory areas in the amblyopic brain may not distinguish external noise from internal noise,

and weight vision according to the spatiotemporal reliability of its neural signal. Indeed, such a

mechanism is suggested by the results of Study I, which showed that audiovisual spatial

168

integration in amblyopia obeyed the MLE model of optimal combination (a special case of the

Bayesian framework).

Insight may be gained from other audiovisual phenomena as well. The perceptual task involved

in the temporal ventriloquism effect and sound-induced flash illusion involve auditory

dominance over a visual temporal judgment, whereas the McGurk effect involves visual

dominance over an auditory phonetic judgment. If modality dominance is assumed to reflect

optimal perceptual weighting based on the reliability of the component unisensory inputs, then

certain predictions follow. Assuming normal temporal reliability of the amblyopic auditory

signal, a heightened auditory contribution can be predicted for perceptual tasks typically

dominated by audition (e.g., temporal judgments), and a diminished visual contribution can be

predicted for perceptual tasks typically dominated by vision (e.g., phonetic judgments). Indeed,

this pattern is in general agreement with the empirical data from adults with amblyopia. The

widened temporal binding windows for the temporal ventriloquism effect (Study IV) and the

sound-induced flash illusion (Narinesingh et al., 2017) are consistent with a heightened auditory

contribution in response to diminished visual temporal reliability and in amblyopia. Although the

magnitude of susceptibility to the temporal ventriloquism effect (Study IV) and the sound-

induced flash illusion (Narinesingh et al., 2017) were not heightened in amblyopia, this may

reflect a ceiling effect for the contribution of audition in the audiovisual percept. Reduced

sensitivity to the McGurk effect is also consistent with a greater contribution of audition to the

fused percept in amblyopia.

Curiously, susceptibility to the McGurk effect remains reduced in amblyopia even when viewing

with the fellow eye only. At first glance, this observation appears to conflict with the hypothesis

that the mechanisms of audiovisual integration remain intact and optimal in amblyopia.

Importantly, however, the McGurk effect involves integration of not only simple spatiotemporal

properties of the multisensory stimuli, but also of the more complex phonetic identity of the

linguistic content. Phonetic identity of a visual signal is derived from lip-reading abilities, and

lip-reading abilities are sensitive to damage by early-onset visual deprivation (Putzar, Goerendt,

et al., 2010; Putzar, Hötting, et al., 2010). If lip-reading abilities are similarly impaired in

unilateral amblyopia, then diminished susceptibility to the McGurk effect during fellow eye

viewing may still reflect an optimal process of multisensory phonetic integration. These issues

169

are not resolved, however. The effect of unilateral amblyopia on lip-reading abilities and the

optimality of the McGurk percept remain open to investigation.

7.3 Are Non-integrative Audiovisual Processes Impaired in Amblyopia?

In the preceding section, the question of whether multisensory integration is impaired in

amblyopia was explored. Study I and Study II revealed that spatial localization precision for

visual, auditory, and audiovisual stimuli are reduced in amblyopia. Comparison of the empirical

data with an MLE ideal observer model showed that participants with amblyopia demonstrated

optimal integration; that is, impairments in spatial precision at the unisensory (i.e., visual and

auditory) level accounted for spatial deficits observed at the multisensory (i.e., audiovisual)

level. Study III showed that temporal resolution for detection of audiovisual asynchrony is

reduced in amblyopia. Despite this audiovisual temporal perception deficit, integration—as

demonstrated by the temporal ventriloquism effect—was intact in amblyopia, as shown in Study

IV. Prior observations on the McGurk effect have suggested that audiovisual integration is

impaired in amblyopia (Burgmeier et al., 2015; Narinesingh et al., 2015; Narinesingh et al.,

2014). However, quantitative assessments of the unisensory contributions to the fused percept

have not been done for the McGurk effect in amblyopia. Without such measurements of

unisensory performance, the concept of an integration failure in amblyopia remains hypothetical.

As explored in section 7.2.1, many mechanisms other than a failure of appropriate integration

may explain the comparatively low susceptibility to the McGurk effect observed in amblyopia.

On the balance of evidence summarized above and reviewed in section 7.2, it can be argued that

unilateral amblyopia does not involve a primary failure of audiovisual integration. Rather, the

observed abnormalities in audiovisual integration may be explained plausibly and

parsimoniously by amblyopia-related impairments of non-integrative multisensory processes

acquired during early life. The evidence for this hypothesis will be examined below.

7.3.1 Cross-modal Matching

Cross-modal matching refers to the multisensory process by which stimuli from different sensory

modalities are compared to estimate their equivalence (Stein et al., 2010). In contrast to

multisensory integration, which involves fusion of unimodal information to produce a new

170

unified percept, cross-modal matching requires preservation of stimulus features within each

modality (Fujisaki & Nishida, 2005; Stein et al., 2010).

Audiovisual simultaneity judgment is an example of cross-modal matching on the basis of

audiovisual temporal correspondence. Study III showed that adults with the most common forms

of unilateral amblyopia (anisometropic, strabismic, and mixed mechanism) have a widened

temporal window of perceived audiovisual simultaneity, suggesting reduced precision in the

neural mechanism for cross-modal matching of audiovisual temporal features. The window was

widened in both auditory-lead and visual-lead SOA conditions, consistent with findings recently

reported for a sample of adults with deprivational amblyopia caused by unilateral congenital

cataract (Chen et al., 2017). In both studies, the shape of the audiovisual binding window did not

change with viewing condition, indicating that the alterations in simultaneity perception were not

real-time adjustments to amblyopic visual input, but likely reflected changes crystallized during

development. Stevenson, Zemtsov, et al. (2012) investigated the audiovisual simultaneity

window and how it relates to performance on other multisensory tasks, and found that

developmentally normal adults with a wider audiovisual simultaneity window (particularly the

visual-lead side) tend to exhibit lower susceptibility to the McGurk effect, but greater

susceptibility to the sound-induced flash illusion. The authors postulated that the correlations

between performance parameters for these tasks reflect individual variability in the underlying

ability to dissociate asynchronous audiovisual stimuli. Similarly, Chen et al. (2017) hypothesized

that the widened window of audiovisual simultaneity in deprivational amblyopia results from

lower temporal precision in the cross-modal perceptual system, and that amblyopic visual input

may interfere with normal developmental tuning of the neural circuits for audiovisual

simultaneity perception (Chen et al., 2017; Chen et al., 2016). A possible mechanism for this

apparent interference in the developmental tuning of audiovisual simultaneity perception (as

discussed in Study III) is temporal uncertainty, or noise, in the amblyopic visual signal (Banko,

Kortvelyes, Nemeth, et al., 2013; Roelfsema et al., 1994). Indeed, a recent abstract reporting a

study of 47 visually normal adults showed that the precision of temporal perception in an

audiovisual simultaneity judgment task can be predicted from the trial-to-trial variability of an

individual’s cortical evoked responses to visual or auditory stimuli (Arnold, Mathews, Keane, &

Yarrow, 2017). Extrapolating to amblyopia, this finding implicates neural noise in the visual

signal as a limit on the developmental tuning of audiovisual simultaneity perception.

171

Impaired cross-modal matching may also account for the effect of amblyopia on the temporal

window of integration for the temporal ventriloquism effect (Study IV). Although the strength of

integration across stimulus conditions did not differ significantly between groups, subgroup

analysis revealed a wider temporal window of integration among amblyopic participants with

poorer stereo acuity, and suggested a similar trend for those with more severe acuity loss in the

amblyopic eye (Figure 6.6 and Figure 6.7). Not only do these finding indicate that the capacity

for audiovisual temporal integration is intact in amblyopia, they also suggest that it operates over

a wider range of SOAs. Two possible explanations exist for this widened window of audiovisual

temporal integration if sequential versus parallel processing mechanisms are considered. If

temporal ventriloquism is a product of sequential processing, then a widened simultaneity

window may be the proximate cause of the widened window of integration (i.e., the simultaneity

window acts as a temporal filter that constrains subsequent integration). Conversely, if temporal

ventriloquism is a product of parallel processing, then temporal noise in the amblyopic visual

signal may be the proximate cause of both the widened window of simultaneity and the widened

window of integration. A speculated distinguishing feature between these two proposed

mechanisms is the effect of viewing condition on the width of the window of integration. In the

case of sequential processing, the window of integration will depend upon the window of

simultaneity. Because the window of simultaneity does not change on amblyopic eye or fellow

eye viewing, the window of integration would also remain unchanged. In the case of parallel

processing, however, the window of integration will depend upon the level of temporal noise in

the visual signal. Because the capacity for optimal integration is argued to remain intact in

amblyopia, the window of integration will change with the viewing eye. Although this remains

an outstanding question, it is important to note that temporal noise in the amblyopic visual signal

(as opposed to a failure of integration) can account for the observed perceptual abnormalities in

both hypothetical mechanisms.

7.3.2 Unisensory Impairments and Cross-sensory Calibration

In addition to the effects of amblyopia on audiovisual integration and non-integrative

audiovisual processes discussed above, Study I and Study II revealed associated abnormalities in

unisensory spatial localization. The reduced precision in visual spatial localization observed in

Study I (Figure 3.4) undoubtedly represents a unimodal effect related to other spatiotemporal

visual deficits in amblyopia (reviewed in section 1.1.3). In contrast, the reduced precision in

172

auditory spatial localization (i.e., wider MAA) observed in Study I, and confirmed in Study II,

cannot be explained as a real-time effect of amblyopic visual input. Indeed, the experiments were

conducted in complete darkness, with neither the auditory stimuli, nor the method of response for

sound localization, involving vision. This finding was unexpected, but replicated, and constitutes

discovery of a novel clinical deficit in people with the most common forms of amblyopia

(anisometropic, strabismic, and mixed-mechanism). The mechanism for this novel deficit in

sound localization is likely one of impaired cross-sensory calibration by vision (reviewed in

section 1.3.7). That is, amblyopic visual input disrupts the developmental calibration of sound

localization during a sensitive period in early life. While animal models (King et al., 1988;

Knudsen & Knudsen, 1989) and human data (Gori et al., 2014; Lessard et al., 1998) support this

hypothesis, this novel finding is particularly intriguing because discordant binocular vision has

never before been shown to impair spatial hearing. To the contrary, the only previous work to

study sound localization in early unilateral visual impairment examined monocular adults, and

found that sound localization accuracy was slightly enhanced (Hoover et al., 2012).

In addition to reduced sound localization precision in amblyopia, Study III also found that sound

localization accuracy was poorer in the spatial hemifield ipsilateral to the amblyopic eye (Figure

4.4C, D). Furthermore, the magnitude of these hemifield-specific inaccuracies correlated

significantly with clinical markers of amblyopia– visual acuity in the amblyopic eye and stereo

acuity (Figure 4.5). Considering the anatomic differences in how retinal fibres decussate in the

retinotectal and retinogeniculate pathways (Lane et al., 1973; Pollack & Hickey, 1979), this

asymmetric pattern of sound localization error suggests that the superior colliculus, rather than

V1, mediates the cross-sensory calibration of sound localization by vision in humans. If this is

the case, it implies that amblyogenic factors in early life not only disrupt visual spatial

processing in the retinostriate pathway (see section 1.1.4), but that they also cause a de novo

(i.e., second primary) deficit in auditory spatial processing in the midbrain via the

retinocollicular pathway. Indeed, a similar mechanism involving the superior colliculus as a

second primary site of impairment was previously hypothesized by Ciuffreda et al. (1978) to

explain the prolongation of saccadic latencies in amblyopia.

173

7.4 Clinical Implications

The majority of the discussion to this point has dealt with elucidating the pattern and

pathophysiology of multisensory processing abnormalities in amblyopia. The findings herein

also have clinical implications for the diagnosis and treatment of amblyopia and its associated

deficits.

It is important to note that many multisensory phenomena (e.g., the McGurk effect and the

spatial ventriloquism effect) rely on unnatural pairings of audiovisual stimuli to induce

perceptual illusions. The normal perceptual system appears to fail the observer in such

circumstances by delivering a perceptual product that is non-veridical. For instance, at first

glance, it would appear that normal susceptibility to the McGurk effect, resulting in non-

veridical auditory perception, would be an adaptive disadvantage. On deeper consideration,

however, it reflects an adaptive ability to combine naturally-occurring stimuli to enhance the

fidelity of perception. In and of themselves, such illusory phenomena do not demonstrate the

adaptive advantages of multisensory integration, but serve as useful experimental tools to probe

the mechanistic underpinnings of multisensory function. Assessing the clinical implications of

abnormal perception of illusory percepts in amblyopia therefore necessitates extrapolation to

ecologically valid situations.

The balance of evidence reviewed and presented in this thesis points to intact spatial and

temporal audiovisual integration in amblyopia. Indeed, people with amblyopia integrate visual

and auditory spatial signals appropriately according to the MLE model, and show enhancements

in visual temporal processing by the temporal ventriloquism effect. These findings suggest that

singling out integrative processes as specific targets for rehabilitation may be misguided, and

shift the focus for clinically-relevant perceptual deficits to the realm of unisensory and non-

integrative multisensory functions.

The most surprising finding was the sound localization deficit (widened MAA) in amblyopia

described in Study I and Study II. Although standard hearing screening tests do not assess sound

localization ability, a widened MAA has a measurable impact on the level of experienced

hearing disability and handicap (Van Esch et al., 2015). The Gothenburg Profile is a validated

tool for clinical assessment of real-world hearing disability (Ringdahl, Eriksson-Mangold, &

Andersson, 1998). Responses to several items of the Gothenburg Profile, including “Are there

174

occasions when you cannot localize different sounds in traffic?” and “Are there occasions when

you turn your head in the wrong direction, when someone calls you?”, are significantly

correlated with poorer spatial hearing as measured by the MAA (Van Esch et al., 2015). The

ability to segregate sounds on the basis of spatial cues has also been shown to contribute

significantly to speech intelligibility in both young children and adults (Litovsky, 2005). By

extension, it is speculated that impaired cross-sensory calibration of sound localization in

amblyopia may have similar real-world consequences for situational awareness in traffic, social

interaction, and speech comprehension in noisy environments.

Findings described in Study III also support the hypothesis that amblyopia involves reduced

precision in audiovisual simultaneity perception. Although it is difficult to make a case for the

importance of more precise simultaneity perception per se, the width of the simultaneity window

may be causally related to performance on other indices of multisensory integration (Stevenson,

Zemtsov, et al., 2012). Improved multisensory integration, in turn, may confer perceptual

advantages as outlined in section 1.3.1. In this light, the importance of a widened simultaneity

window in amblyopia may lie in its demonstrated potential for clinical modification. Indeed,

various forms of perceptual learning, including short-term simultaneity training with feedback,

musical training, and video gaming experience, have been shown to narrow the audiovisual

simultaneity window (Donohue et al., 2010; Lee & Noppeney, 2011a; Powers et al., 2009;

Stevenson et al., 2013).

More broadly, the effects of amblyopia on audiovisual temporal perception and spatial hearing

presented herein lead one to ask several questions. First, do the current treatments for amblyopia

(e.g. occlusion or pharmacologic penalization of the better-seeing eye) cause or exacerbate these

impairments? It is conceivable that amblyopia therapy may deprive the developing brain of high-

fidelity spatiotemporal visual signals necessary for audiovisual temporal and spatial hearing

development. Parttime occlusion is likely insufficient to induce appreciable impairments, but the

effects of full-time occlusion or long-lasting pharmacologic penalization during a sensitive

period of multisensory development may be more significant. Second, can treatment standards

for amblyopia be improved to better address the impairments in audiovisual temporal perception

and spatial hearing? Evidence from a study of the McGurk effect in amblyopia suggests that

successful treatment before 5 years of age may prevent abnormalities in audiovisual speech

integration (Burgmeier et al., 2015). Considerable evidence from animal models and some data

175

from clinical populations also point to a sensitive period in early life during which spatial

hearing is vulnerable to damage from anomalous visual experience (reviewed in section 1.2.2.6).

Similar to the importance of early therapy for the visual aspects of amblyopic rehabilitation

(Campos, 1995; Flynn et al., 1998; Holmes et al., 2011; Lea et al., 1989; Scheiman et al., 2005),

outcomes for multisensory and spatial hearing abilities in amblyopia may also be improved by

early treatment. While more data are needed to support this hypothesis, if early treatment for

amblyopia improves speech integration and spatial hearing outcomes, the evidentiary weight in

favour of population-based childhood vision screening programs will undoubtedly be enhanced.

7.5 Conclusions

Below, the main conclusions of this thesis are summarized.

1) The capacity for spatial and temporal audiovisual integration in amblyopia is intact.

In the spatial domain, the manner of audiovisual integration in the spatial ventriloquism

effect was optimal according to the MLE model of multisensory combination (Study I).

The perceptual weight of each modality and differences in audiovisual localization

precision between the amblyopia and control groups were accounted for by differences in

perceptual performance at the unisensory level for vision, and surprisingly, audition.

In the temporal domain, audiovisual integration, as assessed by the temporal

ventriloquism effect, was intact (Study IV).

2) The temporal resolution of audiovisual simultaneity perception in amblyopia is diminished.

The temporal window of audiovisual simultaneity perception was widened in amblyopia,

and its width was not dependent on which eye was viewing (Study III). The results

suggest that the amblyopic impairment in audiovisual temporal perception is caused by a

central processing abnormality and developmental in origin.

3) Sound localization in amblyopia is impaired.

Horizontal sound localization precision in the central region of space was impaired in

amblyopia (Study I and Study II). Horizontal sound localization error in the central 32° of

space was abnormally asymmetric in amblyopia, with greater error in the hemifield

176

ipsilateral to the amblyopic eye. The magnitude of sound localization error in the

amblyopic hemifield correlated significantly with amblyopic deficits in visual acuity and

stereo acuity. The results suggest that amblyopia disrupts spatial hearing during a

sensitive period of auditory development by a mechanism of cross-sensory calibration.

The spatial pattern of sound localization errors implicates the superior colliculus in

mediating the effect of amblyopic vision on spatial hearing.

177

Chapter 8 Future Directions

Future Directions

The experimental findings and conclusions reported in this thesis inspire further questions about

the development and mechanisms of multisensory processing and integration, the nature and

extent of perceptual impairments in amblyopia, and the adequacy and impact of current therapies

for amblyopia. A future research program encompassing several interrelated areas of study is

envisioned and outlined below.

8.1 Development and Mechanisms of Multisensory Processing and Integration

Similar to the way early visual deprivation has provided an invaluable experimental model for

the study of normal visual development (Lewis & Maurer, 2005), unilateral amblyopia provides

a unique opportunity to study the requirements for normal multisensory development.

Soto-Faraco and Alsius (2009) noted a controversy in the field of multisensory processing: are

different attributes of a multisensory object treated separately by the perceptual system and

bound by different mechanisms (i.e., multiple parallel processes), or are they treated in a unified

manner and processed by a common mechanism (i.e., a single sequential process)? In their study

of the McGurk effect, they reported that the temporal window for audiovisual speech integration

is wider than that for perceived audiovisual synchrony, supporting the hypothesis of multiple

parallel processes (Soto-Faraco & Alsius, 2009). Speech integration, however, is often

considered a special case of multisensory processing (Baart et al., 2015; Baart, Vroomen, et al.,

2014; Eskelund et al., 2011; Lalonde & Holt, 2016). Study of non-speech integrative phenomena,

such as the temporal ventriloquism effect and the sound-induced flash illusion, may therefore

provide more generalizable results. In section 7.3.1, it was hypothesized that if the temporal

window of audiovisual integration is proximally constrained by the window of audiovisual

simultaneity (Figure 8.1A), then it will not be modulated directly by the reliability of the visual

signal. Conversely, if the two processes (i.e., simultaneity perception and integration) occur by

parallel mechanisms (Figure 8.1B), then the temporal window of audiovisual integration will

respond to changes in the reliability of the visual signal. In amblyopia, it is already established

that the audiovisual simultaneity window is widened regardless of viewing condition, indicating

178

that it does not respond to changes in the reliability of the visual signal (Study III and Chen et al.

(2017)). However, the effect of monocular viewing condition on the temporal window of

integration for non-speech phenomena has not been fully investigated (Study IV and Narinesingh

et al. (2017)). Experimental decoupling of the response pattern for these two multisensory

processes by monocular viewing conditions (i.e., a stable simultaneity window, but variable

temporal window of integration) would provide evidence for parallel processing mechanisms.

Figure 8.1: Possible mechanisms that determine the temporal window of audiovisual

integration. (A) Sequential processing. The width of the simultaneity window is the proximal

constraint on the window of integration. Because the simultaneity window is invariant in fellow

eye and amblyopic eye viewing conditions, the integration window will be similarly invariant.

(B) Parallel processing. In this model, the simultaneity window is not a proximal constraint on

179

the window of integration. The window of integration will therefore vary according to whether

visual input is received from the temporally precise fellow eye, or from the temporally imprecise

amblyopic eye.

A future endeavour will also be to combine electroencephalography with behavioural studies in

amblyopia to elucidate the factors and mechanisms that determine the perception of multisensory

stimuli. Arnold et al. (2017) reported preliminary electroencephalographic data showing that

temporal variability (i.e., noise) in cortical evoked potentials predicted a visually normal

observer’s ability to judge the simultaneity of audiovisual signals. This was an important finding,

because common electroencephalographic techniques that employ time-domain averaging

discard information on trial-to-trial variability. Indeed, this technique, used to increase the

signal-to-noise ratio, has been implicated in the misinterpretation of VEP findings in amblyopia

(Banko, Kortvelyes, Nemeth, et al., 2013). The hypothesis that neural noise predicts the

precision of audiovisual simultaneity perception (Arnold et al., 2017) may be tested in a sample

of observers with amblyopia—a population established to have a widened temporal window of

audiovisual simultaneity perception. Furthermore, if reliable data on the age of onset and

treatment for amblyopic participants can be obtained, evidence for a sensitive period for the

calibration of audiovisual simultaneity perception may also be found.

8.2 Nature and Extent of Perceptual Impairments in Amblyopia

Results presented in this thesis revealed a new class of perceptual impairment in amblyopia

affecting the auditory system. As a novel finding, future experiments need to more fully define

the nature and extent of the sound localization deficit. Study I and Study II described and

confirmed a deficit in horizontal sound localization precision (i.e., a wider MAA) for a central

auditory target. However, Study II also described a deficit in sound localization accuracy that

preferentially affected the spatial hemifield ipsilateral to the amblyopic eye. Further studies

should measure the MAA in amblyopia for auditory targets in the left and right hemifields to

determine if the spatial asymmetry identified in sound localization accuracy also applies to sound

localization precision. An asymmetric effect on MAA would further implicate the superior

colliculus as the neural site of cross-sensory calibration of spatial hearing in humans. Similar

sound localization experiments measuring horizontal sound localization precision and error may

180

also be conducted on a sample of non-amblyopic observers with early-onset strabismus. Results

from a strabismic population may be compared to those from an anisometropic amblyopic

population to determine the differential contributions of retinal defocus and binocular

decorrelation to the cross-sensory calibration of sound localization. As mentioned above, if

reliable data on the age of onset and treatment for the early-onset visual disturbances can be

obtained, evidence for a sensitive period for the visual influence on spatial hearing may be

found.

Earlier studies of the McGurk effect in amblyopia suggested that the visual disorder involves a

failure of multisensory integration. New data presented in this thesis and elsewhere (Narinesingh

et al., 2017), however, imply otherwise—specifically, that mechanisms for audiovisual spatial

and temporal integration remain intact in amblyopia. A difficulty in determining whether

previous studies of the McGurk effect in amblyopia demonstrate normal or deficient integration

is that their experimental designs did not incorporate measures of performance on the component

unisensory tasks (i.e., auditory speech recognition and lip-reading ability) (Burgmeier et al.,

2015; Narinesingh et al., 2015; Narinesingh et al., 2014). Whether the audiovisual speech

perception differences in amblyopia (described in the studies listed above) result from deficient

integration of the available unisensory information, or from a unisensory deficit propagated

through a normal integrative mechanism, remains an unresolved topic of speculation. An

immediate goal for future research is therefore to test adults with unilateral amblyopia on an

audiovisual speech integration task that involves unisensory and multisensory measures of

perceptual performance (as in Putzar, Hötting, et al. (2010), for example). Based on the findings

of intact audiovisual integration in this thesis (Study I and Study IV), it is hypothesized that

audiovisual speech integration in amblyopia is also intact, and that the deficits observed in earlier

studies result from reduced lip-reading ability in amblyopia. In the same way sound localization

deficits are associated with both bilateral (Gori et al., 2014; Lessard et al., 1998) and unilateral

(Study I and Study II) early-onset visual impairments, lip-reading impairments described in

bilateral early-onset visual deprivation (Putzar, Hötting, et al., 2010) may be found in unilateral

amblyopia as well.

With respect to amblyopia, much of the scientific and clinical focus has understandably been on

its prominent visual spatial deficits (McKee et al., 2003) and the pathophysiologic significance

of visual spatial noise (Levi & Klein, 2003; Levi et al., 2008; Levi et al., 1987; Levi et al., 1994;

181

Niechwiej-Szwedo, Kennedy, et al., 2012; Nordmann et al., 1992; Raashid et al., 2015).

However, reported visual temporal processing deficits in amblyopia (Huang et al., 2012; Spang

& Fahle, 2009; St John, 1998; Tredici & von Noorden, 1984), recent trial-by-trial analyses of

VEP data (Arnold et al., 2017; Banko, Kortvelyes, Nemeth, et al., 2013; Banko, Kortvelyes,

Weiss, et al., 2013), and the widened temporal windows of perceptual binding for number of

audiovisual tasks (Study III, Study IV, Chen et al. (2017), and Narinesingh et al. (2017)), suggest

that visual temporal noise may be an underappreciated pathophysiologic factor in amblyopia. In

addition to the experiment modeled after Arnold et al. (2017) outlined in section 8.1, an

important future direction for amblyopia research will be psychophysical experiments to more

comprehensively assess amblyopic visual temporal perception. For example, interocular

differences in visual temporal precision and perceptual latency may be measured using a set of

2AFC visual TOJ tasks. To rule out the possibility of a global temporal processing deficit, the

integrity of auditory temporal processing in amblyopia may be confirmed by a temporal order

discrimination task for two tones of different pitch (Tallal, 1978), or by an auditory gap detection

task (Irwin et al., 1985). As noted in section 3.5, temporal factors—specifically, temporal decay

in the amblyopic visual spatial signal—may also explain the deficit in visual spatial precision

observed in Study I. This hypothesis may be tested by comparing the effect of varying the

temporal interval between sequential visual stimuli on localization performance in amblyopia

and control groups. A significant interaction between temporal interval and group would signify

differential temporal decay in the visual spatial signal.

Several studies in this thesis reported relations between clinical features of amblyopia and

performance on multisensory tasks. Study II found that the magnitude of sound localization error

in the auditory hemifield ipsilateral to the amblyopic eye was correlated with deficits in stereo

acuity and monocular visual acuity. Subgroup analysis in Study III found that the width of the

audiovisual simultaneity window related to the severity of the monocular acuity deficit, while the

point of subjective simultaneity related to the binocularity deficit. Study IV found that the

temporal window for the temporal ventriloquism effect was wider in individuals with poor stereo

acuity. Practical limitations of sample size, however, prevented a systematic examination of

potential differences between etiological subtypes of amblyopia. Future studies may delve into

this area of investigation by selecting fewer paradigms and focusing on recruiting larger numbers

of participants with anisometropic, strabismic, and mixed mechanism amblyopia.

182

8.3 Looking to the Future of Amblyopia Therapy

A key step in translating novel research findings into meaningful healthcare innovation is

establishing a link between laboratory results and patient function in real-world situations.

The real-world impact of the amblyopic deficit in sound localization is unknown, but may be

assessed in several ways. In hearing impaired populations, a relation between the MAA and a

person’s ability to localize voices and sounds in traffic has been established using the

Gothenburg Profile, a validated tool for clinical assessment of experienced hearing disability and

handicap (Ringdahl et al., 1998; Van Esch et al., 2015). These critical abilities may also be

assessed in people with amblyopia using the Gothenburg Profile, and their disability scores may

be correlated with experimentally-determined measures of sound localization precision and

accuracy. The clinical relevance of a widened MAA in amblyopia may also be inferred from

further experimental study. For example, poorer ability to use spatial cues to segregate speech

streams (i.e., higher thresholds for spatial release from masking) may indicate increased

difficulty with speech comprehension in noisy environments (Pillsbury et al., 1991). Difficulty in

this regard may have implications for attention and comprehension in the classroom setting. In

section 7.4, it was also speculated that some therapies for amblyopia—specifically, full-time

occlusion and long-lasting pharmacologic penalization—may exacerbate the amblyopic

disturbance in sound localization by consistently depriving the developing brain of a high-

fidelity spatial signal from the fellow eye. This hypothesis could be tested in a prospective

manner by randomizing previously untreated children with severe amblyopia to either part-time

occlusion or atropine penalization, then measuring the MAA during and at the conclusion of

therapy. The outcome of such a study may provide an evidence-based rationale for choosing

part-time occlusion over other treatment options that are currently considered equivalent

(American Academy of Ophthalmology Pediatric Ophthalmology/Strabismus Panel, 2012).

Beyond informing the evidence-based application of currently-available therapies, a future

research program is envisioned to seek novel methods to improve multisensory outcomes in

amblyopia. For example, can musical training (Lee & Noppeney, 2011a), or training on visual

and audiovisual temporal perception tasks (Donohue et al., 2010; Powers et al., 2009; Stevenson

et al., 2013) narrow the audiovisual simultaneity window in amblyopia as they do in visually

normal adults?

183

A fuller understanding of the complex interplay between visual and auditory perception, and an

appreciation of the far-reaching developmental consequences of anomalous sensory experience,

will undoubtedly enhance the clinician’s ability to minimize disability and maximize health in

generations to come.

184

References

Aaen-Stockdale, C., & Hess, R. F. (2008). The amblyopic deficit for global motion is spatial scale invariant. Vision Res, 48(19), 1965–1971. doi:10.1016/j.visres.2008.06.012

Abel, S. M., Figueiredo, J. C., Consoli, A., Birt, C. M., & Papsin, B. C. (2009). The effect of blindness on horizontal plane sound source identification: El efecto de la ceguera en la identificatión de la fuente sonora en el piano horizontal. International Journal of

Audiology, 41(5), 285–292. doi:10.3109/14992020209077188 Abrahamsson, M., Fabian, G., & Sjostrand, J. (1990). A longitudinal study of a population based

sample of astigmatic children. II. The changeability of anisometropia. Acta Ophthalmol

(Copenh), 68(4), 435–440. Abrahamsson, M., & Sjostrand, J. (1988). Contrast sensitivity and acuity relationship in

strabismic and anisometropic amblyopia. Br J Ophthalmol, 72(1), 44–49. Adams, G. G., & Karas, M. P. (1999). Effect of amblyopia on employment prospects. Br J

Ophthalmol, 83(3), 380. Adams, R. J., & Courage, M. L. (2002). Using a single test to measure human contrast sensitivity

from early childhood to maturity. Vision Res, 42(9), 1205–1210. Alais, D., & Burr, D. (2004). The ventriloquist effect results from near-optimal bimodal

integration. Curr Biol, 14(3), 257–262. doi:10.1016/j.cub.2004.01.029 Alais, D., Newell, F. N., & Mamassian, P. (2010). Multisensory processing in review: from

physiology to behaviour. Seeing Perceiving, 23(1), 3–38. doi:10.1163/187847510X488603

Alsius, A., Navarra, J., Campbell, R., & Soto-Faraco, S. (2005). Audiovisual integration of speech falters under high attention demands. Curr Biol, 15(9), 839–843. doi:10.1016/j.cub.2005.03.046

Alsius, A., Navarra, J., & Soto-Faraco, S. (2007). Attention to touch weakens audiovisual speech integration. Exp Brain Res, 183(3), 399–404.

Altmann, L., & Singer, W. (1986). Temporal integration in amblyopic vision. Vision Res, 26(12), 1959–1968.

American Academy of Ophthalmology Pediatric Ophthalmology/Strabismus Panel. (2012). Preferred practice pattern guidelines. Amblyopia. San Francisco, CA: American Academy of Ophthalmology Retrieved from www.aao.org/ppp.

American Academy of Pediatrics Section on Ophthalmology Council on Children with Disabilities, American Academy of Ophthalmology, American Association for Pediatric Ophthalmology and Strabismus, & American Association of Certified Orthoptists. (2009). Joint statement–Learning disabilities, dyslexia, and vision. Pediatrics, 124(2), 837–844. doi:10.1542/peds.2009-1445

Andersen, T. S., Tiippana, K., & Sams, M. (2005). Maximum Likelihood Integration of rapid flashes and beeps. Neurosci Lett, 380(1-2), 155–160. doi:10.1016/j.neulet.2005.01.030

Andreassi, J., & Greco, J. (1975). Effects of bisensory stimulation on reaction time and the evoked cortical potential. Physiological Psychology, 3(2), 189–194.

Arden, G. B., & Barnard, W. M. (1979). Effect of occlusion on the visual evoked response in amblyopia. Trans Ophthalmol Soc U K, 99(3), 419–426.

Arnold, D., Mathews, N., Keane, B., & Yarrow, K. (2017). Evoked neural response variability predicts poor timing precision. J Vis, 17(10), 733–733. doi:10.1167/17.10.733

185

Aschersleben, G., & Bertelson, P. (2003). Temporal ventriloquism: crossmodal interaction on the time dimension. 2. Evidence from sensorimotor synchronization. Int J Psychophysiol,

50(1-2), 157–163. Ashmead, D. H., Clifton, R. K., & Perris, E. E. (1987). Precision of auditory localization in

human infants. Developmental Psychology, 23(5), 641. Ashmead, D. H., Davis, D. L., Whalen, T., & Odom, R. D. (1991). Sound localization and

sensitivity to interaural time differences in human infants. Child Dev, 62(6), 1211–1226. Ashmead, D. H., Grantham, D. W., Murphy, W., & Tharpe, A. M. (1993). Human infants’

sensitivity to interaural level differences. J Acoust Soc Am, 93(4), 2360–2360. Ashmead, D. H., Wall, R. S., Ebinger, K. A., Eaton, S. B., Snook-Hill, M.-M., & Yang, X.

(1998). Spatial hearing in children with visual disabilities. Perception, 27(1), 105–122. Attebo, K., Mitchell, P., Cumming, R., Smith, W., Jolly, N., & Sparkes, R. (1998). Prevalence

and causes of amblyopia in an adult population. Ophthalmology, 105(1), 154–159. Baart, M., Bortfeld, H., & Vroomen, J. (2015). Phonetic matching of auditory and visual speech

develops during childhood: evidence from sine-wave speech. J Exp Child Psychol, 129, 157–164. doi:10.1016/j.jecp.2014.08.002

Baart, M., Stekelenburg, J. J., & Vroomen, J. (2014). Electrophysiological evidence for speech-specific audiovisual integration. Neuropsychologia, 53, 115–121. doi:10.1016/j.neuropsychologia.2013.11.011

Baart, M., Vroomen, J., Shaw, K., & Bortfeld, H. (2014). Degrading phonetic information affects matching of audiovisual speech in adults, but not in infants. Cognition, 130(1), 31–43. doi:10.1016/j.cognition.2013.09.006

Babu, R. J., Clavagnier, S. R., Bobier, W., Thompson, B., & Hess, R. F. (2013). The regional extent of suppression: strabismics versus nonstrabismics. Invest Ophthalmol Vis Sci,

54(10), 6585–6593. Baker, F. H., Grigg, P., & von Noorden, G. K. (1974). Effects of visual deprivation and

strabismus on the response of neurons in the visual cortex of the monkey, including studies on the striate and prestriate cortex in the normal animal. Brain Res, 66(2), 185–208.

Banati, R. B., Goerres, G., Tjoa, C., Aggleton, J. P., & Grasby, P. (2000). The functional anatomy of visual-tactile integration in man: a study using positron emission tomography. Neuropsychologia, 38(2), 115–124.

Banko, E. M., Kortvelyes, J., Nemeth, J., Weiss, B., & Vidnyanszky, Z. (2013). Amblyopic deficits in the timing and strength of visual cortical responses to faces. Cortex, 49(4), 1013–1024. doi:10.1016/j.cortex.2012.03.021

Banko, E. M., Kortvelyes, J., Weiss, B., & Vidnyanszky, Z. (2013). How the visual cortex handles stimulus noise: insights from amblyopia. PloS one, 8(6), e66583. doi:10.1371/journal.pone.0066583

Banks, M. S., & Bennett, P. J. (1988). Optical and photoreceptor immaturities limit the spatial and chromatic vision of human neonates. J Opt Soc Am A, 5(12), 2059–2079.

Barnard, W. M., & Arden, G. B. (1979). Changes in the visual evoked response during and after occlusion therapy for amblyopia. Child Care Health Dev, 5(6), 421–430.

Barnes, G. R., Hess, R. F., Dumoulin, S. O., Achtman, R. L., & Pike, G. B. (2001). The cortical deficit in humans with strabismic amblyopia. J Physiol, 533(Pt 1), 281–297.

Barrett, B. T., Bradley, A., & McGraw, P. V. (2004). Understanding the neural basis of amblyopia. Neuroscientist, 10(2), 106–117. doi:10.1177/1073858403262153

Barrett, B. T., Pacey, I. E., Bradley, A., Thibos, L. N., & Morrill, P. (2003). Nonveridical visual perception in human amblyopia. Invest Ophthalmol Vis Sci, 44(4), 1555–1567.

186

Barutchu, A., Crewther, D. P., & Crewther, S. G. (2009). The race that precedes coactivation: development of multisensory facilitation in children. Developmental Science, 12(3), 464–473.

Barutchu, A., Danaher, J., Crewther, S. G., Innes-Brown, H., Shivdasani, M. N., & Paolini, A. G. (2010). Audiovisual integration in noise by children and adults. J Exp Child Psychol,

105(1-2), 38-50. doi:10.1016/j.jecp.2009.08.005 Batra, R., Kuwada, S., & Fitzpatrick, D. C. (1997). Sensitivity to interaural temporal disparities

of low-and high-frequency neurons in the superior olivary complex. I. Heterogeneity of responses. J Neurophysiol, 78(3), 1222–1236.

Battaglia, P. W., Jacobs, R. A., & Aslin, R. N. (2003). Bayesian integration of visual and auditory signals for spatial localization. J Opt Soc Am A Opt Image Sci Vis, 20(7), 1391–1397.

Bedell, H. E., Flom, M. C., & Barbeito, R. (1985). Spatial aberrations and acuity in strabismus and amblyopia. Invest Ophthalmol Vis Sci, 26(7), 909–916.

Bertelson, P., & Radeau, M. (1981). Cross-modal bias and perceptual fusion with auditory-visual spatial discordance. Percept Psychophys, 29(6), 578–584.

Bertelson, P., Vroomen, J., De Gelder, B., & Driver, J. (2000). The ventriloquist effect does not depend on the direction of deliberate visual attention. Percept Psychophys, 62(2), 321–332. doi:10.3758/bf03205552

Berwanger, D., Wittmann, M., von Steinbuchel, N., & von Suchodoletz, W. (2004). Measurement of temporal-order judgment in children. Acta Neurobiol Exp (Wars), 64(3), 387–394.

Binns, K. E., Grant, S., Withington, D. J., & Keating, M. J. (1992). A topographic representation of auditory space in the external nucleus of the inferior colliculus of the guinea-pig. Brain

Res, 589(2), 231–242. doi:http://dx.doi.org/10.1016/0006-8993(92)91282-J Birch, E. E. (2013). Amblyopia and binocular vision. Prog Retin Eye Res, 33, 67–84.

doi:10.1016/j.preteyeres.2012.11.001 Birch, E. E., & Holmes, J. M. (2010). The clinical profile of amblyopia in children younger than

3 years of age. J AAPOS, 14(6), 494–497. Birch, E. E., Morale, S. E., Jost, R. M., De La Cruz, A., Kelly, K. R., Wang, Y. Z., & Bex, P. J.

(2016). Assessing suppression in amblyopic children with a dichoptic eye chart. Invest

Ophthalmol Vis Sci, 57(13), 5649–5654. doi:10.1167/iovs.16-19986 Birch, E. E., Stager, D., Leffler, J., & Weakley, D. (1998). Early treatment of congenital

unilateral cataract minimizes unequal competition. Invest Ophthalmol Vis Sci, 39(9), 1560–1566.

Birch, E. E., & Stager, D. R. (1988). Prevalence of good visual acuity following surgery for congenital unilateral cataract. Arch Ophthalmol, 106(1), 40–43.

Birch, E. E., Stager, D. R., & Wright, W. W. (1986). Grating acuity development after early surgery for congenital unilateral cataract. Arch Ophthalmol, 104(12), 1783–1787.

Birch, E. E., & Swanson, W. H. (2000). Hyperacuity deficits in anisometropic and strabismic amblyopes with known ages of onset. Vision Res, 40(9), 1035–1040.

Birch, E. E., Swanson, W. H., Stager, D. R., Woody, M., & Everett, M. (1993). Outcome after very early treatment of dense congenital unilateral cataract. Invest Ophthalmol Vis Sci,

34(13), 3687–3699. Blakemore, C. (1988). The sensitive periods of the monkey’s visual cortex Strabismus and

Amblyopia (pp. 219–234): Springer. Blakemore, C., & Vital-Durand, F. (1986). Effects of visual deprivation on the development of

the monkey's lateral geniculate nucleus. J Physiol, 380, 493–511.

187

Blankenship, C., Zhang, F., & Keith, R. (2016). Behavioral measures of temporal processing and speech perception in cochlear implant users. J Am Acad Audiol, 27(9), 701–713. doi:10.3766/jaaa.15026

Blauert, J. (1970). Ein Versuch zum Richtungshören bei gleichzeitiger optischer Stimulation. Acustica, 23, 118–119.

Bonneh, Y. S., Sagi, D., & Polat, U. (2004). Local and non-local deficits in amblyopia: acuity and spatial interactions. Vision Res, 44(27), 3099–3110. doi:10.1016/j.visres.2004.07.031

Bonneh, Y. S., Sagi, D., & Polat, U. (2007). Spatial and temporal crowding in amblyopia. Vision

Res, 47(14), 1950–1962. doi:10.1016/j.visres.2007.02.015 Boothe, R. G., Dobson, V., & Teller, D. Y. (1985). Postnatal development of vision in human

and nonhuman primates. Annu Rev Neurosci, 8(1), 495–545. doi:10.1146/annurev.ne.08.030185.002431

Boudreau, J. C., & Tsuchitani, C. (1968). Binaural interaction in the cat superior olive S segment. J Neurophysiol, 31(3), 442–454.

Bradley, A., & Freeman, R. D. (1981). Contrast sensitivity in anisometropic amblyopia. Invest

Ophthalmol Vis Sci, 21(3), 467–476. Bristow, D., Dehaene-Lambertz, G., Mattout, J., Soares, C., Gliga, T., Baillet, S., & Mangin, J.-

F. (2009). Hearing faces: how the infant brain matches the face it sees with the speech it hears. J Cogn Neurosci, 21(5), 905–921.

Brown, K. W., & Gottfried, A. W. (1986). Cross-modal transfer of shape in early infancy: Is there reliable evidence. Advances in infancy research, 4, 163–170.

Brown, L. E., Halpert, B. A., & Goodale, M. A. (2005). Peripheral vision for perception and action. Exp Brain Res, 165(1), 97–106. doi:10.1007/s00221-005-2285-y

Brown, S. A., Weih, L. M., Fu, C. L., Dimitrov, P., Taylor, H. R., & McCarty, C. A. (2000). Prevalence of amblyopia and associated refractive errors in an adult population in Victoria, Australia. Ophthalmic Epidemiol, 7(4), 249–258.

Buch, H., Vinding, T., La Cour, M., & Nielsen, N. V. (2001). The prevalence and causes of bilateral and unilateral blindness in an elderly urban Danish population. The Copenhagen City Eye Study. Acta Ophthalmol Scand, 79(5), 441–449. doi:10.1034/j.1600-0420.2001.790503.x

Burgmeier, R., Desai, R. U., Farner, K. C., Tiano, B., Lacey, R., Volpe, N. J., & Mets, M. B. (2015). The effect of amblyopia on visual-auditory speech perception: why mothers may say "Look at me when I'm talking to you". JAMA Ophthalmol, 133(1), 11–16. doi:10.1001/jamaophthalmol.2014.3307

Burnham, D., & Dodd, B. (1996). Auditory-visual speech perception as a direct process: The McGurk effect in infants and across languages Speechreading by Humans and Machines (pp. 103–114): Springer.

Burnham, D., & Dodd, B. (2004). Auditory-visual speech integration by prelinguistic infants: perception of an emergent consonant in the McGurk effect. Dev Psychobiol, 45(4), 204–220. doi:10.1002/dev.20032

Burr, D., Banks, M. S., & Morrone, M. C. (2009). Auditory dominance over vision in the perception of interval duration. Exp Brain Res, 198(1), 49–57. doi:10.1007/s00221-009-1933-z

Burr, D., & Gori, M. (2012). Multisensory integration develops late in humans. In M. M. Murray & M. T. Wallace (Eds.), The Neural Bases of Multisensory Processes. Boca Raton (FL): CRC Press/Taylor & Francis LLC.

Bushara, K. O., Grafman, J., & Hallett, M. (2001). Neural correlates of auditory-visual stimulus onset asynchrony detection. J Neurosci, 21(1), 300–304.

188

Bushara, K. O., Hanakawa, T., Immisch, I., Toma, K., Kansaku, K., & Hallett, M. (2003). Neural correlates of cross-modal binding. Nat Neurosci, 6(2), 190–195. doi:10.1038/nn993

Bushara, K. O., Weeks, R. A., Ishii, K., Catalan, M. J., Tian, B., Rauschecker, J. P., & Hallett, M. (1999). Modality-specific frontal and parietal areas for auditory and visual spatial localization in humans. Nat Neurosci, 2(8), 759–766. doi:10.1038/11239

Caird, D., & Klinke, R. (1983). Processing of binaural stimuli by cat superior olivary complex neurons. Experimental Brain Research, 52(3), 385–399.

Callan, D. E., Callan, A. M., Kroos, C., & Vatikiotis-Bateson, E. (2001). Multimodal contribution to speech perception revealed by independent component analysis: a single-sweep EEG case study. Brain Res Cogn Brain Res, 10(3), 349-353.

Calvert, G. A. (2001). Crossmodal processing in the human brain: insights from functional neuroimaging studies. Cereb Cortex, 11(12), 1110–1123. doi:10.1093/cercor/11.12.1110

Calvert, G. A., Brammer, M. J., Bullmore, E. T., Campbell, R., Iversen, S. D., & David, A. S. (1999). Response amplification in sensory-specific cortices during crossmodal binding. Neuroreport, 10(12), 2619–2623.

Calvert, G. A., Campbell, R., & Brammer, M. J. (2000). Evidence from functional magnetic resonance imaging of crossmodal binding in the human heteromodal cortex. Curr Biol,

10(11), 649–657. Calvert, G. A., Hansen, P. C., Iversen, S. D., & Brammer, M. J. (2001). Detection of audio-visual

integration sites in humans by application of electrophysiological criteria to the BOLD effect. Neuroimage, 14(2), 427–438. doi:10.1006/nimg.2001.0812

Calvert, G. A., Spence, C., & Stein, B. E. (2004). The handbook of multisensory processes: MIT press.

Campbell, R. A., Doubell, T. P., Nodal, F. R., Schnupp, J. W., & King, A. J. (2006). Interaural timing cues do not contribute to the map of space in the ferret superior colliculus: a virtual acoustic space study. J Neurophysiol, 95(1), 242–254.

Campos, E. (1995). Amblyopia. Surv Ophthalmol, 40(1), 23–39. Canon, L. K. (1970). Intermodality inconsistency of input and directed attention as determinants

of the nature of adaptation. J Exp Psychol, 84(1), 141. Canon, L. K. (1971). Directed attention and maladaptive" adaptation" to displacement of the

visual field. J Exp Psychol, 88(3), 403. Carlton, J., & Kaltenthaler, E. (2011). Amblyopia and quality of life: a systematic review. Eye

(Lond), 25(4), 403–413. doi:10.1038/eye.2011.4 Chen, Y. C., Lewis, T. L., Shore, D. I., & Maurer, D. (2017). Early binocular input is critical for

development of audiovisual but not visuotactile simultaneity perception. Curr Biol, 27(4), 583–589. doi:10.1016/j.cub.2017.01.009

Chen, Y. C., Shore, D. I., Lewis, T. L., & Maurer, D. (2015). The role of early visual experience

in the development of the later perception of audiovisual simultaneity: Evidence from

cataract-reversal patients. Paper presented at the Jean Piaget Society, Toronto, ON, Canada.

Chen, Y. C., Shore, D. I., Lewis, T. L., & Maurer, D. (2016). The development of the perception of audiovisual simultaneity. J Exp Child Psychol, 146, 17–33. doi:10.1016/j.jecp.2016.01.010

Chen, Y. C., & Spence, C. (2017). Assessing the role of the 'unity assumption' on multisensory integration: a review. Front Psychol.

Choe, C. S., Welch, R. B., Gilford, R. M., & Juola, J. F. (1975). The “ventriloquist effect”: Visual dominance or response bias? Atten Percept Psychophys, 18(1), 55–60.

189

Chua, B., & Mitchell, P. (2004). Consequences of amblyopia on education, occupation, and long term vision loss. Br J Ophthalmol, 88(9), 1119–1121. doi:10.1136/bjo.2004.041863

Ciuffreda, K. J., Kenyon, R. V., & Stark, L. (1978). Increased saccadic latencies in amblyopic eyes. Invest Ophthalmol Vis Sci, 17(7), 697–702.

Clifton, R. K., Gwiazda, J., Bauer, J. A., Clarkson, M. G., & Held, R. M. (1988). Growth in head size during infancy: Implications for sound localization. Developmental Psychology,

24(4), 477. Clifton, R. K., Morrongiello, B. A., Kulig, J. W., & Dowd, J. M. (1981). Newborns' orientation

toward sound: Possible implications for cortical development. Child Dev, 833–838. Colby, C. L., Gattass, R., Olson, C. R., & Gross, C. G. (1988). Topographical organization of

cortical afferents to extrastriate visual area PO in the macaque: a dual tracer study. J

Comp Neurol, 269(3), 392–413. doi:10.1002/cne.902690307 Collignon, O., Dormal, G., de Heering, A., Lepore, F., Lewis, T. L., & Maurer, D. (2015). Long-

lasting crossmodal cortical reorganization triggered by brief postnatal visual deprivation. Curr Biol, 25(18), 2379–2383. doi:10.1016/j.cub.2015.07.036

Colonius, H., & Diederich, A. (2004). Multisensory interaction in saccadic reaction time: a time-window-of-integration model. J Cogn Neurosci, 16(6), 1000–1009. doi:10.1162/0898929041502733

Conner, I. P., Odom, J. V., Schwartz, T. L., & Mendola, J. D. (2007a). Monocular activation of V1 and V2 in amblyopic adults measured with functional magnetic resonance imaging. J

AAPOS, 11(4), 341–350. doi:10.1016/j.jaapos.2007.01.119 Conner, I. P., Odom, J. V., Schwartz, T. L., & Mendola, J. D. (2007b). Retinotopic maps and

foveal suppression in the visual cortex of amblyopic adults. J Physiol, 583(Pt 1), 159–173. doi:10.1113/jphysiol.2007.136242

Constantinescu, T., Schmidt, L., Watson, R., & Hess, R. F. (2005). A residual deficit for global motion processing after acuity recovery in deprivation amblyopia. Invest Ophthalmol Vis

Sci, 46(8), 3008–3012. doi:10.1167/iovs.05-0242 Corneil, B. D., Van Wanrooij, M., Munoz, D. P., & Van Opstal, A. J. (2002). Auditory-visual

interactions subserving goal-directed saccades in a complex scene. J Neurophysiol, 88(1), 438–454.

Cuppini, C., Magosso, E., Rowland, B., Stein, B., & Ursino, M. (2012). Hebbian mechanisms help explain development of multisensory integration in the superior colliculus: a neural network model. Biol Cybern, 106(11-12), 691–713. doi:10.1007/s00422-012-0511-9

Cynader, M., & Berman, N. (1972). Receptive-field organization of monkey superior colliculus. J Neurophysiol, 35(2), 187–201.

Davis, S. M., & McCroskey, R. L. (1980). Auditory fusion in children. Child Dev, 51(1), 75–80. doi:10.2307/1129592

Daw, N. W. (1998). Critical periods and amblyopia. Arch Ophthalmol, 116(4), 502–505. Daw, N. W. (2006). Visual Development: Springer. de Heering, A., Dormal, G., Pelland, M., Lewis, T., Maurer, D., & Collignon, O. (2016). A brief

period of postnatal visual deprivation alters the balance between auditory and visual attention. Curr Biol, 26(22), 3101–3105. doi:10.1016/j.cub.2016.10.014

DeFilippo, C. L., & Snell, K. B. (1986). Detection of a temporal gap in low-frequency narrow-band signals by normal-hearing and hearing-impaired listeners. J Acoust Soc Am, 80(5), 1354–1358.

Demer, J. L., von Noorden, G. K., Volkow, N. D., & Gould, K. L. (1988). Imaging of cerebral blood flow and metabolism in amblyopia by positron emission tomography. Am J

Ophthalmol, 105(4), 337–347.

190

Deneve, S., & Pouget, A. (2004). Bayesian multisensory integration and cross-modal spatial links. J Physiol Paris, 98(1-3), 249–258. doi:10.1016/j.jphysparis.2004.03.011

Desjardins, R. N., & Werker, J. F. (2004). Is the integration of heard and seen speech mandatory for infants? Dev Psychobiol, 45(4), 187–203.

Dixon, N. F., & Spitz, L. (1980). The detection of auditory visual desynchrony. Perception, 9(6), 719–721.

Donnelly, U. M., Stewart, N. M., & Hollinger, M. (2005). Prevalence and outcomes of childhood visual disorders. Ophthalmic Epidemiol, 12(4), 243–250. doi:10.1080/09286580590967772

Donohue, S. E., Woldorff, M. G., & Mitroff, S. R. (2010). Video game players show more precise multisensory temporal processing abilities. Atten Percept Psychophys, 72(4), 1120–1129.

Driver, J. (1996). Enhancement of selective listening by illusory mislocation of speech sounds due to lip-reading. Nature, 381, 66–68.

DuBois, R. M., & Cohen, M. S. (2000). Spatiotopic organization in human superior colliculus observed with fMRI. Neuroimage, 12(1), 63–70. doi:10.1006/nimg.2000.0590

Duffy, K. R., & Mitchell, D. E. (2013). Darkness alters maturation of visual cortex and promotes fast recovery from monocular deprivation. Curr Biol, 23(5), 382–386. doi:10.1016/j.cub.2013.01.017

El-Shamayleh, Y., Kiorpes, L., Kohn, A., & Movshon, J. A. (2010). Visual motion processing by neurons in area MT of macaque monkeys with experimental amblyopia. J Neurosci,

30(36), 12198–12209. doi:10.1523/JNEUROSCI.3055-10.2010 Ellemberg, D., Lewis, T. L., Liu, C. H., & Maurer, D. (1999). Development of spatial and

temporal vision during childhood. Vision Res, 39(14), 2325–2333. doi:http://dx.doi.org/10.1016/S0042-6989(98)00280-6

Ellemberg, D., Lewis, T. L., Maurer, D., Brar, S., & Brent, H. P. (2002). Better perception of global motion after monocular than after binocular deprivation. Vision Res, 42(2), 169–179.

Engel, G. R., & Dougherty, W. G. (1971). Visual-auditory distance constancy. Nature,

234(5327), 308. Ernst, M. O. (2008). Multisensory integration: a late bloomer. Curr Biol, 18(12), R519–521.

doi:10.1016/j.cub.2008.05.002 Ernst, M. O., & Banks, M. S. (2002). Humans integrate visual and haptic information in a

statistically optimal fashion. Nature, 415(6870), 429–433. doi:10.1038/415429a Ernst, M. O., & Bulthoff, H. H. (2004). Merging the senses into a robust percept. Trends Cogn

Sci, 8(4), 162–169. doi:10.1016/j.tics.2004.02.002 Eskelund, K., Tuomainen, J., & Andersen, T. S. (2011). Multistage audiovisual integration of

speech: dissociating identification and detection. Exp Brain Res, 208(3), 447–457. doi:10.1007/s00221-010-2495-9

Fechner, G. T. (1889). Elemente der Psychophysik. 2 Bd. Leipzig: Breit-kopf & Härtel [2. und 3.

unveränderte Auflage, hrsg. von W. Wundt, 1889 und 1907]. Feldman, D. E. (2012). The spike-timing dependence of plasticity. Neuron, 75(4), 556–571.

doi:10.1016/j.neuron.2012.08.001 Fendrich, R., & Corballis, P. M. (2001). The temporal cross-capture of audition and vision.

Percept Psychophys, 63(4), 719–725. Fetsch, C. R., Turner, A. H., DeAngelis, G. C., & Angelaki, D. E. (2009). Dynamic reweighting

of visual and vestibular cues during self-motion perception. J Neurosci, 29(49), 15601–15612. doi:10.1523/JNEUROSCI.2574-09.2009

191

Fieger, A., Röder, B., Teder-Salejarvi, W., Hillyard, S. A., & Neville, H. J. (2006). Auditory spatial tuning in late-onset blindness in humans. J Cogn Neurosci, 18(2), 149–157. doi:10.1162/089892906775783697

Fine, I., Wade, A. R., Brewer, A. A., May, M. G., Goodman, D. F., Boynton, G. M., . . . MacLeod, D. I. (2003). Long-term deprivation affects visual perception and cortex. Nature neuroscience, 6(9), 915–916.

Flynn, J. T., Schiffman, J., Feuer, W., & Corona, A. (1998). The therapy of amblyopia: an analysis of the results of amblyopia therapy utilizing the pooled data of published studies. Trans Am Ophthalmol Soc, 96, 431–450; discussion 450–433.

Fong, M. F., Mitchell, D. E., Duffy, K. R., & Bear, M. F. (2016). Rapid recovery from the effects of early monocular deprivation is enabled by temporary inactivation of the retinas. Proc

Natl Acad Sci U S A, 113(49), 14139–14144. doi:10.1073/pnas.1613279113 Forster, B., Cavina-Pratesi, C., Aglioti, S. M., & Berlucchi, G. (2002). Redundant target effect

and intersensory facilitation from visual-tactile interactions in simple reaction time. Experimental Brain Research, 143(4), 480–487.

Foucher, J. R., Lacambre, M., Pham, B. T., Giersch, A., & Elliott, M. A. (2007). Low time resolution in schizophrenia Lengthened windows of simultaneity for visual, auditory and bimodal stimuli. Schizophr Res, 97(1-3), 118–127. doi:10.1016/j.schres.2007.08.013

Frassinetti, F., Bolognini, N., & Ladavas, E. (2002). Enhancement of visual perception by crossmodal visuo-auditory interaction. Exp Brain Res, 147(3), 332–343. doi:10.1007/s00221-002-1262-y

Freeman, R., & Bradley, A. (1980). Monocularly deprived humans: nondeprived eye has supernormal vernier acuity. J Neurophysiol, 43(6), 1645–1653.

Freides, D. (1974). Human information processing and sensory modality: cross-modal functions, information complexity, memory, and deficit. Psychol Bull, 81(5), 284–310.

Frens, M. A., Van Opstal, A. J., & Van Der Willigen, R. F. (1995). Spatial and temporal factors determine auditory-visual interactions in human saccadic eye movements. Percept

Psychophys, 57(6), 802–816. doi:10.3758/BF03206796 Frey, R. D. (1990). Selective attention, event perception and the criterion of acceptability

principle: Evidence supporting and rejecting the doctrine of prior entry. Human

Movement Science, 9(3), 481–530. Friedman, D. S., Repka, M. X., Katz, J., Giordano, L., Ibironke, J., Hawse, P., & Tielsch, J. M.

(2009). Prevalence of amblyopia and strabismus in white and African American children aged 6 through 71 months the Baltimore Pediatric Eye Disease Study. Ophthalmology,

116(11), 2128–2134 e2121–2122. doi:10.1016/j.ophtha.2009.04.034 Fronius, M., Sireteanu, R., & Zubcov, A. (2004). Deficits of spatial localization in children with

strabismic amblyopia. Graefes Arch Clin Exp Ophthalmol, 242(10), 827–839. doi:10.1007/s00417-004-0936-5

Fujisaki, W., & Nishida, S. (2005). Temporal frequency characteristics of synchrony-asynchrony discrimination of audio-visual signals. Exp Brain Res, 166(3-4), 455–464. doi:10.1007/s00221-005-2385-8

Fujisaki, W., Shimojo, S., Kashino, M., & Nishida, S. (2004). Recalibration of audiovisual simultaneity. Nat Neurosci, 7(7), 773–778. doi:10.1038/nn1268

Gardner, M. B., & Gardner, R. S. (1973). Problem of localization in the median plane: effect of pinnae cavity occlusion. J Acoust Soc Am, 53(2), 400–408.

Gebhard, J. W., & Mowbray, G. H. (1959). On discriminating the rate of visual flicker and auditory flutter. Am J Psychol, 72(4), 521–529. doi:10.2307/1419493

192

Giaschi, D. E., Regan, D., Kraft, S. P., & Hong, X. H. (1992). Defective processing of motion-defined form in the fellow eye of patients with unilateral amblyopia. Invest Ophthalmol

Vis Sci, 33(8), 2483–2489. Gibson, J. J. (1966). The Senses Considered as Perceptual Systems. Oxford, UK: Houghton

Mifflin. Godfroy, M., Roumes, C., & Dauchy, P. (2003). Spatial variations of visual-auditory fusion

areas. Perception, 32(10), 1233–1245. Gonzalez, E. G., Wong, A. M., Niechwiej-Szwedo, E., Tarita-Nistor, L., & Steinbach, M. J.

(2012). Eye position stability in amblyopia and in normal binocular vision. Invest

Ophthalmol Vis Sci, 53(9), 5386–5394. doi:10.1167/iovs.12-9941 Goodyear, B. G., Nicolle, D. A., Humphrey, G. K., & Menon, R. S. (2000). BOLD fMRI

response of early visual areas to perceived contrast in human amblyopia. J Neurophysiol,

84(4), 1907–1913. Gori, M. (2015). Multisensory integration and calibration in children and adults with and without

sensory and motor disabilities. Multisens Res, 28(1-2), 71–99. doi:10.1163/22134808-00002478

Gori, M., Del Viva, M., Sandini, G., & Burr, D. C. (2008). Young children do not integrate visual and haptic form information. Curr Biol, 18(9), 694–698. doi:10.1016/j.cub.2008.04.036

Gori, M., Sandini, G., & Burr, D. (2012). Development of visuo-auditory integration in space and time. Front Integr Neurosci, 6, 77. doi:10.3389/fnint.2012.00077

Gori, M., Sandini, G., Martinoli, C., & Burr, D. (2010). Poor haptic orientation discrimination in nonsighted children may reflect disruption of cross-sensory calibration. Curr Biol, 20(3), 223–225. doi:10.1016/j.cub.2009.11.069

Gori, M., Sandini, G., Martinoli, C., & Burr, D. C. (2014). Impairment of auditory spatial localization in congenitally blind human subjects. Brain, 137(Pt 1), 288–293. doi:10.1093/brain/awt311

Gori, M., Tinelli, F., Sandini, G., Cioni, G., & Burr, D. (2012). Impaired visual size-discrimination in children with movement disorders. Neuropsychologia, 50(8), 1838–1843. doi:10.1016/j.neuropsychologia.2012.04.009

Gorman, J. J., Cogan, D. G., & Gellis, S. S. (1957). An apparatus for grading the visual acuity of infants on the basis of opticokinetic nystagmus. Pediatrics, 19(6), 1088–1092.

Grant, K. W., & Seitz, P. F. (2000). The use of visible speech cues for improving auditory detection of spoken sentences. J Acoust Soc Am, 108(3), 1197–1208.

Grant, K. W., Walden, B. E., & Seitz, P. F. (1998). Auditory-visual speech recognition by hearing-impaired subjects: consonant recognition, sentence recognition, and auditory-visual integration. J Acoust Soc Am, 103(5 Pt 1), 2677–2690.

Grant, S., Melmoth, D. R., Morgan, M. J., & Finlay, A. L. (2007). Prehension deficits in amblyopia. Invest Ophthalmol Vis Sci, 48(3), 1139–1148. doi:10.1167/iovs.06-0976

Green, A. M., & Angelaki, D. E. (2010). Multisensory integration: resolving sensory ambiguities to build novel representations. Curr Opin Neurobiol, 20(3), 353–360. doi:10.1016/j.conb.2010.04.009

Green, A. M., & Swets, J. A. (1966). Signal Detection Theory and Psychophysics: New York: Wiley.

Groh, J. M., Kelly, K. A., & Underhill, A. M. (2003). A monotonic code for sound azimuth in primate inferior colliculus. J Cogn Neurosci, 15(8), 1217–1231. doi:10.1162/089892903322598166

193

Groh, J. M., & Sparks, D. L. (1996). Saccades to somatosensory targets. III. eye-position-dependent somatosensory activity in primate superior colliculus. J Neurophysiol, 75(1), 439–453.

Grothe, B., Pecka, M., & McAlpine, D. (2010). Mechanisms of sound localization in mammals. Physiol Rev, 90(3), 983–1012. doi:10.1152/physrev.00026.2009

Guerreiro, M. J., Putzar, L., & Röder, B. (2015). The effect of early visual deprivation on the neural bases of multisensory processing. Brain, 138(Pt 6), 1499–1504. doi:10.1093/brain/awv076

Guerreiro, M. J., Putzar, L., & Röder, B. (2016). Persisting cross-modal changes in sight-recovery individuals modulate visual perception. Curr Biol, 26(22), 3096–3100. doi:10.1016/j.cub.2016.08.069

Hadad, B., Schwartz, S., Maurer, D., & Lewis, T. L. (2015). Motion perception: a review of developmental changes and the role of early visual experience. Front Integr Neurosci, 9, 49. doi:10.3389/fnint.2015.00049

Hadad, B. S., Maurer, D., & Lewis, T. L. (2011). Long trajectory for the development of sensitivity to global and biological motion. Dev Sci, 14(6), 1330–1339. doi:10.1111/j.1467-7687.2011.01078.x

Hadjikhani, N., & Roland, P. E. (1998). Cross-modal transfer of information between the tactile and the visual representations in the human brain: a positron emission tomographic study. J Neurosci, 18(3), 1072–1084.

Hairston, W. D., Burdette, J. H., Flowers, D. L., Wood, F. B., & Wallace, M. T. (2005). Altered temporal profile of visual-auditory multisensory interactions in dyslexia. Exp Brain Res,

166(3-4), 474–480. doi:10.1007/s00221-005-2387-6 Hamed, L., Glaser, J., & Schatz, N. (1991). Improvement of vision in the amblyopic eye

following visual loss in the contralateral normal eye: a report of three cases. Binoc Vis, 6, 97–100.

Hariharan, S., Levi, D. M., & Klein, S. A. (2005). “Crowding” in normal and amblyopic vision assessed with Gaussian and Gabor C’s. Vision Res, 45(5), 617–633. doi:http://dx.doi.org/10.1016/j.visres.2004.09.035

Harrad, R., & Hess, R. (1992). Binocular integration of contrast information in amblyopia. Vision Res, 32(11), 2135–2150.

Harrington, L., & Peck, C. (1998). Spatial disparity affects visual-auditory interactions in human sensorimotor processing. Experimental Brain Research, 122(2), 247–252.

Hartline, P. H., Vimal, R. P., King, A., Kurylo, D., & Northmore, D. (1995). Effects of eye position on auditory localization and neural representation of space in superior colliculus of cats. Experimental Brain Research, 104(3), 402–408.

Harwerth, R. S., Smith, E. L., 3rd, Boltz, R. L., Crawford, M. L., & von Noorden, G. K. (1983). Behavioral studies on the effect of abnormal early visual experience in monkeys: temporal modulation sensitivity. Vision Res, 23(12), 1511–1517.

Harwerth, R. S., Smith, E. L., 3rd, Duncan, G. C., Crawford, M. L., & von Noorden, G. K. (1986). Multiple sensitive periods in the development of the primate visual system. Science, 232(4747), 235–238.

He, H.-Y., Ray, B., Dennis, K., & Quinlan, E. M. (2007). Experience-dependent recovery of vision following chronic deprivation amblyopia. Nat Neurosci, 10(9), 1134–1136. doi:http://www.nature.com/neuro/journal/v10/n9/suppinfo/nn1965_S1.html

He, S., Cavanagh, P., & Intriligator, J. (1996). Attentional resolution and the locus of visual awareness. Nature, 383(6598), 334.

194

Heming, J. E., & Brown, L. N. (2005). Sensory temporal processing in adults with early hearing loss. Brain Cogn, 59(2), 173–182. doi:10.1016/j.bandc.2005.05.012

Hendrickson, A. E., Movshon, J. A., Eggers, H. M., Gizzi, M. S., Boothe, R. G., & Kiorpes, L. (1987). Effects of early unilateral blur on the macaque's visual system. II. Anatomical observations. J Neurosci, 7(5), 1327–1339.

Heng, S., & Dutton, G. N. (2011). The Pulfrich effect in the clinic. Graefes Arch Clin Exp

Ophthalmol, 249(6), 801–808. doi:10.1007/s00417-011-1689-6 Hershenson, M. (1962). Reaction time as a measure of intersensory facilitation. J Exp Psychol,

63(3), 289. Hess, R. F. (2001). Amblyopia: site unseen. Clin Exp Optom, 84(6), 321–336. Hess, R. F., Demanins, R., & Bex, P. J. (1997). A reduced motion aftereffect in strabismic

amblyopia. Vision Res, 37(10), 1303–1311. doi:http://dx.doi.org/10.1016/S0042-6989(96)00277-5

Hess, R. F., & Holliday, I. E. (1992). The spatial localization deficit in amblyopia. Vision Res,

32(7), 1319–1339. Hess, R. F., & Howell, E. R. (1977). The threshold contrast sensitivity function in strabismic

amblyopia: evidence for a two type classification. Vision Res, 17(9), 1049–1055. Hess, R. F., & Pointer, J. S. (1985). Differences in the neural basis of human amblyopia: The

distribution of the anomaly across the visual field. Vision Res, 25(11), 1577–1594. doi:10.1016/0042-6989(85)90128-2

Hess, R. F., Wang, Y. Z., Demanins, R., Wilkinson, F., & Wilson, H. R. (1999). A deficit in strabismic amblyopia for global shape detection. Vision Res, 39(5), 901–914.

Hillock-Dunn, A., & Wallace, M. T. (2012). Developmental changes in the multisensory temporal binding window persist into adolescence. Dev Sci, 15(5), 688–696. doi:10.1111/j.1467-7687.2012.01171.x

Hillock, A. R., Powers, A. R., & Wallace, M. T. (2011). Binding of sights and sounds: age-related changes in multisensory temporal processing. Neuropsychologia, 49(3), 461–467. doi:10.1016/j.neuropsychologia.2010.11.041

Hirsh, I. J. (1959). Auditory perception of temporal order. J Acoust Soc Am, 31(6), 759–767. Hirsh, I. J., & Sherrick, C. E., Jr. (1961). Perceived order in different sense modalities. J Exp

Psychol, 62(5), 423–432. Ho, C. S., Giaschi, D. E., Boden, C., Dougherty, R., Cline, R., & Lyons, C. (2005). Deficient

motion perception in the fellow eye of amblyopic children. Vision Res, 45(12), 1615–1627. doi:10.1016/j.visres.2004.12.009

Ho, C. S., Paul, P. S., Asirvatham, A., Cavanagh, P., Cline, R., & Giaschi, D. E. (2006). Abnormal spatial selection and tracking in children with amblyopia. Vision Res, 46(19), 3274–3283. doi:10.1016/j.visres.2006.03.029

Hofman, P. M., Van Riswick, J. G., & Van Opstal, A. J. (1998). Relearning sound localization with new ears. Nat Neurosci, 1(5), 417–421. doi:10.1038/1633

Hogan, S. C., & Moore, D. R. (2003). Impaired binaural hearing in children produced by a threshold level of middle ear disease. J Assoc Res Otolaryngol, 4(2), 123–129. doi:10.1007/s10162-002-3007-9

Holmes, J. M., Beck, R. W., Repka, M. X., Leske, D. A., Kraker, R. T., Blair, R. C., . . . Pediatric Eye Disease Investigator, G. (2001). The amblyopia treatment study visual acuity testing protocol. Arch Ophthalmol, 119(9), 1345–1353.

Holmes, J. M., & Clarke, M. P. (2006). Amblyopia. Lancet, 367(9519), 1343–1351.

195

Holmes, J. M., Lazar, E. L., Melia, B. M., Astle, W. F., Dagi, L. R., Donahue, S. P., . . . Pediatric Eye Disease Investigator, G. (2011). Effect of age on response to amblyopia treatment in children. Arch Ophthalmol, 129(11), 1451–1457. doi:10.1001/archophthalmol.2011.179

Holmes, J. M., Manh, V. M., Lazar, E. L., Beck, R. W., Birch, E. E., Kraker, R. T., . . . Pediatric Eye Disease Investigator, G. (2016). Effect of a binocular iPad game vs part-time patching in children aged 5 to 12 years with amblyopia: a randomized clinical trial. JAMA Ophthalmol, 134(12), 1391–1400. doi:10.1001/jamaophthalmol.2016.4262

Holmes, N. P., & Spence, C. (2005). Multisensory integration: space, time and superadditivity. Curr Biol, 15(18), R762–764. doi:10.1016/j.cub.2005.08.058

Hoover, A. E., Harris, L. R., & Steeves, J. K. (2012). Sensory compensation in sound localization in people with one eye. Exp Brain Res, 216(4), 565–574. doi:10.1007/s00221-011-2960-0

Horton, J. C., & Hocking, D. R. (1996). Pattern of ocular dominance columns in human striate cortex in strabismic amblyopia. Vis Neurosci, 13(4), 787–795.

Horton, J. C., & Hocking, D. R. (1997). Timing of the critical period for plasticity of ocular dominance columns in macaque striate cortex. J Neurosci, 17(10), 3684–3709.

Horton, J. C., & Stryker, M. P. (1993). Amblyopia induced by anisometropia without shrinkage of ocular dominance columns in human striate cortex. Proc Natl Acad Sci U S A, 90(12), 5494–5498.

Horwood, J., Waylen, A., Herrick, D., Williams, C., & Wolke, D. (2005). Common visual defects and peer victimization in children. Invest Ophthalmol Vis Sci, 46(4), 1177–1181. doi:10.1167/iovs.04-0597

Hötting, K., & Röder, B. (2009). Auditory and auditory-tactile processing in congenitally blind humans. Hear Res, 258(1-2), 165–174. doi:10.1016/j.heares.2009.07.012

Howard, I. P., & Templeton, W. B. (1966). Human Spatial Orientation. Oxford, England: John Wiley.

Huang, P. C., Li, J., Deng, D., Yu, M., & Hess, R. F. (2012). Temporal synchrony deficits in amblyopia. Invest Ophthalmol Vis Sci, 53(13), 8325–8332. doi:10.1167/iovs.12-10835

Hubel, D. H., & Wiesel, T. N. (1970). The period of susceptibility to the physiological effects of unilateral eye closure in kittens. J Physiol, 206(2), 419–436.

Hubel, D. H., Wiesel, T. N., & LeVay, S. (1977). Plasticity of ocular dominance columns in monkey striate cortex. Philos Trans R Soc Lond B Biol Sci, 278(961), 377–409.

Hughes, H. C., Reuter-Lorenz, P. A., Nozawa, G., & Fendrich, R. (1994). Visual-auditory interactions in sensorimotor processing: saccades versus manual responses. J Exp

Psychol Hum Percept Perform, 20(1), 131–153. Imamura, K., Richter, H., Fischer, H., Lennerstrand, G., Franzen, O., Rydberg, A., . . .

Langstrom, B. (1997). Reduced activity in the extrastriate visual cortex of individuals with strabismic amblyopia. Neurosci Lett, 225(3), 173–176.

Intriligator, J., & Cavanagh, P. (2001). The spatial resolution of visual attention. Cogn Psychol,

43(3), 171–216. doi:10.1006/cogp.2001.0755 Irwin, R. J., Ball, A. K., Kay, N., Stillman, J. A., & Rosser, J. (1985). The development of

auditory temporal acuity in children. Child Dev, 56(3), 614–620. Jacobs, R. A. (2002). What determines visual cue reliability? Trends Cogn Sci, 6(8), 345–350. Jay, M. F., & Sparks, D. L. (1987a). Sensorimotor integration in the primate superior colliculus.

I. Motor convergence. J Neurophysiol, 57(1), 22–34. Jay, M. F., & Sparks, D. L. (1987b). Sensorimotor integration in the primate superior colliculus.

II. Coordinates of auditory signals. J Neurophysiol, 57(1), 35–55.

196

Jiang, F., Stecker, G. C., Boynton, G. M., & Fine, I. (2016). Early blindness results in developmental plasticity for auditory motion processing within auditory and occipital cortex. Front Hum Neurosci, 10, 324. doi:10.3389/fnhum.2016.00324

Jiang, W., Jiang, H., & Stein, B. E. (2006). Neonatal cortical ablation disrupts multisensory development in superior colliculus. J Neurophysiol, 95(3), 1380–1396.

Jiang, W., Wallace, M. T., Jiang, H., Vaughan, J. W., & Stein, B. E. (2001). Two cortical areas mediate multisensory integration in superior colliculus neurons. J Neurophysiol, 85(2), 506–522.

Jones, J. A., & Munhall, K. G. (1997). Effects of separating auditory and visual sources on audiovisual integration of speech. Canadian Acoustics, 25(4), 13–19.

Jones, K. R., Spear, P. D., & Tong, L. (1984). Critical periods for effects of monocular deprivation: differences between striate and extrastriate cortex. J Neurosci, 4(10), 2543–2552.

Kanabus, M., Szelag, E., Rojek, E., & Poppel, E. (2002). Temporal order judgement for auditory and visual stimuli. Acta Neurobiol Exp (Wars), 62(4), 263–270.

Kandel, G. L., Grattan, P. E., & Bedell, H. E. (1980). Are the dominant eyes of amblyopes normal? Am J Optom Physiol Opt, 57(1), 1–6.

Kanonidou, E., Proudlock, F. A., & Gottlob, I. (2010). Reading strategies in mild to moderate strabismic amblyopia: an eye movement investigation. Invest Ophthalmol Vis Sci, 51(7), 3502–3508. doi:10.1167/iovs.09-4236

Kasser, M., & Feldman, J. (1953). Amblyopia in adults*: treatment of those engaged in the various industries. Am J Ophthalmol, 36(10), 1443–1446.

Keetels, M., & Vroomen, J. (2005). The role of spatial disparity and hemifields in audio-visual temporal order judgments. Exp Brain Res, 167(4), 635–640. doi:10.1007/s00221-005-0067-1

Keetels, M., & Vroomen, J. (2011). No effect of synesthetic congruency on temporal ventriloquism. Atten Percept Psychophys, 73(1), 209–218. doi:10.3758/s13414-010-0019-0

Kelly, J. P., Tarczy-Hornoch, K., Herlihy, E., & Weiss, A. H. (2015). Occlusion therapy improves phase-alignment of the cortical response in amblyopia. Vision Res, 114, 142–150. doi:10.1016/j.visres.2014.11.014

Kelly, K. R., Jost, R. M., De La Cruz, A., & Birch, E. E. (2015). Amblyopic children read more slowly than controls under natural, binocular reading conditions. J AAPOS, 19(6), 515–520. doi:10.1016/j.jaapos.2015.09.002

Kersten, D., & Yuille, A. (2003). Bayesian models of object perception. Curr Opin Neurobiol,

13(2), 150–158. doi:http://dx.doi.org/10.1016/S0959-4388(03)00042-4 Keuroghlian, A. S., & Knudsen, E. I. (2007). Adaptive auditory plasticity in developing and

adult animals. Prog Neurobiol, 82(3), 109–121. doi:10.1016/j.pneurobio.2007.03.005 Khavarghazalani, B., Farahani, F., Emadi, M., & Hosseni Dastgerdi, Z. (2016). Auditory

processing abilities in children with chronic otitis media with effusion. Acta Otolaryngol,

136(5), 456–459. doi:10.3109/00016489.2015.1129552 Killan, C. F., Royle, N., Totten, C. L., Raine, C. H., & Lovett, R. E. (2015). The effect of early

auditory experience on the spatial listening skills of children with bilateral cochlear implants. Int J Pediatr Otorhinolaryngol, 79(12), 2159–2165. doi:10.1016/j.ijporl.2015.09.039

King, A. J. (2004). The superior colliculus. Curr Biol, 14(9), R335–338. doi:10.1016/j.cub.2004.04.018

197

King, A. J. (2009). Visual influences on auditory spatial learning. Philos Trans R Soc Lond B

Biol Sci, 364(1515), 331–339. doi:10.1098/rstb.2008.0230 King, A. J., & Carlile, S. (1993). Changes induced in the representation of auditory space in the

superior colliculus by rearing ferrets with binocular eyelid suture. Exp Brain Res, 94(3), 444–455.

King, A. J., Hutchings, M. E., Moore, D. R., & Blakemore, C. (1988). Developmental plasticity in the visual and auditory representations in the mammalian superior colliculus. Nature,

332(6159), 73–76. doi:10.1038/332073a0 King, A. J., & Palmer, A. R. (1983). Cells responsive to free-field auditory stimuli in guinea-pig

superior colliculus: distribution and response properties. J Physiol, 342(1), 361–381. King, A. J., & Palmer, A. R. (1985). Integration of visual and auditory information in bimodal

neurones in the guinea-pig superior colliculus. Exp Brain Res, 60(3), 492–500. King, A. J., Parsons, C. H., & Moore, D. R. (2000). Plasticity in the neural coding of auditory

space in the mammalian brain. Proc Natl Acad Sci U S A, 97(22), 11821–11828. doi:10.1073/pnas.97.22.11821

Kiorpes, L., Kiper, D. C., O'Keefe, L. P., Cavanaugh, J. R., & Movshon, J. A. (1998). Neuronal correlates of amblyopia in the visual cortex of macaque monkeys with experimental strabismus and anisometropia. J Neurosci, 18(16), 6411–6424.

Kishimoto, F., Fujii, C., Shira, Y., Hasebe, K., Hamasaki, I., & Ohtsuki, H. (2014). Outcome of conventional treatment for adult amblyopia. Jpn J Ophthalmol, 58(1), 26–32. doi:10.1007/s10384-013-0279-z

Klaeger-Manzanell, C., Hoyt, C. S., & Good, W. V. (1994). Two step recovery of vision in the amblyopic eye after visual loss and enucleation of the fixing eye. Br J Ophthalmol, 78(6), 506–507.

Klemm, O. (1920). Untersuchungen über die Lokalisation von Schallreizen IV: über den Einfluss des binauralen Zeitunterschieds auf die Lokalisation. Arch ges Psychol, 40, 117–145.

Klumpp, R., & Eady, H. (1956). Some measurements of interaural time difference thresholds. J

Acoust Soc Am, 28(5), 859–860. Knudsen, E. I., & Brainard, M. S. (1991). Visual instruction of the neural map of auditory space

in the developing optic tectum. Science, 253(5015), 85–87. Knudsen, E. I., Esterly, S. D., & Knudsen, P. F. (1984). Monaural occlusion alters sound

localization during a sensitive period in the barn owl. J Neurosci, 4(4), 1001–1011. Knudsen, E. I., & Knudsen, P. F. (1986). The sensitive period for auditory localization in barn

owls is limited by age, not by experience. J Neurosci, 6(7), 1918–1924. Knudsen, E. I., & Knudsen, P. F. (1989). Vision calibrates sound localization in developing barn

owls. J Neurosci, 9(9), 3306–3313. Knudsen, E. I., & Knudsen, P. F. (1990). Sensitive and critical periods for visual calibration of

sound localization by barn owls. J Neurosci, 10(1), 222–232. Knudsen, E. I., Knudsen, P. F., & Esterly, S. D. (1984). A critical period for the recovery of

sound localization accuracy following monaural occlusion in the barn owl. J Neurosci,

4(4), 1012–1020. Kording, K. P., Beierholm, U., Ma, W. J., Quartz, S., Tenenbaum, J. B., & Shams, L. (2007).

Causal inference in multisensory perception. PloS one, 2(9), e943. doi:10.1371/journal.pone.0000943

Kovacs, I., Polat, U., Pennefather, P. M., Chandna, A., & Norcia, A. M. (2000). A new test of contour integration deficits in patients with a history of disrupted binocular experience during visual development. Vision Res, 40(13), 1775–1783.

198

Krueger, D., & Ederer, F. (1984). Report on the National Eye Institute's visual acuity impairment survey pilot study. Bethesda, MD: Office of Biometry and Epidemiology, National Eye Institute, National Institutes of Health, Public Health Service, Department of Health and Human Services.

Kugelberg, U. (1992). Visual acuity following treatment of bilateral congenital cataracts. Doc

Ophthalmol, 82(3), 211–215. Kumpik, D. P., Kacelnik, O., & King, A. J. (2010). Adaptive reweighting of auditory localization

cues in response to chronic unilateral earplugging in humans. J Neurosci, 30(14), 4883–4894. doi:10.1523/JNEUROSCI.5488-09.2010

Kupfer, C. (1957). Treatment of amblyopia ex anopsia in adults*: a preliminary report of seven cases. Am J Ophthalmol, 43(6), 918–922.

Kushnerenko, E., Teinonen, T., Volein, A., & Csibra, G. (2008). Electrophysiological evidence of illusory audiovisual speech percept in human infants. Proc Natl Acad Sci U S A,

105(32), 11442–11445. doi:10.1073/pnas.0804275105 Lalonde, K., & Holt, R. F. (2016). Audiovisual speech perception development at varying levels

of perceptual processing. J Acoust Soc Am, 139(4), 1713–1723. Lambert, S. R., Buckley, E. G., Drews-Botsch, C., DuBois, L., Hartmann, E., Lynn, M. J., . . .

Wilson, M. E. (2010). The infant aphakia treatment study: design and clinical measures at enrollment. Arch Ophthalmol, 128(1), 21–27. doi:10.1001/archophthalmol.2009.350

Lambert, S. R., DuBois, L., Cotsonis, G., Hartmann, E. E., & Drews-Botsch, C. (2016). Factors associated with stereopsis and a good visual acuity outcome among children in the Infant Aphakia Treatment Study. Eye (Lond), 30(9), 1221–1228. doi:10.1038/eye.2016.164

Lane, R., Allman, J., Kaas, J., & Miezin, F. (1973). The visuotopic organization of the superior colliculus of the owl monkey (Aotus trivirgatus) and the bush baby (Galago senegalensis). Brain Res, 60(2), 335–349.

Lea, S. J. H., Loades, J., & Rubinstein, M. P. (1989). The sensitive period for anisometropic amblyopia. Eye (Lond), 3(6), 783–790.

Lee, H., & Noppeney, U. (2011a). Long-term music training tunes how the brain temporally binds signals from multiple senses. Proc Natl Acad Sci U S A, 108(51), E1441–1450. doi:10.1073/pnas.1115267108

Lee, H., & Noppeney, U. (2011b). Physical and perceptual factors shape the neural mechanisms that integrate audiovisual signals in speech comprehension. J Neurosci, 31(31), 11338–11350. doi:10.1523/JNEUROSCI.6510-10.2011

Leguire, L. E., Rogers, G. L., & Bremer, D. L. (1990). Amblyopia: the normal eye is not normal. J Pediatr Ophthalmol Strabismus, 27(1), 32–38; discussion 39.

Lessard, N., Pare, M., Lepore, F., & Lassonde, M. (1998). Early-blind human subjects localize sound sources better than sighted subjects. Nature, 395(6699), 278–280.

Levi, D. M. (2013). Linking assumptions in amblyopia. Vis Neurosci, 30(5-6), 277–287. doi:10.1017/S0952523813000023

Levi, D. M., Hariharan, S., & Klein, S. A. (2002). Suppressive and facilitatory spatial interactions in amblyopic vision. Vision Res, 42(11), 1379–1394. doi:http://dx.doi.org/10.1016/S0042-6989(02)00061-5

Levi, D. M., & Harwerth, R. S. (1977). Spatio-temporal interactions in anisometropic and strabismic amblyopia. Invest Ophthalmol Vis Sci, 16(1), 90–95.

Levi, D. M., & Klein, S. A. (1985). Vernier acuity, crowding and amblyopia. Vision Res, 25(7), 979–991. doi:http://dx.doi.org/10.1016/0042-6989(85)90208-1

Levi, D. M., & Klein, S. A. (2003). Noise provides some new signals about the spatial vision of amblyopes. J Neurosci, 23(7), 2522–2526.

199

Levi, D. M., Klein, S. A., & Chen, I. (2007). The response of the amblyopic visual system to noise. Vision Res, 47(19), 2531–2542. doi:10.1016/j.visres.2007.06.014

Levi, D. M., Klein, S. A., & Chen, I. (2008). What limits performance in the amblyopic visual system: seeing signals in noise with an amblyopic brain. J Vis, 8(4), 1 1–23. doi:10.1167/8.4.1

Levi, D. M., Klein, S. A., & Yap, Y. L. (1987). Positional uncertainty in peripheral and amblyopic vision. Vision Res, 27(4), 581–597.

Levi, D. M., & Polat, U. (1996). Neural plasticity in adults with amblyopia. Proc Natl Acad Sci

U S A, 93(13), 6830–6834. Levi, D. M., Waugh, S. J., & Beard, B. L. (1994). Spatial scale shifts in amblyopia. Vision Res,

34(24), 3315–3333. Lewald, J., Ehrenstein, W. H., & Guski, R. (2001). Spatio-temporal constraints for auditory--

visual integration. Behav Brain Res, 121(1-2), 69–79. Lewald, J., & Guski, R. (2003). Cross-modal perceptual integration of spatially and temporally

disparate auditory and visual stimuli. Brain Res Cogn Brain Res, 16(3), 468–478. doi:10.1016/s0926-6410(03)00074-0

Lewis, J. W., Beauchamp, M. S., & DeYoe, E. A. (2000). A comparison of visual and auditory motion processing in human cerebral cortex. Cereb Cortex, 10(9), 873–888.

Lewis, T. L., & Maurer, D. (2005). Multiple sensitive periods in human visual development: evidence from visually deprived children. Dev Psychobiol, 46(3), 163–183. doi:10.1002/dev.20055

Lewkowicz, D. J. (2000). The development of intersensory temporal perception: an epigenetic systems/limitations view. Psychol Bull, 126(2), 281–308.

Lewkowicz, D. J., & Flom, R. (2014). The audiovisual temporal binding window narrows in early childhood. Child Dev, 85(2), 685–694. doi:10.1111/cdev.12142

Lewkowicz, D. J., & Hansen-Tift, A. M. (2012). Infants deploy selective attention to the mouth of a talking face when learning speech. Proc Natl Acad Sci U S A, 109(5), 1431–1436. doi:10.1073/pnas.1114783109

Lewkowicz, D. J., & Lickliter, R. (1994). The Development of Intersensory Perception:

Comparative Perspectives. Hillsdale, NJ: Erlbaum. Lewkowicz, D. J., & Turkewitz, G. (1980). Cross-modal equivalence in early infancy: Auditory–

visual intensity matching. Developmental Psychology, 16(6), 597. Li, J., Thompson, B., Lam, C. S., Deng, D., Chan, L. Y., Maehara, G., . . . Hess, R. F. (2011).

The role of suppression in amblyopia. Invest Ophthalmol Vis Sci, 52(7), 4169–4176. doi:10.1167/iovs.11-7233

Li, R. W., Ngo, C., Nguyen, J., & Levi, D. M. (2011). Video-game play induces plasticity in the visual system of adults with amblyopia. PLoS Biol, 9(8), e1001135. doi:10.1371/journal.pbio.1001135

Li, X., Dumoulin, S. O., Mansouri, B., & Hess, R. F. (2007). Cortical deficits in human amblyopia: their regional distribution and their relationship to the contrast detection deficit. Invest Ophthalmol Vis Sci, 48(4), 1575–1591.

Li, X., Mullen, K. T., Thompson, B., & Hess, R. F. (2011). Effective connectivity anomalies in human amblyopia. Neuroimage, 54(1), 505–516. doi:10.1016/j.neuroimage.2010.07.053

Litovsky, R. Y. (1997). Developmental changes in the precedence effect: estimates of minimum audible angle. J Acoust Soc Am, 102(3), 1739–1745.

Litovsky, R. Y. (2005). Speech intelligibility and spatial release from masking in young children. J Acoust Soc Am, 117(5), 3091–3099.

200

Litovsky, R. Y., & Ashmead, D. H. (1997). Development of binaural and spatial hearing in infants and children

In R. H. Gilkey & T. R. Anderson (Eds.), Binaural and spatial hearing in real and virtual

environments (pp. 571–592). Mahwah, N.J.: Lawrence Erlbaum Associates. Litovsky, R. Y., Fligor, B. J., & Tramo, M. J. (2002). Functional role of the human inferior

colliculus in binaural hearing. Hear Res, 165(1-2), 177–188. doi:http://dx.doi.org/10.1016/S0378-5955(02)00304-0

Lovelace, C. T., Stein, B. E., & Wallace, M. T. (2003). An irrelevant light enhances auditory detection in humans: a psychophysical analysis of multisensory integration in stimulus detection. Brain Res Cogn Brain Res, 17(2), 447–453.

Löwel, S., & Engelmann, R. (2002). Neuroanatomical and neurophysiological consequences of strabismus: changes in the structural and functional organization of the primary visual cortex in cats with alternating fixation and strabismic amblyopia. Strabismus, 10(2), 95–105.

Lyons-Ruth, K. (1977). Bimodal perception in infancy: Response to auditory-visual incongruity. Child Dev, 820–827.

Macaluso, E. (2006). Multisensory processing in sensory-specific cortical areas. Neuroscientist,

12(4), 327–338. doi:10.1177/1073858406287908 Macaluso, E., & Driver, J. (2001). Spatial attention and crossmodal interactions between vision

and touch. Neuropsychologia, 39(12), 1304–1316. doi:Doi 10.1016/S0028-3932(01)00119-1

Macaluso, E., Frith, C., & Driver, J. (2000). Selective spatial attention in vision and touch: unimodal and multimodal mechanisms revealed by PET. J Neurophysiol, 83(5), 3062–3075.

Macaluso, E., Frith, C. D., & Driver, J. (2000). Modulation of human visual cortex by crossmodal spatial attention. Science, 289(5482), 1206–1208.

Macaluso, E., George, N., Dolan, R., Spence, C., & Driver, J. (2004). Spatial and temporal factors during processing of audiovisual speech: a PET study. Neuroimage, 21(2), 725–732. doi:10.1016/j.neuroimage.2003.09.049

MacDonald, J., Andersen, S., & Bachmann, T. (2000). Hearing by eye: how much spatial degradation can be tolerated? Perception, 29(10), 1155–1168.

Magnotti, J. F., & Beauchamp, M. S. (2017). A causal inference model explains perception of the McGurk effect and other incongruent audiovisual speech. PLoS Comput Biol, 13(2), e1005229. doi:10.1371/journal.pcbi.1005229

Makous, J. C., & Middlebrooks, J. C. (1990). Two-dimensional sound localization by human listeners. J Acoust Soc Am, 87(5), 2188–2200.

Mansouri, B., & Hess, R. F. (2006). The global processing deficit in amblyopia involves noise segregation. Vision Res, 46(24), 4104–4117. doi:10.1016/j.visres.2006.07.017

Marks, L. E. (1978). The Unity of the Senses. New York: Academic Press. Martin, B., Giersch, A., Huron, C., & van Wassenhove, V. (2013). Temporal event structure and

timing in schizophrenia: preserved binding in a longer "now". Neuropsychologia, 51(2), 358–371. doi:10.1016/j.neuropsychologia.2012.07.002

Mathews, S., Yager, D., Ciuffreda, K. J., & Ettinger, E. R. (1987). Spatial frequency discrimination in anisometropic and strabismic amblyopia. Appl Opt, 26(8), 1432–1436. doi:10.1364/AO.26.001432

Maurer, D., Lewis, T. L., Brent, H. P., & Levin, A. V. (1999). Rapid improvement in the acuity of infants after visual input. Science, 286(5437), 108–110.

201

Maurer, D., Stager, C. L., & Mondloch, C. J. (1999). Cross‐modal transfer of shape is difficult to demonstrate in one‐month‐olds. Child Dev, 70(5), 1047–1057.

Mayer, D. L., Beiser, A. S., Warner, A. F., Pratt, E. M., Raye, K. N., & Lang, J. M. (1995). Monocular acuity norms for the Teller Acuity Cards between ages one month and four years. Invest Ophthalmol Vis Sci, 36(3), 671–685.

Mayer, D. L., & Dobson, V. (1982). Visual acuity development in infants and young children, as assessed by operant preferential looking. Vision Res, 22(9), 1141–1151.

McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 264(5588), 746–748.

McKee, S. P., Levi, D. M., & Movshon, J. A. (2003). The pattern of visual deficits in amblyopia. J Vis, 3(5), 380–405. doi:10:1167/3.5.5

McKee, S. P., Levi, D. M., Schor, C. M., & Movshon, J. A. (2016). Saccadic latency in amblyopia. J Vis, 16(5), 3. doi:10.1167/16.5.3

Meier, K., & Giaschi, D. (2017). Unilateral amblyopia affects two eyes: fellow eye deficits in amblyopia. Invest Ophthalmol Vis Sci, 58(3), 1779–1800.

Meltzoff, A. N., & Borton, R. W. (1979). Intermodal matching by human neonates. Nature,

282(5737), 403–404. Membreno, J. H., Brown, M. M., Brown, G. C., Sharma, S., & Beauchamp, G. R. (2002). A cost-

utility analysis of therapy for amblyopia. Ophthalmology, 109(12), 2265–2271. doi:http://dx.doi.org/10.1016/S0161-6420(02)01286-1

Meredith, M. A., Nemitz, J. W., & Stein, B. E. (1987). Determinants of multisensory integration in superior colliculus neurons. I. Temporal factors. J Neurosci, 7(10), 3215–3229.

Meredith, M. A., & Stein, B. E. (1986a). Spatial factors determine the activity of multisensory neurons in cat superior colliculus. Brain Res, 365(2), 350–354. doi:http://dx.doi.org/10.1016/0006-8993(86)91648-3

Meredith, M. A., & Stein, B. E. (1986b). Visual, auditory, and somatosensory convergence on cells in superior colliculus results in multisensory integration. J Neurophysiol, 56(3), 640–662.

Middlebrooks, J. C., & Green, D. M. (1991). Sound localization by human listeners. Annu Rev

Psychol, 42(1), 135–159. doi:10.1146/annurev.ps.42.020191.001031 Middlebrooks, J. C., Makous, J. C., & Green, D. M. (1989). Directional sensitivity of sound‐

pressure levels in the human ear canal. J Acoust Soc Am, 86(1), 89–108. Miller, L. M., & D'Esposito, M. (2005). Perceptual fusion and stimulus coincidence in the cross-

modal integration of speech. J Neurosci, 25(25), 5884–5893. doi:10.1523/JNEUROSCI.0896-05.2005

Mills, A. W. (1958). On the minimum audible angle. J Acoust Soc Am, 30(4), 237–246. Milner, A. D., & Goodale, M. A. (2008). Two visual systems re-viewed. Neuropsychologia,

46(3), 774–785. doi:10.1016/j.neuropsychologia.2007.10.005 Mirabella, G., Hay, S., & Wong, A. M. (2011). Deficits in perception of images of real-world

scenes in patients with a history of amblyopia. Arch Ophthalmol, 129(2), 176–183. doi:10.1001/archophthalmol.2010.354

Mon-Williams, M., Wann, J. P., Jenkinson, M., & Rushton, K. (1997). Synaesthesia in the normal limb. Proc Biol Sci, 264(1384), 1007–1010. doi:10.1098/rspb.1997.0139

Moore, D. R. (1993). Plasticity of binaural hearing and some possible mechanisms following late-onset deprivation. J Am Acad Audiol, 4(5), 277–283.

Moore, R. Y., & Goldberg, J. M. (1966). Projections of the inferior colliculus in the monkey. Experimental Neurology, 14(4), 429–438. doi:http://dx.doi.org/10.1016/0014-4886(66)90127-0

202

Morein-Zamir, S., Soto-Faraco, S., & Kingstone, A. (2003). Auditory capture of vision: examining temporal ventriloquism. Brain Res Cogn Brain Res, 17(1), 154–163. doi:10.1016/s0926-6410(03)00089-2

Moro, S. S., Harris, L. R., & Steeves, J. K. (2014). Optimal audiovisual integration in people with one eye. Multisens Res, 27(3-4), 173–188. doi:10.1163/22134808-00002453

Moro, S. S., & Steeves, J. K. (2015). Audiovisual integration in people with one eye: Normal temporal binding window and sound induced flash illusion but reduced McGurk effect. J

Vis, 15(12), 721. doi:10.1167/15.12.721 Morrell, L. K. (1968). Temporal characteristics of sensory interaction in choice reaction times. J

Exp Psychol, 77(1), 14–18. Morrongiello, B. A. (1988). Infants’ localization of sounds along the horizontal axis: Estimates

of minimum audible angle. Developmental Psychology, 24(1), 8–13. Morrongiello, B. A., Fenwick, K. D., & Chance, G. (1998). Crossmodal learning in newborn

infants: Inferences about properties of auditory-visual events. Infant Behavior and

Development, 21(4), 543–553. doi:http://dx.doi.org/10.1016/S0163-6383(98)90028-5 Movshon, J. A., Eggers, H. M., Gizzi, M. S., Hendrickson, A. E., Kiorpes, L., & Boothe, R. G.

(1987). Effects of early unilateral blur on the macaque's visual system. III. Physiological observations. J Neurosci, 7(5), 1340–1351.

Mozolic, J. L., Hugenschmidt, C. E., Peiffer, A. M., & Laurienti, P. J. (2008). Modality-specific selective attention attenuates multisensory integration. Exp Brain Res, 184(1), 39–52. doi:10.1007/s00221-007-1080-3

Muchnik, C., Efrati, M., Nemeth, E., Malin, M., & Hildesheimer, M. (1991). Central auditory skills in blind and sighted subjects. Scand Audiol, 20(1), 19–23. doi:10.3109/01050399109070785

Muckli, L., Kiess, S., Tonhausen, N., Singer, W., Goebel, R., & Sireteanu, R. (2006). Cerebral correlates of impaired grating perception in individual, psychophysically assessed human amblyopes. Vision Res, 46(4), 506–526. doi:10.1016/j.visres.2005.10.014

Muir, D., & Field, J. (1979). Newborn infants orient to sounds. Child Dev, 50(2), 431–436. doi:10.2307/1129419

Munhall, K. G., Gribble, P., Sacco, L., & Ward, M. (1996). Temporal constraints on the McGurk effect. Percept Psychophys, 58(3), 351–362. doi:10.3758/bf03206811

Nardini, M., Jones, P., Bedford, R., & Braddick, O. (2008). Development of cue integration in human navigation. Curr Biol, 18(9), 689–693. doi:10.1016/j.cub.2008.04.021

Narinesingh, C., Goltz, H. C., Raashid, R. A., & Wong, A. M. (2015). Developmental trajectory of McGurk effect susceptibility in children and adults with amblyopia. Invest Ophthalmol

Vis Sci, 56(3), 2107–2113. doi:10.1167/iovs.14-15898 Narinesingh, C., Goltz, H. C., & Wong, A. M. (2017). Temporal binding window of the sound-

induced flash illusion in amblyopia. Invest Ophthalmol Vis Sci, 58(3), 1442–1448. doi:10.1167/iovs.16-21258

Narinesingh, C., Wan, M., Goltz, H. C., Chandrakumar, M., & Wong, A. M. (2014). Audiovisual perception in adults with amblyopia: a study using the McGurk effect. Invest Ophthalmol

Vis Sci, 55(5), 3158–3164. doi:10.1167/iovs.14-14140 Nath, A. R., & Beauchamp, M. S. (2012). A neural basis for interindividual differences in the

McGurk effect, a multisensory speech illusion. Neuroimage, 59(1), 781–787. doi:10.1016/j.neuroimage.2011.07.024

Navarra, J., Alsius, A., Soto-Faraco, S., & Spence, C. (2010). Assessing the role of attention in the audiovisual integration of speech. Information Fusion, 11(1), 4–11. doi:10.1016/j.inffus.2009.04.001

203

Navarra, J., Vatakis, A., Zampini, M., Soto-Faraco, S., Humphreys, W., & Spence, C. (2005). Exposure to asynchronous audiovisual speech extends the temporal window for audiovisual integration. Brain Res Cogn Brain Res, 25(2), 499–507. doi:10.1016/j.cogbrainres.2005.07.009

Neu, B., & Sireteanu, R. (1997). Monocular acuity in preschool children: Assessment with the Teller and Keeler acuity cards in comparison to the C-test. Strabismus, 5(4), 185–202. doi:10.3109/09273979709044534

Newsome, W. T., & Pare, E. B. (1988). A selective impairment of motion perception following lesions of the middle temporal visual area (MT). J Neurosci, 8(6), 2201–2211.

Niechwiej-Szwedo, E., Goltz, H. C., Chandrakumar, M., Hirji, Z., & Wong, A. M. (2011). Effects of anisometropic amblyopia on visuomotor behavior, III: Temporal eye-hand coordination during reaching. Invest Ophthalmol Vis Sci, 52(8), 5853–5861. doi:10.1167/iovs.11-7314

Niechwiej-Szwedo, E., Goltz, H. C., Chandrakumar, M., Hirji, Z. A., & Wong, A. M. (2010). Effects of anisometropic amblyopia on visuomotor behavior, I: saccadic eye movements. Invest Ophthalmol Vis Sci, 51(12), 6348–6354. doi:10.1167/iovs.10-5882

Niechwiej-Szwedo, E., Goltz, H. C., Chandrakumar, M., & Wong, A. M. (2012). The effect of sensory uncertainty due to amblyopia (lazy eye) on the planning and execution of visually-guided 3D reaching movements. PloS one, 7(2), e31075.

Niechwiej-Szwedo, E., Kennedy, S. A., Colpa, L., Chandrakumar, M., Goltz, H. C., & Wong, A. M. (2012). Effects of induced monocular blur versus anisometropic amblyopia on saccades, reaching, and eye-hand coordination. Invest Ophthalmol Vis Sci, 53(8), 4354–4362. doi:10.1167/iovs.12-9855

Noesselt, T., Bergmann, D., Hake, M., Heinze, H. J., & Fendrich, R. (2008). Sound increases the saliency of visual events. Brain Res, 1220, 157–163. doi:10.1016/j.brainres.2007.12.060

Noesselt, T., Rieger, J. W., Schoenfeld, M. A., Kanowski, M., Hinrichs, H., Heinze, H. J., & Driver, J. (2007). Audiovisual temporal correspondence modulates human multisensory superior temporal sulcus plus primary sensory cortices. J Neurosci, 27(42), 11431-11441. doi:10.1523/JNEUROSCI.2252-07.2007

Norcia, A. M., & Tyler, C. W. (1985). Spatial frequency sweep VEP: visual acuity during the first year of life. Vision Res, 25(10), 1399–1408.

Norcia, A. M., Tyler, C. W., & Hamer, R. D. (1990). Development of contrast sensitivity in the human infant. Vision Res, 30(10), 1475–1486.

Nordmann, J. P., Freeman, R. D., & Casanova, C. (1992). Contrast sensitivity in amblyopia: masking effects of noise. Invest Ophthalmol Vis Sci, 33(10), 2975–2985.

O’Connor, N., & Hermelin, B. (1972). Seeing and hearing and space and space and time. Atten

Percept Psychophys, 11(1), 46–48. Ogilvie, J. C. (1956). Effect of auditory flutter on the visual critical flicker frequency. Canadian

Journal of Psychology/Revue canadienne de psychologie, 10(2), 61–68. doi:http://dx.doi.org/10.1037/h0083662

Oliver, D. L., & Huerta, M. F. (1992). Inferior and superior colliculi. In D. B. Webster, A. N. Popper, & R. R. Fay (Eds.), The Mammalian Auditory Pathway: Neuroanatomy (pp. 168–221). New York, NY: Springer New York.

Packwood, E. A., Cruz, O. A., Rychwalski, P. J., & Keech, R. V. (1999). The psychosocial effects of amblyopia study. J AAPOS, 3(1), 15–17.

Paré, M., Richler, R. C., ten Hove, M., & Munhall, K. (2003). Gaze behavior in audiovisual speech perception: The influence of ocular fixations on the McGurk effect. Percept

Psychophys, 65(4), 553–567.

204

Parise, C. V., & Ernst, M. O. (2016). Correlation detection as a general mechanism for multisensory integration. Nature communications, 7.

Parisi, V., Scarale, M. E., Balducci, N., Fresina, M., & Campos, E. C. (2010). Electrophysiological detection of delayed postretinal neural conduction in human amblyopia. Invest Ophthalmol Vis Sci, 51(10), 5041–5048. doi:10.1167/iovs.10-5412

Park, T. J., Klug, A., Holinstat, M., & Grothe, B. (2004). Interaural level difference processing in the lateral superior olive and the inferior colliculus. J Neurophysiol, 92(1), 289–301. doi:10.1152/jn.00961.2003

Pascalis, O., de Schonen, S., Morton, J., Deruelle, C., & Fabre-Grenet, M. (1995). Mother's face recognition by neonates: a replication and an extension. Infant Behavior and

Development, 18(1), 79–85. Pêcheux, M. G., Lepecq, J. C., & Salzarulo, P. (1988). Oral activity and exploration in 1–2‐

month‐old infants. British Journal of Developmental Psychology, 6(3), 245–256. Petrini, K., Russell, M., & Pollick, F. (2009). When knowing can replace seeing in audiovisual

integration of actions. Cognition, 110(3), 432–439. doi:10.1016/j.cognition.2008.11.015 Pick, H. L., Warren, D. H., & Hay, J. C. (1969). Sensory conflict in judgments of spatial

direction. Percept Psychophys, 6(4), 203–205. doi:10.3758/BF03207017 Pillsbury, H. C., Grose, J. H., Hall, J. W., & Iii. (1991). Otitis media with effusion in children:

Binaural hearing before and after corrective surgery. Archives of Otolaryngology–Head

& Neck Surgery, 117(7), 718–723. doi:10.1001/archotol.1991.01870190030008 Polat, U., Ma-Naim, T., Belkin, M., & Sagi, D. (2004). Improving vision in adult amblyopia by

perceptual learning. Proc Natl Acad Sci U S A, 101(17), 6692–6697. Polat, U., Sagi, D., & Norcia, A. M. (1997). Abnormal long-range spatial interactions in

amblyopia. Vision Res, 37(6), 737–744. Pollack, J. G., & Hickey, T. L. (1979). The distribution of retino-collicular axon terminals in

rhesus monkey. J Comp Neurol, 185(4), 587–602. doi:10.1002/cne.901850402 Pons, F., Lewkowicz, D. J., Soto-Faraco, S., & Sebastian-Galles, N. (2009). Narrowing of

intersensory speech perception in infancy. Proc Natl Acad Sci U S A, 106(26), 10598–10602. doi:10.1073/pnas.0904134106

Pöppel, E. (1997). A hierarchical model of temporal perception. Trends Cogn Sci, 1(2), 56–61. Posner, M. I., Nissen, M. J., & Klein, R. M. (1976). Visual dominance: an information-

processing account of its origins and significance. Psychol Rev, 83(2), 157–171. Powers, A. R., 3rd, Hillock, A. R., & Wallace, M. T. (2009). Perceptual training narrows the

temporal window of multisensory binding. J Neurosci, 29(39), 12265–12274. doi:10.1523/JNEUROSCI.3501-09.2009

Preslan, M. W., & Novak, A. (1996). Baltimore Vision Screening Project. Ophthalmology,

103(1), 105–109. Pulfrich, C. (1922). Die Stereoskopie im Dienste der isochromen und heterochromen

Photometrie. Naturwissenschaften, 10(33), 714–722. Pulkki, V. (2001). Spatial sound generation and perception by amplitude panning techniques:

Helsinki University of Technology. Pulkki, V., & Karjalainen, M. (2001). Localization of amplitude-panned virtual sources I:

stereophonic panning. Journal of the Audio Engineering Society, 49(9), 739–752. Putzar, L., Goerendt, I., Heed, T., Richard, G., Buchel, C., & Röder, B. (2010). The neural basis

of lip-reading capabilities is altered by early visual deprivation. Neuropsychologia, 48(7), 2158–2166.

205

Putzar, L., Goerendt, I., Lange, K., Rosler, F., & Röder, B. (2007). Early visual deprivation impairs multisensory interactions in humans. Nat Neurosci, 10(10), 1243–1245. doi:10.1038/nn1978

Putzar, L., Hötting, K., & Röder, B. (2010). Early visual deprivation affects the development of face recognition and of audio-visual speech perception. Restor Neurol Neurosci, 28(2), 251–257. doi:10.3233/RNN-2010-0526

Raashid, R. A., Liu, I. Z., Blakeman, A., Goltz, H. C., & Wong, A. M. (2016). The initiation of smooth pursuit is delayed in anisometropic amblyopia. Invest Ophthalmol Vis Sci, 57(4), 1757–1764. doi:10.1167/iovs.16-19126

Raashid, R. A., Wong, A., Blakeman, A., & Goltz, H. C. (2015). Saccadic adaptation in visually normal individuals using saccadic endpoint variability from amblyopia. Invest

Ophthalmol Vis Sci, 56(2), 947–955. Raashid, R. A., Wong, A., Chandrakumar, M., Blakeman, A., & Goltz, H. C. (2013). Short-term

saccadic adaptation in patients with anisometropic amblyopia. Invest Ophthalmol Vis Sci,

54(10), 6701–6711. Rahi, J., Logan, S., Timms, C., Russell-Eggitt, I., & Taylor, D. (2002). Risk, causes, and

outcomes of visual impairment after loss of vision in the non-amblyopic eye: a population-based study. Lancet, 360(9333), 597–602.

Raij, T., Uutela, K., & Hari, R. (2000). Audiovisual integration of letters in the human brain. Neuron, 28(2), 617–625.

Rayleigh, L. (1907). XII. On our perception of sound direction. The London, Edinburgh, and

Dublin Philosophical Magazine and Journal of Science, 13(74), 214–232. Recanzone, G. H. (2003). Auditory influences on visual temporal rate perception. J

Neurophysiol, 89(2), 1078–1093. doi:10.1152/jn.00706.2002 Recanzone, G. H. (2009). Interactions of auditory and visual stimuli in space and time. Hear Res,

258(1-2), 89–99. doi:10.1016/j.heares.2009.04.009 Repka, M. X., Beck, R. W., Kraker, R. T., Cole, S. R., Holmes, J. M., Birch, E. E., . . . Cotter, S.

A. (2002). The clinical profile of moderate amblyopia in children younger than 7 years. Arch Ophthalmol, 120(3), 281–287.

Richards, M. D., Goltz, H. C., & Wong, A. M. (2017a). Optimal audiovisual integration in the ventriloquism effect but pervasive deficits in unisensory spatial localization in amblyopia. Invest Ophthalmol Vis Sci, (in press).

Richards, M. D., Goltz, H. C., & Wong, A. M. F. (2017b). Alterations in audiovisual simultaneity perception in amblyopia. PloS one, 12(6), e0179516. doi:10.1371/journal.pone.0179516

Ringdahl, A., Eriksson-Mangold, M., & Andersson, G. (1998). Psychometric evaluation of the Gothenburg Profile for measurement of experienced hearing disability and handicap: applications with new hearing aid candidates and experienced hearing aid users. Br J

Audiol, 32(6), 375–385. Robaei, D., Rose, K. A., Ojaimi, E., Kifley, A., Martin, F. J., & Mitchell, P. (2006). Causes and

associations of amblyopia in a population-based sample of 6-year-old Australian children. Arch Ophthalmol, 124(6), 878–884. doi:10.1001/archopht.124.6.878

Robinson, D. L., & Kertzman, C. (1995). Covert orienting of attention in macaques. III. Contributions of the superior colliculus. J Neurophysiol, 74(2), 713–721.

Rock, I., & Victor, J. (1964). Vision and touch: An experimentally created conflict between the two senses. Science, 143(3606), 594–596.

Röder, B., Rosler, F., & Spence, C. (2004). Early vision impairs tactile perception in the blind. Curr Biol, 14(2), 121–124.

206

Röder, B., Teder-Salejarvi, W., Sterr, A., Rosler, F., Hillyard, S. A., & Neville, H. J. (1999). Improved auditory spatial tuning in blind humans. Nature, 400(6740), 162–166. doi:10.1038/22106

Roelfsema, P. R., Konig, P., Engel, A. K., Sireteanu, R., & Singer, W. (1994). Reduced synchronization in the visual cortex of cats with strabismic amblyopia. Eur J Neurosci,

6(11), 1645–1655. Rose, J. E., Brugge, J. F., Anderson, D. J., & Hind, J. E. (1967). Phase-locked response to low-

frequency tones in single auditory nerve fibers of the squirrel monkey. J Neurophysiol,

30(4), 769–793. Rose, S. A., Gottfried, A. W., & Bridger, W. H. (1981). Cross-modal transfer in 6-month-old

infants. Developmental Psychology, 17(5), 661. Roseboom, W., & Arnold, D. H. (2011). Twice upon a time: multiple concurrent temporal

recalibrations of audiovisual speech. Psychol Sci, 22(7), 872-877. doi:10.1177/0956797611413293

Roseboom, W., Nishida, S., & Arnold, D. H. (2009). The sliding window of audio-visual simultaneity. J Vis, 9(12), 4 1–8. doi:10.1167/9.12.4

Rosenblum, L. D., Schmuckler, M. A., & Johnson, J. A. (1997). The McGurk effect in infants. Atten Percept Psychophys, 59(3), 347–357.

Rowland, B., Stanford, T., & Stein, B. (2007). A Bayesian model unifies multisensory spatial localization with the physiological properties of the superior colliculus. Experimental

Brain Research, 180(1), 153–161. doi:10.1007/s00221-006-0847-2 Saenz, M., Lewis, L. B., Huth, A. G., Fine, I., & Koch, C. (2008). Visual Motion Area MT+/V5

Responds to Auditory Motion in Human Sight-Recovery Subjects. J Neurosci, 28(20), 5141–5148. doi:10.1523/JNEUROSCI.0803-08.2008

Salomao, S. R., & Ventura, D. F. (1995). Large sample population age norms for visual acuities obtained with Vistech-Teller Acuity Cards. Invest Ophthalmol Vis Sci, 36(3), 657–670.

Salzman, C. D., Britten, K. H., & Newsome, W. T. (1990). Cortical microstimulation influences perceptual judgements of motion direction. Nature, 346(6280), 174.

Scheiman, M. M., Hertle, R. W., Beck, R. W., Edwards, A. R., Birch, E., Cotter, S. A., . . . Donahue, S. (2005). Randomized trial of treatment of amblyopia in children aged 7 to 17 years. Arch Ophthalmol, 123(4), 437–447.

Schiller, P. H., & Stryker, M. (1972). Single-unit recording and stimulation in superior colliculus of the alert rhesus monkey. J Neurophysiol, 35(6), 915–924.

Schneider, K. A., & Bavelier, D. (2003). Components of visual prior entry. Cogn Psychol, 47(4), 333–366. doi:10.1016/s0010-0285(03)00035-5

Schor, C. M., & Westall, C. (1984). Visual and vestibular sources of fixation instability in amblyopia. Invest Ophthalmol Vis Sci, 25(6), 729–738.

Schröder, J. H., Fries, P., Roelfsema, P. R., Singer, W., & Engel, A. K. (2002). Ocular dominance in extrastriate cortex of strabismic amblyopic cats. Vision Res, 42(1), 29–39.

Secen, J., Culham, J., Ho, C., & Giaschi, D. (2011). Neural correlates of the multiple-object tracking deficit in amblyopia. Vision Res, 51(23-24), 2517–2527. doi:10.1016/j.visres.2011.10.011

Sekuler, R., Sekuler, A. B., & Lau, R. (1997). Sound alters visual motion perception. Nature,

385(6614), 308. doi:10.1038/385308a0 Sengpiel, F., Jirmann, K. U., Vorobyov, V., & Eysel, U. T. (2006). Strabismic suppression is

mediated by inhibitory interactions in the primary visual cortex. Cereb Cortex, 16(12), 1750–1758. doi:10.1093/cercor/bhj110

207

Shams, L., & Beierholm, U. R. (2010). Causal inference in perception. Trends Cogn Sci, 14(9), 425–432. doi:10.1016/j.tics.2010.07.001

Shams, L., Kamitani, Y., & Shimojo, S. (2000). Illusions: What you see is what you hear. Nature, 408(6814), 788–788.

Shams, L., Kamitani, Y., & Shimojo, S. (2002). Visual illusion induced by sound. Brain Res

Cogn Brain Res, 14(1), 147–152. doi:http://dx.doi.org/10.1016/S0926-6410(02)00069-1 Sharma, V., Levi, D. M., & Klein, S. A. (2000). Undercounting features and missing features:

evidence for a high-level deficit in strabismic amblyopia. Nat Neurosci, 3(5), 496–501. doi:10.1038/74872

Shatz, C. J., & Stryker, M. P. (1978). Ocular dominance in layer IV of the cat's visual cortex and the effects of monocular deprivation. J Physiol, 281, 267–283.

Shimojo, S., & Held, R. (1987). Vernier acuity is less than grating acuity in 2-and 3-month-olds. Vision Res, 27(1), 77–86.

Shipley, T. (1964). Auditory Flutter-Driving of Visual Flicker. Science, 145(3638), 1328–1330. Simion, F., Regolin, L., & Bulf, H. (2008). A predisposition for biological motion in the

newborn baby. Proc Natl Acad Sci U S A, 105(2), 809–813. Simmers, A. J., Ledgeway, T., Hess, R. F., & McGraw, P. V. (2003). Deficits to global motion

processing in human amblyopia. Vision Res, 43(6), 729–738. doi:10.1016/s0042-6989(02)00684-3

Simmers, A. J., Ledgeway, T., Mansouri, B., Hutchinson, C. V., & Hess, R. F. (2006). The extent of the dorsal extra-striate deficit in amblyopia. Vision Res, 46(16), 2571–2580. doi:10.1016/j.visres.2006.01.009

Sireteanu, R., Thiel, A., Fikus, S., & Iftime, A. (2008). Patterns of spatial distortions in human amblyopia are invariant to stimulus duration and instruction modality. Vision Res, 48(9), 1150–1163. doi:10.1016/j.visres.2008.01.028

Skoczenski, A. M., & Norcia, A. M. (2002). Late maturation of visual hyperacuity. Psychol Sci,

13(6), 537–541. doi:10.1111/1467-9280.00494 Slutsky, D. A., & Recanzone, G. H. (2001). Temporal and spatial dependency of the

ventriloquism effect. Neuroreport, 12(1), 7–10. Sokol, S. (1983). Abnormal evoked potential latencies in amblyopia. Br J Ophthalmol, 67(5),

310–314. Soto-Faraco, S., & Alsius, A. (2009). Deconstructing the McGurk–MacDonald illusion. J Exp

Psychol Hum Percept Perform, 35(2), 580–587. doi:10.1037/a0013483 Spang, K., & Fahle, M. (2009). Impaired temporal, not just spatial, resolution in amblyopia.

Invest Ophthalmol Vis Sci, 50(11), 5207–5212. Sparks, D. L. (1986). Translation of sensory signals into commands for control of saccadic eye

movements: role of primate superior colliculus. Physiol Rev, 66(1), 118–171. Sparks, D. L. (1988). Neural cartography: sensory and motor maps in the superior colliculus.

Brain Behav Evol, 31(1), 49–56. Spelke, E. (1976). Infants' intermodal perception of events. Cogn Psychol, 8(4), 553–560. Spence, C., & Parise, C. (2010). Prior-entry: a review. Conscious Cogn, 19(1), 364–379.

doi:10.1016/j.concog.2009.12.001 Spence, C., Shore, D. I., & Klein, R. M. (2001). Multisensory prior entry. J Exp Psychol Gen,

130(4), 799–832. St John, R. (1998). Judgements of visual precedence by strabismics. Behav Brain Res, 90(2),

167–174. Stein, B. E., Burr, D., Constantinidis, C., Laurienti, P. J., Alex Meredith, M., Perrault, T. J., Jr., .

. . Lewkowicz, D. J. (2010). Semantic confusion regarding the development of

208

multisensory integration: a practical solution. Eur J Neurosci, 31(10), 1713–1720. doi:10.1111/j.1460-9568.2010.07206.x

Stein, B. E., & Meredith, M. A. (1993). The Merging of the Senses: The MIT Press. Stein, B. E., Meredith, M. A., Huneycutt, W. S., & McDade, L. (1989). Behavioral indices of

multisensory integration: orientation to visual cues is affected by auditory stimuli. J Cogn

Neurosci, 1(1), 12–24. doi:10.1162/jocn.1989.1.1.12 Stein, B. E., Stanford, T. R., Ramachandran, R., Perrault, T. J., Jr., & Rowland, B. A. (2009).

Challenges in quantifying multisensory integration: alternative criteria, models, and inverse effectiveness. Exp Brain Res, 198(2-3), 113–126. doi:10.1007/s00221-009-1880-8

Stein, B. E., Stanford, T. R., & Rowland, B. A. (2014). Development of multisensory integration from the perspective of the individual neuron. Nat Rev Neurosci, 15(8), 520–535. doi:10.1038/nrn3742

Stelmach, L. B., & Herdman, C. M. (1991). Directed attention and perception of temporal order. J Exp Psychol Hum Percept Perform, 17(2), 539.

Sterritt, G. M., Camp, B. W., & Lipman, B. S. (1966). Effects of early auditory deprivation upon auditory and visual information processing. Perceptual and motor skills.

Stevens, A. A., & Weaver, K. (2005). Auditory perceptual consolidation in early-onset blindness. Neuropsychologia, 43(13), 1901–1910. doi:10.1016/j.neuropsychologia.2005.03.007

Stevens, S. S., & Newman, E. B. (1936). The localization of actual sources of sound. The

American Journal of Psychology, 48(2), 297–306. doi:10.2307/1415748 Stevenson, R. A., Fister, J. K., Barnett, Z. P., Nidiffer, A. R., & Wallace, M. T. (2012).

Interactions between the spatial and temporal stimulus factors that influence multisensory integration in human performance. Exp Brain Res, 219(1), 121–137. doi:10.1007/s00221-012-3072-1

Stevenson, R. A., Siemann, J. K., Schneider, B. C., Eberly, H. E., Woynaroski, T. G., Camarata, S. M., & Wallace, M. T. (2014). Multisensory temporal integration in autism spectrum disorders. J Neurosci, 34(3), 691–697. doi:10.1523/JNEUROSCI.3615-13.2014

Stevenson, R. A., VanDerKlok, R. M., Pisoni, D. B., & James, T. W. (2011). Discrete neural substrates underlie complementary audiovisual speech integration processes. Neuroimage, 55(3), 1339–1345. doi:10.1016/j.neuroimage.2010.12.063

Stevenson, R. A., & Wallace, M. T. (2013). Multisensory temporal integration: task and stimulus dependencies. Exp Brain Res, 227(2), 249–261. doi:10.1007/s00221-013-3507-3

Stevenson, R. A., Wilson, M. M., Powers, A. R., & Wallace, M. T. (2013). The effects of visual training on multisensory temporal processing. Exp Brain Res, 225(4), 479–489. doi:10.1007/s00221-012-3387-y

Stevenson, R. A., Zemtsov, R. K., & Wallace, M. T. (2012). Individual differences in the multisensory temporal binding window predict susceptibility to audiovisual illusions. J

Exp Psychol Hum Percept Perform, 38(6), 1517–1529. doi:10.1037/a0027339 Stewart, C. E., Moseley, M. J., & Fielder, A. R. (2011). Amblyopia therapy: an update.

Strabismus, 19(3), 91–98. doi:10.3109/09273972.2011.600421 Stone, J., Hunkin, N., Porrill, J., Wood, R., Keeler, V., Beanland, M., . . . Porter, N. (2001).

When is now? Perception of simultaneity. Proceedings of the Royal Society of London B:

Biological Sciences, 268(1462), 31-38. Strasburger, H. (2001). Converting between measures of slope of the psychometric function.

Atten Percept Psychophys, 63(8), 1348–1355. Stuart, J. A., & Burian, H. M. (1962). A study of separation difficulty. Its relationship to visual

acuity in normal and amblyopic eyes. Am J Ophthalmol, 53(3), 471–477.

209

Student Support Services Team. (2008). School hearing screening guidelines. Albany, New York 12234: The University of the State of New York.

Subramanian, V., Jost, R. M., & Birch, E. E. (2013). A quantitative study of fixation stability in amblyopia. Invest Ophthalmol Vis Sci, 54(3), 1998–2003. doi:10.1167/iovs.12-11054

Sugita, Y., & Suzuki, Y. (2003). Audiovisual perception: Implicit estimation of sound-arrival time. Nature, 421(6926), 911. doi:10.1038/421911a

Sumby, W. H., & Pollack, I. (1954). Visual contribution to speech intelligibility in noise. J

Acoust Soc Am, 26(2), 212–215. Tallal, P. (1978). An experimental investigation of the role of auditory temporal processing in

normal and disordered language development. Language acquisition and language

breakdown: Parallels and divergences, 25–61. Talsma, D., Senkowski, D., Soto-Faraco, S., & Woldorff, M. G. (2010). The multifaceted

interplay between attention and multisensory integration. Trends Cogn Sci, 14(9), 400–410.

The Lasker/IRRF Initiative for Innovation in Vision Science. (2017). Amblyopia: Challenges

and Opportunities. Retrieved from Theeuwes, J. (1991). Exogenous and endogenous control of attention: The effect of visual onsets

and offsets. Atten Percept Psychophys, 49(1), 83–90. Thompson, J. R., Woodruff, G., Hiscox, F. A., Strong, N., & Minshull, C. (1991). The incidence

and prevalence of amblyopia detected in childhood. Public health, 105(6), 455–462. Thurlow, W. R., & Jack, C. E. (1973). Certain determinants of the “ventriloquism effect”.

Perceptual and motor skills, 36(3 suppl), 1171–1184. Titchener, E. B. (1908). Lectures on the Elementary Psychology of Feeling and Attention:

Macmillan. Tommila, V., & Tarkkanen, A. (1981). Incidence of loss of vision in the healthy eye in

amblyopia. Br J Ophthalmol, 65(8), 575–577. Tootell, R. B., Switkes, E., Silverman, M. S., & Hamilton, S. L. (1988). Functional anatomy of

macaque striate cortex. II. Retinotopic organization. J Neurosci, 8(5), 1531–1568. Tredici, T. D., & von Noorden, G. K. (1984). The Pulfrich effect in anisometropic amblyopia

and strabismus. Am J Ophthalmol, 98(4), 499–503. Tripathy, S. P., & Cavanagh, P. (2002). The extent of crowding in peripheral vision does not

scale with target size. Vision Res, 42(20), 2357–2369. Tsirlin, I., Colpa, L., Goltz, H. C., & Wong, A. M. (2015). Behavioral training as new treatment

for adult amblyopia: a meta-analysis and systematic review. Invest Ophthalmol Vis Sci,

56(6), 4061–4075. doi:10.1167/iovs.15-16583 Tünnermann, J., Petersen, A., & Scharlau, I. (2015). Does attention speed up processing?

Decreases and increases of processing rates in visual prior entry. J Vis, 15(3), 1–1. Tuomainen, J., Andersen, T. S., Tiippana, K., & Sams, M. (2005). Audio-visual speech

perception is special. Cognition, 96(1), B13–22. doi:10.1016/j.cognition.2004.10.004 Vaegan, & Taylor, D. (1979). Critical period for deprivation amblyopia in children. Trans

Ophthalmol Soc U K, 99(3), 432–439. van de Graaf, E. S., van der Sterre, G. W., van Kempen-du Saar, H., Simonsz, B., Looman, C.

W., & Simonsz, H. J. (2007). Amblyopia and Strabismus Questionnaire (A&SQ): clinical validation in a historic cohort. Graefes Arch Clin Exp Ophthalmol, 245(11), 1589–1595. doi:10.1007/s00417-007-0594-5

van der Heijden, M., & Trahiotis, C. (1999). Masking with interaurally delayed stimuli: the use of “internal” delays in binaural detection. J Acoust Soc Am, 105(1), 388–399.

210

Van Esch, T., Lutman, M., Vormann, M., Lyzenga, J., Hällgren, M., Larsby, B., . . . Dreschler, W. (2015). Relations between psychophysical measures of spatial hearing and self-reported spatial-hearing abilities. International Journal of Audiology, 54(3), 182–189.

Van Essen, D. C., & Deyoe, E. A. (1995). Concurrent processing in the primate visual cortex. In M. S. Gazzaniga (Ed.), The cognitive neurosciences (pp. 383–400): MIT Press.

Van Essen, D. C., Newsome, W. T., & Maunsell, J. H. (1984). The visual field representation in striate cortex of the macaque monkey: asymmetries, anisotropies, and individual variability. Vision Res, 24(5), 429–448. doi:http://dx.doi.org/10.1016/0042-6989(84)90041-5

van Wassenhove, V., Grant, K. W., & Poeppel, D. (2007). Temporal window of integration in auditory-visual speech perception. Neuropsychologia, 45(3), 598–607. doi:10.1016/j.neuropsychologia.2006.01.001

Vatakis, A., & Spence, C. (2007). Crossmodal binding: evaluating the "unity assumption" using audiovisual speech stimuli. Percept Psychophys, 69(5), 744–756.

Vatakis, A., & Spence, C. (2008). Evaluating the influence of the 'unity assumption' on the temporal perception of realistic audiovisual stimuli. Acta Psychol (Amst), 127(1), 12–23. doi:10.1016/j.actpsy.2006.12.002

Vereecken, E. P., & Brabant, P. (1984). Prognosis for vision in amblyopia after the loss of the good eye. Arch Ophthalmol, 102(2), 220–224.

Vinding, T., Gregersen, E., Jensen, A., & Rindziunski, E. (2009). Prevalence of amblyopia in old people without previous screening and treatment. Acta Ophthalmologica, 69(6), 796–798. doi:10.1111/j.1755-3768.1991.tb02063.x

Von Békésy, G. (1930). Zur Theorie des Horens: Uber das Richtungshoren bei einer Zeitdifferenz oder Lautstarkenungleichheit der beiderseitigen Schalleinwirkungen. Physik. Z., 31, 824–835.

von Noorden, G. K., & Campos, E. (2002). Binocular Vision and Ocular Motility (6th ed.). St. Louis, MO: Mosby.

Voss, P., Lassonde, M., Gougoux, F., Fortin, M., Guillemot, J. P., & Lepore, F. (2004). Early- and late-onset blind individuals show supra-normal auditory abilities in far-space. Curr

Biol, 14(19), 1734–1738. doi:10.1016/j.cub.2004.09.051 Vroomen, J., Bertelson, P., & De Gelder, B. (2001). The ventriloquist effect does not depend on

the direction of automatic visual attention. Percept Psychophys, 63(4), 651–659. doi:10.3758/bf03194427

Vroomen, J., & Keetels, M. (2006). The spatial constraint in intersensory pairing: no role in temporal ventriloquism. J Exp Psychol Hum Percept Perform, 32(4), 1063–1071. doi:10.1037/0096-1523.32.4.1063

Vroomen, J., & Keetels, M. (2010). Perception of intersensory synchrony: a tutorial review. Atten Percept Psychophys, 72(4), 871–884. doi:10.3758/APP.72.4.871

Vroomen, J., & Stekelenburg, J. J. (2011). Perception of intersensory synchrony in audiovisual speech: not that special. Cognition, 118(1), 75–83. doi:10.1016/j.cognition.2010.10.002

Wada, Y., Kitagawa, N., & Noguchi, K. (2003). Audio-visual integration in temporal perception. Int J Psychophysiol, 50(1-2), 117–124.

Wali, N., Leguire, L., Rogers, G., & Bremer, D. (1991). CSF interocular interactions in childhood ambylopia. Optom Vis Sci, 68(2), 81–87.

Wallace, M. T. (2004). The development of multisensory processes. Cognitive Processing, 5(2), 69–83.

211

Wallace, M. T., Perrault, T. J., Jr., Hairston, W. D., & Stein, B. E. (2004). Visual experience is necessary for the development of multisensory integration. J Neurosci, 24(43), 9580–9584. doi:10.1523/JNEUROSCI.2535-04.2004

Wallace, M. T., & Stein, B. E. (2001). Sensory and multisensory responses in the newborn monkey superior colliculus. J Neurosci, 21(22), 8886–8894.

Wallace, M. T., & Stein, B. E. (2007). Early experience determines how the senses will interact. J Neurophysiol, 97(1), 921–926. doi:10.1152/jn.00497.2006

Wallace, M. T., Wilkinson, L. K., & Stein, B. E. (1996). Representation and integration of multiple sensory inputs in primate superior colliculus. J Neurophysiol, 76(2), 1246–1266.

Warncke, H. (1941). The fundamentals of room-related stereophonic reproduction in sound films. Akust. Zh, 6, 174–188.

Warren, D. H., Welch, R. B., & McCarthy, T. J. (1981). The role of visual-auditory “compellingness” in the ventriloquism effect: Implications for transitivity among the spatial senses. Percept Psychophys, 30(6), 557–564.

Warren, R. M. (2008). Auditory Perception: An Analysis and Synthesis: Cambridge University Press.

Watkins, S., Shams, L., Tanaka, S., Haynes, J. D., & Rees, G. (2006). Sound alters activity in human V1 in association with illusory visual perception. Neuroimage, 31(3), 1247–1256. doi:http://dx.doi.org/10.1016/j.neuroimage.2006.01.016

Watts, P. O., Neveu, M. M., Holder, G. E., & Sloper, J. J. (2002). Visual evoked potentials in successfully treated strabismic amblyopes and normal subjects. J AAPOS, 6(6), 389–392. doi:10.1067/mps.2002.129046

Weaver, K. E., & Stevens, A. A. (2006). Auditory gap detection in the early blind. Hear Res,

211(1-2), 1–6. doi:10.1016/j.heares.2005.08.002 Webber, A. L., Wood, J. M., Gole, G. A., & Brown, B. (2008). Effect of amblyopia on self-

esteem in children. Optom Vis Sci, 85(11), 1074–1081. doi:10.1097/OPX.0b013e31818b9911

Welch, R. B. (1999). Meaning, attention, and the “unity assumption” in the intersensory bias of spatial and temporal perceptions. Advances in psychology, 129, 371–387.

Welch, R. B., & Warren, D. H. (1980). Immediate perceptual response to intersensory discrepancy. Psychol Bull, 88(3), 638–667. doi:http://dx.doi.org/10.1037/0033-2909.88.3.638

Wertheimer, M. (1961). Psychomotor Coordination of Auditory and Visual Space at Birth. Science, 134(3491), 1692–1692. doi:10.1126/science.134.3491.1692

Wesson, M. D., & Loop, M. S. (1982). Temporal contrast sensitivity in amblyopia. Invest

Ophthalmol Vis Sci, 22(1), 98–102. Wiesel, T. N., & Hubel, D. H. (1963a). Effects of visual deprivation on morphology and

physiology of cells in the cat's lateral geniculate body. J Neurophysiol, 26(978), 6. Wiesel, T. N., & Hubel, D. H. (1963b). Single-cell responses in striate cortex of kittens deprived

of vision in one eye. J Neurophysiol, 26(6), 1003–1017. Wightman, F., Allen, P., Dolan, T., Kistler, D., & Jamieson, D. (1989). Temporal resolution in

children. Child Dev, 60(3), 611–624. doi:10.2307/1130727 Wightman, F. L., & Kistler, D. J. (1989a). Headphone simulation of free‐field listening. I:

stimulus synthesis. J Acoust Soc Am, 85(2), 858–867. Wightman, F. L., & Kistler, D. J. (1989b). Headphone simulation of free‐field listening. II:

Psychophysical validation. J Acoust Soc Am, 85(2), 868–878. Williams, C., Azzopardi, P., & Cowey, A. (1995). Nasal and temporal retinal ganglion cells

projecting to the midbrain: implications for "blindsight". Neuroscience, 65(2), 577–586.

212

Williams, C., Northstone, K., Howard, M., Harvey, I., Harrad, R. A., & Sparrow, J. M. (2008). Prevalence and risk factors for common vision problems in children: data from the ALSPAC study. Br J Ophthalmol, 92(7), 959–964. doi:10.1136/bjo.2007.134700

Witten, I. B., & Knudsen, E. I. (2005). Why seeing is believing: merging auditory and visual worlds. Neuron, 48(3), 489–496. doi:10.1016/j.neuron.2005.10.020

Wittmann, M. (1999). Time Perception and Temporal Processing Levels of the Brain. Chronobiology International, 16(1), 17–32. doi:10.3109/07420529908998709

Wong, A. M., Richards, M. D., & Goltz, H. C. (2017, 4 April 2017). The effect of amblyopia on

the developmental calibration of sound localization. Paper presented at the 43rd annual NANOS meeting, Washington, DC.

Wright, D., Hebrank, J. H., & Wilson, B. (1974). Pinna reflections as cues for localization. J

Acoust Soc Am, 56(3), 957–962. Yarrow, K., Jahn, N., Durant, S., & Arnold, D. H. (2011). Shifts of criteria or neural timing? The

assumptions underlying timing perception studies. Conscious Cogn, 20(4), 1518–1531. doi:10.1016/j.concog.2011.07.003

Yu, L., Rowland, B. A., & Stein, B. E. (2010). Initiating the development of multisensory integration by manipulating sensory experience. J Neurosci, 30(14), 4904–4913. doi:10.1523/JNEUROSCI.5575-09.2010

Yu, M., Brown, B., & Edwards, M. H. (1998). Investigation of multifocal visual evoked potential in anisometropic and esotropic amblyopes. Invest Ophthalmol Vis Sci, 39(11), 2033–2040.

Yuille, A. L., & Bulthoff, H. H. (1996). Bayesian decision theory and psychophysics. In C. K. David & R. Whitman (Eds.), Perception as Bayesian inference (pp. 123–161): Cambridge University Press.

Zackon, D. H., Casson, E. J., Zafar, A., Stelmach, L., & Racette, L. (1999). The temporal order judgment paradigm: subcorticalattentional contribution under exogenous and endogenouscueing conditions. Neuropsychologia, 37(5), 511–520. doi:10.1016/S0028-3932(98)00134-1

Zampini, M., Guest, S., Shore, D. I., & Spence, C. (2005). Audio-visual simultaneity judgments. Percept Psychophys, 67(3), 531–544.

Zampini, M., Shore, D. I., & Spence, C. (2005). Audiovisual prior entry. Neurosci Lett, 381(3), 217–222. doi:10.1016/j.neulet.2005.01.085

Zhang, W., & Zhao, K. (2005). Multifocal VEP difference between early- and late-onset strabismus amblyopia. Doc Ophthalmol, 110(2-3), 173–180. doi:10.1007/s10633-005-4312-5

Zürcher, B., & Lang, J. M. (1979). Reading capacity in cases of 'cured' strabismic amblyopia. Trans Ophthalmol Soc U K, 100(4), 501–503.

Zwiers, M. P., Van Opstal, A. J., & Cruysberg, J. R. M. (2001). A spatial hearing deficit in early-blind humans. J Neurosci, 21(9), RC142–RC142.

Zwiers, M. P., Versnel, H., & Van Opstal, A. J. (2004). Involvement of monkey inferior colliculus in spatial hearing. J Neurosci, 24(17), 4145–4156. doi:10.1523/JNEUROSCI.0199-04.2004

213

Copyright Acknowledgements

The work contained within Study III was previously published in: Richards, M. D., Goltz, H. C.,

& Wong, A. M. F. (2017). Alterations in audiovisual simultaneity perception in amblyopia. PLoS

one, 12(6). Its text and figures have been reformatted for inclusion in this thesis, with permission

under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

audiovisual processing and integration in amblyopia · audiovisual temporal integration using the...

Documents