audiovisual processing and integration in amblyopia · audiovisual temporal integration using the...
TRANSCRIPT
Audiovisual Processing and Integration in Amblyopia
by
Michael David Richards
A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy
Institute of Medical Science University of Toronto
© Copyright by Michael David Richards 2018
ii
Audiovisual Processing and Integration in Amblyopia
Michael David Richards
Doctor of Philosophy
Institute of Medical Science University of Toronto
2018
Abstract
Amblyopia is a developmental visual disorder caused by abnormal visual experience during early
life. Accumulating evidence points to perceptual deficits in amblyopia beyond vision, in the
realm of audiovisual multisensory perception. This thesis presents a systematic psychophysical
investigation of audiovisual processing and integration in adults with unilateral amblyopia. Study
I examines audiovisual spatial integration and reveals amblyopic deficits in localization precision
for unisensory visual and auditory stimuli, but statistically optimal integration according to the
maximum likelihood estimation model of multisensory integration. Study II confirms the novel
deficit in sound localization described in Study I, and reveals a non-uniform spatial pattern of
sound localization deficits that implicates the superior colliculus as a primary neural locus
affected by abnormal visual experience. Study III measures audiovisual simultaneity perception,
and shows that asynchronous audiovisual pairs are perceived as synchronous over wider
temporal intervals than normal, regardless of which eye is viewing. Study IV examines
audiovisual temporal integration using the temporal ventriloquism effect, and reveals successful
temporal integration in amblyopia, but possibly over a wider interval of audiovisual asynchrony.
In sum, the findings suggest that the capacities for spatial and temporal audiovisual integration
are intact in amblyopia, but that non-integrative multisensory processes, including cross-modal
temporal matching and cross-sensory calibration of sound localization, are impaired.
iii
Acknowledgments
I would like to express my sincere gratitude to a number of people and organizations that
contributed in some way to the realization of this thesis.
To my primary supervisor, Dr. Agnes Wong, thank you for your invaluable mentorship,
guidance, and support over the course of this research program. I am grateful for your
encouragement along this winding path, for the intellectual freedom you afforded me in my
work, and for your commitment to the ideal of merging scientific rigour with clinical
compassion. You have been an inspiration to me in pursuing this career.
To my co-supervisor, Dr. Herb Goltz, thank you for your day-to-day engagement with the
challenges and triumphs of my graduate experience. I am grateful for the encouragement,
knowledge, and thoughtful advice you shared with me during our innumerable scientific
discussions. Your mentorship has fundamentally shaped my training and this work.
To the members of my advisory committee, Dr. Karen Gordon, Dr. Bob Harrison, and Dr.
Daphne Maurer, thank you for your invaluable guidance and scientific insights that have pushed
me to become a better scientist. I am grateful for the time and energy you have devoted to seeing
me through this program.
I am also grateful for the administrative and financial support provided by the Clinician
Investigator Program and the Vision Science Research Program at the University of Toronto.
To past and present members of the lab, thank you for your friendship on this journey. Thank
you to Luke Gane, Al Blakeman for your technical expertise. Thank you to Linda Colpa for
recruiting and screening subjects, to Inna Tsirlin for early critiques of the research proposal, and
to Cindy Narinesingh for software support. Thank you to my desk mates Jaime Sklar, Marija
Zivcevska, and Shaobo Lei for the camaraderie on our side of ‘the wall’, and to Arham Raashid
for your friendship and unwavering enthusiasm for science.
To my family and friends, thank you for loving and supporting me through these transformative
years. I am eternally grateful to my mom and dad for their belief in my ability to confront and
overcome any obstacle. And to my partner, Tanner, thank you for your love, patience, and
support—I could not have finished this without you.
iv
Table of Contents
Acknowledgments.......................................................................................................................... iii
Table of Contents ........................................................................................................................... iv
Statement of Contributions ............................................................................................................ ix
List of Abbreviations .......................................................................................................................x
List of Tables ................................................................................................................................ xii
List of Figures .............................................................................................................................. xiii
Chapter 1 General Introduction .......................................................................................................1
General Introduction ...................................................................................................................1
1.1 Amblyopia............................................................................................................................1
1.1.1 Overview ..................................................................................................................1
1.1.2 Etiology of Amblyopia ............................................................................................3
1.1.3 Abnormalities in Spatiotemporal Visual Processing in Amblyopia ........................5
1.1.4 Neural Basis of Amblyopia ....................................................................................12
1.1.5 Normal Visual Development and Sensitive Periods for Damage and Recovery ...14
1.2 Auditory Processing ...........................................................................................................19
1.2.1 Overview ................................................................................................................19
1.2.2 Auditory Spatial Processing ...................................................................................19
1.2.3 Auditory Temporal Processing ..............................................................................27
1.3 Multisensory Processing and Integration ...........................................................................29
1.3.1 Overview ................................................................................................................29
1.3.2 Influence of Cognitive Factors in Multisensory Processing ..................................33
1.3.3 Neural Sites of Multisensory Processing ...............................................................35
1.3.4 Multisensory Integration ........................................................................................41
1.3.5 Theories of Multisensory Integration and Modality Dominance ...........................42
1.3.6 Development of Multisensory Processes ...............................................................47
v
1.3.7 Cross-Sensory Calibration Hypothesis ..................................................................48
1.3.8 Selected Psychophysical Measures of Audiovisual Processing and Integration ...49
1.4 Multisensory Processing in Amblyopia .............................................................................55
1.4.1 Audiovisual Temporal and Spatial Perception ......................................................55
1.4.2 Audiovisual Speech Perception .............................................................................57
1.5 Summary ............................................................................................................................58
Chapter 2 Study Aims and Hypotheses .........................................................................................59
Study Aims and Hypotheses .....................................................................................................59
2.1 General Rationale and Research Aims ..............................................................................59
2.2 Specific Study Aims and Hypotheses ................................................................................60
2.2.1 Audiovisual Spatial Perception ..............................................................................60
2.2.2 Audiovisual Temporal Perception .........................................................................62
Chapter 3 Study I ...........................................................................................................................66
Study I: Optimal Audiovisual Integration in the Ventriloquism Effect but Pervasive Deficits in Unisensory Spatial Localization in Amblyopia ......................................................66
3.1 Abstract ..............................................................................................................................66
3.2 Introduction ........................................................................................................................67
3.3 Methods..............................................................................................................................69
3.3.1 Participants .............................................................................................................69
3.3.2 Apparatus and Stimuli............................................................................................72
3.3.3 Procedure ...............................................................................................................73
3.3.4 Data Analysis .........................................................................................................75
3.4 Results ................................................................................................................................75
3.4.1 Localization Performance ......................................................................................76
3.4.2 Testing the Maximum Likelihood Estimation Model ............................................80
3.5 Discussion ..........................................................................................................................86
vi
Chapter 4 Study II ..........................................................................................................................89
Study II: Amblyopia and the Developmental Calibration of Sound Localization ....................89
4.1 Abstract ..............................................................................................................................89
4.2 Introduction ........................................................................................................................89
4.3 Methods..............................................................................................................................92
4.3.1 Experiment 1: Relative sound localization—minimum audible angle task using speaker array ................................................................................................92
4.3.2 Experiment 2: Absolute Auditory Localization .....................................................96
4.3.3 Experiment 3: Replication of MAA task using stereo speaker apparatus (amplitude panning) .............................................................................................101
4.4 Results ..............................................................................................................................101
4.4.1 Experiment 1 ........................................................................................................101
4.4.2 Experiment 2 ........................................................................................................102
4.4.3 Experiment 3 ........................................................................................................106
4.5 Discussion ........................................................................................................................107
Chapter 5 Study III.......................................................................................................................111
Study III: Alterations in Audiovisual Simultaneity Perception in Amblyopia .......................111
5.1 Abstract ............................................................................................................................111
5.2 Introduction ......................................................................................................................111
5.3 Materials and Methods .....................................................................................................115
5.3.1 Participants ...........................................................................................................115
5.3.2 Apparatus and Stimuli..........................................................................................118
5.3.3 Procedure .............................................................................................................118
5.3.4 Analysis................................................................................................................119
5.4 Results ..............................................................................................................................121
5.4.1 Binocular Viewing Condition ..............................................................................121
5.4.2 Monocular Viewing Conditions ...........................................................................128
vii
5.5 Discussion ........................................................................................................................130
Chapter 6 Study IV ......................................................................................................................136
Study IV: Temporal Ventriloquism Reveals Normal Audiovisual Temporal Integration in Amblyopia ...............................................................................................................................136
6.1 Abstract ............................................................................................................................136
6.2 Introduction ......................................................................................................................136
6.3 Methods............................................................................................................................141
6.3.1 Participants ...........................................................................................................141
6.3.2 Apparatus and Stimuli..........................................................................................142
6.3.3 Design and Procedure ..........................................................................................143
6.3.4 Data Analysis .......................................................................................................144
6.4 Results ..............................................................................................................................146
6.5 Discussion ........................................................................................................................151
Chapter 7 General Discussion and Conclusions ..........................................................................156
General Discussion and Conclusions ......................................................................................156
7.1 Summary of Findings and Evaluation of Specific Hypotheses .......................................156
7.1.1 Audiovisual Spatial Perception ............................................................................156
7.1.2 Audiovisual Temporal Perception .......................................................................158
7.2 Is Audiovisual Integration Impaired in Amblyopia? .......................................................159
7.2.1 Possible Mechanisms for the Pattern of Audiovisual Integration Abnormalities in Amblyopia .......................................................................................................160
7.3 Are Non-integrative Audiovisual Processes Impaired in Amblyopia? ............................169
7.3.1 Cross-modal Matching .........................................................................................169
7.3.2 Unisensory Impairments and Cross-sensory Calibration .....................................171
7.4 Clinical Implications ........................................................................................................173
7.5 Conclusions ......................................................................................................................175
Chapter 8 Future Directions .........................................................................................................177
viii
Future Directions .....................................................................................................................177
8.1 Development and Mechanisms of Multisensory Processing and Integration ..................177
8.2 Nature and Extent of Perceptual Impairments in Amblyopia ..........................................179
8.3 Looking to the Future of Amblyopia Therapy .................................................................182
References ....................................................................................................................................184
Copyright Acknowledgements.....................................................................................................213
ix
Statement of Contributions
Dr. Michael Richards (author) – all aspects of this work, including but not limited to:
experimental design, data collection, data analysis, data interpretation, thesis preparation
Dr. Agnes Wong (supervisor) – mentorship, assistance with experimental design, assistance with
data interpretation, assistance with thesis preparation
Dr. Herbert Goltz (co-supervisor) – mentorship, assistance with experimental design, assistance
with data interpretation, assistance with thesis preparation
Dr. Daphne Maurer (committee member) – mentorship, assistance with experimental design,
assistance with data interpretation, assistance with thesis preparation
Dr. Karen Gordon (committee member) – mentorship, assistance with experimental design,
assistance with data interpretation, assistance with thesis preparation
Dr. Robert Harrison (committee member) – mentorship, assistance with experimental design,
assistance with data interpretation, assistance with thesis preparation
Linda Colpa – assistance with participant recruitment, assistance with data interpretation
Arham Raashid – assistance with data analysis, assistance with data interpretation
Luke Gane – assistance with experimental apparatus, assistance with data analysis
Alan Blakeman – assistance with experimental apparatus, assistance with data analysis
Jaime Sklar – assistance with data collection, assistance with data interpretation
Cindy Narinesingh – assistance with data analysis
Dr. Inna Tsirlin – assistance with experimental design
x
List of Abbreviations
AE Amblyopic eye
ANCOVA Analysis of covariance
ANOVA Analysis of variance
AV Audiovisual
BOLD Blood oxygenation level-dependent
DNLL Dorsal nucleus of the lateral lemniscus
ETDRS Early treatment of diabetic retinopathy study
FE Fellow eye (as compared to the amblyopic eye)
fMRI Functional magnetic resonance imaging
HRTF Head-related transfer function
IC Inferior colliculus
ILD Interaural level difference
IPS Intraparietal sulcus
ITD Interaural time difference
JND Just noticeable difference
LE Left eye
LED Light emitting diode
LGN Lateral geniculate nucleus
logMAR Logarithm of the minimum angle of resolution
xi
LSO Lateral superior olive
MAA Minimum audible angle
MLE Maximum likelihood estimation
MNTB Medial nucleus of the trapezoid body
MRI Magnetic resonance imaging
MSO Medial superior olive
MT Middle temporal visual area
PET Positron emission tomography
PSE Point of subjective equality
PSS Point of subjective simultaneity
RE Right eye
RMS Root mean square
SC Superior colliculus
SD Standard deviation
SOA Signal onset asynchrony
STS Superior temporal sulcus
TOJ Temporal order judgment
V1 Primary visual cortex, or striate cortex
V2, V3, V3a, Vp, V4+, V8 Extrastriate visual cortices
VEP Visual evoked potential
xii
List of Tables
Table 3.1: Characteristics of participants with amblyopia............................................................ 71
Table 3.2: Probe stimulus displacements used for each test stimulus condition .......................... 74
Table 4.1: Clinical details of participants with amblyopia in Experiment 1 ................................ 94
Table 4.2: Clinical details of participants with amblyopia in Experiment 2 ................................ 98
Table 5.1: Characteristics of participants with amblyopia.......................................................... 117
Table 5.2: Audiovisual simultaneity window parameters by main group .................................. 122
Table 5.3: Audiovisual simultaneity window parameters by amblyopia severity ...................... 125
Table 5.4: Audiovisual simultaneity window parameters by amblyopia etiology...................... 126
Table 5.5: Audiovisual simultaneity window parameters by suppression status........................ 127
Table 5.6: Audiovisual simultaneity window parameters by stereopsis level ............................ 128
Table 5.7: Comparison of audiovisual simultaneity window parameters by viewing condition for
participants with amblyopia (repeated measures ANOVA) ....................................................... 130
Table 6.1: Clinical characteristics of participants with amblyopia ............................................. 145
Table 6.2: Visual temporal order judgment performance in the control and amblyopia groups 146
xiii
List of Figures
Figure 1.1: Visual evoked potential P1 amplitude and latency distributions from trial-by-trial
analysis from 18 adults with unilateral amblyopia ....................................................................... 11
Figure 1.2: Schematic diagram of retinal projections in the retinostriate and retinocollicular
pathways ....................................................................................................................................... 36
Figure 1.3: Summary of putative multisensory areas of the human brain based on primate
anatomical data, human psychophysical data, and functional neuroimaging studies ................... 39
Figure 1.4: Posterior-to-anterior audiovisual processing gradient in the human STS .................. 40
Figure 1.5: A hypothetical audiovisual simultaneity window ...................................................... 50
Figure 1.6: A diagram of the spatial ventriloquism effect ............................................................ 52
Figure 1.7: Examples of audiovisual stimulus conditions that elicit the temporal ventriloquism
effect ............................................................................................................................................. 53
Figure 3.1: Audiovisual apparatus for the presentation of visual blobs and auditory clicks ........ 72
Figure 3.2 Illustration of the trial timeline .................................................................................... 74
Figure 3.3: Unimodal and bimodal localization task performance ............................................... 78
Figure 3.4: Localization precision for visual-only, auditory-only, and spatially congruent
bimodal audiovisual stimuli .......................................................................................................... 79
Figure 3.5: Bimodal localization bias for audiovisual stimuli with spatial conflict ..................... 80
Figure 3.6: Bimodal localization precision, as observed and as predicted by the MLE model .... 81
Figure 3.7: Maximal bimodal advantage ratio for localization precision, observed, as predicted
by the MLE model, and as predicted by integration failure ......................................................... 83
Figure 3.8: Perceptual weight for vision (wV), observed and as predicted by the MLE model .... 84
xiv
Figure 3.9: Visual blob size equivalent to the auditory click in terms of spatial precision (on
unimodal presentation) and perceptual weight (on bimodal presentation) ................................... 85
Figure 4.1: Apparatus for Experiment 1, a horizontal array of 11 speakers with a central fixation
LED ............................................................................................................................................... 95
Figure 4.2: Apparatus for Experiment 2, stereo speakers with LED monitor .............................. 99
Figure 4.3: Relative sound localization performance on a horizontal speaker array .................. 102
Figure 4.4: Absolute sound localization performance ................................................................ 103
Figure 4.5: Correlations between RMS error for sound localization and clinical measures of
amblyopia across auditory target positions ................................................................................. 105
Figure 4.6: Relative sound localization performance on stereo speaker apparatus .................... 106
Figure 4.7: Correlation between minimum audible angle (MAA) values determined by amplitude
panning (Experiment 3) and by physical speakers (Experiment 1) ............................................ 107
Figure 5.1: Schematic diagram of signal onset asynchronies (SOA) for auditory-lead and visual-
lead conditions. ........................................................................................................................... 119
Figure 5.2: Sample audiovisual simultaneity judgment data from a visually normal control
participant, fitted with a truncated Gaussian function by the maximum likelihood method ...... 120
Figure 5.3: Main group analysis for audiovisual simultaneity judgment responses with both eyes
viewing as a function of SOA ..................................................................................................... 122
Figure 5.4: Subgroup analyses for audiovisual simultaneity judgment responses with both eyes
viewing as a function of SOA ..................................................................................................... 124
Figure 5.5: The audiovisual simultaneity window for binocular and monocular viewing
conditions among participants with amblyopia .......................................................................... 129
Figure 6.1: Schematic of the apparatus and stimuli that induce the temporal ventriloquism effect
..................................................................................................................................................... 140
xv
Figure 6.2: The temporal ventriloquism effect with and without intact audiovisual integration 141
Figure 6.3: The audiovisual apparatus ........................................................................................ 143
Figure 6.4: Visual temporal order judgment performance for visual-only stimuli and audiovisual
stimuli with synchronous clicks (AV sync) ................................................................................ 147
Figure 6.5: The temporal ventriloquism effect in the control group and the amblyopia group .. 148
Figure 6.6: Relation between susceptibility to the temporal ventriloquism effect and visual acuity
in the amblyopic eye across click timing conditions in which the second click lagged the onset of
the second light ........................................................................................................................... 150
Figure 6.7: Relation between susceptibility to the temporal ventriloquism effect and stereo acuity
across click lag conditions in participants with amblyopia ........................................................ 150
Figure 8.1: Possible mechanisms that determine the temporal window of audiovisual integration
..................................................................................................................................................... 178
1
Chapter 1 General Introduction
General Introduction
1.1 Amblyopia
1.1.1 Overview
The term amblyopia is derived from the Greek words amblys, meaning ‘dulled’ or ‘blunt’, and
ops, meaning ‘eye’, and literally means ‘dimness of vision’. Amblyopia, also commonly referred
to as ‘lazy eye’, has been traditionally defined as a decrease in visual acuity in the absence of any
apparent ocular defect to account for the impairment (von Noorden & Campos, 2002). The
traditional definition, however, is incomplete. Evidence from animal studies (Hubel & Wiesel,
1970; Kiorpes, Kiper, O'Keefe, Cavanaugh, & Movshon, 1998; Movshon et al., 1987) and more
recently from human neuroimaging studies (Goodyear, Nicolle, Humphrey, & Menon, 2000),
shows that the visual deficit in amblyopia is related to dysfunctional processing of visual
information. Amblyopia is accompanied by one or more amblyogenic factors that disrupted
normal visual experience during a sensitive period in the development of the visual pathways in
infancy or early childhood (Birch, 2013). Therefore, amblyopia may be more precisely defined
as an impairment in visual processing caused by abnormal visual experience during a critical
period in the first years of life (Holmes & Clarke, 2006).
Clinically, amblyopia can be defined as a unilateral, or rarely bilateral, reduction in best-
corrected visual acuity that cannot be directly attributed to a structural eye abnormality
(American Academy of Ophthalmology Pediatric Ophthalmology/Strabismus Panel, 2012; The
Lasker/IRRF Initiative for Innovation in Vision Science, 2017). Additionally, the diagnosis of
amblyopia requires a history of one or more amblyogenic factors, most commonly strabismus
(eye misalignment) and anisometropia (difference in refractive error between the eyes), or more
rarely visual deprivation (most often from congenital cataract), that interfered with pattern vision
or normal binocular interaction in early life (Preslan & Novak, 1996). Because the majority of
amblyopia is unilateral, a widely used and practical definition of amblyopia is an inter-ocular
difference in best-corrected visual acuity of 2 or more lines (i.e. ≥ 2 logMAR) on a standard eye
2
chart (Holmes & Clarke, 2006). Visual acuity differences of less than 2 lines are generally within
the range of test-retest variability for normal observers (Holmes et al., 2001).
Amblyopia is a significant public health concern among children and adults alike. The
prevalence of amblyopia among children in developed countries is estimated between 2% to 4%
(Donnelly, Stewart, & Hollinger, 2005; Friedman et al., 2009; Preslan & Novak, 1996; Robaei et
al., 2006; Thompson, Woodruff, Hiscox, Strong, & Minshull, 1991; Williams et al., 2008).
Failure to detect and treat amblyopia in childhood means that its effects are often lifelong.
Indeed, amblyopia is the leading cause of persistent monocular blindness among adults (Buch,
Vinding, La Cour, & Nielsen, 2001; Krueger & Ederer, 1984), and its estimated prevalence in
adult populations is approximately 3% (Attebo et al., 1998; Brown et al., 2000; Vinding,
Gregersen, Jensen, & Rindziunski, 2009). In addition to the unilateral visual deficit acquired in
childhood, people with amblyopia are at markedly increased risk of bilateral blindness compared
to the general population, most commonly from trauma to the fellow eye (Tommila &
Tarkkanen, 1981). The lifetime risk of serious visual impairment in the fellow eye is estimated at
1.2% to 3.3% (Rahi, Logan, Timms, Russell-Eggitt, & Taylor, 2002). In total, health economists
estimate that untreated amblyopia accounts for over $7 billion in lost earning power annually in
the United States (Membreno, Brown, Brown, Sharma, & Beauchamp, 2002).
The impacts of amblyopia extend beyond visual perception as well. Quality of life studies report
that amblyopia has negative effects on social interactions (Horwood, Waylen, Herrick, Williams,
& Wolke, 2005; Packwood, Cruz, Rychwalski, & Keech, 1999; van de Graaf et al., 2007), self-
esteem and self-image (Packwood et al., 1999; Webber, Wood, Gole, & Brown, 2008), sports
involvement (Packwood et al., 1999), educational attainment (Chua & Mitchell, 2004), and
ultimate career choice (Adams & Karas, 1999; Packwood et al., 1999) (see Carlton and
Kaltenthaler (2011) for review). Although vision problems are widely acknowledged not to
cause primary dyslexia or learning disabilities (American Academy of Pediatrics Section on
Ophthalmology Council on Children with Disabilities, American Academy of Ophthalmology,
American Association for Pediatric Ophthalmology and Strabismus, & American Association of
Certified Orthoptists, 2009), school-aged children with unilateral amblyopia read more slowly
than their typically-sighted counterparts, even under natural binocular viewing conditions
(Kanonidou, Proudlock, & Gottlob, 2010; Kelly, Jost, De La Cruz, & Birch, 2015). Eye-hand
coordination is also affected, with slower and less precise reaching and grasping (Grant,
3
Melmoth, Morgan, & Finlay, 2007; Niechwiej-Szwedo, Goltz, Chandrakumar, Hirji, & Wong,
2011; Niechwiej-Szwedo, Goltz, Chandrakumar, & Wong, 2012). Additionally, perceptual
abnormalities in audiovisual multisensory processing are evident in children and adults with
unilateral amblyopia (Burgmeier et al., 2015; Chen, Lewis, Shore, & Maurer, 2017; Narinesingh,
Goltz, Raashid, & Wong, 2015; Narinesingh, Goltz, & Wong, 2017; Narinesingh, Wan, Goltz,
Chandrakumar, & Wong, 2014).
The current gold-standard for the treatment of unilateral amblyopia in children involves
refractive correction and occlusion (i.e., patching) or pharmacological penalization of the fellow
eye to promote use of the amblyopic eye (American Academy of Ophthalmology Pediatric
Ophthalmology/Strabismus Panel, 2012; Stewart, Moseley, & Fielder, 2011). In the case of
visual deprivation, the cause of visual obstruction must be addressed first. If strabismus is
present, however, amblyopia treatment may commence immediately, before eye muscle surgery
to straighten the eyes. The frequency and duration of occlusion or penalization prescribed are
generally determined by the severity of the acuity deficit, and may continue for months to a year
or more, until gains in visual acuity reach a plateau. Using a primary endpoint of 0.3 logMAR
(i.e., 20/40 Snellen equivalent), the overall success rate for occlusion therapy in young children
is about 75% (Flynn, Schiffman, Feuer, & Corona, 1998). Treatment is considerably less
effective after 7 years of age, but small improvements have been observed into late adolescence
(Campos, 1995; Holmes et al., 2011; Lea, Loades, & Rubinstein, 1989; Scheiman et al., 2005).
Novel therapies such as dichoptic games (Holmes et al., 2016), dark exposure (Duffy & Mitchell,
2013) and retinal inactivation (Fong, Mitchell, Duffy, & Bear, 2016) hold promise, but their
efficacy is as yet unproven in humans.
1.1.2 Etiology of Amblyopia
Amblyopia is typically classified according to the amblyogenic factor presumed to have
interfered with visual experience during the critical period in visual maturation. Refractive
amblyopia is caused by chronic retinal defocus associated with untreated refractive error in one
or both eyes. Unilateral refractive amblyopia is termed anisometropic amblyopia, and occurs
when the refractive error between the two eyes is unequal. The more hyperopic (i.e., far-sighted)
eye typically receives the more defocused retinal image and becomes amblyopic (Birch, 2013).
Bilateral refractive amblyopia is much less common, and occurs in cases of high refractive error
4
affecting both eyes. Strabismic amblyopia is associated with misalignment of the visual axes.
Constant, non-alternating strabismus, and eso-deviations are particularly amblyogenic, and lead
to amblyopia in the non-fixating eye (Birch, 2013). Mixed-mechanism amblyopia is the term
applied to cases that exhibit both anisometropia and strabismus as amblyogenic factors.
Deprivational amblyopia is caused by complete or partial obstruction of the visual axis in one or
both eyes. It is most commonly associated with congenital cataract, but may also be observed in
cases of severe ptosis, corneal opacity, and vitreous hemorrhage. Deprivational amblyopia is the
rarest form of amblyopia, the earliest onset, and generally causes visual impairment that is more
severe and refractory to treatment (American Academy of Ophthalmology Pediatric
Ophthalmology/Strabismus Panel, 2012). The rationale for this etiological classification of
amblyopia is not solely based on convenience, but also supported by differences in epidemiology
and the pattern of visual deficits (see section 1.1.3) among the groups.
In adults with a history of amblyopia, the most common etiology is anisometropia in 50%,
followed by mixed-mechanism in 27%, strabismus in 19% and deprivation in 4% (Attebo et al.,
1998). Among children, however, the relative prevalence depends on the age group under study.
Below 3 years of age, 82% of amblyopia is associated with strabismus, only 5% is associated
with anisometropia, and 13% is associated with mixed mechanism (Birch & Holmes, 2010).
Between 3 and 6 years of age, however, the proportion associated with strabismus decreases to
38%, and the proportions associated with anisometropia and mixed mechanism rise to 37% and
24%, respectively (Repka et al., 2002). These differences by age cohort suggest differential
sensitivity to amblyogenic factors as visual development progresses, with strabismus being a
stronger influence before age 3 years, and anisometropia emerging as a significant influence
primarily after age 3 years (Birch, 2013). Indeed, a longitudinal study of infants showed that
anisometropia is does not confer increased risk of amblyopia unless it persists for 3 years
(Abrahamsson, Fabian, & Sjostrand, 1990).
The different etiologies of amblyopia also show differential responsiveness to treatment. Using a
final visual acuity of 0.3 logMAR (20/40) as the definition of treatment success, a meta-analysis
of 23 studies on occlusion therapy reported a success rate of 78% in strabismic amblyopia, 67%
in anisometropic amblyopia, and 59% in mixed-mechanism amblyopia (Flynn et al., 1998).
Furthermore, it reported slightly superior mean final visual acuity for anisometropic and
strabismic amblyopia (0.30 logMAR and 0.27 logMAR, respectively) than for mixed mechanism
5
amblyopia (0.43 logMAR). Monocular deprivational amblyopia is usually considered separately
in clinical studies because of its early age at presentation and need for urgent surgical
intervention. Anecdotally, it is regarded as more severe and resistant to therapy than other forms
of amblyopia (American Academy of Ophthalmology Pediatric Ophthalmology/Strabismus
Panel, 2012), but the visual outcome for deprivational amblyopia is highly dependent on age at
which treatment is initiated (Birch & Stager, 1988; Birch, Stager, & Wright, 1986; Birch,
Swanson, Stager, Woody, & Everett, 1993; Kugelberg, 1992). For unilateral congenital cataract,
cases treated before 2 months of age achieve a mean visual acuity of 0.38 logMAR (20/48),
compared to 0.89 logMAR (20/155) in cases treated at 3 months of age or later (Birch, Stager,
Leffler, & Weakley, 1998). A recent study reported that the current standard for cataract surgery,
performed at a median age of 1.8 months, achieves a visual acuity of 0.3 logMAR (20/40) in
only 28% of cases (Lambert et al., 2010; Lambert, DuBois, Cotsonis, Hartmann, & Drews-
Botsch, 2016).
1.1.3 Abnormalities in Spatiotemporal Visual Processing in Amblyopia
Amblyopia is typically detected and diagnosed by a reduction in optotype acuity, but its
associated findings include a wide range of visual and perceptual processing abnormalities
(American Academy of Ophthalmology Pediatric Ophthalmology/Strabismus Panel, 2012;
McKee, Levi, & Movshon, 2003). In addition to a reduction in optotype acuity, typical visual
deficits in amblyopia include reductions in contrast sensitivity, Vernier acuity (i.e., detection of
slight misalignment between two parallel line segments), grating acuity (i.e., resolution of
alternating dark and light stripes of high spatial frequencies), and stereo acuity (i.e., perception of
binocular disparity depth cues in three-dimensional space) (McKee et al., 2003). Foveal
suppression and spatial interference (i.e., the ‘crowding’ effect) in the amblyopic eye are also
characteristic findings (Babu, Clavagnier, Bobier, Thompson, & Hess, 2013; Bonneh, Sagi, &
Polat, 2007; Levi & Klein, 1985; Stuart & Burian, 1962). Although the fellow eye typically has
normal optotype acuity on clinical testing, it commonly exhibits subtle visual deficits as well
(Meier & Giaschi, 2017).
In the amblyopic eye, contrast sensitivity is typically reduced, with high spatial frequencies
preferentially affected, although broadband impairments are sometimes observed (Abrahamsson
& Sjostrand, 1988; Bradley & Freeman, 1981; Hess & Howell, 1977; Levi & Harwerth, 1977).
6
In the fellow eye, contrast sensitivity is also reduced (Leguire, Rogers, & Bremer, 1990; Wali,
Leguire, Rogers, & Bremer, 1991). The contrast sensitivity deficit in the fellow eye is less
severe, but correlated with the contrast sensitivity deficit in the amblyopic eye. Furthermore,
standard occlusion therapy for amblyopia (i.e., patching of the fellow eye) improves contrast
sensitivity in both the amblyopic and fellow eye, indicating binocular interaction (Leguire et al.,
1990; Wali et al., 1991). Vernier acuity and grating acuity are also impaired in the amblyopic
eye, and the two measures are highly correlated with the deficit in optotype acuity (McKee et al.,
2003). In the fellow eye, evidence suggests that Vernier acuity is usually unaffected or even
slightly enhanced (Freeman & Bradley, 1980; Levi & Klein, 1985). Stereo acuity requires
sensory fusion of correlated or aligned images from each eye, and therefore is affected by any
condition impairing monocular visual acuity, eye alignment, or binocular processing. Stereo
acuity deficits in amblyopia are very typical, and are related to the severity of the visual acuity
deficit, strabismus, and foveal suppression. Indeed, as a general rule, stereopsis is absent in deep
amblyopia (McKee et al., 2003).
Beyond these deficits in low-level visual processing, amblyopia is also associated with
impairments in higher-level processing that requires combination of visual cues over a large area
of visual space to form a coherent percept (Mirabella, Hay, & Wong, 2011; Sharma, Levi, &
Klein, 2000). Spatial distortions of the visual field and positional uncertainty in visual alignment
tasks are observed in the amblyopic and fellow eye (Barrett, Pacey, Bradley, Thibos, & Morrill,
2003; Bedell, Flom, & Barbeito, 1985; Fronius, Sireteanu, & Zubcov, 2004; Hess & Holliday,
1992; Sireteanu, Thiel, Fikus, & Iftime, 2008). Detection of global shape is impaired in the
amblyopic eye (Hess, Wang, Demanins, Wilkinson, & Wilson, 1999), and contour integration,
which requires long-range interactions between elements in the visual space, is impaired in the
amblyopic eye and possibly in the fellow eye (Kovacs, Polat, Pennefather, Chandna, & Norcia,
2000). Similarly, abnormal long-range spatial interference by flanking elements (i.e.,
‘crowding’) is evident in the amblyopic eye (Bonneh, Sagi, & Polat, 2004; Polat, Sagi, & Norcia,
1997). Additionally, individuals with amblyopia show binocular deficits in global motion
detection (Aaen-Stockdale & Hess, 2008; Ho et al., 2006), motion-defined form detection
(Giaschi, Regan, Kraft, & Hong, 1992), and real-world scene perception (Mirabella et al., 2011).
The distribution, pattern and severity of visual deficits also varies among the etiological
categories of amblyopia. In anisometropic amblyopia, spatial frequency discrimination
7
thresholds are higher (i.e., impaired) at most spatial frequencies in the amblyopic eye, with
greater deficits evident at higher spatial frequencies. In strabismic amblyopia, however,
discrimination thresholds vary inconsistently across spatial frequencies, with peaks and troughs
that qualitatively resemble the spatial frequency discrimination profile of the normal peripheral
retina (Mathews, Yager, Ciuffreda, & Ettinger, 1987). The deficits in contrast sensitivity also
differ between etiological subtypes, with anisometropic and deprivational amblyopia having
poorer contrast sensitivity thresholds than either strabismic or mixed-mechanism amblyopia
(McKee et al., 2003). Furthermore, in anisometropic amblyopia, the deficit in contrast sensitivity
is uniformly distributed across the central and peripheral visual field, whereas in strabismic
amblyopia, the peripheral field (i.e., beyond the central 30 degrees) is relatively spared (Hess &
Pointer, 1985). In general, anisometropic and strabismic amblyopia also differ in the way their
visual deficits co-vary with each other. In anisometropic amblyopia, Vernier acuity and grating
acuity are affected proportionately, but in strabismic amblyopia the two are decoupled, with
Vernier falling off faster than predicted in relation to grating acuity; that is, strabismic amblyopia
has a proportionally greater deficit in Vernier acuity than grating acuity (Birch & Swanson,
2000; Levi & Klein, 1985). Similarly, inaccuracies in spatial localization can be predicted from
the deficit in contrast sensitivity in anisometropic amblyopia, but in strabismic amblyopia,
deficits in contrast sensitivity and spatial localization accuracy appear decoupled (Hess &
Holliday, 1992). The etiological categories of amblyopia also vary in their level of binocular
dysfunction (e.g., stereo acuity and suppression). Empirically and intuitively, strabismic and
mixed-mechanism amblyopia tend to have the poorer binocular function, and anisometropic and
deprivational amblyopia tend to have relatively preserved binocular function, as the visual axes
remain aligned (McKee et al., 2003). Indeed, binocular function has been proposed as a primary
determining factor behind the differential patterns of visual deficits observed among the
etiological categories of amblyopia (McKee et al., 2003).
Despite the variations in visual function described above, the etiological subtypes of amblyopia
share more similarities than differences (McKee et al., 2003). Indeed, the diagnostic hallmarks of
amblyopia—reduced visual acuity, reduced contrast sensitivity, reduced stereo acuity, elevated
interocular suppression, and the crowding phenomenon—are common to all etiological subtypes
(Levi & Klein, 1985; J. Li et al., 2011; McKee et al., 2003). Additionally, clinical trials support a
common therapeutic strategy—occlusion and pharmacologic penalization—as the gold-standard
8
for all subtypes of amblyopia (American Academy of Ophthalmology Pediatric
Ophthalmology/Strabismus Panel, 2012; Holmes & Clarke, 2006). Some have also noted that
ascertainment of etiology can be difficult, and may lead to spurious classification, because of
changes in a patient’s refraction and eye alignment in the interval between actual onset and first
clinical presentation (The Lasker/IRRF Initiative for Innovation in Vision Science, 2017). The
scientific study of these etiologic subtypes as a common clinical entity is therefore well justified.
The effects of amblyopia also extend beyond the spatial domain to include temporal aspects of
visual processing that require combination of visual cues over an interval of time. Severely
affected amblyopic eyes (i.e., optotype acuity worse than 0.7 logMAR) are associated with
reduced temporal contrast sensitivity across modulation frequencies from 2 Hz to 30 Hz
(Harwerth, Smith, Boltz, Crawford, & von Noorden, 1983; Wesson & Loop, 1982). Several
studies also suggest that amblyopic vision involves temporal uncertainty as well: the amblyopic
eye in anisometropic and strabismic amblyopia shows an extra-foveal deficit in temporal
resolution that correlates with the severity of the spatial acuity deficit (Spang & Fahle, 2009),
and sensitivity to visual asynchrony is reduced in the fovea of amblyopic eyes (Huang, Li, Deng,
Yu, & Hess, 2012). Furthermore, reduced sensitivity to visual temporal order is observed in the
fellow eye in strabismic amblyopia when the judgment requires intrahemispheric integration of
visual information (i.e., transmission across the corpus callosum) (St John, 1998). Amblyopic
eyes also show markedly reduced duration of visual persistence, apparent as an abnormal
decrease in the ability to detect differences between two visual stimuli as the inter-stimulus
interval increases (Altmann & Singer, 1986).
In addition to these impairments in visual cue combination over time, amblyopia is also
associated with increased latency in the visual system. The Pulfrich effect is a compelling
binocular visual illusion that links spatial vision and temporal processing (Pulfrich, 1922). In this
effect, a pendulum oscillating in the frontal plane is falsely perceived as having an elliptical
orbit. The illusion is not present in visually typical observers under normal binocular conditions,
but only when a signal latency difference is introduced into the visual system, either
experimentally or pathologically (Heng & Dutton, 2011). If stereopsis is intact, the image of the
moving pendulum in the lagging eye creates a binocular disparity that is perceived as three-
dimensional depth. A spontaneous Pulfrich effect has been observed in anisometropic
amblyopia, indicating delayed processing of visual information from the amblyopic eye (Tredici
9
& von Noorden, 1984). Indeed, this finding agrees with studies of pattern-reversal visual evoked
potentials (VEPs) which show that the latency of the P1 wave (time from stimulation to the peak
of the first major positive inflection, usually at about 100 ms) is prolonged for the amblyopic eye
(Arden & Barnard, 1979; Barnard & Arden, 1979; Sokol, 1983) and possibly for the fellow eye
(Watts, Neveu, Holder, & Sloper, 2002). Within individuals, trial-to-trial VEP latency
measurements from stimulation of the amblyopic eye are not only longer, but also more variable
(Banko, Kortvelyes, Nemeth, Weiss, & Vidnyanszky, 2013; Kelly, Tarczy-Hornoch, Herlihy, &
Weiss, 2015). VEP latency jitter and the customary practice of averaging waveforms over
multiple trials is proposed by some as the artifactual source of the reduced VEP amplitudes
observed in amblyopia, as illustrated in Figure 1.1 (Banko, Kortvelyes, Nemeth, et al., 2013).
Studies of multifocal VEP responses in amblyopia also show that increases in latency and
decreases in amplitude are not spatially uniform, but more pronounced in the central visual field
in both anisometropic and strabismic amblyopia (Yu, Brown, & Edwards, 1998; Zhang & Zhao,
2005). Furthermore, a study of anisometropic amblyopia showed the interocular difference in
retinocortical transmission time is correlated with the interocular difference in visual acuity
(Parisi, Scarale, Balducci, Fresina, & Campos, 2010). Visual processing delays are also evident
in visuomotor tasks, with increased latency in initiation of saccades (Ciuffreda, Kenyon, & Stark,
1978; McKee, Levi, Schor, & Movshon, 2016) and smooth pursuit eye movements (Raashid,
Liu, Blakeman, Goltz, & Wong, 2016) when viewing with the amblyopic eye, and increased
duration of the motor planning phase during visually-guided reaching movements (Niechwiej-
Szwedo et al., 2011).
A common factor implicated in many of the spatial and temporal perceptual abnormalities
outlined above is noise and its processing by the amblyopic visual system (Banko, Kortvelyes,
Weiss, & Vidnyanszky, 2013; Levi, 2013). Difficulty in handling external noise is evident from
behavioural studies. The addition of random visual noise to stimuli for global motion and global
orientation discrimination tasks degrades performance in amblyopia to a much greater degree
than it does for visually normal observers, suggesting an impairment in the ability to segregate
external noise from signal (Mansouri & Hess, 2006). The amblyopic eye also shows markedly
reduced sensitivity for the detection of white noise, particularly at high spatial frequencies (Levi,
Klein, & Chen, 2007). In addition to difficulty processing signals in external (i.e., stimulus)
noise, the amblyopic visual system is hypothesized to have higher levels of internal (i.e., neural)
10
noise. Behaviourally, this is indicated by greater trial-to-trial variability in perceptual tasks (Levi
& Klein, 2003; Levi, Klein, & Chen, 2008; Levi, Klein, & Yap, 1987; Levi, Waugh, & Beard,
1994), increased latency and diminished amplitude of saccadic eye movements (Niechwiej-
Szwedo, Goltz, Chandrakumar, Hirji, & Wong, 2010; Raashid, Wong, Chandrakumar,
Blakeman, & Goltz, 2013), and greater error in visually-guided grasping (Grant et al., 2007) and
reaching hand movements (Niechwiej-Szwedo, Goltz, et al., 2012). Physiologically, internal
noise is apparent as increased latency jitter in VEP responses (Banko, Kortvelyes, Nemeth, et al.,
2013) (see Figure 1.1), and reduced neural synchrony in V1 when driven by stimulation of the
amblyopic eye (Roelfsema, Konig, Engel, Sireteanu, & Singer, 1994). The importance of noise
in the pathophysiology of amblyopia is underscored by the ability to simulate amblyopic deficits
in contrast sensitivity (Nordmann, Freeman, & Casanova, 1992) and saccadic adaptation
(Raashid, Wong, Blakeman, & Goltz, 2015) in visually normal viewers by the addition of
external random noise, and by the inability of visual blur, alone, to simulate amblyopic deficits in
visually-guided reaching (Niechwiej-Szwedo, Kennedy, et al., 2012).
11
Figure 1.1: Visual evoked potential P1 amplitude and latency distributions from trial-by-
trial analysis from 18 adults with unilateral amblyopia. The histograms show identical P1
amplitude distributions for the fellow eye (FE) and amblyopic eye (AE), but P1 latency
distributions that are wider and skewed toward longer latencies for the AE. The density plots
illustrate that the P1 latency from stimulation of the AE is noisier and less reliable as a marker of
event timing than the signal from the FE. From Banko, Kortvelyes, Nemeth, et al. (2013).
Reprinted with permission from Elsevier.
12
1.1.4 Neural Basis of Amblyopia
The neural basis of amblyopia has been extensively studied in animal models, and more recently,
using neuroimaging techniques in humans. Although the current scientific consensus points to
the striate or primary visual cortex (V1) as the principal site of neurological dysfunction in
amblyopia, abnormalities have been identified at multiple sites in the visual system (Barrett,
Bradley, & McGraw, 2004; Hess, 2001).
The first significant insights into the neural basis of amblyopia arose from the pioneering work
of Hubel and Wiesel on the structure and function of the visual system in kittens deprived of
vision in one eye (Wiesel & Hubel, 1963a, 1963b). They showed that monocular deprivation by
eyelid suture beginning at birth caused marked atrophy in the layers of the lateral geniculate
nucleus (LGN) fed by the deprived eye (Wiesel & Hubel, 1963a). Deprivation for a comparable
period beginning after 1 to 2 months of normal visual experience resulted in similar but less
severe atrophy, and deprivation beginning in adulthood resulted in no atrophy in the LGN.
Electrophysiological recordings in the kittens with early-onset monocular deprivation showed
normal receptive fields, but reduced overall activity in the layers of the LGN fed by the deprived
eye. Subsequent investigations in monocularly deprived monkeys showed similar histological
changes in the primate LGN, but essentially normal physiological responses to visual stimulation
in the layers fed by the deprived eye (Baker, Grigg, & von Noorden, 1974; Blakemore & Vital-
Durand, 1986). These findings suggested experience-dependent plasticity in the morphology of
the LGN during a critical period in early life, but they did not account for the behavioural deficits
observed in deprivational amblyopia.
Unlike the LGN neurons in the striate cortex of visually deprived animals showed profound loss
of binocularity, loss of responsiveness to stimulation of the deprived eye, and a commensurate
shift in ocular dominance toward the non-deprived eye (Baker et al., 1974; Wiesel & Hubel,
1963b). As with the atrophy observed in the LGN, the physiological changes in the feline striate
cortex were less severe if the animal was deprived after a period of normal visual experience,
and absent if the animal was deprived in adulthood. Although monocular deprivation does not
cause atrophy of the striate cortex as it does in the LGN, cortical microstructure is greatly
affected: in cortical layer IV, the ocular dominance columns serving the deprived eye are
13
markedly narrower, while those serving the fellow eye are expanded (Hubel, Wiesel, & LeVay,
1977; Shatz & Stryker, 1978).
In addition to the early anatomical and physiological studies of deprivational amblyopia, other
studies have examined the neuroanatomic correlates of anisometropic and strabismic amblyopia.
Histologically, macaque monkeys with optically-induced anisometropic amblyopia show atrophy
in the LGN layers fed by the amblyopic eye, and narrowing of the V1 ocular dominance columns
serving the amblyopic eye, similar to that seen in monocular deprivation (Hendrickson et al.,
1987). These neuroanatomic changes in anisometropic amblyopia are accompanied by
physiological changes in the striate cortex: the number of neurons driven binocularly (i.e.,
cortical binocularity) is reduced, the proportion of neurons driven by the fellow eye is increased,
and neurons driven by the amblyopic eye have abnormally poor spatial selectivity and contrast
sensitivity (Kiorpes et al., 1998; Movshon et al., 1987). In monkeys with surgically-induced
strabismic amblyopia, cortical binocularity is similarly reduced, but the shift in eye dominance is
only seen among animals with more profound behavioural visual impairment (Kiorpes et al.,
1998). Furthermore, it has been shown in cats that strabismic amblyopia involves a loss of long-
range horizontal intracortical fibres connecting ocular dominance columns of opposite eyes (i.e.
binocular connections), and a reduction in temporal synchronization among neurons driven by
the amblyopic eye (Löwel & Engelmann, 2002; Roelfsema et al., 1994). Histological data on
amblyopia in humans is sparse, but two post-mortem studies suggest that narrowing of ocular
dominance columns serving the amblyopic eye may not be a feature of anisometropic or
strabismic amblyopia in humans (Horton & Hocking, 1996; Horton & Stryker, 1993).
Although the animal studies described above showed a positive relation between the magnitude
of striatal abnormalities and the degree of visual impairment, quantitative analyses suggest that
the physiological losses in V1 do not fully account for the behavioural deficits in amblyopia
(Kiorpes et al., 1998). Non-invasive neuroimaging techniques provided further insights into the
neural basis of amblyopia beyond V1 in humans. Positron emission tomography (PET) has
shown reduced cerebral blood flow and glucose metabolism in V1 and extrastriate visual areas
during amblyopic eye viewing (Demer, von Noorden, Volkow, & Gould, 1988; Imamura et al.,
1997). Functional magnetic resonance imaging (fMRI) studies report similar findings, with
decreased blood oxygenation level-dependent (BOLD) responses associated with amblyopic eye
viewing in areas V1, V2, V3, Vp (i.e. ventral posterior area of V3), and V3a (i.e., dorsal V3)
14
(Barnes, Hess, Dumoulin, Achtman, & Pike, 2001; Conner, Odom, Schwartz, & Mendola,
2007a, 2007b; Li, Dumoulin, Mansouri, & Hess, 2007). Furthermore, fMRI responses to
stimulation of the amblyopic eye are progressively reduced from V1/V2 to V4+/V8 and the
lateral occipital complex, suggesting impaired transmission of visual information to areas of
higher visual processing (Muckli et al., 2006). In amblyopic macaque monkeys, decreased fMRI
responses to the amblyopic eye have also been observed in the middle temporal (MT) visual area
(El-Shamayleh, Kiorpes, Kohn, & Movshon, 2010), which has been shown to mediate global
motion discrimination (Newsome & Pare, 1988; Salzman, Britten, & Newsome, 1990).
Similarly, an fMRI study of humans with amblyopia showed decreased activity in area MT
during visual motion tracking (Secen, Culham, Ho, & Giaschi, 2011). In agreement with these
spatially distributed reductions in cerebral blood flow and metabolism, the effective connectivity
between disparate visual brain areas (LGN-striate and striate-extrastriate) is also reduced when
driven by the amblyopic eye (Goodyear et al., 2000; Li, Mullen, Thompson, & Hess, 2011).
1.1.5 Normal Visual Development and Sensitive Periods for Damage and Recovery
The visual system is immature at birth, and its functional and anatomic development is critically
dependent upon the visual input it receives in early life (Blakemore, 1988; Boothe, Dobson, &
Teller, 1985; Maurer, Lewis, Brent, & Levin, 1999; Wiesel & Hubel, 1963b) (see Lewis and
Maurer (2005) for review). It therefore follows that if vision is disrupted during development, the
normal course of visual maturation can be derailed and lead to permanent dysfunction of the
visual system. In their landmark studies on monocular deprivation in kittens, Hubel and Wiesel
described a critical, or sensitive, period during which monocular deprivation causes changes in
the ocular dominance of neurons in the striate cortex (Hubel & Wiesel, 1970; Wiesel & Hubel,
1963b). Subsequent studies in animals and humans found that there is not one sensitive period
for visual development, but different sensitive periods for the various aspects of visual
perception (Harwerth, Smith, Duncan, Crawford, & von Noorden, 1986), and the various kinds
of abnormal visual experience (Daw, 1998). Furthermore, for any aspect of visual perception, the
sensitive period for damage by abnormal visual experience is often different from the period
during which functional recovery may be achieved, for example, by occlusion therapy for
amblyopia (Lewis & Maurer, 2005). The general timelines of normal human visual development,
sensitive periods for damage and functional recovery are outlined below.
15
The visual system in newborn humans is functional but immature. Shortly after birth, infants can
visually discriminate their mother’s face from a stranger’s (Pascalis, de Schonen, Morton,
Deruelle, & Fabre-Grenet, 1995), and show a preference for biological motion (Simion, Regolin,
& Bulf, 2008). Visual acuity is rudimentary at birth, measuring less than 1.0 logMAR (i.e., less
than 20/200) by response to an optokinetic grating stimulus (Gorman, Cogan, & Gellis, 1957).
Grating acuity measured by a forced choice preferential looking task improves from 1 to 3
cycles/degree (i.e., less than 20/200) to about 10 cycles/degree (i.e., 20/60) in first 6 months
postnatally, then improves more slowly until reaching adult levels at 4 to 6 years of age (Mayer
et al., 1995; Neu & Sireteanu, 1997; Salomao & Ventura, 1995). Spatial sweep visual evoked
potentials are in general agreement with behavioural improvements in grating acuity in early
infancy (Norcia & Tyler, 1985). While postnatal changes in retinal structure and refraction
account for much of the observed improvement in visual acuity, maturation of neural networks
involved in visual processing must also play a role (Banks & Bennett, 1988; Daw, 2006). Vernier
acuity, or hyperacuity, follows a protracted developmental trajectory distinct from grating acuity.
Vernier acuity is poorer than grating acuity until 2 to 3 months of age, after which it catches up
and parallels the gains in grating acuity during early childhood (Shimojo & Held, 1987;
Skoczenski & Norcia, 2002). While grating acuity reaches adult levels by 6 years, Vernier acuity
continues to improve, surpassing grating acuity and reaching a plateau at around 14 years of age
(Skoczenski & Norcia, 2002). Contrast sensitivity thresholds also improve rapidly in infancy.
From birth until 9 weeks of age, contrast sensitivity improves across all spatial frequencies
(Norcia, Tyler, & Hamer, 1990). Improvements in contrast sensitivity closely follow the
development of grating acuity. Contrast sensitivity at high spatial frequencies develops more
slowly than at low spatial frequencies, but the overall sensitivity function reaches adult form by 7
to 9 years of age (Adams & Courage, 2002; Ellemberg, Lewis, Liu, & Maurer, 1999). Sensitivity
to global motion and biological motion—higher level visual functions that require combination
of spatially disparate visual cues over time—also have a prolonged course of improvement, with
adult-like thresholds not reached until 12 to 14 years of age (Hadad, Maurer, & Lewis, 2011). As
with the differential rate of development of contrast sensitivity across spatial frequencies (Adams
& Courage, 2002; Ellemberg et al., 1999), the maturation of sensitivity to global motion may be
non-uniform for different motion integration stimuli (Hadad, Schwartz, Maurer, & Lewis, 2015).
16
The concept of a sensitive period for visual development is relevant to clinical practice because it
indicates the time during which visual impairment may be prevented (i.e., the sensitive period for
damage), and possibly reversed (i.e., the sensitive period for recovery) by therapeutic
intervention. In their studies of visual development in the cat, Hubel and Wiesel described a
sensitive period for changes in the ocular dominance of cells in the striate cortex and the
associated behavioural deficits caused by early-onset monocular deprivation (Hubel & Wiesel,
1970). Deprivation before 4 weeks of age had no effect on the functional architecture of the
striate cortex. Beginning at 4 weeks, however, the visual system showed an exquisite
susceptibility to monocular deprivation, with even brief periods of deprivation causing profound
physiological and behavioural deficits. Susceptibility remained high for two weeks, then tailed
off slowly until the third month, at which point the visual system was no longer affected by
deprivation. Near the end of the sensitive period, even long periods of eyelid closure produced
only mild physiological deficits. The sensitive period for plasticity of ocular dominance columns
in the striate cortex was subsequently delineated in macaque monkeys (Horton & Hocking,
1997). Unlike the feline visual system, which had a 1 month period of non-susceptibility to
monocular deprivation, the macaque visual system was vulnerable to physiological damage from
birth. Indeed, susceptibility to damage in macaque infants was highest at 1 week of age,
diminished by 5 weeks of age, and absent by 3 months of age. Furthermore, animal studies
describe multiple, partially overlapping sensitive periods for different visual functions and
different areas of the visual brain (Harwerth et al., 1986; Jones, Spear, & Tong, 1984). In
macaque monkeys, the sensitive period was found to end by 3 months of age for scotopic
spectral sensitivity, by 6 months of age for photopic spectral sensitivity, but extend up to 2 years
of age for spatial contrast sensitivity, and beyond 2 years of age for binocular summation
(Harwerth et al., 1986). In another study of monocular deprivation in kittens, the period of
susceptibility for changes in ocular dominance ended by 18 to 26 weeks of age for the
extrastriate cortex (specifically, the lateral suprasylvian area), but extended beyond 35 weeks of
age for the striate cortex (Jones et al., 1984).
In humans, the sensitive periods for damage in visual development have been extensively studied
in children with deprivational amblyopia from congenital and early-onset cataract (see Lewis and
Maurer (2005) for review). A study of early surgery for bilateral congenital cataracts showed that
if deprivation begins at birth and lasts for less than 10 days, children can sometimes attain
17
normal optotype acuity (Kugelberg, 1992). Another study examined the visual acuity in children
at the first contact lens fitting following surgery for congenital and acquired unilateral cataract
(Vaegan & Taylor, 1979). It showed that deprivation onset between 6 and 30 months of age
caused the greatest loss of acuity (less than or equal to counting fingers), that the losses
diminished as a function of deprivation onset after 3 years of age, and that no losses were
apparent with deprivation onset after 10 years of age. Similarly, a study of children with early
monocular and binocular cataract found that deprivation onset before 5 years of age resulted in
abnormal grating acuity, but that onset at 11 years or later had no effect on grating acuity
(Ellemberg, Lewis, Maurer, Brar, & Brent, 2002). The pattern of early high susceptibility and
gradual tailing off is reminiscent of the sensitive period described in animal models (Horton &
Hocking, 1997; Hubel & Wiesel, 1970), but over a much longer time scale. Together, these
studies suggest that the sensitive period for damage of visual spatial acuity extends from birth (or
10 days after birth) to about 10 years of age. Importantly, this is about 4 years beyond the period
of normal development for grating acuity, indicating that this aspect of the visual function
remains vulnerable and plastic for some time after adult-level function is attained (Lewis &
Maurer, 2005). The sensitive period for damage and its relation to normal development is quite
different for global motion, however. A study of early onset unilateral and bilateral cataract
showed that global motion discrimination thresholds are only impaired if deprivation begins
before 4 to 8 months of age (Ellemberg et al., 2002). Similarly, a clinical case study of visual
recovery in a 43 year old man deprived of form vision since 3 years of age reported impaired
spatial resolution consistent with deprivational amblyopia, but normal performance on motion
processing tasks (Fine et al., 2003). Furthermore, fMRI responses in the patient showed
dramatically reduced responses in V1, but normal activation in area MT to motion stimuli. In
stark contrast to spatial acuity, global motion has a brief sensitive period for damage (4 to 8
months) that is dramatically shorter than its period of normal development (12 to 14 years)
(Hadad et al., 2011). Further complicating the discussion of sensitive periods for damage is an
apparent difference between amblyogenic factors (discussed in section 1.1.2). Strabismic
amblyopia is 16 times more prevalent than anisometropic amblyopia under 3 years of age (Birch
& Holmes, 2010), but the two are equally prevalent between 3 and 6 years of age (Repka et al.,
2002). Additionally, anisometropia with onset by 1 year of age is not associated with an
increased risk of amblyopia unless it persists for 3 years (Abrahamsson et al., 1990). These
18
findings suggest that the sensitive period for damage may occur later for retinal defocus (i.e.,
anisometropia) than for binocular decorrelation (i.e., strabismus) or form deprivation.
While the sensitive periods for damage are of tremendous interest to basic science, the sensitive
periods for recovery are arguably more relevant to clinical care because they predict the
effectiveness of treatment. According to clinical tradition, the most common forms of amblyopia
(strabismic, anisometropic, and mixed mechanism) have the best potential for recovery of visual
acuity if occlusion therapy is undertaken before 7 years of age (Campos, 1995; von Noorden &
Campos, 2002). While this is generally borne out in clinical studies (Flynn et al., 1998; Holmes
et al., 2011; Lea et al., 1989), partial recovery is well-documented in adolescents (Scheiman et
al., 2005) and reported in adults long after the end of the sensitive period for damage (Kasser &
Feldman, 1953; Kishimoto et al., 2014; Kupfer, 1957). This suggests that the sensitive period for
recovery of visual acuity tails off more slowly than the sensitive period for damage. Other
evidence indicates that the potential for recovery may also be enhanced or re-opened. For
example, profound recovery of visual acuity in the amblyopic eye has been reported in adults
following loss of the fellow eye (Hamed, Glaser, & Schatz, 1991; Klaeger-Manzanell, Hoyt, &
Good, 1994; Vereecken & Brabant, 1984), and in animals following brief periods of darkness
(Duffy & Mitchell, 2013; He, Ray, Dennis, & Quinlan, 2007) and bilateral pharmacological
retinal inactivation (Fong et al., 2016). Training on perceptual learning tasks has also produced
modest gains in visual acuity, Vernier acuity, contrast sensitivity, and stereopsis in adults with
amblyopia (Levi & Polat, 1996; Li, Ngo, Nguyen, & Levi, 2011; Polat, Ma-Naim, Belkin, &
Sagi, 2004), but debate remains about whether these gains represent true visual recovery, or
merely changes in visual attention (Tsirlin, Colpa, Goltz, & Wong, 2015). Even if normal visual
acuity is achieved by occlusion therapy, some deficits in other visual functions persist. In
deprivational amblyopia, for example, global motion sensitivity can remain impaired despite
complete recovery of visual acuity (Constantinescu, Schmidt, Watson, & Hess, 2005). Similarly,
reading capacity may remain markedly impaired in strabismic amblyopia despite full recovery of
visual acuity (Zürcher & Lang, 1979). Like the sensitive periods for damage, these findings
suggest that different visual functions have distinct but overlapping sensitive periods for
recovery as well. Furthermore, they highlight the potential inadequacy of occlusion therapy in
addressing the full constellation of perceptual impairments that occur in amblyopia.
19
1.2 Auditory Processing
1.2.1 Overview
Acoustic perception is a complex process involving peripheral transduction of sound information
by the cochlea, transmission of the neural impulses to the brain, and central processing of the
auditory signals to register a conscious perception of the acoustic world. At the most basic level,
the peripheral auditory system detects and encodes the frequency and amplitude of one-
dimensional sound waves. Centrally, however, the combination and comparison of the acoustic
information from both ears (i.e., binaural hearing) permits additional information about the
auditory world to be inferred and extracted. For example, analysis of binaural differences in
signal timing and signal intensity provide information about the spatial structure of the auditory
world. Certainly, the ability to localize sound sources and to perceive movement of auditory
objects is critical for survival (Grothe, Pecka, & McAlpine, 2010). Furthermore, integration of
auditory signals over an interval of time enables perception of acoustic sequences and temporal
patterns. This ability to recognize the larger temporal structure of modulation in frequency and
amplitude is essential to communication, comprehension of speech, and appreciation of music
(Warren, 2008).
1.2.2 Auditory Spatial Processing
Sound waves produce one-dimensional oscillations of the eardrum that are transmitted by the
ossicles of the middle ear to the fluid-filled cochlea. Within the cochlea, the transmitted waves
cause deflections in neurosensory hair cells arranged along the basilar membrane. The hair cells
are tonotopically arranged, meaning that their position along the basilar membrane represents the
frequency of a sound rather than a location in space. Therefore, unlike the photoreceptors in the
neurosensory retina, which have an intrinsic spatiotopy, the peripheral auditory apparatus has no
explicit representation of space. Rather, the perception of auditory space must be inferred from
the frequency and amplitude of the signals from each ear. These spatial computations are largely
undertaken by specialized centres in the auditory midbrain (reviewed in Grothe et al. (2010)).
Despite this lack of an explicit spatiotopy, the mature human auditory system has remarkable
sound localization abilities. Indeed, using monaural and binaural cues, humans can discriminate
changes in angular direction of 1 to 2 degrees in the horizontal plane (Blauert, 1970; Klemm,
20
1920) and of 3.5 degrees in the vertical plane (Makous & Middlebrooks, 1990). Below, the
mechanisms of monaural and binaural sound localization in humans are reviewed.
1.2.2.1 Monaural Cues
The head and pinna of the outer ear interact with sound entering the auditory meatus, producing
many frequency-specific time delays and complex spectral changes (Wright, Hebrank, & Wilson,
1974). The spectral transformation that results is referred to as a head-related transfer function,
or HRTF, and varies depending on the azimuth and elevation of the sound source relative to the
head and pinna (Wightman & Kistler, 1989a, 1989b). These monaural cues are of primary
importance in auditory localization on the vertical plane. When the normal convolutions of the
human pinna are experimentally occluded, sound localization in the vertical plane is significantly
impaired: discrimination of changes in elevation is reduced, and the likelihood of front-back
confusions is increased (Gardner & Gardner, 1973). Discrimination of changes in azimuth,
however, appears unaffected by obliteration of monaural spectral cues (Hofman, Van Riswick, &
Van Opstal, 1998).
1.2.2.2 Binaural Cues
Accurate auditory localization in the horizontal plane (i.e., azimuth) relies on two binaural cues:
interaural level difference (ILD) and interaural time difference (ITD). The recognition that these
two cues contribute to binaural perception of azimuth and their dominance at different
frequencies is often referred to as the duplex theory of sound localization (Rayleigh, 1907).
ILD refers to the difference in sound pressure level between the two ears, and is caused by
acoustic shadowing by the head. If a sound is presented from the side, the path of the sound wave
is interrupted by the head, shadowing the far ear from the source of the sound. The amount that a
given sound source is attenuated, or acoustically shadowed, depends on the wavelength of the
sound relative to the diameter of the listener’s head. High frequency sounds with wavelengths
smaller than the diameter of the human head can be attenuated by as much as 35 decibels
(Middlebrooks, Makous, & Green, 1989), far above the 1 to 2 decibel ILD detection threshold
for clicks (Mills, 1958; Von Békésy, 1930). However, as frequency decreases (and wavelength
increases), the effect of acoustic shadowing tails off. Frequencies at or below approximately
21
1400 Hz have wavelengths equal to or larger than the diameter of the human head, and generally
produce ILDs too small to be useful as binaural localization cues (Mills, 1958).
ITD refers to the difference in acoustic signal arrival time (i.e. phase delays) between the two
ears. The relation between ITDs and horizontal angular direction is dependent on the distance
between the two ears and the velocity of sound waves through air. A sound presented from the
side has different path lengths to each ear, and given a constant velocity, will arrive at the far ear
later. In humans, the physiological range of ITDs varies from 0 μs for a central sound source to
about 750 μs for a fully lateralized sound source (van der Heijden & Trahiotis, 1999). The
discrimination thresholds for ITDs can be as short as 10 μs, however, which corresponds to a
change in azimuth of about 2 degrees (Klumpp & Eady, 1956). For frequencies below
approximately 1400 Hz, ITD provides an unambiguous spatial signal (Mills, 1958). For higher
frequencies, however, the ITD becomes spatially ambiguous because phase offsets of one or
more wavelengths could produce ITDs in the physiologic range.
1.2.2.3 Neural Pathways for Sound Localization
Signals for ITDs and ILDs from each ear converge in the auditory midbrain, where dedicated
networks transform these binaural cues into signals for sound location in head-centred
coordinates (see Grothe et al. (2010) for review).
The first site of convergence for binaural ILD cues is the lateral superior olive (LSO). The LSO
receives excitatory input directly from the ipsilateral cochlear nucleus and inhibitory inputs from
the contralateral cochlear nucleus via the medial nucleus of the trapezoid body (MNTB). These
excitatory and inhibitory inputs give rise to ILD sensitivity in LSO neurons by a subtractive
process (Boudreau & Tsuchitani, 1968; Caird & Klinke, 1983). LSO neurons are excited by
sounds that are more intense in the ipsilateral ear, and are inhibited by sounds more intense in the
contralateral ear. Neurons in each LSO send excitatory projections to the contralateral dorsal
nucleus of the lateral lemniscus (DNLL) and inferior colliculus (IC), and inhibitory projections
to the ipsilateral DNLL and IC. In addition to input from the LSOs bilaterally, each IC receives
inhibitory input from the contralateral DNLL and excitatory input directly from the contralateral
cochlear nucleus. These additional ascending inputs to the IC provide additional ILD cues and
improve the spatial sensitivity of IC neurons compared to LSO neurons (Park, Klug, Holinstat, &
Grothe, 2004).
22
Sensitivity to ITD cues is traditionally thought to emerge from convergence of binaural signals in
the medial superior olive (MSO), although in mammals, neurons in both the MSO and LSO
respond to ITD cues. The MSO receives excitatory input directly from the ipsilateral and
contralateral cochlear nucleus, as well as inhibitory input directly from the ipsilateral cochlear
nucleus and indirectly from the contralateral cochlear nucleus via the MNTB. For low frequency
sounds, action potentials from the cochlea are phase-locked to the stimulus waveform, thus
preserving the fine temporal structure of the acoustic stimulus (Rose, Brugge, Anderson, & Hind,
1967). These converging inputs on MSO neurons permit analysis of the phase offset between the
phase-locked binaural signals, and typically give rise to maximal excitation when the auditory
signal leads in the contralateral ear (Batra, Kuwada, & Fitzpatrick, 1997). Neurons in each MSO
send ascending excitatory projections to the ipsilateral and contralateral DNLL and IC. Although
both ILD and ITD cues contribute to sound localization in behavioural studies (Klumpp & Eady,
1956; Mills, 1958), spatial selectivity in the mammalian IC appears to be predominantly to ILD
cues (Campbell, Doubell, Nodal, Schnupp, & King, 2006)
In humans, accurate sound localization in each spatial hemifield relies on the integrity of the
contralateral IC (Litovsky, Fligor, & Tramo, 2002). Beyond this gross lateralization of function,
however, evidence of spatiotopic organization in the IC is weak. A study of the macaque monkey
IC found that while individual neurons are spatially tuned, the overall topographic organization
within each IC is tonotopic rather than spatiotopic (Zwiers, Versnel, & Van Opstal, 2004). A
study of the guinea pig IC, however, found a weak spatiotopy, with more caudal areas of the IC
representing more peripheral locations in the contralateral auditory hemifield (Binns, Grant,
Withington, & Keating, 1992).
Each IC sends ipsilateral projections rostrally to the multisensory superior colliculus (SC)
(Moore & Goldberg, 1966; Oliver & Huerta, 1992), where the auditory inputs interact with
retinotopically organized projections from the visual system to produce a spatiotopic map of
auditory space (King & Palmer, 1983). Unlike the auditory neurons in the IC, auditory neurons
of the SC show spatial sensitivity that changes with eye position, reflecting a transformation
from a head-centred to a retina-centred encoding of auditory space (Hartline, Vimal, King,
Kurylo, & Northmore, 1995; Jay & Sparks, 1987b). A shared coordinate system between the SC
auditory space map and overlying visual space map allows topographical alignment to be
maintained as the eyes move in the orbits (Jay & Sparks, 1987a), and is likely essential for the
23
integration of auditory and visual location signals that occurs in deeper layers of the SC
(Meredith & Stein, 1986b).
1.2.2.4 Normal Development of Sound Localization
Like the visual system (Mayer & Dobson, 1982), the auditory system is immature at birth, and
follows a developmental trajectory shaped by sensory input in early life (King, 2009; Litovsky &
Ashmead, 1997). As shortly as 10 minutes after birth, neonates show a slow orienting response
to sounds presented in the left and right hemifields (Clifton, Morrongiello, Kulig, & Dowd,
1981; Muir & Field, 1979; Wertheimer, 1961). Once infants develop sufficient neck control, the
precision of horizontal sound localization can be studied using a minimum audible angle (MAA)
task, which measures the smallest change in sound source position that can be reliably perceived
(Mills, 1958). At 6 months of age, the MAA is about 20°, but improves gradually until reaching
adult levels (about 1° to 2°) sometime after 18 months of age but before 5 years of age
(Ashmead, Clifton, & Perris, 1987; Litovsky, 1997; Mills, 1958; Morrongiello, 1988). In
infancy, sensitivity to both ILD and ITD cues are poorer relative to adult thresholds, but
sensitivity to ITD cues is better than would be predicted by measurements of MAA for free-field
sound sources (Ashmead, Davis, Whalen, & Odom, 1991; Ashmead, Grantham, Murphy, &
Tharpe, 1993). This lack of real-world auditory localization precision is postulated to allow for
easy recalibration of sound localization as head growth causes rapid changes in the mapping of
ILD, ITD, and spectral cues to physical space (Ashmead et al., 1991; Clifton, Gwiazda, Bauer,
Clarkson, & Held, 1988).
1.2.2.5 Influence of Vision on the Development of Sound Localization
Spatial tuning and calibration of the developing auditory system does not occur in isolation, but
is strongly influenced by visual experience in early life. As noted previously, orienting responses
to auditory spatial cues are present in humans at birth (Clifton et al., 1981; Muir & Field, 1979;
Wertheimer, 1961). Similarly, animal studies have shown that a rudimentary map of auditory
space is present in the mammalian superior colliculus at birth (King & Carlile, 1993). Although
visual input is not necessary for the development of normal or even supra-normal spatial hearing
abilities (Lessard, Pare, Lepore, & Lassonde, 1998; Röder et al., 1999; Voss et al., 2004),
evidence from animals and humans show that abnormal visual input can cause significant,
permanent alterations to the processing of auditory spatial cues. Knudsen and Knudsen (1989)
24
showed that barn owls reared wearing prism spectacles mislocalize sounds presented in darkness
in the direction of the early visual field shift. Owls reared with prisms for less than 6 months
tended to recover normal sound localization following prism removal, but those reared with
prisms for more than 6 months did not recover normal sound localization abilities, indicating a
permanent developmental miscalibration of spatial hearing (Knudsen & Knudsen, 1990). These
behavioural changes in sound localization were accompanied by shifts in the auditory space map
in the optic tectum, the avian homolog to the mammalian SC (Knudsen & Brainard, 1991).
Similarly, ferrets reared with experimentally-induced strabismus in one eye show a
corresponding shift in spatial tuning of neurons in the contralateral SC (King, Hutchings, Moore,
& Blakemore, 1988). Ferrets reared with a more severe visual distortion by rotation of one eye
about the visual axis show complete disorganization of the SC auditory map (King et al., 1988).
These animal studies indicate that vision has a dominant influence on the calibration of sound
localization, regardless of whether the visual input is spatially accurate.
Studies of clinical populations with visual impairment confirm that early visual experience can
affect the developmental calibration of spatial hearing in humans as well. Lessard et al. (1998)
found that people with early-onset bilateral visual impairment (i.e., congenital blindness in the
central visual field) localize sounds with poor precision and accuracy. Similarly, Gori, Sandini,
Martinoli, and Burr (2014) found that congenitally blind adults perform poorly on auditory
spatial bisection tasks, indicating an impairment in the encoding of Euclidean auditory
relationships. Interestingly, several studies of visually impaired populations show sensory
compensation by the auditory system for the loss of visual spatial abilities. For example, people
who are totally blind from birth can localize sounds with normal accuracy in the central region of
space (Lessard et al., 1998), and display supra-normal localization abilities in peripheral space
(i.e., when sounds are presented laterally) (Röder et al., 1999). One-eyed people who lost their
second eye in early life also show better-than-normal accuracy in sound localization in the
central region of space (Hoover, Harris, & Steeves, 2012). Although early visual experience
strongly influences the development of spatial hearing, vision is clearly not necessary to attain
sound localization abilities that are equivalent to or exceed normal performance in many
respects.
25
1.2.2.6 Sensitive Periods for the Development of Sound Localization
In general, the auditory system remains adaptable to changes in the spatial mapping of binaural
cues, even into adulthood (Moore, 1993). Ferrets raised with chronic monaural occlusion adapt
to the skewed ILD cues and localize sound normally at maturity as long as the occlusion is
maintained (King, Parsons, & Moore, 2000). Adult ferrets raised with normal binaural
experience show impaired sound localization immediately following monaural occlusion, but
recover near-normal spatial hearing within 6 months, indicating the capacity for adaptation in
adulthood (King et al., 2000). Similarly, adult humans who went through childhood with normal
sensory experience show systematic bias in sound localization immediately following monoaural
occlusion, but can adapt back to pre-occlusion performance within 1 week with training.
(Kumpik, Kacelnik, & King, 2010). Like the visual system, however, the auditory system has
sensitive periods in early life during which abnormal auditory and visual input can permanently
alter the processing of binaural cues and the localization of sound (see Keuroghlian and Knudsen
(2007) for review).
Evidence that accurate and precise sound localization depends on auditory experience in early
life is available from animal studies and reports on human populations with early-onset hearing
impairment. Experiments on monaural occlusion in young barn owls found that if owls are
younger than 8 weeks of age at the time of ear plugging, they can fully adapt to the abnormal
binaural cues (Knudsen, Esterly, & Knudsen, 1984). If they are older than 8 weeks, however,
adaptation does not occur and they remain tuned to normal binaural cues, indicating that the
sensitive period for damage to sound localization by abnormal binaural experience ends at about
8 weeks of age (Knudsen, Esterly, et al., 1984; Knudsen & Knudsen, 1986). The sensitive period
for recovery of sound localization in barn owl appears to be longer, however: researchers
subjected barn owls to monaural occlusion starting before 8 weeks of age, and found that the rate
of recovery slowed as a function of the age at ear plug removal, with virtually no recovery
observed if occlusion persisted for 38 to 42 weeks (Knudsen, Knudsen, & Esterly, 1984).
Studies of children with early deafness followed by restoration of hearing by cochlear
implantation suggests an analogous sensitive period in humans. One study of children with
bilateral cochlear implants found that those who acquired profound deafness after 3.5 years of
age localized sounds more accurately than congenitally deaf children, regardless of the age of
26
cochlear implantation (Killan, Royle, Totten, Raine, & Lovett, 2015). Other researchers have
measured spatial release from masking—a psychoacoustic measure of the ability of a listener to
separate acoustic signals from noisy backgrounds on the basis spatial cues—as a marker of
spatial hearing ability in children with a history of early peripheral hearing loss. A study of
children age 5 to 13 years with a prior history of otitis media with effusion (i.e., middle ear
disease causing transient hearing loss) found that spatial release from masking did not improve
even after pure tone thresholds had normalized (Pillsbury, Grose, Hall, & Iii, 1991). This effect
of early auditory deprivation was confirmed by a later study that found otitis media with effusion
for a cumulative total of 2.5 years or more during the first 5 years of life resulted in residual
impairments in binaural hearing (Hogan & Moore, 2003). These findings demonstrate that
abnormal auditory experience during the first several years can produce lasting deficits in spatial
hearing.
The influence of visual experience on sound localization also demonstrates sensitive periods for
damage and recovery in early life (see King (2009) for review). This has been most extensively
studied in the barn owl. Knudsen and Knudsen (1990) showed that the accuracy of sound
localization is maximally affected by visual field displacement when prism spectacle experience
began by 3 weeks of age. This sensitive period for damage gradually tapered off until between
15 to 38 weeks of age. They also examined the sensitive period for recovery from vision-induced
sound mislocalization, and found that the recovery was full if normal vision was restored by
about 26 weeks of age, but absent if normal vision was restored after 29 weeks of age. In barn
owls, therefore, the sensitive period for maximal recovery appears to extend beyond the sensitive
period for maximal damage. Detailed data of this kind are not available for humans, but
investigations comparing sound localization performance in the early-blind and late-blind
generally show some normal to supra-normal abilities in both groups (Abel, Figueiredo, Consoli,
Birt, & Papsin, 2009; Ashmead et al., 1998; Fieger, Röder, Teder-Salejarvi, Hillyard, & Neville,
2006; Hoover et al., 2012; Lessard et al., 1998; Röder et al., 1999; Voss et al., 2004), but deficits
on some tasks only among the early-blind (Gori et al., 2014; Lessard et al., 1998; Zwiers, Van
Opstal, & Cruysberg, 2001). These findings suggest the presence of a sensitive period for
damage by abnormal visual experience in early life for humans.
27
1.2.3 Auditory Temporal Processing
The human auditory system is capable of perceiving the temporal structure of sound over a
remarkable range of time scales. Successful processing and integration of temporal cues on the
order of microseconds enables perception of spatial location by ITD cues, and integration on the
order of milliseconds allows detection and discrimination of the basis of simultaneity, non-
simultaneity, temporal order, and duration (see Wittmann (1999) for review). Perception and
recognition of temporal structure and patterns on the order of seconds enables the comprehension
of speech and appreciation of music (Pöppel, 1997; Warren, 2008). The limits of temporal
resolution are not equal for all perceptual tasks, however, suggesting that different neural
mechanisms are at play (Hirsh, 1959).
Extraction of spatial information from ITD cues demands higher temporal precision than any
other neural computation in the human brain (Grothe et al., 2010). Based on spatial
discrimination thresholds, the temporal resolution limit for ITD cues is approximately 10 μs
(Klumpp & Eady, 1956). Unlike other types of auditory temporal processing, however, the
perceptual result of ITD processing is spatial, not temporal, in nature. The reliable perception of
time in the auditory system requires temporal separation at least two orders of magnitude greater
that for ITDs: a temporal interval of at least few milliseconds is required for a listener to reliably
perceive two brief sounds presented in sequence as non-simultaneous, and a separation of about
20 ms is needed for a listener to reliably report the order in which they occurred (Hirsh, 1959;
Hirsh & Sherrick, 1961; Kanabus, Szelag, Rojek, & Poppel, 2002).
1.2.3.1 Normal Development of Auditory Temporal Processing
As in the spatial domain, performance on auditory perceptual tasks in the temporal domain
improves throughout development. Discrimination thresholds of auditory temporal order for
clicks presented to alternate ears improve from about 130 ms at 5 years of age to about 70 ms by
11 years of age (Berwanger, Wittmann, von Steinbuchel, & von Suchodoletz, 2004), but this
methodology may be confounded with spatial discrimination by ITD cues. Others have avoided
this spatial confound by examining the thresholds for purely temporal stimuli. Davis and
McCroskey (1980) examined auditory temporal resolution by measuring the minimum inter-
stimulus interval between two brief (17 ms) sounds before they are perceptually fused into one
event. They found that the auditory fusion threshold improved dramatically from between 20 to
28
24 ms at 3 years of age to between 6 to 10 ms at 8 years of age, before stabilizing to between 4 to
8 ms at 9 to 11 years of age. Others have quantified auditory temporal resolution by measuring
the detection threshold for a temporal gap (i.e., silence) within a burst of continuous noise.
Similar to the auditory fusion threshold, the auditory gap detection threshold was found to
improve significantly through childhood, reaching adult levels by 11 years of age (Irwin, Ball,
Kay, Stillman, & Rosser, 1985; Wightman, Allen, Dolan, Kistler, & Jamieson, 1989). Gap
detection thresholds vary by stimulus frequency and level, but for broadband noise at 40 dB SPL,
improve from approximately 18 ms at 6 years of age to 6 ms at 11 years to age (Irwin et al.,
1985). Further supporting this time course of development for auditory temporal processing,
Tallal (1978) found that temporal order discrimination for two tones of different frequency
improved through childhood, reaching adult levels by approximately 9 years of age.
1.2.3.2 Influences of Abnormal Sensory Experience on Auditory Temporal Processing
Like the development of auditory spatial processing abilities, the development of auditory
temporal processing abilities is also dependent upon normal sensory experience in early life. A
study of auditory gap detection showed that adult listeners with congenital moderate
sensorineural hearing loss have higher gap detection thresholds than normal hearing controls,
even after controlling for audibility of the stimuli (DeFilippo & Snell, 1986). Similar results were
reported among adult cochlear implant users with later-onset hearing loss, however, raising the
possibility that this deficit is not developmental in origin (Blankenship, Zhang, & Keith, 2016).
Another study, however, found that children with a history of transient hearing loss from otitis
media with effusion prior to 5 years of age have poorer auditory gap detection thresholds
compared to unaffected children (Khavarghazalani, Farahani, Emadi, & Hosseni Dastgerdi,
2016). Their findings indicate that long-lasting effects on gap threshold detection can arise from
auditory deprivation in early life, and suggest that early childhood encompasses a sensitive
period for auditory temporal processing (Khavarghazalani et al., 2016). Temporal processing
deficits exist at larger time scales as well. A study of older children examining their ability to
reproduce a temporal sequence presented as a series of suprathreshold tones reported that
children with early hearing loss do so less accurately than normal hearing controls (Sterritt,
Camp, & Lipman, 1966). Interestingly, early hearing impairment affects temporal processing in
other sense modalities as well. For example, in adults who became deaf before 2 years of age,
29
simultaneity judgment thresholds are dramatically impaired for unisensory visual and unisensory
tactile stimuli (Heming & Brown, 2005). Conversely, early visual impairment is associated with
superior temporal auditory resolution: early blind adults outperform developmentally typical
adults in auditory temporal order judgments (Stevens & Weaver, 2005), and possibly in auditory
gap detection (Muchnik, Efrati, Nemeth, Malin, & Hildesheimer, 1991; Weaver & Stevens,
2006). These heightened auditory temporal processing abilities in the early blind may represent
sensory compensation in the absence of vision.
The interactions between vision and audition are reviewed further in the following section.
1.3 Multisensory Processing and Integration
1.3.1 Overview
Each sense modality provides a unique window on the external world. The visual system detects
energy in the form of electromagnetic radiation in the visible spectrum. The auditory system
detects energy in the form of air pressure fluctuations. The somatosensory system detects thermal
and mechanical energy delivered to the body in the form of touch, pain, vibration, and
movement. In many instances, the senses provide modality-specific information about features in
the environment. Colour, for instance, is specific to vision, while pitch is specific to audition. In
other instances, however, the different senses provide information on shared features of a
stimulus that are not specific to a particular modality. Object features accessible to more than one
sensory modality, such as intensity, spatial location, and temporal frequency or rhythmic
structure, have been variously referred to as amodal attributes (Lewkowicz, 2000), intermodal
invariants (Gibson, 1966), common sensibles (Marks, 1978), and intersensory equivalencies
(Lewkowicz & Lickliter, 1994). When watching a percussionist, for example, the occurrence of
drum beats in space and time is often accessible to both the visual and auditory systems, and if
he or she is playing very intensely, to the somatosensory system as well. These unisensory
information streams converge in the central nervous system, which serves the fundamental
function of combining these multisensory information streams to synthesize a coherent and
biologically meaningful internal representation of the external world. This function, termed
perceptual binding, is of critical importance for survival because its perceptual result determines
an organism’s ability to react quickly, precisely, and appropriately to salient stimuli (Ernst &
Bulthoff, 2004).
30
In any scientific endeavour, clear definitions of the processes and phenomena under discussion
are essential for accurate and nuanced understanding of the subject matter and for the ultimate
advancement of knowledge. In the literature dealing with multisensory perception, however,
terminology is often applied variably. In an effort to address this semantic confusion, Stein et al.
(2010) published a multi-author consensus paper defining key terms in the field. Their
conceptual framework and essential definitions will be presented briefly. “Multisensory
processing” is an umbrella term referring to any neural or behavioural phenomenon associated
with two or more sensory modalities. “Cross-modal matching” is a multisensory process in
which stimulus features from different sensory modalities are compared to estimate their
equivalence. The unisensory features being compared may be temporal in nature (e.g., time of
onset or duration), spatial in nature (e.g., spatial location or frequency), or relate to identity (e.g.,
matching lip movements to sound). “Multisensory integration” is a multisensory process that
involves the combination of unisensory signals to produce a neural or behavioural response that
is significantly different from its component inputs. Unlike cross-modal matching, multisensory
integration does not require the preservation of the features of the unisensory stimuli, but instead
fuses the common features to create a new neural response or behavioural percept (Stein et al.,
2010).
Multisensory processing in the central nervous system provides not only a richer, more complete
perceptual gestalt, it forms the foundation for cross-modal associative learning in infancy (Stein,
Stanford, & Rowland, 2014) and enables cross-modal communication essential for sensory
calibration during development (Gori, 2015; Knudsen & Knudsen, 1990). As these processes
undergo experience-dependent maturation, integrative capacities emerge at both the neuronal and
behavioural levels (Stein et al., 2014; Wallace, 2004). At maturity, the synthesis and integration
of sensory information across modalities confers distinct perceptual advantages in response
speed, precision, and detection thresholds (Stein & Meredith, 1993).
The adaptive advantage of multisensory processing and integration is perhaps most obvious in its
effect on reaction times. For example, simple manual response times to multisensory cues are
shorter than to cues presented in the visual, auditory, or tactile modality alone (Andreassi &
Greco, 1975; Forster, Cavina-Pratesi, Aglioti, & Berlucchi, 2002; Hershenson, 1962). Initiation
of saccadic eye movements that shift gaze to a target of interest are similarly reduced when
redundant cues are presented in more than one sensory modality (Frens, Van Opstal, & Van Der
31
Willigen, 1995; Harrington & Peck, 1998; Hughes, Reuter-Lorenz, Nozawa, & Fendrich, 1994).
While spatially concordant multisensory cues produce the shortest saccadic latencies, pairing a
visual target with a spatially neutral cue also improves performance compared to visual-only
presentations (Colonius & Diederich, 2004).
The benefits of multisensory processing are also evident in the ability to perceive subtle stimuli
near the threshold of detectability. In cats, detection of low-intensity visual stimuli is improved
by presentation of spatially coincident but task irrelevant auditory cues (Stein, Meredith,
Huneycutt, & McDade, 1989). In humans, sensitivity to near-threshold visual stimuli is similarly
enhanced by presentation of a sound in close spatial and temporal proximity (Frassinetti,
Bolognini, & Ladavas, 2002; Noesselt, Bergmann, Hake, Heinze, & Fendrich, 2008).
Conversely, sensitivity to faint sounds is enhanced by simultaneous presentation of a neutral
light cue (Lovelace, Stein, & Wallace, 2003).
In addition to reaction times and detection thresholds, multisensory processing can improve the
precision and accuracy of sensory perception as well. For instance, in circumstances where the
spatial reliability of visual and auditory signals is similar, localization precision and accuracy for
spatially aligned audiovisual stimuli is better than for stimuli presented in either modality alone
(Alais & Burr, 2004; Stevenson, Fister, Barnett, Nidiffer, & Wallace, 2012). In the temporal
dimension, in-phase auditory flutter improves detection of visual flicker, lowering the critical
flicker frequency (Ogilvie, 1956). Similar multisensory enhancement is also observed in the
accuracy of saccades to bimodal audiovisual targets (Corneil, Van Wanrooij, Munoz, & Van
Opstal, 2002). In the visuotactile realm, the ability to discriminate differences in object size is
better when visual and tactual cues are available compared to when cues are available in only
one modality (Ernst & Banks, 2002). Combined multisensory cues have also been shown to
improve speech comprehension. The addition of visual cues (i.e., lip-reading) significantly
improves the intelligibility of auditory speech for hearing impaired listeners (Grant, Walden, &
Seitz, 1998), and for normal hearing listeners in noisy environments (Driver, 1996; Grant &
Seitz, 2000; Sumby & Pollack, 1954).
Beyond the advantages in precision, accuracy, and reaction times, multisensory processing also
plays an important role in resolving perceptual ambiguities that exist at the unisensory level
(Ernst & Bulthoff, 2004; Green & Angelaki, 2010). A common example of this is the self-motion
32
illusion (i.e., vection) that occurs when stationary trains begin to move relative to one another.
To the visual system of a passenger inside one train, viewing visual motion of the neighbouring
train through the window is perceptually ambiguous: the same visual stimulus may be generated
by the passenger’s train moving forward, or by the neighbouring train moving backward. This
ambiguity is resolved, however, by multisensory combination of visual motion cues with
vestibular motion cues (Ernst & Bulthoff, 2004; Fetsch, Turner, DeAngelis, & Angelaki, 2009).
In the audiovisual realm, sound can similarly shift perception of ambiguous visual motion paths.
When two simple dots are animated to converge and diverge, they may be perceived as
streaming through one another or as colliding and bouncing off one another. The addition of a
brief sound at the moment of convergence, however, biases the percept towards a collision and
bounce (Sekuler, Sekuler, & Lau, 1997).
The above examples demonstrate the adaptive advantages of combined processing and
integration of multiple sensory streams by the nervous system in scenarios with ecological
validity. The tendency toward integration may also give rise to illusions, however, in scenarios
where the stimuli are manipulated experimentally to introduce disparity or incongruity in one or
more dimensions (e.g., space, time, semantic content, or numerosity). Such illusions have been
exploited extensively by researchers to determine the neural and computational underpinnings of
multisensory perception (Calvert, Spence, & Stein, 2004; Stein & Meredith, 1993). For example,
in the classic ventriloquism effect, spatial disparity between the location of a visual stimulus and
the source of a corresponding sound biases, or ‘captures’, the perceived spatial origin of the
sound (Howard & Templeton, 1966). In the more recently described temporal ventriloquism
effect, temporal disparity between visual and auditory cues biases perception: the apparent
interval between the onset of two lights can be shortened by paired auditory clicks that
temporally intervene between the lights, or lengthened by paired clicks that temporally flank the
lights (Morein-Zamir, Soto-Faraco, & Kingstone, 2003). In the well-known McGurk effect,
incongruity between semantic content of an auditory speech signal and the accompanying visual
speech signal alters the perceived auditory signal (McGurk & MacDonald, 1976). For instance,
when audio of the syllable /ba/ is paired with video of the syllable /ga/, the resulting percept in
developmentally normal individuals is rarely veridical, but most commonly an illusory /da/.
Incongruities in numerosity between rapid sequential visual flashes and auditory beeps also elicit
33
a multisensory illusion: a single flash accompanied by multiple beeps is perceived as multiple
flashes (Shams, Kamitani, & Shimojo, 2000, 2002).
The phenomena described above are a small selection of the numerous multisensory effects
described in the scientific literature, but serve to underscore the extent to which perception is
shaped by multisensory interactions.
1.3.2 Influence of Cognitive Factors in Multisensory Processing
Although numerous studies suggest that multisensory perceptual binding is a rapid, pre-attentive,
stimulus-driven process that occurs without the observer’s awareness (Driver, 1996; McGurk &
MacDonald, 1976; Sekuler et al., 1997), several cognitive factors including attention and
decisional bias have been shown to modulate performance on multisensory tasks.
Attention is an essential cognitive function that enables an observer to select relevant stimuli so
that greater neural resources may be devoted to their processing (Talsma, Senkowski, Soto-
Faraco, & Woldorff, 2010). The interactions between attention and multisensory processing are
bidirectional: salient stimuli may alter attentional selection by a bottom-up alerting mechanism,
while top-down directed attention can alter the manner in which multisensory stimuli are
processed (Theeuwes, 1991). The influence of directed attention on multisensory processing is
evident in studies of the McGurk effect—an audiovisual speech illusion (Navarra, Alsius, Soto-
Faraco, & Spence, 2010). When attentional resources shift to a secondary task, susceptibility to
the McGurk effect decreases despite direct viewing of the speaker’s face (Alsius, Navarra,
Campbell, & Soto-Faraco, 2005). Similarly, when attention is directed to a tactile stimulus while
performing the McGurk task, observers’ susceptibility to the audiovisual illusion decreases
(Alsius, Navarra, & Soto-Faraco, 2007). In addition to altering the McGurk effect, directed
attention may also modulate multisensory enhancement in reaction times. For example, in an
audiovisual cue discrimination task, focusing attention on a single modality abolishes the
multisensory enhancement in reaction time observed when attention is not explicitly directed
(Mozolic, Hugenschmidt, Peiffer, & Laurienti, 2008). Importantly, however, not all multisensory
phenomena are influenced by directed attention. The spatial ventriloquism effect is influenced by
neither directed attention (i.e., top-down attentional effects) (Bertelson, Vroomen, De Gelder, &
Driver, 2000), nor automatic attention (i.e., bottom-up alerting effects) (Vroomen, Bertelson, &
De Gelder, 2001). Similarly, visual perceptual enhancement in the temporal ventriloquism effect
34
is not accounted for by attentional alerting or distraction by accompanying auditory clicks
(Morein-Zamir et al., 2003).
Another hypothesized effect of attention on sensory perception is that of prior entry. According
to the law of prior entry, a stimulus presented in an attended modality or location will be
perceived before a stimulus in an unattended modality or location (Titchener, 1908). Scientific
support for this effect is largely derived from studies of audiovisual temporal order judgment
(TOJ) tasks. These studies show that the point of subjective simultaneity (PSS) shifts as a
function of the attended modality, speeding perception by an estimated 14 ms (Schneider &
Bavelier, 2003; Spence, Shore, & Klein, 2001; Zampini, Shore, & Spence, 2005). While
different explanations have been proposed, including sensory acceleration (Stelmach &
Herdman, 1991; Tünnermann, Petersen, & Scharlau, 2015) and enhanced perceptual sensitivity
(Schneider & Bavelier, 2003), the underlying mechanism of the effect remains controversial
(Frey, 1990; Spence & Parise, 2010).
Many psychophysical tasks probing multisensory processes are also susceptible to response
biases that may mimic lower-level sensory interactions (Welch, 1999). In signal detection theory,
this concept is formalized as the criterion parameter (Green & Swets, 1966). Importantly, the
criterion can vary, or shift, independently of the internal signal characteristics, and both factors
determine the perceptual response. Consequently, identifying the confound of response bias or
criterion shift is particularly problematic in experimental paradigms (e.g., yes/no judgments) that
do not have a stochastically-determined noise floor like that built-in to the 2-alternative forced
choice method (Fechner, 1889; Welch, 1999). For example, an investigation of audiovisual
simultaneity judgments and temporal order judgments showed that the mechanisms driving
perceptual changes in these tasks are ambiguous, and may be explained equally by changes in
neural timing or by criterion shifts (Yarrow, Jahn, Durant, & Arnold, 2011). Less equivocal
evidence of cognitive response bias is evident in studies of intersensory bias, however, which
show a relatively larger magnitude of multisensory interaction among pairings of realistic,
contextually-rich, semantically congruent (i.e., “compelling”) stimuli (e.g., a steam kettle and a
whistle sound) versus arbitrary, unfamiliar, unrealistic stimuli (e.g. a dot and a tone) (Warren,
Welch, & McCarthy, 1981). Similarly, asynchrony between speech video clips and audio tracks
is easier to detect with gender mismatched stimuli compared with gender matched stimuli
35
(Vatakis & Spence, 2007), but no such effect was found for matched and mismatched non-
speech stimuli (Vatakis & Spence, 2008).
1.3.3 Neural Sites of Multisensory Processing
Perceptual information is segregated by modality in the peripheral nervous system, but
converges on a shared population of neurons for multisensory processing in the central nervous
system. Rather than a discrete locus, however, neurophysiologic and neuroimaging studies have
revealed numerous brain regions involved in multisensory functions, and a diversity of activation
patterns that vary by perceptual task and stimulus (see Calvert (2001) and Alais, Newell, and
Mamassian (2010) for reviews).
1.3.3.1 Superior Colliculus
Multisensory integration has been most extensively studied in the superior colliculus (SC), a
phylogenetically ancient midbrain structure common to all mammals and homologous to the
optic tectum in other vertebrates (King, 2004). It receives ascending multisensory signals—
visual, auditory, and somatosensory (Meredith & Stein, 1986b)—and plays key roles in directing
attention to stimuli of interest (Robinson & Kertzman, 1995) and mediating saccadic eye
movements (Schiller & Stryker, 1972; Sparks, 1986). Neurons in the superficial layers of the SC
are unisensory and respond solely to visual stimulation, while cells in the deeper layers are
multisensory and respond to combinations of visual, auditory, and somatosensory stimuli (King,
2004). In addition to this superficial-to-deep organization by modality, neurons within each layer
of the SC are topographically arranged according to their spatial receptive fields (Cynader &
Berman, 1972; Lane, Allman, Kaas, & Miezin, 1973; Sparks, 1988; Wallace, Wilkinson, &
Stein, 1996). Importantly, the multimodal spatial maps in the SC are in spatial register, so that
overlapping receptive fields of visual-, auditory-, and somatosensory-responsive neurons map to
the same region of space (Sparks, 1988; Wallace et al., 1996). The superficial visual receptive
fields are organized in a retinotopic manner similar to the primary visual cortex: each SC
contains a representation of the contralateral visual field, with the central-peripheral axis
represented rostro-caudally, and the superior-inferior axis represented latero-medially (Lane et
al., 1973). The deeper auditory and somatosensory receptive fields have similar topography to
the visual map, and maintain their alignment by making compensatory shifts in response to
changes in eye position (Groh & Sparks, 1996; Hartline et al., 1995). The prioritization of spatial
36
alignment among the multisensory maps in the SC highlights its role in processing sensory
information according to spatial location—an amodal feature—regardless of its modality of
origin.
The neuroanatomic arrangement of retinocollicular projections is also distinct from the
arrangement of retinostriate projections, as shown in Figure 1.2 (Lane et al., 1973; Pollack &
Hickey, 1979). The right and left striate cortices (V1) receive approximately equal input from the
corresponding nasal and temporal retinas of each eye. Each SC, however, receives input
predominantly from the nasal retina (serving the temporal visual hemifield) of the contralateral
eye. As a result, the nasal and temporal visual hemifields of each eye are equally represented in
V1, but the temporal visual hemifield is over-represented in the SC.
Figure 1.2: Schematic diagram of retinal projections in the retinostriate and
retinocollicular pathways. (A) In the retinostriate pathway, retinal projections undergo a hemi-
decussation in the optic chiasm. The right and left primary visual cortices (V1) therefore receive
approximately equal input from the corresponding points in the nasal and temporal retinas of
each eye. (B) In the retinocollicular pathway, the majority of retinal projections originate in the
37
nasal retina of each eye and decussate fully in the optic chiasm. The right and left superior
colliculi (SC) therefore receive predominantly crossed input from the nasal retina of the
contralateral eye (serving the temporal visual field for that eye). LGN, lateral genicular nucleus;
RE, right eye; LE, left eye. Modified from Zackon, Casson, Zafar, Stelmach, and Racette (1999).
Reprinted with permission from Elsevier.
Multisensory neurons in the SC exhibit several characteristic responses that offer insight into the
mechanisms and constraints of multisensory integration at this site (Holmes & Spence, 2005;
Stein, Stanford, Ramachandran, Perrault, & Rowland, 2009). First, the firing rate of multisensory
neurons tends to be enhanced when stimuli in different sense modalities originate from the same
location in space; the greater the spatial separation between two unimodal signals, the smaller the
multisensory response enhancement (Meredith & Stein, 1986a). This is referred to as the “spatial
rule” of multisensory integration. Second, neural responses in the multisensory layers of the SC
are greater when the stimuli in each modality occurs as approximately the same time (Meredith,
Nemitz, & Stein, 1987). This is termed the “temporal rule” of multisensory integration. Third,
SC multisensory neurons driven by spatially congruent stimuli from different modalities show a
magnitude of response enhancement (i.e., an increase in firing rate) that is greater than the sum
of the responses to the unisensory stimulus alone (Meredith & Stein, 1986a). This phenomenon
is termed “superadditivity” (Stein & Meredith, 1993). Fourth, SC multisensory neuron response
enhancement (i.e., the magnitude of superadditivity) is generally greater for weak component
unisensory stimuli (Meredith & Stein, 1986b). This classical multisensory response is termed the
principle of “inverse effectiveness” (Stein & Meredith, 1993). Together, superadditivity and
inverse effectiveness comprise non-linear responses that enhance the saliency of weak but
spatially- and temporally-congruent stimuli when information is available from more than one
modality (Holmes & Spence, 2005).
These characteristic non-linear responses of multisensory SC neurons are not present at birth, but
develop in an experience-dependent manner. Single-cell recordings from the SC of newborn
monkeys show that although deep SC neurons respond to visual, auditory, and somatosensory
stimuli, the inputs are not integrated to produce response enhancement (Wallace & Stein, 2001).
Similar recordings in cats reared in darkness show that visual experience is necessary for
38
integrative responses to emerge in the SC (Wallace, Perrault, Hairston, & Stein, 2004). This
experience-dependent maturation of SC multisensory neurons is also mediated by descending
cortical input: in kittens, ablation of corticofugal pathways from the anterior ectosylvian sulcus
and the rostral lateral suprasylvian sulcus precludes development of integrated multisensory
responses (Jiang, Jiang, & Stein, 2006). In the adult cat, reversible cryogenic inactivation of
these cortical areas also temporarily blocks multisensory enhancement in SC neurons (Jiang,
Wallace, Jiang, Vaughan, & Stein, 2001).
Assuming normal experience-dependent development, probabilistic models suggest the
multisensory response enhancement in the SC, and its spatial and temporal constraints, represent
an optimal strategy to attend and orient to important environmental stimuli (Rowland, Stanford,
& Stein, 2007).
1.3.3.2 Cortical Areas
Beyond the SC, functional neuroimaging and electrophysiological studies show extensive
multisensory interactions in the human cerebral cortex (Figure 1.3). Although the regions
involved and precise patterns of activation are highly stimulus and task dependent, some general
patterns have emerged in the literature (Calvert, 2001). The superior temporal sulcus (STS) is
commonly activated in tasks requiring integration of complex multisensory stimuli, particularly
during audiovisual speech perception (Callan, Callan, Kroos, & Vatikiotis-Bateson, 2001;
Calvert, Campbell, & Brammer, 2000; Raij, Uutela, & Hari, 2000). Functional MRI data suggest
a posterior-to-anterior audiovisual processing gradient in the STS: the posterior regions respond
to audiovisual signals regardless of their spatialtemporal structure, the middle regions integrate
audiovisual signals according to their physical stimulus properties (i.e., spatiotemporal
correspondence), and the anterior regions integrate audiovisual signals according to their
linguistic content (Lee & Noppeney, 2011b). Activation of the STS is also observed duing
illusory visual perception in the sound-induced flash illusion (Watkins, Shams, Tanaka, Haynes,
& Rees, 2006). The intraparietal sulcus (IPS) shows enhanced activation during spatial
localization and spatial attention tasks requiring cross-modal integration of audiovisual stimuli
(Bushara et al., 1999; Lewis, Beauchamp, & DeYoe, 2000) and visuotactile stimuli (E.
Macaluso, C. Frith, & J. Driver, 2000; E. Macaluso, C. D. Frith, & J. Driver, 2000). The cortex
of the insula and claustrum appear to have a prominent roles in cross-modal matching of visual
39
and tactile stimuli (Banati, Goerres, Tjoa, Aggleton, & Grasby, 2000; Hadjikhani & Roland,
1998) and in processing temporal correspondence of visual and auditory stimuli (Bushara,
Grafman, & Hallett, 2001; Calvert, Hansen, Iversen, & Brammer, 2001). Areas of the frontal
lobes frequently show enhanced activation to multisensory stimuli (Banati et al., 2000; Bushara
et al., 2001; Callan et al., 2001; Calvert et al., 2000; Calvert et al., 2001; Lee & Noppeney,
2011b; Lewis et al., 2000; Raij et al., 2000), but their role in multisensory processing is less
distinct than more posterior areas. Some have speculated, however, that the frontal lobes serve to
process more arbitrary or abstract cross-modal associations (e.g., the association between
auditory and visual representations of alphabetical letters) (Calvert, 2001; Calvert et al., 2004;
Raij et al., 2000).
Figure 1.3: Summary of putative multisensory areas of the human brain based on primate
anatomical data, human psychophysical data, and functional neuroimaging studies. The left
image shows a lateral view of the brain; the right image shows the medial view of the brain.
From Calvert (2001). Reprinted with permission from Oxford University Press.
40
Figure 1.4: Posterior-to-anterior audiovisual processing gradient in the human STS.
Coloured areas show fMRI blood oxygenation level-dependent (BOLD) activation based on
different audiovisual stimulus features. The posterior STS responds to audiovisual signals
regardless of their spatiotemporal structure (magenta). The mid-STS responds to audiovisual
signals on the basis of spatiotemporal correspondence (cyan). The anterior STS responds to
audiovisual correspondence on the basis of linguistic content. The frontal lobe also shows areas
of BOLD activation to audiovisual stimuli. Figure modified from Lee and Noppeney (2011b),
with permission under the Creative Commons BY-NC-SA 3.0 Unported License.
Multisensory stimuli have also been shown to modulate functional responses in cortices
traditionally viewed as modality-specific (see Macaluso (2006) for review). For example, cross-
modal perceptual binding, as indicated by sound-induced change in visual motion, is associated
not only with activation of multimodal areas, but also reciprocal inactivation of unimodal areas
(Bushara et al., 2003). Conversely, cross-modal binding of audiovisual speech signals produces
response enhancement in the primary visual and auditory cortices as well as the multisensory
STS (Calvert et al., 1999; Calvert et al., 2000; Nath & Beauchamp, 2012). Similarly, spatial
correspondence between visual and tactile stimuli elevates brain activity not only in the IPS, but
also in the primary sensory cortices contralateral to the stimuli (Macaluso & Driver, 2001).
Findings such as these challenge the traditional view of sensory processing proceeds in a
hierarchical manner from primary unisensory areas, to secondary cortices and increasingly
multisensory areas. Instead, empirical findings indicate that multisensory perception depends on
41
both feed-forward and feed-back interactions between unisensory and multisensory areas
(Calvert, 2001; Macaluso, 2006).
1.3.4 Multisensory Integration
The brain processes multisensory information by comparing continuous unisensory inputs and
selectively combining, or binding together, related signals to improve the fidelity of perception
(Parise & Ernst, 2016). From this viewpoint, whether multisensory signals are perceptually
bound and to what extent they are integrated depends on their relatedness. Although signal
relatedness may be conceptualized on a continuous scale from 0% to 100%, it reflects a
probabilistic measure of a binary determination: whether unisensory signals in different
modalities arose from a common source or from different sources (Shams & Beierholm, 2010).
Indeed, the biological value of perceptual cues to an organism lies in the information they
convey about their extrinsic causes (Kording et al., 2007). If the constraints on multisensory cue
combination are very liberal, an organism risks inappropriately binding multisensory cues from
separate events, thus losing critical information about its environment. On the other hand, if the
constraints on multisensory cue combination are very strict, an organism may not integrate cues
arising from a single source, thus impairing its ability to quickly and precisely detect important
stimuli in its noisy sensory environment. The determination of relatedness and appropriate
integration of multisensory signals is therefore the central problem faced by a multisensory
perceptual system.
As is evident from studies on perceptual bias with experimentally-induced discrepancy between
multisensory stimuli, the perceptual system has a tendency to produce a perceptual experience
consistent with non-discrepant stimuli originating from a common source (Welch & Warren,
1980). Welch and Warren (1980) termed the observer’s belief, or perception, that two or more
unisensory cues belong together, or originate from a common source, the “unity assumption”.
They postulated that the strength of this assumption is a function of the extent of feature
correspondence (e.g., spatial, temporal, or identity) between the unisensory signals, and
cognitive factors such as attention and the overall “compellingness” of the stimulus complex
(Welch, 1999; Welch & Warren, 1980). Indeed, in their studies of the ventriloquism effect,
Thurlow and Jack (1973) showed that the strength of visual capture was dependent upon the
degree of spatial and semantic correspondence between the unisensory signals. If the visual and
42
auditory cues were too widely separated, or semantically unrelated (e.g., a video of a face
mouthing syllables paired with an auditory tone), the cross-modal interaction was relatively
diminished.
While the unity assumption predicts perceptual binding of related but discrepant multisensory
stimuli, the precise manner in which the perceptual system resolves the discrepancy is dependent
upon the stimuli and the perceptual task at hand. In many instances, one modality dominates or
‘captures’ the other in terms of its perceptual representation of the shared feature. In the spatial
dimension, vision tends to dominate when in conflict with other senses. Perhaps the best-known
demonstration of visual dominance is the classic ventriloquism effect, in which the spatial
position of a visual cue dominates and overrides the perceived location of the auditory cue
(Bertelson & Radeau, 1981; Howard & Templeton, 1966; Thurlow & Jack, 1973). Visual
dominance has also been described in the context of visuotactile discrepancy. When an object is
simultaneously grasped and viewed through a distorting lens, judgments of shape are strongly
biased toward the non-veridical visual input (Rock & Victor, 1964). In the temporal dimension,
however, audition tends to dominate vision. In situations of discrepant audiovisual flutter
frequencies, for example, the auditory rate drives, or entrains, the perceived rate of visual flutter
(Gebhard & Mowbray, 1959; Recanzone, 2003; Shipley, 1964). Similarly, multiple beeps
accompanying a single flash can create the illusion of multiple flashes (Shams et al., 2000,
2002), and in the temporal ventriloquism effect, auditory cues in close temporal proximity to
visual cues alters the perceived timing of the visual cues (Morein-Zamir et al., 2003).
1.3.5 Theories of Multisensory Integration and Modality Dominance
Various theories and models have been proposed to explain how the nervous system combines
multisensory cues and determines intersensory bias under different conditions.
1.3.5.1 Directed-Attention Hypothesis
The directed-attention hypothesis proposes that modality dominance in multisensory perception
is determined by differences in attention given to signals in each modality (Posner, Nissen, &
Klein, 1976; Welch & Warren, 1980). According to this hypothesis, visual dominance in
audiovisual location judgments and visuotactile shape judgments reflects a greater amount of
attention given to vision than to audition or touch (Welch & Warren, 1980). Posner et al. (1976)
43
proposed that because visual cues are less alerting than auditory cues, a greater proportion of
attention is tuned to vision. Furthermore, they suggested that visual dominance in the
multisensory percept is achieved through a mechanism of sensory facilitation termed prior entry
(Titchener, 1908). This theory found its support in studies of attentional manipulation in the
ventriloquism effect (Canon, 1970, 1971). However, virtually no effect of attentional
manipulation could be elicited in the setting of visual-proprioceptive positional discrepancy
(Pick, Warren, & Hay, 1969). The proposed attentional bias toward vision, in isolation, also
failed to explain the dominance of audition in intersensory discrepancy in the temporal
dimension (Gebhard & Mowbray, 1959; Shipley, 1964).
1.3.5.2 Modality Appropriateness and Precision Hypotheses
The modality appropriateness hypothesis begins with the assumption that the various sensory
modalities are not equally suited for the perception of any given event (Freides, 1974; O’Connor
& Hermelin, 1972; Welch & Warren, 1980). This theory states that while the different sensory
modalities are capable of many overlapping information processing functions, each has a subset
of functions that it performs better than other modalities. This inherent sensory processing
superiority, in turn, determines the relative bias toward, or dominance of, a particular modality.
A related theory—the modality precision hypothesis—defines the appropriate modality as the
one that has greatest precision for a given perceptual task (Choe, Welch, Gilford, & Juola, 1975;
Welch & Warren, 1980). Put another way, a modality’s dominance in a multisensory percept
may not reflect its inherent physiologic priority over other modalities, but rather its superior
precision for the perceptual dimension being probed (Witten & Knudsen, 2005). Information
from the most precise sense will therefore dominate in the fused percept. Because vision is more
reliable and precise than other senses in the spatial dimension (Recanzone, 2009; Witten &
Knudsen, 2005), it follows that vision dominates in spatial aspects of perception (Bertelson &
Radeau, 1981; Rock & Victor, 1964; Thurlow & Jack, 1973). Indeed, visual signals are rarely
subject to environmental distortion, and the topography of the retina maps directly to physical
space. Conversely, audition is more precise than other senses in the temporal domain
(Recanzone, 2009; Witten & Knudsen, 2005). The modality precision hypothesis therefore
predicts the dominance of audition observed in temporal aspects of perception (Gebhard &
Mowbray, 1959; Morein-Zamir et al., 2003; Recanzone, 2003; Shams et al., 2000; Shipley,
1964).
44
More recently, investigators have found that the typical dominance of vision and audition in
spatial and temporal perception, respectively, can be diminished or reversed. For example, in an
audiovisual frequency judgment task, the typical dominance of auditory flutter frequency over
visual flicker frequency was reversed by making the auditory temporal signal temporally
ambiguous (Wada, Kitagawa, & Noguchi, 2003). In an audiovisual spatial localization task,
blurring of the visual target reversed the typical dominance of the vision over audition (Alais &
Burr, 2004). Similarly, studies of the visual and proprioceptive contributions to judgments of
limb position reported that the relative weighting of the sensory modalities dynamically adjusts
in response to degradation of the visual position signal (Mon-Williams, Wann, Jenkinson, &
Rushton, 1997). These findings indicate that modality dominance in multisensory perception is
not categorical and fixed, but continuous and flexible, with individual modalities dynamically re-
weighted based on the immediately-available sensory information. Although the modality
appropriateness and precision hypotheses predict variable dominance based on the perceptual
task, they do not explicitly account for these dynamic changes in perceptual weighting for a
given task (Wada et al., 2003).
1.3.5.3 Bayesian Inference and the Maximum Likelihood Estimation Model
As stated in section 1.3.4, the adaptive value of perceptual cues to an organism lies in the
information they convey about the external environment (Kording et al., 2007). It follows, then
that the reliability and precision of sensory information are related to its survival value to the
organism (Kording et al., 2007). Because multisensory integration serves to enhance the
reliability and precision of sensory information (Jacobs, 2002), it can be modeled as a problem of
optimal combination (Ernst & Bulthoff, 2004). Bayes’ theorem provides a probabilistic
framework to formalize aspects of the modality precision hypothesis and allows construction of a
hypothetical ideal observer (Deneve & Pouget, 2004; Kersten & Yuille, 2003; Yuille & Bulthoff,
1996). Such an ideal observer may then be used as a reference standard with which to compare
actual human performance (Ernst & Bulthoff, 2004).
For a given external stimulus feature, � (e.g. spatial location), and its sensory representation, �
(encoded as a random variable with noise), Bayes’ theorem states that the posterior probability
distribution, ���|��, is proportional to the product of the likelihood function, ���|��, and the
prior probability distribution ����:
45
���|�� ∝ ���|�� × ����
The likelihood function (i.e., the noise distribution) can be determined experimentally by
repeatedly presenting a stimulus at the same location, �, and measuring the variability in �. If all
positions of � are equally likely, then ���� is a uniform distribution, and the theorem reduces to:
���|�� ∝ ���|��
The value of � that maximizes the posterior probability is therefore the optimal estimate of the
stimulus location, ��:
�� = argmax�
���|��
The same approach can be applied to multisensory stimuli (Deneve & Pouget, 2004). For
example, given an audiovisual location signal, ���, and its independent visual, ��, and auditory,
��, representations in the sensory system, the optimal location estimate may be computed by:
���� = argmax���
�����|��, ���
Using Bayes’ theorem and the assumption of uniform prior distributions (Deneve & Pouget,
2004; Kersten & Yuille, 2003), the posterior distribution reduces to:
�����|��, ��� ∝ �����|��� × �����|���
If ����|��� and ����|��� represent Gaussian distributions, then the optimal audiovisual location
estimate, ����, is the weighted sum of the unimodal location estimates, ��� and ���:
���� = ����� + �����
where the weights, �� and ��, represent the unimodal location estimate reliabilities (i.e., the
inverse variances of their respective posterior probabilities) divided by a normalizing term:
�� = 1 ���⁄1 ���⁄ + 1 ���⁄ and �� = 1 ���⁄
1 ���⁄ + 1 ���⁄
46
This special case of multisensory Bayesian inference with uniform priors is also known as the
Maximum Likelihood Estimation (MLE) model of multisensory integration (Deneve & Pouget,
2004; Ernst & Bulthoff, 2004; Yuille & Bulthoff, 1996).
In summary, the MLE model states that the optimal strategy for multisensory integration to
combine sensory information into the most reliable composite estimate possible. It assumes that
the noise associated with each unisensory estimate is independent and normally distributed, so
that the statistically optimal combination is a simple weighted average where the perceptual
weight is determined by the normalized reciprocal variance of the unisensory estimate. The
uniform prior distributions in the MLE model imply that all possible values for the stimulus are
equally likely, and that the strength of the “unity assumption” (i.e., the belief that the various
unisensory cues belong together) is invariant (Chen & Spence, 2017).
Several studies have demonstrated that human multisensory perception is consistent with cue
combination by the MLE model. In the spatial domain, vision typically dominates in visual-
haptic judgments of shape (Rock & Victor, 1964). Ernst and Banks (2002), however, showed
that the relative perceptual weights of vision and touch can be modulated by the addition of
random noise in depth to the visual signal. Using an intersensory conflict paradigm, they
measured the reliability of the unisensory and multisensory percepts, as well as the intersensory
bias in the multisensory percept across multiple visual noise levels. These empirical data agreed
with predictions from an MLE ideal observer, showing that visual-haptic integration results in
not only optimal modality weighting, but also maximal reliability, in the multisensory percept.
Similarly, Alais and Burr (2004) demonstrated that integration of visual and auditory spatial
signals is consistent with the MLE model using another paradigm of intersensory conflict—the
ventriloquism effect. Instead of adding random noise in depth, however, they modulated the
reliability of the visual stimulus by adding increasing amounts of Gaussian blur. In the temporal
domain, the validity of the MLE model is less clear. A study of audiovisual temporal integration
of flash and beep stimuli reported good agreement with the MLE model (Andersen, Tiippana, &
Sams, 2005). However, other studies of audiovisual temporal integration have not supported this
model. Quantitative analysis of human performance on an audiovisual temporal bisection task
did not support the MLE model (Burr, Banks, & Morrone, 2009). Similarly, a study of
audiovisual rate perception showed that multisensory integration in the temporal dimension is
47
not adequately described by the MLE model, and is more consistent with a Bayesian model that
includes a prior probability distribution (Battaglia, Jacobs, & Aslin, 2003).
1.3.6 Development of Multisensory Processes
Like the development of unisensory perceptual abilities, many multisensory processes mature
over a prolonged period of postnatal development (Wallace, 2004). These processes include
cross-modal associative learning, cross-modal matching, and eventual emergence of the capacity
to integrate multisensory stimuli optimally (Lewkowicz & Lickliter, 1994; Stein & Meredith,
1993). At the neuronal level, simultaneous activation of pre- and post-synaptic neurons by
converging multisensory stimuli is postulated to govern associate learning by simple Hebbian
rules (Cuppini, Magosso, Rowland, Stein, & Ursino, 2012; Feldman, 2012). Indeed, such
experience-dependent cross-modal associative learning has been demonstrated in the SC of
neonatal cats (Yu, Rowland, & Stein, 2010). In humans, cross-modal association and associative
learning is evident in behavioural studies of infants. Using preferential-looking paradigms, the
ability to learn sight-sound pairings has been demonstrated in infants just a few hours old
(Morrongiello, Fenwick, & Chance, 1998). At 3 to 4 weeks of age, infants show an ability to
match auditory and visual stimuli on the basis of intensity (Lewkowicz & Turkewitz, 1980). One
month olds have also been reported to recognize which of two visually perceived shapes matches
one they previously explored tactually (Meltzoff & Borton, 1979), although this finding has not
been supported by subsequent replication studies (Brown & Gottfried, 1986; Maurer, Stager, &
Mondloch, 1999; Pêcheux, Lepecq, & Salzarulo, 1988). At 4 to 6 months of age, infants show a
preference for cross-modal correspondence for novel visual-auditory pairings (Lyons-Ruth,
1977; Spelke, 1976) and cross-modal transfer for visual-tactual pairings (Rose, Gottfried, &
Bridger, 1981). In addition to simple stimulus pairings, behavioural and electrophysiological data
also suggest that multisensory combination of visual and auditory cues plays an important role in
speech acquisition in infancy (Kushnerenko, Teinonen, Volein, & Csibra, 2008; Lewkowicz &
Hansen-Tift, 2012). Lewkowicz (2000) conducted a review of the literature on multisensory
perception in human infants, and proposed that multisensory capacities progress from simple to
more complex in a sequential and hierarchical fashion. According to that model of multisensory
development in the first year of life, sensitivity to temporal relations between auditory and visual
stimuli emerges initially on the basis of synchrony and duration, followed by sensitivity to rate,
and lastly on the basis of complex rhythmic features (Lewkowicz, 2000).
48
Although the ability to compare and combine multisensory cues is present from early infancy,
evidence suggests that optimal multisensory integration does not emerge until considerably later
(Ernst, 2008). For simple audiovisual detection tasks, multisensory facilitation of reaction times
indicative of auditory and visual coactivation is not observed until approximately 8 years of age,
and remains immature until 10 to 11 years of age (Barutchu, Crewther, & Crewther, 2009;
Barutchu et al., 2010). In spatial navigational tasks, adults demonstrate optimal integration of
visual and proprioceptive cues, but performance of children up to 8 years of age is consistent
with a non-integrative strategy of unisensory dominance instead (Nardini, Jones, Bedford, &
Braddick, 2008). Similarly, when provided with visual and haptic cues of object size, statistically
optimal integration producing a bimodal enhancement in precision does not emerge until 8 to 10
years of age (Gori, Del Viva, Sandini, & Burr, 2008). Children younger than 12 years also show
non-optimal integration of auditory and visual cues in tasks of spatial bisection (Gori, Sandini, &
Burr, 2012). Audiovisual speech is an apparent exception to the relatively late emergence of
multisensory integration: when presented with stimuli to elicit the McGurk effect, infants as
young as 4 months old show behavioural (Burnham & Dodd, 2004; Desjardins & Werker, 2004;
Rosenblum, Schmuckler, & Johnson, 1997) and electrophysiological (Bristow et al., 2009)
evidence of audiovisual integration. However, the apparent mechanisms of audiovisual speech
integration vary by age. Behavioural and electrophysiological studies suggest that young children
rely on general perceptual mechanisms for audiovisual speech integration, and only after 6 to 8
years of age do they develop speech-specific and phonetic integrative mechanisms as well
(Baart, Bortfeld, & Vroomen, 2015; Baart, Vroomen, Shaw, & Bortfeld, 2014; Eskelund,
Tuomainen, & Andersen, 2011; Lalonde & Holt, 2016).
1.3.7 Cross-Sensory Calibration Hypothesis
Before the emergence of optimal integration, children tend to demonstrate unisensory dominance
when confronted with conflicting multisensory cues. For example, in children younger than 8
years, visual cues dominate haptic cues in spatial orientation discrimination, and haptic cues
dominate visual cues in size discrimination (Gori et al., 2008). Some have suggested that body
growth in childhood (e.g., increases in limb and digit length, interaural separation, and
interocular distance) necessitates prioritizing sensory recalibration over optimal combination to
prevent the accumulation of bias in multisensory perception (Ernst, 2008; Gori et al., 2008). This
idea has been formalized as the cross-sensory calibration hypothesis, which states that
49
multisensory interactions in childhood play a fundamental role in maintaining accurate sensory
calibration (Burr & Gori, 2012; Gori, 2015; Gori et al., 2008). Similar to the modality precision
hypothesis for modality dominance in adults (Welch & Warren, 1980), the cross-sensory
calibration hypothesis predicts that the more accurate and robust modality informs the calibration
of the other in a multisensory interaction. Consistent with cross-sensory calibration, visual
distortion from prism spectacles induces persistent bias in auditory localization in barn owls
(Knudsen & Knudsen, 1989), and strabismus (i.e., misalignment of the eyes) shifts the auditory
spatial map in the superior colliculus of ferrets (King et al., 1988). In humans, within-modality
perceptual impairments are consistent with predictions of this hypothesis as well. Haptic
orientation discrimination, typically calibrated by visual cues in childhood, is impaired in early-
blind individuals, but haptic size discrimination, which is typically calibrated by touch, is
preserved (Gori, Sandini, Martinoli, & Burr, 2010). Children with upper limb movement
disorders show the opposite pattern: visual orientation discrimination, typically calibrated by
vision, is preserved, but visual size discrimination, typically calibrated by touch, is impaired
(Gori, Tinelli, Sandini, Cioni, & Burr, 2012). In each instance, a lack of accurate information
from the more robust modality that typically dominates in childhood multisensory perceptions is
hypothesized to impair cross-sensory calibration. Furthermore, this hypothesis provides an
explanation for perceptual impairments observed in modalities not directly affected by a
peripheral sensory pathology.
1.3.8 Selected Psychophysical Measures of Audiovisual Processing and Integration
1.3.8.1 Audiovisual Simultaneity Judgment
For a given observer, the range of signal onset asynchronies (SOAs) over which a given set of
visual and auditory stimuli are perceived as simultaneous is termed the audiovisual simultaneity
window (Figure 1.5). In adulthood, the audiovisual simultaneity window is characteristically
bell-shaped with a slight skew toward conditions in which the visual signal leads the acoustic
signal (Dixon & Spitz, 1980; Lewald & Guski, 2003; Slutsky & Recanzone, 2001; Zampini,
Guest, Shore, & Spence, 2005). The consequence of this skew is that the likelihood of perceived
simultaneity between visual and auditory stimuli is maximal when the light objectively precedes
the sound. Several hypotheses have been advanced to explain this visual-lead asymmetry. Based
on differences in reaction times and evoked potential response latencies to unisensory visual and
50
auditory stimuli (Andreassi & Greco, 1975; King & Palmer, 1985), some have suggested that the
asymmetry is a consequence of faster internal processing of auditory compared to visual stimuli
(Vroomen & Keetels, 2010). It has been alternatively hypothesized to represent adaptive tuning
to the natural delay in sound waves compared to light waves emanating from any common
source a significant distance from the observer (e.g., the delay in thunder following a lightning
strike) (King & Palmer, 1985; Vroomen & Keetels, 2010).
Figure 1.5: A hypothetical audiovisual simultaneity window. The response distribution has a
characteristic bell shape with a skew toward the visual-lead side.
While the temporal constraints on the audiovisual simultaneity window are obvious, other factors
also affect the likelihood of perceived simultaneity. Auditory and visual stimuli in originating
from the same location are more likely to be perceived as simultaneous than those originating
from different locations (Zampini, Guest, et al., 2005). Stimulus type also has a significant effect
on the audiovisual simultaneity window, with a wider, more symmetric window observed for
audiovisual speech stimuli compared to simple and complex non-speech stimuli (Stevenson &
Wallace, 2013). The shape of the audiovisual simultaneity window can also be altered by
training. Exposure to a fixed audiovisual time lag for a period of minutes results in a
recalibration of perceived simultaneity responses toward the adapted asynchrony (Fujisaki,
Shimojo, Kashino, & Nishida, 2004). Short-term perceptual training with feedback (Powers,
51
Hillock, & Wallace, 2009; Stevenson, Wilson, Powers, & Wallace, 2013) and long-term musical
training (Lee & Noppeney, 2011a) may also narrow the audiovisual simultaneity window,
particularly on the visual-lead side.
Evidence suggests that a slow, attentive process of cross-modal feature comparison, rather than
true multisensory integration, may underlie audiovisual asynchrony detection (Fujisaki &
Nishida, 2005). The temporal limit of audiovisual asynchrony detection, or width of the
simultaneity window, is therefore argued to depend upon the temporal information encoded at
the unisensory level, or upon the inherent temporal resolution of the neural mechanism for cross-
modal matching (Fujisaki & Nishida, 2005).
Although sensitivity to audiovisual synchrony is posited as the initial basis for multisensory
association in early infancy (Lewkowicz, 2000), this perceptual process continues to mature
throughout early and middle childhood. The audiovisual simultaneity window narrows on both
the auditory-lead and visual-lead sides from early childhood through adolescence, reaching an
adult shape between 9 years and 17 years of age (Chen, Shore, Lewis, & Maurer, 2016; Hillock-
Dunn & Wallace, 2012; Hillock, Powers, & Wallace, 2011; Lewkowicz & Flom, 2014).
In adults, the width of the audiovisual simultaneity window is also correlated with other indices
of multisensory integration. People with a narrow simultaneity window tend to experience a
stronger McGurk effect, but are less susceptible to the sound-induced flash illusion (Stevenson,
Zemtsov, & Wallace, 2012). A common factor uniting these behavioural finding is hypothesized
to be an individual’s ability to dissociate asynchronous multisensory signals (Stevenson,
Zemtsov, et al., 2012).
1.3.8.2 Spatial Ventriloquism Effect
The spatial ventriloquism effect is an illusion involving cross-modal integration in which
spatially disparate visual and auditory stimuli are perceived as originating from the same location
(Figure 1.6). In normal subjects, localization of the visual stimulus typically dominates the fused
percept and ‘captures’ the auditory stimulus (Figure 1.6A) (Howard & Templeton, 1966; Welch
& Warren, 1980). The strength of this perceptual fusion follows the spatial and temporal rules of
multisensory integration (Holmes & Spence, 2005), diminishing with increasing spatial and
temporal separation until the two stimuli are consistently regarded as separate events (Godfroy,
52
Roumes, & Dauchy, 2003; Lewald, Ehrenstein, & Guski, 2001; Lewald & Guski, 2003;
Recanzone, 2009; Slutsky & Recanzone, 2001). The spatial ventriloquism effect is not
significantly affected by top-down directed attention or bottom-up automatic attention, indicating
that the phenomenon is a result of low-level, automatic cross-modal interactions (Bertelson et al.,
2000; Vroomen et al., 2001).
Alais and Burr (2004) demonstrated that blurring of the visual stimulus, which reduces its spatial
reliability, can diminish and even reverse the dominance of sight over sound (Figure 1.6B).
Quantitative analysis of their results showed that integration of visual and auditory spatial
information obeys the MLE model of optimal combination, such that the variance in the
localization estimate of the fused percept is minimized (Alais & Burr, 2004).
Figure 1.6: A diagram of the spatial ventriloquism effect. The speaker icon indicates the
location of the auditory stimulus, and the Gaussian blob represents the location and spatial
reliability of the visual stimulus. The red dot indicates the perceived location of the fused
audiovisual event. Diagram not to scale. (A) Classical ventriloquism, in which the location of the
fused percept is dominated by the location of the visual stimulus. (B) If the visual stimulus is
very spatially unreliable relative to the auditory stimulus, the location of the fused percept shifts
toward the auditory location.
53
1.3.8.3 Temporal Ventriloquism Effect
The temporal ventriloquism effect is a cross-modal phenomenon in which non-speech, non-
rhythmic, and spatially uninformative sounds alter performance on a visual temporal order
judgment (TOJ) task (Figure 1.7) (Morein-Zamir et al., 2003). If two clicks temporally intervene
between the onset of two lights, performance is worsened, as if the clicks “pull” the lights closer
together in time (Figure 1.7A). Conversely, if two clicks temporally flank the onset of two
lights, performance is enhanced, as if the clicks “pull” the lights apart in time (Figure 1.7B).
Morein-Zamir et al. (2003) conducted a series of variations on this paradigm, and concluded that
the enhancement in visual TOJ is dependent upon the second sound trailing the second light by
about 100 to 200 ms and the effect is not accounted for by mechanisms of attentional alerting or
cross-modal interference. Rather, they postulate that the phenomenon results from a low-level
integrative mechanism that resolves intersensory temporal discrepancy by drawing the stimuli
toward temporal convergence (Fendrich & Corballis, 2001; Morein-Zamir et al., 2003).
Beyond the obvious temporal constraints, the temporal ventriloquism effect exhibits no
dependency on intersensory spatial correspondence (Vroomen & Keetels, 2006) or synesthetic
congruency between the auditory and visual stimuli (Keetels & Vroomen, 2011).
Figure 1.7: Examples of audiovisual stimulus conditions that elicit the temporal
ventriloquism effect. (A) When two clicks temporally intervene between the onset times of two
lights, visual TOJ performance is degraded. (B) When two clicks temporally flank the onset of
two lights, visual TOJ performance is enhanced.
54
1.3.8.4 McGurk Effect
The McGurk effect is an audiovisual illusion in which the perception of an auditory speech
stimulus is altered by concurrent presentation with an incongruent visual speech stimulus. In the
original study by McGurk and MacDonald (1976), an auditory /ba/ combined with a visual /ga/
consistently produced the illusory auditory percept of /da/. Similarly, an auditory /pa/ paired with
a visual /ka/ was often perceived as an auditory /ta/. Subsequent studies found that the
phenomenon generalized to many syllabic combinations, with the resulting auditory percept
being either a fused syllable intermediate to the veridical auditory and visual cues, or dominated
by the visual cue (Burnham & Dodd, 1996; Paré, Richler, ten Hove, & Munhall, 2003).
The illusory auditory percept in the McGurk effect is a relatively robust multisensory
phenomenon. The strength of the percept is not substantially influenced by large spatial
disparities between the visual and auditory signals (Jones & Munhall, 1997). A clear view of the
speaker’s lips is also not required: the strength of the McGurk effect does not begin to diminish
until gaze is displaced beyond 10 to 20 degrees from the speaker’s mouth (Paré et al., 2003), and
the effect is relatively insensitive to spatial degradation of the visual signal by pixelation
(MacDonald, Andersen, & Bachmann, 2000). Even among observers that are aware of the
artificial stimulus pairing or asynchrony between the auditory and visual stimuli, the illusion
remains strong (Soto-Faraco & Alsius, 2009). However, unlike integrative phenomena of low-
level audiovisual stimuli (e.g., the spatial and temporal ventriloquism effects), the strength of the
McGurk effect is significantly diminished when attention is divided or diverted from the
audiovisual speech stimuli (Alsius et al., 2005; Alsius et al., 2007).
Several studies indicate that auditory and visual speech signals may be integrated by both
speech-specific and more general multisensory mechanisms (Eskelund et al., 2011; Tuomainen,
Andersen, Tiippana, & Sams, 2005; van Wassenhove, Grant, & Poeppel, 2007; Vroomen &
Stekelenburg, 2011). Furthermore, combined behavioural and electroencephalographic evidence
suggests that audiovisual speech integration in the McGurk effect proceeds in a hierarchical
fashion, with general spatial and temporal features being integrated first, within 100 ms of sound
onset, followed by integration on the basis of phonetic properties shared by the auditory and
visual signals (Baart, Stekelenburg, & Vroomen, 2014).
55
1.4 Multisensory Processing in Amblyopia
Given the protracted course of experience-dependent postnatal development of the visual,
auditory, and multisensory perceptual systems (Burr & Gori, 2012; Daw, 2006; Warren, 2008),
and the cross-modal dependency of spatial hearing on vision (King, 2009), and temporal vision
on hearing (Heming & Brown, 2005), some researchers have begun to examine multisensory
processing in amblyopia and other forms of early visual impairment. Below, the current
knowledge of this emerging field of study is summarized.
1.4.1 Audiovisual Temporal and Spatial Perception
Several lines of evidence indicate that amblyopia involves abnormal temporal interactions
between auditory and visual stimuli. The earliest study to explicitly examine this issue tested
adults with a history of early visual deprivation from bilateral congenital cataract on a simple
visual task with an auditory distractor (Putzar, Goerendt, Lange, Rosler, & Röder, 2007).
Participants were shown a series of rapidly changing colours, and asked to identify the colour
simultaneous with a target flash. A task-irrelevant auditory tone was presented before or after the
target flash. The distractor tone significantly biased the perceived timing of the target flash
among controls, but bilaterally deprived participants were less affected, indicating reduced cross-
modal interactions between vision and hearing. The temporal constraints on the perception of
audiovisual simultaneity have also been studied in a similar population (Chen et al., 2017). In
adults with a history of early bilateral deprivation, the audiovisual simultaneity window was
found to be normal on the auditory-lead side, but widened on the visual-lead side (i.e., they were
more likely to perceive a click and a flash as simultaneous when in fact, the flash came first). In
contrast, adults with a history of early monocular deprivation had a symmetrically widened
simultaneity window (i.e., they were more likely to perceive audiovisual simultaneity in both
auditory-lead and visual-lead conditions). In both groups, the abnormalities in simultaneity
perception persisted regardless of which eye was viewing, suggesting that these effects are
mediated by abnormalities in central audiovisual processing rather than peripheral visual input
alone. The perception of audiovisual simultaneity has not previously been investigated in
anisometropic, strabismic, or mixed mechanism amblyopia. The sound induced flash illusion
(Shams et al., 2000, 2002) has also been employed to study audiovisual temporal interactions in
amblyopia. A preliminary study of the illusion in adults with deprivational amblyopia showed
56
normal susceptibility among adults with unilateral deprivational amblyopia, but reduced
susceptibility to the illusion (i.e., lower likelihood of perceiving illusory flashes) among adults
with bilateral deprivational amblyopia, especially when the visual flashes were presented
peripherally (Chen, Shore, Lewis, & Maurer, 2015). Similar to findings in unilateral
deprivational amblyopia, a study of the sound induced flash illusion in unilateral anisometropic,
strabismic, or mixed mechanism amblyopia found no abnormal susceptibility under monocular
viewing conditions (Narinesingh et al., 2017). Under binocular viewing conditions, however,
participants with non-deprivational forms of amblyopia showed susceptibility to the illusion over
a wider range of auditory-leading SOAs, suggesting a widened temporal window of perceptual
binding (Narinesingh et al., 2017).
Audiovisual spatial processing has not been explicitly investigated in populations with
amblyopia. However, abnormal cross-modal interactions in motion perception, which
incorporates both spatial and temporal signals, has been described in adults with sight recovery
following brief bilateral deprivation from congenital cataracts (Guerreiro, Putzar, & Röder,
2016). Such individuals, who experienced a brief period of congenital blindness, exhibit a
significant visual motion aftereffect following adaptation to auditory motion that is absent in
normally sighted individuals and visually impaired individuals who acquired their deficits after
childhood. This cross-modal effect suggests abnormal involvement of audition in visual motion
processing, and supports previous finding of cross-modal reorganization of the neural substrates
for visual motion processing (e.g., area MT) described in early blind and sight recovery
individuals (Jiang, Stecker, Boynton, & Fine, 2016; Saenz, Lewis, Huth, Fine, & Koch, 2008).
In addition to effects on temporal and motion perception, a brief period of early visual
deprivation has also been shown to affect the attentional balance between vision and audition for
lateralized stimuli (de Heering et al., 2016). Compared to normally sighted individuals, adults
with a history of early visual deprivation had faster reaction times on auditory trials, and on trials
requiring an attentional switch from vision to audition (de Heering et al., 2016). These findings
suggest that auditory signals command greater attentional salience in individuals with a remote
history of brief visual deprivation.
57
1.4.2 Audiovisual Speech Perception
Studies of audiovisual speech perception using McGurk effect paradigms in humans with
amblyopia consistently show abnormally low susceptibility to the illusion compared to visually
normal controls (Burgmeier et al., 2015; Narinesingh et al., 2015; Narinesingh et al., 2014;
Putzar, Hötting, & Röder, 2010). An early study examined adults who had been treated for
bilateral congenital cataracts before 17 months of age, and found that early visual deprivation
was associated with normal auditory speech perception, but poorer lip-reading ability, and
reduced susceptibility to the McGurk effect (Putzar, Hötting, et al., 2010). Interestingly, this
study included a small subgroup of adults who acquired significant visual impairment after 5
years of age who also showed a reduced McGurk effect, but normal lip-reading abilities. At face
value, this finding suggests that later-onset visual impairment can also impair audiovisual
integration. The authors noted, however, that most of the participants in the acquired visual
impairment group also had mild visual impairments since birth, raising the possibility that early
visual impairments more subtle than bilateral cataracts may also interfere with multisensory
development.
These behavioural findings have been followed up with functional neuroimaging using similar
audiovisual speech paradigms. Unlike visually normal controls, participants with a history of
early transient bilateral visual deprivation lacked fMRI evidence of audiovisual integration in the
primary and higher auditory cortices and STS, and did not exhibit response enhancement in
higher-order visual areas (Calvert et al., 1999; Calvert et al., 2000; Guerreiro, Putzar, & Röder,
2015). Another fMRI study in a similar clinical population showed enhanced responses to
auditory stimuli in occipital visual cortex, indicating that early visual deprivation causes
persistent cross-modal reorganization of unisensory cortical areas (Collignon et al., 2015).
Several further studies examined the McGurk effect in the non-deprivational forms of
amblyopia. Similar to the findings in adults with early bilateral visual deprivation, children and
adults with unilateral anisometropic, strabismic, and mixed mechanism amblyopia also
demonstrated reduced susceptibility to the McGurk effect (Burgmeier et al., 2015; Narinesingh et
al., 2015; Narinesingh et al., 2014). Importantly, this audiovisual perceptual anomaly persisted
under amblyopic eye, fellow eye, and binocular viewing conditions, meaning that visual blur
cannot account for the multisensory abnormality (Narinesingh et al., 2014). Despite a clear
58
relation between perception of the McGurk effect and amblyopia, no study to date has found an
association between susceptibility to the McGurk effect and stereo acuity or visual acuity in the
amblyopic eye (Burgmeier et al., 2015; Narinesingh et al., 2015; Narinesingh et al., 2014).
However, one study which recorded the ages of onset and resolution of amblyopia found a
reduced McGurk effect only in children whose amblyopia remained unresolved past 5 years of
age; resolution by 5 years or onset after 5 years was associated with a normal McGurk response
(Burgmeier et al., 2015). This finding suggests the existence of a sensitive period for normal
integration of audiovisual speech signals during the first 5 years of life.
1.5 Summary
Multisensory processing and integration are ubiquitous, adaptive functions that are central to our
perception of the external world. Information encoded by different peripheral sense organs is not
only combined to enrich perceptual representations, but also interacts at multiple downstream
sites to alter the quality and calibration of perception functions. In the mature sensory system,
multisensory combination and integration can improve the reliability of perceptual responses to,
for example, audiovisual speech. In the developing sensory system, cross-modal interactions
influence the calibration of unisensory functions, such as sound localization. Amblyopia is a
common developmental diagnosis that has been systematically investigated as a unisensory
visual impairment. However, its effects are increasingly recognized to extend beyond vision to
the multisensory domain. Indeed, amblyopia is associated with altered cross-modal interactions
in audiovisual speech perception and audiovisual temporal processing. Knowledge of the extent
and mechanisms of the audiovisual impairments in amblyopia, however, remains in its infancy.
More study is therefore needed to understand the nature of the audiovisual processing and
integration deficits in amblyopia, the mechanisms underlying these deficits, and their relation to
conventional clinical diagnosis and treatment.
59
Chapter 2 Study Aims and Hypotheses
Study Aims and Hypotheses
2.1 General Rationale and Research Aims
Multisensory processing and integration play fundamental roles in human perception, behaviour,
learning, and developmental sensory calibration. Amblyopia, a common developmental visual
impairment, is associated with abnormalities in multisensory processing—particularly in
audiovisual speech perception and temporal judgment tasks. Several issues remain unresolved,
however. Despite spatial vision being the most prominent area of deficit in amblyopia,
audiovisual processing in the spatial domain has not been investigated in amblyopia. More
critically, the underlying basis for multisensory abnormalities in amblyopia remains unclear.
Indeed, because multisensory processing encompasses both integrative and non-integrative
functions modulated by complex interactions involving temporal, spatial, and identity
correspondence, unambiguous conclusions about the underlying mechanisms are difficult to
draw from prior scientific evidence. The clinically relevant consequences of these audiovisual
processing abnormalities, and the adequacy of current amblyopia therapies in addressing them,
also remain unknown.
The general aim of this thesis is to characterize the extent of audiovisual spatial and temporal
processing and integration abnormalities in adult humans with amblyopia. Toward this goal, four
studies were conducted to assess multisensory processing of simple audiovisual stimuli in
participants with the commonest forms of unilateral amblyopia (anisometropic, strabismic, and
mixed mechanism).
60
2.2 Specific Study Aims and Hypotheses
2.2.1 Audiovisual Spatial Perception
2.2.1.1 Study I: Optimal Audiovisual Integration in the Ventriloquism Effect but Pervasive Deficits in Unisensory Spatial Localization in Amblyopia
The majority of data indicating abnormalities in multisensory processing in amblyopia comes
from studies of audiovisual speech perception (Burgmeier et al., 2015; Narinesingh et al., 2015;
Narinesingh et al., 2014; Putzar, Hötting, et al., 2010). Audiovisual integration in the spatial
dimension has not been examined previously in humans with amblyopia. This study addresses
this knowledge gap by examining the spatial ventriloquism effect in humans with unilateral
amblyopia, comparing their performance to visually normal controls, and to an ideal observer
based on the MLE model of optimal multisensory integration (Alais & Burr, 2004).
The specific aims of this study are:
1) To measure the precision of unisensory (visual and auditory) and multisensory
(audiovisual) spatial localization in participants with amblyopia under real-world
binocular viewing conditions.
2) To measure the perceptual weights of vision and audition in a ventriloquism effect
paradigm under real-world binocular viewing conditions.
3) To determine if amblyopia is associated with optimal audiovisual spatial integration
according to the MLE model.
The specific hypotheses of this study are:
1) Participants with amblyopia will show reduced precision of visual localization,
normal precision of auditory localization, and reduced precision of audiovisual
localization.
The known binocular deficits in spatial vision, spatial distortion, and positional
uncertainty (reviewed in section 1.1.3) are expected to manifest as poorer performance in
61
unisensory visual localization. Because vision typically dominates in audiovisual spatial
localization judgments (reviewed in section 1.3.5), reduced localization precision is also
expected for audiovisual (i.e., bimodal) stimuli. Although auditory spatial localization is
influenced by early visual experience (reviewed in section 1.2.2.5), deficits in sound
localization have only been demonstrated previously in severe bilateral early visual
impairment. Accordingly, unisensory auditory localization is expected to be within
normal limits in participants with amblyopia.
2) Participants with amblyopia will weight audition more heavily than visually normal
controls in the ventriloquism effect paradigm.
As above, visual spatial localization precision is expected to be reduced, whereas
auditory spatial localization precision is expected to be normal. The fused audiovisual
percept in the ventriloquism effect is therefore expected to reflect this differential effect
on vision and audition, with audition being weighted relatively more by participants with
amblyopia compared to visually typical controls.
3) Participants with amblyopia will integrate visual and auditory spatial signals
optimally according to the MLE model.
Optimal multisensory integration is widely thought to develop late relative to unisensory
perceptual abilities (reviewed in sections 1.3.6 and 1.3.7). Amblyogenic factors, in
contrast, exert their influences on the visual system primarily in early childhood
(reviewed in section 1.1.5). Because optimal integration likely emerges after amblyopic
visual deficits have developed, audiovisual spatial perception in amblyopia is expected to
obey the MLE model.
2.2.1.2 Study II: Amblyopia and the Developmental Calibration of Sound Localization
Early visual experience is known to influence the development of sound localization abilities and
to alter the neural representation of auditory space in the superior colliculus (reviewed in section
1.2.2.5). This study was designed to further investigate the effect of unilateral amblyopia on
sound localization suggested by the findings of Study I (see section 3).
62
The specific aims of this study are:
1) To measure the precision of relative sound localization in the horizontal plane (i.e., the
minimum audible angle, or MAA).
2) To measure the accuracy of absolute sound localization in the horizontal plane.
The specific hypotheses of this study are:
1) Participants with amblyopia will have a wider MAA than visually normal controls.
The sensitive periods for visual development (reviewed in section 1.1.5) and the period
for normal development of sound localization (reviewed in section 1.2.2.4) overlap.
Furthermore, abnormal early visual experience is known to affect the developmental
calibration of sound localization abilities (reviewed in section 1.2.2.5). Given that visual
positional uncertainty affects both the amblyopic and fellow eyes (reviewed in section
1.1.3), and based on the unexpected finding of reduced sound localization precision in
Study I (see section 3), participants with amblyopia are expected to demonstrate an
abnormally wide MAA.
2) Participants with amblyopia will localize sounds less accurately than visually
normal controls.
Visual spatial distortions affect not only the amblyopic eye, but also the fellow eye in
humans with amblyopia (reviewed in section 1.1.3). Based on the same reasoning
outlined in the explanation for hypothesis (1), participants with amblyopia are expected
to localize sounds less accurately than visually normal controls.
2.2.2 Audiovisual Temporal Perception
2.2.2.1 Study III: Alterations in Audiovisual Simultaneity Perception in Amblyopia
Evidence suggests that audiovisual simultaneity perception is based on a non-integrative
mechanism of cross-modal matching (Fujisaki & Nishida, 2005). However, the width of the
simultaneity window correlates with the strength of multisensory integration in the McGurk
63
effect (Stevenson, Zemtsov, et al., 2012). In the most common subtypes of amblyopia
(anisometropic, strabismic, and mixed mechanism), reduced audiovisual integration measured by
the McGurk effect is well-documented (reviewed in section 1.4.2). Understanding audiovisual
simultaneity perception in amblyopia may therefore provide insight into the basis for a reduced
McGurk effect. Although abnormal perception of audiovisual simultaneity has been reported in
humans with the deprivational subtype of amblyopia (reviewed in section 1.4.1), it has not been
measured in the most common subtypes that exhibit a reduced McGurk effect.
The specific aim of this study is:
1) To measure the temporal window of audiovisual simultaneity over a range of signal onset
asynchronies (SOAs).
The specific hypothesis of this study is:
1) Participants with the most common subtypes of amblyopia will be more likely than
controls to perceive asynchronous audiovisual signals as simultaneous for both
visual-lead and auditory-lead SOAs (i.e., they will have a symmetrically widened
temporal window of audiovisual simultaneity).
Based on the known correlation between the width of the simultaneity window and
susceptibility to the McGurk effect in visually normal adults, as well as the symmetrically
widened simultaneity window previously observed in adults with unilateral deprivational
amblyopia, the clinical population in this study is expected to exhibit a symmetrically
widened simultaneity window.
2) The width of the audiovisual simultaneity window in participants with amblyopia
will not be dependent upon viewing condition.
Adults with deprivational amblyopia are known to have a widened simultaneity window
regardless of whether the amblyopic or fellow eye is viewing, suggesting a
developmental origin to the abnormality (Chen et al., 2017). Participants with unilateral
anisometropic, strabismic, and mixed mechanism amblyopia are expected to have a
64
symmetrically widened simultaneity window under all viewing conditions, similar to the
known behaviour of adults with unilateral deprivational amblyopia.
2.2.2.2 Study IV: Temporal Ventriloquism Reveals Normal Audiovisual Temporal Integration in Amblyopia
While some hypothesize that width of the audiovisual simultaneity window reflects the capacity
to encode and compare unisensory temporal features (i.e., non-integrative functions) (Fujisaki &
Nishida, 2005), others suggest that the simultaneity window may reflect abnormalities in the
capacity for audiovisual integration (Chen et al., 2017). This study was designed to distinguish
these possibilities by investigating the temporal ventriloquism effect—a phenomenon in which
audiovisual integration normally enhances temporal resolution on a visual temporal order
judgment (TOJ) task.
The specific aims of this study are:
1) To measure temporal resolution for a visual TOJ task with and without paired auditory
stimuli that normally elicit enhancement by the temporal ventriloquism effect.
2) To measure the temporal window of perceptual binding for the temporal ventriloquism
effect.
The specific hypotheses of this study are:
1) Participants with amblyopia will exhibit enhanced visual TOJ in the temporal
ventriloquism effect, consistent with an intact mechanism for audiovisual temporal
integration.
Evidence suggests that multisensory integration develops late in humans (reviewed in
sections 1.3.6 and 1.3.7), after the typical sensitive period for the development of
amblyopia (reviewed in section 1.1.5). In the temporal domain, audition offers greater
precision and typically dominates over vision (see sections 1.3.4, 1.3.5.2, and 1.3.7). On
these bases, participants with amblyopia are expected to exhibit multisensory
enhancement in visual TOJ performance consistent with the temporal ventriloquism
effect.
65
2) Participants with amblyopia will exhibit perceptual binding (i.e. multisensory
enhancement) in the temporal ventriloquism effect over a wider interval of SOAs
compared to visually normal controls.
Consistent with a widened temporal window of audiovisual simultaneity, and a widened
window of perceptual binding in the sound-induced flash illusion (reviewed in section
1.4.1), participants with amblyopia are expected to exhibit perceptual binding in the
temporal ventriloquism effect over a wider range of SOAs compared to visually normal
adults.
66
Chapter 3 Study I
Study I: Optimal Audiovisual Integration in the Ventriloquism Effect but Pervasive Deficits in Unisensory Spatial Localization in Amblyopia
3.1 Abstract
Purpose: Classically understood as a deficit in spatial vision, amblyopia is increasingly
recognized to also impair audiovisual multisensory processing. Studies to date, however, have
not determined whether the audiovisual abnormalities reflect a failure of multisensory
integration, or an optimal strategy in the face of unisensory impairment. We use the
ventriloquism effect and the maximum likelihood estimation (MLE) model of optimal
integration to investigate integration of audiovisual spatial information in amblyopia.
Methods: Fourteen participants with unilateral amblyopia and 16 visually normal controls
localized brief auditory-only, visual-only, and combined audiovisual stimuli during binocular
viewing using a location discrimination task. A subset of combined audiovisual trials involved
the ventriloquism effect, an illusion in which auditory and visual stimuli originating from
different locations are perceived as a unified event from a single location. Localization precision
and bias were determined by psychometric curve fitting, and the observed parameters were
compared to predictions from the MLE model.
Results: Spatial localization precision was significantly reduced in the amblyopia group for
visual-only, auditory-only, and combined audiovisual stimuli, compared to the control group.
Analyses of localization precision and bias for combined audiovisual stimuli showed no
significant deviations from the MLE model in either the amblyopia and control group.
Conclusions: Despite pervasive deficits in localization precision for visual, auditory, and
audiovisual stimuli, audiovisual spatial integration remains intact and optimal in unilateral
amblyopia.
67
3.2 Introduction
Amblyopia is a neurodevelopmental visual disorder that affects 2–4% of the population.(Birch,
2013) Beyond its widely known effects on vision (McKee et al., 2003), emerging research
indicates that amblyopia also involves a range of abnormalities in multisensory processing. For
example, even when viewing with both eyes, people with unilateral amblyopia show reduced
susceptibility to the McGurk effect (Burgmeier et al., 2015; Narinesingh et al., 2015;
Narinesingh et al., 2014), diminished ability to perceive asynchrony between auditory and visual
stimuli (Chen et al., 2017; M. D. Richards, H. C. Goltz, & A. M. F. Wong, 2017b), and a
widened temporal binding window for the sound-induced flash illusion (Narinesingh et al.,
2017).
While it is clear that early visual experience is necessary for the normal development of many
multisensory processes (Hötting & Röder, 2009; Putzar et al., 2007; Röder, Rosler, & Spence,
2004; Wallace et al., 2004), it is less clear whether the multisensory abnormalities in amblyopia
represent a failure to integrate the available unisensory information, or appropriate integration of
the available, but deficient, unisensory information. Difficulty in answering this question arises
for several reasons. First, there is often ambiguity surrounding which phenomena constitute
multisensory integration. Several prominent investigators in the field define multisensory
integration as “the neural process by which unisensory signals are combined to [produce]… a
multisensory response (neural or behavioral) that is significantly different from the responses
evoked by the modality-specific component stimuli” (Stein et al., 2010). The McGurk effect,
which often elicits a multisensory percept that is distinct from the auditory and visual stimuli, fits
the above definition well. However, the nature of audiovisual asynchrony detection is more
ambiguous—it may plausibly be underpinned by a cross-modal matching process rather than
integration, but empirically, it is correlated with other indices of audiovisual integration (Chen et
al., 2017; Stevenson, Zemtsov, et al., 2012). Second, difficulty in distinguishing between a
failure of integration and a deficiency in the unisensory components being integrated arises
because we lack a model of how specific features of the unisensory components (such as spatial,
temporal, and semantic content) determine what is perceived at the multisensory level.
A paradigm to study this question in amblyopia is provided by the ventriloquism effect (Howard
& Templeton, 1966). The ventriloquism effect is an audiovisual illusion in which spatially
68
disparate visual and auditory stimuli are perceived as originating from the same location.
Typically, the location information of the visual unisensory component dominates in the
perceived location of the fused audiovisual percept, a process sometimes termed visual capture
(Welch & Warren, 1980). Alais and Burr (2004), however, showed that by blurring the visual
stimulus (i.e. modulating its spatial reliability or precision), the perceptual dominance of vision
over audition can be diminished or even reversed. Critically, they demonstrated that the location
and spatial precision of the multisensory percept can be predicted from the location and spatial
precision of the unimodal components using the maximum likelihood estimation (MLE) model
of optimal combination. Therefore, the MLE model of the ventriloquism effect offers a powerful
methodology to disentangle the relative contributions of unisensory impairment and integration
failure from the multisensory abnormalities observed in amblyopia.
The MLE model has been put forward by several groups as a model for multisensory integration
of spatial information involving vision (Alais & Burr, 2004; Ernst & Banks, 2002; Moro, Harris,
& Steeves, 2014). For the ventriloquism effect, the MLE model predicts that the perceived
location of a bimodal event will be the weighted average of the locations of the unimodal events,
such that:
���� = ����� + ����� (1)
where ��� and ��� are the unisensory localization estimates for vision and audition, �� and �� are
the perceptual weights for vision and audition, and ���� is the resultant bimodal localization
estimate. The perceptual weights, �� and ��, sum to 1, and are proportional the variances of the
unisensory localization estimates, ��� and ���, such that:
�� = ������ + ���
(2)
And
�� = ������ + ���
(3)
69
Experimentally, localization variance can be estimated from the psychometric curve fit to the
unimodal localization data. The combination of unisensory localization estimates in the MLE
model is mathematically optimal in that it results in a bimodal localization estimate with the
lowest possible variance (i.e. highest possible precision):
���� = ��������� + ���
≤ min����, ���� (4)
If the psychometric response is represented by a cumulative normal function, the variance of the
function is inversely related to maximum slope, !, at the inflection point of the curve:
! = " 1√2%& ∙ " 1
√��& (5)
Therefore, following from Equations (4) and (5), the MLE model predicts that spatial
localization precision (represented by !) is always greater for the bimodal event than for its
unisensory components, and that the bimodal advantage in spatial localization precision is
greatest when the precisions of the unisensory components are equal.
In the present report, we employ the ventriloquism effect and predictions of the MLE model to
investigate integration of audiovisual spatial information in amblyopia.
3.3 Methods
3.3.1 Participants
All participants were adults with no visual disorders other than amblyopia, strabismus, or
refractive error. Participants were excluded if they had a history of neurodevelopmental or
neurological disorder, hearing impairment, high ametropia (hyperopia > +5D or myopia > -6D),
or any other ocular pathology or prior intraocular surgery. Each participant underwent ocular and
hearing assessment by a certified orthoptist or ophthalmologist. The ocular assessment
documented distance visual acuity with correction (ETDRS chart), stereo acuity (Randot circles
and Titmus fly test), foveal suppression (Worth 4-dot test), eye alignment, and refractive
correction. The hearing assessment (Student Support Services Team, 2008) ensured reliable
detection of suprathreshold pure tones (25 dBA sound pressure level) in each ear at four standard
70
frequencies (500, 1000, 2000, and 4000 Hz) using a screening audiometer (model MA 27,
MAICO Diagnostics, Eden Prairie, MN, USA) with circumaural headphones (model TDH 39,
MAICO Diagnostics, Eden Prairie, MN, USA). Amblyopia was defined as visual acuity of 0.18
logMAR or poorer in the amblyopic eye, and an interocular acuity difference of 0.2 logMAR or
more. Participants were classified as having anisometropic amblyopia if the interocular
difference in spherical equivalent or cylindrical error was 1 diopter (D) or more, as having
strabismic amblyopia if there was any manifest deviation on cover testing in the absence of
anisometropia, or as having mixed-mechanism amblyopia when both anisometropia and a
manifest deviation of 8 prism diopters or more were present. Visually normal was defined as
visual acuity of at least 0.1 logMAR (20/25) in each eye, with stereo acuity of 40 seconds of arc
or better, and no manifest strabismus. The study was approved by the Research Ethics Board at
The Hospital for Sick Children, and all protocols adhered to the tenets of the Declaration of
Helsinki. Written informed consent was obtained from all participants after explanation of the
nature and possible consequences of the study.
Fourteen adults with unilateral amblyopia (mean age: 28.8 years; age range: 19–48) and 16
visually normal controls (mean age: 29.2 years; age range: 23–47) participated in the study.
Clinical characteristics of the participants with amblyopia are summarized in Table 3.1.
71
Table 3.1: Characteristics of participants with amblyopia
Visual acuity
(logMAR)
Refractive correction
Participant Age
(years)
Subtype RE LE RE LE Stereo acuity
(arc sec)
Worth 4-dot
response
A1 29 Strab 0.00 1.00 None None Not measurable LE suppressed
A2 22 Aniso 0.00 0.48 -1.50 +0.50 x 80 +1.00 +1.25 x 95 200 Fused
A3 48 Aniso 0.70 0.00 +2.25 +0.25 x 174 -0.75 3000 Fused
A4 29 Aniso 0.48 -0.10 -5.00 -1.25 3000 Fused
A5 23 Aniso -0.10 0.48 -2.25 +0.25 +2.25 x 85 200 Fused
A6 29 Mixed 0.00 1.00 Plano +3.50 +2.00 x 90 Not measurable LE suppressed
A7 19 Aniso 0.00 0.18 -0.75 +2.00 x 84 -2.75 +4.50 x 99 40 Fused
A8 27 Strab 0.00 0.48 -6.25 +1.00 x 45 -5.50 +1.25 x 135 200 Fused
A9 37 Mixed -0.10 1.30 -1.00 +6.00 +2.50 x 120 Not measurable LE suppressed
A10 32 Aniso -0.10 0.54 Plano +2.00 +2.00 x 124 140 Fused
A11 23 Strab 0.20 0.00 +0.50 +0.50 x 28 +1.25 +0.50 x 88 Not measurable Diplopic
A12 44 Mixed 0.90 0.00 +6.00 +1.25 x 75 -0.75 Not measurable RE suppressed
A13 22 Aniso 1.1 -0.10 -6.00 +0.75 x 174 -4.50 +0.50 x 75 3000 Fused
A14 19 Mixed 0.48 0.00 +3.00 +1.00 x 130 +4.25 3000 Fused
Abbreviations: RE, right eye; LE, left eye; Aniso, anisometropia; Strab, strabismic.
72
3.3.2 Apparatus and Stimuli
The entire experiment was conducted in a darkened, double-walled audiometric chamber
(internal dimensions 2.0 x 2.1 x 2.2 m). The floor was carpeted, and the walls and ceiling were
lined with 5 cm acoustic wedge foam (Foam Factory, Macomb, MI, USA). Head position was
constrained by a chinrest fixed to a small table 65 cm from the centre of the audiovisual
apparatus, as shown in Figure 3.1.
Figure 3.1: Audiovisual apparatus for the presentation of visual blobs and auditory clicks.
Virtual sound sources were generated at locations on horizontal axis between the two physical
speakers by linear amplitude panning. The speakers were driven by coherent signals of
independently variable amplitude, such that the signal amplitude gain to the right and left
speakers always summed to 1.
Visual stimuli consisted of medium contrast (39%) Gaussian blobs of 5 size variants (1 SD =
16°, 20°, 24°, 28°, or 32°), flashed for 33 ms on a large LED monitor (model E654, NEC
Corporation, Tokyo, Japan) subtending 96° x 64° of visual angle (165 cm diagonal). The monitor
was overlaid with an ND 0.9 filter to create a high-quality Gaussian blob (peak luminance 2.1
cd/m2) with imperceptible steps between gray levels. Auditory stimuli consisted of 32 ms clicks
73
(8 cycles of 2–5 kHz bandpass filtered white noise, 4 ms in duration, enveloped with a 2 ms
sigmoid on/off ramp), presented at 62.0 dBA through two speakers mounted on either side of the
monitor on the horizontal midline. Apparent click location was controlled by linear amplitude
panning of interaural level difference (ILD) cues (Pulkki, 2001; Warncke, 1941). The output
profile for each speaker was measured across the entire stimulus dynamic range using a sound
level meter to ensure their outputs were identical. Auditory and visual stimulus timings were
confirmed with an oscilloscope. A wireless gamepad was used to initiate trials and enter
responses. The study was approved by the Research Ethics Board at The Hospital for Sick
Children, and all protocols adhered to the tenets of the Declaration of Helsinki.
3.3.3 Procedure
All trials were conducted with both eyes open. Participants performed a relative spatial
localization task for unimodal stimuli (visual blobs only or auditory clicks only) and bimodal
stimuli (blobs and clicks together) similar to that described by Alais and Burr.(Alais & Burr,
2004) A general trial timeline is illustrated in Figure 3.2. Upon initiation of each trial, a red
fixation dot (0.66°) was presented centrally for 500 ms, followed by a randomized delay of 250–
400 ms. Two matching stimuli (a test stimulus and probe stimulus, but in random order) were
then presented in succession, 500 ms apart, and the participant was asked to “indicate whether
the second event occurred left or right relative to the first event”. Participants were instructed to
keep their head and eyes aligned centrally. There were 21 test stimulus conditions: 6 unimodal (1
click-only and 5 blob size variants), 5 bimodal with spatially congruent clicks and blobs (5 blob
size variants paired with a click), and 10 bimodal with spatially conflicting clicks and blobs (5
blob size variants paired with a click, but blob displaced 4° left and click displaced 4° right, or
click displaced 4° left and blob displaced 4° right). Bimodal test stimulus conditions with spatial
conflict were designed to elicit the ventriloquism effect, and participants were not told of this
spatial disparity. The test stimulus was presented centrally (0°) in all trials; for bimodal test
stimuli with spatial conflict, the unimodal components were displaced 4° in opposite directions
such that that their average location was still 0°. The probe stimulus matched the characteristics
of the test stimulus except for horizontal displacement (specified in Table 3.2), and in some
cases, spatial congruency (bimodal test stimuli with spatial conflict were paired with spatially
congruent probe stimuli). Data were collected in separate blocks for the unimodal auditory
74
conditions, and for each of the 5 blob sizes within the unimodal visual and bimodal conditions.
Twenty trials were run for each probe stimulus displacement, randomly interleaved within each
block.
Figure 3.2 Illustration of the trial timeline. After trial initiation by the participant, a fixation
dot appeared centrally for 500 ms, followed by a dark interval of 250–400 ms. Two brief stimuli
(test and probe) were displayed in sequence, 500 ms apart, but in random order. The participant
judged whether the second stimulus originated left or right relative to the first.
Table 3.2: Probe stimulus displacements used for each test stimulus condition
Stimulus condition Probe stimulus displacements (°)
Click only -15, -12, -9, -6, -3, 3, 6, 9, 12, 15
16 °/SD blob ± click -8, -6, -4, -2, 2, 4, 6, 8
20 °/SD blob ± click -10, -7.5, -5, -2.5, 2.5, 5, 7.5, 10
24 °/SD blob ± click -12, -9, -6, -3, 3, 6, 9, 12
28 °/SD blob ± click -14, -10.5, -7, -3.5, 3.5, 7, 10.5, 14
32 °/SD blob ± click -16, -12, -8, -4, 4, 8, 12, 16
N.B.: negative displacement = leftward, positive displacement = rightward, SD = standard
deviation
75
3.3.4 Data Analysis
The proportion of ‘probe stimulus perceived left’ responses was calculated for each probe
displacement, and the data were fit with a cumulative normal function by the maximum
likelihood method. The mean of the function is the point of subjective equality (PSE) and
represents the localization estimate of the test stimulus (�� in Equation 1). The standard deviation
of the function, �, is related to the precision of the localization estimate, !, as described by
Equation 5. For all unimodal conditions and spatially congruent bimodal conditions, the curve fit
was constrained such that PSE = 0° to avoid falsely steep fits due to undersampling around the
mean. For all bimodal conditions with spatial conflict, the curve fit was unconstrained, as
variation in the PSE was of primary interest. As is common in psychophysical methodology, the
maximum slope of the psychometric function, !, was taken as the measure of localization
precision, and was calculated from � values using Equation 5(Strasburger, 2001). All ! values
were subsequently log10 transformed to achieve linearity and equality of variances required for
statistical analysis. The assumption of equality of variances was met by Levene’s test for all
between-group t-tests, analyses of variance (ANOVAs), and analyses of covariance
(ANCOVAs), and by Mauchly’s test of sphericity for all repeated measures ANOVAs. The
assumption of homogeneity of regression was met for all ANCOVAs. All fitted functions and
parameters were calculated with custom-written scripts in MATLAB version 2011b (Mathworks,
Inc., Natick, MA, USA). All statistical tests were computed using IBM SPSS Statistics version
22 (Armonk, NY, USA). Statistical significance was defined as p < 0.05.
3.4 Results
Mean psychometric data for the unimodal and bimodal localization tasks for the visually normal
control and amblyopia groups are shown in Figure 3.3. Subsequent analyses of localization
precision, perceptual weight by modality, and agreement with the MLE model are reported in the
sections below.
76
3.4.1 Localization Performance
3.4.1.1 Localization Precision for Unimodal Stimuli
Localization precision, defined as the slope of the fitted psychometric function at the midpoint,
decreased monotonically in both groups for unimodal visual stimuli as the blob size increased
from 16° to 32° (Figure 3.3A, B; Figure 3.4A). A one-way ANCOVA controlling for the
covariate of blob size showed that unimodal visual localization precision was significantly
poorer in the amblyopia group compared to the control group (F(1,147) = 7.542, p = 0.007).
Surprisingly, unimodal auditory localization precision (Figure 3.3A, B; Figure 3.4B) was also
significantly reduced in the amblyopia group compared to the control group (t(28) = 2.138, p =
0.041) (Wong, Richards, & Goltz, 2017).
3.4.1.2 Localization Precision for Spatially Congruent Bimodal Stimuli
Localization precision for spatially congruent bimodal stimuli decreased monotonically in both
the control group and amblyopia group as the blob size increased from 16° to 28° (Figure 3.3C,
D; Figure 3.4C). The flattening of the relation at large blob sizes is likely a ceiling effect
imposed by the higher precision of the auditory stimulus. A one-way ANCOVA controlling for
the covariate of blob size showed that bimodal localization precision was significantly lower in
the amblyopia group compared to the control group (F(1,147) = 21.407, p < 0.001).
3.4.1.3 Localization Bias for Spatially Conflicted Bimodal Stimuli
Localization performance for bimodal stimuli with spatial conflict is illustrated for the control
group (Figure 3.3E, G) and the amblyopia group (Figure 3.3F, H). In these trials, the visual and
auditory unimodal components were displaced 4° in opposite directions from centre to elicit a
ventriloquism effect. Localization bias, or PSE, was computed for both conflict conditions (i.e.,
blob -4° and blob 4°) at each blob size for every individual. Results from the two conflict
conditions were subsequently pooled, however, as a 2 x 5 two-way repeated measures ANOVA
for the effect of conflict condition and blob size on PSE showed no significant effect of conflict
condition for the control group (F(1, 15) = 0.218, p = 0.647) or the amblyopia group (F(1,13) =
1.694, p = 0.215). Both groups showed a monotonic progression in PSE from a vision-dominant
localization to an audition-dominant localization as the blob size increased (Figure 3.5). A one-
77
way ANCOVA controlling for the covariate of blob size showed no significant difference in PSE
between the two groups (F(1,297) = 3.003, p = 0.084).
78
Figure 3.3: Unimodal and bimodal localization task performance. Data are shown for the
control group (A, C, E, G) and the amblyopia group (B, D, F, H). Symbols represent the mean
proportion of trials in which a probe stimulus was perceived leftward of a test stimulus. Visual
79
stimuli were Gaussian blobs of specific sizes (1 SD = 16°, red; 1 SD = 20°, orange; 1 SD = 24°,
green; 1 SD = 28°, blue; 1 SD = 32°, purple), and auditory stimuli were white noise clicks. (A,
B) Mean psychometric data for localization of unimodal visual (rainbow symbols and solid lines)
and unimodal auditory test stimuli (black symbols and dashed lines) centred at 0°. (C, D) Mean
psychometric data for localization of bimodal test stimuli whose unimodal components were
central and spatially congruent (i.e. blob and click both centred at 0°). (E–H) Mean psychometric
data for localization of bimodal stimuli whose unimodal components are in spatial conflict (i.e.,
symmetrically displaced about 0°). (E, F) Bimodal conflict conditions with blobs centred 4° left
and clicks centred 4° right. (G, H) Conflict conditions with clicks centred 4° left and blobs
centred 4° right.
Figure 3.4: Localization precision for visual-only, auditory-only, and spatially congruent
bimodal audiovisual stimuli. Control group data are shown in blue and amblyopia group data
are shown in red. Localization precision (i.e., psychometric function slope) values were log10
transformed to equalize variances and linearize the relation between localization precision and
blob size. Error bars represent ±1 SEM. For (A) visual-only stimuli, (B) auditory-only stimuli,
and (C) spatially congruent bimodal stimuli, the control and amblyopia groups differed
significantly after controlling for any differences in blob size (* p < 0.05, **p < 0.01).
80
Figure 3.5: Bimodal localization bias for audiovisual stimuli with spatial conflict. Control
group data are shown in blue, and amblyopia group data are shown in red. Error bars represent
±1 SEM. The unimodal components (visual blob and auditory click) were horizontally displaced
4° in opposite directions from centre, such that their average location was 0°. Positive PSE
values indicate locations toward the blob, and negative PSE values indicate locations toward the
click, and include results pooled from the two conflict conditions tested (i.e., blob left and click
right; click left and blob right). In both groups, the strength and direction of the ventriloquism
effect was modulated by visual blob size.
3.4.2 Testing the Maximum Likelihood Estimation Model
3.4.2.1 Observed Versus Predicted Localization Precision for Spatially Congruent Bimodal Stimuli
Agreement between the observed localization precision for spatially congruent bimodal stimuli
and the values predicted by the MLE model is illustrated for the control group (Figure 3.6A) and
amblyopia group (Figure 3.6B). For the control group, a 2-way repeated measures ANOVA
comparing observed and predicted bimodal localization precision across blob sizes showed no
significant interaction between factors (F(4,60) = 1.136, p = 0.348) and no significant deviation
from the MLE model (F(4,60) = 1.136, p = 0.348). The same 2-way repeated measures ANOVA
analysis in the amblyopia group showed no significant interaction between factors (F(4,52) =
2.293, p = 0.072), and no significant difference in localization precision as observed and as
predicted by the MLE model (F(1,13) = 3.671, p = 0.078).
81
Figure 3.6: Bimodal localization precision, as observed and as predicted by the MLE
model. Values were log10 transformed to equalize variances and linearize the relation between
localization precision and blob size. Error bars represent ±1 SEM. For (A) the control group
(shown in blue) and (B) the amblyopia group (shown in red), the observed bimodal localization
precision (solid lines) did not differ significantly from the predictions of the MLE model (dashed
lines).
According to the MLE model, audiovisual integration results in enhanced localization precision
for bimodal stimuli by optimal combination of the component unimodal spatial signals. In
complete integration failure, however, the best localization precision achievable is that of the
more precise unimodal signal. This distinction provides a test for integration in amblyopia.
Importantly, the MLE model also predicts that the bimodal enhancement in localization precision
is greatest, and therefore most detectable, when the localization precisions of the unimodal
components are equal (i.e., β’V = β’ A) (see Equations 4 and 5). The bimodal localization
precision observed in this study was therefore compared to that expected with intact integration
(i.e., MLE-predicted value computed from unimodal component precisions) and with integration
failure (i.e., the most precise unimodal component) specifically for the condition in which the
82
unimodal components were most similar for each participant (Figure 3.7). For the control group,
a one-way repeated measures ANOVA showed a significant difference among the observed,
MLE-predicted, and best unimodal bimodal localization precisions (F(1.041,15.616) = 7.130, p =
0.016, Greenhouse-Geisser correction). As expected, post hoc multiple comparisons revealed a
significant difference between the observed bimodal localization precision and the best unimodal
localization precision (p = 0.017), but no significant difference between the observed bimodal
localization precision and the MLE-predicted values (p = 0.974), indicating that audiovisual
spatial integration was intact in the control group. For the amblyopia group, a one-way repeated
measures ANOVA showed a significant difference among the observed, MLE-predicted, and
best unimodal bimodal localization precisions (F(1.184,15.388) = 8.827, p = 0.007, Greenhouse-
Geisser correction). Post hoc multiple comparisons revealed a significant difference between the
observed bimodal localization precision and the best unimodal localization precision (p = 0.011),
but no significant difference between the observed bimodal localization precision and the MLE-
predicted values (p = 0.727), indicating that audiovisual spatial integration was intact in the
amblyopia group.
83
Figure 3.7: Maximal bimodal advantage ratio for localization precision, observed, as
predicted by the MLE model, and as predicted by integration failure. Error bars represent ±1
SEM. For (A) the control group and (B) the amblyopia group, the observed maximal bimodal
advantage ratio was consistent with intact integration as predicted by the MLE model, and
inconsistent with integration failure (i.e., best unimodal). *p < 0.05; n.s. = not significant.
3.4.2.2 Observed Versus Predicted Visual Perceptual Weight for Spatially Conflicted Bimodal Stimuli
The MLE model also makes predictions about the contribution of each modality to the perceived
location of a bimodal event when the unimodal components are in spatial conflict (Figure 3.3E–
H). The model predicts that the perceptual weights of vision, ��, and audition, ��, in a bimodal
percept are proportional to their unimodal localization precision (Figure 3.3A, B; Figure 3.4A,
B), according to Equations (1), (2) and (3) above. Agreement between the observed visual
perceptual weight, ��, for spatially conflicted bimodal stimuli and the values predicted by the
MLE model is illustrated for the control group (Figure 3.8A) and the amblyopia group (Figure
3.8B). For both groups, a classic ventriloquism effect with near-complete visual capture was
84
observed for the smallest blob size (16°), while a reverse ventriloquism effect (Alais & Burr,
2004) in which audition dominated was observed for the largest blob sizes. Two-way repeated
measures ANOVAs comparing observed and predicted visual perceptual weight, ��, across blob
sizes showed no significant deviation from the MLE model in the control group (F(1,15) =
2.460, p = 0.138) or the amblyopia group (F(1,13) = 0.004, p = 0.952).
Figure 3.8: Perceptual weight for vision (wV), observed and as predicted by the MLE
model. Error bars represent ±1 SEM. For (A) the control group and (B) amblyopia group, the
perceptual weight of vision observed for bimodal stimuli with spatial conflict did not differ
significantly from that predicted by the MLE model.
3.4.2.3 Observed Equivalence Point for Localization Precision and Perceptual Weight
Another specific prediction of the MLE model is that visual and auditory stimuli will be
weighted equally in the localization estimate of the bimodal stimulus (i.e. �� = ��) when their
unimodal localization precisions are the same (i.e. ′� = ′�). To test this prediction, the visual
85
blob size equivalent to the auditory click in terms of unimodal spatial precision was compared to
the visual blob size equivalent to the auditory click in terms of perceptual weight (Figure 3.9).
For each participant, a linear regression was calculated to predict the unimodal visual precision
based on blob size (control: mean R2 = 0.94; amblyopia: mean R2 = 0.95), and the regression
equation was used to calculate the blob size at the precision level of the auditory click (i.e. when
′� = ′�). Another linear regression was calculated to predict the visual perceptual weight, ��,
based on blob size (control: mean R2 = 0.89; amblyopia: mean R2 = 0.86), and the regression
equation was used to calculate the blob size at �� = 0.5 for each participant (i.e. when �� =��). Paired sample t-tests showed that the mean blob size equivalent to the click in terms of
unimodal spatial precision did not differ significantly from the mean blob size when �� = 0.5 for
the control group (t(15) = -1.566, p = 0.138) or the amblyopia group (t(13) = 0.241, p = 0.834).
Therefore, this prediction of the MLE model was upheld in both groups.
Figure 3.9: Visual blob size equivalent to the auditory click in terms of spatial precision (on
unimodal presentation) and perceptual weight (on bimodal presentation). The MLE model
predicts that the equivalence point should be the same for unimodal spatial precision and
perceptual weight. Indeed, there was no significant difference between the two equivalence
points for either group.
86
3.5 Discussion
We report that under binocular viewing conditions typical of everyday experience, amblyopia is
associated with a pervasive impairment in spatial localization precision that involves visual,
auditory, and audiovisual (i.e., multisensory) perception. Using the MLE model of the
ventriloquism effect (Alais & Burr, 2004), we show that the deficits in audiovisual localization
actually represent optimal combination of the available unisensory (i.e., visual and auditory)
information. Taken together, these findings indicate that amblyopia does not involve a failure of
spatial audiovisual integration, and point to the importance of normal visual experience (or the
detrimental effect of amblyopic vision) in the developmental calibration of other senses.
The unisensory visual localization task measured relative localization precision under binocular
viewing conditions for diffuse visual blobs of various sizes. Despite normal visual acuity in the
fellow eye, the amblyopia group showed a general reduction in visual localization precision
across blob sizes. Several possibilities may account for this finding. Contrary to clinical dogma,
vision in the fellow eye is not normal (Meier & Giaschi, 2017). Careful psychophysical studies
have shown that the fellow eye has reduced optotype (Kandel, Grattan, & Bedell, 1980; McKee
et al., 2003) and vernier acuity (Levi & Klein, 1985), as well as greater spatial uncertainty and
distortion affecting both foveal and extra-foveal vision (Bedell et al., 1985; Sireteanu et al.,
2008). Another possible explanation for the reduction in visual localization precision is the
temporal interval between the two stimuli whose positions were judged. Previous studies of
spatial vision in the fellow eye (mentioned above) used static visual targets whose spatial
elements were present simultaneously. Our study, however, presented spatial elements (i.e.,
blobs) separated by a temporal interval of 500 ms. Factors such as reduced visual persistence
(Altmann & Singer, 1986) or fixation instability (Gonzalez, Wong, Niechwiej-Szwedo, Tarita-
Nistor, & Steinbach, 2012; Schor & Westall, 1984; Subramanian, Jost, & Birch, 2013) in
amblyopia may have therefore contributed to the observed visual spatial localization deficit.
This study also revealed a surprising and novel amblyopic deficit in auditory spatial localization
precision. Two features of the experimental task are particularly notable: (1) trials were
conducted in darkness with no visual cues, and (2) localization did not involve pointing of any
kind, but was a ‘left’ or ‘right’ determination entered as a button press on a gamepad. These
features mean that the spatial uncertainty in amblyopic vision (Hess & Holliday, 1992) and
87
visuomotor control (Niechwiej-Szwedo, Goltz, et al., 2012) cannot directly account for the
observed unisensory auditory effect. Rather, they suggest that the sensory impairment in
amblyopia extends beyond vision and into the realm of binaural spatial hearing.
The bimodal localization task measured relative localization precision under binocular viewing
conditions for diffuse visual blobs of varying sizes paired with a simultaneous auditory click. As
with unisensory stimuli, the amblyopia group showed a general impairment in visual localization
precision across blob sizes. However, analysis according the MLE model showed that
multisensory integration was intact in both the control and amblyopia groups: the maximal
bimodal precision advantage and spatial bias in the ventriloquism effect were optimal based on
the spatial features of the unimodal component stimuli. Small deviations from the MLE model
for bimodal localization precision seen at smaller and larger blob sizes (Figure 3.6) are similar to
those reported by Alais and Burr (2004) (see Figure 2B in their manuscript). The condition in
which the auditory and visual localization precisions are most similar is that in which integration
should result in the greatest improvement in localization precision. Indeed, the MLE predictions
for the maximal bimodal advantage ratio are very close to the empirical data for both groups
(Figure 3.7). Overall, these findings provide independent validation for the MLE model of
ventriloquism (Alais & Burr, 2004) in a larger sample of typically-sighted individuals, and
suggest that at least some multisensory processing abnormalities reported in amblyopia do not
reflect disordered multisensory integration, but rather unisensory deficits that feed into and
propagate through an otherwise normal integrative network.
A common theme in studies of multisensory integration in children is that it develops relatively
late (Burr & Gori, 2012) compared to unisensory abilities (Daw, 2006; Litovsky & Ashmead,
1997) and non-integrative multisensory processes such as cross-modal matching (Pons,
Lewkowicz, Soto-Faraco, & Sebastian-Galles, 2009). By some estimates, optimal multisensory
integration does not arise until age 8 to 10 years (Gori et al., 2008; Nardini et al., 2008), which is
beyond the critical period for the development of amblyopia (Birch, 2013). This may explain
why optimal integration in the ventriloquism effect is spared in amblyopia. Furthermore, it
suggests that other multisensory perceptual anomalies in amblyopia (e.g., reduced susceptibility
to the McGurk effect and poorer audiovisual asynchrony detection) may result from deficits in
unisensory perception (e.g. spatial or temporal uncertainty) that are propagated in an otherwise
88
optimal multisensory percept. Furthermore, disrupted cross-sensory calibration may be the
mechanism by which amblyopic vision impairs unisensory functions beyond vision.
Consistent with the theory of cross-sensory calibration in which the more robust and accurate
sense informs the other (Gori, 2015), these results implicate vision as the master reference for the
calibration of auditory spatial localization during development. Indeed, similar relationships have
been described for other multisensory object features. In normal children younger than 8 years,
vision informs touch in spatial orientation discrimination, but touch informs vision in size
discrimination (Gori et al., 2008). In early bilateral visual impairment, however, cross-sensory
calibration is affected in a predictable way: haptic orientation discrimination (for which vision
typically dominates) is impaired, but haptic size discrimination (for which touch typically
dominates) is preserved (Gori et al., 2010). What is striking about our results is that the
impairment in cross-sensory calibration of auditory localization occurred despite normal visual
acuity in one eye. We have conducted further experiments to specifically investigate the effects
of abnormal vision in the calibration of auditory spatial localization during development, which
will be the subject of a separate report.
89
Chapter 4 Study II
Study II: Amblyopia and the Developmental Calibration of Sound Localization
4.1 Abstract
The visual system of adults with amblyopia developed with reduced binocular input because one
eye was misaligned or defocused. Here we present the first evidence that this visual impairment
interferes with the developmental calibration of auditory localization. The pattern of deficits
suggests that visual input during early development calibrates the auditory spatial map in the
phylogenetically ancient retinocollicular pathway.
4.2 Introduction
Amblyopia is a developmental visual impairment that affects approximately 3% of the
population (Attebo et al., 1998; Brown et al., 2000; Preslan & Novak, 1996). It presents
clinically as a reduction in visual acuity and is not directly attributable to a structural eye
abnormality, but is associated with some factor—most commonly strabismus (eye misalignment)
or anisometropia (unequal refractive error)—that disrupts normal visual experience during a
sensitive period in early life (American Academy of Ophthalmology Pediatric
Ophthalmology/Strabismus Panel, 2012). Beyond the deficit in letter acuity, amblyopia is
associated with a constellation of developmental impairments in spatial visual perception
(McKee et al., 2003), temporal visual perception (Huang et al., 2012; Spang & Fahle, 2009; St
John, 1998), eye movement control (Ciuffreda et al., 1978; Raashid et al., 2016; Subramanian et
al., 2013), and hand-eye coordination (Niechwiej-Szwedo et al., 2011; Niechwiej-Szwedo, Goltz,
et al., 2012), and audiovisual multisensory processing (Burgmeier et al., 2015; Chen et al., 2017;
Narinesingh et al., 2017; Richards et al., 2017b).
In a previous study on audiovisual spatial integration (Study I), we detected an unexpected and
novel deficit in the precision of binaural sound localization in people with unilateral amblyopia
(M. D. Richards, H. C. Goltz, & A. M. Wong, 2017a). The present study is a follow-up on that
90
finding to more fully investigate the effect of amblyopia on the unisensory auditory spatial
perception.
Unlike the visual system which maps space in direct retinotopic coordinates, the human auditory
system does not have access to explicit spatial information. Instead, a listener must infer the
location of a sound indirectly from cues embedded in the acoustic signal. In the horizontal plane,
sound localization is largely based on differences in signal timing (i.e., interaural time difference,
ITD) and intensity (i.e., interaural level differences, ILD) between the ears (Rayleigh, 1907). The
relative contribution of these cues to sound localization depends on the frequency of the acoustic
signal, with ITDs predominating below 1400 Hz, and ILDs predominating for higher frequencies
(Mills, 1958). These binaural inputs converge in the auditory midbrain where spatial sensitivity
emerges in the dorsal nuclei (see Grothe et al. (2010) for review). Neurons in the primate inferior
colliculus show only coarse spatial selectivity for binaural cues (Groh, Kelly, & Underhill,
2003), but in the superior colliculus, a systematically organized spatiotopic map of auditory
space appears (King & Palmer, 1983). While ITD and ILD cues both contribute to sound
localization at the behavioural level (Mills, 1958), spatial selectivity in the mammalian superior
colliculus appears to be exclusively to ILD cues (Campbell et al., 2006).
The superficial layers of the superior colliculus also receive direct retinal input (Pollack &
Hickey, 1979; Williams, Azzopardi, & Cowey, 1995) and show retinotopic organization similar
to that of the striate cortex (DuBois & Cohen, 2000; Lane et al., 1973). In contrast to the
balanced binocular input to the striate cortex, however, each superior colliculus receives retinal
input primarily from the contralateral eye (Lane et al., 1973; Pollack & Hickey, 1979).
Importantly, the collicular visual space map is topographically aligned with the underlying
auditory space map (King & Palmer, 1983). When eye movements shift the retina-centred visual
frame of reference away from the head-centred auditory frame of reference, this alignment tends
to be maintained by shifts in the auditory map (Jay & Sparks, 1987b). Alignment of the
unisensory space maps ensures that auditory and visual stimuli activate neurons at the same site,
and enables integration of those signals in the deeper multisensory layers of the superior
colliculus (Meredith & Stein, 1986b). This multisensory convergence and spatial alignment is
likely essential to the role of the superior colliculus in shifting gaze and attention to salient
environmental stimuli (Schiller & Stryker, 1972; Sparks, 1986).
91
Although a rudimentary map of auditory space map is present in the superior colliculus at birth
(King & Carlile, 1993), animal studies have shown that abnormal visual input during
development causes changes in its topography and alignment with the visual space map (see
King (2009) for review). Barn owls reared with prism spectacles mislocalize sounds in the
direction of the visual field shift, and show corresponding shifts in the tectal auditory space map
(Knudsen & Brainard, 1991). After a certain age, normal visual experience is ineffective in
recovering normal sound localization abilities, indicating that these changes do not represent
short-term adaptation, but permanent alterations crystallized during a sensitive period of brain
plasticity (Knudsen & Knudsen, 1990). A similar shift is observed in the auditory space map in
the superior colliculus of ferrets reared with experimentally-induced strabismus (King et al.,
1988), and complete disorganization of the auditory map is observed when ferrets are reared with
a surgically rotated eye (King et al., 1988). Interestingly, anomalous acoustic experience caused
by chronic occlusion of one ear from birth has little effect on behavioural sound localization in
barn owls (Knudsen & Knudsen, 1990) and induces minimal change in the spatial tuning of
auditory neurons in the superior colliculus in ferrets (King et al., 1988). That the auditory map
can adjust to distorted binaural cues more readily than it can to distorted visual cues implicates
vision as the dominant guiding influence in calibrating the neural representation of auditory
space in the superior colliculus (King et al., 1988).
Spatial acuity in the human auditory system, as in the visual system (Mayer & Dobson, 1982),
follows a developmental trajectory through childhood. Binaural localization is poor at birth, but
improves dramatically during the first several years of life (Litovsky & Ashmead, 1997). In
developmentally typical infants, the smallest reliably perceptible separation between sound
sources, or minimum audible angle (MAA), improves from approximately 20° at 5 months of
age (Ashmead et al., 1987) to 4° at 18 months of age, finally reaching adult acuity of 1–2° by
about 5 years of age (Mills, 1958; Morrongiello, 1988). Abnormal visual experience in early life
is also known to affect sound localization in humans, but the relation is not simple. The
congenitally blind often exhibit sensory compensation for their loss of vision, with superior
auditory spatial tuning, particularly for peripheral stimuli (Ashmead et al., 1998; Lessard et al.,
1998; Röder et al., 1999; Voss et al., 2004). Similarly, people who had one eye enucleated (i.e.,
surgically removed) in childhood can localize sounds more accurately in the central region of
space (Hoover et al., 2012). In contrast, sound localization precision and accuracy are
92
significantly impaired in people whose early visual impairment is limited to the central field
bilaterally (Lessard et al., 1998). In the context of these prior findings, it is difficult to predict the
cross-sensory effect of unilateral amblyopia on sound localization. Is subnormal vision in one
eye accompanied by compensatory enhancement of spatial hearing, or is discordant binocular
input sufficient to impair spatial hearing despite normal visual acuity in the fellow eye? Our prior
investigations suggest that sound localization may indeed be impaired in amblyopia (Richards et
al., 2017a), but this cross-sensory effect has not been examined systematically in this prevalent
visual disorder.
4.3 Methods
In the present study, we measured the precision and accuracy of sound localization using a
relative localization task (Experiments 1 and 3) and an absolute localization task (Experiment 2)
in humans with unilateral amblyopia.
4.3.1 Experiment 1: Relative sound localization—minimum audible angle task using speaker array
4.3.1.1 Participants
All participants reported no history of neurological, neurodevelopmental, auditory, or visual
disorders other than amblyopia, strabismus and/or refractive error. A certified orthoptist or
ophthalmologist examined each participant to measure visual acuity (standard ETDRS chart),
stereopsis (Randot circles test and Titmus fly test), foveal suppression (Worth 4-dot test), ocular
motility and alignment, and refractive correction. Amblyopia was defined as an acuity of ≥0.18
logMAR in the affected eye, and an interocular difference of ≥0.2 logMAR. Amblyopia was
classified as anisometropic if the interocular difference in spherical or astigmatic error was ≥1
diopter (D), strabismic if there was any manifest deviation in the absence of anisometropia, and
mixed if there was a strabismus of ≥8 prism diopters (PD) in the presence of anisometropia ≥1 D.
Each participant also passed a standard hearing test (Student Support Services Team, 2008) to
ensure reliable detection of pure tones at ≤25 dBA sound pressure level (SPL) at four standard
frequencies in each ear (500, 1000, 2000, 4000 Hz). Written informed consent was obtained in
accordance with the protocol approved by the Research Ethics Board at The Hospital for Sick
Children, and in accordance with the Declaration of Helsinki.
93
Ten adults with amblyopia (3 males; mean age, 32 years; range, 22–46 years) and 10 normally-
sighted adults (3 males; mean age, 29 years; range, 22–47 years) participated in Experiment 1.
Demographic and clinical details for participants with amblyopia in Experiment 1 are
summarized in Table 4.1.
94
Table 4.1: Clinical details of participants with amblyopia in Experiment 1
Participant
Age (sex)
Subtype Visual acuity
(logMAR)
Refractive correction
(diopters)
Alignment at 6m
(prism diopters)
Stereo
acuity
(arc sec)
Worth 4-
dot
response
Additional details
RE LE RE LE
P1
27 (F)
Strab 0.00 0.48 -6.25 +1.00 x45 -5.50 +1.25 x35 LE esotropia 2,
LE hypotropia 1
200 Fused Strab surgery age 9
P2
22 (F)
Aniso 0.00 0.48 -1.50 +0.50 x80 +1.00 +1.25 x95 LE esotropia 2 200 Fused
P3
22 (M)
Aniso 1.1 -0.10 -6.00 +0.75 x174 -4.50 +0.50 x75 RE esotropia 2 3000 Fused
P4
23 (F)
Strab 0.20 0.00 +0.50 +0.50 x28 +1.25 +0.50 x88 LE esotropia 8,
bilateral DVD
Not
measurable
Diplopic Infantile esotropia, 2
strab surgeries as child
P5
44 (F)
Mixed 0.90 0.00 +6.00 +1.25 x75 -0.75 RE exotropia 35 Not
measurable
RE
suppressed
P6
37 (F)
Aniso 0.18 -0.10 -3.25 +4.00 x10 -5.25 RE esotropia 1 70 Fused
P7
44 (M)
Aniso -0.10 1.20 -0.25 +2.00 x98 -2.75 +2.00 x69 LE esotropia flick 3000 Fused
P8
46 (F)
Strab -0.10 0.10 +4.25 +5.00 LE esotropia 25, LE
hypotropia 18
Not
measurable
LE
suppressed
Esotropia onset at 6–8
months of age
P9
28 (M)
Aniso 0.18 -0.10 +2.25 +0.25 Exophoria 2 70 Fused
P10
26 (F)
Aniso -0.10 0.18 +0.75 +3.00 LE esotropia 1 140 Fused
Abbreviations: RE, right eye; LE, left eye; Aniso, anisometropia; Strab, strabismus; DVD, dissociated vertical deviation.
95
4.3.1.2 Stimuli and Design
All trials were conducted in a darkened, sound attenuating chamber (internal dimensions 2.0 x
2.1 x 2.2 m) lined with 5 cm acoustic wedge foam (Foam Factory, Macomb, MI, USA). The
background noise level was 39.0 dBA SPL. Participants were seated with the head stabilized in a
chinrest 1 m from a horizontal array of 11 speakers (model CMS0361KLX, CUI Inc., Tualatin,
OR, USA) as shown in Figure 4.1. Auditory stimuli consisted of broadband white noise bursts of
32 ms duration, including a 2 ms sigmoid on/off ramp, delivered at 76.5 dBA SPL (output level
was verified as between 76.3 and 76.6 dBA SPL for each speaker). A red LED positioned over
the central speaker was illuminated between trials to aid in maintaining head alignment with the
speaker array. Participants used a wireless gamepad (model F710, Logitech, Newark, CA, USA)
to initiate trials and enter responses.
Figure 4.1: Apparatus for Experiment 1, a horizontal array of 11 speakers with a central
fixation LED. In each trial, one click was presented at the central reference position and the
other was presented a specified distance (3°, 6°, 9°, 12° or 15°) left or right of centre (auditory
angle θ).
96
Each trial began with illumination of the central fixation LED for 500 ms, followed by a
randomized delay between 250 ms and 400 ms. Two clicks (a reference click and a probe click)
were then presented in succession 500 ms apart. The reference click always originated from the
central speaker (0°), and the probe click originated from a non-central speaker (3°, 6°, 9°, 12° or
15° to the left or right of centre), forming an auditory angle θ; the order or presentation was
randomized. Participants were instructed to judge whether the second click was located to the
left or right relative to the first. Twenty trials were conducted for each of the 10 auditory angles
tested, with the probe click preceding the reference click in 50% of trials for each auditory angle.
Trials were run in random order, arranged in 2 blocks of 100 trials each.
4.3.1.3 Data Analysis
The proportion of trials in which the probe click was “heard right” of the reference click was
calculated for each auditory angle θ. A cumulative Gaussian function was fit to the psychometric
data for each participant in MATLAB version R2011b (Mathworks, Inc., Natick, MA, USA)
using the maximum likelihood method. The MAA was computed for each participant, and
defined as one half of the difference in θ between the 0.25 and 0.75 points on the y-axis of the
psychometric function (Mills, 1958).
The mean MAA values for the control and amblyopia groups were compared using an
independent samples t-test in IBM SPSS Statistics, version 22 (Armonk, NY, USA). Normality
of the data was established by the Shapiro-Wilk test. Degrees of freedom were adjusted to
overcome possible violations in equality of variance detected by Levene’s test.
Associations between the MAA and various clinical characteristics in the amblyopia group were
examined. Associations with amblyopic eye visual acuity in the amblyopic eye and stereo acuity
were assessed using Spearman’s rank correlation.
4.3.2 Experiment 2: Absolute Auditory Localization
4.3.2.1 Participants
Fourteen adults with amblyopia (mean age, range: 30, 19–48 years) and 14 normally-sighted
adults (mean age, range: 30, 23-47 years) participated in Experiment 2. Five of the participants
with amblyopia and four controls had also participated in Experiment 1. All new participants met
97
the same ophthalmic and auditory screening examination requirements as those in Experiment 1,
and the same definitions for amblyopia and its subtypes were used. Demographic and clinical
details for participants with amblyopia in Experiment 2 are summarized in Table 2. Written
informed consent was obtained in accordance with the protocol approved by the Research Ethics
Board at The Hospital for Sick Children, and in adherence to the Declaration of Helsinki.
98
Table 4.2: Clinical details of participants with amblyopia in Experiment 2
Participant
Age (sex)
Subtype Visual acuity
(logMAR)
Refractive correction
(diopters)
Alignment at 6 m
(prism diopters)
Stereo
acuity
(arc sec)
Worth 4-
dot
response
Additional details
RE LE RE LE
P1*
27 (F)
Strab 0.00 0.48 -6.25 +1.00 x45 -5.50 +1.25 x135 LE esotropia 2,
LE hypotropia 1
200 Fused Strab surgery, age 9
years
P2*
22 (F)
Aniso 0.00 0.48 -1.50 +0.50 x80 +1.00 +1.25 x95 LE esotropia 2 200 Fused
P3*
22 (M)
Aniso 1.10 -0.10 -6.00 +0.75 x174 -4.50 +0.50 x75 RE esotropia 2 3000 Fused
P4*
23 (F)
Strab 0.20 0.00 +0.50 +0.50 x28 +1.25 +0.50 x88 LE esotropia 8,
bilateral DVD
Not
measurable
Diplopic Infantile esotropia, 2
strab surgeries as child
P5*
44 (F)
Mixed 0.90 0.00 +6.00 +1.25 x75 -0.75 RE exotropia 35 Not
measurable
RE
suppressed
P11
32 (F)
Aniso -0.10 0.54 Plano +2.00 +2.00 x124 Orthotropic 140 Fused
P12
29 (F)
Mixed 0.00 1.00 Plano +3.50 +2.00 x90 LE exotropia 14,
LE hypertroia 4
Not
measureable
LE
suppressed
Strab surgery, age 4
years
P13
23 (F)
Aniso -0.10 0.48 -2.25 +0.25 +2.25 x85 LE esotropia 1 200 Fused
P14
19 (F)
Aniso 0.00 0.18 -0.75 +2.00 x84 -2.75 +4.50 x99 Exophoria 1 40 Fused
P15
19 (F)
Mixed 0.48 0.00 +3.00 +1.00 x130 +4.25 RE esotropia 4,
RE esophoria 10
3000 Fused Accom. esotropia,
strab surgery as child
P16
29 (F)
Aniso 0.48 -0.10 -5.00 -1.25 RE esotropia 2 3000 Fused
P17
29 (F)
Strab 0.00 1.00 None None LE esotropia 2,
bilateral DVD
Not
measurable
LE
suppressed
Infantile esotropia
P18
37 (F)
Mixed -0.10 1.30 -1.00 +6.00 +2.50 x120 LE exotropia 25 Not
measurable
LE
suppressed
Strab, surgery age 23
years
P19
48 (F)
Aniso 0.70 0.00 +2.25 +0.25 x174 -0.75 RE esotropia 2 3000 Fused
Abbreviations: RE, right eye; LE, left eye; Aniso, anisometropic; Strab, strabismic; DVD, dissociated vertical deviation; Accom.,
accommodative. *Also participated in Experiment 1.
99
4.3.2.2 Stimuli and Design
All trials were conducted in the same acoustic chamber as Experiment 1. Participants were
seated with the chin stabilized in a chinrest before a large (165 cm diagonal) LED monitor (NEC,
model E654, Tokyo, Japan) flanked by stereo speakers (HP Inc., model BR387AA#ABA, Palo
Alto, CA, USA) at ear-level (shown in Figure 4.2). Auditory stimuli consisted of 32 ms click
trains (8 cycles of 4 ms white noise clicks at 62.0 dBA, enveloped with a 2 ms sigmoid on/off
ramp), repeating at 3 Hz. The white noise was 2–5 kHz bandpass filtered to limit the auditory
stimulus to frequencies at which interaural level difference cues predominate for binaural
localization (Mills, 1958). This stereophonic arrangement allowed the generation of phantom
(i.e. virtual) sound sources whose location was perceived on the horizontal axis between the two
physical speakers according to the principles of amplitude panning and summation localization
(Pulkki, 2001; Warncke, 1941). Participants used a wireless mouse with their preferred hand to
initiate trials and enter responses.
Figure 4.2: Apparatus for Experiment 2, stereo speakers with LED monitor. Phantom
sources were generated at locations on the azimuth (auditory angle θ) between the two physical
speakers by amplitude panning. The speakers were driven by coherent signals of independently
variable amplitude, such that the signal amplitude gain to the right and left speakers always
summed to 1.
100
Prior to each trial, a small red fixation dot (0.66°) was presented centrally to aid in consistent
alignment of the head and eyes with respect to the stereo speakers. Each trial began with the
offset of the fixation dot followed by the presentation of the auditory stimulus at one of 9
locations on the azimuth (-16°, -12°, -8°, -4°, 0°, 4°, 8°, 12°, or 16°). Two seconds after onset of
the click train, a visual cursor (vertical white line) appeared on the monitor. Participants aligned
the cursor with the perceived sound source, and clicked a mouse button to enter their response.
The initial horizontal location of the cursor was jittered randomly between -40° and 40° on every
trial to prevent systematic bias from ventriloquism (i.e., visual capture of the auditory stimulus
location by the visual cursor). The click train continued at 3 Hz until a response was entered. To
mitigate the potential effect of visual capture by the visual cursor further, participants were asked
to make their judgment during the 2 seconds of darkness before the cursor appeared. Participants
were instructed to hold their head still in the chinrest during trials, but eye movements were
unconstrained. Five trials were conducted at each of the 9 locations, randomized within a single
block.
4.3.2.3 Data Analysis
For participants with amblyopia, location data were signed relative to the side of the affected
eye. Positive values indicated locations in the spatial hemifield ipsilateral to the amblyopic eye,
and negative values indicated locations in the contralateral hemifield. As normally sighted
controls lack a lateralized visual impairment, location data were expressed in left/right spatial
coordinates.
Linear regression parameters (intercept and slope) were compared with their expected values of
0 and 1, respectively, using a one-sample t-test.
Between-subject localization variability within each hemifield was obtained by averaging the
standard deviation of the mean localization for each of the four click locations within each
hemifield. Hemifield asymmetry in between-subject localization variability was assessed using
paired t-test. Normality of the data was established by the Shapiro-Wilk test.
Within-subject localization error was calculated as the root mean square (RMS) error at each
click location. For the repeated measures ANOVA comparing the effect of hemifield on RMS
101
localization error across click locations, homogeneity of variance was established by Mauchly’s
test of sphericity.
4.3.3 Experiment 3: Replication of MAA task using stereo speaker apparatus (amplitude panning)
4.3.3.1 Participants
As in Experiment 2.
4.3.3.2 Stimuli and Design
As in Experiment 1, with the following exceptions. Sound stimuli consisted of 2–5 kHz bandpass
filtered white noise to limit the auditory stimulus to frequencies at which interaural level
difference cues predominate (Mills, 1958). Instead of a central LED, a small (0.66°) red fixation
dot was presented on the LED monitor.
4.3.3.3 Data analysis
As in Experiment 1.
4.4 Results
4.4.1 Experiment 1
Performance on the relative sound localization task using the array of 11 speakers is illustrated in
Figure 4.3A. The MAA, illustrated in Figure 4.3B, was significantly larger in the amblyopia
group (mean ± SEM: 3.60 ± 0.35°) compared to the control group (mean ± SEM: 2.04 ± 0.12°),
indicating poorer sound localization precision in the amblyopia group (t(18) = -4.278, p = 0.001).
Within the amblyopia group, the MAA showed no significant correlation with visual acuity in the
amblyopic eye (Rs = 0.258, p = .471) or with stereo acuity (Rs = 0.644, p = .644).
102
Figure 4.3: Relative sound localization performance on a horizontal speaker array. Error
bars indicate SEM. (A) Mean psychometric data for the minimum audible angle task. Negative
and positive auditory angles represent sounds presented to the left and right of the central click,
respectively. (B) The mean minimum audible angle was significantly larger in the amblyopia
group compared to the control group (*p = 0.001).
4.4.2 Experiment 2
Performance on the absolute sound localization task using stereo speakers is illustrated in Figure
4.4. For the amblyopia group, location data are expressed relative to the side of the visual
impairment, with positive values indicating auditory locations in the hemifield ipsilateral to the
amblyopic eye.
103
Figure 4.4: Absolute sound localization performance. For the amblyopia group, positive
coordinates indicate auditory locations in the hemifield ipsilateral to the amblyopic eye. (A, B)
Mean localization of auditory targets ± SD. The relation between perceived and specified click
location was linear and closely matched the relation predicted by linear amplitude panning
(dotted grey line) for both groups. (C, D) Sound localization error (root mean square, RMS) by
click location ± SEM. (C) For the control group, the magnitude of sound mislocalization was
symmetric in the left and right auditory hemifields. (D) For the amblyopia group, the magnitude
of sound mislocalization was significantly greater in the auditory hemifield ipsilateral to the
104
amblyopic eye, compared to the contralateral hemifield (*p = 0.043). Contralateral = auditory
hemifield contralateral to the amblyopic eye, Ipsilateral = auditory hemifield ipsilateral to the
amblyopic eye.
The relation between perceived click location and specified click location was linear for all
participants in the control group (mean R2 = 0.98) and the amblyopia group (mean R2 = 0.96).
The mean intercept and slope of regression did not differ significantly from the expected values
of 0 and 1 for the control group (mean intercept = -0.73°, t(13) = -1.01, p = 0.331, mean slope =
1.10, t(13) = 1.183, p = 0.258), indicating that the virtual sound sources appeared where
specified, and that there was no systematic leftward or rightward bias in the apparatus (Figure
4.4A). Similarly, the mean intercept and slope of the regression did not differ significantly from
the expected values of 0 and 1 for the amblyopia group (mean intercept = 0.67°, t(13) = 0.626, p
= 0.542, mean slope = 1.11, t(13) = 1.677, p = 0.117), indicating that there was no systematic
bias in auditory localization toward or away from the amblyopic eye (Figure 4.4B).
Between-subject localization variability did not differ significantly between the left and right
hemifields in the control group (control: t(3) = -1.128, p = 0.342). In the amblyopia group,
however, there was significantly greater localization variability in auditory hemifield ipsilateral
to the amblyopic eye compared to the contralateral side (amblyopia: t(3) = -4.721, p = 0.018).
The mean magnitude of auditory mislocalization, calculated as the RMS error at each specified
click location for each participant, is illustrated in Figure 4.4C for the control group and in
Figure 4.4D for the amblyopia group. For the control group, a 2 x 4 repeated measures ANOVA
comparing the two auditory hemifields across specified click locations showed no significant
interaction of Hemifield x Click Location (F(3,39) = 0.579, p = 0.632), and no main effect of
Hemifield (F(1,13) = 0.075, p = 0.788). For the amblyopia group, the same analysis showed no
significant interaction of Hemifield x Click Location (F(3,39) = 0.781, p = 0.512), but did reveal
a significant main effect of Hemifield (F(1,13) = 5.041, *p = 0.043), with greater auditory
localization in the auditory hemifield ipsilateral to the amblyopic eye.
The correlations between clinical deficits in amblyopia and the magnitude of auditory
mislocalization (RMS error) are illustrated in Figure 4.5. Within the amblyopia group, the
105
clinical deficit in visual acuity and the magnitude of auditory mislocalization were significantly
correlated at the 8°specified click location within the auditory hemifield ipsilateral to the
amblyopic eye (Rs = 0.66, p = 0.011). Similarly, the clinical deficit in stereo acuity and the
magnitude of auditory mislocalization were significantly correlated at the specified click
locations 4° (Rs = 0.56, p = 0.012), 8° (Rs = 0.72, p = 0.004), and 12° (Rs = 0.54, p = 0.045)
within the auditory hemifield ipsilateral to the amblyopic eye. There were no significant clinical
correlations with auditory localization error in the auditory hemifield contralateral to the
amblyopic eye.
Figure 4.5: Correlations between RMS error for sound localization and clinical measures
of amblyopia across auditory target positions. The magnitude of auditory mislocalization was
significantly correlated with visual acuity and stereo acuity deficits in the auditory hemifield
ipsilateral to the amblyopic eye.
106
4.4.3 Experiment 3
Performance on the relative sound localization task using the stereo speaker apparatus (amplitude
panning) is illustrated in Figure 4.6A. The MAA, shown in Figure 4.6B, was significantly larger
in the amblyopia group (mean ± SEM: 4.21 ± 0.29°) compared to the control group (mean ± SEM:
3.38 ± 0.26°), indicating poorer sound localization precision in the amblyopia group (t(26) = -
2.120, p = 0.044). Within the amblyopia group, the MAA showed no significant correlation with
visual acuity in the amblyopic eye (Rs = -0.117, p = 0.690) or with stereo acuity (Rs = -0.048, p =
0.871).
Figure 4.6: Relative sound localization performance on stereo speaker apparatus. Error bars
indicate SEM. (A) Mean psychometric data for the minimum audible angle task. (B) The
minimum audible angle was significantly larger in the amblyopia group compared to the control
group (*p = 0.044).
107
The MAA values obtained using amplitude panning on the stereo speaker apparatus were
compared to those obtained using the physical speaker array in Experiment 1 (Fig. 7). For the
nine individuals tested on both apparatuses (4 control participants and 5 participants with
amblyopia), and the MAA values obtained were significantly correlated (R = 0.80, p = 0.009).
Figure 4.7: Correlation between minimum audible angle (MAA) values determined by
amplitude panning (Experiment 3) and by physical speakers (Experiment 1). Open circles
represent normal control participants and solid circles represent participants with amblyopia.
4.5 Discussion
In summary, we found novel deficits in both the precision and accuracy of sound localization in
people who grew up with amblyopia in one eye. The deficit in unisensory sound localization
precision was apparent as an increase in the MAA in the central region of space. The deficit in
sound localization accuracy was apparent as greater mislocalization in the spatial hemifield
ipsilateral to the amblyopic eye. Furthermore, the magnitude of sound mislocalization in the
108
ipsilateral hemifield was significantly correlated with the severity of amblyopic deficits in visual
acuity and stereo acuity.
The significant correlation between the MAA obtained with the physical sources in the speaker
array and virtual sources generated by amplitude panning indicates good agreement between
these two methods of measuring the MAA. The smaller MAA values obtained using the speaker
array were expected because the stimuli provided by physical sources were broadband, and
included low-frequency ITD and possibly spectral cues absent in virtual sources generated by
amplitude panning (Middlebrooks & Green, 1991; Pulkki & Karjalainen, 2001; Stevens &
Newman, 1936).
Unlike people who lose all vision in one or both eyes at an early age (Hoover et al., 2012;
Lessard et al., 1998; Röder et al., 1999), our results indicate that people with amblyopia do not
exhibit enhanced auditory localization to compensate for their deficits in spatial vision. Rather,
the developmental effect of unilateral amblyopia on spatial hearing more closely resembles that
of partial blindness with residual vision in both eyes (Lessard et al., 1998). This indicates that
discordant binocular vision can disrupt the developmental calibration of auditory space, and that
normal spatial acuity in the fellow eye is not adequate to rescue the process. More generally, the
results support the view that it is the quality of visual input, rather than its absence, that has the
stronger influence on the visual calibration of spatial hearing.
Based on the normal trajectory of MAA improvement through childhood, auditory spatial acuity
in adults with amblyopia is similar to the that of children between 1.5 to 5 years of age (Litovsky
& Ashmead, 1997). This age range corresponds roughly to the age of onset for the most common
forms of amblyopia (Birch & Holmes, 2010; Repka et al., 2002), raising the possibility that
amblyopia or its etiological factors (e.g., strabismus or anisometropia) interfere with the visually-
guided maturation of auditory spatial abilities. Alternatively, the loss of auditory spatial acuity
associated with amblyopia could represent regression to the level of a normal 1.5 to 5 year old
caused by anomalous visual input during a sensitive period in auditory system development.
Why does this amblyopic interference with auditory spatial development occur despite access to
high resolution visual spatial information from the fellow eye? This disconnect between the
binaural (auditory) spatial acuity and binocular (visual) spatial acuity may represent
109
physiological differences between the retinocollicular pathway involved in aligning and
calibrating the auditory space map and the retinogeniculostriate pathway responsible for visual
perception. Under binocular viewing conditions, perceptual dominance of the fellow eye is a
function of suppression of the signal from the amblyopic eye (Babu et al., 2013; J. Li et al.,
2011). Amblyopic suppression is mediated, however, by inhibitory interactions in the primary
visual cortex (Sengpiel, Jirmann, Vorobyov, & Eysel, 2006). If visual calibration of auditory
space occurs in the superior colliculus, as suggested (King, 2009), the usual cortical mechanisms
for amblyopic suppression may be bypassed. Without an independent midbrain mechanism to
suppress signals from the amblyopic eye, signals from both eyes would likely be equally salient
in their collicular representation of visual space.
More importantly, however, the primary visual cortex is also widely posited to be the site of the
neural deficit underlying amblyopia (Kiorpes et al., 1998; Movshon et al., 1987) (see Barrett et
al. (2004) for review). Therefore, the amblyopic visual deficit, as commonly defined, likely does
not affect the retinocollicular pathway. This suggests that the loss of auditory spatial acuity may
be an auditory analog of amblyopia caused by the same amblyogenic factors, but arising de novo
in the retinocollicular pathway. A similar pathologic mechanism involving direct retinocollicular
input has been previously proposed to explain the abnormally long saccadic latencies observed in
amblyopia (Ciuffreda et al., 1978).
Clinical markers of visual impairment, namely, visual acuity in the amblyopic eye and stereo
acuity, did not correlate significantly with the width of the MAA among the participants with
amblyopia. While the relevant predictors of the amblyopic deficit in MAA remain to be
determined, the width of the MAA may depend on historical factors such as such as age of onset,
age at treatment, and duration of patching, that are generally not known or remembered.
Furthermore, the lack of relation between MAA and clinical markers of amblyopia may reflect a
relatively short sensitive period for recovery of MAA compared to that for visual acuity. Indeed,
another amblyopic deficit possibly mediated by the superior colliculus—prolongation of saccadic
latency—can persist despite successful visual rehabilitation (Ciuffreda et al., 1978).
In addition to widening of the MAA, people with amblyopia also showed a significant tendency
to mislocalize sounds in the auditory hemifield ipsilateral to their amblyopic eye. This pattern of
110
auditory localization deficits is remarkable because it does not match the pattern of visual spatial
deficits observed in amblyopia (Hess & Pointer, 1985; Sireteanu et al., 2008). Although
participants localized sounds using a visually-guided cursor, the task was done with both eyes
open, and the specified click locations were well within the field of view of the fellow eye even
for the most eccentric auditory targets at 16° left and right of the midline. The asymmetry in
sound localization error therefore cannot be attributed difficulty seeing the visual cursor.
Furthermore, the pattern does not reflect the functional anatomy of the retinogeniculostriate
pathway, because the left and right primary visual cortices receive equal input from each eye,
ensuring that monocular visual loss does not cause blindness in half of the visual field (i.e.,
homonymous hemianopia). Rather, the hemispatial asymmetry in sound mislocalization is
suggestive of the coordinate framework of the retinocollicular pathway, because retinal input to
each superior colliculus is largely crossed from the contralateral eye (Lane et al., 1973; Pollack
& Hickey, 1979). That significant correlations between the sound mislocalization and the
severity of amblyopic visual deficits were restricted to the ipsilateral hemifield provides
additional evidence of retinocollicular involvement by the same reasoning. Taken together, these
findings provide the first behavioural evidence that the retinocollicular pathway functions to
calibrate auditory spatial abilities in humans.
111
Chapter 5 Study III
Study III: Alterations in Audiovisual Simultaneity Perception in Amblyopia
5.1 Abstract
Amblyopia is a developmental visual impairment that is increasingly recognized to affect higher-
level perceptual and multisensory processes. To further investigate the audiovisual perceptual
impairments associated with this condition, we characterized the temporal interval in which
asynchronous auditory and visual stimuli are perceived as simultaneous 50% of the time (i.e., the
audiovisual simultaneity window). Adults with unilateral amblyopia (n = 17) and visually normal
controls (n = 17) judged the simultaneity of a flash and a click presented with both eyes viewing.
The signal onset asynchrony (SOA) varied from 0 ms to 450 ms for auditory-lead and visual-lead
conditions. A subset of participants with amblyopia (n = 6) was tested monocularly. Compared
to the control group, the auditory-lead side of the audiovisual simultaneity window was widened
by 48 ms (36%; p = 0.002), whereas that of the visual-lead side was widened by 86 ms (37%; p =
0.02). The overall mean window width was 500 ms, compared to 366 ms among controls (37%
wider; p = 0.002). Among participants with amblyopia, the simultaneity window parameters
were unchanged by viewing condition, but subgroup analysis revealed differential effects on the
parameters by amblyopia severity, etiology, and foveal suppression status. Possible mechanisms
to explain these findings include visual temporal uncertainty, interocular perceptual latency
asymmetry, and disruption of normal developmental tuning of sensitivity to audiovisual
asynchrony.
5.2 Introduction
Amblyopia is a developmental visual impairment caused by abnormal visual experience during a
critical period in early childhood. It has a prevalence of 2–4% (Attebo et al., 1998; Brown et al.,
2000; Buch et al., 2001; Friedman et al., 2009; Preslan & Novak, 1996; Thompson et al., 1991;
Vinding et al., 2009), and is recognized as a leading cause of monocular blindness (Buch et al.,
2001; Krueger & Ederer, 1984). Clinically, it presents as a unilateral, or rarely bilateral,
reduction in best-corrected visual acuity that cannot be explained solely by a structural eye
112
abnormality. It is often accompanied by one or more factors, most commonly strabismus (eye
misalignment) or anisometropia (difference in refractive error between the eyes) that interfere
with normal binocular visual experience (American Academy of Ophthalmology Pediatric
Ophthalmology/Strabismus Panel, 2012).
While it is classically understood as a predominantly monocular visual disorder affecting low-
level visual functions such as optotype acuity, stereopsis, and contrast sensitivity (Abrahamsson
& Sjostrand, 1988; Hess & Howell, 1977; Levi & Harwerth, 1977; Levi et al., 1994), amblyopia
is increasingly recognized to involve deficits in higher-level perceptual processing. Affected
individuals show impairments in global shape detection (Hess et al., 1999), real-world scene
perception (Mirabella et al., 2011), motion processing (Aaen-Stockdale & Hess, 2008; Simmers,
Ledgeway, Hess, & McGraw, 2003), and feature counting (Sharma et al., 2000) that affect not
only the amblyopic eye, but also often extend to the fellow eye (Giaschi et al., 1992; Ho et al.,
2005; Kovacs et al., 2000). Beyond the purely visual domain, recent work has shown that
amblyopia also affects multisensory integration in speech perception, manifest as reduced
susceptibility to the McGurk effect, even while viewing with both eyes (Burgmeier et al., 2015;
Narinesingh et al., 2015; Narinesingh et al., 2014).
Multisensory integration is the process by which information from the various senses is
associated and merged into a unified percept. It confers broad advantages in terms of response
time (Morrell, 1968) and accuracy of discrimination (Frassinetti et al., 2002) (see Ernst and
Bulthoff (2004) for review). In infancy, normal visual experience during a critical period is
necessary for the emergence of robust integration of auditory and visual signals (Putzar et al.,
2007; Putzar, Hötting, et al., 2010; Wallace et al., 2004). In turn, audiovisual integration plays an
important role in the development of higher level perceptual functions including speech
acquisition in infancy (Kushnerenko et al., 2008; Lewkowicz & Hansen-Tift, 2012) and speech
comprehension in adulthood (Driver, 1996; Grant & Seitz, 2000; Grant et al., 1998; Sumby &
Pollack, 1954). Interestingly, deficits in multisensory integration have been increasingly
recognized as a feature of various neurodevelopmental disorders, including autism (Stevenson et
al., 2014), dyslexia (Hairston, Burdette, Flowers, Wood, & Wallace, 2005), and schizophrenia
(Foucher, Lacambre, Pham, Giersch, & Elliott, 2007; Martin, Giersch, Huron, & van
Wassenhove, 2013), but the mechanism remains elusive.
113
Visual and auditory stimuli presented in close temporal and spatial correspondence are likely to
be perceived as arising from a single event. This process, termed perceptual binding, is a rapid
pre-attentive process that occurs without the conscious awareness of the observer, and constitutes
a fundamental rule for learning associations between stimuli (Driver, 1996; McGurk &
MacDonald, 1976; Sekuler et al., 1997). Neuroimaging studies indicate that the temporal
correspondence of auditory and visual speech stimuli activates a broad network, including the
superior colliculus (SC), anterior insula, and anterior intraparietal sulcus (IPS), while perceptual
fusion (e.g. as in the McGurk effect) is associated with activation in the multisensory superior
temporal sulcus (mSTS), the middle IPS, and regions of the primary auditory cortex (Macaluso,
George, Dolan, Spence, & Driver, 2004; Miller & D'Esposito, 2005; Stevenson, VanDerKlok,
Pisoni, & James, 2011). Similar studies of non-speech stimuli (e.g. click-flash pairs) have shown
that temporal correspondence of simple audiovisual stimuli activates the SC, mSTS, IPS, and
insula (Calvert et al., 2001), while detection or perception of asynchrony is associated with
activation of an extensive network including the insula, posterior parietal, and prefrontal regions,
with the right insula being involved most significantly (Bushara et al., 2001). Furthermore,
Noesselt et al. (2007) showed that temporal correspondence of simple audiovisual stimuli not
only activates the mSTS, but also affects activity in the primary auditory and visual cortices,
likely by a feedback mechanism from the mSTS.
The temporal interval during which separate visual and auditory stimuli are perceived reliably as
simultaneous is termed the audiovisual simultaneity window, and reflects an equilibrium
between the sensitivity to signal asynchrony (which narrows the audiovisual simultaneity
window) and the tendency toward perceptual binding (which widens the audiovisual simultaneity
window). It is measured using a single-interval forced-choice simultaneity judgment task for
audiovisual stimulus pairs presented with varying signal onset asynchrony (SOA). It typically
has a bell-shaped distribution with a slight skew toward the visual-lead side of objective
simultaneity (Slutsky & Recanzone, 2001; Stevenson & Wallace, 2013; Zampini, Guest, et al.,
2005). Furthermore, audiovisual stimuli are typically perceived as maximally simultaneous when
the visual stimulus slightly precedes the sound. This visual-lead shift in the point of subjective
simultaneity (PSS) is commonly believed to reflect either tuning to the natural condition in
which light waves reach the eyes before sound waves reach the ears, or the neural delay related
to slower processing of visual signals (Vroomen & Keetels, 2010). The audiovisual simultaneity
114
window progressively narrows on both auditory-lead and visual-lead sides from childhood
through adolescence, reaching the adult shape by 9 to 17 years of age (Chen et al., 2016; Hillock-
Dunn & Wallace, 2012; Hillock et al., 2011; Lewkowicz & Flom, 2014). Interestingly,
individuals with a narrower audiovisual simultaneity window, particularly on the visual-lead
side, experience a stronger McGurk effect, suggesting that the audiovisual simultaneity window
may be an index of broader audiovisual integrative function (Stevenson, Zemtsov, et al., 2012).
For an individual with a developmentally normal sensorium, the overall width of the audiovisual
simultaneity window is not fixed, but varies depending on the characteristics of the stimuli.
Complex stimuli such as natural speech and audiovisual stimuli with high semantic congruency
result in a wider audiovisual simultaneity window than simple flash-beep stimuli (Stevenson &
Wallace, 2013; van Wassenhove et al., 2007). Increased spatial separation between the paired
stimuli (Keetels & Vroomen, 2005; Zampini, Guest, et al., 2005), as well as availability of visual
predictive information about when to expect an audiovisual event to occur (Petrini, Russell, &
Pollick, 2009), result in a narrower audiovisual simultaneity window. Its width can be further
narrowed by various forms of perceptual learning—short-term audiovisual and visual-only
training with feedback (Powers et al., 2009; Stevenson et al., 2013), long-term musical training
(Lee & Noppeney, 2011a), and video gaming experience (Donohue, Woldorff, & Mitroff, 2010).
In addition to the width of the audiovisual simultaneity window, its peak, or point of subjective
simultaneity is also variable. Repeated exposure to asynchronous stimuli shifts it toward the
trained asynchrony in a process termed temporal recalibration (Fujisaki et al., 2004; Navarra et
al., 2005; Roseboom & Arnold, 2011). Furthermore, the presence of an additional visual stimulus
that closely precedes or follows a synchronous audiovisual pair biases the PSS away from the
additional stimulus (Roseboom, Nishida, & Arnold, 2009).
Abnormal early visual experience has been shown to affect multisensory processing. Adults with
early pattern vision deprivation from bilateral congenital cataracts have an audiovisual
simultaneity window that is selectively broadened on the visual-lead side (Chen et al., 2017), as
well as diminished audiovisual interaction in speech perception (Putzar, Hötting, et al., 2010),
and a shift in attentional balance toward audition (de Heering et al., 2016). In contrast, the
audiovisual simultaneity window of adults with unilateral congenital cataract is symmetrically
broadened, similar to that seen in typically-developing children (Chen et al., 2017). Audiovisual
115
interactions have also been studied in monocular adults with a history of early enucleation. Like
those with unilateral amblyopia, this population shows reduced susceptibility to the McGurk
effect, but demonstrate normal responses to illusions involving temporal audiovisual integration
such as the sound-induced flash illusion and audiovisual simultaneity judgments (Moro &
Steeves, 2015). This suggests that audiovisual integration deficits may be specific to the nature
of the visual sensory disturbance during the critical period.
Despite its relatively high prevalence, much less is known about the extent of the multisensory
deficits in unilateral amblyopia from strabismus and anisometropia. Specifically, it is unclear
whether the audiovisual integration deficits in these forms of amblyopia are specific to speech, or
whether they reflect a broader impairment in multisensory processing. Evidence from visually
normal adults suggests that susceptibility to the McGurk effect is correlated with other indices of
temporal audiovisual integration (Stevenson, Zemtsov, et al., 2012). One such index is the
audiovisual simultaneity window. Visually normal individuals with lower susceptibility to the
McGurk effect have a wider audiovisual simultaneity window, indicating altered processing of
asynchronous multimodal signals (Stevenson, Zemtsov, et al., 2012). Based on this evidence
from visually normal adults and our previous studies showing that adults with amblyopia are less
susceptible to the McGurk effect (Narinesingh et al., 2015; Narinesingh et al., 2014), we
hypothesized that unilateral amblyopia will also show a symmetrically broadened audiovisual
simultaneity window under binocular and monocular viewing conditions, indicating a higher-
level alteration in audiovisual integration that is generalized beyond speech.
5.3 Materials and Methods
5.3.1 Participants
Participants were adults aged 18 to 48 years, with no history of neurological, auditory, or visual
disorders other than amblyopia, strabismus, or ametropia. Each participant was assessed by a
certified orthoptist or ophthalmologist to document visual acuity (standard ETDRS chart),
stereoacuity (Randot circles test and Titmus fly test), binocularity (Worth 4-dot test), eye
alignment (cover-uncover and alternate cover tests), and refractive correction. Amblyopia was
defined as a visual acuity of 0.18 logMAR (20/40) or worse in the amblyopic eye, and an inter-
ocular difference of at least 0.2 logMAR (2 lines on the ETDRS chart). Anisometropic
116
amblyopia was defined as an inter-ocular difference of 1 diopter (D) or more in either spherical
equivalent or astigmatic correction. Strabismic amblyopia was defined as any manifest deviation
on cover testing in the absence of anisometropia. Mixed amblyopia was defined as the presence
of both anisometropia and a manifest deviation of 8 prism diopters or more. Visually normal was
defined as visual acuity of at least 0.1 logMAR (20/25) in each eye. All participants completed a
hearing test on a commercially-available screening audiometer (model MA 27, MAICO
Diagnostics, Eden Prairie, MN, USA) with circumaural headphones (model TDH 39, MAICO
Diagnostics, Eden Prairie, MN, USA) to ensure reliable responses to low level (≤25 dB) pure
tones at a standard set of frequencies (0.5, 1, 2, and 4 kHz) (Student Support Services Team,
2008). Participants were excluded if they had a history of any other ocular pathology, previous
intraocular surgery, high ametropia (hyperopia > +5D or myopia > -6D), hearing impairment,
neurological disease, or neurodevelopmental disorder. Written informed consent was obtained
from all participants. The study was approved by the Research Ethics Board at The Hospital for
Sick Children, and all protocols adhered to the tenets of the Declaration of Helsinki.
Participants were recruited from November 2014 to February 2016 through flyers posted on
hospital property and advertisements posted on the social media websites Craigslist.ca and
Kijiji.ca. Of 26 individuals with amblyopia recruited, 17 passed the screening examinations and
participated in the study (3 males, mean age: 29 years, range: 19–48 years). An equal number of
visually normal naive control participants were recruited in a similar fashion (4 males, mean age:
29 years, range: 22–47 years). The clinical characteristics of the participants with amblyopia are
summarized in Table 5.1.
117
Table 5.1: Characteristics of participants with amblyopia
Visual acuity
(logMAR)
Refractive correction
Participant Age Subtype RE LE RE LE Stereo acuity
(arc sec)
Worth 4-dot
response
1 29 Strab 0.00 1.00 None None Not measurable LE suppressed
2 22 Aniso 0.00 0.48 -1.50 +0.50 x 80 +1.00 +1.25 x 95 200 Fused
3 48 Aniso 0.70 0.00 +2.25 +0.25 x 174 -0.75 3000 Fused
4 36 Aniso 0.00 0.40 -1.00 +1.00 140 Fused
5 29 Aniso 0.48 -0.10 -5.00 -1.25 3000 Fused
6 23 Aniso -0.10 0.48 -2.25 +0.25 +2.25 x 85 200 Fused
7 29 Aniso 0.10 0.70 -1.50 +1.50 x 100 -3.00 +1.50 x 93 Not measurable LE suppressed
8 32 Strab -0.10 0.18 None None 70 Fused
9 29 Mixed 0.00 1.00 Plano +3.50 +2.00 x 90 Not measurable LE suppressed
10 19 Aniso 0.00 0.18 -0.75 +2.00 x 84 -2.75 +4.50 x 99 40 Fused
11 37 Mixed -0.10 1.30 -1.00 +6.00 +2.50 x 120 Not measurable LE suppressed
12 32 Aniso -0.10 0.54 Plano +2.00 +2.00 x 124 140 Fused
13 23 Strab 0.20 0.00 +0.50 +0.50 x 28 +1.25 +0.50 x 88 Not measurable Diplopic
14 44 Mixed 0.90 0.00 +6.00+1.25x75 -0.75 Not measurable RE suppressed
15 22 Aniso 1.1 -0.10 -6.00+0.75x174 -4.50+0.50x75 3000 Fused
16 19 Mixed 0.48 0.00 +3.00+1.00x130 +4.25 3000 Fused
17 27 Strab 0.00 0.48 -6.25 +1.00 x 45 -5.50 +1.25 x 135 200 Fused
Abbreviations: RE, right eye; LE, left eye; Aniso, anisometropia; Strab, strabismic.
118
5.3.2 Apparatus and Stimuli
Experiments were performed in a dark, sound attenuating chamber (internal dimensions 2.0 x 2.1
x 2.2 m) lined with 5 cm acoustic wedge foam (Foam Factory, Macomb, MI, USA). The
background noise was 39.0 dBA sound pressure level (SPL). Visual stimuli were gray Gaussian
blobs (6 SD = 4°) presented centrally for 33 ms (2 frames at 60 Hz) on a 165 cm LED monitor
(NEC, model E654, Tokyo, Japan). Auditory stimuli were 32 ms white noise click trains
(including a 2 ms sigmoid on/off ramp) presented at 62.0 dBA SPL via stereo speakers (HP Inc.,
model BR387AA#ABA, Palo Alto, CA, USA) mounted on either side of the monitor. Stimuli
were created digitally and controlled using a custom-written program, and participant responses
were collected directly via a gamepad (Logitech, model F710, Newark, CA, USA). The visual
and acoustic signals were horizontally aligned at eye level of the seated participant, and relative
timing was confirmed with an oscilloscope.
5.3.3 Procedure
The audiovisual simultaneity window was characterized using a two-alternative forced-choice
(2AFC) simultaneity judgment task. With the head stabilized on a chinrest 65 cm from the LED
monitor, participants were required to fixate a central red dot on the monitor (0.7°) and press a
button on the gamepad to initiate each trial. Following a random interval of 500 to 1500 ms
during which the screen was dark, a flash-click pair was presented, and the participant indicated
whether the two stimuli were “simultaneous” or “not simultaneous”. The signal onset
asynchrony (SOA) was varied from -450 ms (auditory stimulus presented first, i.e., auditory-
lead) to +450 ms (visual stimulus presented first, i.e., visual-lead) in 75 ms increments (i.e. -450,
-375, -300, -225, -150, -75, 0, +75, +150, +225, +300, +375, +450 ms) for a total of 13 SOA
levels (Figure 5.1). There were 20 trials for each SOA level, randomly interleaved in a single
block, typically taking 12–15 minutes to complete. Data were collected under binocular viewing
conditions for all participants. Data were also collected under amblyopic eye and fellow eye
monocular viewing conditions for a subset of 6 participants with amblyopia to determine if any
group effects were dependent on viewing condition.
119
Figure 5.1: Schematic diagram of signal onset asynchronies (SOA) for auditory-lead and
visual-lead conditions.
5.3.4 Analysis
The proportion of “simultaneous” responses was calculated for each SOA, and the response
distribution was fitted with a previously described truncated Gaussian function using the
maximum likelihood method.(Fujisaki et al., 2004) The correlation coefficient of the fit was
≥0.93 for each individual. The audiovisual simultaneity window width was defined as the width
of the fitted function at the 50% simultaneous response level, with the SOA to the left and right
of 0 ms (i.e. physical simultaneity) representing the auditory-lead threshold and visual-lead
threshold, respectively. The point of subjective simultaneity (PSS) was defined as the mean of the
fitted truncated Gaussian function. Group parameters were calculated as the arithmetic means of
the individual participant parameters. Sample data with fitted function are shown in Figure 5.2.
All curve fitting and parameter calculations were done using MATLAB version 2011b
(Mathworks, Inc., Natick, MA, USA).
120
Figure 5.2: Sample audiovisual simultaneity judgment data from a visually normal control
participant, fitted with a truncated Gaussian function by the maximum likelihood method.
The psychometric parameters (i.e., audiovisual simultaneity window width, auditory-lead
threshold and visual-lead threshold), were estimated at the 50% simultaneous response level. AV
= audiovisual.
Performance parameters (i.e., auditory-lead threshold, visual-lead threshold, audiovisual
simultaneity window width, and PSS) were compared between groups using one-way analysis of
variance (ANOVA) and Tukey post hoc multiple comparisons. Homogeneity of variances was
verified in each case by Levene’s test. Subgroup analyses were performed based on 4 common
clinical factors in amblyopia: (1) severity of the monocular acuity deficit, (2) presumed etiology,
(3) presence or absence of foveal suppression, and (4) level of stereopsis. Amblyopia severity
was classified as moderate if the acuity was ≤ 0.6 logMAR in the amblyopic eye, and as severe if
the acuity was >0.6 logMAR (American Academy of Ophthalmology Pediatric
121
Ophthalmology/Strabismus Panel, 2012). Presumed etiology was classified as either
anisometropic or strabismic/mixed. Foveal suppression status was classified as suppressed or
non-suppressed based on results from the Worth 4-dot test. Level of stereopsis was classified as
fine (i.e., some Randot circles; ≤400 seconds of arc) or poor (i.e., no Randot circles).
Associations between the 4 clinical factors were assessed using 2x2 contingency tables and the
phi coefficient (Φ). All statistics were computed using IBM SPSS Statistics version 22 (Armonk,
NY, USA). Statistical significance was defined as p < 0.05.
5.4 Results
5.4.1 Binocular Viewing Condition
5.4.1.1 Main Group Analysis
The audiovisual simultaneity window in adults with unilateral amblyopia was broadened by 134
ms, or 37%, compared to control participants (F(1,32) = 11.313, p = 0.002) when viewing
binocularly (Figure 5.3 and Table 5.2). The auditory-lead side of the audiovisual simultaneity
window was wider by 48 ms (36%; F(1,32) = 11.012, p = 0.002), and the visual-lead side was
wider by 86 ms (37%; F(1,32) = 6.00, p = 0.02). There was no significant difference in the PSS
between the control and amblyopia group.
122
Figure 5.3: Main group analysis for audiovisual simultaneity judgment responses with both
eyes viewing as a function of SOA. Comparison between control (n = 17) and amblyopia (n =
17) participant groups. Error bars represent standard error of the mean.
Table 5.2: Audiovisual simultaneity window parameters by main group
SOA, mean ± SD (ms)
Performance
parameter
Control
(n = 17)
Amblyopia
(n = 17)
F(1,32) p-value
Auditory-lead
threshold
-136 ± 34 -183 ± 49* 11.012 0.002
Visual-lead
threshold
231 ± 83 317 ± 119* 6.000 0.020
AV simultaneity
window width
366 ± 91 500 ± 136* 11.313 0.002
PSS
47 ± 44 67 ± 60 1.131 0.295
Abbreviations: * p < 0.05 (one-way ANOVA); SOA, signal onset asynchrony; SD, standard
deviation; AV, audiovisual.
123
5.4.1.2 Subgroup Analysis by Clinical Factors
5.4.1.2.1 Amblyopia Severity
Results of the subgroup analysis by amblyopia severity are summarized in Table 5.3 and Figure
5.4A. In the moderate amblyopia subgroup (n = 10), the auditory-lead threshold was broadened
by 45 ms (33%; p = 0.032), but the other parameters (visual-lead threshold, audiovisual
simultaneity window, and PSS) were not significantly different from the control group. In the
severe amblyopia subgroup (n = 7), three parameters were broadened compared to the control
group: the auditory-lead threshold by 51 ms (38%; p = 0.030), the visual-lead threshold by 155
ms (67%; p = 0.003), and the audiovisual simultaneity window by 207 ms (57%; p = 0.001). The
PSS in the severe amblyopia group showed a non-significant trend toward a visual-lead shift
compared to the control group (p = 0.064).
Within the amblyopia group (i.e., moderate vs. severe), severity was significantly related to only
the visual-lead threshold, with those classified as severe having a threshold 118 ms wider
compared those classified as moderate (p = 0.043). Severe amblyopia also showed non-
significant trends toward a wider simultaneity window (p = 0.068) and a visual-lead shifted PSS
(p = 0.071) compared to the moderate group.
124
Figure 5.4: Subgroup analyses for audiovisual simultaneity judgment responses with both
eyes viewing as a function of SOA. (A) Comparison by amblyopia severity. (B) Comparison by
presumed etiology. (C) Comparison by foveal suppression status. (D) Comparison by level of
stereopsis. Error bars represent standard error of the mean.
125
Table 5.3: Audiovisual simultaneity window parameters by amblyopia severity
SOA, mean ± SD (ms)
Performance
parameter
Control
(n = 17)
Amblyopia F(2,31) Omnibus p-
value Moderate
(n = 10)
Severe
(n = 7)
Auditory-lead
threshold
-136 ± 34 -180 ± 39* -186 ± 63* 5.393 0.010
Visual-lead
threshold
231 ± 83 268 ± 103 386 ± 111*† 6.700 0.004
AV simultaneity
window width
366 ± 91 448 ± 126 572 ± 122* 9.120 0.001
PSS
47 ± 44 44 ± 45 100 ± 67 3.281 0.051
Abbreviations: * p < 0.05 (vs. Control group post hoc); † p < 0.05 (vs. Moderate group post
hoc); SOA, signal onset asynchrony; SD, standard deviation; AV, audiovisual.
5.4.1.2.2 Amblyopia Etiology
Results of the subgroup analysis by amblyopia etiology are summarized in Table 5.4 and Figure
5.4B. In the anisometropic subgroup, the auditory-lead threshold was broadened by 75 ms (56%;
p < 0.001) and the audiovisual simultaneity window was broadened by 134 ms (37%; p = 0.025),
but the visual-lead threshold and PSS were not significantly different from the control group. In
the strabismic/mixed subgroup, the visual-lead threshold was broadened by 116 ms (32%, p =
0.032), and the audiovisual simultaneity window was broadened by 133 ms (36%; p = 0.033),
but unlike the anisometropic group, the auditory-lead threshold was not significantly different
compared to the control group. There was a non-significant trend toward a visual-lead shifted
PSS in the strabismic/mixed group compared to the control group (p = 0.064).
Within the amblyopia group (i.e. anisometropic vs. strabismic/mixed), etiology was significantly
related to the auditory-lead threshold, with those classified as anisometropic having a threshold
57 ms wider compared to those classified as strabismic/mixed (p = 0.009). The PSS in the
strabismic/mixed group also showed a non-significant trend toward a visual-lead shift compared
to the anisometropic group (p = 0,058)
126
Table 5.4: Audiovisual simultaneity window parameters by amblyopia etiology
SOA, mean ± SD (ms)
Performance
parameter
Control
(n = 17)
Amblyopia F(2,31) Omnibus p-
value Aniso
(n = 9)
Strab/mixed
(n = 8)
Auditory-lead
threshold
-136 ± 34 -210 ± 44*† -153 ± 34 12.165 <0.001
Visual-lead
threshold
231 ± 83 289 ± 107 348 ± 131* 3.689 0.037
AV simultaneity
window width
366 ± 91 500 ± 134* 500 ± 147* 5.480 0.009
PSS
47 ± 44 40 ± 47 97 ± 61 3.513 0.042
Abbreviations: * p < 0.05 (vs. Control group post hoc); † p < 0.05 (vs. Strab/mixed group post
hoc); SOA, signal onset asynchrony; SD, standard deviation; Aniso, anisometropic; Strab,
strabismic; AV, audiovisual.
5.4.1.2.3 Foveal Suppression Status
Results of the subgroup analysis by foveal suppression status are summarized in Table 5.5 and
Figure 5.4C. In the non-suppressed subgroup, the auditory-lead threshold was broadened by 20
ms (43%; p = 0.002) and the audiovisual simultaneity window was broadened by 116 ms (32%;
p = 0.033), but the visual-lead threshold and PSS were not significantly different from the
control group. In the suppressed subgroup, the visual-lead threshold was broadened by 156 ms
(68%, p = 0.011), the audiovisual simultaneity window was widened by 177 ms (48%; p =
0.014), and the PSS was shifted toward by visual-lead condition by 68 ms (p = 0.025), but the
auditory-lead threshold was not significantly different compared to the control group.
Within the amblyopia group, suppression status was significantly related to the PSS only, with
those classified as suppressed having a PSS shifted 69 ms toward the visual-lead condition
compared to those classified as non-suppressed (p = 0.030).
127
Table 5.5: Audiovisual simultaneity window parameters by suppression status
SOA, mean ± SD (ms)
Performance
parameter
Control
(n = 17)
Amblyopia F(2,31) Omnibus
p-value Non-suppressed
(n = 12)
Suppressed
(n = 5)
Auditory-lead
threshold
-136 ± 34 -195 ± 53* -156 ± 20 7.432 0.002
Visual-lead
threshold
231 ± 82 287 ± 112 387 ± 114* 5.041 0.013
AV simultaneity
window width
366 ± 91 481 ± 149* 543 ± 99* 6.146 0.006
PSS
47 ± 44 46 ± 47 115 ± 65*† 4.286 0.023
Abbreviations: * p < 0.05 (vs. Control group post hoc); † p < 0.05 (vs. Non-suppressed group
post hoc); SOA, signal onset asynchrony; SD, standard deviation; W4D, Worth 4-dot test; AV
audiovisual.
5.4.1.2.4 Stereopsis Level
Results of the subgroup analysis by stereopsis level are summarized in Table 5.6 and Figure
5.4D. In the subgroup with fine stereopsis, none of the simultaneity window parameters were
significantly different from the control group, although there was a trend toward broadening of
the auditory-lead threshold that did not reach significance in post hoc testing (p = 0.055). In the
subgroup with gross stereopsis, the auditory-lead threshold was broadened by 49 ms (36%, p =
0.019), the visual-lead threshold was broadened by 103 ms (45%, p = 0.045), and the audiovisual
simultaneity window was broadened by 151 ms (41%; p = 0.007), but the PSS was not shifted
compared to the control group.
Within the amblyopia group, level of stereopsis was not significantly related to any simultaneity
window parameters.
128
Table 5.6: Audiovisual simultaneity window parameters by stereopsis level
SOA, mean ± SD (ms)
Performance
parameter
Control
(n = 17)
Amblyopia F(2,31) Omnibus
p-value Fine stereopsis
(n = 7)
Poor stereopsis
(n = 10)
Auditory-lead
threshold
-136 ± 34 -182 ± 42 -184 ± 55* 5.343 0.010
Visual-lead
threshold
231 ± 83 293 ± 112 333 ± 126* 3.289 0.051
AV simultaneity
window width
366 ± 91 475 ± 143 518 ± 136* 5.861 0.007
PSS
47 ± 44 55 ± 45 75 ± 70 0.828 0.447
Abbreviations: * p < 0.05 (vs. Control group post hoc); SOA, signal onset asynchrony; SD,
standard deviation; AV, audiovisual.
5.4.1.2.5 Associations Between Clinical Factors
Participants with strabismic/mixed amblyopia were significantly more likely to exhibit foveal
suppression on the Worth 4-dot test compared to those with anisometropic amblyopia (Φ =
0.685, p = 0.005). Etiology was not significantly associated with amblyopia severity (Φ = 0.169,
p = 0.486) or stereopsis level (Φ = 0.310, p = 0.201). Participants with severe amblyopia were
significantly more likely to demonstrate foveal suppression (Φ = 0.509, p = 0.036) and to have
poor stereopsis (Φ = 0.700, p = 0.004) compared to those with moderate amblyopia. Participants
with foveal suppression on the Worth 4-dot test were significantly more likely to have poor
stereopsis compared to those with a non-suppressed response (Φ = 0.540, p = 0.026).
5.4.2 Monocular Viewing Conditions
A subset of 6 participants with amblyopia was tested under monocular amblyopic eye-only and
fellow eye-only viewing conditions. The mean “simultaneous” response percentages are plotted
by SOA in Figure 5.5. Repeated measures ANOVAs, summarized in Table 5.7, showed no
significant differences in any performance parameters across viewing conditions among
participants with amblyopia.
129
Figure 5.5: The audiovisual simultaneity window for binocular and monocular viewing
conditions among participants with amblyopia. There were no significant differences between
viewing conditions (n = 6). Error bars represent standard error of the mean.
130
Table 5.7: Comparison of audiovisual simultaneity window parameters by viewing
condition for participants with amblyopia (repeated measures ANOVA)
SOA, mean ± SD (ms)
Performance
parameter
Both eyes Amblyopic eye Fellow eye F(2,10) Omnibus
p-value
Auditory-lead
threshold
-158 ± 40 -166 ± 53 -177 ± 73 0.331 0.726
Visual-lead
threshold
304 ± 155 283 ± 145 279 ± 119 0.607 0.564
AV simultaneity
window width
462 ± 157 449 ± 177 456 ± 144 0.054 0.948
PSS
73 ± 82 58 ± 64 51 ± 67 1.244 0.329
Abbreviations: SOA, signal onset asynchrony; SD, standard deviation; AV, audiovisual.
5.5 Discussion
We characterized the audiovisual simultaneity window in adults with unilateral amblyopia and in
visually normal control participants using a simultaneity judgment task. The window parameter
values obtained among control participants were very similar to those previously published for
similar experimental protocols (Stevenson & Wallace, 2013; Stevenson, Zemtsov, et al., 2012;
Stone et al., 2001; Zampini, Shore, et al., 2005). With both eyes viewing, the window was wider
in participants with amblyopia on both the auditory-lead and visual-lead sides. The broadening of
the simultaneity window among participants with amblyopia was similar among amblyopic eye
only, fellow eye only, and binocular viewing conditions, suggesting that these perceptual
differences may involve an abnormal central multisensory network for temporal processing. The
results are similar to those reported for adults with early monocular deprivation from congenital
cataract (Chen et al., 2017), and demonstrate that the abnormalities in audiovisual integration in
the most prevalent forms of amblyopia are not specific to the McGurk effect (i.e., audiovisual
speech perception) (Burgmeier et al., 2015; Narinesingh et al., 2015; Narinesingh et al., 2014),
but generalize to simultaneity judgments of simple, non-speech stimuli.
131
Subgroup analyses of the participants with amblyopia by their clinical characteristics showed
several differentiating patterns. The auditory-lead side of the simultaneity window varied with
etiology, with significant broadening seen in the anisometropic group. In contrast, the visual-lead
side varied with severity, with significant broadening seen in the severe group. The PSS is a
composite of the auditory-lead and visual-lead threshold values, and as such, exhibited an
intermediate response: the PSS trended toward visual-lead shifts in the strabismic/mixed group
and in the severe group, and showed a significant visual-lead shift in the foveal suppression
group.
A major distinction between strabmismic and anisometropic amblyopia is the difference in
binocular function (McKee et al., 2003). Strabismic and mixed mechanism amblyopia tend to
show stronger suppression and poorer stereopsis than anisometropic amblyopia (Birch et al.,
2016; Harrad & Hess, 1992; McKee et al., 2003). Interestingly, the clinical characteristics
associated with a broadened visual-lead threshold and visual-lead shifted PSS in this study are
also those known to indicate poor binocularity: strabismic/mixed etiology, foveal suppression,
and a severe monocular acuity deficit. Conversely, anisometropic etiology is known to indicate
relatively better binocular function, and was the only clinical characteristic positively associated
with a broadened auditory-lead threshold in this study.
While anisometropic and strabismic/mixed etiologies were distinguished by their effect on the
auditory-lead side of the audiovisual simultaneity window, several observations are noteworthy
(see Table 5.4 and Figure 5.4B). First, the width of the audiovisual simultaneity window among
the two etiology groups was the same. Second, the magnitude and direction of the differences in
the auditory-lead threshold, visual-lead threshold, and PSS (i.e. the midpoint of the two
thresholds) between the two etiology groups were nearly identical (i.e., 57–58 ms toward the
visual-lead side), suggesting a shift in the function rather than a widening of the visual-lead side.
Third, these effects are unlikely to be confounded by amblyopia severity, as there was no
statistical association between etiology and severity in the study sample. Taken together, these
observations suggest that two distinct mechanisms may be at play: that amblyopia in the absence
of significant strabismus or suppression (e.g., anisometropic amblyopia) leads to a symmetric
broadening of the audiovisual simultaneity window without shifting the PSS, and that it is the
overlay of significant strabismus or suppression (e.g. strabismic/mixed amblyopia) that shifts the
132
PSS toward the visual-lead condition. A symmetric broadening of the audiovisual simultaneity
window without a shift in PSS has also been observed in unilateral deprivational amblyopia
(Chen et al., 2017). Importantly, deprivational and anisometropic amblyopia share image
degradation as a common factor, and exhibit similarities on psychophysical tests of spatial acuity
and binocularity (McKee et al., 2003), lending further support to the hypothesis outlined above.
Because of the statistical associations between the clinical characteristics in the study sample, the
results must be interpreted with caution. Amblyopia severity was significantly associated with
every clinical characteristic except etiology, meaning that interpretation of the subgroup analyses
for suppression and stereopsis is confounded by unbalanced severity between groups. Some
variables may also reflect clinical factors, such as age of onset, which cannot generally be
determined accurately. Strabismus, for example, accounts for the majority of amblyopia cases
under age 3 years, while anisometropia becomes an etiologic factor primarily after age 3 (Birch,
2013). It is also likely that amblyopic etiology, suppression, stereopsis, and severity constitute
overlapping measures of common factors such as binocular function, or age of onset, although
their relations and these interactions are undoubtedly complex (McKee et al., 2003).
In visually normal individuals, the width of the audiovisual simultaneity window and PSS are not
only determined by sensory physiology, but are also modulated by cognitive factors such as
attention, and a decisional bias toward simultaneity (Zampini, Shore, et al., 2005). Attending to
either vision or audition has been shown to shift the PSS away from the attended modality in a
phenomenon termed prior entry (Spence et al., 2001). While it is possible that amblyopia is
associated with an attentional shift toward audition (de Heering et al., 2016), others have
determined that the magnitude of the prior entry effect in this task among visually normal
individuals is only 14 ms—far less than the 69 ms shift observed in the foveal suppression group
in this study (Zampini, Shore, et al., 2005). Decisional bias toward simultaneity (i.e. shift in
criterion for the unity assumption) would have the effect of widening both the auditory-lead and
visual-lead sides of the window without shifting the PSS (Welch & Warren, 1980). However, it
has been shown that within individuals, the width of the simultaneity window is stable over time
(Stone et al., 2001) and unaffected by the range of SOAs tested, suggesting that this parameter
reflects perceptual rather than decisional factors (Chen et al., 2016; Zampini, Guest, et al., 2005).
Indeed, if a decisional bias toward unity was the cause of a widened simultaneity window in
133
amblyopia, one might also expect that susceptibility to the McGurk effect would also be
heightened, but this is not the case (Burgmeier et al., 2015; Narinesingh et al., 2014).
Multiple non-cognitive factors may also contribute to the main and subgroup differences in
audiovisual temporal perception described in this study. Hypothetically, widening of the
simultaneity window could result from strengthened multisensory perceptual binding. As with
decisional bias toward unity, however, the heightened McGurk effect expected from enhanced
audiovisual perceptual binding has not been observed in amblyopia (Burgmeier et al., 2015;
Narinesingh et al., 2015; Narinesingh et al., 2014). Rather, the accompaniment of a wide
simultaneity window in amblyopia with low susceptibility to the McGurk effect is akin to the
relation observed in visually normal individuals (Stevenson, Zemtsov, et al., 2012), and suggests
an impairment in the ability to resolve asynchronous audiovisual pairs as unique events. A
possible mechanism for such an impairment is temporal uncertainty in the visual domain.
Assuming that decisional and criterion factors are unchanged, less precise visual temporal
information would reduce the precision of audiovisual asynchrony detection, and widen the
simultaneity window. Indeed, evidence for temporal uncertainty in amblyopia exists. Spang and
Fahle (2009) reported reduced visual temporal resolution in the amblyopic eyes of anisometropic
and strabismic participants, and that the temporal deficit correlated with amblyopia severity as in
the present study. Huang et al. (2012) employed a synchrony detection task to demonstrate a
foveal temporal processing impairment in the amblyopic eye of strabismic and anisometropic
participants. Impaired temporal processing is also evident in the fellow eye in strabismic
amblyopia when the judgment of temporal order requires interhemispheric transmission across
the corpus callosum (St John, 1998). Visual temporal uncertainty such as that demonstrated in
amblyopia can be expected to have downstream effects on multisensory processes, including
audiovisual asynchrony detection, dependent on visual input.
As discussed above, the PSS shift toward visual-lead SOAs among participants with foveal
suppression was larger than that which is solely attributable to attentional effects (Zampini,
Shore, et al., 2005). PSS shifts of more comparable magnitude, however, have been observed in
normal adults as a result of temporal recalibration to constant asynchrony (Fujisaki et al., 2004).
This phenomenon is likely an important mechanism to deal with the natural physical and neural
asynchrony in auditory and visual signals, and presents a possible mechanism for the PSS shifts
134
observed in amblyopia. In visually normal adults, the first peak cortical evoked response occurs
75 ms after onset of an auditory stimulus and 104 ms after onset of a visual stimulus, resulting in
a neural asynchrony of about 30 ms even under ideal conditions (Andreassi & Greco, 1975). In
amblyopia, however, cortical response latencies from the affected eye are increased compared to
the fellow eye (Sokol, 1983; Zhang & Zhao, 2005). This transmission latency difference may be
another source of temporal uncertainty and act as the perceptual stimulus to shift the PSS toward
visual-lead SOAs. Indeed, evidence for a significant interocular perceptual latency difference in
amblyopia is provided by the observation of a spontaneous Pulfrich effect in some observers
with amblyopia (Tredici & von Noorden, 1984). Another possible explanation for the PSS shift
in amblyopia is that suppression and poor stereopsis may interfere with the normal ability to
account for sound velocity and source distance when making audiovisual simultaneity judgments
(Engel & Dougherty, 1971; Sugita & Suzuki, 2003). This explanation, however, is unlikely, as
monocular adults who lost one eye at an early age perform as normal controls in this task (Moro
& Steeves, 2015).
If the putative audiovisual temporal correspondence detector were intact in amblyopia, one could
reasonably speculate that occlusion of the affected eye would eliminate the temporal uncertainty
and perceptual latency, and normalize the audiovisual simultaneity window parameters.
However, we found viewing condition had no significant effect on the simultaneity window
parameters. This result agrees with the findings in deprivational amblyopia (Chen et al., 2017),
and suggests that the abnormality in audiovisual simultaneity judgment is not solely a result of
amblyopic visual input, but that it involves a central alteration in the capacity to process
audiovisual temporal information. Furthermore, this interpretation is consistent with considerable
evidence that points to the importance of early sensory experience for the emergence of normal
audiovisual integration processes. Neurophysiology studies of cats reared with experimentally
manipulated or absent visual input reported abnormal audiovisual multisensory responses in the
superior colliculus (Wallace et al., 2004; Wallace & Stein, 2007). Adult humans with a history of
transient bilateral visual deprivation in early life show reduced audiovisual multisensory
interaction in behavioural studies (Chen et al., 2017; Putzar et al., 2007; Putzar, Hötting, et al.,
2010), and large-scale cross-modal reorganization of the visual cortex as assessed using
functional MRI (Collignon et al., 2015). Interestingly, typically-developing children up to age 7
years have a symmetrically broadened audiovisual simultaneity window similar to that observed
135
in amblyopia, suggesting that the amblyopic audiovisual simultaneity window may represent a
persistent juvenile state (Chen et al., 2016; Hillock-Dunn & Wallace, 2012; Hillock et al., 2011;
Lewkowicz & Flom, 2014). If the mechanism by which the audiovisual simultaneity window
normally narrows through childhood is experience-dependent, then amblyopia may interfere with
the calibration and refinement of the cortical processes responsible for audiovisual simultaneity
and asynchrony perception. Plausibly, amblyopic visual temporal uncertainty during a critical
period of brain development may limit the resolution of audiovisual asynchrony detection,
leading to a widened audiovisual simultaneity window.
The view that audiovisual simultaneity perception is altered developmentally by the temporal
uncertainty and perceptual latency inherent to amblyopic vision is supported by the lack of a
similar effect in monocular adults. Indeed, adults with a history of early enucleation (i.e.,
removal of one eye) have a normal simultaneity window (Moro & Steeves, 2015). This indicates
that monocular visual loss alone is not sufficient to alter the simultaneity window, and suggests
that impaired but not absent visual input is necessary to disrupt the refinement of temporal
audiovisual processes.
Although amblyopia is classically regarded as a monocular impairment of spatial vision, the
findings of this study, combined with the prior finding of reduced susceptibility to the McGurk
effect, indicate an impairment of audiovisual multisensory perception that generalizes beyond
speech (Burgmeier et al., 2015; Narinesingh et al., 2014). In addition to the main finding of a
widened audiovisual simultaneity window in amblyopia, subgroup analysis suggested that an
accompanying shift in the PSS is dependent on etiology and binocularity. Although the
mechanisms are not clear, hypotheses include visual temporal uncertainty and interocular
perceptual latency asymmetry. The findings give insight into the developmental calibration of
normal multisensory processes, and highlight a previously underappreciated impact of amblyopia
beyond vision.
136
Chapter 6 Study IV
Study IV: Temporal Ventriloquism Reveals Normal Audiovisual Temporal Integration in Amblyopia
6.1 Abstract
Introduction: We have shown previously that amblyopia involves impaired detection of
asynchrony between auditory and visual events. To distinguish whether this impairment
represents a defect in temporal integration or non-integrative multisensory processing (e.g.,
cross-modal matching), we employed the temporal ventriloquism effect in which visual temporal
order judgment (TOJ) is normally enhanced by a lagging auditory click.
Methods: Participants with amblyopia (n = 9) and visually normal controls (n = 9) performed a
visual TOJ task. Pairs of clicks accompanied the two lights such that the first click preceded the
first light, or second click lagged the second light by 100, 200, or 450 ms. Baseline audiovisual
synchrony and visual-only conditions were also tested.
Results: Within both groups, just noticeable differences (JNDs) for the visual TOJ task were
significantly enhanced over baseline in the 100 ms click lag condition. Within the amblyopia
group, poorer stereo acuity was significantly correlated with greater enhancement in visual TOJ
performance in the 200 ms click lag condition.
Conclusions: Audiovisual temporal integration is intact in amblyopia, as indicated by a normal
temporal ventriloquism effect. Amblyopia with poorer binocularity is associated with a widener
temporal binding window for the effect. These findings suggest that previously reported deficits
in audiovisual multisensory processing result from impaired cross-modal matching rather than
diminished capacity for temporal audiovisual integration.
6.2 Introduction
Amblyopia, or “lazy eye”, is a visual disorder that arises from anomalous visual experience
during a sensitive period of brain development in early life. It is most commonly caused by
misalignment of the eyes (i.e., strabismus), inequality in refractive error between the eyes (i.e.,
137
anisometropia), or a combination of the two factors that cause non-correspondence of the retinal
images or visual blur (Birch, 2013). In rare cases, it may also be induced by congenital cataracts
that deprive one or both eyes of form vision. Although visual rehabilitation is possible in
childhood, failures in diagnosis and treatment result in amblyopia being the number one cause of
persistent monocular blindness in adulthood (Buch et al., 2001; Krueger & Ederer, 1984).
Clinically, amblyopia is often regarded as a monocular impairment in low-level visual functions
such as spatial acuity, contrast sensitivity, and binocular vision (American Academy of
Ophthalmology Pediatric Ophthalmology/Strabismus Panel, 2012). However, a considerable
body of research shows that the deficits are not limited to vision in the amblyopic eye (see
(Meier & Giaschi, 2017) for review). The fellow eye also shows subtle deficits in spatial vision
(Bedell et al., 1985; Sireteanu et al., 2008), contrast sensitivity (Leguire et al., 1990; Wali et al.,
1991), and motion processing (Giaschi et al., 1992; Ho et al., 2005). Temporal aspects of visual
processing are also affected. Foveal vision in the amblyopic eye is less sensitive to asynchrony
between visual elements (Huang et al., 2012), extra-foveal regions have poorer temporal
resolution (Spang & Fahle, 2009), and the fellow eye in strabismic amblyopia shows impaired
perception of temporal order (St John, 1998). In agreement with these behavioural deficits in
temporal perception, visual evoked responses in humans show increased latency jitter and
decreased signal-to-noise ratios (Banko, Kortvelyes, Nemeth, et al., 2013; J. P. Kelly et al.,
2015). Physiological recordings in cats show reduced synchronization among neurons in the
striate cortex (Roelfsema et al., 1994) when driven by stimulation of the amblyopic eye. Beyond
vision, amblyopic abnormalities in multisensory processing and integration are also well
documented. People with unilateral amblyopia show reduced integration of incongruent auditory
and visual speech signals as demonstrated by the McGurk effect (Burgmeier et al., 2015;
Narinesingh et al., 2015; Narinesingh et al., 2014). They also show multisensory perceptual
binding over an abnormally wide range of signal onset asynchronies (SOAs) for simple visual
and auditory signals. For example, in audiovisual simultaneity judgment tasks, the temporal
window of perceived simultaneity is broadened on both the visual-lead and auditory-lead sides
for people with unilateral amblyopia (Chen et al., 2017; Richards et al., 2017b). In the sound-
induced flash illusion (Shams et al., 2000), a broader temporal window of auditory dominance
over vision when the sound precedes the flash is also evident (Narinesingh et al., 2017). These
138
multisensory perceptual abnormalities are observed under binocular viewing conditions, and so
cannot be fully explained by visual anomalies in the amblyopic eye alone.
It is important to distinguish multisensory integration from other multisensory processes.
Multisensory integration involves the fusion or combination of unisensory signals to produce a
new response that is significantly different from its component inputs (Stein et al., 2010).
Multisensory processing, on the other hand, is an umbrella term that encompasses multisensory
integration and non-integrative multisensory processes, such as cross-modal matching, that do
not produce a new response, but seek featural equivalencies in time, space, or identity between
unisensory inputs (Stein et al., 2010).
The fused illusory percepts in the McGurk effect and sound-induced flash illusion are examples
of multisensory integration because their perceptual products are qualitatively different from
their unisensory auditory or visual components (McGurk & MacDonald, 1976; Shams et al.,
2000). A defect in multisensory integration would therefore reduce or eliminate perception of the
fused phoneme or illusory flash. This is not the only explanation, however. The strength of
integration is also affected by the level of feature matching (e.g., temporal correspondence,
spatial correspondence, and phonetic identity) in the unisensory signal streams. For example,
degradation of phonetic identity in the auditory signal (Baart, Vroomen, et al., 2014) can reduce
the strength of the McGurk effect, and reduction of cross-modal temporal correspondence (i.e.
greater SOAs) can reduce the strength of both the McGurk effect (Munhall, Gribble, Sacco, &
Ward, 1996) and the sound-induced flash illusion (Shams et al., 2000). The factors influencing
the width of the audiovisual simultaneity window are similarly complex. In normal individuals, a
wider audiovisual simultaneity window correlates empirically with less susceptibility to the
McGurk effect (i.e. weaker integration), but greater susceptibility to the sound-induced flash
illusion (i.e. greater integration) (Stevenson, Zemtsov, et al., 2012). Therefore, the wider
audiovisual simultaneity window observed in amblyopia may represent reduced capacity for
multisensory integration, or integrative fusion over a wider range of SOAs, or both.
Alternatively, a wider audiovisual simultaneity window may represent entirely non-integrative
factors such as criterion shift toward simultaneity (Yarrow et al., 2011), or temporal uncertainty
in the unisensory streams that feed into the neural machinery of multisensory integration
(Richards et al., 2017b). Indeed, some empirical data suggest that the width of the audiovisual
139
simultaneity window may not be a function of multisensory integration, but rather of non-
integrative cross-modal matching of temporal features encoded within the unisensory streams
(Fujisaki & Nishida, 2005). Finally, because testing paradigms for the McGurk effect, sound-
induced flash illusion, and audiovisual simultaneity window typically involve a single interval
with a “target present” vs “target not present” response, they are more susceptible to inter-
individual response bias than traditional 2-alternative forced choice (2AFC) paradigms that have
a predictable noise floor.
The nature of multisensory processing abnormalities in amblyopia remains an unresolved
question. Two opposing mechanisms—a primary failure of integration or a primary deficiency in
unisensory information—can both lead to the same perceptual outcome for many of the
multisensory phenomena studied in amblyopia, as discussed above. To help resolve this
ambiguity, we have examined integration in another audiovisual phenomenon—the temporal
ventriloquism effect (Morein-Zamir et al., 2003). The temporal ventriloquism effect is an
example of audiovisual integration in which performance on a 2AFC visual temporal order
judgment (TOJ) task is improved by paired auditory events (Figure 6.1). Specifically, the ability
to detect the order of onset of two lights improves when spatially irrelevant clicks are presented
such that the second click lags the second light by 100 to 200 ms (Morein-Zamir et al., 2003). In
effect, the second click ‘pulls’ or ‘ventriloquises’ the perceived onset of the second light forward
in time, increasing the apparent interval between the two lights and making their temporal order
easier to judge.
140
Figure 6.1: Schematic of the apparatus and stimuli that induce the temporal ventriloquism
effect. The first click is simultaneous with the onset of the first light (arbitrarily shown as the top
light in this example), but the second click follows the onset of the second light. In normal
observers, discrimination performance on the visual temporal order judgment (TOJ) is enhanced
when the second click lags the second light by 100 to 200 ms.
Unlike the fused percepts of other audiovisual multisensory phenomena operating in the
temporal dimension (e.g., the McGurk effect, audiovisual simultaneity judgment and the sound-
induced flash illusion), the temporal ventriloquism effect results in perceptual enhancement that
cannot emerge from non-integrative cross-modal matching or perceptual blending. In the present
report, we use the temporal ventriloquism effect to determine whether the multisensory
processing abnormalities observed in amblyopia are related to a primary failure of audiovisual
temporal integration. If amblyopia involves a primary failure of audiovisual integration,
perceptual enhancement by temporal ventriloquism will not be observed (Figure 6.2).
141
Figure 6.2: The temporal ventriloquism effect with and without intact audiovisual
integration. In normal observers, the just noticeable difference (JND) in onset of the two lights
is reduced in the “Trailing 100” and “Trailing 200” conditions (i.e., when the second click trails
the second light by 100 or 200 ms) compared to the “Baseline” condition (i.e., when the two
clicks are synchronous with the onset of the two lights). If amblyopia involves a failure of
temporal integration, then this performance enhancement will not be observed (hypothesis
outlined in red). Adapted from (Morein-Zamir et al., 2003).
6.3 Methods
6.3.1 Participants
Nine adults with unilateral amblyopia (8 female; mean age 28 years; age range 18–47)
participated in this study, and 9 visually typical adults (7 female; mean age 31 years; age range
22–46 years) served as controls. All participants passed a standard hearing test, and were
assessed by a certified orthoptist or an ophthalmologist to measure refractive correction, visual
acuity, stereoacuity, foveal suppression, ocular motility, and eye alignment as described in detail
142
elsewhere (Richards et al., 2017b). Amblyopia was defined as a visual acuity of 0.18 logMAR
(20/30) or worse in the affected eye, and an intraocular difference of at least 0.2 logMAR (2 lines
on the standard ETDRS chart). Anisometropic amblyopia was defined as amblyopia with an
interocular difference of at least 1 diopter (D) in either spherical equivalent or astigmatic error.
Strabismic amblyopia was defined as amblyopia accompanied by any manifest deviation on the
cover test in the absence of significant anisometropia (defined above). Mixed-mechanism
amblyopia was defined as amblyopia accompanied by both anisometropia and strabismus of at
least 8 prism diopters. Participants were excluded if they had hyperopia greater than +5 D
spherical equivalent, myopia greater than -6 D spherical equivalent, or a history of any
neurological, neurodevelopmental, auditory, or visual disorders other than amblyopia, strabismus
and ametropia. Written informed consent was obtained from all participants. The study protocol
was approved by the Research Ethics Board at The Hospital for Sick Children, and followed the
tenets of the Declaration of Helsinki.
The clinical characteristics of the participants with amblyopia are summarized in Table 6.1.
6.3.2 Apparatus and Stimuli
The entire experiment was conducted in a carpeted acoustic chamber lined with 5 cm acoustic
wedge foam on the walls and ceiling. The audiovisual apparatus (shown in Figure 6.3) consisted
of two green light emitting diodes (LEDs) arranged 10 cm above and below a central speaker
(model CMS0361KLX, CUI Inc., Tualatin, OR, USA). A red LED positioned over the central
speaker served as a fixation target between trials. Auditory stimuli consisted of 2.5 ms square-
wave clicks presented at 62 dBA sound pressure level (SPL). Participants were seated with the
head stabilized in a chinrest at a standard viewing distance of 1.0 m, and used a wireless
gamepad (model F710, Logitech, Newark, CA, USA) to initiate each trial and enter responses.
143
Figure 6.3: The audiovisual apparatus. Two green LEDs were positioned 10 cm above and
below the point of fixation, indicated by a red LED. A central speaker was positioned
immediately behind the point of fixation. The apparatus was viewed from a distance of 1 m, such
that the visual angle between the fixation point and each green LED was 5.7°.
6.3.3 Design and Procedure
All trials were conducted in darkness with both eyes open. Each trial began with illumination of
the central fixation LED for 500 ms. After a random delay of 500–750 ms following offset of the
fixation LED, the two green LEDs were illuminated in sequence according to predetermined
light SOA conditions. Clicks accompanied the lights according to predetermined click timing
conditions. Participants were asked to make a visual temporal order judgment (TOJ) by
determining which light appeared last (“top” or “bottom”). Responses were unspeeded, and
participants were told that the sounds did not predict the order of the lights.
Twelve light SOA levels were tested: -144, -96, -72, -48, -36, -24, 24, 36, 48, 72, 96, and 144
ms, with negative values indicating that the bottom LED was illuminated first. Eight click timing
144
conditions were tested for each light SOA: one “AV sync” condition, three click “lead”
conditions, three click “lag” conditions, and one “visual-only” condition. In the AV sync
condition, clicks were synchronized with the onset of the two LEDs. In the three lead conditions,
the first click preceded, or led, the first light by 450, 200 or 100 ms, and the second click and
light were synchronous. In the three lag conditions, the first click and light were synchronous,
but the second click trailed, or lagged, the second light by 100, 200 or 450 ms. In the “visual-
only” condition, there were no accompanying clicks. Twenty practice trials preceded the start of
data collection. Twenty trials were run for each light SOA and click timing condition, yielding a
total of 1,920 experimental trials. All audiovisual conditions (i.e. AV sync, lead, and lag
conditions) were randomly interleaved and run in 4 blocks. The visual-only condition was run
separately in a single block.
6.3.4 Data Analysis
The just noticeable difference (JND) and point of subjective simultaneity (PSS) for the visual
TOJ task were determined for each participant in each of the eight click timing conditions. The
JND quantifies the minimum SOA for which the temporal order of the two lights can be reliably
determined, and is a measure of visual temporal resolution. To calculate the JND and PSS, the
proportion of trials in which the top LED was seen first was computed for all light SOA levels. A
cumulative Gaussian curve was then fit to the psychometric data using a maximum likelihood
method. The JND and PSS values were derived from the fitted curve. The JND was defined as
the SOA at which the top LED was seen first 75% of the time, minus the SOA at which the top
light was seen first 25% of the time, divided by two. The PSS was defined as the SOA at which
the top and bottom lights were equally likely to be seen first. All curve fits and parameters were
computed in MATLAB version R2011b (Mathworks, Inc., Natick, MA, USA).
All statistical analyses were computed in IBM SPSS Statistics, version 22 (Armonk, NY, USA).
Homogeneity of variance was established by Levene’s test for independent samples t-tests, and
by Mauchly’s test of sphericity for repeated-measures ANOVAs. Bonferroni adjustments were
applied to post hoc multiple comparisons where indicated in the results. Participants with non-
measureable stereo acuity (worse than 3000 seconds of arc) were assigned a supra maximal value
of 3001 for Spearman rank correlation. Statistical significance was defined as p < 0.05.
145
Table 6.1: Clinical characteristics of participants with amblyopia
Participant
Age (sex)
Subtype Visual acuity
(logMAR)
Refractive correction
(diopters)
Alignment at 6m
(prism diopters)
Stereo
acuity
(arc sec)
Worth 4-dot
response
Additional details
RE LE RE LE
A1
27 (F)
Strab 0.00 0.48 -6.25 +1.00 x 45 -5.50 +1.25 x 135 LE esotropia 2,
LE hypotropia 1
200 Fused Strab surgery at 9
years of age
A2
22 (F)
Aniso 0.00 0.48 -1.50 +0.50 x 80 +1.00 +1.25 x 95 LE esotropia 2 200 Fused
A3
22 (M)
Aniso 1.10 -0.10 -6.00 +0.75 x 174 -4.50 +0.50 x 75 RE esotropia 2 3000 Fused
A4
23 (F)
Strab 0.20 0.00 +0.50 +0.50 x 28 +1.25 +0.50 x 88 LE esotropia 8,
bilateral DVD
Not
measurable
Diplopic Infantile esotropia,
2 strab surgeries as
child
A5
44 (F)
Mixed 0.90 0.00 +6.00 +1.25 x 75 -0.75 RE exotropia 35 Not
measurable
RE
suppressed
A6
37 (F)
Aniso 0.18 -0.10 -3.25 +4.00 x 10 -5.25 RE esotropia 1 70 Fused
A7
46 (F)
Strab -0.10 0.10 +4.25 +5.00 LE esotropia 25,
LE hypotropia 18
Not
measurable
LE
suppressed
Esotropia onset at
6–8 months of age
A8
28 (M)
Aniso 0.18 -0.10 +2.25 +0.25 Exophoria 2 70 Fused
A9
26 (F)
Aniso -0.10 0.18 +0.75 +3.00 LE esotropia 1 140 Fused
Abbreviations: RE, right eye; LE, left eye; Aniso, anisometropic; Strab, strabismic; DVD, dissociated vertical deviation.
146
6.4 Results
Performance on the visual TOJ task is summarized for the amblyopia group and control group in
Table 6.2. Mean JND values did not differ significantly between the two groups for the baseline
synchronous click (i.e., AV sync) condition, the visual-only condition, or any of the
asynchronous click timing conditions. Similarly, mean PSS values did not differ significantly
between the two groups for any click timing condition. Furthermore, one-sample t-tests
comparing the PSS with the expected value of 0 showed no significant deviation of the PSS from
true simultaneity for any click timing condition in either group.
Table 6.2: Visual temporal order judgment performance in the control and amblyopia
groups
Click
timing
condition
(ms)
JND, mean ± SEM
(ms)
PSS, mean ± SEM
(ms)
Control Amblyopia t(16) p Control Amblyopia t(16) p
Lead 450 60 ± 8 73 ± 9 -1.038 0.32 -1 ± 8 -3 ± 2 0.197* 0.85
Lead 200 60 ± 8 70 ± 9 -0.803 0.43 -3 ± 8 2 ± 4 -0.472 0.64
Lead 100 66 ± 9 67 ± 6 -0.102 0.92 -7 ± 9 -4 ± 5 -0.355 0.73
AV sync 60 ± 6 64 ± 4 -0.461 0.65 -9 ± 8 0 ± 9 -0.727 0.48
Lag 100 45 ± 6 48 ± 3 -0.526* 0.61 -6 ± 8 -1 ± 7 -0.375 0.71
Lag 200 52 ± 4 49 ± 4 0.442 0.66 -3 ± 6 -7 ± 8 0.482 0.64
Lag 450 64 ± 7 61 ± 6 0.390 0.70 -3 ± 6 -7 ± 8 0.393 0.70
Visual-only 55 ± 6 65 ± 5 -1.338 0.20 -5 ± 8 -3 ± 10 -0.137 0.89
Abbreviations: JND, just noticeable difference; PSS, point of subjective simultaneity; AV,
audiovisual; *degrees of freedom adjusted for inequality of variances
Performance on the visual TOJ task for unimodal (visual-only) and bimodal baseline (AV sync)
stimuli were compared using paired samples t-tests (Figure 6.4). There was no significant
difference in mean JNDs between the visual-only condition and AV sync condition for the
control group (t(8) = 1.752, p = 0.118) or the amblyopia group (t(8) = -0.330, p = 0.750). Pearson
147
correlations between JND and amblyopic eye visual acuity were not significant for the visual-
only condition (R = -0.120, p = 0.758) or the baseline AV sync condition (R = 0.089, p = 0.820).
Similarly, Spearman rank correlations between JND and stereo acuity were not significant for the
visual-only condition (Rs = -0.085, p = 0.827) or the baseline AV sync condition (Rs = 0.162, p =
0.676).
Figure 6.4: Visual temporal order judgment performance for visual-only stimuli and
audiovisual stimuli with synchronous clicks (AV sync). JNDs did not differ significantly by
stimulus modality. Error bars represent standard error of the mean.
Variation in performance on the visual TOJ task across click timing conditions is illustrated in
Figure 6.5. Mean JND values for each click timing condition were submitted to a one-way
repeated-measures ANOVA for each group. There was a significant effect of click timing
condition on JND for the control group (F(6, 48) = 3.920, p = 0.002) and the amblyopia group
(F(6, 48) = 4.407, p = 0.001). Post hoc comparisons showed that visual TOJ performance was
significantly enhanced over baseline only when the second click lagged the second light by 100
ms for the control group (p = 0.011, Bonferroni correction) and the amblyopia group (F(6, 48) =
148
4.407, p = 0.001). The magnitude of this enhancement over baseline was 16 ms (25%) in the
control group and 15 ms (25%) in the amblyopia group. These findings are consistent with the
temporal ventriloquism effect previously described in visually normal participants (Morein-
Zamir et al., 2003).
Figure 6.5: The temporal ventriloquism effect in the control group and the amblyopia
group. JNDs for the control group are shown at the top in blue, and those for the amblyopia
group are shown in the bottom in red. The baseline AV sync condition is represented by a black
149
bar for both groups. Lead conditions are those in which the first click preceded the onset of the
first light, followed by a synchronous second click and light. Lag conditions are those in which
the first click and light were synchronous, but the second click trailed the second light. A
significant temporal ventriloquism effect was observed in both groups (*p < 0.05). Error bars
represent standard error of the mean.
To examine the relation between clinical measures of amblyopia and susceptibility to the
temporal ventriloquism effect, the JND improvement from the baseline AV sync condition was
computed for the three click lag conditions for each participant. Spearman correlations between
stereo acuity and the JND improvement from baseline (Figure 6.7) were not significant for the
100 ms click lag condition (Rs = 0.03, p = 0.93) or the 450 ms click lag condition (Rs = -0.51, p =
0.16). However, worse stereo acuity was significantly correlated with greater susceptibility to the
temporal ventriloquism effect for the intermediate 200 ms click lag condition (Rs = 0.74, p =
0.02). Pearson correlations between logMAR visual acuity in the amblyopic eye and the JND
improvement from baseline (Figure 6.6) showed a similar trend, though not statistically
significant. The relation between visual acuity and JND improvement from baseline were not
significant for the 100 ms click lag condition (R = -0.29, p = 0.45) or the 450 ms click lag
condition (R = -0.33, p = 0.39). The intermediate 200 ms click lag condition, however, showed a
trend toward greater susceptibility to the temporal ventriloquism effect in those with worse
visual acuity in the amblyopic eye (R = 0.64, p = 0.06).
150
Figure 6.6: Relation between susceptibility to the temporal ventriloquism effect and visual
acuity in the amblyopic eye across click timing conditions in which the second click lagged
the onset of the second light. The index of susceptibility to the temporal ventriloquism effect
was defined as the improvement in JND from each participant’s baseline performance in the AV
sync condition. The 200 ms click lag condition showed a positive relation between greater
susceptibility to the temporal ventriloquism effect and worse acuity in the amblyopic eye
(indicated by trend line), but this did not reach statistical significance. AE = amblyopic eye.
Figure 6.7: Relation between susceptibility to the temporal ventriloquism effect and stereo
acuity across click lag conditions in participants with amblyopia. The index of susceptibility
to the temporal ventriloquism effect was defined as the improvement in JND from each
participant’s baseline performance in the AV sync condition. The 200 ms click lag condition
showed a significant positive relation between greater susceptibility to the temporal
ventriloquism effect and worse stereo acuity (indicated by trend line).
151
6.5 Discussion
We characterized and compared the effect of paired sounds on performance in a visual TOJ task
for participants with unilateral amblyopia and visually normal controls under binocular viewing
conditions. Both the amblyopia and control group showed a significant 25% improvement in
visual temporal precision, as measured by the JND, when the second click lagged the onset of the
second light by 100 ms, consistent with the temporal ventriloquism effect previously described
(Morein-Zamir et al., 2003). This finding suggests that the capacity for audiovisual integration in
the temporal dimension remains intact in amblyopia. By extension, it lends support to the
hypothesis that failed integration is not the primary source of multisensory processing
abnormalities observed in amblyopia. In the amblyopia group, we also found that poorer stereo
acuity was correlated with greater JND improvement from baseline (i.e., greater susceptibility to
the temporal ventriloquism effect) when the second click lagged the onset of the second light by
200 ms. A similar trend was observed between poorer visual acuity in the amblyopic eye and
greater susceptibility to the temporal ventriloquism effect at the 200 ms click lag condition, but
the relation did not reach statistical significance. These findings suggest that the width of the
temporal binding window for the effect is modulated by the severity of amblyopic visual deficits,
with an extended window observed in those with greater deficits in stereo acuity, and possibly
visual acuity.
A common factor that modulates the temporal ventriloquism effect and many other multisensory
phenomena (such as audiovisual simultaneity perception, the McGurk effect, and the sound-
induced flash illusion) is a dependency on cross-modal temporal correspondence and asynchrony
detection (Morein-Zamir et al., 2003; Munhall et al., 1996; Shams et al., 2000; Stevenson,
Zemtsov, et al., 2012). Previous work has shown that unilateral amblyopia is associated with
symmetric widening of the temporal window of audiovisual simultaneity perception (Chen et al.,
2017; Richards et al., 2017b) and reduced susceptibility to the McGurk effect under both
monocular and binocular viewing conditions (Burgmeier et al., 2015; Narinesingh et al., 2015;
Narinesingh et al., 2014). In addition, a study of the sound-induced flash illusion in amblyopia
suggested that the temporal binding window for the illusion is extended under binocular
conditions when the clicks lead the flash (Narinesingh et al., 2017). How does the observation of
a normal-like temporal ventriloquism effect with a possibly extended temporal binding window
fit in the context of these prior findings? Insight comes from the work by Stevenson, Zemtsov, et
152
al. (2012) who described the correlations of various indices of multisensory function in a sample
of visually normal adults. They found that the width of the audiovisual simultaneity window was
negatively correlated with susceptibility to the McGurk effect, but positively correlated with
susceptibility to the sound-induced flash illusion. They proposed that a narrower audiovisual
simultaneity window relates directly to a superior ability to dissociate, or resolve, asynchronous
unisensory components of an audiovisual stimulus pair. Because temporal correspondence is a
constraint on multisensory perceptual binding, any change in the sensitivity to audiovisual
asynchrony will necessarily alter the likelihood of audiovisual integration. In the case of the
McGurk effect, heightened sensitivity to asynchrony means that auditory and visual stimuli
perceived as synchronous are more unique, more likely to have arisen from a single event, and
therefore more strongly integrated in a fused percept. In the case of the sound-induced flash
illusion, diminished sensitivity to asynchrony means that the temporal constraints on integration
are looser. In turn, the asynchrony inherent in the sound-induced flash illusion stimulus poses
less of an impediment to integration, and therefore susceptibility to the illusory percept is
increased. In their study, Stevenson, Zemtsov, et al. (2012) did not explicitly test the width of the
temporal binding window for the sound-induced flash illusion, but based on their reasoning
(outlined above), one might expect perceptual binding and an illusory percept over a wider range
of audiovisual SOAs—that is, a wider temporal binding window—as was previously observed in
amblyopia (Narinesingh et al., 2017). Like the sound-induced flash illusion, the temporal
ventriloquism effect is also dependent upon perceptual binding of asynchronous auditory and
visual signals in the temporal dimension. Therefore, reduced sensitivity to audiovisual
asynchrony, as evidenced by a widened simultaneity window (Chen et al., 2017; Richards et al.,
2017b), would likely not diminish susceptibility to the temporal ventriloquism effect, but enable
integration over a wider range of SOAs. Indeed, a widened audiovisual temporal binding
window in amblyopia is suggested by the significant correlation between susceptibility to the
temporal ventriloquism effect and the extent of the stereo acuity deficit (Figure 6.7).
Taken together, the profile of multisensory processing abnormalities suggest that amblyopia
involves reduced temporal resolution in unisensory perception or in the mechanism for cross-
modal matching (i.e., non-integrative comparison of unisensory features) rather than a primary
deficit in audiovisual integration. Several sources of evidence point to an amblyopic deficit in
temporal resolution residing within vision rather than audition. Most obviously, amblyopia is a
153
primary disorder of the visual system, and its causative factors are ones that interfere with
normal visual experience. Behaviourally, deficits in temporal processing have been demonstrated
in the amblyopic and fellow eye (Huang et al., 2012; Spang & Fahle, 2009; St John, 1998).
Physiologically, cortical responses driven by stimulation of the amblyopic eye are less
synchronized (Roelfsema et al., 1994) and their temporal encoding is less reliable (Roelfsema et
al., 1994). Studies of visually normal people also demonstrate that audition is more temporally
precise than vision (Kanabus et al., 2002), and tends to be dominant in processing the temporal
dimension of audiovisual events (Aschersleben & Bertelson, 2003; Gebhard & Mowbray, 1959;
Shams et al., 2000; Shipley, 1964). Given the normal dominance of audition in temporal
audiovisual processing, any amblyopic deficit in auditory temporal resolution would likely have
diminished the magnitude of the temporal ventriloquism effect and the sound-induced flash
illusion, yet no such diminution was observed in the present study or previously (Narinesingh et
al., 2017). Finally, the width of the audiovisual simultaneity window and the width of the
temporal binding window for temporal ventriloquism vary with the extent of amblyopic deficits
in stereo acuity and visual acuity, respectively (Richards et al., 2017b). While correlation does
not equal causation, the relation is compelling.
Reduced temporal resolution in amblyopic vision may arise from noisy encoding of the visual
signal. Indeed, increased temporal jitter (Banko, Kortvelyes, Nemeth, et al., 2013; J. P. Kelly et
al., 2015) and interocular transmission latency differences (Sokol, 1983; Watts et al., 2002)
would necessarily reduce the temporal certainty for any visual event. Consequently, the
probability distribution for true visual event timing would be widened and likely skewed toward
later onset. Such a developmental mis-calibration may provide an explanation for the weaker
temporal constraints (i.e., wider temporal window) for audiovisual perceptual binding
documented in the present study and in prior work on amblyopic multisensory perception (Chen
et al., 2017; Narinesingh et al., 2017; Richards et al., 2017b).
Curiously, despite previous findings of reduced visual temporal resolution in amblyopia (Huang
et al., 2012; Spang & Fahle, 2009; St John, 1998), no such impairment was found in the present
study. Indeed, the visual-only, baseline audiovisual (AV sync), and asynchronous audiovisual
(click lead and click lag) conditions did not differ significantly between groups. There are
several possible explanations for this lack of effect, outlined below.
154
One possibility is that the visual temporal processing deficit is not uniformly distributed across
visual space. Huang et al. (2012) described a temporal processing deficit that was present in
foveally (within 1.25° of fixation) but absent peripherally (at an eccentricity of 5°) in the
amblyopic eye, raising the possibility that the peripheral stimulus in the present study was
beyond the region of visual temporal resolution impairment. Against this, however, a temporal
processing deficit has been shown to involve regions of the amblyopic visual field well beyond
the eccentricity tested in the present study (Spang & Fahle, 2009).
Another possibility is that normal visual temporal information available from fellow eye
overruled the deficient temporal signal from the amblyopic eye on binocular viewing. Visual
temporal resolution is indeed impaired in the fellow eye of people with strabismic amblyopia,
but only when the visual TOJ involves visual stimuli presented on opposite sides of the vertical
midline and thus requiring intrahemispheric communication via the corpus callosum (St John,
1998). The visual stimuli in our study were presented on the vertical midline, however, meaning
that the visual TOJ may not have involved intrahemispheric communication, and therefore may
not have induced the visual TOJ deficit previously described (St John, 1998).
A third possibility is that the temporal resolution deficit relevant to abnormal audiovisual
processing in amblyopia does not lie within unisensory visual perception, but at the interface
between auditory and visual temporal perception—at the level of cross-modal matching of
temporal features. This view is supported by prior work showing that sensitivity to audiovisual
asynchrony detection is equally impaired whether viewing with the amblyopic eye, the fellow
eye, or with both eyes together (Chen et al., 2017; Richards et al., 2017b), and by evidence from
visually normal humans indicating that detection of audiovisual asynchrony is based on
matching, rather than integration, of temporal features encoded within the unisensory streams
(Fujisaki & Nishida, 2005). The neural mechanism for cross-modal matching may have been
mis-calibrated in early life under the influence of increased temporal jitter (Banko, Kortvelyes,
Nemeth, et al., 2013; Roelfsema et al., 1994) and interocular latency differences (Sokol, 1983;
Watts et al., 2002). If audiovisual cross-modal matching is calibrated prior to the end of the
sensitive period for amblyopic visual recovery (Lewis & Maurer, 2005), then therapy that
improves vision, equalizes evoked response latencies (Arden & Barnard, 1979; Barnard &
Arden, 1979) and reduces internal temporal noise (Birch et al., 2016) may not narrow the
audiovisual temporal binding window. Asynchronous sensitive periods for unisensory and
155
multisensory functions may therefore explain why the temporal resolution of cross-modal
matching may be impaired despite normal visual TOJ performance.
156
Chapter 7 General Discussion and Conclusions
General Discussion and Conclusions
7.1 Summary of Findings and Evaluation of Specific Hypotheses
This thesis examined audiovisual processing and integration in adults with unilateral amblyopia.
It did so by measuring their performance on tasks of audiovisual spatial and temporal perception.
In the spatial dimension, the precision of auditory and visual localization, and the precision and
bias of audiovisual localization, were measured and compared to the performance of normally
sighted controls and to an ideal observer based on the maximum likelihood estimation (MLE)
model of optimal integration (Study I and II). In the temporal dimension, sensitivity to
audiovisual asynchrony (Study III) and perceptual enhancement by the temporal ventriloquism
effect (Study IV) were measured and compared to the performance of normally sighted controls.
Overall, the findings indicated amblyopia involves spatial processing deficits in visual and
auditory localization, and temporal processing deficits in audiovisual asynchrony perception.
Despite these deficits in unisensory and non-integrative processing, however, individuals with
unilateral amblyopia exhibited optimal audiovisual spatial integration and intact audiovisual
temporal integration.
7.1.1 Audiovisual Spatial Perception
Studies I and II examined spatial localization of unisensory and multisensory audiovisual stimuli
in participants with amblyopia and visually normal controls.
7.1.1.1 Study I
Study I found that under binocular viewing conditions, participants with amblyopia localized
unimodal visual and auditory, as well as spatially congruent bimodal audiovisual stimuli, less
precisely than controls. Despite these pervasive deficits in spatial localization precision,
participants with amblyopia demonstrated optimal integration of visual and auditory spatial cues
according to the MLE model of multisensory integration (Alais & Burr, 2004; Ernst & Banks,
2002). This optimal strategy of audiovisual combination was evident not only in the spatial bias
of the fused percept in conditions of audiovisual spatial conflict (i.e., the spatial ventriloquism
157
paradigm), but also in the improvement in localization precision for bimodal stimuli (compared
to that for unimodal stimuli) in conditions of audiovisual spatial congruency.
The results of Study I partially support and are partially inconsistent with the specific hypotheses
(see section 2.2.1.1). In support of hypothesis (1), the observed localization precision for visual
and audiovisual stimuli was reduced compared to visually normal controls. In surprising
opposition, however, localization precision for auditory stimuli was also reduced. To the author’s
knowledge, reduced sound localization precision has not been reported in unilateral amblyopia,
or any other unilateral visual impairment. Hypothesis (2), which stated that participants with
amblyopia will weight audition more heavily than vision compared to visually normal controls,
was also rejected on the basis of empirical findings. The hypothesized sensory reweighting was
likely not observed because spatial precision for visual and auditory localization were both
reduced. Imporantly, hypothesis (3), stating that audiovisual spatial integration will obey the
MLE model of optimal combination, was supported. The predictions of the MLE model were
borne out in two ways: in terms of the perceptual weight of the unisensory components in the
bimodal localization estimate, and critically, in terms of enhanced precision of the bimodal
localization estimate. Such precision enhancement is a hallmark of statistically optimal
multisensory integration.
7.1.1.2 Study II
In follow up to Study I, Study II confirmed that amblyopia is associated with reduced relative
sound localization precision, as measured by the minimum audible angle (MAA) in the
horizontal plane. Study II also examined absolute sound localization in the horizontal plane, and
found that people with amblyopia localize sounds less accurately in the spatial hemifield
ipsilateral to their amblyopic eye. This asymmetry in sound localization accuracy correlated
significantly with the participants’ clinical deficits in stereo acuity and visual acuity.
The specific hypotheses for Study II (see section 2.2.1.2) were formulated in response to the
unexpected findings of Study I. Hypothesis (1), which stated that participants with amblyopia
will have a wider MAA than visually normal controls, was unequivocally supported. Hypothesis
(2), which predicted that participants with amblyopia will localize sounds less accurately than
visually normal controls, was supported, but with qualification: the impairment was not uniform
across the horizontal plane, but restricted to the spatial hemifield ipsilateral to amblyopic eye.
158
7.1.2 Audiovisual Temporal Perception
Companion works to Studies I and II in the spatial domain, Studies III and IV examined
audiovisual multisensory interactions in the temporal domain.
7.1.2.1 Study III
Study III examined audiovisual simultaneity perception, and found that amblyopia is associated
with a greater likelihood of perceiving asynchronous auditory and visual signals as simultaneous
over a wider range of signal onset asynchronies (SOAs) compared to visually normal controls.
When participants with amblyopia were analyzed as a homogeneous group, the audiovisual
simultaneity window was widened by more than 35% for both auditory-lead and visual-lead
SOAs, and did not vary between monocular (i.e., amblyopic eye and fellow eye) and binocular
viewing conditions. When the participants were subdivided according to their clinical
characteristics, however, a pattern emerged: amblyopia with good binocularity was associated
with widening of the audiovisual simultaneity window, and the overlay of poor binocularity was
associated with a concomitant shift in the point of subjective simultaneity toward visual-lead
SOAs.
The specific hypothesis for Study III (see section 2.2.2.1), which predicted (1) a symmetrically
widened audiovisual simultaneity window, and (2) non-dependence on viewing condition, were
supported by the empirical findings.
7.1.2.2 Study IV
Study IV measured the just noticeable difference (JND) for a visual temporal order judgment
(TOJ) task, and investigated perceptual enhancement by paired auditory clicks in a temporal
ventriloquism paradigm (Morein-Zamir et al., 2003). Performance did not differ between the
amblyopia and control groups for the unisensory visual condition or the baseline multisensory
condition in which the onset of each light was paired with a synchronous auditory click. In both
groups, the JND for the visual TOJ was enhanced by 25% over baseline when the second click
lagged the onset of the second light by 100 ms. Within the amblyopia group, poorer stereo acuity
was significantly correlated with a greater JND enhancement when the second click lagged the
onset on the second light by 200 ms. The results indicated that amblyopia is associated with
159
intact audiovisual integration, and a wider temporal binding window for the temporal
ventriloquism effect.
The findings of Study IV generally supported the specific hypotheses stated in section 2.2.2.2,
Hypothesis (1), which predicted enhanced visual TOJ performance according to the temporal
ventriloquism effect, was unequivocally supported. Hypothesis (2), which predicted a widened
temporal binding window for the temporal ventriloquism effect in amblyopia, was not supported
by direct comparison of JNDs between the two groups, but by the significant positive correlation
between the degree of stereo acuity deficit and JND enhancement among participants with
amblyopia.
7.2 Is Audiovisual Integration Impaired in Amblyopia?
At first glance, the evidence for impaired audiovisual integration in amblyopia appears to be
conflicting. Several studies have established that unilateral amblyopia is associated with reduced
sensitivity to the McGurk effect (Burgmeier et al., 2015; Narinesingh et al., 2015; Narinesingh et
al., 2014). These investigators advanced two main hypotheses to explain this multisensory
abnormality: that amblyopia is associated with (1) a failure of audiovisual integration, or (2)
reduced reliability of the visual signal which, in turn, induces sensory reweighting to favour
audition in the combined multisensory percept. Other studies of adults with a history of early
bilateral deprivation from congenital cataracts described similar reductions in susceptibility to
the McGurk effect, as well as reduced interactions between simple auditory and visual stimuli in
the temporal dimension (Putzar et al., 2007; Putzar, Hötting, et al., 2010). Again, the findings
suggested a lack of multisensory integration caused by early anomalous visual experience
(Putzar et al., 2007). In contrast, a study of the sound-induced flash illusion suggested that
audiovisual integration is intact in adults with unilateral amblyopia (Narinesingh et al., 2017).
Rather than an integration deficit, participants with amblyopia showed a wider window of
temporal integration for the sound-induced flash illusion compared to visually normal
participants (Narinesingh et al., 2017). This thesis (Study I and Study IV) also presented strong
evidence that audiovisual integration remains intact in amblyopia. In the spatial domain, not only
did audiovisual combination produce a multisensory percept that was more precise than either of
the unisensory component percepts, but the strategy of multisensory combination was also
optimal according to the MLE model of the spatial ventriloquism effect (Study I). In the
160
temporal domain, not only was the strength of the temporal ventriloquism effect similar between
the amblyopia and control groups, but the temporal window of integration was in fact wider in
participants with worse binocularity (Study IV).
How can these conflicting pieces of evidence surrounding impaired audiovisual integration in
amblyopia be reconciled? Several possibilities exist.
7.2.1 Possible Mechanisms for the Pattern of Audiovisual Integration Abnormalities in Amblyopia
7.2.1.1 Differential Impact on Anatomic Sites of Audiovisual Integration
One possibility is that separate integrative mechanisms exist for the various audiovisual tasks,
and that these mechanisms are differentially affected by amblyopia. That separate neural circuits
are responsible for processing different aspects of multisensory integration is supported by the
observation that distinct anatomic sites are preferentially activated by different multisensory
tasks (reviewed in section 1.3.3). For example, the STS is preferentially activated during
integration of audiovisual speech stimuli (Callan et al., 2001; Calvert et al., 2000; Nath &
Beauchamp, 2012; Raij et al., 2000), the IPS is preferentially activated in audiovisual tasks
involving spatial localization and spatial attention (Bushara et al., 1999; Lewis et al., 2000), and
the cortex of the insula is preferentially activated by audiovisual perceptual binding on the basis
of temporal correspondence (Bushara et al., 2001; Calvert et al., 2001). If amblyopia
disproportionately alters processing in the ventral STS and relatively spares the dorsal IPS and
cortex of the insula, then one might predict that integration of audiovisual speech signals in the
McGurk effect would be more impaired than integration of simple audiovisual stimuli in the
spatial and temporal ventriloquism effects. Indeed, this is the general pattern of multisensory
integration abnormalities observed in behavioural studies of amblyopia.
Physiological and neuroimaging evidence also offers some support to the hypothesis that the
circuits for audiovisual integration in the ventral pathway are disproportionately affected in
amblyopia (Milner & Goodale, 2008). In cats with strabismic amblyopia, single-unit responses to
visual stimuli are more abnormal in the ventral pathway compared to the dorsal pathway
(Schröder, Fries, Roelfsema, Singer, & Engel, 2002). Similarly, in humans with strabismic and
anisometropic amblyopia, fMRI data suggest that transmission failure from lower to higher
visual areas affects the ventral pathway more consistently than the dorsal pathway (Muckli et al.,
161
2006). These findings are not conclusive, however, as amblyopic abnormalities in the dorsal
pathway and its associated functions are well-documented in behavioural (Hess, Demanins, &
Bex, 1997; Ho et al., 2005; Mansouri & Hess, 2006; Mirabella et al., 2011; Simmers et al., 2003;
Simmers, Ledgeway, Mansouri, Hutchinson, & Hess, 2006), neuroimaging (Barnes et al., 2001;
X. Li et al., 2011; Secen et al., 2011) and physiological studies (Schröder et al., 2002).
If the pattern of audiovisual perceptual abnormalities in amblyopia stems from differential
effects on the capacity for audiovisual integration among anatomically distinct areas, then why
should one anatomic area be affected more than another? Specifically, why should the putative
circuits for audiovisual integration in the ventral pathway (e.g., audiovisual speech integration in
the STS) be more affected than those residing elsewhere (e.g., audiovisual spatial integration in
the IPS)? A possible answer lies in the known differences in the extent to which the central and
peripheral visual fields are represented in the two streams (reviewed by Milner and Goodale
(2008)). In the striate cortex (V1), the central visual field is topographically over-represented,
with more neural tissue devoted to processing of central compared to peripheral visual stimuli
(Tootell, Switkes, Silverman, & Hamilton, 1988; Van Essen, Newsome, & Maunsell, 1984).
While this emphasis on the central visual field persists in the ventral pathway, the peripheral
visual field is relatively emphasized in the dorsal pathway (Brown, Halpert, & Goodale, 2005;
Van Essen & Deyoe, 1995). Indeed, some dorsal visual areas, such as the parieto-occipital area,
show almost no cortical magnification at all (Colby, Gattass, Olson, & Gross, 1988).
Importantly, the spatiotemporal visual deficits in amblyopia also show differential effects on the
central and peripheral visual fields (reviewed in section 1.1.3): in strabismic amblyopia, contrast
sensitivity is relatively more affected in the central visual field (Hess & Pointer, 1985), and in
both anisometropic and strabismic amblyopia, increased latency on multifocal VEP is more
pronounced in the central visual field (Yu et al., 1998; Zhang & Zhao, 2005). The ventral
pathway and its associated circuits for audiovisual speech integration may therefore be
particularly affected by amblyopia, as both its visual input and the amblyopic deficit
predominantly involve the central visual field. By the same reasoning, the dorsal pathway and its
associated audiovisual integrative functions may be relatively spared.
Several findings challenge this hypothesis, however. Activation of the STS is observed during
illusory perception for both the sound-induced flash illusion (Watkins et al., 2006) and the
McGurk effect (Callan et al., 2001; Calvert et al., 2000; Raij et al., 2000), yet people with
162
amblyopia show reduced integration only for the McGurk effect (Burgmeier et al., 2015;
Narinesingh et al., 2017; Narinesingh et al., 2014). Not only do people with amblyopia remain
susceptible to the sound-induced flash illusion, they show a widened temporal binding window
for the effect (Narinesingh et al., 2017). If the differences in visual field representation in the
dorsal and ventral streams accounted for the pattern of audiovisual integration abnormalities in
amblyopia, then one would predict that susceptibility to both illusions involving the STS—the
McGurk effect and the sound-induced flash illusion—would be diminished. Furthermore, if the
amblyopic abnormalities in perception of the McGurk effect and sound-induced flash illusion are
related to alterations in shared multisensory neural circuits, then susceptibility to the illusions
would likely co-vary. Contrary to this prediction, however, susceptibility to the McGurk effect is
negatively correlated with susceptibility to the sound-induced flash illusion (Stevenson,
Zemtsov, et al., 2012).
7.2.1.2 Differential Influences of Attention on Audiovisual Integrative Processes
Another possible mechanism for the pattern of audiovisual integration abnormalities observed in
amblyopia is attention. Specifically, the interaction between a visual attention deficit in
amblyopia, and the differential effect of attention on multisensory processes in normal adults
(reviewed in section 1.3.2).
The hypothesis that amblyopia involves a visual attention deficit stems from studies suggesting
that the crowding phenomenon in normal peripheral vision is not due to limits in spatial
resolution, but rather to the resolving power of visual attention (He, Cavanagh, & Intriligator,
1996; Intriligator & Cavanagh, 2001). The increased crowding effect in amblyopia may therefore
reflect a deficit in visual attention. Multiple subsequent studies of the crowding effect have found
evidence of deficient selective visual attention in observers with strabismic amblyopia while
viewing with the amblyopic eye (Hariharan, Levi, & Klein, 2005; Levi, Hariharan, & Klein,
2002; McKee et al., 2003; Sharma et al., 2000; Tripathy & Cavanagh, 2002), and a study of
spatial tracking in amblyopia suggested the visual attention deficit extends to the fellow eye of
both strabismic and anisometropic subtypes (Ho et al., 2006). More recently, a functional
neuroimaging study showed that a brief period of visual deprivation from bilateral congenital
cataracts alters the balance between visual and auditory attention, favouring audition (de Heering
et al., 2016).
163
The results of Study I and Study IV presented in this thesis provide strong evidence that the
capacities for spatial and temporal integration of simple audiovisual stimuli (i.e., clicks and
flashes) remain intact in amblyopia. In contrast, previous studies of the McGurk effect in
amblyopia suggested that integration of audiovisual speech signals is impaired (Burgmeier et al.,
2015; Narinesingh et al., 2015; Narinesingh et al., 2014; Putzar, Hötting, et al., 2010).
Importantly, the magnitude of the modulating influence of attention on audiovisual integration
varies according to the perceptual task. The spatial ventriloquism effect has proven insensitive to
the effects of top-down directed attention and bottom-up automatic attention (Bertelson et al.,
2000; Vroomen et al., 2001). Morein-Zamir et al. (2003) have shown that the temporal
ventriloquism effect cannot be accounted for by attentional alerting or distraction by cross-modal
interference. Similarly, a study of the sound-induced flash illusion suggests that it is not a
function of visual attentional enhancement by sound (Shams et al., 2002). In contrast, the
strength of the McGurk effect is significantly modulated by attention: susceptibility is reduced
under conditions of increased attentional load and attentional diversion to irrelevant
somatosensory stimuli (Alsius et al., 2005; Alsius et al., 2007). Therefore, an amblyopic deficit
in visual attention may hypothetically account for the observed reduction in susceptibility to the
McGurk effect and preserved integration in the spatial and temporal ventriloquism effects, and
the sound-induced flash illusion. However, an attentional explanation for the effect of amblyopia
on audiovisual integration does not offer a clear explanation for the widened windows of
temporal binding observed in the temporal ventriloquism effect (discussed in Study IV and
section 6.5) and sound-induced flash illusion (Narinesingh et al., 2017).
7.2.1.3 Differential Sensitive Periods for Audiovisual Integrative Processes
The preponderance of evidence from developmentally typical humans points to multisensory
integration being a late-emerging function in the course of sensory development (reviewed in
section 1.3.6). Multisensory facilitation of reaction times emerges at about 8 years of age, and
matures over a period of 2 to 3 years (Barutchu et al., 2009; Barutchu et al., 2010). Statistically
optimal integration also emerges late: after 8 years of age for visual and proprioceptive
navigational cues (Nardini et al., 2008), between 8 and 10 years of age for visual and haptic
object size cues (Gori et al., 2008), and after 12 years of age for visual and auditory spatial
bisection cues (Gori, Sandini, et al., 2012). An apparent exception to this pattern, however, is
integration of audiovisual speech cues. Indeed, McGurk stimuli elicit behavioural and
164
electrophysiological responses suggestive of audiovisual integration in infants as young as 4
months of age (Bristow et al., 2009; Burnham & Dodd, 2004; Desjardins & Werker, 2004;
Rosenblum et al., 1997). If these facets of multisensory integration are susceptible to
maldevelopment, or damage, from anomalous sensory input, then their distinct ages of
emergence imply the presence of distinct sensitive periods as well. Indeed, multiple
asynchronous sensitive periods are well-described for different facets of unisensory visual
development (reviewed in section 1.1.5 and in Lewis and Maurer (2005)). By extension, if the
sensitive period for integration of audiovisual speech signals is considerably earlier than those
for simple audiovisual spatial and temporal signals (as tested by the spatial and temporal
ventriloquism effects in Study I and Study IV), then the pattern of audiovisual integrative
capacities observed in amblyopia may be explained. That is, amblyopia or abnormal visual
experience in early life may affect audiovisual speech integration, but not other integrative
functions, because their respective sensitive periods for damage are asynchronous. This
hypothesis is supported by data suggesting that normal susceptibility to the McGurk effect is
observed in children whose amblyopia either resolved before 5 years of age, or onset after 5
years of age (Burgmeier et al., 2015).
The concept of sensitive periods may also be relevant to amblyopic abnormalities in processes
such as audiovisual simultaneity perception (Study III) that are multisensory but not clearly the
consequence of integration (Chen et al., 2017; Fujisaki & Nishida, 2005). Chen et al. (2016)
showed that the audiovisual simultaneity window narrows on both the auditory-lead and visual-
lead sides throughout childhood, reaching its adult shape by 9 years of age—long after the
typical age of onset for amblyopia (Birch, 2013). Interestingly, 9 years is also the approximate
age at which many aspects of multisensory integration first emerge (reviewed above and in
section 1.3.6). This timeline of multisensory development raises the possibility that mature non-
integrative multisensory processes, such as cross-modal matching based on temporal and spatial
correspondence, may be a pre-condition for the emergence of statistically optimal integration.
Indeed, Ernst (2008) noted that establishing correspondence between multisensory signals (i.e.,
determining which signals belong together) is an essential task, without which integration cannot
occur. In this way, amblyopia may exert an influence on multisensory integration through its
effect on the maturation of non-integrative multisensory processes.
165
7.2.1.4 Multi-stage Audiovisual Processing
If the influence of amblyopia on multisensory integration is secondary to its deleterious effect on
non-integrative multisensory processes such as audiovisual asynchrony detection (Study III),
why is susceptibility to the McGurk effect reduced (Burgmeier et al., 2015; Narinesingh et al.,
2015; Narinesingh et al., 2014; Putzar, Hötting, et al., 2010), while susceptibility to temporal
ventriloquism (Study IV) and the sound-induced flash illusion are normal or even increased
(Narinesingh et al., 2017). A possible explanation is suggested by converging lines of evidence
for a multi-stage mechanism for multisensory processing specific to audiovisual speech
perception (reviewed in section 1.3.8.4).
Using perceptually ambiguous sine wave replicas of natural speech, Tuomainen et al. (2005)
showed that audiovisual integration in a McGurk paradigm depends on whether the listener
believes the auditory stimuli are speech or non-speech signals. If the listener was unaware that
the auditory stimuli were speech, negligible integration was observed. If the listener learned to
perceive the same auditory stimuli as speech, however, significant integration occurred (as for
natural speech). These results point to the existence of a speech-specific mode of multisensory
perception that depends on access to phonetic representations of auditory stimuli. An fMRI study
of a similar paradigm examined brain activation by visual speech paired with auditory speech
and sine wave replicas in participants trained to perceive the sine wave auditory signal as
intelligible speech or as non-speech sounds (Lee & Noppeney, 2011b). The results revealed a
posterior-to-anterior multisensory processing gradient along the STS and superior temporal gyrus
in the ventral stream (Figure 1.4). Although fMRI lacks the temporal resolution to determine the
activation sequence, this finding suggests that as audiovisual signals advance along this pathway,
they are integrated on the basis of increasingly selective and complex features. An
electrophysiological study by Baart, Stekelenburg, et al. (2014) employed a similar pairing of
visual speech and sine wave speech to examine the temporal characteristics of audiovisual
speech integration in the cerebral cortex. They found corroborating evidence for a speech-
specific mode of multisensory perception, and reported that audiovisual integration of
spatiotemporal features precedes integration of linguistic features. The authors proposed a
sequential, rather than parallel, time course of audiovisual speech integration in which
integration of spatiotemporal properties occurs first (from 50 to 100 ms) followed by integration
of phonetic properties (from 100 to 200 ms). If the output of the first (spatiotemporal) stage of
166
audiovisual speech integration influences or constrains integration in the second (phonetic) stage,
then the output of the first stage may be conceptually analogous to the unity assumption (Welch
& Warren, 1980), or a Bayesian prior (Magnotti & Beauchamp, 2017), that determines the
subsequent strength of integration of the audiovisual phonetic information. In other words, the
strength of phonetic integration in the second stage may be dependent upon the certainty of
common causality in the first stage. Indeed, evidence for such a relation between simple featural
binding and phonetic integration was reported in a study by Stevenson, Zemtsov, et al. (2012).
The authors measured the performance on a variety of audiovisual tasks in a sample of
developmentally normal adults, and found that those with a wider temporal window of perceived
audiovisual simultaneity generally showed lower susceptibility to the McGurk effect, but higher
susceptibility to the sound-induced flash illusion. They hypothesized that a wider simultaneity
window relates to a poorer ability to dissociate asynchronous events, leading to a reduction in the
uniqueness of perceived synchronous events, and consequently, reduced phonetic integration as
indexed by the McGurk effect. Indeed, if perceived synchrony is less unique, then its usefulness
in determining whether a given audiovisual pair arose from a single event (i.e., common
causality) is reduced. Similarly, the widened audiovisual simultaneity window observed in
amblyopia (Study III and Chen et al. (2017)) may reflect poorer ability to dissociate
asynchronous signals. This may lead to a less reliable determination of common causality in the
first stage of audiovisual speech integration, which propagates forward in the pathway to reduce
the strength of phonetic integration in the second stage. In this way, reduced susceptibility to the
McGurk effect in amblyopia may not reflect a failure of integration, but may indeed represent a
statistically optimal strategy of audiovisual speech integration.
If a reduced McGurk effect reflects a lower certainty of common causality in amblyopia, why is
susceptibility to the temporal ventriloquism effect and the sound-induced flash illusion not also
reduced? Unlike the McGurk effect, audiovisual integration in the temporal ventriloquism effect
and the sound-induced flash illusion do not require integration of linguistic elements. They are
therefore unlikely to invoke the multi-stage, speech-specific mode of perception involving the
anterior STS (Baart, Stekelenburg, et al., 2014; Lee & Noppeney, 2011b; Tuomainen et al.,
2005). As a consequence, the additional constraints on integration imposed by phonetic
mismatch would not affect these non-speech integrative phenomena. For example, if unilateral
amblyopia involves deficits in lip-reading ability like those described in early visual deprivation,
167
they may only affect audiovisual speech integration at the second stage specific to phonetic
content (Lalonde & Holt, 2016; Putzar, Goerendt, et al., 2010; Putzar, Hötting, et al., 2010).
Furthermore, multi-stage processing implies that the criterion for phonetic integration in
amblyopia may be shifted independently from the criterion for spatiotemporal integration.
Reduced susceptibility to the McGurk effect in amblyopia may alternatively represent
amblyopia-related maldevelopment of the neural substrates for phonetic integration in the second
stage of audiovisual speech integration. Because temporal ventriloquism and the sound-induced
flash illusion do not involve linguistic elements, any amblyopia-related impairment specific to
phonetic integration would not affect these non-speech integrative phenomena.
7.2.1.5 Optimal Integration in the Setting of Reduced Sensory Reliability
A Bayesian framework has been successfully applied to numerous instances of multisensory
integration in humans (reviewed in section 1.3.5.3). An underlying assumption of the Bayesian
framework is that integration is statistically optimal, and that the weight of each modality in the
combined multisensory percept is a function of the relative reliability of each unisensory
component stimulus (Ernst & Bulthoff, 2004). Sensory information from the more reliable
modality is weighted more heavily than sensory information from the less reliable modality. In
normal adults, the unisensory weighting coefficients for optimal combination are not fixed for
each modality, but have been shown to dynamically readjust in response to exogenous changes
in signal reliability (Alais & Burr, 2004; Andersen et al., 2005; Battaglia et al., 2003; Ernst &
Banks, 2002; Gori et al., 2008; Moro et al., 2014; Nardini et al., 2008). This dynamic response to
exogenous changes in signal reliability indicates that normal multisensory integration remains
sensitive and flexible at maturity in many instances.
Several experiments that demonstrate dynamic readjustment of unisensory weighting in optimal
multisensory integration have modulated the reliability of the exogenous visual signal by the
addition of random noise (Battaglia et al., 2003; Ernst & Banks, 2002). Importantly,
spatiotemporal noise is also a well-documented feature of the amblyopic visual system (reviewed
in section 1.1.3, Levi (2013), and Banko, Kortvelyes, Weiss, et al. (2013)). Downstream
multisensory areas in the amblyopic brain may not distinguish external noise from internal noise,
and weight vision according to the spatiotemporal reliability of its neural signal. Indeed, such a
mechanism is suggested by the results of Study I, which showed that audiovisual spatial
168
integration in amblyopia obeyed the MLE model of optimal combination (a special case of the
Bayesian framework).
Insight may be gained from other audiovisual phenomena as well. The perceptual task involved
in the temporal ventriloquism effect and sound-induced flash illusion involve auditory
dominance over a visual temporal judgment, whereas the McGurk effect involves visual
dominance over an auditory phonetic judgment. If modality dominance is assumed to reflect
optimal perceptual weighting based on the reliability of the component unisensory inputs, then
certain predictions follow. Assuming normal temporal reliability of the amblyopic auditory
signal, a heightened auditory contribution can be predicted for perceptual tasks typically
dominated by audition (e.g., temporal judgments), and a diminished visual contribution can be
predicted for perceptual tasks typically dominated by vision (e.g., phonetic judgments). Indeed,
this pattern is in general agreement with the empirical data from adults with amblyopia. The
widened temporal binding windows for the temporal ventriloquism effect (Study IV) and the
sound-induced flash illusion (Narinesingh et al., 2017) are consistent with a heightened auditory
contribution in response to diminished visual temporal reliability and in amblyopia. Although the
magnitude of susceptibility to the temporal ventriloquism effect (Study IV) and the sound-
induced flash illusion (Narinesingh et al., 2017) were not heightened in amblyopia, this may
reflect a ceiling effect for the contribution of audition in the audiovisual percept. Reduced
sensitivity to the McGurk effect is also consistent with a greater contribution of audition to the
fused percept in amblyopia.
Curiously, susceptibility to the McGurk effect remains reduced in amblyopia even when viewing
with the fellow eye only. At first glance, this observation appears to conflict with the hypothesis
that the mechanisms of audiovisual integration remain intact and optimal in amblyopia.
Importantly, however, the McGurk effect involves integration of not only simple spatiotemporal
properties of the multisensory stimuli, but also of the more complex phonetic identity of the
linguistic content. Phonetic identity of a visual signal is derived from lip-reading abilities, and
lip-reading abilities are sensitive to damage by early-onset visual deprivation (Putzar, Goerendt,
et al., 2010; Putzar, Hötting, et al., 2010). If lip-reading abilities are similarly impaired in
unilateral amblyopia, then diminished susceptibility to the McGurk effect during fellow eye
viewing may still reflect an optimal process of multisensory phonetic integration. These issues
169
are not resolved, however. The effect of unilateral amblyopia on lip-reading abilities and the
optimality of the McGurk percept remain open to investigation.
7.3 Are Non-integrative Audiovisual Processes Impaired in Amblyopia?
In the preceding section, the question of whether multisensory integration is impaired in
amblyopia was explored. Study I and Study II revealed that spatial localization precision for
visual, auditory, and audiovisual stimuli are reduced in amblyopia. Comparison of the empirical
data with an MLE ideal observer model showed that participants with amblyopia demonstrated
optimal integration; that is, impairments in spatial precision at the unisensory (i.e., visual and
auditory) level accounted for spatial deficits observed at the multisensory (i.e., audiovisual)
level. Study III showed that temporal resolution for detection of audiovisual asynchrony is
reduced in amblyopia. Despite this audiovisual temporal perception deficit, integration—as
demonstrated by the temporal ventriloquism effect—was intact in amblyopia, as shown in Study
IV. Prior observations on the McGurk effect have suggested that audiovisual integration is
impaired in amblyopia (Burgmeier et al., 2015; Narinesingh et al., 2015; Narinesingh et al.,
2014). However, quantitative assessments of the unisensory contributions to the fused percept
have not been done for the McGurk effect in amblyopia. Without such measurements of
unisensory performance, the concept of an integration failure in amblyopia remains hypothetical.
As explored in section 7.2.1, many mechanisms other than a failure of appropriate integration
may explain the comparatively low susceptibility to the McGurk effect observed in amblyopia.
On the balance of evidence summarized above and reviewed in section 7.2, it can be argued that
unilateral amblyopia does not involve a primary failure of audiovisual integration. Rather, the
observed abnormalities in audiovisual integration may be explained plausibly and
parsimoniously by amblyopia-related impairments of non-integrative multisensory processes
acquired during early life. The evidence for this hypothesis will be examined below.
7.3.1 Cross-modal Matching
Cross-modal matching refers to the multisensory process by which stimuli from different sensory
modalities are compared to estimate their equivalence (Stein et al., 2010). In contrast to
multisensory integration, which involves fusion of unimodal information to produce a new
170
unified percept, cross-modal matching requires preservation of stimulus features within each
modality (Fujisaki & Nishida, 2005; Stein et al., 2010).
Audiovisual simultaneity judgment is an example of cross-modal matching on the basis of
audiovisual temporal correspondence. Study III showed that adults with the most common forms
of unilateral amblyopia (anisometropic, strabismic, and mixed mechanism) have a widened
temporal window of perceived audiovisual simultaneity, suggesting reduced precision in the
neural mechanism for cross-modal matching of audiovisual temporal features. The window was
widened in both auditory-lead and visual-lead SOA conditions, consistent with findings recently
reported for a sample of adults with deprivational amblyopia caused by unilateral congenital
cataract (Chen et al., 2017). In both studies, the shape of the audiovisual binding window did not
change with viewing condition, indicating that the alterations in simultaneity perception were not
real-time adjustments to amblyopic visual input, but likely reflected changes crystallized during
development. Stevenson, Zemtsov, et al. (2012) investigated the audiovisual simultaneity
window and how it relates to performance on other multisensory tasks, and found that
developmentally normal adults with a wider audiovisual simultaneity window (particularly the
visual-lead side) tend to exhibit lower susceptibility to the McGurk effect, but greater
susceptibility to the sound-induced flash illusion. The authors postulated that the correlations
between performance parameters for these tasks reflect individual variability in the underlying
ability to dissociate asynchronous audiovisual stimuli. Similarly, Chen et al. (2017) hypothesized
that the widened window of audiovisual simultaneity in deprivational amblyopia results from
lower temporal precision in the cross-modal perceptual system, and that amblyopic visual input
may interfere with normal developmental tuning of the neural circuits for audiovisual
simultaneity perception (Chen et al., 2017; Chen et al., 2016). A possible mechanism for this
apparent interference in the developmental tuning of audiovisual simultaneity perception (as
discussed in Study III) is temporal uncertainty, or noise, in the amblyopic visual signal (Banko,
Kortvelyes, Nemeth, et al., 2013; Roelfsema et al., 1994). Indeed, a recent abstract reporting a
study of 47 visually normal adults showed that the precision of temporal perception in an
audiovisual simultaneity judgment task can be predicted from the trial-to-trial variability of an
individual’s cortical evoked responses to visual or auditory stimuli (Arnold, Mathews, Keane, &
Yarrow, 2017). Extrapolating to amblyopia, this finding implicates neural noise in the visual
signal as a limit on the developmental tuning of audiovisual simultaneity perception.
171
Impaired cross-modal matching may also account for the effect of amblyopia on the temporal
window of integration for the temporal ventriloquism effect (Study IV). Although the strength of
integration across stimulus conditions did not differ significantly between groups, subgroup
analysis revealed a wider temporal window of integration among amblyopic participants with
poorer stereo acuity, and suggested a similar trend for those with more severe acuity loss in the
amblyopic eye (Figure 6.6 and Figure 6.7). Not only do these finding indicate that the capacity
for audiovisual temporal integration is intact in amblyopia, they also suggest that it operates over
a wider range of SOAs. Two possible explanations exist for this widened window of audiovisual
temporal integration if sequential versus parallel processing mechanisms are considered. If
temporal ventriloquism is a product of sequential processing, then a widened simultaneity
window may be the proximate cause of the widened window of integration (i.e., the simultaneity
window acts as a temporal filter that constrains subsequent integration). Conversely, if temporal
ventriloquism is a product of parallel processing, then temporal noise in the amblyopic visual
signal may be the proximate cause of both the widened window of simultaneity and the widened
window of integration. A speculated distinguishing feature between these two proposed
mechanisms is the effect of viewing condition on the width of the window of integration. In the
case of sequential processing, the window of integration will depend upon the window of
simultaneity. Because the window of simultaneity does not change on amblyopic eye or fellow
eye viewing, the window of integration would also remain unchanged. In the case of parallel
processing, however, the window of integration will depend upon the level of temporal noise in
the visual signal. Because the capacity for optimal integration is argued to remain intact in
amblyopia, the window of integration will change with the viewing eye. Although this remains
an outstanding question, it is important to note that temporal noise in the amblyopic visual signal
(as opposed to a failure of integration) can account for the observed perceptual abnormalities in
both hypothetical mechanisms.
7.3.2 Unisensory Impairments and Cross-sensory Calibration
In addition to the effects of amblyopia on audiovisual integration and non-integrative
audiovisual processes discussed above, Study I and Study II revealed associated abnormalities in
unisensory spatial localization. The reduced precision in visual spatial localization observed in
Study I (Figure 3.4) undoubtedly represents a unimodal effect related to other spatiotemporal
visual deficits in amblyopia (reviewed in section 1.1.3). In contrast, the reduced precision in
172
auditory spatial localization (i.e., wider MAA) observed in Study I, and confirmed in Study II,
cannot be explained as a real-time effect of amblyopic visual input. Indeed, the experiments were
conducted in complete darkness, with neither the auditory stimuli, nor the method of response for
sound localization, involving vision. This finding was unexpected, but replicated, and constitutes
discovery of a novel clinical deficit in people with the most common forms of amblyopia
(anisometropic, strabismic, and mixed-mechanism). The mechanism for this novel deficit in
sound localization is likely one of impaired cross-sensory calibration by vision (reviewed in
section 1.3.7). That is, amblyopic visual input disrupts the developmental calibration of sound
localization during a sensitive period in early life. While animal models (King et al., 1988;
Knudsen & Knudsen, 1989) and human data (Gori et al., 2014; Lessard et al., 1998) support this
hypothesis, this novel finding is particularly intriguing because discordant binocular vision has
never before been shown to impair spatial hearing. To the contrary, the only previous work to
study sound localization in early unilateral visual impairment examined monocular adults, and
found that sound localization accuracy was slightly enhanced (Hoover et al., 2012).
In addition to reduced sound localization precision in amblyopia, Study III also found that sound
localization accuracy was poorer in the spatial hemifield ipsilateral to the amblyopic eye (Figure
4.4C, D). Furthermore, the magnitude of these hemifield-specific inaccuracies correlated
significantly with clinical markers of amblyopia– visual acuity in the amblyopic eye and stereo
acuity (Figure 4.5). Considering the anatomic differences in how retinal fibres decussate in the
retinotectal and retinogeniculate pathways (Lane et al., 1973; Pollack & Hickey, 1979), this
asymmetric pattern of sound localization error suggests that the superior colliculus, rather than
V1, mediates the cross-sensory calibration of sound localization by vision in humans. If this is
the case, it implies that amblyogenic factors in early life not only disrupt visual spatial
processing in the retinostriate pathway (see section 1.1.4), but that they also cause a de novo
(i.e., second primary) deficit in auditory spatial processing in the midbrain via the
retinocollicular pathway. Indeed, a similar mechanism involving the superior colliculus as a
second primary site of impairment was previously hypothesized by Ciuffreda et al. (1978) to
explain the prolongation of saccadic latencies in amblyopia.
173
7.4 Clinical Implications
The majority of the discussion to this point has dealt with elucidating the pattern and
pathophysiology of multisensory processing abnormalities in amblyopia. The findings herein
also have clinical implications for the diagnosis and treatment of amblyopia and its associated
deficits.
It is important to note that many multisensory phenomena (e.g., the McGurk effect and the
spatial ventriloquism effect) rely on unnatural pairings of audiovisual stimuli to induce
perceptual illusions. The normal perceptual system appears to fail the observer in such
circumstances by delivering a perceptual product that is non-veridical. For instance, at first
glance, it would appear that normal susceptibility to the McGurk effect, resulting in non-
veridical auditory perception, would be an adaptive disadvantage. On deeper consideration,
however, it reflects an adaptive ability to combine naturally-occurring stimuli to enhance the
fidelity of perception. In and of themselves, such illusory phenomena do not demonstrate the
adaptive advantages of multisensory integration, but serve as useful experimental tools to probe
the mechanistic underpinnings of multisensory function. Assessing the clinical implications of
abnormal perception of illusory percepts in amblyopia therefore necessitates extrapolation to
ecologically valid situations.
The balance of evidence reviewed and presented in this thesis points to intact spatial and
temporal audiovisual integration in amblyopia. Indeed, people with amblyopia integrate visual
and auditory spatial signals appropriately according to the MLE model, and show enhancements
in visual temporal processing by the temporal ventriloquism effect. These findings suggest that
singling out integrative processes as specific targets for rehabilitation may be misguided, and
shift the focus for clinically-relevant perceptual deficits to the realm of unisensory and non-
integrative multisensory functions.
The most surprising finding was the sound localization deficit (widened MAA) in amblyopia
described in Study I and Study II. Although standard hearing screening tests do not assess sound
localization ability, a widened MAA has a measurable impact on the level of experienced
hearing disability and handicap (Van Esch et al., 2015). The Gothenburg Profile is a validated
tool for clinical assessment of real-world hearing disability (Ringdahl, Eriksson-Mangold, &
Andersson, 1998). Responses to several items of the Gothenburg Profile, including “Are there
174
occasions when you cannot localize different sounds in traffic?” and “Are there occasions when
you turn your head in the wrong direction, when someone calls you?”, are significantly
correlated with poorer spatial hearing as measured by the MAA (Van Esch et al., 2015). The
ability to segregate sounds on the basis of spatial cues has also been shown to contribute
significantly to speech intelligibility in both young children and adults (Litovsky, 2005). By
extension, it is speculated that impaired cross-sensory calibration of sound localization in
amblyopia may have similar real-world consequences for situational awareness in traffic, social
interaction, and speech comprehension in noisy environments.
Findings described in Study III also support the hypothesis that amblyopia involves reduced
precision in audiovisual simultaneity perception. Although it is difficult to make a case for the
importance of more precise simultaneity perception per se, the width of the simultaneity window
may be causally related to performance on other indices of multisensory integration (Stevenson,
Zemtsov, et al., 2012). Improved multisensory integration, in turn, may confer perceptual
advantages as outlined in section 1.3.1. In this light, the importance of a widened simultaneity
window in amblyopia may lie in its demonstrated potential for clinical modification. Indeed,
various forms of perceptual learning, including short-term simultaneity training with feedback,
musical training, and video gaming experience, have been shown to narrow the audiovisual
simultaneity window (Donohue et al., 2010; Lee & Noppeney, 2011a; Powers et al., 2009;
Stevenson et al., 2013).
More broadly, the effects of amblyopia on audiovisual temporal perception and spatial hearing
presented herein lead one to ask several questions. First, do the current treatments for amblyopia
(e.g. occlusion or pharmacologic penalization of the better-seeing eye) cause or exacerbate these
impairments? It is conceivable that amblyopia therapy may deprive the developing brain of high-
fidelity spatiotemporal visual signals necessary for audiovisual temporal and spatial hearing
development. Parttime occlusion is likely insufficient to induce appreciable impairments, but the
effects of full-time occlusion or long-lasting pharmacologic penalization during a sensitive
period of multisensory development may be more significant. Second, can treatment standards
for amblyopia be improved to better address the impairments in audiovisual temporal perception
and spatial hearing? Evidence from a study of the McGurk effect in amblyopia suggests that
successful treatment before 5 years of age may prevent abnormalities in audiovisual speech
integration (Burgmeier et al., 2015). Considerable evidence from animal models and some data
175
from clinical populations also point to a sensitive period in early life during which spatial
hearing is vulnerable to damage from anomalous visual experience (reviewed in section 1.2.2.6).
Similar to the importance of early therapy for the visual aspects of amblyopic rehabilitation
(Campos, 1995; Flynn et al., 1998; Holmes et al., 2011; Lea et al., 1989; Scheiman et al., 2005),
outcomes for multisensory and spatial hearing abilities in amblyopia may also be improved by
early treatment. While more data are needed to support this hypothesis, if early treatment for
amblyopia improves speech integration and spatial hearing outcomes, the evidentiary weight in
favour of population-based childhood vision screening programs will undoubtedly be enhanced.
7.5 Conclusions
Below, the main conclusions of this thesis are summarized.
1) The capacity for spatial and temporal audiovisual integration in amblyopia is intact.
In the spatial domain, the manner of audiovisual integration in the spatial ventriloquism
effect was optimal according to the MLE model of multisensory combination (Study I).
The perceptual weight of each modality and differences in audiovisual localization
precision between the amblyopia and control groups were accounted for by differences in
perceptual performance at the unisensory level for vision, and surprisingly, audition.
In the temporal domain, audiovisual integration, as assessed by the temporal
ventriloquism effect, was intact (Study IV).
2) The temporal resolution of audiovisual simultaneity perception in amblyopia is diminished.
The temporal window of audiovisual simultaneity perception was widened in amblyopia,
and its width was not dependent on which eye was viewing (Study III). The results
suggest that the amblyopic impairment in audiovisual temporal perception is caused by a
central processing abnormality and developmental in origin.
3) Sound localization in amblyopia is impaired.
Horizontal sound localization precision in the central region of space was impaired in
amblyopia (Study I and Study II). Horizontal sound localization error in the central 32° of
space was abnormally asymmetric in amblyopia, with greater error in the hemifield
176
ipsilateral to the amblyopic eye. The magnitude of sound localization error in the
amblyopic hemifield correlated significantly with amblyopic deficits in visual acuity and
stereo acuity. The results suggest that amblyopia disrupts spatial hearing during a
sensitive period of auditory development by a mechanism of cross-sensory calibration.
The spatial pattern of sound localization errors implicates the superior colliculus in
mediating the effect of amblyopic vision on spatial hearing.
177
Chapter 8 Future Directions
Future Directions
The experimental findings and conclusions reported in this thesis inspire further questions about
the development and mechanisms of multisensory processing and integration, the nature and
extent of perceptual impairments in amblyopia, and the adequacy and impact of current therapies
for amblyopia. A future research program encompassing several interrelated areas of study is
envisioned and outlined below.
8.1 Development and Mechanisms of Multisensory Processing and Integration
Similar to the way early visual deprivation has provided an invaluable experimental model for
the study of normal visual development (Lewis & Maurer, 2005), unilateral amblyopia provides
a unique opportunity to study the requirements for normal multisensory development.
Soto-Faraco and Alsius (2009) noted a controversy in the field of multisensory processing: are
different attributes of a multisensory object treated separately by the perceptual system and
bound by different mechanisms (i.e., multiple parallel processes), or are they treated in a unified
manner and processed by a common mechanism (i.e., a single sequential process)? In their study
of the McGurk effect, they reported that the temporal window for audiovisual speech integration
is wider than that for perceived audiovisual synchrony, supporting the hypothesis of multiple
parallel processes (Soto-Faraco & Alsius, 2009). Speech integration, however, is often
considered a special case of multisensory processing (Baart et al., 2015; Baart, Vroomen, et al.,
2014; Eskelund et al., 2011; Lalonde & Holt, 2016). Study of non-speech integrative phenomena,
such as the temporal ventriloquism effect and the sound-induced flash illusion, may therefore
provide more generalizable results. In section 7.3.1, it was hypothesized that if the temporal
window of audiovisual integration is proximally constrained by the window of audiovisual
simultaneity (Figure 8.1A), then it will not be modulated directly by the reliability of the visual
signal. Conversely, if the two processes (i.e., simultaneity perception and integration) occur by
parallel mechanisms (Figure 8.1B), then the temporal window of audiovisual integration will
respond to changes in the reliability of the visual signal. In amblyopia, it is already established
that the audiovisual simultaneity window is widened regardless of viewing condition, indicating
178
that it does not respond to changes in the reliability of the visual signal (Study III and Chen et al.
(2017)). However, the effect of monocular viewing condition on the temporal window of
integration for non-speech phenomena has not been fully investigated (Study IV and Narinesingh
et al. (2017)). Experimental decoupling of the response pattern for these two multisensory
processes by monocular viewing conditions (i.e., a stable simultaneity window, but variable
temporal window of integration) would provide evidence for parallel processing mechanisms.
Figure 8.1: Possible mechanisms that determine the temporal window of audiovisual
integration. (A) Sequential processing. The width of the simultaneity window is the proximal
constraint on the window of integration. Because the simultaneity window is invariant in fellow
eye and amblyopic eye viewing conditions, the integration window will be similarly invariant.
(B) Parallel processing. In this model, the simultaneity window is not a proximal constraint on
179
the window of integration. The window of integration will therefore vary according to whether
visual input is received from the temporally precise fellow eye, or from the temporally imprecise
amblyopic eye.
A future endeavour will also be to combine electroencephalography with behavioural studies in
amblyopia to elucidate the factors and mechanisms that determine the perception of multisensory
stimuli. Arnold et al. (2017) reported preliminary electroencephalographic data showing that
temporal variability (i.e., noise) in cortical evoked potentials predicted a visually normal
observer’s ability to judge the simultaneity of audiovisual signals. This was an important finding,
because common electroencephalographic techniques that employ time-domain averaging
discard information on trial-to-trial variability. Indeed, this technique, used to increase the
signal-to-noise ratio, has been implicated in the misinterpretation of VEP findings in amblyopia
(Banko, Kortvelyes, Nemeth, et al., 2013). The hypothesis that neural noise predicts the
precision of audiovisual simultaneity perception (Arnold et al., 2017) may be tested in a sample
of observers with amblyopia—a population established to have a widened temporal window of
audiovisual simultaneity perception. Furthermore, if reliable data on the age of onset and
treatment for amblyopic participants can be obtained, evidence for a sensitive period for the
calibration of audiovisual simultaneity perception may also be found.
8.2 Nature and Extent of Perceptual Impairments in Amblyopia
Results presented in this thesis revealed a new class of perceptual impairment in amblyopia
affecting the auditory system. As a novel finding, future experiments need to more fully define
the nature and extent of the sound localization deficit. Study I and Study II described and
confirmed a deficit in horizontal sound localization precision (i.e., a wider MAA) for a central
auditory target. However, Study II also described a deficit in sound localization accuracy that
preferentially affected the spatial hemifield ipsilateral to the amblyopic eye. Further studies
should measure the MAA in amblyopia for auditory targets in the left and right hemifields to
determine if the spatial asymmetry identified in sound localization accuracy also applies to sound
localization precision. An asymmetric effect on MAA would further implicate the superior
colliculus as the neural site of cross-sensory calibration of spatial hearing in humans. Similar
sound localization experiments measuring horizontal sound localization precision and error may
180
also be conducted on a sample of non-amblyopic observers with early-onset strabismus. Results
from a strabismic population may be compared to those from an anisometropic amblyopic
population to determine the differential contributions of retinal defocus and binocular
decorrelation to the cross-sensory calibration of sound localization. As mentioned above, if
reliable data on the age of onset and treatment for the early-onset visual disturbances can be
obtained, evidence for a sensitive period for the visual influence on spatial hearing may be
found.
Earlier studies of the McGurk effect in amblyopia suggested that the visual disorder involves a
failure of multisensory integration. New data presented in this thesis and elsewhere (Narinesingh
et al., 2017), however, imply otherwise—specifically, that mechanisms for audiovisual spatial
and temporal integration remain intact in amblyopia. A difficulty in determining whether
previous studies of the McGurk effect in amblyopia demonstrate normal or deficient integration
is that their experimental designs did not incorporate measures of performance on the component
unisensory tasks (i.e., auditory speech recognition and lip-reading ability) (Burgmeier et al.,
2015; Narinesingh et al., 2015; Narinesingh et al., 2014). Whether the audiovisual speech
perception differences in amblyopia (described in the studies listed above) result from deficient
integration of the available unisensory information, or from a unisensory deficit propagated
through a normal integrative mechanism, remains an unresolved topic of speculation. An
immediate goal for future research is therefore to test adults with unilateral amblyopia on an
audiovisual speech integration task that involves unisensory and multisensory measures of
perceptual performance (as in Putzar, Hötting, et al. (2010), for example). Based on the findings
of intact audiovisual integration in this thesis (Study I and Study IV), it is hypothesized that
audiovisual speech integration in amblyopia is also intact, and that the deficits observed in earlier
studies result from reduced lip-reading ability in amblyopia. In the same way sound localization
deficits are associated with both bilateral (Gori et al., 2014; Lessard et al., 1998) and unilateral
(Study I and Study II) early-onset visual impairments, lip-reading impairments described in
bilateral early-onset visual deprivation (Putzar, Hötting, et al., 2010) may be found in unilateral
amblyopia as well.
With respect to amblyopia, much of the scientific and clinical focus has understandably been on
its prominent visual spatial deficits (McKee et al., 2003) and the pathophysiologic significance
of visual spatial noise (Levi & Klein, 2003; Levi et al., 2008; Levi et al., 1987; Levi et al., 1994;
181
Niechwiej-Szwedo, Kennedy, et al., 2012; Nordmann et al., 1992; Raashid et al., 2015).
However, reported visual temporal processing deficits in amblyopia (Huang et al., 2012; Spang
& Fahle, 2009; St John, 1998; Tredici & von Noorden, 1984), recent trial-by-trial analyses of
VEP data (Arnold et al., 2017; Banko, Kortvelyes, Nemeth, et al., 2013; Banko, Kortvelyes,
Weiss, et al., 2013), and the widened temporal windows of perceptual binding for number of
audiovisual tasks (Study III, Study IV, Chen et al. (2017), and Narinesingh et al. (2017)), suggest
that visual temporal noise may be an underappreciated pathophysiologic factor in amblyopia. In
addition to the experiment modeled after Arnold et al. (2017) outlined in section 8.1, an
important future direction for amblyopia research will be psychophysical experiments to more
comprehensively assess amblyopic visual temporal perception. For example, interocular
differences in visual temporal precision and perceptual latency may be measured using a set of
2AFC visual TOJ tasks. To rule out the possibility of a global temporal processing deficit, the
integrity of auditory temporal processing in amblyopia may be confirmed by a temporal order
discrimination task for two tones of different pitch (Tallal, 1978), or by an auditory gap detection
task (Irwin et al., 1985). As noted in section 3.5, temporal factors—specifically, temporal decay
in the amblyopic visual spatial signal—may also explain the deficit in visual spatial precision
observed in Study I. This hypothesis may be tested by comparing the effect of varying the
temporal interval between sequential visual stimuli on localization performance in amblyopia
and control groups. A significant interaction between temporal interval and group would signify
differential temporal decay in the visual spatial signal.
Several studies in this thesis reported relations between clinical features of amblyopia and
performance on multisensory tasks. Study II found that the magnitude of sound localization error
in the auditory hemifield ipsilateral to the amblyopic eye was correlated with deficits in stereo
acuity and monocular visual acuity. Subgroup analysis in Study III found that the width of the
audiovisual simultaneity window related to the severity of the monocular acuity deficit, while the
point of subjective simultaneity related to the binocularity deficit. Study IV found that the
temporal window for the temporal ventriloquism effect was wider in individuals with poor stereo
acuity. Practical limitations of sample size, however, prevented a systematic examination of
potential differences between etiological subtypes of amblyopia. Future studies may delve into
this area of investigation by selecting fewer paradigms and focusing on recruiting larger numbers
of participants with anisometropic, strabismic, and mixed mechanism amblyopia.
182
8.3 Looking to the Future of Amblyopia Therapy
A key step in translating novel research findings into meaningful healthcare innovation is
establishing a link between laboratory results and patient function in real-world situations.
The real-world impact of the amblyopic deficit in sound localization is unknown, but may be
assessed in several ways. In hearing impaired populations, a relation between the MAA and a
person’s ability to localize voices and sounds in traffic has been established using the
Gothenburg Profile, a validated tool for clinical assessment of experienced hearing disability and
handicap (Ringdahl et al., 1998; Van Esch et al., 2015). These critical abilities may also be
assessed in people with amblyopia using the Gothenburg Profile, and their disability scores may
be correlated with experimentally-determined measures of sound localization precision and
accuracy. The clinical relevance of a widened MAA in amblyopia may also be inferred from
further experimental study. For example, poorer ability to use spatial cues to segregate speech
streams (i.e., higher thresholds for spatial release from masking) may indicate increased
difficulty with speech comprehension in noisy environments (Pillsbury et al., 1991). Difficulty in
this regard may have implications for attention and comprehension in the classroom setting. In
section 7.4, it was also speculated that some therapies for amblyopia—specifically, full-time
occlusion and long-lasting pharmacologic penalization—may exacerbate the amblyopic
disturbance in sound localization by consistently depriving the developing brain of a high-
fidelity spatial signal from the fellow eye. This hypothesis could be tested in a prospective
manner by randomizing previously untreated children with severe amblyopia to either part-time
occlusion or atropine penalization, then measuring the MAA during and at the conclusion of
therapy. The outcome of such a study may provide an evidence-based rationale for choosing
part-time occlusion over other treatment options that are currently considered equivalent
(American Academy of Ophthalmology Pediatric Ophthalmology/Strabismus Panel, 2012).
Beyond informing the evidence-based application of currently-available therapies, a future
research program is envisioned to seek novel methods to improve multisensory outcomes in
amblyopia. For example, can musical training (Lee & Noppeney, 2011a), or training on visual
and audiovisual temporal perception tasks (Donohue et al., 2010; Powers et al., 2009; Stevenson
et al., 2013) narrow the audiovisual simultaneity window in amblyopia as they do in visually
normal adults?
183
A fuller understanding of the complex interplay between visual and auditory perception, and an
appreciation of the far-reaching developmental consequences of anomalous sensory experience,
will undoubtedly enhance the clinician’s ability to minimize disability and maximize health in
generations to come.
184
References
Aaen-Stockdale, C., & Hess, R. F. (2008). The amblyopic deficit for global motion is spatial scale invariant. Vision Res, 48(19), 1965–1971. doi:10.1016/j.visres.2008.06.012
Abel, S. M., Figueiredo, J. C., Consoli, A., Birt, C. M., & Papsin, B. C. (2009). The effect of blindness on horizontal plane sound source identification: El efecto de la ceguera en la identificatión de la fuente sonora en el piano horizontal. International Journal of
Audiology, 41(5), 285–292. doi:10.3109/14992020209077188 Abrahamsson, M., Fabian, G., & Sjostrand, J. (1990). A longitudinal study of a population based
sample of astigmatic children. II. The changeability of anisometropia. Acta Ophthalmol
(Copenh), 68(4), 435–440. Abrahamsson, M., & Sjostrand, J. (1988). Contrast sensitivity and acuity relationship in
strabismic and anisometropic amblyopia. Br J Ophthalmol, 72(1), 44–49. Adams, G. G., & Karas, M. P. (1999). Effect of amblyopia on employment prospects. Br J
Ophthalmol, 83(3), 380. Adams, R. J., & Courage, M. L. (2002). Using a single test to measure human contrast sensitivity
from early childhood to maturity. Vision Res, 42(9), 1205–1210. Alais, D., & Burr, D. (2004). The ventriloquist effect results from near-optimal bimodal
integration. Curr Biol, 14(3), 257–262. doi:10.1016/j.cub.2004.01.029 Alais, D., Newell, F. N., & Mamassian, P. (2010). Multisensory processing in review: from
physiology to behaviour. Seeing Perceiving, 23(1), 3–38. doi:10.1163/187847510X488603
Alsius, A., Navarra, J., Campbell, R., & Soto-Faraco, S. (2005). Audiovisual integration of speech falters under high attention demands. Curr Biol, 15(9), 839–843. doi:10.1016/j.cub.2005.03.046
Alsius, A., Navarra, J., & Soto-Faraco, S. (2007). Attention to touch weakens audiovisual speech integration. Exp Brain Res, 183(3), 399–404.
Altmann, L., & Singer, W. (1986). Temporal integration in amblyopic vision. Vision Res, 26(12), 1959–1968.
American Academy of Ophthalmology Pediatric Ophthalmology/Strabismus Panel. (2012). Preferred practice pattern guidelines. Amblyopia. San Francisco, CA: American Academy of Ophthalmology Retrieved from www.aao.org/ppp.
American Academy of Pediatrics Section on Ophthalmology Council on Children with Disabilities, American Academy of Ophthalmology, American Association for Pediatric Ophthalmology and Strabismus, & American Association of Certified Orthoptists. (2009). Joint statement–Learning disabilities, dyslexia, and vision. Pediatrics, 124(2), 837–844. doi:10.1542/peds.2009-1445
Andersen, T. S., Tiippana, K., & Sams, M. (2005). Maximum Likelihood Integration of rapid flashes and beeps. Neurosci Lett, 380(1-2), 155–160. doi:10.1016/j.neulet.2005.01.030
Andreassi, J., & Greco, J. (1975). Effects of bisensory stimulation on reaction time and the evoked cortical potential. Physiological Psychology, 3(2), 189–194.
Arden, G. B., & Barnard, W. M. (1979). Effect of occlusion on the visual evoked response in amblyopia. Trans Ophthalmol Soc U K, 99(3), 419–426.
Arnold, D., Mathews, N., Keane, B., & Yarrow, K. (2017). Evoked neural response variability predicts poor timing precision. J Vis, 17(10), 733–733. doi:10.1167/17.10.733
185
Aschersleben, G., & Bertelson, P. (2003). Temporal ventriloquism: crossmodal interaction on the time dimension. 2. Evidence from sensorimotor synchronization. Int J Psychophysiol,
50(1-2), 157–163. Ashmead, D. H., Clifton, R. K., & Perris, E. E. (1987). Precision of auditory localization in
human infants. Developmental Psychology, 23(5), 641. Ashmead, D. H., Davis, D. L., Whalen, T., & Odom, R. D. (1991). Sound localization and
sensitivity to interaural time differences in human infants. Child Dev, 62(6), 1211–1226. Ashmead, D. H., Grantham, D. W., Murphy, W., & Tharpe, A. M. (1993). Human infants’
sensitivity to interaural level differences. J Acoust Soc Am, 93(4), 2360–2360. Ashmead, D. H., Wall, R. S., Ebinger, K. A., Eaton, S. B., Snook-Hill, M.-M., & Yang, X.
(1998). Spatial hearing in children with visual disabilities. Perception, 27(1), 105–122. Attebo, K., Mitchell, P., Cumming, R., Smith, W., Jolly, N., & Sparkes, R. (1998). Prevalence
and causes of amblyopia in an adult population. Ophthalmology, 105(1), 154–159. Baart, M., Bortfeld, H., & Vroomen, J. (2015). Phonetic matching of auditory and visual speech
develops during childhood: evidence from sine-wave speech. J Exp Child Psychol, 129, 157–164. doi:10.1016/j.jecp.2014.08.002
Baart, M., Stekelenburg, J. J., & Vroomen, J. (2014). Electrophysiological evidence for speech-specific audiovisual integration. Neuropsychologia, 53, 115–121. doi:10.1016/j.neuropsychologia.2013.11.011
Baart, M., Vroomen, J., Shaw, K., & Bortfeld, H. (2014). Degrading phonetic information affects matching of audiovisual speech in adults, but not in infants. Cognition, 130(1), 31–43. doi:10.1016/j.cognition.2013.09.006
Babu, R. J., Clavagnier, S. R., Bobier, W., Thompson, B., & Hess, R. F. (2013). The regional extent of suppression: strabismics versus nonstrabismics. Invest Ophthalmol Vis Sci,
54(10), 6585–6593. Baker, F. H., Grigg, P., & von Noorden, G. K. (1974). Effects of visual deprivation and
strabismus on the response of neurons in the visual cortex of the monkey, including studies on the striate and prestriate cortex in the normal animal. Brain Res, 66(2), 185–208.
Banati, R. B., Goerres, G., Tjoa, C., Aggleton, J. P., & Grasby, P. (2000). The functional anatomy of visual-tactile integration in man: a study using positron emission tomography. Neuropsychologia, 38(2), 115–124.
Banko, E. M., Kortvelyes, J., Nemeth, J., Weiss, B., & Vidnyanszky, Z. (2013). Amblyopic deficits in the timing and strength of visual cortical responses to faces. Cortex, 49(4), 1013–1024. doi:10.1016/j.cortex.2012.03.021
Banko, E. M., Kortvelyes, J., Weiss, B., & Vidnyanszky, Z. (2013). How the visual cortex handles stimulus noise: insights from amblyopia. PloS one, 8(6), e66583. doi:10.1371/journal.pone.0066583
Banks, M. S., & Bennett, P. J. (1988). Optical and photoreceptor immaturities limit the spatial and chromatic vision of human neonates. J Opt Soc Am A, 5(12), 2059–2079.
Barnard, W. M., & Arden, G. B. (1979). Changes in the visual evoked response during and after occlusion therapy for amblyopia. Child Care Health Dev, 5(6), 421–430.
Barnes, G. R., Hess, R. F., Dumoulin, S. O., Achtman, R. L., & Pike, G. B. (2001). The cortical deficit in humans with strabismic amblyopia. J Physiol, 533(Pt 1), 281–297.
Barrett, B. T., Bradley, A., & McGraw, P. V. (2004). Understanding the neural basis of amblyopia. Neuroscientist, 10(2), 106–117. doi:10.1177/1073858403262153
Barrett, B. T., Pacey, I. E., Bradley, A., Thibos, L. N., & Morrill, P. (2003). Nonveridical visual perception in human amblyopia. Invest Ophthalmol Vis Sci, 44(4), 1555–1567.
186
Barutchu, A., Crewther, D. P., & Crewther, S. G. (2009). The race that precedes coactivation: development of multisensory facilitation in children. Developmental Science, 12(3), 464–473.
Barutchu, A., Danaher, J., Crewther, S. G., Innes-Brown, H., Shivdasani, M. N., & Paolini, A. G. (2010). Audiovisual integration in noise by children and adults. J Exp Child Psychol,
105(1-2), 38-50. doi:10.1016/j.jecp.2009.08.005 Batra, R., Kuwada, S., & Fitzpatrick, D. C. (1997). Sensitivity to interaural temporal disparities
of low-and high-frequency neurons in the superior olivary complex. I. Heterogeneity of responses. J Neurophysiol, 78(3), 1222–1236.
Battaglia, P. W., Jacobs, R. A., & Aslin, R. N. (2003). Bayesian integration of visual and auditory signals for spatial localization. J Opt Soc Am A Opt Image Sci Vis, 20(7), 1391–1397.
Bedell, H. E., Flom, M. C., & Barbeito, R. (1985). Spatial aberrations and acuity in strabismus and amblyopia. Invest Ophthalmol Vis Sci, 26(7), 909–916.
Bertelson, P., & Radeau, M. (1981). Cross-modal bias and perceptual fusion with auditory-visual spatial discordance. Percept Psychophys, 29(6), 578–584.
Bertelson, P., Vroomen, J., De Gelder, B., & Driver, J. (2000). The ventriloquist effect does not depend on the direction of deliberate visual attention. Percept Psychophys, 62(2), 321–332. doi:10.3758/bf03205552
Berwanger, D., Wittmann, M., von Steinbuchel, N., & von Suchodoletz, W. (2004). Measurement of temporal-order judgment in children. Acta Neurobiol Exp (Wars), 64(3), 387–394.
Binns, K. E., Grant, S., Withington, D. J., & Keating, M. J. (1992). A topographic representation of auditory space in the external nucleus of the inferior colliculus of the guinea-pig. Brain
Res, 589(2), 231–242. doi:http://dx.doi.org/10.1016/0006-8993(92)91282-J Birch, E. E. (2013). Amblyopia and binocular vision. Prog Retin Eye Res, 33, 67–84.
doi:10.1016/j.preteyeres.2012.11.001 Birch, E. E., & Holmes, J. M. (2010). The clinical profile of amblyopia in children younger than
3 years of age. J AAPOS, 14(6), 494–497. Birch, E. E., Morale, S. E., Jost, R. M., De La Cruz, A., Kelly, K. R., Wang, Y. Z., & Bex, P. J.
(2016). Assessing suppression in amblyopic children with a dichoptic eye chart. Invest
Ophthalmol Vis Sci, 57(13), 5649–5654. doi:10.1167/iovs.16-19986 Birch, E. E., Stager, D., Leffler, J., & Weakley, D. (1998). Early treatment of congenital
unilateral cataract minimizes unequal competition. Invest Ophthalmol Vis Sci, 39(9), 1560–1566.
Birch, E. E., & Stager, D. R. (1988). Prevalence of good visual acuity following surgery for congenital unilateral cataract. Arch Ophthalmol, 106(1), 40–43.
Birch, E. E., Stager, D. R., & Wright, W. W. (1986). Grating acuity development after early surgery for congenital unilateral cataract. Arch Ophthalmol, 104(12), 1783–1787.
Birch, E. E., & Swanson, W. H. (2000). Hyperacuity deficits in anisometropic and strabismic amblyopes with known ages of onset. Vision Res, 40(9), 1035–1040.
Birch, E. E., Swanson, W. H., Stager, D. R., Woody, M., & Everett, M. (1993). Outcome after very early treatment of dense congenital unilateral cataract. Invest Ophthalmol Vis Sci,
34(13), 3687–3699. Blakemore, C. (1988). The sensitive periods of the monkey’s visual cortex Strabismus and
Amblyopia (pp. 219–234): Springer. Blakemore, C., & Vital-Durand, F. (1986). Effects of visual deprivation on the development of
the monkey's lateral geniculate nucleus. J Physiol, 380, 493–511.
187
Blankenship, C., Zhang, F., & Keith, R. (2016). Behavioral measures of temporal processing and speech perception in cochlear implant users. J Am Acad Audiol, 27(9), 701–713. doi:10.3766/jaaa.15026
Blauert, J. (1970). Ein Versuch zum Richtungshören bei gleichzeitiger optischer Stimulation. Acustica, 23, 118–119.
Bonneh, Y. S., Sagi, D., & Polat, U. (2004). Local and non-local deficits in amblyopia: acuity and spatial interactions. Vision Res, 44(27), 3099–3110. doi:10.1016/j.visres.2004.07.031
Bonneh, Y. S., Sagi, D., & Polat, U. (2007). Spatial and temporal crowding in amblyopia. Vision
Res, 47(14), 1950–1962. doi:10.1016/j.visres.2007.02.015 Boothe, R. G., Dobson, V., & Teller, D. Y. (1985). Postnatal development of vision in human
and nonhuman primates. Annu Rev Neurosci, 8(1), 495–545. doi:10.1146/annurev.ne.08.030185.002431
Boudreau, J. C., & Tsuchitani, C. (1968). Binaural interaction in the cat superior olive S segment. J Neurophysiol, 31(3), 442–454.
Bradley, A., & Freeman, R. D. (1981). Contrast sensitivity in anisometropic amblyopia. Invest
Ophthalmol Vis Sci, 21(3), 467–476. Bristow, D., Dehaene-Lambertz, G., Mattout, J., Soares, C., Gliga, T., Baillet, S., & Mangin, J.-
F. (2009). Hearing faces: how the infant brain matches the face it sees with the speech it hears. J Cogn Neurosci, 21(5), 905–921.
Brown, K. W., & Gottfried, A. W. (1986). Cross-modal transfer of shape in early infancy: Is there reliable evidence. Advances in infancy research, 4, 163–170.
Brown, L. E., Halpert, B. A., & Goodale, M. A. (2005). Peripheral vision for perception and action. Exp Brain Res, 165(1), 97–106. doi:10.1007/s00221-005-2285-y
Brown, S. A., Weih, L. M., Fu, C. L., Dimitrov, P., Taylor, H. R., & McCarty, C. A. (2000). Prevalence of amblyopia and associated refractive errors in an adult population in Victoria, Australia. Ophthalmic Epidemiol, 7(4), 249–258.
Buch, H., Vinding, T., La Cour, M., & Nielsen, N. V. (2001). The prevalence and causes of bilateral and unilateral blindness in an elderly urban Danish population. The Copenhagen City Eye Study. Acta Ophthalmol Scand, 79(5), 441–449. doi:10.1034/j.1600-0420.2001.790503.x
Burgmeier, R., Desai, R. U., Farner, K. C., Tiano, B., Lacey, R., Volpe, N. J., & Mets, M. B. (2015). The effect of amblyopia on visual-auditory speech perception: why mothers may say "Look at me when I'm talking to you". JAMA Ophthalmol, 133(1), 11–16. doi:10.1001/jamaophthalmol.2014.3307
Burnham, D., & Dodd, B. (1996). Auditory-visual speech perception as a direct process: The McGurk effect in infants and across languages Speechreading by Humans and Machines (pp. 103–114): Springer.
Burnham, D., & Dodd, B. (2004). Auditory-visual speech integration by prelinguistic infants: perception of an emergent consonant in the McGurk effect. Dev Psychobiol, 45(4), 204–220. doi:10.1002/dev.20032
Burr, D., Banks, M. S., & Morrone, M. C. (2009). Auditory dominance over vision in the perception of interval duration. Exp Brain Res, 198(1), 49–57. doi:10.1007/s00221-009-1933-z
Burr, D., & Gori, M. (2012). Multisensory integration develops late in humans. In M. M. Murray & M. T. Wallace (Eds.), The Neural Bases of Multisensory Processes. Boca Raton (FL): CRC Press/Taylor & Francis LLC.
Bushara, K. O., Grafman, J., & Hallett, M. (2001). Neural correlates of auditory-visual stimulus onset asynchrony detection. J Neurosci, 21(1), 300–304.
188
Bushara, K. O., Hanakawa, T., Immisch, I., Toma, K., Kansaku, K., & Hallett, M. (2003). Neural correlates of cross-modal binding. Nat Neurosci, 6(2), 190–195. doi:10.1038/nn993
Bushara, K. O., Weeks, R. A., Ishii, K., Catalan, M. J., Tian, B., Rauschecker, J. P., & Hallett, M. (1999). Modality-specific frontal and parietal areas for auditory and visual spatial localization in humans. Nat Neurosci, 2(8), 759–766. doi:10.1038/11239
Caird, D., & Klinke, R. (1983). Processing of binaural stimuli by cat superior olivary complex neurons. Experimental Brain Research, 52(3), 385–399.
Callan, D. E., Callan, A. M., Kroos, C., & Vatikiotis-Bateson, E. (2001). Multimodal contribution to speech perception revealed by independent component analysis: a single-sweep EEG case study. Brain Res Cogn Brain Res, 10(3), 349-353.
Calvert, G. A. (2001). Crossmodal processing in the human brain: insights from functional neuroimaging studies. Cereb Cortex, 11(12), 1110–1123. doi:10.1093/cercor/11.12.1110
Calvert, G. A., Brammer, M. J., Bullmore, E. T., Campbell, R., Iversen, S. D., & David, A. S. (1999). Response amplification in sensory-specific cortices during crossmodal binding. Neuroreport, 10(12), 2619–2623.
Calvert, G. A., Campbell, R., & Brammer, M. J. (2000). Evidence from functional magnetic resonance imaging of crossmodal binding in the human heteromodal cortex. Curr Biol,
10(11), 649–657. Calvert, G. A., Hansen, P. C., Iversen, S. D., & Brammer, M. J. (2001). Detection of audio-visual
integration sites in humans by application of electrophysiological criteria to the BOLD effect. Neuroimage, 14(2), 427–438. doi:10.1006/nimg.2001.0812
Calvert, G. A., Spence, C., & Stein, B. E. (2004). The handbook of multisensory processes: MIT press.
Campbell, R. A., Doubell, T. P., Nodal, F. R., Schnupp, J. W., & King, A. J. (2006). Interaural timing cues do not contribute to the map of space in the ferret superior colliculus: a virtual acoustic space study. J Neurophysiol, 95(1), 242–254.
Campos, E. (1995). Amblyopia. Surv Ophthalmol, 40(1), 23–39. Canon, L. K. (1970). Intermodality inconsistency of input and directed attention as determinants
of the nature of adaptation. J Exp Psychol, 84(1), 141. Canon, L. K. (1971). Directed attention and maladaptive" adaptation" to displacement of the
visual field. J Exp Psychol, 88(3), 403. Carlton, J., & Kaltenthaler, E. (2011). Amblyopia and quality of life: a systematic review. Eye
(Lond), 25(4), 403–413. doi:10.1038/eye.2011.4 Chen, Y. C., Lewis, T. L., Shore, D. I., & Maurer, D. (2017). Early binocular input is critical for
development of audiovisual but not visuotactile simultaneity perception. Curr Biol, 27(4), 583–589. doi:10.1016/j.cub.2017.01.009
Chen, Y. C., Shore, D. I., Lewis, T. L., & Maurer, D. (2015). The role of early visual experience
in the development of the later perception of audiovisual simultaneity: Evidence from
cataract-reversal patients. Paper presented at the Jean Piaget Society, Toronto, ON, Canada.
Chen, Y. C., Shore, D. I., Lewis, T. L., & Maurer, D. (2016). The development of the perception of audiovisual simultaneity. J Exp Child Psychol, 146, 17–33. doi:10.1016/j.jecp.2016.01.010
Chen, Y. C., & Spence, C. (2017). Assessing the role of the 'unity assumption' on multisensory integration: a review. Front Psychol.
Choe, C. S., Welch, R. B., Gilford, R. M., & Juola, J. F. (1975). The “ventriloquist effect”: Visual dominance or response bias? Atten Percept Psychophys, 18(1), 55–60.
189
Chua, B., & Mitchell, P. (2004). Consequences of amblyopia on education, occupation, and long term vision loss. Br J Ophthalmol, 88(9), 1119–1121. doi:10.1136/bjo.2004.041863
Ciuffreda, K. J., Kenyon, R. V., & Stark, L. (1978). Increased saccadic latencies in amblyopic eyes. Invest Ophthalmol Vis Sci, 17(7), 697–702.
Clifton, R. K., Gwiazda, J., Bauer, J. A., Clarkson, M. G., & Held, R. M. (1988). Growth in head size during infancy: Implications for sound localization. Developmental Psychology,
24(4), 477. Clifton, R. K., Morrongiello, B. A., Kulig, J. W., & Dowd, J. M. (1981). Newborns' orientation
toward sound: Possible implications for cortical development. Child Dev, 833–838. Colby, C. L., Gattass, R., Olson, C. R., & Gross, C. G. (1988). Topographical organization of
cortical afferents to extrastriate visual area PO in the macaque: a dual tracer study. J
Comp Neurol, 269(3), 392–413. doi:10.1002/cne.902690307 Collignon, O., Dormal, G., de Heering, A., Lepore, F., Lewis, T. L., & Maurer, D. (2015). Long-
lasting crossmodal cortical reorganization triggered by brief postnatal visual deprivation. Curr Biol, 25(18), 2379–2383. doi:10.1016/j.cub.2015.07.036
Colonius, H., & Diederich, A. (2004). Multisensory interaction in saccadic reaction time: a time-window-of-integration model. J Cogn Neurosci, 16(6), 1000–1009. doi:10.1162/0898929041502733
Conner, I. P., Odom, J. V., Schwartz, T. L., & Mendola, J. D. (2007a). Monocular activation of V1 and V2 in amblyopic adults measured with functional magnetic resonance imaging. J
AAPOS, 11(4), 341–350. doi:10.1016/j.jaapos.2007.01.119 Conner, I. P., Odom, J. V., Schwartz, T. L., & Mendola, J. D. (2007b). Retinotopic maps and
foveal suppression in the visual cortex of amblyopic adults. J Physiol, 583(Pt 1), 159–173. doi:10.1113/jphysiol.2007.136242
Constantinescu, T., Schmidt, L., Watson, R., & Hess, R. F. (2005). A residual deficit for global motion processing after acuity recovery in deprivation amblyopia. Invest Ophthalmol Vis
Sci, 46(8), 3008–3012. doi:10.1167/iovs.05-0242 Corneil, B. D., Van Wanrooij, M., Munoz, D. P., & Van Opstal, A. J. (2002). Auditory-visual
interactions subserving goal-directed saccades in a complex scene. J Neurophysiol, 88(1), 438–454.
Cuppini, C., Magosso, E., Rowland, B., Stein, B., & Ursino, M. (2012). Hebbian mechanisms help explain development of multisensory integration in the superior colliculus: a neural network model. Biol Cybern, 106(11-12), 691–713. doi:10.1007/s00422-012-0511-9
Cynader, M., & Berman, N. (1972). Receptive-field organization of monkey superior colliculus. J Neurophysiol, 35(2), 187–201.
Davis, S. M., & McCroskey, R. L. (1980). Auditory fusion in children. Child Dev, 51(1), 75–80. doi:10.2307/1129592
Daw, N. W. (1998). Critical periods and amblyopia. Arch Ophthalmol, 116(4), 502–505. Daw, N. W. (2006). Visual Development: Springer. de Heering, A., Dormal, G., Pelland, M., Lewis, T., Maurer, D., & Collignon, O. (2016). A brief
period of postnatal visual deprivation alters the balance between auditory and visual attention. Curr Biol, 26(22), 3101–3105. doi:10.1016/j.cub.2016.10.014
DeFilippo, C. L., & Snell, K. B. (1986). Detection of a temporal gap in low-frequency narrow-band signals by normal-hearing and hearing-impaired listeners. J Acoust Soc Am, 80(5), 1354–1358.
Demer, J. L., von Noorden, G. K., Volkow, N. D., & Gould, K. L. (1988). Imaging of cerebral blood flow and metabolism in amblyopia by positron emission tomography. Am J
Ophthalmol, 105(4), 337–347.
190
Deneve, S., & Pouget, A. (2004). Bayesian multisensory integration and cross-modal spatial links. J Physiol Paris, 98(1-3), 249–258. doi:10.1016/j.jphysparis.2004.03.011
Desjardins, R. N., & Werker, J. F. (2004). Is the integration of heard and seen speech mandatory for infants? Dev Psychobiol, 45(4), 187–203.
Dixon, N. F., & Spitz, L. (1980). The detection of auditory visual desynchrony. Perception, 9(6), 719–721.
Donnelly, U. M., Stewart, N. M., & Hollinger, M. (2005). Prevalence and outcomes of childhood visual disorders. Ophthalmic Epidemiol, 12(4), 243–250. doi:10.1080/09286580590967772
Donohue, S. E., Woldorff, M. G., & Mitroff, S. R. (2010). Video game players show more precise multisensory temporal processing abilities. Atten Percept Psychophys, 72(4), 1120–1129.
Driver, J. (1996). Enhancement of selective listening by illusory mislocation of speech sounds due to lip-reading. Nature, 381, 66–68.
DuBois, R. M., & Cohen, M. S. (2000). Spatiotopic organization in human superior colliculus observed with fMRI. Neuroimage, 12(1), 63–70. doi:10.1006/nimg.2000.0590
Duffy, K. R., & Mitchell, D. E. (2013). Darkness alters maturation of visual cortex and promotes fast recovery from monocular deprivation. Curr Biol, 23(5), 382–386. doi:10.1016/j.cub.2013.01.017
El-Shamayleh, Y., Kiorpes, L., Kohn, A., & Movshon, J. A. (2010). Visual motion processing by neurons in area MT of macaque monkeys with experimental amblyopia. J Neurosci,
30(36), 12198–12209. doi:10.1523/JNEUROSCI.3055-10.2010 Ellemberg, D., Lewis, T. L., Liu, C. H., & Maurer, D. (1999). Development of spatial and
temporal vision during childhood. Vision Res, 39(14), 2325–2333. doi:http://dx.doi.org/10.1016/S0042-6989(98)00280-6
Ellemberg, D., Lewis, T. L., Maurer, D., Brar, S., & Brent, H. P. (2002). Better perception of global motion after monocular than after binocular deprivation. Vision Res, 42(2), 169–179.
Engel, G. R., & Dougherty, W. G. (1971). Visual-auditory distance constancy. Nature,
234(5327), 308. Ernst, M. O. (2008). Multisensory integration: a late bloomer. Curr Biol, 18(12), R519–521.
doi:10.1016/j.cub.2008.05.002 Ernst, M. O., & Banks, M. S. (2002). Humans integrate visual and haptic information in a
statistically optimal fashion. Nature, 415(6870), 429–433. doi:10.1038/415429a Ernst, M. O., & Bulthoff, H. H. (2004). Merging the senses into a robust percept. Trends Cogn
Sci, 8(4), 162–169. doi:10.1016/j.tics.2004.02.002 Eskelund, K., Tuomainen, J., & Andersen, T. S. (2011). Multistage audiovisual integration of
speech: dissociating identification and detection. Exp Brain Res, 208(3), 447–457. doi:10.1007/s00221-010-2495-9
Fechner, G. T. (1889). Elemente der Psychophysik. 2 Bd. Leipzig: Breit-kopf & Härtel [2. und 3.
unveränderte Auflage, hrsg. von W. Wundt, 1889 und 1907]. Feldman, D. E. (2012). The spike-timing dependence of plasticity. Neuron, 75(4), 556–571.
doi:10.1016/j.neuron.2012.08.001 Fendrich, R., & Corballis, P. M. (2001). The temporal cross-capture of audition and vision.
Percept Psychophys, 63(4), 719–725. Fetsch, C. R., Turner, A. H., DeAngelis, G. C., & Angelaki, D. E. (2009). Dynamic reweighting
of visual and vestibular cues during self-motion perception. J Neurosci, 29(49), 15601–15612. doi:10.1523/JNEUROSCI.2574-09.2009
191
Fieger, A., Röder, B., Teder-Salejarvi, W., Hillyard, S. A., & Neville, H. J. (2006). Auditory spatial tuning in late-onset blindness in humans. J Cogn Neurosci, 18(2), 149–157. doi:10.1162/089892906775783697
Fine, I., Wade, A. R., Brewer, A. A., May, M. G., Goodman, D. F., Boynton, G. M., . . . MacLeod, D. I. (2003). Long-term deprivation affects visual perception and cortex. Nature neuroscience, 6(9), 915–916.
Flynn, J. T., Schiffman, J., Feuer, W., & Corona, A. (1998). The therapy of amblyopia: an analysis of the results of amblyopia therapy utilizing the pooled data of published studies. Trans Am Ophthalmol Soc, 96, 431–450; discussion 450–433.
Fong, M. F., Mitchell, D. E., Duffy, K. R., & Bear, M. F. (2016). Rapid recovery from the effects of early monocular deprivation is enabled by temporary inactivation of the retinas. Proc
Natl Acad Sci U S A, 113(49), 14139–14144. doi:10.1073/pnas.1613279113 Forster, B., Cavina-Pratesi, C., Aglioti, S. M., & Berlucchi, G. (2002). Redundant target effect
and intersensory facilitation from visual-tactile interactions in simple reaction time. Experimental Brain Research, 143(4), 480–487.
Foucher, J. R., Lacambre, M., Pham, B. T., Giersch, A., & Elliott, M. A. (2007). Low time resolution in schizophrenia Lengthened windows of simultaneity for visual, auditory and bimodal stimuli. Schizophr Res, 97(1-3), 118–127. doi:10.1016/j.schres.2007.08.013
Frassinetti, F., Bolognini, N., & Ladavas, E. (2002). Enhancement of visual perception by crossmodal visuo-auditory interaction. Exp Brain Res, 147(3), 332–343. doi:10.1007/s00221-002-1262-y
Freeman, R., & Bradley, A. (1980). Monocularly deprived humans: nondeprived eye has supernormal vernier acuity. J Neurophysiol, 43(6), 1645–1653.
Freides, D. (1974). Human information processing and sensory modality: cross-modal functions, information complexity, memory, and deficit. Psychol Bull, 81(5), 284–310.
Frens, M. A., Van Opstal, A. J., & Van Der Willigen, R. F. (1995). Spatial and temporal factors determine auditory-visual interactions in human saccadic eye movements. Percept
Psychophys, 57(6), 802–816. doi:10.3758/BF03206796 Frey, R. D. (1990). Selective attention, event perception and the criterion of acceptability
principle: Evidence supporting and rejecting the doctrine of prior entry. Human
Movement Science, 9(3), 481–530. Friedman, D. S., Repka, M. X., Katz, J., Giordano, L., Ibironke, J., Hawse, P., & Tielsch, J. M.
(2009). Prevalence of amblyopia and strabismus in white and African American children aged 6 through 71 months the Baltimore Pediatric Eye Disease Study. Ophthalmology,
116(11), 2128–2134 e2121–2122. doi:10.1016/j.ophtha.2009.04.034 Fronius, M., Sireteanu, R., & Zubcov, A. (2004). Deficits of spatial localization in children with
strabismic amblyopia. Graefes Arch Clin Exp Ophthalmol, 242(10), 827–839. doi:10.1007/s00417-004-0936-5
Fujisaki, W., & Nishida, S. (2005). Temporal frequency characteristics of synchrony-asynchrony discrimination of audio-visual signals. Exp Brain Res, 166(3-4), 455–464. doi:10.1007/s00221-005-2385-8
Fujisaki, W., Shimojo, S., Kashino, M., & Nishida, S. (2004). Recalibration of audiovisual simultaneity. Nat Neurosci, 7(7), 773–778. doi:10.1038/nn1268
Gardner, M. B., & Gardner, R. S. (1973). Problem of localization in the median plane: effect of pinnae cavity occlusion. J Acoust Soc Am, 53(2), 400–408.
Gebhard, J. W., & Mowbray, G. H. (1959). On discriminating the rate of visual flicker and auditory flutter. Am J Psychol, 72(4), 521–529. doi:10.2307/1419493
192
Giaschi, D. E., Regan, D., Kraft, S. P., & Hong, X. H. (1992). Defective processing of motion-defined form in the fellow eye of patients with unilateral amblyopia. Invest Ophthalmol
Vis Sci, 33(8), 2483–2489. Gibson, J. J. (1966). The Senses Considered as Perceptual Systems. Oxford, UK: Houghton
Mifflin. Godfroy, M., Roumes, C., & Dauchy, P. (2003). Spatial variations of visual-auditory fusion
areas. Perception, 32(10), 1233–1245. Gonzalez, E. G., Wong, A. M., Niechwiej-Szwedo, E., Tarita-Nistor, L., & Steinbach, M. J.
(2012). Eye position stability in amblyopia and in normal binocular vision. Invest
Ophthalmol Vis Sci, 53(9), 5386–5394. doi:10.1167/iovs.12-9941 Goodyear, B. G., Nicolle, D. A., Humphrey, G. K., & Menon, R. S. (2000). BOLD fMRI
response of early visual areas to perceived contrast in human amblyopia. J Neurophysiol,
84(4), 1907–1913. Gori, M. (2015). Multisensory integration and calibration in children and adults with and without
sensory and motor disabilities. Multisens Res, 28(1-2), 71–99. doi:10.1163/22134808-00002478
Gori, M., Del Viva, M., Sandini, G., & Burr, D. C. (2008). Young children do not integrate visual and haptic form information. Curr Biol, 18(9), 694–698. doi:10.1016/j.cub.2008.04.036
Gori, M., Sandini, G., & Burr, D. (2012). Development of visuo-auditory integration in space and time. Front Integr Neurosci, 6, 77. doi:10.3389/fnint.2012.00077
Gori, M., Sandini, G., Martinoli, C., & Burr, D. (2010). Poor haptic orientation discrimination in nonsighted children may reflect disruption of cross-sensory calibration. Curr Biol, 20(3), 223–225. doi:10.1016/j.cub.2009.11.069
Gori, M., Sandini, G., Martinoli, C., & Burr, D. C. (2014). Impairment of auditory spatial localization in congenitally blind human subjects. Brain, 137(Pt 1), 288–293. doi:10.1093/brain/awt311
Gori, M., Tinelli, F., Sandini, G., Cioni, G., & Burr, D. (2012). Impaired visual size-discrimination in children with movement disorders. Neuropsychologia, 50(8), 1838–1843. doi:10.1016/j.neuropsychologia.2012.04.009
Gorman, J. J., Cogan, D. G., & Gellis, S. S. (1957). An apparatus for grading the visual acuity of infants on the basis of opticokinetic nystagmus. Pediatrics, 19(6), 1088–1092.
Grant, K. W., & Seitz, P. F. (2000). The use of visible speech cues for improving auditory detection of spoken sentences. J Acoust Soc Am, 108(3), 1197–1208.
Grant, K. W., Walden, B. E., & Seitz, P. F. (1998). Auditory-visual speech recognition by hearing-impaired subjects: consonant recognition, sentence recognition, and auditory-visual integration. J Acoust Soc Am, 103(5 Pt 1), 2677–2690.
Grant, S., Melmoth, D. R., Morgan, M. J., & Finlay, A. L. (2007). Prehension deficits in amblyopia. Invest Ophthalmol Vis Sci, 48(3), 1139–1148. doi:10.1167/iovs.06-0976
Green, A. M., & Angelaki, D. E. (2010). Multisensory integration: resolving sensory ambiguities to build novel representations. Curr Opin Neurobiol, 20(3), 353–360. doi:10.1016/j.conb.2010.04.009
Green, A. M., & Swets, J. A. (1966). Signal Detection Theory and Psychophysics: New York: Wiley.
Groh, J. M., Kelly, K. A., & Underhill, A. M. (2003). A monotonic code for sound azimuth in primate inferior colliculus. J Cogn Neurosci, 15(8), 1217–1231. doi:10.1162/089892903322598166
193
Groh, J. M., & Sparks, D. L. (1996). Saccades to somatosensory targets. III. eye-position-dependent somatosensory activity in primate superior colliculus. J Neurophysiol, 75(1), 439–453.
Grothe, B., Pecka, M., & McAlpine, D. (2010). Mechanisms of sound localization in mammals. Physiol Rev, 90(3), 983–1012. doi:10.1152/physrev.00026.2009
Guerreiro, M. J., Putzar, L., & Röder, B. (2015). The effect of early visual deprivation on the neural bases of multisensory processing. Brain, 138(Pt 6), 1499–1504. doi:10.1093/brain/awv076
Guerreiro, M. J., Putzar, L., & Röder, B. (2016). Persisting cross-modal changes in sight-recovery individuals modulate visual perception. Curr Biol, 26(22), 3096–3100. doi:10.1016/j.cub.2016.08.069
Hadad, B., Schwartz, S., Maurer, D., & Lewis, T. L. (2015). Motion perception: a review of developmental changes and the role of early visual experience. Front Integr Neurosci, 9, 49. doi:10.3389/fnint.2015.00049
Hadad, B. S., Maurer, D., & Lewis, T. L. (2011). Long trajectory for the development of sensitivity to global and biological motion. Dev Sci, 14(6), 1330–1339. doi:10.1111/j.1467-7687.2011.01078.x
Hadjikhani, N., & Roland, P. E. (1998). Cross-modal transfer of information between the tactile and the visual representations in the human brain: a positron emission tomographic study. J Neurosci, 18(3), 1072–1084.
Hairston, W. D., Burdette, J. H., Flowers, D. L., Wood, F. B., & Wallace, M. T. (2005). Altered temporal profile of visual-auditory multisensory interactions in dyslexia. Exp Brain Res,
166(3-4), 474–480. doi:10.1007/s00221-005-2387-6 Hamed, L., Glaser, J., & Schatz, N. (1991). Improvement of vision in the amblyopic eye
following visual loss in the contralateral normal eye: a report of three cases. Binoc Vis, 6, 97–100.
Hariharan, S., Levi, D. M., & Klein, S. A. (2005). “Crowding” in normal and amblyopic vision assessed with Gaussian and Gabor C’s. Vision Res, 45(5), 617–633. doi:http://dx.doi.org/10.1016/j.visres.2004.09.035
Harrad, R., & Hess, R. (1992). Binocular integration of contrast information in amblyopia. Vision Res, 32(11), 2135–2150.
Harrington, L., & Peck, C. (1998). Spatial disparity affects visual-auditory interactions in human sensorimotor processing. Experimental Brain Research, 122(2), 247–252.
Hartline, P. H., Vimal, R. P., King, A., Kurylo, D., & Northmore, D. (1995). Effects of eye position on auditory localization and neural representation of space in superior colliculus of cats. Experimental Brain Research, 104(3), 402–408.
Harwerth, R. S., Smith, E. L., 3rd, Boltz, R. L., Crawford, M. L., & von Noorden, G. K. (1983). Behavioral studies on the effect of abnormal early visual experience in monkeys: temporal modulation sensitivity. Vision Res, 23(12), 1511–1517.
Harwerth, R. S., Smith, E. L., 3rd, Duncan, G. C., Crawford, M. L., & von Noorden, G. K. (1986). Multiple sensitive periods in the development of the primate visual system. Science, 232(4747), 235–238.
He, H.-Y., Ray, B., Dennis, K., & Quinlan, E. M. (2007). Experience-dependent recovery of vision following chronic deprivation amblyopia. Nat Neurosci, 10(9), 1134–1136. doi:http://www.nature.com/neuro/journal/v10/n9/suppinfo/nn1965_S1.html
He, S., Cavanagh, P., & Intriligator, J. (1996). Attentional resolution and the locus of visual awareness. Nature, 383(6598), 334.
194
Heming, J. E., & Brown, L. N. (2005). Sensory temporal processing in adults with early hearing loss. Brain Cogn, 59(2), 173–182. doi:10.1016/j.bandc.2005.05.012
Hendrickson, A. E., Movshon, J. A., Eggers, H. M., Gizzi, M. S., Boothe, R. G., & Kiorpes, L. (1987). Effects of early unilateral blur on the macaque's visual system. II. Anatomical observations. J Neurosci, 7(5), 1327–1339.
Heng, S., & Dutton, G. N. (2011). The Pulfrich effect in the clinic. Graefes Arch Clin Exp
Ophthalmol, 249(6), 801–808. doi:10.1007/s00417-011-1689-6 Hershenson, M. (1962). Reaction time as a measure of intersensory facilitation. J Exp Psychol,
63(3), 289. Hess, R. F. (2001). Amblyopia: site unseen. Clin Exp Optom, 84(6), 321–336. Hess, R. F., Demanins, R., & Bex, P. J. (1997). A reduced motion aftereffect in strabismic
amblyopia. Vision Res, 37(10), 1303–1311. doi:http://dx.doi.org/10.1016/S0042-6989(96)00277-5
Hess, R. F., & Holliday, I. E. (1992). The spatial localization deficit in amblyopia. Vision Res,
32(7), 1319–1339. Hess, R. F., & Howell, E. R. (1977). The threshold contrast sensitivity function in strabismic
amblyopia: evidence for a two type classification. Vision Res, 17(9), 1049–1055. Hess, R. F., & Pointer, J. S. (1985). Differences in the neural basis of human amblyopia: The
distribution of the anomaly across the visual field. Vision Res, 25(11), 1577–1594. doi:10.1016/0042-6989(85)90128-2
Hess, R. F., Wang, Y. Z., Demanins, R., Wilkinson, F., & Wilson, H. R. (1999). A deficit in strabismic amblyopia for global shape detection. Vision Res, 39(5), 901–914.
Hillock-Dunn, A., & Wallace, M. T. (2012). Developmental changes in the multisensory temporal binding window persist into adolescence. Dev Sci, 15(5), 688–696. doi:10.1111/j.1467-7687.2012.01171.x
Hillock, A. R., Powers, A. R., & Wallace, M. T. (2011). Binding of sights and sounds: age-related changes in multisensory temporal processing. Neuropsychologia, 49(3), 461–467. doi:10.1016/j.neuropsychologia.2010.11.041
Hirsh, I. J. (1959). Auditory perception of temporal order. J Acoust Soc Am, 31(6), 759–767. Hirsh, I. J., & Sherrick, C. E., Jr. (1961). Perceived order in different sense modalities. J Exp
Psychol, 62(5), 423–432. Ho, C. S., Giaschi, D. E., Boden, C., Dougherty, R., Cline, R., & Lyons, C. (2005). Deficient
motion perception in the fellow eye of amblyopic children. Vision Res, 45(12), 1615–1627. doi:10.1016/j.visres.2004.12.009
Ho, C. S., Paul, P. S., Asirvatham, A., Cavanagh, P., Cline, R., & Giaschi, D. E. (2006). Abnormal spatial selection and tracking in children with amblyopia. Vision Res, 46(19), 3274–3283. doi:10.1016/j.visres.2006.03.029
Hofman, P. M., Van Riswick, J. G., & Van Opstal, A. J. (1998). Relearning sound localization with new ears. Nat Neurosci, 1(5), 417–421. doi:10.1038/1633
Hogan, S. C., & Moore, D. R. (2003). Impaired binaural hearing in children produced by a threshold level of middle ear disease. J Assoc Res Otolaryngol, 4(2), 123–129. doi:10.1007/s10162-002-3007-9
Holmes, J. M., Beck, R. W., Repka, M. X., Leske, D. A., Kraker, R. T., Blair, R. C., . . . Pediatric Eye Disease Investigator, G. (2001). The amblyopia treatment study visual acuity testing protocol. Arch Ophthalmol, 119(9), 1345–1353.
Holmes, J. M., & Clarke, M. P. (2006). Amblyopia. Lancet, 367(9519), 1343–1351.
195
Holmes, J. M., Lazar, E. L., Melia, B. M., Astle, W. F., Dagi, L. R., Donahue, S. P., . . . Pediatric Eye Disease Investigator, G. (2011). Effect of age on response to amblyopia treatment in children. Arch Ophthalmol, 129(11), 1451–1457. doi:10.1001/archophthalmol.2011.179
Holmes, J. M., Manh, V. M., Lazar, E. L., Beck, R. W., Birch, E. E., Kraker, R. T., . . . Pediatric Eye Disease Investigator, G. (2016). Effect of a binocular iPad game vs part-time patching in children aged 5 to 12 years with amblyopia: a randomized clinical trial. JAMA Ophthalmol, 134(12), 1391–1400. doi:10.1001/jamaophthalmol.2016.4262
Holmes, N. P., & Spence, C. (2005). Multisensory integration: space, time and superadditivity. Curr Biol, 15(18), R762–764. doi:10.1016/j.cub.2005.08.058
Hoover, A. E., Harris, L. R., & Steeves, J. K. (2012). Sensory compensation in sound localization in people with one eye. Exp Brain Res, 216(4), 565–574. doi:10.1007/s00221-011-2960-0
Horton, J. C., & Hocking, D. R. (1996). Pattern of ocular dominance columns in human striate cortex in strabismic amblyopia. Vis Neurosci, 13(4), 787–795.
Horton, J. C., & Hocking, D. R. (1997). Timing of the critical period for plasticity of ocular dominance columns in macaque striate cortex. J Neurosci, 17(10), 3684–3709.
Horton, J. C., & Stryker, M. P. (1993). Amblyopia induced by anisometropia without shrinkage of ocular dominance columns in human striate cortex. Proc Natl Acad Sci U S A, 90(12), 5494–5498.
Horwood, J., Waylen, A., Herrick, D., Williams, C., & Wolke, D. (2005). Common visual defects and peer victimization in children. Invest Ophthalmol Vis Sci, 46(4), 1177–1181. doi:10.1167/iovs.04-0597
Hötting, K., & Röder, B. (2009). Auditory and auditory-tactile processing in congenitally blind humans. Hear Res, 258(1-2), 165–174. doi:10.1016/j.heares.2009.07.012
Howard, I. P., & Templeton, W. B. (1966). Human Spatial Orientation. Oxford, England: John Wiley.
Huang, P. C., Li, J., Deng, D., Yu, M., & Hess, R. F. (2012). Temporal synchrony deficits in amblyopia. Invest Ophthalmol Vis Sci, 53(13), 8325–8332. doi:10.1167/iovs.12-10835
Hubel, D. H., & Wiesel, T. N. (1970). The period of susceptibility to the physiological effects of unilateral eye closure in kittens. J Physiol, 206(2), 419–436.
Hubel, D. H., Wiesel, T. N., & LeVay, S. (1977). Plasticity of ocular dominance columns in monkey striate cortex. Philos Trans R Soc Lond B Biol Sci, 278(961), 377–409.
Hughes, H. C., Reuter-Lorenz, P. A., Nozawa, G., & Fendrich, R. (1994). Visual-auditory interactions in sensorimotor processing: saccades versus manual responses. J Exp
Psychol Hum Percept Perform, 20(1), 131–153. Imamura, K., Richter, H., Fischer, H., Lennerstrand, G., Franzen, O., Rydberg, A., . . .
Langstrom, B. (1997). Reduced activity in the extrastriate visual cortex of individuals with strabismic amblyopia. Neurosci Lett, 225(3), 173–176.
Intriligator, J., & Cavanagh, P. (2001). The spatial resolution of visual attention. Cogn Psychol,
43(3), 171–216. doi:10.1006/cogp.2001.0755 Irwin, R. J., Ball, A. K., Kay, N., Stillman, J. A., & Rosser, J. (1985). The development of
auditory temporal acuity in children. Child Dev, 56(3), 614–620. Jacobs, R. A. (2002). What determines visual cue reliability? Trends Cogn Sci, 6(8), 345–350. Jay, M. F., & Sparks, D. L. (1987a). Sensorimotor integration in the primate superior colliculus.
I. Motor convergence. J Neurophysiol, 57(1), 22–34. Jay, M. F., & Sparks, D. L. (1987b). Sensorimotor integration in the primate superior colliculus.
II. Coordinates of auditory signals. J Neurophysiol, 57(1), 35–55.
196
Jiang, F., Stecker, G. C., Boynton, G. M., & Fine, I. (2016). Early blindness results in developmental plasticity for auditory motion processing within auditory and occipital cortex. Front Hum Neurosci, 10, 324. doi:10.3389/fnhum.2016.00324
Jiang, W., Jiang, H., & Stein, B. E. (2006). Neonatal cortical ablation disrupts multisensory development in superior colliculus. J Neurophysiol, 95(3), 1380–1396.
Jiang, W., Wallace, M. T., Jiang, H., Vaughan, J. W., & Stein, B. E. (2001). Two cortical areas mediate multisensory integration in superior colliculus neurons. J Neurophysiol, 85(2), 506–522.
Jones, J. A., & Munhall, K. G. (1997). Effects of separating auditory and visual sources on audiovisual integration of speech. Canadian Acoustics, 25(4), 13–19.
Jones, K. R., Spear, P. D., & Tong, L. (1984). Critical periods for effects of monocular deprivation: differences between striate and extrastriate cortex. J Neurosci, 4(10), 2543–2552.
Kanabus, M., Szelag, E., Rojek, E., & Poppel, E. (2002). Temporal order judgement for auditory and visual stimuli. Acta Neurobiol Exp (Wars), 62(4), 263–270.
Kandel, G. L., Grattan, P. E., & Bedell, H. E. (1980). Are the dominant eyes of amblyopes normal? Am J Optom Physiol Opt, 57(1), 1–6.
Kanonidou, E., Proudlock, F. A., & Gottlob, I. (2010). Reading strategies in mild to moderate strabismic amblyopia: an eye movement investigation. Invest Ophthalmol Vis Sci, 51(7), 3502–3508. doi:10.1167/iovs.09-4236
Kasser, M., & Feldman, J. (1953). Amblyopia in adults*: treatment of those engaged in the various industries. Am J Ophthalmol, 36(10), 1443–1446.
Keetels, M., & Vroomen, J. (2005). The role of spatial disparity and hemifields in audio-visual temporal order judgments. Exp Brain Res, 167(4), 635–640. doi:10.1007/s00221-005-0067-1
Keetels, M., & Vroomen, J. (2011). No effect of synesthetic congruency on temporal ventriloquism. Atten Percept Psychophys, 73(1), 209–218. doi:10.3758/s13414-010-0019-0
Kelly, J. P., Tarczy-Hornoch, K., Herlihy, E., & Weiss, A. H. (2015). Occlusion therapy improves phase-alignment of the cortical response in amblyopia. Vision Res, 114, 142–150. doi:10.1016/j.visres.2014.11.014
Kelly, K. R., Jost, R. M., De La Cruz, A., & Birch, E. E. (2015). Amblyopic children read more slowly than controls under natural, binocular reading conditions. J AAPOS, 19(6), 515–520. doi:10.1016/j.jaapos.2015.09.002
Kersten, D., & Yuille, A. (2003). Bayesian models of object perception. Curr Opin Neurobiol,
13(2), 150–158. doi:http://dx.doi.org/10.1016/S0959-4388(03)00042-4 Keuroghlian, A. S., & Knudsen, E. I. (2007). Adaptive auditory plasticity in developing and
adult animals. Prog Neurobiol, 82(3), 109–121. doi:10.1016/j.pneurobio.2007.03.005 Khavarghazalani, B., Farahani, F., Emadi, M., & Hosseni Dastgerdi, Z. (2016). Auditory
processing abilities in children with chronic otitis media with effusion. Acta Otolaryngol,
136(5), 456–459. doi:10.3109/00016489.2015.1129552 Killan, C. F., Royle, N., Totten, C. L., Raine, C. H., & Lovett, R. E. (2015). The effect of early
auditory experience on the spatial listening skills of children with bilateral cochlear implants. Int J Pediatr Otorhinolaryngol, 79(12), 2159–2165. doi:10.1016/j.ijporl.2015.09.039
King, A. J. (2004). The superior colliculus. Curr Biol, 14(9), R335–338. doi:10.1016/j.cub.2004.04.018
197
King, A. J. (2009). Visual influences on auditory spatial learning. Philos Trans R Soc Lond B
Biol Sci, 364(1515), 331–339. doi:10.1098/rstb.2008.0230 King, A. J., & Carlile, S. (1993). Changes induced in the representation of auditory space in the
superior colliculus by rearing ferrets with binocular eyelid suture. Exp Brain Res, 94(3), 444–455.
King, A. J., Hutchings, M. E., Moore, D. R., & Blakemore, C. (1988). Developmental plasticity in the visual and auditory representations in the mammalian superior colliculus. Nature,
332(6159), 73–76. doi:10.1038/332073a0 King, A. J., & Palmer, A. R. (1983). Cells responsive to free-field auditory stimuli in guinea-pig
superior colliculus: distribution and response properties. J Physiol, 342(1), 361–381. King, A. J., & Palmer, A. R. (1985). Integration of visual and auditory information in bimodal
neurones in the guinea-pig superior colliculus. Exp Brain Res, 60(3), 492–500. King, A. J., Parsons, C. H., & Moore, D. R. (2000). Plasticity in the neural coding of auditory
space in the mammalian brain. Proc Natl Acad Sci U S A, 97(22), 11821–11828. doi:10.1073/pnas.97.22.11821
Kiorpes, L., Kiper, D. C., O'Keefe, L. P., Cavanaugh, J. R., & Movshon, J. A. (1998). Neuronal correlates of amblyopia in the visual cortex of macaque monkeys with experimental strabismus and anisometropia. J Neurosci, 18(16), 6411–6424.
Kishimoto, F., Fujii, C., Shira, Y., Hasebe, K., Hamasaki, I., & Ohtsuki, H. (2014). Outcome of conventional treatment for adult amblyopia. Jpn J Ophthalmol, 58(1), 26–32. doi:10.1007/s10384-013-0279-z
Klaeger-Manzanell, C., Hoyt, C. S., & Good, W. V. (1994). Two step recovery of vision in the amblyopic eye after visual loss and enucleation of the fixing eye. Br J Ophthalmol, 78(6), 506–507.
Klemm, O. (1920). Untersuchungen über die Lokalisation von Schallreizen IV: über den Einfluss des binauralen Zeitunterschieds auf die Lokalisation. Arch ges Psychol, 40, 117–145.
Klumpp, R., & Eady, H. (1956). Some measurements of interaural time difference thresholds. J
Acoust Soc Am, 28(5), 859–860. Knudsen, E. I., & Brainard, M. S. (1991). Visual instruction of the neural map of auditory space
in the developing optic tectum. Science, 253(5015), 85–87. Knudsen, E. I., Esterly, S. D., & Knudsen, P. F. (1984). Monaural occlusion alters sound
localization during a sensitive period in the barn owl. J Neurosci, 4(4), 1001–1011. Knudsen, E. I., & Knudsen, P. F. (1986). The sensitive period for auditory localization in barn
owls is limited by age, not by experience. J Neurosci, 6(7), 1918–1924. Knudsen, E. I., & Knudsen, P. F. (1989). Vision calibrates sound localization in developing barn
owls. J Neurosci, 9(9), 3306–3313. Knudsen, E. I., & Knudsen, P. F. (1990). Sensitive and critical periods for visual calibration of
sound localization by barn owls. J Neurosci, 10(1), 222–232. Knudsen, E. I., Knudsen, P. F., & Esterly, S. D. (1984). A critical period for the recovery of
sound localization accuracy following monaural occlusion in the barn owl. J Neurosci,
4(4), 1012–1020. Kording, K. P., Beierholm, U., Ma, W. J., Quartz, S., Tenenbaum, J. B., & Shams, L. (2007).
Causal inference in multisensory perception. PloS one, 2(9), e943. doi:10.1371/journal.pone.0000943
Kovacs, I., Polat, U., Pennefather, P. M., Chandna, A., & Norcia, A. M. (2000). A new test of contour integration deficits in patients with a history of disrupted binocular experience during visual development. Vision Res, 40(13), 1775–1783.
198
Krueger, D., & Ederer, F. (1984). Report on the National Eye Institute's visual acuity impairment survey pilot study. Bethesda, MD: Office of Biometry and Epidemiology, National Eye Institute, National Institutes of Health, Public Health Service, Department of Health and Human Services.
Kugelberg, U. (1992). Visual acuity following treatment of bilateral congenital cataracts. Doc
Ophthalmol, 82(3), 211–215. Kumpik, D. P., Kacelnik, O., & King, A. J. (2010). Adaptive reweighting of auditory localization
cues in response to chronic unilateral earplugging in humans. J Neurosci, 30(14), 4883–4894. doi:10.1523/JNEUROSCI.5488-09.2010
Kupfer, C. (1957). Treatment of amblyopia ex anopsia in adults*: a preliminary report of seven cases. Am J Ophthalmol, 43(6), 918–922.
Kushnerenko, E., Teinonen, T., Volein, A., & Csibra, G. (2008). Electrophysiological evidence of illusory audiovisual speech percept in human infants. Proc Natl Acad Sci U S A,
105(32), 11442–11445. doi:10.1073/pnas.0804275105 Lalonde, K., & Holt, R. F. (2016). Audiovisual speech perception development at varying levels
of perceptual processing. J Acoust Soc Am, 139(4), 1713–1723. Lambert, S. R., Buckley, E. G., Drews-Botsch, C., DuBois, L., Hartmann, E., Lynn, M. J., . . .
Wilson, M. E. (2010). The infant aphakia treatment study: design and clinical measures at enrollment. Arch Ophthalmol, 128(1), 21–27. doi:10.1001/archophthalmol.2009.350
Lambert, S. R., DuBois, L., Cotsonis, G., Hartmann, E. E., & Drews-Botsch, C. (2016). Factors associated with stereopsis and a good visual acuity outcome among children in the Infant Aphakia Treatment Study. Eye (Lond), 30(9), 1221–1228. doi:10.1038/eye.2016.164
Lane, R., Allman, J., Kaas, J., & Miezin, F. (1973). The visuotopic organization of the superior colliculus of the owl monkey (Aotus trivirgatus) and the bush baby (Galago senegalensis). Brain Res, 60(2), 335–349.
Lea, S. J. H., Loades, J., & Rubinstein, M. P. (1989). The sensitive period for anisometropic amblyopia. Eye (Lond), 3(6), 783–790.
Lee, H., & Noppeney, U. (2011a). Long-term music training tunes how the brain temporally binds signals from multiple senses. Proc Natl Acad Sci U S A, 108(51), E1441–1450. doi:10.1073/pnas.1115267108
Lee, H., & Noppeney, U. (2011b). Physical and perceptual factors shape the neural mechanisms that integrate audiovisual signals in speech comprehension. J Neurosci, 31(31), 11338–11350. doi:10.1523/JNEUROSCI.6510-10.2011
Leguire, L. E., Rogers, G. L., & Bremer, D. L. (1990). Amblyopia: the normal eye is not normal. J Pediatr Ophthalmol Strabismus, 27(1), 32–38; discussion 39.
Lessard, N., Pare, M., Lepore, F., & Lassonde, M. (1998). Early-blind human subjects localize sound sources better than sighted subjects. Nature, 395(6699), 278–280.
Levi, D. M. (2013). Linking assumptions in amblyopia. Vis Neurosci, 30(5-6), 277–287. doi:10.1017/S0952523813000023
Levi, D. M., Hariharan, S., & Klein, S. A. (2002). Suppressive and facilitatory spatial interactions in amblyopic vision. Vision Res, 42(11), 1379–1394. doi:http://dx.doi.org/10.1016/S0042-6989(02)00061-5
Levi, D. M., & Harwerth, R. S. (1977). Spatio-temporal interactions in anisometropic and strabismic amblyopia. Invest Ophthalmol Vis Sci, 16(1), 90–95.
Levi, D. M., & Klein, S. A. (1985). Vernier acuity, crowding and amblyopia. Vision Res, 25(7), 979–991. doi:http://dx.doi.org/10.1016/0042-6989(85)90208-1
Levi, D. M., & Klein, S. A. (2003). Noise provides some new signals about the spatial vision of amblyopes. J Neurosci, 23(7), 2522–2526.
199
Levi, D. M., Klein, S. A., & Chen, I. (2007). The response of the amblyopic visual system to noise. Vision Res, 47(19), 2531–2542. doi:10.1016/j.visres.2007.06.014
Levi, D. M., Klein, S. A., & Chen, I. (2008). What limits performance in the amblyopic visual system: seeing signals in noise with an amblyopic brain. J Vis, 8(4), 1 1–23. doi:10.1167/8.4.1
Levi, D. M., Klein, S. A., & Yap, Y. L. (1987). Positional uncertainty in peripheral and amblyopic vision. Vision Res, 27(4), 581–597.
Levi, D. M., & Polat, U. (1996). Neural plasticity in adults with amblyopia. Proc Natl Acad Sci
U S A, 93(13), 6830–6834. Levi, D. M., Waugh, S. J., & Beard, B. L. (1994). Spatial scale shifts in amblyopia. Vision Res,
34(24), 3315–3333. Lewald, J., Ehrenstein, W. H., & Guski, R. (2001). Spatio-temporal constraints for auditory--
visual integration. Behav Brain Res, 121(1-2), 69–79. Lewald, J., & Guski, R. (2003). Cross-modal perceptual integration of spatially and temporally
disparate auditory and visual stimuli. Brain Res Cogn Brain Res, 16(3), 468–478. doi:10.1016/s0926-6410(03)00074-0
Lewis, J. W., Beauchamp, M. S., & DeYoe, E. A. (2000). A comparison of visual and auditory motion processing in human cerebral cortex. Cereb Cortex, 10(9), 873–888.
Lewis, T. L., & Maurer, D. (2005). Multiple sensitive periods in human visual development: evidence from visually deprived children. Dev Psychobiol, 46(3), 163–183. doi:10.1002/dev.20055
Lewkowicz, D. J. (2000). The development of intersensory temporal perception: an epigenetic systems/limitations view. Psychol Bull, 126(2), 281–308.
Lewkowicz, D. J., & Flom, R. (2014). The audiovisual temporal binding window narrows in early childhood. Child Dev, 85(2), 685–694. doi:10.1111/cdev.12142
Lewkowicz, D. J., & Hansen-Tift, A. M. (2012). Infants deploy selective attention to the mouth of a talking face when learning speech. Proc Natl Acad Sci U S A, 109(5), 1431–1436. doi:10.1073/pnas.1114783109
Lewkowicz, D. J., & Lickliter, R. (1994). The Development of Intersensory Perception:
Comparative Perspectives. Hillsdale, NJ: Erlbaum. Lewkowicz, D. J., & Turkewitz, G. (1980). Cross-modal equivalence in early infancy: Auditory–
visual intensity matching. Developmental Psychology, 16(6), 597. Li, J., Thompson, B., Lam, C. S., Deng, D., Chan, L. Y., Maehara, G., . . . Hess, R. F. (2011).
The role of suppression in amblyopia. Invest Ophthalmol Vis Sci, 52(7), 4169–4176. doi:10.1167/iovs.11-7233
Li, R. W., Ngo, C., Nguyen, J., & Levi, D. M. (2011). Video-game play induces plasticity in the visual system of adults with amblyopia. PLoS Biol, 9(8), e1001135. doi:10.1371/journal.pbio.1001135
Li, X., Dumoulin, S. O., Mansouri, B., & Hess, R. F. (2007). Cortical deficits in human amblyopia: their regional distribution and their relationship to the contrast detection deficit. Invest Ophthalmol Vis Sci, 48(4), 1575–1591.
Li, X., Mullen, K. T., Thompson, B., & Hess, R. F. (2011). Effective connectivity anomalies in human amblyopia. Neuroimage, 54(1), 505–516. doi:10.1016/j.neuroimage.2010.07.053
Litovsky, R. Y. (1997). Developmental changes in the precedence effect: estimates of minimum audible angle. J Acoust Soc Am, 102(3), 1739–1745.
Litovsky, R. Y. (2005). Speech intelligibility and spatial release from masking in young children. J Acoust Soc Am, 117(5), 3091–3099.
200
Litovsky, R. Y., & Ashmead, D. H. (1997). Development of binaural and spatial hearing in infants and children
In R. H. Gilkey & T. R. Anderson (Eds.), Binaural and spatial hearing in real and virtual
environments (pp. 571–592). Mahwah, N.J.: Lawrence Erlbaum Associates. Litovsky, R. Y., Fligor, B. J., & Tramo, M. J. (2002). Functional role of the human inferior
colliculus in binaural hearing. Hear Res, 165(1-2), 177–188. doi:http://dx.doi.org/10.1016/S0378-5955(02)00304-0
Lovelace, C. T., Stein, B. E., & Wallace, M. T. (2003). An irrelevant light enhances auditory detection in humans: a psychophysical analysis of multisensory integration in stimulus detection. Brain Res Cogn Brain Res, 17(2), 447–453.
Löwel, S., & Engelmann, R. (2002). Neuroanatomical and neurophysiological consequences of strabismus: changes in the structural and functional organization of the primary visual cortex in cats with alternating fixation and strabismic amblyopia. Strabismus, 10(2), 95–105.
Lyons-Ruth, K. (1977). Bimodal perception in infancy: Response to auditory-visual incongruity. Child Dev, 820–827.
Macaluso, E. (2006). Multisensory processing in sensory-specific cortical areas. Neuroscientist,
12(4), 327–338. doi:10.1177/1073858406287908 Macaluso, E., & Driver, J. (2001). Spatial attention and crossmodal interactions between vision
and touch. Neuropsychologia, 39(12), 1304–1316. doi:Doi 10.1016/S0028-3932(01)00119-1
Macaluso, E., Frith, C., & Driver, J. (2000). Selective spatial attention in vision and touch: unimodal and multimodal mechanisms revealed by PET. J Neurophysiol, 83(5), 3062–3075.
Macaluso, E., Frith, C. D., & Driver, J. (2000). Modulation of human visual cortex by crossmodal spatial attention. Science, 289(5482), 1206–1208.
Macaluso, E., George, N., Dolan, R., Spence, C., & Driver, J. (2004). Spatial and temporal factors during processing of audiovisual speech: a PET study. Neuroimage, 21(2), 725–732. doi:10.1016/j.neuroimage.2003.09.049
MacDonald, J., Andersen, S., & Bachmann, T. (2000). Hearing by eye: how much spatial degradation can be tolerated? Perception, 29(10), 1155–1168.
Magnotti, J. F., & Beauchamp, M. S. (2017). A causal inference model explains perception of the McGurk effect and other incongruent audiovisual speech. PLoS Comput Biol, 13(2), e1005229. doi:10.1371/journal.pcbi.1005229
Makous, J. C., & Middlebrooks, J. C. (1990). Two-dimensional sound localization by human listeners. J Acoust Soc Am, 87(5), 2188–2200.
Mansouri, B., & Hess, R. F. (2006). The global processing deficit in amblyopia involves noise segregation. Vision Res, 46(24), 4104–4117. doi:10.1016/j.visres.2006.07.017
Marks, L. E. (1978). The Unity of the Senses. New York: Academic Press. Martin, B., Giersch, A., Huron, C., & van Wassenhove, V. (2013). Temporal event structure and
timing in schizophrenia: preserved binding in a longer "now". Neuropsychologia, 51(2), 358–371. doi:10.1016/j.neuropsychologia.2012.07.002
Mathews, S., Yager, D., Ciuffreda, K. J., & Ettinger, E. R. (1987). Spatial frequency discrimination in anisometropic and strabismic amblyopia. Appl Opt, 26(8), 1432–1436. doi:10.1364/AO.26.001432
Maurer, D., Lewis, T. L., Brent, H. P., & Levin, A. V. (1999). Rapid improvement in the acuity of infants after visual input. Science, 286(5437), 108–110.
201
Maurer, D., Stager, C. L., & Mondloch, C. J. (1999). Cross‐modal transfer of shape is difficult to demonstrate in one‐month‐olds. Child Dev, 70(5), 1047–1057.
Mayer, D. L., Beiser, A. S., Warner, A. F., Pratt, E. M., Raye, K. N., & Lang, J. M. (1995). Monocular acuity norms for the Teller Acuity Cards between ages one month and four years. Invest Ophthalmol Vis Sci, 36(3), 671–685.
Mayer, D. L., & Dobson, V. (1982). Visual acuity development in infants and young children, as assessed by operant preferential looking. Vision Res, 22(9), 1141–1151.
McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 264(5588), 746–748.
McKee, S. P., Levi, D. M., & Movshon, J. A. (2003). The pattern of visual deficits in amblyopia. J Vis, 3(5), 380–405. doi:10:1167/3.5.5
McKee, S. P., Levi, D. M., Schor, C. M., & Movshon, J. A. (2016). Saccadic latency in amblyopia. J Vis, 16(5), 3. doi:10.1167/16.5.3
Meier, K., & Giaschi, D. (2017). Unilateral amblyopia affects two eyes: fellow eye deficits in amblyopia. Invest Ophthalmol Vis Sci, 58(3), 1779–1800.
Meltzoff, A. N., & Borton, R. W. (1979). Intermodal matching by human neonates. Nature,
282(5737), 403–404. Membreno, J. H., Brown, M. M., Brown, G. C., Sharma, S., & Beauchamp, G. R. (2002). A cost-
utility analysis of therapy for amblyopia. Ophthalmology, 109(12), 2265–2271. doi:http://dx.doi.org/10.1016/S0161-6420(02)01286-1
Meredith, M. A., Nemitz, J. W., & Stein, B. E. (1987). Determinants of multisensory integration in superior colliculus neurons. I. Temporal factors. J Neurosci, 7(10), 3215–3229.
Meredith, M. A., & Stein, B. E. (1986a). Spatial factors determine the activity of multisensory neurons in cat superior colliculus. Brain Res, 365(2), 350–354. doi:http://dx.doi.org/10.1016/0006-8993(86)91648-3
Meredith, M. A., & Stein, B. E. (1986b). Visual, auditory, and somatosensory convergence on cells in superior colliculus results in multisensory integration. J Neurophysiol, 56(3), 640–662.
Middlebrooks, J. C., & Green, D. M. (1991). Sound localization by human listeners. Annu Rev
Psychol, 42(1), 135–159. doi:10.1146/annurev.ps.42.020191.001031 Middlebrooks, J. C., Makous, J. C., & Green, D. M. (1989). Directional sensitivity of sound‐
pressure levels in the human ear canal. J Acoust Soc Am, 86(1), 89–108. Miller, L. M., & D'Esposito, M. (2005). Perceptual fusion and stimulus coincidence in the cross-
modal integration of speech. J Neurosci, 25(25), 5884–5893. doi:10.1523/JNEUROSCI.0896-05.2005
Mills, A. W. (1958). On the minimum audible angle. J Acoust Soc Am, 30(4), 237–246. Milner, A. D., & Goodale, M. A. (2008). Two visual systems re-viewed. Neuropsychologia,
46(3), 774–785. doi:10.1016/j.neuropsychologia.2007.10.005 Mirabella, G., Hay, S., & Wong, A. M. (2011). Deficits in perception of images of real-world
scenes in patients with a history of amblyopia. Arch Ophthalmol, 129(2), 176–183. doi:10.1001/archophthalmol.2010.354
Mon-Williams, M., Wann, J. P., Jenkinson, M., & Rushton, K. (1997). Synaesthesia in the normal limb. Proc Biol Sci, 264(1384), 1007–1010. doi:10.1098/rspb.1997.0139
Moore, D. R. (1993). Plasticity of binaural hearing and some possible mechanisms following late-onset deprivation. J Am Acad Audiol, 4(5), 277–283.
Moore, R. Y., & Goldberg, J. M. (1966). Projections of the inferior colliculus in the monkey. Experimental Neurology, 14(4), 429–438. doi:http://dx.doi.org/10.1016/0014-4886(66)90127-0
202
Morein-Zamir, S., Soto-Faraco, S., & Kingstone, A. (2003). Auditory capture of vision: examining temporal ventriloquism. Brain Res Cogn Brain Res, 17(1), 154–163. doi:10.1016/s0926-6410(03)00089-2
Moro, S. S., Harris, L. R., & Steeves, J. K. (2014). Optimal audiovisual integration in people with one eye. Multisens Res, 27(3-4), 173–188. doi:10.1163/22134808-00002453
Moro, S. S., & Steeves, J. K. (2015). Audiovisual integration in people with one eye: Normal temporal binding window and sound induced flash illusion but reduced McGurk effect. J
Vis, 15(12), 721. doi:10.1167/15.12.721 Morrell, L. K. (1968). Temporal characteristics of sensory interaction in choice reaction times. J
Exp Psychol, 77(1), 14–18. Morrongiello, B. A. (1988). Infants’ localization of sounds along the horizontal axis: Estimates
of minimum audible angle. Developmental Psychology, 24(1), 8–13. Morrongiello, B. A., Fenwick, K. D., & Chance, G. (1998). Crossmodal learning in newborn
infants: Inferences about properties of auditory-visual events. Infant Behavior and
Development, 21(4), 543–553. doi:http://dx.doi.org/10.1016/S0163-6383(98)90028-5 Movshon, J. A., Eggers, H. M., Gizzi, M. S., Hendrickson, A. E., Kiorpes, L., & Boothe, R. G.
(1987). Effects of early unilateral blur on the macaque's visual system. III. Physiological observations. J Neurosci, 7(5), 1340–1351.
Mozolic, J. L., Hugenschmidt, C. E., Peiffer, A. M., & Laurienti, P. J. (2008). Modality-specific selective attention attenuates multisensory integration. Exp Brain Res, 184(1), 39–52. doi:10.1007/s00221-007-1080-3
Muchnik, C., Efrati, M., Nemeth, E., Malin, M., & Hildesheimer, M. (1991). Central auditory skills in blind and sighted subjects. Scand Audiol, 20(1), 19–23. doi:10.3109/01050399109070785
Muckli, L., Kiess, S., Tonhausen, N., Singer, W., Goebel, R., & Sireteanu, R. (2006). Cerebral correlates of impaired grating perception in individual, psychophysically assessed human amblyopes. Vision Res, 46(4), 506–526. doi:10.1016/j.visres.2005.10.014
Muir, D., & Field, J. (1979). Newborn infants orient to sounds. Child Dev, 50(2), 431–436. doi:10.2307/1129419
Munhall, K. G., Gribble, P., Sacco, L., & Ward, M. (1996). Temporal constraints on the McGurk effect. Percept Psychophys, 58(3), 351–362. doi:10.3758/bf03206811
Nardini, M., Jones, P., Bedford, R., & Braddick, O. (2008). Development of cue integration in human navigation. Curr Biol, 18(9), 689–693. doi:10.1016/j.cub.2008.04.021
Narinesingh, C., Goltz, H. C., Raashid, R. A., & Wong, A. M. (2015). Developmental trajectory of McGurk effect susceptibility in children and adults with amblyopia. Invest Ophthalmol
Vis Sci, 56(3), 2107–2113. doi:10.1167/iovs.14-15898 Narinesingh, C., Goltz, H. C., & Wong, A. M. (2017). Temporal binding window of the sound-
induced flash illusion in amblyopia. Invest Ophthalmol Vis Sci, 58(3), 1442–1448. doi:10.1167/iovs.16-21258
Narinesingh, C., Wan, M., Goltz, H. C., Chandrakumar, M., & Wong, A. M. (2014). Audiovisual perception in adults with amblyopia: a study using the McGurk effect. Invest Ophthalmol
Vis Sci, 55(5), 3158–3164. doi:10.1167/iovs.14-14140 Nath, A. R., & Beauchamp, M. S. (2012). A neural basis for interindividual differences in the
McGurk effect, a multisensory speech illusion. Neuroimage, 59(1), 781–787. doi:10.1016/j.neuroimage.2011.07.024
Navarra, J., Alsius, A., Soto-Faraco, S., & Spence, C. (2010). Assessing the role of attention in the audiovisual integration of speech. Information Fusion, 11(1), 4–11. doi:10.1016/j.inffus.2009.04.001
203
Navarra, J., Vatakis, A., Zampini, M., Soto-Faraco, S., Humphreys, W., & Spence, C. (2005). Exposure to asynchronous audiovisual speech extends the temporal window for audiovisual integration. Brain Res Cogn Brain Res, 25(2), 499–507. doi:10.1016/j.cogbrainres.2005.07.009
Neu, B., & Sireteanu, R. (1997). Monocular acuity in preschool children: Assessment with the Teller and Keeler acuity cards in comparison to the C-test. Strabismus, 5(4), 185–202. doi:10.3109/09273979709044534
Newsome, W. T., & Pare, E. B. (1988). A selective impairment of motion perception following lesions of the middle temporal visual area (MT). J Neurosci, 8(6), 2201–2211.
Niechwiej-Szwedo, E., Goltz, H. C., Chandrakumar, M., Hirji, Z., & Wong, A. M. (2011). Effects of anisometropic amblyopia on visuomotor behavior, III: Temporal eye-hand coordination during reaching. Invest Ophthalmol Vis Sci, 52(8), 5853–5861. doi:10.1167/iovs.11-7314
Niechwiej-Szwedo, E., Goltz, H. C., Chandrakumar, M., Hirji, Z. A., & Wong, A. M. (2010). Effects of anisometropic amblyopia on visuomotor behavior, I: saccadic eye movements. Invest Ophthalmol Vis Sci, 51(12), 6348–6354. doi:10.1167/iovs.10-5882
Niechwiej-Szwedo, E., Goltz, H. C., Chandrakumar, M., & Wong, A. M. (2012). The effect of sensory uncertainty due to amblyopia (lazy eye) on the planning and execution of visually-guided 3D reaching movements. PloS one, 7(2), e31075.
Niechwiej-Szwedo, E., Kennedy, S. A., Colpa, L., Chandrakumar, M., Goltz, H. C., & Wong, A. M. (2012). Effects of induced monocular blur versus anisometropic amblyopia on saccades, reaching, and eye-hand coordination. Invest Ophthalmol Vis Sci, 53(8), 4354–4362. doi:10.1167/iovs.12-9855
Noesselt, T., Bergmann, D., Hake, M., Heinze, H. J., & Fendrich, R. (2008). Sound increases the saliency of visual events. Brain Res, 1220, 157–163. doi:10.1016/j.brainres.2007.12.060
Noesselt, T., Rieger, J. W., Schoenfeld, M. A., Kanowski, M., Hinrichs, H., Heinze, H. J., & Driver, J. (2007). Audiovisual temporal correspondence modulates human multisensory superior temporal sulcus plus primary sensory cortices. J Neurosci, 27(42), 11431-11441. doi:10.1523/JNEUROSCI.2252-07.2007
Norcia, A. M., & Tyler, C. W. (1985). Spatial frequency sweep VEP: visual acuity during the first year of life. Vision Res, 25(10), 1399–1408.
Norcia, A. M., Tyler, C. W., & Hamer, R. D. (1990). Development of contrast sensitivity in the human infant. Vision Res, 30(10), 1475–1486.
Nordmann, J. P., Freeman, R. D., & Casanova, C. (1992). Contrast sensitivity in amblyopia: masking effects of noise. Invest Ophthalmol Vis Sci, 33(10), 2975–2985.
O’Connor, N., & Hermelin, B. (1972). Seeing and hearing and space and space and time. Atten
Percept Psychophys, 11(1), 46–48. Ogilvie, J. C. (1956). Effect of auditory flutter on the visual critical flicker frequency. Canadian
Journal of Psychology/Revue canadienne de psychologie, 10(2), 61–68. doi:http://dx.doi.org/10.1037/h0083662
Oliver, D. L., & Huerta, M. F. (1992). Inferior and superior colliculi. In D. B. Webster, A. N. Popper, & R. R. Fay (Eds.), The Mammalian Auditory Pathway: Neuroanatomy (pp. 168–221). New York, NY: Springer New York.
Packwood, E. A., Cruz, O. A., Rychwalski, P. J., & Keech, R. V. (1999). The psychosocial effects of amblyopia study. J AAPOS, 3(1), 15–17.
Paré, M., Richler, R. C., ten Hove, M., & Munhall, K. (2003). Gaze behavior in audiovisual speech perception: The influence of ocular fixations on the McGurk effect. Percept
Psychophys, 65(4), 553–567.
204
Parise, C. V., & Ernst, M. O. (2016). Correlation detection as a general mechanism for multisensory integration. Nature communications, 7.
Parisi, V., Scarale, M. E., Balducci, N., Fresina, M., & Campos, E. C. (2010). Electrophysiological detection of delayed postretinal neural conduction in human amblyopia. Invest Ophthalmol Vis Sci, 51(10), 5041–5048. doi:10.1167/iovs.10-5412
Park, T. J., Klug, A., Holinstat, M., & Grothe, B. (2004). Interaural level difference processing in the lateral superior olive and the inferior colliculus. J Neurophysiol, 92(1), 289–301. doi:10.1152/jn.00961.2003
Pascalis, O., de Schonen, S., Morton, J., Deruelle, C., & Fabre-Grenet, M. (1995). Mother's face recognition by neonates: a replication and an extension. Infant Behavior and
Development, 18(1), 79–85. Pêcheux, M. G., Lepecq, J. C., & Salzarulo, P. (1988). Oral activity and exploration in 1–2‐
month‐old infants. British Journal of Developmental Psychology, 6(3), 245–256. Petrini, K., Russell, M., & Pollick, F. (2009). When knowing can replace seeing in audiovisual
integration of actions. Cognition, 110(3), 432–439. doi:10.1016/j.cognition.2008.11.015 Pick, H. L., Warren, D. H., & Hay, J. C. (1969). Sensory conflict in judgments of spatial
direction. Percept Psychophys, 6(4), 203–205. doi:10.3758/BF03207017 Pillsbury, H. C., Grose, J. H., Hall, J. W., & Iii. (1991). Otitis media with effusion in children:
Binaural hearing before and after corrective surgery. Archives of Otolaryngology–Head
& Neck Surgery, 117(7), 718–723. doi:10.1001/archotol.1991.01870190030008 Polat, U., Ma-Naim, T., Belkin, M., & Sagi, D. (2004). Improving vision in adult amblyopia by
perceptual learning. Proc Natl Acad Sci U S A, 101(17), 6692–6697. Polat, U., Sagi, D., & Norcia, A. M. (1997). Abnormal long-range spatial interactions in
amblyopia. Vision Res, 37(6), 737–744. Pollack, J. G., & Hickey, T. L. (1979). The distribution of retino-collicular axon terminals in
rhesus monkey. J Comp Neurol, 185(4), 587–602. doi:10.1002/cne.901850402 Pons, F., Lewkowicz, D. J., Soto-Faraco, S., & Sebastian-Galles, N. (2009). Narrowing of
intersensory speech perception in infancy. Proc Natl Acad Sci U S A, 106(26), 10598–10602. doi:10.1073/pnas.0904134106
Pöppel, E. (1997). A hierarchical model of temporal perception. Trends Cogn Sci, 1(2), 56–61. Posner, M. I., Nissen, M. J., & Klein, R. M. (1976). Visual dominance: an information-
processing account of its origins and significance. Psychol Rev, 83(2), 157–171. Powers, A. R., 3rd, Hillock, A. R., & Wallace, M. T. (2009). Perceptual training narrows the
temporal window of multisensory binding. J Neurosci, 29(39), 12265–12274. doi:10.1523/JNEUROSCI.3501-09.2009
Preslan, M. W., & Novak, A. (1996). Baltimore Vision Screening Project. Ophthalmology,
103(1), 105–109. Pulfrich, C. (1922). Die Stereoskopie im Dienste der isochromen und heterochromen
Photometrie. Naturwissenschaften, 10(33), 714–722. Pulkki, V. (2001). Spatial sound generation and perception by amplitude panning techniques:
Helsinki University of Technology. Pulkki, V., & Karjalainen, M. (2001). Localization of amplitude-panned virtual sources I:
stereophonic panning. Journal of the Audio Engineering Society, 49(9), 739–752. Putzar, L., Goerendt, I., Heed, T., Richard, G., Buchel, C., & Röder, B. (2010). The neural basis
of lip-reading capabilities is altered by early visual deprivation. Neuropsychologia, 48(7), 2158–2166.
205
Putzar, L., Goerendt, I., Lange, K., Rosler, F., & Röder, B. (2007). Early visual deprivation impairs multisensory interactions in humans. Nat Neurosci, 10(10), 1243–1245. doi:10.1038/nn1978
Putzar, L., Hötting, K., & Röder, B. (2010). Early visual deprivation affects the development of face recognition and of audio-visual speech perception. Restor Neurol Neurosci, 28(2), 251–257. doi:10.3233/RNN-2010-0526
Raashid, R. A., Liu, I. Z., Blakeman, A., Goltz, H. C., & Wong, A. M. (2016). The initiation of smooth pursuit is delayed in anisometropic amblyopia. Invest Ophthalmol Vis Sci, 57(4), 1757–1764. doi:10.1167/iovs.16-19126
Raashid, R. A., Wong, A., Blakeman, A., & Goltz, H. C. (2015). Saccadic adaptation in visually normal individuals using saccadic endpoint variability from amblyopia. Invest
Ophthalmol Vis Sci, 56(2), 947–955. Raashid, R. A., Wong, A., Chandrakumar, M., Blakeman, A., & Goltz, H. C. (2013). Short-term
saccadic adaptation in patients with anisometropic amblyopia. Invest Ophthalmol Vis Sci,
54(10), 6701–6711. Rahi, J., Logan, S., Timms, C., Russell-Eggitt, I., & Taylor, D. (2002). Risk, causes, and
outcomes of visual impairment after loss of vision in the non-amblyopic eye: a population-based study. Lancet, 360(9333), 597–602.
Raij, T., Uutela, K., & Hari, R. (2000). Audiovisual integration of letters in the human brain. Neuron, 28(2), 617–625.
Rayleigh, L. (1907). XII. On our perception of sound direction. The London, Edinburgh, and
Dublin Philosophical Magazine and Journal of Science, 13(74), 214–232. Recanzone, G. H. (2003). Auditory influences on visual temporal rate perception. J
Neurophysiol, 89(2), 1078–1093. doi:10.1152/jn.00706.2002 Recanzone, G. H. (2009). Interactions of auditory and visual stimuli in space and time. Hear Res,
258(1-2), 89–99. doi:10.1016/j.heares.2009.04.009 Repka, M. X., Beck, R. W., Kraker, R. T., Cole, S. R., Holmes, J. M., Birch, E. E., . . . Cotter, S.
A. (2002). The clinical profile of moderate amblyopia in children younger than 7 years. Arch Ophthalmol, 120(3), 281–287.
Richards, M. D., Goltz, H. C., & Wong, A. M. (2017a). Optimal audiovisual integration in the ventriloquism effect but pervasive deficits in unisensory spatial localization in amblyopia. Invest Ophthalmol Vis Sci, (in press).
Richards, M. D., Goltz, H. C., & Wong, A. M. F. (2017b). Alterations in audiovisual simultaneity perception in amblyopia. PloS one, 12(6), e0179516. doi:10.1371/journal.pone.0179516
Ringdahl, A., Eriksson-Mangold, M., & Andersson, G. (1998). Psychometric evaluation of the Gothenburg Profile for measurement of experienced hearing disability and handicap: applications with new hearing aid candidates and experienced hearing aid users. Br J
Audiol, 32(6), 375–385. Robaei, D., Rose, K. A., Ojaimi, E., Kifley, A., Martin, F. J., & Mitchell, P. (2006). Causes and
associations of amblyopia in a population-based sample of 6-year-old Australian children. Arch Ophthalmol, 124(6), 878–884. doi:10.1001/archopht.124.6.878
Robinson, D. L., & Kertzman, C. (1995). Covert orienting of attention in macaques. III. Contributions of the superior colliculus. J Neurophysiol, 74(2), 713–721.
Rock, I., & Victor, J. (1964). Vision and touch: An experimentally created conflict between the two senses. Science, 143(3606), 594–596.
Röder, B., Rosler, F., & Spence, C. (2004). Early vision impairs tactile perception in the blind. Curr Biol, 14(2), 121–124.
206
Röder, B., Teder-Salejarvi, W., Sterr, A., Rosler, F., Hillyard, S. A., & Neville, H. J. (1999). Improved auditory spatial tuning in blind humans. Nature, 400(6740), 162–166. doi:10.1038/22106
Roelfsema, P. R., Konig, P., Engel, A. K., Sireteanu, R., & Singer, W. (1994). Reduced synchronization in the visual cortex of cats with strabismic amblyopia. Eur J Neurosci,
6(11), 1645–1655. Rose, J. E., Brugge, J. F., Anderson, D. J., & Hind, J. E. (1967). Phase-locked response to low-
frequency tones in single auditory nerve fibers of the squirrel monkey. J Neurophysiol,
30(4), 769–793. Rose, S. A., Gottfried, A. W., & Bridger, W. H. (1981). Cross-modal transfer in 6-month-old
infants. Developmental Psychology, 17(5), 661. Roseboom, W., & Arnold, D. H. (2011). Twice upon a time: multiple concurrent temporal
recalibrations of audiovisual speech. Psychol Sci, 22(7), 872-877. doi:10.1177/0956797611413293
Roseboom, W., Nishida, S., & Arnold, D. H. (2009). The sliding window of audio-visual simultaneity. J Vis, 9(12), 4 1–8. doi:10.1167/9.12.4
Rosenblum, L. D., Schmuckler, M. A., & Johnson, J. A. (1997). The McGurk effect in infants. Atten Percept Psychophys, 59(3), 347–357.
Rowland, B., Stanford, T., & Stein, B. (2007). A Bayesian model unifies multisensory spatial localization with the physiological properties of the superior colliculus. Experimental
Brain Research, 180(1), 153–161. doi:10.1007/s00221-006-0847-2 Saenz, M., Lewis, L. B., Huth, A. G., Fine, I., & Koch, C. (2008). Visual Motion Area MT+/V5
Responds to Auditory Motion in Human Sight-Recovery Subjects. J Neurosci, 28(20), 5141–5148. doi:10.1523/JNEUROSCI.0803-08.2008
Salomao, S. R., & Ventura, D. F. (1995). Large sample population age norms for visual acuities obtained with Vistech-Teller Acuity Cards. Invest Ophthalmol Vis Sci, 36(3), 657–670.
Salzman, C. D., Britten, K. H., & Newsome, W. T. (1990). Cortical microstimulation influences perceptual judgements of motion direction. Nature, 346(6280), 174.
Scheiman, M. M., Hertle, R. W., Beck, R. W., Edwards, A. R., Birch, E., Cotter, S. A., . . . Donahue, S. (2005). Randomized trial of treatment of amblyopia in children aged 7 to 17 years. Arch Ophthalmol, 123(4), 437–447.
Schiller, P. H., & Stryker, M. (1972). Single-unit recording and stimulation in superior colliculus of the alert rhesus monkey. J Neurophysiol, 35(6), 915–924.
Schneider, K. A., & Bavelier, D. (2003). Components of visual prior entry. Cogn Psychol, 47(4), 333–366. doi:10.1016/s0010-0285(03)00035-5
Schor, C. M., & Westall, C. (1984). Visual and vestibular sources of fixation instability in amblyopia. Invest Ophthalmol Vis Sci, 25(6), 729–738.
Schröder, J. H., Fries, P., Roelfsema, P. R., Singer, W., & Engel, A. K. (2002). Ocular dominance in extrastriate cortex of strabismic amblyopic cats. Vision Res, 42(1), 29–39.
Secen, J., Culham, J., Ho, C., & Giaschi, D. (2011). Neural correlates of the multiple-object tracking deficit in amblyopia. Vision Res, 51(23-24), 2517–2527. doi:10.1016/j.visres.2011.10.011
Sekuler, R., Sekuler, A. B., & Lau, R. (1997). Sound alters visual motion perception. Nature,
385(6614), 308. doi:10.1038/385308a0 Sengpiel, F., Jirmann, K. U., Vorobyov, V., & Eysel, U. T. (2006). Strabismic suppression is
mediated by inhibitory interactions in the primary visual cortex. Cereb Cortex, 16(12), 1750–1758. doi:10.1093/cercor/bhj110
207
Shams, L., & Beierholm, U. R. (2010). Causal inference in perception. Trends Cogn Sci, 14(9), 425–432. doi:10.1016/j.tics.2010.07.001
Shams, L., Kamitani, Y., & Shimojo, S. (2000). Illusions: What you see is what you hear. Nature, 408(6814), 788–788.
Shams, L., Kamitani, Y., & Shimojo, S. (2002). Visual illusion induced by sound. Brain Res
Cogn Brain Res, 14(1), 147–152. doi:http://dx.doi.org/10.1016/S0926-6410(02)00069-1 Sharma, V., Levi, D. M., & Klein, S. A. (2000). Undercounting features and missing features:
evidence for a high-level deficit in strabismic amblyopia. Nat Neurosci, 3(5), 496–501. doi:10.1038/74872
Shatz, C. J., & Stryker, M. P. (1978). Ocular dominance in layer IV of the cat's visual cortex and the effects of monocular deprivation. J Physiol, 281, 267–283.
Shimojo, S., & Held, R. (1987). Vernier acuity is less than grating acuity in 2-and 3-month-olds. Vision Res, 27(1), 77–86.
Shipley, T. (1964). Auditory Flutter-Driving of Visual Flicker. Science, 145(3638), 1328–1330. Simion, F., Regolin, L., & Bulf, H. (2008). A predisposition for biological motion in the
newborn baby. Proc Natl Acad Sci U S A, 105(2), 809–813. Simmers, A. J., Ledgeway, T., Hess, R. F., & McGraw, P. V. (2003). Deficits to global motion
processing in human amblyopia. Vision Res, 43(6), 729–738. doi:10.1016/s0042-6989(02)00684-3
Simmers, A. J., Ledgeway, T., Mansouri, B., Hutchinson, C. V., & Hess, R. F. (2006). The extent of the dorsal extra-striate deficit in amblyopia. Vision Res, 46(16), 2571–2580. doi:10.1016/j.visres.2006.01.009
Sireteanu, R., Thiel, A., Fikus, S., & Iftime, A. (2008). Patterns of spatial distortions in human amblyopia are invariant to stimulus duration and instruction modality. Vision Res, 48(9), 1150–1163. doi:10.1016/j.visres.2008.01.028
Skoczenski, A. M., & Norcia, A. M. (2002). Late maturation of visual hyperacuity. Psychol Sci,
13(6), 537–541. doi:10.1111/1467-9280.00494 Slutsky, D. A., & Recanzone, G. H. (2001). Temporal and spatial dependency of the
ventriloquism effect. Neuroreport, 12(1), 7–10. Sokol, S. (1983). Abnormal evoked potential latencies in amblyopia. Br J Ophthalmol, 67(5),
310–314. Soto-Faraco, S., & Alsius, A. (2009). Deconstructing the McGurk–MacDonald illusion. J Exp
Psychol Hum Percept Perform, 35(2), 580–587. doi:10.1037/a0013483 Spang, K., & Fahle, M. (2009). Impaired temporal, not just spatial, resolution in amblyopia.
Invest Ophthalmol Vis Sci, 50(11), 5207–5212. Sparks, D. L. (1986). Translation of sensory signals into commands for control of saccadic eye
movements: role of primate superior colliculus. Physiol Rev, 66(1), 118–171. Sparks, D. L. (1988). Neural cartography: sensory and motor maps in the superior colliculus.
Brain Behav Evol, 31(1), 49–56. Spelke, E. (1976). Infants' intermodal perception of events. Cogn Psychol, 8(4), 553–560. Spence, C., & Parise, C. (2010). Prior-entry: a review. Conscious Cogn, 19(1), 364–379.
doi:10.1016/j.concog.2009.12.001 Spence, C., Shore, D. I., & Klein, R. M. (2001). Multisensory prior entry. J Exp Psychol Gen,
130(4), 799–832. St John, R. (1998). Judgements of visual precedence by strabismics. Behav Brain Res, 90(2),
167–174. Stein, B. E., Burr, D., Constantinidis, C., Laurienti, P. J., Alex Meredith, M., Perrault, T. J., Jr., .
. . Lewkowicz, D. J. (2010). Semantic confusion regarding the development of
208
multisensory integration: a practical solution. Eur J Neurosci, 31(10), 1713–1720. doi:10.1111/j.1460-9568.2010.07206.x
Stein, B. E., & Meredith, M. A. (1993). The Merging of the Senses: The MIT Press. Stein, B. E., Meredith, M. A., Huneycutt, W. S., & McDade, L. (1989). Behavioral indices of
multisensory integration: orientation to visual cues is affected by auditory stimuli. J Cogn
Neurosci, 1(1), 12–24. doi:10.1162/jocn.1989.1.1.12 Stein, B. E., Stanford, T. R., Ramachandran, R., Perrault, T. J., Jr., & Rowland, B. A. (2009).
Challenges in quantifying multisensory integration: alternative criteria, models, and inverse effectiveness. Exp Brain Res, 198(2-3), 113–126. doi:10.1007/s00221-009-1880-8
Stein, B. E., Stanford, T. R., & Rowland, B. A. (2014). Development of multisensory integration from the perspective of the individual neuron. Nat Rev Neurosci, 15(8), 520–535. doi:10.1038/nrn3742
Stelmach, L. B., & Herdman, C. M. (1991). Directed attention and perception of temporal order. J Exp Psychol Hum Percept Perform, 17(2), 539.
Sterritt, G. M., Camp, B. W., & Lipman, B. S. (1966). Effects of early auditory deprivation upon auditory and visual information processing. Perceptual and motor skills.
Stevens, A. A., & Weaver, K. (2005). Auditory perceptual consolidation in early-onset blindness. Neuropsychologia, 43(13), 1901–1910. doi:10.1016/j.neuropsychologia.2005.03.007
Stevens, S. S., & Newman, E. B. (1936). The localization of actual sources of sound. The
American Journal of Psychology, 48(2), 297–306. doi:10.2307/1415748 Stevenson, R. A., Fister, J. K., Barnett, Z. P., Nidiffer, A. R., & Wallace, M. T. (2012).
Interactions between the spatial and temporal stimulus factors that influence multisensory integration in human performance. Exp Brain Res, 219(1), 121–137. doi:10.1007/s00221-012-3072-1
Stevenson, R. A., Siemann, J. K., Schneider, B. C., Eberly, H. E., Woynaroski, T. G., Camarata, S. M., & Wallace, M. T. (2014). Multisensory temporal integration in autism spectrum disorders. J Neurosci, 34(3), 691–697. doi:10.1523/JNEUROSCI.3615-13.2014
Stevenson, R. A., VanDerKlok, R. M., Pisoni, D. B., & James, T. W. (2011). Discrete neural substrates underlie complementary audiovisual speech integration processes. Neuroimage, 55(3), 1339–1345. doi:10.1016/j.neuroimage.2010.12.063
Stevenson, R. A., & Wallace, M. T. (2013). Multisensory temporal integration: task and stimulus dependencies. Exp Brain Res, 227(2), 249–261. doi:10.1007/s00221-013-3507-3
Stevenson, R. A., Wilson, M. M., Powers, A. R., & Wallace, M. T. (2013). The effects of visual training on multisensory temporal processing. Exp Brain Res, 225(4), 479–489. doi:10.1007/s00221-012-3387-y
Stevenson, R. A., Zemtsov, R. K., & Wallace, M. T. (2012). Individual differences in the multisensory temporal binding window predict susceptibility to audiovisual illusions. J
Exp Psychol Hum Percept Perform, 38(6), 1517–1529. doi:10.1037/a0027339 Stewart, C. E., Moseley, M. J., & Fielder, A. R. (2011). Amblyopia therapy: an update.
Strabismus, 19(3), 91–98. doi:10.3109/09273972.2011.600421 Stone, J., Hunkin, N., Porrill, J., Wood, R., Keeler, V., Beanland, M., . . . Porter, N. (2001).
When is now? Perception of simultaneity. Proceedings of the Royal Society of London B:
Biological Sciences, 268(1462), 31-38. Strasburger, H. (2001). Converting between measures of slope of the psychometric function.
Atten Percept Psychophys, 63(8), 1348–1355. Stuart, J. A., & Burian, H. M. (1962). A study of separation difficulty. Its relationship to visual
acuity in normal and amblyopic eyes. Am J Ophthalmol, 53(3), 471–477.
209
Student Support Services Team. (2008). School hearing screening guidelines. Albany, New York 12234: The University of the State of New York.
Subramanian, V., Jost, R. M., & Birch, E. E. (2013). A quantitative study of fixation stability in amblyopia. Invest Ophthalmol Vis Sci, 54(3), 1998–2003. doi:10.1167/iovs.12-11054
Sugita, Y., & Suzuki, Y. (2003). Audiovisual perception: Implicit estimation of sound-arrival time. Nature, 421(6926), 911. doi:10.1038/421911a
Sumby, W. H., & Pollack, I. (1954). Visual contribution to speech intelligibility in noise. J
Acoust Soc Am, 26(2), 212–215. Tallal, P. (1978). An experimental investigation of the role of auditory temporal processing in
normal and disordered language development. Language acquisition and language
breakdown: Parallels and divergences, 25–61. Talsma, D., Senkowski, D., Soto-Faraco, S., & Woldorff, M. G. (2010). The multifaceted
interplay between attention and multisensory integration. Trends Cogn Sci, 14(9), 400–410.
The Lasker/IRRF Initiative for Innovation in Vision Science. (2017). Amblyopia: Challenges
and Opportunities. Retrieved from Theeuwes, J. (1991). Exogenous and endogenous control of attention: The effect of visual onsets
and offsets. Atten Percept Psychophys, 49(1), 83–90. Thompson, J. R., Woodruff, G., Hiscox, F. A., Strong, N., & Minshull, C. (1991). The incidence
and prevalence of amblyopia detected in childhood. Public health, 105(6), 455–462. Thurlow, W. R., & Jack, C. E. (1973). Certain determinants of the “ventriloquism effect”.
Perceptual and motor skills, 36(3 suppl), 1171–1184. Titchener, E. B. (1908). Lectures on the Elementary Psychology of Feeling and Attention:
Macmillan. Tommila, V., & Tarkkanen, A. (1981). Incidence of loss of vision in the healthy eye in
amblyopia. Br J Ophthalmol, 65(8), 575–577. Tootell, R. B., Switkes, E., Silverman, M. S., & Hamilton, S. L. (1988). Functional anatomy of
macaque striate cortex. II. Retinotopic organization. J Neurosci, 8(5), 1531–1568. Tredici, T. D., & von Noorden, G. K. (1984). The Pulfrich effect in anisometropic amblyopia
and strabismus. Am J Ophthalmol, 98(4), 499–503. Tripathy, S. P., & Cavanagh, P. (2002). The extent of crowding in peripheral vision does not
scale with target size. Vision Res, 42(20), 2357–2369. Tsirlin, I., Colpa, L., Goltz, H. C., & Wong, A. M. (2015). Behavioral training as new treatment
for adult amblyopia: a meta-analysis and systematic review. Invest Ophthalmol Vis Sci,
56(6), 4061–4075. doi:10.1167/iovs.15-16583 Tünnermann, J., Petersen, A., & Scharlau, I. (2015). Does attention speed up processing?
Decreases and increases of processing rates in visual prior entry. J Vis, 15(3), 1–1. Tuomainen, J., Andersen, T. S., Tiippana, K., & Sams, M. (2005). Audio-visual speech
perception is special. Cognition, 96(1), B13–22. doi:10.1016/j.cognition.2004.10.004 Vaegan, & Taylor, D. (1979). Critical period for deprivation amblyopia in children. Trans
Ophthalmol Soc U K, 99(3), 432–439. van de Graaf, E. S., van der Sterre, G. W., van Kempen-du Saar, H., Simonsz, B., Looman, C.
W., & Simonsz, H. J. (2007). Amblyopia and Strabismus Questionnaire (A&SQ): clinical validation in a historic cohort. Graefes Arch Clin Exp Ophthalmol, 245(11), 1589–1595. doi:10.1007/s00417-007-0594-5
van der Heijden, M., & Trahiotis, C. (1999). Masking with interaurally delayed stimuli: the use of “internal” delays in binaural detection. J Acoust Soc Am, 105(1), 388–399.
210
Van Esch, T., Lutman, M., Vormann, M., Lyzenga, J., Hällgren, M., Larsby, B., . . . Dreschler, W. (2015). Relations between psychophysical measures of spatial hearing and self-reported spatial-hearing abilities. International Journal of Audiology, 54(3), 182–189.
Van Essen, D. C., & Deyoe, E. A. (1995). Concurrent processing in the primate visual cortex. In M. S. Gazzaniga (Ed.), The cognitive neurosciences (pp. 383–400): MIT Press.
Van Essen, D. C., Newsome, W. T., & Maunsell, J. H. (1984). The visual field representation in striate cortex of the macaque monkey: asymmetries, anisotropies, and individual variability. Vision Res, 24(5), 429–448. doi:http://dx.doi.org/10.1016/0042-6989(84)90041-5
van Wassenhove, V., Grant, K. W., & Poeppel, D. (2007). Temporal window of integration in auditory-visual speech perception. Neuropsychologia, 45(3), 598–607. doi:10.1016/j.neuropsychologia.2006.01.001
Vatakis, A., & Spence, C. (2007). Crossmodal binding: evaluating the "unity assumption" using audiovisual speech stimuli. Percept Psychophys, 69(5), 744–756.
Vatakis, A., & Spence, C. (2008). Evaluating the influence of the 'unity assumption' on the temporal perception of realistic audiovisual stimuli. Acta Psychol (Amst), 127(1), 12–23. doi:10.1016/j.actpsy.2006.12.002
Vereecken, E. P., & Brabant, P. (1984). Prognosis for vision in amblyopia after the loss of the good eye. Arch Ophthalmol, 102(2), 220–224.
Vinding, T., Gregersen, E., Jensen, A., & Rindziunski, E. (2009). Prevalence of amblyopia in old people without previous screening and treatment. Acta Ophthalmologica, 69(6), 796–798. doi:10.1111/j.1755-3768.1991.tb02063.x
Von Békésy, G. (1930). Zur Theorie des Horens: Uber das Richtungshoren bei einer Zeitdifferenz oder Lautstarkenungleichheit der beiderseitigen Schalleinwirkungen. Physik. Z., 31, 824–835.
von Noorden, G. K., & Campos, E. (2002). Binocular Vision and Ocular Motility (6th ed.). St. Louis, MO: Mosby.
Voss, P., Lassonde, M., Gougoux, F., Fortin, M., Guillemot, J. P., & Lepore, F. (2004). Early- and late-onset blind individuals show supra-normal auditory abilities in far-space. Curr
Biol, 14(19), 1734–1738. doi:10.1016/j.cub.2004.09.051 Vroomen, J., Bertelson, P., & De Gelder, B. (2001). The ventriloquist effect does not depend on
the direction of automatic visual attention. Percept Psychophys, 63(4), 651–659. doi:10.3758/bf03194427
Vroomen, J., & Keetels, M. (2006). The spatial constraint in intersensory pairing: no role in temporal ventriloquism. J Exp Psychol Hum Percept Perform, 32(4), 1063–1071. doi:10.1037/0096-1523.32.4.1063
Vroomen, J., & Keetels, M. (2010). Perception of intersensory synchrony: a tutorial review. Atten Percept Psychophys, 72(4), 871–884. doi:10.3758/APP.72.4.871
Vroomen, J., & Stekelenburg, J. J. (2011). Perception of intersensory synchrony in audiovisual speech: not that special. Cognition, 118(1), 75–83. doi:10.1016/j.cognition.2010.10.002
Wada, Y., Kitagawa, N., & Noguchi, K. (2003). Audio-visual integration in temporal perception. Int J Psychophysiol, 50(1-2), 117–124.
Wali, N., Leguire, L., Rogers, G., & Bremer, D. (1991). CSF interocular interactions in childhood ambylopia. Optom Vis Sci, 68(2), 81–87.
Wallace, M. T. (2004). The development of multisensory processes. Cognitive Processing, 5(2), 69–83.
211
Wallace, M. T., Perrault, T. J., Jr., Hairston, W. D., & Stein, B. E. (2004). Visual experience is necessary for the development of multisensory integration. J Neurosci, 24(43), 9580–9584. doi:10.1523/JNEUROSCI.2535-04.2004
Wallace, M. T., & Stein, B. E. (2001). Sensory and multisensory responses in the newborn monkey superior colliculus. J Neurosci, 21(22), 8886–8894.
Wallace, M. T., & Stein, B. E. (2007). Early experience determines how the senses will interact. J Neurophysiol, 97(1), 921–926. doi:10.1152/jn.00497.2006
Wallace, M. T., Wilkinson, L. K., & Stein, B. E. (1996). Representation and integration of multiple sensory inputs in primate superior colliculus. J Neurophysiol, 76(2), 1246–1266.
Warncke, H. (1941). The fundamentals of room-related stereophonic reproduction in sound films. Akust. Zh, 6, 174–188.
Warren, D. H., Welch, R. B., & McCarthy, T. J. (1981). The role of visual-auditory “compellingness” in the ventriloquism effect: Implications for transitivity among the spatial senses. Percept Psychophys, 30(6), 557–564.
Warren, R. M. (2008). Auditory Perception: An Analysis and Synthesis: Cambridge University Press.
Watkins, S., Shams, L., Tanaka, S., Haynes, J. D., & Rees, G. (2006). Sound alters activity in human V1 in association with illusory visual perception. Neuroimage, 31(3), 1247–1256. doi:http://dx.doi.org/10.1016/j.neuroimage.2006.01.016
Watts, P. O., Neveu, M. M., Holder, G. E., & Sloper, J. J. (2002). Visual evoked potentials in successfully treated strabismic amblyopes and normal subjects. J AAPOS, 6(6), 389–392. doi:10.1067/mps.2002.129046
Weaver, K. E., & Stevens, A. A. (2006). Auditory gap detection in the early blind. Hear Res,
211(1-2), 1–6. doi:10.1016/j.heares.2005.08.002 Webber, A. L., Wood, J. M., Gole, G. A., & Brown, B. (2008). Effect of amblyopia on self-
esteem in children. Optom Vis Sci, 85(11), 1074–1081. doi:10.1097/OPX.0b013e31818b9911
Welch, R. B. (1999). Meaning, attention, and the “unity assumption” in the intersensory bias of spatial and temporal perceptions. Advances in psychology, 129, 371–387.
Welch, R. B., & Warren, D. H. (1980). Immediate perceptual response to intersensory discrepancy. Psychol Bull, 88(3), 638–667. doi:http://dx.doi.org/10.1037/0033-2909.88.3.638
Wertheimer, M. (1961). Psychomotor Coordination of Auditory and Visual Space at Birth. Science, 134(3491), 1692–1692. doi:10.1126/science.134.3491.1692
Wesson, M. D., & Loop, M. S. (1982). Temporal contrast sensitivity in amblyopia. Invest
Ophthalmol Vis Sci, 22(1), 98–102. Wiesel, T. N., & Hubel, D. H. (1963a). Effects of visual deprivation on morphology and
physiology of cells in the cat's lateral geniculate body. J Neurophysiol, 26(978), 6. Wiesel, T. N., & Hubel, D. H. (1963b). Single-cell responses in striate cortex of kittens deprived
of vision in one eye. J Neurophysiol, 26(6), 1003–1017. Wightman, F., Allen, P., Dolan, T., Kistler, D., & Jamieson, D. (1989). Temporal resolution in
children. Child Dev, 60(3), 611–624. doi:10.2307/1130727 Wightman, F. L., & Kistler, D. J. (1989a). Headphone simulation of free‐field listening. I:
stimulus synthesis. J Acoust Soc Am, 85(2), 858–867. Wightman, F. L., & Kistler, D. J. (1989b). Headphone simulation of free‐field listening. II:
Psychophysical validation. J Acoust Soc Am, 85(2), 868–878. Williams, C., Azzopardi, P., & Cowey, A. (1995). Nasal and temporal retinal ganglion cells
projecting to the midbrain: implications for "blindsight". Neuroscience, 65(2), 577–586.
212
Williams, C., Northstone, K., Howard, M., Harvey, I., Harrad, R. A., & Sparrow, J. M. (2008). Prevalence and risk factors for common vision problems in children: data from the ALSPAC study. Br J Ophthalmol, 92(7), 959–964. doi:10.1136/bjo.2007.134700
Witten, I. B., & Knudsen, E. I. (2005). Why seeing is believing: merging auditory and visual worlds. Neuron, 48(3), 489–496. doi:10.1016/j.neuron.2005.10.020
Wittmann, M. (1999). Time Perception and Temporal Processing Levels of the Brain. Chronobiology International, 16(1), 17–32. doi:10.3109/07420529908998709
Wong, A. M., Richards, M. D., & Goltz, H. C. (2017, 4 April 2017). The effect of amblyopia on
the developmental calibration of sound localization. Paper presented at the 43rd annual NANOS meeting, Washington, DC.
Wright, D., Hebrank, J. H., & Wilson, B. (1974). Pinna reflections as cues for localization. J
Acoust Soc Am, 56(3), 957–962. Yarrow, K., Jahn, N., Durant, S., & Arnold, D. H. (2011). Shifts of criteria or neural timing? The
assumptions underlying timing perception studies. Conscious Cogn, 20(4), 1518–1531. doi:10.1016/j.concog.2011.07.003
Yu, L., Rowland, B. A., & Stein, B. E. (2010). Initiating the development of multisensory integration by manipulating sensory experience. J Neurosci, 30(14), 4904–4913. doi:10.1523/JNEUROSCI.5575-09.2010
Yu, M., Brown, B., & Edwards, M. H. (1998). Investigation of multifocal visual evoked potential in anisometropic and esotropic amblyopes. Invest Ophthalmol Vis Sci, 39(11), 2033–2040.
Yuille, A. L., & Bulthoff, H. H. (1996). Bayesian decision theory and psychophysics. In C. K. David & R. Whitman (Eds.), Perception as Bayesian inference (pp. 123–161): Cambridge University Press.
Zackon, D. H., Casson, E. J., Zafar, A., Stelmach, L., & Racette, L. (1999). The temporal order judgment paradigm: subcorticalattentional contribution under exogenous and endogenouscueing conditions. Neuropsychologia, 37(5), 511–520. doi:10.1016/S0028-3932(98)00134-1
Zampini, M., Guest, S., Shore, D. I., & Spence, C. (2005). Audio-visual simultaneity judgments. Percept Psychophys, 67(3), 531–544.
Zampini, M., Shore, D. I., & Spence, C. (2005). Audiovisual prior entry. Neurosci Lett, 381(3), 217–222. doi:10.1016/j.neulet.2005.01.085
Zhang, W., & Zhao, K. (2005). Multifocal VEP difference between early- and late-onset strabismus amblyopia. Doc Ophthalmol, 110(2-3), 173–180. doi:10.1007/s10633-005-4312-5
Zürcher, B., & Lang, J. M. (1979). Reading capacity in cases of 'cured' strabismic amblyopia. Trans Ophthalmol Soc U K, 100(4), 501–503.
Zwiers, M. P., Van Opstal, A. J., & Cruysberg, J. R. M. (2001). A spatial hearing deficit in early-blind humans. J Neurosci, 21(9), RC142–RC142.
Zwiers, M. P., Versnel, H., & Van Opstal, A. J. (2004). Involvement of monkey inferior colliculus in spatial hearing. J Neurosci, 24(17), 4145–4156. doi:10.1523/JNEUROSCI.0199-04.2004
213
Copyright Acknowledgements
The work contained within Study III was previously published in: Richards, M. D., Goltz, H. C.,
& Wong, A. M. F. (2017). Alterations in audiovisual simultaneity perception in amblyopia. PLoS
one, 12(6). Its text and figures have been reformatted for inclusion in this thesis, with permission
under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.