computational modelling of visual …... computational modelling of visual attention presented by...
TRANSCRIPT
www.helsinki.fi/yliopisto
COMPUTATIONAL MODELLING OF
VISUAL ATTENTION
Presented by
Ben Cowley
Laurent Itti and Christof Koch, Nature Reviews: Neuroscience, Vol.2, Feb.2001, p.1-11
www.helsinki.fi/yliopisto
• Overview
1. Neural Mechanisms in Visual Attention
2. ‘Spotlight’ of Attention
3. Saliency
4. Pre-Attentive Computation of Visual Features
5. Computational Architectures
6. Attentional Selection and Inhibition-of-Return
7. Attention and Recognition
• Video talk: Surya Ganguli on attentional modelling
• Summary
Paper/Presentation Structure
www.helsinki.fi/yliopisto
• Review of computational models of focal visual attention
• Selective visual attention: functions to direct gaze to salient objects in the
environment
• Rapidly detect relevant stimuli – prey, predators, mates
• Solve bandwidth problem: 107 - 108 bits/sec at optic nerve
• Parallel use of bottom-up, image-based saliency and top-down, task-related cues
• Current emphasis on bottom-up, image-based control of attentional deployment
Overview
www.helsinki.fi/yliopisto
Neural Mechanisms in Visual
Attention
www.helsinki.fi/yliopisto
Neural Mechanisms in Visual
Attention
www.helsinki.fi/yliopisto
Neural Mechanisms in Visual
Attention
Ventral
Stream:
‘what’
Dorsal
Stream:
‘where’
www.helsinki.fi/yliopisto
• Early visual areas process pre-attentive visual features
• Potentially based on centre-surround mechanisms
• Items of interest ‘pop-out’
• Saliency-based, pretty fast ~25-50 ms
Neural Mechanisms in Visual
Attention
www.helsinki.fi/yliopisto
• Pre-frontal cortex and other ‘higher’ areas modulate attention
• Task-based, variable selection of criteria
• Price in speed: 200 ms or more
Neural Mechanisms in Visual
Attention
www.helsinki.fi/yliopisto
• Not only selection of objects
• Also enhancement of cortical, and thus conscious, representation of those
objects
• Is the spotlight only metaphorical?
‘Spotlight’ of Attention
www.helsinki.fi/yliopisto
‘Spotlight’ of Attention
Peter Tse, 2005
illusionoftheyear.com
www.helsinki.fi/yliopisto
‘Spotlight’ of Attention
Peter Tse, 2005, illusionoftheyear.com
www.helsinki.fi/yliopisto
• Computational Architectures (beginning with Koch and Ullman) centre
around a ‘saliency map’
• 2D topographical array of stimulus conspicuity or saliency
• Receives inputs from early visual areas
• Scan saliency map in descending order
Saliency
Koch, C. & Ullman, S. Shifts in selective visual attention: towards the underlying neural circuitry.
Hum. Neurobiol. 4, 219–227 (1985).
www.helsinki.fi/yliopisto
www.helsinki.fi/yliopisto
• Different features contribute at different strengths to perceptual saliency
• Strong interaction between visual modalities not found
• Feature contrast more important than feature strength
• Neuronal response strongly modulated by context
• Inhibitory effect occurs when stimulus extends beyond classical receptive field
• Dampens excess reverberatory firing?
Saliency:Pre-Attentive Computation
of Visual Features
www.helsinki.fi/yliopisto
• Saliency map represents only importance of features
• Convergence from feature maps requires some spatial selection
• Pre-attentive feature-extraction mechanisms in future models
• Spatially-heterogeneous non-linear interactions: context
Saliency
www.helsinki.fi/yliopisto
• Some early models:
• Didday and Arbib (1975), Marr (1982)
• Saliency-based Computational Architectures heavily influenced by Koch
and Ullman (1985)
• Models differentiate mainly on their feature extraction and pruning
strategy
• How to build a saliency map and use it
Computational Architectures
Didday, R. L. & Arbib, M. A. Eye movements and visual perception: A “two visual system’’ model.
Int. J.Man–Machine Studies 7, 547–569 (1975).
Marr, D. Vision, San Francisco, Freeman, (1982).
www.helsinki.fi/yliopisto
• Wolfe (1994)
• Relevant feature selection modelled by top-down, spatially-defined,
feature-dependent weighting of feature maps
• Saliency = likelihood target_presesnt at map location
• Combines bottom-up feature contrast and top-down feature weight
Computational Architectures
Wolfe, J. M. Visual search in continuous, naturalistic stimuli.
Vision Res. 34, 1187–1195 (1994).
www.helsinki.fi/yliopisto
• Tsotos et al (1995) combined:
• Feedforward bottom-up feature extraction hierarchy
• Feedback selective tuning of feature extraction mechanisms.
• Saliency = hierarchical competitive pruning of non-contributory
feedforward paths
Computational Architectures
Tsotsos, J. K. et al. Modeling visual-attention via selective tuning.
Artif. Intell. 78, 507–545 (1995).
www.helsinki.fi/yliopisto
• Milanese et al (1995)
• Energy minimisation approach
• Energy term =
• min Inter-feature coherence
• min Intra-feature coherence
• min Total activity of each map
• max Dynamic range of each map
Computational Architectures
Milanese, R., Gil, S. & Pun, T. Attentive mechanisms for dynamic and static scene analysis.
Opt. Eng. 34, 2428–2434 (1995).
www.helsinki.fi/yliopisto
• Itti et al (1998, 2000)
• Purely bottom-up approach approach
• Spatial competition for saliency directly modelled by surround modulation
• Convolution of each feature map by large Mexican Hat filter
• Losing locations are entirely eliminated
Computational Architectures
Itti, L., Koch, C. & Niebur, E. A model of saliency-based visual attention for rapid scene analysis.
IEEE Trans. Patt. Anal. Mach. Intell. 20, 1254–1259 (1998).
Itti, L. & Koch, C. A saliency-based search mechanism for overt and covert shifts of visual attention.
Vision Res. 40, 1489–1506 (2000).
www.helsinki.fi/yliopisto
• Unique, centralised map is unlikely
• Multiple areas code for stimulus saliency
• Lateral intraparietal sulcus of PPC
• Frontal eye fields
• Subdivisions of pulvinar
• Superior colliculus
• Stages in the sensorimotor processing stream?
• Neurons respond relevant to location: e.g. Dorsal ‘where’ stream proximity to
sensorimotor strip
• Neuron-level coding for saliency is established
Saliency: Biologically Plausible?
www.helsinki.fi/yliopisto
www.helsinki.fi/yliopisto
• Winner-take-all network is biologically plausible for saliency map
• Leads to gaze fixation issue (exists in certain patients)
• Inhibit neurons in saliency map after attendance
• ‘Inhibition of return’ IOR
• Simple model: transient inhibition at current foci
• Computationally easy, biologically implausible!
• IOR is object-bound: again the need to be context-aware
Attentional Selection and
Inhibition-of-Return
www.helsinki.fi/yliopisto
• Bottom-up models shown so far can only account for <300ms after
stimulus onset
• Top-down model requires volitional bias also
• Computational challenge: integration
• Hierarchical models, tree-based
• Require learning ‘knowledge’ through training
• Schill et al (2001), Rybak et al (1998)
Attention and Recognition
Schill, K., Umkehrer, E., Beinlich, S., Krieger, G. & Zetzsche, C. Scene analysis with saccadic eye
movements: top-down and bottom-up modeling. J. Electronic Imaging 10(01), 152-160, (2001).
Rybak, I. A., Gusakova, V. I., Golovan, A. V., Podladchikova, L. N. & Shevtsova, N. A. A model of
attention-guided visualperception and recognition. Vision Res. 38, 2387–2400 (1998).
www.helsinki.fi/yliopisto
www.helsinki.fi/yliopisto
• Biological plausibility of these models was low
• More biologically plausible models for individual streams (ventral, dorsal)
were unintegrated
• Some post-review models achieve greater biological validity for more focal
processing
• E.g. Butts et al (2011)
• Potential for plausible integrated models on these foundations?
Attention and Recognition
Daniel A. Butts, Chong Weng, Jianzhong Jin, Jose-Manuel Alonso,and Liam Paninski. Temporal
Precision in the Visual Pathway through the Interplay of Excitation and Stimulus-Driven
Suppression. The Journal of Neuroscience, 3 August 2011, 31(31):11313-11327;
www.helsinki.fi/yliopisto
• So what is the state of the art?
Video talk: Surya Ganguli on attentional modelling
Attention and Recognition
http://fora.tv/2011/03/16/Mapping_the_Brain_and_Neural_Architectures#chapter_05
www.helsinki.fi/yliopisto
1. Neural Mechanisms in Visual Attention
2. ‘Spotlight’ of Attention
3. Saliency
4. Pre-Attentive Computation of Visual Features
5. Computational Architectures
6. Attentional Selection and Inhibition-of-Return
7. Attention and Recognition
Summary