action and cognition

116

Upload: marimarxan

Post on 16-Jan-2016

13 views

Category:

Documents


0 download

DESCRIPTION

This is the script of the lecture Action and Cognition

TRANSCRIPT

Page 1: Action and Cognition
Page 2: Action and Cognition

a&c |

a&c | 2

Copyright Issues

The sole purpose of this version of the following document is to serve as a studying aid for the participants of the lecture “Action and Cognition I” held by Prof. Dr. Peter König in 2003/2004 at the University of Osnabrück.

This document is not yet ready for public release as it contains pictures protected by the copyright law. Therefore it must not be distributed, neither in electronic nor printed form, to anyone not being a participant of the lecture mentioned above.

We urge you to stick to these guidelines.Seriously, we might get our pants sued off if this document is made available on the internet.

Furthermore we would like to encourage you to actively take part in the improvement of this document. If you happen to fi nd mistakes or have access to appropriate pictures not protected by a copyright, please contact us.

If you want to use any part of the text, feel free to do so.

Page 3: Action and Cognition

a&c |

a&c | 3

Introduction

This Script is supposed to provide you with all the information of the lecture “Action and Cognition”. It contains everything necessary to know to have a good fi nal exam. But indeed this is not the primary idea about it.

The script as well as the lecture tries to make students of “Cognitive Science” familiar with the methods of neuroscience and some general theories about brain function(s). The paradigm of studies of the mental is that the human brain is the organ that “produces” all our mental abilities such as thinking, speaking, motor action, attention as well as desires, beliefs and all kinds of emotion. Thus to understand brain function should be to understand the nature of human behaviour.

On the way which might lead to such a full understanding, it is most important to get detailed insight in the methods which allow to evaluate brain activity under controlled circumstances. Much data is obtained by studies of monkey and cat brains. They allow “watching” the brains activity on the scale of single neurons and groups of neurons in the living animal. Brain imaging techniques like EEG, MEG and fMRI are the most familiar ones for studies on humans. We will not dig very deep into statistical analysis of data. All the tools to explore the brain are used in different combination by scientists whose most recent papers will serve as the background of the lectures. The lecture is not only meant to be a presentation of vocabulary and theories. It should underline the importance to read and understand contemporary scientifi c literature.

This script joins the fi rst part of the lecture “Action and Cognition” which is mainly concerned with vision, attention and consciousness. It will accompany you from the eye through different stages of visual processing like the lateral geniculate nucleus in the thalamus to primary visual cortex and “higher” areas of visual processing. Curiously it seems that the processing of form and position are partly separated to parallel pathways. But how is this information then recombined? This is the question of how “binding” of visual features is realized for our experience of the world. This important binding-problem is only one of the questions that are still under massive debate in Cognitive Science. If the brain is the organ that builds ourselves in all aspects then to understand the brain is to understand who “we” are. If we can fully understand ourselves in neuroscientifi c terms this will incorporate the idea of a deterministic mind. And this core problem of the nature of the human mind is just one bridge to philosophy and computer science, the other main “parts” of this one topic. As all science is a process of criticism you should sharpen your senses in the fi eld of Cognitive Science!

We hope that this script enables you to formulate your own critical questions and in turn helps you to take part in the process of exploring the human mind.

Page 4: Action and Cognition

a&c |

a&c | 4

Terms of Use and Copyright Issues

Introduction

Table of Contents

Terminology Map

Part 01: Vision

01 The Constructive Nature of Color Vision

02 The Primary Visual Cortex

03 Vision, Vx and a Representation of the World

Appendix A: Object Recognition in IT and STS

04 Temporal Coding

Part 02: Attention

05 Attention

06 Neural Correlates of Attentional Mechanisms Appendix B: MRI & fMRI 07 Top-Down and Feed-Back Mechanisms in the Visual System

Part 03: Consciousness

08 Consciousness Appendix C: Binocular Rivalry and Visual Awareness

Part 04: Motor Action

09 Spinal Motor Systems

References

Page 5: Action and Cognition

a&c |

a&c | 5

Terminology Map

Page 6: Action and Cognition

a&c | The Constructive Nature of Color Vision

a&c | 6

The functional organization of the brain can be investigated with different methods.

Lets put a human brain on a table:At first glance we recognize the gross anatomy, i.e. the two hemispheres heavily folded to structures called gyri (the folds) and sulci (the clefts in between). The central sulcus divides the frontal and parietal lobe. The frontal and parietal lobe in turn are separated from the temporal lobe by the lateral sulcus. The last main division is the occipital lobe,

located at the back of the hemisphere. This distinction is totally anatomical and serves as a tool for describing positions in the cortex. (picture : 1. lobes.jpg)The cortex appears to be homogenous in structure. This might give rise to a globalist view of brain function: all cognitive functions are realized in the whole of the cortex. The ability of the cortex to compensate for severe damage supports this view.If the cortex is in contrast of modular organization, separable regions for distinct functions, this should become observable using following methods

CHAPTER 01: The Constructive Nature of (Color) VisionIntroduction

Imagine a visual decision task! You are confronted with a complex situation; say you want to buy a Merlot or a Cabernet wine in the supermarket. You stand in front of the shelf and look at the bottles. In physical terms this visual image consists of electromagnetic waves of different wavelength and amplitude. It is photons reaching the retinae in our eyes. In a process, which is only partly understood by now, the brain constructs a visual perception of the wine bottles in the shelf. Where in the brain is the whole scene realized? Is there a central screen that all information is projected onto? But, if there was such a highest stage of information processing, how can “I” look at that screen and decide for one of the constituents of my evening meal. Is there a kind of scan taking place to “read out” the result of visual computing? If so we could ask again “whom” the result of the scan is presented to; we would end in an infinite regress of screens and stages “looking” at these screens. Consequently a highest stage in creating a visual image is no plausible idea to explain how we decide. The simple computationist view of a three step processing, perception-decision-action, is no plausible framework considering the structure of our visual experience. Thus we must look at visual processing in more detail to get a better impression of what’s going on up there.

01 Gross anatomy allows a coarse defi nition of different cortical areas according to their position.

Page 7: Action and Cognition

a&c | The Constructive Nature of Color Vision

a&c | 7

of investigation:If we stain the cell bodies of all cortex neurons and then cut it in slices we can see differences in the organization of the cortex layers. A very nice example is a difference in the occipital lobe. There we find an area, which appears striped in the microscope thus called “striate cortex”. But one stripe, the “line of Gennari”, is only present in one part of the striate cortex, dividing visual areas 1 (V1) and 2 (V2). Such differences in cytoarchitectonics are likely to reflect differences in function. (picture : 2. line of gennari.tif)In 1909 Korbinian Brodmann presented a map of 52 cortex areas exhibiting cytoarchitectonic differences. (picture : 3. Brodmann.tif) Although this scheme is still popular it is somewhat outdated: Differences between areas are hard to reproduce in different subjects. But in an unknown way the differences in cytoarchitectonics within a single cortex are truly related to differences in function.The cortex is heavily interconnected. Thus a

way to define cortical areas is to look at the afferences that arrive in a certain cortex area. For example V1 receives most of the input from the lateral geniculate nucleus (LGN) in the thalamus. Hence anatomical connectivity may separate areas concerned with different input.A very popular method to define areas is to look at the brain activity either on the scale of cortical regions (with imaging techniques like fMRI) or on the scale of single neurons. Differences in the response properties of single neurons or cortex regions are very likely to reflect differences in function. If you stitch somebody with a needle successively from his feet to his head, neighbouring neurons in primary somatosensory cortex are most active, revealing a topographic map of the whole body. Somatosensory cortex is part of the parietal lobe and located on the posterior bank of the central sulcus.

02 The visual cortex has a six layer structure. Area V1 can be distinguished from area V2 cytoar-chitectonically: The line of gennari is present only in V1. This cytoarchitectonic differencecorresponds to differences in the responds properties of neurons in V1 to V2.

03 In 1909, Korbinian Brodmann presented this map of the human cortex. It relies on Brodmann’s results from microscopic investigation of the cytoarchitectonics of human cortex.

02

03

Page 8: Action and Cognition

a&c | The Constructive Nature of Color Vision

a&c | 8

So the methods for defining cortical areas are:

- gross anatomy - cytoarchitectonics - anatomical connectivity - cellular response properties - regional response properties - topographic maps

If we compare human with monkey, cat or mouse brain we will find only one major difference between them: size! The gross anatomy and cytoarchitectonics is at great deal similar between humans and other higher vertebrates. This underlines the relevance of studies on monkeys and cats for issues on humans.To keep orientation in the confusing structure of the cortex the areas are named either after function (V1

– visual area 1) or after position in on the cortex (MT – mediotemporal area).Our terminology map at the beginning shows you the main notions for naming cortex positions.

Fellemann and van Essen (1991) created a map for the visual areas of the macaque monkey brain, using area definitions from several studies. The map gives a good impression of how complex the visual system in primates is organized. Indeed about 40% of the macaque cortex is concerned with visual information processing. It is important to see what evidence supports such a map. This issue will occupy you for a good part of the lecture. The example of colour vision will guide us in this first part.

The Constructive Nature of Colour Vision

04 1.Human Brain, 2. Monkey brain without gyri, 3.cat brain, 4. mouse brain, 5. Monkey brain with gyri 05 The Fellemann & Van Essen Diagram

04

05

Page 9: Action and Cognition

a&c | The Constructive Nature of Color Vision

a&c | 9

We enter the map at the retina. As many aspects of retinal processing are familiar to you I will give only a short review and concentrate on the constructive nature of colour vision.

The retina

All visual information, that is the photons of light, enters the central nervous system (CNS) through the eye. The eye already acts as part of the CNS; the circuitry of the retina serves as a processing stage where the visual information is grouped, amplified or inhibited.The inner part is a layer of pigment cells that absorbs

light. Attached we find the layer of receptor cells, i.e. rods and cones. These possess a light sensitive outer segment, partly stuck into the pigment cell layer (picture : 6. Retinal layers.jpg, picture : 7. Retinal layers photo.jpg 8. human eye.gif). The pigment cells prevent light being reflected, thus enhancing contrast in the receptive cell layer.

Rods and Cones

Rods are predominantly luminance selective. They are widely distributed over the whole retina but only a few are present in the central region. Due to their specific

06 Retinal layers in a human eye and the connectivity of the receptors and neurons.

07 A slice through human retina.

08 The human Eye

07

06

08

Page 10: Action and Cognition

a&c | The Constructive Nature of Color Vision

a&c | 10

sensitivity for luminance differences, rods are mainly involved in night vision.Cones are colour sensitive receptor cells, of great number in the central retina. There they allow for detailed and coloured day vision. If you fixate on a point in front of you and slowly bring a pencil of random colour (that you do not know!) into your visual field you wont be able to recognize the pencils colour until it reached a more central position. This reflects the low density of cones in the peripheral retina. Rods and cones both connect to bipolar cells that act as transducers of the graded receptor membrane potentials into a sequence of action potentials. Bipolar cells in turn contact to retina ganglion cells that build the output layer of the retina: The axons of ganglion cells leave the eye at a certain spot where they join and build the optic nerve. The optic nerve ends in parts of the diencephalon, foremost the LGN (lateral geniculate nucleus) of the thalamus. In the most central part of the retina, the fovea centralis the density of cones is much higher than in the periphery. Thus the spatial resolution of the retina image is very high here. In addition two mechanisms facilitate detailed and coloured vision: 1. ganglion cells are shifted aside in order to allow arriving light to directly stimulate the receptors. 2. receptor cells contact only one ganglion cell such that detailed spatial information is preserved.

Lateral connections

Besides bipolar cells there are two other types of interneurons in the retina: horizontal cells and amacrine cells: Horizontal cells inhibitorily connect neighbouring bipolar cells. This lateral inhibition increases contrast.Amacrine cells laterally connect bipolar cells with ganglion cells such that one ganglion cell can cover a whole area of the retina: its receptive field.The connectivity of the retina allows for the construction of detailed coloured vision. Receptor cells respond to luminance (rods) and colour (cones), bipolar cells receive input from bipolar cells and in turn bipolar cells connect to ganglion cells. Lateral connections allow for lateral inhibition and convergent information processing. This results in specific response properties of bipolar and ganglion cells, the topic of the next part.

Discriminating colours

Of course, neither the photons entering the eye are themselves coloured, nor is there anything colourful about receptor potentials and action potentials. Colours are private events.Psychophysics tells that human colour perception depends on the wavelength of light. Electromagnetic waves build a continuous wavelength spectrum. The most powerful gamma-radiation is of very short wavelength, on subatomic scale. On the other side of the spectrum we find radio signals of several meters in wavelength. The visual spectrum lies between

09 Rods and Cones

10 Subcortical visual tracts.

09

10

Page 11: Action and Cognition

a&c | The Constructive Nature of Color Vision

a&c | 11

300 and 800 nanometres (wavelength). Wavelength is inversed proportional to frequency, that is: the shorter the wavelength of an electromagnetic wave the higher is its frequency.But I said it was photons, not “waves”, reaching the retina. That’s true, the physical description of light is dualistic: light has wavelike and particle like properties. The energy of a photon depends on “its” wavelength and the number of photons is reflected in

the amplitude of the corresponding light wave. So for bright light we can say that there are many photons reaching the retina. And for colour we can say that these photons are of particular wavelength.The human visual spectrum is covered in the colours of the rainbow. The cones in the retina do not respond equally to those different wavelengths. There are three types of cones, each of which responds best to a certain

BOX 5.2Tuning Curves

A tuning curve is the average response variation of a neuron to a systematic variation of a stimulus. On the left, the response of a V1neuron is shown as a function of the orientation a rotating bar. The global maximum is around 225° of orientation angle. This means that the neuron of interest fires most vigorously if a bar of this particular orientation enters its receptive field. Any deviation from this angle results in a gradual decline of activity.

11 Visible light is just a small potion of the electromagnetic wavelenght spectrum

10 A typical tuning curve

Page 12: Action and Cognition

a&c | The Constructive Nature of Color Vision

a&c | 12

colour (wavelength). Nowadays we call them S-cones (short wavelength) M-(medium wavelength) and L-(long wavelength) cones. So the high dimensional space of the colour spectrum is mapped on a the three dimensional space of S- M- and L-cone activity. But how can we then discriminate all colours of the visual spectrum?You can approximate all colours by mixing blue, green and red. There are two ways to mix colours: the additive method and the subtractive method. The subtractive method uses filters to exclude certain wavelengths from passing through.The additive method mixes frequencies like you did with your paint box in primary school. The observations made by mixing colours allow insight in the constructive nature of colour vision: Cones of different wavelength sensitivity produce a specific response pattern of retinal ganglion cells that “encodes” a specific colour reception.Imagine disco lights of different colour. Adding colours means adding photons of a certain wavelength, means increasing activity of a certain type, means changing the response pattern.Colour perception is described by the Grassman laws: a) Every perception of colour can be described by mixing three basic colours, b) adding colours is a continuous operation, c) metamer* colours act identical in mixing, d) the total intensity is the sum of all intensities.*Colours that have the same visual appearance (= the same “tristimulus values”), but different spectral composition, are called metameric.

As well the pattern of ganglion cell activity is not colourful and it remains an open question how perceptional qualities are generated. There does not seem to be a plausible reason that the wavelength corresponding to red does not appear as green. In philosophy of mind this is sometimes referred to as the explanatory gap. There is an ongoing qualia-debate concerning this unsolved problem.

Anomalies in perception give insight in the nature of colour vision.

Most humans are trichromats, that is they have a three-cone type system. Dichromats, humans with only two types of cones have severe problems in distinguishing certain colours, e.g. green and red. We name such persons colour blinds. On the other hand there are even very few humans with four different cone types, tetrachromats. Those persons can perceive differences in colour that are not accessible to trichromats. The nice article “Looking for Mrs Tetrachromat” from the Red Herring Journal can be found under: www.redherring.com.People with impaired vision in long wavelength are called proteranopes, for middle wavelength deuteranopes and for short wavelength tritanopes.Some other animals like e.g. crayfish (stromatopoda) possess up to 16 different receptors and are sensitive to 8 different polarizations, i.e. the orientation of an electromagnetic wave. What a colourful world!

02 03 Connectivity in V1The M and P Pathways from the LGN remain partially maintained in V1

12Stromatopeia

Page 13: Action and Cognition

a&c | The Constructive Nature of Color Vision

a&c | 13

Spatial Resolution

We do not perceive our environment in pixels, but have the impression of a continuous visual space. But as the retina is no continuous receptive membrane there has to be a minimal contrast angle below which no difference in luminance is perceived. A single hair can still be easily recognized, the particles of dust are only visible under high contrast and a normal sheet of paper appears as continuous in reading distance; the cellulose fibres are not visible as the papers constituents. How fine-grained exactly is our spatial resolution?Imagine a horizontal cut through visual space. If we draw a circle on this surface with the lens of the eye as its centre we can define angles of spatial resolution. Those angles correspond to distances on the circle. The spatial resolution of the human fovea is about 1` (angle minute). An angle minute is an angle one degree of a circle divided by 60. This resolution corresponds to an object in 5 meters distance of 1,5 mm in size. Every image is projected through the lens onto the retina like in a photo camera. The retina image is inverted and top-down. A visual image of a 1` angle produces a retina image of 5 micrometers in size. This is exactly the size of a receptor cell on the fovea. To distinguish constituents of an object there has to be at least one less active receptor cell between to active cells (example of the paper sheet).The blind spotNot all the retina is covered with receptor cells in equal density. From the centre to the periphery the density of cells decreases. However, we have the impression of a detailed and coloured vision of the whole visual field. This reflects the constructive nature of vision. One striking example for this principle is the blind spot, the area where the optic nerve leaves the eye. In this area no receptor cells are present because the axons have to traverse the retina cell layers there. One would expect that we see should see a dark spot in our visual field due to the absence of receptors. But the visual system reconstructs the image for that spot such that we don’t recognize it.

Lateral inhibition, receptive fi elds

Another aspect of the constructive nature of vision is lateral inhibition between receptors. ContrastHorizontally aligned cells inhibit neighbouring cells. This results in higher contrast between differently activated neighbouring cells.If you look at snickers bar on a white sheet contrast of the edges is thus increased. A very nice example for lateral inhibition is the so-called “Hermann-Grid”.. In the intersections of white lines dark dots appear in the non-central image; in the foveal region spatial resolution is too high to reveal the effect. Looking at such a grid you will also notice a dynamic flickering of the dots. Our saccadic eye movement is recognizable here. So the sharp edges in high contrast images are a result of the constructive nature of vision.

Receptive fi elds

There is more to say to lateral inhibition. Ganglion cells receive input from many bipolar cells. So a Ganglion cell represents a receptive field on the retina. Receptive fields of Ganglion cells are mainly of two different types. One type receives excitatory input from the central receptive field and inhibitory input from the peripheral receptive field. In the other type it’s vice versa. The receptive field matches exactly an intersection with an on-centre and an off-surround inhibitory input will suppress the centre excitation such that a dark dot is perceived. An on centre ganglion cell responds best to a white dot presented to the central receptive field with no stimulation of the periphery. Total illumination of the receptive field will prevent a reaction due to the inhibitory surround.

coloured receptive fi elds

Centre and periphery of concentric broadband cells react to combinations of different wavelengths. Take for example the wavelengths of red and green and build all combinations (green on/red on – centre). The periphery always reacts antagonistic to the centre .In

13 Count the black dots for Bush and the white dots for Kerry

Page 14: Action and Cognition

a&c | The Constructive Nature of Color Vision

a&c | 14

simple concentric antagonist colour cells have a basic colour centre and an antagonist colour surround (in one of the two possibilities: centre-on/surround-off or vice versa).Simple coextensive antagonist colour cells respond to combinations of basic colours without a complementary centre-surround structure of there receptive field, with an additional complementary centre surround fashion such a cell is called concentric double antagonist colour cell. Better remember rather the concept than the vocabulary here!

Two main types of ganglion cells can be distinguished.

Those, which have big cell bodies, the M – or magnocells, and smaller ones, P-or parvocells. The axons of these types group to build parallel magnocellular- and a parvocellular pathway in the optic nerve. M cells get input from many bipolar cells. Thus spatial resolution is low due to large receptive fields. As M cells are mostly present in peripheral retinal areas they are not colour sensitive but respond better to low contrast. Their response is fast and phasic (responding to dynamic changes in their receptive fields – e.g. stimulus on-and offsets). So the magnocellular pathway is considered to provide “quick and dirty” information. In contrast P-cells are most abundant in the central retinal area. So they are colour sensitive and respond better on high contrast. Their response is slow and tonic (sustained).Therefore the parvocellular pathway is considered to provide “slow and detailed” information.The next stage of processing is the LGN (lateral geniculate nucleus) in the thalamus. Axons from the nasal parts of each retina cross to the contralateral hemisphere at the optic chiasm. So information of one side of the visual field is sent to the contralateral

LGN. The LGN consists of six different cell layers each of which gets information from either the left or the right eye. Layer one and two are of large cells, the magnocellular layers, layers three to six are the parvocellular layers. Axons of the magnocellular pathways terminate solely in the magnocellular layers and the same holds for the parvocellular pathway. Axons from cells in LGN project to the visual cortex in separated pathways that even enter different sub layers of the primary visual cortex

Visual processing is at great deal integration of input.

Comparing the number of rods and cones with the number of ganglion cells shows that a strong integration of information takes place. On average 120 receptor cells are analyzed by one ganglion cell. Integration processes occur at several further stages of the visual information path. Integration of information is also a kind of filtering that provides the cortex with supposedly very relevant information; regions of high contrast for example.You will find the principle of integrative processing again in the chapter where the focus will be on primary visual cortex (V1). Simple cells and complex cells in V1 integrate information of neighbouring receptive fields provided by LGN cells. This enables them to react to light bars of a certain orientation.So convergent information processing is maintained along the P- and M-pathway, leading to more conceptual response properties of the involved cells.SummaryInformation about a visual scene is already processed on retina level. The circuitry of the retina allows for enhancing contrast and for detailed colour vision in the central region. Colour vision depends on the activation of three types of cones. The principles

13 scheme for center-surround antagonistic coloured Receptive Fields

Page 15: Action and Cognition

a&c | The Constructive Nature of Color Vision

a&c | 15

of mixing colours are reflected by the activity of cones. Bipolar and ganglion cells have receptive fields, either colour sensitive or not, and respond dependent on the shape of light stimulating their receptive field. Two types of Ganglion cells, big and small cells, build the magno- and the parvocellular pathway providing the cortex with different visual information. The magnocellular pathway contains “quick and dirty” information from non-central visual field. The parvocellular pathway contains “slow and detailed” information from the central visual field. These two pathways stay segregated at least up to primary visual cortex where they allow for more elaborated cell responses.

Page 16: Action and Cognition

a&c | The Primary Visual Cortex

a&c | 16

CHAPTER 02: The Primary Visual CortexIntroduction

In the previous chapter, we have seen the stages and properties of subcortical visual processing: In contrast to processing on cortical levels, which we will cover in this chapter, subcortical visual processing is mostly hierarchical and sequential; Higher areas are engaged later in the processing pathway and tend to have bigger receptive fields. Additionally within all stages in these early processing pathways visual information is represented topographically: Information from each visual hemifield is projected in an orderly fashion onto the lateral geniculate nucleus (LGN) in the contralateral thalamus. Besides, the major portion of the LGN represents the fovea, the small region of the retina with the highest density of photoreceptors. We have also seen that signalling from retina ganglion cells to the thalamus occurs via magno- and parvocellular pathways. These pathways have different response properties. They give rise for a functional distinction into parallel pathways. In the LGN, distinct laminar structures, the M- and P- layers, receive exclusively information from one of the pathways.

This chapter will provide you with information about subsequent cortical stages of visual processing. We will investigate the major output region of LGN cells - the primary visual cortex (V1) . V1 is is located on the occipital pole of the primate brain and serves as the major input gate for cortical visual processing. We will learn that the response properties of neurons increase in complexity when we move from LGN to V1. Nevertheless, we will encounter that the segregation between M- and P-Pathways will be partially maintained in V1.

The existence of distributive and parallel processing in higher visual processing pathways leads to the so called binding problem. The binding problem refers to the problem how different sensory information from different processing areas leads to a unitary percept. In this chapter we will discuss the classical and in chapter 4 a more recent approach towards the problem of perceptual binding.

01The Primary Visual Cortex located at the occipital poles of primate brains.

Page 17: Action and Cognition

a&c | The Primary Visual Cortex

a&c | 17

02The visual pathways leading to the Primary Visual Cortex. The primary visual cortex is the major input gate for cortical visual processing.

Page 18: Action and Cognition

a&c | The Primary Visual Cortex

a&c | 18

Primary Visual Cortex V1

Location and Nomenclature

V1 sits caudally in the occipital lobe of the primate brain. Other expressions for V1 are striate cortex and also Broadmann’s area 17. V1 is the initial site for cortical processing of visual information.

Cytoarchitectonics

V1 in humans is about 2mm thick and contains 6 layers of cells. These layers are located between the Pia Mater - the membrane surrounding primate CNS - and the underlying white matter. The laminar organization results from different cell types in each layer. Like the whole cortex, V1 is made up by pyramidal and stellate cells. Whereas stellate cells are connecting locally, pyramidal cells form projections into other cortical and subcortical areas.

V1 is classified cytoarchitectonically as Brodmann’s

The human cortex contains mainly two different types of cells: Projection neurons and local interneurons. Projection neurons are large cells and pyramidally shaped. They are located in layers 2, 3, 4 5 and 6 and form excitatory (Transmitter: glutamate) projections onto other cortical and subcortical areas. Local inhibitory (Transmitter: gaba) interneurons are located in all layers in the cortex and typically form connections within the area they originate in. These inibitory interneurons are distinguished by the connectivity patterns they form and the cotransmitters they contain. For example, basket cells form inhibitory connections with the cell bodies - somae - of other neurons.

The interplay between projection and interneurons results in global and local, excitatory and inhibitory cell interactions. These interactions help to explain specialization and flexibility of cortical networks. Besides, and important for V1, are local excitatory stellate-shaped interneurons. They are most prominent in layer 4 in the human cortex. They are the primary targets from thalamocortical projections. Besides the laminar organization, the human neocortex also contains columns which span several layers and make up functional modules. In V1 for example, we will encounter orientation selective columns made up by cells with the same response properties to light bars with specific orientations.

BOX 2.1Cell Types in Primate Cortex

03 Connectivity in V1Projection and Interneurons make up primary visual Cortex

04 2 different staining techniques reveal V1’s cytoarchitectonic structure

03

04

Page 19: Action and Cognition

a&c | The Primary Visual Cortex

a&c | 19

area 17. The cytoarchitectonical classification was made because A17 contains a big stripe like structure in the internal granular layer (the fourth layer from the top). The stripes resulted mainly from white matter fibers arranged in stripes. This structure also gave rise to the term striate cortex for V1. Functional Characterization

Functionally, we will look at the input to V1, its internal connectivity and its output to other cortical structures.

Input: Projections from LGN to V1

V1 receives its main input from the parallel M and P pathways of the LGN. In V1 layer 4 is the major input layer from the LGN. Since V1 serves as the input gate for visual information to the cortex, layer 4 is the dominant layer. It is subdivided into different sublayers - those that the term striate cortex refers to - the stripes: 4A, 4B, 4Cα and 4Cbeta. In V1 the anatomical segregation of the M- and P-Pathway is maintained: M-cells (mαgno) project mainly onto layer

4Cα whereas, P-cells (βarvo) onto layer 4Cβ. Once the input from the LGN arrives in layer 4, the spiny stellate cells relay incoming information to further layers in V1. Therefore we could state that the orderly representation of visual information of the LGN is conserved in the input relays of V1.

Connectivity

But what does happen to the information in V1? The functional organization of V1 – its connectivity and the cells response properties mainly reveal that 1) the represented information gets more complex and 2) the functional segregation of M and P Pathway remains maintained. We will first address the form of information representation in V1. Aiming at this, we will first investigate response properties of cells and afterwards the functional organization of cell groups in V1.

05 Connectivity in V1. In Part A you see V1’s input from the LGN. Part B shows you the most prominent cell types in V1. Part C shows you a cybernetic scheme of the information fl ow in V1. 06 Input to V1. The early segregation of M and P pathways remains partially maintained in V1.

Page 20: Action and Cognition

a&c | The Primary Visual Cortex

a&c | 20

Cell Response Properties: The classical experiments

In the early 60’s of the preceding century, the Nobel Laureates Hubel & Wiesel studied the response properties of cells in the early stages of visual processing (like LGN andV1). They found out that cells in V1 have strikingly different response properties than cells in LGN.

Simple and Complex cells

We have already seen that visual processing from the retina to the LGN is based on the concept of concentric receptive fields. We have seen that receptive fields in these areas are organized in a center-surround fashion between excitatory and inhibitory regions. The cells are optimally responsive to small spots of light in their excitatory region.

Hubel and Wiesel found out Cells in all other layers of V1 (except 4C) reveal different response properties. Instead of beeing optimally responsive to small spots of light, neurons in V1 respond best to linear stimuli with different

orientations. Hubel and Wiesel encountered primarily two different functional types of cells: Simple Cells and Complex Cells.

Simple cells are abundant in layers 4 and 6. Simple cells are tuned to a bar of light with a specific orientation. This means that if a bar of light occurs within the receptive field at the optimal orientation, the cell responds best. The cell shows a decrease in its response to other oriented – non preferred stimuli. In order to explain this increase of response complexity, Hubel and Wiesel used a feed-forward explanation: A simple cell receives convergent input from thalamic cells. These thalamic cells reveal the usual concentric receptive fields. If now the circular centers are all aligned along a linear axis and the response properties of the cell are the same, the ensemble of these neurons may represent light falling on a straight line.

The other cell type they encountered is called complex cell. Complex cells are found in layers 2,3 and 5 and are also tuned to stimuli with respect to a specific orientation. So both cell types are orientation selective. But, in contrast to simple cells, the position of stimuli with

07 Response Properties of centre-surround cells in LGN, and of simple cells in V1. Below: Hubels & Wiesels Feed-Forward Explanation for V1 simple cells.

08 Hubels and Wiesels Feed-Forward Explanation for V1 complex cells

07

08

Page 21: Action and Cognition

a&c | The Primary Visual Cortex

a&c | 21

optimal orientation in the receptive field is less important. A light- bar of optimal orientation moved over the whole receptive field of such a cell triggers a strong response. Other orientations trigger weaker responses. Therefore complex cells are tuned to movements of linear stimuli along a specific oriented axis. In order to explain this increase of response abstraction, Hubel and Wiesel used again a feed-forward explanation: They assumed that a complex cell receives convergent input from simple cells with the same orientation selectivity but with slightly different offset receptive field positions. In this way the receptive field of a complex cell is build up from the individual adjacent receptive fields of preceding simple cells.

Whether the response behavior of simple and complex cells could be fully explained by means of Hubel´s and Wiesel`s feedforward approach is still under debate. The major shortcoming of the classic feedforward explanation is that it can’t really explain highly selective cells with a narrow tuning to a specific orientation. Current theories (eg. Shapley et al. , 2003) on orientation selectivity point out that tuning of orientation selectivity cannot solely be explained by feedforward excitation. It merely results from more complex global and local, both excitatory and inhibitory interactions.

Modular organization in V1 – the icecube hypothesis

Additionally, Hubel and Wiesel investigated, that V1 is organized into different layer spanning functional columns. Each column contains cells in layer 4C with

concentric receptive fields and extends from the pial surface towards the underlying white matter. Besides the input cells in layer 4, each column contains simple cells with almost identical response properties of a designated portion of one retina. An array of different functional columns is necessary to analyse a discrete region in the visual field. These arrays are therefore called modules or hypercolumns because they are considered to be functional units within V1. Each module contains 3 prominent structures: Orientation columns, occular dominance columns, and blobs.

orientation columns

Hubel and Wiesel found out that columns with differing axes of orientations are located in an orderly fashion next to each other. These results were obtained by vertical and oblique electrode penetrations by measuring the response properties of the penetrated cells with respect to specifically oriented stimuli. The common feature within each such a column is the fact, that above and below the input layer, simple cells are located which monitor almost identical retinal positions with identical response properties with respect to a specific axis of orientation. Therefore about every 3/4 mm a complete orientation cycle (360 degrees) changes. Another topological feature in V1 is that neighboring areas in the visual field are mapped onto adjacent orientation columns in V1.

Ocular dominance columns

Remember, that in the binocular zone of the visual field each point is captured by both eyes. For example, all

The transition from the concentric receptive fields of the LGN to the orientation selective simple cells of the cortex is still not fully understood.

The classical feedforward model of Hubel and Wiesel states, that linearly adjajecent LGN cells with identical antagonistic response properties converge onto the simple cells in Layer 4 and 6. The LGN cell activities are linearly summed and thus yield the activity for simple cells in V1. This model has the great virtue of being calculable and explicit. Nevertheless, it does not account for highly orientation selective cells. These cells tend to show sharply peaked tuning curves and zero activity for orthogonal stimuli with respect to their

BOX 2.2Paper Review: Shapley et al. Dynamics of orientation selectivity in V1 and the importance of inhibition

preferred orientation.

In a recently published paper, Shapley, Hawken and Ringach (2003) investigated the mechanisms underlying dynamic orientation tuning of cortical cells. They investigated that orientation tuning of V1 cells is generated mainly by tuned enhancement from thalamocortical feed-forward connections and global suppresion supplied by cortico-cortical inhibition. Additionally they found out that tuned suppression near a cell’s preferred orientation sharpens peaks and may be an important mechanism to generate highly selective cells with a narrow response bandwith.

Therefore a more refined explanation for the complex behavior of cortical cells should not solely take into account subcortical excitation but also point towards the effects of the interplay between global and local excitatory and inhibitory stimulation.

Page 22: Action and Cognition

a&c | The Primary Visual Cortex

a&c | 22

09 The Ice-Cube modelOrientation and Ocular Dominance Columns in V1

10 Orientation Columns in Monkeys’ Primary Visual Cortex

Page 23: Action and Cognition

a&c | The Primary Visual Cortex

a&c | 23

11 Organization of orientation columns, ocular dominance columns and blobs in V1.

Page 24: Action and Cognition

a&c | The Primary Visual Cortex

a&c | 24

points in the right visual hemifield are captured by the right nasal hemiretina and the left temporal hemiretina. Therefore the primary visual cortices in each hemisphere conveys information of the contralateral visual field delivered from both eyes. In addition to columns of cells sensitive to specific orientations, cells in V1 are also organized into alternating ocular dominance columns. Ocular dominance columns represent an orderly arrangement of cells that are responsive only to input from the left or from the right eye.

Blobs

The orderly arrangement of orientation columns is regularly interrupted by peg-like structures of 150-200m

in diameter - the blobs. Blobs are most prominent within layers 2 and 3 and located at the center of the ocular dominance columns of V1. Cells in the blob structures do not respond to specific axes of orientation but are assumed to deal with color processing.

So far, the modular organization of V1 reveals that each module monitors the whole information from a particular spot of the visual field. The 3 different structures forming up a module are capable to represent information about orientation, binocular interaction, motion and color. A complete sequence of ocular dominance columns and orientation columns is repeated in a regular and precise manner over the cortical surface of V1.

BOX 2.3Non-Classical Receptive Fields

Experiments have shown that the behaviour of orientation selective cells is more complex than suggested by early computational feed-forward models (Hubel & Wiesel, 1961). We have already stressed the importance of cortico-cortical interactions on the tuning of highly orientation specific cells. Cortico-cortical interactions may additionally explain contextual effects on cells response behavior: measurements have shown that once a cell is activated by a stimulus in its concentric receptive fields (CRF), another, simultaneously presented stimulus outside that field can have an effect on the cell’s response. This mostly inhibitory effect is referred to as non-classical receptive field (non-CRF) inhibition and is exhibited to a varying extent by 80% of the orientation selective

12 Connectivity properties in primary visual cortex lead to the concept of non-classical receptive fi elds: cells with similar response properties are connected more intensively.

13 A Computational model for non-CRFs. a) Non-CRF inhibition is caused by the surround of the CRF. b) The pop-out effect of oriented line seg-ments on a background of other segments. The segment pops out only if its orientation is suffi -ciently different drom that of the background.c) The three legs of the triangle are not perceived in the same way : the leg which is parallel to the bars of the grating does not pop out as the other two legs.

12

13

Page 25: Action and Cognition

a&c | The Primary Visual Cortex

a&c | 25

Connectivity

Vertical, Horizontal Within the primary visual cortex one has to distinguish between vertical and horizontal connections. Vertical connections pass through single or adjacent columns. Input yielded by the M- and the P - Cells terminates in layers 4CB or 4Crespecitvely.

In contrast, horizontal connections link the different vertically oriented columnar units in V1. Horizontal connections enable communication of cells within a particular layer. Two major properties of horizontal connections have been found. 1. Horizontal connections tend to link adjacent

regions. This has been found out by radiolabeled tracing experiments.

2. Additionally, cells with the same response properties , eg. same orientation or same functional property within the blobs tend to have a high interconnectivity.

These features reveal that horizontal connectivity tie together cells with the same functional and topographical response properties. Therefore horizontal connectivity may enables the cells in V1 to be modified in their response by stimuli outside their classical receptive field. This implies contextual effects on the behavior of particular neurons in V1 and gives rise to the concept of

non-classical receptive fields

The classic Conception: Convergence solves the binding problem?

In the preceding paragraphs we encountered the visual processing pathway up into the primary visual cortex. One of the crucial properties of visual information processing in the early stages is the striking bottom-up convergence of cells combined with and increasing response complexity:

First light photons elicit responses in single photoreceptors. A circular area of these receptors on the retina is monitored by a single bipolar cell. The bipolar cells already distinguish their responses in an antagonistic center-surround fashion with respect to the position of the light. Bipolar cells themselves are grouped in the receptive fields for retina ganglion cells. Each geniculate nucleus cell surveys the activity of various retina ganglion cells. Retina ganglion and geniculate nucleus cells are already responding to contrasts. This information is transformed in the cells in V1. Simple cells already respond to preferred oriented bars of light on specific positions within their receptive fields . The receptive fields of various simple cells themselves are monitored by complex cells in V1, which are instead tuned to specifically oriented bars somewhere in their receptive fields.

14 The classic (but overcome) solution to the bindingproblem solely postulates binding by convergence

Page 26: Action and Cognition

a&c | The Primary Visual Cortex

a&c | 26

The early findings of these response properties lead to the classic solution of binding problem. The underlying hypothesis is, that unitary perceptual processes may result from bottom-up process of increasing convergence and complexity: Leading up the visual processing hierarchy, response and information processing properties of involved cells become more and more abstract and conceptual. Finally semantic levels are reached. These levels are made up by highly specialized neurons which may encode your mama or a very specific combination of properties making up your red brand new manta Torano sports car. Therefore the classic conception fits well to our intuition of how the brain/mind works. According to good old Descartes somewhere in the brain lies a center of convergence. Here all information comes together and is object of a uniform perceptual interpretation.

The problem for the classic conception is obviously the numerical explosion of neurons engaged in the encoding of our perception. For recognition, every object of our perception has to be represented by at least one neuron in the hypothetical center of convergence. Additionally there have to exist neurons which encode possible transformations of the objects, like rotations or perspectival distortions. Futher it is unlikely, that we have a cell for each instance determined by the value of specific properties. Eg. if we had a cellular representation of a well known ikea cupboard in each color variation we would likely have a brain full of billies. Besides this purely theoretical objection, the classical answer also has to face that a center of convergence has not been found in the neural architecture of the brain yet. In chapter four and nine we will have a look at more elaborated answers addressing the binding problem.

Overview

This chapter deals with the primary visual cortex V1. V1 is located in the most caudal part of the occipital lobe. We investigated the cytoarchitectonic, histological and functional structure of V1. Cytoarchitectonically, V1 reveals the typical pattern of a primary sensory cortex. It contains a large thalamocortical input layer 4, which is further subdivided into different sublayers. The sublayers 4C and 4C convey information from almost exclusively M- or P-Cells of the LGN.Therefore we could investigate that partially these parallel pathways remain segregated in V1.

Histologically, besides cells with concentric receptive fields, which are also found in earlier stages of visual processing, V1 contains mainly two different cell types: simple and complex cells. Whereas simple cells are tuned towards a bar of light of a specific orientation, complex cells monitor movement along a specific axis of orientation within their receptive fields. Functionally, we have encountered vertical groupings of cells with almost identical response properties, the

columnar structures. Regular patterns of orientation and ocular dominance columns are interrupted by color sensitive structures - the blobs. An array of these structures covering all aspects of visual information for a particular spot within the visual field is grouped into a functional module or hypercolumn, Horizontal connections between columnar units sharing identical or similar response properties within the modules further enable visual processing to deal with the computational concept of non-classical receptive fields. The next chapter will introduce the two most prominent processing pathways - the dorsal and ventral pathway - and the most important areas they run along.

Page 27: Action and Cognition

a&c | Vision, Vx, Representation of the World

a&c | 27

CHAPTER 03: Vision, Vx, Representation of the WorldIntroduction

This chapter will give a rough outline about cortical visual processing beyond area V1 and how a representation of the visual world might be constructed in the brain. At the beginning, we will review Primary Visual Cortex’s (V1) functionality and anatomical structure. We will reconsider the input and output relations of V1 with respect to it’s laminar structure. Starting in V1 we follow the gross connections towards other areas of visual processing.

It was originally thought that visual processing is serial and purely hierarchical. This means that more elementary areas project increasingly complex information onto subsequent more complex areas. This assumption has been supported by strong convergences of low level neurons onto neurons from higher areas. The implied increase of abstraction within neuronal response and representation properties lead very quickly to a solution of the problem of perceptual binding. The core - increasing abstraction of processed information - of the classical view holds. Nevertheless, it has become widely accepted that the classical view is misleading. Several methods in cognitive neuroscience, such as lesional studies and brain imaging techniques, found out that cortical visual processing occurs in parallel across different, heavily interconnected, areas. Interconnected means, that these areas are not solely unidirectionally connected. Instead areas connect bidirectionally as well as asymmetrically with other areas.

Therefore, we will stress on a more elaborated physiological pathway hypothesis. The pathway hypothesis states that several parallel pathways exist. These pathways process diffent aspects of a visual scene. It is traditionally assumed that a dorsal processing stream is concerned with the processing of spatial and motion information, whereas a ventral stream is engaged in form and color processing. Findings on neuronal response properties in the areas engaged in these hypothetical pathways support the hypothesis. Nevertheless, we will fi nd out that the heavy interconnectivity throughout the entire cortex casts massive doubts against simplifi ed structural models.

Throughout the second part of the chapter, we will follow both pathways and inform you about the most extensively studied areas within them.

01 The subway map. (Fellemann & Van Essen, 1992)

Page 28: Action and Cognition

a&c | Vision, Vx, Representation of the World

a&c | 28

The Subway Map

In chapter 01, we already have encountered the Felleman & van Essen Map (1991). It shows you the interconnections of cortical areas in a schematized way. Areas are represented as nodes in a network whereas connections between them are plotted as arcs. The map was plotted on the basis of metastudies on macaque monkeys. This means that a huge amount of data from various studies is underlying the plotting. We obviously see at least 6 important features of visual processing emerging from the image:

1. Visual processing occurs in multiple areas.

2. Visual processing is partially hierarchical (in lower levels). This hierarchy may be emerging from the anatomical structure. Although the map solely shows the interconnections between various cortical areas, we could see that lower levels of visual processing, as the retina or the lateral geniculate nucleus (LGN) are plotted at the bottom of the map. Higher visual areas concerned with more specifi c visual processing such as the inferior temporal cortex (IT) regions are located more on the upper part of the map. Additionally, the functional architecture of the map resembles the topological arrangement of the visual areas.

3. Visual processing acts in parallel. The diagram reveals a strong degree of interconnectivity. The graph is non-directed. This means that the arcs in the network are not barely unidirectional but instead reciprocal. They connect areas bidirectionally. It is important to mention that areas are not connected completely symmetrical; the connections originate and terminate in different layers. Although two areas might be interconnected - they nevertheless almost always connect to different areas of the cortex.

4. Lower level processing differs from higher level processing. The network properties of the lower part differ from the upper part. In the lower part, we have a more vertical arranged structure. The upper part appears more horizontally aligned. These differences refl ect distinct processing principles in lower and higher levels: Lower levels such as retino - geniculate -cortical pathways can be grouped more sequentially and serially, whereas high levels act more in parallel and in interdependence.

5.There exist several functional pathways.

6. There is no unique fi nal representation. Instead of a fi nal convergence region, the diagram clearly shows that visual processing occurs in a highly distributed and parallel manner. How unitary perceptual impressions may be generated - bound together - from multiple representations is still under heavy scientifi c debate. We will address one solution - the binding by synchronization mechanism according to Singer et al. throughout the next chapter.

Is the connectivity of our brain represented the

Fellemann and van Essen Diagram?

The Fellemann and Van Essen Diagram reveals the general scheme underlying almost whole connectivity of visual processing. But is representative for your brain? The diagram map results from various large scale statistical analysis on monkey studies. Fellemann & Van Essen have interpolated a lot of data in order to plot a general scheme of cortical connectivity. This involved data from various subjects. These subjects have shown a large statistical variance in cortical topography and connectivity. Although the gross features revealed by the map (1-5) seem to underlie vision, it is highly probable that a single connectivity map would look somewhat different.

Central Visual Pathways

The two early parallel pathways - the magno- and parvocellular pathway - convey different types of visual information from the retina through the LGN to V1. Whereas the M-Pathway is mainly sensitive to dynamic changes, but transmits them in poor spatial resolution, the P-Pathway delivers information with good spatial but poor temporal resolution. Besides, we have seen that these pathways remain fully segregated in the input layer 4C of V1. P cells project to layer 4Cβ. M cells to layer 4Cα. Given Facts, like the maintenance of segregation in layer 4C of V1, and missing facts, gave rise to the early pathway hypothesis.

The early pathway hypothesis

The early pathway hypothesis assumes that M- and P-Pathways remain segregated behind V1 and form two parallel and distinct processing streams: The M-Pathway feeds into a dorsal stream leading to the posterior parietal cortex which mainly processes spatial and motion information. The P-Pathway instead goes from V1 to a ventral stream and streams to the inferior temporal region. Traditionally the ventral stream is associated with “what” information - color and form - whereas the dorsal stream is associated with processing “where” information. The early pathway hypothesis is very attractive since it is compellingly simple. We can take for granted that a dorsal and a ventral visual pathway with different functions exist. But both the input relations towards M and P-pathways and their functional structure are not as simple as the early pathway hypothesis states.

A more refi ned pathway model

In the following paragraphs, we will augment the early pathway hypothesis. The augmented version is also based on the existence of a dorsal - where - and a ventral - what - pathway. First, we will investigate the processing of information from the M-and P-Pathway in V1 and subsequent areas. We will detect several sites in which information from both pathways converge. This makes up the argument against the simplifi ed early pathway hypothesis of two totally independent

Page 29: Action and Cognition

a&c | Vision, Vx, Representation of the World

a&c | 29

channels. Furthermore, we will formulate a more refi ned pathway model and point out the respective areas engaged within each path. Although the more refi ned pathway model is still the most prominent one, we will present you some experimental results, casting several doubts even onto the augmented model.

From V1 to V2

In the processing from V1 to V2 the M-and P-Pathway remain partially segregated. As in the early model, we have an association between P and ventral pathway and an association between M and dorsal pathway.

P cells feed into the ventral pathway:

The P-cell paths run almost exclusively along the ventral stream. You can trace projections from P-cells up to the inferior temporal cortex sitting near your ears. P-cell projections from the LGN terminate on Layer 4Cβ. From 4Cβ information passes to the blob and interblob regions in Layer 2 and 3 of V1. Remember, that blob regions are arranged in the columnar structure of V1 and contain wavelength sensitive cells. An interblob region subsumes the remaining areas inthe columnar structure. Thus interblob regions are not tuned to colorful, but merely to oriented stimuli.

Therefore, Layer 2,3 marks the initial site where processing of colorful and colorless stimuli becomes parallel in the ventral pathway.

These parallel processing is even maintained throughout the rest of the ventral path: The blobs strongly project onto the thin stripes in Area V2, whereas interblobs strongly project onto the neighboring interstripe regions in V2. Coming from

these regions, projections go mainly into Area V4. Following the processing hierarchy of the ventral stream upwards, increasingly complex form and color information is processed. As an example, Hedgé and Van Essen found out, that cells in V2 respond best to complex stimuli as circular shapes fi lled up with hyperbolic gratings.

The M cells feed into the dorsal pathway M cells are sensitive to dynamic changes within their receptive fi elds and thus are capable to detect motions. M-cells terminate both on layer 4Cα and layer 4B in V1. From the input layer of the cortex the input is apparently relayed without considerable modifi cations. From 4Cα, cells project onto layer 4B. Layer 4B contains direction sensitive neurons; these cells respond best to stimuli moving in the direction they are tuned for. Layer 4B in turn sends axons either to the thick stripe regions of V2 or to the middle temporal area (MT). Both areas are situated along the dorsal stream.

Integration of both

Although it is taken for granted that both M- and P- cells quantitatively dominate the bottom-up input in their respective pathways, these pathways do not remain exclusively segregated. There is plenty of neuroscientifi c evidence suggesting that a total segregation, as proposed by the early model, is misleading:

- In V1, both M and P cells have input to the blobs and interblobs. Besides, several cross connections exist between blob and interblob regions.

- Selective blocking of the M-Pathway leads to an

02 Connectivity in V1

03 The Pathway Hypothesis02

03

Page 30: Action and Cognition

a&c | Vision, Vx, Representation of the World

a&c | 30

altered response behavior of cells in the superfi ciallayers of V1.

We can conclude that there is a substantial segregation of the M and P pathways up to V1. There is probably a separation into V2, additionally a likely predominance of the M input into the dorsal pathway to MT and the parietal cortex, and a mixture of P and M input into the ventral pathway leading to the inferior temporal lobe. During the next part of this chapter, we will follow each pathway and investigate their most prominent regions. We will start with the ventral pathway which processes form information. It extends from blob and interblob regions of V1 to the thin and interstripe regions in V2, and then passes V4 on its way to the inferior temporal cortex.

Area V2

Location

Area V2 was classifi ed cytoarchitectonically by Korbinian Brodmann as Area 18. It surrounds area V1 both dorsally and ventrally. Therefore it is quite reasonable that Area V2 functions as a subsequent stage in visual processing.

Anatomy

It has been investigated that, like in V1, the upper layers of V2 also consists different functional modules. A cytochrome oxidase (CO) stain marks regions which contain the enzyme cytochrome oxidase. These

regions are dark in the stain. Applying such a stain on V1 reveals a dot like pattern since only the blobs in V1 contain CO. In contrast, a cytochrome oxydase stain reveals stripe like patterns in V2 . We could identify thin and thick stripes with a high density of CO as well as pale interstripe regions containing almost no CO.

Input

As we already mentioned, the major projections from V1 to V2 reveal a very parallel structure:The thin stripe regions in V2 receive their main input from blob regions in V1. Both regions contain a high amount of cytochrome oxidase. It is assumed that color relevant information is processed here. Xiao et. al. suggested that the thin stripes in V2 contain functional maps where the color of a stimulus is represented topographically. (Xiao, Wang & Fellemann, 2003). The pale interstripe regions in V2 receive strong afferents from the interblob regions in V1. The interblob regions contain orientation sensitive columns. They are solely tuned to the orientation of stimuli, not to their color. It is therefore assumed that form processing takes place here. The thick stripe regions in V2 receive their major input from cells within layer 4B of Area V1. Since 4B gets its major input from the magnocellular LGN Pathway, V2 thick stripes are closely associated with that pathway. A signifi cant proportion of V2 neurons projecting to the superior colliculus resides in the thick stripes. (Abel et al., 1997) A signifi cantly smaller proportion of labelled cells lies in the thin and interstripe compartments. Therefore the spatial distribution of subcortically

04 Top: Cytochtome Oxydase Stain of V2.Bottom: Areas engaged in ‘early’ cortical visual processing. (Kandell, Jessel & Schwartz, 2000)

05 Areas engaged in cortical visual processing.(Kandell, Jessel & Schwartz, 2000)

04

05

Page 31: Action and Cognition

a&c | Vision, Vx, Representation of the World

a&c | 31

projecting neurons can correlate with the internal modular organization of visual area V2. Moreover, different CO compartments in V2 are strongly associated with functionally different pathways. (Abel et al., 1997)

Output

Thin stripe and pale interstripe regions project onto structures within V4. It has been shown that they partially maintain their separation into a color and a form channel, although even in Area V2 strong interconnections exist between them. Thick stripes instead project onto the middle temporal Area (Area MT).

Area V4

Location

Area V4 is located at the inferior occipito-temporal junction. Therefore V4 is a prestriate region. It is located in the ventral processing stream.

Cells in V4 receive signals from the thin and pale

stripes of V2. Although V4 is associated with the analysis of global shape and color, the function of V4 remains a bit unclear because cells in V4 exhibit differing response properties:Some neurons in V4 respond to fi gures with a particular shape, like disks, crosses, gratings and spirals. These extracted shapes may be elementary units for perceiving shapes of complex objects. Some neurons in V4 respond to edges with a particular orientation. In this respect, V4 neurons resemble neurons in V1 and V2. Notably, activities and characteristics of neurons in V4 strongly depend on whether we focus on an object to which those neurons respond. Neurons in V4 are not selective to the direction of motion or even to the spatial position. Although Area V4 is also a visuotopically organized area, it lacks a fi ne retinotopic map; its neurons analyze the shape and color of an object regardless of its position. Early suggestions for the functionality of V4 were that it is a centre for color perception. The color-selective neurons in the visual areas before V4 e.g. blob neurons in V1 and thin stripe neurons in V2) are actually selective not to color but to wavelength. This means that they respond to a particular wavelength of light refl ected from a surface. The received wavelength depends on the wavelength of illuminating light. For example, neurons sensitive to middle wavelength (green) light do not respond at all to a green surface if the surface is illuminated

In two subsequent studies, Jay Hegdé and David Van Essen (1999, 2003) have studied the neuronal response properties in V2. They tried to find out, which visual information is represented in V2.The first paper - “Selectivity for Complex Shapes in Primate Visual Area V2” (1999)- assesses the response properties of V2 neurons to 128 grating and geometric line stimuli. Hedge and Van Essen report that many cells in V2 are preferentially responsive to complex contours. These contours include angles, circular shapes, intersections and arcs. Additionally, they investigated that cells are even better tuned to hyperbolic or polar gratings than to linear gratings. Besides, Hedgé and Van Essen found out that individual cells differ substantially in their response properties to stimuli.

Individual V2 cells convey complex and very specific information about contour and texture characteristics associated with visual stimuli. In the second paper Hedgé and Van Essen focussed on “Strategies of Shape Representation in Macaque Visual Area V2” (2003) . They investigate that populations of V2 cells widely varied in terms of selectivity for various shape characteristics.

BOX 2.1Paper Reviews: Hedgé & Van Essen on V2 (1999,2003)

06 Response profi iles of V2 neurons to a stimulus set according to Hedgès and Van Essens experiment. Red indicate strong responses, blue weak responses . (Hedge & Van Essen, 1999, 2003)

Page 32: Action and Cognition

a&c | Vision, Vx, Representation of the World

a&c | 32

by reddish light. However, in natural circumstances, we perceive the color of a surface to some extent regardless of the wavelength from the illumination. This is called color constancy. Neurons in V4 support this constant-color perception. They are selective to the color of a surface rather than to the wavelength of light refl ected from it.

V4 is the major source of visual input to the inferior temporal cortex (IT), known to be crucial for object recognition. To the end of this chapter, we will follow along the dorsal processing streams. The dorsal streams extend from direction selective cells in 4B to the thick stripe regions of V2 or MT. Within the dorsal stream spatial and dynamic information, crucial for motion processing is computed.

Area MT/V5

Location and Nomenclature

Area MT refers to middle temporal area. MT is located at the middle temporal sulcus at the edge of the parietal cortex. MT is an expression for non-human primate brains. Human primate brains contain the homologue MT (hMT+) are, which is also labeled by MT/V5.

As with every area, there are several ways to characterize MT:

Cytoarchitectonics

MT has a high density of myelinated axons. This results in fast connectivity. In MT is surrounded by satellite

areas dedicated to higher order motion processing. (See : Schmolesky - Signal timing across the macaque visual system) The two satellites are the medial superior – temporal area (MST) and the fundus of superior – temporal (FST).

Connectivity

MT receives input from layer 4B of V1 and the thick stripe compartments of V2. These two areas are, as already mentioned, mainly processing magnocellular information. Further input to MT is provided by a projection from the pulvinar at the superior colliculus. This region receives a 10% share of the retinal input, while the other 90% take the known route via the lateral geniculate nucleus. This connection therefore bypasses the LGN and V1. MT projects to its satellite area MST and to intraparietal regions like LIP.

Response properties

The major part of neurons in MT are tuned to the direction and speed of moving contrasts in luminance. Some cells within MT respond also to movements of contrasts in color but with different luminance. Each cell responds is tuned to a specifi c direction. The large receptive fi elds of cells in MT may provide a basis for determining whether a number of points are moving together. This would be an implementation of the “common fate” Gestalt - law.

Topography

MT contains a topographic representation of the visual fi eld. Cells with similar direction selectivity are

07 MT is located at the middle temporal sulcus at the edge of the parietal cortex.

Page 33: Action and Cognition

a&c | Vision, Vx, Representation of the World

a&c | 33

arranged into vertical column. Adjacent columns are tuned similar directions and adjacent parts of the visual fi eld. This structure resembles the hypercolumns in primary visual cortex. In MT a “direction hypercolumn” contains a set of adjacent neurons that describe a whole 360° turn with respect to direction selectivity. Just like in other higher - order areas, the receptive fi elds of single neurons are comparably large. (The receptive fi elds of MT cells are about 10 times wider than those of cells in the striate cortex). Additionally, it is important to mention, that there is ocular invariance in MT.

Function

The MT area appears to be engaged in motion processing. From the response property section we know that a major part of MT cells is selectively tuned for motion into one direction. Only a minor part of MT cells response properties is substantially altered by form and shape of the moving stimulus. From lesions in monkey MT or the human MT homologue (hMT+) revealed that subjects lesioned in these areas show impaired motion perception. One can say that MT not only contains motion selective neurons, but also that activity in MT has a causal relationship on our perception of movement. Higher systems - PPC, FEF. Area 46In the following section we will investigate 3 higher order areas of the dorsal pathway. It is assumed that these are highly integrative areas which may form a neurophysiological basis for attention.

PPC

Location and Connectivity

Neuroimaging and unit-electrode recordings from e.g.. Beauchanp et al. show that V5/MT responses can be modulated by attention. A likely source of this modulation is the posterior parietal cortex (PPC). Area PPC is densely interconnected with V5/MT. This means that it gets extensive feed-forward input and also sends reciprocally connections back to V5/MT.

Topography and function

PPC (also labeled as Area 7a) is part of a distributed system. It marks integrative areas of the dorsal pathway. This higher order-system contains PPC, the frontal eye fi elds (FEF) , cingulate cortex, prefrontal cortex (Area 46) , thalamic (pulvinar), and other regions. It is assumed that this system plays a crucial role in visual attention. For example, PPC unit fi ring is enhanced by attentive fi xation. The frontal eye fi elds have been implicated in attention by ablation of areas 6 and 8, engendering the ‘‘premotor theory’’ of attention. Reciprocal projections from the PPC include areas FEF, frontal operculum, and area 46 (prefrontal cortex). It was investigated that activity in PPC may be suffi cient to explain a modulatory effects on feed-forward connections from V2 to V5/MT.

Area 46

Now we come to an area which is not really a part of the visual pathways but a major player in many discussions about higher brain functions. Area 46 is situated in the frontal lobes. Another expression

07 MT variability across human subjects. MT shown in red.

Page 34: Action and Cognition

a&c | Vision, Vx, Representation of the World

a&c | 34

for area 46 which you may have heard already is prefrontal cortex (PFC). The PFC is integrated in a a lot of coarse theoretical models in cognitive psychology. E.g. it is part of a working memory hypothesis, which assumes that PFC functions as an executive control mechanism for the storage and retrieval of memory. Although the functioning of frontal cortical areas still remain somewhat unclear, PFC relates to motivational control and interacts with sensorimotor integration.

Is our hierarchy well defi ned?

The augmented pathway hypothesis assumes a distinction into mainly a dorsal - motion - and a ventral - form - pathway. We have learned that these pathways are not mutually exclusive; various areas - such as in the blob and interblob regions in V1 - exhibit interconnections between the pathways. The Felleman & Van Essen shows the interconnectivity of cortical areas. We could interpret a hierarchy from the arrangement of the areas appearing in the plot and also from the increasingly complex types of response properties in the pathways. Several studies on latency measurements try to address whether the imposed hierarchy is well defi ned or just arbitrarily.

Latencies

Studies on latency measurement record the time delay of cells in a specifi c area according to the onset of a stimuli. The fi gure below shows us the signal timing across the macaque visual system. Plotted are probability functions over time for different areas

engaged in visual processing. On the x axis we can see the time from stimulus onset. A high x value therefore means a long latency time with respect to the stimulus onset. The value of y determines the probability on which cells in the respective area respond. At the maximum y value - 1.0 we defi nitely had a response before in the respective area. The 0.5 value marks chance level. First, our graph shows that most areas are active simultaneously. That fi ts very well to the idea of distributed and parallel processing. The existence of parallelism weakens the hypothesis of a hierarchical manner of processing. Nevertheless, we could see several indicators, which support a hierarchy hypothesis and a distinction between different pathways: 1) there are signifi cant differences between parvo and magnocellular pathways. Cells in magnocellular pathways react more quickly than cells in the parvocellular pathways. The notion of “quick and dirty” processing therefore fi ts well to M - Pathway processing. 2) On average, V1 is activated before other cortical areas. This supports the claim, that V1 marks the initial site of cortical processing. 3) We see that cells in areas of the dorsal pathway in total respond earlier than areas within the ventral pathway. The most prominent areas here are the M layers of the LGN, V1, MT and MST. The “counterparts” along the ventral pathway, V2 and V4 have longer latencies. This, in combination with 1) may support the idea, that the m and p pathways remain partially segregated along the processing in extrastriate cortical areas.

01 Time from stimulus onset measurements help to evaluate the consistency of hierarchy and pathway hypotheses.

Page 35: Action and Cognition

a&c | Vision, Vx, Representation of the World

a&c | 35

Page 36: Action and Cognition

a&c | Appendix A: IT and STS

a&c | 36

Appendix A: Object recognition in IT and STSIntroduction

In the ventral processing stream Inferotemporal Cortex (IT) and the adjacent area Superior Temporal Sulcus (STS) (fi g1 and 2) are involved in the recognition of objects and faces, respectively. But in the case of face recognition it must be stated that face selective neurons are also common in IT. As a general principle it might be kept in mind that face selective neurons responsive to face identity are found in IT, and the ones responsive to facial expressions and view point in STS (Logothetis, 1998). As might be expected from such high level visual areas, the visually responsive neurons of these areas have very large RFs, most of which include the foveal region. This property might play an important role for the recognition of the same object, regardless of its locus in the visual fi eld. This issue will be reconsidered after a brief discussion of the underlying principles of “object selectivity” and the presentation of object selectivity in this area.

01 Location of IT.in the subway map and in the pri-mate brain.

02 Location of IT and STS in the primate brain.

01

01

Page 37: Action and Cognition

a&c | Appendix A: IT and STS

a&c | 37

Characteristics and Roots of the Object Selectivity:

Considering the underlying principles governing “object selectivity”, it becomes apparent that it is a certain configuration of object features for which the cells are selective (fig3), and not the simultaneous presence of these features in an arbitrary arrangement. In this

context, which features and which feature arrangements are important can be found out experimentally through single cell recordings (link). Through this paradigm selectivity for certain complex stimuli might be assigned to different subareas of IT (fig4). This might also allow to generate the “simple forms of complex shapes”, the simple forms which are able to result in the same response in the neuron as a complex natural stimulus

03 The responses of face and hand selective cells under display of faces, hands, and the features of these objects are plotted as a function of time. The black bars under the x-axis indicate the timing of the display

04 A rough assignment of selectivity for different kinds of objects to the areas of cortex, which are involved in shape recognition.

05 Natural stimuli, which elicit activity in certain populations of IT and STS, and the correspond-ing “simple” complex stimuli, which elicit similar activity.

03

0504

Page 38: Action and Cognition

a&c | Appendix A: IT and STS

a&c | 38

(fig5). The roots of this feature selectivity were discussed by Sigala & Logothetis through a monkey study. The monkeys learned to categorize 10 faces (for matters of simplicity only the results of “face categorization” will be discussed in this script; the reader might check the original article in case of interest) into two groups (fig6) with an appetitively motivated operant conditioning task (if the animal responded correctly, it was given a juice reward). The faces could vary from each other along 4 dimensions (such as eye height or nose length) and only two of these features (eye separation and eye height) were diagnostic of the category to which the given face belonged. If the eye

height was low or if the eye height was moderate but the eye separation large, such a face belonged to the first category, if it did not fulfil this criteria it belonged to the second (fig7). But nose length and mouth height were not informative about categorization. After this task was learned visually responsive neurons in IT were isolated and their selectivity to these 4 features was computed (fig8). The results indicated that selectivity for diagnostic features were significantly above the selectivity for the non-diagnostic ones. This implies that the feature selectivity in IT has its roots in categorization of objects.

06 The stimulus set consisted of line drawings of faces with four varying features: eye height, eye separation, nose length and mouth height.

07 Two dimensional representation of the stimu-lus space. Black stars represent the stimuli of the fi rst category and ovals represent the stimuli of the second category. Each number indicates the position of one corresponding stimulus from fi g 6. As the stimuli differ along 4 dimensions, the 2_dimensional representations in this fi gure re-sult in overlap of stimuli that are distinct. P1 and P2 represent the prototypes. The rest of the circles represent test exemplars that did not belong to a fi xed category. The two dimensions were linearly separable along the eye height, eye separation di-mensions (solid line) but not along the nose length, mouth height dimensions.

08 Plots of the average selectivity index of each neuron for the diagnostic vs the non-diagnostic features. Each point represents a neuron. Circles represent neurons with statistically signifi cant selectivity for diagnostic dimensions. Triangles represent neurons with no selectivity. (the other symbols in the fi gure are not discussed in this script, but in the original text)

06

07

08

Page 39: Action and Cognition

a&c | Appendix A: IT and STS

a&c | 39

Columns in IT

The “columnar” and “topographical” organization common for many areas in the cerebral cortex is also encountered in IT cortex. As the neurons responsive to same and similar objects or to object features are in the same columns (fig9); in the adjacent columns different positions (in terms of the view point) of the same or similar shapes are coded (fig10).

Size and Position (in)variance in IT

In this context invariance means that neurons will

respond to a stimulus, whatever its size and location in the receptive field is. Whether the responses of IT neurons are position and/or size independent, as their large RFs might suggest, is still a matter of debate. Some researchers failed to show changes for neural responses in IT as they changed the size and location of the same face, and this finding was replicated with different types and sizes of objects (Ito et al., 1994; Tovee et al., 1994). But in a recent study (DiCarlo and Maunsell. 2003) invariance of neural responses was questioned. In this study monkeys were trained in an object recognition task with small (0,6° wide) stimuli. The change in neural responses in the case of small

09 Columnar organization in IT.

10 Topographical organization in IT.

09

10

Page 40: Action and Cognition

a&c | Appendix A: IT and STS

a&c | 40

differences (1,5°) in the retinal positions was very high, implying that IT neurons were not position invariant. According to the authors, the discrepancy between this particular study and the previous ones might be due to the size of the stimuli used in the experiments, or due to the plasticity effects in IT after very extensive training. Although this is compatible with the observation, that the studies using large stimuli found position invariance,

and the ones using small stimuli found variance; it is clear that further research will be required for a final decision.

STS:

Face selectivity in STS follows the general rules of object selectivity mentioned above. It is again a certain

04 Response of a face selective neuron during the presentation of different complex stimuli. The blue shaded area refers to time of presentation.

05 Some face selective cells in STS are also view-dependent, in the sense that they only respond to the view of a face from a certain angle.

11

Page 41: Action and Cognition

a&c | Appendix A: IT and STS

a&c | 41

arrangement of certain features, even the absence of a single feature, or even a slight change in the configuration results in diminished responses (fig11) Some populations of face selective neurons in STS are also “view” selective. For instance, a given neuron might respond to the side view of a face but not to the front view of the same face (fig12).

Page 42: Action and Cognition

a&c | Attention

a&c | 42

What is temporal coding?

First, let us come to a first definition of temporal coding. Roughly speaking, anything that is not integrated over extended periods of time, can be called a temporal code. There are plenty examples, e.g. the form of the PSTH explained in the next paragraph, the latency of the first spike, or temporal relationship of different spikes.

How can temporal codes be read out?

A big part of our knowledge about neuronal behaviour stems from the recording of single neurons. In these recordings, one single electrode is used to measure the currents in a single neuron. There are different methods to do these recordings, but the common point is that only one site of the brain is measured at a time. These recordings do allow for the examination of the change in firing rate over time, but only as far as it is stimulus-locked, i.e., if the timing of the neuronal reaction depends only on the stimulus and not on other neurons. For example, a reoccurring burst of spikes that appears always 120 ms after stimulus onset may easily be detected, also in different neurons, but synchronized firing between

neurons that is not stimulus-locked cannot be detected. Therefore, multi-electrode recordings are used more and more. In these recordings, multiple electrode are used to measure currents at different points in the brain at the same time. They can thus be used to analyze internally generated covariations of firing patterns.In the following, the most frequent form to report neuronal activity is the so-called Peri-stimulus time histogram (PSTH). To get a PSTH, a time line is divided into single time windows, and for every window the number of occurring action potentials is counted. The result is a graph with bars as in figure 01.In the following we will discuss some statistical means to analyze measured data, especially those needed to evaluate data from multi-electrode measures.Correlation: Auto-correlation

Suppose you have measured data from ten trials. This data is rastered, i.e. you have only the points of time where the peak of an action potential (AP) is achieved; an AP is supposed to occur at one exact point of time and is considered discrete (this type of data is sometimes referred to as a “spike train”). Even for a single trial, you can do more than just look at when the APs occur. You can calculate the distance of all pairs of APs in time. This

CHAPTER 04: Temporal CodingIntroduction

This part of the script will be about temporal codes. We will define what temporal codes are, discuss some methods of statistical analysis of data and finally see some examples of experiments where temporal coding is important

01A post (or peri) stimulus time histogram (PSTH)

Page 43: Action and Cognition

a&c | Attention

a&c | 43

will yield a new function with points for the time distances of the APs. It will show, at what time-difference the neuron spikes most often. This function is called auto-correlation function. Additionally, if you average over your ten trials, you get the averaged auto-correlation function. This function allows for the detection of temporal patterns in your data.

Correlation: Cross-correlation

A slight variant of the strategy described above can be used for more than one neuron. Instead of comparing one neuron to itself, i.e., measuring the interval between two spikes, you could also do this for two neurons. The resulting function would describe how often the two neurons spike with a certain time difference. It is called cross-correlation. Both correlation functions can be plotted in so-called correlogramms (like in figure 02 b) and c). If no correlation is present, these correlogramms show just a flat line. This forms some kind of a zero point. The more the function differs from this line, the higher the correlation at a point in time. If correlation occurs at regular time intervals, a typical waveform is present (e.g. in figure b). This is usually read as a sign of synchronization.

Evoked & Induced

In many experiments, data is not only recorded in a single trial. Multiple trials are performed and the resulting data is averaged. The advantage of this method is that errors that are due to measurement techniques or to external circumstances that are irrelevant to the experiment become less significant. It is thereby possible to measure

the reaction of a neuron to a certain stimulus more precisely than it would be otherwise. There are two possibilities to average over trials. The first possibility is to sum over trials first (this is equivalent to averaging), and apply a non-linear measure afterwards (this might e.g. be a power function). The result is called evoked (e.g. evoked power). The second possibility is to apply a non-linear measure to every single trial and sum up afterwards. In this case the result is called induced. The main difference is the recording of activity that is not stimulus-locked and varies over trials. This activity may be considered to be noise. When computing evoked power, such activity will average out before applying the non-linear measure

Correlation function & Shift predictor

Another important differentiation is that between an average correlation function and a shift predictor. If correlation functions are computed for several trials and the result is summed up, the result is called average correlation function. On the other hand a so-called shift-predictor is computed, if the trials are summed up first and afterwards a correlation function is applied.Now what is a shift predictor good for? The crosscorrelation is used to describe coincidence in spiking in two neurons. Now the stimulus is presented at a certain time and thus some relationship between the neurons is artificially created. This relationship is described by the shift predictor. Exactly that is why it is called shift predictor, it predicts the shift in the data introduced by the adjustment to stimulus onset. If we take raw crosscorrelation-data and subtract the shift-predictor, we can correct our data and remove the artificiality in our data. The remaining correlation cannot be due just to the time of stimulus onset, if it is

02Cross correlation functions (CCF) plotted ascorrelogramms

Page 44: Action and Cognition

a&c | Attention

a&c | 44

still significant, some other cause has to exist.

What temporal coding is good for: Binding by synchronization

The binding problem has been mentioned already in chapters 1 and 3. We will now take a look at what binding has to do with temporal codes and why synchronization may solve some problems in this area.Nearly all stimulus features that can be coded by traditional rate codes may be represented by temporal means, as well. But there is more to temporal coding. Especially the relation between two features may be represented, which is not possible by traditional rate codes.Imagine the picture of a triangle in the upper part of a screen (see figure 03), and another one with a square in the bottom part. The first one might have the features “triangle” and “up” whereas the second has the features “square” and “bottom”. Now, what happens if both stimuli are presented at once. There will be all four features present, but one needs a possibility to differentiate between a triangle on top of a square and a square on top of a triangle. One possibility to do so, is to synchronize the features “top” and “triangle” and the

features “bottom” and “square”. Thus, a simple syntax is given to bind features together. So, additional to the classical means synchronization may give us a possibility to represent feature bindings.

Experimental evidence

In an experiment carried out by Engel et al. (1991) , a measurement with four electrodes was carried out (see figure 04). The cells at site 1 and 3 preferred vertical orientation, while those at site 2 and 4 preferred horizontal orientation. Trials with single bars with an orientation of 0° (A), 60° (B) and 135° (C) showed synchronized firing occurred between all responding sites. Interestingly, however, when using two bars, one with 0° and one with 112°, sites 1 and 3 where synchronized, as well as sites 2 and 4. But there was no synchronization between 2 and 3 (shown in figure), 1 and 2, 1 and 4, and 3 and 4 (not shown). This seems to be coherent with the theory that features are bound by synchronization.Another experiment (see figure 05) shows that synchronization might play a role in the relation between the two hemispheres. The experiment similar to the already mentioned split-brain-studies. Two electrodes

03

03 Binding by synchronization helps to solve the question which features belong together. So false conjunction of features can be avoided.

04 Correlogramms from the experiment by Engel et al. discussed below

04

Page 45: Action and Cognition

a&c | Attention

a&c | 45

were located in area 17 (this experiment was done on a cat) of the left and right hemisphere, close to the representation of the vertical meridian (B), i.e. at the site in the brain, where optical signals are processed that are near a vertical line presented exactly in front of the subject. When two light bars with optimal orientation where shown, the responses in C were evoked. The cross-correlogramm (E) again shows a strong synchronization between the cells. On the other hand, in animals with a cut corpus callosum (see figure 06) show completely different results. In an experiment quite similar to the one above, autocorrelation- as well as interhemispheric crosscorrelation-functions show oscillatory behaviour (C,D and E). But the cross-correlation between the two hemispheres (F) is virtually blank. Thus, it seems that via the corpus callosum a synchronization between cells with similar response properties is established.

Oscillation and why it is important for synchronization

As you might have noticed, most of the synchronization we observed was quite regular, i.e. the correlogramms

showed a typical waveform. This is due to the fact that usually the firing pattern of a single neuron is not at all random, but regular in most cases. In other words: The neurons are oscillating. This fact has some impact on the synchronization. While a complicated firing pattern may be hard to synchronize, a simple, regular pattern, such as simple oscillation, may be repeated easier. Short- and long-range oscillation may have different effects on the neuronal activity.State dependence & Spike time dependent plasticity The exact role of the timing of spikes in our brain is not that well understood. However, for some areas it has been shown that the timing of spikes has indeed an effect on plasticity of neurons. That is, though a certain spiking behaviour, the state of a neuron may be changed such, that it fires more or less easily. The dependency of the peak conductance at a synapse of spike timing is shown in figure 07. It is to be read like this: If some postsynaptic spike arrives a bit earlier than a presynaptic spike, the conductance is increased, the lesser the time distance between the spikes, the more it is increased. If the postsynaptic spike arrives later, the conductance becomes lower, the closer the time distance, the more it is lowered. Thus, plasticity

05

05 Synchronized activity between the two hemi-spheres clearly can be seen in these correlo-gramms.

06 Interhemispheric synchronization is abolished by dissection of the corpus callosum.

06

Page 46: Action and Cognition

a&c | Attention

a&c | 46

08

07

07 Dependency of the peak conductance at a syn-apse of spike timing.

08 Model for dynamic recruitment

09 Predictions of the model for dynamic recruit-ment

09

Page 47: Action and Cognition

a&c | Attention

a&c | 47

may in fact be influenced by spike timing.

Dynamic recruitment

The mechanism mentioned in the last paragraph is far from being the whole story. There are more complex ways how spikes may influence the plasticity of neurons. As in the model shown in figure 08, the AP may also inhibit the dendrite and thereby change its conductance . It is not yet resolved how much this model resembles the events in the human brain. However, by this mechanism, networks of neurons may adapt to a presented stimulus (as shown in figure 09).

Experiments with strabismic humans / animals

In the next part we will talk about strabism (squint). In squinting subjects, the information from both eyes is not integrated in the usual way. This does also have influence on the synchronization of the two hemispheres. As you might remember, the information of both eyes is processed in different pathways. Still separated in the LGN, it is nevertheless blended at the level of the cortex. But also here, ocular dominance columns exist in the primary visual areas.Usually, objects that are close to each other in feature space (or in Euclidean space) show synchronization. This is the case because in a retinotopic map adjacent neurons are simultaneously

10

10 Synchronization between the two hemispheres in strabismic cats

11 Feature binding in ambiguous pictures by syn-chronization

11

Page 48: Action and Cognition

a&c | Attention

a&c | 48

activated by an object covering a certain portion of the visual field. However, in squinting cats there is a significant difference, depending on whether measuring two neurons responsible for the same eye or for different eyes (see figure 10) . This indicates that synchronization serves as a mechanism that allows for the definition of coherence of objects in different eyes.

Again: Binding by synchronization

Maybe you have already seen pictures like figure 11 (a). The twist in it is that you might see a woman behind a candle or two women facing each other. It is possible to see each of them separately, but subjects do never perceive both variants at the same time. This fact is not only reported, but can also be shown by measurements. It can be shown that there is a different neuronal state for each of the perceived pictures. The problem we encounter here is how to define functional relations between neurons responsible for different parts of this stimulus. The stimulus does never change, however, the relations between certain neurons change drastically, as the perceived picture is a completely different one.The temporal binding we discussed above may be a solution to this problem. It has been suggested that a temporal binding mechanism establishes synchrony between neurons responsible for certain parts of the image. In this way, a differentiation between figure and ground may be achieved.

Is synchronization really relevant?

Another experiment used mirrors to present different pictures to a cat (see figure 12). It could be shown that the neuronal reaction is synchronized only with respect to one of the presented pictures. There was one “winning eye” whose stimulus was processed. For this eye, synchronization occurred between some of the responsible neurons. For the “losing eye”, no such synchronization could be measured. This is another hint in the direction that synchronization may be relevant for our perception of stimuli.

Summary

In this part of the script, you should have learned that there are different types of temporal codes. You should know roughly what statistical methods are used to evaluate data, especially auto- and cross-correlations. Although there is a lot of ongoing debate about synchronization, there is some experimental evidence hinting in the direction of synchronization as a mechanism to bind object features. Synchronization appears between neurons that are close in feature or Euclidean space, and occurs between the two hemispheres. In strabismic animals, synchronization does in fact resemble the perception of stimuli in different eyes.

12 Experimental setup for synchronization in cats

Page 49: Action and Cognition

a&c | Attention

a&c | 49

Page 50: Action and Cognition

a&c | Attention

a&c | 50

Defi ning attention

“Everyone knows what attention is. It is the taking possesion by the mind, in a clear and vivid form, of one out of what seem several simultaneously possible objects or trains of thought” (William James). This phenomenological definition resembles our common sense conception of attention.

There may be a more evolutionary founded definition: Organisms have limited processing capacities. Limited capacities restrain the amount of information to be processed. Due to ecological and evolutionary constraints the organisms have to select the most relevant information out of their input. Therefore, in order to understand the concept of attention a bit better, we will now dig into scientifically accepted mechanisms of selection. These could either be overt or covert.

Overt selection

The term overt selection refers to selective capturing of environmental signals by conformational changes in the sensory apparatus for a specific modality. Puzzled? Overt means external and therefore observable for

others. Overt selection processes are therefore crucially dependent on the direction and alignments of the respective sensory organs. The most important and notable example is overt selection in the visual system.

overt selection in the visual system:

In the visual system the direction of gaze initially defines your visual field. The direction of gaze is driven by head-movements and the oculo-motor system, which moves your eyeballs. Besides, the direction of gaze also selects a special region in the visual field. Visual information from this region is taken up by the fovea. From previous chapters we know that the fovea is the area with the highest density of photoreceptors in the center of the retina. We are also aware that the fovea is, although tiny in size (approx. 1mm), the major represented part of the retina in the further processing tracts: “For example, from the high-resolution foveal representation where most processing ressources are allocated to the central 5° of the visual field, to late stages of visual cortical processing, where receptive fields invariably grow to encompass the fovea, the neural architecture disproportionately represent the visual

CHAPTER 05: AttentionIntroduction

This chapter introduces you the concept of attention psychologically and neuroscientifically. We start with a definition of attention. Perceptual attention is crucially dependent on selection mechanisms and processing capacities in the perceptual pathways. Capacity is the amount of perceptual ressources available for processing of information. Factors modifying attentional capacity are e.g. alertness, daytime and motivation. Due to limited processing capacities, selection mechanisms have to somehow “filter” the appropriate ressources in order to get a benefit in performance. These selection mechanisms can be either overt - external and observable - or covert - internal and unobservable for others. Thus the two most prominent subcategories of attentional selection we will encounter are overt and covert attention.

Afterwards we dig into psychological theories of attention:You can roughly distinguish the most prominent ones by referring to the processing stages on which selective mechanisms take place. We will review theories of early and late selection. Early selection claims that attentional mechanisms take place on a preprocessing stage. Thus they may act as a “filter” for most relevant or appropriate stimuli. These stimuli information undergoes further processing whereas other stimuli information is neglected. Theories of late selection instead claim, that attentional filtering occurs relatively late in the processing pathways. They are supported by experiments in which subjects also grasp semantic -higher level - features of neglected stimuli.If selection occurred very late in the processing pathway, there would be no gain for our limited capacities and the role of attention would seem to be paradoxical and useless. Other theories in between, such as Anne Treisman’s Attenuation Theory provide a more sophisticated approach - namely that attention acts as an attenuator - which reduces activation of certain stimuli merely than blocks them totally.

Neuropsychologists study attention traditionally on unilateral neglect patients. As you know from your cognitive neuroscience class, patients of neglect fail to respond to stimuli of the contralesional side in the visual field and even to the contralesional side of particular objects. We distinguish between spatial neglect and object based neglect. At the end of this chapter, the study of this symptoms will lead us to subcategorize attention into spatial and object based attention.

Page 51: Action and Cognition

a&c | Attention

a&c | 51

central visual field.” (Parkhurst et al., 2002). Eye movements therefore select the most important parts of visual input in order to perceive a scene.When you think about limitations in visual information processing of a single fixation, it becomes obvious that you have to redirect you fixation many times in order to

perceive the scene around you. Typical experimental research on overt attention asks what makes points attractive – salient – for eyemovements towards that point. Thus we will often encounter studies which correlate visual features of a scene and tracked eye movements in that scene.The Eyetracker your institute is

EyeLink II is one of the latest eye tracking devices. It provides very high resolution and fast data acquisition. As two cameras with built-in illuminators track the eyes continuously, a third camera attached to a headband keeps track of the motion of the head. The eye cameras make use corneal reflection together with pupil tracking to determine the movements of the pupils with very high accuracy. The head camera informs about the point of gaze. This allows head motion and speech during the experiment which was quite problematic with older types of eye tracking devices.

BOX 5.1EyeLink II

Saliency maps are computational models. A salience map represents the attentional significance of points in a given visual scene. In the construction process, visual scenes are given by coloured high resolution pictures. The computational construction of a saliency map is a process which occurs in several serial stages and in a bottom-up (stimulus-driven) and parallel manner. As we will see below, aspects of these computational processes are found to a certain extend in the visual system as well. Therefore the saliency map model serves as some kind of a biologically justified computational model. The stages for constructing a saliency map (acc, to Niebur and Parkhurst, 2002) are: 1. Separation of the visual scene into feature channels:The input image is segregated into several parallel feature channels. For example, the image may be divided into a colour, intensity and orientation channel. In each of these feature channels the feature-specific image is sampled on a series of spatial scales. Such a sampling results in several subfeature images. An example of such an image represents the saturation of a particular green tone.

Biologically, feature channeling may be found in the

BOX 5.2Saliency Map modelsParkhurst, Niebur et al. (2002)

different types of photoreceptors or in the various types of bipolar cells.

2. Creation of center - surround subfeature mapsAt the next level of the feature map construction process, the features are reorganized into a center -surround arrangement. The algorithm behind maximizes local differences and increases contrast. Center-surround arrangements resemble the properties of receptive fields in bipolar cells, retina ganglion cells, geniculate cells and some cortical input cells in layer 4C.

3. normalizing the center-surround maps imagesThe third processing step normalizes the images. Such a normalization is a statisical transformation process and usually scales all activation values with respect to average values and maxima.

4. Construction of unimodal saliency mapsFinally the normalized subfeature center-surround maps in each feature channel are combined. Since the normalized subfeature center-surround maps represent a sprecific feature, their combination results in a unimodal saliency map. This combination process is a linear summation of the normalized subfeature data within each channel.

5. Normalization of unimodal saliency maps and construction of a multimodal saliency map.

Page 52: Action and Cognition

a&c | Attention

a&c | 52

using, is the EyeLink 2 system. (see BOX 5.1)

Normally, eye-fixation and linear interpolation methods help to calibrate eye-tracking devices before experimental trials start. In this paradigm, participants are shown visual scenes during a given time window while the eye movements of the participant are recorded. By this approach, it is possible for experimenters to correlate subjects’ eye movements with the computed unimodal saliency maps. (see also BOX 5.2)

Let’s have a look on 2 examples of this research paradigm: Parkhurst et al. (2002) experiments revealed the following results:1. There exists a significant correlation between stimulus salience and point of first fixation. 2. The influence from different feature channels alters in distinct image types. For example, in fractals and home interior databases, color played a significant role, whereas luminance intensity dominated the saliency in natural landscape and building databases.

First, Their results justify the association of fixation traces with saliency maps. Since overt visual attention itself is associated with fixation traces, Parkhurst et al. thus give another justification that a saliency map models overt visual attention and may thus be biologically implemented. This may strengthen a hypothesis that

overt visual attention is a stimulus-driven, bottom-up, mechanism. Second, their results also fit very well into a embodied theory of the mind which you might know from your philosophy classes. Embodyment takes into account the agent’s beeing in its environment: Different environments determine the features, on which basis of which they are analysed and segmented. Thus, our perceptual mechanisms may adapt to the environment in which we see, act and live.

Another similar experiment experiment from Einhäuser & König (2003) also presented subjects with images taken from outdoor environment. The experiment consisted of two parts. (i) they recorded fixation points and correlated them with contrast intensity at the respective part of the unmodified scene. (ii) additionally, they varied the contrasts level within the scenes and computed whether high contrast by itself causally attracts overt attention. Einhäuser & König found out that no causal contribution of luminance contrast to a saliency map construction of human overt attention is detectable. They explained this surprising result with the structure of the visual apparatus. Although neurons in the early visual system are strongly sensitive to luminance contrast (as you can see eg. in the tuning curves of cells in V1) the saliency of points in a visual scene becomes more and more invariant to contrast. This could imply that saliency maps for overt

Afterwards, the unimodal saliency maps are normalized again. The unimodal saliency maps for each feature channel are summed. The final normalization enhances feature competetion across spatial scales and remaps all the unimodal saliency maps to equivalent spatial scales. Afterwards and a multimodal representation, the final common saliency map, is created.

A saliency map would therefore tell you where in an image something is happening, but not what is happening there. Salient features are represented whereas unsalient features of the visual scene aren’t.

01Construction of a multimodal saliency map.(Parkhurst et al.)

Page 53: Action and Cognition

a&c | Attention

a&c | 53

visual attention in natural scenes do not originate in early visual areas and merely strenghten hypotheses of top-down effects of attention.

overt attention in other modalities:

Now we will briefly consider overt attention/selection in other modalities, although not much scientific work has been made in these domains yet. Recall that overt attention mechanisms are dependent on the alignment of the particular sensory apparatus.Therefore, only haptic sensory perception can exhibit overt attention selection processes (since only a few of us can aligh their ears to a specific source). Selection mechanisms in the auditory, olfactory and vestibular system act more covertly. This means they work rather independently from the alignment of the sensory organs and are driven more by internal mechanisms.

Covert attention

Covert attention also refers to a selective enhancement in the processing of attended stimuli at the expense of neglected stimuli. Covert means unobservable for others. The crucial difference to overt attention, as already mentioned, is that mechanims of covert selection are not driven by external adjustments of the sensory apparatus but are based on internal processes.

Experiments like shadowing tasks investigate mechanisms of covert attention. In shadowing tasks, subjects have hypothetically the same sensorymotor access to all presented stimuli. This is important in order to diminish the influence of overt

selection in these tasks. Experiments on shadowing tasks began on the auditory domain. Subjects wore headphones. They had to repeat a message coming from either the left or right channel. Initial results show that subjects fail to detect more fine grained information from the unattended channel. In addition to auditory shadowing (figure 02), shadowing tasks for the visual system indicate that shadowing performance is crucially dependent on the physical properties of the concurring stimuli. If the stimuli differ physically (eg. messages from different voices in the headphone task) subjects show more enhanced performance than in conditions where the physical properties are identical or very similar . You may explain this feature if you assume that covert selection mechanisms occur relatively early in the processing hierarchy. On the way to move up the hierarchy highly resembling inputs are grouped together and unattended stimuli are selected out. Therefore, shadowing for physically differing stimuli is easier and faster because it already occurs on low levels which are sensitive to the physical differences of stimuli. Separation of resembling stimuli is difficult since they are not successfully segregated on low levels. It is more probable that these different but similar stimuli are already bound together when you need to shadow one of them.

You also perform shadowing in various real world situations. If you dive, for example, into the exciting and pulsating nightlife of Osnabrück and go to a private party of a coursemate youcertainly noticed that you can just attend successfully to one communication stream. All other sounds from your environment are faded out, although people talk about

02Covert selection in the auditory system:a shadowing task. (Gazzaniga et al., 2001)

Page 54: Action and Cognition

a&c | Attention

a&c | 54

interesting (eg. A&C) and sometimes funny (eg. A&C) topics. A further important aspect is that you may pretend to listen to a communication stream (eg. Talk with your coursemate), but are actually attending to some other, more interesting stream (Peter Königs talk). Your poor coursemate may not notice it.

Early or/and late selection

In the previous section, we have investigated different mechanisms of selection. At the end you might have noticed that we have started to make use of words like early, late and hierarchy. In this section we will investigate theories differing according to the stage on which the selection mechanisms take place.

Early selection

Early findings on covert attention gave rise to Broadbent’s filter theory (1958). In filter theory, attention acts on preprocessing stages like a filter. The filter analyses some gross features of the input, selects a limited amount of incoming stimuli and permits only the selection to be further processed. All unattended stimuli are filtered out and aren’t processed in higher levels. Although Broadbent’s model is quite intuitive, subsequent studies showed that attention is not that simple. Their results indicate, that even higher level semantic features of unattended stimuli may be already processed until the selection mechanisms take place.

Late selection

The results of Tipper & Diver suggest several shortcomings in Broadbend’s filter model. Attention does not only take place on early, but also on late stages of perceptual processing. Representations on later

03 Broadbent’s Filter Theory

04 Theories of Early and Late Selection

03

04

Page 55: Action and Cognition

a&c | Attention

a&c | 55

processing stages allow the selection mechanisms to determine which part of the input is important not only based on physical, but also on semantic features. Treisman’s attenuator theory (AT, 1960) combines early vs. late selection mechanisms: According to AT selection operates both in early and late stages of perceptual processing. During early phases of perception, selection is based on the physical features of the input. The crucial difference to Filter Theory is that this initial selection process is only

partial. It rather reduces (=attenuates) activation levels of unattended stimuli than totally block them. At late stages of processing activation levels from the input representations are compared with respect to their activation thresholds. Late selection enable the dynamic modification of activation thresholds for semantically highly salient items. Again a party example: When suddenly someone utters your own name, you’ll probably get aware of it, although you haven’t attended to the speaker until she utters it.

BOX 5.3Paper review: Tipper & Driver, Negative priming between pictures and words in a selective attention task: Evidence for semantic processing of ignored stimuli]

In cognitive psychology it is not just about stimuli and input, but also about internal representations and their transformation leading to actions. In their paper, Tipper and Diver investigate the role attention plays in perceptual processes to produce internal representations and its mechanisms.

First, they compare the precategorical and postcategorical view for the role of attention. Both views are crucially related to the stages in which attentional mechanisms take place: The precategorical view holds that attention is critical for the processes involved in object recognition. Objects which are ignored are only represented by their physical, low-level features. Only for selected objects semantic features such as identity are represented. It is reasonable for you, that precategorical theories are

predicated upon filter models (eg.Broadbent, 1958) in which objects were selected barely on their physical -precategorial - basis. Postcategorical selection theories (Deutsch & Deutsch) hold that attentional mechanisms play a limited role in object recognition. Whether or not an object is attended to, the perceptual system will encode its internal representation and attentional effects do occur on a categorical higher-level, stage. The mechanisms of attention therefore occur relatively late, as proposed by e.g. Treisman’s FIT and attenuation theory. They conducted an experiment in which they presented superimposed drawings of two objects and had to attend to one of the objects. In the experimental condition, participants also had to identify the superordinate category of the unattended object, eg. (cat - animals). The performance for these superordinate categories was significantly high. Tipper and Diver found out, that even unattended stimuli reach a processing level in which semantic features like object category are represented. The results therefore suggest a postcategorical view about the role of attention which implies late selection mechanisms.

05 Treismans attenuation theory of auditory at-tention. The message coming from the unattended left ear is attenuated, but both messages remain active central dictionary units, only one of which typically exceeds threshold to become conscious. This theory contains mechanisms for both early and late selection.

Page 56: Action and Cognition

a&c | Attention

a&c | 56

The inattention paradigm

To study mechanisms of early and late selection in the visual domain, the vision scientists Mack and Rock developed the “inattentional paradigm” procedure. In this setting experimenters investigate which features of unattended stimuli are processed. The experiment goes on as the following:In several intitial trials (A) subjects had to perform a rather difficult discrimination taks of a single object. Then in the inattention trial (B) an extra, unexpected element is presented. In a final recognition test (C) the experimenters evaluate the performance of subjects perception with respect to the features of the unexpected element.

In addition, various attentional instructions, full, divided and inattention condition, were given to the subjects with respect to the extra elements. The figure underneath shows that simple sensory properties were perceived withour attention but more complex ones were not.

Change Blindness

Change blindness experiments have revealed that subjects perform poorly in detecting even gross changes in visual presented stimuli. In a change blindness experiment subjects are presented two complex scenes in form of high-resolution pictures that are identical except for one object or feature in the scene. If two scenes are aligned to each other and presented after one

06 Left: The inattention paradigm. Subjects are instructed to determine the whether horizontal or the the vertical line of a briefl y presented cross is longer (A), but on the inattention trial, an unexpected element is presented (B). Subjects are asked whether they saw anything besides the cross and are then given a recognition tesk (C) to evaluate their perception of the extra element.

Right: Results from the inattention paradigm. Subjects perform better than chance at recog-nizing color, location and # of elementd but not shape. (adapted from Rock et al.,1992)

07 Change Blindness stimulus set.Subjects need several trials in order to detectthe gross change between the two pictures.

06

06

Page 57: Action and Cognition

a&c | Attention

a&c | 57

another, this task is very easy for subjects to perform. The difference initially pops-out since our visual system is very successful in performing sudden changes in the environment. Instead, if experimenters insert a blank between the two scenes the task becomes extremely difficult and can only be solved in a high number of trials since subjects have to perform a linear search throughout the whole scene. An explanation of poor performance in change blindness tasks is the hypothesis of a grand illusion (O’Reagan, 1992): our impression that conscious perception is rich, complete and detailed is illusory. We solely perceive the objects we directly attend to, the rest is filled in and just seems to be in our perception. “If only attended parts of the environment are represented in the brain, how can we have the impression of such richness and completeness in the visual world outside us? The answer may be that the visual world acts as an external memory. We have the impression of simultaneously seeing everything because any portion of the visual field that awakens our interest is immediately available for scrutiny via an unconscious flick of the eye or of attention. On the other hand, those parts of the scene which are not being currently processed (and in some sense not “seen”) nevertheless constitute a background or setting, that, like the supporting cast of a film, vivifies our visual experience.”

Hemineglect

Up to now, we have investigated attention solely theoretically. Now, we’ll dig into some clinical diseases of attention. The most well-investigated group of subjects for neuroscientific theories of attention are patients with unilateral neglect (or hemineglect) syndrome. Hemineglect mostly results from damages in the right parietal lobe. Patients usually suffered from a stroke leading to anoxia - a lack of oxygen - in these regions. The symptoms and location of neglect gives rise to various brain imaging and electrophysiological studies locating attention and trying to investigate its neurophysiological mechanisms. The symptoms of neglect are most commonly associated to an unawareness of objects and events in the contralesional visual field. Typically, hemineglect patients perform poor in detecting, memorizing and reporting objects in the contralesional side. Hemineglect occurs even though sensory pathways remain intact. This has been suggested by neuroanaotomical and psychologigal studies involving mental imagery tasks which are supposed to operate on high perceptual levels.

Further evidence for the non sensory nature of neglect results from examples, in which neglect syndromes are not solely based on spatial locations within the visual field. There are findings in which hemineglect patients fail to perceive the right half of certain objects, although

BOX 5.4Patient R.R: A case of object based neglect.

Usually patients with a visual hemineglect show no aware perception of objects and events in the contralesional visual hemifield. The symptoms can be summed up by poor performance in visual detection and memory tasks and also limited saccades into the contralesional hemifield. Patient R.R exhibited a different form of neglect: Whereas his awareness for the whole visual field remained intact, R.R showed a object-based form neglect. R.R.’s symptoms include that his perception omits left sides of objects and an impaired chimaeric stimuli detection performance. Findings as the performance of R.R suggest that attention is not a unitary spatial construct but instead takes place on various stages of perception and representation.

08 Self-Portraits from the german artist Jung recovering from hemineglect after a stroke.(2 , 3.5, 6 and 9 months)

Page 58: Action and Cognition

a&c | Attention

a&c | 58

they are presented in the intact (ipsilesional) visual hemifield. These findings include patients who fail to eat the contralesional mealhalf on their plate unless they rotate it to the ipsilesional side.

Futher, cases of object based neglect imply that solely spatial based theories of attention, such as the Zoom Lens Theory (Eriksen & St.James, 1986) or the Spotlight Metaphor (Posner, 1978) do not suffice to grasp the whole concept of attention.

An important alternative or augmentation therefore is the possibility that attention selects rather particular objects than regions in the visual field.

Another strong evidence for an object-based concept of attention comes from studies on patients sufferering from Balint’s syndrome. The pathology of Balint’s syndrome is even more severe than Hemineglect syndrome. Balint’s syndrome results in a complete unability to perceive more than one single object patients currently attend to. Patients suffering from Balint’s syndrome are functionally blind. They must use conscious strategies such as closing their eyes in order to be able to fixate another object and to redirect their attention.

an overall view

The debate between object based and spatial based attention does not mean them to be mutually exclusive.

Since perception operates on multiple levels, there is the strong tendency that attentional effects both take place spatial and object based. Besides, we found evidence for the influence of attention on various stages of processing, as you could see in the debate between early and late processing. Since attention seems to have various properties and seems to occur on nearly all levels, classical and oversimplified psychological approaches such as the spotlight metaphor discussed in this chapter could not really catch and explain attention. One of the most compelling and interesting approaches towards the apparent ubiquity of attention is O’Reagans grand illusion hypothesis from 1992. It explains attention the other way round. It equates attention with direct perception and addressed more the additional filling in of our conscious mechanisms. These mechanisms may use the world as an external memory in order to complete the limited perceptual input rather than filter rich and detailed environmental input streams. As we have encountered the preceding chapters that vision is a highly constructive and active mechanism on all levels - this approach seems to be excitingly appealing!

09 Diagrams showing unilateral and bilateral parietal lesions (a) typical of neglect(b) typical of Balint’s syndrome

Page 59: Action and Cognition

a&c | Neural Correlates of Attention

a&c | 59

Where to start ?

Lesions to the posterior parietal cortex ( PPC ) are notorious to be accompanied by deficits in attentional control. Areas in the PPC therefore are the objects of interest in subsequent experiments.

Two groups of researchers followed the leads from attention to PPC and tried to understand the relation between these to entities. Both groups were looking for bottom - up as well as top - down driven mechanisms for attentional control. There are grave differences between the groups regarding both methods and results. An overview about the experiments as well as a short discussion of the results will be given in the following paragraph.

Bisley & Goldberg - Neuronal Activity in the Lateral Intraparietal Area and Spatial Attention

The first series of experiments regarding attention in PPC was conducted by James W. Bisley and Michael E. Goldberg. Previous studies suggested that an area called LIP (lateral intraparietal cortex) is a major player in the generation of visual attention in monkeys.

The Paradigm

The “paradigm” employed by the experimenters was the following : At first they used a behavioural task (which will be discussed at length later on) to track the

monkeys changing locus of attention in the visual field. Then they compared these dynamics to neurons with the corresponding receptive fields in LIP. Two monkeys were trained to prepare a saccade towards a remembered position on a computer screen. This target position was pointed out before by a cue stimulus. (Consequently, this position is referred to as “target”) As the target stimulus appeared, the monkeys had to plan saccade and had to wait for a second stimulus called ‘probe’. This probe was a GO/NOGO stimulus upon which the monkeys were to decide whether to execute the saccade towards the cued target location (Go) or not (NOGO). The probe stimulus could appear inside or outside of the target location... just anywhere. The researchers varied the contrast of the probe throughout the experiment. By doing so they were able to determine the contrast threshold of the probe. This threshold is the contrast value below which the animals were not able to discriminate between GO stimulus and NOGO stimulus and where correct trials were based upon dead reckoning. Above this contrast threshold the monkeys were capable of more correct probe recognition and consequently performed as they were trained to perform.

An attentional effect on contrast threshold

Bisley and Goldberg found out that the monkeys performed better when the probe appeared at the saccade target location. This means that the contrast threshold was lowered due to an increased sensitivity at the target location. The logical consequence of this statement is that the contrast threshold away from the

CHAPTER 06: Neural correlates of attentional mechanismsIntroduction

In the preceding passages we learned about saliency maps. We also learned that these maps correspond in a way to human exploratory oculomotor behaviour. (for example : the most salient point is usually equivalent to the point of first fixation) This correspondence gives rise to the question whether there exists a neural mechanism responsible for selecting relevant parts of a visual scene and directing the focus of attention towards this part. This hypothetical mechanism should be able to allocate attention in two basic ways.

1.There should be a bottom - up control of visual attention. This means that a novel stimulus enters the centre of attention simply because it is salient enough. This mechanism can be understood as an implementation of a saliency map, because the most salient stimuli drive the exploratory behaviour of the attentional focus. Saliency may be obtained by a number of factors like contrast, luminance, movement, novelty... whatever. (for further information see : construction of saliency maps in chapter 05)2. Furthermore a mechanism for the top - down control of attention is needed. There is the possibility to voluntarily attend to a certain location in visual space. This voluntary guidance of attention is not driven by an external, salient stimulus at the site of attention that makes the location ‘interesting’. One may say that the locus of attention is focussed by pure willpower

The search for such a mechanism will keep us occupied for a major portion of this chapter. In Appendix B we will have a look at how MRI and functional MRI enable us to have a look at the insides of a living organism without necessarily killing it.In the last section we will discuss what goes on in the brain during visual imagery.

Page 60: Action and Cognition

a&c | Neural Correlates of Attention

a&c | 60

This map of PPC might look a bit confusing now, but I can assure you that it will prove to be handy later on. So don’t be irritated by the following picture and it’s description. Just skip it and return to it when you read about an unknown PPC area to find out where it is located. All of the following is taken from “Neuroimaging of cognitive functions in human parietal cortex” by Culham & Kanwisher.

Comparison of monkey (a) and human (b) parietal cortex. Parietal area is white.How to read the figure above :Bold text : major sulci, italic text : lobules, plain text : functional or anatomical areas. Parietal boundaries are based on anatomical criteria rather than on functional attributes.The central sulcus (CS), Sylvian fissure (SF) and parieto - occipital sulcus (POS) provide unambiguous boundaries, with the remaining boundaries extrapolated from other landmarks.The most salient parietal landmark is the intraparietal sulcus (IPS) that divides the parietallobe into the superior parietal (SPL) and inferior parietal lobules (IPL) in both species . In humans, the IPS is a long (~7 cm), deep

BOX 6.1Culham & Kanwisher : Our guides to the posterior parietal cortex

(~2 cm) sulcus between the transverse occipitalsulcus ([TrOS] near the POS) and the postcentral sulcus (PCS).In the monkey, parietal cortex contains many specialized regions including primary somatosensory cortex (S1); Brodmann’s areas 5, 7A and 7B; visual areas V3A (occipitoparietal boundary), V6A and the anterior (AIP), ventral (VIP), medial (MIP) and lateral (LIP) and caudal (cIPS) sections of the IPS [6,7]. The IPS and adjacent lunate sulcus (LS) in the monkey brain have been opened up toreveal the fundus and banks of each sulcus. Human neuroanatomy differs substantially from that of monkey. It is generally believed that the human SPL is homologous to the monkey IPL. Several human areas have been proposed to be putative human homologues of monkey areas (appended with question marks to indicate speculative relationships).Other areas without clear homologies have also been reported, including: V7; the supramarginal (SMG) and angular (AG) gyri; functional areas at the IPS/TrOS junction (IPTO); the temporoparietal junction (TPJ) and parieto-occipital (PO) region. Medial parietal areas have not been well-characterized in either species. STS, superior temporal sulcus.

01 Macaque PPC vs. human PPC

Page 61: Action and Cognition

a&c | Neural Correlates of Attention

a&c | 61

target location was higher than inside this location.

“We suggest that this lowering of the threshold is an index of attention allocated to the goal of the planned saccade.”

Bisley and Goldberg make this basic assumption to describe the location of attention as an area of increased visual sensitivity. Conversely, areas where the contrast threshold is higher (lower sensitivity), are those areas where attention is not focussed at the moment. Figure 02 shows that the locus of attention is defined by a leftward shift in the psychophysical curve. Probes at the target site are treated as if they were of a higher contrast. This is due to enhanced visual sensitivity which is an attentional effect. Figure 03 shows the normalized contrast thresholds for three trials with different stimulus onset asynchronies (SOA) per animal. In each trial the

probe was presented at the saccade target location. Points significantly below the dashed line show attentional enhancement. The lower, the stronger. As one can see, there is a clear attentional effect at the target site when the monkey prepares a saccade towards it. Up to this point, attentional effect were strictly top - down, that is, there was only a voluntary, goal directed focussing towards the target location. But, as experience shows, novel stimuli with an appropriate grade of salience can attract attention. This would be the bottom - up or stimulus - driven allocation of attention.To include this effect, Bisley and Goldberg introduced a distracter point. This point was flashed occasionally after the target location had been cued. It could occur either in the saccade target location or on the opposite side of the screen.

02

02 The effect of attention on the psychophysical curve ( contrast vs. performance ).

03 The effect of attention on contrast threshold at target location.

04 The shift of the attentional effect from target to distracter and back.

03

04

Page 62: Action and Cognition

a&c | Neural Correlates of Attention

a&c | 62

Shifting of attention by distraction

When the distracter flashed at the opposite side, half a second after the target stimulus, an interesting phenomenon could be observed:The site of attention shifted from the potential saccade target location to the site of the distracter and back. Figure 04 depicts the time course of this shifting of attention. At first we will describe the repositioning of attention from target to distracter site. The investigators presented the GO/NOGO probe at the site of the distracter 200 ms after the distracter flashed. Their finding was that the contrast threshold at the distracter site had been depressed. A lowered contrast threshold is equivalent to an increase in sensitivity and increased sensitivity is assumed to be an attentional effect. If attention was drawn to the site of the distracter, it must have been drawn away from the original saccade target location.So, Bisley and his colleague presented the probe at the original target site 200 ms after the distracter flashed. Just as expected, contrast threshold was elevated at the site where it had been lowered 200ms ago. The locus of attention was now on the opposite side of the screen. When the probe was presented about 700 ms after distracter presentation, the results showed that the monkey’s attention had returned to the same configuration as before distraction. The contrast threshold was heightened at the distraction site as well as anywhere else on the screen except the original saccade target location, where it was low again.

Or as the two researchers put it :“Thus, as in humans, the monkey’s attention is involuntarily drawn to the flashed distracter. This occurs even when the animal is planning a saccade elsewhere, but the attentional effect lasts for less than 700 ms, by which time the attention has returned to the saccade goal”

What’s going on in LIP during attention and shifts of attention?

The next step for Bisley and Goldberg was to relate the gathered behavioural data about the dynamics of attention to activity in the lateral intraparietal cortex. They recorded the firing - activity of 41 LIP neurons by using a single cell recording technique. The two graphs shown in figure 05 depict the firing pattern of a single neuron in LIP. In the top row of both graphs, one can see spike trains from individual trials. The graph is simply a summation of these spikes. There are blue and red bars presenting presentation of target and distracter stimulus. Just in case these are too many colours for you to follow:

- blue bar = target stimulus presentation- red bar = distracter stimulus presentation

The blue graph at the top shows the activity our neuron as the target stimulus appeared in its receptive field, followed by the flashing of the distracter outside the receptive field. Conversely, the red graph shows the

05

05 Activity in LIP neurons with receptive fi elds at target or distracter location.

06 Activity of populations of neurons at target and distracter location in both animals.

06

Page 63: Action and Cognition

a&c | Neural Correlates of Attention

a&c | 63

activity of a neuron (might be the same neuron, might be anywhere else in LIP) upon presentation of the distracter stimulus in its receptive field, while the foregoing target stimulus was not within the range of its receptive field.

Now, we are going to enlarge our focus to whole populations of LIP neurons. Figure 06 in a way is comparable to the single cell recordings we saw above, but there are subtle differences. While the blue and red bars are already known to us, the graphs need a little explaining:The blue graph represents the normalized activity of the group of neurons whose receptive field capture the flashing of the target stimulus while being ‘blind’ to the appearing of the distracter stimulus. The red trace describes just the opposite. A further difference is the fact that the graphs vary in thickness. The thickness represents the ‘standard error of measurement’ (SEM), the statistical phenomenon according to that even similar neurons under similar conditions may exhibit small fluctuations in activity, sometimes being a bit below or a bit above normal activity. The most obvious deviation concerns the plotting of a red and a blue graph into the same coordinate system. The problem is that these recordings are not taken from a single trial. Yet they are in the same coordinate system along one and the same time axis. The two graphs are assembled from single cell recordings in two populations of neurons. As there are about 20 neurons in each population, (and therefore in each graph) the number of trials stuffed into a single coordinate system is about 40. But according to Bisley

and Goldberg this does not pose any problem to us. “...one could as easily reinterpret the activity as that simultaneously seen in two different populations of neurons, one with receptive fields at the saccade goal and the other with receptive fields at the distracter site.”This fits us just right.

Comparison of LIP activity and behaviour

The next step for Bisley and Goldberg was to correlate behavioural data with the recordings from LIP. Figure 07 is almost completely composed of data we already know, so don’t be afraid. The top row shows the performance of the monkeys as observed in the now familiar behavioural trials. You needn’t be confused about the triangles and circles. The triangles represent data from trials shown in figure 04. This series of trials was conducted before the single cell recording. Circles represent data from trials performed after the single cell recording. As usual, the blue triangles / circles show the performance of the monkey when the probe stimulus was presented at the saccade target site while red objects correspond to the animal’s performance as the probe was presented at the distracter site. Again : a low contrast threshold means a high sensitivity at this site which is an attentional effect. So far we recognize the dynamics of the shift of the locus of enhanced sensitivity as we have seen it before. The function of the mysterious grey vertical column will be explained later. The red and blue graphs in the bottom row are taken from figure 06. Each stands for the averaged, normalized activity of a neuronal population. Neurons

07 Comparison of LIP activity and behavioural data.

Page 64: Action and Cognition

a&c | Neural Correlates of Attention

a&c | 64

with receptive fields at the saccade target location are included in the blue graph, while neurons ‘watching’ the distracter location are shown in red. As you may have recognized, the time axis is adjusted in another manner as in figure 06. The axis below is stimulus locked to the presentation of the distracter while the axis in figure 06 starts a bit earlier and includes both target and distracter presentation. This is the reason why we don’t see a blue peak below.The black line represents the difference between the activities in the ‘blue’ and ‘red’ population of neurons. This difference is calculated by the Wilcoxon signed rank test, as denoted on the right. A significant difference is represented by a low value close to zero while a point where there is no difference between the activity of the two neuronal groups has a value approaching to one. There indeed is an interval on the time axis where the difference between the two populations is not significant. Some milliseconds after the distracter presentation period, the activity in the population of cells with their receptive field at the site of the distracter begins to decrease. Simultaneously, there is a slow but significant increase in the neuronal population at the target site. As a consequence of this development, there is an point where the red graph (distracter pop.) and the blue graph (target pop.) cross. During this interval, there is no significant difference in activity between the two sites. The time window during which this effect occurred is called the ‘window of ambiguity’. Its duration is marked (in figure 07) with a vertical grey column. As a consequence of these findings, Bisley and Goldberg went on and looked for a correlate of the window of ambiguity in the behavioural tasks. Their findings can be read very clearly from our figure 07. Again, the grey column extending into the top row of the figure marks the length of the window of ambiguity. During this time, where there is no significant difference in the activity of the underlying LIP neurons, there is no spatial region with enhanced sensitivity.

Graded vs. Binary

There is a remarkable feature about LIP that needs a little explanation here:

Activity in LIP obviously can be correlated to the behavioural shift of attention and the movement of the locus of enhanced sensitivity. But the sensitivity advantage is either on or off. It is binary. One site in the visual field is perceptually enhanced while every other site is not. Sometimes, during the window of ambiguity, there is no such preferred site, but the attentional effect is still binary. Activity in LIP, on the other hand, is graded. There are varying levels of activity, for example in the target and the distracter population. And even though both activities may be above baseline activity at the same time, there can only be one (or none) winner. There is no threshold value that has to be exceeded in order to ‘win’ an attentional effect in the corresponding part of the visual space. The population with the highest activity in relation to the activity at every other site is the winner.Another nice quote will outline this idea : “Thus, the attentional advantage lay in the spatial location subtended by the receptive fields of the neurons with the greatest activity, regardless of the absolute value of that activity”

LIP activity is a qualitative prediction

Moreover, the activity in LIP did predict the behaviour of the monkey in the sense that it predicted the quality of its response. When an animal performed a task correctly, responses to the distracter measured in the first 100 ms before the probe appeared were lower than in incorrect trials. During correct trials, activity at the saccade target site was larger. The weaker the response to the distracter and the stronger the activity at the saccade goal, the better was the result. “We suggest that this activity in LIP provides a general index of quality of a monkey’s performance. In figure 08, one can see the relation between quality of response and activity at the saccade target population and distracter population. On the left, the ratio of correct trials is heightened along with the activity in the target population. The right side shows the opposite for the activity in the distracter population.

Summary LIP

Bisley and Goldberg have shown that activity in LIP can

08Activity in LIP vs. quality of response.

Page 65: Action and Cognition

a&c | Neural Correlates of Attention

a&c | 65

indeed continuously describe the dynamics of the locus of attention. They described the attentional effect as an increase in visual sensitivity at the attended location. The attentional advantage is always of the same nature, no matter if it is driven top - down (by a planned saccade to the site) or bottom - up (by a distracter stimulus). Furthermore they pointed out the distinction between the binary nature of the attentional effect and the graded activity in LIP. They proposed that this graded activity of a neuron can be seen as a priority value attached to the underlying receptive field. Additionally, it was remarked that one must look at the whole set of neurons in LIP to find the monkeys locus of attention, because the highest activity in LIP describes this locus, not just any neuron above a threshold activity. Last but not least, we have gained the insight that LIP firing can act as an index of the quality of task performance.

Corbetta et al. - Voluntary orienting is dissoci-ated from target detection in human PPC

As already touched above, we are going to discuss a further approach to the exploration of attentional control in the posterior parietal cortex.

Clinical inspiration

Maurizio Corbetta and colleagues drew inspiration from clinical neuroscience to formulate their hypotheses. There are clinical cases that allow to hypothesize a dissociation between areas responsible for top - down or bottom - up driven attentional control, respectively.

(Now would be a great time to have a look at the PPC map a few pages above.) Patients with damage to the right temporoparietal cortical junction (TPJ; includes inferior parietal lobule and superior temporal gyrus) show an unilateral visual neglect affecting novel, unexpected stimuli on the left side of the visual field. If these patients receive a cue to attend to a stimulus in the neglected hemifield, they can manage to do so. So top - down attentional control is still intact while bottom - up control is impaired. The syndrome may occur in the left TPJ as well, but effects are more severe after right than left lesion. On the other hand, there is data indicating that another area along the intraparietal sulcus (IPs), which separates the superior from the inferior parietal lobule, is involved in top - down driven control of attention. This hypothesis is supported by results from experiments with monkeys and humans.

The experiment

This fitting dissociation of bottom - up, stimulus driven attentional control in the TPJ and top - down, goal directed commanding of attention was just too catchy for Corbetta & colleagues to ignore. So they tested the predictions with human subjects using ERfMRI.Figure 09 : Subjects had to climb into an MRI scanner and had to fixate on a a green fixation cross in the middle of a rectangular box. The green cross was flanked on either side by square boxes. Accurate fixation had to be maintained throughout the trial. At the beginning of the trial, the green fixation cross was covered by a cue arrow pointing towards the left or the right box, respectively.

09 10

11

09 Fixation time !

10 In a cue trial, nothing but a cue is presented

11 In a noise trial, a cue was followed by a period of waiting for the stimulus to come.

12 A valid and an invalid trial.

12

Page 66: Action and Cognition

a&c | Neural Correlates of Attention

a&c | 66

The arrow indicated the most likely location of a subsequent target stimulus (a white asterisk) that was to be expected in one of the square boxes. The duration of cue presentation was one MR frame (2360 ms).What happened after cue presentation depended on the type of trial. There were cue (20% of all trials), noise (20%), valid (44%) and invalid (16%) trials.Figure 10 : On a cue trial the trial ended immediately after cue presentation. The end of a trial was signalled by a change in the colour of the fixation cross. It turned from green to red. After having seen the cue, the subject’s attention shifted towards the location indicated by the cue. It was maintained there until the red cross indicated the end of the trial.A noise trial as seen in figure 11 was composed of trial presentation and a subsequent test period with a duration of two MR frames. This trial is similar to the cue trial, but the attention to the cued box has to be sustained for a longer time.Figure 12 shows a valid and an invalid trial. In a valid trial, the target stimulus appeared at the cued location during the test period. In an invalid trial the target appeared at side of the rectangle opposite to the cued location. The trials lasted between 4 and 7 MR frames. The (human) subjects were told to press a button upon target detection. They had to withhold response during cue or noise trials.

Results & Conclusion

There are obvious similarities between this paradigm and the one used by Bisley and Goldberg in the experiment mentioned before. Upon cue arrow recognition, the subject has to direct attention voluntarily towards the cued side of the rectangle. Again, this is what we call top - down attentional control. In an invalid trial, the target stimulus is equivalent to the distracter stimulus in the Bisley and Goldberg experiment. It attracts attention in a bottom - up manner. Responses in correct trials were faster than in incorrect trials. This is an indication for the attention devoted to the cued location. The results from the fMRI scan provided direct confirmation for the two predictions that where made before. These predictions stated that there are two different cortical areas responsive during top – down or bottom – up control of attention. The cortex around the intraparietal sulcus (ITs) was active before target presentation. This is the period during which voluntary attention is paid

to the target square. This process is not dependent on bottom – up target detection. Conversely, the right temporoparietal cortical junction (TPJ) responded best to unexpected target presentation at an unattended location. These results may explain the clinical cases mentioned above. Neglect cases, where bottom – up stimulus recognition in the contralateral field is disturbed, are commonly associated with a lesion in the TPJ of the right hemisphere. When ITs is still intact, a verbal cue may help the patient to voluntarily (top – down) attend to the left visual hemifield. With the help of these cues, top – down guided attention can help localize objects in the left hemifield that failed to attract bottom – up attention.

Summary :

Bisley and Goldberg have shown that LIP seems to possess a similarity to a saliency map. Corbetta and colleagues dissociated two distinct neural saliency maps for top – down and bottom -up guided attention.

You have to bear in mind that you find what you are looking for. Researchers may interpret their findings according to what they expect. The chosen paradigm limits the set of possible answers to answers that fit into the our way of looking at things. Some people can write a whole research paper without using the word ‘representation’. Others, who have another way of looking at things, tend to see representations everywhere. The same hold for the notion of saliency maps. LIP is not a saliency map. The model of saliency maps just fits very well with what we observe in LIP when looking for correlates of saliency maps .Later on we will see that other areas like FEF and the superior culliculus possess properties that makes them similar to a saliency map as well.

Imagery

What is imagery?

Stephen M. Kosslyn definitely has an opinion about imagery. Being responsible for the “imagery” section of the MIT Encyclopedia of Cognitive Science, Kosslyn has managed to put together these formidable sentences (actually taken from : Kosslyn et al. -Neural Foundations of Imagery), giving a neat little introduction to imagery. So I don’t have to.

13The illusionary contour used in the experiment by Goebel et al. : The Kanisza Square.

Page 67: Action and Cognition

a&c | Neural Correlates of Attention

a&c | 67

„Mental imagery occurs when perceptual information is accessed from memory, giving rise to the experience of ‘seeing with the mind’s eye’ , ‘hearing with the mind’s ear’ and so on . By contrast, perception occurs when information is registered directly from the senses. Mental images need not result simply from the recall of previously perceived objects or events; they can also be created by combining and modifying stored perceptual information in novel ways.”

The questions

As usual, we want to know what areas might be involved in visual imagery. Furthermore we want to know if there are any intersections between the ensemble of areas active during ‘normal’ visual perception and during imagining these percepts.

The constructive nature of vision: direct evidence from functional magnetic resonance imaging studies of apparent motion and motion imagery

R. Goebel and his colleagues try to answer this question.

They use fMRI to observe the effects of various kinds of motion perceptions on brain activity. These perceptions include :

- Objectively perceived motion : This is just a real stimulus. Usually a cloud of moving dots.

- Apparent motion : When multiple visual stimuli are presented in an alternating manner at a certain frequency, subjects perceive a movement from one stimulus site to the other instead of two alternately flickering stimuli.

- Apparent motion of illusionary contours : One can create the perception of apparent motion using illusionary contours like Kanisza squares. ( Figure 13 )

- Imagined movement : Imagine one of the above stimuli

- Controls : Flickering controls, fixation periods as well as static stimuli are presented, too.

14

14 Stimuli and controls presented during the “Apparent motion I” series.

15 Stimuli and controls presented during the “Apparent motion II” series.

15

Page 68: Action and Cognition

a&c | Neural Correlates of Attention

a&c | 68

18

16

17

16 Brain activity during “Apparent motion I”

17 Brain activity during “Apparent motion II”Abbreviations :subjective contour apparent motion : SCAMsubjective contour fl ickering control : SCFCno contour apparent motion : NCAMno contour fl ickering control : NCFC

18 Brain activity during “Motion Imagery I”

Page 69: Action and Cognition

a&c | Neural Correlates of Attention

a&c | 69

The Paradigm

Having a closer look at the list above, one may notice that the amount of objective motion is gradually reduced from top to bottom, while the amount of internally generated motion is gradually increased. So there is a gradual shift from ‘real’ to imagined stimuli. The idea is to observe the effect of this shift from ‘feasible’ to ‘abstract’ percepts using functional imaging techniques.

Experimental setup

Subjects have to focus on a fixation cross on a computer screen. Then one of the experimental setups )(escibed below) follows. (All of the following setups included fixation periods and objective motion periods)

Setup : Apparent motion I (Figure 14)

During the setup of the “Apparent motion I” series, perception of apparent motion is created by two concentric rings of different diameter that are switched on and off in an alternating manner.As a result, the two flashing rings appear to be a single ring, continuously shrinking and expanding. While this shrinking and expanding is already an illusion, there still are objectively observable stimulus contours involved. The control condition consists of one or two double - rings flickering at 0,5 and 1 times the frequency of the original stimulus.

Setup : Apparent motion II (Figure 15)

Here, two Kanisza squares are presented in alternation on the left and right side of the fixation cross. So at both locations are four pacmen, their mouths being open or closed according to the alternation frequency. This creates the impression of a single square moving from left to right. This experimental setup combines illusionary contours and illusionary (apparent) movement. In the control condition there is a no – contour trial with messed up pacmen as well as a trial where two Kanisza squares flicker in unison.

Setup : Motion imagery I

After watching an objective or apparent motion stimulus, subjects have to pass a fixation period ( to let the haemodynamic response return to baseline ) and then are signalled to imagine the foregone stimulus as intensely as possible.

Setup : Motion imagery II

In this experimental setup, motion imagery is compared with imagery of static stimuli.

Results : Apparent motion I (Figure 16)

During ‘Apparent motion I’ V1 shows just the same amount of activity as in all trials where remainders of

observable stimuli are present, as there simply is visual information entering the system. MT/MST responds vigorously to apparent motion and (of course) to objective motion.Conclusion :The MT/MST complex seems to be the first area across the dorsal pathway that is responsive to apparent motion stimuli.

Results : Apparent motion II (Figure 17)

As already mentioned above, V1 responds as usual to the incoming visual signals. V2 and V3A respond significantly stronger to the two conditions with subjective contours than to the respective control conditions with the outward turned pacmen. Apparent motion of subjective contours activates MT/MST more effectively than no contour controls and flickering controls.Conclusion :MT/MST is able to use subjective contour information. As V2 already responds more strongly to Kanisza squares than no contour controls, subjective contour information is thought to be extracted at the level of V2

Results : Motion imagery I (Figure 18)

All subjects reported that they were able to evoke a clear motion experience. Yet this percept lacked details. Interestingly, most of the areas involved in objective & apparent motion perception are active during imagery conditions as well :

1. The MT/MST complex has an average activation of 60% compared to the onjective motion perception. The upper anterior part of MT/MST ( the putative MST homologue ) is activated 80% as intensely while the lower posterior part of MT/MST manages to get a 40% activation ratio compared to objective motion.2. V2 and V3A are weak but significantly activated.3. V1 shows no significant activation during imagery.

Additional cortical areas in the inferior, dorsolateral and superior temporal cortex are strongly and bilaterally activated.

Conclusion :The activity during imagery becomes more similar to objective motion with synaptic distance from V1 along the dorsal stream. The reported lack of detail in imagined stimuli might be connected to the lack of activation in V1. MT/MST may be able to evoke an experience of motion detached from the figural details available to V1 in tasks with real stimuli.

Results : Motion imagery II

During the ‘Motion imagery II’ setup, the results are similar to ‘Motion imagery I’, but MT/MST is not involved in imagery and perception of static stimuli.

Overall conclusion :

Page 70: Action and Cognition

a&c | Neural Correlates of Attention

a&c | 70

From the apparent motion experiments we can deduce, that MT/MST is the first area to use apparent motion cues. MT/MST can also use subjective contour information. The latter is thought to be extracted at the level of V2, because V2 responds more strongly to illusionary contour stimuli than to messed up Kanisza squares. By doing this, V2 may make this information accessible to MT/MST V1 is not significantly activated during imagery of moving stimuli.The reported lack of detail in imagery may be due to the lack of activation in V1.Many higher level areas are active during imagery. FEF, BA 9/46, IPL, SPL (superior, inferior parietal lobule) and others are among them. Some of these areas may be involved in spatial attention as previous parts of the scripts alluded.

Full Stop ?

As you possibly imagined, this is not the whole story. The publication by Goebel et al is from the year 1998. In a review from 2001 (see : Kosslyn et al. - Neural Foundations of Imagery) , Kosslyn stated that there have been numerous experiments indicating an activation of V1 during imagery. In the majority of the reviewed experiments this activation was significant. Possible sources for these deviations are as usual : experimental setups, data acquisition, data analysis. So keep in mind, that science is a continuous process and note that up to now, the question whether V1 is involved in imagery is to be answered with : “Yes, it is !”.

Page 71: Action and Cognition

a&c | Appendix B: MRI, fMRI

a&c | 71

Appendix B: MRI & fMRIIntroduction

Imaging Techniques are the art of creating images of the inside of an organism without (necessarily) killing it. MRI as well as fMRI are imaging techiques. As you are probably studying cognitive science or a neuroscience related subject, you will probably know what techinques like MRI and fMRI are good for. I will leave out the section where I praise and glorify the advances of functional imaging techniques and their wondrous impact on neuroscience. You’ll get enough of that in cognitive psychology and other subjects.The following part is heavily “inspired” by Douglas C. Noll’s “A primer on MRI and Functional MRI”. The Primer is freely available on Noll’s homepage.

01 A typical result of an fMRI Scan. The raw images (below ) are integrated into the Talairach brain - coordinate system (above)

Page 72: Action and Cognition

a&c | Appendix B: MRI, fMRI

a&c | 72

Magnetic Resonance Imaging

MRI is based on the physics of Nuclear Magnetic Resonance (NMR). NMR is a phenomenon based upon the quantum mechanical property of nuclear spin. This spin is associated with a magnetic moment. Having a look at figure 02, we can more easily imagine what the spin of a nucleus is. It simply is its magnetization vector. It may point towards any direction and wobble around its imaginary axis. All atomic nuclei with an odd number of neutrons or protons will have a nuclear spin. There are lot of isotopes with an odd atomic weight like 13C, 31P or 23Na that do great service in MR imaging. For imaging in living things, hydrogen is the atom of choice.Hydrogen (1H) consists of only a single proton. One is an odd number. 1H (or H+ or proton) is contained in large concentrations in organisms. Therefore hydrogen is the entity of interest in most MRI studies involving organisms. If we have an object like the brain, it is full of hydrogen nuclei, each with the characteristic spin. In an NMR experiment, this object is placed in a strong, static magnetic field called B0. The magnetic moments associated with nuclear spin tend to align themselves parallel or antiparallel ( parallel but oppositely directed ) to the static magnetic field. As a slightly larger fraction aligns parallel to the magnetic field B0 , they form a net charge parallel to B0. Figure 03 shows what I just described. An

incredibly large number (well there are only six in figure 03) of nuclei with magnetic moments become aligned to B0 and form a net magnetization called M because the majority of nuclei is aligned parallel to B0. There is no net magnetization in the transverse plain (orthogonal to B0) ,because the spins spin at a random phase and not in unison.This magnetization shows resonance ( our NMR phenomenon ) to oscillating magnetic fields at characteristic frequency. This frequency is determined by the Larmor equation :

f = γ⋅B0

B0 is the strength of the applied magnetic field measured in Tesla (T). γ is a constant specific to the type of nucleus and describes a frequency ratio per Tesla. For 1H γ is 42.58 MHz per Tesla. At a B0 strength of 1.5 T, the resonance frequency of 1H consequently is 63.76 MHz. As these frequencies are in the radio range, resonance frequencies are often referred to as radio frequencies or RF pulses. In most MR systems, the strength of the main magnetic field is between 0.002 T and 5 T. To generate field strengths above 0.5 T you need to own a superconductive magnet. Higher field strengths have a better ‘signal to noise ratio’ (SNR). (If you want more information on this, read the primer!)

02 The Spin

03 The magnetization vectors of some nuclei align to B0

04 The net magnetization M spiralling down to-wards the transverse plane. Excellent wobbling, M !

02 03

04

Page 73: Action and Cognition

a&c | Appendix B: MRI, fMRI

a&c | 73

So what does this frequency value tell us?

Now the NMR phenomenon comes into play. If we apply a magnetic field oscillating at a frequency of 63.76 MHz to our object, then the spinning nuclei will absorb energy and get excited. This excitation results in the fact that the magnetization begins to precess around the axis of B0 because it is tipped into the plane orthogonal to B0. This precession occurs at the specific resonance frequency associated with the target nucleus. As seen in figure 04, the magnetization M begins to wobble around B0’s axis and to spiral down towards the transverse plane.The higher the power (remember : the power is proportional to the square of the amplitude) of the radio frequency pulse, the smaller the angle between M and B1. In other words, the higher the RF power, the lower M may wobble. Figure 06 shows that if one places a coil near to the excited object, the precessing, rotating magnetization M will induce a voltage, s(t) in the coil. s(t) oscillates proportional to the rotation speed of M. The signal s(t) contains a whole spectrum of signals from different sources and has to decomposed later on. This is done by frequency analysis. After excitation, the magnetization vector returns to its equilibrium state.

This decay of wobbling occurs exponentially according to the time constant T2. The decay speed is sensitive to interaction with other spins (number of 1H nuclei in tissue) as well as upon small inhomogeneities in the magnetic field. Different tissues produce different inhomogeneities of magnetic fields. The time constant T2 together with spin - spin interaction and magnetic field inhomogeneities is the basis of the new time constant T2*. This decay factor describes the cessation of horizontal precessing in the plane orthogonal to the original field. Consequently, T2* decreases in size as excitation is stopped.At last, the magnetization returns to its equilibrium state and M realigns with B0 again. As a result of this, the net magnetization M returns to the value it had before excitation. This happens because the magnetization vectors of the nuclei are aligned to B0 just like before excitation. The time constant for this realigning process is named T1 and again varies from tissue to tissue. These two time constants can be read out of the voltage induced in the coil. Much of MRI is based on exploiting differences in these parameters to be able to discriminate different tissues.

06

05

05 An MRI scan as a projection onto one dimen-sional space. frequency distribution as a function of x - position.

a) The magnetic fi eld gradient is responsible for the difference in frequency. (See Larmor equation)

b) The corresponding decomposed frequency spectrum

06 Rotation of M induces a current in a nearby coil.

Page 74: Action and Cognition

a&c | Appendix B: MRI, fMRI

a&c | 74

Now we know that different tissues emit different radio responses. But we want to localize the different tissues, not just know if there are any. By manipulating the main magnetic field, spatial localization in MRI is made possible. A magnetic field gradually changing in strength will change the resonance frequency of magnetized tissue according to the Larmor equation. Usually special magnetic coils are used to manipulate the main magnetic field linear to spatial location.The new Larmor equation looks like this :

f = γ ⋅ B(x)

where

B(x) = G ⋅ x + B0

G is the gradient of the strength of the magnetic field along an axis called x. Now there is a one - to one correspondence between spatial location and resonance frequency of the nuclei at this location. We find somebody who can do Fourier transformation and make him decompose s(t) into its different response frequencies. Now we can see the magnetization properties of different tissues along our x axis.

There are different ways to receive such a one - dimensional function. By combining these methods one can create two - dimensional slices of an object.

These can be stacked to form a three - dimensional image of the magnetization properties of an object. MRI produces exquisite images of the soft tissue of an organism and is completely non - invasive, not using surgical intervention or ionizing radiation.

fMRI - Functional MRI

Physiological reactions of neuronal activity involve many complex processes. A neuron’s metabolism utilizes an increased amount glucose, oxygen as well as nutrients during activation. As a result cerebral blood flow has to be increased in the activated region to transport oxygenated blood and ‘raw materials’ to the site of activity.

Oxyhaemoglobin (oxygenated haemoglobin, haemoglobin carrying oxygen, the principal haemoglobin in arterial blood) is diamagnetic. This means that is has no magnetic properties. Deoxigenated blood is the blood in the draining veins after oxygen has been unloaded in the active tissue. This haemoglobin is paramagnetic. It disturbs the static magnetic field. This means it can be used as an intrinsic contrast agent in an appropriately performed MRI imaging session. This MRI session would then be called an fMRI session and the s(t) signal would be called BOLD signal. BOLD is an acronym for blood oxygen level dependent. Due to its influence on the main magnetic field, a change in the blood oxygenation causes a change in the decay parameter T2*. This leads to an increase of intensity in T2* - weighted images.

07 Interactions in the formation of the BOLD signal. Positive / Negative arrows indicate positive / negative correlations between the parameters

Page 75: Action and Cognition

a&c | Appendix B: MRI, fMRI

a&c | 75

The BOLD signal can be turned into an image just like conventional MRI signals and is usually superimposed onto a high resolution MRI scan to localize regions of metabolic activity.The weakness of fMRI is its bad temporal resolution as it measures changes in blood flow and not directly neural activity. Consequently, an MR imaging is broken into a sequence of MR - frames. One MR - frame covers two seconds of measured BOLD signals.

ERfMRI

Event related fMRI is the imaging of the response of a single trial. This is done by an experimental design that allows these responses to be extracted. The inter - trial interval (ITI can be extended to allow the response to arrive and vanish again to see the whole duration of the response. ITI or the trial types can be randomized to observe the fluctuations between interacting or overlapping trial responses. By mathematical means, the overall shape of a single - trial response can be extracted. This means that ( for example) incorrect and correct trials can be regarded separately.

It ain’t that easy after all

However there is a complex interdependence between many physiological factors. Increased activity does not only mean increased T2* signals. Figure 07 will give a glimpse of this complexity.

Page 76: Action and Cognition

a&c | Top-Down and Feed-Back Mechanisms in the Visual System

a&c | 76

Frontal Eye Field

Frontal Eye Field (FEF) is a frontal lobe area, located in the rostral bank of the arcuate sulcus in the frontal cortex of the macaque monkey (fig1). This high-level are might be involved in attentional regulation (of course in a top-down fashion) acting as a saliency map, a possibility which will be discussed later after a brief summary of response properties of the cells in this area.- The FEF is heavily interconnected in a topographic fashion with areas in both the dorsal and the ventral streams of extrastriate visual cortex (e.g. Schall et al 1995b).- The ventrolateral FEF, which is responsible for generating short amplitude saccades, receives visual

afferents from the foveal representation in retinotopically organized areas, such as MT and V4; from areas that represent central vision in inferotemporal cortex, such as TEO and caudal TE; and from areas in parietal cortex having little retinotopic order, such as the lateral intraparietal area.- In contrast, the dorsomedial FEF, which is responsible for generating longer-amplitude saccades, is innervated by the peripheral visual field representation of retinotopically organized areas, from areas that emphasize peripheral vision, such as the dorsal medial superior temporal and parietooccipital areas as well as from the lateral intraparietal area.- FEF contains visually responsive neurons, as well as neurons which elicit a saccade

CHAPTER 07: Top-Down and Feed-Back Mechanisms in the Visual System

The aim of this section is to discuss the role of feedback connections in the visual hierarchy. First a high-level area (Frontal Eye Field, FEF) will be presented as a possible candidate, responsible for attentional modulation in a top down fashion. Apparently this contradicts with the exclusively bottom-up theories of attention discussed in earlier lectures. In the second part, the role of another high level area in the cat cortex (namely posterior middle suprasylvian pMS) in the generation of direction maps in lower areas will be examined. In the remainder of this section, a general framework for perceptual learning and conscious perception will be suggested, which underlines the importance of top down mechanisms with a large body of experimental evidence.

01 Location of FEF

Page 77: Action and Cognition

a&c | Top-Down and Feed-Back Mechanisms in the Visual System

a&c | 77

-However, the activity of presaccadic movement neurons in the FEF appears to be strongly context-dependent, being absent before spontaneous saccades and maximum before volitional saccades.

Selective gating of visual signals by microstimulation of frontal cortex

Moore and Armstrong studied the role of top-down interactions between FEF and V4. They monitored attention-like effects of FEF stimulation. First, the RFs of some V4 neurons were isolated, and the orientation

preferences of these neurons were determined (fig2). The FEF sites were stimulated to elicit saccades, and the direction of these saccades were mapped as well (fig2). The saccade vector of any given FEF site could either fall into the RF of any given V4 neuron, or to the outside of the RF. The aligned (saccade into the RF) and non-aligned V4 neuron /FEF site couples were determined. During the course of the experiment the effects of FEF sub-threshold stimulation (i.e. below the level of minimal stimulation enough to generate a saccade) on V4 neurons’ activation levels for aligned and non-aligned conditions were monitored, as the monkeys were maintaining fixation. First of all FEF stimulation had no

02 As FEF was stimulated, the activity of V4 neu-rons was monitored. The RFs of V4 neurons, and the saccade targets of FEF cells were predeter-mined.

03 The effects of FEF stimulation on the activity of V4 neurons (black trace > no stimulation; red trace> stimulation)

03

02

Table The effect of FEF stimulation on the activity of a given V4 neuron, under different stimulus presentations on its RF. (0) indicates no effect, (+) a positive effect, (-) a suppression. The amount of (+) or (-) signs indicate arbitrarily how large the effect is.

Page 78: Action and Cognition

a&c | Top-Down and Feed-Back Mechanisms in the Visual System

a&c | 78

effect when there was no stimulus presented on the RF. RF stimulus presentation conditions varied according toi) whether the stimulus was in the preferred orientation of the neuron or not, ii) whether another stimulus was simultaneously

presented outside the RF or not, andiii) whether V4/FEF site coupling was aligned or non-aligned. The activation of the V4 neuron before and after FEF stimulation was compared (e.g. fig3). The modulation was dependent upon the three conditions

04 The effects of FEF stimulation on activity in V4 in different experimental conditions (see text for details)

05 A sketch of the methods used in the experiment.

06 The single condition orientation maps before (baseline) and during cooling.

07 The effects of cooling on orientation and direction maps.

05

04

07

06

Page 79: Action and Cognition

a&c | Top-Down and Feed-Back Mechanisms in the Visual System

a&c | 79

mentioned before; in the aligned condition FEF stimulation increased activity of the V4 neuron, especially if the stimulus was of the preferred orientation of the cell, and if there was the simultaneous presentation of a distracter outside the RF. In the non-aligned condition, FEF stimulation was only effective when the RF stimulus was of the preferred orientation, and if there was a distracter as well (in this condition the evoked-saccade vector was aligned with the distracter location), but what important is, that in this case the modulation was in the negative direction, i.e. it was a suppression of the activation (fig4 and table). These results convincingly show that FEF is involved in the “activation of a network that controls the gain of visually driven signals”, borrowing the words of the authors; rather than being a centre to passively

increase the activation of lower level neurons. As such FEF can act in a top-down fashion as a Saliency Map*, where planned eye movements are involved in (positive or negative) attentional modulation.

*This property of FEF may remind of Saliency Map- like properties of LIP, discussed in 6th chapter (link to 6th lecture).

08 The effects of cooling on orientation maps, and the reversibility of these effects.

09 Pixelwise comparisons of direction maps,indicating the th correlation between different conditions.

08

09

Page 80: Action and Cognition

a&c | Top-Down and Feed-Back Mechanisms in the Visual System

a&c | 80

The Role Of Feedback In Shaping Neural Representations In Cat Visual Cortex

In another demonstration of the importance of top-down mechanisms, Galuske et al discussed the role of feedback connections in the generation of direction and orientation maps in area 18. The posterior middle suprasylvian (pMS, see fig5- refer to area17 and a18 as V1 homologue) was cooled, i.e. the feedback connections from pMS to area 18 were deactivated. After the calculation of single-condition maps for direction (obtained through activation with coherently moving dots) and orientation (obtained through activation with moving gratings)(fig6), images of orientation and direction maps in area 18 were gathered (fig7). The ones corresponding to the cooling conditions were compared to before-cooling and after-cooling maps, in order to reveal the changes induced by deactivation of feedback connections. As the general layout of the orientation maps, and the distribution of map features, such as pinwheels were preserved, the response strength and orientation selectivity were reduced during cooling, but this effect was reversible (fig8). The same reduction in response strength was also present for direction maps generated during cooling. Much more important than this however, in this case the general layout of the map was

also disrupted to a great extent (fig7). This difference can be clearly visualized when pixelwise comparisons of the direction preferences between “during cooling” vs. “before cooling” direction maps and “before cooling” vs. “after cooling” direction maps are compared (fig9). This comparison demonstrates that the before- and after- cooling maps display a high correlation, but the during-cooling maps are not correlated to any of them. Thus, as the feedback connections from pMS to area 18 enhance the strength of response to orientation and sharpen the orientation selectivity in this area, they are not responsible for the layout of this map; whereas the representation of direction selectivity in the same area arises largely from these feedback connections. The enormous change in direction maps before and during cooling is a result of the cooling effects on individual neurons. These effects were selective and depended on the properties of the neurons rather than being the same for all units (fig10). The examination of neuron tuning curves before, during and after cooling revealed that there was a change in direction preference of less selective neurons, together with near silencing of highly selective neurons, just during cooling (fig11). Thus the feedback from the high level area pMS is the main determinant factor for the direction selectivity in area 18, as displayed above through the changes in maps, due to these top down connections

10 Average response changes during pMS cooling separated according to their initial degree of direc-tion selectivity. Impact of pMS cooling was greatest on neurons exhibiting higher degrees of direction selectivity.

11 Polar plots of the spiking response of 3 units in area 18 before, during, and after cooling. Top: Highly selective unit. Middle:A less selective unit: Bottom: Non-selective unit.

10

11

Page 81: Action and Cognition

a&c | Top-Down and Feed-Back Mechanisms in the Visual System

a&c | 81

which regulate the activity of single neurons.

The Reverse Hierarchy Theory (Rht)

In the remaining part of this section a new way to look at the visual hierarchy (an upside down look actually), which differs from the classical view with its underlying theory and its implications will be introduced.

The Classical View of the Visual Hierarchy

In the classical view the visual processing and perception

starts from the low levels (such as V1), where simple features (colour, orientation etc) are represented, and move up to higher levels as the representation becomes more complex (shape, 3-D etc). The logic of the view can be exemplified with a discussion of the popout phenomenon. An odd element in an array of identical distracters can pop out (can catch the gaze instantly and can be detected) or might require a serial search among the distracters before it is identified by the viewer as different (fig12). According to the classical view, those elements, which differ from the distracters in a basic feature, for which the low level areas are selective, will pop-out; accordingly this very fast phenomenon is automatic and preattentive. But if the element differs in a

12 Arrays which pop out (e.g. “A”), and which re-quire a serial search (e.g. “C”)

13 The arrays used in pop out tasks, on the left, with the odd element, on the right, without.

14 After the presentation of the array, a mask was displayed after a variable time to counter the afterimage effects.

15 Average performance improvement (all posi-tions condition) from the fi rst (circles), second (stars), and fi nal (crosses) sessions.

12 13

14

15

Page 82: Action and Cognition

a&c | Top-Down and Feed-Back Mechanisms in the Visual System

a&c | 82

conjunction of features (say vertical “and” red, in an array of red horizontals and green verticals) serial search will be necessary. Because, first the low-level area information for each of the features must be transmitted to higher levels, where the information will be combined. Only then with the focus of attention this simultaneous presence of the relevant features will be available to the user. To conclude, when one moves up in the visual hierarchy, the representations will become more complex, as the selectivity of the neurons decrease, and as their RFs enlarge.

RHT- Task Diffi culty and the Specifi city of

Perceptual Learning

Ahissar and Hochstein (1997) discuss the relationship between the specificity of perceptual learning and task difficulty in terms of receptive field properties along the visual hierarchy, to start their questioning the validity of the classical view.

Experimental Procedure The basic experimental procedure was a feature detection paradigm, based on the popout phenomenon. The subjects viewed a stimulus array consisting of oriented bars, which either included an oddly oriented bar or not, and were asked to report the absence or presence of

16 Swapped Orientation display. The orienta-tions of the targets and distractors were now the reverse of before.

17 Comparison of learning specifi city under 4 training schemes:: target could appear at any array (all: right column) or at one of 2 positions (2 pos: left column). Target orientation deviated from that of distractors by 90, 30 or 16 degrees (top, middle, bottom rows, respectively).

16

17

18

18 Position specifi city of learning 2 positions. Detection at short SOA is plotted (for trained and neighbouring positions) as a function of horizontal distance from fi xation. Left, training easy 2 pos30 degrees yields improvement along the horizontal meridian, including interpolated and extrapolated positions. Detection measured during session testing all positions following 2-position training (squares). For comparison, asymptotic detection for the group trained on all pos-30 degrees is also shown (crosses). Right, training diffi cult 2-pos-16 degrees leads to more spatially specifi c improve-ment.

Page 83: Action and Cognition

a&c | Top-Down and Feed-Back Mechanisms in the Visual System

a&c | 83

this odd element (fig13). The short display of this array was followed by a mask (fig14), which appeared after a variable time (stimulus-to-mask onset asynchrony, SOA). The larger the target-distracter orientation difference and the longer the SOA, the easier the detection becomes. After the asymptotic levels of learning were reached (the stage of learning, after which no performance increase is observed), the subjects were tested with a manipulated set of stimuli, to observe the transfer of the initial learning to this new condition.

Learning Specificity of Learning - Task Difficulty The first result, which contradicts with the classical view, is the fact that even in the very easy conditions, a significant amount of learning was observed. If popout was

an exclusively preattentive and automatic phenomenon, it should be devoid of any learning effects, which is clearly not the case (fig15).In the case of “orientation specificity” the transfer was monitored with “swapped orientations” (fig16). Two of the initial training schemes were easy and two were hard (fig17), according to the distracter-target orientation difference (difficulty increased by orientation), the number of possible locations where the target could appear (difficulty increased by positional uncertainity). After training in the easy conditions subjects showed complete learning transfer to the swapped orientations, whereas the subjects trained under hard schemes had to learn again. In a similar manner, the difficulty increased by orientation, led to position specificity (fig18). In this case stimulus manipulation to measure learning

19 Comparison of relative specifi city for 4 condi-tions and different SOAs.

20 Training with interleaved easy and diffi cult trials leads to initial training of easy trials and subsequent learning of more diffi cult trials (see text for details)

21 Eureka effect. If subjects view a single easy condition, they show dramatic improvement for the hard tasks.

19

20

21

Page 84: Action and Cognition

a&c | Top-Down and Feed-Back Mechanisms in the Visual System

a&c | 84

specificity was to change the location of the target element to detect, when subjects were trained with two possible target locations. The generalization in the easy conditions is explained by learning in higher level areas with large receptive fields, where no fine discrimination of orientation and retinal position is observed. On the

other hand the specificity of learning in hard ones, is suggesting learning in lower level areas, which are capable of such fine discrimination due to the small receptive fields characteristic of these areas.

22 Learning is similarly enabled when the eureka presentation is with the same stimulus as the test (squares). Or with swapped orientations (circles). However showing a paper drawing sketch of the stimulus (disks) is ineffective.

23 A: Repetition blindnessB: The very fi rst view of an ambiguous fi gure produces an integrated percept of one possible in-terpretation, not an ambiguous collection of lines and colored regions.C: when large letters are composed of smaller letters, the initial percept matches the global stimulus. D: Subjects more easily detect a masked word than a masked letter, even if the difference be-tween the words to be chosen from is only in that letter.E: Apparent motion. If the image presented is made up of alternations between the white and the yellow dots, the global percept is of either horizontal or vertical motion of all the dots but not intermixed motions.22

23

Page 85: Action and Cognition

a&c | Top-Down and Feed-Back Mechanisms in the Visual System

a&c | 85

Cascade of LearningCan these two learning processes in different areas proceed exclusive of each other or is there a relationship between their occurrences? If the data from the difficult cases of the “orientation specificity” experiment is reconsidered, it becomes apparent that when the different SOA trial blocks are compared, the specificity of learning increases with shorter SOAs (harder cases) (fig19). This is very similar to the increase in specificity induced by difficulty (according to orientation or position). When the improvement at each SOA block is plotted separately, an early performance increase for long SOAs becomes visible. Only after some improvement in these easier cases, the performance improvement can start for the shorter (harder) SOAs (fig20). This cascade of learning is suggesting the fact that learning easy cases (learning in higher areas) is a necessary condition for the start of learning in lower areas for difficult cases. Hence, learning in higher areas guide learning in lower ones, in a reverse hierarchical manner.

Eureka effectThe following prediction is that there shouldn’t be any learning, if the trained stimuli consist of only difficult examples (like training only very short SOAs, or with distracters, which differ from the target only slightly) Because the low level region, which should be involved during the learning will be undetermined, due to the lack of high level guidance. But there can be learning, if before training the difficult cases, a single easy case is trained. In this case high levels, which are activated by the initial easy condition training will be able highlight the low level region required for the learning of the difficult case. To test this prediction, subjects were presented with arrays similar to above described ones, to detect the presence or absence of the target. But the SOA was always kept at 50ms, which is very short. These subjects failed to show any sign of improvement throughout the session (fig21). But another group of subjects, who first viewed a 30s long display of the array once with once without the odd element, showed dramatic improvement during viewing the 50ms presentations (fig21) (“Eureka” effect). Additionally this learning enabling effect, which

24 Schematic diagram of Classical Hierarchy and RHT (see text for details).

25 Contrary to the predictions of the common view (but in line with the predictions of RHT) the odd elements on left do not pop out, but the ones on the right do.

26 Visual illusion pop out (see text for details).

24

25

26

Page 86: Action and Cognition

a&c | Top-Down and Feed-Back Mechanisms in the Visual System

a&c | 86

should be a high level phenomenon according to the model, can be generalized. The subjects, who had “eureka” effect and started to learn to detect the target in the array, also could learn to detect the odd element in the arrays of swapped orientations of the same arrays. But had to learn nevertheless (their performance at the start of the swapped orientation condition was low), because to learn to discriminate the new target orientation requires learning in a different low-level region (fig22). The inability to learn difficult cases related to low levels; in the absence of a high level guidance (like the “eureka” effect) is compatible with this reverse hierarchical model.

View from the Top:Hierarchies and Reverse Hierarchies in the Visual System

Generalizing RHT

Further evidence suggests that this reverse hierarchy can hold not only for perceptual learning but for conscious perception in general. When an ambiguous figure is viewed, the perception is one of the possible interpretations of the figure, not something in between (fig23). Many repetitions (fig23) or major changes (change blindness) in a scene may go unnoticed, as

long as the gist remains the same. In this framework the first route of processing which starts from lower areas (where cells respond to simple features and have small RFs) and reaches higher areas (where complex features, categories etc are represented) acts in an implicit fashion, only afterwards the first explicit perception occurs in these higher levels. This initial perception is a rough integration of information from the lower areas, a “gist of the scene”, based on spread attention (making use of large RFs). Only later with feedback connections can the details of a scene be added to perception, with focused attention, due to the return to lower areas, which are very specific (fig24). This of course, is not very different from perceptual learning, which starts at the higher levels, and then directed to low levels.

More and more ‘popout’

The classical discussion of popout experiments, in terms of assigning it to a level in the hierarchy as described above (link), is further challenged under the light of the high level-low level property differences. For instance, complex features like 3 dimensionality or depth do popout (fig25), and of course such a detection cannot be attributed to low level areas. And in the case of visual illusions, like crossed line or Miller-Lyer; if an element will pop-out or not does not depend upon its actual size, but upon the perceived size (fig26) (at low levels it is the retinal size what matters, not the perceived size of

27 A brief summary of RHT and the comparison of approaches of common view and RHT for various phenomena.

27

Page 87: Action and Cognition

a&c | Top-Down and Feed-Back Mechanisms in the Visual System

a&c | 87

objects). Additionally when the subjects were asked to report the exact location of the odd element, together with its presence or absence, performance drops to serial search levels. If it was the lower levels, responsible for popout, they would have the location information due to their small specific RFs, which lacks during the task. The explanation of such results provided by RHT is that popout occurs if the odd element is different in a feature underlying a “basic category” which can be represented only in higher level areas; and serial search with focused attention through the involvement of lower areas will be necessary to localize an odd element, which does not popout.

A Brief Summary

To remind and to sum up; RHT assumes that the visual input reaches low levels of the processing stream, and here the simple features will be presented. The products will go to upper levels in a converging fashion, where the specificity of feature information (such as locational or orientational) will start to degrade. The first conscious perception and perceptual learning, up to which all processing continued implicitly,, will occur in the higher areas and detailed information will be absent. Later these areas will guide attention to select the relevant low-level areas for any given situation to extract detailed information from a scene; or to direct learning to relevant low-level areas (fig24). If one compares the attributions of mechanisms by RHT and the common view in terms of the sites responsible for these mechanisms, and in terms of their expected latency in the hierarchy, it becomes apparent what a “strange twist in the story of attention”, or even in the story of all visual processing, this theory is (fig27).

Concluding Remarks

In this section the traditional view of visual processing in an exclusively bottom-up fashion, where the visual representations become more complex when one moves upward; has been convincingly questioned, and the role and importance of feedback connections between visual areas, about which not much was known up to recent times is underlined. At this point, it wouldn’t be an exaggeration to conclude that the top-down processing streams are as important as the bottom-up ones, if not more.

Page 88: Action and Cognition

a&c | Consciousness

a&c | 88

Paper: Srinivasan et. al.: “Increased Synchronization of Neuromagnetic Responses during Conscious Perception”

Synchronization, a neuroscientifi c paradigm

The investigation of human consciousness is a difficult task for many reasons. Foremost it is not clear how to define consciousness. Consciousness is a private “event”, that is I can only ascribe consciousness to somebody else on basis of his behaviour. If neuroscience identified the neural activity underlying conscious perception a question would remain: what is conscious about that neural activity? If we emphasized a perceptive stage reading that activity we would end up in an infinite regress of perceptive stages, so the question is not ill-founded. But similar to other issues it might be helpful to see how far we get with a neuroscientific approach.Srinivasan, Russel, Edelman and Tononi, all from Neuroscience Institute in San Diego, published an article in 1999 addressing the neural correlate of conscious perception. A general difficulty in investigating neural processes related to conscious states is to filter the activity caused by physical changes of the perceived object. Srinivasan et. al. used an experimental design that overcomes this hindrance: They took advantage of the phenomenon of binocular rivalry. If two competing stimuli are presented to both eyes, one at each, then the observer is conscious of only one stimulus at a time. The presented stimuli alternate in perception. So changes in neural activity are very likely to correlate to changed conscious perception; the stimuli to both eyes remain constant. Srinivasan et. al. found out that conscious perception is correlated with increased synchronization of neuromagnetic responses. This supports the paradigm of synchrony being the basic feature of neural activity producing mental states. In this summary I will give a brief overview of the experimental setup, explain the used measurements and discuss the results.

Experimental setup :

Eleven persons (nine men, two women) took part in the experiment. They were asked to look at a screen showing a vertical red grating and a horizontal blue grating simultaneously. To reveal the effect of binocular rivalry they had to wear glasses with a red and green filter lenses (known from ancient 3-D cinemas) such that only

one grating was presented to each eye. The presented gratings were presented at different frequencies in order to identify the neural activity caused by the stimuli. Four frequencies were used for the stimuli: 7,41, 8,33, 9,55 and 11,12 Hz. Only the data for 7,41 and 8,33 Hz was analyzed. 9,55 and 11,12 served as control frequencies in the following way: Two different frequency pairs were used in each trial with one unchanged or common frequency. Those common frequencies were the two analyzed ones. The neural activity was measured with a 148 electrode MEG system placed on the persons scalp. The persons were asked to press a button with their left index finger when the perceived stimulus was the red vertical grating, when changed to the blue horizontal grating another button should be pressed with the right index finger. For each frequency pair each grating was presented at each frequency and to each eye for a total of four trials. This leads to 8 trials for each common frequency. Take for example the frequency pair 7,41/9,55 Hz. First one the red vertical grating is presented with 7,41 Hz to the left eye. Then it is presented with 9,55 Hz to the left eye: makes 2 trials. Now that grating is presented to the other eye again with both frequencies: make 4 trials. Now the reference frequency is changed and a the procedure is repeated with, say 7,41/11,12 Hz: makes a total of 8 trials for the common frequency 7,41 Hz.

Measurements :

MEG measures the magnetic field on the scalp produced by charge differences in the cortex.

Power analysis :

To show cortex activity the power spectrum is plotted, that is the average power of the magnetic field at different oscillatory frequencies over all 148 sensors. Differences in the average power at a certain frequency indicate activity differences at that frequency.

Coherence analysis :

To analyze coherence between local and distant cortex areas at the common frequency all possible pairs of sensors were taken into account. Coherence measures how high the correlation is between the signals from different sensors, that is if their activity at a certain frequency is in phase or not. So if the coherence

CHAPTER 08: ConsciousnessIntroduction

In this chapter we present three papers addressing the problem of consciousness. As Srinivasan et. al. and Logothetis et. al. give an account for the neural correlate of consciousness, O’Reagan and Noe present a fundamentally different approach to this issue.

Page 89: Action and Cognition

a&c | Consciousness

a&c | 89

between two signals is increased this indicates increased synchrony between the corresponding cortex areas.

Results :

Power analysis :

Srinivasan et. al. confirmed the findings of previous studies, that the amplitude of the neural response at stimulus frequency is largely modulated by perceptual dominance. The largest increase of power during perceptual dominance at the common frequency was found over occipital areas. As MEG is primarily sensitive to synchronous synaptic activity this result indicates an increase in local synchronization. Although the main increase of power was found in occipital areas, some more anterior area, that are not part of the visual system, showed increased power at stimulus frequency under perceptional dominance.

Coherence Analysis :

Coherence between distant cortex areas can result from different neural mechanisms: Increased synchronization of the input to different cortex regions may be responsible for increased coherence between these regions. On the other hand the architecture of intercortical connections, especially recurrent connections, may cause the effect. Srinivasan et. al. observed both high coherence between nearby and distant pairs of sensors. As all sensors get input from regions covered by nearby sensors this can result in pseudo-coherence. To receive genuine coherence only sensor pairs over 12 cm in distance were considered. Generally, during perceptual dominance increased coherence was observed between the distant sensors taken into account. At non-stimulus frequencies the switch from dominance to non-dominance of a stimulus in contrast had no effect. This supports the idea of increased synchrony for consciously perceived stimuli, because non-stimulus frequencies are not part of the input. High coherence was observed between sensors of opposite hemisphere as well as between anterior and posterior regions within one hemisphere.The results are given in summary here. Synchronization between distant cortex regions was considerably different between all tested persons.

Conclusion :

Srinivasan could verify their thesis that conscious perception is accompanied with increased synchronous activity within and between distant cortex regions. As binding of stimulus features is an important aspect of conscious perception these findings also support the resolution of the binding problem by synchronization.

I can recommend to work through this paper. It combines many methods of contemporary neuroscience and thus can help forming a useful methodological background for

your studies.

Paper: Leopold, D.A. & Logothetis: “Multistable phenomena: changing views in perception”

Further hints to how consciousness might be elicited by the brain is provided by Leopold & Logothetis’ (L&L) paper about the nature of multistable phenomena. As we have seen in the chapter about binocular rivalry, multistability has strong implications for how the brain might mediate between percepts as well as for the location of the areas involved. It has been a useful paradigm to study the functions of perceptual organization because it addresses the obscure facts that although there might be several possibilities to interpret an image (multistable pictures, see figure 01) or there is a different picture presented to each eye (binocular rivalry), we “see” only one at a time.The authors suggest that active perception is highly intervened by central integration areas, acting upon representations of the sensory stimuli as well as combining input for sensorimotor coordination. In this sense they view the “flipping” of percepts not as some passive, automatic process but rather similar to certain forms of attention. Leopold and Logothetis imply that if there indeed exists a higher-level influence on general sensory processing, this would have consequences not only for our theories about perceptual organization but also for the much harder problem of consciousness.Let’s take a look on what L&L cite as evidence to back up their theory.In a first approach, L&L did a series of experiments measuring neural activity in various visual areas (such as V1, V2, V4, MT, MST, IT, STS) of the monkey brain during a binocular rivalry paradigm.

A preliminary assumption made by the authors concerned the location where the competition between the two inputs takes place. Unlike the “monocular blockade” theory, which pinpoints the elimination of the non-dominant input before or in V1, L&L rather consider it to be closely related to forms of multistable perception where there is a competition between different stimulus representations. A first result showed no hints for the elimination of the non-dominant stimulus representation (large numbers of both monocular and binocular neurons continued firing for the suppressed stimulus). On the other hand, the changing of the perceived stimulus also resulted in changes of activity for many neurons in these areas. L&L attributed most of these changes to the transition process (upon the change of the perceived stimulus) because longer-lasting effects of the change could only be observed in IT. L&L’s first concluded that this might reflect the early (reaction to a signal announcing the reorganization) and late (result of the reorganization) stages of the perceptual change.The authors also cite the theory of how synchrony in firing behaviour might be the means behind such a change and again refer to the possibility that top-down processes could influence coherent firing of neurons.

Page 90: Action and Cognition

a&c | Consciousness

a&c | 90

01

02

02 Multistable perception in the monkey brain: experimental setup

03 Modulation of fi ring behaviour according to perceptual changes: Most neurons in V1/V2 are independent of perceptual changes.Percentage of percept-related cells increases on higher levels of the visual processing hierarchy.

03

Page 91: Action and Cognition

a&c | Consciousness

a&c | 91

Several fMRI studies with human subjects are outlined by L&L as evidence for the involvement of structures outside the visual cortex. Most strikingly they conclude that areas in the fronto - parietal cortex are activated exclusively in multistable viewing conditions (and not during passive viewing).Another assumption made by the authors considered perceptual reorganization as the result of some special form of behaviour. First they claim that perceptual changes have random characteristics and thus are similar to exploratory behaviour (e.g. eye fixations in scene exploration) as long as they are not influenced by either the input features or voluntary control. Such voluntary processes make up important evidence for their view. The intention of the viewer can be a major factor in multistable perception in terms of the “flipping” rate and even the choice of what is perceived. Furthermore, it is possible to practice the control over perceptual changes and even to learn them when it’s necessary (Have a look at figure 01 and try it for yourselves ).Other factors include intelligence, personality, and mood as well as pharmacological substances that influence cognitive abilities.Finally, L&L describe how the involvement of brain areas mediating voluntary control has been investigated in a study of patients with brain damage. Patients with damage in the right frontal areas (involved in behaviour) had difficulties with changing the percept whereas patients with exclusive damage to visual areas showed no differences in their ability to perceive changes (e.g. reversals in multistable images).

L&L continue by comparing the influence of top-down processes on multistable perception to those exhibited by certain mechanisms of attention (see : chapter 05). We’ve seen how voluntary control plays a role in both (though on a larger scale for attention). Top-down influence as in attention is similar in multistability, however it does not result in the enhanced processing of a certain part of the stimulus but rather in a change of the entire percept. Although L&L admit that the relationship between multistable perception and attention needs to be more accurately defined, it gives an idea of how the activity of sensory neurons in visual cortex can be interpreted.

L&L close their argumentation summing up the properties common to all forms of multistability. Some can be easily reproduced with the help of our examples (Thanks to Marlon Brando). Uniqueness for instance is the obvious fact that only a single percept can exist for cases of ambiguous (multistable images) and conflicting input (rivalry). These are constantly influenced by processes also found in attention as well as unconscious processes initiated in higher brain areas that actively alter what is perceived. This opposes the claim that they are automatic, passive processes. A third characteristic is the randomness we’ve discussed above. L&L suggest that a random nature of perceptual change

provides a mechanism to compensate for any bias due to the anatomical and functional properties of the visual pathway.

Given the evidence and theories by L&L, what can we derive for the problem of (neural correlates of) consciousness? Is there indeed an ongoing influence of sensorimotor integration areas upon the sensory processing in the visual system and only in cases of ambiguous stimuli can we become aware of it? Could examples like assigning something meaningful to a noisy stimulus (where a direct interpretation is impossible) provide strong hints for this? Accepting such a theory could enable us to analyze the respective neural assemblies and their firing behaviour in an attempt to explain perceptual organization but also to identify possible mechanisms of consciousness.

Page 92: Action and Cognition

a&c | Consciousness

a&c | 92

An alternative approach

In our list of theories concerning the locus of consciousness we still have a strikingly different approach to consider. Compared to similar theories that incorporate behaviour in perception, O’Regan & Noë (O&N) present a more elaborated framework for analyzing the nature of vision and visual consciousness.

They claim that the experience of seeing is not generated by activity in internal representations but rather is a way of acting. Indeed, they suggest that if their theories are to be found plausible, consciousness itself cannot be located in the brain so the whole concept of NCC must be reconsidered. Let’s try to dig deeper into their theoretical basis for these rather strong claims.

We have seen a few theories dealing with visual consciousness in the preceding chapters about perceptual organization. In the eyes of O&N however, most theories do not resolve the hard problem. For instance: What it is about synchronized firing (remember chapter 04) between cell assemblies that might create consciousness?What would make potential quantum processes (Hemeroff - Quantum coherence in microtubules: A neural basis for emergent consciousness?) in the neurons microtubules or cortico - cortical connections (Edelman - The remembered present) or some sort of commentary system (Weisskrantz - Consciousness lost

and found: A neuropsychological exploration) able to generate consciousness?Another problem is the distinctiveness of the sensed quality of stimuli from the different modalities. Why exactly is seeing the colour red so very different from hearing a sound or smelling a smell? The classic notion of Müller’s “specific nerve energy” in one or the other variant is generally enough for most scientists to explain the problem. O&N however doubt even its modern form by questioning the mechanisms of how particular sensory pathways, cortical areas, cell types or whatever is specific to the respective sensory processing, could give rise to experience.

Vision as an exploratory activity

In the current framework presented by O&N, vision is assumed to be an exploratory activity instead of based on internal representations. This exploration of the environment is performed in a lawful way with the help of knowledge of what they call sensorimotor contingencies. There are two types of sensorimotor contingencies described by the authors. The visual-modality related sensorimotor contingencies are related to sensation (how senses are affected by stimuli). On the other hand, visual-attribute related sensorimotor contingencies relate to perception (the result of categorization of objects and events). This distinction between sensation and perception can commonly be found in psychology.

04 J. Kevin O’Regan

05 Alva Noë

Page 93: Action and Cognition

a&c | Consciousness

a&c | 93

06

07

06 The case of the villainous monster

07 The situation faced by our brain

08 Sensorimotor contingencies induced by the sensory apparatus. Looking at a line does not pro-duce a “straight“ image. Moving the eye changes the geometry of the image, the thickness, and the cone contribution. In essence, the images of paral-lel lines have little in common. It is the structure of the rules.The sensorimotor contingencies within different sensory domains are subject to differ-ent (in)variance properties (e.g. the effect of head movements on audition, smell and vision).

08

Page 94: Action and Cognition

a&c | Consciousness

a&c | 94

1. Visual - modality related sensorimotor contingencies

Imagine yourself operating a remote-controlled submarine diving to the wreckage of the titanic. Now what if a big bad sea-monster of some sort screwed up all the connections between the cameras, the sonar equipment, robot arms and sensors? All the wiring has been mixed up, none of the many screens and blinking lights makes sense anymore and all the original functions don’t work. What would you have to do in order to regain control over the machine? Just observe the structure of the changes that occur when you try out the various levers and buttons. How does the movement change, which lights correspond to the information of the sensors etc.Our brain is faced with a similar situation in terms of the nervous input arriving from the different modalities or any other neurons. All the different sensory input can only be distinguished by the structure of the rules that govern their changes produced by motor actions. This is what O&N call sensorimotor contingencies. They assign different sensorimotor contingencies to the different modalities precisely because of their varying structure of rules.

In vision, most laws of sensorimotor contingencies are derived from the changes of the retinal signal when rotating the eyes, from the irregular distribution of photoreceptors on the retina as well as the varying cortical projections as a result of head movement. A perfectly straight horizontal line, for example, would then be characterized by the retinal changes when you look to a point above the line as well as the fact that there won’t be any change when you just move along the line (in case it’s infinitely long). (See : figure 08)On the other hand, modality-related sensorimotor contingencies of the auditory domain exhibit different structures in terms of the changes upon e.g. movement of the head.Hence, the specific laws of sensorimotor contingencies of the visual domain are established because exploration is performed with the visual apparatus.

2. Visual - attribute related sensorimotor contingencies

Objects in the three-dimensional visual field possess properties such as size, shape, texture and colour as well as position. Visual exploration of objects thus determines

09 Subjects tend to remember having seen a greater expanse of a scene than was shown in a photograph.For example, when drawing the close-up view in Panel A from memory, the subject’s drawing (Panel C) contained extended boundaries.Another subject, shown a more wide - angle view of the same scene (Panel B),also drew the scene with extended boundaries (Panel D).

Page 95: Action and Cognition

a&c | Consciousness

a&c | 95

how these properties change upon movement. The shape, the reflection of light and colour of an object might change when we move around it thus depending on our point of view. This is not the case for tactile explorations because such movements would not result in changes of what is perceived. Consequently, tactile sensorimotor contingencies posses a different structure of rules.The quality of shape for instance, is described by O&N as the set of all potential changes that the shape exhibits when we move around or move it relative to us. The resulting abstraction of laws from a subset of changes is a very important mechanism in their theory. One particular example of a classic study shows how normal such changes are for us:Cases of patients who were born blind but whose vision had been restored later in life show surprise about the changes of objects such as e.g. the rotation of a coin. Another record reported:

“Being shown his father’s picture in a locket at his mother’s watch, and told what it was, he acknowledged a likeness, but was vastly surprised; asking, how it could be, that a large face could be expressed in so little room, saying, it should have seemed as impossible to him, as to put a bushel of any thing into a pint.”

Mastery

O&N require vision not only to satisfy these conditions of exploration according to the two forms of sensorimotor contingencies but also that the brain must be actively exercising the mastery of these rules, in a sense being tuned to them. Thus, to be a visual perceiver is to be able to exercise mastery of visual-related rules of sensorimotor contingencies.

Integration

The two forms of sensorimotor contingencies related to the visual domain constitute a first step to explain the qualitative nature of vision. A third very important aspect is the integration of these laws into thought and action-guidance. O&N claim that we only truly “see” something once we are not only exercising the mastery of the respective sensorimotor contingencies in the object’s environment but also include this exercise in our planning, reasoning and speech behaviour. In other words, this is where attention (See : chapter 05) comes into play. The authors also indicate that this is a matter of degree because only a restricted number of aspects from a scene can influence our reasoning and action-guidance. If you think of driving a car, exercising the appropriate sensorimotor contingencies for many aspects happens while not being fully aware of it.

Got Qualia?

Still, there seems to be something missing in what O&N presented as their framework. Referring to what

philosophers call the “qualia” of vision, how can the phenomenal experience be explained? What makes the subjective character of visual experience? O&N spend a great deal with discussing the philosophical implications of their approach. In order to avoid extending the scope of this lecture too much, it should suffice to mention how O&N try to avoid the problems most theories of vision account when dealing with the phenomenal character of visual experience. They basically claim, that if the qualia are described as being intrinsic properties of experiential states, they simply do not exist in the current framework because visual experience is a way of acting and not a state or event. However, O&N caution us not to think that they deny the qualitative character of visual experience. If visual experience is something we do then the qualitative features are simply aspects of this activity.

Further details of the theoretical framework

Now that we’ve been presented why vision requires knowledge of sensorimotor contingencies, some concepts need to elaborated in more detail. This knowledge, for instance, is created by abstracting particular lawful relations between the actions and the resulting sensory change. These laws of sensorimotor contingencies are thus a practical rather than a propositional form of knowledge. Furthermore, among all the different sensorimotor contingencies we obtain throughout our life, only a subset can be used in the present moment, therefore this subset characterizes the object that is currently perceived.

The world as an outside memory

Another important related idea developed by the first author is concerned with the denial of an exact internal representation of the world. It leads to the concept of the world as an outside memory, which can be inspected, with the methods described above. Specifically, it is stated that the world acts as its own external representation. The relation between this external memory and normal memory lies in the attention you cast on specific aspects. Just as you cannot think about all the things that you remember about a scene at once, you can nevertheless relocate your awareness continually. Similarly, not everything is currently perceived when you look on a visual scene. However, through shifts of attention, information about the specific sensorimotor contingencies of the currently attended aspects becomes available and this information can influence your thought and behaviour. In the strict framework of O&N, only at this stage the parts of a visual scene are actually being seen.

How come that we usually have the impression of seeing everything if we can indeed only talk of seeing partial aspects of a scene? It is the result of being able to process every detail of a scene voluntarily. So if we

Page 96: Action and Cognition

a&c | Consciousness

a&c | 96

would try to test if we can see everything in a scene, we would shift our attention to each detail and thus consciously see it. Thus, we get the impression to see everything in a scene because everything we check on, we see.

O&N emphasize also that our feeling of vividness of a scene stems from our sensitivity for visual transients. A visual transient denotes any change in a scene and appears to entail some kind of “alerting” process that shifts attention to the location of the change. The concept of the world as an outside memory readily supports this feeling of the presence of a scene in vision where all changes are immediately visible to us.Now imagine not being able to fulfill the criteria vision requires under the current framework. Many experimental paradigms used some form of tachistoscopic stimulus presentation where stimuli appear for very shorts period of time (around 150 msec) to prevent eye saccades. O&N conclude from various findings that even though there seems to be some information about the stimulus available to recognize (highly familiar) categories, it probably cannot be said that the subjects really were “seeing” the stimuli. The authors describe an experiment where the subject’s eye movements were measured by a computer. They had to learn to distinguish three unknown symbols resembling Chinese characters by fixating the symbols in the central visual field. Upon any attempt to make an eye saccade, the symbol would disappear which was extremely irritating for the subjects. Hundreds of trials were necessary until the subjects could distinguish the symbols, however, once the symbols were displayed as little as half a degree away from the learnt (central) position the subjects failed to recognize them again.

Consider the fact that you would talk of seeing a partially occluded object. Applying your knowledge of sensorimotor contingencies about what would change when you move your head, you get the impression of seeing the object (or at least almost seeing it). Results from experiments indicate that we generally assume to have seen more that we actually did. (As figure 09 shows )The theory of the world as an outside memory thus provides further groundwork to explain many phenomena in the visual domain with the help of sensorimotor contingencies. The most interesting and difficult to explain experimental discoveries are outlined by O&N in a section of their paper that exclusively deals with empirical data that is supposed to show how powerful their framework can be.

Empirical data

O&N stress that the experimental data they provide should not be considered as tests of their theory in the general manner how new theories are tested. Their framework should rather be viewed as an attempt to

unite several lines of research including the new work on “change blindness” to see the results from a single point of view.

An extra - retinal signal?

Classic theories claim that an internal representation of a visual scene is created by continually inserting snapshots of parts of the scene. The exact location to insert the consecutive parts is thought to be determined by some sort of extraretinal signal. aO&N refute this idea and suggest that even if there exists something like an extraretinal signal it is very inaccurate (indicated by experimental findings showing that the position and time of the eye movement is estimated correctly only about 1 second after the eye reaches its final position).

Trans - saccadic fusion, saccadic suppression, the blind spot.

Many research topics are non-problems in the current framework. The question of how snapshots, created by each saccadic shift, are fused to form an internal copy of a visual scene is simply not plausible regarding the assumption that the world acts as an external memory. Trans-saccadic fusion hence is not required because eye movements are exactly one of the methods to scan this outside memory, which is part of what “seeing”, is.

A similar argumentation can be applied to the issue of saccadic suppression, which deals with the potential smear of the visual field during an eye saccade. Again, O&N claim that since eye movements are part of seeing, it’s not necessary to postulate some suppression mechanisms for saccadic smear since the smear is part of what it is to see. In other words, if the brain didn’t receive signals of smear from the retina it would assume that the visual scene is hallucinated or dreamt. Similarly, there is no need to explain some sort of “filling-in” mechanism for visual input falling on the blind spot of the retina either. On the contrary, the blind spot is part of what it is to see in the authors’ framework and is included in the sensorimotor contingencies related to visual apparatus.

Colour perception, inverting the visible world

Just as there is no need to compensate for the blind spot, the uneven distribution of cones in the foveal and peripheral regions of the retina doesn’t give us the impression that our colour perception is somehow non-uniform. Although we can be shown these facts they are nevertheless non-problems in the current framework.Concerning the perception of colour, O&N apply the notion of visual experience as the structure of changes on the example of a red colour patch. Depending on how you look at the patch, the visual input arriving via the retina will have varying properties due to e.g. the

Page 97: Action and Cognition

a&c | Consciousness

a&c | 97

The non-uniform distribution of receptor cells on the retina:The famous “experiment” you might remember from your Sensory Physiology lecture reveals how colour vision drastically diminishes in the periphery of your visual field. The uneven distribution of cones across the retina with particularly less in the periphery of the fovea can be shown by this little test. Just attempt to grab a random coloured pen with your eyes closed. Now hold the pen to the left or right of your head on eye level and slowly move it horizontally towards the centre of your visual field while constantly staring straight ahead. Remember to not move your eyes but try to find out when exactly you can distinguish the colour of the pen. You will experience that the pen indeed will only appear to have a colour at a certain point since the reduced amount of cones in the periphery of the retina results in less accurate colour vision in your peripheral visual field.

BOX 8.1 The non - uniform distribution of receptor cells on the retina

10 Experimental data show that integration across saccades is very limited. Similarly, saccadic suppression is a non-problem. The conclusion is consistent with the theory presented here, where the problem of visual stability is a non-problem and uses the world as an external memory. Along similar lines, saccadic suppression is a non-problem.

differences in distribution of rods and cones. In other words, the perception of red is knowing about the relevant sensorimotor contingencies (the changes that red causes). Even when we assume the observer to be a dichromate, the structure of changes upon moving relative to a surface consists of several more cues to disambiguate its colour. For instance, the spectrum of the reflected light changes when the surface is moved to reflect more yellow sunlight than blue shades from the sky. (Have a little look a Box 5.1 to refresh your knowledge about the retina)What if we exploit this idea that seeing red is determined by the structure of changes in a thought experiment? Imagine our experimental setup to be consisting of a computer that changes the colour of the displayed patch from red to green once we move our eyes away from the central fixation point. Shouldn’t we get the impression after some training that green patches in peripheral vision and red patches in central

vision are the same colour? More findings that can be explained in O&N’s framework of vision come from experiments where subjects wear goggles that invert the visual signals falling onto the retina so that the world appears upside-down or left-/right-inverted. Most subjects were able to adapt in a range of days and continuous wearing of the apparatus made their visual world “normal” again. Given that vision in the current framework is knowledge of sensorimotor contingencies, adaptation is naturally performed by forming new lawful relations of changes by the subjects.

The change - blindness phenomenon

The idea of the world as an outside memory leads O’Reagan and collaborators to a series of experiments dealing with the “change blindness” phenomenon. As we’ve seen in the paragraph about visual transients (Chapter 05), changes in a visual scene are immediately

Page 98: Action and Cognition

a&c | Consciousness

a&c | 98

detected and attention is shifted to that location. But what if this transient was prevented from being detected? This could be done by a global flicker or a film cut at the time of the change. The results in most cases were that people were plainly unable to see any change in the visual scene even when they occurred in very large objects and in full view. You can try for yourself (there are sample videos on this CD) and verify that often the changes in one of the pictures are very difficult to locate when you don’t know that there are changes. An intriguing example of the consequences such change blindness could have is an experiment where experienced professional pilots had to land a plane in a simulator under poor visibility conditions (Figure 11). The pilots used a head-up display, which projected flight guidance and control data onto the windshield. Although the runway was continually visible during the landing task, two of eight pilots didn’t notice the large jet plane that was introduced in some trials as an unexpected change in the visual field. These pilots simply landed their plane through the obstacle and were vastly surprised when they were shown a replay of their performance.The results of a series of experiments involving a target search paradigm is once more related to the idea of the world as an outside memory. A visual display consisting of a target symbol among a number of distracter symbols was used to investigate the search performance of subjects. The display was continually visible and unchanged during the trials, only the target symbol was changed. If we would assume that this were enough time to build an internal representation of the display we could predict the search rate to improve for each new trial. However, the results show no such improvement. This can easily be explained with

the assumptions inherent of O&N’s framework. Why should there be a representation of the world when the world can serve as its own memory to be scanned and accessed. There is no need to postulate some kind of “inattentional amnesia” for these findings.Further examples of well-known visual phenomena concern the assumption that we do not see everything that can be seen. Take a look at the triangle and the example of figure-ground competition (figure 13) to test it for yourself. Surely, most of you have experienced things like looking at something but not really seeing it (e.g. the brake lights of the car ahead of you without pressing the brake).

Sensory substitution

The theoretical groundwork of defining visual experience as derived from the changes upon movement is leading O&N to strong claims about sensory substitution. They suggest that visual experience should be possible through other modalities. One primary example is echolocation devices that produce auditory signals depending on the direction, distance, size and surface texture of nearby objects. One such device described by O&N consists of a single photoreceptor attached to the blind subject’s index finger. The subject was now able to derive new rules of sensorimotor contingencies depending on the auditory signals that were transmitted upon the various possibilities to move relative to the light sources in the environment. Interestingly, the subjects quickly stated not to hear the sounds anymore but rather to “sense” the objects in their environment. Similar statements were also obtained from subjects using a TVSS device (tactile visual sensory substitution; shown in figure 14). For TVSS, a blind subject wears

11 1211 The visual scene as presented to the pilots. Note the overlaying HUD-display and the obstacle on the runway.

12 No, sir! That’s not: “The illusion of seeing”

Page 99: Action and Cognition

a&c | Consciousness

a&c | 99

an array of vibratory stimulators that receive information about luminance distribution from a camera attached to the spectacle frames. In case of stationary stimuli such as object placed in front of the static camera, the subjects were generally unable to identify what is in front of them. However, once the subjects could manipulate the camera’s position and move around the object, identification became possible. Note that it didn’t matter onto which part of the body the array was attached and after some practice, the array could be reattached to a different body part (e.g. the forehead). Moreover, at some stage of training the subject started to locate objects in space and not on their skin although they could still feel the tactile stimulation. O&N mention one case where the experimenter manipulated the zoom function of the camera. This mean trick led to the rapid zooming in of an object and a vastly surprised subject that threw back his head and raised his arms to avoid the seemingly approaching object.

A somewhat more natural form of sensory substitution is the so-called “facial vision” of the blind. Blind people often state to be able to have the impression of slight touch on their face when confronting objects (especially large objects of 30-80cm). In an attempt to refute theories about some form of extrasensory perception, one scientist stuffed the ears of his blind subjects and proved that audition was involved in the facial sense. Specifically the intensity, direction and frequency shifts

of reflected sound were found to play a major role. Nevertheless, it would be interesting to investigate the relevant sensorimotor contingencies of the auditory modality and why these potentially have a similar structure to the tactile sensorimotor contingencies of a slight touch in the face.Finally, there are many more interesting claims to investigate in the current framework. Based on the sensory substitution findings, could it be possible to become tuned to a sense like orientation (relative to the earth’s axis)? What about constructing a device that maps an ultrasonic pulse-echo sense (like in the bat) onto the skin? The theory of visual experience in the current framework surely is tempting to accept as an elegant way to answer many intuitive and experimental questions but are its parts really that plausible? Is it really as complete and elaborated as O’Regan & Noë intend to present it? The paper (available on stud.IP) and the seminar will bring more about the sensorimotor account of vision and visual consciousness.

14 A blind subject with a “Tactile Visual Substitution system”(TVSS). A TV camera (mounted on spectacle frames) sendssignals through electronic circuitry (displayed in right hand) to anarray of small vibrators (left hand) which is strapped against thesubject’s skin. The pattern of tactile sitmulation corresponds roughlyto a greatly enlarged visual image.

13 Figure-ground competition.

14

13

Page 100: Action and Cognition

a&c | Consciousness

a&c | 100

Page 101: Action and Cognition

a&c | Spinal Motor Systems

a&c | 101

The Spinal Cord

The Spine

At first, let’s have a look at the spine. The Spine is the bony shell around the spinal cord and the most crucial support of the body. The vertebrae are the stacked modules the spine is composed of. There are 33 of them. The names of the spines are important because they are a common landmark to localize certain structures in the spinal area. Their names in ascending order : - 7 cervical vertebraeThese vertebrae are designed to allow all types of movements like flexion or extension. This is accomplished by their small size. - 12 thoracic vertebraeThese are the vertebrae your ribs are attached to. This region has a relatively narrow spinal canal as there are no

limbs nearby. - 5 lumbar vertebraeThe size of the vertebrae increases from cervical to lumbar level, because the body weight has to be supported. As a result of this, the lumbar vertebrae are the largest. Below L1 there is no spinal cord, but only cauda equina. ( This term will be explained later on ) - 5 fused sacral vertebrae - 4 fused coccygeal vertebrae.

The spinal cord itself

The spinal cord together with the brain form the central nervous system (CNS). The human spinal cord extends from the occipital bone at the base of the skull to the first lumbar vertebra (L1). Thereby it is spanning a length of about 45 cm. Running through the spinal canal of the spine, it is surrounded by the a system of membranes

CHAPTER 09: Spinal Motor systemsIntroduction

The spinal cord clearly is one of the most underestimated parts of the human body. Many people even tend to assume that the spinal cord is part of the peripheral nervous system. In their eyes this organ is only a slave to the brain, providing input/output pathways and some reflex circuits.In this chapter we will learn the truth about the spinal cord. We will learn about its anatomy and function as well as about spinal reflexes. Afterwards, the concept of “force fields“ will be introduced, granting the spinal cord an extra share of intelligence.

01

02

01 The human spine and the vertebral regions.

02 The meninges are a system of membranes that contain the brain and the spinal cord. It consists of three layers, the dura mater, the arachnoid layer and the pia mater.

Page 102: Action and Cognition

a&c | Spinal Motor Systems

a&c | 102

called meninges.This three layer system encapsulates the whole CNS. Its structure is shown in figure 02.If you were asked to give a rough outline of spinal cord function, the following two sentences would go a long way. „It receives sensory input from the skin, joints, and muscles of the trunk and limbs. Furthermore it contains the motor neurons responsible for both voluntary and involuntary movement.“Of course you have to know what you’re talking about so we will take a closer look:

A longitudinal view

Although the spinal cord is a continuous structure, there is a repeating pattern of enlargements along its length. These enlargements occur at the junction between two vertebrae and are called spinal nerves. The size of these enlargements varies throughout the body, depending on whether the emerging motor nerves innervate the limbs or the trunk.These spinal nerves consist of axons connecting the CNS to the peripheral nervous system (PNS). Afferent and efferent axons project into and out of the spinal cord through he gaps between the vertebrae. As a matter of fact, the spinal nerves are named after the vertebra above (in the thoracic & lumbar region) or below (in the cervical region) its location.There are 8 cervical, 12 thoracic, 5 lumbar, 5 sacral and 1 coccygeal spinal nerves. 8 + 12 +5 +5 + 1 = 31. So, there are 31 of spinal nerves.As we just read, the spinal cord only descends to L1, which does not mean that the spinal canal below this point is empty.Its content below this point is composed of CSF and nerve roots. Due to the fibrous consistence of these nerve roots, the part below L1 is also known as cauda equina (horse’s tail ... because it looks like the hair

of a horse’s tail. See figure 03 ).

A crossection

If we take at the look at a fresh crossection of the spinal cord, the first thing that strikes us is that unlike in the brain, the white matter of myelinated fibers surrounds the butterfly shaped gray core composed of cell bodies. So the location of gray and white matter is reversed in the spinal cord.The ‘gray butterfly’ is composed of pairs of dorsal and ventral horns. These are equivalent to the tips of the wings of our metaphorical butterfly. Neurons situated in the dorsal horn receive sensory information from the periphery and may give rise to reflex circuits by forming synapses on local motorneurons in the spinal cord or may project to higher areas. The ventral horn is composed of neurons (spinal / alpha motorneurons ) receiving input from either higher areas like neocortex or brainstem or from local dorsal horn neurons. They project onto muscle fibers in the periphery, causing contraction upon activation. These efferent nerves departing from the ventral horn form the ventral root. The spinal nerves exiting and entering the spinal cord at cervical and lumbar levels appear to be swollen due to the high bandwidth required for communication with the limbs. At thoracic levels and at the upper part of the lumbar level protuberances known as lateral horns can be found. The lateral horns contain the preganglionic neurons of the autonomic nervous system. (This is only mentioned for the sake of completeness, our focus is not the autonomic nervous system) The white matter surrounding the gray butterfly mainly consists of ascending / descending myelinated axons. These axons are organized in tracts. As these tracts are concerned with the conductance of clearly defined information, injury to single tracts may

03The spinal canal below L1 is fi lled with cauda equina.

04A Cross - section of the spinal cord

Page 103: Action and Cognition

a&c | Spinal Motor Systems

a&c | 103

cause clearly defined deficits.

Two divisions

There are two divisions of spinal nerves. One of them is the sensory division composed of afferent axon fibres and the motor division composed of efferent axon fibres. The sensory division enters the spinal cord through the dorsal root while the motor division exits via the ventral root.

Afferent:

Cutaneous receptors (touch), nociceptors (pain) and an ensemble of ‘deep’ receptors located in the skeletal muscles (proprioception) project from their location in the periphery into the gray matter of the spinal cord. The cell bodies of these pseudounipolar sensory neurons are accumulated in the dorsal root ganglion which is located outside the spinal cord but inside the spinal canal. (A ganglion is an accumulation of cell bodies) Due to differences in myelination and diameter, there are variations in conduction speed among the afferent

BOX 9.1Dermatomes

A dermatome is a portion of skin that becomes anesthetized by the destruction of a corresponding pair of dorsal roots. (See figure 05) As there are 31 pairs of dorsal root ganglia, there are 31 dermatomes. The picture above shows the approximate boundaries of the dermatomes. The location of the matching dorsal root is described with respect to neighbouring vertebrae.

05 Dermatomes

Page 104: Action and Cognition

a&c | Spinal Motor Systems

a&c | 104

projections. All sensory information from the neurons in the periphery has to enter the spinal cord at a specific dorsal root. This bundling into discrete portions gives rise to the notion of “dermatomes“. (See : Box 9.1)

Upon entering the spinal cord, the various types of sensory neurons may be discriminated according to their connectivity patterns.The first example would be to directly ascend to higher areas without synapsing and crossing to the contralateral side. (e.g. neurons transmitting pressure information)Second and third, some of these neurons form synapses as they enter the gray butterfly. The contacted neurons may either cross to the contralateral side and thenAn example for the first one :Nociceptors contacting a further sensory neuron in the spinal cord. The axon of this second neuron ascends in the so called spinothalamic tract. (Ye-es... towards the thalamus) ascend to the brain or give rise to local reflex circuits.An example for the latter one :Neurons bearing information from golgi tendon receptors. These receptors respond to minute changes in muscle tension.

Efferent :

The neurons of the motor division innervate the skeletal/somatic muscles and the smooth muscles of inner organs (autonomic ns ). The cell bodies of the

neurons connecting to the somatic muscles can be found in the ventral horn while the preganglionic neurons belonging to the autonomic nervous system are located in the lateral horn of the spinal cord. The ventral roots and the cranial nerves of the autonomic nervous system comprise the so called final common pathway. Every bit of information flowing to the periphery has to take the route through the final common pathway. Its destruction at a certain stage will cause paralysis below this point as well as reflex disturbances due to the lack of top - down modulatory effects.

Refl exes

One of the most popular features of the spinal cord are its reflex circuits. A reflex is an automatic, involuntary behaviour caused by an activation of a sensory neuron in the periphery or by autonomic reflex neurons in the brainstem. Accordingly, reflexes can be divided into somatic and autonomic reflexes.We will focus on two types of somatic reflexes in the following section, namely the mono- and polysynaptic stretch reflexes in the human leg.

A Monosynaptic Refl ex The most common example for a monosynaptic reflex is the knee jerk reflex. By hitting the patella with a small rubber hammer, your doctor causes the quadriceps fomaris (ye-es, that’s a muscle) to stretch. This stretch

06

0806 The typical shape of a pseudounipolar cell

07 Nerve roots leaving and entering the spinal cord

08 A Muscle spindle : What where muscle spindles good for again ? Muscle spindles are sensitive to length (stretch) of a muscle and also signal the rate of change of length.

07

Page 105: Action and Cognition

a&c | Spinal Motor Systems

a&c | 105

is recognized by the muscle spindles inside the muscle fibers. After excitation, a chain reaction is elicited. (See figure 10) Information from the muscle spindle travels along the axon of the sensory neuron. It enters the spinal cord via the dorsal horn. Upon arrival in the ipsilateral ventral horn, it innervates the alpha motor neuron group activating the quadriceps foramis. Consequently, your leg will perform a kicking movement towards your doctor. This process is called monosynaptic because there is only one synaptic relay station in between stimulus and response.Note that the quadriceps foramis is the only active muscle in the knee jerk reflex. It is irritated by some external force and it is contracted by the reflex circuit.Additionally,

inhibitory interneurons, blocking the alpha motor neurons of the antagonist muscle are also innervated by the afferent fibers from the quadriceps foramis.

A Polysynaptic Refl ex

An example for a polysynaptic reflex would be the withdrawal reflex, where an afferent fiber from a nociceptor located below the patella branches in the dorsal root, connecting to both excitatory and inhibitory interneurons. These interneurons in turn innervate the motorneurons in the contra- and ipsilateral ventral horns. According to the circuitry of inhibitory and excitatory interneurons certain muscle groups are contracted

09 10

11

09 Schematic depiction of a monosynaptic refl ex

10 The notorious knee - jerk refl ex

11 The polysynaptic withdrawal refl ex

Page 106: Action and Cognition

a&c | Spinal Motor Systems

a&c | 106

while others remain relaxed. In the example of the withdrawal reflex, the ipsilateral leg is withdrawn while the contralateral leg is extended, keeping the overall balance of the body.

Not that stupid

Contrary to popular beliefs, reflexes are neither simple nor are they completely stereotyped. Information from joint angle, skin and muscle stretch as well as nociception converge onto reflex circuits to include every kind of information available to peripheral receptors. Thus fine differences between, say, a pinch and a small squeeze can be integrated into reflex actions. Top - down

modulatory effects play an important role in regulation of reflexes. The overall condition of the system, including proprioceptive information (posture) or state of arousal is of great importance to reflex responses. As has been shown in gruesome experiments with living animals, transsection of the spinal cord or removal of the brain does significantly influence the intensity of reflex actions.

Not stupid at all : Force fi elds

The traditional view that spinal cord is a passive relay station in the production and control of motor behaviours

06

08

13

12 14

15

12 Experimental setting (see text for details)

13 Schematic diagram of the frog leg’s workspace, which is covered by the 9 points in the picture.

14 An example of the Convergent Force Fields (EP: Equilibrium Point)

15 A Convergent Force Field visualized as being superimposed on the workspace of the leg.

Page 107: Action and Cognition

a&c | Spinal Motor Systems

a&c | 107

is challenged by Bizzi and colleagues in a series of experiments. The typical experimental setting consisted of clamping and microstimulating the lumbar spinal cord of a frog as the ankle of the frog was attached to a six-point force transducer to measure the force generated by the ankle as a response to the stimulation (Fig 12). As the site of stimulation remained constant through trials, the force was measured at different ankle locations in the workspace of the leg, which can be defined as the region of the horizontal plane reachable by the leg (Fig

16 17

13). The spatial variation of the gathered force vectors led to force field, which is according to the direction of the vectors both convergent and definable by a single equilibrium point, the location where the leg wouldn’t move even if it was free to move (Fig 14 and 15). It has been argued that in the frog spinal cord there are at least four regions with a distinct type of convergent force field (Fig 17). The presence of these functional spinal units responsible for precise muscle group contractions necessitates the attribution of an active role to the spinal

16 The different regions of the lumbar spinal cord, where stimulation leads to different CFFs.

17 Different CFFs as gathered by the stimulation of the different sites in the lumbar spinal cord.

18 Vectoral summation principle of the CFFS.18

Page 108: Action and Cognition

a&c | Spinal Motor Systems

a&c | 108

cord in the control and production of motor behaviours. Additionally, the wide variety of motor behaviours can be explained with these few functional units. With the same experimental procedure the researchers have been able to show that the vectoral summation of the two force fields corresponding to two different spinal stimulations sites is equivalent to the force field generated by the simultaneous co-stimulation of these two sites (Fig 18). Accordingly, the functional units in the spinal cord might be considered as the “motor primitives” from which with the simple summation principle the wide range of movements can be generated, with the exploitation of these units by supraspinal systems.

Page 109: Action and Cognition

a&c | Consciousness

a&c | 109

PAPER REVIEW:Tong, Nakayama, Vaugan, Kanwischer:

In 1998 Tong et al. carried out an experiment to investigate in the question of perception-dependent activity in higher areas. The areas investigated in were the fusiform face area (FFA) and the parahippocampal place area (PPA). Both are located in extrastriate cortex. The FFA is an area, which responds selectively to faces, wheres the PPA is responding to houses and other places. The FFA is localized in the mid-fusiform gyrus, while the PPA is in the parahippocampal gyrus (cf. figure).The reaction of these two areas on rivalrous and non-rivalrous stimuli was measured with a fMRI-Scanner.

experimental setup

Three different scans were performed: First a localizer scan, used to localize exactly the FFA and PPA in the

patients, then a scan with rivalrous stimuli and finally a scan with non-rivalrous stimuli.For the localizer scan, the subjects were shown alternating sequences of faces and houses. The reaction was measured and the area in the mid-fusiform gyrus showing reaction only to faces but not to houses was defined as the FFA while the area in the parahippocampal gyrus showing reaction only to houses but not to faces was defined as the PPA.For the rivalrous scans, one house-stimulus and one face-stimulus were presented at the same time, one to each eye, so that the perception switched from face to house while the stimulus kept constant. For the non-rivalrous scans, to both eyes a sequence of faces and houses was shown. The subjects were given switches to report which image they perceive in rivalrous scans.resultsThe localizer scans were successful and the localization of the FFA and PPA was consistent with previous studies.The data gained from rivalrous and non-rivalrous scans showed a striking resemblance . The results are not only similar qualitatively, but even the amplitude is nearly equal.

APPENDIX C: Binocular rivalry and visual awarenessIntroduction

One interesting question in neuroscience is that of abstraction in the brain. The representation of objects in early visual processing has been discussed already, but we did not yet talk about the representation in higher areas. A good way to investigate in this field are binocular rivalry experiments. These experiments investigate the reaction to a setup where a different stimulus is presented to each eye. The subject does not, as one might think, perceive a blending of both pictures. The actual perception in fact switches from one image to the other. This switching does only occur in perception, whereas the actual stimulus remains unchanged. Thus, higher brain areas containing high-level descriptions should change their activity when the perception changes, whereas lower brain areas should remain in the same state.Early studies have already shown that in fact lower visual areas, especially primary visual cortex, show strong dependency on the stimulus itself, i.e. they actually depend on the image on the retina. In these brain areas, different visual stimuli always seem to lead to different reactions, even if they are perceived as as similar or even identical.

01 The FFA is localized in the mid-fusiform gyrus, while the PPA is in the parahippocampal gyrus.

01

Page 110: Action and Cognition

a&c | Consciousness

a&c | 110

01

Page 111: Action and Cognition

a&c | Consciousness

a&c | 111

summary

The study of Tong et al. shows indeed a strong coupling between visual awareness and activity in FFA and PPA. The switching between the perceived images in rivalry scans was resembled by the scans; while perceiving a house, the PPA showed increased activity and the perception of a face lead to more activity in FFA.This experiment contributes to our understanding of binocular rivalry. It shows that the difference in the stimuli, which is present at lower brain areas, is resolved in higher areas, and the brain activity does actually resemble the perception and no longer the mere stimulus.02 The FFA is localized in the mid-fusiform gyrus, while the PPA is in the parahippocampal gyrus.

Page 112: Action and Cognition

a&c | References

a&c | 111

References & Background Material

Most of the listed sources are also available on the CD.Go to Action & Cognition/References

01 The Constructive Nature of Color Vision Kandel, Schwartz, Jessel.Principles of Neural Science Chapter 25: Constructing the Visual Image Chapter 26: Visual Processing by the Retina Chapter 29: Color Vision

02 The Primary Visual Cortex

Kandel, Schwartz, Jessel.Principles of Neural Science Chapter 25: Constructing the Visual Image Chapter 27: Central Visual Pathways

Palmer. Vision Science . From Photons to Phenomenology. MIT Press. 2001.

David H.Hubel, Torsten N. Wiesel. Early Exploration of the Visual Cortex. (Neuron, Vol. 20, 401–412, March, 1998, Copyright 1998 by Cell Press)

Shapley, Hawken, Ringach.Dynamics of Orientation Selectivity in the Primary Visual Cortex and the Importance of Cortical Inhibition. (Neuron, Vol. 38, 689–699, June 5, 2003, Copyright 2003 by Cell Press )

03 Vision, Vx and a Representation of the World

Kandel, Schwartz, Jessel.Principles of Neural Science Chapter 25: Constructing the Visual Image Chapter 27: Central Visual Pathways Chapter 28: Perception of Motion, Depth and Form Chapter 29: Color Vision

Palmer. Vision Science . From Photons to Phenomenology. MIT Press. 2001.

Singer. Der Beobachter im Gehirn. Suhrkamp Verlag. 2002.

Hedgé, van Essen. Selectivity for Complex Shapes in Primate Visual Area V2 (The Journal of Neuroscience, 2000, Vol. 20)

Felleman, Van Essen. Distributed hierarchical processing in the primate cerebral cortex. (Cereb Cortex. 1991 Jan.Feb;1(1):1.47)

Abel et al. . Distribution of neurons projecting to the superior colliculus correlates with thick cytochrome oxidase stripes in macaque visual area V2. (J Comp Neurol. 1997 January ) Schall et al. .Topography of visual cortex connections with frontal eye fi eld in macaque: convergence and segregation of processing streams. (J Neurosci. 1995 June)

Beauchamp et al.. Graded effects of spatial and featural attention on human area MT and

Page 113: Action and Cognition

associated motion processing areas. (J Neurophysiol. 1997 Jul;78(1):516.20.) Schmolesky. Signal timing across macaque visual system (Copyright 1998 The American Physiological Society )

04 Appendix A: Object Recognition in IT and STS

DiCarlo, Maunsell. Anterior Inferotemporal Neurons of Monkeys Engaged in Object Recognition Can be Highly Sensitive to Object Retinal Position (J Neurophysiol 89: 3264–3278, 2003)

Sigala, Logothetis. Visual categorization shapes feature selectivity in the primate temporal cortex (Nature Magazine, Vol. 415, 17 Jan. 2002)

04 Temporal Coding

Singer.Neuronal Synchrony: A Versatile Code Review for the Defi nition of Relations?

05 Attention

Treue.Neural correlates of attention in primate visual cortex (TRENDS in Neurosciences Vol.24 No.5 May 2001 )

Parkhurst, Niebur.Modeling the role of salience in the allocation of overt visual attention.(Vision Res. 2002 Jan;42(1):107.23.)

Parkhurst, Niebur.Texture contrast attracts overt visual attention in natural scenes.(Eur J Neurosci. 2004 Feb;19(3):783.9.) Einhäuser, König.Does Luminance.Contrast contribute to a Saliency map for Overt Visual Attention.(Eur J Neurosci 17(5):1089.1097, 2003)

O’Regan et al..Change blindness as a result of mudsplashes(Nature , March 4, 1998)

Palmer. Vision Science. From Photons to Phenomenology. MIT Press. 2001

06 Neural Correlates of Attentional Mechanisms

Culham, Kanwisher . Neuroimaging of cognitive functions in human parietal cortex (Current Opinion in Neurobiology 2001, 11:157–163)

Kosslyn et al . Neural foundations of imagery (Nature Reviews . Neuroscience Vol. 2, September 2001)

Bisley, Goldberg.Neuronal Activity in the Lateral Intraparietal Area and Spatial Attention(Science Magazine . Vol 299 , 3 January 2003)

Page 114: Action and Cognition

Corbetta, Shulman – Control of Goal directed and Stimulus driven Attention in the Brain(NATURE REVIEWS . NEUROSCIENCE VOL. 3, MARCH 2002 )

Corbetta, Kincade, Ollinger, McAvoy, Shulman . Voluntary orienting is dissociated from target detection in human posterior parietal cortex (Nature Neuroscience . volume 3 no 3, march 2000)

Yantis . To See is to attend(Science Magazine . Vol. 299, 3 January 2003)

07 Top-Down and Feed-Back Mechanisms in the Visual System

Hochstein, Ahissar . View from the Top: Hierarchies and Reverse Hierarchies in the Visual System(Neuron, Vol. 36, 791–804, December 5, 2002, Copyright 2002 by Cell Press)

Ahissar, Hochstein. Task diffi culty and the specifi city of perceptual learning(Nature 1997 Vol. 387)

Galuske et al. . The role of feedback in shaping neural representations in cat visual cortex(PNAS December 24, 2002 vol. 99 no. 26)

Moore, Armstrong. . Selective gating of visual signals by microstimulation of frontal cortex(Nature, Vol. 421 , 23 Jan. 2003)

Ahissar. Perceptual learning (Current directions in psychological sciences,1999)

08 Consciousness

Srinivasan et al . Increased Synchronization of Neuromagnetic Responses during Conscious Perception(The Journal of Neuroscience, July 1, 1999)

Leopold, Logothetis . Multistable phenomena: changing views in perception(Trends in Cognitive Sciences – Vol. 3, No. 7, July 1999)

Frith, Perry, Lumer. The neural correlates of conscious experience: an experimental framework

O’Regan, Noe. A sensorimotor account of vision and visual consciousness

09 Spinal Motor Systems

Eccles. The Spinal Cord and Refl ex Action

Kandel, Schwartz, Jessel. Principles of Neural Science.Chapter 36: Spinal Refl exes

Bizzi et al. New Perspective on Spinal Motor Sytems(Nature Reviews, Neuroscience, Vol , Nov. 2000)

Appendix B: MRI and fMRI

Noll. Primer on MRI and functional MRI available on Douglas Noll’s Homepage.

Page 115: Action and Cognition

a&c | References

a&c | 114

Authors“... words ... don’t come easy ... to me”

01 The Constructive Nature of Color Vision written by Christian Honey

02 The Primary Visual Cortex

written by Boris Bernhardt

03 Vision, Vx and a Representation of the World

written by Boris Bernhardt

Appendix A: Object Recognition in IT and STS

written by Alper AçIk

04 Temporal Coding

written by Stephan Weller

05 Attention

written by Boris Bernhardt

06 Neural Correlates of Attention

written by Robert Märtin

09 Appendix B: MRI,fMRI

written by Robert Märtin

07 Top-Down and Feed-Back Mechanisms in the Visual System

written by Alper AçIk

08 Consciousness

written by Anton Stasche

09 Spinal Motor Systems

written by Robert Märtin

Page 116: Action and Cognition

a&c | References

a&c | 115

09 Appendix C: Binocular Rivalry and Visual Awareness

written by Robert Märtin