the formation of habits - inria · initial formation and the later expression of habits. the model...
TRANSCRIPT
THE FORMATION OF HABITS The implicit supervision of the basal ganglia
MEROPI TOPALIDOU12e Colloque de Société des Neurosciences
Montpellier May 19-22, 2015
THE FORMATION OF HABITS The implicit supervision of the basal ganglia
MEROPI TOPALIDOU12e Colloque de Société des Neurosciences
Montpellier May 19-22, 2015
THE FORMATION OF HABITS The implicit supervision of the basal ganglia
MEROPI TOPALIDOU12e Colloque de Société des Neurosciences
Montpellier May 19-22, 2015
Goal-Directed Actions VS Habits
Belin et al. (2008), Yin (2008), Foerde & Shohamy (2011), Doll et al. (2012)
Goal-Directed Actions VS Habits
→ initiation of response is under direct control of the current value of outcome
Belin et al. (2008), Yin (2008), Foerde & Shohamy (2011), Doll et al. (2012)
Goal-Directed Actions VS Habits
→ initiation of response is under direct control of the current value of outcome
Belin et al. (2008), Yin (2008), Foerde & Shohamy (2011), Doll et al. (2012)
Goal-Directed Actions VS Habits
→ initiation of response is under direct control of the current value of outcome
Belin et al. (2008), Yin (2008), Foerde & Shohamy (2011), Doll et al. (2012)
→ sensitive to devaluation of the outcome
Goal-Directed Actions VS Habits
→ direct initiation of responding by stimulus and/or context presentation
→ initiation of response is under direct control of the current value of outcome
Belin et al. (2008), Yin (2008), Foerde & Shohamy (2011), Doll et al. (2012)
→ sensitive to devaluation of the outcome
Goal-Directed Actions VS Habits
→ direct initiation of responding by stimulus and/or context presentation
→ initiation of response is under direct control of the current value of outcome
Belin et al. (2008), Yin (2008), Foerde & Shohamy (2011), Doll et al. (2012)
→ sensitive to devaluation of the outcome
Goal-Directed Actions VS Habits
→ direct initiation of responding by stimulus and/or context presentation
→ initiation of response is under direct control of the current value of outcome
Belin et al. (2008), Yin (2008), Foerde & Shohamy (2011), Doll et al. (2012)
→ sensitive to devaluation of the outcome
→ resistant to devaluation of the outcome
Goal-Directed Actions VS Habits
→ direct initiation of responding by stimulus and/or context presentation
→ initiation of response is under direct control of the current value of outcome
Belin et al. (2008), Yin (2008), Foerde & Shohamy (2011), Doll et al. (2012)
→ sensitive to devaluation of the outcome
→ behavior adjusts to reflect the new value of the outcome that the action would obtain
→ resistant to devaluation of the outcome
Goal-Directed Actions VS Habits
→ direct initiation of responding by stimulus and/or context presentation
→ initiation of response is under direct control of the current value of outcome
Belin et al. (2008), Yin (2008), Foerde & Shohamy (2011), Doll et al. (2012)
→ sensitive to devaluation of the outcome
→ behavior adjusts to reflect the new value of the outcome that the action would obtain
→ habits persist even if the reward becomes less attractive or if the action is not necessary to earn the reward.
→ resistant to devaluation of the outcome
Cortex Basal Ganglia
Novel behaviors require attention and flexible thinking and therefore are dependent on cortex, whereas automatic behaviors has been assumed to be primarily mediated by subcortical structures. Much evidence suggests however, that subcortical structures, such as the striatum, make significant contributions to initial learning. More recently, evidence has been accumulating that neurons in the associative striatum are selectively activated during early learning, whereas those in the sensorimotor striatum are more active after automaticity has developed. At the same time, other recent reports suggest that automatic behaviors are striatum- and dopamine-independent, and may be mediated entirely within cortex. Resolving this apparent conflict should be a major goal of future research.
These ideas led to the theory that dominated the 20th century: Novel behaviors require attention and flexible thinking and therefore are dependent on cortex, whereas automatic behaviors require neither of these and so are not mediated primarily by cortex. Instead, it has long been assumed that automatic behaviors are primarily mediated by subcortical structures.
Cortex Basal Ganglia
Ashby, Turner & Horvitz (2010)
Habits go there
Goal Directed actions go here
Cortex leads decision once learned
BG “teach” cortex during
learning phase
Daw, Niv & Dayan (2005)
Novel behaviors require attention and flexible thinking and therefore are dependent on cortex, whereas automatic behaviors has been assumed to be primarily mediated by subcortical structures. Much evidence suggests however, that subcortical structures, such as the striatum, make significant contributions to initial learning. More recently, evidence has been accumulating that neurons in the associative striatum are selectively activated during early learning, whereas those in the sensorimotor striatum are more active after automaticity has developed. At the same time, other recent reports suggest that automatic behaviors are striatum- and dopamine-independent, and may be mediated entirely within cortex. Resolving this apparent conflict should be a major goal of future research.
These ideas led to the theory that dominated the 20th century: Novel behaviors require attention and flexible thinking and therefore are dependent on cortex, whereas automatic behaviors require neither of these and so are not mediated primarily by cortex. Instead, it has long been assumed that automatic behaviors are primarily mediated by subcortical structures.
Outline
• Experiment
• Computational model
• Results
Experimental setup
Trial start Cue presentation Go signal Decision Reward Trial stop
1.0s - 1.5s 1.0s - 1.5s 1.0s - 1.5s Time
0.75
0.25
Pre-learned cues
Novel cues (every day)
0.75
0.25
Hab
itual
Con
ditio
nN
ovel
Con
ditio
n
Two monkeys, simple two-armed bandit task with P=0.75 and P=0.25.
→ Habitual condition (known stimuli pair, same every day)→ Novel condition (unfamiliar stimuli pair, new every day)
Piron et al. (submitted)
Experimental results
1.0
0.8
0.6
0.4
0.2
0.00 20 40 60 80 100 120
Number of trials
Mea
n su
cces
s ra
te
HC NC saline
Piron et al. (submitted)
HC NCSaline Saline
Mean of first 25 trials Mean of last 25 trials
1.0
0.8
0.6
0.4
0.2
0.0Mea
n su
cces
s ra
te
HC NC
**
HC NCSaline Saline
Mean of first 25 trials Mean of last 25 trials
1.0
0.8
0.6
0.4
0.2
0.0Mea
n su
cces
s ra
te
HC NC
**
Experimental results
Muscimol injection in GPi disrupts learning in novel conditions (NC) but performances remains intact (but slower) in habitual conditions (HC).
Mean of last 25 trials
1.0
0.8
0.6
0.4
0.2
0.0
0 20 40 60 80 100 120
Number of trials
Me
an
su
cce
ss r
ate
HC NC
saline
muscimol
Saline Muscimol
1.0
0.8
0.6
0.4
0.2
0.0
Me
an
su
cce
ss r
ate
***
*
HC NC HC NC
Piron et al. (submitted)
Experimental conclusion
If habits were stored in basal ganglia, monkeys would not achieve peak performances in muscimol conditions for familiar stimuli.
If habits were learned in cortex, monkeys would be able to reach peak performances in muscimol conditions for unfamiliar stimuli.
Piron et al. (submitted)Mean of last 25 trials
1.0
0.8
0.6
0.4
0.2
0.0
0 20 40 60 80 100 120
Number of trials
Mean s
uccess r
ate
HC NC
saline
muscimol
Saline Muscimol
1.0
0.8
0.6
0.4
0.2
0.0
Mean s
uccess r
ate
***
*
HC NC HC NC
Computational model
Two segregated loops:
→ Cognitive loop allows to choose a shape→ Motor loop allows to reach a shape
External current External current
2
1
Thalamuscognitive
(4 units)
Thalamusmotor(4 units)
STNcognitive
(4 units)
GPimotor(4 units)
GPicognitive
(4 units)
Striatumcognitive
(4 units)
Striatumassociative
(4x4 units)
Cortexmotor(4 units)
Cortexcognitive
(4 units)
Cortexassociative
(4x4 units)
3
- - -
- -
-+ +
GPecognitive
(4 units)
-
-+
External current
INDI
RECT
PAT
HWAY
-
-
HYPE
RDIR
ECT
PATH
WAY
-
-
-
Striatummotor (4 units)
DIRE
CT P
ATHW
AY
GPemotor(4 units)
-
STNmotor(4 units)
Topalidou et al. (in prep.)
Neural NetworkNeuron Rate model
Cortico-basal competition
Cognitive decision has to intervene in motor decision.
Thanks to lateral competition, cortex can take a decision without interaction with BG.
Cortical decision
Cortico-Basal decision
External current External current
2
1
Thalamuscognitive
(4 units)
Thalamusmotor(4 units)
STNcognitive
(4 units)
GPimotor(4 units)
GPicognitive
(4 units)
Striatumcognitive
(4 units)
Striatumassociative
(4x4 units)
Cortexmotor(4 units)
Cortexcognitive
(4 units)
Cortexassociative
(4x4 units)
3
- - -
- -
-+ +
GPecognitive
(4 units)
-
-+
External current
INDI
RECT
PAT
HWAY
-
-
HYPE
RDIR
ECT
PATH
WAY
-
-
-
Striatummotor (4 units)
DIRE
CT P
ATHW
AY
GPemotor(4 units)
-
STNmotor(4 units)
Topalidou et al. (in prep.)
Acting is learning
Learning occurs at three different places simultaneously.
① & ② Hebbian learning
③ Reinforcement learning
Cortex learns to reproduce previous repertories, regardless of whether or not are appropriate (HL).
Fast basal ganglia trial-and-error learning (RL) biases slow cortical one (HL) ensuring that the correct behavior is produced. Hélie et al. (2014)
External current External current
2
1
Thalamuscognitive
(4 units)
Thalamusmotor(4 units)
STNcognitive
(4 units)
GPimotor(4 units)
GPicognitive
(4 units)
Striatumcognitive
(4 units)
Striatumassociative
(4x4 units)
Cortexmotor(4 units)
Cortexcognitive
(4 units)
Cortexassociative
(4x4 units)
3
- - -
- -
-+ +
GPecognitive
(4 units)
-
-+
External current
INDI
RECT
PAT
HWAY
-
-
HYPE
RDIR
ECT
PATH
WAY
-
-
-
Striatummotor (4 units)
DIRE
CT P
ATHW
AY
GPemotor(4 units)
-
STNmotor(4 units)
Topalidou et al. (in prep.)
Computational results
Intact model→ peak performances on familiar conditions→ can learn novel conditions
Lesioned model (GPi)→ peak performances on familiar conditions→ cannot learn novel conditions
Mean of last 25 trials
1.0
0.8
0.6
0.4
0.2
0.0
0 20 40 60 80 100 120
Number of trials
Me
an
su
cce
ss r
ate
HC NC
saline
muscimol
Saline Muscimol
1.0
0.8
0.6
0.4
0.2
0.0
Me
an
su
cce
ss r
ate
***
*
HC NC HC NC
(Monkey results)
Topalidou et al. (in prep.)
External current External current
2
1
Thalamuscognitive
(4 units)
Thalamusmotor(4 units)
STNcognitive
(4 units)
GPimotor(4 units)
GPicognitive
(4 units)
Striatumcognitive
(4 units)
Striatumassociative
(4x4 units)
Cortexmotor(4 units)
Cortexcognitive
(4 units)
Cortexassociative
(4x4 units)
3
- - -
- -
-+ +
GPecognitive
(4 units)
-
-+
External current
INDI
RECT
PAT
HWAY
-
-
HYPE
RDIR
ECT
PATH
WAY
-
-
-
Striatummotor (4 units)
DIRE
CT P
ATHW
AY
GPemotor(4 units)
-
STNmotor(4 units)
Sensitivity to reward devaluation
Conclusion
The acquisition and the expression of habits are two entangled processes that can be dissociated experimentally.
This experimental dissociation sheds light on the nature of the interaction between the basal ganglia and the cortex and their respective role in the initial formation and the later expression of habits.
The model suggests that the basal ganglia implicitly supervises the cortex where habits are actually stored, but the cortex cannot learn them on its own.
In the future, the model will be tested in different protocols in order to ensure the accuracy of its predictions.
• Nicolas Rougier
• T. Boraud
• C. Piron
• D. Kase
• A. Leblois
Acknowledgements
• Nicolas Rougier
• T. Boraud
• C. Piron
• D. Kase
• A. Leblois
Acknowledgements
• Nicolas Rougier
• T. Boraud
• C. Piron
• D. Kase
• A. Leblois
Acknowledgements
• Nicolas Rougier
• T. Boraud
• C. Piron
• D. Kase
• A. Leblois
Acknowledgements
• Nicolas Rougier
• T. Boraud
• C. Piron
• D. Kase
• A. Leblois
Acknowledgements
550
500
450
400
0HC NC HC NC
Saline Muscimol
Rea
ctio
n pe
riod
(ms)
**
**