optimal control approaches to language · 2017. 6. 14. · howes, lewis & vera (2009,...
TRANSCRIPT
Optimal Control Approaches to LanguageHow the Architecture “Shows Through”
Richard L. Lewis Michael Shvartsman Satinder Singh
Psychology, Computer ScienceUniversity of Michigan
Soar Workshop 32(!)21 June 2012
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
The idea: Applying Newell & Simon’s scissors to language(plus utility maximization)
All aspects of linguistic processing and
behavior—from parsing strategies to
production strategies to control of short
and long-term memory to eye-movement
control—may be understood as the solution
to the constrained optimization problem
posed by the external task environment,
task structure, and internal processing
structure/constraints (e.g. representation
noise, knowledge).
arg max
organism structure
(constraints) task goals (
payoff)
envir
onmen
t st
ructu
re
behavior (optimal policies)
Howes, Lewis & Vera (2009,
Psychological Review)
We’ll pursue this via the application of state-of-the-art theoretical ideas
. . . from the 1940-60s: optimal control and optimal state estimation.Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements
Overview
1 What determines the nature of eye-movements in linguistictasks?
The task and modelPredictions vs. human behaviorHow architecture shapes adaptation
2 Conclusions and looking ahead
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Task and model
The List Lexical Decision Task
TASK ENVIRONMENT
MODEL
Control policies optimizing reward
predicted behavior under ACCURACY payoff
predicted behavior under BALANCED payoff
predicted behavior under SPEED payoff
behavior under ACCURACY payoff
behavior under BALANCED payoff
behavior under SPEED payoff
ACCURACY emphasis
payoff
BALANCEDpayoff
Varying architectural constraints
HUMANSin eye-tracking experiments
SPEED emphasis
payoff
2
3
4
5
6
lead hilt robe helm guru east1
varying architectural constraints
Version of this task first used by Schvanaveldt & Meyer (1973);Meyer & Schvanavedlt (1972)
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Task and model
Explicit payoffs: Motivating with cash
4 LEWIS, SHVARTSMAN & SINGH
Accuracy Balanced Speed
Incorrect penalty -150 -50 -25Speed bonus (persecond under 5s)
8 6.7 5.7
Table 1Quantitative payo↵s given to both model and hu-man participants. These payo↵ points translatedinto cash bonuses for the human participants.
Three distinct payo↵s
We evaluated both model and human participantsaccording to three di↵erent payo↵ functions (spec-ified in Table 1. The payo↵s were designed to im-pose di↵erent speed-accuracy tradeo↵s for a givenlevel of success, and were all defined in terms of abonus for speed and penalty for incorrect responses.The bonus was continuous at the millisecond level,starting at zero points for responses longer than 5sand rising by a di↵erent number of points per sec-ond for each payo↵.
An optimal control model
Main theoretical assumptions
We can now briefly state our three main theoret-ical assumptions:
1. Saccadic control is a “rise-to-threshold” sys-tem (Brodersen et al., 2008) conditioned on task-specific decision variables that reflect the accu-mulation and integration of noisy evidence overtime. We model the dynamic evidence accumula-tion as Bayesian sequential sampling, and in oursimple two-alternative task this is equivalent to a Se-quential Probability Ratio Test (Wald & Wolfowitz,1948).
2. The saccade thresholds are set to maximizetask-specific payo↵, but this is one part of a jointoptimization problem that includes all other policyparameters that determine behavior in the task. Inour model of the LLDT, this consists of a separatedecision variable and threshold that determines thetask-level response to the entire trial (but does notinclude architectural parameters, which are fixed).These two thresholds together determine how longthe model fixates on individual strings, how manystrings it reads, and when and how it responds.
3. The shape of the payo↵ surface (and thus itsmaxima) over the multi-dimensional policy space isdetermined jointly by the payo↵ function and prop-erties of the perceptual and oculomotor system, in-cluding saccade programming duration, eye-brain-lag, saccade execution duration, manual motor pro-gramming duration, and representational noise.
Overview
We provide a brief overview of a typical trialbefore focusing in on specific detail of each as-pect of the model specification. See Figure 2 fora schematic diagram of the full model, and Fig-ure 3 for simulated traces from two sample trials.On a given trial, the first fixation starts on the left-most string. During each fixation, noisy informationabout the fixated string is acquired at every timestep(with some delay, the eye-brain-lag, (VanRullen &Thorpe, 2001)). This noisy information is usedfor updating the model’s beliefs about the status ofthe current string as well as the trial as a whole.This continues until either the string-level or thetrial-level belief reaches some threshold, at whichpoint either a saccade is initiated (if the string-levelthreshold is reached), or a manual response is initi-ated (if the trial-level threshold is hit). We will re-fer to these thresholds as the saccade threshold anddecision threshold. Information acquisition contin-ues while the saccade or manual response is beingprogrammed and until the saccade begins execution(with some visual persistence o↵set). Once saccadeprogramming and execution is complete, the modelfixates on the following string (if there are stringsremaining), or initiates a response otherwise. Oncemotor programming and execution is complete, themodel receives point feedback (i.e. the payo↵) andthe trial is over.
Dynamics assumptions: Oculo-motor ar-chitecture and noise
The model’s sequential perceptual inferencemechanism is embedded in a simple oculomotorcontrol machine, drawing upon modern mathemati-cal models of oculomotor control in reading. Thedelays noted above (eye-brain-lag, saccade pro-gramming and execution times, and motor time) aredrawn from gamma distributions (chosen for con-venience because they are constrained to be posi-tive and have been previously used to model these
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Task and model
Explicit payoffs: Motivating with cash
4 LEWIS, SHVARTSMAN & SINGH
Accuracy Balanced Speed
Incorrect penalty -150 -50 -25Speed bonus (persecond under 5s)
8 6.7 5.7
Table 1Quantitative payo↵s given to both model and hu-man participants. These payo↵ points translatedinto cash bonuses for the human participants.
Three distinct payo↵s
We evaluated both model and human participantsaccording to three di↵erent payo↵ functions (spec-ified in Table 1. The payo↵s were designed to im-pose di↵erent speed-accuracy tradeo↵s for a givenlevel of success, and were all defined in terms of abonus for speed and penalty for incorrect responses.The bonus was continuous at the millisecond level,starting at zero points for responses longer than 5sand rising by a di↵erent number of points per sec-ond for each payo↵.
An optimal control model
Main theoretical assumptions
We can now briefly state our three main theoret-ical assumptions:
1. Saccadic control is a “rise-to-threshold” sys-tem (Brodersen et al., 2008) conditioned on task-specific decision variables that reflect the accu-mulation and integration of noisy evidence overtime. We model the dynamic evidence accumula-tion as Bayesian sequential sampling, and in oursimple two-alternative task this is equivalent to a Se-quential Probability Ratio Test (Wald & Wolfowitz,1948).
2. The saccade thresholds are set to maximizetask-specific payo↵, but this is one part of a jointoptimization problem that includes all other policyparameters that determine behavior in the task. Inour model of the LLDT, this consists of a separatedecision variable and threshold that determines thetask-level response to the entire trial (but does notinclude architectural parameters, which are fixed).These two thresholds together determine how longthe model fixates on individual strings, how manystrings it reads, and when and how it responds.
3. The shape of the payo↵ surface (and thus itsmaxima) over the multi-dimensional policy space isdetermined jointly by the payo↵ function and prop-erties of the perceptual and oculomotor system, in-cluding saccade programming duration, eye-brain-lag, saccade execution duration, manual motor pro-gramming duration, and representational noise.
Overview
We provide a brief overview of a typical trialbefore focusing in on specific detail of each as-pect of the model specification. See Figure 2 fora schematic diagram of the full model, and Fig-ure 3 for simulated traces from two sample trials.On a given trial, the first fixation starts on the left-most string. During each fixation, noisy informationabout the fixated string is acquired at every timestep(with some delay, the eye-brain-lag, (VanRullen &Thorpe, 2001)). This noisy information is usedfor updating the model’s beliefs about the status ofthe current string as well as the trial as a whole.This continues until either the string-level or thetrial-level belief reaches some threshold, at whichpoint either a saccade is initiated (if the string-levelthreshold is reached), or a manual response is initi-ated (if the trial-level threshold is hit). We will re-fer to these thresholds as the saccade threshold anddecision threshold. Information acquisition contin-ues while the saccade or manual response is beingprogrammed and until the saccade begins execution(with some visual persistence o↵set). Once saccadeprogramming and execution is complete, the modelfixates on the following string (if there are stringsremaining), or initiates a response otherwise. Oncemotor programming and execution is complete, themodel receives point feedback (i.e. the payo↵) andthe trial is over.
Dynamics assumptions: Oculo-motor ar-chitecture and noise
The model’s sequential perceptual inferencemechanism is embedded in a simple oculomotorcontrol machine, drawing upon modern mathemati-cal models of oculomotor control in reading. Thedelays noted above (eye-brain-lag, saccade pro-gramming and execution times, and motor time) aredrawn from gamma distributions (chosen for con-venience because they are constrained to be posi-tive and have been previously used to model these
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Task and model
Explicit payoffs: Motivating with cash
4 LEWIS, SHVARTSMAN & SINGH
Accuracy Balanced Speed
Incorrect penalty -150 -50 -25Speed bonus (persecond under 5s)
8 6.7 5.7
Table 1Quantitative payo↵s given to both model and hu-man participants. These payo↵ points translatedinto cash bonuses for the human participants.
Three distinct payo↵s
We evaluated both model and human participantsaccording to three di↵erent payo↵ functions (spec-ified in Table 1. The payo↵s were designed to im-pose di↵erent speed-accuracy tradeo↵s for a givenlevel of success, and were all defined in terms of abonus for speed and penalty for incorrect responses.The bonus was continuous at the millisecond level,starting at zero points for responses longer than 5sand rising by a di↵erent number of points per sec-ond for each payo↵.
An optimal control model
Main theoretical assumptions
We can now briefly state our three main theoret-ical assumptions:
1. Saccadic control is a “rise-to-threshold” sys-tem (Brodersen et al., 2008) conditioned on task-specific decision variables that reflect the accu-mulation and integration of noisy evidence overtime. We model the dynamic evidence accumula-tion as Bayesian sequential sampling, and in oursimple two-alternative task this is equivalent to a Se-quential Probability Ratio Test (Wald & Wolfowitz,1948).
2. The saccade thresholds are set to maximizetask-specific payo↵, but this is one part of a jointoptimization problem that includes all other policyparameters that determine behavior in the task. Inour model of the LLDT, this consists of a separatedecision variable and threshold that determines thetask-level response to the entire trial (but does notinclude architectural parameters, which are fixed).These two thresholds together determine how longthe model fixates on individual strings, how manystrings it reads, and when and how it responds.
3. The shape of the payo↵ surface (and thus itsmaxima) over the multi-dimensional policy space isdetermined jointly by the payo↵ function and prop-erties of the perceptual and oculomotor system, in-cluding saccade programming duration, eye-brain-lag, saccade execution duration, manual motor pro-gramming duration, and representational noise.
Overview
We provide a brief overview of a typical trialbefore focusing in on specific detail of each as-pect of the model specification. See Figure 2 fora schematic diagram of the full model, and Fig-ure 3 for simulated traces from two sample trials.On a given trial, the first fixation starts on the left-most string. During each fixation, noisy informationabout the fixated string is acquired at every timestep(with some delay, the eye-brain-lag, (VanRullen &Thorpe, 2001)). This noisy information is usedfor updating the model’s beliefs about the status ofthe current string as well as the trial as a whole.This continues until either the string-level or thetrial-level belief reaches some threshold, at whichpoint either a saccade is initiated (if the string-levelthreshold is reached), or a manual response is initi-ated (if the trial-level threshold is hit). We will re-fer to these thresholds as the saccade threshold anddecision threshold. Information acquisition contin-ues while the saccade or manual response is beingprogrammed and until the saccade begins execution(with some visual persistence o↵set). Once saccadeprogramming and execution is complete, the modelfixates on the following string (if there are stringsremaining), or initiates a response otherwise. Oncemotor programming and execution is complete, themodel receives point feedback (i.e. the payo↵) andthe trial is over.
Dynamics assumptions: Oculo-motor ar-chitecture and noise
The model’s sequential perceptual inferencemechanism is embedded in a simple oculomotorcontrol machine, drawing upon modern mathemati-cal models of oculomotor control in reading. Thedelays noted above (eye-brain-lag, saccade pro-gramming and execution times, and motor time) aredrawn from gamma distributions (chosen for con-venience because they are constrained to be posi-tive and have been previously used to model these
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Task and model
An adaptive model that performs the complete task
Manualresponse Posterior Update Lexicon +
Model of experiment
Experiment Environment
Critic
Feedback
Noisy sample from position
(Bounded) Optimal Control
Initiate saccade program
Reward Function
Stimulus
Oculomotor System
Initiate press
Button press indicating trial response
s kI(·)
saccade program saccade eye-brain lag ⇠ Gamma(50ms)
⇠ Gamma(250ms)
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.match
SD= 1.75, threshold=0.95Samples
Posterio
r
RT dist. for RN.match
RT
Frequen
cy
300 500 700
0500
10001
500200
0
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.mismatch
SD= 1.75, threshold=0.95Samples
Posterio
r
RT dist. for RN.mismatch
RT
Frequen
cy
300 500 700
0500
1500
2500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
YES.match
SD= 1.75, threshold=0.95Samples
Posterio
r
RT dist. for YES.match
RT
Frequen
cy
300 500 700
0100
0200
0300
0
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
NRN.match
SD= 2.00, threshold=0.95Samples
Posterio
r
RT dist. for NRN.match
RT
Frequen
cy
300 500 700
0100
0300
0
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
NRN.mismatch
SD= 2.00, threshold=0.95Samples
Posterio
r
RT dist. for NRN.mismatch
RT
Frequen
cy
400 500 600 700
0500
1500
2500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.match
SD= 2.00, threshold=0.95Samples
Posterio
r
RT dist. for RN.match
RT
Frequen
cy
300 500 700
0500
1000
1500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.mismatch
SD= 2.00, threshold=0.95Samples
Posterio
r
RT dist. for RN.mismatch
RT
Frequen
cy
300 500 700
0500
1500
2500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
YES.match
SD= 2.00, threshold=0.95Samples
Posterio
r
RT dist. for YES.match
RT
Frequen
cy
300 500 700
0500
1500
2500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.mismatch
SD= 3.25, threshold=0.92Samples
Posterio
r
RT dist. for RN.mismatch
RT
Frequen
cy
300 500 700
0500
10001
500200
0
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
YES.match
SD= 3.25, threshold=0.92Samples
Posterio
r
RT dist. for YES.match
RT
Frequen
cy
300 500 700
0500
1000
1500
2000
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
NRN.match
SD= 3.50, threshold=0.92Samples
Posterio
r
RT dist. for NRN.match
RT
Frequen
cy
300 500 700
0500
1000
2000
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
NRN.mismatch
SD= 3.50, threshold=0.92Samples
Posterio
r
RT dist. for NRN.mismatch
RT
Frequen
cy
300 500 700
0500
1500
2500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.match
SD= 3.50, threshold=0.92Samples
Posterio
r
RT dist. for RN.match
RT
Frequen
cy
300 500 700
0100
0300
0
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.mismatch
SD= 3.50, threshold=0.92Samples
Posterio
r
RT dist. for RN.mismatch
RT
Frequen
cy
300 500 700
0500
1500
2500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
YES.match
SD= 3.50, threshold=0.92Samples
Posterio
rRT dist. for YES.match
RT
Frequen
cy
300 500 700
0500
1500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
NRN.match
SD= 1.50, threshold=0.93Samples
Posterio
r
RT dist. for NRN.match
RT
Frequen
cy
350 400 450 500
0500
1500
2500
saccade decision
trial decision
⇠ Gamma(40ms)⇠ Gamma(125ms)
optimal thresholds
18 LEWIS, SHVARTSMAN & SINGH
Next, we update the string-level beliefs:
Prnew(S k = Wi|T ! N k, sk) =
=Pr(sk |S k = Wi,T ! N k)Prold(S k = Wi|T ! N k)
Prold(sk |T ! N k)
=Pr(sk |S k = Wi)Prold(S k = Wi|T ! N k)
Prold(sk |T ! N k)(4)
Prnew(S k = Ni|T = Nk, sk) =
=Pr(sk |S k = Ni,T = N k)Prold(S k = Ni|T = N k)
Prold(sk |T = N k)
=Pr(sk |S k = Ni)Prold(S k = Ni|T = N k)
Prold(sk |T = N k)(5)
Finally, we update the trial level beliefs:
Prnew(T =W|sk) =
=Prold(sk |T =W)Prold(T =W)
Prold(sk)=
=Prold(sk |T ! N k)Prold(T =W)
Prold(sk)
(6)
Prnew(T = N j!k |sk) =
=Prold(sk |T = N j)Prold(T = N j)
Prold(sk)
=Prold(sk |T ! N k)Prold(T = N j)
Prold(sk),
(7)
Prnew(T = N k |sk) =
=Prold(sk |T = N k)Prold(T = N k)
Prold(sk)
(8)
In order to make decisions, in addition to the proba-bility that the trial is a word trial or not that we computeabove, we also need the probability that the string at po-sition k is a word or nonword, i.e., Pr(S k ∈ ∪n
i=1Wi) andPr(S k ∈ ∪m
i=1Ni):
Pr(S k ∈ ∪ni=1Wi) =
n∑
i=1
Pr(S k = Wi|T ! N k)Pr(T ! Nk)
(9)
Pr(S k ∈ ∪mi=1Ni) = 1.0 − Pr(S k ∈ ∪n
i=1Wi) (10)
The full process then iterates, with each Prnew becom-ing the next Prold.
AppendixReferences
Anderson, J. R. (1990). The Adaptive Character ofThought. Hillsdale, NJ: Lawrence Erlbaum.
Ballard, D. H., & Hayhoe, M. M. (2009). Modelling therole of task in the control of gaze. Visual Cognition,17(6-7), 1185-1204.
Bicknell, K. (2010). Rational eye movements in read-ing combining uncertainty about previous words withcontextual probability. Proceedings of the 32nd An-nual Conference of the Cognitive Science Society.
Bicknell, K., & Levy, R. (2010). A rational model of eyemovement control in reading. Proceedings of the 48thAnnual Meeting of the Association for ComputationalLinguistics, 1168–1178.
Bicknell, K., & Levy, R. (2012). The utility of modelingword identification from visual input within models ofeye movements in reading. Visual Cognition, 1–18.
Bogacz, R., Brown, E., Moehlis, J., Holmes, P., & Cohen,J. D. (2006, October). The physics of optimal decisionmaking: A formal analysis of models of performancein two-alternative forced-choice tasks. PsychologicalReview, 113(4), 700-765.
Brodersen, K. H., Penny, W. D., Harrison, L. M., Dau-nizeau, J., Ruff, C. C., Duzel, E., et al. (2008). Inte-grated bayesian models of learning and decision mak-ing for saccadic eye movements. Neural Networks,21(9), 1247–1260.
Edwards, W. (1961, January). Behavioral decision theory.Annual Review of Psychology, 12, 473–98.
Edwards, W. (1965). Optimal strategies for seeking infor-mation: Models for statistics, choice reaction-times,and human information-processing. Journal of Math-ematical Psychology, 2(2), 312-329.
Engbert, R., Nuthmann, A., Richter, E., & Kliegl, R.(2005, October). Swift: A dynamical model of sac-cade generation during reading. Psychological Re-view, 112(4), 777-813.
Forster, K. (1979). Levels of processing and the struc-ture of the language processor. In W. E. Cooper &E. C. Walker (Eds.), Sentence processing: Psycholin-guistic studies presented to merrill garrett. Hillsdale,NJ: Lawrence Erlbaum.
Geisler, W. (1989). Sequential ideal-observer analysis ofvisual discriminations. Psychological Review, 96(2),267.
Grainger, J. (1990). Word frequency and neighborhoodfrequency effects in lexical decision and naming. Jour-nal of Memory and Language, 29(2), 228–244.
Green, D. M., & Swets, J. A. (1966). Signal DetectionTheory and Psychophysics. New York: Wiley.
Hale, J. T. (2011). What a rational parser would do. Cog-nitive Science, 35, 399–443.
Howes, A., Lewis, R. L., & Vera, A. H. (2009). Rationaladaptation under task and processing constraints: Im-plications for testing theories of cognition and action.Psychological Review, 116(4), 717–751.
Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996).
,... (see Appendix)
1
2
34
5
6
7
lead hilt robe helm guru east
See Legge et al (1997); Bicknell & Levy (2010); Ratcliff & Mckoon (2008); Norris(2009).
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Task and model
“Random walk” of Bayesian posterior update
accumulators that are assumed to gradually accrue noisysensory input until a threshold of activation is reached[19,20]. More recent models proposed in the field of compu-tational neuroscience explicitly model individual integratorneurons [21–23] or assume that each accumulator corre-sponds to a population of integrator neurons associatedwith a particular choice alternative [24,25]. Importantly,all these models assume that SAT is controlled by thedistance between the initial activity of the integrators (i.e.the baseline) and the response threshold. If this difference issmall, decisions are fast but error-prone; if the difference islarge, decisions are accurate but slow (Figure 1). Therefore,these models predict that neural changes associated withSAT should be visible in brain areas involved in decisionmaking (i.e. areas containing integrator neurons), ratherthan in areas specialized in stimulus encoding and motorexecution.
Mathematical models can account for SAT in two ways:either by changing the baseline of accumulators or bychanging their threshold (Figure 2). Although many mod-elling studies have assumed for simplicity that the base-line stays constant and that SAT is controlled by changingthe threshold, in almost all mathematical models thecrucial factor controlling SAT is the baseline–thresholddistance. In most models an increase in baseline and acorresponding decrease in threshold produce exactly thesame changes in simulated behavior. Thus, behavioraldata alone do not allow one to distinguish whether SATis controlled by changing the baseline, threshold, or both.To address this question, one needs to analyze neuralactivity.
fMRI studies of SAT: advantages and limitationsTo date, the most direct evidence concerning the neuralbasis of SAT comes from three recent BOLD-fMRI studiesin humans [4–6]. BOLD is an fMRI technique that revealsthe local changes in blood oxygenation that are closelycoupled with local increases in neural activation [26].Compared to cell recordings in animals, human fMRIhas distinct advantages as a method for studying SAT.First, unlike animals, human subjects can simply beinstructed to be fast or accurate. Furthermore, fMRI per-mits whole-brain coverage at a spatial resolution sufficientto delineate regional changes in activation. Whole-braincoverage is important for studying phenomena, like SAT,that are likely to be dependent on the interplay betweenvarious brain areas.
Two general limitations of fMRI are its low spatial andtemporal resolution. In contrast to neurophysiologicalrecordings, fMRI does not allow one to monitor the activityof specific integrator neurons as evidence accumulates overtime. In addition to these general limitations, fMRI studiesof SAT are also confronted with two specific challenges.First, as pointed out by van Veen et al. [6], the amplitude ofdecision-related BOLD responses is not only proportionalto the baseline–threshold distance but also to the durationof the decision process [26], and this duration is likely to belonger in the accuracy condition. Therefore it is hard toattribute changes in decision-related BOLD signalsunequivocally to changes in baseline–threshold distance.Second, it is difficult to examine response thresholds with
Figure 1. An accumulator model account of SAT. The figure shows a simulation ofa choice between two alternatives. The model includes two accumulators, whoseactivity is shown by blue lines. The inputs to both accumulators are noisy, but theinput to the accumulator shown in dark blue has a higher mean, because thisaccumulator represents the correct response. Lowering the threshold (horizontallines) leads to faster responses at the expense of an increase in error rate. In thisexample, the green threshold leads to a correct but relatively slow response,whereas the red threshold leads to an incorrect but relatively fast response.
Figure 2. Schematic illustration of changes in the activity of neural integrators associated with SAT. Horizontal axes indicate time, while vertical axes indicate firing rate.The blue lines illustrate the average activity of a neural integrator selective for the chosen alternative, and the dashed lines indicate baseline and threshold. (a) Accuracyemphasis is associated with a large baseline–threshold distance. (b,c) Speed emphasis can be caused either by increasing the baseline (panel b) or by lowering thethreshold (panel c); in formal models, these changes are often mathematically equivalent.
Review Trends in Neurosciences Vol.33 No.1
11
OPTIMAL CONTROL = setting thresholds optimally
OPTIMAL STATE ESTIMATION =evidence integration via Bayesian update
Graph from Bogacz (2009), TINS.
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Task and model
Bayesian priors and stimulus representation
Model maintains belief probabilities over:
(a) probability distribution over all possible strings in the currentlyfixated position;
(b) the probability of a nonword in each position; probability currenttrial is a word trial is 1− sum over these.
The prior over (a) is based on Brown Corpus frequency; prior over (b) is
probability of a nonword trial (0.5) divided by the # of positions (6).
The string stimulus is represented with a simple indicator vector coding
(length 26 × 4) (Norris, 2009). At each sample (10ms), Guassian noise
of mean zero and SD = 1.2 (more on this) is added to the true
representation.
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Task and model
Sample model behavior
0 500 1000 1500 2000
CORRECT WORD Trial
Saccade threshold=0.92000, Decision yes= 0.99900, Decision no= 0.99900, Noise=1.15, Sac. prog =125 msTrial Time (ms)
FIXATION
brag
FIXATION
none
FIXATION
mean
FIXATION
bloc
FIXATION
hilt
FIXATION
hair
EBL
EBL
EBL
EBL
EBL
EBLsampling sampling sampling sampling sampling sampling
prog prog prog prog prog prog
MOTOR All words!!
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Task and model
Sample model behavior
0 500 1000 1500 2000
INCORRECT WORD Trial
Saccade threshold=0.92000, Decision yes= 0.99900, Decision no= 0.99900, Noise=1.15, Sac. prog =125 msTrial Time (ms)
FIXATION
lard
FIXATION
sane
FIXATION
helm toot east lost
EBL
EBL
EBLsampling sampling sampling
prog prog prog
MOTOR Not all words!!
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Task and model
Generating predictions
Selecting a policy (pair of thresholds) “programs” themachine to perform the task.
Policy selected through payoff optimization—not data fitting.
Optimal OptimalPayoff condition Saccade Threshold Response Threshold
Accuracy 0.99 0.999Balanced 0.97 0.999
Speed 0.92 0.99
I.e, π∗speed = (0.92, 0.99) and so on. With fixed policy, machine generates
dozens of behavioral measures (e.g. RTs, errors, RTs for accurate vs./
inaccurate, fixation durations for words, nonwords, frequency effects, . . . )
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Task and model
Generating predictions
Selecting a policy (pair of thresholds) “programs” themachine to perform the task.
Policy selected through payoff optimization—not data fitting.
Optimal OptimalPayoff condition Saccade Threshold Response Threshold
Accuracy 0.99 0.999Balanced 0.97 0.999
Speed 0.92 0.99
I.e, π∗speed = (0.92, 0.99) and so on. With fixed policy, machine generates
dozens of behavioral measures (e.g. RTs, errors, RTs for accurate vs./
inaccurate, fixation durations for words, nonwords, frequency effects, . . . )
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Task and model
Finding the sweet spot: Payoff as function of thresholds
π∗ = argmaxπ∈ΠEtrial∼ExperimentU(π, trial) (1)
0.80 0.85 0.90 0.95 1.00
14
16
18
20
22
24
26
ACC payoff vs. Saccade Threshold
Saccade Threshold
Accu
racy
Pay
off
(MODEL, noise=1.20)
●(0.9900, 0.9990)
0.80 0.85 0.90 0.95 1.00
14
16
18
20
22
24
26
BAL payoff vs. Saccade Threshold
Saccade Threshold
Bala
nce
Payo
ff
(MODEL, noise=1.20)
●(0.9700, 0.9990)
●
0.80 0.85 0.90 0.95 1.00
14
16
18
20
22
24
26
SPD payoff vs. Saccade Threshold
Saccade Threshold
Spee
d Pa
yoff
(MODEL, noise=1.20)
●(0.9200, 0.9900)
0.80 0.85 0.90 0.95 1.00
−10
0
10
20
ACC payoff vs. Decision Threshold
Decision Threshold
Accu
racy
Pay
off
(MODEL, noise=1.20)
●(0.9900, 0.9990)
0.80 0.85 0.90 0.95 1.00
−10
0
10
20
BAL payoff vs. Decision Threshold
Decision Threshold
Bala
nce
Payo
ff
(MODEL, noise=1.20)
●(0.9700, 0.9990)
0.80 0.85 0.90 0.95 1.00
−10
0
10
20
SPD payoff vs. Decision Threshold
Decision Threshold
Spee
d Pa
yoff
(MODEL, noise=1.20)
●(0.9200, 0.9900)Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Task and model
Payoff as a function of behavior
0.80 0.85 0.90 0.95 1.00
200
250
300
350
SFD vs. Saccade Threshold
Saccade Threshold
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
●
●
●
0.80 0.85 0.90 0.95 1.00
1000
1200
1400
1600
1800
2000
Trial RT vs. Saccade Threshold
Saccade Threshold
Tria
l Res
pons
e Ti
me
(ms)
(MODEL, noise=1.20)
●
●●
0.80 0.85 0.90 0.95 1.00
0.75
0.80
0.85
0.90
0.95
1.00
% Correct vs. Saccade Threshold
Saccade Threshold
% C
orre
ct
(MODEL, noise=1.20)
●
●●
200 250 300 350
18
20
22
24
26
Payoffs vs. SFD
Single Fixation Duration (ms)
Payo
ff
(MODEL, noise=1.20)
●
●
●
1300 1500 1700 1900
18
20
22
24
26
Payoffs vs. Trial RT
Trial Response Time (ms)
Payo
ff
(MODEL, noise=1.20)
●
●
●
0 2 4 6 8 10 12
18
20
22
24
26
Payoffs vs. Frequency Effect
Log Frequency Effect on SFD
Payo
ff
(MODEL, noise=1.20)
●
●
●
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Model vs. human
Model and human at level of trial
●
1000
1200
1400
1600
1800
2000
2200
Response Time for Word Trials
Tria
l Res
pons
e Ti
me
(ms)
(HUMAN)
Accuracy Balance Speed
●
●●
●
●
●
CORRECT trialINCORRECT trial
1000
1200
1400
1600
1800
2000
2200
Response Time for Word Trials
Tria
l Res
pons
e Ti
me
(ms)
(MODEL, noise=1.20)
Accuracy Balance Speed
●
●●
●
●●●●
●
●
●
●●●●
●
●●
●
●●
●●
●●
●●
●
●
●
●
●
●●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
1000
1200
1400
1600
1800
2000
2200
Response Time for Nonword Trials
Tria
l Res
pons
e Ti
me
(ms)
(HUMAN)
Accuracy Balance Speed
●
●●
●
●●
CORRECT trialINCORRECT trial
1000
1200
1400
1600
1800
2000
2200
Response Time for Nonword Trials
Tria
l Res
pons
e Ti
me
(ms)
(MODEL, noise=1.20)
Accuracy Balance Speed
●
●●
●
●●●●●●
●●●●●
●
●●
●
●●
●
●
●
●
●
●
●
●●●●
●●
●
●
●
●
●
●●
●●●●●●●●●●●●
●●
●
●
●
●
●
●
●
85
90
95
100
Percent Correct
Perc
ent C
orre
ct
(HUMAN)
Accuracy Balance Speed
●
●●
85
90
95
100
Percent Correct
Perc
ent C
orre
ct
(MODEL, noise=1.20)
Accuracy Balance Speed
●●●●
●●●●●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
1000
1200
1400
1600
1800
2000
2200
Response Time for Word Trials
Tria
l Res
pons
e Ti
me
(ms)
(HUMAN)
Accuracy Balance Speed
●
●●
●
●
●
CORRECT trialINCORRECT trial
1000
1200
1400
1600
1800
2000
2200
Response Time for Word Trials
Tria
l Res
pons
e Ti
me
(ms)
(MODEL, noise=1.20)
Accuracy Balance Speed
●
●●
●
●●●●
●
●
●
●●●●
●
●●
●
●●
●●
●●
●●
●
●
●
●
●
●●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
1000
1200
1400
1600
1800
2000
2200
Response Time for Nonword Trials
Tria
l Res
pons
e Ti
me
(ms)
(HUMAN)
Accuracy Balance Speed
●
●●
●
●●
CORRECT trialINCORRECT trial
1000
1200
1400
1600
1800
2000
2200
Response Time for Nonword Trials
Tria
l Res
pons
e Ti
me
(ms)
(MODEL, noise=1.20)
Accuracy Balance Speed
●
●●
●
●●●●●●
●●●●●
●
●●
●
●●
●
●
●
●
●
●
●
●●●●
●●
●
●
●
●
●
●●
●●●●●●●●●●●●
●●
●
●
●
●
●
●
●
85
90
95
100
Percent Correct
Perc
ent C
orre
ct
(HUMAN)
Accuracy Balance Speed
●
●●
85
90
95
100
Percent Correct
Perc
ent C
orre
ct
(MODEL, noise=1.20)
Accuracy Balance Speed
●●●●
●●●●●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Model vs. human
Model and human at level of trial
●
1000
1200
1400
1600
1800
2000
2200
Response Time for Word Trials
Tria
l Res
pons
e Ti
me
(ms)
(HUMAN)
Accuracy Balance Speed
●
●●
●
●
●
CORRECT trialINCORRECT trial
1000
1200
1400
1600
1800
2000
2200
Response Time for Word Trials
Tria
l Res
pons
e Ti
me
(ms)
(MODEL, noise=1.20)
Accuracy Balance Speed
●
●●
●
●●●●
●
●
●
●●●●
●
●●
●
●●
●●
●●
●●
●
●
●
●
●
●●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
1000
1200
1400
1600
1800
2000
2200
Response Time for Nonword Trials
Tria
l Res
pons
e Ti
me
(ms)
(HUMAN)
Accuracy Balance Speed
●
●●
●
●●
CORRECT trialINCORRECT trial
1000
1200
1400
1600
1800
2000
2200
Response Time for Nonword Trials
Tria
l Res
pons
e Ti
me
(ms)
(MODEL, noise=1.20)
Accuracy Balance Speed
●
●●
●
●●●●●●
●●●●●
●
●●
●
●●
●
●
●
●
●
●
●
●●●●
●●
●
●
●
●
●
●●
●●●●●●●●●●●●
●●
●
●
●
●
●
●
●
85
90
95
100
Percent Correct
Perc
ent C
orre
ct
(HUMAN)
Accuracy Balance Speed
●
●●
85
90
95
100
Percent Correct
Perc
ent C
orre
ct
(MODEL, noise=1.20)
Accuracy Balance Speed
●●●●
●●●●●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
1000
1200
1400
1600
1800
2000
2200
Response Time for Word Trials
Tria
l Res
pons
e Ti
me
(ms)
(HUMAN)
Accuracy Balance Speed
●
●●
●
●
●
CORRECT trialINCORRECT trial
1000
1200
1400
1600
1800
2000
2200
Response Time for Word Trials
Tria
l Res
pons
e Ti
me
(ms)
(MODEL, noise=1.20)
Accuracy Balance Speed
●
●●
●
●●●●
●
●
●
●●●●
●
●●
●
●●
●●
●●
●●
●
●
●
●
●
●●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
1000
1200
1400
1600
1800
2000
2200
Response Time for Nonword Trials
Tria
l Res
pons
e Ti
me
(ms)
(HUMAN)
Accuracy Balance Speed
●
●●
●
●●
CORRECT trialINCORRECT trial
1000
1200
1400
1600
1800
2000
2200
Response Time for Nonword Trials
Tria
l Res
pons
e Ti
me
(ms)
(MODEL, noise=1.20)
Accuracy Balance Speed
●
●●
●
●●●●●●
●●●●●
●
●●
●
●●
●
●
●
●
●
●
●
●●●●
●●
●
●
●
●
●
●●
●●●●●●●●●●●●
●●
●
●
●
●
●
●
●
85
90
95
100
Percent Correct
Perc
ent C
orre
ct
(HUMAN)
Accuracy Balance Speed
●
●●
85
90
95
100
Percent Correct
Perc
ent C
orre
ct
(MODEL, noise=1.20)
Accuracy Balance Speed
●●●●
●●●●●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Model vs. human
Model and human at level of word/string
●
200
220
240
260
280
300
Single Fixation Duration
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●
●
200
220
240
260
280
300
Single Fixation Duration
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●
●
●
200
220
240
260
280
300
SFD by Frequency Class
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●
●●
●
●
LOW frequencyHIGH frequency
200
220
240
260
280
300
SFD by Frequency Class
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●
●
●
●
●
●
3
4
5
6
7
8
9
Frequency Effect on SFD
Log
Freq
uenc
y Ef
fect
on
SFD
(ms)
(HUMAN)
Accuracy Balance Speed
●
●
●
3
4
5
6
7
8
9
Frequency Effect on SFD
Log
Freq
uenc
y Ef
fect
on
SFD
(ms)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●
●
●
200
220
240
260
280
300
Single Fixation Duration
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●
●
200
220
240
260
280
300
Single Fixation Duration
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●
●
●
200
220
240
260
280
300
SFD by Frequency Class
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●
●●
●
●
LOW frequencyHIGH frequency
200
220
240
260
280
300
SFD by Frequency Class
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●
●
●
●
●
●
3
4
5
6
7
8
9
Frequency Effect on SFD
Log
Freq
uenc
y Ef
fect
on
SFD
(ms)
(HUMAN)
Accuracy Balance Speed
●
●
●
3
4
5
6
7
8
9
Frequency Effect on SFD
Log
Freq
uenc
y Ef
fect
on
SFD
(ms)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●
●
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Model vs. human
Model and human at level of word/string
●
200
220
240
260
280
300
Single Fixation Duration
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●
●
200
220
240
260
280
300
Single Fixation Duration
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●
●
●
200
220
240
260
280
300
SFD by Frequency Class
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●
●●
●
●
LOW frequencyHIGH frequency
200
220
240
260
280
300
SFD by Frequency Class
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●
●
●
●
●
●
3
4
5
6
7
8
9
Frequency Effect on SFD
Log
Freq
uenc
y Ef
fect
on
SFD
(ms)
(HUMAN)
Accuracy Balance Speed
●
●
●
3
4
5
6
7
8
9
Frequency Effect on SFD
Log
Freq
uenc
y Ef
fect
on
SFD
(ms)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●
●
●
200
220
240
260
280
300
Single Fixation Duration
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●
●
200
220
240
260
280
300
Single Fixation Duration
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●
●
●
200
220
240
260
280
300
SFD by Frequency Class
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●
●●
●
●
LOW frequencyHIGH frequency
200
220
240
260
280
300
SFD by Frequency Class
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●
●
●
●
●
●
3
4
5
6
7
8
9
Frequency Effect on SFD
Log
Freq
uenc
y Ef
fect
on
SFD
(ms)
(HUMAN)
Accuracy Balance Speed
●
●
●
3
4
5
6
7
8
9
Frequency Effect on SFD
Log
Freq
uenc
y Ef
fect
on
SFD
(ms)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●
●
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Model vs. human
Model and human: Words vs. nonwords, position effects
Modeling eye-movements Model vs. human
Model and human at level of word/string
●
200
250
300
350
400
Word SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●●
●
●●
CORRECT trialsINCORRECT trials
200
250
300
350
400
Word SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
200
250
300
350
400
Nonword SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●
●
●
●
●
CORRECT trialsINCORRECT trials
200
250
300
350
400
Nonword SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●●
●●●●●●
●●●●●
●●●●
●
●
●
●
●
●
●
220
240
260
280
300
320
Word SFD by Position in List
Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
2 3 4 5
●
●●
●
●● ●
●
●
● ●●
ACCURACYBALANCESPEED
220
240
260
280
300
320
Word SFD by Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
2 3 4 5
●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●●
●●●
●●●
●●
●●
●●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●●
●●●
●●●
●●●
●●
●●
●●
● ● ● ●
● ● ● ●
●●
●●
●
200
250
300
350
400
Word SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●●
●
●●
CORRECT trialsINCORRECT trials
200
250
300
350
400
Word SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
200
250
300
350
400
Nonword SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●
●
●
●
●
CORRECT trialsINCORRECT trials
200
250
300
350
400
Nonword SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●●
●●●●●●
●●●●●
●●●●
●
●
●
●
●
●
●
220
240
260
280
300
320
Word SFD by Position in List
Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
2 3 4 5
●
●●
●
●● ●
●
●
● ●●
ACCURACYBALANCESPEED
220
240
260
280
300
320
Word SFD by Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
2 3 4 5
●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●●
●●●
●●●
●●
●●
●●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●●
●●●
●●●
●●●
●●
●●
●●
● ● ● ●
● ● ● ●
●●
●●
Lewis (University of Michigan) Optimal Control Approaches 5 May 2012 42 / 54Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Model vs. human
Model and human: Words vs. nonwords, position effects
Modeling eye-movements Model vs. human
Model and human at level of word/string
●
200
250
300
350
400
Word SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●●
●
●●
CORRECT trialsINCORRECT trials
200
250
300
350
400
Word SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
200
250
300
350
400
Nonword SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●
●
●
●
●
CORRECT trialsINCORRECT trials
200
250
300
350
400
Nonword SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●●
●●●●●●
●●●●●
●●●●
●
●
●
●
●
●
●
220
240
260
280
300
320
Word SFD by Position in List
Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
2 3 4 5
●
●●
●
●● ●
●
●
● ●●
ACCURACYBALANCESPEED
220
240
260
280
300
320
Word SFD by Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
2 3 4 5
●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●●
●●●
●●●
●●
●●
●●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●●
●●●
●●●
●●●
●●
●●
●●
● ● ● ●
● ● ● ●
●●
●●
●
200
250
300
350
400
Word SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●●
●
●●
CORRECT trialsINCORRECT trials
200
250
300
350
400
Word SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
200
250
300
350
400
Nonword SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●
●
●
●
●
CORRECT trialsINCORRECT trials
200
250
300
350
400
Nonword SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●●
●●●●●●
●●●●●
●●●●
●
●
●
●
●
●
●
220
240
260
280
300
320
Word SFD by Position in List
Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
2 3 4 5
●
●●
●
●● ●
●
●
● ●●
ACCURACYBALANCESPEED
220
240
260
280
300
320
Word SFD by Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
2 3 4 5
●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●●
●●●
●●●
●●
●●
●●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●●
●●●
●●●
●●●
●●
●●
●●
● ● ● ●
● ● ● ●
●●
●●
Lewis (University of Michigan) Optimal Control Approaches 5 May 2012 42 / 54Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Model vs. human
Model and human: Words vs. nonwords, position effects
Modeling eye-movements Model vs. human
Model and human at level of word/string
●
200
250
300
350
400
Word SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●●
●
●●
CORRECT trialsINCORRECT trials
200
250
300
350
400
Word SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
200
250
300
350
400
Nonword SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●
●
●
●
●
CORRECT trialsINCORRECT trials
200
250
300
350
400
Nonword SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●●
●●●●●●
●●●●●
●●●●
●
●
●
●
●
●
●
220
240
260
280
300
320
Word SFD by Position in List
Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
2 3 4 5
●
●●
●
●● ●
●
●
● ●●
ACCURACYBALANCESPEED
220
240
260
280
300
320
Word SFD by Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
2 3 4 5
●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●●
●●●
●●●
●●
●●
●●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●●
●●●
●●●
●●●
●●
●●
●●
● ● ● ●
● ● ● ●
●●
●●
●
200
250
300
350
400
Word SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●●
●
●●
CORRECT trialsINCORRECT trials
200
250
300
350
400
Word SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
200
250
300
350
400
Nonword SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●
●
●
●
●
CORRECT trialsINCORRECT trials
200
250
300
350
400
Nonword SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●●
●●●●●●
●●●●●
●●●●
●
●
●
●
●
●
●
220
240
260
280
300
320
Word SFD by Position in List
Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
2 3 4 5
●
●●
●
●● ●
●
●
● ●●
ACCURACYBALANCESPEED
220
240
260
280
300
320
Word SFD by Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
2 3 4 5
●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●●
●●●
●●●
●●
●●
●●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●●
●●●
●●●
●●●
●●
●●
●●
● ● ● ●
● ● ● ●
●●
●●
Lewis (University of Michigan) Optimal Control Approaches 5 May 2012 42 / 54Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Model vs. human
Model and human: Words vs. nonwords, position effects
Modeling eye-movements Model vs. human
Model and human at level of word/string
●
200
250
300
350
400
Word SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●●
●
●●
CORRECT trialsINCORRECT trials
200
250
300
350
400
Word SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
200
250
300
350
400
Nonword SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●
●
●
●
●
CORRECT trialsINCORRECT trials
200
250
300
350
400
Nonword SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●●
●●●●●●
●●●●●
●●●●
●
●
●
●
●
●
●
220
240
260
280
300
320
Word SFD by Position in List
Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
2 3 4 5
●
●●
●
●● ●
●
●
● ●●
ACCURACYBALANCESPEED
220
240
260
280
300
320
Word SFD by Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
2 3 4 5
●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●●
●●●
●●●
●●
●●
●●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●●
●●●
●●●
●●●
●●
●●
●●
● ● ● ●
● ● ● ●
●●
●●
●
200
250
300
350
400
Word SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●●
●
●●
CORRECT trialsINCORRECT trials
200
250
300
350
400
Word SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
200
250
300
350
400
Nonword SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●
●
●
●
●
CORRECT trialsINCORRECT trials
200
250
300
350
400
Nonword SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●●
●●●●●●
●●●●●
●●●●
●
●
●
●
●
●
●
220
240
260
280
300
320
Word SFD by Position in List
Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
2 3 4 5
●
●●
●
●● ●
●
●
● ●●
ACCURACYBALANCESPEED
220
240
260
280
300
320
Word SFD by Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
2 3 4 5
●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●●
●●●
●●●
●●
●●
●●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●●
●●●
●●●
●●●
●●
●●
●●
● ● ● ●
● ● ● ●
●●
●●
Lewis (University of Michigan) Optimal Control Approaches 5 May 2012 42 / 54Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements Model vs. human
Model and human: Words vs. nonwords, position effects
Modeling eye-movements Model vs. human
Model and human at level of word/string
●
200
250
300
350
400
Word SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●●
●
●●
CORRECT trialsINCORRECT trials
200
250
300
350
400
Word SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
200
250
300
350
400
Nonword SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●
●
●
●
●
CORRECT trialsINCORRECT trials
200
250
300
350
400
Nonword SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●●
●●●●●●
●●●●●
●●●●
●
●
●
●
●
●
●
220
240
260
280
300
320
Word SFD by Position in List
Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
2 3 4 5
●
●●
●
●● ●
●
●
● ●●
ACCURACYBALANCESPEED
220
240
260
280
300
320
Word SFD by Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
2 3 4 5
●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●●
●●●
●●●
●●
●●
●●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●●
●●●
●●●
●●●
●●
●●
●●
● ● ● ●
● ● ● ●
●●
●●
●
200
250
300
350
400
Word SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●●
●
●●
CORRECT trialsINCORRECT trials
200
250
300
350
400
Word SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
200
250
300
350
400
Nonword SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●
●
●
●
●
CORRECT trialsINCORRECT trials
200
250
300
350
400
Nonword SFD by correctness
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
Accuracy Balance Speed
●●
●●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●●
●●●●●●
●●●●●
●●●●
●
●
●
●
●
●
●
220
240
260
280
300
320
Word SFD by Position in List
Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
2 3 4 5
●
●●
●
●● ●
●
●
● ●●
ACCURACYBALANCESPEED
220
240
260
280
300
320
Word SFD by Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.20)
2 3 4 5
●●
●●
●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●●●
●●●
●●
●●
●●
●●
●●●
●●●
●●●
●●
●●
●●
●
●●
●●●
●●●
●●●
●●
●●
●●
●
●●
●●●
●●●
●●●
●●
●●
●●
● ● ● ●
● ● ● ●
●●
●●
Lewis (University of Michigan) Optimal Control Approaches 5 May 2012 42 / 54Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements How architecture shapes adaptation
1 What determines the nature of eye-movements in linguistictasks?
The task and modelPredictions vs. human behaviorHow architecture shapes adaptation
2 Conclusions and looking ahead
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements How architecture shapes adaptation
Does the processing architecture matter?
The theoretical claim here is that eye-movement control is jointlyshaped by both task payoff and architecture. What is the evidencefor this?
Through modeling we can explore adaptation to differentarchitectures than the one hypothesized for the human oculomotorsystem.
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements How architecture shapes adaptation
The “minimal model”
Manualresponse Posterior Update Lexicon +
Model of experiment
Experiment Environment
Critic
Feedback
Noisy sample from position
(Bounded) Optimal Control
Initiate saccade program
Reward Function
Stimulus
Oculomotor System
Initiate press
Button press indicating trial response
s kI(·)
saccade program saccade eye-brain lag ⇠ Gamma(50ms)
⇠ Gamma(250ms)
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.match
SD= 1.75, threshold=0.95Samples
Posterio
r
RT dist. for RN.match
RTFreq
uency
300 500 700
0500
10001
500200
0
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.mismatch
SD= 1.75, threshold=0.95Samples
Posterio
r
RT dist. for RN.mismatch
RT
Frequen
cy
300 500 700
0500
1500
2500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
YES.match
SD= 1.75, threshold=0.95Samples
Posterio
r
RT dist. for YES.match
RT
Frequen
cy
300 500 700
0100
0200
0300
0
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
NRN.match
SD= 2.00, threshold=0.95Samples
Posterio
r
RT dist. for NRN.match
RT
Frequen
cy
300 500 700
0100
0300
0
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
NRN.mismatch
SD= 2.00, threshold=0.95Samples
Posterio
r
RT dist. for NRN.mismatch
RT
Frequen
cy
400 500 600 700
0500
1500
2500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.match
SD= 2.00, threshold=0.95Samples
Posterio
r
RT dist. for RN.match
RT
Frequen
cy
300 500 700
0500
1000
1500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.mismatch
SD= 2.00, threshold=0.95Samples
Posterio
r
RT dist. for RN.mismatch
RT
Frequen
cy
300 500 700
0500
1500
2500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
YES.match
SD= 2.00, threshold=0.95Samples
Posterio
r
RT dist. for YES.match
RT
Frequen
cy
300 500 700
0500
1500
2500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.mismatch
SD= 3.25, threshold=0.92Samples
Posterio
r
RT dist. for RN.mismatch
RT
Frequen
cy
300 500 700
0500
10001
500200
0
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
YES.match
SD= 3.25, threshold=0.92Samples
Posterio
r
RT dist. for YES.match
RT
Frequen
cy
300 500 700
0500
1000
1500
2000
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
NRN.match
SD= 3.50, threshold=0.92Samples
Posterio
r
RT dist. for NRN.match
RT
Frequen
cy
300 500 700
0500
1000
2000
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
NRN.mismatch
SD= 3.50, threshold=0.92Samples
Posterio
r
RT dist. for NRN.mismatch
RT
Frequen
cy
300 500 700
0500
1500
2500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.match
SD= 3.50, threshold=0.92Samples
Posterio
r
RT dist. for RN.match
RT
Frequen
cy
300 500 700
0100
0300
0
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.mismatch
SD= 3.50, threshold=0.92Samples
Posterio
r
RT dist. for RN.mismatch
RT
Frequen
cy
300 500 700
0500
1500
2500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
YES.match
SD= 3.50, threshold=0.92Samples
Posterio
rRT dist. for YES.match
RT
Frequen
cy
300 500 700
0500
1500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
NRN.match
SD= 1.50, threshold=0.93Samples
Posterio
r
RT dist. for NRN.match
RT
Frequen
cy
350 400 450 500
0500
1500
2500
saccade decision
trial decision
⇠ Gamma(40ms)⇠ Gamma(125ms)
optimal thresholds
18 LEWIS, SHVARTSMAN & SINGH
Next, we update the string-level beliefs:
Prnew(S k = Wi|T ! N k, sk) =
=Pr(sk |S k = Wi,T ! N k)Prold(S k = Wi|T ! N k)
Prold(sk |T ! N k)
=Pr(sk |S k = Wi)Prold(S k = Wi|T ! N k)
Prold(sk |T ! N k)(4)
Prnew(S k = Ni|T = Nk, sk) =
=Pr(sk |S k = Ni,T = N k)Prold(S k = Ni|T = N k)
Prold(sk |T = N k)
=Pr(sk |S k = Ni)Prold(S k = Ni|T = N k)
Prold(sk |T = N k)(5)
Finally, we update the trial level beliefs:
Prnew(T =W|sk) =
=Prold(sk |T =W)Prold(T =W)
Prold(sk)=
=Prold(sk |T ! N k)Prold(T =W)
Prold(sk)
(6)
Prnew(T = N j!k |sk) =
=Prold(sk |T = N j)Prold(T = N j)
Prold(sk)
=Prold(sk |T ! N k)Prold(T = N j)
Prold(sk),
(7)
Prnew(T = N k |sk) =
=Prold(sk |T = N k)Prold(T = N k)
Prold(sk)
(8)
In order to make decisions, in addition to the proba-bility that the trial is a word trial or not that we computeabove, we also need the probability that the string at po-sition k is a word or nonword, i.e., Pr(S k ∈ ∪n
i=1Wi) andPr(S k ∈ ∪m
i=1Ni):
Pr(S k ∈ ∪ni=1Wi) =
n∑
i=1
Pr(S k = Wi|T ! N k)Pr(T ! Nk)
(9)
Pr(S k ∈ ∪mi=1Ni) = 1.0 − Pr(S k ∈ ∪n
i=1Wi) (10)
The full process then iterates, with each Prnew becom-ing the next Prold.
AppendixReferences
Anderson, J. R. (1990). The Adaptive Character ofThought. Hillsdale, NJ: Lawrence Erlbaum.
Ballard, D. H., & Hayhoe, M. M. (2009). Modelling therole of task in the control of gaze. Visual Cognition,17(6-7), 1185-1204.
Bicknell, K. (2010). Rational eye movements in read-ing combining uncertainty about previous words withcontextual probability. Proceedings of the 32nd An-nual Conference of the Cognitive Science Society.
Bicknell, K., & Levy, R. (2010). A rational model of eyemovement control in reading. Proceedings of the 48thAnnual Meeting of the Association for ComputationalLinguistics, 1168–1178.
Bicknell, K., & Levy, R. (2012). The utility of modelingword identification from visual input within models ofeye movements in reading. Visual Cognition, 1–18.
Bogacz, R., Brown, E., Moehlis, J., Holmes, P., & Cohen,J. D. (2006, October). The physics of optimal decisionmaking: A formal analysis of models of performancein two-alternative forced-choice tasks. PsychologicalReview, 113(4), 700-765.
Brodersen, K. H., Penny, W. D., Harrison, L. M., Dau-nizeau, J., Ruff, C. C., Duzel, E., et al. (2008). Inte-grated bayesian models of learning and decision mak-ing for saccadic eye movements. Neural Networks,21(9), 1247–1260.
Edwards, W. (1961, January). Behavioral decision theory.Annual Review of Psychology, 12, 473–98.
Edwards, W. (1965). Optimal strategies for seeking infor-mation: Models for statistics, choice reaction-times,and human information-processing. Journal of Math-ematical Psychology, 2(2), 312-329.
Engbert, R., Nuthmann, A., Richter, E., & Kliegl, R.(2005, October). Swift: A dynamical model of sac-cade generation during reading. Psychological Re-view, 112(4), 777-813.
Forster, K. (1979). Levels of processing and the struc-ture of the language processor. In W. E. Cooper &E. C. Walker (Eds.), Sentence processing: Psycholin-guistic studies presented to merrill garrett. Hillsdale,NJ: Lawrence Erlbaum.
Geisler, W. (1989). Sequential ideal-observer analysis ofvisual discriminations. Psychological Review, 96(2),267.
Grainger, J. (1990). Word frequency and neighborhoodfrequency effects in lexical decision and naming. Jour-nal of Memory and Language, 29(2), 228–244.
Green, D. M., & Swets, J. A. (1966). Signal DetectionTheory and Psychophysics. New York: Wiley.
Hale, J. T. (2011). What a rational parser would do. Cog-nitive Science, 35, 399–443.
Howes, A., Lewis, R. L., & Vera, A. H. (2009). Rationaladaptation under task and processing constraints: Im-plications for testing theories of cognition and action.Psychological Review, 116(4), 717–751.
Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996).
,... (see Appendix)
1
2
34
5
6
7
lead hilt robe helm guru east
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements How architecture shapes adaptation
The “minimal model”
Manualresponse Posterior Update Lexicon +
Model of experiment
Experiment Environment
Critic
Feedback
Noisy sample from position
(Bounded) Optimal Control
Initiate saccade program
Reward Function
Stimulus
Oculomotor System
Initiate press
Button press indicating trial response
s kI(·)
saccade program saccade eye-brain lag ⇠ Gamma(50ms)
⇠ Gamma(250ms)
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.match
SD= 1.75, threshold=0.95Samples
Posterio
r
RT dist. for RN.match
RTFreq
uency
300 500 700
0500
10001
500200
0
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.mismatch
SD= 1.75, threshold=0.95Samples
Posterio
r
RT dist. for RN.mismatch
RT
Frequen
cy
300 500 700
0500
1500
2500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
YES.match
SD= 1.75, threshold=0.95Samples
Posterio
r
RT dist. for YES.match
RT
Frequen
cy
300 500 700
0100
0200
0300
0
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
NRN.match
SD= 2.00, threshold=0.95Samples
Posterio
r
RT dist. for NRN.match
RT
Frequen
cy
300 500 700
0100
0300
0
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
NRN.mismatch
SD= 2.00, threshold=0.95Samples
Posterio
r
RT dist. for NRN.mismatch
RT
Frequen
cy
400 500 600 700
0500
1500
2500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.match
SD= 2.00, threshold=0.95Samples
Posterio
r
RT dist. for RN.match
RT
Frequen
cy
300 500 700
0500
1000
1500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.mismatch
SD= 2.00, threshold=0.95Samples
Posterio
r
RT dist. for RN.mismatch
RT
Frequen
cy
300 500 700
0500
1500
2500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
YES.match
SD= 2.00, threshold=0.95Samples
Posterio
r
RT dist. for YES.match
RT
Frequen
cy
300 500 700
0500
1500
2500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.mismatch
SD= 3.25, threshold=0.92Samples
Posterio
r
RT dist. for RN.mismatch
RT
Frequen
cy
300 500 700
0500
10001
500200
0
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
YES.match
SD= 3.25, threshold=0.92Samples
Posterio
r
RT dist. for YES.match
RT
Frequen
cy
300 500 700
0500
1000
1500
2000
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
NRN.match
SD= 3.50, threshold=0.92Samples
Posterio
r
RT dist. for NRN.match
RT
Frequen
cy
300 500 700
0500
1000
2000
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
NRN.mismatch
SD= 3.50, threshold=0.92Samples
Posterio
r
RT dist. for NRN.mismatch
RT
Frequen
cy
300 500 700
0500
1500
2500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.match
SD= 3.50, threshold=0.92Samples
Posterio
r
RT dist. for RN.match
RT
Frequen
cy
300 500 700
0100
0300
0
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
RN.mismatch
SD= 3.50, threshold=0.92Samples
Posterio
r
RT dist. for RN.mismatch
RT
Frequen
cy
300 500 700
0500
1500
2500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
YES.match
SD= 3.50, threshold=0.92Samples
Posterio
rRT dist. for YES.match
RT
Frequen
cy
300 500 700
0500
1500
0 10 20 30 40 50
0.00.2
0.40.6
0.81.0
NRN.match
SD= 1.50, threshold=0.93Samples
Posterio
r
RT dist. for NRN.match
RT
Frequen
cy
350 400 450 500
0500
1500
2500
saccade decision
trial decision
⇠ Gamma(40ms)⇠ Gamma(125ms)
optimal thresholds
18 LEWIS, SHVARTSMAN & SINGH
Next, we update the string-level beliefs:
Prnew(S k = Wi|T ! N k, sk) =
=Pr(sk |S k = Wi,T ! N k)Prold(S k = Wi|T ! N k)
Prold(sk |T ! N k)
=Pr(sk |S k = Wi)Prold(S k = Wi|T ! N k)
Prold(sk |T ! N k)(4)
Prnew(S k = Ni|T = Nk, sk) =
=Pr(sk |S k = Ni,T = N k)Prold(S k = Ni|T = N k)
Prold(sk |T = N k)
=Pr(sk |S k = Ni)Prold(S k = Ni|T = N k)
Prold(sk |T = N k)(5)
Finally, we update the trial level beliefs:
Prnew(T =W|sk) =
=Prold(sk |T =W)Prold(T =W)
Prold(sk)=
=Prold(sk |T ! N k)Prold(T =W)
Prold(sk)
(6)
Prnew(T = N j!k |sk) =
=Prold(sk |T = N j)Prold(T = N j)
Prold(sk)
=Prold(sk |T ! N k)Prold(T = N j)
Prold(sk),
(7)
Prnew(T = N k |sk) =
=Prold(sk |T = N k)Prold(T = N k)
Prold(sk)
(8)
In order to make decisions, in addition to the proba-bility that the trial is a word trial or not that we computeabove, we also need the probability that the string at po-sition k is a word or nonword, i.e., Pr(S k ∈ ∪n
i=1Wi) andPr(S k ∈ ∪m
i=1Ni):
Pr(S k ∈ ∪ni=1Wi) =
n∑
i=1
Pr(S k = Wi|T ! N k)Pr(T ! Nk)
(9)
Pr(S k ∈ ∪mi=1Ni) = 1.0 − Pr(S k ∈ ∪n
i=1Wi) (10)
The full process then iterates, with each Prnew becom-ing the next Prold.
AppendixReferences
Anderson, J. R. (1990). The Adaptive Character ofThought. Hillsdale, NJ: Lawrence Erlbaum.
Ballard, D. H., & Hayhoe, M. M. (2009). Modelling therole of task in the control of gaze. Visual Cognition,17(6-7), 1185-1204.
Bicknell, K. (2010). Rational eye movements in read-ing combining uncertainty about previous words withcontextual probability. Proceedings of the 32nd An-nual Conference of the Cognitive Science Society.
Bicknell, K., & Levy, R. (2010). A rational model of eyemovement control in reading. Proceedings of the 48thAnnual Meeting of the Association for ComputationalLinguistics, 1168–1178.
Bicknell, K., & Levy, R. (2012). The utility of modelingword identification from visual input within models ofeye movements in reading. Visual Cognition, 1–18.
Bogacz, R., Brown, E., Moehlis, J., Holmes, P., & Cohen,J. D. (2006, October). The physics of optimal decisionmaking: A formal analysis of models of performancein two-alternative forced-choice tasks. PsychologicalReview, 113(4), 700-765.
Brodersen, K. H., Penny, W. D., Harrison, L. M., Dau-nizeau, J., Ruff, C. C., Duzel, E., et al. (2008). Inte-grated bayesian models of learning and decision mak-ing for saccadic eye movements. Neural Networks,21(9), 1247–1260.
Edwards, W. (1961, January). Behavioral decision theory.Annual Review of Psychology, 12, 473–98.
Edwards, W. (1965). Optimal strategies for seeking infor-mation: Models for statistics, choice reaction-times,and human information-processing. Journal of Math-ematical Psychology, 2(2), 312-329.
Engbert, R., Nuthmann, A., Richter, E., & Kliegl, R.(2005, October). Swift: A dynamical model of sac-cade generation during reading. Psychological Re-view, 112(4), 777-813.
Forster, K. (1979). Levels of processing and the struc-ture of the language processor. In W. E. Cooper &E. C. Walker (Eds.), Sentence processing: Psycholin-guistic studies presented to merrill garrett. Hillsdale,NJ: Lawrence Erlbaum.
Geisler, W. (1989). Sequential ideal-observer analysis ofvisual discriminations. Psychological Review, 96(2),267.
Grainger, J. (1990). Word frequency and neighborhoodfrequency effects in lexical decision and naming. Jour-nal of Memory and Language, 29(2), 228–244.
Green, D. M., & Swets, J. A. (1966). Signal DetectionTheory and Psychophysics. New York: Wiley.
Hale, J. T. (2011). What a rational parser would do. Cog-nitive Science, 35, 399–443.
Howes, A., Lewis, R. L., & Vera, A. H. (2009). Rationaladaptation under task and processing constraints: Im-plications for testing theories of cognition and action.Psychological Review, 116(4), 717–751.
Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996).
,... (see Appendix)
1
2
34
5
6
7
lead hilt robe helm guru east
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements How architecture shapes adaptation
Payoff structure & predictions for the minimal model
0.80 0.85 0.90 0.95 1.00
−10
0
10
20
ACC payoff vs. Saccade Threshold
Saccade Threshold
Accu
racy
Pay
off
(MODEL, noise=1.90)
●(0.9990, 0.9999)
0.80 0.85 0.90 0.95 1.00
5
10
15
20
25
BAL payoff vs. Saccade Threshold
Saccade Threshold
Bala
nce
Payo
ff
(MODEL, noise=1.90)
●(0.9950, 0.9990)
●
0.80 0.85 0.90 0.95 1.00
5
10
15
20
25
SPD payoff vs. Saccade Threshold
Saccade Threshold
Spee
d Pa
yoff
(MODEL, noise=1.90)
●(0.9900, 0.9990)
0.93 0.95 0.97 0.99
−10
0
10
20
ACC payoff vs. Decision Threshold
Decision Threshold
Accu
racy
Pay
off
(MODEL, noise=1.90)
●(0.9990, 0.9999)
0.93 0.95 0.97 0.99
−10
0
10
20
BAL payoff vs. Decision Threshold
Decision Threshold
Bala
nce
Payo
ff
(MODEL, noise=1.90)
●(0.9950, 0.9990)
0.93 0.95 0.97 0.99
−10
0
10
20
SPD payoff vs. Decision Threshold
Decision Threshold
Spee
d Pa
yoff
(MODEL, noise=1.90)
●(0.9900, 0.9990)
●
200
220
240
260
280
300
SFD by Frequency Class
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●
●●
●
●
LOW frequencyHIGH frequency
150
200
250
300
SFD by Frequency Class
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.85)
Accuracy Balance Speed
●●●
●●●
●●●●
●●●●
●●●
●●●
●●●●
●●●●
●
●
●
●
●
●
●
3
4
5
6
7
8
9
Frequency Effect on SFD
Log
Freq
uenc
y Ef
fect
on
SFD
(ms)
(HUMAN)
Accuracy Balance Speed
●
●
●
20
21
22
23
24
25
Frequency Effect on SFD
Log
Freq
uenc
y Ef
fect
on
SFD
(ms)
(MODEL, noise=1.85)
Accuracy Balance Speed
●●●
●●●
●●●●
●●●●
●
●
●
●
220
240
260
280
300
320
Word SFD by Position in List
Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
2 3 4 5
●
●●
●
●● ●
●
●
● ●●
ACCURACYBALANCESPEED
150
200
250
300
Word SFD by Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.85)
2 3 4 5
●
●●
●
●●
●
●●
●
●●
●●●
●
●●
●
●●
●
●●
●
●●●
●
●●●
●
●●●
●
●
●●
●
●●●
●
●
●●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
Review word-level results
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements How architecture shapes adaptation
Payoff structure & predictions for the minimal model
0.80 0.85 0.90 0.95 1.00
−10
0
10
20
ACC payoff vs. Saccade Threshold
Saccade Threshold
Accu
racy
Pay
off
(MODEL, noise=1.90)
●(0.9990, 0.9999)
0.80 0.85 0.90 0.95 1.00
5
10
15
20
25
BAL payoff vs. Saccade Threshold
Saccade Threshold
Bala
nce
Payo
ff
(MODEL, noise=1.90)
●(0.9950, 0.9990)
●
0.80 0.85 0.90 0.95 1.00
5
10
15
20
25
SPD payoff vs. Saccade Threshold
Saccade Threshold
Spee
d Pa
yoff
(MODEL, noise=1.90)
●(0.9900, 0.9990)
0.93 0.95 0.97 0.99
−10
0
10
20
ACC payoff vs. Decision Threshold
Decision Threshold
Accu
racy
Pay
off
(MODEL, noise=1.90)
●(0.9990, 0.9999)
0.93 0.95 0.97 0.99
−10
0
10
20
BAL payoff vs. Decision Threshold
Decision Threshold
Bala
nce
Payo
ff
(MODEL, noise=1.90)
●(0.9950, 0.9990)
0.93 0.95 0.97 0.99
−10
0
10
20
SPD payoff vs. Decision Threshold
Decision Threshold
Spee
d Pa
yoff
(MODEL, noise=1.90)
●(0.9900, 0.9990)
●
200
220
240
260
280
300
SFD by Frequency Class
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
Accuracy Balance Speed
●
●
●●
●
●
LOW frequencyHIGH frequency
150
200
250
300
SFD by Frequency Class
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.85)
Accuracy Balance Speed
●●●
●●●
●●●●
●●●●
●●●
●●●
●●●●
●●●●
●
●
●
●
●
●
●
3
4
5
6
7
8
9
Frequency Effect on SFD
Log
Freq
uenc
y Ef
fect
on
SFD
(ms)
(HUMAN)
Accuracy Balance Speed
●
●
●
20
21
22
23
24
25
Frequency Effect on SFD
Log
Freq
uenc
y Ef
fect
on
SFD
(ms)
(MODEL, noise=1.85)
Accuracy Balance Speed
●●●
●●●
●●●●
●●●●
●
●
●
●
220
240
260
280
300
320
Word SFD by Position in List
Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(HUMAN)
2 3 4 5
●
●●
●
●● ●
●
●
● ●●
ACCURACYBALANCESPEED
150
200
250
300
Word SFD by Position in List
Sing
le F
ixatio
n Du
ratio
n (m
s)
(MODEL, noise=1.85)
2 3 4 5
●
●●
●
●●
●
●●
●
●●
●●●
●
●●
●
●●
●
●●
●
●●●
●
●●●
●
●●●
●
●
●●
●
●●●
●
●
●●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
Review word-level resultsLewis (University of Michigan) Optimal Control Approaches 21 June 2012
Modeling eye-movements How architecture shapes adaptation
Model fit for the alternative architectures
1.2 1.4 1.6 1.8 2.0
2040
6080
Model Error vs. Noise for Architectural Variants
Noise
RM
SE
aga
inst
Hum
an S
FD
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●Complete architectureArchitecture with only saccade programmingArchitecture without saccade programmingMinimal model (noise only)
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Conclusion
1 What determines the nature of eye-movements in linguistictasks?
2 Conclusions and looking ahead
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Conclusion
Summary
We applied optimal control and state estimation techniquesto pursue, computationally, an interesting theoretical idea.Doing so yielded two things: arg max
organism structure
(constraints) task goals (
payoff)
envir
onmen
t st
ructu
re
behavior (optimal policies)
1 What determines the nature of eye-movements in linguistic tasks?Answer: Eye-movement control is the solution to a constrainedoptimization problem posed by task structure and payoff, linguisticknowledge, and oculomotor processing architecture.
2 A novel empirical demonstration: Humans adapt their oculomotorcontrol at the level of single fixation durations to maximize payoff inlinguistic tasks, and do so in ways sensitive to the specificcontingencies of the task at hand.
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Conclusion
Stronger ties between psycholinguistics and linguistics andother areas of cognitive science?
Bayesian memory & perception
reinforcement learning
decision making
rational analysis
bounded rationality
psycholinguistics:classic parsing strategiesrational approacheslanguage-as-action
syntactic theory
language evolution
MEMORY constraints(CHANNEL CAPACITY)
PERCEPTUAL/OCULAR-MOTOR
constraints
PARSING PROCESSES
SUBJECTIVE UTILITY FUNCTION
Bounded Optimal Control Analysis
ADAPTIVE, INTEGRATED CONTROL POLICIES
for PARSING, EYE-MOVEMENTS, ACTION and COGNITION
TASK ENVIRONMENT (linguistic stimuli)
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Conclusion
Optimal control and syntactic theory and evolution??
The question we can pose is: What optimization problem(specifically, bounded optimal control problem) is human languagethe solution to?
This offers a perspective on language evolution/emergencethat complements existing approaches by placing emphasis onhow the details of cognitive architecture and utility shapelanguage, abstracting away from processes of evolution.
It perhaps offers another way to pursue the “StrongMinimalist” thesis of optimality in language (recent work byChomsky).
For more, see Bratman, Shvartsman, Lewis & Singh (2010) on my web
site.Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Conclusion
Grammar as bounded optimal policyBratman, Shvartsman, Lewis & Singh (2012)
Table 1: Summary of three sets of experiments and policies learned. See text for detailed description.
ENVIRONMENT AGENT MEMORY LEXICON SIZE (S) PROPERTIES OF EMERGENT LINGUISTIC SYSTEM
Two Rooms
one symbolworking memory+ one symbollong-termmemory
3
Association and systematic order, where in additionsingle symbols uttered in isolation denote specific box-key combinations. Can only achieve 75% success.
4 Association and systematic symbol order. SPEAKERfirst describes the box, then the key (see Figure 2b).
8Highly context-dependent and idiosyncratic symbolmeanings. For example key 2 is represented by sym-bol 4 if uttered before box, but symbol 5 after.
16 Each symbol denotes a box-key combination. For ex-ample symbol 5 means key 1 and box 1.
Two rooms
two symbolworking memory(no long-termmemory)
3 Similar to case with 3 symbols above.
4Complex lexical forms. Describes entire box-key com-bination with two symbols which can be observed si-multaneously by LISTENER effectively creating a 2-symbol length word (see Figure 3b).
One room
one symbolworking memory+ one symbollong-termmemory
3 Symbols act as direct orders to LISTENER, but other-wise policy is similar to the cases of 3 symbols above.
4
Association and symbol order, but no storing or re-trieving from long-term memory is necessary becauseLISTENER can act immediately upon hearing a symbol(see Figure 4b).
Experiment set 1: Exploring constraints on the lexicon.We explore four different lexicon sizes: S = 16, S = 8, S = 4,and S = 3. Figure 2 shows 30 independent learning trajecto-ries for each value of S. The high variance is due to the natureof the learning algorithm which may not converge for bothagents every trial (or may get stuck on a less-than-optimalpolicy)—but what we are interested in are the best policieslearned (because the mechanism used can be improved sig-nificantly beyond our initial implementation of Sarsa(l) withfixed parameters across all experiments).
The first four rows of Table 1 summarize the results. Herewe will discuss the resulting policies in more detail. For 16available symbols, as expected, a different symbol is associ-ated with each box-key combination and the agents arrive atperfect performance. With eight symbols, again the best per-forming policies use two-symbol utterances for each box-keycombination, but not always in the same order (i.e. for somecombinations keys are uttered first and in other boxes are ut-tered first). For the case of four symbols, the best performingpolicies communicate box and key in a particular order, witheach symbol able to refer to either box or key (see Figure 2b).Of particular interest is that the the agents settle on a consis-tent order across box-key combinations, but this order mightbe different over seperate experiments: the linear position is
necessary but the specific order is not. Finally, for the case ofonly three symbols the agents again learn a policy where lin-ear symbol order matters. Curiously, this alone should onlyafford success in 56% of combinations; some policies how-ever achieved 75% success. The policy succeeds in the addi-tional box-key combinations by associating each with a singlesymbol uttered in isolation. That is, with limitations in sym-bol size utterance length becomes informative in addition topositional information.
As we can see, this method of systematically altering onlya single constraint (lexicon size) yields broad variation in lin-guistic properties even in this extremely simple domain, in-cluding the denotation of symbols and the use of order in-formation. The case of three and four symbols suggests thatlimited memory (paired with environmental pressures) leadsto the systematic use of symbol order in optimal performance,especially when the lexicon size is limited.
Experiment set 2: Modified agent constraints. Here ouraim is to explore further what specific constraints led to thesystematic use of order in Experiment 1. We alter the con-straints on the agents by allowing the LISTENER two symbolsin working memory instead of one (and no long-term mem-ory). All the other dynamics of the Treasure Box Domainare kept constant. The actions of store and retrieve have new
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012
Conclusion
Collaborators and sponsors
Michael ShvartsmanU of Michigan Psychology
Satinder SinghU of Mich Computer Sci
Andrew HowesU of Birmingham Comp Sci
Marc BermanU of Toronto Psych
Julie BolandU of Michigan Psychology
Sam EpsteinU of Michigan Linguistics
Miki ObataMie University Linguistics
John JonidesU of Michigan Psychology
Language & Cognitive Architecture Lab: Bryan Berend, Yasaman
Kazerooni, Emmanuel Kumar, Craig Sanders, Mehgha Shyam
National Science Foundation
Lewis (University of Michigan) Optimal Control Approaches 21 June 2012