-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
1/73
Strategy analysis of probability learning1
Strategy analysis
of probability learning
Balzs Aczl
Psychology Institute,
University of Etvs Lornd
Budapest, Hungary
MA Thesis
(Psychology Course)
2005/2006 Spring Term
Supervisor: Mihly Racsmny, PhD.
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
2/73
Strategy analysis of probability learning2
Contents
I. Introduction 4
II. The origins of probabilism in psychological research 5
II.1. Egon Brunswik 5
II.1.1. Brunswiks Lens model 6
II.1.2. Brunswick on learning 8
II.1.3. Multiple cue probability learning 10
II.1.4. The expansion of Brunswick works 11
II.2. Estes and the probability matching 12
III. New interest in probability learning 14
III.1. Probabilistic associative learning 15
III.2. Probabilistic classification learning 17
IV. Methodological considerations 20
IV.1. Base-rate neglect 20
IV.2. Strategy analysis of PCL 23
IV.3. The question of consciousness 24
IV.4. Analysing the test results 26
V. The Dynamical approach 29
V.1. Dynamic models of cognition 29
V.1.1. Static and dynamic models of learning 30
V.2. Dynamical analyses 31
VI. Experiments 32
VI.1. Methods 34
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
3/73
Strategy analysis of probability learning3
VI.1.1. Participant 34
VI.1.2. Materials 34
VI.1.3. Apparatus 37
VI.1.4. Procedure 37
VI.1.5. Data collection 38
VII. Results 39
VII.1. Experiment 1 39
VII.1.1. Hit rate 39
VII.1.2. Rolling regression 40
VII.1.3. State-space model 43
VII.1.4. Test phase and explicit measures 43
VII.2. Experiment 2 45
VII.2.1. Hit rate 46
VII.2.2. Rolling regression 47
VII.2.3. State-space model 47
VII.2.4. Implicit, explicit measures 48
VII.3. Summary of results 49
VIII. Discussion 51
VIII.1. Rationality under uncertainty 52
VIII.2. The power of statistical learning 53
VIII.3. From duality to multiple systems 55
IX. Conclusion 58
References 60
Appendix
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
4/73
Strategy analysis of probability learning4
God has afforded us only the twilight of probability;suitable, I presume, to that state of mediocrity and
probationership he has been pleased to place us in here.
John Locke, 1690
I. Introduction
All of us spend our lives learning, yet our goal to understand its basic processes is still
unaccomplished. Learning theory is a critical field of our psychological investigation because most
human behaviour involves some form of learning (Robertson, 1970). Learning may be defined as a
process in which behavioural patters are changed as the result of experience (Kelso, 1997).
Learning is an adaptive process of our cognition to predict the future on the basis of pastexperience. In an uncertain world this prediction can rely only on probabilistic relations about the
environment (Lagnado, Newell, Kahan, & Shanks, in press). Therefore to understand how people
learn probabilistic information from experience is a fundamental question of human behaviour.
In this explorative study I concentrate on four basic aspects of probability learning. First
of all, during my discussion I rely on Brunswiks theoretical framework about the probabilistic
nature of human psychology. Secondly, I wish to provide a critical review of the methodology
used in the previous studies of probability learning. Thirdly, I am motivated to examine the role of
conscious and unconscious processes behind the applied decisional strategies. Finally, I consider
learning to be a dynamic process and find the dynamical approach and methodology to be relevant
for the investigation. In the experiment I demonstrate how these aspects of probability,
consciousness and dynamic processes play essential rule in probability learning.
The research question intends to explore what decisional strategies we use in
experiment-based probability learning situations. The modified experimental task is applicable to
explore the used decisional strategies. The employed methodology provides useful tools to the
field to investigate simultaneous processes underlying learning behaviour. The results may yielddirect implications to the understanding of the well-known suboptimalities of human learning and
decision making. The whole work may raise new questions about the interaction of the
association- and rule-based processes behind human learning.
I greatly acknowledge Dnes Tths collaboration in the planning, execution and analysis of the experiments reported.
Special thanks for the careful revisions to Tams Makny and for the linguistical check of the manuscript to James
Wason from Cambridge.
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
5/73
Strategy analysis of probability learning5
II. The origins of probabilism in psychological research
Probabilism in psychological research started on its way with the argument of Egon
Brunswick (e.g., 1939) to describe the relations between organisms and their environments in
probabilistic terms. Brunswiks scientific work was consequently provocative. He believed that
psychology needs a revolutionary turn to understand behaviour with regard of its function and in
terms of probabilistic relations. However the impact of his work did not live up to his original aim
for a change in the mainstream of the field, after all it is becoming apparent that his thoughts were
basically right.
In this chapter I wish to describe some of the views and works of Egon Brunswik,
because his conceptions on perception, learning and experimental design serve as a theoretical
background to the methodologies discussed in this work. I also give a short description about the
contribution of William Estes who started an initial research to model probability learning in
mathematical terms. Contemporary conceptions in probability learning studies originate directly
from these two approaches.
II.1. Egon Brunswik
Brunswik, although born in Budapest, began his career in psychology as the first
assistant of Karl Bhler in Vienna (Doherty & Kurz, 1996). Under the auspices of Bhler,
Brunswiks views turned against the popular psycho-physical parallelism, and this attitude
determined his later theories. He started to state it in Vienna and continued in Berkeley that both
the incoming perception and the outgoing behaviour have a rather ambiguous nature. In his view
the probable partial causes and probable partial effects of the behaviour should be under focus
when we wish to understand the great compatibility between organism and environment
(Hammond, 2001). Brunswick intended to say in an evolutionary view that in natural
environments survival is possible only if the organism is able to establish compensatory balance
in the face of comparative chaos within the physical environment (Brunswik, 1943, p. 257). In
that physics-envy time of psychology, his concept of probable behaviour in a somewhat
unpredictable environment was in sharp contrast with the mainstream thinking that sought for
stability in laws and research on behaviour. With his Lens model which intended to describe this
compensatory balance of the organism in inherent uncertainty within the environment and
within the person, he went against the dominating determinism of his time. Indeed, Brunswiks
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
6/73
Strategy analysis of probability learning6
belief was that the probability character of the causal (partial cause-and-effect) relationship in the
environment calls for a fundamental, all-inclusive shift in our methodological ideology regarding
psychology (Brunswik, p. 261). These harsh words according to Hammond, Brunswiks
student and follower established a distance between Brunswiks views and those of Hull, Lewin
and many generations of future experimental psychologists that is still hard to overcome
(Hammond, p. 56).
II.1.1. Brunswiks Lens Model
In the domain of perceptual constancy Brunswiks first research was conducted with
Lajos Kardos (Brunswik & Kardos, 1929) on Bhlers duplicity principle. This principle opposed
the view that context has only a modifying effect on perception that comes into counts only after
the object, instead, Bhler and his students stated that context is always present and never
subordinate in perception (Brunswick, 1937). This reconceptualisation of context served as a
crystallisation point for his ideas that led to his early lens analogy and his later generalised lens
model (Cooksey, 2001).
The analogy of the doubly convex lens is a heuristic tool what he conceived as a
composite picture of the functional unit of behaviour, or the unit of achievement (Brunswik,1952, p. 19-20). This tool was meant to help the researcher in structuring the investigation of
organismic achievement. The explanation of the model contains another metaphor, the intuitive
statistician, which depicts the perceptual system, being equipped with latent capacities capable
for basic statistical functioning in the uncertainty of the environment (Brunswick, 1956, p. 80).
The cues in the environment are only probabilistically related to the objective of the individual. In
that sense, the decision maker has only probabilistic information about the environment and also
about how to utilize these perceived cues. During judgemental processes the decision maker relies
on these environmental cues to attain his/her goal. Not all of the cues may have equal relevance
for predicting the outcome of the decision, but according to the view all of them go into
account (cf. Brunswiks views on the context as additional mediating data (1937)). The decision
maker uses his/her memory of cue-outcome correlations from previous experiences. In Brunswiks
view, this statistical processing occurs involuntarily. Further principle to understand the Lens
model is his principle of parallel concepts (Doherty & Kurz, 1996). This principle states that the
perceived environment and the cognitive system should be described by the same type of
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
7/73
Strategy analysis of probability learning7
constructs. These thoughts gave the basic theory for the construction of the Lens model. Brunswik
(1952) thought that using correlation statistics these constructs become measurable.
Figure 1.
Illustration of the Lens model. The objective weights ( w) between the environment (Y E) and the
cues (X) and the subjective weights ( j) between the cues and the judgement (Y S) are parallelconcepts. The functional achievement (r a) is the correlation between the persons judgement and
the ecological criterion value. (based on Cooksey, 1996)
Thus, as Figure 1. depicts, ecological validity is defined as the correlation between the
values of the distal criterion of the environment and the perceived cues; the cue utilization validity
is defined as the correlation between the values of the perceived cues and the individuals
judgements; finally, the achievement is measured by the correlation of the values of the distal
criterion of the environment and the individuals judgements. Achievement, as the most general
measurement of the model, reflects Brunswiks broadest descriptive term, the probabilistic
functionalism . In this terminology, achievement is the degree to which the orgamism successfully
attains its goals (Doherty & Kurz, 1996). This conception makes apparent distinction from most
of the other traditions in psychology. Investigators of human behaviour almost exclusively
define the correctness of human performance in comparison to some kind of normative models
(mostly adapted from statistics or logic). Unlike the appliers of these coherence standards of
assessments (e.g., investigators of heuristics and biases), those of using the Lens model look for
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
8/73
Strategy analysis of probability learning8
correspondence in performance. In their words, successful performance (i.e. for achievement)
[is] the degree to which a persons responses agree with, or correspond with, the environmental
events (Doherty & Kurz, p. 123).
The biggest contribution of the model is that it provided a general tool for those
investigations of cognition where the focus is on any kind of judgemental processes under
uncertainty. Brunswik had only a few demonstrations using the Lens model (e.g., 1952), but after
his death, the approach, starting from Hammonds clinically oriented research initials (Hammond,
1955), evolved into the Social Judgement Theory, which reached many areas of investigation on
judgement and decision making including educational decision making (e.g., Cooksey, 1988),
medical decision making (e.g., Wigton, 1988), accounting (e.g. Krogstad, Ettenson, & Shanteau,1984), or risk judgement (e.g., Earle & Cvetkovich, 1988). The other expansion of Brunswiks
Lens model had enriching effect on the theories of learning, setting off the studies of multiple cue
probability learning (described in Section II.1.3.).
II.1.2. Brunswick on learning
Brunswik dedicated two research studies to the investigation of probability learning. The
first work, the Probability as a determiner of rat behaviour (1939) was executed in 1936 and1937, as his first experiment in the United States. This paper presents not only an excellent
reflection of Brunswiks main attitude towards the psychology of his time, but the first experiment
on probability learning in the literature as well. The design of the experiment followed the
standard design of his age, the T-maze. The T-maze was a usual experimental setup of
behaviourism, where the animal (usually a rat) was places in a two armed T-form maze. Food
reward was exclusively conditioned to one side and never to the other side, thus training the
animal to learn the expected behaviour. Brunswiks innovation was that he has altered the
predictability of the sides for the running series. In every run there was food on only one side, but
the location of the rewarded side was not consistent in each group. Brunswik calibrated the
predictabilities of certainty for the groups following 100:0, 75:25, 67:33 ratios. In each case, the
generally profitable side was counted as correct choice, the generally unprofitable side as
error. In this sense, in the exceptional cases some of the successful choices were errors and some
of the unsuccessful choices were correct responses. Further to study the effect of ambiguity,
after 4 days of training Brunswik gave reversal trainings to the animals. Now the profitable and
unprofitable sides were exchanged, however, keeping the previously set probabilities the same for
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
9/73
Strategy analysis of probability learning9
all groups. In order to study more about the effect, he added repeated reversal training to the 100:0
group. In the following six consecutive days these animals had the same type of training except the
directions were reversed in each day (Brunswik, 1939). The results in general showed that
discrimination increases with the increase of the difference of reward probability on the two sides.
The description of the design here served not as an introduction the results of the study,
but to present the innovative conception in it. Leaving Thorndikes ideas alone on the probabilistic
nature of the environment (Thorndike, 1932), most of the behaviourist studies steadily stuck to
deterministic reinforcements in their designs. It was Brunswiks explicit goal with strong
argument to reform psychology to accept the genuine uncertainty in observation and judgement
into the model. It is worth to take notice of his argument that can be called passionate in scientificcontext. The title already hides an ironic paradox: probability as a determiner. Further, the first
two sentence concisely sums up his overall conception stating In the natural environment of a
living being, cues, means or pathways to a goal are usually neither absolutely reliable nor
absolutely wrong. In most cases there is, objectively speaking, no perfect certainty that this or that
will, or will not, lead to a certain end, but only a higher or lesser degree of probability.
(Brunswik, 1939, p. 195).
Brunswiks other study on probability learning was conducted with Hans Herma in 1951
titled as Probability learning of perceptual cues in the establishment of the weight illusion . This
work applied probability learning to Brunswiks genuine interest, the perception. In a brief
summary, in this perceptual experiment the participants had to lift heavy and light weight objects
simultaneously in both hands. The objects were painted in two colours. Each participant was told
that after some presentations of weights, he/she has to tell in a snap judgement which of the two
objects appeared heavier at the first moment of the lifting. (Brunswik & Herma, 1951). Because of
the successive weight contrast effect the participants underestimated the relatively heavier weight
in the subsequent trial and vica-versa for the light weights. The estimated weight contrast was
under focus, since, in the balanced weights test trials, the one presented on the side with generally
lesser frequency of heavy objects is judged as the heavier of the pair (p. 174), thus showing the
effect of probability learning. Thus, the location and the colour served as cues and the estimated
weights as the dependant variable. The uncertainty of the environment was represented in the
probability by which the multiple cues predicted the objects. The results showed that the contrast
responses, after an early maximum, declined under continued reinforcement. This paradoxical
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
10/73
Strategy analysis of probability learning10
result was speculated to be a special characteristic of probabilistic learning (Brunswik & Herma,
1951.)
The interest of this work is not about its design or results, but about the questions that it
aims to study of how rapidly the organism adapts the probabilistic structure of the environment; or
if probability learning reaches a final level at which gets stabilized (Bjorkman, 2001). For our
current research and understanding of probability learning the relevance of these questions is still
unexpired.
Although, learning was not closely related to Brunswiks original interest, in these
works he presented probability learning and multi-cue probability learning experiments for the
first time. The implications of this concept are detectable best in the various areas of the multiplecue probability learning paradigm.
II.1.3. Multiple cue probability learning
In a typical multiple cue probability learning situation (MCPL) participants make
judgements based on cues probabilistically associated to feedbacks in a series of trials. The aim of
this experimental design is to model the organisms attempt to learn the relationship of the
variables of the environment and to model its function of predicting the efficiency of behaviour.The probabilistic reinforcement of cues by feedbacks represents the conception of general
uncertainty of real life environments and its perception as well.
The description is in strike contrast with many of the traditional learning models where
the degree of learning is tested and defined by the number of correct retrieval of items or of
associations previously presented in deterministic pairing. The degree of learning in probability
learning studies is assessed by the percentage of the correct decisions made on the basis of the
previous experience on the task.
The merit of this model is not just for its contribution to the learning theories, but also
for its essential involvement in most of the cognitive processes ranging from perception, and
categorization to decision making. In this section I wish to provide an insight to the evolvement of
the research of MCPL to elucidate what attempts have led to the concepts and experimental trials
of today.
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
11/73
Strategy analysis of probability learning11
II.1.4. The expansion of Brunswick works
Although, the debate goes on, if Brunswik and Herma used the term probability learning
rightfully in their experiment published in 1951 (described in Section II.1.1.), or a present-day
reader would make distinction calling it partial reinforcement (Bjorkman, 2001), some readers of
his article understood the originality in the concept and started a series of experiments that
triggered off an emerging of questions for the coming five decades.
The first work after Brunswiks 1955 death applying multiple and single cues in
probability learning was conducted by Smedslund (1955) in Norway. In his inquiry into the
origins of perception assumed perception to be established by a process of multiple-probability
learning, meaning that in learning people utilise complex configurations of ambiguous and
probabilistic cues of the environment. One of his two experiments was to explore the possibility of
utilising the probability learning procedure as a diagnostic tool in clinical psychology. He found
the method to be slow and inefficient (Holoworth, 1999).
Extensive research program was initiated only from 1964 by Hammond, Brunswiks
follower, and his students in the United States (e.g., Hammond, Hursch, & Todd, 1964), they
began to analyze the components of clinical inference explicitly in the framework of Brunswiks
Lens model (sketched earlier in Section II.1.1.). The other main propagator of the approach wasBjrkman, who started a long research project in Sweden (e.g., Bjrkman, 1965, 1987) as well as
his student Brehmer, who published 77 articles in the topic counting between 1972 and 1988
(Holoworth, 1999).
Until the late 1980s a big proportion of the studies documented substantial learning
effects, however not reaching optimal level (cf. Hammonds results), while Brehmer expressed his
pessimistic conclusion, stating: When we learn from outcomes, it may, in fact, be almost
impossible to discover that one really does not know anything. This is especially true when the
concepts are very complex in the sense that each instance contains many dimensions. In this case,
there are too many ways of explaining why a certain outcome occurred, and to explain away
failures of predicting the correct outcome. Because of this, the need to change may not be apparent
to us, and we may fail to learn that our rule is invalid, not only for particular cases but for the
general case also. (Brehmer, 1980, p. 228-229).
It seems that the interest in the MCPL approach dwindled considerably after the mid-
1980s, but in fact, the question only has merged into a parallel research tradition of probability
learning, marked by the name of Estes.
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
12/73
Strategy analysis of probability learning12
II.2. Estes and the phenomenon of probability matching
Another theoretical origin of probability learning research can be found in the early
articles of William Estes. In contrast to Brunswiks case, here the thought of probabilistic relations
came from the inside circles of behaviourists. In the time of Estes in the 1930-40s, after the
resultless attempts of the big search for the global theories of learning, the field was looking for a
new direction. The orientation of that time was to suppose that all psychological phenomena could
be understood in terms of some version of associationism, meaning that situational stimuli (Ss)
control all behavioural responses (Rs). Estes wanted to keep this paradigm while involving
probability into learning. He, as a good follower of Skinner, his mentor, in his 1950 article found a
way to execute the task. First, he defined the response classes for the organism in a given situation.
The advantage of having a closed set of responses is that the organisms behavioural state in a
given situation can be fully characterized in terms of probabilities of its emitting each of the N
response classes (Bower, 1994). In this way, learning can be defined as the increase of the
probability of the correct responses alongside decreasing the competing alternative responses in a
given situation. This concept made Estes to be able to construct differential equations to predict
quantitative data of behavioural response. This statistical theory of learning brought about themathematical models flourishing for the coming 25 years in the field.
After describing some finite differential equations in trial-by-trial changes in learning
(Estes & Burke, 1953) Estes, the former rat runner, started his first probability learning
experiments with his students in the mid-1950s. Adapting the experimental situations from the
earlier studies of Brunswick, the subjects had to predict which of the two possible outcomes will
occur on each trial in the given situation (e.g., a light would appear on the left or the right). The
events occurred in random sequence and only the base rate became available to help subjects to
predict which event will show up in that trial. Thus the fixed probability was independent of the
history of the outcomes and of the behaviour of the subject. The optimal strategy to maximize the
expected utility would be to always choose the option which appeared with probability greater
than one half. The striking feature of the results of the following experiments was that subjects
matched the underlying probabilities of the two outcomes along their decisions. For instance the
experimenter has programmed the lights to flash randomly, but the red light would flash 70% on
the time and the blue 30% of the time. During the course of the experiment the participants most
often will predict the red light roughly 70% of the time and the blue light roughly 30% of the time.
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
13/73
Strategy analysis of probability learning13
This strategy is suboptimal, since they will predict correctly only 58% of the time
(0.7x0.7+0.3x0.3=0.58), while predicting always the more likely light would bring 70% hit rate
(1x0.7+0x0.3=0.70). The illogical sense of the results inconveniently surprised the researchers of
the field. It is worth here to cite Kenneth Arrow economists comment from the time: We have
here an experimental situation which is essentially of an economic nature in the sense of seeking
to achieve a maximum of expected reward, and yet the individual does not in fact, at any point,
even in a limit, reach the optimal behavior. I suggest that this result points out strongly the
importance of learning theory, not only in the greater understanding of the dynamics of economic
behavior, but even in suggesting that equilibria maybe be different from those that we have
predicted in our usual theory. (Arrow, 1958, p. 14). Bush & Monstellers (1955) stochasticlearning theory, as much as Estess (1957) probability matching theorem, did predict this kind of
behaviour in linear difference equations.
The researchers enthusiasm began to wane about the forms equations of the learning
curves during the following decades. Estess colleague, Roger Shepard mentions three reasons
explaining this turndown (1992). First, Estes and his fellows of that time based their theories on
the assumption that the associative bonds come about through temporal contiguity between events,
however with the cognitive approach it turned out that the predictive significance of events is a
better explanative basis for this kind of learning (Rescorla & Wagner, 1972). Secondly, they
realized only lately that what can be learned must be genetically internalized and thus the
experiments have to set to be ecologically valid (Gibson, 1979). Finally, these mathematical
derivations could not be employed expectedly for more complex classification tasks and with the
availability of stored program computers the search for few general principles was waned and
what started was a seemingly endless patching together of details, as doc heuristics explicitly
engineered to accomplish various practical tasks(Shepard, p.420).
Even if the equations of learning curves did not remain a central research topic after the
1950s, the previous robust findings that people use probability matching instead of normative
optimal strategies in binary prediction tasks still startled researchers for further studies. Searching
for explanation, a vast number of experiments tried to vary the parameters of the task in the
coming decades. In general they found that matching decreased with size of the reward (Brackbill,
Kappy, & Srarr, 1962; Siegel & Goldstein 1959) and with the number of trials (Edwards 1961).
Shanks, Tunney, & McCarthy (2002), supporting rational choice theory showed that three factors
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
14/73
Strategy analysis of probability learning14
contribute to reaching optimal response strategy. These factors are (1) large financial incentives;
(2) meaningful and regular feedback; (3) extensive training.
Rewards:
Although, Friedman and Massaro (1998) failed to find evidence that monetary payoff
effects performance, we can find documentation of this in the literature (Siegel & Goldstein, 1959;
Vulkan, 2000). In general, even under monetary payoff, the asymptotic levels of responding rarely
exceeded 95% correct choice (Vulkan, 2000). Although, Shanks et al. (2002) paid almost 40 in
average to his participants, he could show only a slight positive effect on performance, yet the
magnitude of payoff did not show correlation with the performance.
Number of trials:Restle (1961) showed that that sequence effect may disappear after 1000 trials, Goodie
and Fantino (1999) found gradual transition towards optimal responding over 1600 trials.
Considering these data we should conclude that humans are slow learners, rather than rational
strategists. These findings even seems to be inadaptable for real life, since, as Fantino (1998)
noted [l]ife rarely offers 1,600 trials (p. 213).
Feedback:
At least from Thorndike (1898) we know that one is more likely to choose an option in
the future, if he/she receives a positive feedback. Nevertheless, previous research has found that
outcome feedback is quite limited in its usefulness, particularly in comparison to cognitive
feedback (Balzer, Doherty, & OConner, 1989). Cognitive feedback refers to information about
the relations between responses and outcomes (functional validity information), or a summary of
the relations between responses and outcomes (task information). Shanks et al. (2002) found that
individuals may be differentially sensitive to the motivating properties of feedback. Despite of
these observations, we can tell in summary that previous researches paid little attention to the role
of feedback and its effects on asymptotic levels of performance is far from being explained.
III. New interest in probability learning
Although from Brunswick up to the late 1980s 280 journal articles, book chapters,
doctoral dissertations and technical reports were published on MCPL, research on this approach of
probability learning dwindled after the mid-1970s. The renewed interest in this topic started on its
way again only in the late 1980s. This was the time when personal computers became available for
psychologist experiments demanding complex computations, thus allowing developing models of
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
15/73
Strategy analysis of probability learning15
connectionist networks. This was the movement that helped Brunswikian probabilism get adopted
eventually into the classical schools of human learning studies.
III.1. Probabilistic associative learning
With the appearance of the cognitive approach animal and human learning research
started to follow separate routes. The researchers of animal learning continued to focus on
elementary learning, while interest in human research shifted from learning to memory and from
the classical models to the models of artificial intelligence. With the development of the
connectionist networks after the mid-1980s the methodological apparatus became available to
study the elementary aspects of human learning within the domain of cognitive psychology. Apart
from some exceptions (e.g., Dickinson & Shanks, 1985) no studies have attempted directly to
bridge results of animal experiments to human learning before Gluck and Bower (1988a). In their
seminal papers (1988a,b), they developed an adaptive connectionist network to test the Rescorla-
Wagner model of associative learning (Rescorla & Wagner, 1972) in human category learning.
Gluck and Bower were looking for a learning model that involves probabilistic relation in its
formulation. The Rescorla-Wagner model is based on Rescorlas previous demonstration (1968)
that in the case of animal associative learning the level of the probability of a CS will vary theprobability of the US in the presence of the CS compared to the US probability in the absence of
the CS (Gluck & Bower, 1988a). Gluck and Bower found that the Rescorla-Wagner rule for
association formulation can be regarded as a special case of the least-mean-squares (LMS)
learning rule which was used in training the adaptive connectionist networks of that time.
Attempting to evaluate the LMS rule as a component of human learning, Gluck and Bower
(1988a,b) conducted a series of experiments to explore the accuracy of the model to probabilistic
classification learning situations.
They adapted the experimental task from Medins medical classification task (e.g.,
Medin, Altom, Edelson, & Freko, 1982). In this task in each trial, participants pretending
medical diagnosticians - met one or more of the four symptoms of hypothetical patients in medical
charts. They had to classify each patient as having one of the two fictitious diseases. After each
trial they received feedback about the correct diagnosis. The combinations of the four cues
(symptoms), unknown to the participants, were imperfectly, probabilistically associated with the
feedbacks (diagnoses) ( Figure 2. ) following the scheme of the multiple-cue probability learning
studies (e.g., Castellan, 1977). During the training, participants learnt the relationship of the
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
16/73
Strategy analysis of probability learning16
symptom patterns with the diseases and at the end of the experiment they were asked directly to
estimate the conditional probabilities of each symptom to the diseases.
Figure 2. Cue-outcome relations in the probabilistic association task (Gluck & Bower, 1988a). Similarly to
Brunswiks Lens model ( Figure 1 .) objective weights ( w) represent the relation between the
environment and the cues.
The aim of the application of this design was to receive distinguishable predictions of
their adaptive network from the three competing models of category learning (exemplar-, feature-
frequency-, prototype model). The best fitting model could shed light on the question of to what
extent we use similarity and base-rate (category probability) information in probabilistic
categorization (Estes, Campbell, Hatsonpoulous, & Hurwitz, 1989). The exemplar (or context)
model assumes that the learner stores all exemplar of each category and a new instance get
categorised on the basis of its relative similarity to the stored exemplars (e.g., Nosofsky, Kruschke,
& McKinley, 1992); the feature-frequency model presumes that the learner stores relative
frequencies of occurrence of cues within the categories and then classifies an instance according to
the relative likelihood of its particular pattern of features arising from each of the categories
(Gluck, Bower, 1988b) (e.g., Reeds, 1972); the prototype model assumes that the learner abstracts
an average description of each category and then the new instance gets classified according to its
similarity to this prototype (e.g., Matsuka, 2004).
The adaptive network was an error-driven one-layer LMS network. To generate
differential predictions of the LMS model and the alternative models, Gluck and Bower (1988a)
had to unbalance the overall frequencies of the two diseases. This way, one of the diseases
occurred more often than the other one. The results from the learning phase showed that the base-
rate information (the overall frequencies of the two diseases) were reflected in performance, as a
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
17/73
Strategy analysis of probability learning17
form of probability match, however, when the participants were given explicit test trials at the end,
they showed substantial base-rate neglect. As a result, it was found here and in several other
experiments (e.g., Estes et al., 1989; Gluck & Bower, 1988b) that simple network model has
stronger predictive value for these results than the alternative models. Recently, out of the 12
current models of category learning, the COVIS (competition between verbal and implicit
systems) model is assumed as the best describer of probabilistic classification (Ashby, Alfonso-
Reese, Turken, & Waldron, 1998; Kri, 2003).
III.2. Probabilistic classification learning
More recently, four different kinds of category learning tasks are generally used: rule-
based tasks, information integration tasks, prototype distortion tasks, and the weather prediction
task (Ashby & Maddox, 2005). The weather prediction task (WP) is a version of probabilistic
classification learning task, developed by Knowlton, Squire and Glucks (1994). This test follows
the structure of Gluck & Bowers (1988a) construction except participants play a weather
forecaster. On the basis of one of the 14 combinations of four tarot cards (binary cues),
participants had to predict rainy or sunny weather (binary outcome) ( Figure 3. ).
Figure 3.
Probabilistic Classification Learning task. In this task people have to guess the weather on the
basis of the presented combination of cards with geometric signs on them. (adapted from Aczel &
Gonci, 2005)
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
18/73
Strategy analysis of probability learning18
Outcomes associated to the patterns appeared with fixed probabilities, but in random distribution.
Thus, the WP is substantially analogous to Gluck and Bowers (1988a) medical diagnosis task and
to the MCPL tasks in a brunswikian sense, since the participants have to make judgements on the
basis of the experienced relation between multiple-cues and a distal object, just as the Lens model
describes. The standard analysis of the test proceeds by averaging correct responses in blocks of
10 trials and measuring the deviation of these mean values from chance level. A response is meant
to be correct on a particular trial, if the outcome selected was more frequently associated with the
given pattern than the other one in the course of the whole experiment (Knowlton & Squire, 1996).
During the last decade, this task became extensively used in cognitive neuroscience
(e.g., Eldridge, Masterman, & Knowlton, 2002; Knowlton et al., 1994, 1996; Poldrack, Clark,Par-Blageov, Shohamy, Creso Moyano et al., 2001; Reber, Knowlton, & Squire 1996; Reber &
Squire, 1999). The exciting result that has started this interest in clinical research was Knowlton et
al.s (1994) finding with an initiative trial of this task. They compared the performance amnesiacs
with normal control on the WP task. The result was that the two groups performed equally well
during the first 50 trials of learning, however, in the extended part of the training, the amnesiacs
performance decreased relative to the healthy group (Knowlton et al., 1994). The impairment of
the declarative memory in amnesiacs is often coupled with the finding of relatively sound non-
declarative learning (Milner, Corkin, & Teuber, 1968; Warrington & Weiskrantz, 1968). On the
basis of this neurological observation, Knowlton et al. (1994) interpreted the results as the
performance of the two groups being processed by non-declarative learning systems in the first
part of the task, whereas in the late training the control people began memorising the test, what the
amnesiacs could not (Knowlton et al., 1994; see also Gluck, Oliver, & Myers, 1996). In contrast,
patients with Parkinson disease show learning deficit in the first 50 trials of the test which
continues throughout the training (Knowlton et al., 1996). A learning patter similar to the
amnesiacs was found with Alzheimer patients who were in the early stages of the disease. In both
cases the anterograde amnesic symptoms is connected to neurodegenerative processes in the
medial temporal lobes (Eldridge et al., 2002). Szabolcs Kri and his colleagues (Kri et al., 2000)
examined schizophrenics with the WP task who, as well known, have abnormalities in executive
function and explicit memory. The results showed normal performance for schizophrenics
comparing to control. These results suggested that the WP task is processed by non-declarative
neural systems.
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
19/73
Strategy analysis of probability learning19
The key cortical area activations found generally correlating with probabilistic
classification learning were the occipital cortex and the right nucleus caudatus (Kri, 2003). It is in
correspondence with the observation that people with impaired basal ganglia have difficulties in
implicit learning tasks. The correct responses correlate positively with caudate and prefrontal
activation, however, the role of the prefrontal and parietal cortices in this task is not yet
understood (Fera, Weickert, Goldberg, Tessitore, Hariri et al., 2005). The abnormal functioning of
the basal ganglia is well documented with Tourette patients (e.g., Peterson, Leckman, Duncan,
Ketzles, Riddle et al., 1994). In a clinical study, the WP task was used with children with Tourette
syndrome (Kri, Szlobodnyik, Benedek, Janka, & Gdoros, 2002). The children exhibited
impaired learning in the WP, however, in an explicit transfer version of the test they showednormal learning (Kri et al., 2002). Further study showed that transcranial direct current
stimulation of the left prefrontal cortex could improve implicit learning in WP task in healthy
people (Kincses, Antal, Nitsche, Bartafi, & Paulus, 2003). Interestingly, while healthy people
show activation in the striatum with no activation in the MTL during learning in implicit motor
sequence task, people with obsessive-compulsive disorder exhibited no activation in the striatum
and activation in the MTL (Moody, Bookheimer, Vanek, & Knowlton, 2004). Poldrack, et al.
(2001) intended to study this interaction of the basal ganglia and the medial temporal lobe (MTL)
during probabilistic classification learning. The appealing finding showed that during the initial
part of the WP task the MTL was active, while the caudate inactive, but very shortly the MTL
became deactivated (presumably inhibited), whereas the caudate nucleus became activated.
Poldrack et al. (2001) interpreted the results as the first substantive evidence of competition
between memory systems. They supposed that in the initial part of the task the two systems
(explicit vs. implicit) may compete and as it turns out that the task demands implicit processing
the MTL becomes inhibited (Poldrack & Rodriguez, 2004). This result supports the view that the
two systems are not dissociated imperviously, but are in constant interaction and are encoded by
common factors (e.g., McDonald, Devan, & Hong, 2004; Turk-Browne, Yi, & Chun, 2006).
The probabilistic associative learning (Gluck & Bower, 1988a,b) and the probabilistic
classification learning (Knowlton et al., 1994) tasks were developed for examining concrete
computational and clinical analysis, but their main contribution to the field is that they provide
experimental and analysing methodology for probability learning studies. These initial
examinations resulted sufficient data and experience to be able to reconsider the basic procedures
and methodologies for further studies.
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
20/73
Strategy analysis of probability learning20
IV. Methodological considerations
It became apparent very early that probabilistic experiments bring about confusing
results, if they are not measured with special attention. This section highlights the three main
problematic points of the field. The base-rate neglect is a phenomenon that intrigued not just the
researchers of learning, but it seems that this field has to most elaborated explanations to the issue.
Strategy analysis of probability learning has no long history in the literature, but bears special
interest for this explorative study. All of the methodological problems are entangled by our lack
insight to the questions of consciousness in the processes.
IV.1. Base-rate neglect
Gluck and Bowers (1988a,b) observation that although the participants reflected the
experienced probabilities in their decisions in the training phase, they did not consider the base
rate information in their decisions in the test part calls for special attention. This result may seem
to be uninterpretable, still it fits well with the literature of the base rate fallacy. The dominating
research on judgement and decision making in the 1970s and 1980s was concerned with the
heuristics and biases paradigm (Koehler, 1993). This approach was developed by Daniel
Kahneman and Amos Tversky (1972) who elaborate the attractive theory stating that peoples
intuitive judgements about probabilistic events are made via erroneous error-prone heuristics. In
their 1973 seminal paper, presenting empirical support to the view, they concluded that by
[representativeness] heuristics, people predict the outcome that appears most representative of the
evidence. Consequently, intuitive predictions are insensitive to the reliability of the evidence or to
the prior probability of the outcome, in violation of the logic of statistical prediction. [] It is
shown that [] people erroneously predict rare events and extreme values if these happen to berepresentative. (Kahneman & Tversky, 1973. p. 237). As evidence in support of this theory
mounted, base rate fallacy (Bar-Hiller, 1980) became a favourably used example of the
heuristics and bias paradigm. The results have implicated a common view about human judgement
as genuinely biased and generally poor (e.g., Lopes, 1991). However, for the early 1990s, the
preliminary converging evidence became apparent that the general base rate fallacy has been
overstated previously, the base rates are not uniformly ignored (Koehler, 1994, 1996). The
question arises: if base rates are not always ignored, when are they likely to be used? One can find
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
21/73
Strategy analysis of probability learning21
two main streams of researchers answering the question. One emphasizes that relative frequencies
can be better represented than single event probabilities (e.g., Gigerenzer, 1991; Hoffrage,
Gigerenzer, Krauss, & Martignon, 2002), the other one seeks the answer in the structure of the
tests (e.g., Koehler, 1996; Spellman, 1993).
Gigerenzer and colleagues (e.g., 1995) forcefully argue that the dominating theory of
heuristics is flawed, because representations in terms of natural frequencies better facilitate the
usage of the probability (or frequency) than the given conditional probabilities (e.g., Hoffrage et
al., 2002). This conception comes from the empirical works generated by ecological views (e.g.,
Gigerenzer, 1996). Natural frequencies originate from natural sampling (Gigerenzer & Hoffrage,
1995). Natural sampling is an automatic way of encountering statistical information from thenatural environment (Hoffrage et al., 2002). Giving a concrete example for these concepts:
Natural frequencies:
Out of each 1000 patients, 40 are infected.
Out of 40 infected patients, 30 will test positive.
Out of 960 uninfected patients, 120 will also test positive.
Normalized frequencies:
Out of each 1000 patients, 40 are infected.
Out of 1000 infected patients, 750 will test positive.
Out of 1000 uninfected patients, 125 will also test positive. (Hoffrage et al, p. 346).
Thus, probability is the normalised value of the natural frequency for one hundred. The authors
point to this fact as a reason of the fallacy, since the computation is simpler if natural frequencies
are provided rather then normalised frequencies or probabilities are given (Gigerenzer & Hoffrage,
1995).
Koehler (1996) advocates the view that the fallacy is brought about by the way in which
base rate summary statistics are provided in typical base rare tasks. He claims that people, given
opportunity for implicit base-rate learning, will be more sensitive to probabilities and will show
higher use of base rates in final judgements. It was published in previous studies that when base
rates were directly experienced, through trial-by-trial feedback, they seemed to be used more
accurately on judgements (Lindeman, Van Den Brink, & Hoogstraten, 1988; Manis, Dovalina,
Avis, & Cardoze, 1980; Medin & Edelson, 1988), in contrast to the method of presenting mere
summary statistics. Directly experiencing base rates is found to be helpful e.g. for auditors
learning financial statement errors (Butt, 1988), or for physicians learning the relationship of base
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
22/73
Strategy analysis of probability learning22
rates and diagnostic information (Christensen-Szalanski & Beach, 1982). Koehler (1996) assumes
that directly experienced base rates may be better relied on implicit rather than explicit learning
systems and that is why it is better remembered, or more easily accessed than information that is
learned explicitly. He renders a reason for this stating: the implicit learning experience comes in
the form of trial-by-trial learning, the information at each trial may be encoded as a separate
"trace". In this way, multiple traces develop, and the information associated with these traces may
be cognitively available. This contrasts with the explicit learning of a single summary statistic
which does not produce multiple traces and which has been associated with less accurate
judgments (Koehler, 1996). Other explanations support the view that personally experienced
information is more vivid or salient, thus more available (Brekke & Borgida, 1988); or people aremore trusting of self-generated base rates (Ungar & Sever, 1989). This conception is consistent
with many observations from category learning literature, where the probability matching strategy
indicates that people learn the experienced base rates and use them in their decisions (however not
optimally), meanwhile, in the explicit test phase they show base rate fallacy (e.g., Estes, et al.
1989; Gluck & Bower, 1988a,b; Medin & Edelson, 1988).
Holyoak and Spellman (1993) considered the phenomenon and suggested two
components being behind the base rate usage. Acquisition (1), which, in a trial-by-trial format, is
processed implicitly and quite accurately (perhaps based on learning conditional probabilities);
and access (2), which (depending on the type of test) may be under explicit and conscious control.
Consequently, when both acquisition and access part of the test tap the implicit system, people
will apply better on base rates than one of the phases are not implicit (Spellman, 1993).
In sum, we can conclude that the base rate literature holds substantial relevance to the
study of probabilistic categorization. For a comprehensive understanding of the issue, we must
consider the above described aspects both in experiment construction and data interpretation.
IV.2. Strategy analysis of PCL
In probabilistic classification learning people acquire information about cue-outcome
relations, therefore the category learning literature regards the PCL technically as an information-
integration task (Ashby & Maddox, 2005), however, little is known about, how people integrate
the observed cues. As will be argued below, a variety of different strategies are all about equally
effective to solve the task.
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
23/73
Strategy analysis of probability learning23
The hypothesis is reasonable that the evolved animals and humans have to be optimal in
categorization decisions (Ashby & Maddox, 1992). According to Thorndikes law of effect (1898),
the probability of successful trials will increase with time. Still, robust deviations from this law are
observed from the 1950s, one popular instantiation is the probability matching first studied by
Estes (1950) (see in Section II.2.). Ashby and Gott (1988) examined the performance on many
human categorization tasks comparing it to an optimal classifier (a hypothetical device
maximizing reward, e.g., Morrison, 1990). The overall data showed that human classification
cannot be described by optimality. Decision bound theory (Ashby & Townsend, 1986) attributed
two inherent suboptimalities to human (and all other organisms) decision making processes. Both
suboptimalities are the by-producs of the neural system (Maddox & Bohil, 1998). The perceptualnoise comes from the spontaneous activity within the central nervous system, further, the
optimality of the cognitive system is also limited by the observers memory (criterial noise).
Deviation from optimality is observable in the strategy level of decision making as well.
Suboptimal strategies are also often found in probability learning literature (Vulkan, 2000). The
known deviations from optimality in human decision making still seek for a proper explanation,
however, one can find at least three distinct effects attributed to the phenomena in the previous
works. One set of explanations is classified as payoff variability effects (Busemeyer & Townsend,
1993; Haruvy & Erev, 2001). This theory claims that the increase in pay-off variability move
choice behaviour towards random choice (Erev & Barron, 2005). The second set of explanations
is classified as underweighting of rare events (Barron & Erev, 2003). In these situations people
tend to rely on the typical outcomes that have the best pay-off. The third set of explanations
involves loss aversion (Kahneman & Tversky, 1979). This counterproductive action is observed in
the stock markets showing that people tend to avoid any loss. But, in probabilistic cases, the
choice of the less probable alternative decreases the overall chance of maximal profit (e.g., Gneezy
& Potters, 1997). These three cognitive strategies may seem to be reasonable from a point of view,
but their negative by-product is the deviation from maximization of gain.
Recently it became apparent that the WP task is solvable by a range of different
strategies. Gluck, Shohamy and Myers (2002) presented techniques of post-hoc analyses by which
they deduced that participants may use at least three different strategies. Post-experiment
questionnaires uncovered that most the participants believed to use one of the following strategies:
(1) optimal multi-cue strategy , in which they respond according to the outcome probability of each
combination of the presented four cues; (2) one-cue strategy , in which they respond on the basis of
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
24/73
Strategy analysis of probability learning24
their focus on the presence or absence of only one cue; and (3) singleton strategy , in which they
focus and learn only about the patterns where there is only one cue present. Only the first strategy
is optimal in a normative sense, but all of these strategies can raise the level of performance above
chance level. Bases on these reports, Gluck et al. (2002) developed a strategy analysis method for
the WP task describing ideal judgement profiles for each of these strategies. This method
identified the applied individual learning strategies in two follow up experiments. The results
showed that 90% of the participants in experiment 1 and 80% of the participants in experiment 2
fitted the singleton criterion, however, when they break the task into 50-trial blocks a gradual shift
was observable towards multi-cue strategies (Gluck et al, 2002). Controversially, there was little
correspondence between the explicit self-reports and the actually used strategies of the individuals.Recently Lagnado and colleagues (in press) introduced an alternative strategy to Glucks multi-cue
strategy. The multi-match strategy supposes that people match the underlying probabilities of the
two outcomes by their predictions. As we will see later, the adoption of these simple heuristics
represents only a few of the plausible options that may be involved in probabilistic decision
making situations.
The fact that good performance can be achieved by explicit memorisation of heuristic
strategies (e.g., one-cue strategy, or singleton strategy) makes the implicit nature of the test quite
questionable. As it is argued in this study, beside finer strategy analysing methods, a better
alternative of the WP task might prevent the method from being susceptible to indentifiability
problems.
IV.3. The question of consciousness
Most of the previous studies of probabilistic classification learning assumed that the
decision makers lack self-insight into the judgemental policies underlying these judgements,
therefore regarded the test as a pure implicit learning test (e.g., Evans, Clibbens, Cattini, Harris, &
Dennis, 2003; Gluck et al. 2002; Wigton, 1996; York, Doherty, & Kamouri, 1987). This view was
also supported by those who emphasized the implicit nature of the experience-based learning tasks
(e.g., Spellman, 1993). Although the question is important, the relation between peoples learning
performance and their knowledge has received less attention.
The thought that learning can be based on separate conscious and non-conscious
systems is detectable in some of the early theories of learning (e.g., Tolman, 1932), systematic
research has not started on its way until the late 1960s (Reber, 1967, 1969). Rebers concept of
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
25/73
Strategy analysis of probability learning25
implicit learning concerns that people acquire information from the environment without intending
to do so (Cleeremans, Destrebecqz, & Boyer, 1998). This thesis made it possible to study
unconscious phenomena in cognitive psychology without relying on any psychoanalytic
conception. In a later paper, Reber (1992) taking an evolutionist standpoint argued that
consciousness is a novel phenomena evolving after many higher perceptual and cognitive
processes. He stated four hypotheses about implicit mechanisms (1) it is in robust relation with
psychological and neurological effects; (2) it is independent of the IQ; (3) it is independent of age;
and (4) it has little variance among populations. Others derive from this view that implicit learning
acquire little effort, is often accurate and even optimised comparing to the explicit ways learning
(e.g., Holyoak & Spellman, 1993). The PCL task is one of the few tests that can be used to revealthe rightfulness of these assumptions.
A large body of research has documented apparent dissociation with the WP task and
took the separation of the two learning systems as evidence (Ashby, Ell, & Waldron, 2003;
Knowlton et al., 1996; Reber & Squire, 1999; Squire, 1994). The two systems were also
demonstrated in neuropsychological studies arguing that the two systems are differentially
impaired in the certain clinical cases (Ashby et al., 2003; Knowlton et al., 1996; Poldrack et al.,
2001).
Davis Shanks is one of the few researchers who questions the need of using any implicit
concept in the explanation. In a voluminous paper with a colleague he (Shanks & St. John, 1994)
reviewed the implicit learning literature according to their sensitivity criterion . This criterion
claims that for terming a learning behaviour to be implicit, one has to rule out the insensitivity of
the explicit test. Shanks & St. John (1994) argue that the explicit tests are possibly insensitive to
measure the explicit processes that occurred during learning for two reasons. They suspect that the
retrospective questionnaires can distort the validity of the assessment because of the memory
constraint and the possible interferences. Taking this criterion serious, they did not find any
previous studies where implicit learning was satisfyingly demonstrated.
More than a decade later he and his colleagues pointed to other methodological
shortcomings of the field (Lagnado et al., in press). First of all, they emphasized the need to
distinguish between someones insight into the task ( task knowledge ), and someones insight into
his/her own judgemental processes ( self insight ). In the case of the WP task, it refers to the
difference between the learners knowledge of the cue-outcome relation, and to how to use this
knowledge to predict the outcome. Lagnado et al. (in press) conjecture that the two might be
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
26/73
Strategy analysis of probability learning26
separate, although, the previous researches ran faultfully the two together. Thus, the proclaimed
dissociations could have referred to the dissociation between insight and learning, task knowledge
and learning, or both (Lagnado et al., in press). Further problem of explicit testing was added to
the list by the reasonable claim that verbalisation difficulties of probabilistic inferences might be a
natural obstacle of a valid report.
Recently, Lagnado et al. (in press) conducted a series experiments to demonstrate these
claims. They used strongly and weakly predictive cards on the basic WP task. In Experiment 1, to
measure the knowledge and insight, participants were given explicit questions after each block of
50 trials. One of the two different sets of questions asked the participants to rate on a continuous
scale the probability of the outcome (rainy vs. sunny weather) in the case of each individual card(measuring task knowledge). The other question tested the participants of how much they had
relied on each card in making their decisions (self-insight). This rating was also registered in a
similarly continuous scale. The results indicated strong correspondence of the performance and
both task knowledge and self-insight, being accurate in all.
In Experiment 2 they asked similar explicit question from the participants on the screen
of the same WP task. The difference was that the questions - of how much they relied on each card
were presented to the participants after each trial. The results revealed that from early on in the
task people rated strong cards more important than weak ones. The authors concluded that
participants developed insight into their cue usage relatively early in the task.
In Experiment 3 they tested whether the explicit questions after each trial directed
conscious attention to the task, thus biasing the characteristics of the learning performance. In the
final results there were no changes in any measurement of the third experiment comparing to the
second one. These findings strongly supported the authors doubt that the WP is purely an implicit
task.
Lagnado and colleagues (in press) argument gains credence from these findings, but the
appealing results supporting the existence of an implicit way of learning as well as the lack of the
consensus in analysing the test results let us without satisfactory answer to the question.
IV.4. Analysing the test results
The standard analysis of the PCL task follows a simple procedure. It computes a mean
percentage of the correct responses for the whole task by averaging across both trials and
participants. The aim of this analysis was mostly to develop a categorization models (e.g., Gluck
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
27/73
Strategy analysis of probability learning27
& Bower, 1988a,b), or to study clinical groups (e.g., Knowlton et al., 1994, 1996). While the test
became popular in these research fields, this analysing method does not tell much about the
process of probability learning. The testing is insensitive about measuring both individual
differences and dynamic processes along the test.
Probabilistic classification learning tasks were developed from the multiple-cue
probability learning paradigm, thus, although unacknowledged, its basic structure reflects the
Brunswikian Lens model (for details see Section II.1.1.). This framework is applicable for every
judgemental process where the decision maker has to rely on environmental cues. The original
theory was based on the view that people construct internal cognitive models that reflect the
probabilistic properties of the environment (Doherty & Kurz, 1996; Lagnado et al., in press). Thecentral tenet of this approach was to analyse individual judgemental processes prior to computing
group averages. Individual judgemental policies can be examined by considering the relation of
the given cues and the patterns of judgement (for overview see Cooksey, 1996). More specifically,
computing multiple regression analysis for the judgements and the cue values across all the trials
of the task, we can measure the cue-utilization weights from the resultant beta-coefficients. Simply
put, it shows that weights that the individuals have given to each cue during their judgements.
Having the judges policy models, it can be compared to the actual structure of the
environment. This is achieved by computing parallel multiple linear regression for the
environmental cues. Here the beta coefficients will be the objective cue weights which are the
same for all participants exposed to the same task. This way, the judgment policies (by their cue
utilization weights) will be comparable with the objective weights. The analysis reveals eventually
how the individuals learnt the task environment. The analysing method can be applied in a parallel
way for the assessment of the explicit judgements as well, measuring the task knowledge and the
self-insight.
Brunswiks Lens model and its later developments provided a useful framework for
studying judgemental processes, but its shortcomings hinder us from receiving a detailed picture
of the dynamics of the process both from the aspect of the environment and the decision maker.
Regading the WP task, the way of averaging performance across all trials not just
ignores the possibility that one can vary his/her subjective weights over the trials, but overlooks
the fact that the individual cannot obtain a representative picture about the probabilistic structure
early on in the task. In fact, the observed probability of an outcome changes from trial to trial and
reaches the final (given) value only after meeting the last feedback (so it is actually never
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
28/73
Strategy analysis of probability learning28
measured). Consider the structure of the following Figure 4. It depicts the binary decisional tree in
PCL. Going downwards from the top of the tree, one can observe the percentage of the normative
expectancy of one of the feedbacks in each step.
Figure 4.
Binary decisional tree in WP. Going downwards from the top of the tree, the numbers on the
diagonal lines indicate if the correct response was set to be sun (1) or rain (0). The numbers ( x)
inside the rectangles represent the percentage of the normative expectancy of sun feedback in eachgiven step (for rain the values are 100- x).
In practice the outcome patterns are (quasi-)randomly distributed (with fixed overall
probabilities) in the WP experiments. It can be read from the illustration ( Figure 4. ) that regarding
solely the final percentage values each participant might observe different probabilities on the
previous trials. This makes the group averaging technique at least quite robust, if not inadequate.
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
29/73
Strategy analysis of probability learning29
Furthermore, on the 14 trials, first presenting a particular pattern, participants have no previous
knowledge about the outcomes at all, so their responses cannot be a result of learning.
It is also plausible that the people have no perfect memory for all the observed stimuli,
thus recency may have effect on the decisions. People in experience-based learning situations have
to update their impressions according to the newly sampled outcomes (Hogarth & Einhorn, 1992).
Recently presented outcomes may have greater weight than earlier ones (e.g., Hertwig, Barron,
Weber, & Erev, 2004). Moreover, individual memory constraints may vary regarding the number
of considered samples in decisions (e.g., Jones, Love, & Maddox, in press).
In view of these facts the conclusion is inevitable that a more sensitive analysing method is
required for a precise description of probability learning and a coherent framework for modellingthe dynamics of the process.
V. The Dynamical approach
One more sensitive methodology for this analysis is provided by the dynamical
approach. Dynamical approach to cognitive science rejects the idea that cognition is the operation
of a special mental computer located in the brain, rather it provides a framework for understanding
cognitive processes within cognition as a complex natural system (van Gelder & Port, 1995). Oneof the two main tenets of the Dynamical Hypothesis (van Gelder, 1998) which distinguishes it
from the traditional views of cognition is its primary focus on the processes in real time. Contrary
to the computational aspects, the main aim of the approach is to describe behaviour in its temporal
course. Instead of the input-output relation, its concern is about the changing of the overall system
in time. In contrast to the computer storage analogy, followers of the dynamic view reinforce the
common psychological view that we are not passive recipients of information rather that we
actively manipulate, reconstruct and bias it (MacLeod, Uttl, & Ohta, 2005). The other key aspect
of the approach is an emphasis on total state. Total state refers to the conjunction of all aspects of
the system at a given point in time (Beer, 2000; Bosse, 2005).
V.1. Dynamic Models of Cognition
The approach appeared to be expedient for decision making theories. Busemeyer and
Townsend (1993) were one of the first ones publishing a useable dynamic model for high-level
cognitive processing. The study provided a new aspect for understanding the relation of decisional
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
30/73
Strategy analysis of probability learning30
models. They classified the decision making models according to two attributes: deterministic
versus probabilistic and static versus dynamic (1993). This dynamic-cognitive decision field
theory in contrast to the earlier dominating deterministic or static theories successfully
accounts for many time-varying aspects of the phenomenon and can offer a more detailed process-
oriented explanation of motivational and cognitive mechanisms of decision making (Port, 2000).
Before we exult over finding a revolutionary new alternative for the traditional theories
of cognitive science, we should notice not only the long-established presence of its components,
but the considerable limitations of its current applicability as well. The general framework was
developed to study the biological and cognitive systems in their evolution, but unfortunately in
many fields of this science there is not much realistic prospect for its empirical testing (French &Thomas, 2001). In most cases of cognition processing transitions are extremely rapid, the variables
are exceedingly numerous and hardly detectable. Except for a few carefully constrained simple
cases, we have insufficient amounts of data and inadequate method of computation compared to
what these analyses would require (Port, 2000).
V.1.1. Static and Dynamic Models of Learning
Still, one of the areas where dynamic analyses may provide stronger descriptive andpredictive power in modelling cognitive processes is the modelling of learning processes. As
discussed before, learning is a process in which we change the predictions about our environment
on the basis of new experiences. This process is an evolving product of changing factors along the
course of time (Smith et al., 2004). In typical experiments learning curves can be documented by
recording responses (decisions) in multiple trial tasks. Multiple trial tasks provide us with
continuous sampling of the changing phases of the process. In these studies (usually with binary
responses) stimuli can be associated deterministically or probabilistically with reinforcement.
Following Busemeyer and Townsends (1993) categorization of decision theories, here I propose
an outlining classification of human learning models according to the deterministic versus
probabilistic and static versus dynamic attributes, as in Table 1.
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
31/73
Strategy analysis of probability learning31
Table 1
Categorization of Learning Models
Category Static Dynamic
DeterministicClassical/OperantConditioning
Skill Learning
Probabilistic Rescorla-Wagner model
DynamicProbabilityLearning model
Note. The matrix depicts and hypothetical categorization of learning models according to the
deterministic-probabilistic and static-dynamic axes.
A model is static when it regards and measures learning not in its course of process as
dynamic ones do , but as a property of the system in a certain point in time. Therefore traditional
conditioning theories are in this block, but most standard tests also belong here. Regarding the
other attribute, a model is deterministic, if the stimulus is always, or never associated with the
response, while in probabilistic theories the stimuli are reinforced according to varying distribution
where. In this sense the Rescorla-Wagner model (1972) is static, because it does not account on
the changing factors from trial to trial. Considering these aspects the study of probability learning
requires techniques of a dynamic learning model.
V.2. Dynamical analyses
Two techniques have been published recently to monitor learning behaviour over time.
The rolling regression analysis (or sequential least square technique) was introduced to illuminate
individual behavioural differences in price forecasting (Kelley & Friedman, 2002). This method
computes series of regressions by a moving window, generating trial-by-trial estimates about the
individuals responsiveness to the observed cues. The learning curve is then compared with the
curve that an ideal learner would show on the same task. The method regards the trial by trial
information that the participant has actually observed. Thus, ideal learners strategies are defined
according to the current state of knowledge of the ideal observer in each trial. This technique
makes it possible to examine decisional attitudes individually along the course of the experiment
and provides a tool to compare learning performance with an ideal learner of various strategies
(e.g. Kitzis, Kelley, Berg, Massaro, & Friedman, 1998; Lagnado et al., in press). The other,
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
32/73
Strategy analysis of probability learning32
explicitly dynamic statistical analysis of learning is the state-space model paradigm (Smith et al.
2004). This model computes the probability of correct responses for each state of the learning
process by maximum likelihood applying expectation maximization algorithms. Knowing the
learning curve and the confidence intervals permits us to identify the first trial on the curve where
individuals performs better than chance level (Smith et al., 2004). This technique gives a precise
definition of learning and a coherent statistical framework for learning studies with binary
responses.
VI. Experiments
The present explorative study aims to analyse what decisional strategies are applied inprobabilistic learning situations. The question was examined with some modification on the standard
PCL task. In addition, two novel statistical analyses rolling regression and state-space model were
applied on the data.
First of all, I used a new version of the usual PCL scheme (see Section III.1., III.2.). As it was
reviewed in Section IV.2., a variety of different strategies are all about equally effective to solve the WP
task. The ambiguity springs from the perceptual design of the four cues. In this task the four cues are four
geometric forms of which combinations are the basis of judgements. As Gluck et al. (2002)
demonstrated the application of the singleton strategy (in which the focus is only on the patterns where
there is only one cue present) can also lead the participant to good performance than the multi-cue
strategy (where all the cue combinations are considered). For example, if the participant associates every
single triangle with rain and guesses in the other trials randomly, he may reach a good score on the whole
experiment, however, no probability learning occurred. To prevent this complication I used a different
PCL task than the WP. The experimental setup was adapted from Shohamy and colleagues (Shohamy et
al., 2004) who constructed a design, which kept the basic structure of the WP, but used less detached
cues. As described in details below, the participants had to make guesses on the basis of the features
(cues) of a toy figure (hat, moustache, etc.) ( Figure 6. ).
Another modification was that the sequence of the trials and feedbacks were set to be identical
for each participant. As written in Section IV.2., the random distribution of the patterns (with fixed
overall probability) limits the possibilities of individual comparison. With fixed pathways of the patterns
in the decisional tree the order of the stimuli and outcomes became identical for all participants. This
made the data usable for an aggregated trial-by-trial evaluation within the group and between the
individuals.
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
33/73
Strategy analysis of probability learning33
To measure the performance differences after different feedback pathways, different
pathways belonged to some of the final probabilities. Figure 5. demonstrates that more pathways
lead to the final value of 33.3, 66.7, 25, 75, 16.6, 83.3, and 50 of probabilities. In the case of some
the pathways with common final point, one answer was more probably correct in the beginning,
but later on the alternative answer became dominant, while in other cases one answer was
dominant in the whole course of trials. The fixed pathway method also permits us to compare the
learning differences at the same final point after different paths.
Figure 5.
Feedback pathways. The 14 pathways following the arrows represent the feedback sequences of
the 14 patterns. Numbers ( x) in the rectangles indicate the percentage of the normative expectancy
of vanilla feedback in each given step (for chocolate the values are 100- x)1.
1 Constructed by Dnes Tth.
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
34/73
Strategy analysis of probability learning34
In practice the final value that represents the normative overall probability of a pattern
can be measured only after the last feedback (about previous practice see Section IV.4.).
Therefore, I inserted one more trial of each pattern (14) with no feedback after their last
presenting. These extra trials provide us data to measure learning at the final point.
In Section IV.1. we could see that experienced based learning is assumed to based on
rather implicit then explicit mechanisms. The ignorance of this led previous studies to confusion
about the explanation of learning processes along this task (see in Section IV.3.). A large body of
researchers emphasized that the misinterpretation of data originates from the malpractice that the
experimental learning was tested by explicit assessments. To avoid this, I added an implicit test
phase following the learning phase. This phase was identical to the learning phase except, theparticipants did not get feedback after the trials, but were told that they receive their result at the
end of the session. The advantage of this extra part is to test the subjects in a similar situation to
the learning phase, and since there is not feedback, the observed probability remains unchanged,
thus we can count mean scores from the sequences of test trials.
To measure participants explicit task knowledge, the third part of the experiment was
an overt test of what probability value they associate to each pattern. The explicit test had to
follow the other tests not to interfere with them.
VI.1. Methods
VI.1.1. Participants
Fourty-five undergraduate students from the Psychology Institute of ELTE, Budapest
participated in the present study (mean age = 23.46 years; SD = 3.77 years). There were 15 males
and 30 females. The participants were divided into two groups: baseline and dual-task groups.
Twenty-eight participant were in the baseline group (n = 28; 6 males and 22 females with a mean
age of 22.00 years and a SD of 2.87 years). The dual-task group consisted of 17 participants (n =
17; 9 males and 8 females with a mean age of 25.88 years and a SD of 3.92 years). The
participants received course credits.
VI.1.2. Materials
The A modified version of the PCL I task (as introduced by Shohamy et al., 2004) was
used in this study. In this version participants are told that they are selling ice cream in an ice
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
35/73
-
8/13/2019 Aczel - Strategy Analysis of Probability Learning
36/73
Strategy analysis of probability learning36
cues present (1111) and with no cue present (0000) were never used (following the scheme of the
previous works since Knowlton et al., 1994).
Table 2
Note. Each card could be present (1) or absent (0) for each pattern. The all-present (1111) and all-
absent (0000) patterns were never used. The overall probability of vanilla outcome for all patterns
is 50%.
Additionally, two digital photographs of vanilla and chocolate ice creams and an image
of the tip jar that showed the extra tips were used on the screens. All the materials were presented
on a computer screen with identical black background.
Using the 14 patterns of stimuli, 214 trials were constructed for the learning phase and70 for the test phase. For the explicit test 14 PowerPoint slides were created using the 14 patterns.
During the learning phase the two feedbacks (vanilla, chocolate) were equally probable, but each
pattern was independently associated to one of the feedbacks according to the scheme of Table 2.
200 of the 214 learning trials followed the pathways of Figure 5. , the remaining 14 trials were not
associated with feedback.
-
8/13/2019 Aczel - Strategy Analysis of Probability Learn