Download - Aczel - Strategy Analysis of Probability Learning

8/13/2019 Aczel - Strategy Analysis of Probability Learning

1/73

Strategy analysis of probability learning1

Strategy analysis

of probability learning

Balzs Aczl

Psychology Institute,

University of Etvs Lornd

Budapest, Hungary

MA Thesis

(Psychology Course)

2005/2006 Spring Term

Supervisor: Mihly Racsmny, PhD.


2/73


Contents

I. Introduction 4

II. The origins of probabilism in psychological research 5

II.1. Egon Brunswik 5

II.1.1. Brunswiks Lens model 6

II.1.2. Brunswick on learning 8

II.1.3. Multiple cue probability learning 10

II.1.4. The expansion of Brunswick works 11

II.2. Estes and the probability matching 12

III. New interest in probability learning 14

III.1. Probabilistic associative learning 15

III.2. Probabilistic classification learning 17

IV. Methodological considerations 20

IV.1. Base-rate neglect 20

IV.2. Strategy analysis of PCL 23

IV.3. The question of consciousness 24

IV.4. Analysing the test results 26

V. The Dynamical approach 29

V.1. Dynamic models of cognition 29

V.1.1. Static and dynamic models of learning 30

V.2. Dynamical analyses 31

VI. Experiments 32

VI.1. Methods 34


3/73


VI.1.1. Participant 34

VI.1.2. Materials 34

VI.1.3. Apparatus 37

VI.1.4. Procedure 37

VI.1.5. Data collection 38

VII. Results 39

VII.1. Experiment 1 39

VII.1.1. Hit rate 39

VII.1.2. Rolling regression 40

VII.1.3. State-space model 43

VII.1.4. Test phase and explicit measures 43

VII.2. Experiment 2 45

VII.2.1. Hit rate 46

VII.2.2. Rolling regression 47

VII.2.3. State-space model 47

VII.2.4. Implicit, explicit measures 48

VII.3. Summary of results 49

VIII. Discussion 51

VIII.1. Rationality under uncertainty 52

VIII.2. The power of statistical learning 53

VIII.3. From duality to multiple systems 55

IX. Conclusion 58

References 60

Appendix


4/73


God has afforded us only the twilight of probability;suitable, I presume, to that state of mediocrity and

probationership he has been pleased to place us in here.

John Locke, 1690

I. Introduction

All of us spend our lives learning, yet our goal to understand its basic processes is still

unaccomplished. Learning theory is a critical field of our psychological investigation because most

human behaviour involves some form of learning (Robertson, 1970). Learning may be defined as a

process in which behavioural patters are changed as the result of experience (Kelso, 1997).

Learning is an adaptive process of our cognition to predict the future on the basis of pastexperience. In an uncertain world this prediction can rely only on probabilistic relations about the

environment (Lagnado, Newell, Kahan, & Shanks, in press). Therefore to understand how people

learn probabilistic information from experience is a fundamental question of human behaviour.

In this explorative study I concentrate on four basic aspects of probability learning. First

of all, during my discussion I rely on Brunswiks theoretical framework about the probabilistic

nature of human psychology. Secondly, I wish to provide a critical review of the methodology

used in the previous studies of probability learning. Thirdly, I am motivated to examine the role of

conscious and unconscious processes behind the applied decisional strategies. Finally, I consider

learning to be a dynamic process and find the dynamical approach and methodology to be relevant

for the investigation. In the experiment I demonstrate how these aspects of probability,

consciousness and dynamic processes play essential rule in probability learning.

The research question intends to explore what decisional strategies we use in

experiment-based probability learning situations. The modified experimental task is applicable to

explore the used decisional strategies. The employed methodology provides useful tools to the

field to investigate simultaneous processes underlying learning behaviour. The results may yielddirect implications to the understanding of the well-known suboptimalities of human learning and

decision making. The whole work may raise new questions about the interaction of the

association- and rule-based processes behind human learning.

I greatly acknowledge Dnes Tths collaboration in the planning, execution and analysis of the experiments reported.

Special thanks for the careful revisions to Tams Makny and for the linguistical check of the manuscript to James

Wason from Cambridge.


5/73


II. The origins of probabilism in psychological research

Probabilism in psychological research started on its way with the argument of Egon

Brunswick (e.g., 1939) to describe the relations between organisms and their environments in

probabilistic terms. Brunswiks scientific work was consequently provocative. He believed that

psychology needs a revolutionary turn to understand behaviour with regard of its function and in

terms of probabilistic relations. However the impact of his work did not live up to his original aim

for a change in the mainstream of the field, after all it is becoming apparent that his thoughts were

basically right.

In this chapter I wish to describe some of the views and works of Egon Brunswik,

because his conceptions on perception, learning and experimental design serve as a theoretical

background to the methodologies discussed in this work. I also give a short description about the

contribution of William Estes who started an initial research to model probability learning in

mathematical terms. Contemporary conceptions in probability learning studies originate directly

from these two approaches.

II.1. Egon Brunswik

Brunswik, although born in Budapest, began his career in psychology as the first

assistant of Karl Bhler in Vienna (Doherty & Kurz, 1996). Under the auspices of Bhler,

Brunswiks views turned against the popular psycho-physical parallelism, and this attitude

determined his later theories. He started to state it in Vienna and continued in Berkeley that both

the incoming perception and the outgoing behaviour have a rather ambiguous nature. In his view

the probable partial causes and probable partial effects of the behaviour should be under focus

when we wish to understand the great compatibility between organism and environment

(Hammond, 2001). Brunswick intended to say in an evolutionary view that in natural

environments survival is possible only if the organism is able to establish compensatory balance

in the face of comparative chaos within the physical environment (Brunswik, 1943, p. 257). In

that physics-envy time of psychology, his concept of probable behaviour in a somewhat

unpredictable environment was in sharp contrast with the mainstream thinking that sought for

stability in laws and research on behaviour. With his Lens model which intended to describe this

compensatory balance of the organism in inherent uncertainty within the environment and

within the person, he went against the dominating determinism of his time. Indeed, Brunswiks


6/73


belief was that the probability character of the causal (partial cause-and-effect) relationship in the

environment calls for a fundamental, all-inclusive shift in our methodological ideology regarding

psychology (Brunswik, p. 261). These harsh words according to Hammond, Brunswiks

student and follower established a distance between Brunswiks views and those of Hull, Lewin

and many generations of future experimental psychologists that is still hard to overcome

(Hammond, p. 56).

II.1.1. Brunswiks Lens Model

In the domain of perceptual constancy Brunswiks first research was conducted with

Lajos Kardos (Brunswik & Kardos, 1929) on Bhlers duplicity principle. This principle opposed

the view that context has only a modifying effect on perception that comes into counts only after

the object, instead, Bhler and his students stated that context is always present and never

subordinate in perception (Brunswick, 1937). This reconceptualisation of context served as a

crystallisation point for his ideas that led to his early lens analogy and his later generalised lens

model (Cooksey, 2001).

The analogy of the doubly convex lens is a heuristic tool what he conceived as a

composite picture of the functional unit of behaviour, or the unit of achievement (Brunswik,1952, p. 19-20). This tool was meant to help the researcher in structuring the investigation of

organismic achievement. The explanation of the model contains another metaphor, the intuitive

statistician, which depicts the perceptual system, being equipped with latent capacities capable

for basic statistical functioning in the uncertainty of the environment (Brunswick, 1956, p. 80).

The cues in the environment are only probabilistically related to the objective of the individual. In

that sense, the decision maker has only probabilistic information about the environment and also

about how to utilize these perceived cues. During judgemental processes the decision maker relies

on these environmental cues to attain his/her goal. Not all of the cues may have equal relevance

for predicting the outcome of the decision, but according to the view all of them go into

account (cf. Brunswiks views on the context as additional mediating data (1937)). The decision

maker uses his/her memory of cue-outcome correlations from previous experiences. In Brunswiks

view, this statistical processing occurs involuntarily. Further principle to understand the Lens

model is his principle of parallel concepts (Doherty & Kurz, 1996). This principle states that the

perceived environment and the cognitive system should be described by the same type of


7/73


constructs. These thoughts gave the basic theory for the construction of the Lens model. Brunswik

(1952) thought that using correlation statistics these constructs become measurable.

Figure 1.

Illustration of the Lens model. The objective weights ( w) between the environment (Y E) and the

cues (X) and the subjective weights ( j) between the cues and the judgement (Y S) are parallelconcepts. The functional achievement (r a) is the correlation between the persons judgement and

the ecological criterion value. (based on Cooksey, 1996)

Thus, as Figure 1. depicts, ecological validity is defined as the correlation between the

values of the distal criterion of the environment and the perceived cues; the cue utilization validity

is defined as the correlation between the values of the perceived cues and the individuals

judgements; finally, the achievement is measured by the correlation of the values of the distal

criterion of the environment and the individuals judgements. Achievement, as the most general

measurement of the model, reflects Brunswiks broadest descriptive term, the probabilistic

functionalism . In this terminology, achievement is the degree to which the orgamism successfully

attains its goals (Doherty & Kurz, 1996). This conception makes apparent distinction from most

of the other traditions in psychology. Investigators of human behaviour almost exclusively

define the correctness of human performance in comparison to some kind of normative models

(mostly adapted from statistics or logic). Unlike the appliers of these coherence standards of

assessments (e.g., investigators of heuristics and biases), those of using the Lens model look for


8/73


correspondence in performance. In their words, successful performance (i.e. for achievement)

[is] the degree to which a persons responses agree with, or correspond with, the environmental

events (Doherty & Kurz, p. 123).

The biggest contribution of the model is that it provided a general tool for those

investigations of cognition where the focus is on any kind of judgemental processes under

uncertainty. Brunswik had only a few demonstrations using the Lens model (e.g., 1952), but after

his death, the approach, starting from Hammonds clinically oriented research initials (Hammond,

1955), evolved into the Social Judgement Theory, which reached many areas of investigation on

judgement and decision making including educational decision making (e.g., Cooksey, 1988),

medical decision making (e.g., Wigton, 1988), accounting (e.g. Krogstad, Ettenson, & Shanteau,1984), or risk judgement (e.g., Earle & Cvetkovich, 1988). The other expansion of Brunswiks

Lens model had enriching effect on the theories of learning, setting off the studies of multiple cue

probability learning (described in Section II.1.3.).

II.1.2. Brunswick on learning

Brunswik dedicated two research studies to the investigation of probability learning. The

first work, the Probability as a determiner of rat behaviour (1939) was executed in 1936 and1937, as his first experiment in the United States. This paper presents not only an excellent

reflection of Brunswiks main attitude towards the psychology of his time, but the first experiment

on probability learning in the literature as well. The design of the experiment followed the

standard design of his age, the T-maze. The T-maze was a usual experimental setup of

behaviourism, where the animal (usually a rat) was places in a two armed T-form maze. Food

reward was exclusively conditioned to one side and never to the other side, thus training the

animal to learn the expected behaviour. Brunswiks innovation was that he has altered the

predictability of the sides for the running series. In every run there was food on only one side, but

the location of the rewarded side was not consistent in each group. Brunswik calibrated the

predictabilities of certainty for the groups following 100:0, 75:25, 67:33 ratios. In each case, the

generally profitable side was counted as correct choice, the generally unprofitable side as

error. In this sense, in the exceptional cases some of the successful choices were errors and some

of the unsuccessful choices were correct responses. Further to study the effect of ambiguity,

after 4 days of training Brunswik gave reversal trainings to the animals. Now the profitable and

unprofitable sides were exchanged, however, keeping the previously set probabilities the same for


9/73


all groups. In order to study more about the effect, he added repeated reversal training to the 100:0

group. In the following six consecutive days these animals had the same type of training except the

directions were reversed in each day (Brunswik, 1939). The results in general showed that

discrimination increases with the increase of the difference of reward probability on the two sides.

The description of the design here served not as an introduction the results of the study,

but to present the innovative conception in it. Leaving Thorndikes ideas alone on the probabilistic

nature of the environment (Thorndike, 1932), most of the behaviourist studies steadily stuck to

deterministic reinforcements in their designs. It was Brunswiks explicit goal with strong

argument to reform psychology to accept the genuine uncertainty in observation and judgement

into the model. It is worth to take notice of his argument that can be called passionate in scientificcontext. The title already hides an ironic paradox: probability as a determiner. Further, the first

two sentence concisely sums up his overall conception stating In the natural environment of a

living being, cues, means or pathways to a goal are usually neither absolutely reliable nor

absolutely wrong. In most cases there is, objectively speaking, no perfect certainty that this or that

will, or will not, lead to a certain end, but only a higher or lesser degree of probability.

(Brunswik, 1939, p. 195).

Brunswiks other study on probability learning was conducted with Hans Herma in 1951

titled as Probability learning of perceptual cues in the establishment of the weight illusion . This

work applied probability learning to Brunswiks genuine interest, the perception. In a brief

summary, in this perceptual experiment the participants had to lift heavy and light weight objects

simultaneously in both hands. The objects were painted in two colours. Each participant was told

that after some presentations of weights, he/she has to tell in a snap judgement which of the two

objects appeared heavier at the first moment of the lifting. (Brunswik & Herma, 1951). Because of

the successive weight contrast effect the participants underestimated the relatively heavier weight

in the subsequent trial and vica-versa for the light weights. The estimated weight contrast was

under focus, since, in the balanced weights test trials, the one presented on the side with generally

lesser frequency of heavy objects is judged as the heavier of the pair (p. 174), thus showing the

effect of probability learning. Thus, the location and the colour served as cues and the estimated

weights as the dependant variable. The uncertainty of the environment was represented in the

probability by which the multiple cues predicted the objects. The results showed that the contrast

responses, after an early maximum, declined under continued reinforcement. This paradoxical


10/73


result was speculated to be a special characteristic of probabilistic learning (Brunswik & Herma,

1951.)

The interest of this work is not about its design or results, but about the questions that it

aims to study of how rapidly the organism adapts the probabilistic structure of the environment; or

if probability learning reaches a final level at which gets stabilized (Bjorkman, 2001). For our

current research and understanding of probability learning the relevance of these questions is still

unexpired.

Although, learning was not closely related to Brunswiks original interest, in these

works he presented probability learning and multi-cue probability learning experiments for the

first time. The implications of this concept are detectable best in the various areas of the multiplecue probability learning paradigm.

II.1.3. Multiple cue probability learning

In a typical multiple cue probability learning situation (MCPL) participants make

judgements based on cues probabilistically associated to feedbacks in a series of trials. The aim of

this experimental design is to model the organisms attempt to learn the relationship of the

variables of the environment and to model its function of predicting the efficiency of behaviour.The probabilistic reinforcement of cues by feedbacks represents the conception of general

uncertainty of real life environments and its perception as well.

The description is in strike contrast with many of the traditional learning models where

the degree of learning is tested and defined by the number of correct retrieval of items or of

associations previously presented in deterministic pairing. The degree of learning in probability

learning studies is assessed by the percentage of the correct decisions made on the basis of the

previous experience on the task.

The merit of this model is not just for its contribution to the learning theories, but also

for its essential involvement in most of the cognitive processes ranging from perception, and

categorization to decision making. In this section I wish to provide an insight to the evolvement of

the research of MCPL to elucidate what attempts have led to the concepts and experimental trials

of today.


11/73


II.1.4. The expansion of Brunswick works

Although, the debate goes on, if Brunswik and Herma used the term probability learning

rightfully in their experiment published in 1951 (described in Section II.1.1.), or a present-day

reader would make distinction calling it partial reinforcement (Bjorkman, 2001), some readers of

his article understood the originality in the concept and started a series of experiments that

triggered off an emerging of questions for the coming five decades.

The first work after Brunswiks 1955 death applying multiple and single cues in

probability learning was conducted by Smedslund (1955) in Norway. In his inquiry into the

origins of perception assumed perception to be established by a process of multiple-probability

learning, meaning that in learning people utilise complex configurations of ambiguous and

probabilistic cues of the environment. One of his two experiments was to explore the possibility of

utilising the probability learning procedure as a diagnostic tool in clinical psychology. He found

the method to be slow and inefficient (Holoworth, 1999).

Extensive research program was initiated only from 1964 by Hammond, Brunswiks

follower, and his students in the United States (e.g., Hammond, Hursch, & Todd, 1964), they

began to analyze the components of clinical inference explicitly in the framework of Brunswiks

Lens model (sketched earlier in Section II.1.1.). The other main propagator of the approach wasBjrkman, who started a long research project in Sweden (e.g., Bjrkman, 1965, 1987) as well as

his student Brehmer, who published 77 articles in the topic counting between 1972 and 1988

(Holoworth, 1999).

Until the late 1980s a big proportion of the studies documented substantial learning

effects, however not reaching optimal level (cf. Hammonds results), while Brehmer expressed his

pessimistic conclusion, stating: When we learn from outcomes, it may, in fact, be almost

impossible to discover that one really does not know anything. This is especially true when the

concepts are very complex in the sense that each instance contains many dimensions. In this case,

there are too many ways of explaining why a certain outcome occurred, and to explain away

failures of predicting the correct outcome. Because of this, the need to change may not be apparent

to us, and we may fail to learn that our rule is invalid, not only for particular cases but for the

general case also. (Brehmer, 1980, p. 228-229).

It seems that the interest in the MCPL approach dwindled considerably after the mid-

1980s, but in fact, the question only has merged into a parallel research tradition of probability

learning, marked by the name of Estes.


12/73


II.2. Estes and the phenomenon of probability matching

Another theoretical origin of probability learning research can be found in the early

articles of William Estes. In contrast to Brunswiks case, here the thought of probabilistic relations

came from the inside circles of behaviourists. In the time of Estes in the 1930-40s, after the

resultless attempts of the big search for the global theories of learning, the field was looking for a

new direction. The orientation of that time was to suppose that all psychological phenomena could

be understood in terms of some version of associationism, meaning that situational stimuli (Ss)

control all behavioural responses (Rs). Estes wanted to keep this paradigm while involving

probability into learning. He, as a good follower of Skinner, his mentor, in his 1950 article found a

way to execute the task. First, he defined the response classes for the organism in a given situation.

The advantage of having a closed set of responses is that the organisms behavioural state in a

given situation can be fully characterized in terms of probabilities of its emitting each of the N

response classes (Bower, 1994). In this way, learning can be defined as the increase of the

probability of the correct responses alongside decreasing the competing alternative responses in a

given situation. This concept made Estes to be able to construct differential equations to predict

quantitative data of behavioural response. This statistical theory of learning brought about themathematical models flourishing for the coming 25 years in the field.

After describing some finite differential equations in trial-by-trial changes in learning

(Estes & Burke, 1953) Estes, the former rat runner, started his first probability learning

experiments with his students in the mid-1950s. Adapting the experimental situations from the

earlier studies of Brunswick, the subjects had to predict which of the two possible outcomes will

occur on each trial in the given situation (e.g., a light would appear on the left or the right). The

events occurred in random sequence and only the base rate became available to help subjects to

predict which event will show up in that trial. Thus the fixed probability was independent of the

history of the outcomes and of the behaviour of the subject. The optimal strategy to maximize the

expected utility would be to always choose the option which appeared with probability greater

than one half. The striking feature of the results of the following experiments was that subjects

matched the underlying probabilities of the two outcomes along their decisions. For instance the

experimenter has programmed the lights to flash randomly, but the red light would flash 70% on

the time and the blue 30% of the time. During the course of the experiment the participants most

often will predict the red light roughly 70% of the time and the blue light roughly 30% of the time.


13/73


This strategy is suboptimal, since they will predict correctly only 58% of the time

(0.7x0.7+0.3x0.3=0.58), while predicting always the more likely light would bring 70% hit rate

(1x0.7+0x0.3=0.70). The illogical sense of the results inconveniently surprised the researchers of

the field. It is worth here to cite Kenneth Arrow economists comment from the time: We have

here an experimental situation which is essentially of an economic nature in the sense of seeking

to achieve a maximum of expected reward, and yet the individual does not in fact, at any point,

even in a limit, reach the optimal behavior. I suggest that this result points out strongly the

importance of learning theory, not only in the greater understanding of the dynamics of economic

behavior, but even in suggesting that equilibria maybe be different from those that we have

predicted in our usual theory. (Arrow, 1958, p. 14). Bush & Monstellers (1955) stochasticlearning theory, as much as Estess (1957) probability matching theorem, did predict this kind of

behaviour in linear difference equations.

The researchers enthusiasm began to wane about the forms equations of the learning

curves during the following decades. Estess colleague, Roger Shepard mentions three reasons

explaining this turndown (1992). First, Estes and his fellows of that time based their theories on

the assumption that the associative bonds come about through temporal contiguity between events,

however with the cognitive approach it turned out that the predictive significance of events is a

better explanative basis for this kind of learning (Rescorla & Wagner, 1972). Secondly, they

realized only lately that what can be learned must be genetically internalized and thus the

experiments have to set to be ecologically valid (Gibson, 1979). Finally, these mathematical

derivations could not be employed expectedly for more complex classification tasks and with the

availability of stored program computers the search for few general principles was waned and

what started was a seemingly endless patching together of details, as doc heuristics explicitly

engineered to accomplish various practical tasks(Shepard, p.420).

Even if the equations of learning curves did not remain a central research topic after the

1950s, the previous robust findings that people use probability matching instead of normative

optimal strategies in binary prediction tasks still startled researchers for further studies. Searching

for explanation, a vast number of experiments tried to vary the parameters of the task in the

coming decades. In general they found that matching decreased with size of the reward (Brackbill,

Kappy, & Srarr, 1962; Siegel & Goldstein 1959) and with the number of trials (Edwards 1961).

Shanks, Tunney, & McCarthy (2002), supporting rational choice theory showed that three factors


14/73


contribute to reaching optimal response strategy. These factors are (1) large financial incentives;

(2) meaningful and regular feedback; (3) extensive training.

Rewards:

Although, Friedman and Massaro (1998) failed to find evidence that monetary payoff

effects performance, we can find documentation of this in the literature (Siegel & Goldstein, 1959;

Vulkan, 2000). In general, even under monetary payoff, the asymptotic levels of responding rarely

exceeded 95% correct choice (Vulkan, 2000). Although, Shanks et al. (2002) paid almost 40 in

average to his participants, he could show only a slight positive effect on performance, yet the

magnitude of payoff did not show correlation with the performance.

Number of trials:Restle (1961) showed that that sequence effect may disappear after 1000 trials, Goodie

and Fantino (1999) found gradual transition towards optimal responding over 1600 trials.

Considering these data we should conclude that humans are slow learners, rather than rational

strategists. These findings even seems to be inadaptable for real life, since, as Fantino (1998)

noted [l]ife rarely offers 1,600 trials (p. 213).

Feedback:

At least from Thorndike (1898) we know that one is more likely to choose an option in

the future, if he/she receives a positive feedback. Nevertheless, previous research has found that

outcome feedback is quite limited in its usefulness, particularly in comparison to cognitive

feedback (Balzer, Doherty, & OConner, 1989). Cognitive feedback refers to information about

the relations between responses and outcomes (functional validity information), or a summary of

the relations between responses and outcomes (task information). Shanks et al. (2002) found that

individuals may be differentially sensitive to the motivating properties of feedback. Despite of

these observations, we can tell in summary that previous researches paid little attention to the role

of feedback and its effects on asymptotic levels of performance is far from being explained.

III. New interest in probability learning

Although from Brunswick up to the late 1980s 280 journal articles, book chapters,

doctoral dissertations and technical reports were published on MCPL, research on this approach of

probability learning dwindled after the mid-1970s. The renewed interest in this topic started on its

way again only in the late 1980s. This was the time when personal computers became available for

psychologist experiments demanding complex computations, thus allowing developing models of


15/73


connectionist networks. This was the movement that helped Brunswikian probabilism get adopted

eventually into the classical schools of human learning studies.

III.1. Probabilistic associative learning

With the appearance of the cognitive approach animal and human learning research

started to follow separate routes. The researchers of animal learning continued to focus on

elementary learning, while interest in human research shifted from learning to memory and from

the classical models to the models of artificial intelligence. With the development of the

connectionist networks after the mid-1980s the methodological apparatus became available to

study the elementary aspects of human learning within the domain of cognitive psychology. Apart

from some exceptions (e.g., Dickinson & Shanks, 1985) no studies have attempted directly to

bridge results of animal experiments to human learning before Gluck and Bower (1988a). In their

seminal papers (1988a,b), they developed an adaptive connectionist network to test the Rescorla-

Wagner model of associative learning (Rescorla & Wagner, 1972) in human category learning.

Gluck and Bower were looking for a learning model that involves probabilistic relation in its

formulation. The Rescorla-Wagner model is based on Rescorlas previous demonstration (1968)

that in the case of animal associative learning the level of the probability of a CS will vary theprobability of the US in the presence of the CS compared to the US probability in the absence of

the CS (Gluck & Bower, 1988a). Gluck and Bower found that the Rescorla-Wagner rule for

association formulation can be regarded as a special case of the least-mean-squares (LMS)

learning rule which was used in training the adaptive connectionist networks of that time.

Attempting to evaluate the LMS rule as a component of human learning, Gluck and Bower

(1988a,b) conducted a series of experiments to explore the accuracy of the model to probabilistic

classification learning situations.

They adapted the experimental task from Medins medical classification task (e.g.,

Medin, Altom, Edelson, & Freko, 1982). In this task in each trial, participants pretending

medical diagnosticians - met one or more of the four symptoms of hypothetical patients in medical

charts. They had to classify each patient as having one of the two fictitious diseases. After each

trial they received feedback about the correct diagnosis. The combinations of the four cues

(symptoms), unknown to the participants, were imperfectly, probabilistically associated with the

feedbacks (diagnoses) ( Figure 2. ) following the scheme of the multiple-cue probability learning

studies (e.g., Castellan, 1977). During the training, participants learnt the relationship of the


16/73


symptom patterns with the diseases and at the end of the experiment they were asked directly to

estimate the conditional probabilities of each symptom to the diseases.

Figure 2. Cue-outcome relations in the probabilistic association task (Gluck & Bower, 1988a). Similarly to

Brunswiks Lens model ( Figure 1 .) objective weights ( w) represent the relation between the

environment and the cues.

The aim of the application of this design was to receive distinguishable predictions of

their adaptive network from the three competing models of category learning (exemplar-, feature-

frequency-, prototype model). The best fitting model could shed light on the question of to what

extent we use similarity and base-rate (category probability) information in probabilistic

categorization (Estes, Campbell, Hatsonpoulous, & Hurwitz, 1989). The exemplar (or context)

model assumes that the learner stores all exemplar of each category and a new instance get

categorised on the basis of its relative similarity to the stored exemplars (e.g., Nosofsky, Kruschke,

& McKinley, 1992); the feature-frequency model presumes that the learner stores relative

frequencies of occurrence of cues within the categories and then classifies an instance according to

the relative likelihood of its particular pattern of features arising from each of the categories

(Gluck, Bower, 1988b) (e.g., Reeds, 1972); the prototype model assumes that the learner abstracts

an average description of each category and then the new instance gets classified according to its

similarity to this prototype (e.g., Matsuka, 2004).

The adaptive network was an error-driven one-layer LMS network. To generate

differential predictions of the LMS model and the alternative models, Gluck and Bower (1988a)

had to unbalance the overall frequencies of the two diseases. This way, one of the diseases

occurred more often than the other one. The results from the learning phase showed that the base-

rate information (the overall frequencies of the two diseases) were reflected in performance, as a


17/73


form of probability match, however, when the participants were given explicit test trials at the end,

they showed substantial base-rate neglect. As a result, it was found here and in several other

experiments (e.g., Estes et al., 1989; Gluck & Bower, 1988b) that simple network model has

stronger predictive value for these results than the alternative models. Recently, out of the 12

current models of category learning, the COVIS (competition between verbal and implicit

systems) model is assumed as the best describer of probabilistic classification (Ashby, Alfonso-

Reese, Turken, & Waldron, 1998; Kri, 2003).

III.2. Probabilistic classification learning

More recently, four different kinds of category learning tasks are generally used: rule-

based tasks, information integration tasks, prototype distortion tasks, and the weather prediction

task (Ashby & Maddox, 2005). The weather prediction task (WP) is a version of probabilistic

classification learning task, developed by Knowlton, Squire and Glucks (1994). This test follows

the structure of Gluck & Bowers (1988a) construction except participants play a weather

forecaster. On the basis of one of the 14 combinations of four tarot cards (binary cues),

participants had to predict rainy or sunny weather (binary outcome) ( Figure 3. ).

Figure 3.

Probabilistic Classification Learning task. In this task people have to guess the weather on the

basis of the presented combination of cards with geometric signs on them. (adapted from Aczel &

Gonci, 2005)


18/73


Outcomes associated to the patterns appeared with fixed probabilities, but in random distribution.

Thus, the WP is substantially analogous to Gluck and Bowers (1988a) medical diagnosis task and

to the MCPL tasks in a brunswikian sense, since the participants have to make judgements on the

basis of the experienced relation between multiple-cues and a distal object, just as the Lens model

describes. The standard analysis of the test proceeds by averaging correct responses in blocks of

10 trials and measuring the deviation of these mean values from chance level. A response is meant

to be correct on a particular trial, if the outcome selected was more frequently associated with the

given pattern than the other one in the course of the whole experiment (Knowlton & Squire, 1996).

During the last decade, this task became extensively used in cognitive neuroscience

(e.g., Eldridge, Masterman, & Knowlton, 2002; Knowlton et al., 1994, 1996; Poldrack, Clark,Par-Blageov, Shohamy, Creso Moyano et al., 2001; Reber, Knowlton, & Squire 1996; Reber &

Squire, 1999). The exciting result that has started this interest in clinical research was Knowlton et

al.s (1994) finding with an initiative trial of this task. They compared the performance amnesiacs

with normal control on the WP task. The result was that the two groups performed equally well

during the first 50 trials of learning, however, in the extended part of the training, the amnesiacs

performance decreased relative to the healthy group (Knowlton et al., 1994). The impairment of

the declarative memory in amnesiacs is often coupled with the finding of relatively sound non-

declarative learning (Milner, Corkin, & Teuber, 1968; Warrington & Weiskrantz, 1968). On the

basis of this neurological observation, Knowlton et al. (1994) interpreted the results as the

performance of the two groups being processed by non-declarative learning systems in the first

part of the task, whereas in the late training the control people began memorising the test, what the

amnesiacs could not (Knowlton et al., 1994; see also Gluck, Oliver, & Myers, 1996). In contrast,

patients with Parkinson disease show learning deficit in the first 50 trials of the test which

continues throughout the training (Knowlton et al., 1996). A learning patter similar to the

amnesiacs was found with Alzheimer patients who were in the early stages of the disease. In both

cases the anterograde amnesic symptoms is connected to neurodegenerative processes in the

medial temporal lobes (Eldridge et al., 2002). Szabolcs Kri and his colleagues (Kri et al., 2000)

examined schizophrenics with the WP task who, as well known, have abnormalities in executive

function and explicit memory. The results showed normal performance for schizophrenics

comparing to control. These results suggested that the WP task is processed by non-declarative

neural systems.


19/73


The key cortical area activations found generally correlating with probabilistic

classification learning were the occipital cortex and the right nucleus caudatus (Kri, 2003). It is in

correspondence with the observation that people with impaired basal ganglia have difficulties in

implicit learning tasks. The correct responses correlate positively with caudate and prefrontal

activation, however, the role of the prefrontal and parietal cortices in this task is not yet

understood (Fera, Weickert, Goldberg, Tessitore, Hariri et al., 2005). The abnormal functioning of

the basal ganglia is well documented with Tourette patients (e.g., Peterson, Leckman, Duncan,

Ketzles, Riddle et al., 1994). In a clinical study, the WP task was used with children with Tourette

syndrome (Kri, Szlobodnyik, Benedek, Janka, & Gdoros, 2002). The children exhibited

impaired learning in the WP, however, in an explicit transfer version of the test they showednormal learning (Kri et al., 2002). Further study showed that transcranial direct current

stimulation of the left prefrontal cortex could improve implicit learning in WP task in healthy

people (Kincses, Antal, Nitsche, Bartafi, & Paulus, 2003). Interestingly, while healthy people

show activation in the striatum with no activation in the MTL during learning in implicit motor

sequence task, people with obsessive-compulsive disorder exhibited no activation in the striatum

and activation in the MTL (Moody, Bookheimer, Vanek, & Knowlton, 2004). Poldrack, et al.

(2001) intended to study this interaction of the basal ganglia and the medial temporal lobe (MTL)

during probabilistic classification learning. The appealing finding showed that during the initial

part of the WP task the MTL was active, while the caudate inactive, but very shortly the MTL

became deactivated (presumably inhibited), whereas the caudate nucleus became activated.

Poldrack et al. (2001) interpreted the results as the first substantive evidence of competition

between memory systems. They supposed that in the initial part of the task the two systems

(explicit vs. implicit) may compete and as it turns out that the task demands implicit processing

the MTL becomes inhibited (Poldrack & Rodriguez, 2004). This result supports the view that the

two systems are not dissociated imperviously, but are in constant interaction and are encoded by

common factors (e.g., McDonald, Devan, & Hong, 2004; Turk-Browne, Yi, & Chun, 2006).

The probabilistic associative learning (Gluck & Bower, 1988a,b) and the probabilistic

classification learning (Knowlton et al., 1994) tasks were developed for examining concrete

computational and clinical analysis, but their main contribution to the field is that they provide

experimental and analysing methodology for probability learning studies. These initial

examinations resulted sufficient data and experience to be able to reconsider the basic procedures

and methodologies for further studies.


20/73


IV. Methodological considerations

It became apparent very early that probabilistic experiments bring about confusing

results, if they are not measured with special attention. This section highlights the three main

problematic points of the field. The base-rate neglect is a phenomenon that intrigued not just the

researchers of learning, but it seems that this field has to most elaborated explanations to the issue.

Strategy analysis of probability learning has no long history in the literature, but bears special

interest for this explorative study. All of the methodological problems are entangled by our lack

insight to the questions of consciousness in the processes.

IV.1. Base-rate neglect

Gluck and Bowers (1988a,b) observation that although the participants reflected the

experienced probabilities in their decisions in the training phase, they did not consider the base

rate information in their decisions in the test part calls for special attention. This result may seem

to be uninterpretable, still it fits well with the literature of the base rate fallacy. The dominating

research on judgement and decision making in the 1970s and 1980s was concerned with the

heuristics and biases paradigm (Koehler, 1993). This approach was developed by Daniel

Kahneman and Amos Tversky (1972) who elaborate the attractive theory stating that peoples

intuitive judgements about probabilistic events are made via erroneous error-prone heuristics. In

their 1973 seminal paper, presenting empirical support to the view, they concluded that by

[representativeness] heuristics, people predict the outcome that appears most representative of the

evidence. Consequently, intuitive predictions are insensitive to the reliability of the evidence or to

the prior probability of the outcome, in violation of the logic of statistical prediction. [] It is

shown that [] people erroneously predict rare events and extreme values if these happen to berepresentative. (Kahneman & Tversky, 1973. p. 237). As evidence in support of this theory

mounted, base rate fallacy (Bar-Hiller, 1980) became a favourably used example of the

heuristics and bias paradigm. The results have implicated a common view about human judgement

as genuinely biased and generally poor (e.g., Lopes, 1991). However, for the early 1990s, the

preliminary converging evidence became apparent that the general base rate fallacy has been

overstated previously, the base rates are not uniformly ignored (Koehler, 1994, 1996). The

question arises: if base rates are not always ignored, when are they likely to be used? One can find


21/73


two main streams of researchers answering the question. One emphasizes that relative frequencies

can be better represented than single event probabilities (e.g., Gigerenzer, 1991; Hoffrage,

Gigerenzer, Krauss, & Martignon, 2002), the other one seeks the answer in the structure of the

tests (e.g., Koehler, 1996; Spellman, 1993).

Gigerenzer and colleagues (e.g., 1995) forcefully argue that the dominating theory of

heuristics is flawed, because representations in terms of natural frequencies better facilitate the

usage of the probability (or frequency) than the given conditional probabilities (e.g., Hoffrage et

al., 2002). This conception comes from the empirical works generated by ecological views (e.g.,

Gigerenzer, 1996). Natural frequencies originate from natural sampling (Gigerenzer & Hoffrage,

1995). Natural sampling is an automatic way of encountering statistical information from thenatural environment (Hoffrage et al., 2002). Giving a concrete example for these concepts:

Natural frequencies:

Out of each 1000 patients, 40 are infected.

Out of 40 infected patients, 30 will test positive.

Out of 960 uninfected patients, 120 will also test positive.

Normalized frequencies:

Out of each 1000 patients, 40 are infected.

Out of 1000 infected patients, 750 will test positive.

Out of 1000 uninfected patients, 125 will also test positive. (Hoffrage et al, p. 346).

Thus, probability is the normalised value of the natural frequency for one hundred. The authors

point to this fact as a reason of the fallacy, since the computation is simpler if natural frequencies

are provided rather then normalised frequencies or probabilities are given (Gigerenzer & Hoffrage,

1995).

Koehler (1996) advocates the view that the fallacy is brought about by the way in which

base rate summary statistics are provided in typical base rare tasks. He claims that people, given

opportunity for implicit base-rate learning, will be more sensitive to probabilities and will show

higher use of base rates in final judgements. It was published in previous studies that when base

rates were directly experienced, through trial-by-trial feedback, they seemed to be used more

accurately on judgements (Lindeman, Van Den Brink, & Hoogstraten, 1988; Manis, Dovalina,

Avis, & Cardoze, 1980; Medin & Edelson, 1988), in contrast to the method of presenting mere

summary statistics. Directly experiencing base rates is found to be helpful e.g. for auditors

learning financial statement errors (Butt, 1988), or for physicians learning the relationship of base


22/73


rates and diagnostic information (Christensen-Szalanski & Beach, 1982). Koehler (1996) assumes

that directly experienced base rates may be better relied on implicit rather than explicit learning

systems and that is why it is better remembered, or more easily accessed than information that is

learned explicitly. He renders a reason for this stating: the implicit learning experience comes in

the form of trial-by-trial learning, the information at each trial may be encoded as a separate

"trace". In this way, multiple traces develop, and the information associated with these traces may

be cognitively available. This contrasts with the explicit learning of a single summary statistic

which does not produce multiple traces and which has been associated with less accurate

judgments (Koehler, 1996). Other explanations support the view that personally experienced

information is more vivid or salient, thus more available (Brekke & Borgida, 1988); or people aremore trusting of self-generated base rates (Ungar & Sever, 1989). This conception is consistent

with many observations from category learning literature, where the probability matching strategy

indicates that people learn the experienced base rates and use them in their decisions (however not

optimally), meanwhile, in the explicit test phase they show base rate fallacy (e.g., Estes, et al.

1989; Gluck & Bower, 1988a,b; Medin & Edelson, 1988).

Holyoak and Spellman (1993) considered the phenomenon and suggested two

components being behind the base rate usage. Acquisition (1), which, in a trial-by-trial format, is

processed implicitly and quite accurately (perhaps based on learning conditional probabilities);

and access (2), which (depending on the type of test) may be under explicit and conscious control.

Consequently, when both acquisition and access part of the test tap the implicit system, people

will apply better on base rates than one of the phases are not implicit (Spellman, 1993).

In sum, we can conclude that the base rate literature holds substantial relevance to the

study of probabilistic categorization. For a comprehensive understanding of the issue, we must

consider the above described aspects both in experiment construction and data interpretation.

IV.2. Strategy analysis of PCL

In probabilistic classification learning people acquire information about cue-outcome

relations, therefore the category learning literature regards the PCL technically as an information-

integration task (Ashby & Maddox, 2005), however, little is known about, how people integrate

the observed cues. As will be argued below, a variety of different strategies are all about equally

effective to solve the task.


23/73


The hypothesis is reasonable that the evolved animals and humans have to be optimal in

categorization decisions (Ashby & Maddox, 1992). According to Thorndikes law of effect (1898),

the probability of successful trials will increase with time. Still, robust deviations from this law are

observed from the 1950s, one popular instantiation is the probability matching first studied by

Estes (1950) (see in Section II.2.). Ashby and Gott (1988) examined the performance on many

human categorization tasks comparing it to an optimal classifier (a hypothetical device

maximizing reward, e.g., Morrison, 1990). The overall data showed that human classification

cannot be described by optimality. Decision bound theory (Ashby & Townsend, 1986) attributed

two inherent suboptimalities to human (and all other organisms) decision making processes. Both

suboptimalities are the by-producs of the neural system (Maddox & Bohil, 1998). The perceptualnoise comes from the spontaneous activity within the central nervous system, further, the

optimality of the cognitive system is also limited by the observers memory (criterial noise).

Deviation from optimality is observable in the strategy level of decision making as well.

Suboptimal strategies are also often found in probability learning literature (Vulkan, 2000). The

known deviations from optimality in human decision making still seek for a proper explanation,

however, one can find at least three distinct effects attributed to the phenomena in the previous

works. One set of explanations is classified as payoff variability effects (Busemeyer & Townsend,

1993; Haruvy & Erev, 2001). This theory claims that the increase in pay-off variability move

choice behaviour towards random choice (Erev & Barron, 2005). The second set of explanations

is classified as underweighting of rare events (Barron & Erev, 2003). In these situations people

tend to rely on the typical outcomes that have the best pay-off. The third set of explanations

involves loss aversion (Kahneman & Tversky, 1979). This counterproductive action is observed in

the stock markets showing that people tend to avoid any loss. But, in probabilistic cases, the

choice of the less probable alternative decreases the overall chance of maximal profit (e.g., Gneezy

& Potters, 1997). These three cognitive strategies may seem to be reasonable from a point of view,

but their negative by-product is the deviation from maximization of gain.

Recently it became apparent that the WP task is solvable by a range of different

strategies. Gluck, Shohamy and Myers (2002) presented techniques of post-hoc analyses by which

they deduced that participants may use at least three different strategies. Post-experiment

questionnaires uncovered that most the participants believed to use one of the following strategies:

(1) optimal multi-cue strategy , in which they respond according to the outcome probability of each

combination of the presented four cues; (2) one-cue strategy , in which they respond on the basis of


24/73


their focus on the presence or absence of only one cue; and (3) singleton strategy , in which they

focus and learn only about the patterns where there is only one cue present. Only the first strategy

is optimal in a normative sense, but all of these strategies can raise the level of performance above

chance level. Bases on these reports, Gluck et al. (2002) developed a strategy analysis method for

the WP task describing ideal judgement profiles for each of these strategies. This method

identified the applied individual learning strategies in two follow up experiments. The results

showed that 90% of the participants in experiment 1 and 80% of the participants in experiment 2

fitted the singleton criterion, however, when they break the task into 50-trial blocks a gradual shift

was observable towards multi-cue strategies (Gluck et al, 2002). Controversially, there was little

correspondence between the explicit self-reports and the actually used strategies of the individuals.Recently Lagnado and colleagues (in press) introduced an alternative strategy to Glucks multi-cue

strategy. The multi-match strategy supposes that people match the underlying probabilities of the

two outcomes by their predictions. As we will see later, the adoption of these simple heuristics

represents only a few of the plausible options that may be involved in probabilistic decision

making situations.

The fact that good performance can be achieved by explicit memorisation of heuristic

strategies (e.g., one-cue strategy, or singleton strategy) makes the implicit nature of the test quite

questionable. As it is argued in this study, beside finer strategy analysing methods, a better

alternative of the WP task might prevent the method from being susceptible to indentifiability

problems.

IV.3. The question of consciousness

Most of the previous studies of probabilistic classification learning assumed that the

decision makers lack self-insight into the judgemental policies underlying these judgements,

therefore regarded the test as a pure implicit learning test (e.g., Evans, Clibbens, Cattini, Harris, &

Dennis, 2003; Gluck et al. 2002; Wigton, 1996; York, Doherty, & Kamouri, 1987). This view was

also supported by those who emphasized the implicit nature of the experience-based learning tasks

(e.g., Spellman, 1993). Although the question is important, the relation between peoples learning

performance and their knowledge has received less attention.

The thought that learning can be based on separate conscious and non-conscious

systems is detectable in some of the early theories of learning (e.g., Tolman, 1932), systematic

research has not started on its way until the late 1960s (Reber, 1967, 1969). Rebers concept of


25/73


implicit learning concerns that people acquire information from the environment without intending

to do so (Cleeremans, Destrebecqz, & Boyer, 1998). This thesis made it possible to study

unconscious phenomena in cognitive psychology without relying on any psychoanalytic

conception. In a later paper, Reber (1992) taking an evolutionist standpoint argued that

consciousness is a novel phenomena evolving after many higher perceptual and cognitive

processes. He stated four hypotheses about implicit mechanisms (1) it is in robust relation with

psychological and neurological effects; (2) it is independent of the IQ; (3) it is independent of age;

and (4) it has little variance among populations. Others derive from this view that implicit learning

acquire little effort, is often accurate and even optimised comparing to the explicit ways learning

(e.g., Holyoak & Spellman, 1993). The PCL task is one of the few tests that can be used to revealthe rightfulness of these assumptions.

A large body of research has documented apparent dissociation with the WP task and

took the separation of the two learning systems as evidence (Ashby, Ell, & Waldron, 2003;

Knowlton et al., 1996; Reber & Squire, 1999; Squire, 1994). The two systems were also

demonstrated in neuropsychological studies arguing that the two systems are differentially

impaired in the certain clinical cases (Ashby et al., 2003; Knowlton et al., 1996; Poldrack et al.,

2001).

Davis Shanks is one of the few researchers who questions the need of using any implicit

concept in the explanation. In a voluminous paper with a colleague he (Shanks & St. John, 1994)

reviewed the implicit learning literature according to their sensitivity criterion . This criterion

claims that for terming a learning behaviour to be implicit, one has to rule out the insensitivity of

the explicit test. Shanks & St. John (1994) argue that the explicit tests are possibly insensitive to

measure the explicit processes that occurred during learning for two reasons. They suspect that the

retrospective questionnaires can distort the validity of the assessment because of the memory

constraint and the possible interferences. Taking this criterion serious, they did not find any

previous studies where implicit learning was satisfyingly demonstrated.

More than a decade later he and his colleagues pointed to other methodological

shortcomings of the field (Lagnado et al., in press). First of all, they emphasized the need to

distinguish between someones insight into the task ( task knowledge ), and someones insight into

his/her own judgemental processes ( self insight ). In the case of the WP task, it refers to the

difference between the learners knowledge of the cue-outcome relation, and to how to use this

knowledge to predict the outcome. Lagnado et al. (in press) conjecture that the two might be


26/73


separate, although, the previous researches ran faultfully the two together. Thus, the proclaimed

dissociations could have referred to the dissociation between insight and learning, task knowledge

and learning, or both (Lagnado et al., in press). Further problem of explicit testing was added to

the list by the reasonable claim that verbalisation difficulties of probabilistic inferences might be a

natural obstacle of a valid report.

Recently, Lagnado et al. (in press) conducted a series experiments to demonstrate these

claims. They used strongly and weakly predictive cards on the basic WP task. In Experiment 1, to

measure the knowledge and insight, participants were given explicit questions after each block of

50 trials. One of the two different sets of questions asked the participants to rate on a continuous

scale the probability of the outcome (rainy vs. sunny weather) in the case of each individual card(measuring task knowledge). The other question tested the participants of how much they had

relied on each card in making their decisions (self-insight). This rating was also registered in a

similarly continuous scale. The results indicated strong correspondence of the performance and

both task knowledge and self-insight, being accurate in all.

In Experiment 2 they asked similar explicit question from the participants on the screen

of the same WP task. The difference was that the questions - of how much they relied on each card

were presented to the participants after each trial. The results revealed that from early on in the

task people rated strong cards more important than weak ones. The authors concluded that

participants developed insight into their cue usage relatively early in the task.

In Experiment 3 they tested whether the explicit questions after each trial directed

conscious attention to the task, thus biasing the characteristics of the learning performance. In the

final results there were no changes in any measurement of the third experiment comparing to the

second one. These findings strongly supported the authors doubt that the WP is purely an implicit

task.

Lagnado and colleagues (in press) argument gains credence from these findings, but the

appealing results supporting the existence of an implicit way of learning as well as the lack of the

consensus in analysing the test results let us without satisfactory answer to the question.

IV.4. Analysing the test results

The standard analysis of the PCL task follows a simple procedure. It computes a mean

percentage of the correct responses for the whole task by averaging across both trials and

participants. The aim of this analysis was mostly to develop a categorization models (e.g., Gluck


27/73


& Bower, 1988a,b), or to study clinical groups (e.g., Knowlton et al., 1994, 1996). While the test

became popular in these research fields, this analysing method does not tell much about the

process of probability learning. The testing is insensitive about measuring both individual

differences and dynamic processes along the test.

Probabilistic classification learning tasks were developed from the multiple-cue

probability learning paradigm, thus, although unacknowledged, its basic structure reflects the

Brunswikian Lens model (for details see Section II.1.1.). This framework is applicable for every

judgemental process where the decision maker has to rely on environmental cues. The original

theory was based on the view that people construct internal cognitive models that reflect the

probabilistic properties of the environment (Doherty & Kurz, 1996; Lagnado et al., in press). Thecentral tenet of this approach was to analyse individual judgemental processes prior to computing

group averages. Individual judgemental policies can be examined by considering the relation of

the given cues and the patterns of judgement (for overview see Cooksey, 1996). More specifically,

computing multiple regression analysis for the judgements and the cue values across all the trials

of the task, we can measure the cue-utilization weights from the resultant beta-coefficients. Simply

put, it shows that weights that the individuals have given to each cue during their judgements.

Having the judges policy models, it can be compared to the actual structure of the

environment. This is achieved by computing parallel multiple linear regression for the

environmental cues. Here the beta coefficients will be the objective cue weights which are the

same for all participants exposed to the same task. This way, the judgment policies (by their cue

utilization weights) will be comparable with the objective weights. The analysis reveals eventually

how the individuals learnt the task environment. The analysing method can be applied in a parallel

way for the assessment of the explicit judgements as well, measuring the task knowledge and the

self-insight.

Brunswiks Lens model and its later developments provided a useful framework for

studying judgemental processes, but its shortcomings hinder us from receiving a detailed picture

of the dynamics of the process both from the aspect of the environment and the decision maker.

Regading the WP task, the way of averaging performance across all trials not just

ignores the possibility that one can vary his/her subjective weights over the trials, but overlooks

the fact that the individual cannot obtain a representative picture about the probabilistic structure

early on in the task. In fact, the observed probability of an outcome changes from trial to trial and

reaches the final (given) value only after meeting the last feedback (so it is actually never


28/73


measured). Consider the structure of the following Figure 4. It depicts the binary decisional tree in

PCL. Going downwards from the top of the tree, one can observe the percentage of the normative

expectancy of one of the feedbacks in each step.

Figure 4.

Binary decisional tree in WP. Going downwards from the top of the tree, the numbers on the

diagonal lines indicate if the correct response was set to be sun (1) or rain (0). The numbers ( x)

inside the rectangles represent the percentage of the normative expectancy of sun feedback in eachgiven step (for rain the values are 100- x).

In practice the outcome patterns are (quasi-)randomly distributed (with fixed overall

probabilities) in the WP experiments. It can be read from the illustration ( Figure 4. ) that regarding

solely the final percentage values each participant might observe different probabilities on the

previous trials. This makes the group averaging technique at least quite robust, if not inadequate.


29/73


Furthermore, on the 14 trials, first presenting a particular pattern, participants have no previous

knowledge about the outcomes at all, so their responses cannot be a result of learning.

It is also plausible that the people have no perfect memory for all the observed stimuli,

thus recency may have effect on the decisions. People in experience-based learning situations have

to update their impressions according to the newly sampled outcomes (Hogarth & Einhorn, 1992).

Recently presented outcomes may have greater weight than earlier ones (e.g., Hertwig, Barron,

Weber, & Erev, 2004). Moreover, individual memory constraints may vary regarding the number

of considered samples in decisions (e.g., Jones, Love, & Maddox, in press).

In view of these facts the conclusion is inevitable that a more sensitive analysing method is

required for a precise description of probability learning and a coherent framework for modellingthe dynamics of the process.

V. The Dynamical approach

One more sensitive methodology for this analysis is provided by the dynamical

approach. Dynamical approach to cognitive science rejects the idea that cognition is the operation

of a special mental computer located in the brain, rather it provides a framework for understanding

cognitive processes within cognition as a complex natural system (van Gelder & Port, 1995). Oneof the two main tenets of the Dynamical Hypothesis (van Gelder, 1998) which distinguishes it

from the traditional views of cognition is its primary focus on the processes in real time. Contrary

to the computational aspects, the main aim of the approach is to describe behaviour in its temporal

course. Instead of the input-output relation, its concern is about the changing of the overall system

in time. In contrast to the computer storage analogy, followers of the dynamic view reinforce the

common psychological view that we are not passive recipients of information rather that we

actively manipulate, reconstruct and bias it (MacLeod, Uttl, & Ohta, 2005). The other key aspect

of the approach is an emphasis on total state. Total state refers to the conjunction of all aspects of

the system at a given point in time (Beer, 2000; Bosse, 2005).

V.1. Dynamic Models of Cognition

The approach appeared to be expedient for decision making theories. Busemeyer and

Townsend (1993) were one of the first ones publishing a useable dynamic model for high-level

cognitive processing. The study provided a new aspect for understanding the relation of decisional


30/73


models. They classified the decision making models according to two attributes: deterministic

versus probabilistic and static versus dynamic (1993). This dynamic-cognitive decision field

theory in contrast to the earlier dominating deterministic or static theories successfully

accounts for many time-varying aspects of the phenomenon and can offer a more detailed process-

oriented explanation of motivational and cognitive mechanisms of decision making (Port, 2000).

Before we exult over finding a revolutionary new alternative for the traditional theories

of cognitive science, we should notice not only the long-established presence of its components,

but the considerable limitations of its current applicability as well. The general framework was

developed to study the biological and cognitive systems in their evolution, but unfortunately in

many fields of this science there is not much realistic prospect for its empirical testing (French &Thomas, 2001). In most cases of cognition processing transitions are extremely rapid, the variables

are exceedingly numerous and hardly detectable. Except for a few carefully constrained simple

cases, we have insufficient amounts of data and inadequate method of computation compared to

what these analyses would require (Port, 2000).

V.1.1. Static and Dynamic Models of Learning

Still, one of the areas where dynamic analyses may provide stronger descriptive andpredictive power in modelling cognitive processes is the modelling of learning processes. As

discussed before, learning is a process in which we change the predictions about our environment

on the basis of new experiences. This process is an evolving product of changing factors along the

course of time (Smith et al., 2004). In typical experiments learning curves can be documented by

recording responses (decisions) in multiple trial tasks. Multiple trial tasks provide us with

continuous sampling of the changing phases of the process. In these studies (usually with binary

responses) stimuli can be associated deterministically or probabilistically with reinforcement.

Following Busemeyer and Townsends (1993) categorization of decision theories, here I propose

an outlining classification of human learning models according to the deterministic versus

probabilistic and static versus dynamic attributes, as in Table 1.


31/73


Table 1

Categorization of Learning Models

Category Static Dynamic

DeterministicClassical/OperantConditioning

Skill Learning

Probabilistic Rescorla-Wagner model

DynamicProbabilityLearning model

Note. The matrix depicts and hypothetical categorization of learning models according to the

deterministic-probabilistic and static-dynamic axes.

A model is static when it regards and measures learning not in its course of process as

dynamic ones do , but as a property of the system in a certain point in time. Therefore traditional

conditioning theories are in this block, but most standard tests also belong here. Regarding the

other attribute, a model is deterministic, if the stimulus is always, or never associated with the

response, while in probabilistic theories the stimuli are reinforced according to varying distribution

where. In this sense the Rescorla-Wagner model (1972) is static, because it does not account on

the changing factors from trial to trial. Considering these aspects the study of probability learning

requires techniques of a dynamic learning model.

V.2. Dynamical analyses

Two techniques have been published recently to monitor learning behaviour over time.

The rolling regression analysis (or sequential least square technique) was introduced to illuminate

individual behavioural differences in price forecasting (Kelley & Friedman, 2002). This method

computes series of regressions by a moving window, generating trial-by-trial estimates about the

individuals responsiveness to the observed cues. The learning curve is then compared with the

curve that an ideal learner would show on the same task. The method regards the trial by trial

information that the participant has actually observed. Thus, ideal learners strategies are defined

according to the current state of knowledge of the ideal observer in each trial. This technique

makes it possible to examine decisional attitudes individually along the course of the experiment

and provides a tool to compare learning performance with an ideal learner of various strategies

(e.g. Kitzis, Kelley, Berg, Massaro, & Friedman, 1998; Lagnado et al., in press). The other,


32/73


explicitly dynamic statistical analysis of learning is the state-space model paradigm (Smith et al.

2004). This model computes the probability of correct responses for each state of the learning

process by maximum likelihood applying expectation maximization algorithms. Knowing the

learning curve and the confidence intervals permits us to identify the first trial on the curve where

individuals performs better than chance level (Smith et al., 2004). This technique gives a precise

definition of learning and a coherent statistical framework for learning studies with binary

responses.

VI. Experiments

The present explorative study aims to analyse what decisional strategies are applied inprobabilistic learning situations. The question was examined with some modification on the standard

PCL task. In addition, two novel statistical analyses rolling regression and state-space model were

applied on the data.

First of all, I used a new version of the usual PCL scheme (see Section III.1., III.2.). As it was

reviewed in Section IV.2., a variety of different strategies are all about equally effective to solve the WP

task. The ambiguity springs from the perceptual design of the four cues. In this task the four cues are four

geometric forms of which combinations are the basis of judgements. As Gluck et al. (2002)

demonstrated the application of the singleton strategy (in which the focus is only on the patterns where

there is only one cue present) can also lead the participant to good performance than the multi-cue

strategy (where all the cue combinations are considered). For example, if the participant associates every

single triangle with rain and guesses in the other trials randomly, he may reach a good score on the whole

experiment, however, no probability learning occurred. To prevent this complication I used a different

PCL task than the WP. The experimental setup was adapted from Shohamy and colleagues (Shohamy et

al., 2004) who constructed a design, which kept the basic structure of the WP, but used less detached

cues. As described in details below, the participants had to make guesses on the basis of the features

(cues) of a toy figure (hat, moustache, etc.) ( Figure 6. ).

Another modification was that the sequence of the trials and feedbacks were set to be identical

for each participant. As written in Section IV.2., the random distribution of the patterns (with fixed

overall probability) limits the possibilities of individual comparison. With fixed pathways of the patterns

in the decisional tree the order of the stimuli and outcomes became identical for all participants. This

made the data usable for an aggregated trial-by-trial evaluation within the group and between the

individuals.


33/73


To measure the performance differences after different feedback pathways, different

pathways belonged to some of the final probabilities. Figure 5. demonstrates that more pathways

lead to the final value of 33.3, 66.7, 25, 75, 16.6, 83.3, and 50 of probabilities. In the case of some

the pathways with common final point, one answer was more probably correct in the beginning,

but later on the alternative answer became dominant, while in other cases one answer was

dominant in the whole course of trials. The fixed pathway method also permits us to compare the

learning differences at the same final point after different paths.

Figure 5.

Feedback pathways. The 14 pathways following the arrows represent the feedback sequences of

the 14 patterns. Numbers ( x) in the rectangles indicate the percentage of the normative expectancy

of vanilla feedback in each given step (for chocolate the values are 100- x)1.

1 Constructed by Dnes Tth.


34/73


In practice the final value that represents the normative overall probability of a pattern

can be measured only after the last feedback (about previous practice see Section IV.4.).

Therefore, I inserted one more trial of each pattern (14) with no feedback after their last

presenting. These extra trials provide us data to measure learning at the final point.

In Section IV.1. we could see that experienced based learning is assumed to based on

rather implicit then explicit mechanisms. The ignorance of this led previous studies to confusion

about the explanation of learning processes along this task (see in Section IV.3.). A large body of

researchers emphasized that the misinterpretation of data originates from the malpractice that the

experimental learning was tested by explicit assessments. To avoid this, I added an implicit test

phase following the learning phase. This phase was identical to the learning phase except, theparticipants did not get feedback after the trials, but were told that they receive their result at the

end of the session. The advantage of this extra part is to test the subjects in a similar situation to

the learning phase, and since there is not feedback, the observed probability remains unchanged,

thus we can count mean scores from the sequences of test trials.

To measure participants explicit task knowledge, the third part of the experiment was

an overt test of what probability value they associate to each pattern. The explicit test had to

follow the other tests not to interfere with them.

VI.1. Methods

VI.1.1. Participants

Fourty-five undergraduate students from the Psychology Institute of ELTE, Budapest

participated in the present study (mean age = 23.46 years; SD = 3.77 years). There were 15 males

and 30 females. The participants were divided into two groups: baseline and dual-task groups.

Twenty-eight participant were in the baseline group (n = 28; 6 males and 22 females with a mean

age of 22.00 years and a SD of 2.87 years). The dual-task group consisted of 17 participants (n =

17; 9 males and 8 females with a mean age of 25.88 years and a SD of 3.92 years). The

participants received course credits.

VI.1.2. Materials

The A modified version of the PCL I task (as introduced by Shohamy et al., 2004) was

used in this study. In this version participants are told that they are selling ice cream in an ice


35/73


36/73


cues present (1111) and with no cue present (0000) were never used (following the scheme of the

previous works since Knowlton et al., 1994).

Table 2

Note. Each card could be present (1) or absent (0) for each pattern. The all-present (1111) and all-

absent (0000) patterns were never used. The overall probability of vanilla outcome for all patterns

is 50%.

Additionally, two digital photographs of vanilla and chocolate ice creams and an image

of the tip jar that showed the extra tips were used on the screens. All the materials were presented

on a computer screen with identical black background.

Using the 14 patterns of stimuli, 214 trials were constructed for the learning phase and70 for the test phase. For the explicit test 14 PowerPoint slides were created using the 14 patterns.

During the learning phase the two feedbacks (vanilla, chocolate) were equally probable, but each

pattern was independently associated to one of the feedbacks according to the scheme of Table 2.

200 of the 214 learning trials followed the pathways of Figure 5. , the remaining 14 trials were not

associated with feedback.

8/13/2019 Aczel - Strategy Analysis of Probability Learn

Download - Aczel - Strategy Analysis of Probability Learning

Top Related