aczel - strategy analysis of probability learning

Upload: princegirish

Post on 04-Jun-2018

229 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    1/73

    Strategy analysis of probability learning1

    Strategy analysis

    of probability learning

    Balzs Aczl

    Psychology Institute,

    University of Etvs Lornd

    Budapest, Hungary

    MA Thesis

    (Psychology Course)

    2005/2006 Spring Term

    Supervisor: Mihly Racsmny, PhD.

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    2/73

    Strategy analysis of probability learning2

    Contents

    I. Introduction 4

    II. The origins of probabilism in psychological research 5

    II.1. Egon Brunswik 5

    II.1.1. Brunswiks Lens model 6

    II.1.2. Brunswick on learning 8

    II.1.3. Multiple cue probability learning 10

    II.1.4. The expansion of Brunswick works 11

    II.2. Estes and the probability matching 12

    III. New interest in probability learning 14

    III.1. Probabilistic associative learning 15

    III.2. Probabilistic classification learning 17

    IV. Methodological considerations 20

    IV.1. Base-rate neglect 20

    IV.2. Strategy analysis of PCL 23

    IV.3. The question of consciousness 24

    IV.4. Analysing the test results 26

    V. The Dynamical approach 29

    V.1. Dynamic models of cognition 29

    V.1.1. Static and dynamic models of learning 30

    V.2. Dynamical analyses 31

    VI. Experiments 32

    VI.1. Methods 34

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    3/73

    Strategy analysis of probability learning3

    VI.1.1. Participant 34

    VI.1.2. Materials 34

    VI.1.3. Apparatus 37

    VI.1.4. Procedure 37

    VI.1.5. Data collection 38

    VII. Results 39

    VII.1. Experiment 1 39

    VII.1.1. Hit rate 39

    VII.1.2. Rolling regression 40

    VII.1.3. State-space model 43

    VII.1.4. Test phase and explicit measures 43

    VII.2. Experiment 2 45

    VII.2.1. Hit rate 46

    VII.2.2. Rolling regression 47

    VII.2.3. State-space model 47

    VII.2.4. Implicit, explicit measures 48

    VII.3. Summary of results 49

    VIII. Discussion 51

    VIII.1. Rationality under uncertainty 52

    VIII.2. The power of statistical learning 53

    VIII.3. From duality to multiple systems 55

    IX. Conclusion 58

    References 60

    Appendix

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    4/73

    Strategy analysis of probability learning4

    God has afforded us only the twilight of probability;suitable, I presume, to that state of mediocrity and

    probationership he has been pleased to place us in here.

    John Locke, 1690

    I. Introduction

    All of us spend our lives learning, yet our goal to understand its basic processes is still

    unaccomplished. Learning theory is a critical field of our psychological investigation because most

    human behaviour involves some form of learning (Robertson, 1970). Learning may be defined as a

    process in which behavioural patters are changed as the result of experience (Kelso, 1997).

    Learning is an adaptive process of our cognition to predict the future on the basis of pastexperience. In an uncertain world this prediction can rely only on probabilistic relations about the

    environment (Lagnado, Newell, Kahan, & Shanks, in press). Therefore to understand how people

    learn probabilistic information from experience is a fundamental question of human behaviour.

    In this explorative study I concentrate on four basic aspects of probability learning. First

    of all, during my discussion I rely on Brunswiks theoretical framework about the probabilistic

    nature of human psychology. Secondly, I wish to provide a critical review of the methodology

    used in the previous studies of probability learning. Thirdly, I am motivated to examine the role of

    conscious and unconscious processes behind the applied decisional strategies. Finally, I consider

    learning to be a dynamic process and find the dynamical approach and methodology to be relevant

    for the investigation. In the experiment I demonstrate how these aspects of probability,

    consciousness and dynamic processes play essential rule in probability learning.

    The research question intends to explore what decisional strategies we use in

    experiment-based probability learning situations. The modified experimental task is applicable to

    explore the used decisional strategies. The employed methodology provides useful tools to the

    field to investigate simultaneous processes underlying learning behaviour. The results may yielddirect implications to the understanding of the well-known suboptimalities of human learning and

    decision making. The whole work may raise new questions about the interaction of the

    association- and rule-based processes behind human learning.

    I greatly acknowledge Dnes Tths collaboration in the planning, execution and analysis of the experiments reported.

    Special thanks for the careful revisions to Tams Makny and for the linguistical check of the manuscript to James

    Wason from Cambridge.

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    5/73

    Strategy analysis of probability learning5

    II. The origins of probabilism in psychological research

    Probabilism in psychological research started on its way with the argument of Egon

    Brunswick (e.g., 1939) to describe the relations between organisms and their environments in

    probabilistic terms. Brunswiks scientific work was consequently provocative. He believed that

    psychology needs a revolutionary turn to understand behaviour with regard of its function and in

    terms of probabilistic relations. However the impact of his work did not live up to his original aim

    for a change in the mainstream of the field, after all it is becoming apparent that his thoughts were

    basically right.

    In this chapter I wish to describe some of the views and works of Egon Brunswik,

    because his conceptions on perception, learning and experimental design serve as a theoretical

    background to the methodologies discussed in this work. I also give a short description about the

    contribution of William Estes who started an initial research to model probability learning in

    mathematical terms. Contemporary conceptions in probability learning studies originate directly

    from these two approaches.

    II.1. Egon Brunswik

    Brunswik, although born in Budapest, began his career in psychology as the first

    assistant of Karl Bhler in Vienna (Doherty & Kurz, 1996). Under the auspices of Bhler,

    Brunswiks views turned against the popular psycho-physical parallelism, and this attitude

    determined his later theories. He started to state it in Vienna and continued in Berkeley that both

    the incoming perception and the outgoing behaviour have a rather ambiguous nature. In his view

    the probable partial causes and probable partial effects of the behaviour should be under focus

    when we wish to understand the great compatibility between organism and environment

    (Hammond, 2001). Brunswick intended to say in an evolutionary view that in natural

    environments survival is possible only if the organism is able to establish compensatory balance

    in the face of comparative chaos within the physical environment (Brunswik, 1943, p. 257). In

    that physics-envy time of psychology, his concept of probable behaviour in a somewhat

    unpredictable environment was in sharp contrast with the mainstream thinking that sought for

    stability in laws and research on behaviour. With his Lens model which intended to describe this

    compensatory balance of the organism in inherent uncertainty within the environment and

    within the person, he went against the dominating determinism of his time. Indeed, Brunswiks

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    6/73

    Strategy analysis of probability learning6

    belief was that the probability character of the causal (partial cause-and-effect) relationship in the

    environment calls for a fundamental, all-inclusive shift in our methodological ideology regarding

    psychology (Brunswik, p. 261). These harsh words according to Hammond, Brunswiks

    student and follower established a distance between Brunswiks views and those of Hull, Lewin

    and many generations of future experimental psychologists that is still hard to overcome

    (Hammond, p. 56).

    II.1.1. Brunswiks Lens Model

    In the domain of perceptual constancy Brunswiks first research was conducted with

    Lajos Kardos (Brunswik & Kardos, 1929) on Bhlers duplicity principle. This principle opposed

    the view that context has only a modifying effect on perception that comes into counts only after

    the object, instead, Bhler and his students stated that context is always present and never

    subordinate in perception (Brunswick, 1937). This reconceptualisation of context served as a

    crystallisation point for his ideas that led to his early lens analogy and his later generalised lens

    model (Cooksey, 2001).

    The analogy of the doubly convex lens is a heuristic tool what he conceived as a

    composite picture of the functional unit of behaviour, or the unit of achievement (Brunswik,1952, p. 19-20). This tool was meant to help the researcher in structuring the investigation of

    organismic achievement. The explanation of the model contains another metaphor, the intuitive

    statistician, which depicts the perceptual system, being equipped with latent capacities capable

    for basic statistical functioning in the uncertainty of the environment (Brunswick, 1956, p. 80).

    The cues in the environment are only probabilistically related to the objective of the individual. In

    that sense, the decision maker has only probabilistic information about the environment and also

    about how to utilize these perceived cues. During judgemental processes the decision maker relies

    on these environmental cues to attain his/her goal. Not all of the cues may have equal relevance

    for predicting the outcome of the decision, but according to the view all of them go into

    account (cf. Brunswiks views on the context as additional mediating data (1937)). The decision

    maker uses his/her memory of cue-outcome correlations from previous experiences. In Brunswiks

    view, this statistical processing occurs involuntarily. Further principle to understand the Lens

    model is his principle of parallel concepts (Doherty & Kurz, 1996). This principle states that the

    perceived environment and the cognitive system should be described by the same type of

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    7/73

    Strategy analysis of probability learning7

    constructs. These thoughts gave the basic theory for the construction of the Lens model. Brunswik

    (1952) thought that using correlation statistics these constructs become measurable.

    Figure 1.

    Illustration of the Lens model. The objective weights ( w) between the environment (Y E) and the

    cues (X) and the subjective weights ( j) between the cues and the judgement (Y S) are parallelconcepts. The functional achievement (r a) is the correlation between the persons judgement and

    the ecological criterion value. (based on Cooksey, 1996)

    Thus, as Figure 1. depicts, ecological validity is defined as the correlation between the

    values of the distal criterion of the environment and the perceived cues; the cue utilization validity

    is defined as the correlation between the values of the perceived cues and the individuals

    judgements; finally, the achievement is measured by the correlation of the values of the distal

    criterion of the environment and the individuals judgements. Achievement, as the most general

    measurement of the model, reflects Brunswiks broadest descriptive term, the probabilistic

    functionalism . In this terminology, achievement is the degree to which the orgamism successfully

    attains its goals (Doherty & Kurz, 1996). This conception makes apparent distinction from most

    of the other traditions in psychology. Investigators of human behaviour almost exclusively

    define the correctness of human performance in comparison to some kind of normative models

    (mostly adapted from statistics or logic). Unlike the appliers of these coherence standards of

    assessments (e.g., investigators of heuristics and biases), those of using the Lens model look for

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    8/73

    Strategy analysis of probability learning8

    correspondence in performance. In their words, successful performance (i.e. for achievement)

    [is] the degree to which a persons responses agree with, or correspond with, the environmental

    events (Doherty & Kurz, p. 123).

    The biggest contribution of the model is that it provided a general tool for those

    investigations of cognition where the focus is on any kind of judgemental processes under

    uncertainty. Brunswik had only a few demonstrations using the Lens model (e.g., 1952), but after

    his death, the approach, starting from Hammonds clinically oriented research initials (Hammond,

    1955), evolved into the Social Judgement Theory, which reached many areas of investigation on

    judgement and decision making including educational decision making (e.g., Cooksey, 1988),

    medical decision making (e.g., Wigton, 1988), accounting (e.g. Krogstad, Ettenson, & Shanteau,1984), or risk judgement (e.g., Earle & Cvetkovich, 1988). The other expansion of Brunswiks

    Lens model had enriching effect on the theories of learning, setting off the studies of multiple cue

    probability learning (described in Section II.1.3.).

    II.1.2. Brunswick on learning

    Brunswik dedicated two research studies to the investigation of probability learning. The

    first work, the Probability as a determiner of rat behaviour (1939) was executed in 1936 and1937, as his first experiment in the United States. This paper presents not only an excellent

    reflection of Brunswiks main attitude towards the psychology of his time, but the first experiment

    on probability learning in the literature as well. The design of the experiment followed the

    standard design of his age, the T-maze. The T-maze was a usual experimental setup of

    behaviourism, where the animal (usually a rat) was places in a two armed T-form maze. Food

    reward was exclusively conditioned to one side and never to the other side, thus training the

    animal to learn the expected behaviour. Brunswiks innovation was that he has altered the

    predictability of the sides for the running series. In every run there was food on only one side, but

    the location of the rewarded side was not consistent in each group. Brunswik calibrated the

    predictabilities of certainty for the groups following 100:0, 75:25, 67:33 ratios. In each case, the

    generally profitable side was counted as correct choice, the generally unprofitable side as

    error. In this sense, in the exceptional cases some of the successful choices were errors and some

    of the unsuccessful choices were correct responses. Further to study the effect of ambiguity,

    after 4 days of training Brunswik gave reversal trainings to the animals. Now the profitable and

    unprofitable sides were exchanged, however, keeping the previously set probabilities the same for

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    9/73

    Strategy analysis of probability learning9

    all groups. In order to study more about the effect, he added repeated reversal training to the 100:0

    group. In the following six consecutive days these animals had the same type of training except the

    directions were reversed in each day (Brunswik, 1939). The results in general showed that

    discrimination increases with the increase of the difference of reward probability on the two sides.

    The description of the design here served not as an introduction the results of the study,

    but to present the innovative conception in it. Leaving Thorndikes ideas alone on the probabilistic

    nature of the environment (Thorndike, 1932), most of the behaviourist studies steadily stuck to

    deterministic reinforcements in their designs. It was Brunswiks explicit goal with strong

    argument to reform psychology to accept the genuine uncertainty in observation and judgement

    into the model. It is worth to take notice of his argument that can be called passionate in scientificcontext. The title already hides an ironic paradox: probability as a determiner. Further, the first

    two sentence concisely sums up his overall conception stating In the natural environment of a

    living being, cues, means or pathways to a goal are usually neither absolutely reliable nor

    absolutely wrong. In most cases there is, objectively speaking, no perfect certainty that this or that

    will, or will not, lead to a certain end, but only a higher or lesser degree of probability.

    (Brunswik, 1939, p. 195).

    Brunswiks other study on probability learning was conducted with Hans Herma in 1951

    titled as Probability learning of perceptual cues in the establishment of the weight illusion . This

    work applied probability learning to Brunswiks genuine interest, the perception. In a brief

    summary, in this perceptual experiment the participants had to lift heavy and light weight objects

    simultaneously in both hands. The objects were painted in two colours. Each participant was told

    that after some presentations of weights, he/she has to tell in a snap judgement which of the two

    objects appeared heavier at the first moment of the lifting. (Brunswik & Herma, 1951). Because of

    the successive weight contrast effect the participants underestimated the relatively heavier weight

    in the subsequent trial and vica-versa for the light weights. The estimated weight contrast was

    under focus, since, in the balanced weights test trials, the one presented on the side with generally

    lesser frequency of heavy objects is judged as the heavier of the pair (p. 174), thus showing the

    effect of probability learning. Thus, the location and the colour served as cues and the estimated

    weights as the dependant variable. The uncertainty of the environment was represented in the

    probability by which the multiple cues predicted the objects. The results showed that the contrast

    responses, after an early maximum, declined under continued reinforcement. This paradoxical

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    10/73

    Strategy analysis of probability learning10

    result was speculated to be a special characteristic of probabilistic learning (Brunswik & Herma,

    1951.)

    The interest of this work is not about its design or results, but about the questions that it

    aims to study of how rapidly the organism adapts the probabilistic structure of the environment; or

    if probability learning reaches a final level at which gets stabilized (Bjorkman, 2001). For our

    current research and understanding of probability learning the relevance of these questions is still

    unexpired.

    Although, learning was not closely related to Brunswiks original interest, in these

    works he presented probability learning and multi-cue probability learning experiments for the

    first time. The implications of this concept are detectable best in the various areas of the multiplecue probability learning paradigm.

    II.1.3. Multiple cue probability learning

    In a typical multiple cue probability learning situation (MCPL) participants make

    judgements based on cues probabilistically associated to feedbacks in a series of trials. The aim of

    this experimental design is to model the organisms attempt to learn the relationship of the

    variables of the environment and to model its function of predicting the efficiency of behaviour.The probabilistic reinforcement of cues by feedbacks represents the conception of general

    uncertainty of real life environments and its perception as well.

    The description is in strike contrast with many of the traditional learning models where

    the degree of learning is tested and defined by the number of correct retrieval of items or of

    associations previously presented in deterministic pairing. The degree of learning in probability

    learning studies is assessed by the percentage of the correct decisions made on the basis of the

    previous experience on the task.

    The merit of this model is not just for its contribution to the learning theories, but also

    for its essential involvement in most of the cognitive processes ranging from perception, and

    categorization to decision making. In this section I wish to provide an insight to the evolvement of

    the research of MCPL to elucidate what attempts have led to the concepts and experimental trials

    of today.

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    11/73

    Strategy analysis of probability learning11

    II.1.4. The expansion of Brunswick works

    Although, the debate goes on, if Brunswik and Herma used the term probability learning

    rightfully in their experiment published in 1951 (described in Section II.1.1.), or a present-day

    reader would make distinction calling it partial reinforcement (Bjorkman, 2001), some readers of

    his article understood the originality in the concept and started a series of experiments that

    triggered off an emerging of questions for the coming five decades.

    The first work after Brunswiks 1955 death applying multiple and single cues in

    probability learning was conducted by Smedslund (1955) in Norway. In his inquiry into the

    origins of perception assumed perception to be established by a process of multiple-probability

    learning, meaning that in learning people utilise complex configurations of ambiguous and

    probabilistic cues of the environment. One of his two experiments was to explore the possibility of

    utilising the probability learning procedure as a diagnostic tool in clinical psychology. He found

    the method to be slow and inefficient (Holoworth, 1999).

    Extensive research program was initiated only from 1964 by Hammond, Brunswiks

    follower, and his students in the United States (e.g., Hammond, Hursch, & Todd, 1964), they

    began to analyze the components of clinical inference explicitly in the framework of Brunswiks

    Lens model (sketched earlier in Section II.1.1.). The other main propagator of the approach wasBjrkman, who started a long research project in Sweden (e.g., Bjrkman, 1965, 1987) as well as

    his student Brehmer, who published 77 articles in the topic counting between 1972 and 1988

    (Holoworth, 1999).

    Until the late 1980s a big proportion of the studies documented substantial learning

    effects, however not reaching optimal level (cf. Hammonds results), while Brehmer expressed his

    pessimistic conclusion, stating: When we learn from outcomes, it may, in fact, be almost

    impossible to discover that one really does not know anything. This is especially true when the

    concepts are very complex in the sense that each instance contains many dimensions. In this case,

    there are too many ways of explaining why a certain outcome occurred, and to explain away

    failures of predicting the correct outcome. Because of this, the need to change may not be apparent

    to us, and we may fail to learn that our rule is invalid, not only for particular cases but for the

    general case also. (Brehmer, 1980, p. 228-229).

    It seems that the interest in the MCPL approach dwindled considerably after the mid-

    1980s, but in fact, the question only has merged into a parallel research tradition of probability

    learning, marked by the name of Estes.

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    12/73

    Strategy analysis of probability learning12

    II.2. Estes and the phenomenon of probability matching

    Another theoretical origin of probability learning research can be found in the early

    articles of William Estes. In contrast to Brunswiks case, here the thought of probabilistic relations

    came from the inside circles of behaviourists. In the time of Estes in the 1930-40s, after the

    resultless attempts of the big search for the global theories of learning, the field was looking for a

    new direction. The orientation of that time was to suppose that all psychological phenomena could

    be understood in terms of some version of associationism, meaning that situational stimuli (Ss)

    control all behavioural responses (Rs). Estes wanted to keep this paradigm while involving

    probability into learning. He, as a good follower of Skinner, his mentor, in his 1950 article found a

    way to execute the task. First, he defined the response classes for the organism in a given situation.

    The advantage of having a closed set of responses is that the organisms behavioural state in a

    given situation can be fully characterized in terms of probabilities of its emitting each of the N

    response classes (Bower, 1994). In this way, learning can be defined as the increase of the

    probability of the correct responses alongside decreasing the competing alternative responses in a

    given situation. This concept made Estes to be able to construct differential equations to predict

    quantitative data of behavioural response. This statistical theory of learning brought about themathematical models flourishing for the coming 25 years in the field.

    After describing some finite differential equations in trial-by-trial changes in learning

    (Estes & Burke, 1953) Estes, the former rat runner, started his first probability learning

    experiments with his students in the mid-1950s. Adapting the experimental situations from the

    earlier studies of Brunswick, the subjects had to predict which of the two possible outcomes will

    occur on each trial in the given situation (e.g., a light would appear on the left or the right). The

    events occurred in random sequence and only the base rate became available to help subjects to

    predict which event will show up in that trial. Thus the fixed probability was independent of the

    history of the outcomes and of the behaviour of the subject. The optimal strategy to maximize the

    expected utility would be to always choose the option which appeared with probability greater

    than one half. The striking feature of the results of the following experiments was that subjects

    matched the underlying probabilities of the two outcomes along their decisions. For instance the

    experimenter has programmed the lights to flash randomly, but the red light would flash 70% on

    the time and the blue 30% of the time. During the course of the experiment the participants most

    often will predict the red light roughly 70% of the time and the blue light roughly 30% of the time.

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    13/73

    Strategy analysis of probability learning13

    This strategy is suboptimal, since they will predict correctly only 58% of the time

    (0.7x0.7+0.3x0.3=0.58), while predicting always the more likely light would bring 70% hit rate

    (1x0.7+0x0.3=0.70). The illogical sense of the results inconveniently surprised the researchers of

    the field. It is worth here to cite Kenneth Arrow economists comment from the time: We have

    here an experimental situation which is essentially of an economic nature in the sense of seeking

    to achieve a maximum of expected reward, and yet the individual does not in fact, at any point,

    even in a limit, reach the optimal behavior. I suggest that this result points out strongly the

    importance of learning theory, not only in the greater understanding of the dynamics of economic

    behavior, but even in suggesting that equilibria maybe be different from those that we have

    predicted in our usual theory. (Arrow, 1958, p. 14). Bush & Monstellers (1955) stochasticlearning theory, as much as Estess (1957) probability matching theorem, did predict this kind of

    behaviour in linear difference equations.

    The researchers enthusiasm began to wane about the forms equations of the learning

    curves during the following decades. Estess colleague, Roger Shepard mentions three reasons

    explaining this turndown (1992). First, Estes and his fellows of that time based their theories on

    the assumption that the associative bonds come about through temporal contiguity between events,

    however with the cognitive approach it turned out that the predictive significance of events is a

    better explanative basis for this kind of learning (Rescorla & Wagner, 1972). Secondly, they

    realized only lately that what can be learned must be genetically internalized and thus the

    experiments have to set to be ecologically valid (Gibson, 1979). Finally, these mathematical

    derivations could not be employed expectedly for more complex classification tasks and with the

    availability of stored program computers the search for few general principles was waned and

    what started was a seemingly endless patching together of details, as doc heuristics explicitly

    engineered to accomplish various practical tasks(Shepard, p.420).

    Even if the equations of learning curves did not remain a central research topic after the

    1950s, the previous robust findings that people use probability matching instead of normative

    optimal strategies in binary prediction tasks still startled researchers for further studies. Searching

    for explanation, a vast number of experiments tried to vary the parameters of the task in the

    coming decades. In general they found that matching decreased with size of the reward (Brackbill,

    Kappy, & Srarr, 1962; Siegel & Goldstein 1959) and with the number of trials (Edwards 1961).

    Shanks, Tunney, & McCarthy (2002), supporting rational choice theory showed that three factors

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    14/73

    Strategy analysis of probability learning14

    contribute to reaching optimal response strategy. These factors are (1) large financial incentives;

    (2) meaningful and regular feedback; (3) extensive training.

    Rewards:

    Although, Friedman and Massaro (1998) failed to find evidence that monetary payoff

    effects performance, we can find documentation of this in the literature (Siegel & Goldstein, 1959;

    Vulkan, 2000). In general, even under monetary payoff, the asymptotic levels of responding rarely

    exceeded 95% correct choice (Vulkan, 2000). Although, Shanks et al. (2002) paid almost 40 in

    average to his participants, he could show only a slight positive effect on performance, yet the

    magnitude of payoff did not show correlation with the performance.

    Number of trials:Restle (1961) showed that that sequence effect may disappear after 1000 trials, Goodie

    and Fantino (1999) found gradual transition towards optimal responding over 1600 trials.

    Considering these data we should conclude that humans are slow learners, rather than rational

    strategists. These findings even seems to be inadaptable for real life, since, as Fantino (1998)

    noted [l]ife rarely offers 1,600 trials (p. 213).

    Feedback:

    At least from Thorndike (1898) we know that one is more likely to choose an option in

    the future, if he/she receives a positive feedback. Nevertheless, previous research has found that

    outcome feedback is quite limited in its usefulness, particularly in comparison to cognitive

    feedback (Balzer, Doherty, & OConner, 1989). Cognitive feedback refers to information about

    the relations between responses and outcomes (functional validity information), or a summary of

    the relations between responses and outcomes (task information). Shanks et al. (2002) found that

    individuals may be differentially sensitive to the motivating properties of feedback. Despite of

    these observations, we can tell in summary that previous researches paid little attention to the role

    of feedback and its effects on asymptotic levels of performance is far from being explained.

    III. New interest in probability learning

    Although from Brunswick up to the late 1980s 280 journal articles, book chapters,

    doctoral dissertations and technical reports were published on MCPL, research on this approach of

    probability learning dwindled after the mid-1970s. The renewed interest in this topic started on its

    way again only in the late 1980s. This was the time when personal computers became available for

    psychologist experiments demanding complex computations, thus allowing developing models of

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    15/73

    Strategy analysis of probability learning15

    connectionist networks. This was the movement that helped Brunswikian probabilism get adopted

    eventually into the classical schools of human learning studies.

    III.1. Probabilistic associative learning

    With the appearance of the cognitive approach animal and human learning research

    started to follow separate routes. The researchers of animal learning continued to focus on

    elementary learning, while interest in human research shifted from learning to memory and from

    the classical models to the models of artificial intelligence. With the development of the

    connectionist networks after the mid-1980s the methodological apparatus became available to

    study the elementary aspects of human learning within the domain of cognitive psychology. Apart

    from some exceptions (e.g., Dickinson & Shanks, 1985) no studies have attempted directly to

    bridge results of animal experiments to human learning before Gluck and Bower (1988a). In their

    seminal papers (1988a,b), they developed an adaptive connectionist network to test the Rescorla-

    Wagner model of associative learning (Rescorla & Wagner, 1972) in human category learning.

    Gluck and Bower were looking for a learning model that involves probabilistic relation in its

    formulation. The Rescorla-Wagner model is based on Rescorlas previous demonstration (1968)

    that in the case of animal associative learning the level of the probability of a CS will vary theprobability of the US in the presence of the CS compared to the US probability in the absence of

    the CS (Gluck & Bower, 1988a). Gluck and Bower found that the Rescorla-Wagner rule for

    association formulation can be regarded as a special case of the least-mean-squares (LMS)

    learning rule which was used in training the adaptive connectionist networks of that time.

    Attempting to evaluate the LMS rule as a component of human learning, Gluck and Bower

    (1988a,b) conducted a series of experiments to explore the accuracy of the model to probabilistic

    classification learning situations.

    They adapted the experimental task from Medins medical classification task (e.g.,

    Medin, Altom, Edelson, & Freko, 1982). In this task in each trial, participants pretending

    medical diagnosticians - met one or more of the four symptoms of hypothetical patients in medical

    charts. They had to classify each patient as having one of the two fictitious diseases. After each

    trial they received feedback about the correct diagnosis. The combinations of the four cues

    (symptoms), unknown to the participants, were imperfectly, probabilistically associated with the

    feedbacks (diagnoses) ( Figure 2. ) following the scheme of the multiple-cue probability learning

    studies (e.g., Castellan, 1977). During the training, participants learnt the relationship of the

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    16/73

    Strategy analysis of probability learning16

    symptom patterns with the diseases and at the end of the experiment they were asked directly to

    estimate the conditional probabilities of each symptom to the diseases.

    Figure 2. Cue-outcome relations in the probabilistic association task (Gluck & Bower, 1988a). Similarly to

    Brunswiks Lens model ( Figure 1 .) objective weights ( w) represent the relation between the

    environment and the cues.

    The aim of the application of this design was to receive distinguishable predictions of

    their adaptive network from the three competing models of category learning (exemplar-, feature-

    frequency-, prototype model). The best fitting model could shed light on the question of to what

    extent we use similarity and base-rate (category probability) information in probabilistic

    categorization (Estes, Campbell, Hatsonpoulous, & Hurwitz, 1989). The exemplar (or context)

    model assumes that the learner stores all exemplar of each category and a new instance get

    categorised on the basis of its relative similarity to the stored exemplars (e.g., Nosofsky, Kruschke,

    & McKinley, 1992); the feature-frequency model presumes that the learner stores relative

    frequencies of occurrence of cues within the categories and then classifies an instance according to

    the relative likelihood of its particular pattern of features arising from each of the categories

    (Gluck, Bower, 1988b) (e.g., Reeds, 1972); the prototype model assumes that the learner abstracts

    an average description of each category and then the new instance gets classified according to its

    similarity to this prototype (e.g., Matsuka, 2004).

    The adaptive network was an error-driven one-layer LMS network. To generate

    differential predictions of the LMS model and the alternative models, Gluck and Bower (1988a)

    had to unbalance the overall frequencies of the two diseases. This way, one of the diseases

    occurred more often than the other one. The results from the learning phase showed that the base-

    rate information (the overall frequencies of the two diseases) were reflected in performance, as a

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    17/73

    Strategy analysis of probability learning17

    form of probability match, however, when the participants were given explicit test trials at the end,

    they showed substantial base-rate neglect. As a result, it was found here and in several other

    experiments (e.g., Estes et al., 1989; Gluck & Bower, 1988b) that simple network model has

    stronger predictive value for these results than the alternative models. Recently, out of the 12

    current models of category learning, the COVIS (competition between verbal and implicit

    systems) model is assumed as the best describer of probabilistic classification (Ashby, Alfonso-

    Reese, Turken, & Waldron, 1998; Kri, 2003).

    III.2. Probabilistic classification learning

    More recently, four different kinds of category learning tasks are generally used: rule-

    based tasks, information integration tasks, prototype distortion tasks, and the weather prediction

    task (Ashby & Maddox, 2005). The weather prediction task (WP) is a version of probabilistic

    classification learning task, developed by Knowlton, Squire and Glucks (1994). This test follows

    the structure of Gluck & Bowers (1988a) construction except participants play a weather

    forecaster. On the basis of one of the 14 combinations of four tarot cards (binary cues),

    participants had to predict rainy or sunny weather (binary outcome) ( Figure 3. ).

    Figure 3.

    Probabilistic Classification Learning task. In this task people have to guess the weather on the

    basis of the presented combination of cards with geometric signs on them. (adapted from Aczel &

    Gonci, 2005)

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    18/73

    Strategy analysis of probability learning18

    Outcomes associated to the patterns appeared with fixed probabilities, but in random distribution.

    Thus, the WP is substantially analogous to Gluck and Bowers (1988a) medical diagnosis task and

    to the MCPL tasks in a brunswikian sense, since the participants have to make judgements on the

    basis of the experienced relation between multiple-cues and a distal object, just as the Lens model

    describes. The standard analysis of the test proceeds by averaging correct responses in blocks of

    10 trials and measuring the deviation of these mean values from chance level. A response is meant

    to be correct on a particular trial, if the outcome selected was more frequently associated with the

    given pattern than the other one in the course of the whole experiment (Knowlton & Squire, 1996).

    During the last decade, this task became extensively used in cognitive neuroscience

    (e.g., Eldridge, Masterman, & Knowlton, 2002; Knowlton et al., 1994, 1996; Poldrack, Clark,Par-Blageov, Shohamy, Creso Moyano et al., 2001; Reber, Knowlton, & Squire 1996; Reber &

    Squire, 1999). The exciting result that has started this interest in clinical research was Knowlton et

    al.s (1994) finding with an initiative trial of this task. They compared the performance amnesiacs

    with normal control on the WP task. The result was that the two groups performed equally well

    during the first 50 trials of learning, however, in the extended part of the training, the amnesiacs

    performance decreased relative to the healthy group (Knowlton et al., 1994). The impairment of

    the declarative memory in amnesiacs is often coupled with the finding of relatively sound non-

    declarative learning (Milner, Corkin, & Teuber, 1968; Warrington & Weiskrantz, 1968). On the

    basis of this neurological observation, Knowlton et al. (1994) interpreted the results as the

    performance of the two groups being processed by non-declarative learning systems in the first

    part of the task, whereas in the late training the control people began memorising the test, what the

    amnesiacs could not (Knowlton et al., 1994; see also Gluck, Oliver, & Myers, 1996). In contrast,

    patients with Parkinson disease show learning deficit in the first 50 trials of the test which

    continues throughout the training (Knowlton et al., 1996). A learning patter similar to the

    amnesiacs was found with Alzheimer patients who were in the early stages of the disease. In both

    cases the anterograde amnesic symptoms is connected to neurodegenerative processes in the

    medial temporal lobes (Eldridge et al., 2002). Szabolcs Kri and his colleagues (Kri et al., 2000)

    examined schizophrenics with the WP task who, as well known, have abnormalities in executive

    function and explicit memory. The results showed normal performance for schizophrenics

    comparing to control. These results suggested that the WP task is processed by non-declarative

    neural systems.

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    19/73

    Strategy analysis of probability learning19

    The key cortical area activations found generally correlating with probabilistic

    classification learning were the occipital cortex and the right nucleus caudatus (Kri, 2003). It is in

    correspondence with the observation that people with impaired basal ganglia have difficulties in

    implicit learning tasks. The correct responses correlate positively with caudate and prefrontal

    activation, however, the role of the prefrontal and parietal cortices in this task is not yet

    understood (Fera, Weickert, Goldberg, Tessitore, Hariri et al., 2005). The abnormal functioning of

    the basal ganglia is well documented with Tourette patients (e.g., Peterson, Leckman, Duncan,

    Ketzles, Riddle et al., 1994). In a clinical study, the WP task was used with children with Tourette

    syndrome (Kri, Szlobodnyik, Benedek, Janka, & Gdoros, 2002). The children exhibited

    impaired learning in the WP, however, in an explicit transfer version of the test they showednormal learning (Kri et al., 2002). Further study showed that transcranial direct current

    stimulation of the left prefrontal cortex could improve implicit learning in WP task in healthy

    people (Kincses, Antal, Nitsche, Bartafi, & Paulus, 2003). Interestingly, while healthy people

    show activation in the striatum with no activation in the MTL during learning in implicit motor

    sequence task, people with obsessive-compulsive disorder exhibited no activation in the striatum

    and activation in the MTL (Moody, Bookheimer, Vanek, & Knowlton, 2004). Poldrack, et al.

    (2001) intended to study this interaction of the basal ganglia and the medial temporal lobe (MTL)

    during probabilistic classification learning. The appealing finding showed that during the initial

    part of the WP task the MTL was active, while the caudate inactive, but very shortly the MTL

    became deactivated (presumably inhibited), whereas the caudate nucleus became activated.

    Poldrack et al. (2001) interpreted the results as the first substantive evidence of competition

    between memory systems. They supposed that in the initial part of the task the two systems

    (explicit vs. implicit) may compete and as it turns out that the task demands implicit processing

    the MTL becomes inhibited (Poldrack & Rodriguez, 2004). This result supports the view that the

    two systems are not dissociated imperviously, but are in constant interaction and are encoded by

    common factors (e.g., McDonald, Devan, & Hong, 2004; Turk-Browne, Yi, & Chun, 2006).

    The probabilistic associative learning (Gluck & Bower, 1988a,b) and the probabilistic

    classification learning (Knowlton et al., 1994) tasks were developed for examining concrete

    computational and clinical analysis, but their main contribution to the field is that they provide

    experimental and analysing methodology for probability learning studies. These initial

    examinations resulted sufficient data and experience to be able to reconsider the basic procedures

    and methodologies for further studies.

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    20/73

    Strategy analysis of probability learning20

    IV. Methodological considerations

    It became apparent very early that probabilistic experiments bring about confusing

    results, if they are not measured with special attention. This section highlights the three main

    problematic points of the field. The base-rate neglect is a phenomenon that intrigued not just the

    researchers of learning, but it seems that this field has to most elaborated explanations to the issue.

    Strategy analysis of probability learning has no long history in the literature, but bears special

    interest for this explorative study. All of the methodological problems are entangled by our lack

    insight to the questions of consciousness in the processes.

    IV.1. Base-rate neglect

    Gluck and Bowers (1988a,b) observation that although the participants reflected the

    experienced probabilities in their decisions in the training phase, they did not consider the base

    rate information in their decisions in the test part calls for special attention. This result may seem

    to be uninterpretable, still it fits well with the literature of the base rate fallacy. The dominating

    research on judgement and decision making in the 1970s and 1980s was concerned with the

    heuristics and biases paradigm (Koehler, 1993). This approach was developed by Daniel

    Kahneman and Amos Tversky (1972) who elaborate the attractive theory stating that peoples

    intuitive judgements about probabilistic events are made via erroneous error-prone heuristics. In

    their 1973 seminal paper, presenting empirical support to the view, they concluded that by

    [representativeness] heuristics, people predict the outcome that appears most representative of the

    evidence. Consequently, intuitive predictions are insensitive to the reliability of the evidence or to

    the prior probability of the outcome, in violation of the logic of statistical prediction. [] It is

    shown that [] people erroneously predict rare events and extreme values if these happen to berepresentative. (Kahneman & Tversky, 1973. p. 237). As evidence in support of this theory

    mounted, base rate fallacy (Bar-Hiller, 1980) became a favourably used example of the

    heuristics and bias paradigm. The results have implicated a common view about human judgement

    as genuinely biased and generally poor (e.g., Lopes, 1991). However, for the early 1990s, the

    preliminary converging evidence became apparent that the general base rate fallacy has been

    overstated previously, the base rates are not uniformly ignored (Koehler, 1994, 1996). The

    question arises: if base rates are not always ignored, when are they likely to be used? One can find

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    21/73

    Strategy analysis of probability learning21

    two main streams of researchers answering the question. One emphasizes that relative frequencies

    can be better represented than single event probabilities (e.g., Gigerenzer, 1991; Hoffrage,

    Gigerenzer, Krauss, & Martignon, 2002), the other one seeks the answer in the structure of the

    tests (e.g., Koehler, 1996; Spellman, 1993).

    Gigerenzer and colleagues (e.g., 1995) forcefully argue that the dominating theory of

    heuristics is flawed, because representations in terms of natural frequencies better facilitate the

    usage of the probability (or frequency) than the given conditional probabilities (e.g., Hoffrage et

    al., 2002). This conception comes from the empirical works generated by ecological views (e.g.,

    Gigerenzer, 1996). Natural frequencies originate from natural sampling (Gigerenzer & Hoffrage,

    1995). Natural sampling is an automatic way of encountering statistical information from thenatural environment (Hoffrage et al., 2002). Giving a concrete example for these concepts:

    Natural frequencies:

    Out of each 1000 patients, 40 are infected.

    Out of 40 infected patients, 30 will test positive.

    Out of 960 uninfected patients, 120 will also test positive.

    Normalized frequencies:

    Out of each 1000 patients, 40 are infected.

    Out of 1000 infected patients, 750 will test positive.

    Out of 1000 uninfected patients, 125 will also test positive. (Hoffrage et al, p. 346).

    Thus, probability is the normalised value of the natural frequency for one hundred. The authors

    point to this fact as a reason of the fallacy, since the computation is simpler if natural frequencies

    are provided rather then normalised frequencies or probabilities are given (Gigerenzer & Hoffrage,

    1995).

    Koehler (1996) advocates the view that the fallacy is brought about by the way in which

    base rate summary statistics are provided in typical base rare tasks. He claims that people, given

    opportunity for implicit base-rate learning, will be more sensitive to probabilities and will show

    higher use of base rates in final judgements. It was published in previous studies that when base

    rates were directly experienced, through trial-by-trial feedback, they seemed to be used more

    accurately on judgements (Lindeman, Van Den Brink, & Hoogstraten, 1988; Manis, Dovalina,

    Avis, & Cardoze, 1980; Medin & Edelson, 1988), in contrast to the method of presenting mere

    summary statistics. Directly experiencing base rates is found to be helpful e.g. for auditors

    learning financial statement errors (Butt, 1988), or for physicians learning the relationship of base

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    22/73

    Strategy analysis of probability learning22

    rates and diagnostic information (Christensen-Szalanski & Beach, 1982). Koehler (1996) assumes

    that directly experienced base rates may be better relied on implicit rather than explicit learning

    systems and that is why it is better remembered, or more easily accessed than information that is

    learned explicitly. He renders a reason for this stating: the implicit learning experience comes in

    the form of trial-by-trial learning, the information at each trial may be encoded as a separate

    "trace". In this way, multiple traces develop, and the information associated with these traces may

    be cognitively available. This contrasts with the explicit learning of a single summary statistic

    which does not produce multiple traces and which has been associated with less accurate

    judgments (Koehler, 1996). Other explanations support the view that personally experienced

    information is more vivid or salient, thus more available (Brekke & Borgida, 1988); or people aremore trusting of self-generated base rates (Ungar & Sever, 1989). This conception is consistent

    with many observations from category learning literature, where the probability matching strategy

    indicates that people learn the experienced base rates and use them in their decisions (however not

    optimally), meanwhile, in the explicit test phase they show base rate fallacy (e.g., Estes, et al.

    1989; Gluck & Bower, 1988a,b; Medin & Edelson, 1988).

    Holyoak and Spellman (1993) considered the phenomenon and suggested two

    components being behind the base rate usage. Acquisition (1), which, in a trial-by-trial format, is

    processed implicitly and quite accurately (perhaps based on learning conditional probabilities);

    and access (2), which (depending on the type of test) may be under explicit and conscious control.

    Consequently, when both acquisition and access part of the test tap the implicit system, people

    will apply better on base rates than one of the phases are not implicit (Spellman, 1993).

    In sum, we can conclude that the base rate literature holds substantial relevance to the

    study of probabilistic categorization. For a comprehensive understanding of the issue, we must

    consider the above described aspects both in experiment construction and data interpretation.

    IV.2. Strategy analysis of PCL

    In probabilistic classification learning people acquire information about cue-outcome

    relations, therefore the category learning literature regards the PCL technically as an information-

    integration task (Ashby & Maddox, 2005), however, little is known about, how people integrate

    the observed cues. As will be argued below, a variety of different strategies are all about equally

    effective to solve the task.

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    23/73

    Strategy analysis of probability learning23

    The hypothesis is reasonable that the evolved animals and humans have to be optimal in

    categorization decisions (Ashby & Maddox, 1992). According to Thorndikes law of effect (1898),

    the probability of successful trials will increase with time. Still, robust deviations from this law are

    observed from the 1950s, one popular instantiation is the probability matching first studied by

    Estes (1950) (see in Section II.2.). Ashby and Gott (1988) examined the performance on many

    human categorization tasks comparing it to an optimal classifier (a hypothetical device

    maximizing reward, e.g., Morrison, 1990). The overall data showed that human classification

    cannot be described by optimality. Decision bound theory (Ashby & Townsend, 1986) attributed

    two inherent suboptimalities to human (and all other organisms) decision making processes. Both

    suboptimalities are the by-producs of the neural system (Maddox & Bohil, 1998). The perceptualnoise comes from the spontaneous activity within the central nervous system, further, the

    optimality of the cognitive system is also limited by the observers memory (criterial noise).

    Deviation from optimality is observable in the strategy level of decision making as well.

    Suboptimal strategies are also often found in probability learning literature (Vulkan, 2000). The

    known deviations from optimality in human decision making still seek for a proper explanation,

    however, one can find at least three distinct effects attributed to the phenomena in the previous

    works. One set of explanations is classified as payoff variability effects (Busemeyer & Townsend,

    1993; Haruvy & Erev, 2001). This theory claims that the increase in pay-off variability move

    choice behaviour towards random choice (Erev & Barron, 2005). The second set of explanations

    is classified as underweighting of rare events (Barron & Erev, 2003). In these situations people

    tend to rely on the typical outcomes that have the best pay-off. The third set of explanations

    involves loss aversion (Kahneman & Tversky, 1979). This counterproductive action is observed in

    the stock markets showing that people tend to avoid any loss. But, in probabilistic cases, the

    choice of the less probable alternative decreases the overall chance of maximal profit (e.g., Gneezy

    & Potters, 1997). These three cognitive strategies may seem to be reasonable from a point of view,

    but their negative by-product is the deviation from maximization of gain.

    Recently it became apparent that the WP task is solvable by a range of different

    strategies. Gluck, Shohamy and Myers (2002) presented techniques of post-hoc analyses by which

    they deduced that participants may use at least three different strategies. Post-experiment

    questionnaires uncovered that most the participants believed to use one of the following strategies:

    (1) optimal multi-cue strategy , in which they respond according to the outcome probability of each

    combination of the presented four cues; (2) one-cue strategy , in which they respond on the basis of

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    24/73

    Strategy analysis of probability learning24

    their focus on the presence or absence of only one cue; and (3) singleton strategy , in which they

    focus and learn only about the patterns where there is only one cue present. Only the first strategy

    is optimal in a normative sense, but all of these strategies can raise the level of performance above

    chance level. Bases on these reports, Gluck et al. (2002) developed a strategy analysis method for

    the WP task describing ideal judgement profiles for each of these strategies. This method

    identified the applied individual learning strategies in two follow up experiments. The results

    showed that 90% of the participants in experiment 1 and 80% of the participants in experiment 2

    fitted the singleton criterion, however, when they break the task into 50-trial blocks a gradual shift

    was observable towards multi-cue strategies (Gluck et al, 2002). Controversially, there was little

    correspondence between the explicit self-reports and the actually used strategies of the individuals.Recently Lagnado and colleagues (in press) introduced an alternative strategy to Glucks multi-cue

    strategy. The multi-match strategy supposes that people match the underlying probabilities of the

    two outcomes by their predictions. As we will see later, the adoption of these simple heuristics

    represents only a few of the plausible options that may be involved in probabilistic decision

    making situations.

    The fact that good performance can be achieved by explicit memorisation of heuristic

    strategies (e.g., one-cue strategy, or singleton strategy) makes the implicit nature of the test quite

    questionable. As it is argued in this study, beside finer strategy analysing methods, a better

    alternative of the WP task might prevent the method from being susceptible to indentifiability

    problems.

    IV.3. The question of consciousness

    Most of the previous studies of probabilistic classification learning assumed that the

    decision makers lack self-insight into the judgemental policies underlying these judgements,

    therefore regarded the test as a pure implicit learning test (e.g., Evans, Clibbens, Cattini, Harris, &

    Dennis, 2003; Gluck et al. 2002; Wigton, 1996; York, Doherty, & Kamouri, 1987). This view was

    also supported by those who emphasized the implicit nature of the experience-based learning tasks

    (e.g., Spellman, 1993). Although the question is important, the relation between peoples learning

    performance and their knowledge has received less attention.

    The thought that learning can be based on separate conscious and non-conscious

    systems is detectable in some of the early theories of learning (e.g., Tolman, 1932), systematic

    research has not started on its way until the late 1960s (Reber, 1967, 1969). Rebers concept of

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    25/73

    Strategy analysis of probability learning25

    implicit learning concerns that people acquire information from the environment without intending

    to do so (Cleeremans, Destrebecqz, & Boyer, 1998). This thesis made it possible to study

    unconscious phenomena in cognitive psychology without relying on any psychoanalytic

    conception. In a later paper, Reber (1992) taking an evolutionist standpoint argued that

    consciousness is a novel phenomena evolving after many higher perceptual and cognitive

    processes. He stated four hypotheses about implicit mechanisms (1) it is in robust relation with

    psychological and neurological effects; (2) it is independent of the IQ; (3) it is independent of age;

    and (4) it has little variance among populations. Others derive from this view that implicit learning

    acquire little effort, is often accurate and even optimised comparing to the explicit ways learning

    (e.g., Holyoak & Spellman, 1993). The PCL task is one of the few tests that can be used to revealthe rightfulness of these assumptions.

    A large body of research has documented apparent dissociation with the WP task and

    took the separation of the two learning systems as evidence (Ashby, Ell, & Waldron, 2003;

    Knowlton et al., 1996; Reber & Squire, 1999; Squire, 1994). The two systems were also

    demonstrated in neuropsychological studies arguing that the two systems are differentially

    impaired in the certain clinical cases (Ashby et al., 2003; Knowlton et al., 1996; Poldrack et al.,

    2001).

    Davis Shanks is one of the few researchers who questions the need of using any implicit

    concept in the explanation. In a voluminous paper with a colleague he (Shanks & St. John, 1994)

    reviewed the implicit learning literature according to their sensitivity criterion . This criterion

    claims that for terming a learning behaviour to be implicit, one has to rule out the insensitivity of

    the explicit test. Shanks & St. John (1994) argue that the explicit tests are possibly insensitive to

    measure the explicit processes that occurred during learning for two reasons. They suspect that the

    retrospective questionnaires can distort the validity of the assessment because of the memory

    constraint and the possible interferences. Taking this criterion serious, they did not find any

    previous studies where implicit learning was satisfyingly demonstrated.

    More than a decade later he and his colleagues pointed to other methodological

    shortcomings of the field (Lagnado et al., in press). First of all, they emphasized the need to

    distinguish between someones insight into the task ( task knowledge ), and someones insight into

    his/her own judgemental processes ( self insight ). In the case of the WP task, it refers to the

    difference between the learners knowledge of the cue-outcome relation, and to how to use this

    knowledge to predict the outcome. Lagnado et al. (in press) conjecture that the two might be

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    26/73

    Strategy analysis of probability learning26

    separate, although, the previous researches ran faultfully the two together. Thus, the proclaimed

    dissociations could have referred to the dissociation between insight and learning, task knowledge

    and learning, or both (Lagnado et al., in press). Further problem of explicit testing was added to

    the list by the reasonable claim that verbalisation difficulties of probabilistic inferences might be a

    natural obstacle of a valid report.

    Recently, Lagnado et al. (in press) conducted a series experiments to demonstrate these

    claims. They used strongly and weakly predictive cards on the basic WP task. In Experiment 1, to

    measure the knowledge and insight, participants were given explicit questions after each block of

    50 trials. One of the two different sets of questions asked the participants to rate on a continuous

    scale the probability of the outcome (rainy vs. sunny weather) in the case of each individual card(measuring task knowledge). The other question tested the participants of how much they had

    relied on each card in making their decisions (self-insight). This rating was also registered in a

    similarly continuous scale. The results indicated strong correspondence of the performance and

    both task knowledge and self-insight, being accurate in all.

    In Experiment 2 they asked similar explicit question from the participants on the screen

    of the same WP task. The difference was that the questions - of how much they relied on each card

    were presented to the participants after each trial. The results revealed that from early on in the

    task people rated strong cards more important than weak ones. The authors concluded that

    participants developed insight into their cue usage relatively early in the task.

    In Experiment 3 they tested whether the explicit questions after each trial directed

    conscious attention to the task, thus biasing the characteristics of the learning performance. In the

    final results there were no changes in any measurement of the third experiment comparing to the

    second one. These findings strongly supported the authors doubt that the WP is purely an implicit

    task.

    Lagnado and colleagues (in press) argument gains credence from these findings, but the

    appealing results supporting the existence of an implicit way of learning as well as the lack of the

    consensus in analysing the test results let us without satisfactory answer to the question.

    IV.4. Analysing the test results

    The standard analysis of the PCL task follows a simple procedure. It computes a mean

    percentage of the correct responses for the whole task by averaging across both trials and

    participants. The aim of this analysis was mostly to develop a categorization models (e.g., Gluck

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    27/73

    Strategy analysis of probability learning27

    & Bower, 1988a,b), or to study clinical groups (e.g., Knowlton et al., 1994, 1996). While the test

    became popular in these research fields, this analysing method does not tell much about the

    process of probability learning. The testing is insensitive about measuring both individual

    differences and dynamic processes along the test.

    Probabilistic classification learning tasks were developed from the multiple-cue

    probability learning paradigm, thus, although unacknowledged, its basic structure reflects the

    Brunswikian Lens model (for details see Section II.1.1.). This framework is applicable for every

    judgemental process where the decision maker has to rely on environmental cues. The original

    theory was based on the view that people construct internal cognitive models that reflect the

    probabilistic properties of the environment (Doherty & Kurz, 1996; Lagnado et al., in press). Thecentral tenet of this approach was to analyse individual judgemental processes prior to computing

    group averages. Individual judgemental policies can be examined by considering the relation of

    the given cues and the patterns of judgement (for overview see Cooksey, 1996). More specifically,

    computing multiple regression analysis for the judgements and the cue values across all the trials

    of the task, we can measure the cue-utilization weights from the resultant beta-coefficients. Simply

    put, it shows that weights that the individuals have given to each cue during their judgements.

    Having the judges policy models, it can be compared to the actual structure of the

    environment. This is achieved by computing parallel multiple linear regression for the

    environmental cues. Here the beta coefficients will be the objective cue weights which are the

    same for all participants exposed to the same task. This way, the judgment policies (by their cue

    utilization weights) will be comparable with the objective weights. The analysis reveals eventually

    how the individuals learnt the task environment. The analysing method can be applied in a parallel

    way for the assessment of the explicit judgements as well, measuring the task knowledge and the

    self-insight.

    Brunswiks Lens model and its later developments provided a useful framework for

    studying judgemental processes, but its shortcomings hinder us from receiving a detailed picture

    of the dynamics of the process both from the aspect of the environment and the decision maker.

    Regading the WP task, the way of averaging performance across all trials not just

    ignores the possibility that one can vary his/her subjective weights over the trials, but overlooks

    the fact that the individual cannot obtain a representative picture about the probabilistic structure

    early on in the task. In fact, the observed probability of an outcome changes from trial to trial and

    reaches the final (given) value only after meeting the last feedback (so it is actually never

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    28/73

    Strategy analysis of probability learning28

    measured). Consider the structure of the following Figure 4. It depicts the binary decisional tree in

    PCL. Going downwards from the top of the tree, one can observe the percentage of the normative

    expectancy of one of the feedbacks in each step.

    Figure 4.

    Binary decisional tree in WP. Going downwards from the top of the tree, the numbers on the

    diagonal lines indicate if the correct response was set to be sun (1) or rain (0). The numbers ( x)

    inside the rectangles represent the percentage of the normative expectancy of sun feedback in eachgiven step (for rain the values are 100- x).

    In practice the outcome patterns are (quasi-)randomly distributed (with fixed overall

    probabilities) in the WP experiments. It can be read from the illustration ( Figure 4. ) that regarding

    solely the final percentage values each participant might observe different probabilities on the

    previous trials. This makes the group averaging technique at least quite robust, if not inadequate.

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    29/73

    Strategy analysis of probability learning29

    Furthermore, on the 14 trials, first presenting a particular pattern, participants have no previous

    knowledge about the outcomes at all, so their responses cannot be a result of learning.

    It is also plausible that the people have no perfect memory for all the observed stimuli,

    thus recency may have effect on the decisions. People in experience-based learning situations have

    to update their impressions according to the newly sampled outcomes (Hogarth & Einhorn, 1992).

    Recently presented outcomes may have greater weight than earlier ones (e.g., Hertwig, Barron,

    Weber, & Erev, 2004). Moreover, individual memory constraints may vary regarding the number

    of considered samples in decisions (e.g., Jones, Love, & Maddox, in press).

    In view of these facts the conclusion is inevitable that a more sensitive analysing method is

    required for a precise description of probability learning and a coherent framework for modellingthe dynamics of the process.

    V. The Dynamical approach

    One more sensitive methodology for this analysis is provided by the dynamical

    approach. Dynamical approach to cognitive science rejects the idea that cognition is the operation

    of a special mental computer located in the brain, rather it provides a framework for understanding

    cognitive processes within cognition as a complex natural system (van Gelder & Port, 1995). Oneof the two main tenets of the Dynamical Hypothesis (van Gelder, 1998) which distinguishes it

    from the traditional views of cognition is its primary focus on the processes in real time. Contrary

    to the computational aspects, the main aim of the approach is to describe behaviour in its temporal

    course. Instead of the input-output relation, its concern is about the changing of the overall system

    in time. In contrast to the computer storage analogy, followers of the dynamic view reinforce the

    common psychological view that we are not passive recipients of information rather that we

    actively manipulate, reconstruct and bias it (MacLeod, Uttl, & Ohta, 2005). The other key aspect

    of the approach is an emphasis on total state. Total state refers to the conjunction of all aspects of

    the system at a given point in time (Beer, 2000; Bosse, 2005).

    V.1. Dynamic Models of Cognition

    The approach appeared to be expedient for decision making theories. Busemeyer and

    Townsend (1993) were one of the first ones publishing a useable dynamic model for high-level

    cognitive processing. The study provided a new aspect for understanding the relation of decisional

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    30/73

    Strategy analysis of probability learning30

    models. They classified the decision making models according to two attributes: deterministic

    versus probabilistic and static versus dynamic (1993). This dynamic-cognitive decision field

    theory in contrast to the earlier dominating deterministic or static theories successfully

    accounts for many time-varying aspects of the phenomenon and can offer a more detailed process-

    oriented explanation of motivational and cognitive mechanisms of decision making (Port, 2000).

    Before we exult over finding a revolutionary new alternative for the traditional theories

    of cognitive science, we should notice not only the long-established presence of its components,

    but the considerable limitations of its current applicability as well. The general framework was

    developed to study the biological and cognitive systems in their evolution, but unfortunately in

    many fields of this science there is not much realistic prospect for its empirical testing (French &Thomas, 2001). In most cases of cognition processing transitions are extremely rapid, the variables

    are exceedingly numerous and hardly detectable. Except for a few carefully constrained simple

    cases, we have insufficient amounts of data and inadequate method of computation compared to

    what these analyses would require (Port, 2000).

    V.1.1. Static and Dynamic Models of Learning

    Still, one of the areas where dynamic analyses may provide stronger descriptive andpredictive power in modelling cognitive processes is the modelling of learning processes. As

    discussed before, learning is a process in which we change the predictions about our environment

    on the basis of new experiences. This process is an evolving product of changing factors along the

    course of time (Smith et al., 2004). In typical experiments learning curves can be documented by

    recording responses (decisions) in multiple trial tasks. Multiple trial tasks provide us with

    continuous sampling of the changing phases of the process. In these studies (usually with binary

    responses) stimuli can be associated deterministically or probabilistically with reinforcement.

    Following Busemeyer and Townsends (1993) categorization of decision theories, here I propose

    an outlining classification of human learning models according to the deterministic versus

    probabilistic and static versus dynamic attributes, as in Table 1.

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    31/73

    Strategy analysis of probability learning31

    Table 1

    Categorization of Learning Models

    Category Static Dynamic

    DeterministicClassical/OperantConditioning

    Skill Learning

    Probabilistic Rescorla-Wagner model

    DynamicProbabilityLearning model

    Note. The matrix depicts and hypothetical categorization of learning models according to the

    deterministic-probabilistic and static-dynamic axes.

    A model is static when it regards and measures learning not in its course of process as

    dynamic ones do , but as a property of the system in a certain point in time. Therefore traditional

    conditioning theories are in this block, but most standard tests also belong here. Regarding the

    other attribute, a model is deterministic, if the stimulus is always, or never associated with the

    response, while in probabilistic theories the stimuli are reinforced according to varying distribution

    where. In this sense the Rescorla-Wagner model (1972) is static, because it does not account on

    the changing factors from trial to trial. Considering these aspects the study of probability learning

    requires techniques of a dynamic learning model.

    V.2. Dynamical analyses

    Two techniques have been published recently to monitor learning behaviour over time.

    The rolling regression analysis (or sequential least square technique) was introduced to illuminate

    individual behavioural differences in price forecasting (Kelley & Friedman, 2002). This method

    computes series of regressions by a moving window, generating trial-by-trial estimates about the

    individuals responsiveness to the observed cues. The learning curve is then compared with the

    curve that an ideal learner would show on the same task. The method regards the trial by trial

    information that the participant has actually observed. Thus, ideal learners strategies are defined

    according to the current state of knowledge of the ideal observer in each trial. This technique

    makes it possible to examine decisional attitudes individually along the course of the experiment

    and provides a tool to compare learning performance with an ideal learner of various strategies

    (e.g. Kitzis, Kelley, Berg, Massaro, & Friedman, 1998; Lagnado et al., in press). The other,

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    32/73

    Strategy analysis of probability learning32

    explicitly dynamic statistical analysis of learning is the state-space model paradigm (Smith et al.

    2004). This model computes the probability of correct responses for each state of the learning

    process by maximum likelihood applying expectation maximization algorithms. Knowing the

    learning curve and the confidence intervals permits us to identify the first trial on the curve where

    individuals performs better than chance level (Smith et al., 2004). This technique gives a precise

    definition of learning and a coherent statistical framework for learning studies with binary

    responses.

    VI. Experiments

    The present explorative study aims to analyse what decisional strategies are applied inprobabilistic learning situations. The question was examined with some modification on the standard

    PCL task. In addition, two novel statistical analyses rolling regression and state-space model were

    applied on the data.

    First of all, I used a new version of the usual PCL scheme (see Section III.1., III.2.). As it was

    reviewed in Section IV.2., a variety of different strategies are all about equally effective to solve the WP

    task. The ambiguity springs from the perceptual design of the four cues. In this task the four cues are four

    geometric forms of which combinations are the basis of judgements. As Gluck et al. (2002)

    demonstrated the application of the singleton strategy (in which the focus is only on the patterns where

    there is only one cue present) can also lead the participant to good performance than the multi-cue

    strategy (where all the cue combinations are considered). For example, if the participant associates every

    single triangle with rain and guesses in the other trials randomly, he may reach a good score on the whole

    experiment, however, no probability learning occurred. To prevent this complication I used a different

    PCL task than the WP. The experimental setup was adapted from Shohamy and colleagues (Shohamy et

    al., 2004) who constructed a design, which kept the basic structure of the WP, but used less detached

    cues. As described in details below, the participants had to make guesses on the basis of the features

    (cues) of a toy figure (hat, moustache, etc.) ( Figure 6. ).

    Another modification was that the sequence of the trials and feedbacks were set to be identical

    for each participant. As written in Section IV.2., the random distribution of the patterns (with fixed

    overall probability) limits the possibilities of individual comparison. With fixed pathways of the patterns

    in the decisional tree the order of the stimuli and outcomes became identical for all participants. This

    made the data usable for an aggregated trial-by-trial evaluation within the group and between the

    individuals.

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    33/73

    Strategy analysis of probability learning33

    To measure the performance differences after different feedback pathways, different

    pathways belonged to some of the final probabilities. Figure 5. demonstrates that more pathways

    lead to the final value of 33.3, 66.7, 25, 75, 16.6, 83.3, and 50 of probabilities. In the case of some

    the pathways with common final point, one answer was more probably correct in the beginning,

    but later on the alternative answer became dominant, while in other cases one answer was

    dominant in the whole course of trials. The fixed pathway method also permits us to compare the

    learning differences at the same final point after different paths.

    Figure 5.

    Feedback pathways. The 14 pathways following the arrows represent the feedback sequences of

    the 14 patterns. Numbers ( x) in the rectangles indicate the percentage of the normative expectancy

    of vanilla feedback in each given step (for chocolate the values are 100- x)1.

    1 Constructed by Dnes Tth.

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    34/73

    Strategy analysis of probability learning34

    In practice the final value that represents the normative overall probability of a pattern

    can be measured only after the last feedback (about previous practice see Section IV.4.).

    Therefore, I inserted one more trial of each pattern (14) with no feedback after their last

    presenting. These extra trials provide us data to measure learning at the final point.

    In Section IV.1. we could see that experienced based learning is assumed to based on

    rather implicit then explicit mechanisms. The ignorance of this led previous studies to confusion

    about the explanation of learning processes along this task (see in Section IV.3.). A large body of

    researchers emphasized that the misinterpretation of data originates from the malpractice that the

    experimental learning was tested by explicit assessments. To avoid this, I added an implicit test

    phase following the learning phase. This phase was identical to the learning phase except, theparticipants did not get feedback after the trials, but were told that they receive their result at the

    end of the session. The advantage of this extra part is to test the subjects in a similar situation to

    the learning phase, and since there is not feedback, the observed probability remains unchanged,

    thus we can count mean scores from the sequences of test trials.

    To measure participants explicit task knowledge, the third part of the experiment was

    an overt test of what probability value they associate to each pattern. The explicit test had to

    follow the other tests not to interfere with them.

    VI.1. Methods

    VI.1.1. Participants

    Fourty-five undergraduate students from the Psychology Institute of ELTE, Budapest

    participated in the present study (mean age = 23.46 years; SD = 3.77 years). There were 15 males

    and 30 females. The participants were divided into two groups: baseline and dual-task groups.

    Twenty-eight participant were in the baseline group (n = 28; 6 males and 22 females with a mean

    age of 22.00 years and a SD of 2.87 years). The dual-task group consisted of 17 participants (n =

    17; 9 males and 8 females with a mean age of 25.88 years and a SD of 3.92 years). The

    participants received course credits.

    VI.1.2. Materials

    The A modified version of the PCL I task (as introduced by Shohamy et al., 2004) was

    used in this study. In this version participants are told that they are selling ice cream in an ice

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    35/73

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learning

    36/73

    Strategy analysis of probability learning36

    cues present (1111) and with no cue present (0000) were never used (following the scheme of the

    previous works since Knowlton et al., 1994).

    Table 2

    Note. Each card could be present (1) or absent (0) for each pattern. The all-present (1111) and all-

    absent (0000) patterns were never used. The overall probability of vanilla outcome for all patterns

    is 50%.

    Additionally, two digital photographs of vanilla and chocolate ice creams and an image

    of the tip jar that showed the extra tips were used on the screens. All the materials were presented

    on a computer screen with identical black background.

    Using the 14 patterns of stimuli, 214 trials were constructed for the learning phase and70 for the test phase. For the explicit test 14 PowerPoint slides were created using the 14 patterns.

    During the learning phase the two feedbacks (vanilla, chocolate) were equally probable, but each

    pattern was independently associated to one of the feedbacks according to the scheme of Table 2.

    200 of the 214 learning trials followed the pathways of Figure 5. , the remaining 14 trials were not

    associated with feedback.

  • 8/13/2019 Aczel - Strategy Analysis of Probability Learn