a atheory ofpavlovian conditioning: variations in …...3atheory ofpavlovian conditioning:...

18
Rescorla, R. A, & Wagner, AR. (1972). A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and non reinforcement. In AH. Black & W.F. Prokasy (eds.), Classical conditioning II: current research and theory (pp. 64-99) New York: Appleton-Century-Crofts. ROBERT A. RESCORLA ALLAN R. WAGNER 3 A Theory of Pavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement In several recent papers (Rescorla, 1969; Wagner, 1969a, 1969b) we have entertained similar theories of Pavlovian conditioning. The separate statements have in [act differed more in the language of their expression than in their substance. The major intent of the present paper is to ex- plicate a more precise version of the form of theory involved, and to indicate how it may be usefully applied to a variety of phenomena in· volving asso('iative learning. The impetus for a new theoretical model is not generally a new datum which dearly disconfirms existing theory. It is more likely to be the accumulation of a salient pattern of data, separate portions of which may be ade(luately handled by separate existing theories, but which apo pears to invite a more integrated theoretical account. Such, at least, is the better description of the background of the present work. In the sections which follow we will first describe certain data from our laboratories which exemplify the kind of observations which have encouraged the present theorizing. The theory will then be presented in sufficient detail to show how it may be applied to experimental situations involving a variety of Pavlovian conditioning arrangements. Finally, we will briefly discuss the theory in relationship to more conventional approaches. BACKGROUND The background data pauern embraces a considerable range of phenomena. At the core, however, is a rather simple set of observations involving Pavlovian conditioning with compound CSs. Suppose we have inferential knowledge concerning the "associative Tht, pn'p;lrariol\ of this manuscript and the research reportt.·d wt.'rc slIpportt'<J in pari loy N;lIional S<i"ncc Foundation grants Gß-64!'~ and GB-6554. The order of authorship was d"tnmined by tOI. We would like to thank Miss Karen Gould, Mrs. Mari. Saave,l,.a. and Mr. Gerd Lehmann for as.sistance in data collection. and Mr. Donald Rightm .... for writing the computer program for lhe model. Rescorlo & Wagner 65 strength" of some stimulus element A and of a second element X. This generally requires that we know <:ertain things abOlit the organism's history o[ experience with the separate cues, and something aboUl Ihe organism's bchavior in their presence. We may know, for cxample, that A is a CS which has [re(luent1y heen paired wilh a US, aJ)(1 which ('on- sistently elicits a sizeable CR. And we may know that X is a novel CS which neither elicits a CR nor inhibits Ihe onurrence of otherwise elicited CRs. In this case, we would (,ommonly attrillute a high excitatory strength to A and a zero strength to X. Suppose further then. that A and X cues which have been arranged to have special strength characteristics are presellled concurrently and the AX compound is either reinforced by a US or is nonreinforced. What effect will such an AX trial, or a series of similar AX trials, have upon the behavioral influence, or "associative strength" o[ X alone? The answer, it appears, is that lhe effects will depend in a systematic [ash- ion not only upon the currem strenglh o[ X, bm also upon the currclll strength of A, and hen('e upon Ihe net strength of the AX compound. For example, if X has a relatively low excitatory value, a series of AX reinforced trials will inlTease the CR eli(:iting ('haracterislic of X mud. more when A is arranged to have a relalively low exritatory value than WhCll A is arranged to have a high excitatory value. Similarly, a series of AX nonreinforœd trials will delTease lhe CR eliriting charaneristic o[ X, or will increase the CR inhibitory characteristic of X much more if A is arranged to have a relatively high ex( ilatory value Ihan if A is arranged to have a low excitatory value. Support for such generalizations may be drawn [rom a number of sonrces (e.g., Kamin, 196H; Egger Re Miller. 19í¡2; Konorski. 194R; Pavlov, 1927). RUI, it will be convenient to use several experiments from our laboratories to indi<:ate the syslematic varialion involved. We will first illustrate the manner in which the eUeers of reinforcemelll appear 10 depend upon the net strenglh of the ("(mIpound, allli lhen the manner in which the effects o[ nonreinforcemenl appear also to depend upon lhe net strength of the l"lllllpound. Variation in the effects of H~inf()rcement. Wagner and Saavedra (Wag- ner, 1969b) trained three groups of 2n rallbits in an eyelid conditioning situation in which the US was a Inn IllSL 4.5-ma. shock lo the area of the eye. In the referenœ condition «(;roup Il) Ss received 2nn nmdilioning trials in which a 1100 mse. wmpound CS, wnsisting of a flashing light (A) and a tone (X), always terminalI'd wilh reinforcement. Two addi· tional groups received an equal nllmller of reillforced AX trials, but also reœived 2nn trials with Ihe A ('uc alolle, irregularly interspersed among the compound trials. In Group I, 1\ alone was always reinforced; in Group 111,A alone was always nonreinforced.

Upload: others

Post on 11-Mar-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: a ATheory ofPavlovian Conditioning: Variations in …...3ATheory ofPavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement In several recent papers

Rescorla, R. A, & Wagner, AR. (1972). A theory of Pavlovian conditioning:variations in the effectiveness of reinforcement and non reinforcement.In AH. Black & W.F. Prokasy (eds.), Classical conditioning II: current researchand theory (pp. 64-99) New York: Appleton-Century-Crofts.

ROBERT A. RESCORLAALLAN R. WAGNER

3 A Theory of Pavlovian Conditioning:Variations in the Effectiveness ofReinforcement and NonreinforcementIn several recent papers (Rescorla, 1969; Wagner, 1969a, 1969b) we

have entertained similar theories of Pavlovian conditioning. The separatestatements have in [act differed more in the language of their expressionthan in their substance. The major intent of the present paper is to ex-plicate a more precise version of the form of theory involved, and toindicate how it may be usefully applied to a variety of phenomena in·volving asso('iative learning.

The impetus for a new theoretical model is not generally a newdatum which dearly disconfirms existing theory. It is more likely to bethe accumulation of a salient pattern of data, separate portions of whichmay be ade(luately handled by separate existing theories, but which apopears to invite a more integrated theoretical account. Such, at least, isthe better description of the background of the present work.

In the sections which follow we will first describe certain data fromour laboratories which exemplify the kind of observations which haveencouraged the present theorizing. The theory will then be presented insufficient detail to show how it may be applied to experimental situationsinvolving a variety of Pavlovian conditioning arrangements. Finally, wewill briefly discuss the theory in relationship to more conventionalapproaches.

BACKGROUND

The background data pauern embraces a considerable range ofphenomena. At the core, however, is a rather simple set of observationsinvolving Pavlovian conditioning with compound CSs.

Suppose we have inferential knowledge concerning the "associative

Tht, pn'p;lrariol\ of this manuscript and the research reportt.·d wt.'rc slIpportt'<Jin pari loy N;lIional S<i"ncc Foundation grants Gß-64!'~ and GB-6554. The order ofauthorship was d"tnmined by tOI. We would like to thank Miss Karen Gould, Mrs.Mari. Saave,l,.a. and Mr. Gerd Lehmann for as.sistance in data collection. and Mr.Donald Rightm .... for writing the computer program for lhe model.

Rescorlo & Wagner 65

strength" of some stimulus element A and of a second element X. Thisgenerally requires that we know <:ertain things abOlit the organism'shistory o[ experience with the separate cues, and something aboUl Iheorganism's bchavior in their presence. We may know, for cxample, thatA is a CS which has [re(luent1y heen paired wilh a US, aJ)(1 which ('on-sistently elicits a sizeable CR. And we may know that X is a novel CSwhich neither elicits a CR nor inhibits Ihe onurrence of otherwiseelicited CRs. In this case, we would (,ommonly attrillute a high excitatorystrength to A and a zero strength to X.

Suppose further then. that A and X cues which have been arrangedto have special strength characteristics are presellled concurrently andthe AX compound is either reinforced by a US or is nonreinforced. Whateffect will such an AX trial, or a series of similar AX trials, have uponthe behavioral influence, or "associative strength" o[ X alone? Theanswer, it appears, is that lhe effects will depend in a systematic [ash-ion not only upon the currem strenglh o[ X, bm also upon the currclllstrength of A, and hen('e upon Ihe net strength of the AX compound.For example, if X has a relatively low excitatory value, a series of AXreinforced trials will inlTease the CR eli(:iting ('haracterislic of X mud.more when A is arranged to have a relalively low exritatory value thanWhCll A is arranged to have a high excitatory value. Similarly, a series ofAX nonreinforœd trials will delTease lhe CR eliriting charaneristic o[X, or will increase the CR inhibitory characteristic of X much more if Ais arranged to have a relatively high ex( ilatory value Ihan if A is arrangedto have a low excitatory value.

Support for such generalizations may be drawn [rom a number ofsonrces (e.g., Kamin, 196H; Egger Re Miller. 19í¡2; Konorski. 194R; Pavlov,1927). RUI, it will be convenient to use several experiments from ourlaboratories to indi<:ate the syslematic varialion involved. We will firstillustrate the manner in which the eUeers of reinforcemelll appear 10

depend upon the net strenglh of the ("(mIpound, allli lhen the manner inwhich the effects o[ nonreinforcemenl appear also to depend upon lhenet strength of the l"lllllpound.

Variation in the effects of H~inf()rcement. Wagner and Saavedra (Wag-ner, 1969b) trained three groups of 2n rallbits in an eyelid conditioningsituation in which the US was a Inn IllSL 4.5-ma. shock lo the area of theeye. In the referenœ condition «(;roup Il) Ss received 2nn nmdilioningtrials in which a 1100 mse. wmpound CS, wnsisting of a flashing light(A) and a tone (X), always terminalI'd wilh reinforcement. Two addi·tional groups received an equal nllmller of reillforced AX trials, but alsoreœived 2nn trials with Ihe A ('uc alolle, irregularly interspersed amongthe compound trials. In Group I, 1\ alone was always reinforced; inGroup 111,A alone was always nonreinforced.

Page 2: a ATheory ofPavlovian Conditioning: Variations in …...3ATheory ofPavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement In several recent papers

66 Assumptions, Theories, Models

Immediately following training all Ss received 16 reinforced testtrials with X presented alone for the first time. In each of the three lrain-ing conditions X had been experienced an equal number of times, andhad always been followed by reinforcement. However, the conditionswere designed to encourage different degrees of conditioning or "asso-ciative strength" to A with which X was always experienced in com-pound. In comparison to Group U which received only compound trials,reinforcing A alone in Group I should have increased the associativestrength of A and hence of the AX compound during training, whereasnonreinforcing A alone in Group lU should have decreased the strengthof A and hence of the AX compound during training. The question waswhether X would be differentially responded to in the three conditionsas a function of this differential experience with A.

Figure 1 summarizes the percentage test trial responses of the threegroups to the X element. Also included for comparison are the per-centages of conditioned eyeblink responses to the AX compound and tothe A element, where appropriate, during the immediately precedingblock of training trials, As may be seen, relative to Condition II in-creasing the associative strength of A in Condition I decreased condi-tioned responding acquired by X, whereas depreciating the associativestrength of A in Condition lU increased conditioned responding acquired

I

GROUPS

n lIT

AX+ A+ ® AX+ ®STIMULI

AX+ A- ®

Figur. 1. Median percentage conditioned eyeblink responses to on AX com-pound and to the A and X elements alone, in three groups receiving either notraining with A alone (lI), training with A alone reinforced (I), or training withA alone nonreinforced (III), contemporaneous with AX reinforced.

Rescorla & Wagner 67

lJy X. This onlerinll; of lhc treatments is nol only slatisti"ally reliahle,but is very replOdu(eablc in ditlerent siluations, \Vall;ner (I!Hi!lh) hasreportcd esscutially identi(al results I"l"Omsimilar (omparisous iuvolvinl;Conditioned Emotional Response (CER) naiuiug or disniminated bar-press training wilh rats.

Res.:orla, in a previously unpuhlished experillleut. ohtained similaretreCls, hut in a situation in whi(:h Ihe asso('iativc value 01 the A .:ue wasmanipulated priOl' to the start 01'AX traininl;. Four groups of rats werefirst trained to har-press on a VI sdledule for food ,-eiuforcement. Theseveral groups then re(:eived ditlerelll Pavlovian (Olulitioning treatmentswith a 2,min. tone CS (A) and a 0.5 sec l,m.l. f,x,t shoc'k, while confineIlin a separate shoc'k ch;unher. These treatments were designed to estah-lish different hehavior etrel:ls to A. For all groups 12 preselllations of Aoccurred in each of 52-hour conditioning sessions. In Group ,1\-0, Awas traiued to elicit fear by presenting the shock with a prohahility of .Rduring the CS but never in its ahsence. For a semnd group (C;roup 0- .Il)shoc'ks on-lined with a frequen(:y of ,tl per 2-min. interval in the ahsenfCof A bUI the onset of A signalled a 4,min. pericxl free from sho(:ks. Thispnxedure ("(HI hi he expened (e.g" Rescor1a, I!IÜ!l)to make 1\ a (ondi,tioned inhihitor 01" feaL The remaininf!; IWOgroups were ,ontrol groupsin whidl lhe ('ollilitioninf!; trcatmcnts ("(mid he expe'-Ied to leave Arela,tively neutra\. Thus, Group Control 0-.1\ re.:eived Ihe same numhn ofshoc'ks as Group 0-.1\, and the same IIllmber of exposures to 1\, bul thetwo were tllKorrelated in lime. Group Shock re(-eived the same scheduleof shoc'ks as Group 0- .R,hut never cxperienced A,

Following this c'onditioning to 1\ alone. all Ss re('eived 1\ lrials inwhi(:h a I\ashing light (X) was presented in fOnjunclÍon wilh 1\ and thecompound was reinforced with sho"k on a !iO'i;, sdledule. Finally, oneach of four test days following this wmpoullll (-ondilioning all Ss re,ceived 4 nonreinforœd test presentalions of X alone while bar'pressing.

Figure 2 sUlllmarizes the results of thcse extinnion tesl sessions, inthe form of mean suppression ratios (I\nnau & Kamin, l!)()l). This l-atioyields a value of lero when the Ui completely disrupts har-pressing, anda value of .r, when har-pressing behavior is nnatleCled hy the CS, Thus.the lower the value indicated, the more eHenive was X alone. Il shoul(lalso he noted that the hehavior of the IWOreferen('e ~roups (Group Sho..kand Group Control, O-.R) was virtually iclentil:al throu~hout testing, sothat the two groups have heen ("(nnhined in Figure 2.

Although X was experienœcl, amI had been followed by l'einfon'e,melll an elJllal number of times in the several groups, X was not similarlyeffective during tesling. In ("(1II1parisonto the performance of the refer-ence groups, pretraining cue 1\ to eli..it fear in C;roup .tl-O decreased thean¡uisition of fear to X as a result of the AX reinlor ..emellls. In adclilion,pretrainin~ nie 1\ to be an inhihitor of fear iu Group 0-.1\ i/lnellsed theaUluisition 01" I"ear 10 X as a result of the I\X reinfon:ements.

Page 3: a ATheory ofPavlovian Conditioning: Variations in …...3ATheory ofPavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement In several recent papers

68

xot-

o~o::

~CI)CI)I.LIo::Q.Q.:::>CI)

z~I.LI~

.4

.3

.2

.1

Assumptions, Theories, Models Rescorla & Wagner

/,./~,/p

78-0• ,/

/

_ /1' lleference

¡I/ A/A. /

/,/

cl

69

The A and 8 cues, by virtue of lhe difl"cretJl lIumhers of reinlÚITe-ments ill their presenœ, were designed lo havc diflcrcnl assOlialivestrenglhs. i.e., A was designed to he a relatively slrong (uc. and H arela.tively weak cue hy Ihc clld of an¡uisilion. For half of the Ss A was aflashing lighl ami H was a vilnati,," al'plic:d lo Ss' dH~S\. For the remain .illg Ss the' lIat ure of Ihe nIes was rc,·crsed.

The X cue, which for all .'Is was a !lI tifl hz lone, was lhe elemenl ofspecial illlercs\. hllmedialely followin/{ anluisitioll. Ss were assigned loone of two treatment cOllditions alld adminislered 32 extinction trials,in whid. X was presented and nonreinfon-ed. For half of lhe Ss X \Vaspresented during extinction in wmpoulld with Ss' A (:ue, while for theremaining IR .'Is il was presemed in wmpmmd with the B Ule.

On the !l2 lrials immediately folluwing the extinctiou phase. X wasagain presented alone to all Ss ami was reiuforce<l. Cumparisun of Ss'responding during this reanluisition phase with the lcvel of respondingto X at the end of original anluisition, alluwed a delermination of thedecrememal efIeCls suffered as il result of the intervenin!!; extinction, witheither of the two compounds containing X.

Figure 3 represents the mean pere'emages wnditioned eyehlinkresponses to the several CSs durin/{ the three phases of the experiment.The acquisition functions which summarize the respondin!!; of all 36 Ssprior to differential treatment, illdicate that Ihere was al'pre<'iahle ac-

4 ACQUISITION

100' ALL COMPONENTS

if 90o 80wz

A.....P'...oO 70¡:: ,.

o 60pl

Z IO I,/ .......Xu 50 tfw

LCl 40~z 30 ,t>-~-{J,w

/~"~o/I'/'-euCl: 20wQ.

1'JY10

I 2 3 4 5 6 7BLOCKS OF 32A, 4B, and 32X

TRIALS

2 3

SESSIONS

Figur. 2. Mean suppression ratio for X fallowing AX reinforced trials. Groups.8-0 and 0-.8 had prior excitatory and inhibilory training, respectively, to A,while the reference groups had prior treatment nol expected to influence theassociative strenglh of A.

TEST

Varia/ioll in the effects of nonreinforcemen/. A study conducted byWagner, Saavedra, and Lehmann (Wagner, 1969b) was designed to eval·uate whether nOllreinforcement would also have different effects upona stimulus element, depending upon the strength of the compound inwhich the element was imbedded. The study was conducted in eyelidconditioning and generally employed parameters similar to those in theearlier Wagner and Saavedra study.

Thirty-six rabbits were first conditioned to three separate stimuluselements, which will be referred to as A, B, and X. Over the course oftwo days training there were 224 A, 28 B, and 224 X trials, irregularlyordered, in which the respective cues were presented alone and rein·forred.

EXTINCTION REACQUISITIONSELECTED COMPOUND I X

After

U..r~~-BLOCKS OF 8 TRIALS

234

Figure 3. Mean percenloge eyeblink responses during three Iroining phases, in·valvillg acquisition to each of three separale component CSs, extinction with oneof Iwo compounds farmed from Ihe acquisilion componenls, and reacquisilion lothe component common lo the twa exlinction compounds. (From Wagner, 1969b.l

Page 4: a ATheory ofPavlovian Conditioning: Variations in …...3ATheory ofPavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement In several recent papers

70 Assumptions, Theories, Models Rescorla & Wagner 71

<)uisition to X, amI, importantly, diHerent amounts of accluisition to theA and B (ues. FlIrther evidem:e that A and ß attained different assoda-live strengths may be seen in the extinction phase panel of Figure 3.That group which reœived X in compound wilh the A n.e respondedmore frequently duriug extinction than did that group whi('h received Xin (:olllpoul\(l wich the presumahly weaker, B cue.

The data of major interest, however, are depicted in the reacqulSl-tian functions whic-h summarize the subsequent responding to X alone,in each of the two treatment groups. As is apparent, there was lessresponding to X following the AX extinction than following the BXextinction wmlition. That group in which the 32 nonreinforced ex·posures to X ill\'olved a relatively strong compound containing the Acue, experiemed a significantly greater decrement in responding to Xlhan did that group in which the same nonreinforced exposures 10 Xinvolved a relalively weak compound containing the B cue.

Nonreinfonement may not only cause a CS to lose its tendency toeli('it conditioned responses, but under appropriate circumstances maycause a CS 10 bemme "inhibitory," i.e., to act so as to decrease the likeli·hood of otherwise elicited CRs. The circumstances which are known tofavor this ocnlrrenœ are in fact consistent with the data from theprevious study. Thai is, while simply nonreinforcing a previously neutralcue in isolation is unlikely to make that cue inhibitory, consistentlynonreinfon'ing the same cue when in compound with an otherwise ex-cÎlatory cue can result in a "conditioned inhibitor"(e.g., Konarski, I94R).This fact may be viewed as further indicating that the "decremental"effects of nonreinfor<:emelll are greater, the greater the net associativestrength of all of the cues which precede the nonreinforcement.

To further evaluate this proposition, Wagner and Saavedra, in apreviously unpublished experiment, only slightly modified the procedureof the Wagner, Saavedra, and Lehmann study referred 10 above. Duringan initial aC(luisition phase, cues A and B were again trained, as a resultof differential numbers of reinforced trials (240 vs. 8), to have differentassociative strengths, and a third cue C, necessary for the test phase, wasalso highly trained (54R trials).

Followinj?; such training, a novel cue X was introduced in compoundwith either A or B for different groups of 20 Ss, and the compound wasnon reinforced. Sixty-four such nonreinforced trials were irregularlyalternated with a similar number of trials in which the cue paired withX continued to be presented alone and reinforced.

The X nie should have become a conditioned inhibitor as a resultof either training schedule (e.g., Pavlov, 1927; Rescorla Sc LoLordo, 1965).The (luestion was whether X would become more inhibitory as a resultof being nOllreinforced in compound with the stronger A cue, as com·pared to the weaker B cue. This was evaluated by returning the C cue,

GROUPS

AX- BX-IP. 100owz2 eol-oz8 GOwC)oC(

~ 40woa::w

20o..zoC(w~

e ex e exFigur. 4. Mean percentage conditioned eye Iid responses, in evaluation of theconditioned inhibitory properties of X in two group •. In One group X had beennon reinforced in compound with a relatively excitatory cue, A, while in the othergroup X had been nonreinforced in compound with a Ie•• excitatory cue, B.

and determining in both groups the reduction in responding to C whenin compound with X. This final test phase involved 16 reinforced pres-entations of C and of the CX compound.

For all Ss, C was a flashing light, X a vihratory slilllulus. ami A amiB dissimilar auditory cues, the identification of the two as A and Bcounterbalanced within experimental groups. Conditioned respondingobserved during the initial training phases was appropriate to the experi-mental intention that A have a greater associative strength than ß: priorto the introduction of X all Ss were responding at a higher level to theirA then to their B cue, and a similar difference was continlled in theperformance of the separate groups subsequently receiving A reinforcedvs. AX nonreinforced or B reinforced vs. BX nonreinforced.

Figure 4 presents the data of major interest from the final test phase.The two groups responded at the same high level to the e cue alone. Theaddition of the X cue, however, decreased this responding considerably(and reliably) more in the case of that group which had previously ex-perienced the nonreinforcement of X in compound with the relativelystrong A cue, then in the case of that group which had experienced the

Page 5: a ATheory ofPavlovian Conditioning: Variations in …...3ATheory ofPavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement In several recent papers

72 Assumptions, Theories, Models

GROUPS

.0 MA .5 MA I MA

I.20

52l-e:{It:

z .15OInInIúo::Q.. .10Q..~(/)

ze:{

Ö .05Iú:l

e ex e ex e exFigur. 5. Medion suppre .. ion ratio la e alone and Ihe ex compound. Slimulus Xhad received prior inhibitory training contrasted with different intensities ofIhe US.

nonreinforcement of X in compound with the relatively weak B cue.In all of the previous studies the associative strength of the cue with

which X was eventually treated in compound was manipulated by vary-ing the schedule of reinforcements and nonreinforcements with respect tothat cue alone. There are, however, other variables which should in-Ouence the learning which would accrue to such cues alone and il mightbe expected that these variables would have an influence similar to thatproduced by varying the reinforcement schedule. For example, in theWagner and Saavedra inhibition study above, it might have been aseffective 10 bring A and B to differential strengths as a result of thesame number of pairings with a US, but with a higher intensity USassociated with A than with B.

Such reasoning gains support from an unpublished experiment byRescorla. Following VI food-rewarded bar·press training, three groupsof 8 rats received CER conditioning, with a 1200 Hz tone alone (A) anda compound (AX) composed of this tone plus a Oashing light. In total,45 A and 75 AX trials were irregularly distributed over 30 training days.

Rescorla & Wagner 73

For all Ss the AX trials were consistently nonreinfon:ed. The groups dif-fered in the intensity of a .5 sec. shock US which they received on the Aalone trials, being eilher (l,ma. (nonreinfonemem), .S-ma., or l'ma .

In onlel' to evaluale the inhibitory effects of X in each of the threetreatments it was ne(:essary to train the CER to an additional nIe (C).This cue was a 2S0 Hz tone, introduced for all Ss after the differentialtreatmem phase and reinforced with a .S-ma. shock on a S0'7o reinforce,ment schedule. Testing was then accomplished by evaluating the degreeof suppression produced when X was presellled in compound with C. ascompared to C alone, Two C and two ex trials, each nonreinforced,were presented on each of 6 consecutive test days.

Figure 5 presents the median suppression ratios during testing underthe two cue conditions in the three gToUpS.Again it should be noted thatthe smaller the ratio value the more effective was the CS in disruptingbar-pressing. As may be seen, C alone produced equivalent degrees ofsuppression in all three groups. The addition of X had little effect uponsuppression in the O-ma. group but increasingly interfered with suppres-sion in the .5-ma. and I-rna. groups, There was a clear and statisticallyreliable tendency for X to be a more effective conditioned inhibitor ofthe CER as a result of AX nonreinforcement, the more intense the USwith which the A cue alone was paired.

THE BASIC THEORY

The generalization which applies to all of the results in the previoussection is thai the effect of a reinforcement or nonreinforcement inchanging the associative strength of a stimulus depends upon the existingassociative strength, not only of that stimulus, but also of other stimuliconcurrently present. It appears that the changes in associative strengthof a stimulus as a result of a trial can be well-predicted from the com-posite strength resulting from all stimuli preselll on that trial. If thiscomposite strength is low, the ability of a reinfon:ement to produce in-crements in the strength of component stimuli will be high; if thecomposite strength is high, reinfonement will he relatively less efteetive.Similar generalizations appear to govern the eft'elliveness of a nonrein-forced stimulus presentation. If the composite associative strength of astimulus compound is high, then the degree to which a nonreinforcedpresentation will produce decrements in the associative strength of thecomponents will be large; if the composite strength is low, the effectof a nOllreinforcement will be redured.

Certain similarities and differences between these generalizations andHull's postulates for growth of 11H11will !Je readily recognized. Thechanges in associative strength are acknowledged to depend upon cur-rent levels of that strength. However, the statements above assert that

Page 6: a ATheory ofPavlovian Conditioning: Variations in …...3ATheory ofPavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement In several recent papers

74 Assumptions, Theories, Models

changes in the strength of a stimulus depend upon the total associativestrength of the compound in which that stimulus appears, whereas forHull only the strength of the component in question was relevant, It isjust this dependence upon total associative strength which is central tothe theory we wish 10 develop here.

There are a variety of theoretical languages in which this centralidea can be expressed. One rather peripheralistic formulation has beensuggested by Rescorla (1969). He proposed that the change in CR condi-tioned to a CS, as a result of a CS-US pairing may depend upon thediscrepancy between the CR actually evoked on that trial and the maxi-mum CR which the particular US will support. The CR occurring on atrial arises from all of the stimuli present on that trial, not simply theCS in question. Increments in conditioning may be assumed to occurwhen the actual CR evoked on a trial is smaller than the maximumwhich the ensuing US will support. Correspondingly. decrements resultwhen Ihe actual CR is larger than the maximum CR. Rescorla hasparticularly emphasized the implication that Pavlovian conditioned in-hibition can be established to a CS by presenting it at a time when theactual CR is larger than the maximum CR which the subsequent USwill support.

A somewhat different version of the central notion, that condition-ing depends upon the associative strength of all stimuli occurring on atrial, has been suggested by Wagner (1969a, b). Wagner couched hisproposal in terms of the (:hanges in "signal value" of a cue, an associa-tive construct meant to embrace both the incremental effects of rein-forcement and the decremental effects of nonreinforcement. Specifically,the changes in signal value as a result of a trial were assumed to be linearfunctions of the composite signal value resulting from all stimuli presenton that trial. Separate sets of such linear functions were suggested to beappropriate for the cases of reinforcement and nonreinforcement. Theresultant signal value of the stimulus would presumably be reflected inthe overt CR, although the specific relationship was not treated.

A less completely formulated version of these ideas has been sug-gested by Kamin (1968). Indeed, it was Kamin's notions concerning the"surprisingness" of a US that originally encouraged the formulations ofRescorla and \Vagner. Attempting to accounl for his data on the so-calledblocking effect, Kamin argued that conditioning will occur only when theUS event is somehow "surprising" for the animal. Although the condi-tions which produce this surprise were not detailed, Kamin clearly in-tended that the surprise generated by a US be assumed to be reduced ifthat US is preceded by a CS which has previously been paired with it.Conse<luently, the surprise generated on a CS-US trial (and the resultingincrement in conditioning to the CS) should depend upon the degree towhich all stimuli present predict the US which occurs. It is not clear

Rescorla & Wagner 75

from Kamin's formulation what should he the ('ouSe(IUCIKes for ('mldi,tioned respondinA when the animal is variously "surprised" hy Ihe non,OCCUlTenreof a US.

Thc rentrai notion sUAgested here ("¡Ill also he phrased in. sonlcwh"tmore cognitive terms. One versÜm miAht read: organisms only learnwhen events violatc their expenalions. (:enain expenat iollS are hllilt nI'about the events followinl!; a stimnlus mmplex: expenations initiated bythat ('omplex and its componellt slimllli are then only modified whenconsequent events disal!;ree with the ("()mposite expectation.

A mOJ"(' 1)J"(~I"i.\rform Illation of thr tl/COIY. It should be dear thatthese formulations all express the same nne idea. They all Aeneraleessentially similar expectations with respert to the ":lI-iable ellects ofreinforcement, as reported in the previous section, However, the abilityof any of these formulations 10 make spcrili(' predictions is limited hylheir impre('ise verbal nalllre. It has seemed profitable to us to askwhether. if we make more specifi(', formal :lSsnmptions arollnd the (en,tral notion involved. we rould cxpand the possihilities for experimcntalevaluation. ]n what follows we will attempt ont' sud I specili(alion; theformulation follows most dose!y Wag ncr's (I l"il'a. b) version of Ihetheory.

As indicated above. one way to look at the ('entral notion ot thistheory is as a moclifiGllion of Hull's anOlll1l of the Arowth 01' "1111,

Similarly, one way 10 view lhc paninllar fonnali':ltion lo be proposcdis as a modili('ation of the lIIathematic,l model most dosely relalcd tothe Hullian theory, the linear model. This model (e.g., Bush &: Mosteller.1955) specifies lhc changes in probability of a rcspollSe as a I('sllll of atrial by the following equation:

Âp" = ß(lI. - l',,)'

where ß is the learning rate parameter. p" Ihe plObabilil)' 01 a rcsponseon trial n, and À the asymptote of leanring. The paniclllar "alllt's of ßand À are determined by the US and CS e\'ents involved on tht' trial.Clearly the model ilKorporates the basic Hullian assumption that theincremelll (or denemelll) in leaming on each trial is depeiulelll upon theamount already nmditioned at the bel!;inning of that trial as well asupon the final asymptote of learning which that liS will suppo.-l. Notile.however, that the model specifies lhe rules for growth in rcsponsc proha-bility while Hull's equations :Ire for growth of hahil strcngth. "Hu.

The model we wish to propose nHlstitlltes a mollification of thelinear model in several ways. First, it desnihes lhe learning curves forstrength of assoÚation, not response prohahilil)'. In thaI SCIISC it is morein line with the Hullian theory than is Ihe lincar model. lndepenc\elll:lssumptions will ne('essarily have to he made ahollt the mapping of

Page 7: a ATheory ofPavlovian Conditioning: Variations in …...3ATheory ofPavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement In several recent papers

76 Assumptions. Theories. Models

associative strengths into responding in any particular situation. Sec-ondly. we will explidtly remgnize that learning is tied to various exter-nal stimuli and discuss associative strength to various stimuli. Inrecognition of these two modifi('ations, we will descrihe the model interms of VI' the strength of association to stimulus j,

It is also important to note thaI VI will be allowed to take on bothpositive and negative values, corresponding roughly to conditioned exci-tation and conditioned inhibition. But, the most significant departurefrom the linear model is that when a stimulus compound, AX, is fol-lowed I)y a US, the changes in the strength to each of the componentstimuli, A and X, will be taken to be a function of VAX' i.e., the strengthof the compound, rather than the strength of the I'espective components.

When a mmpound, AX, is followed by US" the changes in associa-tive strength of the respective components may be represented as:

~VA = aAßt (À1 - VAX)and

¡INX = axß, (À1 - V AX)'

If AX is followed by a different valued US, i.e .• US., which mayinclude O or nonreinforcement, the changes in associative strength of therespective components may be represented as:

~VA = aAß~ (À2 - VAX)and

~Vx = aXß2 (À2 - V AX)·

As may he seen in the ahove equation, there are three sets of para-meters which affect the magnitude of the changes involved. The alphasare learning rate parameters, each associated with one component stimu-lus. and are appropriately subscripted to indicate this identification. Thevalue of alpha roughly represents stimulus salience and indicates ourassumption that different stimuli may acquire associative strength atdifferent rates despite equal reinforcement. The betas are learning rateparameters associated with the USs. The assignment of different betavalues to different USs indicates our assumption that the rate of learningmay depend upon the particular US employed. Alpha and beta valuesare confined lo the unit illlerval, {)~a, ß~1. Finally, the À values repre-sent the asymptotic level of associative strength which each US willsupport; presumably different USs will yield different asymptotic levels.Although À is not formally hounded, changing the range of its permis-sible values simply shifts the scale on which we observe Vs.

In order to apply the model. two further specifications are needed,The associative strength of the compound, V AX' must somehow bespecified in terms of the strengths of the components. The simplestassumption, and the one we will make here is, VAX = VA + Vx. Notice

Î

Rescorla & Wogner 77

that ¡llthough the Vs are in prilu:iple unhounded, iu appli, ation the Àvalues .~et limits on lhe ("(lIupound Vs.

S(·("(lIldly. wc IIced lo providc somc mapl'ill~ ut V v¡oIl1C's illlo hC',havior. "Ve arc Ilot prepared to makc delailed assllmptiulls ill this in,stan<:e. In I"oln, we would assllme lhat allY surh mal'pin~ wouldnecessarily he pc(uliar lo ead. cxpClimclIlal situation, ami depend upoua large numhel' of "performa,u'e" variahles. For lhe analyses we wish lopreselll in this paper. it will gClIerally he sull¡"icnt simply hI assumcthat the mapping of Vs into magniltl<le or I'wbahility of (unditioncdrespollliing preserves their ordcring, Stimulus ("(lInpotlluls whosc uet Vis negative would all he expected generally to map illlo a zero CR.but dillerential negative values (:oulel also he distinguished among hy avariety of experimental procedures (Res(orla, 19ti9).

ELEMENTARY DEDUCTIONS FROM THE THEORY

Without making more spedfic assnmptions about parameter \'alues.certain general deductious GtII he madc from the model. It should hedear that for the case of repeated reinfon:emellt or uonrcinfonementof a ~ingle cue. A. the equalions redure to essentially lhe linear model.For instauce. as VA inneases with repeatcd reilllOlTemelll 01 A. the dif-ference helween VA and À will deITcase, Conscquently, innemenls in V Awill decrease and a negatively a("('elcrated learuing nnve will result withan asymplote of À. Similarly, if we assume that the À \';oIue ass()("ÍatedWilh nOlneinfolTement is lower than VA' thell a negatively a(Teleratedextinniun funnion is generated hy repcated nonreinfolTemelll of A,

Râlll()r("(~lIwllt 01 romp/nil/cl stimuli. nut the more interestinll;Gises result from ("(lInpound stimuli, as ill the experimellls of the pre,\'ious senion, Consider first the case 01' reinfolTemcnt of an AX ,om·pound. The experiments of" the pre\'ious sell ion. logether wilh thoscof Kamin (1!IIiH). imlicale that prior conditionill~ 01 A redulcs the de~reeto whi,h reiuf"oH'cmelll of an AX nllllpoumi iIHTCIllt"IlS rhe assoliativcstrenglh of X. From the ahove equatious il is dear thaI "han~es in V Xare ~o\'erned hy the ditlercn<:e ht'twcen À alld the 'Ollll>osite Vn. Theresult of prior nmclitioninp; to A is that VA' and thus V.,,,. is large: henrclhe c1illerelKe helween À ami V"" is reduled ami lhe elleni"eness of re-inforœmelll nllTespondiugly limited. Similarly, (he prior cSlablishmetllof A as an illhibilor, as in Res('orla's O-.H ~wup means that VA isnegative. As a consequencc, V"x is reduced and the ditrerelHc hetweenÀ and VAX enlargt'd; thus X (an he innemented proportionately morethrough rcinl·on:emelll.

The argumetlls for the Wagncr aud Saavedra cxperimelll are es·sentially similar; here A is nol pretreated, but V A is modified by illter·

Page 8: a ATheory ofPavlovian Conditioning: Variations in …...3ATheory ofPavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement In several recent papers

78 Assumptions, Theories, Models

spersing trials of A alone with reinforced AX trials. In the case whereA is reinforced on those intermixed trials, again VA will be large andresult in an enlarged VAX' thus limiting the amount of conditioniugwhich can acnue 10 X on lhe compound trials. Early in condilioning,reinforcements will occur 10 AX while VAX is still below asymptote andconsequently Vx will increase initially. Nevertheless, VX will eventuallydecrease lo zero. Since VA will increase toward À, as a result of the Aalone trials. VAX will come to exceed À. Notice that when this happens,

. the result of a ,-ÓlI/orced AX trial is to durement the associative strengthof lhe components. As A and AX are both reinforced, increments 10 Awill occur on thc reinforced A trials and decrements to A and X onthe reinforced compound trials. The result will be a transfer to A ofwhatever associative strength X may have initially acquired. It is animportalll characteristic of the model that even on reinforced trials, ifVAX ex('eeds À, A and X will be decremented.

A similar account can be given of the results of nonreinforcedpresentations of A alone in the Wagner and Saavedra study. Thesepresentations should lead to a reduction of VA and hence a reductionof VAX' This provides increased opportunity to condition X on the AXtrials, as compared 10 a condition involving only reinforced AX trials.

Kamin (1968) has provided considerable additional data for theparticular case in which AX reinforcement is preceded by a history ofreinforcement of A. His experiments, carried out in a CER situation,indicate that with a high degree of prior conditioning to A, reinforce-ment of AX can be rendered almost completely ineffective in condi-tioning fear to X.

Several variations in the treatment of A, however, were found toattenuate the ability of A to "block" conditioning of X. For instance,as the number of prior conditioning trials to A was reduced, the abilityof A to block the conditioning of X was lessened. Alternatively, if A -was first highly conditioned but then extinguished, the extinction dis-rupted A's ability to block. Finally, if the intensity of A was decreased,blocking was redm·ed.

All of the lauer manipulations might be expected to yield a lowerVA and thus a lower VAX at the time of reinforcement of the compound.This deduction should require no elaboration in the cases where numberof reinfon-ed or extinguished trials to A alone was manipulated. In thecase of decreased A intensity, it is only necessary to make the reasonableassumption that a lower VA was attained prior to compound trainingas a result of a lower a associated with the weaker stimulus.

Kamin also reponed that the nature of the US at the time AX isintroduced is critical. For instance, if the prior conditioning of A isdone with a loma. shock and then AX is followed hy a I-rna. shock,a large interfel'ence with the conditioning of X is observed. But if the

Rescorla & Wagner 79

AX compound is followed instead by a 4·ma, shoc'k, wnsiderahly moreconditioning to X results. This conditioning to X depends not simplyupon the use of a high sho('k intensity, but requires an innease in shockintensity from the conditioning of A to that of AX. Alternalively, ifA is followed by a single shock and then AX is followed hy two shocksin close succession, similar conditioning to X results.

There is a natural way for the present model to handle these out-comes. There is evidence available (e.g., Annau !le Kamin, 1961) thathigher asymptotes of conditioning result from higher shock intensities.Thus it would not be unrcasonable lO assume that the À value associa-ted with a 4-ma. shock is larger than lhat associated with a l-ma, shock.The result of increasing shock intensity when shifting from conditioningof A to conditioning of AX is that the potelllial for conditioning X isenhanced; consequently, the reduction of the blocking is not surprising.Notice that it is the increasing of the À value between the two stages ofthe experiment, rather than simply having a larger À throughout, thatis critical to this prediflion. A similar kind of reasoning might be appliedto the case of shifting from a single shock following A to a double shockfollowing AX"

There are aspects of Kamin's data, however, with which the pre-sent model does not deal so well. For instance, Kamin found that theinitial trial on which X is presented in conjunction with A generallyyields less suppression than the previous A-alone trial. Furthermore,he provided evidence that most of the conditioning to X which occursin his paradigm results from that first AX trial. The present theoryprovides no statement of performance axioms which might lead us toexpect the introduction of X to interfere with suppression to A unlessit is assumed that X is initially associated with a negative V. Moreproblematic is the fact that while the theory predicts that the first AXtrial will produce more X conditioning than any subsequent compoundtrial, it does not anticipate Kamin's claim that the first AX trial accountsfor all of the conditioning to X. Further analysis of this problem mustawait additional data collection as well as the development of more de·tailed performance statements for the theory.

Kamin has also investigated a phenomenon related to the blockingeffect, so-called "overshadowing." If an AX compound is repeatedlyreinforced and A is simply a more salient stimulus, little conditioning

I It lIIiRht I", 1I0led in passing that jusI as Ihe preM'nt 1II0dt'1 P""licls a possibleincn"III('nl in as..uxiative slH'nRth of X wh('n the reinforcement lIIa~nittJ(l(" fur AXis int'fl'êt5l'd uvc'" thai for X, so il pn'cliets él d(·cr ...·llwnt in associative valut' of X whenIhl' H'infol(C'lIu'nt 1113Jtniuu!t' Jor AX is dt'Crt'flM'd with rt.·~p('ct lO Ihat fur X. In Ihatcas...., " will be Inw...·f(·d and 3!l'HlillinK that \'~ h3~ hern made' lo approach the pre-~hifl "l'V AX will he.' AH'aler Ihan rhe POSI-shifl X. rr-sultinR in .lccn·lIIt·IUS to holhft" and X, If X h"Rills wilh 110 associalivt' Slrt'lIKth, this proct't1nre miKht be «("'clt'd toIHochu;(' a l'CJluliliol1ed inhibitur. F.xpt·ritlu'nu arc cUlTt'nlly und ...·Tway in our labora-lorics to illVt'stiKalt' this pos.ihililY,

Page 9: a ATheory ofPavlovian Conditioning: Variations in …...3ATheory ofPavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement In several recent papers

80 Assumptions, Theories, Models

to X may result. Even though A is not reinforced more frequently thanX, because it is a more salient stimulus it may overshadow X and inter-fere with developmcnt of associative strength to the latter cue. Kaminhas demonstrated that the degree to which A overshadows X is depen-dent upon the relative intensities of the two stimuli, relatively moreintense stimuli yielding more overshadowing. From the point of viewof the present model, the effect of having A be a more salient stimulusis that it will have a larger alpha value. Hence when AX is reinforced,VA will grow rapidly with respect to Vl[. Consequently, the more salientA is with respect to X, the greater proportion of VAX will be due to VA

and the more limited the conditioning to X. More precisely, it is ex-pected that when VA and Vx both begin conditioning at zero, that afterany large number of conditioning trials with AX reinforced that Vx

will be equal to ~ VAX'

all. + axNotice that, nevertheless, the more potent the US used, the higher

should be À and the more conditioning to AX and hence to X shouldresult, Thus, overshadowing, measured in terms of absolute respondingto X. should be attenuated by employing greater US magnitudes, a find-ing confirmed by Kamin (1968).

Nonreinforcement of compound .stimuli. The preceding para-graphs indicate that the present model is capable of integrating a con-siderable amount of data on the effects of reinforcement, both fromour own laboratories and those of others. We now turn to an accountof some elementary effects of nonreinforcement. Because of the sym-metry of the model, the arguments for nonreinforcement are analogousto lhose for reinforcement and can be presented briefb'.

As pointed out above, if we assume that the À value associated witha liS of zero intensity is zero, then the model naturally generates anegatively accelcrated extinction function. Considering the nonreinforccdpresentation of a compound stimulus, the changes in component associa-tive strength should be dependent upon the total V of the ,compound:the larger VAX' the larger should be the decrement expected in bothA and X as a result of a nonreinforced presentation of the compound.Conse(luently, any operation which enhances VA and hence VAX shouldresult in a larger decrement to X as a result of nonreinforcement ofAX. This prediction is consistent with the findings from the two eye-blink conditioning studies from Wagner's laboratory reported above,in which the number of prior reinforcements of A critically affected thedecrementing of X. In fact, in the Wagner and Saavedra study whereX was presumably introduced with a zero V, greater conditioned in-hibition accrued to X when it was nonreinforced in conjunction withthat stimulus which had been more frequently reinforced in the past. If

Rescorla & Wagner 81

we assume that dilferent intensities of the US result in difrerent levels ofconditionin!,\, the conditioned inhihition results of the Res('orla experi-ment also fall inlo place. If A is followed hy a mort' inlcnse US amI asa consequence VA is larger, then nonreinfof("ement of X in the pre-sence of A should result in greater conditioned inhibition to X.

There is an additional case of nonreinforccment which is of interestto consider even though no data are currently available, Suppose lhatstimulus A were pretreated so as to give it a negative V and that a novelcue, X, were then combined with it and the AX compound nonrein-forced. The value of VAX should then be negative, and assuming that theÀ associated with nonreinforcement is zero, À - VAX should be positive,Thus lIonreinfo7Ting X in conjunction with an inhibitor should give Xpositive associative strength. This prediction, however paradoxical,should further emphasize the kind of symmetry inherent in the model'streatment of nonreinforcement and reinforcement.

APPLICATION TO A PROILEM IN DISCRIMINATION LEARNING

In the foregoing applications of lhe model, reasonably adequatepredictions from the basic theory might well have been drawn withoutbenefit of any quantitative formulations. The cue of interest was neverpresented in compound with more than a single additional stimulus,and it would generally have been possible and sufficient. in order toaccount for the data dis('ussed, to specify lhat the associative strengthof the latter stimulus had been manipulated to have various ordereddegrees of excitatory or inhihitory value.

In other instances to which the theory should be applil"able, how-ever, it is not possible to proceed at such a level. The present sectionwill attempt to illustrate one such instance, in which certain quantita-tive assumptions become critical.

The problem to be considered involves the dillerelllial reinforre-ment of stimulus compounds coutaining a so,called ('Ollllllon ('ue. Sup-pose a compound CS, composed of experimentally isolalable eues A amIX is consistently reinforced, while another compound CS composed ofcues B and X is consistently nonreinforced. If the several componentsare ade(luately discriminable we should, of course, expe(:t that differen-tial associative strengths will be acquired, with VAX approaching thatasymptote appropriate to the US employed, and VIIX approaching thatasymptote appropriate to nonreinforcement. But what should be theexpected fate of Vx?

What makes this question especially interesting is that there isreason to believe (e.g., Wagner. Logan, Haberlandt, &: Price, 1968) thata cue occupying the place of X in an AX, BX discrimination will cometo be less responded to alone than might be expected simply on the

Page 10: a ATheory ofPavlovian Conditioning: Variations in …...3ATheory ofPavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement In several recent papers

82 Assumptions, Theories, Models

basis of the schedule of reinforcement and nonreinforcement with whichit is associated. And, several theories have been advanced (e.g., Restle,1957; Sutherland, 1964; Mackintosh, 1965) which propose that such acommon, or "irrelevant" cue will become "adapted," "neutralized," or"unattended-to," by virtue of the availability of more valid cues.

The present theory includes no such special "neutralization" as-sumption, but indicates only that the learning which accrues to Xshould be a function of the trial by trial strength of both AX and BXin relationship to the reinforcing events with which they are separatelyassociated. But, should X then be expected to become neutral? Or, atleast more "neutral" than in various comparison treatments not in-volving a discrimination?

In order to illustrate certain relatively general features expectedin the course of discrimination learning when AX trials are alternatedwith BX trials, the former followed by reinforcement, and the latter bynonreinforcement, a sample learning run was computed. It should berecalled that the theory specifies that,

liVA= aA{J1(Àl - VAX),liVx = ax{Jt (Àl - VAX)

when AX is reinforced, and,

liVB = aB{J2(À2- VBX),liVx = aX{J2(À2- VIIX)

when BX is Ilonreinforced.For purposes of this example, A, B, and X were assumed to begin

with Vs of zero prior to the first learning trial, and were assumed to beequally salient (aA = aB = ax = 1.0). Reinforcement and nonreinforce-ment were taken to be associated with ÀSof 1,0 and O respectively, andwith equal rate parameters (ßl = (J2 = .05). Any set of parameter as-sumptions identifies a special case, but this set was intended to havesome semblance to certain experimental arrangements (e.g., the initialVs of zero might be relatively typical of untrained cues) and to avoidany assumptions that would beg justification (e.g., unequal as or (Js).

The left panel of Figure 6 depicts the mean VAXand mean VIIXcomputed over successive blocks of 4 trials. The right panel of Figure 6depicts the corresponding V values of the separate components. Severalfeatures of the plotted functions should be noted. With all Vs beginningat zero, VAXand VBXboth increase over the early trials. Although therate parameters {JI and {J2 associated with reinforcement and nonrein·forcement were set equal, Àl - VAXis initially large, whereas À2 - VBXis concurrently small. As a consequence Vx as well as VA increase rapidlyin relationship to a slower decline in VB. Only when the absolute valueof (JI(ÀI - VAX) becomes equal 10 (J2(À2 - VBX) does Vx cease to grow

Rescorla & Wagner

1.00

,75

.50

V .25

~OL

-.25

83

4 8-.50' , I I I

O

BLOCKS Of fOUR TRIALS

8 12 16

Fi"u .. 6. Mean V. computed from a sample learning run in which AX reinforcedtrials were alternaled with BX nonreinforced Irial •. The left panel depicts Ihecompound theorelieal values, the right panel the corre.ponding componenlvalues.

and does the rate of decrease in VII equal the rate of increase in VA' Astrials continue, VAXand VBXnecessarily approach À1 and À2, so that inthis example, the terminal value of VA is sufficiently positive so thaiVA+ Vx = I. and the value of VIIis sufficiently negative that VII+ VX = O.

It is unnecessary for present purposes 10 completely detail the wayin which this picture changes as all the values of the model are manipu,lated. For example, whereas VAXand VIIXwill always approach ÀI andÀ2, the terminal values of VA' Vn, and Vx in relationship to ÀI and À2

will depend appreciably upon their initial starting values. But, for theconclusions we wish to draw, we can restrict ourselves without dangerto instances like that exemplified, in which all Vs begin at zero,

It is worth noting, however, the effects of variations in a and {J.For instance, if {JI is made larger than {J2' the initial positive course ofVox, associated with the nonreinforced compound, would attain a higherabsolute level and be more protracted. Similarly, Vx would initially riseto a higher value and then fall. However, for all values of {JI and {J2

Page 11: a ATheory ofPavlovian Conditioning: Variations in …...3ATheory ofPavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement In several recent papers

84 Assumptions, Theories, Models

greater than zero, the asymptotic levels of VA' Vx' and VII as well asVAXand VIIXarc indepeJl(lclll of the ßs.

Variatiun in lhc re1alive salienccs 01" i\, B, ami X has a morcmarked effect. For example, if ax is made larger relative to aA and all,the formation of the discrimination in VAXand VIlX is not only slowed,but the terminal values of VAo VII' and Vx are modified. It can be shownthat under the conditions, AI = I, and A2 = 0, and where aA = all' the

axasymptotic value of Vx is e'I'"'' lU _

aA + 2axThe picture of discrimination learning in VAX and VIIX shown in

Figure fi rcsembles many empirical functions. Especially satisfying islhe initial rise predicted in the strength of the nonreinforced compound,a phenomenon which is frequently observed (e.g., Wagner, 1968). It isevident. however, that the theory does not generally predict that theassociative slrength of a common cue such as X will attain a value ofzero. Rather as indicated ahove, X should attain some associative strengthdepending upon its relative salience in comparison to that of the dis-niminative cues. It seems, in fact. that an important expectation fromthe model for the case described is that X will have some positive value,such that discriminative cues, in the nonreinforced compound must be·come i"hiIJi/lny in order for the strength of the latter compound toapproach zero.

Still, it may he more informative to ask whelher X should be more"neutral" as a result of being imbedded in such a discrimination than asa result of other treatments involving the same associated schedule ofreinforcement.

We llIay simply declare, that the strength of X should, accordingto the model, generally be less following discrimination training thanfollowing an idelllical partial reinforcement schedule of X in isolation.BUI, grallled the discussion thus far, this is a rather uninteresting com·parison, since it can largely be viewed in terms of the overshadowingthat occurs when a cue is trained in compound with other cues, as com·pared to being trained in isolation.

A more interesting empirical comparison was provided hy Wagner,Logan, Haberland!, and Price (1968). In hath CER and eyelid condi·tioning, these investigators compared responding to an isolatable com·mon cue ocCtlpying the place of X in an AX, BX discrimination, withthe responding to a similar cue, experienced in a "pseudodiscrimination"treatment in whi('h AX and BX were both partially reinforced on a50% schedule. Although X in the two treatments was associated duringtraining Wilh the same reinforcement schedule, and in compound withthe same cues, it was much more responded to when tested alone follow-ing pseudodisuimination as compared to discrimination training.

Rescorla & Wagner 85

We have already specified the asymptotic value of Vx to be ex-pected in an AX, BX discrimination in which eadl compound is ex-perienced on half of the trials. It will thus he usclnl lo ("()Jnpare lhisexpected value with the comparable value expected following a pseudo-discrimination procedure in which half of each of the AX and BX trialsare reinforced and half nonreinforced. in the manner of Wagner, et al.(1968).

It can be shown that according to the model, the asymptotic valueapproached by a partially reinforced compound should be equal to

'lTß\At - ('IT-I) ß2A2

'lTßI - ('IT - I) Jhwhere 'IT is the proportion of reinforced trials. Adopting, as we havepreviously, the assumption that Àl = 1.0 and A2 = 0, then in the case of

a 50% reinforcement schedule this becomes __ ß_I_ßI + ß~

This quantity thus expresses the theoretical asymptotic value of VAXandVRX in the pseudodiscrimination case under consideration.

Of course VAX= VA+ Vx ami VIIX= VII + Vx and it can be demon·strated that in the instance in which X is presented on all trials, and Aand B each on half the trials that.

2ax 2ax

Thus the asymptotic Vx attained under the pseudodiscriminationprocedure should be:

An appreciation of the relative Vx expected in the discriminationand pseudodiscrimination treatments is most evident if we now expressthe asymptotic Vx in the pseudodiscrimination condition, minus theasymptotic Vx in the discrimination condition, which difference is:

~ [2 (_ßI ) -I]ßI + ß2

Wagner (1969b) has noted that the present form of the theory canaccount for the greater responding to X alone under a pseudodiscrimi-nation as compared to a discrimination treatment, only if one is willingto specify certain quantitative differences in the effects of reinforced andnonreinforced trials. The mathematical expression above makes thisevident in terms of the current model. Only when the rate parameter

Page 12: a ATheory ofPavlovian Conditioning: Variations in …...3ATheory ofPavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement In several recent papers

86 Assumptions, Theories, Models

associated with reinforcement (ß.) is greater than the rate parameterassociated with nonreinforcement (ß2) will the quantity expressed bepositive, i.e.. will Vx be greater in the pseudodiscrimination than in thediscrimination treatment. The above expression also indicates that themagnitude of this effect will depend upon the cue saliences. so that anydifference between the two conditions will be augmented as ax ap-proaches \.0 and aA becomes small.

The robustness of the effect demonstrated by Wagner, Logan, Haber·landt, and Price, would suggest that ßl is considerably larger than ß2in the situations which they employed. How adequate this assumptionwill otherwise turn out to be remains to be determined. It is worthnoting, however, that a similar assumption has frequently been deemednecessary in the application of related models to other data areas (e.g.,Bush Sc Mosteller, 1955; Lovejoy, 1966). The advantage of the presentanalysis is that it makes dear the kinds of additional quantitative as-sumptions which are necessary in order to account for the differencesin strength of the common cue in these different treatments.2

APPLICATION TO BACKGROUND STIMULI AS COMPONENTS

Although the present model is stated in terms of increments anddecrements in the associative strength of component stimuli as a result

2 Bet:ause of the summalion assumption, il may appear that the presenl model willbe inappropriate for certain cases of discrimination learning. For in.tance, there is someevidrnce thai organisms are capable of learning to respond lo a compound stimuluswhile wilhholding their rrsponse to its components. and vice versa (Woodbury, 194~),Such discriminations seem lo call for an appeal to some special characlcristic of lhecompound nol present in its components. However, Mr, Donald Rightmer has sug-gested to us a way of conceptualizing such discriminations without recoune to"configuring." Consider a compound composed of two componenl slimuli. Thesecomponents, although readily discriminable from each other, wilt nevertheless con,tain some COllllllon properties. To indicate Ihis, we may dncribe the components asAX amI BX and lhc compound formed from lheir combinations as ABX, If we applythe model lO a case ill which this compound is consistenlly reinforced and lhesecomponenls lion reinforced. it correclly predicts learning of that discrimination. Inlhis case VAliX approaches At while VAX and VIIX both approach A2• as a result ofVA and V

ßbolh approaching A.-A~. while Vx approaches 2 A2-At. Much of lhe

hurden of lhe learning resls Wilh the common pans of the component stimuli andcon"-'qu"ntly lhe rate of learning wilt depend upon the assumed salience of X. Asimilar result occurs for the case of reinforcement of AX and BX and the nonrein·forcement of ABX. We mention these examples only to indicale lhat al leasl onekind of evidence commonly ciled in criticism of summation notions is compatablewith the preseIlt model.

A common approach to discrimination learning (e.g., Estes Sc Burke, 195~) is toassllme. howewr. not only that any pair of CSs can be lheorelically conceptualizedas being composed of unique and common ClIcS. bUI lhat the discriminability of theC's d"pI'nds only upon lhe relalive weights of the IWO sets of elements. While thislauer manner of allalysis mighl appear especially congenial 10 lhe present lheory,it pn'SelllS sußicíent difficulties thai we would prefer not to commil ourselves to lhislUore: Kt:'IIC'ralstrateg)'. A consideration of the alternatives would, unfortunately, takeus beyond lhe scope of the present paper.

lI

.;

I(

IJ.

I-

Rescorla & Wagner 87

of reinforcement and nonreinforcement of compounds, it also has im,plications for situations not obviously involving compound stimuli, Oneinteresting such application is to data recently collected by Rescorla(1969) pointing to the importance of CS-lJS correlalions in Pavlovianfear condilioning. Consider a situalion in whic:h an animal re('eivesbrief, intense electric shocks randomly distributed in lime. Supposefurther, that tonal stimuli are presented irregularly without regard to

the occurrence of the shocks, i.e., in such a way that shocks may ocnlrin both the presence and absence of the CS, and there is no correlationbetween the CS and shock. The queslion of interest is to what degreethe tones will acquire associative strength.

A typical experiment asking this question employed a CER pro,cedure with rats (Rescorla. 1968, Experiment 1). Three groups of animalsreceived tone CSs and shock USs. Group I received the tones and shocksin random relation to each other, with shocks oc('urring both in thepresence and absense of the CS. Group Il received the identical treat·ment except that all shocks programmed to occur in the absence of theCS were omitted. Notice that these two groups received the same numberof shocks during the CS; they differed only in that the first gronp alsoreceived shocks at other times. Finally, a third group re('eived the samereduced number of shocks as Group IJ, but those shocks were distrib,uted randomly in time, in the manner of Group I. When these stimuliwere subsequently presented while the rats bar'pressed for food reward,only Group II showed fear of the CS. The two groups for which the CSand shock were independent showed no measurable fear conditioning.

This result suggests that the correlation of the CS and US, in addi·tion to the number of reinforced CS presentations, is an importantdeterminant of fear conditioning. One way 10 describe this correlationis in terms of the probability of occurrence of the US in the presenceand absence of the CS. When the two events are positively correlated,then the probability of shock is higher during the CS lhan in its absence;when they are uncorrelated, lhen those probabilities are equal. Further-more, this description suggests a third case: when the prohability of theUS is higher in the absence of the CS than in its presenc'e, the twoevents are negatively correlated. In a series of experiments, Res('orla(1969) has accumulated evidence that these relative probabilities areimportant in determining the amount of conditioning oblained, Accord·ing to that evidence, when the probability of shock is higher during theCS than in its absem:e, the CS becomes a conditioned elicitor of fear;when the CS signals a period which is relatively free from otherwiseprobable shocks. it becomes a conditioned inhibitor of fear. Finally,when the probabilities of shock are equal in the presenc'e or absence ofthe CS, little or no conditioning of either sort occurs.

Somehow the organism appears to evaluate the probability with

Page 13: a ATheory ofPavlovian Conditioning: Variations in …...3ATheory ofPavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement In several recent papers

Rescorla & WagnerI

1.0

.8

.6

.4

.2

><O>

-.2

-.4

-.6

-.8

-1.0TRIALS

88 Assumptions, Theories, Models

which shocks occur both in the presence and in the absence of the CS,and il is the relation between these two probabilities which determinesthe amount of fear conditioning observed to the CS. The organism isapparently hehaving as a relatively complex probability comparitor.What we wish to suggest here is that the preselll model may provideone way of understanding how lhe animal can be sensitive to such subtlerelations via a relatively simple process.

The important point to notice for this analysis is that the CS occursagainst a background of uncontrolled stimuli. To speak of shocksoccurring in the absence of the CS is to say that they occur in the pre-sence of situational stimuli arising from the experimental environment.Although these stimuli are not explicitly manipulated by the experi-menter, they nevertheless can be expected 10 influence the animal. Thus,one way to think about the occurrence of the CS is as an event trans-forming the background stimulus, A, into background-plus-CS, AX. Thepresent model, of course, has been designed to account for the condi-tioning of X when it appears in such a compound, as a function of thetreatment of A elsewhere.

In order to exemplify the application of the model to this particu-lar case, lhe experimental session was taken to be divisible into timesegments the length of the CS duration. Each segment containing theCS is thus treated as an AX "trial" and each segment not containingthe CS as an A "trial." It is possible then to specify the sequence of re-inforcement and nonreinforcement over each of the two kinds of trials.

Sample learning TUns were computed from the model with sched-ules of background alone (A) and background-plus-CS (AX) as mightbe the case in experiments such as those of Rescorla (1968). Figure 7shows the results of one such application of the model. This figuredescribes the V value of the CS over trials, as a function of differentshock probabilities in the presence and absence of the CS. The firstdigit labeling each curve indicates shock probability during the CS;the second the probability in the absence of the CS. The particularparameter selections used in arriving at the functions plotted were asfollows: The CS was assumed to be present 1/5 of the time according toan irregular sequence and to have a salience 5 times that of the back-ground (aA = .1, ax = .5); the À values associated with reinforcementand nonreinforcement were taken to be 1 and O, respectively, while therate parameter associated with reinforcement was set at twice that as-sociated with nonreinforccment (ßt = .J, ß2 = .05).

The asymptotic values of the functions represented in Figure 7 arein general agreement with Rescorla's data in that they are clearly orderedby the relative probability of shock in the presence and absence of theCS. In addition. positive V values are associated with positive correla-tions between the CS and shock; negative V values are associated withnegative correlations. Furthermore, the magnitude of the correlation

89

~ .8-0

-------------- .8-.2

.8-.4

.8-.8

.4-.8

.2-.8

0-.8

FiIlU'" 7. Predicted .trenllth of a.sociation of X as a function of different USprobabilitie. in the pre.ence and absence of X. The first number next to eachcurve indicate. the probability of the US during the CS; the second, the prob-ability in the absence of the CS.

may be seen to be important, with stronger correlations ~enerating Vsmore removed from zero. Finally, the asymptotir V of a CS \IIHorrelatedwith the occurrence of shock is zero.

In fact, it is possible to arrive at a relatively simple expressiondescribing the asymptote of comlitioning (Vx) for these various treat-ments. The equation for the asymplotic value of VA is:

VA = 'TrAß.

Similarly, the equation for the asymptotic value of VAX is:

'TrAXß.

1T,\xß. - (1-1TAx) ß"VAX =

Page 14: a ATheory ofPavlovian Conditioning: Variations in …...3ATheory ofPavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement In several recent papers

Random control procedure. It is of interest to consider further thecase in which shock probability is equal in the presence and absence ofthe CS. This is the "truly random" control treatment suggested by Res-corIa (1967) as a procedure against which to evaluate the effects of CS-UScontingencies. As noted above, asymptotically the predictions from themodel agree with Rescorla's findings of little conditioning using such aprocedure. However, the model suggests that early in conditioning sucha CS may in fact show an initial rise in V, followed by a return to thezero point.

The basis for this initial rise in V lies in the rates of conditioningof A and AX. Early in conditioning, reinforcement of the AX compoundoccurs while VAand hence VAXis relatively low. C'..onsequently, Vx canreceive a sizable increment. As VA increases due to shocks in the absenceof the CS, VAXapproaches its asymptotic value and the increments in Xcorrespondingly decrease in size. Furthermore, when VA approaches itsasymptote, VAXwill begin to exceed its asymptote due to the contribu-tion of Vx. When this occurs, VAXwill be decremented. What follows is

Rescorla & Wagner 91

a period in which VA is incremented from shocks in the absence of theCS and VAXis decremented because is exceeds its asymptote; effectivelythis produces a redistribution of the associative strength of AX amongthe components A and X. The process stops when VAand VAXare equal,i.e., Vx is zero.

This account makes dear that the degree of initial conditioning ofthe CS predicted by the model in the random proœdure is a function ofthe relative conditioning rates of A and AX. There are a variety ofexperimental manipulations and parameters of the model which conse-quently should inßuence the magnitude and duration of this rise. Twoexperimental conditions are especially important. The first is the overallprobability of shock in both the presence and absence of the CS. As theoverall shock probability increases, the magnitude of the initial condi-tioning to X is greater, and the approach to the final zero asymptote isslower. It is interesting to note, however, that the stage of training atwhich the maximum Vx is reached remains the same. A second inRuen-tial manipulation is the proportion of the total session during which theCS is present. In Figure 7, the CS was assumed to be present 1/5 of lhetime, as it was in fact in Rescorla's experiments. However, il can beshown that as the proportion of the session during which lhe CS is pre-sent is increased, the magnitude of the initial rise in V of the CS is in-creased; furthermore, the peak magnitude occurs earlier in conditioning.and the attainment of the final asymptote is retarded.

In addition to these experimental manipulations, two parametersof the model are important in determining the magnitude of the initialconditioning of X. As might be expected, one of these is the relativestimulus salience assumed for the background and the CS. Accordingto the model, as the relative salience of the CS increases, the initial posi-tive value taken on by a random CS is enlarged and its duration pro-longed. Finally, the assumed relative importance of reinforcement andnonreinforcement is also relevant. As the rate parameters associated withreinforcement is assumed to be progressively larger than the rate param-eter associated with nonreinforcemem, the magnitude of this initialrise would be expected to increase.

Since most of the experiments employing this procedure have onlyassessed conditioning to the CS after extended training, there is rela-tively little direct evidence bearing on the details of these predictionsfrom the model. However, because the manipulations which are pre-dicted to affect the magnitude of the initial rise are also predicted toprolong its presence, some studies employing extended training mightyet reRect the effects in question. One interesting example is a recentdissertation carried out at McMaster by Kremer. Kremer (1968) reportednonzero terminal levels of fear following a random procedure. l-lisconditions were similar to those of Rescorla except that he used a more

90 Assumptions, Theories, Models

Thus, one may arrive at the asymptotic value of Vx as:

Vx = VAX - V A

lt may be seen that Vx will depend only upon the probability ofreinforcement in the presence of X (lTAX) and in the absence of X (lTA)'as well as upon the rate parameters associated with reinforcement andnonreinforcement. Since the lauer parameters are constants in the twoequations for VAX and VAit should be evident how Vx will vary with therelative probabilities of reinforcement. When 1T.u is greater than lTA'Vx will have a positive value, as lI'AX becomes equal to lI'A' Vx willapproach zero, and finally when lI'A is greater than lTAX' Vx will becomenegative.

Although we will discuss below the effects of other variables uponVx prior to asymptote, none of these influence the final product oflearning. In particular it is worth pointing out that in this instancethe initial Vs, at the begining of any of the probability treatments.leave no permanent effect; whatever the starting VA and Vx' a giventreatment will eventually yield asymptote values, as specified, appro-priate to that treatment. For example. should Vx first be incrementedto some high value prior to a .8-.8 "extinction" treatment, the asymp-totic value of X would still be zero according to the m¢el.

Whatever the overall shock probability, if AX and A are reinforcedwith equal probability, the value of X will approach zero. Notice, how-ever, that this zero level of conditioning for X occurs against differentlevels of conditioning to A which is dependent upon the overall shockprobability.

Page 15: a ATheory ofPavlovian Conditioning: Variations in …...3ATheory ofPavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement In several recent papers

92 Assumptions, Theories, Models Rescorla & Wagner 93

salient CS (white noise vs. 720 Hz tone) and the CS was present a largerproportion of the session (1/3 or 1/2 vs. 1/5). It should be noted thatthese modifications of Rescorla's procedure are ones which according tothe present model should enhance and prolong the initial positiveconditioning of the CS. Furthermore, Kremer found that more frequentalternation between CS and non-CS periods produced more fear con-ditioning in the random procedure. It seems likely that more frequentalternation of these periods would also enhance CS salience, in whichcase the model again predicts a prolongation of positive phase of theCS. Consequently, if we assume that Kremer's data were collected priorto asymptotic levels of conditioning. his results would generally fit withthe model.

Some comment should be made about the consequences of thesepredictions for the suitability of the truly random procedure as a "con-trol" treatment in Pavlovian conditioning. This procedure was designedas a control for a particular operation, namely the establishment of acontingency between CS and US. On the assumption that contingenciesare important in conditioning, this procedure gives a baseline of nocorrelation with which to make comparisons. This application of theprocedure remains indifferent to the assignment of any particular theo-retical learning value to the CS. What the present account attempts isa theoretical understanding of the results of arranging various degreesof correlation between a CS and US. And according to that account,although 'conditioning results are ordered by degree of correlation atevery stage of learning and although asymptotically the random pro-cedure does attain an associative value of zero, nevertheless there arestages at which it has positive value. But other theoretical accounts ofthis procedure are also possible and all would leave equally unaffectedthe status of the random treatment as a procedural control.

X. The maKnitllde of this overshootinK in \Ix will depend upon lherelative rates of conditioning of A and AX. Thus. the same parameterswill affect the rise in the case of positive wrrelations as were importantin the case of no correlation.

Negative CS-US correlations. One of the more interesting resultsfrom experiments exploring correlations between CSs and USs is lhefinding that negative correlations lead to CSs which are conditioned in·hibitors. Furthermore, the magnitude of the conditioned inhibition isa function of the degree of negative correlation (Rescor1a, 1969). Thepresent model is in general agreement with lhese findings; however,conditions which asymptotically generate negative Vs may. accordingto the model, initially generate positive Vs. For instance, the .4-.8condition in Figure 7, although asymptoting at a level of -.33, attainsconsiderable positive associative strength early in conditioning. Again,this initial rise is controlled by parameters affecting relative rates ofconditioning for A and AX; it is only when VA is sufficiently large thatVAX exceeds its asymptote that X can begin to acquire negative associa-tive value. Thus prior to the setting up of X as a conditioned inhibitor,A must first be established as a conditioned excitor. Furthermore, noticethat if a negatively correlated treatment is terminated early in comli-tioning, conditioned excitation may be observed despite the fact thatthe negative correlation was in force from the outset.

In summary, the present model seems consistent with the majorasymptotic results of arranging various correlations hetween a CS andUS. Furthermore, it specifies the set of experimental manipnlationswhich might be expected to inßuence these results. In addition, it makesa number of interesting predictions abont preasymptotic consequencesof arranging correlations between CSs and USs.

Positive CS-US correlations. From Figure 7 it is dear that through-out learning the degree of excitatory conditioning varies with the mag-nitude of the correlation between the CS and US; furthermore, positivecorrelations always yield positive Vs. However. the learning curvespredicted from this account of positive correlated situations differ fromtypical learning curves in that they are not all monotonic. When shocksare delivered both in the presence and absence of the CS but with ahigher probability during the CS, the V associated with the CS maysometimes attain a value early in conditioning which exceeds its finalasymptotic value. The reason for this is similar to that for the initialrise in the random treatment; initially AX may approach its asymptotemore rapidly than A approaches its final level. Consequently, after AXhas ceased to grow, A continues to increase and during AX trials thecompound V is redistributed among the components at the expense of

RELAliON 10 AnENTlONAL THEORY

The model we have presented was designed to account for instancesin which identical stimuli, although associated wilh equal reinforcementschedules. nevertheless acquire different associative strengths, as a resultof the stimulus context in which they are imbedded. The correspondencebetween the model and data, as described in the preceding sections,would appear to offer encouragement to the line of theorizing whichhas been developed. There is, however, another plausible, more conven-tional theoretical approach to this same general problem, and one whichhas otherwise received some measure of support (e.g., Mackintosh, 1965).It thus becomes pertinent lo ask what relative advantage, if any, is en·joyed by the present theory.

The alternative which must be considered is an "attemional" or

Page 16: a ATheory ofPavlovian Conditioning: Variations in …...3ATheory ofPavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement In several recent papers

94 Assumptions, Theories, Models Rescorla & Wagner 95

"stimulus selection" interpretation (e.g., Sutherland, 1964). The familiarnotion is that an organism can learn only about those cues to which it isattending. and tl':lt it has a limited attentional capacity. Attending toone cue is thus presumed to decrease the likelihood of attending to, andhence learning about, other available cues. Such theory has no apparentdifficulty, for example, in accounting for the fact that the reinforcementof one stimulus element will have less incremental effects upon thelearning to that element, if there is concurrently present a stronger (bet-ter attended-to) cue.

The kind of "in principle" arguments in favor of attentional theoryare very seductive. Certainly the organism does not have an unlimitedcapacity to process sensory information. Thus, it must be expected thatunder some circumstances an environmental stimulus'will not be reactedto, or may be less reacted 10. as a result of the processing of concurrentstimuli. We would hardly quarrel with this. The proper question, how-ever, is whether the organism:s capacity is so limited that it is necessaryto assume that several highly distinctive stimuli, as employed in com-pound-stimulus Pavlovian training, cannot generally be simultaneouslyattended to. And even if it were advisable to make such assumptions,would an attentianal theory still account for the range of data withwhich we are presently concerned?

In many instances an attentional theory appears quite adequate. Ifwe pretrain associative strength to a stimulus (A) and then reinforcethe same stimulus in conjunction with a novel stimulus (X), the resultantassociative strength of X will be reduced compared with that of a groupnot pretrained on A. It seems natural to assume that pretraining on Aleads the animal to attend to A to the deteriment of X, and thus toshow little conditioning to X. A similar account seems applicable tothe failure of a common cue to acquire considerable associative strengthin a discriminative conditioning situation. The animal may be assumed10 attend 10 the dimension defining the primary discriminanda andthereby fail to attend to stimuli less well correlated with the US. Andthe same reasoning may appear to apply 10 the failure to observe sub-stantial evidence of learning in the "truly random" conditioning pro-cedure. The CS is no more informative than are the contextual cuesconcerning the occurrence of the US, is thus not especially attended to,and is not learned about.

A major difficulty with this approach is that we are not providedwith a specification of the trial-by-trial events which control attentionin Pavlovian conditioning. It is not sufficient to argue that stimuli un-correlated with reinforcement are not attended to; one needs to knowhow the trial-by-trial events which compose the uncorrelated treatmentare processed by the animal so as to generate failure to attend to the

CS. In the absence of such mechanisms the attentional account is lillIemore than a redescription of the data.

But, to our view, the most significant fact is that while there areobvious symmetries in the results we have discussed, the attentionalaccount seems only to apply to portions of the data. For inslance, justas prior reinforcement of A will reduce the amount learned about X onsubsequent reinforced AX trials, so prior inhibitory training of A willaugment the amount learned about X on reinforced AX trials. It is notclear what modification in allenIion to A could be produced by inhibi-tory training which would enhance the amount learned about X on AXtrials. It does not seem plausible to argue that inhibitory training of Amakes the animal especially fail to attend 10 that cue since Rescorla(1969) has shown that such training gives A deeremental control overresponding. In addition, an attentional account of the effects of priorreinforcement of A upon the subsequent nonreinforcement of AX doesnot seem satisfactory. Why should training an animal to allend to Amake nonreinforced AX trials especially potent in conditioning inhibi·tion to X?

Even in the case of certain phenomena which attentional theory hasbeen thought to handle well, the apparent adequacy of the theory maybe illusionary. Consider Kamin's (I96R) blocking experiment in whichprior conditioning to A makes the subsequent reinforcement of AXpractically ineffective in conditioning X. According 10 auentianal theory,pretreatment of A causes the animal 10 attend to A on AX trials, so thatall of the reinforcement effects that occur go to A. However, since A isalready well conditioned, the influence of this reinforcement is difficultto detect. According to the present model. the prior conditioning of Aresults in an AX with a high associative value which in turn devaluesthe reinforcer; consequently, neither A nor X should receive additionalconditioning.

An unpublished experiment from Rescorla's laboratory was de-signed to evaluate these alternative interpretations. The strategy wasto produce "blocking" using a high VAX arrived at through a low levelof pretraining to both A and X rather than considerable training of Aalone. According to the present model, any way of arriving at the sameVAX should interfere equally with the effectiveness of the reinforcer andpreclude further conditioning to either stimulus on the AX trials. A('-cording 10 an attentional notion, however, the reinforcer remains effec-tive. Since both A and X separately. should have low associative values,any selection of one of the stimuli to which to attend should result infurther conditioning of that stimulus. Conse<luelllly, some conditioningshould occur to at least one of the stimuli.

Four groups of rats were bar-press trained on a VI schedule of food

Page 17: a ATheory ofPavlovian Conditioning: Variations in …...3ATheory ofPavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement In several recent papers

96 Assumptions, Theories, Models

reinforcement. They then all received 6 conditioning trials while bar-pressing. with a 2-min. CS ami a 0.5-sec., loma. shock; on three lrials theCS was a flashings light, on three a 1200 hz tone. This training resultedin less than asymptotic suppression to both CSs, such that subsequentTL compound trials could he shown to produce more complete suppres-sion. Group TL tl~en received 10 conditioning trials on which the tone-light compound terminated in shock. Groups T and L each received 10reinforced trials with the tone or light alone, respectively. Group Nreceived no further conditioning with either stimulus. All animals werethen tested with 2 nonreinforced presentations of each component stimu-lus over each of 5 test days. Finally, each animal received li test sessionsduring which the TL compound was presented 4 times.

Figure 8 shows the mean suppression ratios for the light and toneseparately for each of the 4 groups over the initial 5 test days. Lookingfirst at the suppression observed to the tone CS, as represented in theleft panel of Figure 8, it is clear that further conditioning to the tonealone (Group T) resulted in more suppression than did failure to giveadditional training (Group N). Group L, which had received onlyfurther conditioning to the light, showed suppression to the tone similarto that of Group N. The most interesting result, however, was that thesuppression in Croup TL was not different from that of Croup N, butwas considerably less than that of Croup T. Reinforcing the tone in thepresence of the light evidently produced no additional acquisition offear to the lone.

The suppression observed to the light CS, as shown in the rightpanel of Figure 8, indicates that the findings for the tone in Group TLwere not due simply to all animals attending to the light during com-pound training. The overall suppression to the light was greater thanthat to the tone, but the pattern of results was similar. Croups N, T,and TL did not differ in responding to the light, but all suppressedless than did Croup L. Thus, additional training to the light alone, butnot to the light in compound with the tone, yielded further condition-ing to the light.

The results of the subsequent compound test trials were consonantwith the ahove results obtained with the components. Over the threedays of testing the TL compound, Groups TL and N gave a mean sup-pression ratio of .23 and .28; the combined T and L groups gave a ratioof .13. Thus, additional training to either T or L alone yielded moresubsequent suppression to the TL compound than did additional train-ing to the compound itself.

These data clearly demonstrate that by giving a small amount ofprior conditioning to each of two stimuli, it is possible to interfereseverely with further conditioning to both stimuli when their compoundpresentation is reinforced. This finding is difficult to reconcile with an

¿ III

~Io10

ort)

o

OI.lV~ NOISS3~ddns NV3W97

Page 18: a ATheory ofPavlovian Conditioning: Variations in …...3ATheory ofPavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement In several recent papers

98 Assumptions, Theories, Model. Re.carla & Wagner 99

CONCLUDING COMMENTS

Konorski. J. Conditioned refleus and neuron organization. Cambridge: TheUniversity Press. 1948.

Kremer. E. Pavlovian conditioning and lhe random control procedure. Un·published docloral dissertalion. McMaster University, 1968.

Lovejoy, E. P. An analysis of the overlearning reversal effect. PsychologicalReview, 1966, 7J, 87-101I.

Mackintosh, N. J. Selective attelliion ill animal diseriminalion learning. Psy-chological Bulletin, 1965. 64, 124-150.

Pavlov. I. P. Conditioned reflexes. London: Oxford UniversilY Press, 1927.Rescorla. R. A. Pavlovian conditioning and ilS proper connol procedures.

Psychological Review, 1967, 74,71-80.Rescorla, R. A. ProbabililY of shock in the presence and absence of CS in fear

condilioning. Journal of Comparative and Physiological Psychology, 1968.66,1-5.

Rescorla, R. A. Conditioned inhibition of fear. In W. K. Honig and N. J.Mackintosh (Eds.), Fundamental i"ues in associative learning. Halifax:Dalhousie University Press, 1969.

Rescorla, R. A., !te LoLordo, V. M. Inhibition of avoidance behavior. Journalof Comparative and Physiological Psychology, 1965.59,406--412.

Restle. F. A theory of discrimination learning. Psychological Review, 1955. 62,11-19.

SUlherland, N. S. Visual discrimination in animals. British Medical Bul/etin,1964,20,54-59.

Wagner, A. R. Slimulus validity and slimulus seleclion. In W. K. Honig andN. J. Mackinl05h (Eds.). Fundamental i"ues in associative learning. Halifax:Dalhousie Universily Press, 1969, (a).

Wagner, A. R. Stimulus·selection and a "modified continuity lheory." In G. H.Bower and J. T. Spence (Eds.), The psychology of learning and motivation.Vol. li. New York: Academic Press, 1969, (b).

Wagner. A. R. Incidenlal stimuli and discrimination learning. In G. Gilbertand N. S. Sutherland (Eds.), DiJcrimination learning. London: AcademicPress, 1968.

Wagner. A. R., Logan, F. A.. Haberlandt, K., Ile Price. T. Slimulus selection inanimal discrimination learning. Journal of Experimental Psychology, 1968,76,171-180.

Woodbury, C. B. The learning of stimulus patterns by dogs. Journal of Com·parative Psychology, 1943, H, 29-40.

attentional theory, but is an obvious deduction from the present model.Whatever the other virtues or liabilities of auentional theory, it

simply does not fare well in relationship to the present model when ap-plied to the range of Pavlovian conditioning arrangements under eon-sideration.

We have attempted to point out some general principles governingthe effectiveness of reinforcement and nonreinforcement in Pavlovianconditioning situations. Experiments from our own laboratories indicatethat the incremental or decremental effects upon a component stimulusas a result of the reinforcement or nonreinforcement of a stimulus com·pound containing that component, depend upon the total associativestrength of the compound, not simply upon the associative strength ofthe component. This general dependence incorporated within a morequantitative formulation of our earlier theoretical position (e.g., Wag-ner, 1969a, 1969b; Rescorla, 1969) provides a way of integrating a size-able number of empirical findings. Several sample derivations made fromthe theory have been demonstrated to match well with available data.But, the greater value of the more specific theoretical formulation whichhas been proposed may be, as we have seen in several instances, in theidentification of additional variables of importance to Pavlovian condi·tioning. It at least invites a fresh look at a data area in which the avail-able theoretical alternatives have been meager.

Reference.Annau, Z., !te Kamin, L. J. The conditioned emotional response as a function of

intensity of the US. Journal of Comparative and Physiological Psychology,1961,54,428-H2.

Bush, R. R., Ile Mosleller, F. Stochastic models for learning. New York: Wiley,1955.

Egger, D. M., Ile Miller, N. E. Secondary reinforcement in rais as a function ofinformation value and reliability of lhe stimulus. Journal of ExperimentalPsychology, 1962.64,97-104.

Estes, W. K .• Ile Burke, C. J. A lheory of slimulus variability in learning. Psycho-logical Review, I9511.60,276--286.

Hull. C. L. Principles of behavior. New York.: Appleton-Century-Crofts, 19411.Kamin. L. J. Attention-like processes in classical conditioning. In M. R. Jones

(Ed.), Miami Symposium on the prediction of behavior: Aversive stimula-tion. Miami: Universily of Miami Press, 1968.

Kamin, L. J. Predictabilily, surprise, attenlion, and condilioning. In R. Churchand B. Campbell (Eds.). Punishment and aversive behavior. New York.:Appleton-Century·Crofts. 1969.