operant conditioning the learner is not passive. learning based on consequence!!!

37
Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Upload: henry-henderson

Post on 23-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 2: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Operant Conditioning Learning controlled by a connection to the

consequence of one’s behavior Consequences of behavior determine whether it will

be repeated in future

Vs. Classical Conditioning Behavior is…

CC: elicited, automatic, reflexive OC: emitted, voluntary, complex behaviors

Reward is… CC: provided independent of actions OC: dependent on behavior

Page 3: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!
Page 4: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!
Page 5: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!
Page 6: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

B.F. Skinner

• The most influential behaviorist and proponent of Operant Conditioning.

• Nurture guy through and through.

• Used a Skinner Box (Operant Conditioning Chamber) to prove his concepts.

Page 7: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Skinner Operant box—non-reflexive behaviors could

be altered by learning

Page 8: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!
Page 9: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Chaining Behaviors

Subjects are taught a number of responses successively in order to get a reward.

Click picture to see a rat chaining behaviors.

Click to see a cool example of chaining behaviors.

Page 10: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Thorndike’s Puzzle and The Law of Effect

• Edward Thorndike• Locked cats in a cage• Behavior changes because of

its consequences.• If a response is rewarded, that

response is more likely to occur• If consequences are

unpleasant, the Stimulus-Reward connection will weaken. (LOE)

• Called the whole process instrumental learning.• Instrumental behaviors

Click picture to see a better explanation of the Law of Effect.

Page 11: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Thorndike

Page 12: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Operant Conditioning

Reinforcement Increases probability of response Positive: desirable stimulus is added Negative: undesirable stimulus is removed

Punishment Decreases probability of response

Positive: adding something bad

Negative: removing something good

Page 13: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Reinforcement

When an event increases the likelihood that a response will

occur again

Positive

Adding something good

Designed to increase behavior

Negative

Removing something bad

Designed to increase behavior

Page 14: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Types of reinforcers

Primary vs. secondary Primary: inherently satisfying to most people Secondary: gain value from conditioning

Immediate & delayed Usually needs to be immediate, but humans can

handle delayed reinforcers Important for self-control

Page 15: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Rat basketball

What type of learning was this an example of?

Can you explain what helped the rats learn to score a basket?

Page 16: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Punishment/Consequence When an event decreases the likelihood that a response

will occur again

Two types: Positive & Negative

Positive ≠ Good. POSITIVE = ADD

Adding something bad

Designed to decrease behavior

Negative ≠ Bad. NEGATIVE = SUBTRACT

Removing something good

Designed to decrease behavior

Page 17: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Importance of reinforcement Punishment signals undesirable behavior but

doesn’t inform of desired behavior Punished behavior is suppressed Punishment teaches stimulus discrimination Punishment (esp. physical) teaches fear &

aggression Ignore behavior that one wants to punish; look

for what to reinforce

Page 18: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Punishment tends to be ineffective It tells the organism what not to do,

rather than what to doCreates anxiety that can interfere with

future learningEncourages subversive behavior

(sneakiness)Provides a model for aggressive

behavior Only true for some races/cultures

Page 19: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Neg. reinforcement ≠ punishment

Page 20: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

The Decision Tree

How to solve operant conditioning problems

Should the behavior

increase or decrease?

Is something being

added or taken away?

Increase.(Reinforcement)

Decrease.(Punishment)

Added.(Positive)

Removed.(Negative)

Page 21: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Review

Positive Negative

Punishmentdecreases behavior

ADD something

unfavorable

SUBTRACT something desirable

Reinforcement

increasesbehavior

ADD something desirable

SUBTRACT something

unfavorable

Page 23: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Behavior Modification Started with Thorndike Altering individual behavior (frequency) through positive and

negative reinforcement and positive and negative punishment Adaptive behaviors

Reduction of behavior through its extinction and punishment A.K.A. – Applied Behavior Analysis or Positive Behavior Support (PBS)

A child is riding with an adult, and the child is thirsty. So, the child asks to stop and get a drink. The adult says no, the child asks again, and again, and again... Finally, the adult gives in, saying, "All right, just this once." Big mistake, right? Why? The adult has now put the child on a partial schedule, guaranteeing a repetition of the same behavior later on. Instead, the adult should have said, "All right, I'll get you a drink IF you don't ask for one for the next 10 (time may have to vary, depending on the child) minutes." Then, the adult is providing the child with positive reinforcement for being quiet.

Ending a Relationship?????

Page 24: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Behavior Modification Reinforcement provides a system of rewards and punishments to change negative

behavior into positive responses. Provides rewards when someone acts in a positive manner. Rewards can range

from a compliment to granting a special privilege to the patient whose behavior becomes desirable.

A negative consequence might be the result of unwanted behavior, with the removal of a favorite object or taking away a privilege.

Cognitive behavior modification techniques focus on thought patterns that affect behavior, Involve teaching a patient to recognize thoughts that may be unrealistic or distort

reality. Keeping a journal, role-playing, and being asked to defend thoughts that defy

reality. Eating disorders, anxiety disorder, OCD, Panic attacks

Aversion behavior modification techniques center on the premise that all behavior is learned and can be unlearned. (aka CC) Electrical shock treatment is one example of adverse stimuli used to treat deviant

behavior. (Mild) medication given to alcoholics that might make them ill if they drink while

using the drug. The token system provides immediate rewards while setting goals for future conduct.

Distribute a token or similar object each time a patient or student exhibits positive behavior.

Tokens can be amassed and later exchanged for a prize or privilege, or lost due to unwanted behavior.

This form of behavior modification is commonly used in mental institutions and prisons to help control individuals who show violent tendencies.

Page 25: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Premack principle A less frequently performed

behavior can be increased by reinforcing it with a more frequent behavior Eat your vegetables before you

can have dessert!

Page 26: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Operant Conditioning in Daily Life

Do we wait for the subject to deliver the desired behavior?

Sometimes, we use a process called shaping.

Shaping is reinforcing small steps on the way to the desired behavior.

To train a dog to get your slippers, you would have to reinforce him in small steps. First, to find the slippers. Then to put them in his mouth. Then to bring them to you and so on…this is shaping behavior.

To get Barry to become a better student, you need to do more than give him a massage when he gets good grades. You have to give him massages when he studies for ten minutes, or for when he completes his homework. Small steps to get to the desired behavior.

Page 27: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Shaping Reinforcing responses that come successively closer to the desired response

Successive approximations

Page 28: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Shaping Reinforcers gradually increase organism’s

actions toward desired end behavior Successive approximations : behaviors closer &

closer to end learning goal get rewarded1. Simply turning toward the lever will be reinforced

2. Only stepping toward the lever will be reinforced

3. Only moving to within a specified distance from the lever will be reinforced

4. Only touching the lever with a part of the body will be reinforced

5. Only touching the lever with a specified paw will be reinforced

6. Only depressing the lever partially with the specified paw will be reinforced

7. Only depressing the lever completely with the specified paw will be reinforced

Page 29: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Schedules of reinforcement

•How often to you give the reinforcer?•Every time or just sometimes you see the behavior.

Page 30: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Schedules of Reinforcement Continuous reinforcement schedule:

Reinforcing a response every time Learning occurs rapidly, extinction occurs rapidly

Partial reinforcement schedule: Reinforcing a response only some of the time Slower acquisition, but resistant to extinction

Fixed vs. Variable Ratio vs. Interval

Fixed ratio: after set # of responses Variable ratio: after unpredictable # of responses Fixed interval: after set amount of time has passed Variable interval: after unpredictable amount of time has

passed

Page 31: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Continuous v. Partial Reinforcement

Continuous Partial Reinforce the behavior

EVERYTIME the behavior is exhibited.

Usually done when the subject is first learning to make the association.

Acquisition comes really fast.

But so does extinction.

• Reinforce the behavior only SOME of the times it is exhibited.

• Acquisition comes more slowly.

• But is more resistant to extinction.

• FOUR types of Partial Reinforcement schedules.

Page 32: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Schedules of reinforcement Continuous vs. partial

Page 33: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Ratio schedules

1. Fixed-ratio (FR) schedules: Reinforcement after a fixed (predictable)

number of responses Ex: paid $1 for every 20 apples you pick

2. Variable-ratio (VR) schedules: Reinforcement after a varying (unpredictable)

number of responses Induces very high rate of responding

Ex: scratch & win lottery tickets

Page 34: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Interval Schedules3. Fixed-interval (FI) schedule:

Reinforcement after a fixed (predictable) amount of time

4. Variable-interval (VI) schedule: Reinforcement after varying (unpredictable)

amounts of time

Page 35: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Reinforcement Schedules

after set number of responses

after set amount of time

after random number of responses

after random amount of time

Ratio Interval

Fixed

Variable

Page 36: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Ratio Interval

Fixed

Variable

Page 37: Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Name that Schedule!

Winning at the slot machines Getting a free flight after accumulating 10,000

flight miles Receiving an allowance every Saturday

regardless of chores, as long as you’ve done one chore

Random drug testing at your job

A.Variable Ratio C. Variable Interval

B.Fixed Ratio D. Fixed Interval

AB

D

C