Operant Conditioning
Psychology
This week we are covering Chapter 5: Learning. This is the chapter when students start to think, Why am I studying learning? Don’t we all just _know_ how we learn? I wish that were the case! But it turns out that learning is a pretty complicated topic. Check out Dr. Stephen Chew’s How to Get the Most Out of Studying (Links to an external site.)website with its associated 5 short videos to learn how you can study more effectively (in this, or any other class).
Keep in mind that some of the topics we’ll cover in this chapter are going to seem easy, but I assure you that they do require deep thinking on your part. We’ll cover the following sections in your book:
· An Introduction to Learning
· Classical Conditioning
· Operant Conditioning
We won’t cover Observational Learning and Cognition.
The infographics in this chapter are EXCELLENT for understanding the concepts described in class. I’ll give some links to other illustrative examples below, but check these out and study them:
· 5.1 Learning through Classical Conditioning
· 5.2 Learning through Operant Conditioning
· 5.3 Partial Reinforcement: Schedules of Reinforcements
· 5.4 Learning: Reinforcement and Punishment
Classical Conditioning
Classical conditioning is something you’ve heard about before – yes it has to do with the drooling dog and Ivan Pavlov. Basically, Pavlov was studying the digestive system of dogs (Links to an external site.) when he realized something. Whenever he came into the lab wearing a white lab coat, the dogs began to salivate. It was as if they were anticipating being fed. Pavlov named the stimuli and behaviors in this situation: unconditioned response (UCR), unconditioned stimulus (UCS), neutral stimulus (NS), conditioned response (CR), and conditioned stimulus (CS). Note: “condition” in this context is just a fancy word for learning.
Before the dog became conditioned to expect food when he saw someone in a lab coat:
· The unconditioned response (UCR) was a behavior that the dog did naturally. He didn’t have to learn (or be conditioned) to do this behavioral response (i.e., drool).
· The unconditioned stimulus (UCS) was something that elicited the UCR (i.e., food).
· The neutral stimulus (NS) was something that does not elicit any specific behavior before conditioning.
In Pavlov’s case, the UCS was the site of food and the UCR was the dog drooling. You don’t have to teach a dog to drool when he sees food, right? The NS was the person in a lab coat. Dogs don’t typically do anything when they see this.
During conditioning, Pavlov said that there was a repeated pairing of the UCS and the NS. In Pavlov’s case, the dog was learning that when someone in a lab coat (NS) came into the lab, the dog would get food (UCS). When the dog got food, he would salivate (UCR to the food).
After conditioning, the dog learned to associate the appearance of a person in a lab coat with food. So the NS has become the conditioned stimulus. And the dog’s response to the conditioned stimulus is his conditioned response.
· The conditioned response (CR) was the learned (i.e., conditioned) behavior that the dog elicited. He started to drool whenever he saw someone in a lab coat.
· The conditioned stimulus (CS) was the thing that elicited the CR (i.e., lab coat).
In Pavlov’s case, the dog learned (was conditioned) to associate a person coming into the lab wearing a white lab coat (CS) with getting food, so the dog will salivate whenever he sees the person (CR) even if the person does not have any food!
So in summary:
· UCS = food
· UCR = drooling in response to the food
· NS = person in a white lab coat before conditioning
· CS = previously the NS; person in a white lab coat after conditioning
· CR = drooling in response to the person
This was Pavlov’s original observation. In his classical (Links to an external site.)(heh, get it?) experiment, he paired a tone with the appearance of food. Check out this video on the classic experiment here (Links to an external site.) So:
· UCS = food
· UCR = drooling in response to the food
· NS = bell ringing before conditioning
· CS = previously the NS; bell ringing after conditioning
· CR = drooling in response to the bell ringing
A person can generalize their classically conditioned (learned) behaviors to other stimuli. For example, using a 50 Hz tone as the CS that leads to the drooling as the CR, a person (or dog or other animal), can generalize their CR (drooling) to other CSs (other tones) that are similar, for example to a 100 Hz tone, or even a loud horn. In this way, we say a person’s behavior has become generalized so that they give a response (drooling) whenever something similar (any noise) occurs. On the other hand, a CR might be specific to a specific CS. This is called discrimination. In this case, the dog will only salivate to a 50 Hz tone, and not any other tone.
You can see generalization in thisvideo of the Little Albert experiment (Links to an external site.) Shape, arrow Description automatically generated, how behaviorist John Watson classically conditioned a baby named Albert to become afraid (CR) to previously neutral stimuli such as fire, a monkey, a dog, a rabbit, and a white rat. Before conditioning, Little Albert showed no scared reactions to any of these neutral stimuli. During conditioning, Watson presented a loud noise (UCS) which elicited a fear response in Albert (UCR) – he cried, yelled, and tried to crawl away. By pairing the loud noise (UCS) with the previously neutral stimuli, he conditioned Little Albert to show fear (CR) to the previously neutral stimuli (now the CSs). You can then see how Little Albert’s fear response (the CR) generalized to other furry objects like a fur coat or Santa mask.
You might ask if you can ever get rid of these associations. This is called extinction. What happens is that if a person is not presented with the UCS paired with the CS, they will eventually stop eliciting a CR. In the case of Pavlov’s dog, if you stop presenting the bell (CS) with the food (UCS), then the dog will eventually stop drooling whenever it hears the bell (CR). BUT – extinction is not forgetting! If you wait a period of time (hours, days, weeks, months, years) and then the person is presented with the CS, they might spontaneously elicit the CR. This is called spontaneous recovery. In the case of Pavlov’s dog, if you stop presenting the bell (CS) after the drooling (CR) has stopped, and then present the bell ringing a few hours later, the dog will salivate once again. The response hasn’t been forgotten – think about it as being dormant.
Operant Conditioning
Operant conditioning explains how changes in voluntary behaviors (like studying, exercising, etc) are learned through the good and bad consequences of that behavior. The idea here that is future behavior is governed by what happened the last time you engaged in that behavior. For example, let’s say you passed your Intro Psych test because you studied. If you want to pass the next test, you’ll know you have to study again. If you failed your last test because you really only read your textbook, but didn’t really study, then the next time if you want to pass your test, you know you’re going to have to change your study habits. (See the Chew website named above.)
In operant conditioning, a behavior is either increased or decreased. Something that leads to an increase in behavior is called reinforcement. Something that leads to a decrease in behavior is called punishment. Not all punishment is bad in this context. If you want to change your behavior such that you want to decrease your alcohol use, then you use a form of punishment.
Both reinforcement and punishment can have positive or negative consequences. Again, this does not mean good or bad. It refers to whether something is being given or taken away (Links to an external site.).
In the case of decreasing your alcohol use:
· positive punishment might be something like every time you drink, you get sick, while
· negative punishment might be something like every time you drink, you get your phone taken away
In the case of increasing your study habits:
· positive reinforcement might be something like every time you study a chapter, you reward yourself with a cup of coffee
· negative reinforcement might be something like every time you study a chapter, you get to stop feeling bad about not studying
So there is a 2 (reinforcement, punishment) x 2 (positive, negative) matrix here of possible outcomes (Links to an external site.):
image
So when does someone receive the consequence (positive or negative) that leads to a change in their behavior (reinforcement, punishment)? Both reinforcements and punishments can be given out on a schedule. These are called schedules of reinforcement (which is confusing because punishment can be doled out in a particular schedule of reinforcement). This is a good video explaining ithere (Links to an external site.) .
The schedules describe under what circumstances will lead to a positive or negative consequence for your behavior. Under ratio schedules, a person receives a consequence after a certain number of behaviors (i.e., a certain number of chapters studied). Under interval schedules, a person receives a consequence after a certain period of time has passed. The number of behaviors or the amount of time passed can be either fixed or random (called interval). So again, you have a 2×2 matrix (Links to an external site.) of schedule of reinforcement (Links to an external site.):
Picture
Some examples:
Table Description automatically generated
Some websites that may be helpful:
· TED-Ed video (Links to an external site.) Shape, arrow Description automatically generatedon the differences between classical and operant conditioning
· Classical conditioning sites:
· Classical Conditioning (Links to an external site.) from Lumen
· Classical Conditioning, generalization, extinction, discrimination (Links to an external site.) Shape, arrow Description automatically generatedfrom Kahn Academy
· Example of classical conditioning from the TV showThe Office (Links to an external site.) Shape, arrow Description automatically generated
· Operant conditioning sites:
· Operant Conditioning (Links to an external site.) from Simply Psychology
· Example of operant conditioning from the TV showThe Big Bang Theory (Links to an external site.) Shape, arrow Description automatically generated
· Operant conditioning from Lumen (Links to an external site.), has a good quiz at the end to test your understanding of concepts.
Hit reply and type your answers to the following:
Announcements about upcoming assignments:
· Your RWP #1 stage 2 response is due Mon, 6/5. Remember that each RWP is worth 10% of your final grade – don’t wait until the last minute. Please let me know if you have any questions. Please enter your post for stage 2 in a new thread by hitting REPLY to my reply to your initial post.
1. Pick one of the short videos on Dr. Stephen Chew’s How to Get the Most Out of Studying website (Links to an external site.) and summarize what it talks about. Was there anything that surprised you? How will you use this information to inform your future study habits?
2. What is the difference between operant and classical conditioning? Include in your answer: the type of behavior that is being changed, the experimental procedure used, extinction and generalization.
3. Pick ONE of the following:
A. We learn a lot through classical conditioning. Explain how a drug addict might be classically conditioned to yearn for drugs when they pass their dealer’s house. Be sure to mention what happens before, during, and after conditioning and indicate the UCS, UCR, NS, CS, and CR. What happens during spontaneous recovery?
B. Many animal trainers use operant conditioning to train animals to do things like jump through a hoop or to “sit”. Explain how you would use operant conditioning to train an elephant to walk in a circle on command using successive approximations. Be sure to mention and explain what type of conditioning you are using (reinforcement, punishment, positive, negative) and which schedule of reinforcement you would employ (ratio, interval, fixed, variable).