Operant conditioning is something to be observed in most areas of a growing child’s life, as well as in the adult world. It has been widely applied in human behavior modification processes using reinforcement techniques. A student that receives a positive response from the teacher may become encouraged to repeat the behavior. The opposite is also true. Reprimanded before classmates may discourage a student from repeating their conduct. Both examples are operant conditioning, where positive reinforcement stimulates behavior repetition, and negative response discourages it.
Behavioral psychologist B.F. Skinner coined the term operant conditioning in 1937. He defined it as behavior controlled by its consequences. Operant conditioning strengthens or discourages behavior based on the consequences of that behavior. It is a learning process utilizing reinforcement and punishment.
How BF Skinner Discovered Operant Conditioning
Burrhus Frederic Skinner grew up in a small town in Susquehanna, Pennsylvania. He received a B.A. in English literature at Hamilton College in 1926; he wanted to become a writer. Despite mentorship from famous poet Robert Frost, he soon realized that writing wasn’t his future.
While working at a bookstore, he noticed Pavlov and Watson’s works, which inspired him to focus on psychology. B.F. Skinner received his Ph.D. in psychology at Harvard University in 1931. He stayed on as a researcher until 1936. He left Harvard for academic posts at the University of Minnesota and Indiana University to return to Harvard in 1948.
Influenced by John B. Watson’s philosophy of behaviorism, that rejected Jung and Freud’s psychoanalytic theories, Skinner believed that action was a conditioning consequence. He deemed free will insignificant and believed humanity’s future was in creating a human environment where human behavior was controlled systematically, resulting in desirable outcomes.
Edward Thorndike’s Law of Effect also influenced Skinner. The principle of Law of Effect showed that people were more inclined to repeat actions with desirable outcomes than those with undesirable results.
Skinner was interested in behaviors that affected the environment and coined the term to differentiate from Pavlov’s reflex-related behaviors. Skinnerian conditioning (or operant conditioning) wasn’t very different from the already known term, instrumental learning. His schedules of reinforcement, however, distinguished his theory from instrumental learning. Operant conditioning studied reversible behavior maintained by schedules of reinforcement. His discovery led to new tools for learning processes.
At Harvard, B.F. Skinner developed the Skinner box, an operant conditioning apparatus, and the cumulative recorder. The cumulative recorder device recorded responses as an upward movement of the line. Response rates were read by looking at the slope of the line. The Skinner box is a small environment used to analyze animal behavior through conditioning and reinforcement.
Skinner primarily used rats and pigeons, where he isolated them to observe their responses to targeted behavior. For example, a rat may be conditioned to press a lever attached to a feeding tube to receive a food pellet. At first, it touches the lever by accident, but later it learns that by pressing the lever, it is rewarded with food pellets. The rat was learning a new behavior through reinforcement. Skinner discovered with these experiments that the response rate depended on what occurred after the behavior was performed. These behaviors he called operant behaviors and the process of reinforcement he called operant conditioning.
What is the Difference Between Classical and Operant Conditioning?
Ivan Pavlov’s dogs is a famous experiment that describes classical conditioning. Classical condition refers to involuntary responses that result from experiences that occur before a response. By repeatedly pairing a sound, ringing of a bell, with giving the dogs food, the dogs learn to associate the two. Pavlov noticed the dogs began to salivate when hearing the sound.
Classical conditioning pairs a neutral stimulus, the sound of a bell, with an unconditioned stimulus, the taste of food. The latter automatically triggered salivating, an unconditioned response. With repetition, the sound becomes a conditioned stimulus evoking a conditioned response of salivating.
In classical conditioning, the test subject learns the association between two stimuli resulting in a reflective response. In operant conditioning, the test subject learns the association between behavior and its consequences. Their behavioral response is strengthened or weakened through reinforcement.
In Pavlov’s experiment, the dogs started salivating before they received the food. In the Skinner box rat experiment, the rat’s behavior response changed because of what happened after the consequences. By learning the target behavior, the rat pressed the right lever to receive food, resulting in a positive response.
What Are The 4 Types of Operant Conditioning?
The four types of operant conditioning refer to reinforcement through positive or negative stimuli that strengthen or weaken behaviors.
Positive reinforcement strengthens a behavior with positive rewards or conditioning. For example, giving a pet a treat to teach them to sit or stay, praising a student, or giving them a star for completing their homework, or an employer receiving a performance bonus for exceeding a target.
Negative reinforcement strengthens a behavior by removing a negative stimulus, for example, shortening a prison sentence for good behavior, removing a child’s favorite toy until they make their bed, or withholding final payment until the job is completed to the client’s satisfaction. The negative reinforcement is learning that good behavior results in shortening prison time. A child that doesn’t make their bed learns to change their behavior and to do the chores to minimize the removing of their favorite toy.
Positive Punishment / Punishment
Positive punishment weakens the behavior by adding a negative stimulus or conditioning, for example, scolding a child when they misbehave, reprimanding a student who talks during class, and laboratory experiments when the rat receives an electric shock when pushing a lever.
Negative Punishment / Extinction
Negative punishment weakens the behavior by removing positive or negative conditioning. For example, ignoring a screaming child results in the child screaming less in the future, punishing a cat for stopping it from jumping on the food counter, or punishing an employee for being late to reinforce being on time.
Questions one could ask to assess what type of operant conditioning was used:
- What behavior changed?
- Was the behavior strengthened or weakened?
- What were the consequences that followed the behavior?
- Were the consequences added or removed?
What Are The Steps In Operant Conditioning?
Complex behaviors are learned through shaping that happens with step-by-step reinforcement. Initially, reinforcement occurs during the learning of the first part of the behavior. When the first part of the behavior is mastered, shaping moves to the next part.
During the next step, reinforcement no longer happens with the first part but has moved forward to occur with the second part of the behavior. The reinforcement pattern continues in the same way until the final part of the behavior is mastered.
For example, when you teach a child to ride a bike, the child is praised for climbing on the bike. Next, the child is praised for correct posture. As the child progresses through the stages until they ride the bike, they are praised to encourage them to take the next step.
The four steps of shaping are summarized as:
- At step 1, you reinforce any response that resembles the target behavior.
- At step 2, you now reinforce a response that resembles the target behavior more closely. The response in step one is no longer reinforced because it has been mastered.
- At step 3, a closer and closer approximation of the target behavior is reinforced. No longer reinforce responses that have been mastered.
- At the final step, only the target behavior is reinforced until mastered.
Shaping is a common teaching and training method used for learning new behaviors in humans and animals. It is used for training dogs, and familiar technique parents and teachers may use in human learning.
The Schedule of reinforcement controls the timing and frequency of reinforcement to bring about a target behavior. The types of reinforcement schedules impact the rate of learning differently.
Continuous and Intermittent Schedules
- Continuous schedules reward the learner or test subject every time they produce the target behavior. Learning happens fast but decreases as quickly when the reinforcement stops.
- Intermittent or partial schedules reward at intervals or after specific ratios.
Fixed-Ratio and Variable-Ratio Schedules
- Fixed-ratio schedules reward the test subject or learner after a specified number of target behavior responses.
- Variable-ratio schedules vary how many responses needed before the test subject is rewarded. The response rate stays high after cessation of the rewards because of the uncertainty when the next reward may occur.
Fixed Interval and Variable Interval Schedules
- Fixed Interval schedules reward after a specified amount of time has passed. With a fixed ratio, the test subject may be rewarded after every fourth response, whereas with fixed intervals the test subject is rewarded after 15-minutes or an hour.
- Variable Interval schedules vary the amount of time between rewards, whereas variable-ratio varies the reward after several responses.
How To Use Operant Conditioning in The Classroom?
Operant conditioning is highly effective in the classroom learning environment to manage student behavior modification. One of the most significant benefits is the immediate feedback students receive that reinforces or discourages the behavior.
Discipline in the classroom is an excellent example of enforcing a wanted behavior by positive or negative reinforcements. The teacher educates the students by reinforcing the required behavior through positive reinforcements. It may be vocal praise, treats, or by using symbols like stickers or stamps. The pleasing experience reinforces the desire to repeat the behavior.
Other examples of using positive reinforcement are
- behavioral charts that reward a student when following the rules of the class
- praises are reinforcing good behavior and showing the rest of the class what is expected.
Punishment, however, motivates a lower response rate in the undesired behavior. The rest of the class won’t be keen to behave in the same way. Punishment reinforcement examples are
- being ignored by the teacher when a student shouts out the answer instead of raising their hand
- separated from sitting next to your friend because you were talking in class
- getting detention due to misbehavior
Shaping is another tool a teacher may use to encourage certain behaviors. For example, if the teacher wants to inspire students to participate by answering questions, they can use shaping. Initially, the teacher praises every attempt to answer a question to encourage participation. Gradually the teacher tapers the response to praising only the correct answers.
The disadvantage of operant conditioning in the classroom is that it is an extrinsic motivator that isn’t as lasting as intrinsic motivation. Successful management of student behavior has better results when intrinsic is combined with extrinsic.