Unit 3.8: Operant Conditioning

AP Psychology | Unit 3: Development and Learning

🎯 Exam Focus

Operant conditioning is learning through consequences - behaviors increase with reinforcement and decrease with punishment. Master the four key types (positive reinforcement, negative reinforcement, positive punishment, negative punishment), understand B.F. Skinner's contributions and the Skinner Box, know Thorndike's Law of Effect, memorize all schedules of reinforcement (fixed ratio, variable ratio, fixed interval, variable interval) with their resistance to extinction, understand shaping and successive approximations, distinguish primary from secondary reinforcers, know extinction and spontaneous recovery, and be able to identify operant vs. classical conditioning. This major topic appears frequently on both multiple-choice and FRQ sections. You must distinguish between reinforcement (increases behavior) and punishment (decreases behavior), and between positive (adding stimulus) and negative (removing stimulus).

📚 What is Operant Conditioning?

Operant conditioning (also called instrumental conditioning) is a type of learning in which behavior is strengthened if followed by reinforcement or weakened if followed by punishment.

Unlike classical conditioning where responses are elicited automatically, operant conditioning involves voluntary behaviors that are "operated" or performed on the environment. The organism actively "operates" on its surroundings to produce consequences that affect whether the behavior will be repeated.

The key principle: Behaviors followed by satisfying consequences tend to be repeated; behaviors followed by unpleasant consequences tend not to be repeated. This is the foundation of how we learn from experience and how behaviors are shaped over time.

🔬 Key Researchers

Edward Thorndike: Law of Effect

Edward Thorndike pioneered research on instrumental learning through his puzzle box experiments with cats.

The Puzzle Box Experiment:

Hungry cat placed in wooden puzzle box with food visible outside
Cat had to perform specific action (pull string, press lever) to escape
Initially, cat tried random behaviors
Eventually, cat accidentally performed correct action and escaped
With repeated trials, cat escaped faster and faster
Cat learned through trial and error which behaviors led to desired outcome

⚖️ Law of Effect

The Law of Effect states: Behaviors followed by satisfying consequences are more likely to be repeated; behaviors followed by unpleasant consequences are less likely to be repeated. This principle laid the foundation for understanding how consequences shape behavior.

B.F. Skinner: The Operant Chamber (Skinner Box)

B.F. Skinner expanded on Thorndike's work and is considered the father of operant conditioning. He developed the operant chamber (commonly called the Skinner Box) to study animal behavior systematically.

The Skinner Box:

Enclosed chamber with lever (rats) or disc (pigeons)
Food dispenser activated by pressing lever/pecking disc
Could deliver reinforcers (food pellets) or punishers (mild shocks)
Recording device tracked responses automatically
Allowed precise control over environment and measurement of behavior

Teenager breaks curfew → Loses car privileges → Breaks curfew less
Child misbehaves → Favorite toy taken away → Misbehaves less
Student talks back → Loses recess time → Talks back less
Employee arrives late → Bonus docked → Arrives late less

Also called: Response cost or omission training — taking away something good to reduce unwanted behavior.

📊 Quick Reference Table

Type	Action	Effect on Behavior	Example
Positive Reinforcement	ADD pleasant	INCREASES	Get praise for studying
Negative Reinforcement	REMOVE unpleasant	INCREASES	Aspirin stops headache
Positive Punishment	ADD unpleasant	DECREASES	Get scolded for talking
Negative Punishment	REMOVE pleasant	DECREASES	Lose phone for breaking rule

🎁 Types of Reinforcers

Primary Reinforcers

Innate, biological needs that are naturally reinforcing

Characteristics:

No learning required
Satisfy biological needs
Universally reinforcing

Examples: Food, water, sleep, warmth, sex

Secondary Reinforcers

Learned through association with primary reinforcers

Characteristics:

Acquired value through conditioning
Not inherently reinforcing
Can vary by culture/individual

Examples: Money, grades, praise, tokens, trophies

⏰ Schedules of Reinforcement

Schedules of reinforcement determine when and how often a behavior is reinforced. Different schedules produce different patterns of responding and different resistance to extinction.

Continuous Reinforcement

Behavior is reinforced every single time it occurs.

Best for establishing new behaviors quickly
Fastest acquisition
But: rapid extinction when reinforcement stops

Example: Vending machine — insert money, get snack every time

Partial (Intermittent) Reinforcement

Behavior is reinforced only some of the time. More resistant to extinction than continuous reinforcement.

Partial Reinforcement Effect: Behaviors learned through partial reinforcement are more persistent and harder to extinguish than behaviors learned through continuous reinforcement.

Four Types of Partial Reinforcement Schedules

1. Fixed-Ratio (FR) Schedule

Reinforcement after a fixed number of responses. Predictable ratio.

Characteristics:

Produces high, steady rate of responding
Brief pause after reinforcement (post-reinforcement pause)
Higher ratio = more responses per reinforcer
Moderate resistance to extinction

Examples: FR-5 means reinforcement after every 5 responses. Piecework pay (paid per item produced), frequent flyer programs (free flight after X miles), buy 10 get 1 free punch cards

2. Variable-Ratio (VR) Schedule ⭐ HIGHEST RESISTANCE

Reinforcement after an unpredictable number of responses. Average ratio varies.

Characteristics:

Produces highest, most consistent rate of responding
No post-reinforcement pause (can't predict when next reinforcer comes)
MOST resistant to extinction
Organism keeps responding because "next response might be the one"

Examples: VR-10 means reinforcement after average of 10 responses (could be 5, then 15, then 8). Gambling/slot machines (most addictive schedule!), lottery tickets, fishing (unpredictable catches), sales commissions with variable success

3. Fixed-Interval (FI) Schedule

Reinforcement for first response after a fixed time period has elapsed. Predictable time.

Characteristics:

Produces "scalloped" response pattern
Low responding right after reinforcement
Responding increases as time approaches next reinforcement
Lower resistance to extinction

Examples: FI-60 seconds means first response after 60 seconds gets reinforced. Checking email (mail delivered once per day at fixed time), studying pattern before exams (cramming right before test), weekly paychecks

4. Variable-Interval (VI) Schedule

Reinforcement for first response after an unpredictable time period. Average interval varies.

Characteristics:

Produces steady, moderate rate of responding
No post-reinforcement pause
High resistance to extinction (but less than VR)
Can't predict when next reinforcer available, so keep checking

Examples: VI-30 seconds means reinforcement available after average of 30 seconds (could be 15, then 45, then 20). Pop quizzes (unpredictable timing → study consistently), checking social media notifications, fishing in unpredictable waters, surprise inspections

📊 Resistance to Extinction Ranking

Most Resistant: Variable-Ratio (VR) — gambling, slot machines
High Resistance: Variable-Interval (VI) — pop quizzes
Moderate: Fixed-Ratio (FR) — piecework pay
Least Resistant: Fixed-Interval (FI) — weekly paychecks
Fastest Extinction: Continuous Reinforcement

🎨 Shaping

What is Shaping?

Shaping is the process of reinforcing successive approximations (gradual steps) toward a desired behavior. Used when the target behavior is unlikely to occur spontaneously.

How It Works:

Identify target behavior (final goal)
Identify starting behavior (what organism already does)
Break down into small, achievable steps
Reinforce each successive approximation
Gradually raise criteria for reinforcement
Eventually reach target behavior

Example: Teaching Rat to Press Lever

Step 1: Reinforce rat for facing the lever
Step 2: Reinforce for moving toward the lever
Step 3: Reinforce for touching lever area
Step 4: Reinforce for touching lever
Step 5: Reinforce only for pressing lever

Real-World Applications: Teaching children new skills (toilet training, riding bike), animal training, therapy for developmental disabilities, athletic coaching

🔧 Additional Key Concepts

Extinction

Extinction in operant conditioning occurs when a previously reinforced behavior is no longer reinforced, leading to a decrease in that behavior.

Stop reinforcing behavior → behavior gradually decreases
May see extinction burst (temporary increase before decrease)
Spontaneous recovery can occur after rest period

Example: If vending machine stops giving snacks when you insert money, you'll eventually stop putting money in.

Discrimination

Discrimination is learning to respond differently to different stimuli based on which behaviors are reinforced in which situations.

Discriminative stimulus: Signal that indicates whether a behavior will be reinforced
Organism learns "when" and "where" behavior will be reinforced

Example: Asking parent for candy when they're in good mood (discriminative stimulus) vs. bad mood

Generalization

Generalization occurs when a behavior reinforced in one situation is also performed in similar situations.

Example: Child learns saying "please" gets treats from parents, then says "please" to other adults expecting same result

💡 Real-World Applications

Token Economy

A behavior modification system using secondary reinforcers (tokens) that can be exchanged for primary reinforcers or privileges.

Used in classrooms, psychiatric hospitals, prisons
Earn tokens for desired behaviors
Exchange tokens for rewards (candy, privileges, activities)
Effective for managing groups and teaching new behaviors

Other Applications

Parenting: Time-outs (negative punishment), praise for good behavior (positive reinforcement)
Education: Grades, gold stars, extra credit, privileges
Workplace: Bonuses, promotions, performance reviews
Animal training: Clicker training, treats for tricks
Therapy: Behavior modification for developmental disabilities, addiction treatment
Video games: Points, achievements, level-ups (variable-ratio reinforcement = addictive!)

⚖️ Classical vs. Operant Conditioning

Feature	Classical Conditioning	Operant Conditioning
Type of Behavior	Involuntary, automatic (reflexes)	Voluntary, active
Focus	Association between TWO STIMULI	Association between BEHAVIOR and CONSEQUENCE
Response	Elicited (drawn out automatically)	Emitted (voluntarily performed)
Timing	NS before/with UCS	Consequence after behavior
Key Researcher	Ivan Pavlov	B.F. Skinner
Example	Bell → salivation (learned)	Press lever → get food (learned)

📝 AP Exam Strategy

Multiple Choice Tips

Master the four types: Know whether it's reinforcement (increase) or punishment (decrease), and whether adding or removing stimulus
Memorize schedules: FR, VR, FI, VI — know which produces highest response rate (VR) and most resistance to extinction (VR)
Identify scenarios: Given example, label as positive/negative reinforcement/punishment
Distinguish from classical: Operant = voluntary behavior with consequences; Classical = involuntary stimulus association
Know Skinner's contributions: Operant chamber, radical behaviorism, schedules of reinforcement
Understand shaping: Successive approximations toward target behavior

Free Response Question (FRQ) Tips

Label precisely: Don't just say "reinforcement" — specify positive or negative reinforcement
Explain WHY behavior increases/decreases: Connect consequence to behavioral outcome
Use correct terminology: "Positive punishment" not "positive consequence"
Apply schedules correctly: If asked about gambling, identify as variable-ratio and explain why resistant to extinction
Show understanding: Explain mechanism, not just memorize definitions
Compare when asked: Clearly distinguish operant from classical conditioning

✨ Quick Review Summary

🔑 The Big Picture

Operant conditioning is learning through consequences — voluntary behaviors increase with reinforcement and decrease with punishment. Thorndike's Law of Effect: behaviors followed by satisfying consequences repeated. B.F. Skinner studied this systematically using Skinner Box. Four key types: Positive reinforcement (add pleasant, increase behavior), Negative reinforcement (remove unpleasant, increase behavior), Positive punishment (add unpleasant, decrease behavior), Negative punishment (remove pleasant, decrease behavior). Primary reinforcers (biological needs: food, water) vs. secondary reinforcers (learned value: money, grades). Schedules of reinforcement: Continuous (every time), Fixed-Ratio (fixed number responses), Variable-Ratio (unpredictable number — most resistant to extinction, gambling), Fixed-Interval (fixed time — scalloped pattern), Variable-Interval (unpredictable time — steady responding). Shaping uses successive approximations to teach complex behaviors. Extinction occurs when reinforcement stops. Differs from classical: operant = voluntary behavior with consequences; classical = involuntary stimulus association. Applications: token economy, parenting, education, animal training, therapy.

💡 Essential Concepts

Operant conditioning
Edward Thorndike
Law of Effect
B.F. Skinner
Skinner Box
Positive reinforcement
Negative reinforcement
Positive punishment
Negative punishment
Primary reinforcers
Secondary reinforcers
Continuous reinforcement
Partial reinforcement
Fixed-Ratio (FR)
Variable-Ratio (VR)
Fixed-Interval (FI)
Variable-Interval (VI)
Shaping
Successive approximations
Extinction
Discrimination
Generalization
Discriminative stimulus
Token economy
Behavior modification

📚 AP Psychology Unit 3.8 Study Notes | Operant Conditioning

Master Skinner, reinforcement/punishment, and schedules for exam success!