Unit 3.8: Operant Conditioning
AP Psychology | Unit 3: Development and Learning
π― Exam Focus
Operant conditioning is learning through consequences - behaviors increase with reinforcement and decrease with punishment. Master the four key types (positive reinforcement, negative reinforcement, positive punishment, negative punishment), understand B.F. Skinner's contributions and the Skinner Box, know Thorndike's Law of Effect, memorize all schedules of reinforcement (fixed ratio, variable ratio, fixed interval, variable interval) with their resistance to extinction, understand shaping and successive approximations, distinguish primary from secondary reinforcers, know extinction and spontaneous recovery, and be able to identify operant vs. classical conditioning. This major topic appears frequently on both multiple-choice and FRQ sections. You must distinguish between reinforcement (increases behavior) and punishment (decreases behavior), and between positive (adding stimulus) and negative (removing stimulus).
π What is Operant Conditioning?
Operant conditioning (also called instrumental conditioning) is a type of learning in which behavior is strengthened if followed by reinforcement or weakened if followed by punishment.
Unlike classical conditioning where responses are elicited automatically, operant conditioning involves voluntary behaviors that are "operated" or performed on the environment. The organism actively "operates" on its surroundings to produce consequences that affect whether the behavior will be repeated.
The key principle: Behaviors followed by satisfying consequences tend to be repeated; behaviors followed by unpleasant consequences tend not to be repeated. This is the foundation of how we learn from experience and how behaviors are shaped over time.
π¬ Key Researchers
Edward Thorndike: Law of Effect
Edward Thorndike pioneered research on instrumental learning through his puzzle box experiments with cats.
The Puzzle Box Experiment:
- Hungry cat placed in wooden puzzle box with food visible outside
- Cat had to perform specific action (pull string, press lever) to escape
- Initially, cat tried random behaviors
- Eventually, cat accidentally performed correct action and escaped
- With repeated trials, cat escaped faster and faster
- Cat learned through trial and error which behaviors led to desired outcome
βοΈ Law of Effect
The Law of Effect states: Behaviors followed by satisfying consequences are more likely to be repeated; behaviors followed by unpleasant consequences are less likely to be repeated. This principle laid the foundation for understanding how consequences shape behavior.
B.F. Skinner: The Operant Chamber (Skinner Box)
B.F. Skinner expanded on Thorndike's work and is considered the father of operant conditioning. He developed the operant chamber (commonly called the Skinner Box) to study animal behavior systematically.
The Skinner Box:
- Enclosed chamber with lever (rats) or disc (pigeons)
- Food dispenser activated by pressing lever/pecking disc
- Could deliver reinforcers (food pellets) or punishers (mild shocks)
- Recording device tracked responses automatically
- Allowed precise control over environment and measurement of behavior
Skinner's Contribution: Showed that behavior is largely determined by consequences. He believed in radical behaviorism β that internal mental states don't need to be studied; focus only on observable behavior and environmental consequences.
π Four Types of Consequences
Understanding these four types is CRITICAL for the AP exam. The key is to understand two dimensions: (1) Does it increase or decrease behavior? (2) Is something added or removed?
π‘ Memory Framework
REINFORCEMENT = INCREASES behavior
PUNISHMENT = DECREASES behavior
POSITIVE = ADDING something
NEGATIVE = REMOVING something
1. Positive Reinforcement
ADDING a pleasant stimulus to INCREASE behavior.
Formula:
Behavior β Add Pleasant Consequence β Behavior Increases
Examples:
- Student studies hard β Receives praise from teacher β Studies more
- Rat presses lever β Gets food pellet β Presses lever more
- Employee works extra hours β Gets bonus β Works extra hours more often
- Child cleans room β Gets allowance β Cleans room more regularly
2. Negative Reinforcement
REMOVING an unpleasant stimulus to INCREASE behavior.
Formula:
Behavior β Remove Unpleasant Consequence β Behavior Increases
Examples:
- Taking aspirin β Headache goes away β Take aspirin more when headache occurs
- Rat presses lever β Electric shock stops β Presses lever more to escape shock
- Fastening seatbelt β Annoying beeping stops β Fasten seatbelt more regularly
- Student submits homework on time β Teacher stops nagging β Submits on time more
β οΈ Common Confusion:
Negative reinforcement is NOT punishment! "Negative" means removing/subtracting, not "bad." The behavior still INCREASES because something unpleasant is taken away (which feels good).
3. Positive Punishment
ADDING an unpleasant stimulus to DECREASE behavior.
Formula:
Behavior β Add Unpleasant Consequence β Behavior Decreases
Examples:
- Child misbehaves β Gets spanked/scolded β Misbehaves less
- Speeding β Get traffic ticket β Speed less
- Talking in class β Teacher gives detention β Talk less
- Dog jumps on guests β Owner says "No!" sharply β Jumps less
4. Negative Punishment
REMOVING a pleasant stimulus to DECREASE behavior.
Formula:
Behavior β Remove Pleasant Consequence β Behavior Decreases
Examples:
- Teenager breaks curfew β Loses car privileges β Breaks curfew less
- Child misbehaves β Favorite toy taken away β Misbehaves less
- Student talks back β Loses recess time β Talks back less
- Employee arrives late β Bonus docked β Arrives late less
Also called: Response cost or omission training β taking away something good to reduce unwanted behavior.
π Quick Reference Table
| Type | Action | Effect on Behavior | Example |
|---|---|---|---|
| Positive Reinforcement | ADD pleasant | INCREASES | Get praise for studying |
| Negative Reinforcement | REMOVE unpleasant | INCREASES | Aspirin stops headache |
| Positive Punishment | ADD unpleasant | DECREASES | Get scolded for talking |
| Negative Punishment | REMOVE pleasant | DECREASES | Lose phone for breaking rule |
π Types of Reinforcers
Primary Reinforcers
Innate, biological needs that are naturally reinforcing
Characteristics:
- No learning required
- Satisfy biological needs
- Universally reinforcing
Examples: Food, water, sleep, warmth, sex
Secondary Reinforcers
Learned through association with primary reinforcers
Characteristics:
- Acquired value through conditioning
- Not inherently reinforcing
- Can vary by culture/individual
Examples: Money, grades, praise, tokens, trophies
β° Schedules of Reinforcement
Schedules of reinforcement determine when and how often a behavior is reinforced. Different schedules produce different patterns of responding and different resistance to extinction.
Continuous Reinforcement
Behavior is reinforced every single time it occurs.
- Best for establishing new behaviors quickly
- Fastest acquisition
- But: rapid extinction when reinforcement stops
Example: Vending machine β insert money, get snack every time
Partial (Intermittent) Reinforcement
Behavior is reinforced only some of the time. More resistant to extinction than continuous reinforcement.
Partial Reinforcement Effect: Behaviors learned through partial reinforcement are more persistent and harder to extinguish than behaviors learned through continuous reinforcement.
Four Types of Partial Reinforcement Schedules
1. Fixed-Ratio (FR) Schedule
Reinforcement after a fixed number of responses. Predictable ratio.
Characteristics:
- Produces high, steady rate of responding
- Brief pause after reinforcement (post-reinforcement pause)
- Higher ratio = more responses per reinforcer
- Moderate resistance to extinction
Examples: FR-5 means reinforcement after every 5 responses. Piecework pay (paid per item produced), frequent flyer programs (free flight after X miles), buy 10 get 1 free punch cards
2. Variable-Ratio (VR) Schedule β HIGHEST RESISTANCE
Reinforcement after an unpredictable number of responses. Average ratio varies.
Characteristics:
- Produces highest, most consistent rate of responding
- No post-reinforcement pause (can't predict when next reinforcer comes)
- MOST resistant to extinction
- Organism keeps responding because "next response might be the one"
Examples: VR-10 means reinforcement after average of 10 responses (could be 5, then 15, then 8). Gambling/slot machines (most addictive schedule!), lottery tickets, fishing (unpredictable catches), sales commissions with variable success
3. Fixed-Interval (FI) Schedule
Reinforcement for first response after a fixed time period has elapsed. Predictable time.
Characteristics:
- Produces "scalloped" response pattern
- Low responding right after reinforcement
- Responding increases as time approaches next reinforcement
- Lower resistance to extinction
Examples: FI-60 seconds means first response after 60 seconds gets reinforced. Checking email (mail delivered once per day at fixed time), studying pattern before exams (cramming right before test), weekly paychecks
4. Variable-Interval (VI) Schedule
Reinforcement for first response after an unpredictable time period. Average interval varies.
Characteristics:
- Produces steady, moderate rate of responding
- No post-reinforcement pause
- High resistance to extinction (but less than VR)
- Can't predict when next reinforcer available, so keep checking
Examples: VI-30 seconds means reinforcement available after average of 30 seconds (could be 15, then 45, then 20). Pop quizzes (unpredictable timing β study consistently), checking social media notifications, fishing in unpredictable waters, surprise inspections
π Resistance to Extinction Ranking
Most Resistant: Variable-Ratio (VR) β gambling, slot machines
High Resistance: Variable-Interval (VI) β pop quizzes
Moderate: Fixed-Ratio (FR) β piecework pay
Least Resistant: Fixed-Interval (FI) β weekly paychecks
Fastest Extinction: Continuous Reinforcement
π¨ Shaping
What is Shaping?
Shaping is the process of reinforcing successive approximations (gradual steps) toward a desired behavior. Used when the target behavior is unlikely to occur spontaneously.
How It Works:
- Identify target behavior (final goal)
- Identify starting behavior (what organism already does)
- Break down into small, achievable steps
- Reinforce each successive approximation
- Gradually raise criteria for reinforcement
- Eventually reach target behavior
Example: Teaching Rat to Press Lever
- Step 1: Reinforce rat for facing the lever
- Step 2: Reinforce for moving toward the lever
- Step 3: Reinforce for touching lever area
- Step 4: Reinforce for touching lever
- Step 5: Reinforce only for pressing lever
Real-World Applications: Teaching children new skills (toilet training, riding bike), animal training, therapy for developmental disabilities, athletic coaching
π§ Additional Key Concepts
Extinction
Extinction in operant conditioning occurs when a previously reinforced behavior is no longer reinforced, leading to a decrease in that behavior.
- Stop reinforcing behavior β behavior gradually decreases
- May see extinction burst (temporary increase before decrease)
- Spontaneous recovery can occur after rest period
Example: If vending machine stops giving snacks when you insert money, you'll eventually stop putting money in.
Discrimination
Discrimination is learning to respond differently to different stimuli based on which behaviors are reinforced in which situations.
- Discriminative stimulus: Signal that indicates whether a behavior will be reinforced
- Organism learns "when" and "where" behavior will be reinforced
Example: Asking parent for candy when they're in good mood (discriminative stimulus) vs. bad mood
Generalization
Generalization occurs when a behavior reinforced in one situation is also performed in similar situations.
Example: Child learns saying "please" gets treats from parents, then says "please" to other adults expecting same result
π‘ Real-World Applications
Token Economy
A behavior modification system using secondary reinforcers (tokens) that can be exchanged for primary reinforcers or privileges.
- Used in classrooms, psychiatric hospitals, prisons
- Earn tokens for desired behaviors
- Exchange tokens for rewards (candy, privileges, activities)
- Effective for managing groups and teaching new behaviors
Other Applications
- Parenting: Time-outs (negative punishment), praise for good behavior (positive reinforcement)
- Education: Grades, gold stars, extra credit, privileges
- Workplace: Bonuses, promotions, performance reviews
- Animal training: Clicker training, treats for tricks
- Therapy: Behavior modification for developmental disabilities, addiction treatment
- Video games: Points, achievements, level-ups (variable-ratio reinforcement = addictive!)
βοΈ Classical vs. Operant Conditioning
| Feature | Classical Conditioning | Operant Conditioning |
|---|---|---|
| Type of Behavior | Involuntary, automatic (reflexes) | Voluntary, active |
| Focus | Association between TWO STIMULI | Association between BEHAVIOR and CONSEQUENCE |
| Response | Elicited (drawn out automatically) | Emitted (voluntarily performed) |
| Timing | NS before/with UCS | Consequence after behavior |
| Key Researcher | Ivan Pavlov | B.F. Skinner |
| Example | Bell β salivation (learned) | Press lever β get food (learned) |
π AP Exam Strategy
Multiple Choice Tips
- Master the four types: Know whether it's reinforcement (increase) or punishment (decrease), and whether adding or removing stimulus
- Memorize schedules: FR, VR, FI, VI β know which produces highest response rate (VR) and most resistance to extinction (VR)
- Identify scenarios: Given example, label as positive/negative reinforcement/punishment
- Distinguish from classical: Operant = voluntary behavior with consequences; Classical = involuntary stimulus association
- Know Skinner's contributions: Operant chamber, radical behaviorism, schedules of reinforcement
- Understand shaping: Successive approximations toward target behavior
Free Response Question (FRQ) Tips
- Label precisely: Don't just say "reinforcement" β specify positive or negative reinforcement
- Explain WHY behavior increases/decreases: Connect consequence to behavioral outcome
- Use correct terminology: "Positive punishment" not "positive consequence"
- Apply schedules correctly: If asked about gambling, identify as variable-ratio and explain why resistant to extinction
- Show understanding: Explain mechanism, not just memorize definitions
- Compare when asked: Clearly distinguish operant from classical conditioning
β¨ Quick Review Summary
π The Big Picture
Operant conditioning is learning through consequences β voluntary behaviors increase with reinforcement and decrease with punishment. Thorndike's Law of Effect: behaviors followed by satisfying consequences repeated. B.F. Skinner studied this systematically using Skinner Box. Four key types: Positive reinforcement (add pleasant, increase behavior), Negative reinforcement (remove unpleasant, increase behavior), Positive punishment (add unpleasant, decrease behavior), Negative punishment (remove pleasant, decrease behavior). Primary reinforcers (biological needs: food, water) vs. secondary reinforcers (learned value: money, grades). Schedules of reinforcement: Continuous (every time), Fixed-Ratio (fixed number responses), Variable-Ratio (unpredictable number β most resistant to extinction, gambling), Fixed-Interval (fixed time β scalloped pattern), Variable-Interval (unpredictable time β steady responding). Shaping uses successive approximations to teach complex behaviors. Extinction occurs when reinforcement stops. Differs from classical: operant = voluntary behavior with consequences; classical = involuntary stimulus association. Applications: token economy, parenting, education, animal training, therapy.
π‘ Essential Concepts
- Operant conditioning
- Edward Thorndike
- Law of Effect
- B.F. Skinner
- Skinner Box
- Positive reinforcement
- Negative reinforcement
- Positive punishment
- Negative punishment
- Primary reinforcers
- Secondary reinforcers
- Continuous reinforcement
- Partial reinforcement
- Fixed-Ratio (FR)
- Variable-Ratio (VR)
- Fixed-Interval (FI)
- Variable-Interval (VI)
- Shaping
- Successive approximations
- Extinction
- Discrimination
- Generalization
- Discriminative stimulus
- Token economy
- Behavior modification
π AP Psychology Unit 3.8 Study Notes | Operant Conditioning
Master Skinner, reinforcement/punishment, and schedules for exam success!