Unit 3.8: Operant Conditioning

AP Psychology | Unit 3: Development and Learning

🎯 Exam Focus

Operant conditioning is learning through consequences - behaviors increase with reinforcement and decrease with punishment. Master the four key types (positive reinforcement, negative reinforcement, positive punishment, negative punishment), understand B.F. Skinner's contributions and the Skinner Box, know Thorndike's Law of Effect, memorize all schedules of reinforcement (fixed ratio, variable ratio, fixed interval, variable interval) with their resistance to extinction, understand shaping and successive approximations, distinguish primary from secondary reinforcers, know extinction and spontaneous recovery, and be able to identify operant vs. classical conditioning. This major topic appears frequently on both multiple-choice and FRQ sections. You must distinguish between reinforcement (increases behavior) and punishment (decreases behavior), and between positive (adding stimulus) and negative (removing stimulus).

πŸ“š What is Operant Conditioning?

Operant conditioning (also called instrumental conditioning) is a type of learning in which behavior is strengthened if followed by reinforcement or weakened if followed by punishment.

Unlike classical conditioning where responses are elicited automatically, operant conditioning involves voluntary behaviors that are "operated" or performed on the environment. The organism actively "operates" on its surroundings to produce consequences that affect whether the behavior will be repeated.

The key principle: Behaviors followed by satisfying consequences tend to be repeated; behaviors followed by unpleasant consequences tend not to be repeated. This is the foundation of how we learn from experience and how behaviors are shaped over time.

πŸ”¬ Key Researchers

Edward Thorndike: Law of Effect

Edward Thorndike pioneered research on instrumental learning through his puzzle box experiments with cats.

The Puzzle Box Experiment:

  • Hungry cat placed in wooden puzzle box with food visible outside
  • Cat had to perform specific action (pull string, press lever) to escape
  • Initially, cat tried random behaviors
  • Eventually, cat accidentally performed correct action and escaped
  • With repeated trials, cat escaped faster and faster
  • Cat learned through trial and error which behaviors led to desired outcome

βš–οΈ Law of Effect

The Law of Effect states: Behaviors followed by satisfying consequences are more likely to be repeated; behaviors followed by unpleasant consequences are less likely to be repeated. This principle laid the foundation for understanding how consequences shape behavior.

B.F. Skinner: The Operant Chamber (Skinner Box)

B.F. Skinner expanded on Thorndike's work and is considered the father of operant conditioning. He developed the operant chamber (commonly called the Skinner Box) to study animal behavior systematically.

The Skinner Box:

  • Enclosed chamber with lever (rats) or disc (pigeons)
  • Food dispenser activated by pressing lever/pecking disc
  • Could deliver reinforcers (food pellets) or punishers (mild shocks)
  • Recording device tracked responses automatically
  • Allowed precise control over environment and measurement of behavior

Skinner's Contribution: Showed that behavior is largely determined by consequences. He believed in radical behaviorism β€” that internal mental states don't need to be studied; focus only on observable behavior and environmental consequences.

πŸ”‘ Four Types of Consequences

Understanding these four types is CRITICAL for the AP exam. The key is to understand two dimensions: (1) Does it increase or decrease behavior? (2) Is something added or removed?

πŸ’‘ Memory Framework

REINFORCEMENT = INCREASES behavior
PUNISHMENT = DECREASES behavior

POSITIVE = ADDING something
NEGATIVE = REMOVING something

1. Positive Reinforcement

ADDING a pleasant stimulus to INCREASE behavior.

Formula:

Behavior β†’ Add Pleasant Consequence β†’ Behavior Increases

Examples:

  • Student studies hard β†’ Receives praise from teacher β†’ Studies more
  • Rat presses lever β†’ Gets food pellet β†’ Presses lever more
  • Employee works extra hours β†’ Gets bonus β†’ Works extra hours more often
  • Child cleans room β†’ Gets allowance β†’ Cleans room more regularly

2. Negative Reinforcement

REMOVING an unpleasant stimulus to INCREASE behavior.

Formula:

Behavior β†’ Remove Unpleasant Consequence β†’ Behavior Increases

Examples:

  • Taking aspirin β†’ Headache goes away β†’ Take aspirin more when headache occurs
  • Rat presses lever β†’ Electric shock stops β†’ Presses lever more to escape shock
  • Fastening seatbelt β†’ Annoying beeping stops β†’ Fasten seatbelt more regularly
  • Student submits homework on time β†’ Teacher stops nagging β†’ Submits on time more

⚠️ Common Confusion:

Negative reinforcement is NOT punishment! "Negative" means removing/subtracting, not "bad." The behavior still INCREASES because something unpleasant is taken away (which feels good).

3. Positive Punishment

ADDING an unpleasant stimulus to DECREASE behavior.

Formula:

Behavior β†’ Add Unpleasant Consequence β†’ Behavior Decreases

Examples:

  • Child misbehaves β†’ Gets spanked/scolded β†’ Misbehaves less
  • Speeding β†’ Get traffic ticket β†’ Speed less
  • Talking in class β†’ Teacher gives detention β†’ Talk less
  • Dog jumps on guests β†’ Owner says "No!" sharply β†’ Jumps less

4. Negative Punishment

REMOVING a pleasant stimulus to DECREASE behavior.

Formula:

Behavior β†’ Remove Pleasant Consequence β†’ Behavior Decreases

Examples:

  • Teenager breaks curfew β†’ Loses car privileges β†’ Breaks curfew less
  • Child misbehaves β†’ Favorite toy taken away β†’ Misbehaves less
  • Student talks back β†’ Loses recess time β†’ Talks back less
  • Employee arrives late β†’ Bonus docked β†’ Arrives late less

Also called: Response cost or omission training β€” taking away something good to reduce unwanted behavior.

πŸ“Š Quick Reference Table

Type Action Effect on Behavior Example
Positive Reinforcement ADD pleasant INCREASES Get praise for studying
Negative Reinforcement REMOVE unpleasant INCREASES Aspirin stops headache
Positive Punishment ADD unpleasant DECREASES Get scolded for talking
Negative Punishment REMOVE pleasant DECREASES Lose phone for breaking rule

🎁 Types of Reinforcers

Primary Reinforcers

Innate, biological needs that are naturally reinforcing

Characteristics:

  • No learning required
  • Satisfy biological needs
  • Universally reinforcing

Examples: Food, water, sleep, warmth, sex

Secondary Reinforcers

Learned through association with primary reinforcers

Characteristics:

  • Acquired value through conditioning
  • Not inherently reinforcing
  • Can vary by culture/individual

Examples: Money, grades, praise, tokens, trophies

⏰ Schedules of Reinforcement

Schedules of reinforcement determine when and how often a behavior is reinforced. Different schedules produce different patterns of responding and different resistance to extinction.

Continuous Reinforcement

Behavior is reinforced every single time it occurs.

  • Best for establishing new behaviors quickly
  • Fastest acquisition
  • But: rapid extinction when reinforcement stops

Example: Vending machine β€” insert money, get snack every time

Partial (Intermittent) Reinforcement

Behavior is reinforced only some of the time. More resistant to extinction than continuous reinforcement.

Partial Reinforcement Effect: Behaviors learned through partial reinforcement are more persistent and harder to extinguish than behaviors learned through continuous reinforcement.

Four Types of Partial Reinforcement Schedules

1. Fixed-Ratio (FR) Schedule

Reinforcement after a fixed number of responses. Predictable ratio.

Characteristics:

  • Produces high, steady rate of responding
  • Brief pause after reinforcement (post-reinforcement pause)
  • Higher ratio = more responses per reinforcer
  • Moderate resistance to extinction

Examples: FR-5 means reinforcement after every 5 responses. Piecework pay (paid per item produced), frequent flyer programs (free flight after X miles), buy 10 get 1 free punch cards

2. Variable-Ratio (VR) Schedule ⭐ HIGHEST RESISTANCE

Reinforcement after an unpredictable number of responses. Average ratio varies.

Characteristics:

  • Produces highest, most consistent rate of responding
  • No post-reinforcement pause (can't predict when next reinforcer comes)
  • MOST resistant to extinction
  • Organism keeps responding because "next response might be the one"

Examples: VR-10 means reinforcement after average of 10 responses (could be 5, then 15, then 8). Gambling/slot machines (most addictive schedule!), lottery tickets, fishing (unpredictable catches), sales commissions with variable success

3. Fixed-Interval (FI) Schedule

Reinforcement for first response after a fixed time period has elapsed. Predictable time.

Characteristics:

  • Produces "scalloped" response pattern
  • Low responding right after reinforcement
  • Responding increases as time approaches next reinforcement
  • Lower resistance to extinction

Examples: FI-60 seconds means first response after 60 seconds gets reinforced. Checking email (mail delivered once per day at fixed time), studying pattern before exams (cramming right before test), weekly paychecks

4. Variable-Interval (VI) Schedule

Reinforcement for first response after an unpredictable time period. Average interval varies.

Characteristics:

  • Produces steady, moderate rate of responding
  • No post-reinforcement pause
  • High resistance to extinction (but less than VR)
  • Can't predict when next reinforcer available, so keep checking

Examples: VI-30 seconds means reinforcement available after average of 30 seconds (could be 15, then 45, then 20). Pop quizzes (unpredictable timing β†’ study consistently), checking social media notifications, fishing in unpredictable waters, surprise inspections

πŸ“Š Resistance to Extinction Ranking

Most Resistant: Variable-Ratio (VR) β€” gambling, slot machines
High Resistance: Variable-Interval (VI) β€” pop quizzes
Moderate: Fixed-Ratio (FR) β€” piecework pay
Least Resistant: Fixed-Interval (FI) β€” weekly paychecks
Fastest Extinction: Continuous Reinforcement

🎨 Shaping

What is Shaping?

Shaping is the process of reinforcing successive approximations (gradual steps) toward a desired behavior. Used when the target behavior is unlikely to occur spontaneously.

How It Works:

  1. Identify target behavior (final goal)
  2. Identify starting behavior (what organism already does)
  3. Break down into small, achievable steps
  4. Reinforce each successive approximation
  5. Gradually raise criteria for reinforcement
  6. Eventually reach target behavior

Example: Teaching Rat to Press Lever

  • Step 1: Reinforce rat for facing the lever
  • Step 2: Reinforce for moving toward the lever
  • Step 3: Reinforce for touching lever area
  • Step 4: Reinforce for touching lever
  • Step 5: Reinforce only for pressing lever

Real-World Applications: Teaching children new skills (toilet training, riding bike), animal training, therapy for developmental disabilities, athletic coaching

πŸ”§ Additional Key Concepts

Extinction

Extinction in operant conditioning occurs when a previously reinforced behavior is no longer reinforced, leading to a decrease in that behavior.

  • Stop reinforcing behavior β†’ behavior gradually decreases
  • May see extinction burst (temporary increase before decrease)
  • Spontaneous recovery can occur after rest period

Example: If vending machine stops giving snacks when you insert money, you'll eventually stop putting money in.

Discrimination

Discrimination is learning to respond differently to different stimuli based on which behaviors are reinforced in which situations.

  • Discriminative stimulus: Signal that indicates whether a behavior will be reinforced
  • Organism learns "when" and "where" behavior will be reinforced

Example: Asking parent for candy when they're in good mood (discriminative stimulus) vs. bad mood

Generalization

Generalization occurs when a behavior reinforced in one situation is also performed in similar situations.

Example: Child learns saying "please" gets treats from parents, then says "please" to other adults expecting same result

πŸ’‘ Real-World Applications

Token Economy

A behavior modification system using secondary reinforcers (tokens) that can be exchanged for primary reinforcers or privileges.

  • Used in classrooms, psychiatric hospitals, prisons
  • Earn tokens for desired behaviors
  • Exchange tokens for rewards (candy, privileges, activities)
  • Effective for managing groups and teaching new behaviors

Other Applications

  • Parenting: Time-outs (negative punishment), praise for good behavior (positive reinforcement)
  • Education: Grades, gold stars, extra credit, privileges
  • Workplace: Bonuses, promotions, performance reviews
  • Animal training: Clicker training, treats for tricks
  • Therapy: Behavior modification for developmental disabilities, addiction treatment
  • Video games: Points, achievements, level-ups (variable-ratio reinforcement = addictive!)

βš–οΈ Classical vs. Operant Conditioning

Feature Classical Conditioning Operant Conditioning
Type of Behavior Involuntary, automatic (reflexes) Voluntary, active
Focus Association between TWO STIMULI Association between BEHAVIOR and CONSEQUENCE
Response Elicited (drawn out automatically) Emitted (voluntarily performed)
Timing NS before/with UCS Consequence after behavior
Key Researcher Ivan Pavlov B.F. Skinner
Example Bell β†’ salivation (learned) Press lever β†’ get food (learned)

πŸ“ AP Exam Strategy

Multiple Choice Tips

  • Master the four types: Know whether it's reinforcement (increase) or punishment (decrease), and whether adding or removing stimulus
  • Memorize schedules: FR, VR, FI, VI β€” know which produces highest response rate (VR) and most resistance to extinction (VR)
  • Identify scenarios: Given example, label as positive/negative reinforcement/punishment
  • Distinguish from classical: Operant = voluntary behavior with consequences; Classical = involuntary stimulus association
  • Know Skinner's contributions: Operant chamber, radical behaviorism, schedules of reinforcement
  • Understand shaping: Successive approximations toward target behavior

Free Response Question (FRQ) Tips

  • Label precisely: Don't just say "reinforcement" β€” specify positive or negative reinforcement
  • Explain WHY behavior increases/decreases: Connect consequence to behavioral outcome
  • Use correct terminology: "Positive punishment" not "positive consequence"
  • Apply schedules correctly: If asked about gambling, identify as variable-ratio and explain why resistant to extinction
  • Show understanding: Explain mechanism, not just memorize definitions
  • Compare when asked: Clearly distinguish operant from classical conditioning

✨ Quick Review Summary

πŸ”‘ The Big Picture

Operant conditioning is learning through consequences β€” voluntary behaviors increase with reinforcement and decrease with punishment. Thorndike's Law of Effect: behaviors followed by satisfying consequences repeated. B.F. Skinner studied this systematically using Skinner Box. Four key types: Positive reinforcement (add pleasant, increase behavior), Negative reinforcement (remove unpleasant, increase behavior), Positive punishment (add unpleasant, decrease behavior), Negative punishment (remove pleasant, decrease behavior). Primary reinforcers (biological needs: food, water) vs. secondary reinforcers (learned value: money, grades). Schedules of reinforcement: Continuous (every time), Fixed-Ratio (fixed number responses), Variable-Ratio (unpredictable number β€” most resistant to extinction, gambling), Fixed-Interval (fixed time β€” scalloped pattern), Variable-Interval (unpredictable time β€” steady responding). Shaping uses successive approximations to teach complex behaviors. Extinction occurs when reinforcement stops. Differs from classical: operant = voluntary behavior with consequences; classical = involuntary stimulus association. Applications: token economy, parenting, education, animal training, therapy.

πŸ’‘ Essential Concepts

  • Operant conditioning
  • Edward Thorndike
  • Law of Effect
  • B.F. Skinner
  • Skinner Box
  • Positive reinforcement
  • Negative reinforcement
  • Positive punishment
  • Negative punishment
  • Primary reinforcers
  • Secondary reinforcers
  • Continuous reinforcement
  • Partial reinforcement
  • Fixed-Ratio (FR)
  • Variable-Ratio (VR)
  • Fixed-Interval (FI)
  • Variable-Interval (VI)
  • Shaping
  • Successive approximations
  • Extinction
  • Discrimination
  • Generalization
  • Discriminative stimulus
  • Token economy
  • Behavior modification

πŸ“š AP Psychology Unit 3.8 Study Notes | Operant Conditioning

Master Skinner, reinforcement/punishment, and schedules for exam success!