AP Statistics 2026 FRQ Solutions: Complete Step-by-Step Answer Guide
This complete solution guide explains every question from the 2026 AP Statistics Free-Response Questions in a simple, beginner-friendly way. Each answer includes the method, formula, substitution, final result, and interpretation in context.
Question 1 Overview
A goat farmer compares the weights of two goat breeds: Breed H and Breed J. We are given the actual data for Breed H and a boxplot for Breed J. The main skills tested are finding a five-number summary and comparing distributions.
Part A: Find the five-number summary for Breed H
The Breed H goat weights are already listed from smallest to largest:
A five-number summary has five values:
Step 1: Minimum
The minimum is the smallest value.
Step 2: Maximum
The maximum is the largest value.
Step 3: Median
There are \(14\) values. Since \(14\) is even, the median is the average of the 7th and 8th values.
Step 4: First quartile, \(Q_1\)
The lower half of the data is:
The middle value is \(56\), so:
Step 5: Third quartile, \(Q_3\)
The upper half of the data is:
The middle value is \(72\), so:
Part B: Compare the center and variability of Breed H and Breed J
To compare two distributions, focus on two major ideas:
Center Variability
Compare center
Breed H has median:
From the Breed J boxplot, the median is also about \(64\).
Compare variability
For Breed H:
For Breed J, the boxplot shows approximately:
Part C(i): What does the stem-and-leaf plot show that the boxplot does not?
The stem-and-leaf plot shows the individual data values. Because individual values are visible, we can see the shape of the distribution more clearly.
Breed H has values clustered around the 50s and again around the 70s. This suggests a bimodal shape.
Part C(ii): Why does a boxplot not show this?
A boxplot only displays the five-number summary. It does not show every individual value.
Question 2 Overview
Holly wants to know whether adding coffee grounds to soil helps rosebushes produce more roses. She grows \(30\) rosebushes in a greenhouse, randomly assigns \(15\) to receive coffee grounds, and leaves \(15\) without coffee grounds.
Part A(i): Identify the treatments
A treatment is the condition applied to the experimental units.
- One-half cup of coffee grounds added weekly.
- No coffee grounds added.
Part A(ii): Identify the experimental units
Experimental units are the individuals or objects that receive the treatments.
Part A(iii): Identify the response variable
The response variable is what is measured after the treatments are applied.
Part B: Describe random assignment
Random assignment helps make the groups comparable. It reduces bias and helps us decide whether differences are likely caused by the treatment.
Step-by-step random assignment method
- Label the rosebushes from \(1\) to \(30\).
- Use a random number generator to choose \(15\) unique numbers from \(1\) to \(30\).
- Assign the selected \(15\) rosebushes to receive coffee grounds.
- Assign the remaining \(15\) rosebushes to receive no coffee grounds.
Part C: Explain “statistically significant” in context
The result was statistically significant at:
This means the observed difference would be unlikely if coffee grounds truly had no effect.
Question 3 Overview
The time it takes for a team song to be performed follows a normal distribution with mean \(109\) seconds and standard deviation \(16\) seconds.
Part A: Probability one performance lasts longer than 120 seconds
We need:
Step 1: Convert 120 seconds to a z-score
Step 2: Find the area to the right
Using a normal calculator or table:
Part B: Probability at least 3 of 10 performances last longer than 120 seconds
From Part A, the probability of success is:
There are \(10\) performances, so this is a binomial distribution:
We need:
This means \(3\) or more performances last longer than \(120\) seconds.
Part C: Geometric distribution
Ben attends games until he sees the first performance longer than \(120\) seconds. This is geometric because we are waiting for the first success.
Part C(i): Mean of \(Y\)
For a geometric random variable:
Part C(ii): Standard deviation of \(Y\)
For a geometric random variable:
Part D: Interpret the standard deviation
The standard deviation tells us how much the number of games usually varies from the average.
Question 4 Overview
A farmer wants to know whether there is a difference in the mean number of oranges produced by trees fertilized with Brand C and Brand N.
| Fertilizer | Sample Size | Mean | Standard Deviation |
|---|---|---|---|
| Brand C | \(58\) | \(141\) | \(15\) |
| Brand N | \(58\) | \(148\) | \(19\) |
Step 1: State the hypotheses
Let:
The null hypothesis says there is no difference:
The alternative hypothesis says there is a difference:
Step 2: Choose the test
We are comparing two means, so we use a two-sample \(t\)-test.
Step 3: Check conditions
Random condition
The farmer randomly assigned trees to the fertilizer treatments, so the random condition is satisfied.
Independent groups
Each tree received only one fertilizer, so the two treatment groups are independent.
Large sample condition
Both sample sizes are \(58\), and \(58\geq 30\), so the sample sizes are large enough.
Step 4: Calculate the test statistic
Substitute the values:
Step 5: Find the p-value
Step 6: Make the decision
The significance level is:
Since:
we reject \(H_0\).
Question 5 Overview
The table classifies \(4,193\) professional athletes by sport and age group. The main skills are probability, conditional probability, mutual exclusivity, independence, and deciding whether a chi-square test is appropriate.
| Age Group | Basketball | Football | Baseball | Total |
|---|---|---|---|---|
| Age \( \lt 25\) | \(232\) | \(807\) | \(259\) | \(1298\) |
| \(25\leq Age\lt 30\) | \(175\) | \(1326\) | \(620\) | \(2121\) |
| \(30\leq Age\lt 35\) | \(90\) | \(287\) | \(276\) | \(653\) |
| \(35\leq Age\) | \(19\) | \(41\) | \(61\) | \(121\) |
| Total | \(516\) | \(2461\) | \(1216\) | \(4193\) |
Part A(i): Probability that a randomly selected athlete is a football player
Number of football players:
Total number of athletes:
So:
Part A(ii): Probability athlete is age 25 to under 30, given they are a football player
This is a conditional probability. We only look at football players.
Part B(i): Which probability does \(b\) represent?
In a mosaic plot, the width of a category represents the probability of that category. The width \(b\) is the width of the football section.
Part B(ii): What probability does \(x\) represent?
The mosaic plot shows:
In a mosaic plot, area represents a joint probability. So \(x\) represents the probability that a randomly selected athlete is both a football player and in the age group \(25\leq Age\lt 30\).
Part C(i): Are “Baseball” and “\(35\leq Age\)” mutually exclusive?
Two events are mutually exclusive if they cannot happen at the same time.
The table shows \(61\) athletes are both baseball players and age \(35\) or older.
Part C(ii): Are “Baseball” and “\(35\leq Age\)” independent?
Two events are independent if knowing one event happened does not change the probability of the other event.
Step 1: Find \(P(35\leq Age)\)
Step 2: Find \(P(35\leq Age\mid \text{Baseball})\)
Step 3: Compare
Part D: Is a chi-square test for independence appropriate?
A chi-square test for independence is used when we have sample data and want to make an inference about a larger population.
But this table includes all \(4,193\) professional athletes in these sports for the recent year. Since we already have the entire population of interest, inference is not needed.
Question 6 Overview
This question studies the relationship between number of hits and number of runs for professional baseball teams.
Part A(i): Describe the scatterplot
From the scatterplot, as the number of hits increases, the number of runs tends to increase. This means the association is positive.
The points follow a roughly straight-line pattern, so the relationship is approximately linear. The relationship is also moderately strong because the points generally follow the upward pattern.
Part A(ii): Predict runs for \(1,250\) hits
The regression equation is:
Substitute \(1250\) for hits:
Part B(i): Compare Team A with other teams in the same salary group
Team A is shown as a square. The problem says squares represent teams with salaries less than the median.
Therefore, Team A is a lower-salary team. Compared with other lower-salary teams, Team A has one of the greatest numbers of hits and one of the greatest numbers of runs.
Part B(ii): Compare strength of linear relationships
Dots represent teams with salaries greater than the median. Squares represent teams with salaries less than the median.
The dots appear closer to a straight-line pattern, while the squares are more spread out. So the relationship is stronger for teams with salaries greater than the median.
Part C(i): Find the critical value
There are \(30\) teams, so:
For regression inference:
For a \(95\%\) confidence level with \(28\) degrees of freedom:
Part C(ii): 95% confidence interval for the mean number of runs
This interval estimates the mean number of runs for all teams with \(1,250\) hits.
Point estimate:
Standard error:
Critical value:
Use:
Part C(iii): 95% prediction interval for one team
This interval predicts the number of runs for one individual team with \(1,250\) hits.
Point estimate:
Standard error:
Critical value:
Part D(i): Which has more variability: sample means or individual observations?
Individual observations usually vary more. Sample means vary less because averaging smooths out extreme values.
Part D(ii): Why is the prediction interval wider than the confidence interval?
The confidence interval estimates the average number of runs for all teams with \(1,250\) hits. The prediction interval predicts the number of runs for one single team with \(1,250\) hits.
Predicting one team is harder because individual teams vary more than averages. That is why the prediction interval has more uncertainty.
The confidence interval standard error is:
The prediction interval standard error is:
The prediction interval formula has an extra \(1\) inside the square root. That extra \(1\) makes the prediction standard error larger.
FAQ: AP Statistics 2026 FRQ Solutions
What topics appeared in the 2026 AP Statistics FRQs?
The questions covered descriptive statistics, boxplots, experimental design, statistical significance, normal probability, binomial and geometric distributions, two-sample \(t\)-tests, conditional probability, independence, mosaic plots, chi-square reasoning, and linear regression intervals.
Why is showing work important in AP Statistics?
AP Statistics free-response questions are scored for method, calculations, and communication. A correct final number without explanation may not receive full credit.
What is the difference between a confidence interval and a prediction interval?
A confidence interval estimates a mean response for all individuals with a certain \(x\)-value. A prediction interval predicts the response for one individual with that \(x\)-value. Prediction intervals are wider because individual outcomes vary more than averages.
Why was the chi-square test not appropriate in Question 5?
The table included the entire population of professional athletes in those sports for that year. Since the data were not a random sample used to infer to a larger population, a chi-square inference test was not needed.
What is the easiest way to improve AP Statistics FRQ answers?
Use a clear structure: identify the method, write the formula, substitute values, calculate carefully, and interpret the result in the context of the problem.