IB Mathematics AI – Topic 4
Statistics & Probability: Estimation & Confidence Intervals
Reliability and Validity of Data Collection
Validity
Definition: Validity refers to whether the data collection method actually measures what it is intended to measure. Valid data accurately represents the concept being studied.
Types of Validity:
- Face Validity: Does the measure appear to test what it claims to test?
- Content Validity: Does the measure cover all aspects of the concept?
- Construct Validity: Does the measure relate to other variables as expected?
- External Validity: Can results be generalized to other contexts?
Threats to Validity:
- Poor question design: Leading, ambiguous, or biased questions
- Inappropriate sampling method: Sample doesn't represent population
- Measurement errors: Instrument doesn't measure correctly
- Confounding variables: Other factors influencing results
Reliability
Definition: Reliability refers to the consistency and repeatability of measurements. A reliable method produces similar results under consistent conditions.
Types of Reliability:
- Test-Retest Reliability: Same results when repeated over time
- Inter-rater Reliability: Different observers produce consistent results
- Internal Consistency: Multiple items measuring same concept produce consistent results
Factors Affecting Reliability:
- Random measurement errors
- Environmental conditions
- Observer/rater variability
- Instrument calibration and precision
Key Relationship:
A measure can be reliable without being valid (consistently wrong), but cannot be valid without being reliable (inconsistent measurements cannot be accurate).
⚠️ Common Pitfalls & Tips:
- High reliability doesn't guarantee validity – consistent but wrong
- Sample size affects reliability – larger samples more reliable
- Random sampling increases validity of generalization
- Pilot studies help identify reliability and validity issues
- Clear operational definitions improve both reliability and validity
Unbiased Estimators
Definition & Concepts
Definition: An unbiased estimator is a statistic whose expected value equals the population parameter it estimates. On average, over many samples, it produces the correct value.
Sample Mean (Unbiased Estimator of \(\mu\)):
\[ \bar{x} = \frac{\sum x_i}{n} \]
The sample mean is an unbiased estimator: \(E(\bar{x}) = \mu\)
Sample Variance (Two Formulas):
Biased Sample Variance (NOT used for estimation):
\[ s_n^2 = \frac{\sum (x_i - \bar{x})^2}{n} \]
This underestimates the population variance
Unbiased Sample Variance (used for estimation):
\[ s_{n-1}^2 = \frac{\sum (x_i - \bar{x})^2}{n-1} \]
This is an unbiased estimator: \(E(s_{n-1}^2) = \sigma^2\)
Unbiased Standard Deviation:
\[ s_{n-1} = \sqrt{\frac{\sum (x_i - \bar{x})^2}{n-1}} \]
Why divide by (n-1)?
Dividing by n produces a biased estimator that systematically underestimates the population variance. Using (n-1) corrects this bias. This is called Bessel's correction.
⚠️ Common Pitfalls & Tips:
- Critical: Use \(n-1\) in denominator for sample variance when estimating population variance
- Your GDC has both: \(\sigma_n\) (population/biased) and \(\sigma_{n-1}\) (sample/unbiased)
- For confidence intervals and t-tests, always use \(s_{n-1}\)
- Larger samples reduce the difference between biased and unbiased estimators
- IB exams typically expect unbiased estimators unless otherwise stated
📝 Worked Example 1: Calculating Unbiased Estimates
Question: A sample of 5 students' test scores are: 72, 85, 78, 92, 83
(a) Calculate the sample mean.
(b) Calculate the unbiased estimate of the population variance.
(c) Calculate the unbiased estimate of the population standard deviation.
Solution:
(a) Sample Mean:
\[ \bar{x} = \frac{72 + 85 + 78 + 92 + 83}{5} = \frac{410}{5} = 82 \]
(b) Unbiased Variance:
First, calculate deviations from mean:
\((72-82)^2 = (-10)^2 = 100\)
\((85-82)^2 = (3)^2 = 9\)
\((78-82)^2 = (-4)^2 = 16\)
\((92-82)^2 = (10)^2 = 100\)
\((83-82)^2 = (1)^2 = 1\)
Sum of squared deviations:
\[ \sum (x_i - \bar{x})^2 = 100 + 9 + 16 + 100 + 1 = 226 \]
Unbiased variance (divide by n-1):
\[ s_{n-1}^2 = \frac{226}{5-1} = \frac{226}{4} = 56.5 \]
(c) Unbiased Standard Deviation:
\[ s_{n-1} = \sqrt{56.5} = 7.52 \text{ (3 s.f.)} \]
Using GDC: Enter data in list, use 1-Var Stats, read \(s_x\) (which is \(s_{n-1}\))
Central Limit Theorem (CLT)
The Theorem
Statement: The Central Limit Theorem states that the distribution of sample means approaches a normal distribution as the sample size increases, regardless of the shape of the population distribution.
Formal Statement:
If \(X_1, X_2, \ldots, X_n\) is a random sample from a population with mean \(\mu\) and standard deviation \(\sigma\), then for large n:
\[ \bar{X} \sim N\left(\mu, \frac{\sigma^2}{n}\right) \quad \text{approximately} \]
Key Properties:
- Mean of sampling distribution: \(E(\bar{X}) = \mu\)
- Variance of sampling distribution: \(\text{Var}(\bar{X}) = \frac{\sigma^2}{n}\)
- Standard error: \(SE(\bar{X}) = \frac{\sigma}{\sqrt{n}}\)
Standardization:
\[ Z = \frac{\bar{X} - \mu}{\sigma / \sqrt{n}} \sim N(0, 1) \]
When Does CLT Apply?
- Sample size guideline: Generally n ≥ 30 is sufficient
- If population is already normal, CLT applies for any n
- If population is moderately skewed, n ≥ 30 usually sufficient
- If population is heavily skewed, may need n ≥ 100
Implications:
- Allows use of normal distribution methods even when population is not normal
- Justifies confidence intervals and hypothesis tests for large samples
- As n increases, sampling distribution becomes more normal
- As n increases, standard error decreases (more precise estimates)
⚠️ Common Pitfalls & Tips:
- CLT applies to the distribution of sample means, not individual observations
- Don't confuse \(\sigma\) (population SD) with \(\sigma/\sqrt{n}\) (standard error)
- Larger samples → smaller standard error → narrower confidence intervals
- CLT doesn't require population to be normal, just sample size to be large
- Rule of thumb: n ≥ 30 for most applications
📝 Worked Example 2: Applying Central Limit Theorem
Question: The time taken to complete a task has mean 45 minutes and standard deviation 12 minutes (distribution unknown). A random sample of 36 people is selected.
(a) State the distribution of the sample mean \(\bar{X}\).
(b) Find the probability that the sample mean is greater than 48 minutes.
(c) Find the probability that the sample mean is between 43 and 47 minutes.
Solution:
Given: \(\mu = 45\) minutes, \(\sigma = 12\) minutes, \(n = 36\)
(a) Distribution of Sample Mean:
Since n = 36 ≥ 30, by the Central Limit Theorem:
Mean of sampling distribution: \(E(\bar{X}) = \mu = 45\)
Standard error: \(SE = \frac{\sigma}{\sqrt{n}} = \frac{12}{\sqrt{36}} = \frac{12}{6} = 2\)
Variance: \(\frac{\sigma^2}{n} = \frac{144}{36} = 4\)
\[ \bar{X} \sim N(45, 4) \quad \text{or} \quad \bar{X} \sim N(45, 2^2) \]
(b) P(\(\bar{X}\) > 48):
Using GDC: normalcdf(48, 1×10⁹⁹, 45, 2)
Or using standardization:
\[ Z = \frac{48 - 45}{2} = \frac{3}{2} = 1.5 \]
\[ P(\bar{X} > 48) = P(Z > 1.5) = 1 - P(Z < 1.5) = 1 - 0.9332 = 0.0668 \]
Answer: 0.0668 or 6.68%
(c) P(43 < \(\bar{X}\) < 47):
Using GDC: normalcdf(43, 47, 45, 2)
\[ P(43 < \bar{X} < 47) = 0.683 \text{ or } 68.3\% \]
Interpretation: Due to CLT, even though we don't know the original distribution, we can use the normal distribution for the sample mean because n ≥ 30. There's about a 68% chance the average time for 36 people falls within 2 minutes of the population mean.
Confidence Intervals
What is a Confidence Interval?
Definition: A confidence interval is a range of values that is likely to contain the true population parameter with a specified level of confidence.
Interpretation:
A 95% confidence interval means: If we repeated the sampling process many times and constructed a confidence interval each time, approximately 95% of these intervals would contain the true population parameter.
Common Misconceptions:
- WRONG: "There is a 95% probability that \(\mu\) is in this interval"
- WRONG: "95% of the data falls in this interval"
- CORRECT: "We are 95% confident that this interval contains \(\mu\)"
- CORRECT: "95% of such intervals will contain the true parameter"
General Form:
\[ \text{Point Estimate} \pm \text{Margin of Error} \]
\[ \text{or} \quad \bar{x} \pm (\text{critical value}) \times SE \]
z-Interval (Population SD Known)
Use when: Population standard deviation \(\sigma\) is known OR large sample with CLT.
Formula:
\[ \bar{x} - z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} < \mu < \bar{x} + z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} \]
Where:
- \(\bar{x}\) = sample mean
- \(z_{\alpha/2}\) = critical z-value
- \(\sigma\) = population standard deviation
- \(n\) = sample size
Common Critical Values:
| Confidence Level | \(\alpha\) | \(\alpha/2\) | \(z_{\alpha/2}\) |
|---|---|---|---|
| 90% | 0.10 | 0.05 | 1.645 |
| 95% | 0.05 | 0.025 | 1.96 |
| 99% | 0.01 | 0.005 | 2.576 |
t-Interval (Population SD Unknown)
Use when: Population standard deviation \(\sigma\) is unknown and must be estimated from sample.
Formula:
\[ \bar{x} - t_{\alpha/2} \cdot \frac{s_{n-1}}{\sqrt{n}} < \mu < \bar{x} + t_{\alpha/2} \cdot \frac{s_{n-1}}{\sqrt{n}} \]
Where:
- \(\bar{x}\) = sample mean
- \(t_{\alpha/2}\) = critical t-value with df = n-1
- \(s_{n-1}\) = unbiased sample standard deviation
- \(n\) = sample size
Degrees of Freedom:
\[ df = n - 1 \]
Properties of t-Distribution:
- Symmetric, bell-shaped (like normal)
- Heavier tails than normal distribution
- As df increases, approaches standard normal
- More spread out than normal for small samples
⚠️ Common Pitfalls & Tips:
- Always use GDC for confidence intervals in exams
- z-interval requires known \(\sigma\); t-interval uses sample \(s_{n-1}\)
- Wider confidence level → wider interval (more confident, less precise)
- Larger sample → narrower interval (more precise)
- GDC: ZInterval for known σ, TInterval for unknown σ
- t-interval is always wider than z-interval (accounts for extra uncertainty)
📝 Worked Example 3: z-Interval (Known σ)
Question: The heights of adult males are known to have a standard deviation of 8 cm. A random sample of 50 males has a mean height of 175 cm.
(a) Construct a 95% confidence interval for the population mean height.
(b) Interpret this confidence interval.
(c) If we wanted a narrower interval, what could we do?
Solution:
Given: \(\sigma = 8\) cm (known), \(\bar{x} = 175\) cm, \(n = 50\), confidence level = 95%
(a) 95% Confidence Interval:
Since \(\sigma\) is known, use z-interval.
Using GDC:
STAT → TESTS → ZInterval
Input: Stats, \(\sigma = 8\), \(\bar{x} = 175\), \(n = 50\), C-Level = 0.95
Calculate
Manual calculation:
For 95% confidence, \(z_{\alpha/2} = 1.96\)
Standard error: \(SE = \frac{\sigma}{\sqrt{n}} = \frac{8}{\sqrt{50}} = \frac{8}{7.071} = 1.131\) cm
Margin of error: \(ME = z_{\alpha/2} \times SE = 1.96 \times 1.131 = 2.217\) cm
Confidence interval:
\[ 175 - 2.217 < \mu < 175 + 2.217 \]
\[ 172.8 < \mu < 177.2 \text{ cm} \]
Or: (172.8, 177.2) cm
(b) Interpretation:
We are 95% confident that the true mean height of the population of adult males lies between 172.8 cm and 177.2 cm. If we repeated this sampling process many times, approximately 95% of the confidence intervals constructed would contain the true population mean.
(c) To narrow the interval:
- Increase sample size: Larger n reduces standard error
- Decrease confidence level: Lower confidence (e.g., 90% instead of 95%) gives narrower interval
- Reduce population variability: Not usually controllable
📝 Worked Example 4: t-Interval (Unknown σ)
Question: A company wants to estimate the average daily production. A random sample of 10 days gives the following production numbers (in units):
120, 135, 128, 142, 115, 138, 125, 132, 140, 130
(a) Calculate the sample mean and unbiased standard deviation.
(b) Construct a 90% confidence interval for the population mean.
(c) What assumption is required for this interval to be valid?
Solution:
(a) Sample Statistics:
Using GDC: Enter data in list, use 1-Var Stats
Sample mean: \(\bar{x} = 130.5\) units
Unbiased standard deviation: \(s_{n-1} = 8.75\) units (approximately)
Sample size: \(n = 10\)
(b) 90% Confidence Interval:
Since \(\sigma\) is unknown, use t-interval.
Degrees of freedom: \(df = n - 1 = 10 - 1 = 9\)
Using GDC:
STAT → TESTS → TInterval
Input: Data (list) or Stats, C-Level = 0.90
Calculate
Manual calculation:
For 90% confidence with df = 9, \(t_{\alpha/2} = 1.833\) (from t-table or GDC)
Standard error: \(SE = \frac{s_{n-1}}{\sqrt{n}} = \frac{8.75}{\sqrt{10}} = \frac{8.75}{3.162} = 2.767\)
Margin of error: \(ME = 1.833 \times 2.767 = 5.072\)
Confidence interval:
\[ 130.5 - 5.072 < \mu < 130.5 + 5.072 \]
\[ 125.4 < \mu < 135.6 \text{ units} \]
GDC gives: (125.4, 135.6) units
(c) Assumption Required:
For a t-interval with small sample size (n = 10), we must assume that the population distribution is approximately normal. With larger samples (n ≥ 30), the Central Limit Theorem allows us to relax this assumption.
Factors Affecting Confidence Intervals
Width of Confidence Interval
Width of Interval:
\[ \text{Width} = 2 \times \text{Margin of Error} = 2 \times z_{\alpha/2} \times \frac{\sigma}{\sqrt{n}} \]
Three Key Factors:
1. Confidence Level (C):
- Higher confidence level → larger critical value → wider interval
- 90% CI is narrower than 95% CI, which is narrower than 99% CI
- Trade-off: More confidence requires wider interval
2. Sample Size (n):
- Larger sample → smaller standard error → narrower interval
- Relationship: \(SE \propto \frac{1}{\sqrt{n}}\)
- To halve the width, need 4 times the sample size
- Most effective way to improve precision
3. Population Variability (\(\sigma\)):
- More variable population → wider interval
- Usually cannot control this factor
- Homogeneous populations give narrower intervals
Summary Table:
| To make interval... | Do this... | Trade-off |
|---|---|---|
| Narrower (more precise) | Increase n OR decrease confidence level | Cost of larger sample OR less confidence |
| Wider (more confident) | Increase confidence level | Less precise estimate |
📊 Quick Reference Summary
Unbiased Estimators
- Mean: \(\bar{x} = \frac{\sum x_i}{n}\)
- Variance: \(s_{n-1}^2 = \frac{\sum(x_i-\bar{x})^2}{n-1}\)
- Use \(n-1\) for estimation
Central Limit Theorem
- \(\bar{X} \sim N(\mu, \sigma^2/n)\)
- Applies for n ≥ 30
- \(SE = \sigma/\sqrt{n}\)
z-Interval (σ known)
- \(\bar{x} \pm z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}}\)
- 95% CI: use z = 1.96
- Use when σ known
t-Interval (σ unknown)
- \(\bar{x} \pm t_{\alpha/2} \cdot \frac{s_{n-1}}{\sqrt{n}}\)
- df = n - 1
- Wider than z-interval
🎯 Which Confidence Interval Should I Use?
START: What parameter are you estimating?
↓ Estimating population mean μ
Is population standard deviation σ known?
→ YES: Use z-Interval (ZInterval on GDC)
→ NO: Use t-Interval (TInterval on GDC)
↓ Estimating population proportion p
→ Use 1-PropZInterval on GDC
🖩 Essential GDC Functions
- 1-Var Stats: Calculate \(\bar{x}\), \(s_x\) (which is \(s_{n-1}\)), \(\sigma_x\) (which is \(s_n\))
- ZInterval: STAT → TESTS → ZInterval (when σ known)
- TInterval: STAT → TESTS → TInterval (when σ unknown)
- 1-PropZInt: For confidence intervals for proportions
- Input options: Can use raw Data or Stats (summary statistics)
- Remember: GDC automatically uses \(s_{n-1}\) for t-intervals
✍️ IB Exam Strategy for Confidence Intervals
- Identify the scenario: Is σ known or unknown? This determines z or t
- State what you're doing: "Using GDC: TInterval" or "ZInterval"
- Show key values: Write \(\bar{x}\), \(s_{n-1}\) or \(\sigma\), n, confidence level
- Report the interval: Write as (lower, upper) or \(a < \mu < b\)
- Interpret in context: "We are 95% confident that the true mean [in context]..."
- For calculations by hand: Show margin of error formula explicitly
- Check reasonableness: Does the interval make sense for the data?
- Answer follow-up questions: Be ready to discuss width, confidence level effects
🚫 Top Mistakes to Avoid
- Using n instead of n-1: Always use \(s_{n-1}\) for unbiased estimates
- Wrong interval type: t when you should use z, or vice versa
- Misinterpreting CI: Don't say "μ has 95% probability of being in interval"
- Forgetting \(\sqrt{n}\): Standard error is \(\sigma/\sqrt{n}\), not \(\sigma/n\)
- Using wrong σ: Population σ for z-interval, sample \(s_{n-1}\) for t-interval
- Not checking assumptions: t-interval requires normal population for small n
- Confusing confidence level with significance level: 95% CI uses α = 0.05
- Wrong critical value: Make sure to use two-tailed value (\(\alpha/2\))
- No context in interpretation: Always relate conclusion to the problem
💬 How to Interpret Confidence Intervals
Example: 95% CI for mean height is (172, 178) cm
❌ INCORRECT Interpretations:
- "There is a 95% probability that μ is between 172 and 178"
- "95% of the population has height between 172 and 178"
- "95% of sample means fall in this interval"
- "The probability that this interval contains μ is 0.95"
✓ CORRECT Interpretations:
- "We are 95% confident that the true mean height is between 172 and 178 cm"
- "If we repeated this process many times, 95% of intervals would contain the true mean"
- "This interval was constructed using a method that captures the true mean 95% of the time"