IB Mathematics AI – Topic 4

Statistics & Probability: Estimation & Confidence Intervals

Reliability and Validity of Data Collection

Validity

Definition: Validity refers to whether the data collection method actually measures what it is intended to measure. Valid data accurately represents the concept being studied.

Types of Validity:

  • Face Validity: Does the measure appear to test what it claims to test?
  • Content Validity: Does the measure cover all aspects of the concept?
  • Construct Validity: Does the measure relate to other variables as expected?
  • External Validity: Can results be generalized to other contexts?

Threats to Validity:

  • Poor question design: Leading, ambiguous, or biased questions
  • Inappropriate sampling method: Sample doesn't represent population
  • Measurement errors: Instrument doesn't measure correctly
  • Confounding variables: Other factors influencing results

Reliability

Definition: Reliability refers to the consistency and repeatability of measurements. A reliable method produces similar results under consistent conditions.

Types of Reliability:

  • Test-Retest Reliability: Same results when repeated over time
  • Inter-rater Reliability: Different observers produce consistent results
  • Internal Consistency: Multiple items measuring same concept produce consistent results

Factors Affecting Reliability:

  • Random measurement errors
  • Environmental conditions
  • Observer/rater variability
  • Instrument calibration and precision

Key Relationship:

A measure can be reliable without being valid (consistently wrong), but cannot be valid without being reliable (inconsistent measurements cannot be accurate).

⚠️ Common Pitfalls & Tips:

  • High reliability doesn't guarantee validity – consistent but wrong
  • Sample size affects reliability – larger samples more reliable
  • Random sampling increases validity of generalization
  • Pilot studies help identify reliability and validity issues
  • Clear operational definitions improve both reliability and validity

Unbiased Estimators

Definition & Concepts

Definition: An unbiased estimator is a statistic whose expected value equals the population parameter it estimates. On average, over many samples, it produces the correct value.

Sample Mean (Unbiased Estimator of \(\mu\)):

\[ \bar{x} = \frac{\sum x_i}{n} \]

The sample mean is an unbiased estimator: \(E(\bar{x}) = \mu\)

Sample Variance (Two Formulas):

Biased Sample Variance (NOT used for estimation):

\[ s_n^2 = \frac{\sum (x_i - \bar{x})^2}{n} \]

This underestimates the population variance

Unbiased Sample Variance (used for estimation):

\[ s_{n-1}^2 = \frac{\sum (x_i - \bar{x})^2}{n-1} \]

This is an unbiased estimator: \(E(s_{n-1}^2) = \sigma^2\)

Unbiased Standard Deviation:

\[ s_{n-1} = \sqrt{\frac{\sum (x_i - \bar{x})^2}{n-1}} \]

Why divide by (n-1)?

Dividing by n produces a biased estimator that systematically underestimates the population variance. Using (n-1) corrects this bias. This is called Bessel's correction.

⚠️ Common Pitfalls & Tips:

  • Critical: Use \(n-1\) in denominator for sample variance when estimating population variance
  • Your GDC has both: \(\sigma_n\) (population/biased) and \(\sigma_{n-1}\) (sample/unbiased)
  • For confidence intervals and t-tests, always use \(s_{n-1}\)
  • Larger samples reduce the difference between biased and unbiased estimators
  • IB exams typically expect unbiased estimators unless otherwise stated

📝 Worked Example 1: Calculating Unbiased Estimates

Question: A sample of 5 students' test scores are: 72, 85, 78, 92, 83

(a) Calculate the sample mean.

(b) Calculate the unbiased estimate of the population variance.

(c) Calculate the unbiased estimate of the population standard deviation.

Solution:

(a) Sample Mean:

\[ \bar{x} = \frac{72 + 85 + 78 + 92 + 83}{5} = \frac{410}{5} = 82 \]

(b) Unbiased Variance:

First, calculate deviations from mean:

\((72-82)^2 = (-10)^2 = 100\)

\((85-82)^2 = (3)^2 = 9\)

\((78-82)^2 = (-4)^2 = 16\)

\((92-82)^2 = (10)^2 = 100\)

\((83-82)^2 = (1)^2 = 1\)

Sum of squared deviations:

\[ \sum (x_i - \bar{x})^2 = 100 + 9 + 16 + 100 + 1 = 226 \]

Unbiased variance (divide by n-1):

\[ s_{n-1}^2 = \frac{226}{5-1} = \frac{226}{4} = 56.5 \]

(c) Unbiased Standard Deviation:

\[ s_{n-1} = \sqrt{56.5} = 7.52 \text{ (3 s.f.)} \]

Using GDC: Enter data in list, use 1-Var Stats, read \(s_x\) (which is \(s_{n-1}\))

Central Limit Theorem (CLT)

The Theorem

Statement: The Central Limit Theorem states that the distribution of sample means approaches a normal distribution as the sample size increases, regardless of the shape of the population distribution.

Formal Statement:

If \(X_1, X_2, \ldots, X_n\) is a random sample from a population with mean \(\mu\) and standard deviation \(\sigma\), then for large n:

\[ \bar{X} \sim N\left(\mu, \frac{\sigma^2}{n}\right) \quad \text{approximately} \]

Key Properties:

  • Mean of sampling distribution: \(E(\bar{X}) = \mu\)
  • Variance of sampling distribution: \(\text{Var}(\bar{X}) = \frac{\sigma^2}{n}\)
  • Standard error: \(SE(\bar{X}) = \frac{\sigma}{\sqrt{n}}\)

Standardization:

\[ Z = \frac{\bar{X} - \mu}{\sigma / \sqrt{n}} \sim N(0, 1) \]

When Does CLT Apply?

  • Sample size guideline: Generally n ≥ 30 is sufficient
  • If population is already normal, CLT applies for any n
  • If population is moderately skewed, n ≥ 30 usually sufficient
  • If population is heavily skewed, may need n ≥ 100

Implications:

  • Allows use of normal distribution methods even when population is not normal
  • Justifies confidence intervals and hypothesis tests for large samples
  • As n increases, sampling distribution becomes more normal
  • As n increases, standard error decreases (more precise estimates)

⚠️ Common Pitfalls & Tips:

  • CLT applies to the distribution of sample means, not individual observations
  • Don't confuse \(\sigma\) (population SD) with \(\sigma/\sqrt{n}\) (standard error)
  • Larger samples → smaller standard error → narrower confidence intervals
  • CLT doesn't require population to be normal, just sample size to be large
  • Rule of thumb: n ≥ 30 for most applications

📝 Worked Example 2: Applying Central Limit Theorem

Question: The time taken to complete a task has mean 45 minutes and standard deviation 12 minutes (distribution unknown). A random sample of 36 people is selected.

(a) State the distribution of the sample mean \(\bar{X}\).

(b) Find the probability that the sample mean is greater than 48 minutes.

(c) Find the probability that the sample mean is between 43 and 47 minutes.

Solution:

Given: \(\mu = 45\) minutes, \(\sigma = 12\) minutes, \(n = 36\)

(a) Distribution of Sample Mean:

Since n = 36 ≥ 30, by the Central Limit Theorem:

Mean of sampling distribution: \(E(\bar{X}) = \mu = 45\)

Standard error: \(SE = \frac{\sigma}{\sqrt{n}} = \frac{12}{\sqrt{36}} = \frac{12}{6} = 2\)

Variance: \(\frac{\sigma^2}{n} = \frac{144}{36} = 4\)

\[ \bar{X} \sim N(45, 4) \quad \text{or} \quad \bar{X} \sim N(45, 2^2) \]

(b) P(\(\bar{X}\) > 48):

Using GDC: normalcdf(48, 1×10⁹⁹, 45, 2)

Or using standardization:

\[ Z = \frac{48 - 45}{2} = \frac{3}{2} = 1.5 \]

\[ P(\bar{X} > 48) = P(Z > 1.5) = 1 - P(Z < 1.5) = 1 - 0.9332 = 0.0668 \]

Answer: 0.0668 or 6.68%

(c) P(43 < \(\bar{X}\) < 47):

Using GDC: normalcdf(43, 47, 45, 2)

\[ P(43 < \bar{X} < 47) = 0.683 \text{ or } 68.3\% \]

Interpretation: Due to CLT, even though we don't know the original distribution, we can use the normal distribution for the sample mean because n ≥ 30. There's about a 68% chance the average time for 36 people falls within 2 minutes of the population mean.

Confidence Intervals

What is a Confidence Interval?

Definition: A confidence interval is a range of values that is likely to contain the true population parameter with a specified level of confidence.

Interpretation:

A 95% confidence interval means: If we repeated the sampling process many times and constructed a confidence interval each time, approximately 95% of these intervals would contain the true population parameter.

Common Misconceptions:

  • WRONG: "There is a 95% probability that \(\mu\) is in this interval"
  • WRONG: "95% of the data falls in this interval"
  • CORRECT: "We are 95% confident that this interval contains \(\mu\)"
  • CORRECT: "95% of such intervals will contain the true parameter"

General Form:

\[ \text{Point Estimate} \pm \text{Margin of Error} \]

\[ \text{or} \quad \bar{x} \pm (\text{critical value}) \times SE \]

z-Interval (Population SD Known)

Use when: Population standard deviation \(\sigma\) is known OR large sample with CLT.

Formula:

\[ \bar{x} - z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} < \mu < \bar{x} + z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} \]

Where:

  • \(\bar{x}\) = sample mean
  • \(z_{\alpha/2}\) = critical z-value
  • \(\sigma\) = population standard deviation
  • \(n\) = sample size

Common Critical Values:

Confidence Level\(\alpha\)\(\alpha/2\)\(z_{\alpha/2}\)
90%0.100.051.645
95%0.050.0251.96
99%0.010.0052.576

t-Interval (Population SD Unknown)

Use when: Population standard deviation \(\sigma\) is unknown and must be estimated from sample.

Formula:

\[ \bar{x} - t_{\alpha/2} \cdot \frac{s_{n-1}}{\sqrt{n}} < \mu < \bar{x} + t_{\alpha/2} \cdot \frac{s_{n-1}}{\sqrt{n}} \]

Where:

  • \(\bar{x}\) = sample mean
  • \(t_{\alpha/2}\) = critical t-value with df = n-1
  • \(s_{n-1}\) = unbiased sample standard deviation
  • \(n\) = sample size

Degrees of Freedom:

\[ df = n - 1 \]

Properties of t-Distribution:

  • Symmetric, bell-shaped (like normal)
  • Heavier tails than normal distribution
  • As df increases, approaches standard normal
  • More spread out than normal for small samples

⚠️ Common Pitfalls & Tips:

  • Always use GDC for confidence intervals in exams
  • z-interval requires known \(\sigma\); t-interval uses sample \(s_{n-1}\)
  • Wider confidence level → wider interval (more confident, less precise)
  • Larger sample → narrower interval (more precise)
  • GDC: ZInterval for known σ, TInterval for unknown σ
  • t-interval is always wider than z-interval (accounts for extra uncertainty)

📝 Worked Example 3: z-Interval (Known σ)

Question: The heights of adult males are known to have a standard deviation of 8 cm. A random sample of 50 males has a mean height of 175 cm.

(a) Construct a 95% confidence interval for the population mean height.

(b) Interpret this confidence interval.

(c) If we wanted a narrower interval, what could we do?

Solution:

Given: \(\sigma = 8\) cm (known), \(\bar{x} = 175\) cm, \(n = 50\), confidence level = 95%

(a) 95% Confidence Interval:

Since \(\sigma\) is known, use z-interval.

Using GDC:

STAT → TESTS → ZInterval

Input: Stats, \(\sigma = 8\), \(\bar{x} = 175\), \(n = 50\), C-Level = 0.95

Calculate

Manual calculation:

For 95% confidence, \(z_{\alpha/2} = 1.96\)

Standard error: \(SE = \frac{\sigma}{\sqrt{n}} = \frac{8}{\sqrt{50}} = \frac{8}{7.071} = 1.131\) cm

Margin of error: \(ME = z_{\alpha/2} \times SE = 1.96 \times 1.131 = 2.217\) cm

Confidence interval:

\[ 175 - 2.217 < \mu < 175 + 2.217 \]

\[ 172.8 < \mu < 177.2 \text{ cm} \]

Or: (172.8, 177.2) cm

(b) Interpretation:

We are 95% confident that the true mean height of the population of adult males lies between 172.8 cm and 177.2 cm. If we repeated this sampling process many times, approximately 95% of the confidence intervals constructed would contain the true population mean.

(c) To narrow the interval:

  • Increase sample size: Larger n reduces standard error
  • Decrease confidence level: Lower confidence (e.g., 90% instead of 95%) gives narrower interval
  • Reduce population variability: Not usually controllable

📝 Worked Example 4: t-Interval (Unknown σ)

Question: A company wants to estimate the average daily production. A random sample of 10 days gives the following production numbers (in units):

120, 135, 128, 142, 115, 138, 125, 132, 140, 130

(a) Calculate the sample mean and unbiased standard deviation.

(b) Construct a 90% confidence interval for the population mean.

(c) What assumption is required for this interval to be valid?

Solution:

(a) Sample Statistics:

Using GDC: Enter data in list, use 1-Var Stats

Sample mean: \(\bar{x} = 130.5\) units

Unbiased standard deviation: \(s_{n-1} = 8.75\) units (approximately)

Sample size: \(n = 10\)

(b) 90% Confidence Interval:

Since \(\sigma\) is unknown, use t-interval.

Degrees of freedom: \(df = n - 1 = 10 - 1 = 9\)

Using GDC:

STAT → TESTS → TInterval

Input: Data (list) or Stats, C-Level = 0.90

Calculate

Manual calculation:

For 90% confidence with df = 9, \(t_{\alpha/2} = 1.833\) (from t-table or GDC)

Standard error: \(SE = \frac{s_{n-1}}{\sqrt{n}} = \frac{8.75}{\sqrt{10}} = \frac{8.75}{3.162} = 2.767\)

Margin of error: \(ME = 1.833 \times 2.767 = 5.072\)

Confidence interval:

\[ 130.5 - 5.072 < \mu < 130.5 + 5.072 \]

\[ 125.4 < \mu < 135.6 \text{ units} \]

GDC gives: (125.4, 135.6) units

(c) Assumption Required:

For a t-interval with small sample size (n = 10), we must assume that the population distribution is approximately normal. With larger samples (n ≥ 30), the Central Limit Theorem allows us to relax this assumption.

Factors Affecting Confidence Intervals

Width of Confidence Interval

Width of Interval:

\[ \text{Width} = 2 \times \text{Margin of Error} = 2 \times z_{\alpha/2} \times \frac{\sigma}{\sqrt{n}} \]

Three Key Factors:

1. Confidence Level (C):

  • Higher confidence level → larger critical value → wider interval
  • 90% CI is narrower than 95% CI, which is narrower than 99% CI
  • Trade-off: More confidence requires wider interval

2. Sample Size (n):

  • Larger sample → smaller standard error → narrower interval
  • Relationship: \(SE \propto \frac{1}{\sqrt{n}}\)
  • To halve the width, need 4 times the sample size
  • Most effective way to improve precision

3. Population Variability (\(\sigma\)):

  • More variable population → wider interval
  • Usually cannot control this factor
  • Homogeneous populations give narrower intervals

Summary Table:

To make interval...Do this...Trade-off
Narrower (more precise)Increase n OR decrease confidence levelCost of larger sample OR less confidence
Wider (more confident)Increase confidence levelLess precise estimate

📊 Quick Reference Summary

Unbiased Estimators

  • Mean: \(\bar{x} = \frac{\sum x_i}{n}\)
  • Variance: \(s_{n-1}^2 = \frac{\sum(x_i-\bar{x})^2}{n-1}\)
  • Use \(n-1\) for estimation

Central Limit Theorem

  • \(\bar{X} \sim N(\mu, \sigma^2/n)\)
  • Applies for n ≥ 30
  • \(SE = \sigma/\sqrt{n}\)

z-Interval (σ known)

  • \(\bar{x} \pm z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}}\)
  • 95% CI: use z = 1.96
  • Use when σ known

t-Interval (σ unknown)

  • \(\bar{x} \pm t_{\alpha/2} \cdot \frac{s_{n-1}}{\sqrt{n}}\)
  • df = n - 1
  • Wider than z-interval

🎯 Which Confidence Interval Should I Use?

START: What parameter are you estimating?

↓ Estimating population mean μ

Is population standard deviation σ known?

→ YES: Use z-Interval (ZInterval on GDC)

→ NO: Use t-Interval (TInterval on GDC)

↓ Estimating population proportion p

→ Use 1-PropZInterval on GDC

🖩 Essential GDC Functions

  • 1-Var Stats: Calculate \(\bar{x}\), \(s_x\) (which is \(s_{n-1}\)), \(\sigma_x\) (which is \(s_n\))
  • ZInterval: STAT → TESTS → ZInterval (when σ known)
  • TInterval: STAT → TESTS → TInterval (when σ unknown)
  • 1-PropZInt: For confidence intervals for proportions
  • Input options: Can use raw Data or Stats (summary statistics)
  • Remember: GDC automatically uses \(s_{n-1}\) for t-intervals

✍️ IB Exam Strategy for Confidence Intervals

  1. Identify the scenario: Is σ known or unknown? This determines z or t
  2. State what you're doing: "Using GDC: TInterval" or "ZInterval"
  3. Show key values: Write \(\bar{x}\), \(s_{n-1}\) or \(\sigma\), n, confidence level
  4. Report the interval: Write as (lower, upper) or \(a < \mu < b\)
  5. Interpret in context: "We are 95% confident that the true mean [in context]..."
  6. For calculations by hand: Show margin of error formula explicitly
  7. Check reasonableness: Does the interval make sense for the data?
  8. Answer follow-up questions: Be ready to discuss width, confidence level effects

🚫 Top Mistakes to Avoid

  1. Using n instead of n-1: Always use \(s_{n-1}\) for unbiased estimates
  2. Wrong interval type: t when you should use z, or vice versa
  3. Misinterpreting CI: Don't say "μ has 95% probability of being in interval"
  4. Forgetting \(\sqrt{n}\): Standard error is \(\sigma/\sqrt{n}\), not \(\sigma/n\)
  5. Using wrong σ: Population σ for z-interval, sample \(s_{n-1}\) for t-interval
  6. Not checking assumptions: t-interval requires normal population for small n
  7. Confusing confidence level with significance level: 95% CI uses α = 0.05
  8. Wrong critical value: Make sure to use two-tailed value (\(\alpha/2\))
  9. No context in interpretation: Always relate conclusion to the problem

💬 How to Interpret Confidence Intervals

Example: 95% CI for mean height is (172, 178) cm

❌ INCORRECT Interpretations:

  • "There is a 95% probability that μ is between 172 and 178"
  • "95% of the population has height between 172 and 178"
  • "95% of sample means fall in this interval"
  • "The probability that this interval contains μ is 0.95"

✓ CORRECT Interpretations:

  • "We are 95% confident that the true mean height is between 172 and 178 cm"
  • "If we repeated this process many times, 95% of intervals would contain the true mean"
  • "This interval was constructed using a method that captures the true mean 95% of the time"