IB Mathematics AI – Topic 4
Statistics & Probability: Probability
Basic Probability Concepts
Fundamental Definitions
Key Terms:
- Experiment: An action or process that produces outcomes
- Outcome: A possible result of an experiment
- Sample Space (U or S): The set of all possible outcomes
- Event (A, B, C, ...): A subset of the sample space
- Trial: A single performance of an experiment
Basic Probability Formula:
\[ P(A) = \frac{n(A)}{n(U)} = \frac{\text{Number of favorable outcomes}}{\text{Total number of possible outcomes}} \]
This assumes all outcomes are equally likely
Properties of Probability:
- \(0 \leq P(A) \leq 1\) for any event A
- \(P(U) = 1\) (certainty)
- \(P(\emptyset) = 0\) (impossibility)
- Complementary Events: \(P(A') = 1 - P(A)\) where \(A'\) means "not A"
Expected Number of Occurrences:
\[ \text{Expected number} = n \times P(A) \]
where n is the number of trials
⚠️ Common Pitfalls & Tips:
- Probabilities must always be between 0 and 1 (or 0% and 100%)
- Express answers as fractions, decimals, or percentages as requested
- Remember: \(P(A) + P(A') = 1\) always
- Check if outcomes are equally likely before using the basic formula
- Be careful with "at least" problems – often easier to use complement
📝 Worked Example 1: Basic Probability
Question: A bag contains 5 red balls, 3 blue balls, and 2 green balls. A ball is selected at random.
(a) Find the probability that the ball is red.
(b) Find the probability that the ball is NOT blue.
(c) If 50 balls are selected with replacement, how many would you expect to be green?
Solution:
Setting up:
Total number of balls = 5 + 3 + 2 = 10
Sample space: \(n(U) = 10\)
(a) Probability of red:
Number of red balls = 5
\[ P(\text{Red}) = \frac{5}{10} = \frac{1}{2} = 0.5 \text{ or } 50\% \]
(b) Probability NOT blue:
Method 1 – Using complement:
\(P(\text{Blue}) = \frac{3}{10}\)
\[ P(\text{Not Blue}) = 1 - P(\text{Blue}) = 1 - \frac{3}{10} = \frac{7}{10} = 0.7 \]
Method 2 – Direct counting:
Not blue means red or green = 5 + 2 = 7 balls
\[ P(\text{Not Blue}) = \frac{7}{10} = 0.7 \text{ or } 70\% \]
(c) Expected number of green balls:
\(P(\text{Green}) = \frac{2}{10} = \frac{1}{5}\)
\[ \text{Expected number} = 50 \times \frac{1}{5} = 10 \text{ green balls} \]
Venn Diagrams
Combined Events & Set Operations
Definition: Venn diagrams use overlapping circles to represent events and their relationships, showing all possible combinations visually.
Set Operations & Probability Rules:
1. Union (A ∪ B): "A or B or both"
\[ P(A \cup B) = P(A) + P(B) - P(A \cap B) \]
2. Intersection (A ∩ B): "Both A and B"
The overlap region in a Venn diagram
3. Complement (A'): "Not A"
\[ P(A') = 1 - P(A) \]
4. Mutually Exclusive Events:
Events that cannot occur together: \(P(A \cap B) = 0\)
For mutually exclusive events:
\[ P(A \cup B) = P(A) + P(B) \]
⚠️ Common Pitfalls & Tips:
- Always subtract the intersection when using the addition rule to avoid double-counting
- The region "only A" = \(P(A) - P(A \cap B)\)
- Check that all probabilities in your Venn diagram sum to 1
- Start with the intersection when filling in a Venn diagram from given information
- "A or B" means A ∪ B (includes both); "A or B but not both" excludes intersection
📝 Worked Example 2: Venn Diagrams
Question: In a class of 30 students:
- 18 students study French (F)
- 15 students study Spanish (S)
- 8 students study both French and Spanish
(a) Draw a Venn diagram to represent this information.
(b) Find the probability that a randomly selected student studies French or Spanish.
(c) Find the probability that a randomly selected student studies neither language.
Solution:
(a) Venn Diagram:
Step 1: Start with the intersection (both languages):
\(n(F \cap S) = 8\)
Step 2: Find "only French":
Only F = Total F - Both = 18 - 8 = 10 students
Step 3: Find "only Spanish":
Only S = Total S - Both = 15 - 8 = 7 students
Step 4: Find "neither":
Neither = Total - (Only F + Both + Only S) = 30 - (10 + 8 + 7) = 5 students
Venn Diagram Structure:
Outside both circles: 5
Circle F only: 10 | Overlap: 8 | Circle S only: 7
[Rectangle represents total: 30 students]
[Left circle = French, Right circle = Spanish]
(b) P(French or Spanish):
Method 1 – Using the formula:
\[ P(F \cup S) = P(F) + P(S) - P(F \cap S) \]
\[ P(F \cup S) = \frac{18}{30} + \frac{15}{30} - \frac{8}{30} = \frac{25}{30} = \frac{5}{6} \approx 0.833 \]
Method 2 – Direct counting:
Students studying at least one language = 10 + 8 + 7 = 25
\[ P(F \cup S) = \frac{25}{30} = \frac{5}{6} \approx 0.833 \text{ or } 83.3\% \]
(c) P(Neither language):
Method 1 – Using complement:
\[ P(\text{Neither}) = 1 - P(F \cup S) = 1 - \frac{5}{6} = \frac{1}{6} \approx 0.167 \]
Method 2 – Direct:
\[ P(\text{Neither}) = \frac{5}{30} = \frac{1}{6} \text{ or } 16.7\% \]
Tree Diagrams & Sample Space Diagrams
Tree Diagrams
Definition: A tree diagram shows all possible outcomes of sequential events, with branches representing different possibilities and their probabilities.
How to Use Tree Diagrams:
- Draw branches for each outcome at each stage
- Label each branch with the outcome and its probability
- Multiply along branches to find probability of a sequence
- Add across branches to find probability of different sequences leading to same outcome
Key Rules:
- Probabilities on branches from same point must sum to 1
- For independent events, second-stage probabilities don't change
- For dependent events (without replacement), second-stage probabilities change
⚠️ Common Pitfalls & Tips:
- Remember: multiply along branches, add across different paths
- For "without replacement," probabilities change on second draw
- Check all probabilities at each stage sum to 1
- Label all branches clearly to avoid confusion
- Tree diagrams work best for 2-3 stages; become unwieldy beyond that
Sample Space Diagrams (Tables of Outcomes)
Definition: A systematic way to list all possible outcomes, often using a table or grid. Particularly useful for two simultaneous events.
When to Use:
- Rolling two dice
- Selecting two items simultaneously
- Any situation with two independent events occurring together
Each cell in the table represents one outcome, making it easy to count favorable outcomes.
📝 Worked Example 3: Tree Diagram (Without Replacement)
Question: A box contains 4 red marbles and 3 blue marbles. Two marbles are drawn at random without replacement.
(a) Draw a tree diagram showing all possible outcomes.
(b) Find the probability that both marbles are red.
(c) Find the probability that the marbles are different colors.
Solution:
(a) Tree Diagram:
Initial: 4 red, 3 blue (total = 7 marbles)
First Draw → Second Draw
Red (4/7) ─┬─ Red (3/6) → RR: (4/7) × (3/6)
└─ Blue (3/6) → RB: (4/7) × (3/6)
Blue (3/7) ─┬─ Red (4/6) → BR: (3/7) × (4/6)
└─ Blue (2/6) → BB: (3/7) × (2/6)
Explanation of probabilities:
- First draw: P(Red) = 4/7, P(Blue) = 3/7
- If 1st is Red: 3 red and 3 blue remain (6 total) → P(2nd Red) = 3/6
- If 1st is Blue: 4 red and 2 blue remain (6 total) → P(2nd Blue) = 2/6
(b) P(Both red):
Follow the RR path and multiply:
\[ P(\text{RR}) = \frac{4}{7} \times \frac{3}{6} = \frac{12}{42} = \frac{2}{7} \approx 0.286 \text{ or } 28.6\% \]
(c) P(Different colors):
Different colors means RB or BR. Add these paths:
\[ P(\text{RB}) = \frac{4}{7} \times \frac{3}{6} = \frac{12}{42} = \frac{2}{7} \]
\[ P(\text{BR}) = \frac{3}{7} \times \frac{4}{6} = \frac{12}{42} = \frac{2}{7} \]
\[ P(\text{Different}) = \frac{2}{7} + \frac{2}{7} = \frac{4}{7} \approx 0.571 \text{ or } 57.1\% \]
📝 Worked Example 4: Sample Space Diagram
Question: Two fair six-sided dice are rolled. Find the probability that:
(a) The sum of the scores is 8.
(b) The product of the scores is greater than 20.
Solution:
Sample Space (Sum of two dice):
| Die 1 \ Die 2 | 1 | 2 | 3 | 4 | 5 | 6 |
|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| 2 | 3 | 4 | 5 | 6 | 7 | 8 |
| 3 | 4 | 5 | 6 | 7 | 8 | 9 |
| 4 | 5 | 6 | 7 | 8 | 9 | 10 |
| 5 | 6 | 7 | 8 | 9 | 10 | 11 |
| 6 | 7 | 8 | 9 | 10 | 11 | 12 |
(a) P(Sum = 8):
Favorable outcomes (highlighted): (2,6), (3,5), (4,4), (5,3), (6,2)
Total outcomes = 6 × 6 = 36
\[ P(\text{Sum} = 8) = \frac{5}{36} \approx 0.139 \text{ or } 13.9\% \]
(b) P(Product > 20):
Need to find outcomes where \(\text{Die}_1 \times \text{Die}_2 > 20\)
Favorable outcomes:
- (4,6): product = 24 ✓
- (5,5): product = 25 ✓
- (5,6): product = 30 ✓
- (6,4): product = 24 ✓
- (6,5): product = 30 ✓
- (6,6): product = 36 ✓
Total favorable outcomes = 6
\[ P(\text{Product} > 20) = \frac{6}{36} = \frac{1}{6} \approx 0.167 \text{ or } 16.7\% \]
Conditional Probability & Independent Events
Conditional Probability
Definition: The probability of event A occurring given that event B has already occurred, denoted \(P(A|B)\) and read as "probability of A given B".
Conditional Probability Formula:
\[ P(A|B) = \frac{P(A \cap B)}{P(B)} \]
where \(P(B) > 0\)
Rearranged (useful for finding intersection):
\[ P(A \cap B) = P(A|B) \times P(B) = P(B|A) \times P(A) \]
Interpretation:
\(P(A|B)\) restricts the sample space to only those outcomes where B occurs, then finds the proportion of those where A also occurs.
Independent Events
Definition: Two events A and B are independent if the occurrence of one does not affect the probability of the other.
Test for Independence:
Events A and B are independent if and only if:
\[ P(A \cap B) = P(A) \times P(B) \]
Alternative conditions (all equivalent):
- \(P(A|B) = P(A)\)
- \(P(B|A) = P(B)\)
- \(P(A \cap B) = P(A) \times P(B)\)
Important Distinction:
- Independent: One event doesn't affect the other (e.g., two coin flips)
- Mutually Exclusive: Events cannot both occur (\(P(A \cap B) = 0\))
- Mutually exclusive events with \(P(A) > 0\) and \(P(B) > 0\) are NOT independent
⚠️ Common Pitfalls & Tips:
- Don't confuse \(P(A|B)\) with \(P(B|A)\) – they're usually different!
- For independent events, probabilities don't change regardless of what happened before
- "With replacement" typically means independent; "without replacement" means dependent
- Use tree diagrams for conditional probability – they make the structure clear
- Check independence by testing \(P(A \cap B) = P(A) \times P(B)\)
📝 Worked Example 5: Conditional Probability
Question: In a school, 60% of students play sports (S) and 40% are in the music club (M). 25% of students both play sports and are in the music club.
(a) Find \(P(M|S)\), the probability a student is in music club given they play sports.
(b) Find \(P(S|M)\), the probability a student plays sports given they're in music club.
(c) Are playing sports and being in music club independent? Justify your answer.
Solution:
Given information:
\(P(S) = 0.60\)
\(P(M) = 0.40\)
\(P(S \cap M) = 0.25\)
(a) Find \(P(M|S)\):
Using the conditional probability formula:
\[ P(M|S) = \frac{P(M \cap S)}{P(S)} = \frac{0.25}{0.60} = \frac{5}{12} \approx 0.417 \]
Interpretation: 41.7% of students who play sports are also in the music club.
(b) Find \(P(S|M)\):
\[ P(S|M) = \frac{P(S \cap M)}{P(M)} = \frac{0.25}{0.40} = \frac{5}{8} = 0.625 \]
Interpretation: 62.5% of students in the music club also play sports.
(c) Test for independence:
For independence, we need \(P(S \cap M) = P(S) \times P(M)\)
Calculate \(P(S) \times P(M)\):
\[ P(S) \times P(M) = 0.60 \times 0.40 = 0.24 \]
But \(P(S \cap M) = 0.25\)
Since \(0.25 \neq 0.24\), the events are NOT independent.
Alternative check using conditional probability:
\(P(M|S) = 0.417 \neq 0.40 = P(M)\)
Since \(P(M|S) \neq P(M)\), this confirms the events are dependent. Being an athlete slightly increases the probability of being in music club.
Transition Matrices & Markov Chains
Definition & Structure
Definition: A Markov chain is a mathematical system that transitions from one state to another with probabilities that depend only on the current state (memoryless property). A transition matrix organizes these probabilities.
Transition Matrix Structure:
A transition matrix T is a square matrix where:
- Entry \(t_{ij}\) = probability of moving FROM state i TO state j
- Each row represents starting state
- Each column represents ending state
- All entries are between 0 and 1
- Each row sums to 1 (must go somewhere)
State Vector:
Current probabilities of being in each state, written as a column vector:
\[ \mathbf{s} = \begin{pmatrix} p_1 \\ p_2 \\ \vdots \\ p_n \end{pmatrix} \]
where \(p_i\) is probability of being in state i, and \(\sum p_i = 1\)
Finding Future States:
\[ \mathbf{s}_{n+1} = T \mathbf{s}_n \]
After k steps: \(\mathbf{s}_k = T^k \mathbf{s}_0\)
Steady State (Long-term Behavior):
A steady state vector \(\mathbf{s}_{\infty}\) satisfies:
\[ T \mathbf{s}_{\infty} = \mathbf{s}_{\infty} \]
This represents the long-term equilibrium probabilities.
Finding Steady State:
- Set up equation \(T\mathbf{s} = \mathbf{s}\)
- Solve the system of equations
- Use constraint that probabilities sum to 1
- Alternatively: Calculate high powers of T (e.g., \(T^{20}\)) and read steady state from any column
⚠️ Common Pitfalls & Tips:
- Check row sums: Every row of a transition matrix must sum to 1
- Multiply on the LEFT: \(T\mathbf{s}\), not \(\mathbf{s}T\)
- State vectors must sum to 1 (represent 100% of probability)
- Use your GDC for matrix multiplication and finding powers
- Steady state is found when \(T\mathbf{s} = \mathbf{s}\), not \(\mathbf{s}T = \mathbf{s}\)
- Not all Markov chains have a unique steady state
📝 Worked Example 6: Markov Chains
Question: A streaming service categorizes viewers as Active (A) or Inactive (I). Each month:
- 80% of Active users remain Active; 20% become Inactive
- 40% of Inactive users become Active; 60% remain Inactive
Currently, 70% of users are Active and 30% are Inactive.
(a) Write the transition matrix T.
(b) Find the distribution after 1 month.
(c) Find the steady-state distribution.
Solution:
(a) Transition Matrix:
Rows: FROM (current state); Columns: TO (next state)
\[ T = \begin{pmatrix} 0.8 & 0.4 \\ 0.2 & 0.6 \end{pmatrix} \begin{matrix} \leftarrow \text{TO Active} \\ \leftarrow \text{TO Inactive} \end{matrix} \]
\[ \begin{matrix} \uparrow \\ \text{FROM A} \end{matrix} \quad \begin{matrix} \uparrow \\ \text{FROM I} \end{matrix} \]
Initial state vector:
\[ \mathbf{s}_0 = \begin{pmatrix} 0.7 \\ 0.3 \end{pmatrix} \]
(b) After 1 month:
\[ \mathbf{s}_1 = T\mathbf{s}_0 = \begin{pmatrix} 0.8 & 0.4 \\ 0.2 & 0.6 \end{pmatrix} \begin{pmatrix} 0.7 \\ 0.3 \end{pmatrix} \]
Calculate:
Active after 1 month: \(0.8(0.7) + 0.4(0.3) = 0.56 + 0.12 = 0.68\)
Inactive after 1 month: \(0.2(0.7) + 0.6(0.3) = 0.14 + 0.18 = 0.32\)
\[ \mathbf{s}_1 = \begin{pmatrix} 0.68 \\ 0.32 \end{pmatrix} \]
After 1 month: 68% Active, 32% Inactive
(c) Steady State:
Let \(\mathbf{s}_{\infty} = \begin{pmatrix} a \\ i \end{pmatrix}\) where \(a + i = 1\)
We need \(T\mathbf{s}_{\infty} = \mathbf{s}_{\infty}\):
\[ \begin{pmatrix} 0.8 & 0.4 \\ 0.2 & 0.6 \end{pmatrix} \begin{pmatrix} a \\ i \end{pmatrix} = \begin{pmatrix} a \\ i \end{pmatrix} \]
This gives us:
Equation 1: \(0.8a + 0.4i = a\) → \(0.4i = 0.2a\) → \(2i = a\)
Equation 2 (Constraint): \(a + i = 1\)
Solving:
Substitute \(a = 2i\) into the constraint:
\( (2i) + i = 1 \) → \(3i = 1\) → \(i = 1/3\)
Then \(a = 2i = 2(1/3) = 2/3\)
\[ \mathbf{s}_{\infty} = \begin{pmatrix} 2/3 \\ 1/3 \end{pmatrix} \approx \begin{pmatrix} 0.667 \\ 0.333 \end{pmatrix} \]
Steady State: 66.7% Active, 33.3% Inactive
In the long run, regardless of starting distribution, the system stabilizes at 2/3 Active and 1/3 Inactive users.
📊 Quick Reference Summary
Basic Probability
- \(P(A) = \frac{n(A)}{n(U)}\)
- \(P(A') = 1 - P(A)\)
- \(0 \leq P(A) \leq 1\)
Combined Events
- \(P(A \cup B) = P(A) + P(B) - P(A \cap B)\)
- Mutually exclusive: \(P(A \cap B) = 0\)
Conditional Probability
- \(P(A|B) = \frac{P(A \cap B)}{P(B)}\)
- \(P(A \cap B) = P(A|B) \times P(B)\)
Independence
- \(P(A \cap B) = P(A) \times P(B)\)
- \(P(A|B) = P(A)\)
Markov Chains
- Next state: \(\mathbf{s}_{n+1} = T\mathbf{s}_n\)
- Steady state: \(T\mathbf{s}_{\infty} = \mathbf{s}_{\infty}\)
- Each column of T sums to 1 (in some conventions)
📐 Choosing the Right Diagram
| Situation | Best Diagram |
|---|---|
| Two or more overlapping groups | Venn Diagram |
| Sequential events (with/without replacement) | Tree Diagram |
| Two simultaneous events (e.g., rolling dice) | Sample Space Diagram (table) |
| System transitions over time | Transition Matrix |
🖩 GDC Calculator Tips for Probability
- Learn matrix operations for Markov chains (multiplication, powers)
- Store transition matrices for repeated calculations
- Use combinations/permutations functions when appropriate
- Factorial button (n!) useful for counting outcomes
- Can verify Venn diagram calculations by adding regions
- For high powers of transition matrices (steady state), calculate \(T^{20}\) or \(T^{50}\)
✍️ IB Exam Strategy for Probability
- Draw diagrams even if not required – they prevent errors and may earn method marks
- Label clearly: Define events with letters (e.g., "Let A = event student passes exam")
- Show working: Write probability formulas before substituting values
- Check answers: Probabilities must be between 0 and 1; all outcomes should sum to 1
- "At least" problems: Often easier using complement: \(P(\text{at least 1}) = 1 - P(\text{none})\)
- Conditional probability: Clearly identify "given" condition and adjust sample space
- Tree diagrams: Multiply along paths, add across paths to same outcome
- Markov chains: Always verify transition matrix rows sum to 1
- Express fractions in simplest form unless told to give decimal answer
🚫 Top 10 Common Mistakes to Avoid
- Forgetting to subtract \(P(A \cap B)\) when finding \(P(A \cup B)\)
- Confusing \(P(A|B)\) with \(P(B|A)\) – they're different!
- Assuming events are independent without checking
- Thinking mutually exclusive events are independent (they're not, unless one has probability 0)
- Forgetting probabilities must sum to 1 in Venn diagrams
- Not adjusting probabilities for "without replacement" in tree diagrams
- Adding probabilities when should multiply (along tree branches)
- Getting rows and columns confused in transition matrices
- Forgetting that complement means "not" → \(P(A') = 1 - P(A)\)
- Not simplifying fractions in final answers