Unit 2.1 – Introducing Statistics: Are Variables Related?
Unit Focus:
In Unit 2, we explore two-variable data—looking for patterns, relationships, and associations between variables. This is a central theme in statistics!
In Unit 2, we explore two-variable data—looking for patterns, relationships, and associations between variables. This is a central theme in statistics!
🔗 Relationships Between Variables: The Basics
- Bivariate Data: Data involving two variables measured on the same individuals.
- Response Variable (y): The outcome you measure (dependent variable).
- Explanatory Variable (x): The variable that explains, predicts, or influences y (independent variable).
- Goal: Describe, visualize, and quantify how x and y are related (if at all).
- Examples: Study time (\(x\)) and exam score (\(y\)); age (\(x\)) and cholesterol (\(y\))
Key Vocabulary
- Association: When values of one variable tend to occur with certain values of another
- Positive Association: As \(x\) increases, \(y\) tends to increase
- Negative Association: As \(x\) increases, \(y\) tends to decrease
- No Association: Changes in \(x\) not related to changes in \(y\)
📊 Visual Tools to Explore Two-Variable Data
- Scatterplot: Graph of points showing the relationship between two quantitative variables.
- Side-by-side Boxplots: Compare a quantitative variable across groups defined by a categorical variable.
- Time Plot: Visualize change in a variable over time (if time is one variable).
- Color or Symbol Coding: Add a third variable by marking points in a scatterplot.
Scatterplot Features (for quantitative relationships):
- Direction: Positive, negative, or no direction
- Form: Linear or nonlinear (curve, cluster, etc.)
- Strength: How tightly points follow a pattern (strong/moderate/weak)
- Outliers: Points not fitting the main pattern
💡 Tips, Tricks & Exam Strategies
- Always state which variable is explanatory (x, independent) and which is response (y, dependent)
- Describe scatterplots with direction, form, strength, and outliers (DFSO)
- Label axes on all graphs; include units
- Look for clusters, unusual points, or subgroups in scatterplots
- Don't assume causation—association does not imply causation!
- Use side-by-side boxplots for comparing groups on a quantitative outcome
❌ Common Mistakes
- Switching x (explanatory) and y (response) variables
- Ignoring outliers or failing to mention them in scatterplot descriptions
- Assuming linear relationship when the pattern is clearly not linear
- Forgetting to mention strength (strong/moderate/weak)
- Claiming causation without experimental evidence
Summary:
Unit 2.1 lays the groundwork for all two-variable data analysis. Know the vocabulary (explanatory-response, positive/negative association), how to describe and visualize relationships, and be precise in reporting patterns. Always question if variables are truly related or if association is just coincidental!
Unit 2.1 lays the groundwork for all two-variable data analysis. Know the vocabulary (explanatory-response, positive/negative association), how to describe and visualize relationships, and be precise in reporting patterns. Always question if variables are truly related or if association is just coincidental!