Unit 2.1 – Introducing Statistics: Are Variables Related?

Unit Focus:
In Unit 2, we explore two-variable data—looking for patterns, relationships, and associations between variables. This is a central theme in statistics!

🔗 Relationships Between Variables: The Basics

  • Bivariate Data: Data involving two variables measured on the same individuals.
  • Response Variable (y): The outcome you measure (dependent variable).
  • Explanatory Variable (x): The variable that explains, predicts, or influences y (independent variable).
  • Goal: Describe, visualize, and quantify how x and y are related (if at all).
  • Examples: Study time (\(x\)) and exam score (\(y\)); age (\(x\)) and cholesterol (\(y\))
Key Vocabulary
  • Association: When values of one variable tend to occur with certain values of another
  • Positive Association: As \(x\) increases, \(y\) tends to increase
  • Negative Association: As \(x\) increases, \(y\) tends to decrease
  • No Association: Changes in \(x\) not related to changes in \(y\)

📊 Visual Tools to Explore Two-Variable Data

  • Scatterplot: Graph of points showing the relationship between two quantitative variables.
  • Side-by-side Boxplots: Compare a quantitative variable across groups defined by a categorical variable.
  • Time Plot: Visualize change in a variable over time (if time is one variable).
  • Color or Symbol Coding: Add a third variable by marking points in a scatterplot.
Scatterplot Features (for quantitative relationships):
  • Direction: Positive, negative, or no direction
  • Form: Linear or nonlinear (curve, cluster, etc.)
  • Strength: How tightly points follow a pattern (strong/moderate/weak)
  • Outliers: Points not fitting the main pattern

💡 Tips, Tricks & Exam Strategies

  • Always state which variable is explanatory (x, independent) and which is response (y, dependent)
  • Describe scatterplots with direction, form, strength, and outliers (DFSO)
  • Label axes on all graphs; include units
  • Look for clusters, unusual points, or subgroups in scatterplots
  • Don't assume causation—association does not imply causation!
  • Use side-by-side boxplots for comparing groups on a quantitative outcome

❌ Common Mistakes

  • Switching x (explanatory) and y (response) variables
  • Ignoring outliers or failing to mention them in scatterplot descriptions
  • Assuming linear relationship when the pattern is clearly not linear
  • Forgetting to mention strength (strong/moderate/weak)
  • Claiming causation without experimental evidence
Summary:
Unit 2.1 lays the groundwork for all two-variable data analysis. Know the vocabulary (explanatory-response, positive/negative association), how to describe and visualize relationships, and be precise in reporting patterns. Always question if variables are truly related or if association is just coincidental!