Two-Variable Data & Scatterplots
A scatterplot shows how two variables relate to each other. Each dot represents one observation — a single data point with both an x-value and a y-value. The SAT tests scatterplots in two main ways: identifying which equation models the data (linear, quadratic, or exponential) and interpreting what the slope or y-intercept means in the context of the problem. The hardest questions don't test math computation — they test whether you can read a graph and connect it to algebra.
Reading a scatterplot
Every scatterplot has the same basic structure: an x-axis (independent variable) and a y-axis (dependent variable), with dots showing how the two variables co-vary. The first thing to do with any scatterplot is identify the shape of the relationship.
The SAT also tests exponential relationships — these look like a curve that gets steeper and steeper (or shallower and shallower) without a peak.
Linear models
When the data falls roughly along a straight line, the best-fit equation is in slope-intercept form. m is the slope (the rate at which y changes per unit increase in x) and b is the y-intercept (the value of y when x = 0).
The SAT loves to ask what the slope or y-intercept means in the context of the real-world scenario.
A scatterplot shows the relationship between the size of a house (in thousands of square feet, x-axis) and its sale price (in thousands of dollars, y-axis). The best-fit line is:
What does the slope of 100 represent in this context?
- Slope = change in y per unit increase in x.
- x is in thousands of square feet. y is in thousands of dollars.
- So a slope of 100 means: for every additional 1,000 square feet, the predicted sale price increases by $100,000.
Using the same equation y = 100x + 100, what does the y-intercept of 100 represent?
- Y-intercept = value of y when x = 0. So when x = 0 thousand square feet, the predicted price is y = 100 thousand dollars.
- In context, this is the predicted price of a house with zero square feet — clearly not a realistic value, but mathematically it represents the model's "baseline" before adding any size.
Y-intercepts often represent something physically nonsensical (like a house with zero square feet). That's normal — the model is only meaningful in the range of the actual data.
Matching an equation to a scatterplot
The SAT often gives you a scatterplot and four equations as answer choices. To pick the right one:
- Check the y-intercept. Find where the trend line would cross the y-axis (x = 0). Eliminate equations whose b-value doesn't match.
- Check the slope sign. If the trend goes up as x increases, slope is positive. If it goes down, slope is negative. Eliminate any equation with the wrong sign.
- Check the slope size. Estimate the rise/run from two visible points on the trend. Eliminate equations whose slope is wildly off.
A scatterplot shows the relationship between time t (in years) and distance d (in some unit). The trend line passes through approximately (0, 84) and (10, 414). Which equation is the best fit?
- (A) d = 0.03t + 402
- (B) d = 10t + 402
- (C) d = 33t + 300
- (D) d = 33t + 84
- The y-intercept (where x = 0) is about 84. Eliminate (A), (B), (C) — none have 84 as the constant.
- Verify the slope on (D). From (0, 84) to (10, 414):
rise = 414 − 84 = 330 run = 10 slope = 330 / 10 = 33 ✓
The y-intercept is usually the fastest way to eliminate three of four answer choices. Look there first.
Quadratic models
When data forms a parabolic (U-shaped or inverted-U) curve, the best-fit equation is quadratic. The most-tested feature is the sign of the leading coefficient a:
- If a is positive, the parabola opens UP (like a U). The vertex is a minimum.
- If a is negative, the parabola opens DOWN (like an inverted U). The vertex is a maximum.
The SAT often gives four answer choices that differ ONLY in the signs of the coefficients. Example:
The trick: the actual quadratic computation isn't needed. Just check (a) the parabola direction (sign of x² coefficient) and (b) the y-intercept (constant term). Two checks usually eliminate three options.
A scatterplot shows electricity generated by nuclear sources over a 12-year period. The data forms an inverted-U shape (peaks in the middle). The y-intercept (where t = 0) is approximately 745. Which is the best-fit equation?
- (A) y = 1.674x² + 19.76x − 745.73
- (B) y = −1.674x² − 19.76x − 745.73
- (C) y = 1.674x² + 19.76x + 745.73
- (D) y = −1.674x² + 19.76x + 745.73
- The parabola opens DOWN (inverted-U), so the leading coefficient must be negative. Eliminate (A) and (C).
- The y-intercept is +745, so the constant term must be POSITIVE. Eliminate (B) (which has −745.73).
No quadratic computation needed. Two visual checks — parabola direction and y-intercept sign — fully determined the answer.
Recognizing the model type
Before matching to an equation, you need to know what TYPE of equation to look for. Here's how to tell at a glance:
| Pattern in the scatter | Model type | Equation form |
|---|---|---|
| Roughly straight line (up or down) | Linear | y = mx + b |
| U-shaped or inverted-U | Quadratic | y = ax² + bx + c |
| Curve that grows faster and faster (or decays slower and slower) | Exponential | y = a · b^x |
A scatter that curves upward could be either quadratic OR exponential. The difference:
Quadratic goes through a minimum (or maximum) somewhere — even if the visible portion is just one side of the parabola.
Exponential never has a peak or trough — it just keeps curving in the same direction.
On the SAT, look for the answer-choice form: if you see x² in the choices, it's quadratic. If you see b^x (like 1.05^t), it's exponential. Often the problem tells you the model type explicitly ("the data is best modeled by a quadratic function").
Sample SAT-style problems
A scatterplot shows the relationship between the years since 1940 (x) and the federal minimum wage in dollars (y). The best-fit line is:
What does the slope of 0.096 mean in this context?
- Slope = change in y per unit change in x. The y is dollars, x is years.
- So a slope of 0.096 means: per year, the predicted minimum wage increases by $0.096 (about 10 cents).
A scatterplot of beach visitors (y) vs. average temperature in °C (x) suggests a linear relationship. The trend line passes through (25, 80) and (35, 560). Which equation best models the data?
- (A) y = 48x − 1120
- (B) y = 48x + 1120
- (C) y = −48x + 1120
- (D) y = 30x + 80
- Compute the slope from the two points:
m = (560 − 80) / (35 − 25) = 480 / 10 = 48Eliminate (C) and (D).
- Plug (25, 80) into y = 48x + b to solve for b:
80 = 48(25) + b b = 80 − 1200 = −1120
A scatterplot shows the relationship between the depth of a swimming pool (x) and the water pressure (y). The data forms a U-shape with the lowest point near x = 5, and the y-intercept appears to be around 25. Which is the best fit?
- (A) y = 2x² − 20x + 25
- (B) y = −2x² + 20x − 25
- (C) y = 2x² − 20x − 25
- (D) y = −2x² − 20x + 25
- Parabola opens UP (U-shape) → leading coefficient is POSITIVE. Eliminate (B) and (D).
- Y-intercept is +25 → constant term must be +25. Eliminate (C).
The scatterplot of ice cream sales (y, in dollars) vs. temperature (x, in °C) is modeled by:
Use the model to predict ice cream sales when the temperature is 22°C.
- Substitute x = 22 into the equation:
y = 45(22) − 200 y = 990 − 200 y = 790
1. Misreading the axis units. If the y-axis is "thousands of dollars" and you read 100, the actual value is $100,000, not $100. Always check the axis labels for units like "thousands," "millions," "percent."
2. Swapping slope and y-intercept. "What does the y-intercept represent?" asks about the constant term (b in y = mx + b), not the slope. Read the question carefully — they often appear together to test whether you know which is which.
3. Ignoring the parabola direction on quadratics. A U-shape and an inverted-U are the same shape mathematically, but they have OPPOSITE signs on the leading coefficient. Always check which way the parabola opens FIRST when matching a quadratic equation.
4. Using the line of best fit as if it were exact. The line of best fit is a model — individual data points scatter around it. The SAT sometimes asks "according to the line of best fit" or "according to the model" — that means use the equation, not the actual scattered points.