Track C — Chapter 16 Problem Set: Linear Regression
This problem set mirrors the Chapter 16 topics:
Prediction with a line of best fit: \(\hat{Y} = a + bX\)
Least squares logic (minimizing squared residuals)
Standard Error of the Estimate (SEE): “how wrong are our predictions on average?”
Multiple regression and incremental \(R^2\)
How to run
Run the script directly:
python -m scripts.psych_ch16_problem_set
Or run the unit tests:
pytest -q tests/test_psych_ch16_problem_set.py
Exercise 1 — Strong simple regression
You’re given a dataset with a clear linear relationship between x and y.
Tasks:
Fit a simple linear regression predicting
yfromx.Interpret the slope (what does a 1-unit change in
ximply fory?).Report \(R^2\) and the SEE (standard error of estimate).
Expected pattern:
The slope is clearly non-zero (very small p-value).
\(R^2\) is moderate-to-high.
Exercise 2 — Weak/noisy regression
You’re given a dataset where the true relationship exists but is weak relative to noise.
Tasks:
Fit a simple linear regression predicting
yfromx.Compare this model to Exercise 1: - What happens to \(R^2\)? - What happens to SEE?
Expected pattern:
\(R^2\) is small (close to 0).
SEE is larger (predictions are less accurate on average).
Exercise 3 — Multiple regression and incremental \(R^2\)
You’re given a dataset with two predictors x1 and x2. The predictors share variance,
but both contribute to predicting y.
Tasks:
Fit a simple regression:
y ~ x1and record \(R^2\).Fit a multiple regression:
y ~ x1 + x2and record \(R^2\).Compute the incremental improvement: \(\Delta R^2 = R^2_{(x1,x2)} - R^2_{(x1)}\).
Interpret the coefficient for
x2(does it add unique predictive power?).
Expected pattern:
The multiple regression has meaningfully higher \(R^2\) than the x1-only model.
x2is typically significant, showing unique contribution beyondx1.