Track C — Chapter 16 Problem Set: Linear Regression =================================================== This problem set mirrors the Chapter 16 topics: - Prediction with a line of best fit: :math:`\hat{Y} = a + bX` - Least squares logic (minimizing squared residuals) - Standard Error of the Estimate (SEE): “how wrong are our predictions on average?” - Multiple regression and incremental :math:`R^2` How to run ---------- Run the script directly: .. code-block:: bash python -m scripts.psych_ch16_problem_set Or run the unit tests: .. code-block:: bash pytest -q tests/test_psych_ch16_problem_set.py Exercise 1 — Strong simple regression ------------------------------------- You’re given a dataset with a clear linear relationship between ``x`` and ``y``. Tasks: 1. Fit a simple linear regression predicting ``y`` from ``x``. 2. Interpret the slope (what does a 1-unit change in ``x`` imply for ``y``?). 3. Report :math:`R^2` and the SEE (standard error of estimate). Expected pattern: - The slope is clearly non-zero (very small p-value). - :math:`R^2` is moderate-to-high. Exercise 2 — Weak/noisy regression ---------------------------------- You’re given a dataset where the true relationship exists but is weak relative to noise. Tasks: 1. Fit a simple linear regression predicting ``y`` from ``x``. 2. Compare this model to Exercise 1: - What happens to :math:`R^2`? - What happens to SEE? Expected pattern: - :math:`R^2` is small (close to 0). - SEE is larger (predictions are less accurate on average). Exercise 3 — Multiple regression and incremental :math:`R^2` ------------------------------------------------------------ You’re given a dataset with two predictors ``x1`` and ``x2``. The predictors share variance, but both contribute to predicting ``y``. Tasks: 1. Fit a simple regression: ``y ~ x1`` and record :math:`R^2`. 2. Fit a multiple regression: ``y ~ x1 + x2`` and record :math:`R^2`. 3. Compute the incremental improvement: :math:`\Delta R^2 = R^2_{(x1,x2)} - R^2_{(x1)}`. 4. Interpret the coefficient for ``x2`` (does it add unique predictive power?). Expected pattern: - The multiple regression has meaningfully higher :math:`R^2` than the x1-only model. - ``x2`` is typically significant, showing unique contribution beyond ``x1``.