4  Modelling and Inference

Note

This section guides you through classical inference for a simple linear regression. Keep your code concise and your explanations clear.

Checklist

  1. Delete everything in this yellow callout (including BEGIN/END lines) when finished.
  2. Fit the provided basic model and interpret 1–2 key coefficients in plain language.
  3. Produce diagnostic plots and comment on linearity, constant variance, and normality.
  4. Implement a stepwise procedure (forward and backward) using MASS::stepAIC and compare to the basic model.
  5. Write a short paragraph interpreting coefficients, including p-values and 95% CIs, for your final chosen model.

4.1 Fit a basic regression model

basic_mod <- fit_basic_model(analysis_data)
summary(basic_mod)

Call:
lm(formula = G3 ~ absences, data = d)

Residuals:
     Min       1Q   Median       3Q      Max 
-10.3033  -2.3033   0.5007   3.4811   9.6183 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 10.30327    0.28347  36.347   <2e-16 ***
absences     0.01961    0.02886   0.679    0.497    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 4.585 on 393 degrees of freedom
Multiple R-squared:  0.001173,  Adjusted R-squared:  -0.001369 
F-statistic: 0.4615 on 1 and 393 DF,  p-value: 0.4973
confint(basic_mod)
                 2.5 %      97.5 %
(Intercept)  9.7459577 10.86057555
absences    -0.0371337  0.07634406

Briefly interpret the coefficients for G2 and absences in the context of final grade G3.

4.2 Diagnostics (minimal)

op <- par(mfrow = c(1,2))
plot(basic_mod, which = 1)  # Residuals vs Fitted
plot(basic_mod, which = 2)  # Normal Q-Q

par(op)

Comment on the diagnostic plots: do you see curvature, non-constant variance, or strong deviations from normality? If so, suggest a simple remedy or note the limitation.

4.3 Stepwise variable selection (student TODO)

Implement run_stepwise() in R/03_model.R using MASS::stepAIC, then try both forward and backward directions. Compare the selected model to the basic model and justify your choice.

# After you implement run_stepwise(), uncomment and run:
# step_forward  <- run_stepwise(analysis_data, direction = "forward")
# step_backward <- run_stepwise(analysis_data, direction = "backward")
# summary(step_forward)
# summary(step_backward)
# AIC(step_forward); AIC(step_backward)

4.4 Inference for your final model

Choose your final model (basic or one of the stepwise results) and comment on key coefficients with p-values and 95% CIs. State any caveats about interpretation.