Final Exam (Dec 9): review and practice materials
Format
Similar format to Exams 1 and 2. Two parts: the first being pen-and-paper, and the second being open-computer.
- Part 1:
- Closed computer and closed notes
- Handwritten answers
- Part 2:
- Open computer. Any online resource that is “not alive” may be used (i.e. you may freely use online resources, but you may not communicate with any other person).
- Electronic submission on Moodle: you will fill in a Quarto script with R code and written answers
Once you turn in Part 1, you may open your computer and download the blank Quarto script for Part 2. You may not return to Part 1 after turning it in and beginning the open-computer portion.
Topics
Based on all material covered so far in the class.
- R Programming Fundamentals
- Variable assignment and basic operations
- Data types and classes (
typeof(),class(),str()) - Type coercion (implicit and explicit)
- Vectors: creation, subsetting, and operations
- Data frames: structure, column extraction, subsetting
- Factors for categorical data
- Missing values (
NA)
- Descriptive Statistics
- Measures of center: mean, median
- Measures of spread: standard deviation, variance, IQR, range
- Outlier-resistant vs. outlier-sensitive statistics
- Frequency tables and proportions for categorical data
- Data visualization: histograms, boxplots
- Statistical Inference Concepts
- Populations, samples, parameters, and statistics
- Hypothesis testing framework
- P-values and their interpretation
- Confidence intervals and their interpretation
- Statistical significance (α = 0.05)
- Hypothesis Tests and Confidence Intervals in R
- One-sample t-test for means:
t.test(x, mu = .) - Two-sample t-test for comparing means:
t.test(y ~ group) - One-sample proportion test:
prop.test(x, n, p = .) - Two-sample proportion test:
prop.test() - Chi-squared test of independence:
chisq.test() - Constructing and interpreting confidence intervals
- One-sample t-test for means:
- Data Visualization
- Creating plots with ggplot2:
ggplot(),aes(),geom_histogram(), etc. - Basic plot customization (titles, axis labels)
- Creating plots with ggplot2:
- Linear Regression
- Simple linear regression: fitting, interpreting, predicting
- The
lm()function and formula syntax - Regression equation: \(Y = \beta_0 + \beta_1 X + \varepsilon\)
- Interpreting coefficients (intercept and slope)
- Understanding residuals
- Regression diagnostics
- Hypothesis tests and confidence intervals for model coefficients
- F-tests for comparing nested models (
anova()) - R-squared and adjusted R-squared
- Making predictions
- Confidence intervals and prediction intervals for model predictions
- Multiple linear regression
- Functions and Function-Oriented Programming
- Writing functions in R: syntax and structure
- Function arguments and default values
- Return values (implicit and explicit)
- Logistic regression for binary outcomes
- Basic syntax with
glm() - Inference for coefficients (hypothesis tests)
- Making predictions
- Basic syntax with
Review materials
For Part 1 (on paper, closed computer), I have created a practice exam to give you a sense of the format and topics: part_1_practice.pdf.
The solution is here: part_1_practice_solution.pdf.
Additional practice questions for live competition on Menti: link. Downloadable version extra_practice.md