Exam 1 (Sep 30): review and practice materials

Solutions

Format

As described in the syllabus, the exam will have two parts: the first being pen-and-paper, and the second being open-computer.

  • Part 1:
    • Closed computer and closed notes
    • Handwritten answers
  • Part 2:
    • Open computer. Any online resource that is “not alive” may be used (i.e. you may freely use online resources, but you may not communicate with any other person).
    • Electronic submission on Moodle: you will fill in a Quarto script with R code and written answers

Once you turn in Part 1, you may open your computer and download the blank Quarto script for Part 2. You may not return to Part 1 after turning it in and beginning the open-computer portion.

Topics

Based on material from Weeks 1-5 and Labs 1-5, the exam will cover:

  • R Programming Fundamentals
    • Variable assignment and basic operations
    • Data types and classes (typeof(), class(), str())
    • Type coercion (implicit and explicit)
    • Vectors: creation, subsetting, and operations
    • Data frames: structure, column extraction, subsetting
    • Factors for categorical data
    • Missing values (NA)
  • Descriptive Statistics
    • Measures of center: mean, median
    • Measures of spread: standard deviation, variance, IQR, range
    • Outlier-resistant vs. outlier-sensitive statistics
    • Frequency tables and proportions for categorical data
    • Data visualization: histograms, boxplots
  • Statistical Inference Concepts
    • Populations, samples, parameters, and statistics
    • Hypothesis testing framework
    • P-values and their interpretation
    • Confidence intervals and their interpretation
    • Statistical significance (α = 0.05)
  • Hypothesis Tests and Confidence Intervals in R
    • One-sample t-test for means: t.test(x, mu = .)
    • Two-sample t-test for comparing means: t.test(y ~ group)
    • One-sample proportion test: prop.test(x, n, p = .)
    • Two-sample proportion test: prop.test()
    • Chi-squared test of independence: chisq.test()
    • Constructing and interpreting confidence intervals
  • Data Visualization
    • Creating plots with ggplot2: ggplot(), aes(), geom_histogram(), etc.
    • Basic plot customization (titles, axis labels)

Practice materials

Part 1

For Part 1 (on paper, closed computer), I have created a practice exam to give you a sense of the format and topics: part_1_practice.pdf.

The solution is here: part_1_practice_solution.pdf.

Part 2

For Part 2, the format will be very similar to the in-class Labs. Given the open format of this part of the exam, I will not provide a separate practice exam for Part 2.

You will be given a Quarto script which you can download, and which you will edit to fill in blank code chunks and free-text responses (again, just like the lab).

The questions will involve exploration of a single real dataset. Coding tasks and questions about this dataset will be drawn from the list of topics above.

I suggest reviewing or even re-doing the labs as a way of practicing for Part 2.