DATA505 Case Study
Predictors of Student Performance
1 About
What can we learn from data about key factors that influence student performance?
In this project, you will explore this question using a dataset (described in Chapter 2) about student achievement in secondary education of two Portuguese schools, and its relation to various social, demographic, and educational factors.
We will extend our use of Quarto to do literate programming in R. You will write R functions to execute the steps of the data analysis, and will also write narrative text (in Quarto) to explain the analysis and your findings in plain English.
1.1 Objectives
- Produce a “filled-in” version of this website, following the overall structure in the table of contents at left. That is, you will:
- Perform exploratory data analysis, creating visual and descriptive summaries of the data
- Develop a regression model to predict student performance in relation to other characteristics of the students and their environments, describing your model selection process
- Draw conclusions in the context of this problem
1.2 Checklist
- Read the dataset description in Chapter 2 (completed for you)
- To get a sense of what you will be asked to do, read the instructions in Chapter 3 as an example. For each section, you will be writing R functions to perform data-analysis tasks, and then filling in a Quarto document to explain your work and findings. This will entail parallel editing of both R scripts (to define the functions) and Quarto files (to explain your work in plain language), filling in a data-analysis project directory.
- Download the zip file student_performance.zip, which contains a template of the project directory. Specifically it has all the code used to produce this site.
- Find and unzip it on your computer, then open the project in RStudio or Positron
- Edit
_quarto.ymlto add your name - Read and follow the instructions in each section to complete the analysis and the website
1.3 Project organization
The project directory is organized as follows:
student_performance/
├── _brand.yml # Drew branding (colors, logo, fonts)
├── _quarto.yml # Quarto book configuration
├── index.qmd # This file - project introduction and overview
├── student_performance.Rproj # RStudio project file
│
├── quarto/ # Quarto documents for the analysis narrative
│ ├── 01_data.qmd # Data description and loading
│ ├── 02_exploration.qmd # Exploratory data analysis
│ ├── 03_modelling.qmd # Model development and selection
│ └── 04_conclusions.qmd # Final conclusions and interpretation
│
├── R/ # R scripts with analysis functions
│ ├── 01_data.R # Functions for data loading and preparation
│ ├── 02_exploration.R # Functions for exploratory analysis
│ └── 03_modelling.R # Functions for model fitting and evaluation
│
├── data/ # Data files
│ ├── raw/ # Original, unmodified data
│ │ └── student-mat.csv # Student performance dataset
│ └── derived/ # Processed/cleaned data (you will create these)
│
└── extras/ # Additional resources
└── drew_logo.png # Drew logo for branding
1.4 Optional: rendering and publishing online
See documentation on Posit Publisher
1.4.1 In Positron
- Type
Ctrl+Shift+P, then search for and run “Posit Publisher: New Credential” and choose Posit Connect Cloud- You should be taken through some steps automatically:
- You will need to create an additional account at Posit Connect Cloud
- You will also need to authenticate Positron to this service
- You should be taken through some steps automatically:
- Type
Ctrl+Shift+P, then search for and run “Posit Publisher: New Deployment” and chooseindex.qmd. You will have an option between deploying with source code or deploying the rendered document only. Either is acceptable, but choosing the rendered document may be simpler.- A window should open at the left to “Deploy Your Project”. Click that button.
This should “deploy” your website to a Posit Connect server. The link could e.g. be shared with me for grading.