Appendices

Test Your Knowledge: Solutions

  1. Log-transform the variable age in data and save the result as age.log.
age.log <- log(data$age)


  1. Square all values in cesd.1 and save the result as cesd.1.squared within the data set data.
data$cesd.1.squared <- data$cesd.1^2


  1. Calculate the mean and standard deviation (\(SD\)) of the variable cesd.2. If necessary, use the Internet to find out which function in R calculates the standard deviation of a numeric vector.
mean(data$cesd.2, na.rm=TRUE)
sd(data$cesd.2, na.rm=TRUE)


  1. Save the calculated mean and standard deviation of cesd.2 in a list element.
list(mean = mean(data$cesd.2, na.rm=TRUE),
     sd = sd(data$cesd.2, na.rm=TRUE))


  1. Does the variable atsphs.0 in data have the desired class numeric? Try to confirm this using R code.
class(data$atsphs.0)


  1. Create two new variables in data: (1) age.65plus, a logical variable which indicates if a person’s age is 65 or above; and (2) cesd.diff, a variable that contains the difference between cesd.0 and cesd.1 for each patient.
data$age.65plus <- data$age >= 65
data$cesd.diff <- data$cesd.0 - data$cesd.1


  1. Using the pipe operator, filter out the records of all patients who are male (sex=0) and part of the intervention group (group=1); then, in this subset, select the variable age and calculate its mean.
data %>%
  filter(sex==0, group==1) %>%
  pull(age) %>%
  mean()


  1. In the fifth and sixth row of data, change the value of degree to NA (missing).
data[5:6,"degree"] <- c(NA, NA)



R Package Information

All code included was tested under R version 4.2.0. The following package versions were used:

    dplyr_1.0.10        openxlsx_4.2.5      scales_1.2.0  
    mice_3.14.0         plot.matrix_1.6.2   skimr_2.1.4  
    miceadds_3.12-26    psych_2.2.3         stdReg_3.4.1  
    mitml_0.4-3         purrr_0.3.4         tidyr_1.2.0       
    mmrm_0.2.2.9013     RefBasedMI_0.1.0    tidyverse_1.3.1 


Data & Downloads

The data we use in this tutorial has been uploaded to a Github repository, which can be found at github.com/MathiasHarrer/rct-tutorial. The repository has also been permanently archived on Zenodo. We also uploaded an R script that includes all the code we used in this tutorial. Quick download links are provided below.

  • rct-tutorial.zip . A zip-folder containing the example trial data set (see data.xlsx), as well as the complete tutorial code (see code.R).

  • data.xlsx . The original, unimputed example data set, which includes simulated patient data of a randomized controlled trial comparing an Internet-based depression intervention to a waitlist control (\(N=\) 546).

  • imp.rda . The imputed data set object generated by the mice function (see Section 3.2.2). After being imported into R, the object has the name imp.

  • imp.j2r.rda . The imputed data set object generated by the RefBasedMI function (“jump-to-reference” imputation; see Section 3.2.3). After being imported into R, the object has the name imp.j2r.

  • implist.rda . The imp object transformed to a list of imputed data sets (mitml.list). This multiply imputed data format can be used to pool analyses using the testEstimates function (see Section 4.1.3).

  • implist.j2r.rda . The imp.j2r object transformed to a list of imputed data sets (mitml.list). This multiply imputed data format can be used to pool analyses using the testEstimates function (see Chapter ).

  • code.R . All code used in the tutorial, collected in one script.

  • skimReport.R . Code of the skimReport function, which is used in Section 5.3 to generate descriptive tables.

To open .rda files, save them in your analysis folder. Then, open the folder through the Files pane in R Studio and click on the .rda file. This automatically imports the object into your R environment. To see how the data.xlsx Excel file can be imported, see Section 1.3.


Errata & Corrections

Statistical software is constantly evolving, and it is possible that some of the code provided in this tutorial may stop working over time. Errata and/or corrections will be documented here.

If you find an error in the tutorial, feel free to contact Mathias (mathias.harrer@tum.de).

Last updated 2023-06-23.