One way repeated measures ANOVA in R

Description of data
Clear R environment
Importing data
One way repeated measures ANOVA model
Visualizing results
- Line graph
Mean separation test

In this video you will learn how to carry out one way repeated measures anova using R studio. A repeated measures design is one in which at least one of the factors consists of repeated measurements on the same subjects or experimental units under different condition or under different time points. It can viewed as an extension of the paired samples T test which involve only two related measures. These measures unlike in regular anova are correlated and not independent.

Why should one use repeated measures anova. The reasons are:

Individual differences or error term can be considerably reduced as a source of between group differences.
The sample size is not divided between conditions or groups and thus inferential testing becomes more powerful.
The design proves to be economical when sample members are difficult to recruit.

We can consider the repeated-measures anova as a special case of two-way anova. Where each cell represents a single measurement for one research subject or participant. The columns are the repeated measurements, and the rows are the individual participants or subjects.

The simplest example of repeated measures design is a paired sample t-test. Where each participant is assigned to two treatment levels. Or, we can say each participant is measured twice at two time intervals.

If we observe subjects at more than two time points then we need to conduct a repeated measures anova.

It will decompose the variability into:

A random subject effect
A fixed treatment or time effect

Treating subject as random effect will facilitate to draw conclusion to the population from where these subjects were taken.

Description of data

Here is the data set which will be used in this analysis. First variable in data set represents the subjects or individuals. These individuals were given a dose of pain relieving drug. The tolerance was measured at four time intervals. Second variable represents time intervals in weeks. The third variable is the response or tolerance of each subject at each interval.

Clear R environment

As a first step, I always recommend to clear data objects and values in global environment with rm() function. Set TRUE value for the argument all to remove objects and values if you have created earlier. Shut down all the graphic windows by using graphics.off() function. Putting the value “cls” in shell() function will clear the console environment.

rm(list = ls(all = TRUE))
graphics.off()
shell("cls")

Importing data

To import the data set I have placed the CSV data file in the project working directory. Create an object data and assign to it a function which I am calling as read.csv(). In argument file after equal sign within quotations just press tab button to access the files present in the working directory. Now we need to choose the respective CSV data file. In the next argument header type TRUE to indicate that the first row of the data file contains variable names or headings.

In the next line we can use the head() function to print the first six rows of the data frame. Here I am converting the first 2 variables as factor variables by using as.factor() function. Use attach() function for data object to mask the components of the variables in the data frame.

data = read.csv(
          file = 'repeated.csv',
          header = TRUE
)

head(data)

data$subject = as.factor(data$subject)
data$time = as.factor(data$time)

attach(data)

#   subject time resp
# 1       A    1 0.12
# 2       B    1 1.25
# 3       C    1 2.35
# 4       D    1 3.31
# 5       E    1 2.21
# 6       F    1 2.47

One way repeated measures ANOVA model

To perform repeated measures anova in R, we identify subject as within subject variable and treat it as a random factor. To apply repeated measures anova use aov() function where response variable is separated by time or grouping variable. The error function is used as the ratio between subject and time. This will split the error into subject error and interaction error. By using summary() function for model object will print the output of the model applied.

model = aov(formula = resp ~ time + Error(subject/time))
summary(model)

# 
# Error: subject
#           Df Sum Sq Mean Sq F value Pr(>F)
# Residuals  5  6.895   1.379               
# 
# Error: subject:time
#           Df Sum Sq Mean Sq F value   Pr(>F)    
# time       3 28.561   9.520    27.4 2.47e-06 ***
# Residuals 15  5.212   0.347                     
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Here you can see the difference between one way anova and one way repeated measures anova. The difference is in residuals. In one way repeated measures anova, the residuals splits into subject and interaction residuals as compared to one way anova. This decomposition of error term considerably reduce the residual source of variance for the calculation of F value.

Visualizing results

Line graph

Now examine a line graph with average response variable level plotted over time to see if the trend is a positive one. To plot the graph first we need to look at summary statistics for the object data. This can be obtained by using dplyr package.

library(dplyr)
summary = data %>%
          group_by(time) %>%
          summarize(Mean_resp = mean(resp),
                    SD_resp = sd(resp),
                    SE_resp = sd(resp)/sqrt(length(resp)),
                    n = n())
print(summary)

# # A tibble: 4 x 5
#   time  Mean_resp SD_resp SE_resp     n
#   <fct>     <dbl>   <dbl>   <dbl> <int>
# 1 1          1.95   1.11    0.454     6
# 2 2          3.04   0.903   0.369     6
# 3 3          3.64   0.449   0.183     6
# 4 4          4.97   0.409   0.167     6

Now use plot() function to plot the graph. In type argument you can use one of the given possible types of plots, here the value ‘o’ specify both types overplotted. You can specify X and Y axis labesl in xlab and ylab argument. By plotting the line graph you can examine the trend if it is a positive one.

plot(summary$Mean_resp,
     type = 'o',
     xlab = 'Time',
     ylab = 'response')

Mean separation test

To calculate pairwise comparisons between the group levels with correction for multiple testing you can use pairwise.t.test() function. For method argument you can use of these methods for pairwise comparisons. I shall use bonferroni to explore difference between means.

pairwise.t.test(
          x = resp,
          g = time,
          p.adjust.method = 'bonferroni'
          
)

# 
#   Pairwise comparisons using t tests with pooled SD 
# 
# data:  resp and time 
# 
#   1       2      3     
# 2 0.1501  -      -     
# 3 0.0074  1.0000 -     
# 4 9.1e-06 0.0021 0.0456
# 
# P value adjustment method: bonferroni

The results show that \(1^{st}\) and \(2^{nd}\) week p value and \(2^{nd}\) and \(3^{rd}\) week p value appears to be non significant. While rest of the comparisons are significant.

Please comment below if you have any questions.

Download data file — Click_here

Download Rscript — Click_here

Download R program — Click_here

Download R studio — Click_here

Search This Blog

Data Analysis in R