One way repeated measures ANOVA in R
In this video you will learn how to carry out one way repeated measures anova using R studio. A repeated measures design is one in which at least one of the factors consists of repeated measurements on the same subjects or experimental units under different condition or under different time points. It can viewed as an extension of the paired samples T test which involve only two related measures. These measures unlike in regular anova are correlated and not independent.
Why should one use repeated measures anova. The reasons are:
- Individual differences or error term can be considerably reduced as a source of between group differences.
- The sample size is not divided between conditions or groups and thus inferential testing becomes more powerful.
- The design proves to be economical when sample members are difficult to recruit.
We can consider the repeated-measures anova as a special case of two-way anova. Where each cell represents a single measurement for one research subject or participant. The columns are the repeated measurements, and the rows are the individual participants or subjects.
The simplest example of repeated measures design is a paired sample t-test. Where each participant is assigned to two treatment levels. Or, we can say each participant is measured twice at two time intervals.
If we observe subjects at more than two time points then we need to conduct a repeated measures anova.
It will decompose the variability into:
- A random subject effect
- A fixed treatment or time effect
Treating subject as random effect will facilitate to draw conclusion to the population from where these subjects were taken.
Description of data
Here is the data set which will be used in this analysis. First variable in data set represents the subjects or individuals. These individuals were given a dose of pain relieving drug. The tolerance was measured at four time intervals. Second variable represents time intervals in weeks. The third variable is the response or tolerance of each subject at each interval.
Clear R environment
As a first step, I always recommend to clear data objects and values in global environment with rm()
function. Set TRUE value for the argument all
to remove objects and values if you have created earlier. Shut down all the graphic windows by using graphics.off()
function. Putting the value “cls” in shell()
function will clear the console environment.
rm(list = ls(all = TRUE))
graphics.off()
shell("cls")
Importing data
To import the data set I have placed the CSV data file in the project working directory. Create an object data and assign to it a function which I am calling as read.csv()
. In argument file after equal sign within quotations just press tab button to access the files present in the working directory. Now we need to choose the respective CSV data file. In the next argument header
type TRUE to indicate that the first row of the data file contains variable names or headings.
In the next line we can use the head()
function to print the first six rows of the data frame. Here I am converting the first 2 variables as factor variables by using as.factor()
function. Use attach()
function for data object to mask the components of the variables in the data frame.
= read.csv(
data file = 'repeated.csv',
header = TRUE
)
head(data)
$subject = as.factor(data$subject)
data$time = as.factor(data$time)
data
attach(data)
# subject time resp
# 1 A 1 0.12
# 2 B 1 1.25
# 3 C 1 2.35
# 4 D 1 3.31
# 5 E 1 2.21
# 6 F 1 2.47
One way repeated measures ANOVA model
To perform repeated measures anova in R, we identify subject as within subject variable and treat it as a random factor. To apply repeated measures anova use aov()
function where response variable is separated by time or grouping variable. The error function is used as the ratio between subject and time. This will split the error into subject error and interaction error. By using summary()
function for model object will print the output of the model applied.
= aov(formula = resp ~ time + Error(subject/time))
model summary(model)
#
# Error: subject
# Df Sum Sq Mean Sq F value Pr(>F)
# Residuals 5 6.895 1.379
#
# Error: subject:time
# Df Sum Sq Mean Sq F value Pr(>F)
# time 3 28.561 9.520 27.4 2.47e-06 ***
# Residuals 15 5.212 0.347
# ---
# Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Here you can see the difference between one way anova and one way repeated measures anova. The difference is in residuals. In one way repeated measures anova, the residuals splits into subject and interaction residuals as compared to one way anova. This decomposition of error term considerably reduce the residual source of variance for the calculation of F value.
Visualizing results
Line graph
Now examine a line graph with average response variable level plotted over time to see if the trend is a positive one. To plot the graph first we need to look at summary statistics for the object data. This can be obtained by using dplyr package.
library(dplyr)
= data %>%
summary group_by(time) %>%
summarize(Mean_resp = mean(resp),
SD_resp = sd(resp),
SE_resp = sd(resp)/sqrt(length(resp)),
n = n())
print(summary)
# # A tibble: 4 x 5
# time Mean_resp SD_resp SE_resp n
# <fct> <dbl> <dbl> <dbl> <int>
# 1 1 1.95 1.11 0.454 6
# 2 2 3.04 0.903 0.369 6
# 3 3 3.64 0.449 0.183 6
# 4 4 4.97 0.409 0.167 6
Now use plot()
function to plot the graph. In type
argument you can use one of the given possible types of plots, here the value ‘o’ specify both types overplotted. You can specify X and Y axis labesl in xlab
and ylab
argument. By plotting the line graph you can examine the trend if it is a positive one.
plot(summary$Mean_resp,
type = 'o',
xlab = 'Time',
ylab = 'response')
Mean separation test
To calculate pairwise comparisons between the group levels with correction for multiple testing you can use pairwise.t.test()
function. For method argument you can use of these methods for pairwise comparisons. I shall use bonferroni to explore difference between means.
pairwise.t.test(
x = resp,
g = time,
p.adjust.method = 'bonferroni'
)
#
# Pairwise comparisons using t tests with pooled SD
#
# data: resp and time
#
# 1 2 3
# 2 0.1501 - -
# 3 0.0074 1.0000 -
# 4 9.1e-06 0.0021 0.0456
#
# P value adjustment method: bonferroni
The results show that \(1^{st}\) and \(2^{nd}\) week p value and \(2^{nd}\) and \(3^{rd}\) week p value appears to be non significant. While rest of the comparisons are significant.
Please comment below if you have any questions.
Download data file — Click_here
Download Rscript — Click_here
Download R program —
Click_here
Download R studio —
Click_here
Comments
Post a Comment