Mean-mean multiple comparison plot using R

See Video  ☝


Introduction

You will learn how to display mean difference multiple comparison plot using R. A mean-mean scatter plot or MMC plot shows a two dimensional representation of the differences between many means.

The MMC scatter plot is different from deffograph however, both provide the same information. The deffograph shows the mean of a group on the horizontal axis against the mean of the other group on the vertical axis with a dot at the intersection. A vector centered at the intersection with a unit slope and a length proportional to the width of the confidence interval represents the confidence interval. A gray identity line represents equality of means; that is the difference is equal to zero.

If the vector does not cross the identity line, you can conclude there is a significant difference between the means. 
Lines crossing the diagonal in this chart are equivalent to non-significant differences between treatment means.

MMC plot is designed as a crossing of the means of a response variable at the levels of one factor with itself. It is then rotated 45 degree unlike the diffograph so the horizontal axis can be interpreted as the differences in mean levels and the vertical axis can be interpreted as the weighted averages of the means comprising each comparison. This class of plots is used to display the results of a multiple comparison procedure. MMC plot contains reference lines representing means for each factor level rotated at 45 degree unlike diffograph.

Each confidence interval on a mean difference is represented by a horizontal lines called vector lines. The width of the vector line represents the confidence interval for each set of contrast or mean comparison. There is a vertical segmented line also called the zero line, indicate the significance according to certain rules. If the vector line is not covered by the zero or segmented line, the corresponding population means are declared significantly different. If vector lines cross the segmented line, the corresponding population means are declared non significant. If an end of a vector line is close to the vertical zero line — the declaration of significance was a close call.

Sometimes it happened that the MMC plot shows the overlapping of labels for each set of mean comparison on the right axis. The Tiebreaker panel is needed to respond to the overprinting of labels in the right axis of the MMC panel. The tiebreaker graph shows an estimate for the difference between means and the Tukey-adjusted 95% confidence intervals for the difference.

Intervals that contain zero indicate that the difference of means is not significant. Intervals that do not contain zero indicate significant differences.

Import data file

I often recommend to first clear all the objects or values in global environment using rm(list = ls(all = TRUE)) before importing the data set. You can also clear the plots using graphics.off() and clear everything in console using shell() function.

rm(list = ls(all = TRUE))
graphics.off()
shell("cls")

Now let’s import the data set using read.csv() function. I have already saved the data file as CSV (comma delimited) in the working directory. The file argument specify the file name with extension (data-strip.csv) CSV. In header argument you can set a logical value that will indicate whether the data file contains variable names as first row. In my data set the file contains the variable names in the first row, so I shall use TRUE for this argument. The head() function will print the first six rows of the data set. The str() will show the format of variables as how they are being read by R.

data <- read.csv(file = "data_mmc.csv", 
                 header = TRUE)
head(data)
#   treatment yield
# 1      Ctrl  29.8
# 2      Ctrl  23.0
# 3      Ctrl  24.3
# 4        NP  16.2
# 5        NP  11.9
# 6        NP  14.4
str(data)
# 'data.frame': 12 obs. of  2 variables:
#  $ treatment: Factor w/ 4 levels "Ctrl","HP","NP",..: 1 1 1 3 3 3 2 2 2 4 ...
#  $ yield    : num  29.8 23 24.3 16.2 11.9 14.4 39.5 33.4 29.3 43.1 ...

Fit analysis of variance model

First attach the data set to the R search path using attach() function. This means that the database is searched by R when evaluating a variable, so objects in the database can be accessed by simply giving their names. This will help us to reduce the argument data$treatment to treatment only.

We can access the functions of one way analysis of variance using R stats package. Load this package using the library() function before fitting the one way analysis of variance model using aov() function.

This function requires formula as its argument to represent what output you wants to get from it. In this model formula the response variable (yield) is separated from factor variable (treatment) using tilde operator ~. To print this model use anova() function for this object.

attach(data)
library(stats)
results <- aov(yield ~ treatment)
anova(results)
# Analysis of Variance Table
# 
# Response: yield
#           Df  Sum Sq Mean Sq F value   Pr(>F)    
# treatment  3 1258.66  419.55  36.698 5.04e-05 ***
# Residuals  8   91.46   11.43                     
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Display data: Heiberger and Holland

The data set can be displayed by providing the output from glht() and mmc() functions.

General linear hypothesis

The results of general linear hypothesis can be obtained by using glht() function. The argument model specify a fitted model, for example an object returned by lm, glm, or aov etc. The second argument linfct specify linear hypothesis to be tested.

Multiple comparisons in AN(C)OVA models are specified by objects returned from function mcp

The third argument specify the alternative hypothesis to be used. The alternative hypothesis must be one of the following;

'"two.sided"' (default), '"greater"' or '"less"'
library(HH)
res.glht <- glht(model = results,
                 linfct = mcp(treatment = "Tukey"),
                 alternative = "two.sided")
print(res.glht)
# 
#    General Linear Hypotheses
# 
# Multiple Comparisons of Means: Tukey Contrasts
# 
# 
# Linear Hypotheses:
#                Estimate
# HP - Ctrl == 0    8.367
# NP - Ctrl == 0  -11.533
# OP - Ctrl == 0   16.067
# NP - HP == 0    -19.900
# OP - HP == 0      7.700
# OP - NP == 0     27.600

Display plot from glht function

To display the results from object of glht() function created above use mmcplot() function. The argument mmc specify mmc object or other object as indicated by method. Second argument focus specify the name of the factor for which the glht object was constructed.

mmcplot(mmc = res.glht,
        col = c("black","red"), 
        lwd = c(1,1), lty = c(2,1),
        focus = "treatment")

Mean-mean multiple comparisons

The results from mmc() function constructs a mmc.multicomp object from the formula and other arguments.

The constructed object must be explicitly plotted with the mmcplot function

The argument model specify the output obtained from aov() or lm() functions. The second and third arguments linfct and focus are the same as discussed above in glht() function.

res.mmc <- mmc(model = results,
               linfct = mcp(treatment = "Tukey"),
               focus = "treatment")
print(res.mmc)
# Tukey contrasts
# Fit: aov(formula = yield ~ treatment) 
# Estimated Quantile = 3.201923 
# 95% family-wise confidence level
# $mca
#          estimate   stderr     lower    upper   height
# OP-HP    7.700000 2.760737 -1.139666 16.53967 37.91667
# OP-Ctrl 16.066667 2.760737  7.227001 24.90633 33.73333
# HP-Ctrl  8.366667 2.760737 -0.472999 17.20633 29.88333
# OP-NP   27.600000 2.760737 18.760334 36.43967 27.96667
# HP-NP   19.900000 2.760737 11.060334 28.73967 24.11667
# Ctrl-NP 11.533333 2.760737  2.693668 20.37300 19.93333
# $none
#      estimate   stderr     lower    upper   height
# OP   41.76667 1.952136 35.516079 48.01725 41.76667
# HP   34.06667 1.952136 27.816079 40.31725 34.06667
# Ctrl 25.70000 1.952136 19.449412 31.95059 25.70000
# NP   14.16667 1.952136  7.916079 20.41725 14.16667

Display MMC plot

To display MMC plot use the same mmcplot() function. The argument mmc specify mmc object or other object as indicated by method. The argument type specify the type of plot to be drawn. You can set following values for this argument;

mca: for the default paired-comparisons plot

linfct: for user-specified set of contrasts

none: for confidence intervals on the set of group means

Using none for type argument will result in a plot of contrast values for the group even if style argument is set to isomeans. Using mca for type and isomeans for style argument will result in MMC plot without tiebreaker.

mmcplot(mmc = res.mmc,
# MMC plot and tiebreaker
        style = both, type = "mca")
# MMC plot only
        style = isomeans, type = "mca"
# Tiebreaker only when type = "none"
        style = both, type = "none"
# Tiebreaker only when type = "none"
        style = isomeans, type = "none"

User-specified set of contrasts

To set user specified contrasts first you need to create contrast coefficients for the argument focus.lmat. In this example following possible contrasts can be specified.

1. Control vs Seed priming
2. Non-primed vs Osmopriming
3. Hydropriming vs Osmopriming
Sum of contrast coefficient must be zero
res.lmat <- cbind("Ctrl-Seed priming" = c( 2, 0, -1,-1),
                  "NP-OP" = c( 0, 1, 0, -1),
                  "HP-OP"= c( 0, 0, 1, -1))

Use level() function to assign names for treatment or factor levels.

dimnames(res.lmat)[[1]] <- levels(data$treatment)
dimnames(res.lmat)[[1]]
# [1] "Ctrl" "HP"   "NP"   "OP"
res.lmat
#      Ctrl-Seed priming NP-OP HP-OP
# Ctrl                 2     0     0
# HP                   0     1     0
# NP                  -1     0     1
# OP                  -1    -1    -1

Now again applying mmc() while specifying user specified set of contrasts for focus.lmat argument and “Tukey” test for mcp argument will the following results.

res.mmc <- mmc(results,
               linfct = mcp(treatment = "Tukey"),
               focus.lmat = res.lmat)
res.mmc
# Tukey contrasts
# Fit: aov(formula = yield ~ treatment) 
# Estimated Quantile = 3.200984 
# 95% family-wise confidence level
# $mca
#          estimate   stderr     lower    upper   height
# OP-HP    7.700000 2.760737 -1.137075 16.53707 37.91667
# OP-Ctrl 16.066667 2.760737  7.229592 24.90374 33.73333
# HP-Ctrl  8.366667 2.760737 -0.470408 17.20374 29.88333
# OP-NP   27.600000 2.760737 18.762925 36.43707 27.96667
# HP-NP   19.900000 2.760737 11.062925 28.73707 24.11667
# Ctrl-NP 11.533333 2.760737  2.696259 20.37041 19.93333
# $none
#      estimate   stderr     lower    upper   height
# OP   41.76667 1.952136 35.517911 48.01542 41.76667
# HP   34.06667 1.952136 27.817911 40.31542 34.06667
# Ctrl 25.70000 1.952136 19.451245 31.94876 25.70000
# NP   14.16667 1.952136  7.917911 20.41542 14.16667
# $lmat
#                   estimate   stderr     lower     upper   height
# OP-NP             7.700000 2.760737 -1.137075 16.537075 37.91667
# OP-HP            27.600000 2.760737 18.762925 36.437075 27.96667
# Seedpriming-Ctrl  2.266667 2.390868 -5.386464  9.919798 26.83333

In addition to mca and lmat default set of contrasts third component lmat represents the user specified set of contrasts.

Now again use mmcplot() function for res.mmc object but use “lmat” in type argument to get graphical display of user specified contrasts. If you will use none for type argument it will result in graphical display for confidence interval for group of treatments. Similarly, using mca will result in graphical display of all pairwise contrast comparisons.

mmcplot(mmc = res.mmc, 
        style = "both", type = "lmat")
mmcplot(mmc = res.mmc, 
        style = "isomeans", type = "lmat")

Changing name of the plot

The name of the plot can changed by specifying the value of MMCname argument.

mmcplot(mmc = res.mmc, type = "mca", style="both",
        MMCname = "Seed priming")

Customizing color and lines

The color of the lines can be customized by specifying the value for col argument. Similarly line width and line type can be specified by using values for lwd and lty argument.

mmcplot(mmc = res.mmc, type = "mca", style="both",
        col = c("green","orange"), 
        lwd = c(2,2), 
        lty = c(2,1))

Please comment below if you have any questions

Download data file — Click_here

Download Rscript — Download Rscript



Download R program — Click_here

Download R studio — Click_here


Comments

Popular posts from this blog

Two way repeated measures analysis in R

Split plot analysis in R

Elegant barplot using ggplot function in R