Analysis of Latin Square Design using R
See Video ⮞ ☝ |
AGRON stats
September 15, 2018
Introduction
A Latin square is a design in which two gradients are controlled with crossed blocks, but in each intersection there is only one treatment level. The major feature of the Latin square design is its capacity to simultaneously handle two known sources of variation among experimental units. Latin square design gives two independent blocking criteria unlike randomized complete block design that treats only one blocking criteria. Horizontal blocking is referred as row blocking and vertical blocking is referred as column blocking. This type of two directional blocking is accomplished by ensuring that each treatment appears only once in each row block and in each column block. This method measures estimated variation among row and column block to considerably reduce from experimental error.
Conditions
Latin square design can be used in following conditions:
- Field trials where fertility gradient exists in two directions perpendicular to each other, or has a unidirectional fertility gradient but also has residual effects from previous trials.
- Insecticide field trials where the insect migration has a predictable direction that is perpendicular to the dominant fertility gradient of the experimental field.
- Greenhouse trials in which the experimental pots are arranged in straight line perpendicular to the glass or screen walls, such that the difference among rows of pots and the distance from the glass wall are expected to be the two major sources of variability among the experimental pots.
- Laboratory trials with replication over time, such that the difference among experimental units conducted at the same time and among those conducted over time constitute the two known sources of variability.
- Useful in animal nutrition studies. As in nutrition trials on dairy cattle, only a few cows may be available for financial reasons.
Restrictions
Latin square design has following restrictions:
- The requirement of Latin square design that all treatment appears only once in each row and column block also becomes a major restriction. The experiment becomes impractical if the number of treatment is very large because of large number of replications required.
- On the other hand if the number of treatment is small, the degree of freedom associated with experimental error becomes too small for the error to be reliably estimated.
- In agricultural experiments where the land requirement is rigid then the actual layout in the field is laborious and approach to the central plots becomes difficult.
- Due to these restrictions the Latin square design is practically being used in experiments only where the number of treatments is not less than four and not greater than eight. Despite of its great potential for controlling experimental error this design is not being used widely in agricultural experiments.
- If there are missing observations in the experiment then the analysis becomes complicated.
Randomization
Before carrying out an experiment, the design should be randomized with the restriction that each treatment occurs once within each row and once within each column. First you need to randomize the order of the row, then the order of the column and finally assign treatments.
A | B | C | D |
B | C | D | A |
C | D | A | B |
D | A | B | C |
Randomize the order of rows
For this purpose draw a square with letters in alphabetical order. Randomize the order of rows using random numbers. According to the ranked order of rows;
- \(4^{th}\) row is placed \(1^{st}\)
- \(3^{rd}\) row is placed \(2^{nd}\)
- \(1^{st}\) row is placed \(3^{rd}\) and
- \(2^{nd}\) row is placed \(4^{th}\)
Random numbers | Ranked order of rows |
---|---|
0.910 | 4 |
0.843 | 3 |
0.324 | 1 |
0.679 | 2 |
A | B | C | D |
B | C | D | A |
C | D | A | B |
D | A | B | C |
D | A | B | C |
C | D | A | B |
A | B | C | D |
B | C | D | A |
Randomize the order of columns
Now randomize the columns. According to the random numbers the ranked order for columns is \(2\), \(3\), \(1\) and \(4\). So according to the ranked order of rows;
\(\blacktriangleright\) \(2^{nd}\) column is placed as \(1^{st}\)
\(\blacktriangleright\) \(3^{rd}\) column is placed as \(2^{nd}\)
\(\blacktriangleright\) \(1^{st}\) column is placed as \(3^{rd}\) and
\(\blacktriangleright\) \(4^{th}\) column is placed as \(4^{th}\)
Random numbers | Ranked order of columns |
---|---|
0.628 | 2 |
0.871 | 3 |
0.158 | 1 |
0.947 | 4 |
D | A | B | C |
C | D | A | B |
A | B | C | D |
B | C | D | A |
A | B | D | C |
D | A | C | B |
B | C | A | D |
C | D | B | A |
Randomize the order of treatments
Now the turn is to randomly assign treatments. First assign treatments to the letters according to the random numbers. According to the random numbers the treatments rank is \(1\), \(4\), \(3\) and \(2\).
Random numbers | Ranked order of treatments |
---|---|
0.039 | 1 |
0.718 | 4 |
0.569 | 3 |
0.182 | 2 |
A | B | C | D |
T1 | T4 | T3 | T2 |
T1 | T4 | T2 | T3 |
T2 | T1 | T3 | T4 |
T4 | T3 | T1 | T2 |
T3 | T2 | T4 | T1 |
Now we shall proceed with an example for analysis of Latin square design using R.
Importing data
Suppose we have a data which is obtained from experimental area that has two directional gradient in fertility of the soil. The data shows yield of four varieties of wheat arranged in a four by four Latin square design. To do analysis first you need to import data in R. Before importing the data set, I often recommend to first clear all the objects or values in global environment using remove()
function. Shut down all open graphics devices using graphics.off()
function. Clear everything in console using system command within shell()
function.
Next step is importing the data set. Suppose we have a data which is obtained from experimental area that has two directional gradient in fertility of the soil. The data shows yield of four varieties of wheat arranged in a four by four Latin square design. Load the package readxl by using library()
function. To import the data from excel spreadsheet use read_excel()
function. In argument path
provide the link of the file. Type TRUE for argument col_names
if the file contains first row as variable names. Your data are now available in the R console. They are stored in the object which I have chosen to call data.
Viewing data
You can visualize them by typing view()
. You can also type head()
or tail()
function to display only the beginning or the end of the data set. These functions cannot be used to edit the data. The instruction fix()
opens a small spreadsheet in R, which can be used to visualize and edit the data.
# # A tibble: 6 x 4
# Row Column Varieties Yield
# <dbl> <dbl> <chr> <dbl>
# 1 1 1 B 1.64
# 2 1 2 D 1.21
# 3 1 3 C 1.42
# 4 1 4 A 1.34
# 5 2 1 C 1.48
# 6 2 2 A 1.18
Verify the variables structure
To verify the structure of the data, str()
function is used. It gives information whether the variables are being read as character, number, integer or factor.
# tibble [16 x 4] (S3: tbl_df/tbl/data.frame)
# $ Row : num [1:16] 1 1 1 1 2 2 2 2 3 3 ...
# $ Column : num [1:16] 1 2 3 4 1 2 3 4 1 2 ...
# $ Varieties: chr [1:16] "B" "D" "C" "A" ...
# $ Yield : num [1:16] 1.64 1.21 1.42 1.34 1.48 ...
In the structure of the variables we can see that the variables row, column and varieties (Treatment variable) are being read as character instead of factor. We can change it to factor by using as.factor()
command. The function attach() gives direct access to the variables of a data frame by typing the name of a variable as it is written on the first line of the file.
data$Row <- as.factor(data$Row)
data$Column <- as.factor(data$Column)
data$Varieties = as.factor(data$Varieties)
str(data)
# tibble [16 x 4] (S3: tbl_df/tbl/data.frame)
# $ Row : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 2 2 2 2 3 3 ...
# $ Column : Factor w/ 4 levels "1","2","3","4": 1 2 3 4 1 2 3 4 1 2 ...
# $ Varieties: Factor w/ 4 levels "A","B","C","D": 2 4 3 1 3 1 4 2 1 3 ...
# $ Yield : num [1:16] 1.64 1.21 1.42 1.34 1.48 ...
Apply Latin Square model
To apply Latin Square Design model let’s define an object model which is assigned with linear model function lm()
where the argument formula
is specified as Yield or response variable separated by (using tilt ~) row, column and varieties. The output can be obtained using anova()
or summary()
function. The analysis of variance table showed that varieties differ significantly regarding the grain yield or response variable. However, there is also highly significant difference in yield due to column blocking.
# Analysis of Variance Table
#
# Response: Yield
# Df Sum Sq Mean Sq F value Pr(>F)
# Row 3 0.03015 0.010052 0.4654 0.716972
# Column 3 0.82734 0.275781 12.7692 0.005148 **
# Varieties 3 0.42684 0.142281 6.5879 0.025092 *
# Residuals 6 0.12958 0.021597
# ---
# Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Mean separation test
The information in analysis of variance table does not identify the specific pairs or groups of varieties that differed. For example, the F test is not able to answer the question of whether every one of the three varieties gave significantly higher yield than that of the check variety or whether there is significant difference among the three varieties. To answer these questions you should go for mean comparison tests.
Now let’s go deeper and see the performance of these varieties by applying suitable mean separation test. Let’s apply Least Significance Difference test to see which variety outperformed regarding grain yield. To apply LSD test first load the library agricolae using library() function. For multiple comparison of treatments use LSD.test()
function. In y
argument set the value by specifying model or typing the response variable name. Type the variable name in quotations while setting the value for trt
argument.
If you type model(aov or lm) in y argument then the variable name for trt argument should be written in quotations else quotations are not required
library(agricolae)
# LSD test
LSD.test(y = model,
trt = "Varieties",
DFerror = model$df.residual,
MSerror = deviance(model)/model$df.residual,
alpha = 0.05,
group = TRUE,
console = TRUE)
#
# Study: model ~ "Varieties"
#
# LSD t Test for Yield
#
# Mean Square Error: 0.0215974
#
# Varieties, means and individual ( 95 %) CI
#
# Yield std r LCL UCL Min Max
# A 1.46375 0.2386900 4 1.2839503 1.64355 1.185 1.670
# B 1.47125 0.2095382 4 1.2914503 1.65105 1.290 1.665
# C 1.06750 0.4426153 4 0.8877003 1.24730 0.660 1.475
# D 1.33875 0.1795538 4 1.1589503 1.51855 1.180 1.565
#
# Alpha: 0.05 ; DF Error: 6
# Critical Value of t: 2.446912
#
# least Significant Difference: 0.2542752
#
# Treatments with the same letter are not significantly different.
#
# Yield groups
# B 1.47125 a
# A 1.46375 a
# D 1.33875 a
# C 1.06750 b
The results showed that varieties B, A and D were statistically at par and yielded more than variety C.
Download data file — Click_here
Download Rscript — Download Rscript
Download R program —
Click_here
Download R studio —
Click_here
Many thanks. Appreciation from Uganda
ReplyDeleteThank you very much
ReplyDelete