## Question

## Transcribed Text

By purchasing this solution you'll be able to access the following files:

Solution.Rmd and Solution.docx.

ANOVA For Two-Way Factorial Designs
1 Factorial (or Crossed) Experimental Designs
In many experiments, multiple types of treatments are applied to the experimental units. We first consider the case in which there are two factors (types of treatments), which we will call Factor A'' andFactor B”. If every possible combination of (1) the levels of Factor A and (2) the levels of Factor B are applied to experimental units, then we say that we have a Factorial Treatment Design or that Factor A and Factor B are Crossed with each other.
For example, if there are 4 levels of Factor A, and 3 levels of Factor B, then to complete one replication of the experiment we will need 12 experimental units to accommodate the 12 treatment combinations. With the addition of crossed factors the number of experimental units increases very quickly and so often tough decisions have to be made regarding the number of treatments and the number of levels of each treatment.
2 The Two-Way Complete ANOVA Model
The two-way complete ANOVA model for a factorial design is
Yijt =μ+αi +βj +(αβ)ij +εijt, εijt ∼ N(0,σ )
iid 2 i = 1,2,...,a j = 1,2,...,b t = 1,2,...,rij
Where the first index i indexes the levels of Factor A, with
i = 1,2,...,a = the number of treatment levels of Factor A
and the second index j indexes the levels of Factor B, with
j = 1,2,...b = the number of treatment levels of Factor B.
The main effects are αi, which encodes the effect of the i-th level of Factor A on the mean response, and βj, which is the effect of the j-th level of Factor B on the mean response.
The interaction effects are (αβ)ij and encode the additional effect on the mean response of the joint combination of level i of Factor A and level j of Factor B.
2.1 Effects Coding
We could write this model as an equivalent one-way ANOVA model by letting τij be the total effect of the combination of level i of Factor A and level j of Factor B on the mean response:
iid 2 Yijt=μ+τij+εijt, εijt∼N(0,σ)
i = 1,2,...,a j = 1,2,...,b t = 1,2,...,rij
This coding provides a link between everything we’ve developed for the one-way ANOVA model and
corresponding results in the 2-way complete model.
1
2.2 Estimable Functions
Recall that a function of parameters is estimable if it is a linear combination of the treatment means E(Yijt). From the one-way ANOVA representation in Section 2.1, we can see that estimable functions of the parameters in the two-way complete model can be written as
c (μ+τ )=c (μ+α +β +(αβ) )
ijij ijiiij ij ij
for real numbers {cij}.
3 Hypothesis Tests
The most important hypothesis tests for the 2-way complete model are done in sequence: 1. First test for significant interaction terms. The null hypothesis is
H0 :(αβ)ij =0foralli,j SSAB/(a − 1)(b − 1) ∼ F(a−1)(b−1),n−ab
where the distribution is under the null.
2. If there ARE significant interactions, then we can’t interpret the effects of each factor individually, and
we move directly to testing for pairwise differences between all treatment combinations: H0 :τ11 −τ12 =0
H0 :τ11 −τ13 =0 H0 :τ21 −τ12 =0 ...
and so on.
3. If there are NOT significant interactions (i.e., we do not reject H0 : (αβ)ij = 0 in number 1 above) then
we can test for differences in each Factor’s treatment means. H0 :α1 =α2 =...=αa
If significant differences are found, we then can look at pairwise differences in the levels of Factor A H0 : α1 − α2 = 0
H0 : α1 − α3 = 0 H0 : α2 − α3 = 0 ...
We can then repeat this for Factor B, first testing for overall differences H0 :β1 =β2 =...=βb
and then for significant pairwise differences in levels of Factor B. H0 : β1 − β2 = 0
H0 : β1 − β3 = 0 H0 : β2 − β3 = 0 ...
and the test statistic used is
SSE/(n − ab)
2
4 Setting up R for ANOVA analysis
So far, we have been able to do most of our computations in R using standard software and options. As we begin to use more complicated designs, we will need some additional packages and options. Before running any analyses for the rest of the semester, it will be helpful to run the following block of code, which will load the required packages and set some important options:
ct
library(lsmeans) # load package for pairwise comparisons
library(multcompView) # load package for Tukey Grouping
library(car) # load package for more complicated ANOVA analyses
options(contrasts = c("contr.sum", "contr.poly")) ## code needed to make estimation corre
You may need to install some of these packages using the “install.packages” command, but you only need to do this once per computer.
5 Example: Bleach
As an example, consider the following experiment (Abagail 1993). 30 different cloths were given one of three different “stains”, by placing either Blue Ink, Raspberry Jam, or Tomato Sauce on the cloth. Each cloth was then immersed in a solution of water and bleach, where the concentration of bleach was either 3 tsp/liter or 7 tsp/liter. Each of the six combinations of stain and concentration was applied to 5 cloths, and the cloths were immersed in the liquid until the stain was lifted. Read the data into R as follows:
conc<-c(rep("3", 15), rep("7", 15))
stain<-c(rep("Ink", 5), rep("Jam", 5), rep("Sauce", 5),
rep("Ink", 5), rep("Jam", 5), rep("Sauce", 5)) time<-c(3600, 3340, 3173, 2452, 3920,
495, 236, 515, 573, 555,
733, 525, 793, 510, 1026,
3660, 4105, 4545, 3569, 3342,
410, 225, 437, 350, 140,
539, 1354, 347, 584, 781)
df<-data.frame(conc=conc, stain=stain, time=time)
This is a two-factor crossed design, because experimental units are assigned to each combination of the levels of Factor A (stain type) and the levels of Factor B (bleach concentration). The two-way ANOVA model for this data is
iid 2 Yijt =μ+αi +βj +(αβ)ij +εijt, εijt ∼ N(0,σ )
i = Ink,Jam,Sauce j = 3,7 t = 1,2,3,4,5 The following block of code fits the 2-way complete model using R
## Analysis of Variance Table
##
## Response: time
modelAB<-aov(time~conc+stain+conc:stain, data=df) anova(modelAB)
3
## Df Sum Sq Mean Sq F value Pr(>F)
## conc 1 125712 125712 0.9897 0.32974
## stain 2 61099421 30549711 240.5151 < 2e-16 ***
## conc:stain 2 688824 344412 2.7115 0.08675 .
## Residuals 24 3048428 127018
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Anova(modelAB,type="III")
## Anova Table (Type III tests)
##
## Response: time
## Sum Sq Df F value Pr(>F)
## (Intercept) 73114119 1 575.6209 < 2e-16 ***
## conc 125712 1 0.9897 0.32974
## stain 61099421 2 240.5151 < 2e-16 ***
## conc:stain 688824 2 2.7115 0.08675 .
## Residuals 3048428 24
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Note: the conc:stain term tells R to include the interaction effects (αβ)ij. Also, the anova and Anova code both give us ANOVA tables, and in this case give identical analyses. However, for more complicated models and for data from unbalanced designs, there are many possible types of ANOVA analysis. In this class, we will teach what are called “Type III sums of squares”, which are obtained using the Anova command from the car package. This will be our default from here on out
The residuals for this model fit show that the assumption of constant error variance is not met (check this!). We tried multiple transformations, and found that the best was the square-root of the response. The resulting 2-way complete model is
iid2 Yijt =μ+αi +βj +(αβ)ij +εijt, εijt ∼ N(0,σ )
i = Ink,Jam,Sauce j = 3,7 t = 1,2,3,4,5
The following block of code fits this model using R. The residuals here look much better. The ANOVA table
for this analysis is
## Anova Table (Type III tests)
##
## Response: sqrttime
## Sum Sq Df F value Pr(>F)
## (Intercept) 37016 1 1735.2500 < 2.2e-16 ***
## conc 0 1 0.0000 0.9947
## stain 9207 2 215.7985 4.566e-16 ***
## conc:stain 99 2 2.3246 0.1194
## Residuals 512 24
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
df$sqrttime<-sqrt(df$time) modeltrAB<-aov(sqrttime~conc+stain+conc:stain, data=df) Anova(modeltrAB,type="III")
4
The con*stain row in the ANOVA table has the test for significant interactions. As this is not significant at the α = 0.05 level, we fail to reject the null hypothesis and can assume that all interaction effects are zero.
The following code plots the means for all 6 combinations of the two factors. The interaction plot shows a weak interaction, with the lines for concentration=3 and concentration=7 being nearly parallel.
interaction.plot(x.factor = df$stain, trace.factor = df$conc, response = df$sqrttime, type ="b",col = 2:3,
xlab ="stain", ylab ="Mean", trace.label ="concentration")
2
1
concentration
13 27
1
2
Ink Jam stain
Sauce
1
2
As we found that the interaction effects are all zero, we proceed to considering each treatment by itself. Our analysis from here on will be similar to a 1-way ANOVA analysis.
The conc row in the ANOVA table indicates that there is no significant difference in the time to stain removal between the two levels of concentration. The stain row in the ANOVA table indicates that there ARE significant differences between time to stain removal in at least one pair of the different stains. To find which, we will use the lsmeans command in R. The following code looks for differences between the levels of Factor B (stain).
## contrast estimate SE df t.ratio p.value
## Ink - Jam 40.129186 2.065519 24 19.428 <.0001
## Ink - Sauce 33.227062 2.065519 24 16.087 <.0001
library(lsmeans) lsmstain=lsmeans(modeltrAB, ~ stain ) contrast(lsmstain, method="pairwise")
5
Mean
20 30 40 50 60
## Jam - Sauce -6.902124 2.065519 24 -3.342 0.0074
##
## Results are averaged over the levels of: conc
## P value adjustment: tukey method for comparing a family of 3 estimates
The results show that there are significant differences between the mean time to remove stains for all pairs of stains. Ink stains take the longest to remove, on average, followed by Tomato Sauce stains, and Jam stains are removed the fastest on average.
4.1 Analysis including interaction effects
While we didn’t find a significant interaction for the bleach data, we will use the data to illustrate how to proceed with an analysis when we do find a significant interaction effect.
Instead of looking at the levels of Factor A (concentration) and Factor b (stain) independently, when we have a significant interaction we proceed directly to looking for significant differences between all possible combinations of treatments. The following R code does this:
## contrast estimate SE df t.ratio p.value
lsminter=lsmeans(modeltrAB, ~ stain:conc ) contrast(lsminter,method="pairwise")
## Ink,3 - Jam,3
## Ink,3 - Sauce,3
## Ink,3 - Ink,7
## Ink,3 - Jam,7
## Ink,3 - Sauce,7
## Jam,3 - Sauce,3
## Jam,3 - Ink,7
## Jam,3 - Jam,7
## Jam,3 - Sauce,7
## Sauce,3 - Ink,7 -35.3548887 2.921086 24 -12.103 <.0001
## Sauce,3 - Jam,7 9.2136146 2.921086 24 3.154 0.0437
## Sauce,3 - Sauce,7 0.4016007 2.921086 24 0.137 1.0000
35.6898687 2.921086 24 12.218 <.0001
30.6976355 2.921086 24 10.509 <.0001
-4.6572533 2.921086 24 -1.594 0.6098
39.9112501 2.921086 24 13.663 <.0001
31.0992362 2.921086 24 10.646 <.0001
-4.9922333 2.921086 24 -1.709 0.5393
-40.3471220 2.921086 24 -13.812 <.0001
4.2213813 2.921086 24 1.445 0.7003
-4.5906326 2.921086 24 -1.572 0.6239
## Ink,7 - Jam,7
## Ink,7 - Sauce,7
## Jam,7 - Sauce,7
##
## P value adjustment: tukey method for comparing a family of 6 estimates
44.5685033 2.921086 24 15.258 <.0001
35.7564894 2.921086 24 12.241 <.0001
-8.8120139 2.921086 24 -3.017 0.0587
There are so many pairwise comparisons it can be difficult to interpret these results! The “Tukey Grouping” is an attempt to make this easier. It shows which of the combinations of factor levels are significantly different from each other. Groups that share a letter are not significantly different from each other. The cld function in the R package multcompView is used for this.
## stain conc lsmean SE df lower.CL upper.CL .group
## Jam 7 17.33869 2.065519 24 13.07567 21.60171 1
## Jam 3 21.56007 2.065519 24 17.29705 25.82309 12
library(multcompView) lsminter=lsmeans(modeltrAB, ~ stain:conc ) cld(lsminter, alpha=0.05)
6
## Sauce 7
## Sauce 3
## Ink 3
## Ink 7
##
26.15070 2.065519 24 21.88768 30.41373 12
26.55230 2.065519 24 22.28928 30.81533
57.24994 2.065519 24 52.98692 61.51296
61.90719 2.065519 24 57.64417 66.17022
2 3
3
## Confidence level used: 0.95
## P value adjustment: tukey method for comparing a family of 6 estimates
## significance level used: alpha = 0.05
We could interpret the results of this analysis as follows:
• Jam7 has significantly lower mean than Sauce3, Ink3, and Ink7. • Jam3 has significantly lower mean than Ink3 and Ink7
• Sauce7 has significantly lower mean than Ink3 and Ink7
• Sauce3 has significantly lower mean than Ink3 and Ink7
• No other comparisons are significantly different than zero.
7
Assignment
1. Greenhouse. Consider an experiment to study the effect of three types of fertilizer (F1, F2, and F3) on the growth of two species of plant (SppA and SppB). The data are as follows:
Fert<-c(rep("control", 12), rep("f1", 12), rep("f2", 12), rep("f3", 12))
Species<-c(rep(c(rep("SppA", 6), rep("SppB", 6)),4))
Height<-c(21.0, 19.5, 22.5, 21.5, 20.5, 21.0, 23.7, 23.8, 23.8, 23.7, 22.8, 24.4, 32.0, 30.5, 25.0, 27.5, 28.0, 28.6, 30.1, 28.9, 30.9, 34.4, 32.7, 32.7, 22.5, 26.0, 28.0, 27.0, 26.5, 25.2, 30.6, 31.1, 28.1, 34.9, 30.1, 25.5, 28.0, 27.5, 31.0, 29.5, 30.0, 29.2, 36.1, 36.6, 38.7, 37.1, 36.8, 37.1)
df<-data.frame(Fert=Fert, Species=Species, Height=Height)
(a) Write out the 2-way complete model for this experiment.
(b) Fit the model using R and examine the residuals. Transform the response if needed to address any problems with normality or constant error variance. If you transform the response, clearly show the residuals from the un-transformed response, and your best transformation, and describe why you chose the transformation you did.
(c) Describe the effect of species and fertilizer on mean height. This description should use the results of hypothesis tests and p-values as described in class. Discuss any relevant interaction effects, main effects and pairwise differences between treatment means. Provide a plot that shows the means for all combinations of factor levels. Provide R code and output that supports your results.
2. Consider the following data, the result of a 2-factor factorial experiment with 5 replications for each combination of Factor A and Factor B. Treatment combinations were assigned at random to the 20 experimental units.
A<-c(rep(1, 10), rep(2, 10)) B<-rep(c(c(rep(1, 5), rep(2, 5))), 2) resp<-c(12.9, 11.3, 11.7, 12.1, 12.3,
13.7, 12.8, 13.6, 13.1, 13.5,
14.2, 14.5, 13.9, 13.6, 14.4,
13.5, 13.1, 13.3, 13.1, 13.4)
df<-data.frame(A=A, B=B, resp=resp)
(a)Write out the 2-way complete model for this experiment.
(b) Fit the model using R and examine the residuals. Transform the response if needed to address any problems with normality or constant error variance. If you transform the response, clearly show the residuals from the un-transformed response, and your best transformation, and describe why you chose the transformation you did.
8
(c) Describe the effect of Factors A and B on mean respnose. This description should use the results of hypothesis tests and p-values as described in class. Discuss any relevant interaction effects, main effects and pairwise differences between treatment means. Provide a plot that shows the means for all combinations of factor levels. Provide R code and output that supports your results.
9

By purchasing this solution you'll be able to access the following files:

Solution.Rmd and Solution.docx.

Hours

Minutes

Seconds

for this solution

or FREE if you

register a new account!

PayPal, G Pay, ApplePay, Amazon Pay, and all major credit cards accepted.

View available Statistics-R Programming Tutors

Get College Homework Help.

**Are you sure you don't want to upload any files?**

Fast tutor response requires as much info as possible.

**Decision:**

Upload a file

Continue without uploading