File: jmfuns.rda
function ( lm.input, margin = 1, contrast.list, strata = NA,
Confidence.Level = 0.95, var.equal = TRUE,
show.cellcode = TRUE, show.test.def = TRUE,
p.digits = 6)
ARGUMENTS
lm.input A 'lm' or 'aov' object, i.e., the output of a call to 'lm' or 'aov'.
margin A vector of positive integers that indicate the factor or factors over whose levels the contrast is defined. The factors are listed in the same order as in the 'formula' argument of the call to 'aov'. For example, if the 'Y ~ A1*A2*A3' is the formula in the call to 'aov', then 'margin=3' indicates that the contrast coefficients apply to the levels of the 'A3' factor.
contrast.list A list of numeric vectors. Each vector indicates the contrast coefficients for the marginal means specified by 'margin'. All vectors must have the same length as the number of marginal means determined by 'margin'.
strata If 'strata' is non-missing, it must be a vector of integers indicating the factor or factors that play the role of strata. Each contrast is computed within every fixed level of the stratifying factors.
Confidence.Level Level of confidence for the confidence intervals.
var.equal Set var.equal = TRUE if and only if homogeneity of variance is assumed in the test of the contrast. The default is var.equal = TRUE.
show.cellcode When margin specifies more than one factor, e.g., 'margin = c(2, 3)', the contrast coefficients define a linear combination with respect to all combinations of factors 2 and 3. If factor 2 has k levels and factor 3 has m levels, there are k*m combinations of levels. If 'show.cellcode = TRUE', then the output displays the correspondence between the sequence of coefficients specified through 'contrast.list' and the k*m levels of factors 2 and 3.
show.test.def
p.digits The number of digits to the right of the decimal place on the p-value. Set 'p.digits = NA' to request scientific notation.
Details
The 'aov' object that is specified by 'lm.input' determines the error term from which all contrast standard errors are calculated. The 'call' component of 'lm.input' identifies the dependent variable and factors in terms of which the contrast is computed. The 'aov.contrast' function can identify these variables under either of two conditions: (a) the variable is contained in a dataframe that was specified in the 'data' argument of the call to 'aov' that produced 'lm.input', or (b) the variable is on the search path. If neither (a) nor (b) is satisfied, then 'aov.contrast' will fail to find the variables in terms of which the contrasts are defined. Each component of 'contrast.list' should be a numeric vector of the same length as the number of marginal means specified by 'margin'. If var.equal = TRUE, the standard error of a contrast is based on the error term from lm.input (deviance(lm.input)) and the degrees of freedom (df's) are the same as the df's of deviance(lm.input). If var.equal = FALSE, the standard error of a contrast is based on the within cell error of the cells that have non-zero coefficients in the contrast, and the df's are computed by the Welch-Satterthwaite formula (Brownlee, 1965).
Value
The output is a matrix with one row for each contrast that is specified in the 'contrast.list' argument. The output includes the estimate of the contrast, confidence bounds, t statistic, p-value of the t statistic, the standard error of the contrast, and the degrees of freedom for error.
References
Brownlee, K. A. (1965). Statistical theory and methodology in science and engineering. New York: Wiley, 1965.
Satterthwaite, F. E. (1946). An approximate distribution of estimates of variance components. Biometrics Bulletin, 2, 110-114.
Welch, B. L. (1947). The generalization of 'Student's problem when several different population variances are involved. Biometrika, 34, 23-35.
First we create a data set with 3 factors (2 x 3 x 4 design, 2 replications per cell).
x1 <- rep(sort(rep(1:4, 2)), 6)
x2 <- rep(sort(rep(1:3,8)), 2)
x3 <- sort(rep(1:2, 24))
Set the random number seed to an arbitrary value so that the results obtained
here will agree with those of any other user.
set.seed(10)
y <- x1 + x2 + x3 + x1*x3 + x2*x3 + rnorm(length(x1), 0, 3)
data.set <- data.frame(y = y,
x1 = factor(x1), x2 = factor(x2), x3 = factor(x3))
aov.out <- aov(y ~ x1*x2*x3, data = data.set)
The next command computes 2 contrasts with respect to the x1 factor.
The first contrast is the difference between the first and second marginal mean.
The second contrast is the difference between the average of the first and
second marginal mean and the third marginal mean.
aov.contrast(aov.out, margin=1,
contrast.list=list(c( 1, -1, 0, 0), c( .5, .5, -1, 0)) )
Model: y ~ x1 * x2 * x3
MS Error = 5.1090441795071, df.error = 24, N.valid = 48, N.missing = 0
Confidence level = 0.95
Contrasts for means of factor x1
Contrast coefficients:
[,1] [,2] [,3] [,4]
C1 1.000 -1.000 0.000 0.000
C2 0.500 0.500 -1.000 0.000
Contrast tests assume homogeneity of variance.
Estimate Conf.Bounds Std.Error Df T.Stat Signif
C1 -2.5715785 1.9045069 0.92277157 24 -2.7867986 0.010238
C2 -3.5299750 1.6493514 0.79914362 24 -4.4171973 0.000183
The next set of commands checks that the preceding aov.contrast commands computed
the correct estimates for the two contrasts.
cell.means <- tapply(data.set$y, data.set[c("x1","x2","x3")], mean)
x1.means <- apply(cell.means, 1, mean)
c("contrast 1" = x1.means[1] - x1.means[2],
"contrast 2" = .5*x1.means[1] + .5*x1.means[2] - x1.means[3])
For more precision on the p-values in the output, use either
aov.contrast(aov.out, margin=1,
contrast.list=list(c( 1, -1, 0, 0), c( .5, .5, -1, 0)),
p.digits=9 )
aov.contrast(aov.out, margin=1,
contrast.list=list(c( 1, -1, 0, 0), c( .5, .5, -1, 0)),
p.digits=NA )
The next command uses the Welch-Satterthwaite procedure for computing a test of the contrast without assuming homogeneity of variance.
aov.contrast(aov.out, margin=1,
contrast.list=list(c( 1, -1, 0, 0), c( .5, .5, -1, 0)),
var.equal = FALSE )
Data: data.set
Model: y ~ x1 * x2 * x3
MS Error = NA, df.error = 24, N.valid = 48, N.missing = 0
Confidence level = 0.95
Contrasts for means of factor x1
Contrast coefficients:
[,1] [,2] [,3] [,4]
C1 1.000 -1.000 0.000 0.000
C2 0.500 0.500 -1.000 0.000
These tests do not assume homogeneity of variance. Welch-Satterthwaite procedure used.
Estimate Conf.Bounds Std.Error Df.Adjusted T.Stat Signif
C1 -2.5715785 2.1417287 0.83062465 4.9497216 -3.0959574 0.027341
C2 -3.5299750 2.2773601 0.87320736 4.7715719 -4.0425392 0.010895
The next command computes the same two contrasts as above, except now the
contrasts are computed separately for each level of x3.
aov.contrast(aov.out, margin=1,
contrast.list=list(c( 1, -1, 0, 0), c( .5, .5, -1, 0)),
strata = 3 )
Data: data.set
Model: y ~ x1 * x2 * x3
MS Error = 5.1090441795071, df.error = 24, N.valid = 48, N.missing = 0
Confidence level = 0.95
Contrasts for means of factor x1 within levels of factor x3
Notation: 'C1' alone refers to the 'C1' contrast averaged over the other factors.
'C1 | i' refers to the 'C1' contrast computed within x3 = i
Contrast coefficients:
[,1] [,2] [,3] [,4]
C1 1.000 -1.000 0.000 0.000
C2 0.500 0.500 -1.000 0.000
Contrast tests assume homogeneity of variance.
Estimate Conf.Bounds Std.Error Df T.Stat Signif
C1 -2.57157854 1.9045069 0.92277157 24 -2.78679863 0.010238
C1 | 1 -4.24715711 2.6933795 1.30499606 24 -3.25453634 0.003365
C1 | 2 -0.89599996 2.6933795 1.30499606 24 -0.68659208 0.498920
C2 -3.52997502 1.6493514 0.79914362 24 -4.41719729 0.000183
C2 | 1 -2.80177384 2.3325351 1.13015974 24 -2.47909542 0.020589
C2 | 2 -4.25817620 2.3325351 1.13015974 24 -3.76776489 0.000945
The next set of commands checks that the preceding aov.contrast command computed
the correct estimates for the two contrasts within the levels of x3.
x1.3.means <- apply(cell.means, c(1,3), mean)
x1.3.means[1,] - x1.3.means[2,]
.5*x1.3.means[1,] + .5*x1.3.means[2,] - x1.3.means[3,]
The next command repeats this analysis without assuming homogeneity of variance.
aov.contrast(aov.out, margin=1,
contrast.list=list(c( 1, -1, 0, 0), c( .5, .5, -1, 0)),
strata = 3, var.equal=FALSE )
The next aov.contrast command computes an interaction contrast between
levels 2 and 3 of the x1 factor and levels 1 and 2 of the x2 factor.
aov.contrast(aov.out, margin=c(1,2), contrast.list=list(
c( 0, 1, -1, 0, -1, 1, 0, 0, 0, 0, 0, 0) ))
Data: data.set
Model: y ~ x1 * x2 * x3
MS Error = NA, df.error = 24, N.valid = 48, N.missing = 0
Confidence level = 0.95
Contrasts for means of factor x1 within levels of factor x3
Notation: 'C1' alone refers to the 'C1' contrast averaged over the other factors.
'C1 | i' refers to the 'C1' contrast computed within x3 = i
Contrast coefficients:
[,1] [,2] [,3] [,4]
C1 1.000 -1.000 0.000 0.000
C2 0.500 0.500 -1.000 0.000
These tests do not assume homogeneity of variance. Welch-Satterthwaite procedure used.
Estimate Conf.Bounds Std.Error Df.Adjusted T.Stat Signif
C1 -2.57157854 2.1417287 0.83062465 4.9497216 -3.0959574 0.027341
C1 | 1 -4.24715711 3.1154273 0.92300069 2.7209125 -4.6014669 0.023806
C1 | 2 -0.89599996 4.5185465 1.38123820 2.8615183 -0.6486933 0.564815
C2 -3.52997502 2.2773601 0.87320736 4.7715719 -4.0425392 0.010895
C2 | 1 -2.80177384 3.7633413 1.10545638 2.6851456 -2.5344952 0.094904
C2 | 2 -4.25817620 4.9832181 1.35200982 2.3980326 -3.1495157 0.069298
The next set of commands checks that the preceding aov.contrast command computed
the correct estimate for the interaction contrast.
x1.x2.means <- apply(cell.means, c(1, 2), mean)
x1.x2.means[1,2] - x1.x2.means[1,3] - x1.x2.means[2,2] + x1.x2.means[2,3]
The next set of commands shows the computation of contrasts between adjusted means
in a oneway analysis of covariance (ancova).
data.set <- data.frame(
score = c(43, 55, 51, 36, 52, 61, 58, 62, 55, 79, 63),
method = factor(c("A","A","A","A","B","B","B","C","C","C","C")),
age = c(72, 74, 71, 69, 70, 73, 73, 75, 70, 75, 73))
lm.results <- lm(score ~ method + age, data = data.set)
aov.contrast(lm.results,
contrast.list = list(c(1, -1, 0), c(1, 0, -1), c(0, 1, -1)))
Data: data.set
Model: score ~ method + age
MS Error = 31.7724775224775, df.error = 7, N.valid = 11, N.missing = 0
Confidence level = 0.95
Contrasts for means of factor method
Contrast coefficients:
[,1] [,2] [,3]
C1 1.000 -1.000 0.000
C2 1.000 0.000 -1.000
C3 0.000 1.000 -1.000
Contrast tests assume homogeneity of variance.
Estimate Conf.Bounds Std.Error Df T.Stat Signif
C1 -9.2010490 10.240797 4.3308348 7 -2.12454395 0.071242
C2 -13.0786713 10.200279 4.3136999 7 -3.03189178 0.019064
C3 -3.8776224 10.554439 4.4634743 7 -0.86874532 0.413778
----------------------------------------------------------------------
Section IIs2: How to use 'aov.contrast' to compute contrasts between marginal means in a between-subjects analysis of variance (completely randomized design)
The 'aov.contrast' function computes tests and confidence intervals for linear combinations of means in a between-subjects analysis of variance. The tests can apply to main effects of factors, to interaction contrasts, or to contrasts computed within fixed levels of other factors.
The following R commands create a hypothetical data set that will be used in all subsequent examples.
x1 <- rep(sort(rep(1:4, 2)), 6) x2 <- rep(sort(rep(1:3,8)), 2) x3 <- sort(rep(1:2, 24)) |
These commands create the codes for three factors. |
cbind(x1, x2, x3) |
This command lets you see the values of x1, x2 and x3. |
set.seed(10)
|
'set.seed' sets the initial state of the random number generator to '10'. The only reason to do this is that it causes all users to draw the same random sample. |
y <- x1 + x2 + x3 + x1*x3 + |
This command creates a variable, y, that results from the additive effects of x1, x2, and x3, and the interaction of x3 with x1 and with x2. It also has a normal random error component. |
data.set <- data.frame(y = y, |
This command creates a dataframe called 'data.set' with variables, y, x1, x2, and x3. The command also converts x1, x2, and x3 from numeric vectors to factors. |
∘ x1 <- rep(sort(rep(1:4, 2)), 6)
x2 <- rep(sort(rep(1:3,8)), 2)
x3 <- sort(rep(1:2, 24))
cbind(x1, x2, x3)
y <- x1 + x2 + x3 + x1*x3 +
x2*x3 + rnorm(48, 0, 8)
data.set <- data.frame(y = y,
x1 = factor(x1),
x2 = factor(x2),
x3 = factor(x3))
⋅
x3 = 1 |
|
|
|
|
|
|
x2 |
|
|
x1 |
Level 1 |
Level 2 |
Level 3 |
Marginal |
Level 1 |
m111 |
m121 |
m131 |
m1·1 |
Level 2 |
m211 |
m221 |
m231 |
m2·1 |
Level 3 |
m311 |
m321 |
m331 |
m3·1 |
Level 4 |
m411 |
m421 |
m431 |
m4·1 |
Marginal |
m·11 |
m·21 |
m·31 |
|
|
|
|
|
|
x3 = 2 |
|
|
|
|
|
|
x2 |
|
|
x1 |
Level 1 |
Level 2 |
Level 3 |
Marginal |
Level 1 |
m112 |
m122 |
m132 |
m1·2 |
Level 2 |
m212 |
m222 |
m232 |
m2·2 |
Level 3 |
m312 |
m322 |
m332 |
m3·2 |
Level 4 |
m412 |
m422 |
m432 |
m4·2 |
Marginal |
m·12 |
m·22 |
m·32 |
|
The experimental design is a 2 x 3 x 4 design with 2 replications per cell. Notation for the population cell means for this design are shown in Table 1RefTo: t1.
The marginal means shown in Table 1RefTo: t1 average the cell means along a row or column within a level of x3. For example, m1·1 is the mean of m111, m112, and m113. When averaging over all other factors, the marginal means are m1··, m2··, m3··, and m4·· for the x1 factor; m·1·, m·2·, and m·3· for the x2 factor; and m··1 and m··2 for the x3 factor.
Contrasts can be defined in terms of linear combinations of cell means or linear combinations of marginal means. For example, m1·· - m2·· is the difference between levels 1 and 2 of x1 averaged over all other factors; m1·1 - m2·1 is the difference between levels 1 and 2 of x1 averaged over x2 with x3 held fixed at x3 = 1; m111 - m211 is the difference between levels 1 and 2 of x1 with x2 and x3 held fixed at x2 = 1 and x3 = 1, respectively. Interaction contrasts involve more than one level of several factors. For example, m12· - m13· - m22· + m23· examines whether the difference between levels 2 and 3 of x2 is different at x1 = 1 and x1 = 2, after averaging over x3. Examples will be given that show how to test all of these different types of contrasts.
The R functions, 'lm' and 'aov', are used to compute analyses of variance[1]. This documentation assumes that you are already familiar with the use of 'lm' and 'aov'. All examples in this documentation are based on 'aov' but nothing in these examples would change if 'lm' were to replace the use of 'aov' in the examples. For example,
aov.out <- aov(y ~ x1*x2*x3, data = data.set) |
This command computes an analysis of variance with dependent variable y and independent variables x1, x2 and x3. All interactions are included in the model. The results of the analysis of variance are stored under the name, 'aov.out'. |
The preceding code stores the output of the call to 'aov' in an object called 'aov.out' (any other name could have been chosen, e.g., 'Nancy'); 'aov' is called an 'aov' object because it was produced by a call to the 'aov' function. The 'data' argument is not strictly necessary because the variables, y, x1, x2, and x3, exist on the search path, but it is generally good practice to use an explicit 'data' argument as a way to maintain a clear record of where the variables are to be found. There are many ways to extract the information that is stored in 'aov.out'. The following code displays a few relevant aspects of the intormation in 'aov.out'.
anova(aov.out) This command shows the standard anova table for this analysis.
aov.out$call This command shows the 'call' component of the 'aov.out' object. The 'aov.contrast' function uses the 'call' component to identify the data set and variables with respect to which the contrasts are defined.
deviance(aov.out)
aov.out$df.resid These commands shows the value of the error term and degrees of freedom for error.
Example 1x1: The simplest version of the 'aov.contrast' function is as follows.
aov.contrast(lm.input = aov.out, margin = 1, contrast = c(1, -.5, -.5, 0))
Specifying 'lm.input = aov.out' tells 'aov.contrast' what data set to use (data.set) and the names of the dependent and independent variables. Specifying 'margin = 1' indicates that the contrasts will be defined with respect to the first independent variable in the call to 'aov' - in this case, this would be the x1 factor. Contrasts on the marginals of the other factors would be specified with 'margin = 2' or 'margin = 3'. The 'contrast.list' argument (abbreviated to 'contrast') indicates that we want to compute statistics for the difference between the first marginal mean of x1 and the average of the second and third marginal mean. The output from this call to 'aov.contrast' is shown below:
Data: data.set
Model: y ~ x1 * x2 * x3
MS Error = 36.3309808320505, df.error = 24, N.valid = 48, N.missing = 0
Confidence level = 0.95
Contrasts for means of factor x1
Contrast coefficients:
[,1] [,2] [,3] [,4]
C1 1.000 -0.500 -0.500 0.000
Contrast tests assume homogeneity of variance.
Estimate Conf.Bounds Std.Error Df T.Stat Signif
1 -3.5997904 4.3982703 2.1310496 24 -1.6892100 0.104131
The output tells us that the data for the analysis is in a dataframe called 'data.set'; that the original call to 'aov' fitted the model: 'y ~ x1 * x2 * x3'; and that the error term that was found for this model was .32.76 with 24 degrees of freedom (df's). The output then shows the correspondence between the levels of the x1 factor and the contrast coefficients. One could fit a different model, e.g., 'y ~ x1 + x2 + x3', and the error term would naturally change. The model formula in the 'aov.contrast' output reminds us that the tests are based on the error term from the fit of a specific anova model, in this case, 'y ~ x1 * x2 * x3'.
The output gives us the estimate of the contrast, the 95% confidence bounds for the contrast, the t-statistic, p-value, and standard error of the contrast, and the degrees of freedom for error. If these results were based on real data, one could report the results as follows: "The test for the difference between m1·· and the mean of m2·· and m3·· was highly significant (t(24) = -1.69, MS error = 36.33, p = .10). The 95% confidence interval for m1·· - (m2·· + m3··)/2 was -3.60 ± 4.40."
Example 2x2: The 'aov.contrast' function allows us to compute statistics several contrasts at once, as long as all the contrast pertain to the same 'aov' object, and the same factor(s). For example:
aov.contrast(lm.input = aov.out, margin = 1,
contrast = list( c( 1, -1, 0, 0),
c( 1, -.5, -.5, 0),
c( 1, -1/3, -1/3, -1/3) ) )
yields statistics for three contrasts. Note that the 'contrast.list' argument must now be a list of numeric vectors, each having the same length as the levels of the first factor.
Data: data.set
Model: y ~ x1 * x2 * x3
MS Error = 36.3309808320505, df.error = 24, N.valid = 48, N.missing = 0
Confidence level = 0.95
Contrasts for means of factor x1
Contrast coefficients:
[,1] [,2] [,3] [,4]
C1 1.000 -1.000 0.000 0.000
C2 1.000 -0.500 -0.500 0.000
C3 1.00000000 -0.33333333 -0.33333333 -0.33333333
Contrast tests assume homogeneity of variance.
Estimate Conf.Bounds Std.Error Df T.Stat Signif
C1 -2.6908761 5.0786851 2.4607242 24 -1.0935302 0.285016
C2 -3.5997904 4.3982703 2.1310496 24 -1.6892100 0.104131
C3 -4.1995700 4.1467290 2.0091729 24 -2.0901984 0.047370
The names for the contrasts, 'C1', 'C2', and 'C3', were automatically generated, but the user can specify names that override these default names, e.g.,
aov.contrast(lm.input = aov.out, margin = 1, contrast = list(
"Dif 1 v 2" = c( 1, -1, 0, 0),
"Dif 1 v 23" = c( 1, -.5, -.5, 0),
"Dif 1 v 234" = c( 1, -1/3, -1/3, -1/3) ) )
[Output omitted]
Example 3x3: The 'strata' argument of 'aov.contrast' allows one to test a contrast within each level of the stratifying factor(s). For example,
aov.contrast(lm.input = aov.out, margin = 1, contrast = c(1, -1, 0, 0), strata = 3)
computes a test of the difference between the first and second levels of x1 within each of the levels of x3. It also computes a test of this contrast averaged over the levels of x3. Here is the output.
Data: data.set
Model: y ~ x1 * x2 * x3
MS Error = 36.3309808320505, df.error = 24, N.valid = 48, N.missing = 0
Confidence level = 0.95
Contrasts for means of factor x1 within levels of factor x3
Notation: 'C1' alone refers to the 'C1' contrast averaged over the other factors.
'C1 | i' refers to the 'C1' contrast computed within x3 = i
Contrast coefficients:
[,1] [,2] [,3] [,4]
C1 1.000 -1.000 0.000 0.000
Contrast tests assume homogeneity of variance.
Estimate Conf.Bounds Std.Error Df T.Stat Signif
C1 -2.6908761 5.0786851 2.4607242 24 -1.09353016 0.285016
C1 | 1 -7.9924190 7.1823453 3.4799895 24 -2.29667904 0.030668
C1 | 2 2.6106668 7.1823453 3.4799895 24 0.75019386 0.460430
The output that is labeled 'C1' shows the results for the contrast averaged over the levels of x2 and x3, i.e., this output shows that the 95% confidence interval for m1·· - m2·· is -2.28 ± 4.82. (See Table 1RefTo: t1 for the notation for population means.) The output that is labeled 'C1 | 1' shows the same contrast computed within x3 = 1 (still averaged over x2) and the output that is labeled 'C1 | 2' shows this contrast computed within x3 = 2 (again averaged over x2). Thus the 95% confidence interval for m1·1 - m2·1 is -2.19 ± 6.82 and the 95% confidence interval for m1·2 - m2·2 is -2.37 ± 6.82.
The 'strata' argument can also be a vector representing several factors, in which case the specified contrast is computed within fixed levels of every combination of the stratifying factors. For example:
aov.contrast(lm.input = aov.out, margin = 1, contrast = c(1, -1, 0, 0), strata = c(2, 3))
specifies the contrast between the first and second levels of x1 computed within each combination of levels of x2 and x3. Here is the output.
Data: data.set
Model: y ~ x1 * x2 * x3
MS Error = 36.3309808320505, df.error = 24, N.valid = 48, N.missing = 0
Confidence level = 0.95
Contrasts for means of factor x1 within levels of factors x2 * x3
Notation: 'C1' alone refers to the 'C1' contrast averaged over the other factors.
'C1 | i*j' refers to the 'C1' contrast computed within x2 = i & x3 = j
Contrast coefficients:
[,1] [,2] [,3] [,4]
C1 1.000 -1.000 0.000 0.000
Contrast tests assume homogeneity of variance.
Estimate Conf.Bounds Std.Error Df T.Stat Signif
C1 -2.6908761 5.0786851 2.4607242 24 -1.09353016 0.285016
C1 | 1*1 5.2199676 12.4401870 6.0275186 24 0.86602264 0.395053
C1 | 1*2 -3.3165813 12.4401870 6.0275186 24 -0.55023991 0.587240
C1 | 2*1 -16.9628483 12.4401870 6.0275186 24 -2.81423408 0.009606
C1 | 2*2 12.8758010 12.4401870 6.0275186 24 2.13616942 0.043068
C1 | 3*1 -12.2343761 12.4401870 6.0275186 24 -2.02975335 0.053609
C1 | 3*2 -1.7272194 12.4401870 6.0275186 24 -0.28655563 0.776912
In addition to showing that the 95% confidence interval for m1·· - m2·· is -2.69 ± 5.08, the output shows, e.g., that the 95% confidence interval for m111 - m211 is 5.21 ± 12.44, and the 95% confidence interval for m112 - m212 is -3.32 ± 12.44, etc.
Example 4x4: It is permissible to specify more than one factor in the 'margin' argument. This is useful for testing interaction contrasts. For example, suppose that the following contrast is specified:
aov.contrast(lm.input = aov.out, margin = c(1, 2),
contrast = c( 0, 1, -1,
0, -1, 1,
0, 0, 0,
0, 0, 0 ))
|
|
|
|
|
|
x2 |
|
x1 |
Level 1 |
Level 2 |
Level 3 |
Level 1 |
1 |
2 |
3 |
Level 2 |
4 |
5 |
6 |
Level 3 |
7 |
8 |
9 |
Level 4 |
10 |
11 |
12 |
To understand this call to 'aov.contrast', we need to see how 'aov.contrast' enumerates the levels of factors x1 and x2. Table 2RefTo: t2 shows the correspondence between levels of x1 and x2 and the sequence of contrast coefficients. From Table 2RefTo: t2, we can see that the contrast specified above is: (m12· - m13·) - (m22· - m23·). Essentially, when the 'margin' argument specifies multiple factors, the factor levels are incremented starting with the last (right-most) factor. Here is the output.
Data: data.set
Model: y ~ x1 * x2 * x3
MS Error = 36.3309808320505, df.error = 24, N.valid = 48, N.missing = 0
Confidence level = 0.95
Contrasts for means of factors x1 * x2
Correspondence between the contrast coefficients and the levels of [1] x1 * x2
x2
x1 1 2 3
1 1 2 3
2 4 5 6
3 7 8 9
4 10 11 12
Contrast coefficients:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
C1 0.000 1.000 -1.000 0.000 -1.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000
Contrast tests assume homogeneity of variance.
Estimate Conf.Bounds Std.Error Df T.Stat Signif
1 4.9372741 12.440187 6.0275186 24 0.81912216 0.420777
The output shows that we cannot reject (m12· - m13·) - (m22· - m23·) = 0 at the .05 level (t(24) = .82, MS error = 36.33, p = .42).
----------------------------------------------------------------------
Section IIIs3. Analysis of Covariance (Ancova) Examples
The computation of contrasts in an ancova is essentially the same as in an anova, except that the contrasts are linear combinations of adjusted cell means rather than cell means. Adjusted cell means are the predicted values for cells where the predictions are computed with respect to the mean values of the covariates. The following examples show how to compute adjusted cell means and contrasts among the adjusted cell means, but the anova contrasts that were computed in previous examples can also be computed in the context of an ancova.
Oneway Ancova Data Set:
Suppose that children are taught a skill by one of three methods, A, B, or C. Their proficiency with the skill is measured on a test, and stored in a variable called 'score'. Their age in months at the beginning of training is recorded in a variable, 'age'.
data.set <- data.frame(
score = c(43, 55, 51, 36, 52, 61, 58, 62, 55, 79, 63),
method = factor(c("A","A","A","A","B","B","B","C","C","C","C")),
age = c(72, 74, 71, 69, 70, 73, 73, 75, 70, 75, 73))
Example 5x5. Adjusted Cell means
lm.results <- lm(score ~ method + age, data = data.set)
aov.contrast(lm.results,
contrast.list = list(A = c(1,0,0), B = c(0,1,0), C = c(0,0,1)))
Data: data.set
Model: score ~ method + age
MS Error = 31.7724775224775, df.error = 7, N.valid = 11, N.missing = 0
Confidence level = 0.95
Contrasts for means of factor method
Contrast coefficients:
[,1] [,2] [,3]
A 1.000 0.000 0.000
B 0.000 1.000 0.000
C 0.000 0.000 1.000
Contrast tests assume homogeneity of variance.
Estimate Conf.Bounds Std.Error Df T.Stat Signif
A 48.643833 6.8833692 2.9109780 7 16.710478 0.000001
B 57.844882 7.7193051 3.2644954 7 17.719394 0.000000
C 61.722505 7.0113887 2.9651175 7 20.816209 0.000000
The adjusted cell means are 48.64, 57.84, and 61.72 for levels A, B, and C of the 'method' factor. These means are the predicted values of the A, B, and C methods at the mean value of 'age'. A separate computation shows that the 'mean(age)' is 72.27 (months).
Example 6x6. Contrasts between levels in a oneway ancova.
aov.contrast(lm.results,
contrast.list = list(c(1, -1, 0), c(1, 0, -1), c(0, 1, -1)))
Data: data.set
Model: score ~ method + age
MS Error = 31.7724775224775, df.error = 7, N.valid = 11, N.missing = 0
Confidence level = 0.95
Contrasts for means of factor method
Contrast coefficients:
[,1] [,2] [,3]
C1 1.000 -1.000 0.000
C2 1.000 0.000 -1.000
C3 0.000 1.000 -1.000
Contrast tests assume homogeneity of variance.
Estimate Conf.Bounds Std.Error Df T.Stat Signif
C1 -9.2010490 10.240797 4.3308348 7 -2.12454395 0.071242
C2 -13.0786713 10.200279 4.3136999 7 -3.03189178 0.019064
C3 -3.8776224 10.554439 4.4634743 7 -0.86874532 0.413778
The difference in mean adjusted 'score' for A and B is -9.20 ± 10.24; for A and C it is -13.08 ± 10.20; for B and C it is -3.88 ± 10.55. Only the second of these differences is significant at the .05 level.
Twoway Ancova Data Set:
This data set is like the preceding one except that in this case there are two factors and two covariates. Suppose that children are taught a skill by one of three methods, A, B, or C. The gender of the children is recorded in a factor variable 'sex'. Their proficiency with the skill is measured on a test before and after training. Proficiency before training is stored in a variable 'score0' and proficiency after training is stored in a variable 'score1'. Their age in months at the beginning of training is recorded in a variable, 'age'. In this analysis, the dependent variable is 'score1', the factor variables are 'method' and 'sex', and the covariates are 'score0' and 'age'.
data.set <- data.frame(
score0 = c(38, 54, 52, 48, 30, 44, 56, 56, 50, 54, 43, 65, 67, 53),
score1 = c(43, 55, 53, 51, 36, 52, 61, 64, 58, 62, 55, 79, 73, 63),
method = factor(c('A','A','A','A','A','B','B','B','B',
'C','C','C', 'C','C')),
sex = factor(c('M','M','F','M','F','F','F','F','M','F','F','M','F','M')),
age = c(72, 74, 68, 71, 69, 70, 73, 69, 73, 75, 70, 75, 74, 73))
lm.results <- lm(score1 ~ score0 + age + method*sex, data = data.set)
aov.contrast(lm.input = lm.results, margin = c(1,2),
contrast.list = list(
c(1, 0, 0, 0, 0, 0), c(0, 1, 0, 0, 0, 0), c(0, 0, 1, 0, 0, 0),
c(0, 0, 0, 1, 0, 0), c(0, 0, 0, 0, 1, 0), c(0, 0, 0, 0, 0, 1)))
Data: data.set
Model: score1 ~ score0 + age + method * sex
MS Error = 4.27192271900576, df.error = 6, N.valid = 14, N.missing = 0
Confidence level = 0.95
Contrasts for means of factors method * sex
Correspondence between the contrast coefficients and the levels of [1] method * sex
sex
method 1 2
1 1 2
2 3 4
3 5 6
Contrast coefficients:
[,1] [,2] [,3] [,4] [,5] [,6]
C1 1.000 0.000 0.000 0.000 0.000 0.000
C2 0.000 1.000 0.000 0.000 0.000 0.000
C3 0.000 0.000 1.000 0.000 0.000 0.000
C4 0.000 0.000 0.000 1.000 0.000 0.000
C5 0.000 0.000 0.000 0.000 1.000 0.000
C6 0.000 0.000 0.000 0.000 0.000 1.000
Contrast tests assume homogeneity of variance.
Estimate Conf.Bounds Std.Error Df T.Stat Signif
C1 51.819632 4.7547179 1.9431505 6 26.667843 0.000000
C2 53.215635 3.1183318 1.2743948 6 41.757574 0.000000
C3 57.596043 3.2176075 1.3149667 6 43.800383 0.000000
C4 58.908146 5.2076366 2.1282486 6 27.679166 0.000000
C5 60.295875 3.1166319 1.2737001 6 47.339146 0.000000
C6 64.564965 4.1423765 1.6928998 6 38.138681 0.000000
To determine the correspondence between the contrast estimates and the combinations of 'method' and 'sex', consider the table that is shown in the output:
sex
method 1 2
1 1 2
2 3 4
3 5 6
Bear in mind that 'F' precedes 'M' in the alphabetical order, so 'sex' 1 is female and 'sex' 2 is male. Similarly, 'A' precedes 'B' precedes 'C' in the alphabetical order, so 'method' 1 corresponds to A, and similarly for 'method' 2 and 3. Therefore the adjusted means for the factor combinations of 'method' and 'gender are:
|
gender |
|
method |
F |
M |
A |
51.819632 |
53.215635 |
B |
57.596043 |
58.908146 |
C |
60.295875 |
64.564965 |
Example 8x8. Gender effect within each method.
lm.results <- lm(score1 ~ score0 + age + method*sex, data = data.set)
aov.contrast(lm.input = lm.results, margin = 2, contrast.list=c(1, -1),
strata = 1)
Data: data.set
Model: score1 ~ score0 + age + method * sex
MS Error = 4.27192271900576, df.error = 6, N.valid = 14, N.missing = 0
Confidence level = 0.95
Contrasts for means of factor sex within levels of factor method
Notation: 'C1' alone refers to the 'C1' contrast averaged over the other factors.
'C1 | i' refers to the 'C1' contrast computed within method = i
Contrast coefficients:
[,1] [,2]
C1 1.000 -1.000
Contrast tests assume homogeneity of variance.
Estimate Conf.Bounds Std.Error Df T.Stat Signif
C1 -2.3257318 3.7037437 1.5136400 6 -1.53651580 0.175322
C1 | 1 -1.3960030 5.8550411 2.3928288 6 -0.58341114 0.580868
C1 | 2 -1.3121027 6.3892034 2.6111294 6 -0.50250391 0.633220
C1 | 3 -4.2690898 4.7255343 1.9312238 6 -2.21056193 0.069090
Averaged over methods, the difference between females and males was -2.33 ± 3.70 with 95% confidence. Within method A, the difference was -1.40 ± 5.86; within method B, it was -1.31 ± 6.39; within method C, it was -4.27 ± 4.73.
As noted, examples that were presented in the context of an anova can also be implemented in the context of an ancova. The only exception is the Welch-Satterthwaite method for testing contrasts without assuming homogeneity of variance. This method is only available with respect to anovas, not with respect to ancovas.
----------------------------------------------------------------------
Other Arguments to the Function
Confidence.Level specifies the confidence level of the confidence intervals. The default is 'Confidence.Level = .95' but one could change this, e.g., to 'Confidence.Level = .90'.
var.equal If var.equal = TRUE, then the
standard error of a contrast is calculated by the standard formula from the
pooled within cell error variance (mean squared within groups). If
var.equal = FALSE, then the standard error is calculated by the formula , where ci
is the contrast coefficient for the i-th cell,
is the variance of the
i-th cell, and ni is the sample size of i-th cell. Furthermore, if
var.equal = FALSE, the degrees of freedom for error are computed by the
Welch-Satterthwaite formula.
show.cellcode specifies whether the table should be displayed that shows the correspondence between contrast coefficients and the combinations of factors specified in the 'margin' argument. The 'show.cellcode' argument has no effect (no table is displayed) when 'margin' specifies only one factor.
p.digits specifies the number of digits to the right of the decimal place to be shown in p-value of the test. Set 'p.digits = NA' if scientific notation is preferred.
----------------------------------------------------------------------
Conclusion
This is my first attempt at writing a R function for user-defined contrasts in a between-subjects anova. A number of improvements would be desirable and still need to be written. Specifically, one would like at least the following: Scheffé, Tukey, and Bonferroni multiple comparison procedures.
Let me be the first to say that I am a very crude R programmer, and the code for the 'aov.contrast' function could use a lot of improvement. If anyone notices errors in the computations of this function, please describe the error in an email to jmiyamot@u.washington.edu. It would be helpful if the message included a small data set and some R code that illustrates the error. In suggesting improvements to the code for 'aov.contrast', please bear in mind that I may have difficulty understanding more sophisticated programming techniques.
#-
#-------------- The 'doc' bookmark highlights all of the documentation but not the function skeleton at the top of the file. Use rdocToString macro to convert documentation to an R string variable --------------------------------------------------------
Format for this html document: Page Size = Tabloid: 11" x 17". Note that Tabloid page size is available if Adobe PDF is set as the default printer, but not if the default printer is a physical printer that prints 8.5" x 11" standard paper. Table widths = 9.5"