aov.contrast

File: jmfuns.rda

function ( lm.input, margin = 1, contrast.list, strata = NA,

Confidence.Level = 0.95, var.equal = TRUE,

show.cellcode = TRUE, show.test.def = TRUE,

p.digits = 6)

ARGUMENTS

lm.input A 'lm' or 'aov' object, i.e., the output of a call to 'lm' or 'aov'.

margin A vector of positive integers that indicate the factor or factors over whose levels the contrast is defined. The factors are listed in the same order as in the 'formula' argument of the call to 'aov'. For example, if the 'Y ~ A1*A2*A3' is the formula in the call to 'aov', then 'margin=3' indicates that the contrast coefficients apply to the levels of the 'A3' factor.

contrast.list A list of numeric vectors. Each vector indicates the contrast coefficients for the marginal means specified by 'margin'. All vectors must have the same length as the number of marginal means determined by 'margin'.

strata If 'strata' is non-missing, it must be a vector of integers indicating the factor or factors that play the role of strata. Each contrast is computed within every fixed level of the stratifying factors.

Confidence.Level Level of confidence for the confidence intervals.

var.equal Set var.equal = TRUE if and only if homogeneity of variance is assumed in the test of the contrast. The default is var.equal = TRUE.

show.cellcode When margin specifies more than one factor, e.g., 'margin = c(2, 3)', the contrast coefficients define a linear combination with respect to all combinations of factors 2 and 3. If factor 2 has k levels and factor 3 has m levels, there are k*m combinations of levels. If 'show.cellcode = TRUE', then the output displays the correspondence between the sequence of coefficients specified through 'contrast.list' and the k*m levels of factors 2 and 3.

show.test.def

p.digits The number of digits to the right of the decimal place on the p-value. Set 'p.digits = NA' to request scientific notation.

Details

The 'aov' object that is specified by 'lm.input' determines the error term from which all contrast standard errors are calculated. The 'call' component of 'lm.input' identifies the dependent variable and factors in terms of which the contrast is computed. The 'aov.contrast' function can identify these variables under either of two conditions: (a) the variable is contained in a dataframe that was specified in the 'data' argument of the call to 'aov' that produced 'lm.input', or (b) the variable is on the search path. If neither (a) nor (b) is satisfied, then 'aov.contrast' will fail to find the variables in terms of which the contrasts are defined. Each component of 'contrast.list' should be a numeric vector of the same length as the number of marginal means specified by 'margin'. If var.equal = TRUE, the standard error of a contrast is based on the error term from lm.input (deviance(lm.input)) and the degrees of freedom (df's) are the same as the df's of deviance(lm.input). If var.equal = FALSE, the standard error of a contrast is based on the within cell error of the cells that have non-zero coefficients in the contrast, and the df's are computed by the Welch-Satterthwaite formula (Brownlee, 1965).

Value

The output is a matrix with one row for each contrast that is specified in the 'contrast.list' argument. The output includes the estimate of the contrast, confidence bounds, t statistic, p-value of the t statistic, the standard error of the contrast, and the degrees of freedom for error.

References

Brownlee, K. A. (1965). Statistical theory and methodology in science and engineering. New York: Wiley, 1965.

Satterthwaite, F. E. (1946). An approximate distribution of estimates of variance components. Biometrics Bulletin, 2, 110-114.

Welch, B. L. (1947). The generalization of 'Student's problem when several different population variances are involved. Biometrika, 34, 23-35.

Section Is0001: Examples

First we create a data set with 3 factors (2 x 3 x 4 design, 2 replications per cell).

x1 <- rep(sort(rep(1:4, 2)), 6)

x2 <- rep(sort(rep(1:3,8)), 2)

x3 <- sort(rep(1:2, 24))

Set the random number seed to an arbitrary value so that the results obtained

here will agree with those of any other user.

set.seed(10)

y <- x1 + x2 + x3 + x1*x3 + x2*x3 + rnorm(length(x1), 0, 3)

data.set <- data.frame(y = y,

x1 = factor(x1), x2 = factor(x2), x3 = factor(x3))

aov.out <- aov(y ~ x1*x2*x3, data = data.set)

The next command computes 2 contrasts with respect to the x1 factor.

The first contrast is the difference between the first and second marginal mean.

The second contrast is the difference between the average of the first and

second marginal mean and the third marginal mean.

aov.contrast(aov.out, margin=1,

contrast.list=list(c( 1, -1, 0, 0), c( .5, .5, -1, 0)) )

Data: data.set

Model: y ~ x1 * x2 * x3

MS Error = 5.1090441795071, df.error = 24, N.valid = 48, N.missing = 0

Confidence level = 0.95

Contrasts for means of factor x1

Contrast coefficients:

[,1] [,2] [,3] [,4]

C1 1.000 -1.000 0.000 0.000

C2 0.500 0.500 -1.000 0.000

Contrast tests assume homogeneity of variance.

Estimate Conf.Bounds Std.Error Df T.Stat Signif

C1 -2.5715785 1.9045069 0.92277157 24 -2.7867986 0.010238

C2 -3.5299750 1.6493514 0.79914362 24 -4.4171973 0.000183

The next set of commands checks that the preceding aov.contrast commands computed

the correct estimates for the two contrasts.

cell.means <- tapply(data.set$y, data.set[c("x1","x2","x3")], mean)

x1.means <- apply(cell.means, 1, mean)

c("contrast 1" = x1.means[1] - x1.means[2],

"contrast 2" = .5*x1.means[1] + .5*x1.means[2] - x1.means[3])

For more precision on the p-values in the output, use either

aov.contrast(aov.out, margin=1,

contrast.list=list(c( 1, -1, 0, 0), c( .5, .5, -1, 0)),

p.digits=9 )

aov.contrast(aov.out, margin=1,

contrast.list=list(c( 1, -1, 0, 0), c( .5, .5, -1, 0)),

p.digits=NA )

The next command uses the Welch-Satterthwaite procedure for computing a test of the contrast without assuming homogeneity of variance.

aov.contrast(aov.out, margin=1,

contrast.list=list(c( 1, -1, 0, 0), c( .5, .5, -1, 0)),

var.equal = FALSE )

Data: data.set

Model: y ~ x1 * x2 * x3

MS Error = NA, df.error = 24, N.valid = 48, N.missing = 0

Confidence level = 0.95

Contrasts for means of factor x1

Contrast coefficients:

[,1] [,2] [,3] [,4]

C1 1.000 -1.000 0.000 0.000

C2 0.500 0.500 -1.000 0.000

These tests do not assume homogeneity of variance. Welch-Satterthwaite procedure used.

Estimate Conf.Bounds Std.Error Df.Adjusted T.Stat Signif

C1 -2.5715785 2.1417287 0.83062465 4.9497216 -3.0959574 0.027341

C2 -3.5299750 2.2773601 0.87320736 4.7715719 -4.0425392 0.010895

The next command computes the same two contrasts as above, except now the

contrasts are computed separately for each level of x3.

aov.contrast(aov.out, margin=1,

contrast.list=list(c( 1, -1, 0, 0), c( .5, .5, -1, 0)),

strata = 3 )

Data: data.set

Model: y ~ x1 * x2 * x3

MS Error = 5.1090441795071, df.error = 24, N.valid = 48, N.missing = 0

Confidence level = 0.95

Contrasts for means of factor x1 within levels of factor x3

Notation: 'C1' alone refers to the 'C1' contrast averaged over the other factors.

'C1 | i' refers to the 'C1' contrast computed within x3 = i

Contrast coefficients:

[,1] [,2] [,3] [,4]

C1 1.000 -1.000 0.000 0.000

C2 0.500 0.500 -1.000 0.000

Contrast tests assume homogeneity of variance.

Estimate Conf.Bounds Std.Error Df T.Stat Signif

C1 -2.57157854 1.9045069 0.92277157 24 -2.78679863 0.010238

C1 | 1 -4.24715711 2.6933795 1.30499606 24 -3.25453634 0.003365

C1 | 2 -0.89599996 2.6933795 1.30499606 24 -0.68659208 0.498920

C2 -3.52997502 1.6493514 0.79914362 24 -4.41719729 0.000183

C2 | 1 -2.80177384 2.3325351 1.13015974 24 -2.47909542 0.020589

C2 | 2 -4.25817620 2.3325351 1.13015974 24 -3.76776489 0.000945

The next set of commands checks that the preceding aov.contrast command computed

the correct estimates for the two contrasts within the levels of x3.

x1.3.means <- apply(cell.means, c(1,3), mean)

x1.3.means[1,] - x1.3.means[2,]

.5*x1.3.means[1,] + .5*x1.3.means[2,] - x1.3.means[3,]

The next command repeats this analysis without assuming homogeneity of variance.

aov.contrast(aov.out, margin=1,

contrast.list=list(c( 1, -1, 0, 0), c( .5, .5, -1, 0)),

strata = 3, var.equal=FALSE )

The next aov.contrast command computes an interaction contrast between

levels 2 and 3 of the x1 factor and levels 1 and 2 of the x2 factor.

aov.contrast(aov.out, margin=c(1,2), contrast.list=list(

c( 0, 1, -1, 0, -1, 1, 0, 0, 0, 0, 0, 0) ))

Data: data.set

Model: y ~ x1 * x2 * x3

MS Error = NA, df.error = 24, N.valid = 48, N.missing = 0

Confidence level = 0.95

Contrasts for means of factor x1 within levels of factor x3

Notation: 'C1' alone refers to the 'C1' contrast averaged over the other factors.

'C1 | i' refers to the 'C1' contrast computed within x3 = i

Contrast coefficients:

[,1] [,2] [,3] [,4]

C1 1.000 -1.000 0.000 0.000

C2 0.500 0.500 -1.000 0.000

These tests do not assume homogeneity of variance. Welch-Satterthwaite procedure used.

Estimate Conf.Bounds Std.Error Df.Adjusted T.Stat Signif

C1 -2.57157854 2.1417287 0.83062465 4.9497216 -3.0959574 0.027341

C1 | 1 -4.24715711 3.1154273 0.92300069 2.7209125 -4.6014669 0.023806

C1 | 2 -0.89599996 4.5185465 1.38123820 2.8615183 -0.6486933 0.564815

C2 -3.52997502 2.2773601 0.87320736 4.7715719 -4.0425392 0.010895

C2 | 1 -2.80177384 3.7633413 1.10545638 2.6851456 -2.5344952 0.094904

C2 | 2 -4.25817620 4.9832181 1.35200982 2.3980326 -3.1495157 0.069298

The next set of commands checks that the preceding aov.contrast command computed

the correct estimate for the interaction contrast.

x1.x2.means <- apply(cell.means, c(1, 2), mean)

x1.x2.means[1,2] - x1.x2.means[1,3] - x1.x2.means[2,2] + x1.x2.means[2,3]

The next set of commands shows the computation of contrasts between adjusted means

in a oneway analysis of covariance (ancova).

data.set <- data.frame(

score = c(43, 55, 51, 36, 52, 61, 58, 62, 55, 79, 63),

method = factor(c("A","A","A","A","B","B","B","C","C","C","C")),

age = c(72, 74, 71, 69, 70, 73, 73, 75, 70, 75, 73))

lm.results <- lm(score ~ method + age, data = data.set)

aov.contrast(lm.results,

contrast.list = list(c(1, -1, 0), c(1, 0, -1), c(0, 1, -1)))

Data: data.set

Model: score ~ method + age

MS Error = 31.7724775224775, df.error = 7, N.valid = 11, N.missing = 0

Confidence level = 0.95

Contrasts for means of factor method

Contrast coefficients:

[,1] [,2] [,3]

C1 1.000 -1.000 0.000

C2 1.000 0.000 -1.000

C3 0.000 1.000 -1.000

Contrast tests assume homogeneity of variance.

Estimate Conf.Bounds Std.Error Df T.Stat Signif

C1 -9.2010490 10.240797 4.3308348 7 -2.12454395 0.071242

C2 -13.0786713 10.200279 4.3136999 7 -3.03189178 0.019064

C3 -3.8776224 10.554439 4.4634743 7 -0.86874532 0.413778

----------------------------------------------------------------------

Section IIs2: How to use 'aov.contrast' to compute contrasts between marginal means in a between-subjects analysis of variance (completely randomized design)

The 'aov.contrast' function computes tests and confidence intervals for linear combinations of means in a between-subjects analysis of variance. The tests can apply to main effects of factors, to interaction contrasts, or to contrasts computed within fixed levels of other factors.

The following R commands create a hypothetical data set that will be used in all subsequent examples.

x1 <- rep(sort(rep(1:4, 2)), 6) x2 <- rep(sort(rep(1:3,8)), 2) x3 <- sort(rep(1:2, 24))	These commands create the codes for three factors.
cbind(x1, x2, x3)	This command lets you see the values of x1, x2 and x3.
set.seed(10)	'set.seed' sets the initial state of the random number generator to '10'. The only reason to do this is that it causes all users to draw the same random sample.
y <- x1 + x2 + x3 + x1x3 + x2x3 + rnorm(48, 0, 8)	This command creates a variable, y, that results from the additive effects of x1, x2, and x3, and the interaction of x3 with x1 and with x2. It also has a normal random error component.
data.set <- data.frame(y = y, x1 = factor(x1), x2 = factor(x2), x3 = factor(x3))	This command creates a dataframe called 'data.set' with variables, y, x1, x2, and x3. The command also converts x1, x2, and x3 from numeric vectors to factors.

∘ x1 <- rep(sort(rep(1:4, 2)), 6)

x2 <- rep(sort(rep(1:3,8)), 2)

x3 <- sort(rep(1:2, 24))

cbind(x1, x2, x3)

set.seed(10)

y <- x1 + x2 + x3 + x1*x3 +
x2*x3 + rnorm(48, 0, 8)

data.set <- data.frame(y = y,
x1 = factor(x1),
x2 = factor(x2),
x3 = factor(x3))

⋅

Table 1t1

x3 = 1
		x2
x1	Level 1	Level 2	Level 3	Marginal
Level 1	m₁₁₁	m₁₂₁	m₁₃₁	m₁_·1
Level 2	m₂₁₁	m₂₂₁	m₂₃₁	m₂_·1
Level 3	m₃₁₁	m₃₂₁	m₃₃₁	m₃_·1
Level 4	m₄₁₁	m₄₂₁	m₄₃₁	m₄_·1
Marginal	m_·₁₁	m_·₂₁	m_·₃₁

x3 = 2
		x2
x1	Level 1	Level 2	Level 3	Marginal
Level 1	m₁₁₂	m₁₂₂	m₁₃₂	m₁_·2
Level 2	m₂₁₂	m₂₂₂	m₂₃₂	m₂_·2
Level 3	m₃₁₂	m₃₂₂	m₃₃₂	m₃_·2
Level 4	m₄₁₂	m₄₂₂	m₄₃₂	m₄_·2
Marginal	m_·₁₂	m_·₂₂	m_·₃₂

The experimental design is a 2 x 3 x 4 design with 2 replications per cell. Notation for the population cell means for this design are shown in Table 1RefTo: t1.

The marginal means shown in Table 1RefTo: t1 average the cell means along a row or column within a level of x3. For example, m1·1 is the mean of m111, m112, and m113. When averaging over all other factors, the marginal means are m1··, m2··, m3··, and m4·· for the x1 factor; m·1·, m·2·, and m·3· for the x2 factor; and m··1 and m··2 for the x3 factor.

Contrasts can be defined in terms of linear combinations of cell means or linear combinations of marginal means. For example, m1·· - m2·· is the difference between levels 1 and 2 of x1 averaged over all other factors; m1·1 - m2·1 is the difference between levels 1 and 2 of x1 averaged over x2 with x3 held fixed at x3 = 1; m111 - m211 is the difference between levels 1 and 2 of x1 with x2 and x3 held fixed at x2 = 1 and x3 = 1, respectively. Interaction contrasts involve more than one level of several factors. For example, m12· - m13· - m22· + m23· examines whether the difference between levels 2 and 3 of x2 is different at x1 = 1 and x1 = 2, after averaging over x3. Examples will be given that show how to test all of these different types of contrasts.

The R functions, 'lm' and 'aov', are used to compute analyses of variance [1]. This documentation assumes that you are already familiar with the use of 'lm' and 'aov'. All examples in this documentation are based on 'aov' but nothing in these examples would change if 'lm' were to replace the use of 'aov' in the examples. For example,

aov.out <- aov(y ~ x1*x2*x3, data = data.set)

This command computes an analysis of variance with dependent variable y and independent variables x1, x2 and x3. All interactions are included in the model. The results of the analysis of variance are stored under the name, 'aov.out'.

The preceding code stores the output of the call to 'aov' in an object called 'aov.out' (any other name could have been chosen, e.g., 'Nancy'); 'aov' is called an 'aov' object because it was produced by a call to the 'aov' function. The 'data' argument is not strictly necessary because the variables, y, x1, x2, and x3, exist on the search path, but it is generally good practice to use an explicit 'data' argument as a way to maintain a clear record of where the variables are to be found. There are many ways to extract the information that is stored in 'aov.out'. The following code displays a few relevant aspects of the intormation in 'aov.out'.

anova(aov.out) This command shows the standard anova table for this analysis.

aov.out$call This command shows the 'call' component of the 'aov.out' object. The 'aov.contrast' function uses the 'call' component to identify the data set and variables with respect to which the contrasts are defined.

deviance(aov.out)

aov.out$df.resid These commands shows the value of the error term and degrees of freedom for error.

Example 1x1: The simplest version of the 'aov.contrast' function is as follows.

aov.contrast(lm.input = aov.out, margin = 1, contrast = c(1, -.5, -.5, 0))

Specifying 'lm.input = aov.out' tells 'aov.contrast' what data set to use (data.set) and the names of the dependent and independent variables. Specifying 'margin = 1' indicates that the contrasts will be defined with respect to the first independent variable in the call to 'aov' - in this case, this would be the x1 factor. Contrasts on the marginals of the other factors would be specified with 'margin = 2' or 'margin = 3'. The 'contrast.list' argument (abbreviated to 'contrast') indicates that we want to compute statistics for the difference between the first marginal mean of x1 and the average of the second and third marginal mean. The output from this call to 'aov.contrast' is shown below:

Data: data.set

Model: y ~ x1 * x2 * x3

MS Error = 36.3309808320505, df.error = 24, N.valid = 48, N.missing = 0

Confidence level = 0.95

Contrasts for means of factor x1

Contrast coefficients:

[,1] [,2] [,3] [,4]

C1 1.000 -0.500 -0.500 0.000

Contrast tests assume homogeneity of variance.

Estimate Conf.Bounds Std.Error Df T.Stat Signif

1 -3.5997904 4.3982703 2.1310496 24 -1.6892100 0.104131

The output tells us that the data for the analysis is in a dataframe called 'data.set'; that the original call to 'aov' fitted the model: 'y ~ x1 * x2 * x3'; and that the error term that was found for this model was .32.76 with 24 degrees of freedom (df's). The output then shows the correspondence between the levels of the x1 factor and the contrast coefficients. One could fit a different model, e.g., 'y ~ x1 + x2 + x3', and the error term would naturally change. The model formula in the 'aov.contrast' output reminds us that the tests are based on the error term from the fit of a specific anova model, in this case, 'y ~ x1 * x2 * x3'.

The output gives us the estimate of the contrast, the 95% confidence bounds for the contrast, the t-statistic, p-value, and standard error of the contrast, and the degrees of freedom for error. If these results were based on real data, one could report the results as follows: "The test for the difference between m1·· and the mean of m2·· and m3·· was highly significant (t(24) = -1.69, MS error = 36.33, p = .10). The 95% confidence interval for m1·· - (m2·· + m3··)/2 was -3.60 ± 4.40."

Example 2x2: The 'aov.contrast' function allows us to compute statistics several contrasts at once, as long as all the contrast pertain to the same 'aov' object, and the same factor(s). For example:

aov.contrast(lm.input = aov.out, margin = 1,

contrast = list( c( 1, -1, 0, 0),

c( 1, -.5, -.5, 0),

c( 1, -1/3, -1/3, -1/3) ) )

yields statistics for three contrasts. Note that the 'contrast.list' argument must now be a list of numeric vectors, each having the same length as the levels of the first factor.

Data: data.set

Model: y ~ x1 * x2 * x3

MS Error = 36.3309808320505, df.error = 24, N.valid = 48, N.missing = 0

Confidence level = 0.95

Contrasts for means of factor x1

Contrast coefficients:

[,1] [,2] [,3] [,4]

C1 1.000 -1.000 0.000 0.000

C2 1.000 -0.500 -0.500 0.000

C3 1.00000000 -0.33333333 -0.33333333 -0.33333333

Contrast tests assume homogeneity of variance.

Estimate Conf.Bounds Std.Error Df T.Stat Signif

C1 -2.6908761 5.0786851 2.4607242 24 -1.0935302 0.285016

C2 -3.5997904 4.3982703 2.1310496 24 -1.6892100 0.104131

C3 -4.1995700 4.1467290 2.0091729 24 -2.0901984 0.047370

The names for the contrasts, 'C1', 'C2', and 'C3', were automatically generated, but the user can specify names that override these default names, e.g.,

aov.contrast(lm.input = aov.out, margin = 1, contrast = list(

"Dif 1 v 2" = c( 1, -1, 0, 0),

"Dif 1 v 23" = c( 1, -.5, -.5, 0),

"Dif 1 v 234" = c( 1, -1/3, -1/3, -1/3) ) )

[Output omitted]

Example 3x3: The 'strata' argument of 'aov.contrast' allows one to test a contrast within each level of the stratifying factor(s). For example,

aov.contrast(lm.input = aov.out, margin = 1, contrast = c(1, -1, 0, 0), strata = 3)

computes a test of the difference between the first and second levels of x1 within each of the levels of x3. It also computes a test of this contrast averaged over the levels of x3. Here is the output.

Data: data.set

Model: y ~ x1 * x2 * x3

MS Error = 36.3309808320505, df.error = 24, N.valid = 48, N.missing = 0

Confidence level = 0.95

Contrasts for means of factor x1 within levels of factor x3

Notation: 'C1' alone refers to the 'C1' contrast averaged over the other factors.

'C1 | i' refers to the 'C1' contrast computed within x3 = i

Contrast coefficients:

[,1] [,2] [,3] [,4]

C1 1.000 -1.000 0.000 0.000

Contrast tests assume homogeneity of variance.

Estimate Conf.Bounds Std.Error Df T.Stat Signif

C1 -2.6908761 5.0786851 2.4607242 24 -1.09353016 0.285016

C1 | 1 -7.9924190 7.1823453 3.4799895 24 -2.29667904 0.030668

C1 | 2 2.6106668 7.1823453 3.4799895 24 0.75019386 0.460430

The output that is labeled 'C1' shows the results for the contrast averaged over the levels of x2 and x3, i.e., this output shows that the 95% confidence interval for m1·· - m2·· is -2.28 ± 4.82. (See Table 1RefTo: t1 for the notation for population means.) The output that is labeled 'C1 | 1' shows the same contrast computed within x3 = 1 (still averaged over x2) and the output that is labeled 'C1 | 2' shows this contrast computed within x3 = 2 (again averaged over x2). Thus the 95% confidence interval for m1·1 - m2·1 is -2.19 ± 6.82 and the 95% confidence interval for m1·2 - m2·2 is -2.37 ± 6.82.

The 'strata' argument can also be a vector representing several factors, in which case the specified contrast is computed within fixed levels of every combination of the stratifying factors. For example:

aov.contrast(lm.input = aov.out, margin = 1, contrast = c(1, -1, 0, 0), strata = c(2, 3))

specifies the contrast between the first and second levels of x1 computed within each combination of levels of x2 and x3. Here is the output.

Data: data.set

Model: y ~ x1 * x2 * x3

MS Error = 36.3309808320505, df.error = 24, N.valid = 48, N.missing = 0

Confidence level = 0.95

Contrasts for means of factor x1 within levels of factors x2 * x3

Notation: 'C1' alone refers to the 'C1' contrast averaged over the other factors.

'C1 | i*j' refers to the 'C1' contrast computed within x2 = i & x3 = j

Contrast coefficients:

[,1] [,2] [,3] [,4]

C1 1.000 -1.000 0.000 0.000

Contrast tests assume homogeneity of variance.

Estimate Conf.Bounds Std.Error Df T.Stat Signif

C1 -2.6908761 5.0786851 2.4607242 24 -1.09353016 0.285016

C1 | 1*1 5.2199676 12.4401870 6.0275186 24 0.86602264 0.395053

C1 | 1*2 -3.3165813 12.4401870 6.0275186 24 -0.55023991 0.587240

C1 | 2*1 -16.9628483 12.4401870 6.0275186 24 -2.81423408 0.009606

C1 | 2*2 12.8758010 12.4401870 6.0275186 24 2.13616942 0.043068

C1 | 3*1 -12.2343761 12.4401870 6.0275186 24 -2.02975335 0.053609

C1 | 3*2 -1.7272194 12.4401870 6.0275186 24 -0.28655563 0.776912

In addition to showing that the 95% confidence interval for m1·· - m2·· is -2.69 ± 5.08, the output shows, e.g., that the 95% confidence interval for m111 - m211 is 5.21 ± 12.44, and the 95% confidence interval for m112 - m212 is -3.32 ± 12.44, etc.

Example 4x4: It is permissible to specify more than one factor in the 'margin' argument. This is useful for testing interaction contrasts. For example, suppose that the following contrast is specified:

aov.contrast(lm.input = aov.out, margin = c(1, 2),

contrast = c( 0, 1, -1,

0, -1, 1,

0, 0, 0,

0, 0, 0 ))

Table 2t2
		x2
x1	Level 1	Level 2	Level 3
Level 1	1	2	3
Level 2	4	5	6
Level 3	7	8	9
Level 4	10	11	12

To understand this call to 'aov.contrast', we need to see how 'aov.contrast' enumerates the levels of factors x1 and x2. Table 2RefTo: t2 shows the correspondence between levels of x1 and x2 and the sequence of contrast coefficients. From Table 2RefTo: t2, we can see that the contrast specified above is: (m12· - m13·) - (m22· - m23·). Essentially, when the 'margin' argument specifies multiple factors, the factor levels are incremented starting with the last (right-most) factor. Here is the output.

Data: data.set

Model: y ~ x1 * x2 * x3

MS Error = 36.3309808320505, df.error = 24, N.valid = 48, N.missing = 0

Confidence level = 0.95

Contrasts for means of factors x1 * x2

Correspondence between the contrast coefficients and the levels of [1] x1 * x2

x1 1 2 3

1 1 2 3

2 4 5 6

3 7 8 9

4 10 11 12

Contrast coefficients:

[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]

C1 0.000 1.000 -1.000 0.000 -1.000 1.000 0.000 0.000 0.000 0.000 0.000 0.000

Contrast tests assume homogeneity of variance.

Estimate Conf.Bounds Std.Error Df T.Stat Signif

1 4.9372741 12.440187 6.0275186 24 0.81912216 0.420777

The output shows that we cannot reject (m12· - m13·) - (m22· - m23·) = 0 at the .05 level (t(24) = .82, MS error = 36.33, p = .42).

----------------------------------------------------------------------

Section IIIs3. Analysis of Covariance (Ancova) Examples

The computation of contrasts in an ancova is essentially the same as in an anova, except that the contrasts are linear combinations of adjusted cell means rather than cell means. Adjusted cell means are the predicted values for cells where the predictions are computed with respect to the mean values of the covariates. The following examples show how to compute adjusted cell means and contrasts among the adjusted cell means, but the anova contrasts that were computed in previous examples can also be computed in the context of an ancova.

Oneway Ancova Data Set:

Suppose that children are taught a skill by one of three methods, A, B, or C. Their proficiency with the skill is measured on a test, and stored in a variable called 'score'. Their age in months at the beginning of training is recorded in a variable, 'age'.

data.set <- data.frame(

score = c(43, 55, 51, 36, 52, 61, 58, 62, 55, 79, 63),

method = factor(c("A","A","A","A","B","B","B","C","C","C","C")),

age = c(72, 74, 71, 69, 70, 73, 73, 75, 70, 75, 73))

Example 5x5. Adjusted Cell means

lm.results <- lm(score ~ method + age, data = data.set)

aov.contrast(lm.results,

contrast.list = list(A = c(1,0,0), B = c(0,1,0), C = c(0,0,1)))

Data: data.set

Model: score ~ method + age

MS Error = 31.7724775224775, df.error = 7, N.valid = 11, N.missing = 0

Confidence level = 0.95

Contrasts for means of factor method

Contrast coefficients:

[,1] [,2] [,3]

A 1.000 0.000 0.000

B 0.000 1.000 0.000

C 0.000 0.000 1.000

Contrast tests assume homogeneity of variance.

Estimate Conf.Bounds Std.Error Df T.Stat Signif

A 48.643833 6.8833692 2.9109780 7 16.710478 0.000001

B 57.844882 7.7193051 3.2644954 7 17.719394 0.000000

C 61.722505 7.0113887 2.9651175 7 20.816209 0.000000

The adjusted cell means are 48.64, 57.84, and 61.72 for levels A, B, and C of the 'method' factor. These means are the predicted values of the A, B, and C methods at the mean value of 'age'. A separate computation shows that the 'mean(age)' is 72.27 (months).

Example 6x6. Contrasts between levels in a oneway ancova.

aov.contrast(lm.results,

contrast.list = list(c(1, -1, 0), c(1, 0, -1), c(0, 1, -1)))

Data: data.set

Model: score ~ method + age

MS Error = 31.7724775224775, df.error = 7, N.valid = 11, N.missing = 0

Confidence level = 0.95

Contrasts for means of factor method

Contrast coefficients:

[,1] [,2] [,3]

C1 1.000 -1.000 0.000

C2 1.000 0.000 -1.000

C3 0.000 1.000 -1.000

Contrast tests assume homogeneity of variance.

Estimate Conf.Bounds Std.Error Df T.Stat Signif

C1 -9.2010490 10.240797 4.3308348 7 -2.12454395 0.071242

C2 -13.0786713 10.200279 4.3136999 7 -3.03189178 0.019064

C3 -3.8776224 10.554439 4.4634743 7 -0.86874532 0.413778

The difference in mean adjusted 'score' for A and B is -9.20 ± 10.24; for A and C it is -13.08 ± 10.20; for B and C it is -3.88 ± 10.55. Only the second of these differences is significant at the .05 level.

Twoway Ancova Data Set:

This data set is like the preceding one except that in this case there are two factors and two covariates. Suppose that children are taught a skill by one of three methods, A, B, or C. The gender of the children is recorded in a factor variable 'sex'. Their proficiency with the skill is measured on a test before and after training. Proficiency before training is stored in a variable 'score0' and proficiency after training is stored in a variable 'score1'. Their age in months at the beginning of training is recorded in a variable, 'age'. In this analysis, the dependent variable is 'score1', the factor variables are 'method' and 'sex', and the covariates are 'score0' and 'age'.

data.set <- data.frame(

score0 = c(38, 54, 52, 48, 30, 44, 56, 56, 50, 54, 43, 65, 67, 53),

score1 = c(43, 55, 53, 51, 36, 52, 61, 64, 58, 62, 55, 79, 73, 63),

method = factor(c('A','A','A','A','A','B','B','B','B',

'C','C','C', 'C','C')),

sex = factor(c('M','M','F','M','F','F','F','F','M','F','F','M','F','M')),

age = c(72, 74, 68, 71, 69, 70, 73, 69, 73, 75, 70, 75, 74, 73))

Example 7x7.

lm.results <- lm(score1 ~ score0 + age + method*sex, data = data.set)

aov.contrast(lm.input = lm.results, margin = c(1,2),

contrast.list = list(

c(1, 0, 0, 0, 0, 0), c(0, 1, 0, 0, 0, 0), c(0, 0, 1, 0, 0, 0),

c(0, 0, 0, 1, 0, 0), c(0, 0, 0, 0, 1, 0), c(0, 0, 0, 0, 0, 1)))

Data: data.set

Model: score1 ~ score0 + age + method * sex

MS Error = 4.27192271900576, df.error = 6, N.valid = 14, N.missing = 0

Confidence level = 0.95

Contrasts for means of factors method * sex

Correspondence between the contrast coefficients and the levels of [1] method * sex

sex

method 1 2

1 1 2

2 3 4

3 5 6

Contrast coefficients:

[,1] [,2] [,3] [,4] [,5] [,6]

C1 1.000 0.000 0.000 0.000 0.000 0.000

C2 0.000 1.000 0.000 0.000 0.000 0.000

C3 0.000 0.000 1.000 0.000 0.000 0.000

C4 0.000 0.000 0.000 1.000 0.000 0.000

C5 0.000 0.000 0.000 0.000 1.000 0.000

C6 0.000 0.000 0.000 0.000 0.000 1.000

Contrast tests assume homogeneity of variance.

Estimate Conf.Bounds Std.Error Df T.Stat Signif

C1 51.819632 4.7547179 1.9431505 6 26.667843 0.000000

C2 53.215635 3.1183318 1.2743948 6 41.757574 0.000000

C3 57.596043 3.2176075 1.3149667 6 43.800383 0.000000

C4 58.908146 5.2076366 2.1282486 6 27.679166 0.000000

C5 60.295875 3.1166319 1.2737001 6 47.339146 0.000000

C6 64.564965 4.1423765 1.6928998 6 38.138681 0.000000

To determine the correspondence between the contrast estimates and the combinations of 'method' and 'sex', consider the table that is shown in the output:

sex

method 1 2

1 1 2

2 3 4

3 5 6

Bear in mind that 'F' precedes 'M' in the alphabetical order, so 'sex' 1 is female and 'sex' 2 is male. Similarly, 'A' precedes 'B' precedes 'C' in the alphabetical order, so 'method' 1 corresponds to A, and similarly for 'method' 2 and 3. Therefore the adjusted means for the factor combinations of 'method' and 'gender are:

	gender
method	F	M
A	51.819632	53.215635
B	57.596043	58.908146
C	60.295875	64.564965

Example 8x8. Gender effect within each method.

lm.results <- lm(score1 ~ score0 + age + method*sex, data = data.set)

aov.contrast(lm.input = lm.results, margin = 2, contrast.list=c(1, -1),

strata = 1)

Data: data.set

Model: score1 ~ score0 + age + method * sex

MS Error = 4.27192271900576, df.error = 6, N.valid = 14, N.missing = 0

Confidence level = 0.95

Contrasts for means of factor sex within levels of factor method

Notation: 'C1' alone refers to the 'C1' contrast averaged over the other factors.

'C1 | i' refers to the 'C1' contrast computed within method = i

Contrast coefficients:

[,1] [,2]

C1 1.000 -1.000

Contrast tests assume homogeneity of variance.

Estimate Conf.Bounds Std.Error Df T.Stat Signif

C1 -2.3257318 3.7037437 1.5136400 6 -1.53651580 0.175322

C1 | 1 -1.3960030 5.8550411 2.3928288 6 -0.58341114 0.580868

C1 | 2 -1.3121027 6.3892034 2.6111294 6 -0.50250391 0.633220

C1 | 3 -4.2690898 4.7255343 1.9312238 6 -2.21056193 0.069090

Averaged over methods, the difference between females and males was -2.33 ± 3.70 with 95% confidence. Within method A, the difference was -1.40 ± 5.86; within method B, it was -1.31 ± 6.39; within method C, it was -4.27 ± 4.73.

As noted, examples that were presented in the context of an anova can also be implemented in the context of an ancova. The only exception is the Welch-Satterthwaite method for testing contrasts without assuming homogeneity of variance. This method is only available with respect to anovas, not with respect to ancovas.

----------------------------------------------------------------------

Other Arguments to the Function

Confidence.Level specifies the confidence level of the confidence intervals. The default is 'Confidence.Level = .95' but one could change this, e.g., to 'Confidence.Level = .90'.

var.equal If var.equal = TRUE, then the standard error of a contrast is calculated by the standard formula from the pooled within cell error variance (mean squared within groups). If var.equal = FALSE, then the standard error is calculated by the formula , where ci is the contrast coefficient for the i-th cell, is the variance of the i-th cell, and ni is the sample size of i-th cell. Furthermore, if var.equal = FALSE, the degrees of freedom for error are computed by the Welch-Satterthwaite formula.

show.cellcode specifies whether the table should be displayed that shows the correspondence between contrast coefficients and the combinations of factors specified in the 'margin' argument. The 'show.cellcode' argument has no effect (no table is displayed) when 'margin' specifies only one factor.

p.digits specifies the number of digits to the right of the decimal place to be shown in p-value of the test. Set 'p.digits = NA' if scientific notation is preferred.

----------------------------------------------------------------------

Conclusion

This is my first attempt at writing a R function for user-defined contrasts in a between-subjects anova. A number of improvements would be desirable and still need to be written. Specifically, one would like at least the following: Scheffé, Tukey, and Bonferroni multiple comparison procedures.

Let me be the first to say that I am a very crude R programmer, and the code for the 'aov.contrast' function could use a lot of improvement. If anyone notices errors in the computations of this function, please describe the error in an email to jmiyamot@u.washington.edu. It would be helpful if the message included a small data set and some R code that illustrates the error. In suggesting improvements to the code for 'aov.contrast', please bear in mind that I may have difficulty understanding more sophisticated programming techniques.

#-------------- The 'doc' bookmark highlights all of the documentation but not the function skeleton at the top of the file. Use rdocToString macro to convert documentation to an R string variable --------------------------------------------------------

Format for this html document: Page Size = Tabloid: 11" x 17". Note that Tabloid page size is available if Adobe PDF is set as the default printer, but not if the default printer is a physical printer that prints 8.5" x 11" standard paper. Table widths = 9.5"

[1] 'lm' is used to fit regression and ancova models as well as anova models.