Stat 421, Fall 2008
Applied Statistics and Experimental Design
Instructor:
Fritz
Scholz
Office: Padelford C-310
Office Hours: Mo 9:00-10:00, We 11:30-12:30 or by appointment.
Office Phone: 206-543-3866 (can leave messages on answering service)
Statistics Department phone: 206-543-7237 (for leaving messages in my mail box)
e-mail: fscholz at u dot washington dot edu (best way to reach me)
Teaching Assistant:
Julia Palacios
Office:
Padelford C24 (Lower level, one level below first floor)
Office Hours: TBD
Office Phone: (206) 543-9076
e-mail: jpalacio[at]u.washington.edu
Lab Page
Lectures: Mo/We/Fr 2:30-3:20 Guggenheim 218
Lab: We 3:30-4:20 Communications B027
Text (recommended but optional):
Angela Dean and Daniel Voss,
Design and Analysis of Experiments, Springer Verlag 1999.
Lecture Notes from Stat 502 in 2006 by Peter Hoff
which I will follow to a great extent
Your main reference will be the lecture slides, see below.
Grades: Homework 40%, midterm 20%, final 40%.
Midterm:Time to be determined.
Approximate Schedule:
Week   1 : Introduction/Review of R.
Week   2 :
Observational Studies and Controlled Experiments,
    
    
     
first randomization tests (Fisher's Exact Test).
Week   3 :
Experimental Design Principles,
comparing two treatments,
    
    
     
completely randomized designs, randomization tests.
Weeks   4/5 : Review of the normal distribution population model,
    
    
     
distributional facts
(t, chi-square, F with noncentral versions),
    
    
     
basic tests and
confidence intervals, power and sample size,
    
    
     
diagnostic tests.
Week   6 : Several treatment effects model, ANOVA,
    
    
     
sum of squares decomposition, geometric interpretation.
Week   7 : Treatment comparisons, model diagnostics
Week   8 : Factorial treatment designs,
    
    
     
ANOVA decomposition for the additive model
Week   9 : ANOVA for the interaction model,
model comparison, and normal-theory testing.
Week   10 : Randomized block experiments, more if time permits.
Reviewing Stat 421
Final 2007 with Solutions
Midterm Solutions
Final Solutions
Final Scores and Grades
Grade comparison over 2006, 2007, and 2008
Homework for Stat 421
Homework 1 due Oct. 3, 2008
Homework 1 Solutions
Homework 2 due Oct. 10, 2008
Homework 2 Solutions
Homework 3 due Oct. 17, 2008
Homework 3 Solutions
Homework 4 due Oct. 24, 2008
Homework 4 Solutions
Homework 5 due Oct. 31, 2008
Homework 5 Solutions
Homework 6 due Nov. 7, 2008
Homework 6 Solutions
Homework 7 due Nov. 14, 2008
Homework 7 Solutions
Homework 8 due Nov. 21, 2008
Homework 8 Solutions
Homework 9 due Dec. 1, 2008
Homework 9 Solutions
R Installation Info and Introductory Guides:
Download site:
The Comprehensive R Archive Network
Free guides:
R Primer by Chris Green
An Introduction to R The introductory guide that comes with R.
Verzani-SimpleR
For a large collection of books on and related to R see:
Books
In the past I have recommended the text by Peter Dalgaard: Introductory Statistics with R ,
now in its second edition, however the above free guides should serve as well.
An Approach to Providing Mathematical Annotation in Plots by Paul Murrell and Ross Ihaka,(may prompt for UW login)
The material that follows is what was used in the Fall 2006 and 2007.
Some modifications will take place.
The date after the lecture slides indicates how up-to-date they are.
Files for STAT 421
Data Files:
MS Excel csv-file Flux.csv or R file
flux containing the flux data.
MS Excel csv-file Flux3.csv
containing the Flux3 data.
MS Excel csv-file aidresponse.csv
containing the aid car response data.
MS Excel csv-file crab.csv
containing the hermit crab count data.
MS Excel csv-file workerdata.csv
containing the worker productivity data for HW 9.
MS Excel csv-file poison.csv
containing the insecticide data used in the Factorial Design section.
MS Excel csv-file
fertilizerdata.csv
containing the fertilizer data used in the Factorial Design section.
MS Excel csv-file
FluxBatch.csv
containing the Flux by Batch data used in the Factorial Design section.
MS Excel csv-file
Battery.csv
containing the Battery data (Montgomery, p. 165)
used in the Factorial Design section.
    
R function
Battery.anal
doing various analyses on Battery data.
    
R function
Battery.random
compares the randomization reference distributions
    
for the Battery data and compares them to the respective F-distributions.
    
randomization reference distributions of F-test for temperature effect
using the Battery data.
    
randomization reference distributions of F-test for type effect
using the Battery data.
    
randomization reference distributions of F-test for interaction effect
using the Battery data.
R Packages:
R package nortest_1.0.zip,
Package for testing normality.
R Code for Lecture Examples:
R function flux.plots that generated the flux plots in the
lectures.
This assumes that the data frame flux is
loaded in R.
R function sample2.plots that produced the
20 normal QQ-plots of standard normal samples of sizes m=9 and n=9.
R functions producing the CLT illustrations in lecture
clt1,
clt2,
clt3,
clt4.
R function producing hypothesis testing illustrations in lecture
sampling.dist,
R function producing chi-square densities, t-densities, F-densities (make sure qnct is loaded into work space)
densities,
R function producing powerfunctions for 1-sample t-test (make sure qnct is loaded into work space)
power.function,
R function typeIerror.rateRand for simulating pairwise and overall p-values in the
pairwise treatment of ANOVA
   
via randomization tests
in order to illustrate the multiple comparison issue.
R function Ftest.rand used for doing the ANOVA randomization test
analysis for the Flux3 data
to produce slides 25
   
and 28 in Stat421ANOVA.pdf. It is an updated version of what is on slide 26.
R function sample.sizeANOVA for determining sample sizes to achieve desired
power in a simple ANOVA test.
R function Fmin.test for simulatuing the Fmin distribution.
R function kruskal.wallis.pvalue for simulatuing p-value of Kruskal-Wallis test.
Reference:
K-Sample Anderson-Darling Tests (may prompt for UW login)
R function adk.pvalue for simulatuing p-value of Anderson-Darling k-sample test.
R function coag.tukey
for doing the pairwise intervals and plot for slides 93 and 94 of ANOVA.
R function for performing the fertilizer ANOVA (Slides 105 & 107)
fertilizer.analysis for the treatment.
R function for performing the randomization test on the fertilizer data
fertilizer.randomization.analysis for the treatment.
R function for performing the randomization test on the fertilizer data
fertilizer.randomization.analysis.Block for the blocking.
The simulation functions comparing the Kruskal-Wallis, Anderson-Darling k-Sample test, and
the F-test
   
on 3 normal samples,
either with same normal distribution each
or with same mean,
   
but different variance, or with same sample variances but different means.
   
kruskal.wallis.demo0,
   
kruskal.wallis.demo1,
   
kruskal.wallis.demo2.
   
To use either one of these you should install the adk package and execute library(adk).
Lecture Slides:
Introduction/Review of R.
    
Last updated 9/23/2008.
Observational Studies & Controlled Experiments.
    
Last updated 10/7/2008.
Two Sample
Experiments:Treatment vs Control.
    
Last updated 10/15/2008
Finite Population Samples and Hypergeometric Distribution.
Normal and Related Distributions,Tests, Power, Sample Size &
Confidence Intervals, Tests of Fit
    
Last updated 10/31/2008
One-Factor ANOVA,
    
Last updated 11/21/2008.
Linear Model and Least Squares Estimation,
    
Last updated 9/23/2008.
Factorial Design,
    
Last updated on 12/05/2008.
Home page of Fritz Scholz.