Stat 421, Fall 2008
Applied Statistics and Experimental Design


Instructor:
  • Fritz Scholz
  • Office: Padelford C-310
  • Office Hours: Mo 9:00-10:00, We 11:30-12:30 or by appointment.
  • Office Phone: 206-543-3866 (can leave messages on answering service)
  • Statistics Department phone: 206-543-7237 (for leaving messages in my mail box)
  • e-mail: fscholz at u dot washington dot edu (best way to reach me)

    Teaching Assistant:
  • Julia Palacios
  • Office: Padelford C24 (Lower level, one level below first floor)
  • Office Hours: TBD
  • Office Phone: (206) 543-9076
  • e-mail: jpalacio[at]u.washington.edu
  • Lab Page

    Lectures: Mo/We/Fr 2:30-3:20 Guggenheim 218
    Lab: We 3:30-4:20 Communications B027
    Text (recommended but optional):
    Angela Dean and Daniel Voss, Design and Analysis of Experiments, Springer Verlag 1999.

    Lecture Notes from Stat 502 in 2006 by Peter Hoff
    which I will follow to a great extent

    Your main reference will be the lecture slides, see below.

    Grades: Homework 40%, midterm 20%, final 40%.
    Midterm:Time to be determined.

    Approximate Schedule:

    Week   1 : Introduction/Review of R.
    Week   2 : Observational Studies and Controlled Experiments,
                    first randomization tests (Fisher's Exact Test).
    Week   3 : Experimental Design Principles, comparing two treatments,
                    completely randomized designs, randomization tests.
    Weeks   4/5 : Review of the normal distribution population model,
                    distributional facts (t, chi-square, F with noncentral versions),
                    basic tests and confidence intervals, power and sample size,
                    diagnostic tests.
    Week   6 : Several treatment effects model, ANOVA,
                    sum of squares decomposition, geometric interpretation.
    Week   7 : Treatment comparisons, model diagnostics
    Week   8 : Factorial treatment designs,
                    ANOVA decomposition for the additive model
    Week   9 : ANOVA for the interaction model, model comparison, and normal-theory testing.
    Week   10 : Randomized block experiments, more if time permits.


  • Reviewing Stat 421

  • Final 2007 with Solutions

  • Midterm Solutions

  • Final Solutions

  • Final Scores and Grades

  • Grade comparison over 2006, 2007, and 2008

    Homework for Stat 421

  • Homework 1 due Oct. 3, 2008

  • Homework 1 Solutions

  • Homework 2 due Oct. 10, 2008

  • Homework 2 Solutions

  • Homework 3 due Oct. 17, 2008

  • Homework 3 Solutions

  • Homework 4 due Oct. 24, 2008

  • Homework 4 Solutions

  • Homework 5 due Oct. 31, 2008

  • Homework 5 Solutions

  • Homework 6 due Nov. 7, 2008

  • Homework 6 Solutions

  • Homework 7 due Nov. 14, 2008

  • Homework 7 Solutions

  • Homework 8 due Nov. 21, 2008

  • Homework 8 Solutions

  • Homework 9 due Dec. 1, 2008

  • Homework 9 Solutions

    R Installation Info and Introductory Guides:

    Download site:
    The Comprehensive R Archive Network

    Free guides:
    R Primer by Chris Green
    An Introduction to R The introductory guide that comes with R.
    Verzani-SimpleR

    For a large collection of books on and related to R see: Books

    In the past I have recommended the text by Peter Dalgaard: Introductory Statistics with R ,
    now in its second edition, however the above free guides should serve as well.

    An Approach to Providing Mathematical Annotation in Plots by Paul Murrell and Ross Ihaka,(may prompt for UW login)

    The material that follows is what was used in the Fall 2006 and 2007.
    Some modifications will take place.
    The date after the lecture slides indicates how up-to-date they are.


    Files for STAT 421

    Data Files:
  • MS Excel csv-file Flux.csv or R file flux containing the flux data.
  • MS Excel csv-file Flux3.csv containing the Flux3 data.
  • MS Excel csv-file aidresponse.csv containing the aid car response data.
  • MS Excel csv-file crab.csv containing the hermit crab count data.
  • MS Excel csv-file workerdata.csv containing the worker productivity data for HW 9.
  • MS Excel csv-file poison.csv containing the insecticide data used in the Factorial Design section.
  • MS Excel csv-file fertilizerdata.csv containing the fertilizer data used in the Factorial Design section.
  • MS Excel csv-file FluxBatch.csv containing the Flux by Batch data used in the Factorial Design section.
  • MS Excel csv-file Battery.csv containing the Battery data (Montgomery, p. 165) used in the Factorial Design section.
         R function Battery.anal doing various analyses on Battery data.
         R function Battery.random compares the randomization reference distributions
         for the Battery data and compares them to the respective F-distributions.
         randomization reference distributions of F-test for temperature effect using the Battery data.
        
    randomization reference distributions of F-test for type effect using the Battery data.
        
    randomization reference distributions of F-test for interaction effect using the Battery data.


    R Packages:
  • R package nortest_1.0.zip, Package for testing normality.


    R Code for Lecture Examples:
  • R function flux.plots that generated the flux plots in the lectures. This assumes that the data frame flux is loaded in R.
  • R function sample2.plots that produced the 20 normal QQ-plots of standard normal samples of sizes m=9 and n=9.
  • R functions producing the CLT illustrations in lecture clt1, clt2, clt3, clt4.
  • R function producing hypothesis testing illustrations in lecture sampling.dist,
  • R function producing chi-square densities, t-densities, F-densities (make sure qnct is loaded into work space) densities,
  • R function producing powerfunctions for 1-sample t-test (make sure qnct is loaded into work space) power.function,
  • R function typeIerror.rateRand for simulating pairwise and overall p-values in the pairwise treatment of ANOVA
        via randomization tests in order to illustrate the multiple comparison issue.
  • R function Ftest.rand used for doing the ANOVA randomization test analysis for the Flux3 data to produce slides 25
        and 28 in Stat421ANOVA.pdf. It is an updated version of what is on slide 26.
  • R function sample.sizeANOVA for determining sample sizes to achieve desired power in a simple ANOVA test.
  • R function Fmin.test for simulatuing the Fmin distribution.
  • R function kruskal.wallis.pvalue for simulatuing p-value of Kruskal-Wallis test.

  • Reference: K-Sample Anderson-Darling Tests (may prompt for UW login)

  • R function adk.pvalue for simulatuing p-value of Anderson-Darling k-sample test.
  • R function coag.tukey for doing the pairwise intervals and plot for slides 93 and 94 of ANOVA.
  • R function for performing the fertilizer ANOVA (Slides 105 & 107) fertilizer.analysis for the treatment.
  • R function for performing the randomization test on the fertilizer data fertilizer.randomization.analysis for the treatment.
  • R function for performing the randomization test on the fertilizer data fertilizer.randomization.analysis.Block for the blocking.
  • The simulation functions comparing the Kruskal-Wallis, Anderson-Darling k-Sample test, and the F-test
        on 3 normal samples, either with same normal distribution each or with same mean,
        but different variance, or with same sample variances but different means.
        kruskal.wallis.demo0,
        kruskal.wallis.demo1,
        kruskal.wallis.demo2.
        To use either one of these you should install the adk package and execute library(adk).


    Lecture Slides:

  • Introduction/Review of R.
         Last updated 9/23/2008.

  • Observational Studies & Controlled Experiments.
         Last updated 10/7/2008.

  • Two Sample Experiments:Treatment vs Control.
         Last updated 10/15/2008

  • Finite Population Samples and Hypergeometric Distribution.


  • Normal and Related Distributions,Tests, Power, Sample Size & Confidence Intervals, Tests of Fit
         Last updated 10/31/2008

  • One-Factor ANOVA,
         Last updated 11/21/2008.

  • Linear Model and Least Squares Estimation,
         Last updated 9/23/2008.

  • Factorial Design,
         Last updated on 12/05/2008.


  • Home page of Fritz Scholz.