Confidence Intervals
 
  • if we can specify with a specific degree of certainty that a sample statistic falls within a given range around population value, then we know with same degree of certainty that population value falls within same range around sample statistic

  •  
  • if 95% of sample means are w/in 2 SE of population mean, there is a 95% chance that population mean is within 2 SE of sample mean

  •  
  • confidence interval (CI) = range of values computed from sample data that includes population value to a specified degree of certainty

  •  
  • 95% CI = standard in most of social and behavioral sciences; standards vary in other sciences (99%, 99.9%, 99.99% CIs)

  •  
  • margin of error = 2 SE

  •  

     
     
     
     
     

    Some principles about CIs
     

  • increased sample size yields narrower CI

  •  
  • smaller sample sd yields narrower CI for mean

  •  
  • width of CI grows with increasing confidence level (from 90% to 95% to 99%)

  •  
  • CI widths vary across samples, because sample statistic (e.g., sd, prop.) used as estimate of pop.  parameter, and sample statistic varies across samples

  •  

     
     
     
     
     

    annual household electricity cost in 1% sample of 1990 CA census

    mean = $703.09
    sd = 585.68    n = 290,968

    95% CI = $700.91-705.27

    SE =

    99% CI if results based on sample of n = 250?

    SEM =

    99% CI = +/- 2.58 SE from mean
     
     
     
     
     
     

    CI for difference between two means
     

  • testing whether the difference between two groups' means on same variable could be due to sampling error

  • 1) compute difference between means

    2) compute SE for each group

    3) compute SE for difference between means by squaring SE for each group, and then taking the square root of the sum of these squared SEs

    4) use this resulting SE for difference between means to construct CI
     

    reported number of children 1996 GSS

    n: men = 1277     women = 1612

    mean:  men = 1.68 women = 1.95

    sd: men = 1.66    women = 1.69

    can the difference between men and women be explained by sampling error?

    difference in means = 0.27

    SE (men) = 1.66 / SQRT 1277 = .05

    SE (women) = 1.69 / SQRT 1612 = .04

    SE (difference in means) = SQRT (.052 + .042) = .06

    95% CI = 0.27 +/- 2(.06) = .15 - .39

    conclusion: women reported more children than men, and this cannot be accounted for by sampling error (0 not included in CI)
     

    Confidence interval for a proportion

    95% CI = sample proportion +/- 2 SE of proportion
     
     

    proportion of Californians who speak a language aside from English at home (1% sample of CA 1990 Census)

    proportion = .316 speak a language other than English at home

    n = 267,117

    SE = SQRT ((.316 x .684)/ 267117) = .001

    95% CI = .314-.318

    shows the incredible influence of sample size on CI
     
     

    Gallup telephone poll about taxes (4/6/01-4/8/01)

    n = 1,025 U.S. adults (random sample)

    51% think income tax paid this year is fair

    SE = SQRT ((.51 x .49) / 1025) = .016

    90% CI = +/- 1.64 SE = 48.4-53.6%

    95% CI = +/- 1.96 SE = 47.9-54.1%

    99% CI = +/- 2.58 SE = 46.9-55.1%

    95% CI if based on a sample of 60?
     


     
     
     
     
     
     
  • rules for constructing CIs vary for different statistics, but interpretation remains the same

  •  
  • some % chance that the true population value will fall in the reported range
  • standard errors often shown in graphs as "error bars"

  •  
  • inferential statistics, like CIs, generally only appropriate when applied to data from probability samples

  •