> dclus1 1 - level Cluster Sampling design With (15) clusters. svydesign(id = ~dnum, weights = ~pw, data = apiclus1, fpc = ~fpc) > summary(rclus1) Survey with replicate weights: Call: as.svrepdesign(dclus1) Unstratified cluster jacknife (JK1) with 15 replicates. Variables: [1] "cds" "stype" "name" "sname" "snum" "dname" [7] "dnum" "cname" "cnum" "flag" "pcttest" "api00" [13] "api99" "target" "growth" "sch.wide" "comp.imp" "both" [19] "awards" "meals" "ell" "yr.rnd" "mobility" "acs.k3" [25] "acs.46" "acs.core" "pct.resp" "not.hsg" "hsg" "some.col" [31] "col.grad" "grad.sch" "avg.ed" "full" "emer" "enroll" [37] "api.stu" "fpc" "pw"All the analysis functions take a survey design object as one of the arguments, and use a model formula to specify variables for analysis. First look at svymean.
> svymean(~api00, dclus1) mean SE api00 644.17 23.542This asks for the mean (and standard error of the mean) for the variable api00, the year 2000 Academic Performance Index. We can ask for means of more than one variable at a time:
> svymean(~api00+api99+stype, dclus1) mean SE api00 644.169399 23.5422 api99 606.978142 24.2250 stypeE 0.786885 0.0463 stypeH 0.076503 0.0268 stypeM 0.136612 0.0296Here we have the means of 1999 and 2000 API and school type (elementary, middle, high). Note that for the factor variable stype the proportion in each category is reported. The same syntax is used to compute means from the survey object rclus1, but here the standard errors are computed from the jackknife replicate weights and are slightly different.
> svymean(~api00+api99+stype, rclus1) mean SE api00 644.169399 26.3294 api99 606.978142 26.9985 stypeE 0.786885 0.0514 stypeH 0.076503 0.0278 stypeM 0.136612 0.0332Totals are estimated with svytotal. Here we estimate the total number of students enrolled, and the total number of schools by type, across the California population.
> svytotal(~enroll+stype, dclus1) total SE enroll 3404940.13 932235.03 stypeE 4873.97 1333.32 stypeH 473.86 158.70 stypeM 846.17 167.55Note again that totals for factor variables are interpreted as total numbers in each category.
Again, the syntax is the same for the rclus1 object that incorporates replicate weights:
> svytotal(~enroll+stype, rclus1) total SE enroll 3404940.13 932235.03 stypeE 4873.97 1333.32 stypeH 473.86 158.70 stypeM 846.17 167.55
The functions for totals and means can also report the design effect, with the option deff=TRUE
> svytotal(~enroll+stype, dclus1, deff=TRUE) total SE DEff enroll 3404940.13 932235.03 31.4827 stypeE 4873.97 1333.32 52.1047 stypeH 473.86 158.70 1.7521 stypeM 846.17 167.55 1.1698though this is not currently available with replicate weights. Ratio estimates are computed with svyratio. This has two formula arguments, specifying one or more numerator variables and one or more denominator variables. In this example we estimate the proportion of students who took the API test from the number who took the test and the number enrolled.
> svyratio(~api.stu,~enroll, dclus1) Ratio estimator: svyratio.survey.design2(~api.stu, ~enroll, dclus1) Ratios= enroll api.stu 0.8497087 SEs= enroll api.stu 0.008386297Again, the syntax is the same for a version with replicate weights The population variance is estimated with svyvar, with a similar syntax to means and totals
> svyvar(~api00+api99, rclus1) variance SE api00 11122 1729.2 api99 12666 1655.3Quantiles are a more difficult estimation problem. It is easy enough to find a point estimate, but many confidence interval methods fail. There are two confidence interval calculation methods for quantiles in objects created with svydesign. The default method is substantially faster but probably less accurate for tail quantiles or for small data sets. In addition to specifying the variables and the design object, it is necessary to specify which quantiles to estimate. Here we estimate the median and quartiles.
> svyquantile(~api00, dclus1, c(.25,.5,.75), ci=TRUE) $quantiles 0.25 0.5 0.75 api00 551.75 652 717.5 $CIs , , api00 0.25 0.5 0.75 (lower 493.2835 564.3250 696.0000 upper) 622.6495 710.8375 761.1355 > dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc) > (qapi<-svyquantile(~api00, dclus1, c(.25,.5,.75),ci=TRUE, interval.type="score"))) $quantiles 0.25 0.5 0.75 api00 551.75 652 717.5 $CIs , , api00 0.25 0.5 0.75 (lower 514.9998 581.9996 669.0003 upper) 632.0005 699.9997 749.0001 > SE(qapi) 0.25 0.5 0.75 29.84766 30.10264 20.40850The confidence intervals are not symmetric and so cannot be generated by adding and subtracting 1.96 standard errors. Nonetheless, a reasonable estimate of the standard error is the length of the confidence interval divided by (2x1.96).
The syntax for replicate weights is similar. Again, there are two methods of variance estimation. The default is valid for all types of replicate weights and is based on computing a confidence interval for the probability and transforming it. The alternative, directly using the variance of replicates, is not valid for jackknife weights.