picture of complicated surveyor instrument

Survey analysis in R

This is the homepage for the "survey" package, which provides facilities in R for analyzing data from complex surveys. The current version is 3.18. A much earlier version (2.2) was published in Journal of Statistical Software

A port of an older version of the package (version 3.6-8) to S-PLUS 8.0 is available from CSAN (thanks to Patrick Aboyoun at Insightful).

Features:

The NEWS file gives a history of features and bug fixes.

Comparison shopping:
Alan Zaslavsky keeps a comprehensive list of survey analysis software for the ASA Section on Survey Research Methods.


Using the survey package:

Some examples (in PDF) translated from Stata and SUDAAN examples at UCLA Academic Technology Services.

Notes on the sparse matrix algorithms used in version 3.15 for two-phase designs (and perhaps more widely in future versions)

Notes on standard errors for survival curves.

A 2009 CDC report compared five other survey analysis packages in the context of the Youth Risk Behaviors Survey. I have written an extension that does the same feature comparisons and results comparisons with R and the survey package. Some of this is copied from the CDC report (which I believe is in the public domain), but they are (of course) not responsible for any of the conclusions or results.


Tutorials

Norman Breslow and I gave the course at STATISTICALPS 2009, at the beginning of September in the Italian Alps. The course will include an introduction to the survey package, but will focus on two-phase designs in epidemiology. We will have some code and data up soon.

Slides from a short tutorial at the US Census Bureau, August 10.

I gave a tutorial at useR 2009, on the afternoon of July 7, 2009.

A 1.5 hour brief introduction to R, including a bit on the survey package, at the AAPOR conference, Friday May 15, 10:30am.

There was a one-day course at the University of Copenhagen Center for Health and Society on April 3, 2009. Slides are available at that link.

Tobias Verbeke has packaged data sets and exercises from Sharon Lohr's Sampling: Design and Analysis for use with the survey package.

I gave a short course for the Washington Statistical Society on March 15-16 2007. The first day was on R and the slides are a selection from these. The second day was on the survey package, slides here.

Norman Breslow and I gave a short course on complex survey designs for epidemiology at the 2008 WNAR (Biometric Society) meeting, UC Davis, June 22, 2008. My sessions were an overview of the survey package and an introduction to calibration. Norm's data sets and code are also online

There is an article on version 3.6-12 of the package in the January 2008 issue of Survey Statistician (note: large PDF file)


Book I have been writing a book Complex Surveys: a guide to analysis using R. It will be published by John Wiley & Sons in January 2010. It already has a web site


Help pages:

The PEAS project at Napier University has Practical Examplars for the Analysis of Surveys using R (as well as other packages).
Thomas Lumley