Datasets and R Code for IEA Short Course
Porto Alegre, Brazil, September 2008

Regression Analysis of Two-Phase Stratified Case-Control Data

Datasets, documentation and R code for "weighted likelihood" or "Horvitz-Thompson" analysis of two-phase stratified data with binary (case-control) outcomes as described by N.E. Breslow and N. Chatterjee, ``Design and analysis of two-phase studies with binary outcome applied to Wilms tumour prognosis," Applied Statistics 48:457-68, 1999. The R code is to be used in conjunction with Thomas Lumley's R Survey Package

Cox Regression Analysis of Stratified Case-Cohort Data

Dataset from the National Wilms Tumor Study (NWTS) originally used by M. Kulich and D.Y. Lin: Improving the efficiency of relative-risk estimation in case-cohort studies. J Amer Statis Assoc 99:832-844, 2004. R code for the analysis of the NWTS data using weighted likelihood or Horvitz-Thompson estimating equations, with weights possibly adjusted by calibration or estimation. The code is to be used in conjunction with Thomas Lumley's R Survey Package For theoretical background see

The dataset is drawn from the third and fourth Wilms tumor studies (NWTS-3 and NWTS-4). Please reference this website together with the following publications in any work that makes use of the data:

Results of a simulation study comparing properties of estimates obtained from these data using adjusted (calibrated or estimated) weights in comparison with standard methods of analysis of (stratified) case-cohort data are available in a manuscript currently under consideration for publication.