Biostatistics: van Belle, Fisher, Heagerty, Lumley

Chapter 16: Analysis of the Time to an Event

Datasets

Errata

Code for examples

Featured graph

Web appendix by chapter

Links

Datasets

Primary Biliary Cirrhosis data and documentation (from Dr Scott Emerson)

The CHS Stroke prediction model, a Java applet showing the results of a predictive model for stroke risk in elderly people in the US.

Setting up data for recurrent event analyses:
Here we show how to perform three popular types of recurrent event analysis on a data set from Appendix D of Fleming & Harrington (1991) also used by Therneau & Hamilton (1997) and Grambsch & Therneau (2003). They come from a randomised trial of interferon-gamma, an immune system messenger, for treating chronic granulomatous disease (CGD), a complex of rare genetic disorders in which the immune response to bacterial infection is impaired. 128 patients were randomised to three injections per week of interferon-gamma or placebo, with the outcome measurement being time to first serious infection. The study stopped early when it became obvious that the treatment was successful. After all the data were collected there were 14 first serious infections in the interferon-gamma group and 30 in the control group, with a total of 20 and 56 respectively when serious infections past the first were included. There are 203 records in the data set, containing an identification variable id, descriptive variables including treatment group, and the information on time to infection. For each time-to-infection there is a sequence number 1,2,3,. . . , a duration of time, a start and stop time and a censoring/infection indicator. The start time is zero for the first infection and for each subsequent infection is one day after the time of the previous one.

The first analysis is a simple generalisation of the Cox model in which the rate of new events for two individuals is always proportional at the same time from the start of the study. This is often called the Andersen–Gill model.
In the second type of analysis we compare individuals only to others who have had the same number of events. This is called a conditional or transitional model. Here we give three examples:

Time is time from the start of the study, hazard ratios do not change
Time is time from the most recent event, hazard ratios do not change
Time is time from the most recent event, hazard ratios may change after each event.

In the third type of analysis time is again time from the start of the study, but the first, second, third,... infections are treated as different types of event. Each receives its own baseline hazard function and an individual is treated as "at risk" for the second event even before the first event has occurred. While this sounds strange, it does turn out to be useful in some circumstances.