HOME

 

BIOST 540, Spring 2018
Analysis of Correlated Data
Assignments
 
 
 
  • Week 1: -- article discussion Tuesday 03 Apr 2018
    • Reading: Overview chapter from van Belle, Fisher, Heagerty and Lumley (2004)
    • Applications: please locate a journal article that uses longitudinal or multilevel data analysis methods (e.g. look for "change" analysis, or mixed models, or GEE) -- be prepared to discuss the following aspects:
      • Population
      • Scientific question(s)
      • Analysis approaches
      • Issues (missing data, model selection, time-dependent covariate, etc.)
    • Summary Form to complete: CorrelatedData-PaperReview.doc (MS Word document)

  • Week 2:
    • Reading: DHLZ Chapters 3 and 4; or FLW Chapters 2 and 3

    • Analysis: MACS CD4 and viral load data

      • MACS-cd4-vload0.raw --- Multicenter AIDS Cohort Data
      • MACS-cd4-vload0.txt --- documentation

      • MACS-cd4-vload0-999.raw --- Multicenter AIDS Cohort Data with -999 rather than NA for missing values

      • [1] Input the data and summarize the distribution of baseline viral load.
      • [2] Summarize the distribution of CD4 in years 1, 2, 3, and 4.
      • [3] Summarize the mean CD4 (and standard deviation) in years 1, 2, 3, and 4 separately for groups based on baseline viral load (see Table 18.1 of VFHL; Table 1.1 of PDF above).
      • [4] Plot individual series of longitudinal observations for a selection of subjects.
      • [5] Characterize the correlation among CD4 measurements.
      • [6] Compute a slope over time for each subject and summarize these slopes.
      • [7] Plot slopes versus the log baseline viral load. Is there an apparent association between the baseline viral load and the subsequent rate of decline? (see Figure 18.5 in VFHL; Figure 1.5 of PDF above).
      • [8] Use linear mixed models to evaluate whether the level of CD4 and/or the rate of decline in CD4 is associated with the baseline viral load.
      • [9] Provide an interpretation for the estimates of the variance for the random effects (random intercepts, random slopes).
      • [10] Use generalized estimating equations (GEE) to evaluate whether the level of CD4 and/or the rate of decline in CD4 is associated with the baseline viral load.

    • Comments on Exercise:

  • Week 3:
    • Reading: Estimation and Inference for LMM -- DHLZ Chpts 4, 5; FLW Chpts 8 (and 7)

    • Analysis: MACS CD4 and viral load data

      • [Note] I needed to add the option "clear" to my use of "statsby".
      • [Note] The chapter presents the cut-offs for L,M,H VIRAL0 as rounded values -- I used slightly different values in analysis.
      • [1] Do you think the regression analysis presented in VFHL (2004) should adjust for CD8? Justify.
      • [2] Do you think the regression analysis presented in VFHL (2004) should adjust for AGE of the subject? Justify.
      • [3] The analysis in VFHL (2004) used categories of viral load at baseline (VLOAD0). However, the scatterplot presented on page 740 (handout p. 21) shows a solid line fit to the points. What regression model would allow the CD4 slope to decrease linearly with increasing logVLOAD0?
      • [4] Note that an interpretation is given on page 750 (handout p. 37) for the coefficient of MONTH. The estimate for this coefficient is shown as -5.398 using a linear mixed model with random intercepts and slopes. However, on page 740 (handout p. 20) the average slope for the low viral load group is estimated as -5.715. These values are similar -- does the coefficient in the linear mixed model also have an interpretation as an average slope? Justify.

    • Comments on Exercise:

    • Discussion: review exercises for discussion on Tuesday 17 April 2018

  • Week 4:
    • Reading: Mean Models -- FLW Chpts 5 and 6

    • Analysis: TLC Data (see Lecture section of web page)

      • [Note] The TLC data are in a "wide" format with one record. per row (one subject's data).
      • [1] Analyze blood lead at each of the three follow-up times using simple cross-sectional methods (i.e. t-test, and estimated means). Compare this to our estimates/inference obtained using a mixed model (notes pp. 201-202, and 205-207).
      • [2] Another approach would analyze the change-since-baseline in blood lead at each of the three follow-up times using simple cross-sectional methods (i.e. t-test, and estimated means). Compare this to our estimates/inference obtained using a mixed model (notes pp. 201-202, and 205-207).
      • [3] A third approach would analyze the blood lead at each of the three follow-up times using ANCOVA. This is conducted by regressing the outcome at Time(j) on the treatment indicator and the baseline response. Compare this to our estimates/inference obtained using a mixed model (notes pp. 201-202, and 205-207).
      • [4] Inspect the coefficient of the baseline response in each of the three ANCOVA models fit in [3]. Are these coefficients approximately the same magnitude, or is there a trend in the values. Would you expect these coefficients to be the same? Justify.

    • Comments on Exercise:

  • Week 5:
 BACK TO TOP

 Last Updated:
28March2017

Contact the instructor at: heagerty@u.washington.edu