As described earlier, if the :allow-missing flag is set to t the program will allow different observation times to be present in different groups. It is convenient to be able to analyse data which are unbalanced either by design or because of a small number of missing observations but when data are missing, especially in longitudinal studies, it is important to consider the missingness mechanism. GEEs, like other moment-based estimates, are valid with data missing completely at random (MCAR) but not necessarily with data missing at random (MAR). Likelihood ratios, on the other hand, are preserved when data are missing at random, implying that direct likelihood and Bayesian inference, and asymptotic frequentist inference based on maximum likelihood estimates are valid.
This disadvantage of moment-based methods is more theoretical than practical for several reasons. First, and most obvious, is that likelihood methods must be using the right likelihood to have these theoretical properties. Score equations based on an incorrectly or incompletely modelled likelihood have no necessary advantage over any other estimating functions in handling missing data. Secondly, correctly modelled likelihood estimation can handle any MAR data pattern at least asymptotically, but other estimation methods will typically give correct answers with some MAR data patterns. For example, Liang & Zeger (1986) claim that the Normal-model GEE is consistent when the missingness depends on any number of previous observations and the binomial-model GEE when the missingness depends on any single previous observation. Finally, and most importantly, it is the exception rather than the rule that data are missing at random, especially if there is any measurement error in the response, as the MAR condition requires that the missingness depend only on the observed values of non-missing observations. If there is any substantial amount of missing data no analysis ignoring the missingness mechanism should be trusted.