Date on Master's Thesis/Doctoral Dissertation


Document Type

Master's Thesis

Degree Name



Bioinformatics and Biostatistics

Committee Chair

Kong, Maiying


Longitudinal method


Clustered longitudinal data is often collected as repeated measurements on subjects over time arising in the clusters. Examples include longitudinal community intervention studies, or family studies with repeated measures on each member. Meanwhile, cluster size is sometime informative, which means that the risk for the outcomes is related to the cluster size. Under this situation, generalized estimating equations (GEE) will lead to invalid inferences because GEE assumes that the cluster size is non-informative. In this study, we investigated the performances of generalized estimating equations (GEE), cluster-weighted generalized estimating equations (CWGEE), and within-cluster resampling (WCR) on clustered longitudinal data. Based on our extensive simulation studies, we conclude that all three methods provide comparable estimates when the cluster size is non-informative. But when cluster size is informative, GEE gives biased estimates, while WCR and CWGEE still provide unbiased and consistent estimates under different \\\"working correlation structures\\\" within-subject. However, WCR is a computationally intensive approach, so CWGEE is the best choice for clustered longitudinal data due to its solving only one estimating equation, which is asymptotically equivalent to WCR.