Date on Master's Thesis/Doctoral Dissertation


Document Type

Doctoral Dissertation

Degree Name

Ph. D.


Bioinformatics and Biostatistics

Degree Program

Biostatistics, PhD

Committee Chair

Kong, Maiying

Committee Co-Chair (if applicable)

Zheng, Qi

Committee Member

Zheng, Qi

Committee Member

Mitra, Riten

Committee Member

Gaskins, Jeremy

Committee Member

Egger, Michael

Author's Keywords

Causal Inference; ATE; generalized propensity score; censoring weights; mediation analysis; health disparities


The dissertation comprises two projects related to causal inference based on observational data. In healthcare research, where abundant observational data such as claims data and electronic records are available, researchers often aim to study the treatment effect and the pathway of that effect. However, estimating treatment effects in observational data presents challenges due to confounding factors. The first project focuses on estimating continuous treatment effects for survival outcomes, while the second concentrates on mediation analysis, allowing the exploration of the pathway of the causal effect. Both projects involve addressing confounding variables. In the first project, I investigate estimation of the average treatment effect (ATE) of continuous treatment on time to event outcome by adjusting multiple confounding factors and considering censoring observations. To adjust confounding factors, various propensity score methods such as multinomial regression and covariate balance propensity score models are used to estimate the ATE via the inverse probability of treatment weighting (IPTW) method. For continuous treatments, the IPTW is generated from covariate balancing generalized propensity score. To remedy the possible bias in estimating ATE for time-to-event data due to censoring observations, we incorporate the censoring weights to estimate ATE. We propose using both the IPTW and the censoring weights (say, double weighting approach) to estimate ATE using the marginal structural accelerated failure time (AFT) model, where the IPTW adjusts for confounding factors and the censoring weights remedy the impact due to censored observations. Comprehensive simulation studies demonstrated our proposed method performed well. We applied our proposed method to examine if blood lead level impacts the time to death of older people in the United States, utilizing data from the NHANES III survey. In the second project, I delve into the more complex causal pathways of exposure to the outcome using mediation analyses. I begin with basic mediation analyses and progress to the more advanced four-way decomposition of causal effects from exposure to outcome. This includes the interaction between multiple mediators and the exposure. Expanding the scope of mediation analyses and four-way decomposition, I extend it to survival analysis and demonstrate the IOM-defined disparity in terms of four-way decomposition effects within the mediation analysis framework. Mediation analysis proves to be a crucial tool in unraveling the intricate pathways contributing to disparities among racial groups. Extensive simulation studies are conducted to examine the contribution of decomposition effects under various settings of mediators and outcomes. Finally, I investigate the factors influencing racial disparity among the black and white populations in the United States based on the NHANES III database.