Date on Master's Thesis/Doctoral Dissertation


Document Type

Doctoral Dissertation

Degree Name

Ph. D.


Bioinformatics and Biostatistics

Degree Program

Biostatistics, PhD

Committee Chair

Kong, Maiying

Committee Co-Chair (if applicable)

Riten, Mitra

Committee Member

Kulasekera, Karunarathna B.

Committee Member

McClain, Craig James

Committee Member

Zheng, Qi

Author's Keywords

Causal inference; ordinal outcome; personalized medicine


This dissertation consists of two projects investigating statistical methods in causal inference and personalized medication using observational data. In the first project, we propose a parametric marginal structural ordinal logistic regression model (MS-OLRM) to assess treatment effects on ordinal outcomes. Average treatment effect (ATE) is used to measure the difference of the mean outcomes if all patients would have been treated compared with the outcomes if they would not have been treated. Many statistical methods have been developed to estimate ATE when the outcome is continuous or binary. The methodology on assessing treatment effect for an ordinal outcome is less studied. For an ordinal outcome, the concept of mean may not be appropriate. For example, the difference in breast cancer between stage II versus stage I is quite different from that between stage IV versus stage III. For an ordinal outcome, we propose use superiority score to measure the treatment effect. Superiority score measures whether the outcome under treatment is stochastically larger than the outcome under control. We propose using the MSOLRM along with the inverse probability of treatment weighting (IPTW) to estimate the superiority score under treatment compared with control. This methodology adjusts confounding factors between treatment and outcome by using IPTW. In the weighted sample, all covariates become balanced among different treatment groups. Extensive simulation studies are carried out to examine the performance of the proposed method. We apply the proposed method to assess the treatment effects of medications and behavior therapies on patients’ recovery from alcohol use disorders using the Kentucky Medicaid 2012-2019 database. In the second project, we propose a doubly robust method for selecting optimal treatment regimen for survival outcome using observational data. In the proposed method, we apply the generalized partial linear single-index models (GPLSIMs) directly to model the contrast functions (i.e., the outcome difference between treatment and control). We consider the outcome under control as nuisance function, and we target to estimate the contrast functions using A-learning method and structural mean model. The optimal treatment regimen is defined as the treatment which results in the optimal outcome. The contrast functions can be consistently estimated if either the outcome model under control or the generalized propensity scores are correctly specified. When the outcome model under control is estimated using GPLSIM, the outcome model is less prone to mis-specification, which results in a more robust estimation for contrast functions and optimal treatment selection. Extensive simulation studies are carried out to examine the performance of the proposed method. The simulation results show the good performance of the proposed method. We apply the proposed method to select the optimal exercise level based on patients’ comorbidity and other characteristics using the National Health and Nutrition Examination Survey (NHANES) III data sets.