Date on Master's Thesis/Doctoral Dissertation


Document Type

Doctoral Dissertation

Degree Name

Ph. D.


Bioinformatics and Biostatistics

Degree Program

Biostatistics, PhD

Committee Chair

Kong, Maiying

Committee Co-Chair (if applicable)

Kulasekera, Karunarathna B.

Committee Member

Kulasekera, Karunarathna B.

Committee Member

Huang, Jiapeng

Committee Member

Antimisiaris, Demetra

Committee Member

Gaskins, Jeremy

Committee Member

Zheng, Qi

Author's Keywords

Average treatment effect; causal inference; propensity score; observational studies


This dissertation consists of three projects related to causal inference based on observational data. In the first project, we propose a double robust to identify the effect modifiers and estimate optimal treatment. Observational studies differ from experimental studies in that assignment of subjects to treatments is not randomized but rather occurs due to natural mechanisms, which are usually hidden from the researchers. Many statistical methods to identify the treatment effect and select the optimal personalized treatment for experimental studies may not be suitable for observational studies any more. In this project, we propose a exible outcome model to select the optimal personalized treatment which is suitable for experimental studies as well as observational studies. In the proposed model, the control group response profile is captured by a non-parametric function, and treatment heterogeneity is captured by the interaction term between treatment and a linear combination of covariates. L1 penalty and A-learning method are proposed to select the important variables in the interaction terms, thus the effect modifiers can be obtained and the optimal treatment can be determined. The proposed approach is quite exible and has a doubly robust nature in terms of that the estimated individual treatment effect is consistent if either the control group response profile or the propensity score model is correctly specified. In the second project, we propose a statistical method for assessing drug interactions with binary treatments. With advances in medicine, many drugs and treatments become available. On the one hand, polydrug use (i.e., using more than one drug at a time) has been used to treat patients with multiple morbid conditions, and polydrug use may cause severe side effects. On the other hand, combination treatments have been successfully developed to treat severe diseases such as cancer and chronic diseases. Observational data, such as electronic health record data, may provide useful information for assessing drug interactions. In this project we propose using marginal structural models to assess the average treatment effect and causal interaction of two drugs by controlling confounding variables. The causal effect and the interaction of two drugs are assessed using the weighted likelihood approach, with weights being the inverse probability of the treatment assigned. Simulation studies were conducted to examine the performance of the proposed method, which showed that the proposed method was able to estimate the causal parameters consistently. Case studies were conducted to examine the joint effect of metformin and glyburide use on reducing the hospital readmission for type 2 diabetic patients, and to examine the joint effect of antecedent statins and opioids use on the immune and inflammatory biomarkers for COVID-19 hospitalized patients. In the third project, we propose a statistical methods for assessing treatment interactions where treatment could be measured in a continuous scale such as different dose levels or intensity of treatment. Combination treatment has been often used to treat certain disease such as cancer or alcohol use disorders (AUD). For example, medication and psychotherapy could be applied together to treat patients with AUD. Observational data from electronic health records or claims data are examples of such data resources which could be used to examine treatment effects and treatment interactions. In the second project, we proposed the generalized MSMs and provide the procedures for estimating ATE and treatment interactions using observational data, where the confounding variables are controlled via the IPTW method. Nevertheless, this method presents the MSMs and algorithms for estimating ATE and treatment interaction when two treatments are used together and each drug has only two levels (present or not), which is unsuitable for the situation when each treatment has multiple levels or in continuous scale. In this project, we propose the marginal structural semiparametric model (MSSM) to estimate ATE and treatment interactions, where the generalized propensity score (GPS) method and spline functions are applied, and each treatment either includes multiple levels or in continuous scale. The statistical method developed here can be used to investigate ATE and treatment interaction on treatment effect, as well as on adverse event, depending on the outcome of interest.

Included in

Biostatistics Commons