Date on Master's Thesis/Doctoral Dissertation
8-2014
Document Type
Doctoral Dissertation
Degree Name
Ph. D.
Department
Bioinformatics and Biostatistics
Committee Chair
Kong, Maiying
Committee Co-Chair (if applicable)
Datta, Susmita
Committee Member
Datta, Susmita
Committee Member
Kulasekera, Karunarathna
Committee Member
Wu, Dongfeng
Committee Member
Jones, Stephen P.
Subject
Regression analysis; Mass spectrometry
Abstract
The focus of this dissertation is to develop statistical methods, under the framework of penalized regressions, to handle three different problems. The first research topic is to address missing data problem for variable selection models including elastic net (ENet) method and sparse partial least squares (SPLS). I proposed a multiple imputation (MI) based weighted ENet (MI-WENet) method based on the stacked MI data and a weighting scheme for each observation. Numerical simulations were implemented to examine the performance of the MIWENet method, and compare it with competing alternatives. I then applied the MI-WENet method to examine the predictors for the endothelial function characterized by median effective dose and maximum effect in an ex-vivo experiment. The second topic is to develop monotonic single-index models for assessing drug interactions. In single-index models, the link function f is unnecessary monotonic. However, in combination drug studies, it is desired to have a monotonic link function f . I proposed to estimate f by using penalized splines with I-spline basis. An algorithm for estimating f and the parameter a in the index was developed. Simulation studies were conducted to examine the performance of the proposed models in term of accuracy in estimating f and a. Moreover, I applied the proposed method to examine the drug interaction of two drugs in a real case study. The third topic was focused on the SPLS and ENet based accelerated failure time (AFT) models for predicting patient survival time with mass spectrometry (MS) data. A typical MS data set contains limited number of spectra, while each spectrum contains tens of thousands of intensity measurements representing an unknown number of peptide peaks as the key features of interest. Due to the high dimension and high correlations among features, traditional linear regression modeling is not applicable. Semi-parametric AFT model with an unspecified error distribution is a well-accepted approach in survival analysis. To reduce the bias caused in denoising step, we proposed a nonparametric imputation approach based on Kaplan-Meier estimator. Numerical simulations and a real case study were conducted under the proposed method.
Recommended Citation
Wan, Yubing, "Penalized regressions for variable selection model, single index model and an analysis of mass spectrometry data." (2014). Electronic Theses and Dissertations. Paper 1508.
https://doi.org/10.18297/etd/1508