Electronic Theses and Dissertations

Bayesian variable selection strategies in longitudinal mixture models and categorical regression problems.

Md Nazir Uddin, University of LouisvilleFollow

Date on Master's Thesis/Doctoral Dissertation

8-2021

Document Type

Doctoral Dissertation

Degree Name

Ph. D.

Department

Bioinformatics and Biostatistics

Degree Program

Biostatistics, PhD

Committee Chair

Gaskins, Jeremy

Committee Co-Chair (if applicable)

Kong, Maying

Committee Member

Kong, Maying

Committee Member

Mitra, Riten

Committee Member

Pal, Subhadip

Committee Member

Gill, Ryan

Author's Keywords

Variable screening; mixture models; shrinkage; bayesian analysis; variable selection

Abstract

In this work, we seek to develop a variable screening and selection method for Bayesian mixture models with longitudinal data. To develop this method, we consider data from the Health and Retirement Survey (HRS) conducted by University of Michigan. Considering yearly out-of-pocket expenditures as the longitudinal response variable, we consider a Bayesian mixture model with $K$ components. The data consist of a large collection of demographic, financial, and health-related baseline characteristics, and we wish to find a subset of these that impact cluster membership. An initial mixture model without any cluster-level predictors is fit to the data through an MCMC algorithm, and then a variable screening step finds a set of candidate predictors that may be associated with the cluster configurations found in the initial fit. For each predictor, we choose a discrepancy measure such as frequentist hypothesis tests that will measure the differences in the predictor values across clusters. A large discrepancy provides evidence that the clusters (and the corresponding response trajectories) differ across the baseline characteristic, and these are used to choose a small set of predictors to include in a multinomial logit model for cluster membership. The stepwise logit model along with other choices is considered as a multivariate variable screening approach. The performance of this methodology is explored in both simulations and real data. Additionally, we consider the problem of variable selection in the baseline categorical logit model for categorical regression. While there are a number of studies considering variable selection in the regression paradigm with a numerical response, the research is limited for a categorical response variable. The main goal of this project is to develop a method for leveraging the features of the global-local shrinkage framework to improve variable selection in baseline categorical logistic regression by introducing new shrinkage priors that encourage similar predictors to be selected across the models for different response levels. To that end, the proposed shrinkage priors share information across response models through the local parameters that favor similar levels of shrinkage for all coefficients (log odds ratios) of a predictor. We explore different shrinkage approaches using the horseshoe and normal gamma priors within our setting and compare to a spike and slab setup and other shrinkage priors that fail to share information across models. We explore the performance of our approach in both simulations and a real data application.

Recommended Citation

Uddin, Md Nazir, "Bayesian variable selection strategies in longitudinal mixture models and categorical regression problems." (2021). Electronic Theses and Dissertations. Paper 3701.
https://doi.org/10.18297/etd/3701

Download

Included in

Applied Statistics Commons, Biostatistics Commons, Categorical Data Analysis Commons, Longitudinal Data Analysis and Time Series Commons, Multivariate Analysis Commons, Probability Commons, Statistical Methodology Commons, Statistical Models Commons

COinS

ThinkIR: The University of Louisville's Institutional Repository

Electronic Theses and Dissertations

Bayesian variable selection strategies in longitudinal mixture models and categorical regression problems.

Date on Master's Thesis/Doctoral Dissertation

Document Type

Degree Name

Department

Degree Program

Committee Chair

Committee Co-Chair (if applicable)

Committee Member

Committee Member

Committee Member

Committee Member

Author's Keywords

Abstract

Recommended Citation

Included in

Search

Browse

Author Corner

Related Links

Contact:

ThinkIR: The University of Louisville's Institutional Repository

Electronic Theses and Dissertations

Bayesian variable selection strategies in longitudinal mixture models and categorical regression problems.

Author

Date on Master's Thesis/Doctoral Dissertation

Document Type

Degree Name

Department

Degree Program

Committee Chair

Committee Co-Chair (if applicable)

Committee Member

Committee Member

Committee Member

Committee Member

Author's Keywords

Abstract

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Related Links

Contact: