Date on Master's Thesis/Doctoral Dissertation
8-2025
Document Type
Doctoral Dissertation
Degree Name
Ph. D.
Department
Bioinformatics and Biostatistics
Degree Program
School of Public Health and Information Sciences, Biostatistics
Committee Chair
Gaskins, Jeremy
Committee Member
Kong, Maiying
Committee Member
Sekula, Michael
Committee Member
Huang, Shih-Ting
Committee Member
Gill, Ryan
Author's Keywords
Bayesian nonparametric clustering; predictor-informed; pyramid group model; common atom model
Abstract
In this dissertation, we performed clustering of observations such that the cluster membership is influenced by a set of predictors. To that end, we employ the Bayesian nonparametric Common Atom Model (CAM), which is a nested clustering algorithm that utilizes a (fixed) group membership for each observation to encourage more similar clustering of members of the same group. CAM operates by assuming each group has its own vector of cluster probabilities, which are themselves clustered to allow similar clustering for some groups. We extend this approach by treating the group membership as an unknown latent variable determined as a flexible nonparametric form of the covariate vector. Consequently, observations with similar predictor values will be in the same latent group and are more likely to be clustered together than observations with disparate predictors. We propose a pyramid group model that flexibly partitions the predictor space into these latent group memberships. This pyramid model operates similarly to a Bayesian regression tree process except that it uses the same splitting rule for at all nodes at the same tree depth which facilitates improved mixing. We propose a block Gibbs sampler for our model to perform posterior inference. Our methodology is demonstrated in simulation and real data examples. In the real data application, we utilize the RAND Health and Retirement Study to cluster and predict patient outcomes in terms of the number of overnight hospital stays.
Recommended Citation
Parh, Md Yasin Ali, "Predictor-informed Bayesian nonparametric clustering." (2025). Electronic Theses and Dissertations. Paper 4629.
Retrieved from https://ir.library.louisville.edu/etd/4629