Date on Master's Thesis/Doctoral Dissertation
Bioinformatics and Biostatistics
Committee Co-Chair (if applicable)
Single-cell; RNA-seq; hurdle model; factor model; differential expression; gene networks
With single-cell RNA sequencing (scRNA-seq) technology, researchers are able to gain a better understanding of health and disease through the analysis of gene expression data at the cellular-level; however, scRNA-seq data tend to have high proportions of zero values, increased cell-to-cell variability, and overdispersion due to abnormally large expression counts, which create new statistical problems that need to be addressed. This dissertation includes three research projects that propose Bayesian methodology suitable for scRNA-seq analysis. In the first project, a hurdle model for identifying differentially expressed genes across cell types in scRNA-seq data is presented. This model incorporates a correlated random effects structure based on an initial clustering of cells to capture the cell-to-cell variability within treatment groups but can easily be adapted to an independent random effect structure if needed. A sparse Bayesian factor model is introduced in the second project to uncover network structures associated with genes in scRNA-seq data. Latent factors impact the gene expression values for each cell and provide flexibility to account for the common features of scRNA-seq. The third project expands upon this latent factor model to allow for the comparison of networks across different treatment groups.
Sekula, Michael, "Novel Bayesian methodology for the analysis of single-cell RNA sequencing data." (2020). Electronic Theses and Dissertations. Paper 3416.
Retrieved from https://ir.library.louisville.edu/etd/3416