Date on Master's Thesis/Doctoral Dissertation
8-2025
Document Type
Master's Thesis
Degree Name
M.S.
Department
Bioinformatics and Biostatistics
Degree Program
Biostatistics, MS
Committee Chair
Sekula, Michael
Committee Co-Chair (if applicable)
Kong, Maiying
Committee Member
Cash, Elizabeth
Author's Keywords
Breast cancer; TCGA-BRCA; RNA-Seq; cox proportional hazards model; LASSO feature selection; cancer genomics
Abstract
High-dimensional genomic data offer both promise and challenges for identifying clinically relevant biomarkers. This study developed a parallelized survival modeling pipeline to identify genes associated with overall survival in breast cancer, with a focus on gene-by-treatment interactions and patient heterogeneity. RNA-Seq data from female patients in the TCGA-BRCA cohort were analyzed. Univariate Cox proportional hazards models were used to screen genes, adjusting for age, race/ethnicity, treatment status, and cancer stage. A LASSO-penalized Cox regression was fit across 2000 random seeds to assess feature stability. Genes were filtered by expression level, statistical significance, and hazard ratios (effect sizes) in either direction, then re-evaluated in a multivariable Cox model. Several genes with statistically significant treatment interactions were identified, including novel candidates not present in established prognostic panels. These findings highlight the value of interaction-aware survival modeling for improving personalized prognostic prediction in breast cancer and underscore the importance of accounting for treatment heterogeneity in high-dimensional genomic analyses.
Recommended Citation
Pratt, David, "Genes that matter: Survival modeling in TCGA-BRCA with treatment interactions." (2025). Electronic Theses and Dissertations. Paper 4643.
Retrieved from https://ir.library.louisville.edu/etd/4643