Date on Master's Thesis/Doctoral Dissertation


Document Type

Master's Thesis

Degree Name



Bioinformatics and Biostatistics

Committee Chair

Brock, Guy

Author's Keywords

Gene-gene interactions; Elastic net; Survival; Survival MDR; Lasso; Random survival forest


Genes--Analysis--Data processing; Cancer--Research


In recent years, a number of computational and statistical problems for identifying SNP-SNP interactions in high dimensional survival data have been studied, and several data mining approaches have been proposed. However, the relative performance of these methods to detect SNP-SNP interactions has not been thoroughly investigated. In this study, we directly compared the performance of the four techniques to detect gene-gene interactions in a recently conducted study of genetic polymorphisms associated with breast cancer survival and recurrence. Four methods were evaluated for their ability to detect SNP-SNP interactions: Survival Multifactor Dimensionality Reduction, Cox regression with LJ (Lasso) and LJ-L2 (Elastic Net) penalties, and Random Survival Forest (RSF). Methods were contrasted on the basis of which SNPs they selected. The results of this study demonstrate how the methods perform in detecting gene-gene interactions for survival data, and are useful in informing researchers about choosing an analysis tool for their own real data applications.