Date on Master's Thesis/Doctoral Dissertation
5-2015
Document Type
Doctoral Dissertation
Degree Name
Ph. D.
Department
Computer Engineering and Computer Science
Degree Program
Computer Science and Engineering, PhD
Committee Chair
Rouchka, Eric Christian
Committee Co-Chair (if applicable)
Chang, Dar-jen
Committee Member
Moseley, Hunter
Committee Member
Petruska, Jeffrey
Committee Member
Yampolskiy, Roman Vladimirovich
Subject
Nucleotide sequence; RNA--Analysis
Abstract
High-throughput mRNA sequencing (also known as RNA-Seq) promises to be the technique of choice for studying transcriptome profiles, offering several advantages over old techniques such as microarrays. This technique provides the ability to develop precise methodologies for a variety of RNA-Seq applications including gene expression quantification, novel transcript and exon discovery, differential expression (DE) and splice variant detection. The detection of significantly changing features (e.g. genes, transcript isoforms, exons) in expression across biological samples is a primary application of RNA-Seq. Uncovering which features are significantly differentially expressed between samples can provide insight into their functions. One major limitation with the majority of recently developed methods for RNA-Seq differential expression is the dependency on annotated biological features to detect expression differences across samples. This forces the identification of expression levels and the detection of significant changes to known genomic regions. Thus, any significant changes occurring in unannotated regions will not be captured. To overcome this limitation, we developed a novel segmentation approach, Island-Based (IBSeq), for analyzing differential expression in RNA-Seq and targeted sequencing (exome capture) data without specific knowledge of an isoform. IBSeq segmentation determines individual islands of expression based on windowed read counts that can be compared across experimental conditions to determine differential island expression. In order to detect differentially expressed features, the significance of DE islands corresponding to each feature are combined using combined p-value methods. We evaluated the performance of our approach by comparing it to a number of existing gene DE methods using several benchmark MAQC RNA-Seq datasets. Using the area under ROC curve (auROC) as a performance metric, results show that IBSeq clearly outperforms all other methods compared.
Recommended Citation
Eteleeb, Abdallah, "An island-based approach for RNA-SEQ differential expression analysis." (2015). Electronic Theses and Dissertations. Paper 2072.
https://doi.org/10.18297/etd/2072