Electronic Theses and Dissertations

Integrated analysis of miRNA/mRNA expression and gene methylation using sparse canonical correlation analysis.

Dake Yang, University of LouisvilleFollow

Date on Master's Thesis/Doctoral Dissertation

5-2016

Document Type

Doctoral Dissertation

Degree Name

Ph. D.

Department

Bioinformatics and Biostatistics

Degree Program

Biostatistics, PhD

Committee Chair

Brock, Guy

Committee Co-Chair (if applicable)

Lorenz, Douglas

Committee Member

Kong, Maiying

Committee Member

Kulasekera, K. B.

Committee Member

Mukhopadhyay, Partha

Committee Member

Wu, Dongfeng

Author's Keywords

Bioinformatic; China; Xinjiang; Louisville; Biostatistics: Dake Yang

Abstract

MicroRNAs (miRNAs) are a large number of small endogenous non-coding RNA molecules (18-25 nucleotides in length) which regulate expression of genes post-transcriptionally. While a variety of algorithms exist for determining the targets of miRNAs, they are generally based on sequence information and frequently produce lists consisting of thousands of genes. Canonical correlation analysis (CCA) is a multivariate statistical method that can be used to find linear relationships between two data sets, and here we apply CCA to find the linear combination of differentially expressed miRNAs and their corresponding target genes having maximal negative correlation. Due to the high dimensionality, sparse CCA is used to constrain the problem and obtain a solution. A novel gene set enrichment analysis statistic is proposed based on the sparse CCA results for estimating the significance of predefined gene sets. The methods are illustrated with both a simulation study and real miRNA-mRNA expression data. DNA methylation is a process of adding a methyl group to DNA by a group of enzymes collectively known as DNA methyltransferases which is an epigenetic modification critical to normal genome regulation and development. In order to understand the role of DNA methylation in gene differentiation, we analyze genome-scale DNA methylation patterns and gene expression data using sparse CCA to find linear combinations between the two data sets which have maximal negative correlation. In a similar spirit to the miRNA-mRNA study, we create a GSEA statistic with weight vectors from the sparse CCA method and assess the significance of predefined gene sets. The method is exemplified with real gene expression / DNA methylation data regarding the development of the embryonic murine palate.

Recommended Citation

Yang, Dake, "Integrated analysis of miRNA/mRNA expression and gene methylation using sparse canonical correlation analysis." (2016). Electronic Theses and Dissertations. Paper 2439.
https://doi.org/10.18297/etd/2439

Download

Included in

Bioinformatics Commons, Statistics and Probability Commons

COinS

ThinkIR: The University of Louisville's Institutional Repository

Electronic Theses and Dissertations

Integrated analysis of miRNA/mRNA expression and gene methylation using sparse canonical correlation analysis.

Date on Master's Thesis/Doctoral Dissertation

Document Type

Degree Name

Department

Degree Program

Committee Chair

Committee Co-Chair (if applicable)

Committee Member

Committee Member

Committee Member

Committee Member

Author's Keywords

Abstract

Recommended Citation

Included in

Search

Browse

Author Corner

Related Links

Contact:

ThinkIR: The University of Louisville's Institutional Repository

Electronic Theses and Dissertations

Integrated analysis of miRNA/mRNA expression and gene methylation using sparse canonical correlation analysis.

Author

Date on Master's Thesis/Doctoral Dissertation

Document Type

Degree Name

Department

Degree Program

Committee Chair

Committee Co-Chair (if applicable)

Committee Member

Committee Member

Committee Member

Committee Member

Author's Keywords

Abstract

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Related Links

Contact: