Date on Master's Thesis/Doctoral Dissertation


Document Type

Doctoral Dissertation

Degree Name

Ph. D.


Computer Engineering and Computer Science

Committee Chair

Rouchka, Eric Christian

Author's Keywords

Intercellular interactions; Tissue interaction; Absolute gene ID conversion; Interval tree; AbsIDconvert; Genome annotation


Cell interaction


Current high-throughput gene expression experiments have a straightforward design of examining the gene expression of one group or condition relative to that of another. The data is typically analyzed as if they represent strictly intracellular events, and often treats genes as coming from a homogeneous population. Although intracellular events are crucial to nearly all biological processes, cell-cell interactions are often just as important, especially when gene expression data is generated from heterogeneous cell populations, such as from whole tissues. Cell-cell molecular interactions are generally lost in the available analytical procedures and as a result, are not examined experimentally, at least not accurately or with efficiency. Most importantly, this imposes major limitations when studying gene expression changes in multiple samples that interact with one another. In order to addresses the limitations of current techniques, we have developed a novel systems-based approach that expands the traditional analysis of gene expression in two stages. This includes a novel sequence-based meta-analytic tool, AbsIDconvert, that allows for conversion of annotated features using an interval tree for storing and querying absolute genomic coordinates for comparison of multi-scale macro-molecule identifiers across platforms and/or organisms. In addition, a systems-based heuristic algorithm is developed to find intercellular interactions between two sets of genes, potentially from different tissues by utilizing location information of each gene along with the information available in the secondary databases in the form of interactions, pathways and signaling. AbsIDconvert is shown to provide a high accuracy in identifier conversion as compared to other available methodologies (typically at an average rate of 84%) while maintaining a higher efficiency (O(n*log(n)). Our intercellular interaction approach and underlying visualization shows promise in allowing researchers to uncover novel signaling pathways in an intercellular fashion that to this point has not been possible.