Date on Master's Thesis/Doctoral Dissertation


Document Type

Doctoral Dissertation

Degree Name

Ph. D.


Interdisciplinary and Graduate Studies

Degree Program

Interdisciplinary Studies with a specialization in Bioinformatics, PhD

Committee Chair

Rouchka, Eric

Committee Co-Chair (if applicable)

Moseley, Hunter

Committee Member

Petruska, Jeffrey

Committee Member

Rai, Shesh

Committee Member

Brock, Guy

Author's Keywords

compressed angle; aberrant coordination geometry; structure-function relationship; bidentation; 3D structure


Metalloproteins are proteins that can bind at least one metal ion as a cofactor. They utilize metal ions for a variety of biological purposes, and are essential for all domains of life. Due to the ubiquity of metalloprotein’s involvement across these processes across all domains of life, how proteins coordinate metal ions for different biochemical functions is of great relevance to understanding the implementation of these biological processes. One of the most important aspects of metal binding is its coordination geometry (CG), which often implies functional activities. Most of the current studies are based on the assumption of previously reported CG models founded mainly in a non-biological chemical context. While this general procedure provides us with great measures on the closest CG model a metal site adopts, it also biases and limits the binding ligand selection and coordination results to the canonical CG models examined. Thus, if a CG model exists that has never be reported previously or is not accounted for in a study, instances from the CG would either be misclassified into an expected model and cause a high in-class variation or considered as outliers. To solve this problem, we have developed our analysis, where the less-biased low-variation measure, bond-length, was used determine the binding ligands and the higher-variation measure, angle, was used to cluster the metal shells into canonical or novel CGs with functional associations. This methodology is model-free, and allows us to derive the CG models from the data itself. Thus, we can handle unknown CGs that may cause problems to the classification methods. This new methodology has enabled the discovery of several previously uncharacterized CGs for zinc and other top abundant metalloproteins. By recognizing these novel/aberrant CGs in our clustering analyses, high correlations were achieved between structural and functional descriptions of metal ion coordination.