Date on Master's Thesis/Doctoral Dissertation


Document Type

Master's Thesis

Degree Name




Degree Program

Anthropology, MA

Committee Chair

Tillquist, Christoper

Committee Co-Chair (if applicable)

Crespo, Fabian

Committee Member

Crespo, Fabian

Committee Member

Perlin, Michael

Author's Keywords

linkage disequilibrium; haplotype phasing; IL-10; immune genes; MAPKAP-K2; population genetics


The block-like structure of the human genome has been the subject of many scientific papers and is of practical significance in large-scale genome-wide association studies. How stringent haplotype block boundaries are within and between populations has been the subject of ongoing debate within human population genetics. This thesis will contribute to the description of universal and population-specific haplotype blocks at functional sites, namely across the IL-10 gene family (including IL-10, IL-19, IL-20 and IL-24), which is involved in a number of immune system processes, and MAPKAP-K2, an adjacent and functionally significant kinase gene. Beyond the description of blocks across these sites in different populations, this thesis will also measure the impact of the haplotype phasing process on downstream applications of linkage disequilibrium analysis, which underlies much of the research on human haplotype blocks. The five genes in this analysis span just over 200kb on the q arm of chromosome 1. A total of 80 samples from the Coriell Institute of Medical Research are used in this analysis and represent Andean, Basque, Chinese, Iberian, Indo-Pakistani, Middle Eastern, Russian, South African and North African populations. Some haplotype block boundaries were concordant with gene boundaries with most populations showing a consistent boundary between IL-20 and IL-24 and at least half of the study populations showing consistent boundaries between MAPKAP-K2, IL-10 and IL-20. The only gene boundary lacking a persistent haplotype block boundary was between IL-19 and IL-20. The haplotype phasing programs PHASE and Beagle shared 13 of 15 haplotype block boundaries in common while MDBlocks and Beagle only shared 2 haplotype block boundaries and PHASE and MDBlocks only shared 1 block boundary. These data indicate that there are indeed population-specific differences in the distribution of LD across these five sites. Despite these differences, there is a general trend of high LD across each gene with a breakdown of LD at gene boundaries across all populations.