In the search for genetic variants affecting liability to complex diseases, haplotype data have received increasing recognition. When there are more than a few haplotypes, one way to assess the association between disease status and haplotypes is to reconstruct their evolutionary history, i.e. to build a tree that specifies the relative age of each haplotype. To achieve that goal, we combine information from two sources: (1) the inferred network relating haplotypes; and (2) the frequency and mutational history of each haplotype. The former resolves into a space of candidate trees while the latter entails probabilistic modeling to determine the absolute age of each haplotype.
To maximize the "tree likelihood", or the joint likelihood of absolute ages of haplotypes for a given tree, we develop a quasi-boundary procedure that enables us to compare trees consistent with the inferred network. These comparisons make explicit which trees are most likely to have generated the observed data and thus guide decisions for data analysis. We evaluate these methods using data on monoamine oxidase (MAO) haplotypes from two ethnic backgrounds, African-American and European-American. Our results show that the evolutionary trees exhibit consistency of relative ages among common haplotypes in the two ethnic groups.