B. Devlin, B.L. Jones, S-A. Bacanu and K. Roeder
To determine the genetic etiology of complex diseases, a common study design is to recruit affected sib/relative pairs (ASP/ARP) and evaluate their genome-wide distribution of identical by descent (IBD)-sharing using a set of highly polymorphic markers. Other attributes or environmental exposures of the ASP/ARP, which are thought to affect liability to disease, are sometimes collected. Conceivably these covariates could refine the linkage analysis. Most published methods for ASP/ARP linkage with covariates can be conceptualized as logistic models in which IBD-status of the ASP is predicted by pair-specific covariates. We develop a different approach to the problem of ASP analysis in the presence of covariates, one that extends naturally to ARP under certain conditions. For ASP linkage analysis, we formulate a mixture model in which a disease mutation is segregating in only a fraction of the sibships, with sibships being unlinked. Covariate information is used to predict membership within groups; in this report, the two groups correspond to the linked and unlinked sibships. For an ASP with covariate(s) Z=z and multilocus genotype X=x, the mixture model is , in which g0(x) follows the distribution of genotypes under the null IBD distribution and allows for increased IBD sharing. Two mixture models are developed. The `Pre-clustering' model uses covariate information to form probabilistic clusters and then tests for excess IBD-sharing independent of the covariates. The `Cov-IBD' model determines probabilistic group membership by joint consideration of covariate and IBD values. Simulations show that incorporating covariates into linkage analysis can enhance power substantially. A feature of our conceptualization of ASP linkage analysis, with covariates, is that it is apparent how data analysts might evaluate covariates prior to the linkage analysis, thus avoiding the loss of power described by Leal and Ott  when data are stratified.
Keywords: clustering algorithms, mixing distribution, score statistics, likelihood ratio, asymptotic distributions