Outlier Detection and False Discovery Rates for Whole-genome DNA Matching

Tzeng, Jung-Ying, Byerley, W., Devlin, B., Roeder, Kathryn and Wasserman, Larry


We define a statistic, called the matching statistic, for locating regions of the genome that exhibit excess similarity among cases when compared to controls. Such regions are reasonable candidates for harboring disease genes. We find the asymptotic distribution of the statistic while accounting for correlations among sampled individuals. We then use the Benjamini and Hochberg false discovery rate (FDR) method for multiple hypothesis testing to find regions of excess sharing. The p-values for each region involve estimated nuisance parameters. Under appropriate conditions, we show that the FDR method based on p-values and with estimated nuisance parameters asymptotically preserves the FDR property. Finally, we apply the method to a pilot study on schizophrenia.

Keywords: Association study, Case-control, False discovery rate with nuisance parameters, Linkage disequilibrium

Heidi Sestrich
Here is the full .pdf text for this technical report.