Controlling the False Discover Rate in Astrophysical Data Analysis

Christopher J. Miller, Christopher Genovese, Robert C. Nichol, Larry Wasserman, Andrew Connolly, Daniel Reichart, Andrew Hopkins, Jeff Schneider and Andrew Moore

Abstract:

The False Discovery Rate (FDR) is a new statistical procedure to control the number of mistakes made when performing multiple hypothesis tests, i.e. when comparing many data against a given model hypothesis. The key advantage of FDR is that it allows one to a priori control the average fraction of false rejections made (when comparing to the null hypothesis) over the total number of rejections performed. We compare FDR to the standard procedure of rejecting all tests that do not match the null hypothesis above some arbitrarily chosen confidence limit, e.g. $2\sigma$ , or at the 95% confidence level. We find a similar rate of correct detections, but with significantly fewer false detections. Moreover, the FDR procedure is quick and easy to compute and can be trivially adapted to work with correlated data. The purpose of this paper is to introduce the FDR procedure to the astrophysics community. We illustrate the power of FDR through several astronomical examples, including the detection of features against a smooth one-dimensional function, e.g. seeing the ``baryon wiggles'' in a power spectrum of matter fluctuations, and source pixel detection in imaging data. In this era of large datasets and high precision measurements, FDR provides the means to adaptively control a scientifically meaningful quantity - the number of false discoveries made conducting multiple hypothesis tests.

Heidi Sestrich
8/1/2001

Here is the full postscript text for this technical report. It is 2565118 bytes long.