A Fast Clustering Algorithm with Application to Cosmology

Woncheol Jang


We present a fast clustering algorithm for density countour clusters (Hartigan, 1975) that is a modified version of the Cuevas, Febrero and Fraiman (2000) algorithm. By Hartigan's definition, clusters are the connected components of a level set $S_c \equiv \{f \geq c\}$ where $f$ is the probability density function. We use kernel density estimators and orthogonal series estimators to estimate $f$ and modify the Cuevas, Febrero and Fraiman (2000) Algorithm to extract the connected components from level set estimators $\hat{S}_c \equiv
\{\hat{f} \geq c\}$. Unlike the original algorithm, our method does not require an extra smoothing parameter and can use the Fast Fourier Tranform (FFT) to speed up the calculations. We show the cosmological definition of clusters of galaxies is equivalent to density contour clusters and present an application in cosmology.

Keywords: Density contour cluster; clustering; Fast Fourier Transform

Heidi Sestrich 2004-06-18
Here is the full PDF text for this technical report. It is 2922772 bytes long.