In many modern applications, including analysis of gene expression
and text documents, the data are noisy, high-dimensional, and
unordered -- with no particular meaning to the given order of the
variables. Yet, successful learning is often possible due to
sparsity; the fact that the data are typically redundant with
underlying structures that can be represented by only a few
features. In this paper we present treelets -- a novel construction
of multi-scale orthonormal bases that extends wavelets to non-smooth
signals. Treelets capture the internal structure of the data and can
as a dimensionality reduction tool significantly improve inference
and prediction. We examine a variety of situations where treelets
outperform principal component analysis and some common variable
selection methods. The proposed method is illustrated on a linear
mixture model, and on two real data sets: internet advertisements
and DNA microarray data.