In many modern applications, including analysis of gene expression and text documents, the data are noisy, high-dimensional, and unordered -- with no particular meaning to the given order of the variables. Yet, successful learning is often possible due to sparsity; the fact that the data are typically redundant with underlying structures that can be represented by only a few features. In this paper we present treelets -- a novel construction of multi-scale orthonormal bases that extends wavelets to non-smooth signals. Treelets capture the internal structure of the data and can as a dimensionality reduction tool significantly improve inference and prediction. We examine a variety of situations where treelets outperform principal component analysis and some common variable selection methods. The proposed method is illustrated on a linear mixture model, and on two real data sets: internet advertisements and DNA microarray data.