656

# VARIABLE FUSION: A NEW ADAPTIVE SIGNAL REGRESSION METHOD

**Stephanie R. Land and Jerome H. Friedman**

### Abstract:

*Signal and image processing are active areas of research in both
statistics and engineering. Most of this research has emphasized the
reconstruction of a ``true'' underlying pattern from one measured with
noise. Our research has a different goal: recognition or prediction
of an ancillary quantity ***y** associated with each observed pattern
. We propose a nonlinear regularized regression technique, *
variable fusion*. Variable fusion produces models of a simple
parsimonious form, readily explained to the non-statistician and
possibly affording savings in data collection. In addition, variable
fusion models perform well in terms of prediction. In this paper we assume
that the quantity **y** is real and single-valued and the pattern
is a ``signal'', i.e., the space of index values **t** is
one-dimensional, although we describe the generalization of the method
to a multidimensional index space. We use the patterns as the
predictors of **y**. The patterns generally originate as analog signals
and are measured at a large set of discrete index values, giving rise
to a correspondingly large set of predictor variables. The problem is
therefore ill-posed and requires regularization. Variable fusion
regularizes by exploiting the spatial nature of the predictor variable
index through powerful variable bandwidth nonlinear smoothing analogs
based on adaptive splines. We compare this method with partial
least squares, ridge regression and cubic spline smoothing. The first
two of these methods apply regularization that is equivariant to the
labeling of the predictors. They therefore ignore the spatial nature
of the predictor index. The latter method, cubic spline smoothing, is
linear and cannot as readily adapt to sharp structure as can a
nonlinear method. We compare the methods in Monte Carlo simulation
and on two examples: phoneme classification based on log-periodograms
of spoken words, and the prediction based on psychometric data of a
post-traumatic stress disorder diagnostic test score.

*
** Keywords:* pattern regression, adaptive splines, cubic spline
regression, ridge regression, partial least squares.

Here is the full postscript text for this
technical report. It is 802806 bytes long.