# Linear Methods for Noise Removal and Filtering

27 September 2018

$\newcommand{\Expect}[1]{\mathbb{E}\left[ #1 \right]} \newcommand{\Var}[1]{\mathrm{Var}\left[ #1 \right]} \newcommand{\SampleVar}[1]{\widehat{\mathrm{Var}}\left[ #1 \right]} \newcommand{\Cov}[1]{\mathrm{Cov}\left[ #1 \right]} \newcommand{\TrueRegFunc}{\mu} \newcommand{\EstRegFunc}{\widehat{\TrueRegFunc}} \DeclareMathOperator{\tr}{tr} \DeclareMathOperator*{\argmin}{argmin} \DeclareMathOperator{\det}{det} \newcommand{\TrueNoise}{\epsilon} \newcommand{\EstNoise}{\widehat{\TrueNoise}} \newcommand{\SignalNoise}{N}$

# In our previous episodes

• General approach to optimal linear prediction
• Predict $$Y$$ from $$\vec{Z} = [Z_1, Z_2, \ldots Z_p]$$
• Prediction is $$\alpha + \vec{\beta} \cdot \vec{Z}$$
• Best $$\alpha = \Expect{Y} - \vec{\beta} \cdot \Expect{\vec{Z}}$$
• Best $$\beta = \Var{\vec{Z}}^{-1} \Cov{\vec{Z}, Y}$$
• Previously:
• Interpolating and extrapolating time series
• Interpolating and extrapolating random fields
• Today:
• Extracting a random signal from random noise
• Extracting a deterministic, periodic trend from noise

# Optimal linear filtering for time series

• Given: $$X(t_1), X(t_2), \ldots X(t_n)$$
• Assumption: $$X(t) = S(t) + \SignalNoise(t)$$, with $$\Expect{\SignalNoise(t)} = 0$$
• Signal $$S(t)$$ plus noise $$\SignalNoise(t)$$
• NOTE: Assuming signal $$S$$ is also a random process!
• Desired: prediction of $$S(t_0)$$
• This is filtering

# Optimal linear filtering for time series

$\begin{eqnarray} \hat{S}(t_0) & = & \alpha + \vec{\beta} \cdot \left[\begin{array}{c} X(t_1) \\ X(t_2) \\ \vdots \\ X(t_n) \end{array}\right]\\ \alpha & = & \Expect{S(t_0)} - \vec{\beta} \cdot \left[\begin{array}{c} \Expect{X(t_1)}\\ \Expect{X(t_2)} \\ \vdots \\ \Expect{X(t_n)}\end{array}\right] ~ \text{(goes away if everything's centered)}\\ \vec{\beta} & = & {\left[\begin{array}{cccc} \Var{X(t_1)} & \Cov{X(t_1), X(t_2)} & \ldots & \Cov{X(t_1), X(t_n)}\\ \Cov{X(t_1), X(t_2)} & \Var{X(t_2)} & \ldots & \Cov{X(t_2), X(t_n)}\\ \vdots & \vdots & \ldots & \vdots\\ \Cov{X(t_1), X(t_n)} & \Cov{X(t_2), X(t_n)} & \ldots & \Var{X(t_n)}\end{array}\right]}^{-1} \left[\begin{array}{c} \Cov{S(t_0), X(t_1)}\\ \Cov{S(t_0), X(t_2)}\\ \vdots \\ \Cov{S(t_0), X(t_n)}\end{array}\right] \end{eqnarray}$

# This should look familiar

• It’s almost the same as optimal linear prediction for time series
• It’s exactly the same as optimal linear prediction of one time series from another

# Simple case, to build intuition

• Assume $$S$$ is stationary, with $$\Expect{S(t)} = 0$$, autocovariance function $$\gamma(h)$$
• Assume $$\SignalNoise$$ is stationary, with variance $$\sigma^2$$ and no autocorrelation or correlation with $$S$$
$\begin{eqnarray} \Cov{X(t), X(t+h)} & = & \Cov{S(t)+\SignalNoise(t), S(t+h)+\SignalNoise(t+h)}\\ & = & \Cov{S(t), S(t+h)} + \Cov{\SignalNoise(t), \SignalNoise(t+h)}\\ & & + \Cov{S(t), \SignalNoise(t+h)} + \Cov{S(t+h), \SignalNoise(t)}\\ & = & \gamma(h) + \sigma^2\mathbf{1}(h=0)\\ \Cov{S(t), X(t+h)} & = & \Cov{S(t), S(t+h) + \SignalNoise(t+h)}\\ & = & \Cov{S(t), S(t+h)} + \Cov{S(t), \SignalNoise(t+h)}\\ & = & \gamma(h) \end{eqnarray}$

# One observation, one estimate

• Given: $$X(t)$$
• Desired: $$S(t)$$
$\begin{eqnarray} \hat{S}(t) & = & \beta X(t)\\ \beta & = & \frac{\Cov{S(t), X(t)}}{\Var{X(t)}}\\ &= & \frac{\gamma(0)}{\gamma(0) + \sigma^2} < 1\\ \hat{S}(t) & = & \frac{\gamma(0)}{\gamma(0) + \sigma^2} X(t) \end{eqnarray}$

# Two observations, one estimate

• Given: $$X(t-1)$$, $$X(t)$$
• Desired: $$S(t)$$

• Estimate is $\begin{eqnarray} \hat{S}(t) & = & \beta_0 X(t) + \beta_1 X(t-1)\\ \left[ \begin{array}{cc} \beta_0 \\ \beta_1 \end{array}\right] & = & \left[\begin{array}{cc} \Var{X(t)} & \Cov{X(t), X(t-1)}\\ \Cov{X(t-1), X(t)} & \Var{X(t-1)} \end{array}\right]^{-1} \left[\begin{array}{c} \Cov{X(t), S(t)} \\ \Cov{X(t-1), S(t)}\end{array} \right] \end{eqnarray}$

EXERCISE: Find $$\beta_0, \beta_1$$ in terms of the function $$\gamma$$ and $$\sigma^2$$

Hint 1: $$\left[\begin{array}{cc} a & b \\ c & d\end{array}\right]^{-1} = \frac{1}{ad - bc}\left[\begin{array}{cc} d & -b \\ -c & a\end{array}\right]$$

Hint 2: $$\Cov{X(t), X(t+h)} = \gamma(h) + \sigma^2\mathbf{1}(h=0)$$ and $$\Cov{S(t), X(t+h)} = \gamma(h)$$

# Two observations, one estimate

$\begin{eqnarray} \left[ \begin{array}{cc} \beta_0 \\ \beta_1 \end{array}\right] & = & \left[\begin{array}{cc} \Var{X(t)} & \Cov{X(t), X(t-1)}\\ \Cov{X(t-1), X(t)} & \Var{X(t-1)}\end{array}\right]^{-1} \left[\begin{array}{c} \Cov{X(t), S(t)} \\ \Cov{X(t-1), S(t)}\end{array} \right]\\ & = & \left[\begin{array}{cc} \gamma(0)+\sigma^2 & \gamma(1)\\ \gamma(1) & \gamma(0)+\sigma^2\end{array}\right]^{-1} \left[\begin{array}{c} \gamma(0) \\ \gamma(1) \end{array} \right]\\ &=& \frac{1}{(\gamma(0)+\sigma^2)^2-\gamma^2(1)}\left[\begin{array}{cc} \gamma(0)+\sigma^2 & -\gamma(1)\\ -\gamma(1) & \gamma(0)+\sigma^2\end{array}\right]\left[\begin{array}{c} \gamma(0) \\ \gamma(1) \end{array} \right]\\ & = & \frac{1}{(\gamma(0)+\sigma^2)^2-\gamma^2(1)}\left[\begin{array}{c} (\gamma(0)+\sigma^2)\gamma(0) - \gamma^2(1) \\ \gamma(1)\sigma^2\end{array}\right] \end{eqnarray}$

# What’s going on here?

• Slope on $$X(t) < 1$$ $$\Rightarrow$$ Estimate of $$S(t)$$ should not change 1-for-1 with $$X(t)$$
• Weak noise $$\Rightarrow$$ pay more attention to the data
• Strong correlation in the signal $$\Rightarrow$$ pay more attention to $$X(t-1)$$ when estimating $$S(t)$$
• Reduces to the one observation, one signal case when $$\gamma(1) = 0$$

# The general pattern: the Wiener filter

• Assume $$X(t) = S(t) + \SignalNoise(t)$$
• Assume $$\SignalNoise$$ uncorrelated with $$S$$
• Then $$\Cov{X(t), X(t+h)} = \Cov{S(t), S(t+h)} + \Cov{\SignalNoise(t), \SignalNoise(t+h)}$$
• and $$\Cov{S(t), X(t+h)} = \Cov{S(t), S(t+h)}$$
• Calculate coefficients in the usual way
• Wiener filter = applying these coefficients to the same series to get $$\hat{S}(t)$$
• Need to know, or guess, at the correlations of either the signal or the noise to do de-noising
• Do not need any direct observations of the signal

# Extracting periodic or cyclic patterns

• Back to the $$X(t) = \TrueRegFunc(t) + \TrueNoise(t)$$ representation
• Trend may be periodic, $$\TrueRegFunc(t) = \TrueRegFunc(t+T)$$ for period $$T$$
• Also call this cyclic or seasonal
• Can also have $$\TrueRegFunc(t) = m(t) + p(t)$$ with $$p$$ periodic and $$m$$ not
• How can we use this, if we know $$T$$?

# Estimating the periodic component

For simplicity, say we have $$X(0), X(1), \ldots X(n)$$, $$n=kT$$

$\begin{eqnarray} \EstRegFunc(0) & = & \frac{1}{k-1}\sum_{i=0}^{k-1}{X(iT)}\\ \EstRegFunc(1) & = & \frac{1}{k-1}\sum_{i=0}^{k -1}{X(1+iT)}\\ \EstRegFunc(t) & = & \frac{1}{k-1}\sum_{i=0}^{k-1}{X(t+iT)}\\ & \vdots & \\ \EstRegFunc(T-1) & = & \frac{1}{k-1}\sum_{i=0}^{k-1}{X(T-1+iT)} \end{eqnarray}$

# Estimating the periodic component

library(gstat)
data(wind)
dub.jan1 <- wind[wind$month == 1 & wind$day == 1, "DUB"]
mean(dub.jan1)
## [1] 12.67722
colMeans(wind[wind$month == 1 & wind$day == 1, -(1:3)])
##       RPT       VAL       ROS       KIL       SHA       BIR       DUB
## 14.610000 11.200556 12.931667  6.845556 10.798333  7.678333 12.677222
##       CLA       MUL       CLO       BEL       MAL
##  8.952778 10.057222  9.936667 13.987222 18.536667

Now repeat for every day of the calendar

# After extracting the periodic component

• Observation = trend + fluctuation $\begin{eqnarray} X(t) & = & \TrueRegFunc(t) + \TrueNoise(t)\\ X(t) & = & \EstRegFunc(t) + \EstNoise(t)\\ \EstNoise(t) & = & X(t) - \EstRegFunc(t) \end{eqnarray}$
• If we think $$\TrueRegFunc$$ is periodic, we can estimate it by averaging over periods
• Then remove the periodic component from the observations

# Summary

• Extracting a random signal from noise looks just like any other linear prediction problem
• $$X(t) = S(t) + \SignalNoise(t)$$, predict $$S(t)$$ from $$X$$
• Need to make assumptions about the noise, $$\Cov{\SignalNoise(t), \SignalNoise(t+h)}$$
• Extracting periodic components by averaging
• $$X(t) = \TrueRegFunc(t) + \TrueNoise(t)$$, with $$\TrueRegFunc$$ deterministic
• Periodic/cyclic/seasonal component means $$\TrueRegFunc(t) = \TrueRegFunc(t+T)$$
• Estimate the periodic trend by averaging over periods
• Need to know period $$T$$
• Next week: how to analyze any process into a sum of periodic components with multiple periods
• Helps with identifying $$T$$
• Helps with covariance estimation
• Helps with any sort of linear filtering
• Changes how you see the world

# Trend + periodicity + fluctuations

$\begin{eqnarray} X(t) & = & m(t) + p(t) + \TrueNoise(t)\\ & = & \text{long-run trend} + \text{periodic component} + \text{random fluctuations} \end{eqnarray}$
• Smooth and remove the smooth trend
• Average over periods and remove the periodic component
• What’s left is $$\approx$$ fluctuations