Introduction

Suppose we are attempting to match two images, each with differing amounts of seeing and noise. One we call the reference image {(R)} and it is obtained with no noise. The second is called the science image {(S)} and it generally has more severe seeing and has noise. We wish to find some transformation that maps {R} to {S} such that the difference is noise-like if there are no additional sources.

In particular, we posit a sequence of operators {(K_{\lambda})_{\lambda \in \Lambda}} and {\sigma > 0} such that there exists a {K_{\lambda_0}} where {\mathbb{E}[S] = \mathbb{E}[K_{\lambda_0}R + \sigma Z] = K_{\lambda_0}}. Our goal is to choose {\hat\lambda \in \Lambda} based only on data.

1. Result

We propose a multiresolution noise-like statistic based on an idea in Davies and Kovak 1991, defined as:

\displaystyle  NL(\lambda, \mathcal{I}) := \sup_{I \in \mathcal{I}}\frac{1}{\sqrt{|I|}} \left| \sum_{i \in I} (K_\lambda R - S)_i \right|

where {\mathcal{I}} is a multiresolution analysis of the pixelized grid and {\lambda \in \Lambda}. Note that we do not include the multiplicative factor {\frac{1}{\sqrt{|I|}}} whenever it is not explicitly needed.

Observe that we can write this as

\displaystyle  NL(\lambda, \mathcal{I}) := NL(\lambda)= \sup_{I \in \mathcal{I}} \left| \sum_{i \in I} (K_\lambda - K_{\lambda_0})R + \sigma Z_i \right|  \ \ \ \ \ (1)

by adding and subtracting {K_{\lambda_0}R}. Note that the summation of {i} is supressed in the first term for notational clarity. Also, when {\mathcal{I}} is fixed, we supress that argument.

Now, one quality this statistic could have would be to asymptotically distinguish between competing hypotheses. In this case, low-noise asymptotics makes more sense than large sample, so we choose this regime.

Our goal is to look at the power of this statistic to determine amongst hypothesis asymptotically. It is known (Das Gupta 2008) that asymptotics for fixed alternative hypothesis leads to trivial results, such as power always tending toward 1.

Hence, we wish to look at an analogy to the Pittman slope. This can be phrased as follows. Let {\tau > 0} be given. Then we want to look at

\displaystyle  \lim_{\sigma \rightarrow 0} \mathbb{P} \left( \frac{NL(\lambda_0 + \Delta C_{\sigma})}{NL(\lambda_0)} > \tau \right)  \ \ \ \ \ (2)

where {C(\sigma)} is a function going to zero with {\sigma} and {\Delta} is a constant. We look at the ratio of the test under the alternate and null hypothesis as a way of rescaling. Alternatively, we can make {\tau} a function of {\sigma}. We see in what follows the ratio in effect chooses that function.

Lemma 1 We can rewrite (2) as

\displaystyle  \begin{array}{rcl}  \lim_{\sigma \rightarrow 0} \mathbb{P} \left( \frac{NL(\lambda_0 + \Delta C_{\sigma})}{NL(\lambda_0)} > \tau \right) & = & \lim_{\sigma \rightarrow 0} \mathbb{P} \left( \frac{ \sup_{I \in \mathcal{I}} \left| \sum_{i \in I} (K_{\lambda_0 + \Delta C(\sigma)} - K_{\lambda_0})R + \sigma Z_i \right| } { \sup_{I \in \mathcal{I}} \left| \sum_{i \in I} Z_i \right| } > \sigma \tau \right) \end{array}

Proof: Use (1) and multiply by {\sigma}. \Box

This is a difficult seeming probability to calculate, even asymptotically. Hence we use the limit as a heuristic that the absence of {\sigma} in the denominator within the probability allows us to consider the following instead

\displaystyle  \lim_{\sigma \rightarrow 0} \mathbb{P} \left( \sup_{I \in \mathcal{I}} \left| \sum_{i \in I} (K_{\lambda_0 + \Delta C(\sigma)} - K_{\lambda_0})R + \sigma Z_i \right| > \sigma \tau \right)

Before continuing, we need a result for exchanging {\sup} and {\mathbb{P}}:

Lemma 2 Let {(X_t)_T} be a sequence of random variables over some index {T} such that {\sup_t X_t = X_{t_*}} for some {t_* \in T}. Then for any {\tau > 0}

\displaystyle  \mathbb{P} ( \sup_t X_t > \tau) \geq \sup_t \mathbb{P}( X_t > \tau)

Proof: Write {\mathbb{P} ( \sup_t X_t > \tau) = \mathbb{E} \mathbf{1}(\sup_t X_t > \tau)}. Now, since {\mathbf{1}(\sup_t X_t > \tau) = \sup_t \mathbf{1}(X_t > \tau)}, we see that

\displaystyle  \mathbb{P} ( \sup_t X_t > \tau) = \mathbb{E} \sup_t \mathbf{1}( X_t > \tau) \geq \sup_t \mathbb{E} \mathbf{1}( X_t > \tau)

where for the last inequality we use that {\sup_t \int f_t \leq \int \sup_t f_t} for the necessary kinds of sequences of functions and measures. \Box

Using Lemma 2, we can write

\displaystyle  \mathbb{P} \left( \sup_{I \in \mathcal{I}} \left| \sum_{i \in I} (K_{\lambda_0 + \Delta C(\sigma)} - K_{\lambda_0})R + \sigma Z_i \right| > \sigma \tau \right) \geq \sup_{I \in \mathcal{I}} \mathbb{P} \left( \left| \sum_{i \in I} (K_{\lambda_0 + \Delta C(\sigma)} - K_{\lambda_0})R + \sigma Z_i \right| > \sigma \tau \right).  \ \ \ \ \ (3)

Now, we would like to examine the {C(\sigma)} such that the RHS of (3) {\stackrel{\sigma \rightarrow 0}{\rightarrow} 1}. First, we can compute the RHS of (3) as follows. Define {\mu_{I,\sigma} := \sum_{i \in I}(K_{\lambda_0 + \Delta C(\sigma)} - K_{\lambda_0})R}. Then

\displaystyle  \mathbb{P} \left( \left| \sum_{i \in I} (K_{\lambda_0 + \Delta C(\sigma)} - K_{\lambda_0})R + \sigma Z_i \right| > \sigma \tau \right) = 1 + \Phi\left( -\sqrt{|I|}\left( \tau + \frac{\mu_{I,\sigma}}{\sigma} \right) \right) - \Phi\left( \sqrt{|I|}\left( \tau - \frac{\mu_{I,\sigma}}{\sigma} \right) \right)  \ \ \ \ \ (4)

by noticing that the sum inside the absolute value is a {N\left(\mu_{I,\sigma},\frac{\sigma^2}{|I|}\right)} random variable.

Using that {\liminf_m \sup_n x_{m,n} \geq \sup_n \liminf_m x_{m,n}} for any doubly indexed sequence {x_{m,n}} we see that under (3) and (4)

\displaystyle  \begin{array}{rcl}  \lim_{\sigma \rightarrow 0} \mathbb{P} \left( \sup_{I \in \mathcal{I}} \left| \sum_{i \in I} (K_{\lambda_0 + \Delta C(\sigma)} - K_{\lambda_0})R + \sigma Z_i \right| > \sigma \tau \right) & \geq & \sup_{I \in \mathcal{I}} \lim_{\sigma \rightarrow 0} \bigg[ 1 + \Phi\left( -\sqrt{|I|}\left( \tau + \frac{\mu_{I,\sigma}}{\sigma} \right) \right)- \\ && \Phi\left( \sqrt{|I|}\left( \tau - \frac{\mu_{I,\sigma}}{\sigma} \right) \right)\bigg]. \end{array}

Now, we see that this probability goes to 1 when {\mu_{I,\sigma}/\sigma \rightarrow \infty}. We did this calculation for the case where {K_{\lambda}} has for a kernel a non-normalized Gaussian kernel with variance {\lambda} for all {\lambda \in \Lambda}.

The result was that

\displaystyle  \frac{\mu_{I,\sigma}}{\sigma} \rightarrow \infty \quad \textrm{if} \quad C'(\sigma) \rightarrow \infty

and

\displaystyle  \frac{\mu_{I,\sigma}}{\sigma} \rightarrow 0 \quad \textrm{if} \quad C'(\sigma) \rightarrow 0

However, the second of the two results is not informative. If we additionally assume that {C(\sigma) = \sigma^\alpha} for {\alpha > 0} we get

\displaystyle  \frac{\mu_{I,\sigma}}{\sigma} \rightarrow \infty \quad \textrm{if} \quad \alpha \in (0,1)

and

\displaystyle  \frac{\mu_{I,\sigma}}{\sigma} \rightarrow 0 \quad \textrm{if} \quad \alpha > 1.

We’re not sure what happens if {\alpha = 1}.

© 2010 Weak-ly Update Suffusion theme by Sayontan Sinha