We introduced: Markov Chains, Monte Carlo simulation, Bootstrap method

Today: Applications to Probability

Where the Bayesian revolution takes over: drawing from complex probability distributions

  • rejection sampling
  • The Metropolis algorithm
  • General introduction to "Markov Chain Monte Carlo"

Optional reading: Geyer, ``Practical Markov Chain Monte Carlo'', Statistical Science 7 (1992): 473–483

One-dimension probability distributions

We have [dpqr]{norm, binom, pois, …} for known distributions. For many good reasons, we'd like to be able to:

  • draw from complex distributions
  • find the solution to relevant integrals
  • eventually, go to many higher dimensions

What's available?

Integration by Simulation

Law of large numbers: if \(X_1, X_2, \ldots X_n\) all IID with p.d.f. \(p\),

\[ \frac{1}{n}\sum_{i=1}^{n}{f(X_i)} \rightarrow \mathbb{E}_{p}[f(X)] = \int{f(x) p(x) dx} \]

The Monte Carlo principle: to find \(\int{g(x) dx}\), draw from \(p\) and take the sample mean of \(f(x) = g(x)/p(x)\).

And it works, too

Central limit theorem:

\[ \frac{1}{n}\sum_{i=1}^{n}{\frac{g(x_i)}{p(x_i)}} \rightsquigarrow \mathcal{N}\left(\int{g(x)dx}, \frac{\sigma^2_{g/p}}{n}\right) \]

The Monte Carlo approximation to the integral is unbiased, with root mean square error \(\propto n^{-1/2}\) – just keep taking Monte Carlo draws, and the error gets as small as you like, even if \(g\) or \(x\) are very complicated.

Taking unknown samples

Generating from \(p\) is easy if it's a standard distribution or we have a nice, invertible CDF (quantile method). What can we do if all we've got is the probability density function \(p\)?

Suppose the pdf \(f\) is zero outside an interval \([c,d]\), and \(\leq M\) on the interval.

plot of chunk unnamed-chunk-1

Rejection sampling

We know how to draw from uniform distributions in any dimension. Do it in two:

x1 <- runif(100000, 0, 1); y1 <- runif(100000, 0, 3); 
selected <- y1 < dbeta(x1, 3, 6)