--- title: "Lecture 25" author: "36-350" date: "11/18/2014" output: ioslides_presentation --- ## Previously... We introduced: Markov Chains, Monte Carlo simulation, Bootstrap method ## Today: Applications to Probability Where the Bayesian revolution takes over: drawing from complex probability distributions - rejection sampling - The Metropolis algorithm - General introduction to "Markov Chain Monte Carlo" **Optional reading**: Geyer, Practical Markov Chain Monte Carlo'', _Statistical Science_ **7** (1992): 473--483 ## One-dimension probability distributions We have [dpqr]{norm, binom, pois, ...} for known distributions. For many good reasons, we'd like to be able to: - draw from complex distributions - find the solution to relevant integrals - eventually, go to many higher dimensions What's available? ## Integration by Simulation Law of large numbers: if $X_1, X_2, \ldots X_n$ all IID with p.d.f. $p$, $\frac{1}{n}\sum_{i=1}^{n}{f(X_i)} \rightarrow \mathbb{E}_{p}[f(X)] = \int{f(x) p(x) dx}$ The **Monte Carlo principle**: to find $\int{g(x) dx}$, draw from $p$ and take the sample mean of $f(x) = g(x)/p(x)$. ## And it works, too Central limit theorem: $\frac{1}{n}\sum_{i=1}^{n}{\frac{g(x_i)}{p(x_i)}} \rightsquigarrow \mathcal{N}\left(\int{g(x)dx}, \frac{\sigma^2_{g/p}}{n}\right)$ The Monte Carlo approximation to the integral is unbiased, with root mean square error $\propto n^{-1/2}$ -- just keep taking Monte Carlo draws, and the error gets as small as you like, even if $g$ or $x$ are very complicated. ## Taking unknown samples Generating from $p$ is easy if it's a standard distribution or we have a nice, invertible CDF (quantile method). What can we do if all we've got is the probability density function $p$? Suppose the pdf $f$ is zero outside an interval $[c,d]$, and $\leq M$ on the interval. {r echo=FALSE} plot(c(0,1), c(0,3), ty="n", main="A Sample Distribution", ylab="Density f(x)", xlab="x") curve (dbeta(x, 3, 6), add=TRUE) lines(c(0,0,1,1), c(0,3,3,0))  ## Rejection sampling We know how to draw from uniform distributions in any dimension. Do it in two: {r} x1 <- runif(300, 0, 1); y1 <- runif(300, 0, 2.6); selected <- y1 < dbeta(x1, 3, 6)  {r echo=FALSE} plot(c(0,1), c(0,3), ty="n", main="A Sample Distribution", ylab="Density f(x)", xlab="x") curve (dbeta(x, 3, 6), add=TRUE) lines(c(0,0,1,1), c(0,3,3,0)) points (x1, y1, col=1+1*selected, cex=0.1)  ## Rejection sampling {r} mean(selected) accepted.points <- x1[selected] mean(accepted.points < 0.5) pbeta(0.5, 3, 6)  ## Rejection sampling For this to work efficiently: we have to cover the target distribution with one that sits close to it. {r} x2 <- runif(100000, 0, 1); y2 <- runif(100000, 0, 10); selected <- y2 < dbeta(x2, 3, 6) mean(selected)  ## Rejection sampling {r echo=FALSE} plot(c(0,1), c(0,6), ty="n", main="A Sample Distribution", ylab="Density f(x)", xlab="x") curve (dbeta(x, 3, 6), add=TRUE) lines(c(0,0,1,1), c(0,3,3,0)) points (x2, y2, col=1+1*selected, cex=0.1)  ## Metropolis Algorithm (Really: Metropolis, Rosenbluth x2, Teller x2, 1953) Rejection sampling works if we know the complete distribution $p(x)$, with $\int p(x) = 1$. For many, many problems, we know the function up to the normalizing constant. Example: Equations of the canonical ensemble of energy formations in the state of uncontrolled hydrogen fusion, $p(E) = \frac{1}{|Z|} exp(f(E)/kT)$ $|Z|$ is legendarily hard to calculate precisely. ## Basic Metropolis On integer steps: {r} metropolis.one <- function (xx, fun=dpois, ...) { prop <- 2*rbinom(1,1,0.5) - 1 + xx acc.ratio <- fun(prop, ...)/fun(xx, ...) output <- if (acc.ratio > runif(1)) prop else xx output } replicate (10, metropolis.one(10, lambda=5))  ## Basic Metropolis On integer steps: {r} start <- 50 draws <- rep(NA, 10000) for (iteration in 1:10000) { start <- metropolis.one (start, lambda=10) draws[iteration] <- start }  ## Basic Metropolis {r} plot(draws)  ## Basic Metropolis Have to discard the initial state for "burn-in" {r} plot(draws[-(1:100)])  ## Basic Metropolis Have to discard the initial state for "burn-in" {r} hist(draws[-(1:100)]) hist(rpois(10000, 10), border=2, add=TRUE)  ## Basic Metropolis On integer steps: {r} metropolis.cts <- function (xx, fun=dgamma, ...) { prop <- rnorm(1, xx, 1) acc.ratio <- fun(prop, ...)/fun(xx, ...) output <- if (acc.ratio > runif(1)) prop else xx output } replicate (20, metropolis.cts(20, shape=100, rate=10))  ## Basic Metropolis {r} start <- 50 draws <- rep(NA, 100000) for (iteration in 1:100000) { start <- metropolis.one (start, fun=dgamma, shape=100, rate=10) draws[iteration] <- start }  ## Basic Metropolis {r} plot(draws)  ## Basic Metropolis Have to discard the initial state for "burn-in" {r} plot(draws[-(1:200)])  ## Basic Metropolis Have to discard the initial state for "burn-in" {r} hist(draws[-(1:200)], prob=TRUE) hist(rgamma(10000, 100, 10), border=2, add=TRUE, prob=TRUE)