--- title: "Lecture 24: Random Simulation, For Fun And Profit" output: ioslides_presentation --- ## Previously... Topics we got to earlier in the course: - How to make draws from, and use random distributions - Writing functions with known run-times and verifiable results - Why not to use `for` loops ## Today... - Writing simulations using random number generation - From the jackknife to the bootstrap - Processes that have memory - Finally, `for` loops! ## `[dpqr]unif`, etc. Recall how we drew from various distributions: `runif()`: draws from the uniform distribution (others follow) Home-baked discrete distributions: Use `sample()` ```{r} population.values <- 1:3 probs <- c(5,2,3) my.draw <- sample (population.values, 100000, probs, replace=TRUE) table(my.draw) ``` ## Permutations with `sample()` `sample()` is powerful -- it works on any object that has a defined `length()`. Permutations: ```{r} sample(5) sample(1:6) replicate(3,sample(c("Curly","Larry","Moe","Shemp"))) sample(list("A",3,sum)) ``` ## Resampling with `sample()` Resampling from any existing distribution gets us the **bootstrap** family of estimators: ```{r} bootstrap.resample <- function (object) sample (object, length(object), replace=TRUE) replicate(5, bootstrap.resample (6:10)) ``` Recall: the *jackknife* worked by removing one point from the sample and recalculating the statistic of interest. Here we resample the same length with replacement. ## Bootstrap test The 2-sample `t`-test checks for differences in means according to a known null distribution. Let's resample and generate the sampling distribution under the bootstrap assumption: ```{r} library(MASS) diff.in.means <- function(df) { mean(df[df$Sex=="M","Hwt"]) - mean(df[df$Sex=="F","Hwt"]) } resample.diffs <- replicate(1000, diff.in.means(cats[bootstrap.resample(1:nrow(cats)),])) hist(resample.diffs); abline(v=diff.in.means(cats), col=2, lwd=3) ``` ## Dependence Most interesting examples in probability have a little dependence added in: "If it rained yesterday, what is the probability it rains today?" Use this to generate weather patterns and probabilities for some time in the future. Almost always occurs with time series; can occur with other dependence (spatial -- if it's sunny in Dallas today, will it also be sunny in Fort Worth?) ## Markov Dependence Suppose we have a sequence of observations that are dependent. In a time series, what happens next depends on what happened before: \[ p(X_1, X_2, ..., X_n) = p(X_1)p(X_2|X_1)...p(X_n|X_{n-1},...,X_1) \] (Note: you could, of course, condition on the future to predict the past, if you had a time machine.) Markov dependence: each outcome only depends on the one that came before. \[ p(X_1, X_2, ..., X_n) = p(X_1)\prod_{i=2}^np(X_i|X_{i-1}) \] ## Generating a Markov Chain 1. Set up the conditional distribution. 2. Draw the initial state of the chain. 3. For every additional draw, use the previous draw to inform the new one. ## And `for` loops are back in play! Alluded to but not taught explicitly because we had better ways of doing things in R -- until now. Simple weather model: - if it was sunny yesterday, today's chance of sun is 80%. - if it wasn't sunny yesterday, today's chance of sun is 20%. Simulate for one year (at the equator?) ```{r} sunny.year <- rep(NA, 365) sunny.year[1] <- 1 for (day in 2:365) sunny.year[day] <- rbinom(1,1,0.2 + 0.6*sunny.year[day-1]) ``` ## And `for` loops are back in play! ```{r} plot(sunny.year, main="Sunny Days in An Equatorial City", xlab="Day", ylab="Sunshine?", ty="l") ``` ## Different From Independence ```{r} boring.year <- rbinom(365, 1, 0.5) plot(boring.year, main="Sunny Days in A Coin Flip City", xlab="Day", ylab="Sunshine?", ty="l") ``` ## The above chain Transitions are represented as a matrix: $Q_{ij}$ is $P(X_t = j|X_{t-1} = i)$. ```{r echo=FALSE} matrix (c(0.8, 0.2, 0.2, 0.8), nrow=2) ``` ## Flippy chain ```{r} weird.year <- rep(NA, 365) weird.year[1] <- 1 transition.matrix <- matrix (c(0.2, 0.8, 0.8, 0.2), nrow=2) for (day in 2:365) weird.year[day] <- sample(1:2, 1, prob=transition.matrix[weird.year[day-1],]) ``` ## Flippy chain ```{r} plot(weird.year, main="Sunny Days in Al Gore's Nightmare", xlab="Day", ylab="Sunshine?", ty="l") ``` ## General Markov Chain ```{r} rmarkovchain <- function (nn, transition.matrix, start=sample(1:nrow(transition.matrix), 1)) { output <- rep (NA, nn) output[1] <- start for (day in 2:nn) output[day] <- sample(ncol(transition.matrix), 1, prob=transition.matrix[output[day-1],]) } ``` ## Simple Unbounded Markov Chain "Unbiased Random Walk": Independent events atop a dependent structure. ```{r} randomwalk <- function (nn, upprob=0.5, start=50) { output <- rep (NA, nn) output[1] <- start for (iteration in 2:nn) output[iteration] <- output[iteration-1] + 2*rbinom(1, 1, upprob) - 1 output } ``` ## Simple Unbounded Markov Chain ```{r} plot (randomwalk(10000, start=200), main="Simple Random Walk") ``` ## Simple Unbounded Markov Chain ```{r} plot (randomwalk(10000, start=200), main="Simple Random Walk") ``` ## Simple Unbounded Markov Chain ```{r} plot (randomwalk(10000, start=200), main="Simple Random Walk") ``` ## Simple Unbounded Markov Chain ```{r} plot (randomwalk(10000, start=200), main="Simple Random Walk") ```