--- title: "Iteration Basics" author: "Statistical Computing, 36-350" date: "Monday October 3, 2016" --- Iteration === Computers: good at applying rigid rules over and over again. Humans: not so good at this. Iteration is at the heart of programming Summary of methods for performing iteration in R: - `for()`, `while()` loops - Vectorization - `apply()` family of functions - `**ply()` family of functions Reminder: conditionals and boolean operators === Control flow: - `if()`, `else if()`, `else`: standard conditionals - `ifelse()`: conditional function that vectorizes nicely - `switch()`: handy for deciding between several options Boolean operators: - `&` work `|` like `+` or `*`: they combine terms elementwise - `&&` and `||` give just a single boolean, lazily - Standard flow flow control: we just want one boolean value. Hence we can skip calculating what's not needed In summary, use `&&` and `||` for conditionals, `&` and `|` for indexing or subsetting Examples of boolean operators === ```{r} set.seed(0) x = runif(10, -1, 1) x x[0 <= x & x <= 0.5] = 999 # Elementwise AND x (0 > 0) && all(matrix(0,2,2) == matrix(0,3,3)) # Lazy AND (0 > 0) && (ThisVariableIsNotDefined == 0) # Lazy AND ``` In the last two lines, R *never* evaluates the expression on the right (each of these would throw an error on its own!) `for()` loop === A `for()` loop increments a **counter** variable along a vector. It repeatedly runs a code block, called the **body** of the loop, with the counter set at its current value, until it runs through the vector ```{r} n = 10 log.vec = vector(length=n, mode="numeric") for (i in 1:n) { log.vec[i] = log(i) } log.vec ``` Here `i` is the counter and the vector we are iterating over is `1:n`. The body is the code in between the braces Breaking from the loop === We can **break** out of a `for()` loop early (before the counter has been iterated over the whole vector), using `break` ```{r} n = 10 log.vec = vector(length=n, mode="numeric") for (i in 1:n) { if (log(i) > 2) { cat("I'm outta here. I don't like numbers bigger than 2\n") break } log.vec[i] = log(i) } log.vec ``` Variations on standard `for()` loops === Many different variations on standard `for()` are possible. Two common ones to be aware of: - Nonnumeric counters: counter variable always gets iterated over a vector, but it doesn't have to be numeric - Nested loops: body of the `for()` loop can contain another `for()` loop (or several others) ```{r} for (str in c("Prof", "Ryan", "Tibs")) { cat(paste(str, "declined to comment\n")) } for (i in 1:4) { for (j in 1:i^2) { cat(paste(j,"")) } cat("\n") } ``` `while()` loop: conditional iteration === A `while()` loop repeatedly runs a code block, called the **body**, until some condition is no longer TRUE ```{r} i = 1 log.vec = c() while (log(i) <= 2) { log.vec = c(log.vec, log(i)) i = i+1 } log.vec ``` `for()` versus `while()` === - `for()` is better when the number of times to repeat (values to iterate over) is clear in advance - `while()` is better when you can recognize when to stop once you're there, even if you can't guess it to begin with - `while()` is more general, in that every `for()` could be replaced with a `while()` (but not vice versa) `while(TRUE)` or `repeat`: unconditional iteration === `while(TRUE)` and `repeat`: both have the same function, just repeat the body indefinitely, until something causes the flow to break Try running the code below in your console ``` repeat { ans = readline("Who is the best Professor of Statistics at CMU? ") if (ans == "Tibs" || ans == "Tibshirani" || ans == "Ryan") { cat("Yes! You get an 'A'.") break } else { cat("Wrong answer!\n") } } ``` Avoiding explicit iteration === Warning: some people have a tendency to **overuse** `for()` and `while()` loops in R. They aren't always needed. Useful alternatives: - Vectorization: act on whole objects rather than iterate over individual elements. When it applies, often simpler, and faster (sometimes a little, sometimes drastically) - `apply()` family of functions: apply a given function over dimensions of an object. E.g., elements of a vector, elements of a list, rows/columns of a matrix. Often simpler, not often faster - `**ply()` family of functions: like `apply()` family, but much more transparent about what goes in and what comes out