Lecture 16: Functions as Objects ===== author: 36-350 date: 20 October 2014 font-family: Garamond transition: none Previously... === - Writing our own functions - Dividng labor with multiple functions - Refactoring to create higher-level operations - Using apply, sapply, etc., to avoid iteration Agenda === - Functions are objects, and can be arguments to other functions - Functions are objects, and can be returned by other functions - Example: surface **Reading**: Sections 7.5, 7.11 and 7.13 of Matloff **Optional Recommended Reading**: Chapter 3 of Chambers Functions as Objects === - In R, functions are objects, just like everything else - This means that they can be passed to functions as arguments and returned by functions as outputs as well Functions of Functions: Computationally === - We often want to do very similar things to many different functions - The procedure is the same, only the function we're working with changes - $\therefore$ Write one function to do the job, and pass the function as an argument - Because R treats a function like any other object, we can do this simply: invoke the function by its argument name in the body - We have already seen examples R Functions That Take Functions as Arguments === - apply(), sapply(), etc.: Take _this_ function and use it on all of _these_ objects - nlm(): Take _this_ function and try to make it small, starting from _here_ - ks.test(): Compare _these_ data to _this_ cumulative distribution function - curve(): Evaluate _this_ function over _that_ range, and plot the results Some R Syntax Facts About Functions === - Typing a function's name, without parentheses, in the terminal gives you its source code: {r} sample  Some R Syntax Facts About Functions === - Functions are their own **class** in R: {r} class(sin) class(sample) resample <- function(x) { sample(x, size=length(x), replace=TRUE) } class(resample)  Some R Syntax Facts About Functions === - Functions can be put into lists or even arrays - A call to function returns a function object + body executed; access with body(foo) + arguments required: access with formals(foo) gives argument list of foo: names are argument names, values are expressions for defaults (if any) + parent environment: access with environment(foo) Some R Syntax Facts About Functions === - R has separate **types** for built-in functions and for those written in R: {r} typeof(resample) typeof(sample) typeof(sin)  Why closure for written-in-R functions? Because expressions are "closed" by referring to the parent environment There's also a 2nd class of built-in functions called primitive Anonymous Functions === - function() returns an object of class function - So far we've assigned that object to a name - If we don't have an assignment, we get an **anonymous function** - Usually part of some larger expression: {r} sapply((-2):2,function(log.ratio){exp(log.ratio)/(1+exp(log.ratio))})  Anonymous Functions === - Often handy when connecting other pieces of code + especially in things like apply and sapply - Won't cluttering the workspace - Can't be examined or re-used later Example: grad() === - Problems in stats. come down to optimization So do lots of problems in econ., physics, CS, bio, ... - Lots of optimization problems require the gradient of the **objective function** - Gradient of $f$ at $x$: $\nabla f(x) = \left[\left.\frac{\partial f}{\partial x_1}\right|_{x} \ldots \left.\frac{\partial f}{\partial x_p}\right|_{x}\right]$ Example: grad() === - We do the same thing to get the gradient of $f$ at $x$ no matter what $f$ is:  find the partial derivative of f with respect to each component of x return the vector of partial derivatives  - It makes no sense to re-write this every time we change $f$! - $\therefore$ write code to calculate the gradient of an arbitrary function - We _could_ write our own, but there are lots of tricky issues + Best way to calculate partial derivative + What if $x$ is at the edge of the domain of $f$? - Fortunately, someone has already done this Example: grad() === From the package numDeriv {r,eval=FALSE} grad(func, x, ...) # Plus other arguments  - Assumes func is a function which returns a single floating-point value - Assumes x is a vector of arguments to func + If x is a vector and func(x) is also a vector, then it's assumed func is vectorized and we get a vector of derivatives - Extra arguments in ... get passed along to func - Other functions in the package for the Jacobian of a vector-valued function, and the matrix of 2nd partials (Hessian) Example: grad() === - Does it work as advertized? {r} require("numDeriv") just_a_phase <- runif(n=1,min=-pi,max=pi) all.equal(grad(func=cos,x=just_a_phase),-sin(just_a_phase)) phases <- runif(n=10,min=-pi,max=pi) all.equal(grad(func=cos,x=phases),-sin(phases)) grad(func=function(x){x[1]^2+x[2]^3}, x=c(1,-1))  Note: grad is perfectly happy with func being an anonymous function! gradient.descent() === Now we can use this as a piece of a larger machine:  gradient.descent <- function(f,x,max.iterations,step.scale, stopping.deriv,...) { for (iteration in 1:max.iterations) { gradient <- grad(f,x,...) if(all(abs(gradient) < stopping.deriv)) { break() } x <- x - step.scale*gradient } fit <- list(argmin=x,final.gradient=gradient,final.value=f(x,...), iterations=iteration) return(fit) }  - Works equally well whether f is mean squared error of a regression, $\psi$ error of a regression, (negative log) likelihood, cost of a production plan, ... Cautions === - _Scoping_ f takes values for all names which aren't its arguments from the environment where it was defined, not the one where it is called (e.g., not from inside grad or gradient.descent) - _Debugging_ If f and g are both complicated, avoid debugging g(f) as a block; divide the work by writing _very simple_ f.dummy to debug/test g, and debug/test the real f separately Returning Functions: A trivial example === Functions can be return values like anything else {r} make.noneuclidean <- function(ratio.to.diameter=pi) { circumference <- function(d) { return(ratio.to.diameter*d) } return(circumference) }  Returning Functions: A trivial example (cont'd.) ==== {r} try(circumference(10)) kings.i <- make.noneuclidean(3) try(kings.i(10)) formals(kings.i) body(kings.i) environment(kings.i) try(circumference(10))  A Less Trivial Example === Create a linear predictor, based on sample values of two variables {r} make.linear.predictor <- function(x,y) { linear.fit <- lm(y~x) predictor <- function(x) { return(predict(object=linear.fit,newdata=data.frame(x=x))) } return(predictor) }  The predictor function persists and works, even when the data we used to create it is gone A Less Trivial Example === {r} library(MASS); data(cats) vet_predictor <- make.linear.predictor(x=cats$Bwt,y=cats$Hwt) rm(cats) # Data set goes away vet_predictor(3.5) # My cat's body mass in kilograms  A more mathematical example === - Instead of finding $\nabla f(x)$, find the function $\nabla f$: {r} nabla <- function(f,...) { require("numDeriv") g <- function(x,...) { grad(func=f,x=x,...) } return(g) }  Exercise: Write a test case! Example: curve() === - You learned to use curve in the first week (because you did all of the assigned reading, including section 2.3.3 of the textbook) - A call to curve looks like this: {r,eval=FALSE} curve(expr, from = a, to = b, ...)  expr is some expression involving a variable called x which is swept from the value a to the value b ... are other plot-control arguments - curve feeds the expression a vector x and expects a numeric vector back, e.g.  curve(x^2 * sin(x))  is fine Using curve() with our own functions === - If we have defined a function already, we can use it in curve: {r,eval=FALSE} psi <- function(x,c=1) {ifelse(abs(x)>c,2*c*abs(x)-c^2,x^2)} curve(psi(x,c=10),from=-20,to=20)  Try this! Also try {r,eval=FALSE} curve(psi(x=10,c=x),from=-20,to=20)  and explain it to yourself Using curve() with our own functions === - If our function doesn't take vectors to vectors, curve becomes unhappy {r,echo=FALSE} gmp <- read.table("http://www.stat.cmu.edu/~cshalizi/statcomp/14/lectures/06/gmp.dat") gmp$pop <- round(gmp$gmp/gmp$pcgmp)  {r} mse <- function(y0,a,Y=gmp$pcgmp,N=gmp$pop) { mean((Y - y0*(N^a))^2) }  {r,eval=FALSE} > curve(mse(a=x,y0=6611),from=0.10,to=0.15) Error in curve(mse(a = x, y0 = 6611), from = 0.1, to = 0.15) : 'expr' did not evaluate to an object of length 'n' In addition: Warning message: In N^a : longer object length is not a multiple of shorter object length  How do we solve this? Using curve() with our own functions === - Define a new, vectorized function, say with sapply: {r} sapply(seq(from=0.10,to=0.15,by=0.01),mse,y0=6611) mse(6611,0.10) mse.plottable <- function(a,...){ return(sapply(a,mse,...)) } mse.plottable(seq(from=0.10,to=0.15,by=0.01),y0=6611)  Using curve() with our own functions === {r} curve(mse.plottable(a=x,y0=6611),from=0.10,to=0.20,xlab="a",ylab="MSE") curve(mse.plottable(a=x,y0=5100),add=TRUE,col="blue")  Using curve() with our own functions === - Alternate strategy: Vectorize() returns a new, vectorized function {r} mse.vec <- Vectorize(mse, vectorize.args=c("y0","a")) mse.vec(a=seq(from=0.10,to=0.15,by=0.01),y0=6611) mse.vec(a=1/8,y0=c(5000,6000,7000))  Using curve() with our own functions === {r} curve(mse.vec(a=x,y0=6611),from=0.10,to=0.20,xlab="a",ylab="MSE") curve(mse.vec(a=x,y0=5100),add=TRUE,col="blue")  Example: surface() === - curve takes an expression and, as a side-effect, plots a 1-D curve by sweeping over x - Suppose we want something like that but sweeping over two variables - Built-in plotting function contour:  contour(x,y,z, [[other stuff]])  x and y are vectors of coordinates, z is a matrix of the corresponding shape (see help(contour) for graphical options) - Strategy: surface should make x and y sequences, evaluate the expression at each combination to get z, and then call contour First attempt at surface() === - Only works with vector-to-number functions: {r} surface.1 <- function(f,from.x=0,to.x=1,from.y=0,to.y=1,n.x=101, n.y=101,...) { x.seq <- seq(from=from.x,to=to.x,length.out=n.x) y.seq <- seq(from=from.y,to=to.y,length.out=n.y) plot.grid <- expand.grid(x=x.seq,y=y.seq) z.values <- apply(plot.grid,1,f) z.matrix <- matrix(z.values,nrow=n.x) contour(x=x.seq,y=y.seq,z=z.matrix,...) invisible(list(x=x.seq,y=y.seq,z=z.matrix)) }  First attempt at surface() === {r} surface.1(function(p){return(sum(p^3))},from.x=-1,from.y=-1)  Expressions and Evaluation === - curve doesn't require us to write a function every time --- what's it's trick? - **Expressions** are just another class of R object, so they can be created and manipulated - One manipulation is **evaluation**  eval(expr,envir)  evaluates the expression expr in the environment envir, which can be a data frame or even just a list - When we type something like x^2+y^2 as an argument to surface.1, R tries to evaluate it prematurely - substitute returns the _unevaluted_ expression - curve uses first substitute(expr) and then eval(expr,envir), having made the right envir Second attempt at surface() === {r} surface.2 <- function(expr,from.x=0,to.x=1,from.y=0,to.y=1,n.x=101, n.y=101,...) { x.seq <- seq(from=from.x,to=to.x,length.out=n.x) y.seq <- seq(from=from.y,to=to.y,length.out=n.y) plot.grid <- expand.grid(x=x.seq,y=y.seq) unevaluated.expression <- substitute(expr) z.values <- eval(unevaluated.expression,envir=plot.grid) z.matrix <- matrix(z.values,nrow=n.x) contour(x=x.seq,y=y.seq,z=z.matrix,...) invisible(list(x=x.seq,y=y.seq,z=z.matrix)) }  Second attempt at surface() === {r} surface.2(abs(x^3)+abs(y^3),from.x=-1,from.y=-1)  Evaluating at Combinations === - Evaluating a function at every combination of two arguments is a really common task - There is a function to do it for us: outer Third attempt at surface() === {r} surface.3 <- function(expr,from.x=0,to.x=1,from.y=0,to.y=1,n.x=101, n.y=101,...) { x.seq <- seq(from=from.x,to=to.x,length.out=n.x) y.seq <- seq(from=from.y,to=to.y,length.out=n.y) unevaluated.expression <- substitute(expr) z <- function(x,y) { return(eval(unevaluated.expression,envir=list(x=x,y=y))) } z.values <- outer(X=x.seq,Y=y.seq,FUN=z) z.matrix <- matrix(z.values,nrow=n.x) contour(x=x.seq,y=y.seq,z=z.matrix,...) invisible(list(x=x.seq,y=y.seq,z=z.matrix, func=z)) }  Third attempt at surface() === {r} surface.3(x^4-y^4,from.x=-1,from.y=-1)  surface() === {r} surface.3(mse.vec(a=x,y0=y),from.x=0.10,to.x=0.15, from.y=6e3,to.y=7e3,nlevels=20)  Summary === - In R, functions are objects, and can be arguments to other functions + Use this to do the same thing to many different functions + Separates writing the high-level operations and the first-order functions + Use sapply (etc.), wrappers, anonymous functions as adapters - Functions can also be returned by other functions + Variables other than the arguments to the function are fixed by the environment of creation + Manipulating expressions lets us flexibly create functions Functions of Functions: Mathematically === - Maximum, and location of the maximum: takes$f$, gives number $\max_{x}{f(x)} ~, ~ \argmax_{x}{f(x)}$ - Derivative of$f$at$x_0$: takes a function and a point, gives a number $\frac{df}{dx}(x_0) \equiv \lim_{h\rightarrow 0}{\frac{f(x_0+h) - f(x_0)}{h}}$ - Definite integral of$f$over$[a,b]$: takes a function and two points, gives a number $\int_{a}^{b}{f(x) dx} \equiv \lim_{n\rightarrow \infty}{\sum_{i=0}^{n-1}{\left(\frac{b-a}{n}\right)f\left(a+i\frac{b-a}{n}\right)}}$ Mathematical view cont'd. === - Functions of functions which return numbers sometimes are sometimes called **functionals**, e.g., expectation values: $\mathbb{E}[f(X)] \equiv \int_{\mathrm{all}~x}{f(x) p(x) dx}$ -$\nabla f(x_0)$takes$f$and$x_0$, gives vector: not strictly a functional -$\nabla f$is another, vector-valued function$\nabla$takes a function and returns a function$\nabla$is an **operator**, not a functional Mathematically === - Something which takes a function in and gives a function back is an **operator** - Differentiation: the operator$d/dx$takes$f$and gives a new function - Gradient: the operator$\nabla$takes$f$and gives a new function similarly$\nabla \cdot$,$\nabla \times$, ... - Indefinite integration:$\int_{-\infty}^{x}{f(u) du}$takes$f$and gives a new function - Fourier transform: takes$f$and gives a new function $\tilde{f}(\omega) = \int_{x=-\infty}^{x=\infty}{f(x) e^{2i\pi \omega x} dx}$ Bonus: Writing Our Own gradient() === - Suppose we didn't know about the numDeriv package.. -Use the simplest possible method: change x by some amount, find the difference in f, take the slope method="simple" option in numDeriv::grad - Start with pseudo-code {r,eval=FALSE} gradient <- function(f,x,deriv.steps) { # not real code evaluate the function at x and at x+deriv.steps take slopes to get partial derivatives return the vector of partial derivatives }  Bonus Example: gradient() === A naive implementation would use a for loop {r,eval=FALSE} gradient <- function(f,x,deriv.steps,...) { p <- length(x) stopifnot(length(deriv.steps)==p) f.old <- f(x,...) gradient <- vector(length=p) for (coordinate in 1:p) { x.new <- x x.new[coordinate] <- x.new[coordinate]+deriv.steps[coordinate] f.new <- f(x.new,...) gradient[coordinate] <- (f.new - f.old)/deriv.steps[coordinate] } return(gradient) }  Works, but it's so repetitive! Bonus Example: gradient() === Better: use matrix manipulation and apply {r,eval=FALSE} gradient <- function(f,x,deriv.steps,...) { p <- length(x) stopifnot(length(deriv.steps)==p) x.new <- matrix(rep(x,times=p),nrow=p) + diag(deriv.steps,nrow=p) f.new <- apply(x.new,2,f,...) gradient <- (f.new - f(x,...))/deriv.steps return(gradient) }  (clearer, and half as long) - Presumes that f takes a vector and returns a single number - Any extra arguments to gradient will get passed to f - Check: Does this work when f is a function of a single number? Bonus Example: gradient() === - Acts badly if f is only defined on a limited domain and we ask for the gradient somewhere near a boundary - Forces the user to choose deriv.steps - Uses the same deriv.steps everywhere, imagine$f(x) = x^2\sin{x}\$ ... and so on through much of a first course in numerical analysis (or at least sec. 5.7 of _Numerical Recipes_)