FINGER EXERCISES FOR WEEK 08 TO BE DISCUSSED MON OCT 21 2002. ------------------------------------------------------------ 1. In hw03, I asked you to find the MLE of "loc", with "sc" fixed, for the cauchy likelihood f(x|loc,sc) = (1/(pi*sc)) * (1/(1 + (x-loc)^2/sc^2) and the data 1.2 2.4 1.3 1.3 0.0 1.0 1.8 0.8 4.6 1.4 Now consider this as a two parameter model f(x|loc,sc) = (1/(pi*sc)) * (1/(1 + (x-loc)^2/sc^2) (a) Use persp() [or one of the trellis functions like "wireframe" if you know about / prefer them] to graph the log-likelihood over a suitable range of the parameter space (loc,sc). [Note: the trick in the "optimization.txt" handout of using "outer()" to quickly generate all the z-values for this plot may require some modification, to work in this case.] (b) Find the mle for (loc,sc) using ms [so this will have to be in Splus]. Try it with nlmin, nlminb, nlm, or optim too [in Splus or R, whichever you prefer]. Note: the "natural" way to call ms is ms( ~ [??], start=c(loc=[??],sc=[??]),data=[??]) where the first [??] is -log(likelihood) for one data point, and you have to fill in appropriate values for the other [??]'s. "ms" will automatically sum the -log(likelihood) expression over the data in the data frame you provide. Use the graph in part (a) to verify/confirm that you've found the max. Is it a clear maximum? (c) Calculate the standard errors for the MLE's in (b) by obtaining and inverting the appropriate Hessian matrix. Notes: - supply whatever derivatives are needed, using deriv or deriv3, so that you can get hessian information from the optimizer you used in part (b). - do you get the same SE's using the hessian estimates from different optimizers? [e.g. ms vs nlminb vs optim]? -------------------------------------------------------------------- 2. The functions nll1.s nll2.s nll3.s saved in the http://www.stat.cmu.edu/~brian/711/data/week08 offer several different ways to calulate the negative log-likelihood of a normal, in R or Splus, given a data vector x and a vector of possible means theta. For example, > source("nll1.s") > x _ rnorm(10) > th _ seq(0, 2, by=0.1) > nll1(x,th) will return a vector of 21 elements, each one the value of the negative log-likelihood for x at one of the 21 theta values 0, 0.1, 0.2, ... 1.0, 2.0. (a) What are the computational differences between the three routines? Which do you expect to be fastest? Second fastest? Third? Why [in each case]? (b) A fourth way to calculate -log(likelihood) for the normal would be something like this: nllr4 <- function(x,theta) { # # nll4 # # compute the normal(theta,1) loglikelihood for the data in the vector # x, at each of the values in the vector theta. Do all the # computations in C. # n <- length(x) m <- length(theta) y <- rep(0,m) z <- .C("mynll",as.integer(n),as.integer(m), as.double(x),as.double(theta),ans = as.double(y)) return(z$ans) } Of course you will also have to write a C subroutine "mynll" to do the computation, and load it into Splus or R using the appropriate recipe. Do this, and verify that it works by comparing its output to the output of one or more of the other routines in part (a). (c) To get a better answer to the last question in part (b), write a function like the following in Splus nlltime <- function(nll, x, th, n = 100) { # nlltime print(unix("date")) for(i in 1:n) nll(x, th) print(unix("date")) invisible() } Now the command "nlltime(nll2,x,theta)" will show approximately how long it takes to run "nll2" 100 times; similarly for "nlltime(nll3,x,theta)", etc. Experiment with different lengths of "x" and "theta" to find out which of the four nll routines is fastest under which conditions. -----------------------------------------------------------------