FINGER EXERCISES FOR WEEK 08 TO BE DISCUSSED MON OCT 21 2002.
------------------------------------------------------------

1. In hw03, I asked you to find the MLE of "loc", with "sc" fixed,
for the cauchy likelihood

	f(x|loc,sc) = (1/(pi*sc)) * (1/(1 + (x-loc)^2/sc^2)

and the data

	1.2 2.4 1.3 1.3 0.0 1.0 1.8 0.8 4.6 1.4

Now consider this as a two parameter model

	f(x|loc,sc) = (1/(pi*sc)) * (1/(1 + (x-loc)^2/sc^2)

(a) Use persp() [or one of the trellis functions like "wireframe" if
    you know about / prefer them] to graph the log-likelihood over a
    suitable range of the parameter space (loc,sc).  [Note: the trick
    in the "optimization.txt" handout of using "outer()" to quickly
    generate all the z-values for this plot may require some
    modification, to work in this case.]

(b) Find the mle for (loc,sc) using ms [so this will have to be in
    Splus].  Try it with nlmin, nlminb, nlm, or optim too [in Splus or
    R, whichever you prefer].  Note: the "natural" way to call ms is

         ms( ~ [??], start=c(loc=[??],sc=[??]),data=[??])

    where the first [??] is -log(likelihood) for one data point, and
    you have to fill in appropriate values for the other [??]'s.  "ms"
    will automatically sum the -log(likelihood) expression over the
    data in the data frame you provide.  Use the graph in part (a) to
    verify/confirm that you've found the max.  Is it a clear maximum?

(c) Calculate the standard errors for the MLE's in (b) by obtaining
    and inverting the appropriate Hessian matrix.  Notes:

    - supply whatever derivatives are needed, using deriv or deriv3, so
      that you can get hessian information from the optimizer you used
      in part (b).
    
    - do you get the same SE's using the hessian estimates from
      different optimizers? [e.g. ms vs nlminb vs optim]?
      
--------------------------------------------------------------------

2. The functions

   nll1.s
   nll2.s
   nll3.s

saved in the http://www.stat.cmu.edu/~brian/711/data/week08 offer several 
different ways to calulate the negative log-likelihood of a normal, in R 
or Splus, given a data vector x and a vector of possible means theta.

For example,

> source("nll1.s")
> x _ rnorm(10)
> th _ seq(0, 2, by=0.1) 
> nll1(x,th)

will return a vector of 21 elements, each one the value of the negative
log-likelihood for x at one of the 21 theta values 0, 0.1, 0.2, ... 1.0,
2.0. 

(a) What are the computational differences between the three routines?
    Which do you expect to be fastest?  Second fastest?  Third?  Why
    [in each case]?

(b) A fourth way to calculate -log(likelihood) for the normal would be
    something like this:

nllr4 <- function(x,theta) {
#
# nll4
#
# compute the normal(theta,1) loglikelihood for the data in the vector
# x, at each of the values in the vector theta.  Do all the
# computations in C.
#
        n <- length(x)
        m <- length(theta)
        y <- rep(0,m)

        z <- .C("mynll",as.integer(n),as.integer(m),
                 as.double(x),as.double(theta),ans = as.double(y))

        return(z$ans)

}

    Of course you will also have to write a C subroutine "mynll" to do
    the computation, and load it into Splus or R using the appropriate
    recipe.
 
    Do this, and verify that it works by comparing its output to the
    output of one or more of the other routines in part (a).

(c) To get a better answer to the last question in part (b), write a
    function like the following in Splus

nlltime <- function(nll, x, th, n = 100)
{
# nlltime
        print(unix("date"))
        for(i in 1:n)
                nll(x, th)
        print(unix("date"))
        invisible()
}

    Now the command "nlltime(nll2,x,theta)" will show approximately
    how long it takes to run "nll2" 100 times; similarly for
    "nlltime(nll3,x,theta)", etc.

    Experiment with different lengths of "x" and "theta" to find out
    which of the four nll routines is fastest under which conditions.

-----------------------------------------------------------------