Next: About this document Up: No Title Previous: The Zero-One version

Budworms

This data set is analyzed by Venables and Ripley also, pp. 230 ff.

The data is the result of an experiment on the toxicity of the tobacco budworm to doses of particular drug, that the moths are beginning to show resistance to. Batches of 20 moths of each sex were exposed for 3 days to the chemical and the number of dead (or disabled) moths in each group were recorded. The data is,

Dose Response Gender
1       1       M
2       4       M
4       9       M
8       13      M
16      18      M
32      20      M
1       0       F
2       2       F
4       6       F
8       10      F
16      12      F
32      16      F

The doses were in micro-grams, and the dose level when up by powers of two. It is a judgement call as to whether you will try to model with dose as the response or . We will try both.

402 > bud <- read.table("budworm.dat", head=T)
402 > bud
   Dose Response Gender 
 1    1        1      M
 2    2        4      M
 3    4        9      M
 4    8       13      M
 5   16       18      M
 6   32       20      M
 7    1        0      F
 8    2        2      F
 9    4        6      F
10    8       10      F
11   16       12      F
12   32       16      F
402 > attach(bud,pos=1)
402 > N <- 20

First, lets plot the data, and see if anything obvious shows up.

402 > plot( Dose, Response/N)         
402 > points( Dose[Gender=="F"], (Response/N)[Gender=="F"], pch="F")             
402 > points( Dose[Gender=="M"], (Response/N)[Gender=="M"], pch="M")

The same plot on the log(Dose) scale, is

402 > plot( Dose, Response/N, log="x")                              
402 > points( Dose[Gender=="F"], (Response/N)[Gender=="F"], pch="F")
402 > points( Dose[Gender=="M"], (Response/N)[Gender=="M"], pch="M")

Given that we have more than 0-1 data, it makes a lot of sense to at least try to plot the (observed) logits,

402 > par(mfrow=c(2,1))
402 > plot( Dose, logit(Response/N), log="x")
402 > title("Logit vs Log(Dose)")            
402 > plot( Dose, logit(Response/N))         
402 > title("Logit vs Dose")

This plot makes it quite clear, that the log scale has a much better chance of working.

So, proceed to fit the model. We surely want a model with different slopes for the two sexes

402 > ldose <- log(Dose)
402 > SF <- cbind(Response, N - Response)
402 > mymod <- glm(SF ~ Gender * ldose, family=binomial)   
402 > summary(mymod)

 ..........
Coefficients:
                  Value Std. Error    t value 
 (Intercept) -2.9935414  0.5525295 -5.4178852
      Gender  0.1749865  0.7781556  0.2248733
       ldose  1.3071341  0.2410133  5.4234939
Gender:ldose  0.5091459  0.3894474  1.3073547


    Null Deviance: 124.8756 on 11 degrees of freedom

Residual Deviance: 4.993727 on 8 degrees of freedom

402 > anova(mymod,test="Chisq")
Analysis of Deviance Table

Binomial model

Response: SF

Terms added sequentially (first to last)
             Df Deviance Resid. Df Resid. Dev   Pr(Chi) 
        NULL                    11   124.8756          
      Gender  1   6.0770        10   118.7986 0.0136955
       ldose  1 112.0415         9     6.7571 0.0000000
Gender:ldose  1   1.7633         8     4.9937 0.1842088
402 >

Now, what is going on here? Gender seems to have no effect (based on the t-value), but the deviance is quite large? Even more, our plots seem to imply what looks like a gender effect. The best way to find out is to distrust the parameter stuff (it can get easily confused), and try fitting a model without the interaction. This makes sense anyway, because there is little/no evidence of an interaction.

402 > anova(mymod <- glm(SF ~ Gender + ldose, family=binomial))        

Terms added sequentially (first to last)
       Df Deviance Resid. Df Resid. Dev 
  NULL                    11   124.8756
Gender  1   6.0770        10   118.7986
 ldose  1 112.0415         9     6.7571
402 > summary(mymod)
Coefficients:
                Value Std. Error   t value 
(Intercept) -3.473154  0.4682939 -7.416612
     Gender  1.100743  0.3557226  3.094385
      ldose  1.535336  0.1890119  8.122959

    Null Deviance: 124.8756 on 11 degrees of freedom

Residual Deviance: 6.757064 on 9 degrees of freedom

That is a little more like it, and seems to be a VERY good fitting model. We should at least look at some diagnostics, and some other plots, but unless they show something very odd, we are done.

402 > plot(mymod)

That looks very good, we should finish by plotting the data and fitted curves on the same graph, and that is left as an exercise for you (or look it up in the book). The book also goes on to talk about LD50, and how to estimate it.

About this document ...

Next: About this document Up: No Title Previous: The Zero-One version

Brian Junker
Sun Mar 15 22:17:44 EST 1998