Analysis of the Fit of the Model

Next: How to plot Up: An Extended Example: Previous: Interpretation of Regression

Analysis of the Fit of the Model

Figure 1 presents a residual analysis for this logistic regression.

One relatively easy method for the analysis of the fit of the logistic regression model is based on calculating the predicted probability of spontaneous language recovery based on the model in (1) compared to the observed probability of language recovery Specifically, substitute the estimated regression coefficients given in above into the multiple logistic regression equation in (1). For each patient in the study calculate his or her individual predicted probability of language recovery by calculating the linear combination of each patient's individual set of explanatory variables with the estimated regression coefficients and use the inverse of the logit transformation to solve for the probability of language recovery, . For example, consider a patient whose stroke was on the left side (SIDE=0), was a 61 year old male (GENDER=1, AGE=61), stayed in the hospital 19 days (LOSD=19), and whose WAB at discharge was 81.6 (WAB1=81.6). The estimated probability of spontaneous language recovery based on the model for such a patient is

This value can be interpreted to mean that 77 patients out of 100 with the above characteristics can be expected to recover normal language functioning spontaneously within the first 2 months post-hospital discharge. To study the goodness of fit of the model, we obtain the frequency distribution of the values of , and determine, say, the quintiles of the distribution. The observed number of patients who had normal language functioning are tallied within each quintile, and the expected number of patients who had normal language functioning (predicted by the model) is determined by summing the 's for all the patients in the quintile. Table 3 presents the expected and observed number of patients who had normal language functioning for each quintile of risk of language recovery. Comparisons of the expected number predicted by the model to the observed number suggests that this model fits the data reasonably well.

A test for whether there is a significant difference between the observed and expected numbers in Table reftab:fit can be constructed based on a Pearson chi-square test. This chi-square test would have degrees-of-freedom equal to the number of groups (e.g., 5 for the number of quintiles) minus 2 (Hosmer and Lemeshow, 1989, chapter 5). In our example, this chi-square statistics is equal to 3.60 with 3 degrees-of-freedom and has a non-significant p-value=0.69, indicating that the model adequately fits the data. A small p-value would indicate a significant lack of fit of the model.

Table 3: Analysis of the fit of the logit model in Equation (1)

Figure 1: Residual plots

Next: How to plot Up: An Extended Example: Previous: Interpretation of Regression

Brian Junker
Sun Mar 15 22:19:21 EST 1998