Figure 1 presents a residual analysis for this logistic regression.
One relatively easy method for the analysis of the fit of the logistic
regression model is based on calculating the predicted probability of
spontaneous language recovery based on the model in (1) compared to
the observed probability of language recovery
Specifically, substitute the estimated regression coefficients given
in above into the multiple logistic regression equation in (1).
For each patient in the study calculate his or her individual
predicted probability of language recovery by calculating the linear
combination of each patient's individual set of explanatory variables
with the estimated regression coefficients and use the inverse of the
logit transformation to solve for the probability of language
recovery,
.
For example, consider a patient whose stroke was on the left side
(SIDE=0), was a 61 year old male (GENDER=1, AGE=61), stayed in the
hospital 19 days (LOSD=19), and whose WAB at discharge was 81.6
(WAB1=81.6). The estimated probability of spontaneous language
recovery based on the model for such a patient is

This value can be interpreted to mean that 77 patients out of 100 with
the above characteristics can be expected to recover normal language
functioning spontaneously within the first 2 months post-hospital
discharge. To study the goodness of fit of the model, we obtain the
frequency distribution of the values of
, and determine, say,
the quintiles of the distribution. The observed number of patients
who had normal language functioning are tallied within each quintile,
and the expected number of patients who had normal language
functioning (predicted by the model) is determined by summing the
's for all the patients in the quintile.
Table 3 presents
the expected and observed number of patients who had normal language
functioning for each quintile of risk of language recovery.
Comparisons of the expected number predicted by the model to the
observed number suggests that this model fits the data reasonably
well.
A test for whether there is a significant difference between the observed and expected numbers in Table reftab:fit can be constructed based on a Pearson chi-square test. This chi-square test would have degrees-of-freedom equal to the number of groups (e.g., 5 for the number of quintiles) minus 2 (Hosmer and Lemeshow, 1989, chapter 5). In our example, this chi-square statistics is equal to 3.60 with 3 degrees-of-freedom and has a non-significant p-value=0.69, indicating that the model adequately fits the data. A small p-value would indicate a significant lack of fit of the model.
Table 3: Analysis of the fit of the logit model in Equation (1)