In 1960, researchers studied of tenants' satisfaction with their housing
conditions in Copenhagen. The data (from the file
copenhagenhousing.dat) are broken
down into a four-dimensional table, with the dimensions being:
type of housing (tower blocks, apartments, atrium houses, terraced houses),
tenant's influence on management (low, medium, high),
contact with other residents (low, high), and level of satisfaction
(low, medium, high). The number of respondents in each category can be
studied with a poisson model.
> housing <- read.table("copenhagenhousing.dat",col.names=c("HousingType",
+ "MgmtInfluence","Contact","Satisfaction","NumberRespondents"))
> attach(housing)
Now that the data have been read in, fit a generalized linear model to the
data. If you use summary to examine the results, you will get
a long list of coefficients and correlations, which you will have
to sort through to find what you are looking for. An easier
way to parse the results is through an ANOVA (analysis of variance) table.
We don't want to use up all of the degrees of freedom, so we
should omit the four-way interaction, which is unlikely to be
significant anyway.
> copenhagen.glm <- glm(NumberRespondents ~ HousingType*MgmtInfluence*Contact*Satisfaction
+ - HousingType:MgmtInfluence:Contact:Satisfaction, family=poisson)
> anova(copenhagen.glm)
Analysis of Deviance Table
Poisson model
Response: NumberRespondents
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev
NULL 71 833.6570
HousingType 3 376.3000 68 457.3570
MgmtInfluence 2 78.5163 66 378.8407
Contact 1 38.8321 65 340.0086
Satisfaction 2 44.6569 63 295.3518
HousingType:MgmtInfluence 6 16.8914 57 278.4604
HousingType:Contact 3 39.0578 54 239.4026
MgmtInfluence:Contact 2 16.6992 52 222.7033
HousingType:Satisfaction 6 60.6687 46 162.0346
MgmtInfluence:Satisfaction 4 102.0653 42 59.9693
Contact:Satisfaction 2 16.0175 40 43.9518
HousingType:MgmtInfluence:Contact 6 5.2896 34 38.6622
HousingType:MgmtInfluence:Satisfaction 12 22.5550 22 16.1072
HousingType:Contact:Satisfaction 6 9.2788 16 6.8284
MgmtInfluence:Contact:Satisfaction 4 0.8841 12 5.9443
It looks like three of the three-way interactions are not
significant (they only reduce the residual deviance by a small amount
for their degrees of freedom). Use update, just like for
regression, to remove them from the model.
> anova(copenhagen.glm <- update(copenhagen.glm, . ~ . - HousingType:MgmtInfluence:Contact
+ - HousingType:Contact:Satisfaction - MgmtInfluence:Contact:Satisfaction))
Analysis of Deviance Table
Poisson model
Response: NumberRespondents
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev
NULL 71 833.6570
HousingType 3 376.3000 68 457.3570
MgmtInfluence 2 78.5163 66 378.8407
Contact 1 38.8321 65 340.0086
Satisfaction 2 44.6569 63 295.3518
HousingType:MgmtInfluence 6 16.8914 57 278.4604
HousingType:Contact 3 39.0578 54 239.4026
MgmtInfluence:Contact 2 16.6992 52 222.7033
HousingType:Satisfaction 6 60.6687 46 162.0346
MgmtInfluence:Satisfaction 4 102.0653 42 59.9693
Contact:Satisfaction 2 16.0175 40 43.9518
HousingType:MgmtInfluence:Satisfaction 12 21.8200 28 22.1318
Suppose the question of interest is, ``which variables have an
effect on resident satisfaction?'' To answer that question, look
at the terms interacting with Satsifaction. It appears that
housing type, influence on management, and contact with other residents
all have an effect on tenant satisfaction. Also, it appears that
housing type and influence on management interact to have an
effect on satisfaction.
Now, to check the direction of the effects:
> copenhagen.glm$coefficients (Intercept) 4.135457 ...
There is a lot of output, and it is not in a particularly legible format. Here is a subset of the output, after some editing:
HousingTypeAtriumhouses: -1.239491 HousingTypeTerracedhouses: -1.444709 HousingTypeTowerblocks: -1.007473 HousingTypeAtriumhousesSatisfactionLow: 0.2352837 HousingTypeAtriumhousesSatisfactionMed: 0.432526 HousingTypeTerracedhousesSatisfactionMed: 0.2217337 HousingTypeTerracedhousesSatisfactionLow: 0.3838231 HousingTypeTowerblocksSatisfactionLow: -0.5432325 HousingTypeTowerblocksSatisfactionMed: -0.3314662
These coefficients say that fewer people live in atrium houses than apartments, fewer people live in terraced houses than apartments and fewer people live in tower blocks than apartments.
The next batch of coefficients reveals that tower blocks are the most preferred type of housing (fewer people have low or medium satisfaction), followed by apartments, and then by terraced houses and atrium houses.
You can examine the rest of the coefficients on your own.