No Title

Finger Exercises --- Due Tuesday, March 17, 1998

Please prepare answers for these exercises, to be turned in. Your grade will be based on how many questions you made a reasonably strong effort to answer, not how many questions you get right or partially right. There is no partial credit, but the grader may give you additional written feedback on each question you attempt.
Feel free to discuss these exercises with each other and with me. You will get the most benefit from the grader's remarks, however, if the work you turn in is your own.

This hw, like most material for this course, is posted on the World Wide Web at URL http://www.stat.cmu.edu/~brian/402/ ; you can cut and paste data directly out of the TeX file if you wish. Look for something under week8 or week9.

There are three problems.

Please select one of the two problems from HW 2 and re-do it in SAS, using whichever of PROC CATMOD, PROC GENMOD, PROC FREQ, PROC INSIGHT (and possibly other PROC's) you find useful. Try to imitate as much of your SPLUS analysis in SAS as possible.

Again, from the book ``The Analysis of Cross-Classified Categorical Data'' by Fienberg (1987). Goodman (1973) analyzed the following data of Lazarsfeld, which gives the cross-classification of 266 respondents, each interviewed at two successive points in time, with respect to their voting intentions (Republican or Democrat) and their opinion of the Republican candidate (For or Against) [see file t138.dat].

The following parts should be done more or less together.
Find the most parsimonious log-linear model that fits these data. Illustrate your work with appropriate exploratory and residual plots, and a little narrative about your work.
Draw a graph indicating the conditional independence relations you found in the data.
and finally
(c)
Use what you know about the time-order and other relationships between these four measurements to change (some of) the edges in your graph for part (b) into arrows going from variables that you suspect are causes to the variables that you suspect are their effects.

Suppose we are specifically interested, in problem #2, in what variables influence the opinion voters have of the Republican candidate at time 2. If the data is arranged as follows,
402 > vote int2 op1 int1 yes no 1 int21 op11 int11 129 3 2 int22 op11 int11 1 2 3 int21 op12 int11 11 23 4 int22 op12 int11 0 1 5 int21 op11 int12 1 0 6 int22 op11 int12 12 11 7 int21 op12 int12 1 1 8 int22 op12 int12 2 68
then we could fit logistic regression models as follows,
402 > attach(vote) 402 > YN _ cbind(yes,no) 402 > mymod <- glm(YN ~ [model terms invloving op1, int1, int2],family=binomial)

Find the most parsimonious logistic regression model (it should obey the Hierarchical Terms Rule) that fits this data. Assess the residuals of the models you try also.
Compare the model you have found here with the model you found in #2.

About this document ...

Next: About this document

Brian Junker
Thu Mar 12 08:46:29 EST 1998