Discrete Multivariate Analysis:  Statistics 36-720


Department of Statistics, Carnegie Mellon University

Fall, 2009

Instructor: Stephen E. Fienberg

  • Lectures: MW 10:30am-11:50am  BH  232Q

  •  
  • Teaching Assistant:  Daniel Manrique <dmanriqu@stat.cmu.edu>
  •  


     

    Primary Text:  Fienberg, Stephen E.  (1980). Analysis of Cross-Classified Categorical Data. 2nd Edition.  MIT Press.  Reprinted by Springer Verlag (2007).

    Additional References:

    Agresti, Alan (2002). Categorical Data Analysis 2nd Edition, Wiley.

    Agresti, Alan (2007). An Introduction to Categorical Data Analysis 2nd Edition,Wiley.

    Bishop, Yvonne M. M. and Fienberg, Stephen E., and Holland, Paul W. (1975). Discrete Multivariate Analysis: Theory and Practice,  MIT Press.  Reprinted by Springer Verlag (2007).

    Edwards, D. (2000). Introduction to Graphical Modelling, 2nd Edition,  Springer.

    Lauritzen, S. (1996).  Graphical Models.  Oxford University Press.

    Venables, W. N. and Ripley, B. D. (2004).  Modern Applied Statistics With S-Plus, 4th Edition, Springer-Verlag.
        (See especially Chapter 7 on generalized linear models.)

    Whittaker, Joe (1990).  Graphical Models in Applied MultivariateStatistics, Wiley.
     

    Computer Prgrams:

    R---Available on the Department of Statistics system but also downloadable for free.

    MIM  has been mounted on the PC's in the Statistics Department student computing room. If you have your own PC, you can download it from
    www.hypergraph.dk/


    Announcements 
     

    Rochdale Data Table:

    Women's economic activity and husbands unemployment in the Rochdale urban region, UK.\ Source: Whittaker(1990):

    The data in Table~\label{roch}contain the cells that form an 8-way contingency table for N=665 households.
    The entries are written in stadard order with H varying fastest, then G, and A varying slowest.  The variables are:

    A: wife economically active: no yes
    B: age of wife$\ge34$: no, yes
    C: husband unemployed: no yes
    D: child less than or equal to 4: no, yes
    E: wife's education , O level+: no, yes
    F: husband'd education, O level+: no, yes
    G: Asian origin:  no,, yes
    H: other household member working: no, yes
     

    5 0 2 1 5 1 0 0 4 1 0 0 6 0 2 0
    8 0 11 0 13 0 1 0 3 0 1 0 26 0 1
    0 5 0 2 0 0 0 0 0 0 0 0 0 0 0 1 0
    4 0 8 2 6 0 1 0 1 0 1 0 0 0 1 0
    17 10 1 1 16 7 0 0 0 2 0 0 10 6 0 0
    1 0 2 0 0 0 0 0 1 0 0 0 0 0 0 0
    4 7 3 1 1 1  2 0 1 0 0 0 1 0 0 0
    0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0
    18 3 2 0 23 4 0 0 22 2 0 0 57 3 0 0
    5 1 0 0 11 0 1 0 11 0 0 0 29 2 1 1
    3 0 0 0 4 0 0 0 1 0 0 0 0 0 0 0
    1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    41 25 0 1 37 26 0 0 15 10 0 0 43 22 0 0
    0 0 0 0 2 0 0 0 0 0 0 0 3 0 0 0
    2 4 0 0 2 1 0 0 0 1 0 0 2 1 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
     


    #Coronary heart disease
    data.1 <- c(44,40,112,67,129,145,12,23,35,12,80,33,109,67,7,9,
              23,32,70,66,50,80,7,13,24,25,73,57,51,63,7,16,5,7,21,
              9,9,17,1,4,4,3,11,8,14,17,5,2,7,3,14,14,9,16,2,3,4,0,
              13,11,5,14,4,4)
    coronary.data <- array(data.1,c(2,2,2,2,2,2),list(A = c("no","yes"),
                           B =c("no","yes"),C= c("no","yes"),D= c("<140",
                              ">=140"),E=c("<3",">=3"),F=c("neg","pos")))
     



         Available Handouts (including syllabus, lecture nots, and assignments)


    A JAVA Applet for exploring the geometry of 2 x 2 tables:

    http://www.cs.cmu.edu/~eairoldi/tetraHedron3D/

    http://www.cs.cmu.edu/~eairoldi/tetraHedron3D/tetraHedron3D.jar

    For details on the existence of MLEs for log-linear models and algebraic geometry, see

    N. Eriksson, S. E. Fienberg, A. Rinaldo, S. Sullivant (2006). Polyhedral Conditions for the Nonexistence of the MLE for Hierarchical Log-linear Models. Journal of Symbolic Computation, 41, 222-233. Click here for a pdf copy.


    Last Updated:  August 23, 2009.