Factor Analysis

36-462/662, Fall 2019

16 September 2019

\[ \newcommand{\X}{\mathbf{x}} \newcommand{\w}{\mathbf{w}} \newcommand{\V}{\mathbf{v}} \newcommand{\S}{\mathbf{s}} \newcommand{\Expect}[1]{\mathbb{E}\left[ #1 \right]} \newcommand{\Var}[1]{\mathrm{Var}\left[ #1 \right]} \newcommand{\SampleVar}[1]{\widehat{\mathrm{Var}}\left[ #1 \right]} \newcommand{\Cov}[1]{\mathrm{Cov}\left[ #1 \right]} \DeclareMathOperator{\tr}{tr} \DeclareMathOperator*{\argmin}{argmin} \newcommand{\FactorLoadings}{\mathbf{\Gamma}} \newcommand{\Uniquenesses}{\mathbf{\psi}} \]

Recap

PCA is not a model

This is where factor analysis comes in

Remember PCA: \[ \S = \X \w \] and \[ \X = \S \w^T \]

(because \(\w^T = \w^{-1}\))

If we use only \(q\) PCs, then \[ \S_q = \X \w_q \] but \[ \X \neq \S_q \w_q^T \]

The factor model

\(\vec{X}\) is \(p\)-dimensional, manifest, unhidden or observable

\(\vec{F}\) is \(q\)-dimensional, \(q < p\) but latent or hidden or unobserved

The model: \[\begin{eqnarray*} \vec{X} & = & \FactorLoadings \vec{F} + \vec{\epsilon}\\ (\text{observables}) & = & (\text{factor loadings}) (\text{factor scores}) + (\text{noise}) \end{eqnarray*}\]

The factor model

\[ \vec{X} = \FactorLoadings \vec{F} + \vec{\epsilon} \]

The factor model: summary

\[\begin{eqnarray} \vec{X} & = & \FactorLoadings \vec{F} + \vec{\epsilon}\\ \Cov{\vec{F}, \vec{\epsilon}} & = & \mathbf{0}\\ \Var{\vec{\epsilon}} & \equiv & \Uniquenesses, ~ \text{diagonal} \Expect{\vec{\epsilon}} & = & \vec{0}\\ \Var{\vec{F}} & = & \mathbf{I} \end{eqnarray}\]

Some consequences of the assumptions

\[\begin{eqnarray} \vec{X} & = & \FactorLoadings \vec{F} + \vec{\epsilon}\\ \Expect{\vec{X}} & = & \FactorLoadings \Expect{\vec{F}} \end{eqnarray}\]

Some consequences of the assumptions

\[\begin{eqnarray} \vec{X} & = & \FactorLoadings \vec{F} + \vec{\epsilon}\\ \Var{\vec{X}} & = & \FactorLoadings \Var{\vec{F}} \FactorLoadings^T + \Var{\vec{\epsilon}}\\ & = & \FactorLoadings \FactorLoadings^T + \Uniquenesses \end{eqnarray}\]

Some consequences of the assumptions

\[\begin{eqnarray} \vec{X} & = & \FactorLoadings \vec{F} + \vec{\epsilon}\\ \Var{\vec{X}} & = & \FactorLoadings \FactorLoadings^T + \Uniquenesses\\ \Cov{X_i, X_j} & = & \text{what?} \end{eqnarray}\]

Geometry

Geometry

How do we estimate?

\[ \vec{X} = \FactorLoadings \vec{F} + \vec{\epsilon} \]

Can’t regress \(\vec{X}\) on \(\vec{F}\) because we never see \(\vec{F}\)

Suppose we knew \(\Uniquenesses\)

Suppose we knew \(\FactorLoadings\)

then we’d say \[\begin{eqnarray} \Var{\vec{X}} & = & \FactorLoadings\FactorLoadings^T + \Uniquenesses\\ \Var{\vec{X}} - \FactorLoadings\FactorLoadings^T & = & \Uniquenesses \end{eqnarray}\]

“One person’s vicious circle is another’s iterative approximation”:

small.height <- 80
small.width <- 60
source("../../hw/03/eigendresses.R")
dress.images <- image.directory.df(path = "../../hw/03/amazon-dress-jpgs/", 
    pattern = "*.jpg", width = small.width, height = small.height)
library("cate")  # For high-dimensional (p > n) factor models
# function is a little fussy and needs its input to be a matrix rather than
# a data frame Also, it has a bunch of estimation methods, but the default
# one doesn't work nicely when some observables have zero variance (here,
# white pixels at the edges of every image --- use something a little more
# robust)
dresses.fa.1 <- factor.analysis(as.matrix(dress.images), r = 1, method = "pc")
summary(dresses.fa.1)  # Factor loadings, factor scores, uniqunesses
##       Length Class  Mode   
## Gamma 14400  -none- numeric
## Z       205  -none- numeric
## Sigma 14400  -none- numeric

par(mfrow = c(1, 2))
plot(vector.to.image(dresses.fa.1$Gamma, height = small.height, width = small.width))
plot(vector.to.image(-dresses.fa.1$Gamma, height = small.height, width = small.width))

par(mfrow = c(1, 1))

dresses.fa.5 <- factor.analysis(as.matrix(dress.images), r = 5, method = "pc")
summary(dresses.fa.5)
##       Length Class  Mode   
## Gamma 72000  -none- numeric
## Z      1025  -none- numeric
## Sigma 14400  -none- numeric