The problem of estimating a covariance matrix in small samples has been considered by several authors following early work by Stein. This problem can be especially important in hierarchical models where the standard errors of fixed and random effects depend on estimation of the covariance matrix of the distribution of the random effects. We propose a set of hierarchical priors for the covariance matrix that produce posterior shrinkage toward a specified structure---here we examine shrinkage toward diagonality. We then address the computational difficulties raised by incorporating these priors, and nonconjugate priors in general, into hierarchical models. We apply a combination of approximation, Gibbs sampling (possibly with a Metropolis step), and importance reweighting to fit the models, and compare this hybrid approach to alternative MCMC methods.
Our investigation involves three alternative hierarchical priors. The first works with the spectral decomposition of the covariance matrix and produces both shrinkage of the eigenvalues toward each other and shrinkage of the rotation matrix toward the identity. The second produces shrinkage of the correlations toward zero, and the third uses a conjugate Wishart distribution to shrink toward diagonality. A simulation study shows that such hierarchical priors, especially the first, can be very effective in reducing small-sample risk. We evaluate the computational algorithm in the context of a Normal nonlinear random-effects model and illustrate the methodology with a Poisson random-effects model.