## 36-724: Applied Bayesian and Computational Methods

Time/PlaceMWF 2:30-3:30, Porter Hall 125B

Website:http://www.stat.cmu.edu/~acthomas/724/.

Instructor:Andrew C. Thomas (acthomas -at- stat.cmu.edu).Office Hours:Thursday, 3:00-5:00, Baker Hall 132H (or by appointment).

TA:Dancsi Percival (dperciva -at- stat.cmu.edu).Office Hours:Friday 10:00-11:00, Wean 8110.

Remaining Class Schedule:Class as usual February 20, 22, 24.No class February 27.Special presentations by Prof. Junker on February 29, Prof. Shalizi on March 2. One final class meeting March 5.

Required text:Gelman, Carlin, Stern and Rubin (2003) Bayesian Data Analysis, Second Edition. Chapman & Hall. I will often refer to this in shorthand as "the red book".

Suggested texts:

- Joseph B. Kadane, "Principles of Uncertainty", CRC. Available for free!
- Peter Congdon, "Bayesian Statistical Modelling" and "Applied Bayesian Modelling", Wiley. Texts used in previous iterations of the course. Code in here is done in WinBUGS.
- Larry Wasserman, "All Of Statistics", Springer. I will often refer to this as "the magenta book".
- Andrew Gelman and Jennifer Hill, "Data Analysis using Regression and Multilevel/Hierarchical Models", Cambridge University Press. Buy the softcover version.

Prerequisites:36-705 ``Intermediate Statistics'' and 36-707 ``Intermediate Regression''. Or, permission of the instructor.

Outline:The goal of this course is to give a meaningful introduction and exploration of Bayesian statistical methods through computational techniques in a short course. We will focus on the principles of Bayesian hierarchical modelling methods that can be programmed efficiently and remain scientifically valid, and methods for debugging without pulling too much hair out. We will not be explicitly covering discriminative machine-learning topics, but we will cover the same debugging concepts that will make things easier when coding them up.

Grading:There will be six weekly homework assignments. The lowest score will be dropped, so that each of your top homeworks are worth 16% of the total, plus 20% for homework follow-ups, classroom discussion and participation.No late homeworks will be accepted!There will be no final exam.

Programming language:R will be the only supported language for the course.

Course Notes:Available here; will be continually updated throughout the course.

Demo files:

- Demo 1: Sampling Random Variables
- Demo 2: Basic Form for Homework Problems
- Demo 3: Markov Chain Methods
- Demo 4: The Gelman-Rubin R statistic for multiple MCMC chains.
- Demo 5: General MCMC routines.
- Demo 6: MCMC for the Efron-Morris data and model.
- Demo 7: The radon data applied problem.
- Demo 8: Importance Sampling.
- Demo 9: Sequential Importance Sampling and Kalman Filter.

Assignments:Submit to 724homeworksgohere@gmail.com. Email submissions only!

- Homework 1, due January 30 at 14:30.
- Homework 2, due February 6 at 14:30. Data files: walker.RData, weathersequence.RData,
- Homework 3, due February 13 at 14:30.
- Homework 4, due February 20 at 14:30. Data: lupus.csv, efron-morris-batting.csv
- Homework 5, due February 27 at 14:30. Data: simple.set.RData.
- Homework 6, due
Wednesday March 7at 14:30. Data: d1.RData.Reading:

- Metropolis et al 1953 - Equation of State Calculations by Fast Computing Machines
- Hastings 1970 - Monte Carlo sampling methods using Markov chains and their applications
- Geman Geman 1984 - Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of images
- Cook et al 2006 - Validation of Software for Bayesian Models Using Posterior Quantiles
- Gelman Rubin 1992 - Inference from iterative simulation using multiple sequences (with discussion)
- Efron Morris 1975 - Data Analysis Using Stein's Estimator and its Generalizations
- Gelman et al 1996 - Posterior predictive assessment of model fitness via realized discrepancies (with discussion)
Tentative outline of the course (subject to reordering, but topics will likely be conserved):

- Week of January 16 -- no class Monday!:

- Introductions. What is Bayesian Statistics?
- One-level, one variable models; conjugate prior distributions.
- (Re-)Introduction to sampling and simulation in R. Grid approximation sampling, rejection sampling.
- Week of January 23:

- A reintroduction to Markov Chain theory, beginning with discrete models and moving to one-dimensional continuous models.
- "Central Dogma of Generative Modelling"; or, how to have confidence in your coding.
- Week of January 30:

- Metropolis-Hastings algorithm and Gibbs sampling. Generalized linear models.
- Week of February 6:

- Gaussian multi-level models, including partial and full pooling of variance components.
- Autocorrelation and cross-correlation in chains.
- Diagnostics for convergence.
- Week of February 13:

- Generalized multilevel models.
- Posterior predictive checking.
- Model comparison and selection methods.
- Week of February 20:

- Varying-slope models in the multilevel context.
- Week of February 27:

- Special topics to be determined; maybe time for Bayesian graphical models, causal inference.
- Week of March 5:

- Course wrap-up.
References:

- The official R Reference Card, containing many, many useful functions in R.