Four key uncertainties of future climate include: (i) Climate system response, as measured by the temperature change under a specified external perturbation; (ii) Climate variations due to natural and anthropogenic aerosols; (iii) Magnitude and character of natural climate variability; (iv) Spatio-temporal patterns of change in climate variables. An honest assessment of such uncertain climate properties is key to the support of any scientific statement about the current state of Earth's climate, and to the construction of forward-looking projections that may be used in policy decisions. In order to quantify climate system properties, computer models are run under different parameterizations. The present work is an attempt to provide a fully probabilistic quantification of the uncertainties in the climate system properties. It is the result of a multi-year collaboration between climate scientists at MIT and statisticians at UCSC.<br>

<p>In this work, we consider the MIT 2D climate model (Forest et al., 2006) (MIT2DCM). Such a model provides simulations of ocean, surface and upper atmospheric temperature behavior. The model uses a system of latitude and height coordinates to simulate the average state of the climate over zones defined by latitude bands. Despite the averaging over longitude, the model is sufficiently complex to match climate observations and to make similar predictions as those of full 3D atmosphere-ocean general circulation models (GCM). In the MIT2DCM, the climate system properties control the climate system response via three parameters: Climate sensitivity, denoted as , defined as the equilibrium global-mean surface temperature response to a doubling of CO; Diffusion of deep-ocean temperature anomalies, ; Net anthropogenic aerosol and unmodeled forcings, . Let . In our approach, we estimate the posterior distribution of using historical records and simulations from the MIT2DCM and GCMs.<br>

<p>Typical output from a run of the MIT2DCM consists of temperatures at 46 different latitudes, with 11 vertical layers for the 1860-1995 period, every 30 minutes. Such output is summarized in three diagnostics'': A vector of 288 components, consisting of the upper air temperature changes between the 1986-1995 and 1961-1980 periods at 36 latitudes and 8 levels; Surface temperature change, consisting of the difference between the decadal average temperatures for 1946-1995 periods and the average temperature of 1906-1995 at 4 different equal-area zonal bands, resulting in a 20 dimensional vector; Deep ocean temperature trend, calculated for the 1952-1995 period. Historical observations and GCM output are obtained in correspondence to the three diagnostics.<br>

<p>The MIT2DCM is run by letting vary on a discrete grid. Prior information on climate properties is available from the literature on climate models. For a given diagnostic, denote the observation and the output of the model. Since depends on , we can obtain a likelihood function assuming that the difference is Gaussian. The resulting posterior provides information for inference on the climate properties. Unfortunately, output from the MIT2DCM is available only at a few hundreds pre-specified points on an irregular grid. Even though the MIT2DCM is much faster than a GCM, the running and post-processing times needed to obtain a value for one of the diagnostics prevents us from embedding its evaluation within an iterative method. So, to fully explore the posterior distribution of we create an auxiliary statistical model that provides an approximation to . For this purpose we use a Gaussian process. This is justified, from a Bayesian viewpoint, by the fact that for a given the value of is unknown, so we may consider it as a random process. In essence the setting of our problem is that of calibration of computer model parameters as described in Kennedy and O'Hagan (2001). The focus is not on prediction or data assimilation, but on inference for the parameters that control the computer model which have a precise physical meaning. For two of the diagnostics the output is multivariate. So the calibration procedure has to incorporate information about a covariance matrix. We use GCM output to elicit a prior distribution for such matrices.<br>

<p>For a technical description of the model, let be the available parameter configurations. . Let be the locations of the diagnostics' components. or 288. Then denote the model runs and denotes the value of the MIT2DCM at the `true' value of the parameter. We assume that . Here encompasses observational errors and model inaccuracies and biases. We let . In the climate literature, is usually referred to as the unforced variability. Estimating will be a byproduct of our model, but it is a problem that has an interest of its own. We then assume that , where is a polynomial in and and a vector of coefficients of dimension . Additionally we have that . Here is a correlation function. We assume that the MIT2DCM can reproduce a correlation structure compatible with the unforced variability. We obtain prior information about by considering output from a GCM and assuming that it also captures the unforced variability.<br>

<p>The proposed model assumes a separable structure for the covariance of . The resulting covariance matrix can be written as a Kronecker product of two smaller matrices. This is a key modeling issue given our use of MCMC for the exploration of the posterior distribution. In fact, for the surface diagnostic the full covariance matrix is of size , for the upper air its size is , making computations, and even storage, very difficult for a non structured matrix.<br>

<p>Our model tackles several interesting issues for both the climatologist and the statistician. We study the properties of the climate using three parameters that are key for long-term climate projections. Such parameters provide information on how the climate will be affected by an increase in the amount of human produced emissions and the capacity of the oceans to absorb heat. We provide a comprehensive answer that incorporates scientific information, structural assumptions and different sources of data, observed and synthetic. The uncertainty is quantified using a joint probability distribution that yields a full description of the dependencies between the three parameters. The model tackles a computer calibration problem where the output is of moderate multivariate dimension. It accounts for all parameter estimation uncertainty. It provides an estimate of the unforced variability. It uses the information from different summaries of the Earth's climate that have different physical responses to the accumulation of heat.<br>