1 Introduction
Climate models are built around models of the atmosphere, which are based on the laws of thermodynamics and on Newton’s laws of motion for air as a fluid. Since they were first developed in the 1960s (Smagorinsky, 1963; Smagorinsky et al., 1965; Manabe et al., 1965; Mintz, 1965; Kasahara and Washington, 1967), they have evolved from atmosphereonly models, via coupled atmosphereocean models with dynamic oceans, to Earth system models (ESMs) with dynamic cryospheres and biogeochemical cycles (Bretherton et al., 2012; Intergovernmental Panel on Climate Change, 2013). Atmosphere and ocean models compute approximate numerical solutions to the laws of fluid dynamics and thermodynamics on a computational grid. For the atmosphere, the computational grid currently consists of cells, spaced – apart in the horizontal; for the oceans, the grid consists of cells, spaced apart in the horizontal. But scales smaller than the mesh size of a climate model cannot be resolved, yet are essential for its predictive capabilities. The unresolved scales are modeled by a variety of semiempirical parameterization schemes, which represent the dynamics on subgridscales as parametric functions of the resolved dynamics on the computational grid (Stensrud, 2007). For example, the dynamical scales of stratocumulus clouds, the most common type of boundary layer clouds, are and smaller, which will remain unresolvable on the computational grid of global atmosphere models for the foreseeable future (Wood, 2012; Schneider et al., 2017). Similarly, the submesoscale dynamics of oceans that may be important for biological processes near the surface have length scales of , which will also remain unresolvable for the foreseeable future (FoxKemper et al., 2014). Such smallerscale dynamics in the atmosphere and oceans must be represented in climate models through parameterization schemes. Additionally, ESMs contain parameterization schemes for many processes for which the governing equations are not known or are only poorly known, for example, ecological or biogeochemical processes.
All of these parameterization schemes contain parameters that are uncertain, and the structure of the equations underlying them is uncertain itself. That is, there is parametric and structural uncertainty (Draper, 1995). For example, entrainment and detrainment rates are parameters or parametric functions of state variables such as the vertical velocity of updrafts. They control the interaction of convective clouds with their environment and affect cloud properties and climate. But how they depend on state variables is uncertain, as is the structure of the closure equations in which they appear (e.g., Stainforth et al., 2005; Holloway and Neelin, 2009; Neelin et al., 2009; Romps and Kuang, 2010; Nie and Kuang, 2012; de Rooy et al., 2013). Or, as another example, the residence times of carbon in different reservoirs (e.g., soil, litter, plants) control how rapidly and where in the biosphere carbon accumulates. They affect the climate response of the biosphere. But they are likewise uncertain, differing by factors among models (Friedlingstein et al., 2006, 2014; Friend et al., 2014; Bloom et al., 2016)
. Typically, parameterization schemes are developed and parameters in them are estimated independently of the model into which they are eventually incorporated. They are tested with observations from field studies at a relatively small number of locations. For processes such as boundarylayer turbulence that are computable if sufficiently high resolution is available, parameterization schemes are increasingly also tested with data generated computationally in local process studies with highresolution models
(e.g., Jakob, 2003, 2010). After the parameterization schemes are developed and incorporated in a climate model or ESM, modelers adjust (“tune”) parameters to satisfy largescale physical constraints, such as a closed energy balance at the top of the atmosphere (TOA), or selected observational constraints, such as reproduction of the 20thcentury globalmean surface temperature record. This model tuning process currently relies on knowledge and intuition of the modelers about plausible ranges of the tunable parameters and about the effect of parameter changes on the simulated climate of a model (Randall and Wielicki, 1997; Mauritsen et al., 2012; Golaz et al., 2013; Hourdin et al., 2013; Flato et al., 2013; Hourdin et al., 2017). But because of the nonlinear and interacting multiscale nature of the climate system, the simulated climate can depend sensitively and in unexpected ways on settings of tunable parameters (e.g., Suzuki et al., 2013; Zhao et al., 2016). It also remains unclear to what extent the resulting parameter choice is optimal, or how uncertain it is. Moreover, typically only a minute fraction of the available observations is used in the tuning process, usually only highly aggregated data such as global or largescale mean values accumulated over periods of years or more. In part, this may be done to avoid overfitting, but more importantly, it is done because the tuning process usually involves parameter adjustments by hand, which each must be evaluated by a forward integration of the model. This makes the tuning process tedious and precludes adjustments of a larger set of parameters to fit more complex observational datasets or a wider range of highresolution process simulations. It also precludes quantification of uncertainties (Schirber et al., 2013; Hourdin et al., 2017).Climate models have improved over the past decades, leading, for example, to better simulations of El Niño, storm tracks, and tropical waves (Guilyardi et al., 2009; Hung et al., 2013; Flato et al., 2013)
. Weather prediction models, the higherresolution siblings of the climate models’ atmospheric component, have undergone a parallel evolution. Along with data assimilation techniques for the initialization of weather forecasts, this has led to great strides in the accuracy of weather forecasts
(Bauer et al., 2015). But the accuracy of climate projections has not improved as much, and unacceptably large uncertainties remain. For example, if one asks how high concentrations can rise before Earth’s surface will have warmed above preindustrial temperatures—the warming target of the 2015 Paris Agreement, of which about remains because about has already been realized—the answers range from 480 to 600 ppm across current climate models (Schneider et al., 2017). A concentration of 480 ppm will be reached in the late 2030s or early 2040s; 600 ppm may not be reached before 2060 even if emissions continue to increase rapidly. Between these extremes lie vastly different optimal policy responses and socioeconomic costs of climate change (Hope, 2015).These large and longstanding uncertainties in climate projections have their root in uncertainties in parameterization schemes. Parameterizations of clouds dominate the uncertainties in physical processes (Cess et al., 1989, 1990; Stephens, 2005; Bony et al., 2006; Soden and Held, 2006; Vial et al., 2013; Webb et al., 2013; Brient and Schneider, 2016). There are uncertainties both in the representation of the turbulent dynamics of clouds and in the representation of their microphysics, which control, for example, the distribution of droplet sizes in a cloud, the fraction of cloud condensate that precipitates out, and the phase partitioning of cloud condensate into liquid and ice (e.g., Stainforth et al., 2005; Jiang et al., 2012; Suzuki et al., 2013; Golaz et al., 2013; BodasSalcedo et al., 2014; Zhao et al., 2016; Kay et al., 2016). Additionally, there are numerous other parameterized processes that contribute to uncertainties in climate projections. For example, it is not precisely known what fraction of the that is emitted by human activities will remain in the atmosphere, and so it is uncertain which emission pathways will lead to a given atmospheric concentration target (Knutti et al., 2008; Meinshausen et al., 2009; Friedlingstein, 2015). Currently, only about half the emitted accumulates in the atmosphere. The other half is taken up by oceans and on land. It is unclear in particular what fraction of the emitted terrestrial ecosystems will take up in the future (Friedlingstein et al., 2006; Canadell et al., 2007; Knorr, 2009; Le Quéré et al., 2013; ToddBrown et al., 2013; Friedlingstein et al., 2014; Friend et al., 2014). Reducing such uncertainties through the traditional approach to developing and improving parameterization schemes—attempting to develop one “correct” global parameterization scheme for each process in isolation, on the basis of observational or computational process studies that are usually focused on specific regions—has met only limited success (Jakob, 2003, 2010; Randall, 2013).
Here we propose a new approach to improving parameterization schemes. The new approach invests considerable computational effort up front to exploit global observations and targeted highresolution simulations through the use of data assimilation and machine learning within physical, biological, and chemical process models. We first outline in broad terms how we envision ESMs to learn from global observations and targeted highresolution simulations (section 2). Then we discuss in more concrete terms the framework underlying such learning ESMs (section 3). We illustrate the approach by learning parameters in a relatively simple dynamical system that mimics characteristics of the atmosphere and oceans (section 4). We conclude with an outlook of the opportunities the framework we outline presents and of the research program that needs to be pursued to realize it (section 5).
2 Learning from Observations and Targeted HighResolution Simulations
2.1 Information Sources for Parameterization Schemes
Parameterization schemes can learn from two sources of information:

Global observations. We live in the golden age of Earth observations from space (L’Ecuyer et al., 2015). A suite of satellites flying in the formation known as the Atrain has been streaming coordinated measurements of the composition of the atmosphere and of physical variables in the Earth system. We have nearly simultaneous measurements of variables such as temperature, humidity, and cloud and sea ice cover, with global coverage for more than a decade (Stephens et al., 2002; Jiang et al., 2012; Simmons et al., 2016; Stephens et al., 2017). Spacebased measurements of biogeochemical tracers and processes, such as measurements of columnaverage concentrations and of photosynthesis in terrestrial ecosystems, are also beginning to become available (e.g., Crisp et al., 2004; Yokota et al., 2009; Frankenberg et al., 2011; Joiner et al., 2011; Frankenberg et al., 2014; Bloom et al., 2016; Eldering et al., 2017; Liu et al., 2017; Sun et al., 2017), and so are more detailed observations of the cryosphere (e.g., Shepherd et al., 2012; Gardner et al., 2013; Vaughan et al., 2013). Parameterization schemes can learn from such spacebased global data, which can be augmented and validated with more detailed local observations from the ground and from field studies.

Local highresolution simulations. Some processes parameterized in ESMs are in principle computable, only the globally achievable resolution precludes their explicit computation. For example, the turbulent dynamics (though currently not the microphysics) of clouds can be computed with high fidelity in limited domains in largeeddy simulations (LES) with grid spacings of (Siebesma et al., 2003; Stevens et al., 2005; Khairoutdinov et al., 2009; Matheou and Chung, 2014; Schalkwijk et al., 2015; Pressel et al., 2015, 2017). Increased computational performance has made LES domain widths of – feasible in recent years, while the horizontal mesh size in atmosphere models has shrunk, to the point that the two scales have converged. Thus, while global LES that reliably resolve low clouds such as cumulus or stratocumulus will not be feasible for decades, it is possible to nest LES in selected grid columns of atmosphere models and conduct highfidelity local simulations of cloud dynamics in them (Schneider et al., 2017). Local highresolution simulations of ocean mesoscale turbulence or sea ice dynamics can be conducted similarly. Parameterization schemes can learn from such nested highresolution simulations.
Of course, both observations and highresolution simulations have been exploited in the development of parameterization schemes for some time. For example, data assimilation techniques have been used to estimate parameters in parameterization schemes from observations. Parameters especially in cloud, convection, and precipitation parameterizations have been estimated by minimizing errors in shortterm weather forecasts over timescales of hours or days (e.g., Emanuel and Živković Rothman, 1999; Grell and Dévényi, 2002; Aksoy et al., 2006; Schirber et al., 2013; Ruiz et al., 2013; Ruiz and Pulido, 2015), or by minimizing deviations between simulated and observed longerterm aggregates of climate statistics, such as globalmean TOA radiative fluxes accumulated over seasons or years (e.g., Jackson et al., 2008; Järvinen et al., 2010; Neelin et al., 2010; Solonen et al., 2012; Tett et al., 2013). Highresolution simulations have been used to provide detailed dynamical information such as vertical velocity and turbulence kinetic energy profiles in convective clouds, which are not easily available from observations. They have often been employed to augment observations from local field studies, and parameterization schemes have been fit to and evaluated with the observations and the highresolution simulations used in tandem (e.g., Liu et al., 2001; Siebesma et al., 2003; Stevens et al., 2005; Siebesma et al., 2007; Hohenegger and Bretherton, 2011; de Rooy et al., 2013; Romps, 2016). Highresolution deep convectionresolving simulations with horizontal grid spacing and, most recently, LES with horizontal grid spacing have also been nested in small, usually twodimensional subdomains of atmospheric grid columns, as a parameterization surrogate that explicitly resolves some aspects of cloud dynamics (e.g., Grabowski and Smolarkiewicz, 1999; Grabowski, 2001; Khairoutdinov and Randall, 2001; Randall et al., 2003; Khairoutdinov et al., 2005; Randall, 2013; Grabowski, 2016; Parishani et al., 2017). Such multiscale modeling approaches, often called superparameterization, have led to markedly improved simulations, for example, of the Asian monsoon, of tropical surface temperatures, and of precipitation and its diurnal cycle, albeit at great computational expense (e.g., Benedict and Randall, 2009; Pritchard and Somerville, 2009a, b; Stan et al., 2010; DeMott et al., 2013). However, multiscale modeling relies on a scale separation between the globalmodel mesh size and the domain size of the nested highresolution simulation (E et al., 2007). Multiscale modeling is computationally advantageous relative to global highresolution simulations as long as it suffices for the nested highresolution simulation to subsample only a small fraction of the footprint of a globalmodel grid column, and to extrapolate the information so obtained to the entire footprint on the basis of statistical homogeneity assumptions. As the mesh size of global atmosphere models shrinks to horizontal scales of kilometers—resolutions that are already feasible in short integrations or limited areas and that will become routine in the next decade (Palmer, 2014; Ban et al., 2015; Ohno et al., 2016; Schneider et al., 2017)—the scale separation to the minimum necessary domain size of nested highresolution simulations will disappear, and with it the computational advantage of multiscale modeling.
What we propose here combines elements of these existing approaches in a novel way. At its core are still parameterization schemes that are based on physical, biological, or chemical process models, whose mathematical structure is developed on the basis of theory, local observations, and, where possible, highresolution simulations. But we propose that these parameterization schemes, when they are embedded in ESMs, learn directly from observations and highresolution simulations that both sample the globe. Highresolution simulations are employed in a targeted way—akin to targeted or adaptive observations in weather forecasting (Palmer et al., 1998; Lorenz and Emanuel, 1998; Bishop et al., 2001)—to reduce uncertainties where observations are insufficient to obtain tight parameter estimates. Instead of incorporating highresolution simulations globally in a small fraction of the footprint of each grid column like in multiscale modeling approaches, the ESM we envision deploys them locally, in entire grid columns, albeit only in a small subset of them. Highresolution simulations can be targeted to grid columns selected based on measures of uncertainty about model parameters. If the nested highresolution simulations feed back onto the ESM, this corresponds to a locally extreme mesh refinement; however, twoway nesting may not always be necessary (e.g., Moeng et al., 2007; Zhu et al., 2010). The model learns parameters from observations and from nested highresolution simulations in a computationally intensive learning phase, after which it can be used in a computationally more efficient manner, like models in use today. Nonetheless, even in simulations of climates beyond what has been observed, bursts of targeted highresolution simulations can continue to be deployed to refine parameters and estimate their uncertainties.
2.2 Computable and Noncomputable Parameters
Learning from highresolution simulations and observations is aimed at determining two different kinds of parameters in parameterization schemes: computable and noncomputable parameters. (Since parameters and parametric functions of state variables play essentially the same role in our discussion, we simply use the term parameter, with the understanding that this can include parametric functions and even nonparametric functions.) Computable parameters are those that can in principle be inferred from highresolution simulations alone. They include parameters in radiative transfer schemes, which can be inferred from detailed linebyline calculations; dynamical parameters in cloud turbulence parameterizations, such as entrainment rates, which can be inferred from LES; or parameters in ocean mixing parameterizations, which can be inferred from highresolution simulations. Noncomputable parameters are parameters that, currently, cannot be inferred from highresolution simulations, either because computational limitations make it necessary for them to also appear in parameterization schemes in highresolution simulations, or because the microscopic equations governing the processes in question are unknown. They include parameters in cloud microphysics parameterizations, which are still necessary to include in LES, and many parameters characterizing ecological and biogeochemical processes, whose governing equations are unknown. Cloud microphysics parameters will increasingly become computable through direct numerical simulation (Devenish et al., 2012; Grabowski and Wang, 2013), but ecological and biogeochemical parameters will remain noncomputable for the foreseeable future. Both computable and noncomputable parameters can, in principle, be learned from observations; the only restrictions to their identifiability come from the wellposedness of the learning problem and its computational tractability. But only computable parameters can be learned from targeted highresolution simulations. To be able to learn computable parameters, it is essential to represent noncomputable aspects of a parameterization scheme consistently in the highresolution simulation and in the parameterization scheme that is to learn from the highresolution simulation. For example, radiative transfer and microphysical processes need to be represented consistently in a highresolution LES and in a parameterization scheme if the parameterization scheme is to learn computable dynamical parameters such as entrainment rates from the LES.
This approach presents challenges for parameter learning, since it implies the need to use observational data and highresolution simulations in tandem to improve model parameterizations. But it also presents an opportunity: in doing so, the reliability and predictive power of ESMs can be improved, and uncertainties in parameters and predictions can be quantified.
2.3 Objectives: Bias Reduction and Exploitation of Emergent Constraints
Computational tractability is paramount for the success of any parameter learning algorithm for ESMs (e.g., Annan and Hargreaves, 2007; Jackson et al., 2008; Neelin et al., 2010; Solonen et al., 2012). The central issue is the number of times the objective function needs to be evaluated, and hence an ESM needs to be run, in the process of parameter learning. Standard parameter estimation and inverse problem approaches may require function or derivative evaluations to learn parameters, especially if uncertainty in the estimates is also required (Cotter et al., 2013). This many forward integrations and/or derivative evaluations of ESMs are not feasible if each involves accumulation of longerterm climate statistics. Fast parameterized processes in climate models often exhibit errors within a few hours or days of integration that are similar to errors in the mean state of the model (Phillips et al., 2004; Rodwell and Palmer, 2007; Xie et al., 2012; Ma et al., 2013; Klocke and Rodwell, 2014). This has given rise to hopes that it may suffice to evaluate objective functions by weather hindcasts over timescales of only hours, making many evaluations of an objective function feasible (Aksoy et al., 2006; Ruiz et al., 2013; Wan et al., 2014). But experience has shown that such shortterm optimization may not always lead to the desired improvements in climate simulations (Schirber et al., 2013). Additionally, slower parameterized processes, for example, involving biogeochemical cycles or the cryosphere, require longer integration times to accumulate statistics entering any meaningful objective function. Therefore, we focus on objective functions involving climate statistics accumulated over windows that we anticipate to be wide compared with the
timescale over which the atmosphere forgets its initial condition. Then the accumulated statistics do not depend sensitively on atmospheric initial conditions. This reduces the onus of correctly assimilating atmospheric initial conditions in parameter learning, which would be required if one were to match simulated and observed trajectories, as in approaches that assimilate model parameters jointly with the state of the system by augmenting state vectors with parameters
(e.g., Dee, 2005; Aksoy et al., 2006; Anderson et al., 2009). The minimum window over which climate statistics will need to be accumulated will vary from processes to process, generally being longer for slower processes (e.g., the cryosphere) than faster processes (e.g., the atmosphere). For slower processes whose initial condition is not forgotten over the accumulation window, it will remain necessary to correctly assimilate initial conditions.The objective functions to be minimized in the learning phase can be chosen to directly minimize biases in climate simulations, for example, precipitation biases such as the longstanding doubleITCZ bias in the tropics (Lin, 2007; Li and Xie, 2014; Zhang et al., 2015; Adam et al., 2016, 2017), or cloud cover biases such as the “too few–too bright” bias in the subtropics (Webb et al., 2001; Zhang et al., 2005; Karlsson et al., 2008; Nam et al., 2012). Because the sensitivity with which an ESM responds to increases in greenhouse gas concentrations correlates with the spatial structure of some of these biases in the models (e.g., Tian, 2015; Siler et al., 2017), minimizing regional biases will likely reduce uncertainties in climate projections, in addition to leading to more reliable simulations of the present climate. To minimize biases, the objective function needs to include meanfield terms penalizing mismatch between spatially and at least seasonally resolved simulated and observed mean fields, for example, of precipitation, ecosystem primary productivity, and TOA radiative energy fluxes.
Additionally, there is a growing literature on “emergent constraints,” which typically are fluctuationdissipation relationships that relate measurable fluctuations in the present climate to the response of the climate system to perturbations (Hall and Qu, 2006; Collins et al., 2012; Klein and Hall, 2015). For example, how strongly tropical lowcloud cover covaries with surface temperature from year to year or even seasonally in the present climate correlates in climate models with the amplitude of the cloud response to global warming (Qu et al., 2014, 2015; Brient and Schneider, 2016). Therefore, the observable lowcloud cover covariation with surface temperature in the present climate can be used to constrain the cloud response to global warming. Or, as another example, how strongly atmospheric concentrations covary with surface temperature in the present climate correlates in climate models with the amplitude of the terrestrial ecosystem response to global warming (e.g., the balance between fertilization of plants and enhanced soil and plant respiration under warming) (Cox et al., 2013; Wenzel et al., 2014). Therefore, the observable concentration covariation with surface temperature can be used to constrain the terrestrial ecosystem response to global warming. Such emergent constraints are usually used post facto, in the evaluation of ESMs. They lead to inferences about the likelihood of a model given the measured natural variations, and they therefore can be used to assess how likely it is that its climate change projections are correct (e.g., Brient and Schneider, 2016). But emergent constraints usually are not used directly to improve models. In what we propose, they are used directly to learn parameters in ESMs and to reduce uncertainties in the climate response. To do so, covariance terms (e.g., between surface temperature and cloud cover or TOA radiative fluxes, or between surface temperature and concentrations) need to be included in the objective function.
The choice of objective functions to be employed is key to the success of what we propose. The use of timeaveraged statistics such as meanfield and covariance terms will make the objective functions smoother and hence reduce the computational cost of minimization, compared with minimizing objective functions that directly penalize mismatch between simulated and observed trajectories of the Earth system. From the point of view of statistical theory, the objective functions should contain the sufficient statistics for the parameters of interest, but what these are is not usually known a priori. In practice, the choice of objective functions will be guided by expertise specific to the relevant subdomains of Earth system science, as well as computational cost. Given that current ESM components such as clouds and the carbon cycle exhibit large seasonal biases (e.g., KeppelAleks et al., 2012; Karlsson and Svensson, 2013; Lin et al., 2014), and their response to longterm warming in some respects resembles their response to seasonal variations (e.g., Brient and Schneider, 2016; Wenzel et al., 2016), accumulating seasonal statistics in the objective functions suggests itself as a starting point.
3 Machine Learning Framework for Earth System Models
3.1 Models and Data
To outline how we envision parameterization schemes in ESMs to learn from diverse data, we first set up notation. Let denote the vector of model parameters to be learned, consisting of computable parameters that can be learned from highresolution simulations, and noncomputable parameters that can only be learned from observations (for example, because highresolution simulations themselves depend on ). The parameters appear in parameterization schemes in a model, which may be viewed as a map , parameterized by time , that takes the parameters to the state variables ,
(1) 
The state variables can include temperatures, humidity variables, and cloud, cryosphere, and biogeochemical variables, and the map may depend on initial conditions and timeevolving boundary or forcing conditions. The map typically represents a global ESM. The state variables are linked to observables through a map representing an observing system, so that
(2) 
The observables might represent surface temperatures, concentrations, or spectral radiances emanating from the TOA. The map in practice will be realized through an observing system simulator, which simulates how observables are impacted by a multitude of state variables . The actual observations (e.g., spacebased measurements) are denoted by , so is the mismatch between simulations and observations. Since is parameterized by , while is independent of , mismatches between and can be used to learn about .
Local highresolution simulations nested in a grid column of an ESM may be viewed as a timedependent map from the state variables of the ESM to simulated state variables ,
(3) 
The map is parameterized by noncomputable parameters and time , and it can involve the timehistory of the state variables up to time . The vector contains statistics of highresolution variables whose counterparts in the ESM are computed by parameterization schemes, such as the mean cloud cover or liquid water content in a grid box. The corresponding variables in the ESM are obtained by a timedependent map that takes state variables and parameters to ,
(4) 
The map typically represents a single grid column of the ESM with its parameterization schemes, taking as input from the ESM. It is structurally similar to . Crucially, however, generally depends on all parameters , while only depends on noncomputable parameters . Thus, the mismatch can be used to learn about the computable parameters .
The same framework also covers other ways of learning about parameterizations schemes from data. For example, the map may represent a single grid column of an ESM, driven by timeevolving boundary conditions from reanalysis data at selected sites. Observations at the sites can then be used to learn about the parameterization schemes in the column (Neggers et al., 2012). Or, similarly, the map may represent a local highresolution simulation driven by reanalysis data, with parameterization schemes, e.g., for cloud microphysics, about which one wants to learn from observations.
3.2 Objective Functions
Objective functions are defined through mismatch between the simulated data and observations , on the one hand, and simulated data and highresolution simulations , on the other hand. We define mismatches using timeaveraged statistics because they do not suffer from sensitivity to atmospheric initial conditions; indeed, matching trajectories directly requires assimilating atmospheric initial conditions, which would make it difficult to disentangle mismatches due to errors in climatically unimportant atmospheric initial conditions from those due to parameterization errors. However, the time averages can still depend on initial conditions for slowly evolving components of the Earth system, such as ocean circulations or ice sheets.
We denote the time average of a function over the time interval by
(5) 
The observational objective function can then be written in the generic form
(6) 
with the 2norm
(7) 
normalized by error standard deviations and covariance information captured in
. The function of the observables typically involves first and secondorder quantities, for example,(8) 
where, for any observable , denotes the fluctuation of about its mean . With given by (8), the objective function penalizes mismatch between the vectors of mean values and and between the covariance components and for some indices and . The leastsquares form of the objective function (6) follows from assuming an error model
(9) 
with the matrix encoding an assumed covariance structure of the noise vector . The relevant components of may be chosen very small for quantities that are used as constraints on the ESM (e.g., the requirement of a closed global energy balance at TOA).
For the mismatch to highresolution simulations, we accumulate statistics over an ensemble of highresolution simulations in different grid columns of the ESM and at different times, possibly, but not necessarily, also accumulating in time. We denote the corresponding ensemble and time average by , and define an objective function analogously to that for the observations through
(10) 
Like the function above, the function typically involves first and secondorder quantities, and the leastsquares form of the objective functions follows from the assumed covariance structure of the noise.
3.3 Learning Algorithms
Learning algorithms attempt to choose parameters that minimize and . However, minimization of and
does not always determine the parameters uniquely, for example, if there are strongly correlated parameters or if the number of parameters to be learned exceeds the number of available observational degrees of freedom. In such cases, regularization is necessary to choose a good solution for the parameters among the multitude of possible solutions. This may be achieved in various ways: by adding to the leastsquares objective functions (
6) and (10) regularizing penalty terms that incorporate prior knowledge about the parameters (Engl et al., 1996), by Bayesian probabilistic regularization (Kaipio and Somersalo, 2005), or by restriction of the parameters to a subset, as in ensemble Kalman inversion (Iglesias et al., 2013).All of these regularization approaches may be useful in ESMs. They involve different tradeoffs between computational expense and the amount of information about the parameters they provide.

Classical regularized least squares leads to an optimization problem that is typically tackled by gradient descent or GaussNewton methods, in which derivatives of the parametertodata map are employed (Nocedal and Wright, 2006). Such methods usually require integrations of the forward model or evaluations of its derivatives with respect to parameters.

Bayesian inversions usually employ Markov chain Monte Carlo (MCMC) methods (Brooks et al., 2011) and variants such as sequential Monte Carlo (Del Moral et al., 2006)
to approximate the posterior probability density function (PDF) of parameters, given data and a prior PDF. A PDF of parameters provides much more information than a point estimate, and consequently MCMC methods typically require many more forward model integrations, sometimes on the order of
. The computational demands can be decreased by an order of magnitude by judicious use of derivative information where available (see Beskos et al. (2017) and references therein) or by improved sampling strategies (e.g., Jackson et al., 2008; Järvinen et al., 2010; Solonen et al., 2012). Nonetheless, the cost remains orders of magnitude higher than for optimization techniques. 
Ensemble Kalman methods are easily parallelizable, derivativefree alternatives to the classical optimization and Bayesian approaches (Houtekamer and Zhang, 2016). Although theory for them is less well developed, empirical evidence demonstrates behavior similar to derivativebased algorithms in complex inversion problems, with a comparable number of forward model integrations (Iglesias, 2016). Ensemble methods for joint state and parameter estimation have recently been systematically developed (Bocquet and Sakov, 2013, 2014; Carrassi et al., 2017), and they are emerging as a promising way to solve inverse problems and to obtain qualitative estimates of uncertainty. However, numerical experiments have indicated that such uncertainty information is qualitative at best: the Kalman methods invoke Gaussian assumptions, which may not be justified, and even if the Gaussian approximation holds, the ensemble sizes needed for uncertainty quantification may not be practical (Law and Stuart, 2012; Iglesias et al., 2013).
An important consideration is how to blend the information about parameters contained in the highresolution simulations and in the observations. One approach is as follows, although others may turn out to be preferable. Minimizing the highresolution objective function in principle gives the computable parameters as an implicit function of the noncomputable parameters . This implicit function may then be used as prior information to minimize the observational objective function over Bayesian MCMC approaches may be feasible for fitting , since the single column model is relatively cheap to evaluate, and the ensemble of highresolution simulations needed may not be large. Although Bayesian approaches may not be feasible for fitting , for which accumulation of statistics of the model is required, this hierarchical approach does have the potential to incorporate detailed uncertainty estimates coming from the highresolution simulations.
The choice of normalization (i.e., and ) in the objective functions plays a significant role in parameter learning, and learning about it has been demonstrated to have considerable impact on data assimilation for weather forecasts (Dee, 1995; Stewart et al., 2014). We will not discuss this issue in any detail, but note it may be addressed by the use of hierarchical Bayesian methodology and ensemble Kalman analogues. Nor will we dwell on the important issue of structural uncertainty—model error— other than to note that this can, in principle, be addressed through the inverse problem approach advocated here: additional unknown parameters, placed judiciously within the model to account for model error, can be learned from data (Kennedy and O’Hagan, 2001; Dee, 2005). The choice of normalization is especially important in this context as it relates to disentangling learning about model error from learning about the other parameters of interest.
Learning algorithms for ESMs can be developed further in several ways:

Minimization of the objective functions and may be performed by online filtering algorithms, akin to those used in the initialization of weather forecasts, which sequentially update parameters as information becomes available (Law et al., 2015). This can reduce the number of forward model integrations required for parameter estimation, and it can allow parameterization schemes to learn adaptively from highresolution simulations during the course of a global simulation.

Where to employ targeted highresolution simulations can be chosen to optimize aspects of the learning process. The simplest approach would be to deploy them randomly, for example, by selecting regions with a probability proportional to their climatological cloud fraction for highresolution simulations of clouds. More efficient would be techniques of optimal experimental design (see Alexanderian et al. (2016) and references therein), within online filtering algorithms. With such techniques, highresolution simulations could be generated to order, to update aspects of parameterization schemes that have the most influence on the global system with which they interact.
Progress along these lines will require innovation. For example, filtering algorithms need to be adapted to deal with strong serial correlations such as those that arise when averages are accumulated over increasing spans and parameters are updated from one average to a longer average . And optimal experimental design techniques require the development of cheap computational methods to evaluate sensitivities of the ESM to individual aspects of parameterization schemes.
4 Illustration With Dynamical System
We envision ESMs eventually to learn parameters online, with targeted highresolution simulations triggering parameter updates on the fly. Here we want to illustrate in offline mode some of the opportunities and challenges of learning parameters in a relatively simple dynamical system. We use the Lorenz96 model (Lorenz, 1996), which has nonlinearities resembling the advective nonlinearities of fluid dynamics and a multiscale coupling of slow and fast variables similar to what is seen in ESMs. The model has been used extensively in the development and testing of data assimilation methods (e.g., Lorenz and Emanuel, 1998; Anderson, 2001; Ott et al., 2004).
4.1 Lorenz96 Model
The Lorenz96 model consists of slow variables (), each of which is coupled to fast variables () (Lorenz, 1996):
(11)  
(12) 
The overbar denotes the mean value over ,
(13) 
Both the slow and fast variables are taken to be periodic in and , forming a cyclic chain with , , and . The slow variables may be viewed as resolvedscale variables and the fast variables as unresolved variables in an ESM. Each of the slow variables may represent a property such as surface air temperature in a cyclic chain of grid cells spanning a latitude circle. Each slow variable affects the fast variables in the grid cell, which might represent cloudscale variables such as liquid water path in each of cumulus clouds. In turn, the mean value of the fast variables over the cell, , feeds back onto the slow variables . The strength of the coupling between fast and slow variables is controlled by the parameter , which represents an interaction coefficient, for example, an entrainment rate that couples cloudscale variables to their largescale environment. Time is nondimensionalized by the lineardamping timescale of the slow variables, which we nominally take to be 1 day, a typical thermal relaxation time of surface temperatures (Swanson and Pierrehumbert, 1997). The parameter controls how rapidly the fast variables are damped relative to the slow; it may be interpreted as a microphysical parameter controlling relaxation of cloud variables, such as a precipitation efficiency. The parameter controls the strength of the external largescale forcing, and the amplitude of the nonlinear interactions among the fast variables. Following Lorenz (1996), albeit relabeling parameters, we choose , , , and , which ensures chaotic dynamics of the system.
The quadratic nonlinearities in this dynamical system resemble advective nonlinearities, e.g., in the sense that they conserve the quadratic invariants (“energies”) and (Lorenz and Emanuel, 1998). The interaction between the slow and fast variables conserves the “total energy” . Energies are damped by the linear terms; they are prevented from decaying to zero by the external forcing . Eventually, the system approaches a statistically steady state in which driving by the external forcing balances the linear damping.
Let denote a longterm time mean in the statistically steady state, and note that all slow variables are statistically identical, as are all fast variables , so we can use the generic symbols and in statistics of the variables. Multiplication of (11) by , using that all variables
are statistically identical, and averaging shows that, in the statistically steady state, second moments of the slow variables satisfy
(14) 
Similarly, second moments of the fast variables satisfy
(15) 
where the overbar again denotes a mean value over the fastvariable index . That is, the interaction coefficient can be determined from estimates of the onepoint statistics and . Its inverse is proportional to the regression coefficient of the fast variables onto the slow: . So the regression of the fast variables onto the slow can be viewed as providing an “emergent constraint” on the system, insofar as the interaction coefficient affects the response of the system to perturbations (e.g., in ). Estimates of and provide an additional constraint (14) on the parameters and . Taking mean values of the dynamical equations (11) and (12) would provide further constraints on these parameters, as well as on , in terms of twopoint statistics involving shifts in and , e.g., covariances of and .
In what follows, we demonstrate the performance of learning algorithms in a perfectmodel setting, first focusing on onepoint statistics to show how to learn about parameters in the full dynamical system from them. Subsequently, we use twopoint statistics to learn about parameters in a single “grid column” of fast variables only.
4.2 Parameter Learning in PerfectModel Setting
We generate data from the dynamical system (11) and (12) with the parameters set to . The role of “observations” in the perfectmodel setting is played by data and generated by the dynamical system with parameters set to their “true” values . That is, the dynamical system (11) and (12) with parameters stands for the global model , the observing system map is the identity, and the data and generated by the dynamical system with parameters is a surrogate for observations. The parameters of the dynamical system are then learned by matching statistics accumulated over (with 1 day denoting the unit time of the system), using discrete sums in place of the time integral in the average (5) and minimizing the “observational” objective function
(16) 
The moment function to be matched,
(17) 
has an entry for each of the indices , giving a vector of length . The noise covariance matrix
is chosen to be diagonal, with entries that are proportional to the sample variances of the moments contained in the vector
,(18) 
Here denotes the variance of , and is an empirical parameter indicating the noise level. The variances and the “true moments” are estimated from a long (46,416 days) control simulation of the dynamical system with the true parameters .
As an illustrative example, we use normal priors for , with mean values and variances . Enforcing positivity of , we use a lognormal prior for , with a mean value and variance for (i.e., a mean value of for ). We take the parameters a priori to be uncorrelated, so that the prior covariance matrix is diagonal.
To illustrate the landscape learning algorithms have to navigate, Figure 1 (top row) shows sections through the potential energy, defined as the negative logarithm of the posterior PDF,
(19) 
where is the prior PDF of parameter . The figure shows the marginal potential energies obtained as one parameter at a time is varied and the objective function is accumulated by forward integration, while the other parameters are held fixed at their true values. As the noise level increases, the contribution of the loglikelihood of the data () is downweighted relative to the prior, the posterior modes shift toward the prior modes, and the posterior is smoothed. Here the objective function for each parameter setting is accumulated over a long period ( days) to minimize sampling variability. However, even with this wide accumulation window, sampling variability remains in some parameter regimes and there noticeably affects . An example is the roughness around , which appears to be caused by metastability on timescales longer than the accumulation window. The roughness could be smoothed by accumulating over periods that are yet longer, or by averaging over an ensemble of initial conditions, but analogous smoothing might be impractical for ESMs. Timeaveraged ESM statistics may exhibit similarly rough dependencies on some parameters (e.g., Suzuki et al., 2013; Zhao et al., 2016), although the dependence on other parameters appears to be relatively smooth (e.g., Neelin et al., 2010), perhaps because ESM parameters targeted for tuning are chosen for the smooth dependence of the climate state on them. Roughness of the potential energy landscape can present challenges for learning algorithms, which may get stuck in local minima. Note also the bimodality in , which arises because the onepoint statistics we fit cannot easily distinguish prograde wave modes of the system (which propagate toward increasing ) from retrograde modes (cf. Lorenz and Emanuel, 1998).
4.2.1 Bayesian Inversion
We use the randomwalk Metropolis (RWM) MCMC algorithm (Brooks et al., 2011) for a full Bayesian inversion of parameters in the dynamical system (11), (12), thereby sampling from the posterior PDF. To reduce burnin (MCMC spinup) time, we initialize the algorithm close to the true parameter values with the result of an ensemble Kalman inversion (see below). The RWM algorithm is then run over iterations, the first iterations are discarded as burnin, and the posterior PDF is estimated by binning every other of the remaining 2000 samples. The objective function for each sample is accumulated over , using the end state of the previous forward integration as initial condition for the next one, without discarding any spinup after a parameter update.
The resulting marginal posterior PDFs do not all peak exactly at the true parameter values, but the true parameter values lie in a region that contains most of the posterior probability mass (Figure 1, second row). The posterior PDFs indicate the uncertainties inherent in estimating the parameters. The posterior PDF of has the largest spread, in terms of standard deviation normalized by mean, indicating relatively large uncertainty in this parameter. The uncertainty appears to arise from the roughness of the potential energy (Figure 1, first row), which reflects inherent sensitivity of the system response to parameter variability; additional roughness of the posterior PDFs may be caused by sampling variability from finitetime averages (Wang et al., 2014). For all four parameters, the posterior PDFs differ significantly from the priors, demonstrating the information content provided by the synthetic data. Finally, although these results have been obtained with forward model integrations and objective function evaluations, more objective function evaluations may be required for more complex forward models, such as ESMs.
4.2.2 Ensemble Kalman Inversion
Ensemble Kalman inversion may be an attractive learning algorithm for ESMs when Bayesian inversion with MCMC is computationally too demanding. To illustrate its performance, we use the algorithm of Iglesias et al. (2013), initializing ensembles of size with parameters drawn from the prior PDFs. In the analysis step of the Kalman inversion, we perturb the target data by addition of noise with zero mean and variance given by (18), that is, replacing by with for each ensemble member . As in the MCMC algorithm, the objective function for each parameter setting is accumulated over , without discarding any spinup after each parameter update. As initial state for the integration of the ensemble, we use a state drawn from the statistically steady state of a simulation with the true parameters.
Noise  Mean ()  Mean ()  Std () 

(9.62, 0.579, 9.37, 2.63)  (9.71, 0.992, 8.70, 9.95)  (0.023, 0.001, 0.104, 0.022)  
(9.57, 0.516, 7.90, 3.15)  (9.77, 0.994, 9.07, 10.04)  (0.107, 0.005, 0.524, 0.103)  
(9.77, 0.522, 9.29, 5.31)  (9.63, 0.982, 8.34, 9.93)  (0.295, 0.017, 1.477, 0.350)  
(9.70, 0.633, 7.68, 6.13)  (9.53, 0.952, 7.97, 9.37)  (0.385, 0.039, 1.964, 0.701) 
Table 1 summarizes the solutions obtained by this ensemble Kalman inversion after iterations, for different ensemble sizes and noise levels . The ensemble mean of the Kalman inversion provides reasonable parameter estimates. But the ensemble standard deviation does not always provide quantitatively accurate uncertainty information. For example, for low noise levels, the true parameter values often lie more than two standard deviations away from the ensemble mean. The ensemble spread also differs quantitatively from the posterior spread in the MCMC simulations. In experiments in which we did not perturb the target data, the smaller ensembles () occasionally collapsed, with each ensemble member giving the same point estimate of the parameters. In such cases, the ensemble contains no uncertainty information, illustrating potential pitfalls of using ensemble Kalman inversion for uncertainty quantification. However, with the perturbed data and for larger ensembles, the ensemble standard deviation is qualitatively consistent with the posterior PDF estimated by MCMC (Figure 1, second row). It provides some uncertainty information, especially for higher noise levels, for example, in the sense that the parameter is demonstrably the most uncertain (Table 1 and Figure 2b). Methods such as localization and variance inflation can help with issues related to ensemble collapse and can also be used to improve ensemble statistics more generally (see Law et al. (2015) and the references therein). However, systematic principles for their application with the aim of correctly reproducing Bayesian posterior statistics have not been found, and so we have not adopted this approach.
The ensemble Kalman inversion typically converges within a few iterations (Figure 2 indicates iterations when ). Larger ensembles lead to solutions closer to the truth (Figure 2a). Convergence within 5 iterations for ensembles of size 10 or 100 implies 50 or 500 objective function evaluations, representing substantial computational savings over the MCMC algorithm with 2000 objective function evaluations. These computational savings come at the expense of detailed uncertainty information. Where the optimal tradeoff lies between computational efficiency, on the one hand, and precision of parameter estimates and uncertainty quantification, on the other hand, remains to be investigated.
4.3 Parameter Learning From Fast Dynamics
Finally, we investigate learning about parameters from the fast dynamics (12) alone. This is similar to learning about computable parameters from local highresolution simulations, e.g., of clouds. That is, the fast dynamics (12) with the true parameters stand for the highresolution model , which generates data , and the fast dynamics with parameters play the role of the singlecolumn model , which generates data . We choose and fix , a value taken from the statistically steady state of the full dynamics. There are three parameters to learn from the fast dynamics: . The onepoint statistics of the fast variables are not enough to recover all three. Therefore, we consider the moment function
(20) 
containing all first moments and second moments (including crossmoments), giving a vector of length . We minimize the “highresolution” objective function
(21) 
using a diagonal noise covariance matrix with diagonal elements proportional to the variances of the statistics in , with a noise level analogous to the noise covariance matrix (18). The variances of the statistics are estimated from a long control integration of the fast dynamics with fixed . Because the fast variables evolve more rapidly than the slow variables , we accumulate statistics over only .
Bayesian inversion with RWM, with the same priors and algorithmic settings as before and with noise level , again gives marginal posterior PDFs with modes close to the truth (Figure 1, third row). The posterior PDFs exhibit similar multimodality and reflect similar uncertainties and biases of posterior modes as those obtained from the full dynamics, especially with respect to the relatively large uncertainties in (cf. Figure 1, second row).
These examples illustrate the potential of learning about parameters from observations and from local highresolution simulations under selected conditions (here, for just one value of the slow variable ). An important question for future investigations is to what extent such results generalize to imperfect parameterization schemes, whose dynamics is usually not identical to the datagenerating dynamics, so that structural in addition to parametric uncertainties arise. This issue can be studied for the Lorenz96 system, for example, by using approximate models as parameterizations of the fast dynamics (e.g., Fatkullin and VandenEijnden, 2004; Wilks, 2005; Crommelin and VandenEijnden, 2008).
5 Outlook
Just as weather forecasts have made great strides over the past decades thanks to improvements in the assimilation of observations (Bauer et al., 2015), climate projections can advance similarly by harnessing observations and modern computational capabilities more systematically. New methods from data assimilation, inverse problems, and machine learning make it possible to integrate observations and targeted highresolution simulations in an ESM that learns from both and uses both to quantify uncertainties. As an objective of such parameter learning we propose the reduction of biases and exploitation of emergent constraints through the matching of mean values and covariance components between ESMs, observations, and targeted highresolution simulations.
Coordinated spacebased observations of crucial processes in the climate system are now available. For example, more than a decade’s worth of coordinated observations of clouds, precipitation, temperature, and humidity with global coverage is available; parameterizations of clouds, convection, and turbulence can learn from them. Or, simultaneous measurements of concentrations and photosynthesis are becoming available; parameterizations of terrestrial ecosystems can learn from them. So far, such observations have been primarily used to evaluate models and identify their deficiencies. Their potential to improve models has not yet been harnessed. Additionally, it is feasible to conduct faithful local highresolution simulations of processes such as the dynamics of clouds or sea ice, which are in principle computable but are too costly to compute globally. Parameterizations can also learn from such highresolution simulations, either online by nesting them in an ESM or offline by creating libraries of highresolution simulations representing different regions and climates to learn from. Such a systematic approach to learning parameterizations from data allows the quantification of uncertainties in parameterizations, which in turn can be used to produce ensembles of climate simulations to quantify the uncertainty in predictions.
The machine learning of parameterizations in our view should be informed by the governing equations of subgridscale processes whenever they are known. The governing equations can be systematically coarsegrained, for example, by modeling the joint PDF of the relevant variables as a mixture of Gaussian kernels and generating moment equations for the modeled PDF from the governing equations (cf. Lappen and Randall, 2001a; Golaz et al., 2002; Guo et al., 2015; Firl and Randall, 2015). The closure parameters that necessarily arise in any such coarsegraining of nonlinear governing equations can then be learned from a broad range of observations and highresolution simulations, as parametric or nonparametric functions of ESM state variables (cf. Parish and Duraisamy, 2016)
. The fineness of the coarsegraining (measured by the number of Gaussian kernels in the above example) can adapt to the information available to learn closure parameters. Such equationinformed machine learning will provide a more versatile means of modeling subgridscale processes than the traditional approach of fixing closure parameters ad hoc or on the basis of a small sample of observations or highresolution simulations. Because parameterizations learned within the structure of the known governing equations respect the relevant symmetries and conservation laws to within the closure approximations, they likely have greater outofsample predictive power than unstructured parameterization schemes, such as neural networks that are fit to subgridscale processes without explicit regard for symmetries and conservation laws
(e.g., Krasnopolsky et al., 2013). Outofsample predictive power will be crucial if highresolution simulations performed in selected locations and under selected conditions are to provide information globally and in changed climates. However, for noncomputable processes whose governing equations are unknown, like many ecological or biogeochemical processes, more empirical, datadriven parameterization approaches may well be called for.An ESM that is designed from the outset to learn systematically from observations and highresolution simulations represents an opportunity to achieve a leap in fidelity of parameterization schemes and thus of climate projections. Such an ESM can be expected to have attendant benefits for weather forecasting, because weather forecasting models and the atmospheric component of ESMs are essentially the same. However, challenges lie along the path toward realization of such an ESM:

We need innovation in learning algorithms. Our relatively simple example showed that parameters in a perfectmodel setting can be learned effectively and efficiently by ensemble Kalman inversion. It remains to investigate questions such as the optimal ensemble size in Kalman inversions, how to adapt inversion algorithms to imperfect models, and how to quantify uncertainties. To increase computational efficiency, online filtering algorithms need to be developed that update parameters on the fly as Earth system statistics are being accumulated.

We need investigations of the best metrics to use when learning parameterization schemes from observations or highresolution simulations. For example, are leastsquares objective functions the best ones to use? Which covariance components or other statistics should be included in the objective functions? There are tradeoffs between the number of covariance components that can be estimated from data and the information they can provide about parameterization schemes.

We need innovation in how learning from observations should interact with learning from targeted highresolution simulations. How should highresolution simulations be targeted? Where is the optimum tradeoff between the added computational cost of conducting highresolution simulations and the marginal information about parameterization schemes they provide?

We need innovation in parameterization schemes themselves, to design them such that they can learn effectively from diverse data sources and can be systematically refined when more information becomes available. It will be important to develop parameterizations that treat subgridscale motions (e.g., boundary layer turbulence, shallow convection, deep convection) in a unified manner, to eliminate artificial spectral gaps that do not exist in nature and to reduce the number of correlated parameters in the schemes (e.g., Lappen and Randall, 2001a, b; Köhler et al., 2011; Suselj et al., 2013; Park, 2014a, b; Guo et al., 2015). Novel approaches that exploit ideas ranging from stochastic parameterization to systematic coarsegraining likely have roles to play here (e.g., Majda et al., 2003, 2008; Klein and Majda, 2006; Palmer et al., 2005; Palmer and Williams, 2010; Majda, 2012; Wouters and Lucarini, 2013; Lucarini et al., 2014; Wouters et al., 2016; Berner et al., 2017). Furthermore, as the resolution of ESMs increases, it will also be necessary to revisit the common practice of modeling subgridscale dynamics in grid columns, because the lateral exchange of subgridscale information across grid columns will play increasingly important roles.
The time is right to seize the opportunities that the available global observations and our computational resources present. Fundamentally reengineering atmospheric parameterization schemes, such as cloud and boundary layer parameterizations, will become a necessity as atmosphere models, within the next decade, reach horizontal grid spacings of 1–10 km and begin to resolve deep convection (Schneider et al., 2017). At such resolutions, common assumptions made in existing parameterization schemes, such as that clouds and the planetary boundary layer adjust instantaneously to changes in resolvedscale dynamics, will become untenable. Additionally, advances in highperformance computing (e.g., manycore computational architectures based on graphical processing units) will soon require a redesign of the software infrastructure of ESMs (Bretherton et al., 2012; Schulthess, 2015; Schalkwijk et al., 2015). So it is timely now to reengineer ESMs and parameterization schemes, and design them from the outset so that they can learn systematically from observations and targeted highresolution simulations.
Integrating observations and targeted highresolution simulations in an Earth system modeling framework would have multiple attendant benefits. Solving the inverse problems of learning about parameterizations from observations requires observing system simulators that map model state variables to observables (Figure 3). The same observing system simulators, integrated in an Earth system modeling framework, can be used to answer questions about the value new observations would provide, for example, in terms of reduced uncertainties in ESMs. Addressing such questions in observing system simulation experiments (OSSEs) is increasingly required before the acquisition of new observing systems (e.g., as part of the U.S. Weather Research and Forecasting Innovation Act of 2017). They are naturally answered within the framework we propose.
Acknowledgements.
We gratefully acknowledge financial support by Charles Trimble, by the Office Naval Research (grant N000141712079), and by the President’s and Director’s Fund of Caltech and the Jet Propulsion Laboratory. We also thank V. Balaji, Michael Keller, Dan McCleese, and John Worden for helpful discussions and comments on drafts, and Momme Hell for preparing Figure 3. The program code used in this paper is available at climatedynamics.org/publications/. Part of this research was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration.References
 Adam et al. (2016) Adam, O., T. Schneider, F. Brient, and T. Bischoff (2016), Relation of the doubleITCZ bias to the atmospheric energy budget in climate models, Geophys. Res. Lett., 43, 7670–7677, doi:10.1002/2016GL069465.
 Adam et al. (2017) Adam, O., T. Schneider, and F. Brient (2017), Regional and seasonal variations of the doubleITCZ bias in CMIP5 models, Climate Dyn., doi:10.1007/s0038201739091.
 Aksoy et al. (2006) Aksoy, A., F. Zhang, and J. W. NielsenGammon (2006), Ensemblebased simultaneous state and parameter estimation with MM5, Geophys. Res. Lett., 33, L12,801, doi:10.1029/2006GL026186.
 Alexanderian et al. (2016) Alexanderian, A., N. Petra, G. Stadler, and O. Ghattas (2016), A fast and scalable method for Aoptimal design of experiments for infinitedimensional Bayesian nonlinear inverse problems, SIAM J. Sci. Comp., 38, A243–A272.
 Anderson et al. (2009) Anderson, J., T. Hoar, K. Raeder, H. Liu, N. Collins, R. Torn, and A. Avellano (2009), The data assimilation research testbed: A community facility, Bull. Amer. Meteor. Soc., 90, 1283–1296, doi:10.1175/2009BAMS2618.1.

Anderson (2001)
Anderson, J. L. (2001), An ensemble adjustment Kalman filter for data assimilation,
Mon. Wea. Rev., 129, 2884–2903, doi:10.1175/15200493(2001)129¡2884:AEAKFF¿2.0.CO;2.  Annan and Hargreaves (2007) Annan, J. D., and J. C. Hargreaves (2007), Efficient estimation and ensemble generation in climate modelling, Phil. Trans. R. Soc. A, 365, 2077–2088, doi:10.1098/rsta.2007.2067.
 Ban et al. (2015) Ban, N., J. Schmidli, and C. Schär (2015), Heavy precipitation in a changing climate: Does shortterm summer precipitation increase faster?, Geophys. Res. Lett., 42, 1165–1172, doi:10.1002/2014GL062588.
 Bauer et al. (2015) Bauer, P., A. Thorpe, and G. Brunet (2015), The quiet revolution of numerical weather prediction, Nature, 525, 47–55, doi:10.1038/nature14956.
 Benedict and Randall (2009) Benedict, J. J., and D. A. Randall (2009), Structure of the MaddenJulian oscillation in the superparameterized CAM, J. Atmos. Sci., 66, 3277–3296, doi:10.1175/2009JAS3030.1.
 Berner et al. (2017) Berner, J., U. Achatz, L. Batte, L. Bengtsson, A. De La Camara, H. M. Christensen, M. Colangeli, D. R. Coleman, D. Crommelin, S. I. Dolaptchiev, et al. (2017), Stochastic parameterization: towards a new view of weather and climate models, Bull. Am. Meteor. Soc., 98, 565–587, doi:10.1175/BAMSD1500268.1.
 Beskos et al. (2017) Beskos, A., M. Girolami, S. Lan, P. E. Farrell, and A. M. Stuart (2017), Geometric MCMC for infinitedimensional inverse problems, J. Comp. Phys., 335, 327–351, doi:10.1016/j.jcp.2016.12.041.
 Bishop et al. (2001) Bishop, C. H., B. J. Etherton, and S. J. Majumdar (2001), Adaptive sampling with the ensemble transform Kalman filter. Part I: Theoretical aspects, Mon. Wea. Rev., 129, 420–436, doi:10.1175/15200493(2001)129¡0420:ASWTET¿2.0.CO;2.
 Bloom et al. (2016) Bloom, A. A., J.F. Exbrayat, I. R. van der Velde, L. Feng, and M. Williams (2016), The decadal state of the terrestrial carbon cycle: Global retrievals of terrestrial carbon allocation, pools, and residence times, Proc. Natl. Acad. Sci., 113, 1285–1290, doi:doi:10.5194/essd51652013.
 Bocquet and Sakov (2013) Bocquet, M., and P. Sakov (2013), Joint state and parameter estimation with an iterative ensemble Kalman smoother, Nonlin. Processes Geophys., 20, 803–818, doi:10.5194/npg208032013.
 Bocquet and Sakov (2014) Bocquet, M., and P. Sakov (2014), An iterative ensemble Kalman smoother, Quart. J. Roy. Meteor. Soc., 140, 1521–1535, doi:10.1002/qj.2236.
 BodasSalcedo et al. (2014) BodasSalcedo, A., K. D. Williams, M. A. Ringer, I. Beau, J. N. S. Cole, J.L. Dufresne, T. Koshiro, B. Stevens, Z. Wang, and T. Yokohata (2014), Origins of the solar radiation biases over the southern ocean in CFMIP2 models, J. Climate, 27, 41–56, doi:10.1175/JCLID1300169.1.
 Bony et al. (2006) Bony, S., R. Colman, V. M. Kattsov, R. P. Allan, C. S. Bretherton, J.L. Dufresne, A. Hall, S. Hallegatte, M. M. Holland, W. Ingram, D. A. Randall, B. J. Soden, G. Tselioudis, and M. J. Webb (2006), How well do we understand and evaluate climate change feedback processes?, J. Climate, 19, 3445–3482, doi:10.1175/JCLI3819.1.
 Bretherton et al. (2012) Bretherton, C., V. Balaji, T. Delworth, R. E. Dickinson, J. A. Edmonds, J. S. Famiglietti, I. Fung, J. S. Hack, J. S. Hurrell, D. J. Jacob, J. L. K. III, L.Y. R. Leung, S. Marshall, W. Masloswski, L. O. Mearns, R. B. Rood, L. L. Smarr, et al. (2012), A National Strategy for Advancing Climate Modeling, The National Academies Press, Washington, D.C.
 Brient and Schneider (2016) Brient, F., and T. Schneider (2016), Constraints on climate sensitivity from spacebased measurements of lowcloud reflection, J. Climate, 29, 5821–5835, doi:10.1175/JCLID150897.1.
 Brooks et al. (2011) Brooks, S., A. Gelman, G. L. Jones, and X.L. Meng (2011), Handbook of Markov Chain Monte Carlo, 619 pp., Chapman and Hall/CRC.
 Canadell et al. (2007) Canadell, J. G., C. L. Quéré, M. R. Raupach, C. B. Field, E. T. Buitenhuis, P. Ciais, T. J. Conway, N. P. Gillett, R. A. Houghton, and G. Marland (2007), Contributions to accelerating atmospheric CO_{2} growth from economic activity, carbon intensity, and efficiency of natural sinks, Proc. Natl. Acad. Sci., 104, 18,866–18,870, doi:10.1073pnas.0702737104.
 Carrassi et al. (2017) Carrassi, A., M. Bocquet, A. Hannart, and M. Ghil (2017), Estimating model evidence using data assimilation, Quart. J. Roy. Meteor. Soc., 143, 866–880, doi:10.1002/qj.2972.
 Cess et al. (1989) Cess, R. D., G. Potter, J. Blanchet, G. Boer, S. Ghan, J. Kiehl, H. Le Treut, Z.X. Li, X.Z. Liang, J. Mitchell, et al. (1989), Interpretation of cloudclimate feedback as produced by 14 atmospheric general circulation models, Science, 245, 513–516.
 Cess et al. (1990) Cess, R. D., G. L. Potter, J. P. Blanchet, G. J. Boer, A. D. Del Genio, M. Déqué, V. Dymnikov, V. Galin, W. L. Gates, S. J. Ghan, J. T. Kiehl, A. A. Lacis, H. Le Treut, Z.X. Li, X.Z. Liang, B. J. McAvaney, V. P. Meleshko, J. F. B. Mitchell, J.J. Morcrette, D. A. Randall, L. Rikus, E. Roeckner, J. F. Royer, U. Schlese, D. A. Sheinin, A. Slingo, A. P. Sokolov, K. E. Taylor, W. M. Washington, R. T. Wetherald, I. Yagai, and M.H. Zhang (1990), Intercomparison and interpretation of climate feedback processes in 19 atmospheric general circulation models, J. Geophys. Res., 95, 16,601–16,615, doi:10.1029/JD095iD10p16601.
 Collins et al. (2012) Collins, M., R. E. Chandler, P. M. Cox, J. M. Huthnance, J. Rougier, , and D. B. Stephenson (2012), Quantifying future climate change, Nature Climate Change, 2, 403–409, doi:10.1038/NCLIMATE1414.
 Cotter et al. (2013) Cotter, S. L., G. O. Roberts, A. M. Stuart, and D. White (2013), MCMC methods for functions: Modifying old algorithms to make them faster, Statist. Science, 28(3), 424–446.
 Cox et al. (2013) Cox, P. M., D. Pearson, B. B. Booth, P. Friedlingstein, C. Huntingford, C. D. Jones, and C. M. Luke (2013), Sensitivity of tropical carbon to climate change constrained by carbon dioxide variability, Nature, 494, 341–344, doi:10.1038/nature11882.
 Crisp et al. (2004) Crisp, D., R. M. Atlas, F.M. Breon, L. R. Brown, J. P. Burrows, P. Ciais, B. J. Connor, S. C. Doney, I. Y. Fung, D. J. Jacob, C. E. Miller, et al. (2004), The Orbiting Carbon Observatory (OCO) mission, Adv. Space Res., 34, 700–709, doi:10.1016/j.asr.2003.08.062.
 Crommelin and VandenEijnden (2008) Crommelin, D., and E. VandenEijnden (2008), Subgridscale parameterization with conditional Markov chains, J. Atmos. Sci., 65, 2661–2675, doi:10.1175/2008JAS2566.1.
 de Rooy et al. (2013) de Rooy, W. C., P. Bechtold, K. Fröhlich, C. Hohenegger, H. Jonker, D. Mironov, A. P. Siebesma, J. Teixeira, and J.I. Yano (2013), Entrainment and detrainment in cumulus convection: an overview, Quart. J. Roy. Meteor. Soc., 139, 1–19.
 Dee (1995) Dee, D. P. (1995), Online estimation of error covariance parameters for atmospheric data assimilation, Mon. Wea. Rev., 123, 1128–1145, doi:10.1175/15200493(1995)123¡1128:OLEOEC¿2.0.CO;2.
 Dee (2005) Dee, D. P. (2005), Bias and data assimilation, Quart. J. Roy. Meteor. Soc., 131, 3323–3343, doi:10.1256/qj.05.137.
 Del Moral et al. (2006) Del Moral, P., A. Doucet, and A. Jasra (2006), Sequential Monte Carlo samplers, J. Roy. Statist. Soc. B, 68, 411–436, doi:10.1111/j.14679868.2006.00553.x.
 DeMott et al. (2013) DeMott, C. A., C. Stan, and D. A. Randall (2013), Northward propagation mechanisms of the boreal summer intraseasonal oscillation in the ERAInterim and SPCCSM, J. Climate, 26, 1973–1992, doi:10.1175/JCLID1200191.1.
 Devenish et al. (2012) Devenish, B. J., P. Bartello, J.L. Brenguier, L. R. Collins, W. W. Grabowski, R. H. A. IJzermans, S. P. Malinowski, M. W. Reeks, J. C. Vassilicos, L.P.Wang, and Z.Warhaft (2012), Droplet growth in warm turbulent clouds, Quart. J. Roy. Meteor. Soc., 138, 1401–1429, doi:10.1002/qj.1897.
 Draper (1995) Draper, D. (1995), Assessment and propagation of model uncertainty, J. Roy. Statist. Soc. B, 57, 45–97.
 E et al. (2007) E, W., B. Engquist, X. Li, W. Ren, and E. VandenEijnden (2007), Heterogeneous multiscale methods: A review, Commun. Comput. Phys., 3, 367–450.
 Eldering et al. (2017) Eldering, A., P. O. Wennberg, D. Crisp, D. S. Schimel, M. R. Gunson, A. Chatterjee, J. Liu, F. M. Schwandner, Y. Sun, C. W. O’Dell, C. Frankenberg, T. Taylor, B. Fisher, G. B. Osterman, D. Wunch, J. Hakkarainen, J. Tamminen, and B. Weir (2017), The Orbiting Carbon Observatory2 early science investigations of regional carbon dioxide fluxes, Science, 358, eaam5745, doi:10.1126/science.aam5745.
 Emanuel and Živković Rothman (1999) Emanuel, K. A., and M. Živković Rothman (1999), Development and evaluation of a convection scheme for use in climate models, J. Atmos. Sci., 56, 1766–1782, doi:10.1175/15200469(1999)056¡1766:DAEOAC¿2.0.CO;2.
 Engl et al. (1996) Engl, H. W., M. Hanke, and A. Neubauer (1996), Regularization of Inverse Problems, 321 pp., Kluwer Academic Publishers, Dordrecht.
 Fatkullin and VandenEijnden (2004) Fatkullin, I., and E. VandenEijnden (2004), A computational strategy for multiscale systems with applications to lorenz 96 model, J. Comp. Phys., 200, 605–638, doi:10.1016/j.jcp.2004.04.013.
 Firl and Randall (2015) Firl, G. J., and D. A. Randall (2015), Fitting and analyzing LES using multiple trivariate Gaussians, J. Atmos. Sci., 72, 1094–1116, doi:10.1175/JASD140192.1.
 Flato et al. (2013) Flato, G., J. Marotzke, B. Abiodun, P. Braconnot, S. C. Chou, W. Collins, P. Cox, F. Driouech, S. Emori, V. Eyring, C. Forest, P. Gleckler, E. Guilyardi, C. Jakob, V. Kattsov, C. Reason, and M. Rummukainen (2013), Evaluation of climate models, in Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change, edited by T. F. Stocker, D. Qin, G.K. Plattner, M. Tignor, S. K. Allen, J. Boschung, A. Nauels, Y. Xia, V. Bex, and P. M. Midgley, chap. 9, pp. 741–853, Cambridge University Press, Cambridge, UK, and New York, NY, USA.
 FoxKemper et al. (2014) FoxKemper, B., S. Bachman, B. Pearson, and S. Reckinger (2014), Principles and advances in subgrid modelling for eddyrich simulations, CLIVAR Exchanges No .65, 19(2).
 Frankenberg et al. (2011) Frankenberg, C., J. B. Fisher, J. Worden, G. Badgley, S. S. Saatchi, J.E. Lee, G. C. Toon, A. Butz, M. Jung, A. Kuze, and T. Yokota (2011), New global observations of the terrestrial carbon cycle from GOSAT: Patterns of plant fluorescence with gross primary productivity, Geophys. Res. Lett., 38, L17,706, doi:10.1029/2011GL048738.
 Frankenberg et al. (2014) Frankenberg, C., C. O’Dell, J. Berry, L. Guanter, J. Joiner, P. Köhler, R. Pollock, and T. E. Taylor (2014), Prospects for chlorophyll fluorescence remote sensing from the Orbiting Carbon Observatory2, Remote Sens. Env., 147, 1–12, doi:10.1016/j.rse.2014.02.007.
 Friedlingstein (2015) Friedlingstein, P. (2015), Carbon cycle feedbacks and future climate change, Phil. Trans. R. Soc. A, 373, 20140,421, doi:10.1098/rsta.2014.0421.
 Friedlingstein et al. (2006) Friedlingstein, P., P. Cox, R. Betts, L. Bopp, W. von Bloh, V. Brovkin, P. Cadule, S. Doney, M. Eby, I. Fung, G. Bala, J. John, C. Jones, F. Joos, T. Kato, M. Kawamiya, W. Knorr, K. Lindsay, H. D. Matthews, T. Raddatz, P. Rayner, C. Reick, E. Roeckner, K.G. Schnitzler, R. Schnur, K. Strassmann, , A. J. Weaver, C. Yoshikawa, and N. Zeng (2006), Climate–carbon cycle feedback analysis: Results from the C^{4}MIP model intercomparison, J. Climate, 19, 3337–3353, doi:10.1175/JCLI3800.1.
 Friedlingstein et al. (2014) Friedlingstein, P., M. Meinshausen, V. K. Arora, C. D. Jones, A. Anav, S. K. Liddicoat, and R. Knutti (2014), Uncertainties in CMIP5 climate projections due to carbon cycle feedbacks, J. Climate, 27, 511–526, doi:10.1175/JCLID1200579.1.
 Friend et al. (2014) Friend, A. D., W. Lucht, T. T. Rademacher, R. Keribin, R. Betts, P. Cadule, P. Ciais, D. B. Clark, R. Dankers, P. D. Falloon, et al. (2014), Carbon residence time dominates uncertainty in terrestrial vegetation responses to future climate and atmospheric CO_{2}, Proc. Natl. Acad. Sci., 111, 3280–3285, doi:10.1073/pnas.1222477110.
 Gardner et al. (2013) Gardner, A. S., G. Moholdt, J. G. Cogley, B. Wouters, A. A. Arendt, J. Wahr, E. Berthier, R. Hock, W. T. Pfeffer, G. Kaser, S. R. M. Ligtenberg, T. Bolch, M. J. Sharp, J. O. Hagen, M. R. van den Broeke, and F. Paul (2013), A reconciled estimate of glacier contributions to sea level rise: 2003 to 2009, Science, 340, 852–857, doi:10.1126/science.1234532.
 Golaz et al. (2002) Golaz, J.C., V. E. Larson, and W. R. Cotton (2002), A PDFbased model for boundary layer clouds. Part I: Method and model description, J. Atmos. Sci., 59, 3540–3551.
 Golaz et al. (2013) Golaz, J.C., L. W. Horowitz, and H. L. II (2013), Cloud tuning in a coupled climate model: Impact on 20th century warming, Geophys. Res. Lett., 40, 2246–2251, doi:10.1002/grl.50232.
 Grabowski (2001) Grabowski, W. W. (2001), Coupling cloud processes with the largescale dynamics using the cloudresolving convection parameterization (CRCP), J. Atmos. Sci., 58, 978–997, doi:10.1175/15200469(2001)058¡0978:CCPWTL¿2.0.CO;2.
 Grabowski (2016) Grabowski, W. W. (2016), Towards global large eddy simulation: Superparameterization revisited, J. Meteor. Soc. Japan, 94, 327–344, doi:10.2151/jmsj.2016017.
 Grabowski and Smolarkiewicz (1999) Grabowski, W. W., and P. K. Smolarkiewicz (1999), CRCP: A cloud resolving convection parameterization for modeling the tropical convecting atmosphere, Physica D, 133, 171–178, doi:10.1016/S01672789(99)001049.
 Grabowski and Wang (2013) Grabowski, W. W., and L.P. Wang (2013), Growth of cloud droplets in a turbulent environment, Ann. Rev. Fluid Mech., 45, 293–324, doi:10.1146/annurevfluid011212140750.
 Grell and Dévényi (2002) Grell, G. A., and D. Dévényi (2002), A generalized approach to parameterizing convection combining ensemble and data assimilation techniques, Geophys. Res. Lett.., 29, 1693, doi:10.1029/2002GL015311.
 Guilyardi et al. (2009) Guilyardi, E., A. Wittenberg, A. Fedorov, M. Collins, C. Wang, A. Capotondi, G. J. van Oldenborgh Oldenborgh, and T. Stockdale (2009), Understanding El Niño in oceanatmosphere general circulation models, Bull. Amer. Meteor. Soc., 90, 325–340, doi:10.1175/2008BAMS2387.1.
 Guo et al. (2015) Guo, H., J.C. Golaz, L. J. Donner, B. Wyman, M. Zhao, and P. Ginoux (2015), CLUBB as a unified cloud parameterization: Opportunities and challenges, Geophys. Res. Lett., 42, 4540–4547, doi:10.1002/2015GL063672.
 Hall and Qu (2006) Hall, A., and X. Qu (2006), Using the current seasonal cycle to constrain snow albedo feedback in future climate change, Geophys. Res. Lett., 33, L03,502, doi:10.1029/2005GL025127.
 Hohenegger and Bretherton (2011) Hohenegger, C., and C. S. Bretherton (2011), Simulating deep convection with a shallow convection scheme, Atmos. Chem. Phys., 11, 10,389–10,406, doi:10.5194/acp11103892011.
 Holloway and Neelin (2009) Holloway, C. E., and J. D. Neelin (2009), Moisture vertical structure, column water vapor, and tropical deep convection, J. Atmos. Sci., 66, 1665–1683.
 Hope (2015) Hope, C. (2015), The $10 trillion value of better information about the transient climate response, Phil. Trans. R. Soc. A, 373, 20140,429, doi:10.1098/rsta.2014.0429.
 Hourdin et al. (2013) Hourdin, F., J.Y. Grandpeix, C. Rio, S. Bony, A. Jam, F. Cheruy, N. Rochetin, L. Fairhead, A. Idelkadi, I. Musat, J.L. Dufresne, A. Lahellec, M.P. Lefebvre, and R. Roehrig (2013), LMDZ5B: the atmospheric component of the IPSL climate model with revisited parameterizations for clouds and convection, Clim. Dyn., 40, 2193–2222, doi:10.1007/s003820121343y.
 Hourdin et al. (2017) Hourdin, F., T. Mauritsen, A. Gettelman, J.C. Golaz, V. Balaji, Q. Duan, D. Folini, D. Ji, D. Klocke, Y. Qian, et al. (2017), The art and science of climate model tuning, Bull. Amer. Meteor. Soc., 98, 589–602, doi:10.1175/BAMSD1500135.1.
 Houtekamer and Zhang (2016) Houtekamer, P. L., and F. Zhang (2016), Review of the ensemble Kalman filter for atmospheric data assimilation, Mon. Wea. Rev., 144, 4489–4532, doi:10.1175/MWRD150440.1.
 Hung et al. (2013) Hung, M.P., J.L. Lin, W. Wang, D. Kim, T. Shinoda, and S. J. Weaver (2013), MJO and convectively coupled equatorial waves simulated by CMIP5 climate models, J. Climate, 26, 6185–6214, doi:10.1175/JCLID1200541.1.
 Iglesias (2016) Iglesias, M. A. (2016), A regularizing iterative ensemble Kalman method for PDEconstrained inverse problems, Inverse Problems, 32, 025,002, doi:10.1088/02665611/32/2/025002.
 Iglesias et al. (2013) Iglesias, M. A., K. J. H. Law, and A. M. Stuart (2013), Ensemble Kalman methods for inverse problems, Inverse Problems, 29, 045,001 (20pp), doi:10.1088/02665611/29/4/045001.
 Intergovernmental Panel on Climate Change (2013) Intergovernmental Panel on Climate Change (2013), Climate Change 2013: The Physical Science Basis, Cambridge University Press, New York.
 Jackson et al. (2008) Jackson, C. S., M. K. Sen, G. Huerta, Y. Deng, and K. P. Bowman (2008), Error reduction and convergence in climate prediction, J. Climate, 21, 6698–6709, doi:10.1175/2008JCLI2112.1.
 Jakob (2003) Jakob, C. (2003), An improved strategy for the evaluation of cloud parameterizations in GCMs, Bull. Amer. Meteor. Soc., 84, 1387–1401, doi:10.1175/BAMS84101387.
 Jakob (2010) Jakob, C. (2010), Accelerating progress in global atmospheric model development through improved parameterizations: Challenges, opportunities, and strategies, Bull. Amer. Meteor. Soc., 91, 869–875, doi:10.1175/2009BAMS2898.1.
 Järvinen et al. (2010) Järvinen, H., P. Räisänen, M. Laine, J. Tamminen, A. Ilin, E. Oja, A. Solonen, and H. Haario (2010), Estimation of ECHAM5 climate model closure parameters with adaptive MCMC, Atmos. Chem. Phys., 10, 9993–10,002, doi:10.5194/acp1099932010.
 Jiang et al. (2012) Jiang, J. H., H. Su, C. Zhai, V. S. Perun, A. D. Genio, L. S. Nazarenko, L. J. Donner, L. Horowitz, C. Seman, J. Cole, A. Gettelman, M. A. Ringer, L. Rotstayn, S. Jeffrey, T. Wu, F. Brient, J.L. Dufresne, H. Kawai, T. Koshiro, M. Watanabe, T. S. LÉcuyer, E. M. Volodin, T. Iversen, H. Drange, M. D. S. Mesquita, W. G. Read, J. W. Waters, B. Tian, J. Teixeira, and G. L. Stephens (2012), Evaluation of cloud and water vapor simulations in CMIP5 climate models using NASA “ATrain” satellite observations, J. Geophys. Res., 117, D14,105, doi:10.1029/2011JD017237.
 Joiner et al. (2011) Joiner, J., Y. Yoshida, A. Vasilkov, L. Corp, and E. Middleton (2011), First observations of global and seasonal terrestrial chlorophyll fluorescence from space, Biogeosci., 8, 637–651, doi:10.5194/bg86372011.
 Kaipio and Somersalo (2005) Kaipio, J., and E. Somersalo (2005), Statistical and Computational Inverse Problems, vol. 160, SpringerVerlag, New York, NY.
 Karlsson and Svensson (2013) Karlsson, J., and G. Svensson (2013), Consequences of poor representation of Arctic seaice albedo and cloudradiation interactions in the CMIP5 model ensemble, Geophys. Res. Lett., 40, 4374–4379, doi:10.1002/grl.50768.
 Karlsson et al. (2008) Karlsson, J., G. Svensson, and H. Rodhe (2008), Cloud radiative forcing of subtropical low level clouds in global models, Clim. Dyn., 30, 779–788, doi:10.1007/s0038200703221.
 Kasahara and Washington (1967) Kasahara, A., and W. M. Washington (1967), NCAR global general circulation model of the atmosphere, Mon. Wea. Rev., 95, 389–402.
 Kay et al. (2016) Kay, J. E., C. Wall, V. Yettella, B. Medeiros, C. Hannay, P. Caldwell, and C. Bitz (2016), Global climate impacts of fixing the Southern Ocean shortwave radiation bias in the Community Earth System Model (CESM), J. Climate, 29, 4617–4636, doi:10.1175/JCLID150358.1.
 Kennedy and O’Hagan (2001) Kennedy, M. C., and A. O’Hagan (2001), Bayesian calibration of computer models, J. Roy. Statist. Soc. B, 63, 425–464, doi:10.1111/14679868.00294.
 KeppelAleks et al. (2012) KeppelAleks, G., P. O. Wennberg, R. A. Washenfelder, D. Wunch, T. Schneider, G. C. Toon, R. J. Andres, J.F. Blavier, B. Connor, K. J. Davis, A. R. Desai, J. Messerschmidt, J. Notholt, C. M. Roehl, V. Sherlock, B. B. Stephens, S. A. Vay, and S. C. Wofsy (2012), The imprint of surface fluxes and transport on variations in total column carbon dioxide, Biogeosci., 9, 875–891.
 Khairoutdinov et al. (2005) Khairoutdinov, M., D. Randall, and C. DeMott (2005), Simulations of the atmospheric general circulation using a cloudresolving model as a superparameterization of physical processes, J. Atmos. Sci., 62, 2136–2154, doi:10.1175/JAS3453.1.
 Khairoutdinov and Randall (2001) Khairoutdinov, M. F., and D. A. Randall (2001), A cloud resolving model as a cloud parameterization in the NCAR Community Climate System Model: Preliminary results, Geophys. Res. Lett., 28, 3617–3620.
 Khairoutdinov et al. (2009) Khairoutdinov, M. F., S. K. Krueger, C.H. Moeng, P. A. Bogenschutz, and D. A. Randall (2009), Largeeddy simulation of maritime deep tropical convection, J. Adv. Model. Earth Sys., 1, Art. #15, 13 pp., doi:10.3894/JAMES.2009.1.15.
 Klein and Majda (2006) Klein, R., and A. J. Majda (2006), Systematic multiscale models for deep convection on mesoscales, Theor. Comput. Fluid Dyn., 20, 525–551, doi:10.1007/s0016200600279.
 Klein and Hall (2015) Klein, S. A., and A. Hall (2015), Emergent constraints for cloud feedbacks, Curr. Clim. Change Rep., 1, 276–287, doi:10.1007/s4064101500271.
 Klocke and Rodwell (2014) Klocke, D., and M. J. Rodwell (2014), A comparison of two numerical weather prediction methods for diagnosing fastphysics errors in climate models, Quart. J. Roy. Meteor. Soc., 140, 517–524, doi:10.1002/qj.2172.
 Knorr (2009) Knorr, W. (2009), Is the airborne fraction of anthropogenic CO_{2} emissions increasing?, Geophys. Res. Lett., 36, L21,710, doi:10.1029/2009GL040613.
 Knutti et al. (2008) Knutti, R., M. R. Allen, P. Friedlingstein, J. M. Gregory, G. C. Hegerl, G. A. Meehl, M. Meinshausen, J. M. Murphy, G.K. Plattner, S. C. B. Raper, T. F. Stocker, P. A. Stott, H. Teng, and T. M. L. Wigley (2008), A review of uncertainties in global temperature projections over the twentyfirst century, J. Climate, 21, 2651–2663, doi:10.1175/2007JCLI2119.1.
 Köhler et al. (2011) Köhler, M., M. Ahlgrimm, and A. Beljaars (2011), Unified treatment of dry convective and stratocumulustopped boundary layers in the ECMWF model, Quart. J. Roy. Meteor. Soc., 137, 43–57, doi:10.1002/qj.713.
 Krasnopolsky et al. (2013) Krasnopolsky, V. M., M. S. FoxRabinovitz, and A. A. Belochitski (2013), Using ensemble of neural networks to learn stochastic convection parameterizations for climate and numerical weather prediction models from data simulated by a cloud resolving model, Adv. Art. Neur. Sys., 2013, 485,913, doi:10.1155/2013/485913.
 Lappen and Randall (2001a) Lappen, C.L., and D. A. Randall (2001a), Toward a unified parameterization of the boundary layer and moist convection. Part I: A new type of massflux model, J. Atmos. Sci., 58, 2021–2036.
 Lappen and Randall (2001b) Lappen, C.L., and D. A. Randall (2001b), Toward a unified parameterization of the boundary layer and moist convection. Part II: Lateral mass exchanges and subplumescale fluxes, J. Atmos. Sci., 58, 2037–2051.
 Law et al. (2015) Law, K., A. Stuart, and K. Zygalakis (2015), Data Assimilation: A Mathematical Introduction, Texts in Applied Mathematics, vol. 62, Springer.
 Law and Stuart (2012) Law, K. J. H., and A. M. Stuart (2012), Evaluating data assimilation algorithms, Mon. Wea. Rev., 140, 3757–3782, doi:10.1175/MWRD1100257.1.
 Le Quéré et al. (2013) Le Quéré, C., R. J. Andres, T. Boden, T. Conway, R. A. Houghton, J. I. House, G. Marland, G. P. Peters, G. Van der Werf, A. Ahlström, et al. (2013), The global carbon budget 1959–2011, Earth Sys. Sci. Data, 5, 165–185, doi:doi:10.5194/essd51652013.
 L’Ecuyer et al. (2015) L’Ecuyer, T. S., H. K. Beaudoing, M. Rodell, W. Olson, B. Lin, S. Kato, C. A. Clayson, E. Wood, J. Sheffield, R. Adler, et al. (2015), The observed state of the energy budget in the early twentyfirst century, J. Climate, 28, 8319–8346, doi:10.1175/JCLID1400556.1.
 Li and Xie (2014) Li, G., and S.P. Xie (2014), Tropical biases in CMIP5 multimodel ensemble: The excessive equatorial Pacific cold tongue and double ITCZ problems, J. Climate, 27, 1765–1780, doi:10.1175/JCLID1300337.1.
 Lin (2007) Lin, J.L. (2007), The doubleITCZ problem in IPCC AR4 coupled GCMs: Ocean–atmosphere feedback analysis, J. Climate, 20, 4497–4525, doi:10.1175/JCLI4272.1.
 Lin et al. (2014) Lin, J.L., T. Qian, and T. Shinoda (2014), Stratocumulus clouds in Southeastern Pacific simulated by eight CMIP5–CFMIP global climate models, J. Climate, 27, 3000–3022, doi:10.1175/JCLID1300376.1.
 Liu et al. (2001) Liu, C., M. W. Moncrieff, and W. W. Grabowski (2001), Hierarchical modelling of tropical convective systems using explicit and parametrized approaches, Quart. J. Roy. Meteor. Soc., 127, 493–515.
 Liu et al. (2017) Liu, J., K. W. Bowman, D. S. Schimel, N. C. Parazoo, Z. Jiang, M. Lee, A. A. Bloom, D. Wunch, C. Frankenberg, Y. Sun, C. W. O’Dell, K. R. Gurney, D. Menemenlis, M. Gierach, D. Crisp, and A. Eldering (2017), Contrasting carbon cycle responses of the tropical continents to the 2015–2016 El Niño, Science, 358, eaam5690, doi:10.1126/science.aam5690.
 Lorenz (1996) Lorenz, E. N. (1996), Predictability—a problem partly solved, in Proc. Seminar on Predictability, vol. 1, pp. 1–18, ECMWF, Reading, Berkshire, UK, Reprinted in T. N. Palmer and R. Hagedorn, eds., Predictability of Weather and Climate, Cambridge UP (2006).
 Lorenz and Emanuel (1998) Lorenz, E. N., and K. A. Emanuel (1998), Optimal sites for supplementary weather observations: Simulation with a small model, J. Atmos. Sci., 55, 399–414, doi:10.1175/15200469(1998)055¡0399:OSFSWO¿2.0.CO;2.
 Lucarini et al. (2014) Lucarini, V., R. Blender, C. Herbert, F. Ragone, S. Pascale, and J. Wouters (2014), Mathematical and physical ideas for climate science, Rev. Geophys., 52, 809–859, doi:10.1002/2013RG000446.
 Ma et al. (2013) Ma, H.Y., S. Xie, J. S. Boyle, S. A. Klein, and Y. Zhang (2013), Metrics and diagnostics for precipitationrelated processes in climate model shortrange hindcasts, J. Climate, 26, 1516–1534, doi:10.1175/JCLID1200235.1.
 Majda (2012) Majda, A. J. (2012), Challenges in climate science and contemporary applied mathematics, Comm. Pure Appl. Math., 65, 920–948, doi:10.1002/cpa.21401.
 Majda et al. (2003) Majda, A. J., I. Timofeyev, and E. VandenEijnden (2003), Systematic strategies for stochastic mode reduction in climate, J. Atmos. Sci., 60, 1705–1722.
 Majda et al. (2008) Majda, A. J., C. Franzke, and B. Khouider (2008), An applied mathematics perspective on stochastic modelling for climate, Phil. Trans. R. Soc. A, 366, 2429–2455, doi:10.1098/rsta.2008.0012.
 Manabe et al. (1965) Manabe, S., J. Smagorinsky, and R. F. Strickler (1965), Simulated climatology of a general circulation model with a hydrologic cycle, Mon. Wea. Rev., 93, 769–798.
 Matheou and Chung (2014) Matheou, G., and D. Chung (2014), Largeeddy simulation of stratified turbulence. Part II: Application of the stretchedvortex model to the atmospheric boundary layer, J. Atmos. Sci., 71, 4439–4460, doi:10.1175/JASD130306.1.
 Mauritsen et al. (2012) Mauritsen, T., B. Stevens, E. Roeckner, T. Crueger, M. Esch, M. Giorgetta, H. Haak, J. Jungclaus, D. Klocke, D. Matei, U. Mikolajewicz, D. Notz, R. Pincus, H. Schmidt, and L. Tomassini (2012), Tuning the climate of a global model, J. Adv. Model. Earth Sys., 4, M00A01, doi:10.1029/2012MS000154.
 Meinshausen et al. (2009) Meinshausen, M., N. Meinshausen, W. Hare, S. C. B. Raper, K. Frieler, R. Knutti, D. J. Frame, and M. R. Allen (2009), Greenhousegas emission targets for limiting global warming to C, Nature, 458, 1158–1162, doi:10.1038/nature08017.
 Mintz (1965) Mintz, Y. (1965), Very longterm global integration of the primitive equations of atmospheric motion, in WMOIUGG Symposium on Research and Development Aspects of LongRange Forecasting, Boulder, Colo., 1964, pp. 141–155, World Meteorological Organization, Geneva.
 Moeng et al. (2007) Moeng, C.H., J. Dudhia, J. Klemp, and P. Sullivan (2007), Examining twoway grid nesting for large eddy simulation of the PBL using the WRF Model, Mon. Wea. Rev., 135, 2295–2311, doi:10.1175/MWR3406.1.
 Nam et al. (2012) Nam, C., S. Bony, J.L. Dufresne, and H. Chepfer (2012), The ‘too few, too bright’ tropical lowcloud problem in CMIP5 models, Geophys. Res. Lett., 39, L21,801, doi:10.1029/2012GL053421.
 Neelin et al. (2009) Neelin, J. D., O. Peters, and K. Hales (2009), The transition to strong convection, J. Atmos. Sci., 66, 2367–2384.
 Neelin et al. (2010) Neelin, J. D., A. Bracco, H. Luo, J. C. McWilliams, and J. E. Meyerson (2010), Considerations for parameter optimization and sensitivity in climate models, Proc. Natl. Acad. Sci., 107, 21,349–21,354, doi:10.1073/pnas.1015473107.
 Neggers et al. (2012) Neggers, R. A. J., A. P. Siebesma, and T. Heus (2012), Continuous singlecolumn model evaluation at a permanent meteorological supersite, Bull. Amer. Meteor. Soc., 93, 1389–1400, doi:10.1175/BAMSD1100162.1.
 Nie and Kuang (2012) Nie, J., and Z. Kuang (2012), Temperature and moisture perturbations: a comparison of largeeddy simulations and a convective parameterization based on stochastically entraining parcels, J. Atmos. Sci., 69, in press.
 Nocedal and Wright (2006) Nocedal, J., and S. J. Wright (2006), Numerical Optimization, Springer Series in Operations Research, 2nd ed., Springer.
 Ohno et al. (2016) Ohno, T., M. Satoh, and Y. Yamada (2016), Warm cores, eyewall slopes, and intensities of tropical cyclones simulated by a 7kmmesh global nonhydrostatic model, J. Atmos. Sci., 73, 4289–4309, doi:10.1175/JASD150318.1.
 Ott et al. (2004) Ott, E., B. R. Hunt, I. Szunyogh, A. V. Zimin, E. J. Kostelich, M. Corazza, E. Kalnay, D. J. Patil, and J. A. Yorke (2004), A local ensemble Kalman filter for atmospheric data assimilation, Tellus, 56, 415–428, doi:10.1111/j.16000870.2004.00076.x.
 Palmer (2014) Palmer, T. (2014), Build highresolution global climate models, Nature, 515, 338–339, doi:10.1038/515338a.
 Palmer and Williams (2010) Palmer, T., and P. Williams (Eds.) (2010), Stochastic Physics and Climate Modelling, 480 pp., Cambridge Univ. Press.
 Palmer et al. (1998) Palmer, T. N., R. Gelaro, J. Barkmeijer, and R. Buizza (1998), Singular vectors, metrics, and adaptive observations, J. Atmos. Sci., 55, 633–653, doi:10.1175/15200469(1998)055¡0633:SVMAAO¿2.0.CO;2.
 Palmer et al. (2005) Palmer, T. N., G. J. Shutts, R. Hagedorn, F. J. DoblasReyes, T. Jung, and M. Leutbecher (2005), Representing model uncertainty in weather and climate prediction, Annu. Rev. Earth Planet. Sci., 33, 163–193, doi:10.1146/annurev.earth.33.092203.122552.
 Parish and Duraisamy (2016) Parish, E. J., and K. Duraisamy (2016), A paradigm for datadriven predictive modeling using field inversion and machine learning, J. Comp. Phys., 305, 758–774, doi:10.1016/j.jcp.2015.11.012.
 Parishani et al. (2017) Parishani, H., M. S. Pritchard, C. S. Bretherton, M. C. Wyant, and M. Khairoutdinov (2017), Toward lowcloudpermitting cloud superparameterization with explicit boundary layer turbulence, J. Adv. Model. Earth Sys., 9, 1542–1571, doi:10.1002/2017MS000968.
 Park (2014a) Park, S. (2014a), A unified convection scheme (UNICON). Part I: Formulation, J. Atmos. Sci., 71, 3902–3930.
 Park (2014b) Park, S. (2014b), A unified convection scheme (UNICON). Part II: Simulation, J. Atmos. Sci., 71, 3931–3973.
 Phillips et al. (2004) Phillips, T. J., G. L. Potter, D. L. Williamson, R. T. Cederwall, J. S. Boyle, M. Fiorino, J. J. Hnilo, J. G. Olson, S. Xie, and J. J. Yio (2004), Evaluating parameterizations in general circulation models: Climate simulation meets weather prediction, Bull. Amer. Meteor. Soc., 85, 1903–1915, doi:10.1175/BAMS85121903.
 Pressel et al. (2015) Pressel, K. G., C. M. Kaul, T. Schneider, Z. Tan, and S. Mishra (2015), Largeeddy simulation in an anelastic framework with closed water and entropy balances, J. Adv. Model. Earth Sys., 7, 1425–1456, doi:10.1002/2015MS000496.
 Pressel et al. (2017) Pressel, K. G., S. Mishra, T. Schneider, C. M. Kaul, and Z. Tan (2017), Numerics and subgridscale modeling in large eddy simulations of stratocumulus clouds, J. Adv. Model. Earth Sys., 9, 1342–1365, doi:10.1002/2016MS000778.
 Pritchard and Somerville (2009a) Pritchard, M. S., and R. C. J. Somerville (2009a), Empirical orthogonal function analysis of the diurnal cycle of precipitation in a multiscale climate model, Geophys. Res. Lett., 36, L05,812, doi:10.1029/2008GL036964.
 Pritchard and Somerville (2009b) Pritchard, M. S., and R. C. J. Somerville (2009b), Assessing the diurnal cycle of precipitation in a multiscale climate model, J. Adv. Model. Earth Sys.., 1, doi:10.3894/JAMES.2009.1.12.
 Qu et al. (2014) Qu, X., A. Hall, S. A. Klein, and P. M. Caldwell (2014), On the spread of changes in marine low cloud cover in climate model simulations of the 21st century, Climate Dyn., 42, 2603–2626, doi:10.1007/s003820131945z.
 Qu et al. (2015) Qu, X., A. Hall, S. A. Klein, and A. M. DeAngelis (2015), Positive tropical marine lowcloud cover feedback inferred from cloudcontrolling factors, Geophys. Res. Lett., 42, doi:10.1002/2015GL065627.
 Randall (2013) Randall, D. A. (2013), Beyond deadlock, Geophys. Res. Lett., 40, 5970–5976, doi:10.1002/2013GL057998.
 Randall and Wielicki (1997) Randall, D. A., and B. A. Wielicki (1997), Measurements, models, and hypotheses in the atmospheric sciences, Bull. Amer. Meteor. Soc., 78, 400–406.
 Randall et al. (2003) Randall, D. A., M. Khairoutdinov, A. Arakawa, and W. Grabowski (2003), Breaking the cloud parameterization deadlock, Bull. Amer. Meteor. Soc., pp. 1547–1564, doi:10.1175/BAMS84111547.
 Rodwell and Palmer (2007) Rodwell, M. J., and T. N. Palmer (2007), Using numerical weather prediction to assess climate models, Quart. J. Roy. Meteor. Soc., 133, 129–146, doi:10.1002/qj.23.
 Romps (2016) Romps, D. M. (2016), The Stochastic Parcel Model: A deterministic parameterization of stochastically entraining convection, J. Adv. Model. Earth Sys., 8, 319–344, doi:10.1002/2015MS000537.
 Romps and Kuang (2010) Romps, D. M., and Z. Kuang (2010), Nature versus nurture in shallow convection, J. Atmos. Sci., 67, 1655–1666.
 Ruiz and Pulido (2015) Ruiz, J., and M. Pulido (2015), Parameter estimation using ensemblebased data assimilation in the presence of model error, Mon. Wea. Rev., 143, 1568–1582, doi:10.1175/MWRD1400017.1.
 Ruiz et al. (2013) Ruiz, J. J., M. Pulido, and T. Miyoshi (2013), Estimating model parameters with ensemblebased data assimilation: A review, J. Meteor. Soc. Japan, 91, 79–99, doi:10.2151/jmsj.2013201.
 Schalkwijk et al. (2015) Schalkwijk, J., H. J. J. Jonker, A. P. Siebesma, and E. Van Meijgaard (2015), Weather forecasting using GPUbased largeeddy simulations, Bull. Amer. Meteor. Soc., 96, 715–723, doi:10.1175/BAMSD1400114.1.
 Schirber et al. (2013) Schirber, S., D. Klocke, R. Pincus, J. Quaas, and J. L. Anderson (2013), Parameter estimation using data assimilation in an atmospheric general circulation model: From a perfect toward the real world, J. Adv. Model. Earth Sys., 5, 58–70, doi:10.1029/2012MS000167.
 Schneider et al. (2017) Schneider, T., J. Teixeira, C. S. Bretherton, F. Brient, K. G. Pressel, C. Schär, and A. P. Siebesma (2017), Climate goals and computing the future of clouds, Nature Climate Change, 7, 3–5, doi:10.1038/nclimate3190.
 Schulthess (2015) Schulthess, T. C. (2015), Programming revisited, Nature Phys., 11, 369–373.
 Shepherd et al. (2012) Shepherd, A., E. R. Ivins, A. Geruo, V. R. Barletta, M. J. Bentley, S. Bettadpur, K. H. Briggs, D. H. Bromwich, R. Forsberg, N. Galin, M. Horwath, S. Jacobs, I. Joughin, M. A. King, J. T. M. Lenaerts, J. Li, et al. (2012), A reconciled estimate of icesheet mass balance, Science, 338, 1183–1189, doi:10.1126/science.1228102.
 Siebesma et al. (2003) Siebesma, A. P., C. S. Bretherton, A. Brown, A. Chlond, J. Cuxart, P. G. Duynkerke, H. Jiang, M. Khairoutdinov, D. Lewellen, C. H. Moeng, E. Sanchez, B. Stevens, and D. E. Stevens (2003), A large eddy simulation intercomparison study of shallow cumulus convection, J. Atmos. Sci., 60, 1201–1219.
 Siebesma et al. (2007) Siebesma, A. P., P. M. M. Soares, and J. Teixeira (2007), A combined eddydiffusivity massflux approach for the convective boundary layer, J. Atmos. Sci., 64, 1230–1248, doi:10.1175/JAS3888.1.
 Siler et al. (2017) Siler, N., S. PoChedley, and C. S. Bretherton (2017), Variability in modeled cloud feedback tied to differences in the climatological spatial pattern of clouds, Clim. Dyn., doi:10.1007/s0038201736732.
 Simmons et al. (2016) Simmons, A., J.L. Fellous, V. Ramaswamy, K. Trenberth, G. Asrar, M. Balmaseda, J. P. Burrows, P. Ciais, M. Drinkwater, P. Friedlingstein, N. Gobron, E. Guilyardi, D. Halpern, M. Heimann, J. Johannessen, P. F. Levelt, E. LopezBaeza, J. Penner, R. Scholes, and T. Shepherd (2016), Observation and integrated Earthsystem science: A roadmap for 2016–2025, Adv. Space Res., 57, 2037–2103, doi:10.1016/j.asr.2016.03.008.
 Smagorinsky (1963) Smagorinsky, J. (1963), General circulation experiments with the primitive equations. I. The basic experiment, Mon. Wea. Rev., 91, 99–164.
 Smagorinsky et al. (1965) Smagorinsky, J., S. Manabe, and J. L. Holloway, Jr. (1965), Numerical results from a ninelevel general circulation model of the atmosphere, Mon. Wea. Rev., 93, 727–768.
 Soden and Held (2006) Soden, B. J., and I. M. Held (2006), An assessment of climate feedbacks in coupled oceanatmosphere models, J. Climate, 19, 3354–3360, doi:10.1175/JCLI3799.1.
 Solonen et al. (2012) Solonen, A., P. Ollinaho, M. Laine, H. Haario, J. Tamminen, and H. Järvinen (2012), Efficient MCMC for climate model parameter estimation: Parallel adaptive chains and early rejection, Bayesian Anal., 7, 715–736, doi:10.1214/12BA724.
 Stainforth et al. (2005) Stainforth, D. A., T. Aina, C. Christensen, M. Collins, N. Faull, D. J. Frame, J. A. Kettleborough, S. Knight, A. Martin, J. M. Murphy, C. Piani, D. Sexton, L. A. Smith, R. A. Spicer, A. J. Thorpe, and M. R. Allen (2005), Uncertainty in predictions of the climate response to rising levels of greenhouse gases, Nature, 433, 403–406.
 Stan et al. (2010) Stan, C., M. Khairoutdinov, C. A. DeMott, V. Krishnamurthy, D. M. Straus, D. A. Randall, J. L. Kinter, and J. Shukla (2010), An oceanatmosphere climate simulation with an embedded cloud resolving model, Geophys. Res. Lett., 37, L01,702, doi:10.1029/2009GL040822.
 Stensrud (2007) Stensrud, D. J. (2007), Parameterization Schemes: Keys to Understanding Numerical Weather Prediction Models, 477 pp., Cambridge Univ. Press, Cambridge, UK.
 Stephens et al. (2017) Stephens, G., D. Winker, J. Pelon, C. Trepte, D. Vane, C. Yuhas, T. L’Ecuyer, and M. Lebsock (2017), CloudSat and CALIPSO within the ATrain: Ten years of actively observing the earth system, Bull. Amer. Meteor. Soc., in press, doi:10.1175/BAMSD160324.1.
 Stephens (2005) Stephens, G. L. (2005), Cloud feedbacks in the climate system: A critical review, J. Climate, 18, 237–273, doi:10.1175/JCLI3243.1.
 Stephens et al. (2002) Stephens, G. L., D. G. Vane, R. J. Boain, G. G. Mace, K. Sassen, Z. Wang, A. J. Illingworth, E. J. O’Connor, W. B. Rossow, S. L. Durden, et al. (2002), The CloudSat mission and the Atrain, Bull. Amer. Meteor. Soc., 83, 1771–1790, doi:10.1175/BAMS83121771.
 Stevens et al. (2005) Stevens, B., C.H. Moeng, A. S. Ackerman, C. S. Bretherton, A. Chlond, S. de Roode, J. Edwards, J.C. Golaz, H. Jiang, M. Khairoutdinov, et al. (2005), Evaluation of largeeddy simulations via observations of nocturnal marine stratocumulus, Mon. Wea. Rev., 133, 1443–1462, doi:10.1175/MWR2930.1.
 Stewart et al. (2014) Stewart, L., S. L. Dance, N. K. Nichols, J. Eyre, and J. Cameron (2014), Estimating interchannel observationerror correlations for IASI radiance data in the Met Office system, Quart. J. Roy. Meteor. Soc., 140, 1236–1244, doi:10.1002/qj.2211.
 Sun et al. (2017) Sun, Y., C. Frankenberg, J. D. Wood, D. S. Schimel, M. Jung, L. Guanter, D. T. Drewry, M. Verma, A. PorcarCastell, T. J. Griffis, L. Gu, T. S. Magney, P. Köhler, B. Evans, and K. Yuen (2017), OCO2 advances photosynthesis observation from space via solarinduced chlorophyll fluorescence, Science, 358, eaam5747, doi:10.1126/science.aam5747.
 Suselj et al. (2013) Suselj, K., J. Teixeira, and D. Chung (2013), A unified model for moist convective boundary layers based on a stochastic eddydiffusivity/massflux parameterization, J. Atmos. Sci., 70, 1929–1953.
 Suzuki et al. (2013) Suzuki, K., J.C. Golaz, and G. L. Stephens (2013), Evaluating cloud tuning in a climate model with satellite observations, Geophys. Res. Lett., 40, 4463–4468, doi:10.1002/grl.50874.
 Swanson and Pierrehumbert (1997) Swanson, K. L., and R. T. Pierrehumbert (1997), Lowertropospheric heat transport in the Pacific storm track, J. Atmos. Sci., 54, 1533–1543.
 Tett et al. (2013) Tett, S. F. B., M. J. Mineter, C. Cartis, D. J. Rowlands, and P. Liu (2013), Can topofatmosphere radiation measurements constrain climate predictions? Part I: Tuning, J. Climate, 26, 9348–9366, doi:10.1175/JCLID1200595.1.
 Tian (2015) Tian, B. (2015), Spread of model climate sensitivity linked to doubleintertropical convergence zone bias, Geophys. Res. Lett., 42, 4133–4141, doi:10.1002/2015GL064119.
 ToddBrown et al. (2013) ToddBrown, K., J. Randerson, W. Post, F. Hoffman, C. Tarnocai, E. Schuur, and S. Allison (2013), Causes of variation in soil carbon simulations from CMIP5 Earth system models and comparison with observations, Biogeosci., 10, 1717–1736, doi:10.5194/bg1017172013.
 Vaughan et al. (2013) Vaughan, D. G., J. C. Comiso, I. Allison, J. Carrasco, G. Kaser, R. Kwok, P. Mote, T. Murray, F. Paul, J. Ren, E. Rignot, O. Solomina, K. Steffen, and T. Zhang (2013), Observations: Cryosphere, in Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change, edited by T. F. Stocker, D. Qin, G.K. Plattner, M. Tignor, S. K. Allen, J. Boschung, A. Nauels, Y. Xia, V. Bex, and P. M. Midgley, chap. 4, pp. 317–382, Cambridge University Press, Cambridge, UK, and New York, NY, USA.
 Vial et al. (2013) Vial, J., J.L. Dufresne, and S. Bony (2013), On the interpretation of intermodel spread in CMIP5 climate sensitivity estimates, Clim. Dyn., 41, 3339–3362, doi:10.1007/s0038201317259.
 Wan et al. (2014) Wan, H., P. J. Rasch, K. Zhang, Y. Qian, H. Yan, and C. Zhao (2014), Short ensembles: an efficient method for discerning climaterelevant sensitivities in atmospheric general circulation models, Geosci. Model Dev.., 7, 1961–1977, doi:10.5194/gmd719612014.
 Wang et al. (2014) Wang, Q., R. Hu, and P. Blonigan (2014), Least Squares Shadowing sensitivity analysis of chaotic limit cycle oscillations, J. Comp. Phys., 267, 210–224, doi:10.1016/j.jcp.2014.03.002.
 Webb et al. (2001) Webb, M., C. Senior, S. Bony, and J.J. Morcrette (2001), Combining ERBE and ISCCP data to assess clouds in the Hadley Centre, ECMWF and LMD atmospheric climate models atmospheric climate models, Clim. Dyn., 17, 905–922.
 Webb et al. (2013) Webb, M. J., F. H. Lambert, and J. M. Gregory (2013), Origins of differences in climate sensitivity, forcing and feedback in climate models, Clim. Dyn., 40, 677–707, doi:10.1007/s003820121336x.
 Wenzel et al. (2014) Wenzel, S., P. M. Cox, V. Eyring, and P. Friedlingstein (2014), Emergent constraints on climatecarbon cycle feedbacks in the CMIP5 Earth system models, Biogeosci., 119, 794–807.
 Wenzel et al. (2016) Wenzel, S., P. M. Cox, V. Eyring, and P. Friedlingstein (2016), Projected land photosynthesis constrained by changes in the seasonal cycle of atmospheric CO_{2}, Nature, 538, 499–501, doi:10.1038/nature19772.
 Wilks (2005) Wilks, D. S. (2005), Effects of stochastic parametrizations in the Lorenz ’96 system, Quart. J. Roy. Meteor. Soc., 131, 389–407, doi:10.1256/qj.04.03.
 Wood (2012) Wood, R. (2012), Stratocumulus clouds, Mon. Wea. Rev., 140, 2373–2423, doi:10.1175/MWRD1100121.1.
 Wouters and Lucarini (2013) Wouters, J., and V. Lucarini (2013), Multilevel dynamical systems: Connecting the Ruelle response theory and the MoriZwanzig approach, J. Stat. Phys., 151, 850–860, doi:10.1007/s1095501307268.
 Wouters et al. (2016) Wouters, J., S. I. Dolaptchiev, and V. Lucarini (2016), Parameterization of stochastic multiscale triads, Nonlin. Processes Geophys., 23, 435–445, doi:10.5194/npg234352016.
 Xie et al. (2012) Xie, S., H.Y. Ma, J. S. Boyle, S. A. Klein, and Y. Zhang (2012), On the correspondence between short and longtimescale systematic errors in CAM4/CAM5 for the Year of Tropical Convection, J. Climate, 25, 7937–7955, doi:10.1175/JCLID1200134.1.
 Yokota et al. (2009) Yokota, T., Y. Yoshida, N. Eguchi, Y. Ota, T. Tanaka, H. Watanabe, and S. Maksyutov (2009), Global concentrations of CO_{2} and CH_{4} retrieved from GOSAT: First preliminary results, SOLA, 5, 160–163.
 Zhang et al. (2005) Zhang, M. H., W. Y. Lin, S. A. Klein, J. T. Bacmeister, S. Bony, R. T. Cederwall, A. D. Del Genio, J. J. Hack, N. G. Loeb, U. Lohmann, P. Minnis, I. Musat, R. Pincus, et al. (2005), Comparing clouds and their seasonal variations in 10 atmospheric general circulation models with satellite measurements, J. Geophys. Res., 110, D15S02, doi:10.1029/2004JD005021.
 Zhang et al. (2015) Zhang, X., H. Liu, and M. Zhang (2015), Double ITCZ in coupled oceanatmosphere models: From CMIP3 to CMIP5, Geophys. Res. Lett., 42, 8651–8659, doi:10.1002/2015GL065973.
 Zhao et al. (2016) Zhao, M., I. M. Held, V. Ramaswamy, S.J. Lin, Y. Ming, P. Ginoux, B. Wyman, L. J. Donner, and D. Paynter (2016), Uncertainty in model climate sensitivity traced to representations of cumulus precipitation microphysics, J. Climate, 29, 543–560, doi:10.1175/JCLID150191.1.
 Zhu et al. (2010) Zhu, P., B. A. Albrecht, V. P. Ghate, and Z. Zhu (2010), Multiplescale simulations of stratocumulus clouds, J. Geophys. Res., 115, D23,201, doi:10.1029/2010JD014400.
Comments
There are no comments yet.