In my last post, I gave some of the details of Andrew Dessler’s latest paper, which criticizes a recent paper by Roy Spencer and Danny Braswell. One of the criticisms I highlighted was the charge that S&B said they had analyzed output from 14 climate models, but only compared 6 of the models to the data–the 3 with the least, and the 3 with the greatest, climate sensitivity. They argued that the 3 least sensitive models did a slightly better job (on average) than the 3 most sensitive ones, but none of them were very good at reproducing the data, so maybe that indicates the real climate is less sensitive than ANY of the models. They also used the temperature series (there are several) that gave the most marked difference from the data. I provided a number of links to show that Spencer has a history of botching his statistics, and noted that in the past he has simply brushed off criticisms of his statistical abuse, relying on the statistical naïveté of his core audience.
Well, true to form, Spencer has now responded to Dessler’s criticism with more statistical sleight of hand. But as I predicted, the facade seems to be cracking a bit, because this time the errors are a bit too obvious, and much too egregious. I won’t go into all of Spencer’s points (since some of them deal with some of Dessler’s points that I didn’t cover last time,) and focus on the issues of statistical abuse and data hiding.
One of Dessler’s points was that SOME of the models (i.e., the ones that simulate El Niño cycles the best) do pretty well at mimicking the pattern in the data S&B pointed out. And since models that don’t allow clouds to force the system can mimic the data well, the fact that another model (S&B’s) that does allow clouds to force the system also can mimic the data well is sort of meaningless. It tells us that the kind of argument S&B were trying to make doesn’t hold water, rather than that Dessler had proven anything about clouds in this way. But let’s see how Roy responded.
But look at what Dessler has done: he has used models which DO NOT ALLOW cloud changes to affect temperature, in order to support his case that cloud changes do not affect temperature! While I will have to think about this some more, it smacks of circular reasoning.
Unbelievable. He has completely turned Dessler’s argument on it’s head.
Another of Dessler’s criticisms was that if Spencer and Braswell want to make an argument that one data set (e.g., the observational data) is “different” than another, they have to do a statistical analysis. That’s just the nature of the problem. But S&B failed to calculate error bars for either the model results (which Trenberth and Fasullo did) or the slopes they calculated for the data (which Dessler did). This is all just standard statistics that every scientist should know. But what does Spencer think of error bars?
Figure 2 in his paper, we believe, helps make our point for us: there is a substantial difference between the satellite measurements and the climate models. He tries to minimize the discrepancy by putting 2-sigma error bounds on the plots and claiming the satellite data are not necessarily inconsistent with the models.
But this is NOT the same as saying the satellite data SUPPORT the models. After all, the IPCC’s best estimate projections of future warming from a doubling of CO2 (3 deg. C) is almost exactly the average of all of the models sensitivities! So, when the satellite observations do depart substantially from the average behavior of the models, this raises an obvious red flag.
With that, Spencer dispenses with a standard statistical technique to deal with uncertainty!!! But let’s review Spencer’s further argument that it’s a big deal if the AVERAGE behavior of all the models don’t match the data in this case.
What Dessler (and Trenberth and Fasullo) showed was that the deviation of the data from the observed values probably had little, if anything to do with climate sensitivity. The ones that do well are the ones that have ALREADY been shown to mimic the El Niño cycle well. That makes sense, because the data correlations S&B calculated were over the span of MONTHS–the kind of timescale El Niño operates on. Given that the models that do well are NOT among the “3 least sensitive” and “3 most sensitive” models S&B showed, then how can this analysis possibly be getting at climate sensitivity in any meaningful way?
Now, if deviations from the observations S&B highlight are just due to how well the models reproduce El Niño, then that means some models are probably better than others at mimicking short-term behavior in the climate system… something else that was already well known. What it doesn’t tell us is how good any of the models are at projecting long-term behavior, which is WHAT EVERYONE CARES ABOUT.
[UPDATE: Down in the comments, HAS points out this paper by Belmadani et al., which he says indicates that the models that did the best at mimicking the observational data pattern in the lag regression statistics aren't necessarily "the best" at mimicking ENSO. This is based on a different kind of comparison (less direct), and I don't know what other model-data comparisons others have made regarding this issue, so it's hard for me to say at this point whether I think HAS is right. The fact is, however, that the lag regression analysis certainly gets at SOMETHING about short-term variability, so whether it's El Niño alone, or a combination of factors that Spencer's analysis is getting at, it still doesn't seem to have anything much to do with climate sensitivity.]
Roy Spencer really, really wants to use the data to say something profound about climate sensitivity (as long as it is low). If that’s his aim, here’s a tip. Maybe he could come up with some statistical test for how well the models reproduce the data–ALL the data sets, not just the one he chose–e.g., he could simply use the sum of squared error. Then he could plot that vs. the climate sensitivity of the models. Maybe a statistician can pipe up here and suggest a more sophisticated way of doing it. I have no idea what Spencer would come up with (although it’s clear from Dessler’s plots that it wouldn’t be a strong correlation), but it would at least nominally address the question he’s attempting to answer.
As it is, he’s waving a red herring, while simultaneously accusing Dessler of doing so.
Finally, what does he say about the “missing” data–the model output that undercut his case, but somehow didn’t make it into his Results or figures?
How is picking the 3 most sensitive models AND the 3 least sensitive models going to “provide maximum support for (our) hypothesis”? If I had picked ONLY the 3 most sensitive, or ONLY the 3 least sensitive, that might be cherry picking…depending upon what was being demonstrated.
And where is the evidence those 6 models produce the best support for our hypothesis? I would have had to run hundreds of combinations of the 14 models to accomplish that. Is that what Dr. Dessler is accusing us of?
Instead, the point of using the 3 most sensitive and 3 least sensitive models was to emphasize that not only are the most sensitive climate models inconsistent with the observations, so are the least sensitive models.
Remember, the IPCC’s best estimate of 3 deg. C warming is almost exactly the warming produced by averaging the full range of its models’ sensitivities together. The satellite data depart substantially from that. I think inspection of Dessler’s Fig. 2 supports my point.
Again, he’s still on the sensitivity issue. This is why I didn’t go all the way and accuse Roy of deliberately hiding data. I’ve demonstrated over and over that he has a tendency to find low climate sensitivity wherever he looks, no matter whether his analysis really addresses the issue, or not. In this case, he was probably so intent on his goal that it didn’t even cross his mind that he was leaving out data that undercut his case.