In catastrophe risk management, where estimates of property losses due to hurricane activity, for example, can easily escalate into the billions, rigorous uncertainty measures are absolutely critical. While it's easy to focus on a single number representing an estimate of some variable of interest, like total U.S. annual losses due to hurricanes, without quantification of uncertainty, the estimate is useless in any context where actual decisions need to be made.
While pursuing my doctorate in Applied Mathematics, I earned my graduate stipend one semester teaching introductory undergraduate statistics. Most students were taking the class just to meet a minimum math requirement, and did not seem to share my enthusiasm for quantitative thinking. I found it particularly effective to cue them verbally when I was about to present material requiring multiple logical steps to give them a chance to summon their mental focus. This pedagogical trick was particularly handy the day I covered confidence intervals, and it went something like this:
- "OK guys, steel yourselves for this next definition...I promise it will eventually end."
- <Writes on board while speaking:> "Aninterval estimator is a rule that specifies the method for using sample measurements to calculate two numbers that form the endpoints of an interval."
- <Turns to face class.> "Ok, hold that thought for a second and notice that the interval itself is random, since it's a function of the random sample." <Writes onboard: 'Nota Bene! The interval is *random*!'>
- "Now, stay with me <continues to write on board:> the probability that this randomly generated interval covers the parameter you want to estimate is the confidence coefficient; together the interval estimator and the confidence coefficient is called a confidence interval."
- <Faces class again.> "OK, it's over. Are you all still awake?"
Even for this statistics nerd, re-reading through that definition induces heavy sighs. Let me make it clear that my problem with confidence intervals has nothing to do with their utility: the frequentist concept is one tool available for rigorous uncertainty quantification. My problem is that their definition and interpretation are completely non-intuitive.
The snag is in the random nature of the interval in their definition, while the parameter it contains is to be interpreted as fixed. The "95%" modifier on a 95% confidence interval indicates that if the data sampling were performed repeatedly, then the rule for building the interval would cover the true value of the parameter 95 times out of 100 . As a practical matter, repeated sampling and estimation happens very infrequently in the real world-especially when the data come from historical observations, as in the estimation of losses from extreme weather events!
Somewhat more philosophically, confidence intervals seem to be just plain awkward for the human mind. Well-documented social experiments have repeatedly shown that people consistently overestimate their confidence in estimates when trying to mentally construct confidence intervals1, 2, indicating that it's hard, if not impossible, for us to get a good mental handle on their squirrely technical definition.
Lesser-known credible intervals, which arise in Bayesian contexts, are the much more intuitive cousins of confidence intervals. In the Bayesian paradigm, any parameter of interest, such as losses incurred on U.S. property due to hurricanes in any given year, is viewed as a random quantity. A Bayesian 95% credible interval is simply an interval containing 95% of the probability density of the parameter of interest.
Boom. See how easy and intuitive the definition of credible intervals was? (I didn't even have to pump you up to get your brain "in the zone" before presenting it!)
So if someone gives you a 95% credible interval for the annual U.S. property loss due to hurricanes, they mean to tell you that in 95% of years, the losses due to hurricanes will fall within that interval. Credible intervals don't require you to imagine repeating the earth's history multiple times to construct the uncertainty interval multiple times from historical storm data. But that awkward thought experiment is necessary to understand the relationship of a frequentist confidence interval to the expected loss.
Clearly, credible intervals are my personal uncertainty interval of choice. But the real takeaway for decision makers is a reminder to take the time to fully understand whatever uncertainty intervals are presented along with any model estimate.
1Alpert, Marc; Howard Raiffa (1982). "A progress report on the training of probability assessors". In Daniel Kahneman, Paul Slovic, Amos Tversky,.Judgment under uncertainty: Heuristics and biases. Cambridge UniversityPress.
2Soll, J., & Klayman, J. (2004). Overconfidence in interval estimates. Journal of Experimental Psychology:Learning, Memory, and Cognition, 30, 299-314.